ASL Fingerspelling to English Translation

1 0
  • 0 Collaborators

Translating the American Sign Language alphabet to the English alphabet. ...learn more

Project status: Under Development

Internet of Things, Artificial Intelligence

Code Samples [1]

Overview / Usage

The initial idea for this project is to create an application where a person can communicate in ASL in front of a webcam and then the equivalent in English appears as a caption.

--- Why is this important? ---

AI for Social Good: This project would contribute to accessibility for people with hearing disabilities (think of video consultations, Alexa which are inaccessible for deaf and mute people).

WHAT IS ASL?

It is important to understand that American Sign Language (ASL) is a different language from English, that is, it has its own grammar, vocabulary, etc. To understand a word in ASL you need to not only understand the gesture of the hand, but also its position with respect to the face, the position of the arms and how fast a person in making the signs - all these things factor into the meaning of a gesture. Moreover, ASL is not a direct translation to English. For example, to say "I am 18 years old and have no brothers or sisters", the direct transcript in ASL would be something like "I – how-old? – 21 and have none brothers and none sisters." All these differences make the general translation demo much more complex.

Methodology / Approach

To make the project more approachable, I concentrated on ASL Fingerspelling.

--- Why? ---

  • Fingerspelling is static (except for j, z) and there is no need for context
  • Useful: fingerspelling is used by native ASL speakers to (1) spell out names, (2) for words that are not in the dictionary yet, and (3) to emphasize a word.

APPROACH:

(1) Hand Detection (find where the hand is in the image to make the classification easier).

For this, I used the TensorFlow Object Detection API and used ssd_mobilenet_v1_coco

Dataset: initially used the eggohands_dataset, but then proceeded to use a dataset I labelled myself from videos of people speaking in ASL (smal dataset - only ~300 images)

(2) Image Classification (compare the image of the hand with large dataset of ASL alphabet)

For this, I used a pre-trained image classification model from TensorFlow: inception_v3

Technologies Used

For now, the only Intel Technology I have used is that I have trained my models with the Intel NUC (which has made everything SO much faster), but in the future I will use the OpenVINO toolkit and the NCS2 to optimize everything.

Repository

https://github.com/sanchzs/asl-fingerspelling

Comments (0)