Object detecting smart glasses for patients with sight disorder

Debdut Goswami

Debdut Goswami

Kolkata, West Bengal

0 0
  • 0 Collaborators

The idea is to make a smart glasses to help the blind people understand the surroundings. The glasses will analyse the surrounding and tell the person about the surrounding via audio feedback. ...learn more

Project status: Concept

Internet of Things, Artificial Intelligence

Intel Technologies
DevCloud, Movidius NCS

Overview / Usage

Why this idea is unique?

Well there exists glasses that can help blind people see but they are very expensive and mostly unaffordable by common people. By using Computer Vision we can easily make this cheap.

Idea:

The idea is to make Smart Glasses to help the blind people understand the surroundings. The glasses will analyse the surrounding and explain the person about the surrounding via audio feedback. These glasses will act like an assistant, which will constantly tell the person about the things near to him/her. It will even inform the consumer if anything is approaching towards him/her. We can even analyse the depth of field, which will make the glasses tell how far objects are from the person. We can even measure the speed at which various objects are moving. Also, we can add a haptic feedback module to warn the consumer for any fast approaching objects or any other kind of dangers. Well these are secondary features, but our primary objective is to be able to explain the surroundings.

Features:

  1. Explaining the surroundings (Primary)
  2. Tell the approximate speed of various vehicles
  3. Warn about approaching dangers
  4. A haptic feedback for critical dangers (eg: Vehicle directly approaching towards him/her)
  5. Emergency mode (If the user meets with any mishap then the glasses will automatically inform the nearest hospital and police station)
  6. Pulse Monitoring (this is an additional feature)

Methodology / Approach

Implementation:

We need to make smart glasses which has a mini camera hooked up at the bridge, a wifi chip to transmit the data and an ear piece to tell the consumer about his/her surroundings.

I plan to use Deep Learning obviously, but specifically I plan on using the faster R-CNN algorithm for object detection and various other analysis on the video that is obtained from the camera module hooked up to the glasses. Depending on the percentage of accuracy, the objects will be explained to the consumer via the ear piece.

We need to keep one thing in mind, i.e., this will work on real world so it needs to be absolutely fast as because if it is late in analyzing and predicting the surrounding, then it might cause a mishap. I tend to use faster R-CNN approach but it requires high computation power to train the model. We also need to keep in mind the latency factor involved in the overall transmission process.

Now there are two ways of doing it:

  1. Online (using Cloud)
  2. Offline (using Raspberry Pi or any other development board)
  3. Edge Computing

  1. Online:

We can send the streaming data to the cloud, let the machine learning model predict the objects and then send the prediction to the smart glasses.

Drawback:

  1. Needs access to high speed internet to function.
  2. Need lower latency server for hosting the model because higher latency will cause a delay in hearing the predictions.

Benefits:

  1. Much cheaper compared to the offline alternative.

2. Offline:

In this case, we will deploy the machine learning model in the Raspberry Pi and we will send the streaming data to Rpi, let the machine learning model predict the objects and then send the prediction to the smart glasses.

Drawbacks:

  1. Initial cost will be high as we need to provide every consumer with Rpi.
  2. Need to carry around an additional item, something which will approximately be like the size of an external hard disk.

Benefits:

  1. Lower latency, so much less risk of delayed response for the prediction.
  2. No need of internet connectivity.

3. Edge Computing

So, Edge computing is the practice of processing data near the edge of your network, where the data is being generated, instead of in a centralized data-processing warehouse. It brings computation and data storage closer to the location where it is needed, to improve response times and save bandwidth.

I personally consider this to be the best solution, as because uploading streaming data to the cloud is technically very difficult as it would significantly increase the monthly cloud bill. Moreover, the availability of such high speed internet connection is very difficult to achieve, mainly in India. And with Rpi, it is just not possible to perform such high computation on such a lower power board.

So, that’s what brings us to the concept of Edge Computing. We are going to use the Intel Neural Compute Stick 2 (INCS2) to do the predictions on the edge of the network. This would significantly eliminate the need for using cloud.

One thing to keep in mind is that, Intel Neural Compute Stick 2 needs a Raspberry Pi to work. We need to plug the INCS2 to the Rpi.

As such, in this method, there is no drawback, and it has all the benefits of both online and offline methods. So, I prefer going for this approach.

Designing the connections:

  1. Connect a camera on the glasses.
  2. Attach a WiFi module at the end of the glasses (This will be used to send the video from the camera to the Rpi).
  3. We need to setup the Rpi with the Intel Neural Compute Stick 2 (This Intel Neural Stick will be used to do the predictions).
  4. Send the prediction back to the transmitting point.
  5. Attack a small earpiece to the glasses which will tell the predictions using text-to-speech.

Note:

We need to deploy the trained model onto the Rpi and have all the dependencies installed in it. We also need to setup the external WiFi module with the Rpi’s on-board WiFi module, so they then can communicate with each other.

Usefulness:

Well this is not the best solution for helping those with challenged eyesight and maybe the glasses that help blind people to see is more sophisticated solution but those are pretty costly and mostly unaffordable.

This can be a much cheaper alternative, more sort of an assistant who would guide you the way. This can seriously benefit those with vision problem. These glasses won’t help the blind people see, instead it will be like an assistant.

Preparing the Dataset:

We don’t need to make much of a dataset as because there are already a lot of datasets available which contains a lot of images of real world, like roads, festivals, etc. We might need to create region specific datasets as because this is quite obvious that images of US won’t work in India.

We can even use Intel’s Computer Vision Annotation Tool (CVAT). Computer Vision Annotation Tool (CVAT) is an open source tool for annotating digital images and videos. The main function of the application is to provide users with convenient annotation instruments. For that purpose, Intel designed CVAT as a versatile service that has many powerful features.

Technologies Used

I plan to use python for coding the Machine Learning model. Well the main reason for using python is it's immense availability of pre-written libraries. I would like to list some of the libraries I plan on using for this project. They are as follows:

  • PIL,
  • Numpy,
  • Pandas,
  • OpenCV,
  • Tensorflow (mainly keras),
  • PyTorch,
  • Scikit Learn,
  • And few more supporting libraries like Matplotlib, Seaborn, etc.

Hardware required:

  • Raspberry Pi (with built-in WiFi chip)
  • Intel Neural Compute Stick 2 (INCS2)
  • Rpi Camera module
  • WiFi module
  • Earpiece
  • Glasses
Comments (0)