"Namaste" Artificial Intelligence

Rishiraj Acharya

Rishiraj Acharya

Kolkata, West Bengal

1 0
  • 0 Collaborators

A trigger word detection model where you record a clip of yourself talking, and have the model ring a chime when it detects you saying "Namaste", a customary Indian greeting. You can also extend it to run on your laptop so that every time you say "Namaste" it starts up your favorite action. ...learn more

Project status: Published/In Market

Artificial Intelligence

Groups
Student Developers for AI

Intel Technologies
Intel Python

Code Samples [1]Links [1]

Overview / Usage

In this project we construct a speech dataset and implement a model for trigger word detection or more commonly known as keyword or wake word detection.

We have seen working of this model in our day to day life like "Ok Google" for Google Assistant and "Hey Siri" for Siri. In this project our trigger word is "Namaste", a customary Indian greeting. Every time it hears someone say "Namaste", it makes an alerting sound.

We can also make it run on computers so that saying "Namaste" does any desired action like turning ON the connected lights in the house or opening any specified app.

Methodology / Approach

  1. Creation of the speech dataset by recording positive examples of word "Namaste", negative examples and also background noises. Hence computation of the spectrogram for corresponding raw data and division into time intervals.
  2. Generation of a large dataset by synthesis of artificial training data by overlaying of positive or negative words on top of background noises since natural data is hard to collect and label.
  3. Development of a model using 1-D Convolutional layer to extract low-level features and generate an output of a smaller dimension, two GRU layers for reading the sequence of inputs, and one Dense plus Sigmoid layer to estimate the chance of the output being 1. Additionally Adam optimizer and Binary cross entropy loss can also be used.
  4. Testing of the model with metrics such as F1 score or Precision/Recall.

Technologies Used

AudioSegment from Pydub, Numpy, Keras, TensorFlow

Repository

https://github.com/rishiraj-acharya/Namaste-AI

Comments (0)