Accent Classification of Nigerian English Speakers
Stanley Dukor
Unknown
- 0 Collaborators
The use of Deep learning and spectrogram images of audio data to detect the accent of native Nigerian speakers. ...learn more
Project status: Under Development
Groups
Student Developers for AI
Overview / Usage
One of the major challenges in speech recognition is to understand speech by non-native English speakers. Accent classification can enhance the automatic speech recognition system by identifying the ethnicity of a speaker (voice recognition) and switching to a speech recognition system that is trained for that particular accent. Also, accent recognition, which provides identification of a speaker’s ethnicity, is crucial in applications such as crime investigation. In this project, I am attempting to solve this problem by classifying about an hour accented voice clips as one of target Nigerian native languages.
Methodology / Approach
The approach to this project of accent classification consists of feature extraction and machine learning classifiers. In this project, the three target languages are Igbo, Yoruba, and Hausa. I plan on gathering audio data of the languages aforementioned, splitting them into short audio files based on the audio sample rate to form a large dataset, processing the audio signal by denoising it using techniques like Gaussian-smooth and Median filters, then using acoustic features like MFCCs and Spectrograms to represent the audio signals as images. I will also apply PCA to these features to reduce the data dimensionality and capture the important data variations. I will explore machine learning classifiers for accent classification of Nigerian English speakers into one of the languages aforementioned. The classifiers include 𝑘-Nearest Neighbors, Support Vector Classifier, Multi-Layer Perceptron, and Convolutional Neural Network. I intend on achieving an accuracy of 96% on classifying the accent of totally unknown Nigerian English speakers correctly.
Technologies Used
I am using Librosa python library to convert the audio data to spectrogram images, and Pytorch deep learning framework to train the images.