Language Identification using CNN PyTorch

New version link : V2

Language and Libraries

Problem statement

The goal of this project is to build a application to indentify $ Indian languages.

Solution Proposed

The solution proposed for the above problem is that we have used Deep learning to solve the above problem to identify spoken language from audio. We have used the Pytorch framework to solve the above problem also we created our custom Language Identification network with the help of PyTorch. Then we created an API that takes in the audio.mp3 and predicts the language. Then we dockerized the application and deployed the model on the GCP cloud.

Dataset Used

This is a dataset of audio samples of 4 different Indian languages. Each audio sample is of 5 seconds duration. This dataset was created using regional videos available on YouTube.

This is constrained to Indian Languages only but could be extended.

Languages present in the dataset - Hindi, Kannada, Tamil, Telugu.

How to run?

Step 1: Clone the repository

git clone "https://github.com/Deep-Learning-01/language-identification-using-cnn-pytorch.git" repository

Step 2- Create a conda environment after opening the repository

conda create -p env python=3.10 -y

conda activate env/

Step 3 - Install the requirements

pip install -r requirements.txt

Step 4 - Export the environment variable

export AWS_ACCESS_KEY_ID=<AWS_ACCESS_KEY_ID>

export AWS_SECRET_ACCESS_KEY=<AWS_SECRET_ACCESS_KEY>

export AWS_DEFAULT_REGION=<AWS_DEFAULT_REGION>

Before running server application make sure your s3 bucket is available and empty

Step 5 - Run the application server

python app.py

Step 6. Train application

http://localhost:8080/train

Step 7. Prediction application

http://localhost:8080

Run locally

Check if the Dockerfile is available in the project directory
Build the Docker image

docker build -t langapp .

Run the Docker image

docker run -d -p 8080:8080 <IMAGEID>

👨‍💻 Tech Stack Used

Python
Flask
Pytorch
Docker
CNN

🌐 Infrastructure Required.

AWS S3
GAR (Google Artifact repository)
GCE (Google Compute Engine)
GitHub Actions

`src` is the main package folder which contains

Artifact : Stores all artifacts created from running the application

Components : Contains all components of Machine Learning Project

DataIngestion
DataTransformation
ModelTrainer
ModelEvaluation
ModelPusher

Custom Logger and Exceptions are used in the project for better debugging purposes.

Conclusion

Can be used for language Identification in videos and other audio files in any organization.

=====================================================================

About

Draft 1 of Language identification project (https://github.com/aravind-selvam/language_identification-using-cnn-and-audio-processing.git)

deep-learning pytorch

Apache License 2.0

Languages

Language:Jupyter Notebook 47.1%Language:Python 44.4%Language:HTML 8.3%Language:Dockerfile 0.2%