xmagedo / SADA-Dataset-Dialect-Explorer

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SADA-Dataset-Dialect-Explorer

SADA-Dataset-Dialect-Explorer

The SADA-Dataset-Dialect-Explorer is a Python-based project designed to visualize and explore audio data from the SADA (Saudi Arabian Dialects Archive) dataset, filtered by dialect. This project allows users to analyze the audio files, visualize the waveform and spectrogram, and display metadata associated with each audio file, such as ShowName, ProcessedText, SpeakerGender, SpeakerAge, and SpeakerDialect.

Features

Filter the dataset by dialect and display random audio samples. Visualize audio waveforms and spectrograms. Display relevant metadata for each audio file. Utilize Librosa library for audio analysis and visualization.

Future Updates

Implement Transformer-based NLP models for dialect identification and classification. Improve the user interface to make it more interactive and user-friendly. Extend the project's functionality to support additional features, such as sentiment analysis, speaker diarization, and automatic speech recognition (ASR).

About


Languages

Language:Jupyter Notebook 100.0%