mfcc

There are 5 repositories under mfcc topic.

ddbourgin / numpy-ml
Machine learning, in numpy
machine-learning neural-networks topic-modeling gaussian-mixture-models hidden-markov-models gradient-boosting bayesian-inference wavenet vae resnet lstm wgan-gp attention reinforcement-learning good-turing-smoothing knn mfcc gaussian-processes word2vec
Language:Python 16145
aubio / aubio
a library for audio and music analysis
audio music analysis c python sound extraction annotation onset pitch beat tempo-tracking mfcc
Language:C 3519
audioFlux
libAudioFlux / audioFlux
A library for audio and music analysis, feature extraction.
audio audio-analysis audio-features python music music-information-retrieval spectrogram wavelet-transform spectral-analysis wavelet-analysis time-frequency-analysis mfcc pitch mir music-analysis signal-processing audio-processing deep-learning machine-learning
Language:C 3031
x4nth055 / emotion-recognition-using-speech
Building and training Speech Emotion Recognizer that predicts human emotions using Python, Sci-kit learn and Keras
machine-learning speech-emotion-recognition emotion-recognition emotion-recognizer sklearn kneighborsclassifier random-forest-classifier mfcc feature-extraction emotion-detection keras support-vector-machine gradient-boosting mlp-classifier librosa deep-learning neural-networks recurrent-neural-networks
Language:Python 646
ar1st0crat / NWaves
.NET DSP library with a lot of audio processing functions
audio dsp filtering sound-effects feature-extraction psychoacoustics sound-synthesis wav mfcc lpc pitch resampling mir signal fda noise time-stretch adaptive-filtering wavelets
Language:C# 507
spafe
SuperKogito / spafe
:sound: spafe: Simplified Python Audio Features Extraction
python dsp audio music audio-analysis music-information-retrieval features-extraction mfcc filterbank signal-processing frequency frequency-analysis time-frequency-analysis frequencies voice sound beat pitch speech-processing gammatone-filterbanks
Language:Python 477
adamstark / Gist
A C++ Library for Audio Analysis
audio-analysis c-plus-plus pitch-tracking mfcc onset-detection music audio gist music-information-retrieval mir spectral-analysis fft
Language:C++ 378
gionanide / Speech_Signal_Processing_and_Classification
Front-end speech processing aims at extracting proper features from short- term segments of a speech utterance, known as frames. It is a pre-requisite step toward any pattern recognition problem employing speech or audio (e.g., music). Here, we are interesting in voice disorder classification. That is, to develop two-class classifiers, which can discriminate between utterances of a subject suffering from say vocal fold paralysis and utterances of a healthy subject.The mathematical modeling of the speech production system in humans suggests that an all-pole system function is justified [1-3]. As a consequence, linear prediction coefficients (LPCs) constitute a first choice for modeling the magnitute of the short-term spectrum of speech. LPC-derived cepstral coefficients are guaranteed to discriminate between the system (e.g., vocal tract) contribution and that of the excitation. Taking into account the characteristics of the human ear, the mel-frequency cepstral coefficients (MFCCs) emerged as descriptive features of the speech spectral envelope. Similarly to MFCCs, the perceptual linear prediction coefficients (PLPs) could also be derived. The aforementioned sort of speaking tradi- tional features will be tested against agnostic-features extracted by convolu- tive neural networks (CNNs) (e.g., auto-encoders) [4]. The pattern recognition step will be based on Gaussian Mixture Model based classifiers,K-nearest neighbor classifiers, Bayes classifiers, as well as Deep Neural Networks. The Massachussets Eye and Ear Infirmary Dataset (MEEI-Dataset) [5] will be exploited. At the application level, a library for feature extraction and classification in Python will be developed. Credible publicly available resources will be 1used toward achieving our goal, such as KALDI. Comparisons will be made against [6-8].
speech-processing mfcc linear-prediction-coefficients classifier speech-utterance feature-extraction support-vector-machines gaussian-mixture-models long-short-term-memory principal-component-analysis kernel-pca linear-discriminant-analysis isomap locally-linear-embedding spectral-embedding spectral-clustering natural-language-processing nlp nltk
Language:Python 249
sp-nitech / SPTK
A suite of speech signal processing tools
audio-processing cepstrum cpp dsp lpc lsp mfcc signal-processing speech speech-processing sptk unix-command
Language:C++ 232
jsingh811 / pyAudioProcessing
Audio feature extraction and classification
audio-data feature-extraction classify classify-audio mfcc mfcc-features mfcc-extractor gfcc gfcc-features gfcc-extractor spectral-features chroma-features classifier-options classify-audio-samples wav-files classifier audio-files hyperparameter-tuning pyaudioprocessing
Language:Python 226
SuperKogito / Voice-based-gender-recognition
:sound: :boy: :girl:Voice based gender recognition using Mel-frequency cepstrum coefficients (MFCC) and Gaussian mixture models (GMM)
gender-recognition gender-detection gender-classification gmm mfcc gender-recognition-by-voice voice mel-frequencies gaussian-mixture-models signal machine-learning data-science vocal speech gender scikit-learn scikit-learn-python speaker
Language:Python 213
ewan-xu / LibrosaCpp
LibrosaCpp is a c++ implemention of librosa to compute short-time fourier transform coefficients,mel spectrogram or mfcc
eigen librosa mfcc
Language:C++ 201
csukuangfj / kaldifeat
Kaldi-compatible online & offline feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd - Provide C++ & Python API
kaldi features-extraction mfcc plp fbank python online-feature-extractor streaming-feature-extractor pytorch cpp
Language:C++ 197
sp-nitech / diffsptk
A differentiable version of SPTK
dsp sptk pytorch python cepstrum digital-signal-processing lpc mfcc deep-learning ddsp signal-processing pqmf lsp plp cqt mdct stft gmm k-means nmf
Language:Python 180
SuyashMore / MevonAI-Speech-Emotion-Recognition
Identify the emotion of multiple speakers in an Audio Segment
artificial-intelligence colab-notebook convolutional-neural-networks deep-learning diarization emotion-analysis emotion-recognition keras-tensorflow machine-learning mfcc mfcc-analysis speech-processing uis-rnn
Language:C 167
tympanix / subsync
Synchronize your subtitles using machine learning
neural-network subtitles mfcc machine-learning speech-detection shift-subtitle subtitle delay shift fix subsync
Language:Python 152
amanbasu / speech-emotion-recognition
Detecting emotions using MFCC features of human speech using Deep Learning
deep-learning emotion emotion-recognition mfcc rnn speech-recognition tensorflow
Language:Jupyter Notebook 130
ZhuoZhuoCrayon / AcousticKeyBoard-Web
声学键盘｜❓脑洞大开：做一个能听懂键盘敲击键位的「玩具」，学习信号处理 / 深度学习 / 安卓 / Django。
deep-learning django lstm mfcc tensorflow
Language:Python 86
GauravWaghmare / Speaker-Identification
A program for automatic speaker identification using deep learning techniques.
speaker-recognition speaker-verification keras mfcc
Language:Python 84
MycroftAI / sonopy
A simple audio feature extraction library
audio-processing mfcc spectrogram sound mel-spectrogram library
Language:Python 79
ZitengWang / python_kaldi_features
python codes to extract MFCC and FBANK speech features for Kaldi
mfcc kaldi
Language:Python 65
mathquis / node-personal-wakeword
Personal wake word detector
wakeword hotword-detection hotword-detector node dtw mfcc
Language:JavaScript 63
k-farruh / speech-accent-detection
The human speaks a language with an accent. A particular accent necessarily reflects a person's linguistic background. The model defines accent based audio record. The result of the model could be used to determine accents and help decrease accents to English learning students and improve accents by training.
native-speakers accent english-languages accent-detection mfcc
Language:Python 60
georgid / AlignmentDuration
Lyrics-to-audio-alignement system. Based on Machine Learning Algorithms: Hidden Markov Models with Viterbi forced alignment. The alignment is explicitly aware of durations of musical notes. The phonetic model are classified with MLP Deep Neural Network.
python htk lyrics duration decoding deep-learning hidden-markov-model alignment synchronization mfcc signal-processing music music-information-retrieval upf gmm neural-networks research
Language:Python 57
zafarrafii / Zaf-Python
Zafar's Audio Functions in Python for audio signal analysis: STFT, inverse STFT, mel filterbank, mel spectrogram, MFCC, CQT kernel, CQT spectrogram, CQT chromagram, DCT, DST, MDCT, inverse MDCT.
python stft dct dst mdct inverse-stft cqt-kernel cqt-spectrogram chromagram inverse-mdct mfcc mel-filterbank short-time-fourier-transform mel-frequency-cepstral-coefficients discrete-cosine-transform discrete-sine-transform mel-spectrogram constant-q-transform modified-discrete-cosine-transform audio-signal-processing
Language:Jupyter Notebook 56
SuperKogito / Voice-based-speaker-identification
:sound: :boy: :girl: :woman: :man: Speaker identification using voice MFCCs and GMM
speaker-identification speaker-recognition gmm mfcc voice mel-frequency-cepstral-coefficients mel-frequencies gaussian-mixture-models signal machine-learning vocal speech scikit-learn scikit-learn-python
Language:Python 54
supikiti / PNCC
A implementation of Power Normalized Cepstral Coefficients: PNCC
deep-learning mfcc pncc robustness speech-enhancement speech-processing speech-recognition
Language:Python 52
alicex2020 / Deep-Learning-Lie-Detection
Use machine learning models to detect lies based solely on acoustic speech information
machine-learning deep-learning mfcc mfcc-analysis lie-detector acoustic-features pitch-tracking support-vector-machines ensemble-learning ensemble-model ensemble-classifier ensemble-machine-learning
Language:Jupyter Notebook 51
aubio / vamp-aubio-plugins
aubio plugins for Vamp
aubio vamp-plugins tempo-tracking tempo-detection mfcc beat-detection beat-tracking tempo beat onset onset-detection audio music music-information-retrieval analysis
Language:C++ 48
zafarrafii / Zaf-Matlab
Zafar's Audio Functions in Matlab for audio signal analysis: STFT, inverse STFT, mel filterbank, mel spectrogram, MFCC, CQT kernel, CQT spectrogram, CQT chromagram, DCT, DST, MDCT, inverse MDCT.
stft istft chromagram mfcc dct dst mdct imdct matlab cqt-kernel cqt-spectrogram mel-filterbank mel-spectrogram short-time-fourier-transform mel-frequency-cepstral-coefficients discrete-cosine-transform discrete-sine-transform modified-discrete-cosine-transform constant-q-transform audio-signal-processing
Language:Jupyter Notebook 48
sheelabhadra / Emergency-Vehicle-Detection
Python implementation of papers on emergency vehicle detection using audio signals
self-driving-car audio-processing machine-learning mfcc wavelets pitch-detection emergency-response neural-network emergency-vehicle-detection
Language:Jupyter Notebook 47
mechanicalsea / spectra
Spectra extraction tutorials based on torch and torchaudio.
filterbank mfcc pytorch voice-activity-detection
Language:Jupyter Notebook 41
Live-Audio-MFCC
pulakk / Live-Audio-MFCC
Live Audio MFCC Visualization in the browser using Web Audio API - https://pulakk.github.io/Live-Audio-MFCC/tutorial
node-js meyda p5-sketches webaudioapi mfcc
Language:JavaScript 41
dydtjr1128 / Speaker-Recognition-using-NN
Speaker Recognition using Neural Network & Linear Regression
speaker-recognition mfcc neural-network linear-regression python nn machine-learning voice-recognition
Language:Jupyter Notebook 37
zhengyima / DTW_Digital_Voice_Recognition
基于DTW与MFCC特征进行数字0-9的语音识别，DTW，MFCC，语音识别，中英数据，端点检测，Digital Voice Recognition。
dtw mfcc voice-recognition digital-signal-processing machine-learning dynamic-programming
Language:Python 37
skaws2003 / pytorch-mfcc
A pytorch implementation of MFCC.
mfcc pytorch
Language:Python 33

mfcc

ddbourgin / numpy-ml

aubio / aubio

libAudioFlux / audioFlux

x4nth055 / emotion-recognition-using-speech

ar1st0crat / NWaves

SuperKogito / spafe

adamstark / Gist

gionanide / Speech_Signal_Processing_and_Classification

sp-nitech / SPTK

jsingh811 / pyAudioProcessing

SuperKogito / Voice-based-gender-recognition

ewan-xu / LibrosaCpp

csukuangfj / kaldifeat

sp-nitech / diffsptk

SuyashMore / MevonAI-Speech-Emotion-Recognition

tympanix / subsync

amanbasu / speech-emotion-recognition

ZhuoZhuoCrayon / AcousticKeyBoard-Web

GauravWaghmare / Speaker-Identification

MycroftAI / sonopy

ZitengWang / python_kaldi_features

mathquis / node-personal-wakeword

k-farruh / speech-accent-detection

georgid / AlignmentDuration

zafarrafii / Zaf-Python

SuperKogito / Voice-based-speaker-identification

supikiti / PNCC

alicex2020 / Deep-Learning-Lie-Detection

aubio / vamp-aubio-plugins

zafarrafii / Zaf-Matlab

sheelabhadra / Emergency-Vehicle-Detection

mechanicalsea / spectra

pulakk / Live-Audio-MFCC

dydtjr1128 / Speaker-Recognition-using-NN

zhengyima / DTW_Digital_Voice_Recognition

skaws2003 / pytorch-mfcc