YC's repositories
Crossmodal_AEC
Code for paper "Crossmodal ASR Error Correction with Discrete Speech Units"
SER-on-WER-and-Fusion
Code for paper "Speech Emotion Recognition with ASR Transcripts: A Comprehensive Study on Word Error Rate and Fusion Techniques"
000
CMU-MultimodalSDK
CMU MultimodalSDK is a machine learning platform for development of advanced multimodal models as well as easily accessing and processing multimodal datasets.
Language:PythonNOASSERTION000
Co-attention
initial commit
Language:Python000
000
fadtk
A simple library for Fréchet Audio Distance (FAD) calculation
Language:PythonMIT000
Language:Python000
speechbrain
A PyTorch-based Speech Toolkit
Language:PythonApache-2.0000
Language:Jupyter Notebook000
Language:SCSSMIT000