Nguyễn Văn Anh Tuấn's starred repositories
crewai-experiments
Experiments with local as well as models available through an api
speech-trident
Awesome speech/audio LLMs, representation learning, and codec models
ABigSurvey
A collection of 1000+ survey papers on Natural Language Processing (NLP) and Machine Learning (ML).
audio-captioning
Audio captioning - DCASE challenge 2023 task 6a
LLocalSearch
LLocalSearch is a completely locally running search aggregator using LLM Agents. The user can ask a question and the system will use a chain of LLMs to find the answer. The user can see the progress of the agents and the final answer. No OpenAI or Google API keys are needed.
Pytorch_mixture-of-experts
PyTorch implementation of moe, which stands for mixture of experts
taming-transformers
Taming Transformers for High-Resolution Image Synthesis
descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
latent-diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
Leaderboard
SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.
pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Diffusion-GAN
Official PyTorch implementation for paper: Diffusion-GAN: Training GANs with Diffusion
semantic-router
Superfast AI decision making and intelligent processing of multi-modal data.
speechbrain
A PyTorch-based Speech Toolkit
Conv-Tasnet-for-speech-enchancement-and-seperation
The state-of-art time domain network for speech separation, and it performs well on speech enhancement and music separation
Robust-E2E-ASR
This repository contains the code for our upcoming paper An Investigation of End-to-End Models for Robust Speech Recognition at ICASSP 2021.