Nguyễn Văn Anh Tuấn's repositories
noisy-student-training-asr
Pytorch implementation of Noisy Student Training for Automatic Speech Recognition and Automatic Pronunciation Error Detection problem
image2latex
Image to Latex using Encoder-Decoder architecture
tuanio.github.io
This is an academic blog
whisper-ctc
Whisper Encoder (extracted from pretrained) with a Linear on top and solve using CTC criterion
ling-wav2vec2
Official implementation of LingWav2Vec2: Linguistic-augmented Wav2Vec2 for Mispronunciation Detection
EfficientConformer-Edit
[ASRU 2021] Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition
fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and FastChat-T5.
hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
LaVy-revised
Pioneering in Vietnamese Multimodal Large Language Model
llava-working-space
Vistral-V: Visual Instruction Tuning for Vistral - Vietnamese Large Vision-Language Model.
MaskCycleGAN-VC
Fork from https://github.com/GANtastic3/MaskCycleGAN-VC
MoE-LLaVA
Mixture-of-Experts for Large Vision-Language Models
SpeechUVCGANv2
Rethinking CycleGAN: Improving Quality of GANs for Unpaired Image-to-Image Translation