Fu-An Chao's starred repositories
Qwen-Audio
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
Qwen2-Audio
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
ms-swift
Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
self-supervised-phone-segmentation
Phoneme segmentation using pre-trained speech models
portfolYOU
A beautiful portfolio Jekyll theme that works with GitHub Pages.
articulatory
Deep Articulatory Synthesis and Inversion
accent-recog-slt2022
Repository for Accent Recognition (Hackathon @SLT2022)
SB_loss_PA
This repository is the implementation of the paper, "Score-balanced Loss for Multi-aspect Pronunciation Assessment" (Interspeech 2023).
INTERSPEECH-2023-24-Papers
INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
gop-dnn-epadb
Goodness of Pronunciation using Kaldi on Epa-DB database
python-audio-effects
Apply audio effects such as reverb and EQ directly to audio files or NumPy ndarrays.
SpeechPrompt
**Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speech processing with prompting paradigm
wav2vec2-live
A live speech recognition using Facebooks wav2vec 2.0 model.
automated-english-transcription-grader
Investigating the effect of auxiliary objectives for the automated grading of learner English speech transcriptions (ACL 2020)
sequence-labeler
Neural network sequence labeling model
huggingsound
HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools
Robust-E2E-ASR
This repository contains the code for our upcoming paper An Investigation of End-to-End Models for Robust Speech Recognition at ICASSP 2021.
PhoneFortifiedPerceptualLoss
Improving Perceptual Quality by Phone-Fortified Perceptual Loss using Wasserstein Distance for Speech Enhancement