Fu-An Chao (Fuann)

Fuann

Geek Repo

Company:National Taiwan Normal University

Location:Taipei, Taiwan

Home Page:https://fuann.github.io

Github PK Tool:Github PK Tool

Fu-An Chao's starred repositories

litgpt

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Language:PythonLicense:Apache-2.0Stargazers:10693Issues:0Issues:0

Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:1482Issues:0Issues:0

Qwen2-Audio

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:PythonStargazers:1221Issues:0Issues:0

ms-swift

Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)

Language:PythonLicense:Apache-2.0Stargazers:4240Issues:0Issues:0

pyreft

ReFT: Representation Finetuning for Language Models

Language:PythonLicense:Apache-2.0Stargazers:1153Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:127Issues:0Issues:0

mamba

Mamba SSM architecture

Language:PythonLicense:Apache-2.0Stargazers:13201Issues:0Issues:0
Language:PythonLicense:MITStargazers:20Issues:0Issues:0

self-supervised-phone-segmentation

Phoneme segmentation using pre-trained speech models

Language:PythonLicense:GPL-3.0Stargazers:52Issues:0Issues:0
Language:PythonLicense:CC-BY-4.0Stargazers:252Issues:0Issues:0

portfolYOU

A beautiful portfolio Jekyll theme that works with GitHub Pages.

Language:HTMLLicense:MITStargazers:989Issues:0Issues:0

articulatory

Deep Articulatory Synthesis and Inversion

Language:PythonLicense:Apache-2.0Stargazers:43Issues:0Issues:0

accent-recog-slt2022

Repository for Accent Recognition (Hackathon @SLT2022)

Language:Jupyter NotebookLicense:MITStargazers:22Issues:0Issues:0

SB_loss_PA

This repository is the implementation of the paper, "Score-balanced Loss for Multi-aspect Pronunciation Assessment" (Interspeech 2023).

Language:PythonLicense:BSD-3-ClauseStargazers:15Issues:0Issues:0

INTERSPEECH-2023-24-Papers

INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!

License:MITStargazers:641Issues:0Issues:0

whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Language:PythonLicense:BSD-2-ClauseStargazers:12465Issues:0Issues:0

gop-dnn-epadb

Goodness of Pronunciation using Kaldi on Epa-DB database

Language:PythonStargazers:33Issues:0Issues:0

python-audio-effects

Apply audio effects such as reverb and EQ directly to audio files or NumPy ndarrays.

Language:PythonLicense:MITStargazers:385Issues:0Issues:0

SpeechPrompt

**Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speech processing with prompting paradigm

Language:PythonStargazers:97Issues:0Issues:0

wav2vec2-live

A live speech recognition using Facebooks wav2vec 2.0 model.

Language:PythonLicense:MITStargazers:327Issues:0Issues:0

automated-english-transcription-grader

Investigating the effect of auxiliary objectives for the automated grading of learner English speech transcriptions (ACL 2020)

Language:PythonLicense:Apache-2.0Stargazers:7Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:931Issues:0Issues:0

sequence-labeler

Neural network sequence labeling model

Language:PythonStargazers:252Issues:0Issues:0
License:CC0-1.0Stargazers:33Issues:0Issues:0

huggingsound

HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools

Language:PythonLicense:MITStargazers:432Issues:0Issues:0

gopt

Code for the ICASSP 2022 paper "Transformer-Based Multi-Aspect Multi-Granularity Non-native English Speaker Pronunciation Assessment".

Language:PythonLicense:BSD-3-ClauseStargazers:150Issues:0Issues:0

pase

Problem Agnostic Speech Encoder

Language:PythonLicense:MITStargazers:439Issues:0Issues:0

Robust-E2E-ASR

This repository contains the code for our upcoming paper An Investigation of End-to-End Models for Robust Speech Recognition at ICASSP 2021.

Language:PythonLicense:MITStargazers:46Issues:0Issues:0

PhoneFortifiedPerceptualLoss

Improving Perceptual Quality by Phone-Fortified Perceptual Loss using Wasserstein Distance for Speech Enhancement

Language:PythonLicense:MITStargazers:73Issues:0Issues:0

s3prl

Self-Supervised Speech Pre-training and Representation Learning Toolkit

Language:PythonLicense:Apache-2.0Stargazers:2268Issues:0Issues:0