Xiangyu Zhao's starred repositories
Combined_Dataset_for_Speech_Emotion_Recognition
A collection of dataset consists of a total of 8 English speech datasets for SER
audiomentations
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.
depression-detect
Predicting depression from acoustic features of speech using a Convolutional Neural Network.
speechbrain
A PyTorch-based Speech Toolkit
openspeech
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
audiolm-pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
speech-trident
Awesome speech/audio LLMs, representation learning, and codec models
CVPR2024-Papers-with-Code
CVPR 2024 论文和开源项目合集
connected-components-3d
Connected components on discrete and continuous multilabel 3D & 2D images. Handles 26, 18, and 6 connected variants; periodic boundaries (4, 8, & 6)
learning_research
本人的科研经验
CLIP-Driven-Universal-Model
[ICCV 2023] CLIP-Driven Universal Model; Rank first in MSD Competition.
AbdomenAtlas
[NeurIPS 2023] AbdomenAtlas 1.0 (5,195 CT volumes + 9 annotated classes)
text-generation-webui
A Gradio web UI for Large Language Models.
mmsegmentation
OpenMMLab Semantic Segmentation Toolbox and Benchmark.