dqqcasia's starred repositories
youtube-dl
Command-line program to download videos from YouTube.com and other video sites
whisper.cpp
Port of OpenAI's Whisper model in C/C++
audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
Llama-Chinese
Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
llama-recipes
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.
seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
lm-evaluation-harness
A framework for few-shot evaluation of language models.
Baichuan-7B
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
whisper_streaming
Whisper realtime streaming for long speech-to-text transcription and translation
deep_learning_curriculum
Language model alignment-focused deep learning curriculum
llm-hallucination-survey
Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models"
bigcode-evaluation-harness
A framework for the evaluation of autoregressive code generation language models.
MEGABYTE-pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
Speech-Resources
语音方向实验室/公司/资源/实习等,欢迎推荐或自荐
BIG-Bench-Hard
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
sft_datasets
开源SFT数据集整理,随时补充