Qingsong Liu's repositories
Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
bark
🔊 Text-Prompted Generative Audio Model
dreamtalk
Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models
Emu
Emu: An Open Multimodal Generalist
FiT
FiT: Flexible Vision Transformer for Diffusion Model
generative-models
Generative Models by Stability AI
genmusic_demo_list
a list of demo websites for automatic music generation research
GPTQ-for-LLaMa
4 bits quantization of LLaMA using GPTQ
llama.cpp
Port of Facebook's LLaMA model in C/C++
LLaMA2-Accessory
An Open-source Toolkit for LLM Development
LLaVA
Large Language-and-Vision Assistant built towards multimodal GPT-4 level capabilities.
LLM-groundedDiffusion
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models (LLM-grounded Diffusion: LMD)
Monkey
Monkey (LMM); 多模态大模型 华科小猴子
Multi-Modality-Arena
Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!
MultimodalOCR
On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
nougat
Implementation of Nougat Neural Optical Understanding for Academic Documents
Open-AnimateAnyone
Unofficial Implementation of Animate Anyone
open_flamingo
An open-source framework for training large multimodal models.
PhotoMaker
PhotoMaker
Prompt-Engineering-Guide
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
text-generation-inference
Large Language Model Text Generation Inference