chenyangMl

Chen Yang's starred repositories

stable-diffusion-webui

Stable Diffusion web UI

Language:PythonAGPL-3.0136769 1057 7550

grok-1

Grok open release

Language:PythonApache-2.049208 561 202

FFmpeg

Mirror of https://git.ffmpeg.org/ffmpeg.git

Language:CNOASSERTION44030 14380

ComfyUI

The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.

Language:PythonGPL-3.043385 343 2579

TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonMPL-2.032272 273 1068

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT29833 190 982

ChatTTS

A generative speech model for daily dialogue.

Language:PythonAGPL-3.028298 168 416

LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonApache-2.027829 188 4393

llama3

The official Meta Llama 3 GitHub site

Language:PythonNOASSERTION24710 209 208

llm.c

LLM training in simple, raw C/CUDA

Language:CudaMIT22330 219 125

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonApache-2.018381 158 1416

ML-YouTube-Courses

📺 Discover the latest machine learning / AI courses on YouTube.

CC0-1.014591 351 17

onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

Language:C++MIT13694 244 6293

InstantID

InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥

Language:PythonApache-2.010580 122 207

xmake

🔥 A cross-platform build utility based on Lua

Language:LuaApache-2.09608 141 3097

minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Language:PythonMIT8809 82 36

EMO

Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

7260 316 258

insanely-fast-whisper

Language:Jupyter NotebookApache-2.07049 61 178

glog

C++ implementation of the Google logging module

Language:C++BSD-3-Clause6933 261 570

StoryDiffusion

Create Magic Story!

Language:Jupyter NotebookApache-2.05574 85 130

NExT-GPT

Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

Language:PythonBSD-3-Clause3108 60 91

Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Language:PythonMIT1932 29 78

cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Language:PythonApache-2.01601 20 44

Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Language:PythonNOASSERTION1287 25 62

vocos

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis

Language:PythonMIT715 34 46

agents

Build real-time multimodal AI applications 🤖🎙️📹

Language:PythonApache-2.0709 25 92

AnyGPT

Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"

Language:Python670 26 24

lilianweng.github.io

My personal page

Language:HTML410 27 11

llama2.c-zh

支持中文场景的的小语言模型 llama2.c-zh

Language:Python138 4 6

keyword-spot

端到端语音唤醒工具箱，从模型训练到模型推理。

Language:PythonMIT5500