ChungKingExpress's starred repositories

stable-diffusion

A latent text-to-image diffusion model

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:66785Issues:555Issues:705

cs-video-courses

List of Computer Science courses with video lectures.

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonLicense:MITStargazers:29516Issues:189Issues:971

so-vits-svc

SoftVC VITS Singing Voice Conversion

Language:PythonLicense:AGPL-3.0Stargazers:24857Issues:174Issues:130

Umi-OCR

OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。

Language:PythonLicense:MITStargazers:23449Issues:136Issues:503

gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"

Language:PythonLicense:NOASSERTIONStargazers:22011Issues:637Issues:262

AnimateAnyone

Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation

ChatGLM3

ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型

Language:PythonLicense:Apache-2.0Stargazers:13129Issues:99Issues:756

open_clip

An open source implementation of CLIP.

Language:PythonLicense:NOASSERTIONStargazers:9276Issues:76Issues:454

LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:9260Issues:97Issues:630

BELLE

BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)

Language:HTMLLicense:Apache-2.0Stargazers:7735Issues:108Issues:439

PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

Language:C++License:MITStargazers:7673Issues:75Issues:151

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:7615Issues:89Issues:1616

TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Language:PythonLicense:Apache-2.0Stargazers:7372Issues:110Issues:150

BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:4494Issues:34Issues:190

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:4298Issues:59Issues:138

ChineseNLPCorpus

中文自然语言处理数据集,平时做做实验的材料。欢迎补充提交合并。

Emu

Emu Series: Generative Multimodal Models from BAAI

Language:PythonLicense:Apache-2.0Stargazers:1576Issues:21Issues:85

ALBEF

Code for ALBEF: a new vision-language pre-training method

Language:PythonLicense:BSD-3-ClauseStargazers:1461Issues:11Issues:139
Language:PythonLicense:Apache-2.0Stargazers:1101Issues:13Issues:92

ChatLM-mini-Chinese

中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。

Language:PythonLicense:Apache-2.0Stargazers:1011Issues:12Issues:44

summarize-from-feedback

Code for "Learning to summarize from human feedback"

Language:PythonLicense:NOASSERTIONStargazers:972Issues:149Issues:21

PPO-for-Beginners

A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.

Language:PythonLicense:MITStargazers:683Issues:10Issues:8

ppo-implementation-details

The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization

Language:PythonLicense:NOASSERTIONStargazers:587Issues:3Issues:6

LLaMA-Pro

[ACL 2024] Progressive LLaMA with Block Expansion.

Language:PythonLicense:Apache-2.0Stargazers:446Issues:20Issues:31

SeqGPT

SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding

Language:PythonLicense:Apache-2.0Stargazers:201Issues:4Issues:14

Multi-CPR

[SIGIR 2022] Multi-CPR: A Multi Domain Chinese Dataset for Passage Retrieval

lm-human-preference-details

RLHF implementation details of OAI's 2019 codebase

Language:PythonLicense:MITStargazers:138Issues:4Issues:7

ChineseSquad

中文机器阅读理解数据集