yechenzhi's starred repositories

generative-models

Generative Models by Stability AI

Language:PythonLicense:MITStargazers:23533Issues:252Issues:287

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

trl

Train transformer language models with reinforcement learning.

Language:PythonLicense:Apache-2.0Stargazers:8881Issues:75Issues:1017

PaLM-rlhf-pytorch

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

Language:PythonLicense:MITStargazers:7654Issues:143Issues:46
Language:PythonLicense:Apache-2.0Stargazers:7030Issues:67Issues:69

lm-evaluation-harness

A framework for few-shot evaluation of language models.

Language:PythonLicense:MITStargazers:5973Issues:36Issues:956

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Language:PythonLicense:BSD-3-ClauseStargazers:5392Issues:63Issues:96

Firefly

Firefly: 大模型训练工具,支持训练Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Language:PythonLicense:NOASSERTIONStargazers:4978Issues:35Issues:177

torchtune

A Native-PyTorch Library for LLM Fine-tuning

Language:PythonLicense:BSD-3-ClauseStargazers:3674Issues:44Issues:381

opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Language:PythonLicense:Apache-2.0Stargazers:3402Issues:25Issues:431

scenic

Scenic: A Jax Library for Computer Vision Research and Beyond

Language:PythonLicense:Apache-2.0Stargazers:3167Issues:39Issues:245

awesome-RLHF

A curated list of reinforcement learning with human feedback resources (continually updated)

PixArt-alpha

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Language:PythonLicense:AGPL-3.0Stargazers:2584Issues:46Issues:0

big_vision

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2069Issues:37Issues:51

AutoPrompt

A framework for prompt tuning using Intent-based Prompt Calibration

Language:PythonLicense:Apache-2.0Stargazers:1921Issues:10Issues:24

direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Language:PythonLicense:Apache-2.0Stargazers:1905Issues:19Issues:77

MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

Language:PythonLicense:Apache-2.0Stargazers:1860Issues:23Issues:84

self-rewarding-lm-pytorch

Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI

Language:PythonLicense:MITStargazers:1272Issues:23Issues:17

safe-rlhf

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Language:PythonLicense:Apache-2.0Stargazers:1260Issues:17Issues:82

self-attention-cv

Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.

Language:PythonLicense:MITStargazers:1150Issues:20Issues:15

SPIN

The official implementation of Self-Play Fine-Tuning (SPIN)

Language:PythonLicense:Apache-2.0Stargazers:905Issues:12Issues:30

flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Language:PythonLicense:MITStargazers:781Issues:20Issues:29

HALOs

A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).

Language:PythonLicense:Apache-2.0Stargazers:643Issues:6Issues:20

ao

Custom data types and layouts for training and inference

Language:PythonLicense:BSD-3-ClauseStargazers:440Issues:26Issues:93

Vita-CLIP

Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]

Language:PythonLicense:MITStargazers:101Issues:7Issues:10

disco

A Toolkit for Distributional Control of Generative Models

Language:PythonLicense:NOASSERTIONStargazers:68Issues:4Issues:0

POVID

[Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning

Language:PythonLicense:Apache-2.0Stargazers:57Issues:3Issues:8

Self-Explore

Self-Explore to avoid ️the p️️it! Improving the Reasoning Capabilities of Language Models with Fine-grained Rewards

LoL-RL

Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients

Language:PythonStargazers:23Issues:2Issues:0