yechenzhi

Firefly: 大模型训练工具，支持训练Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Language:Python5346 54 267

cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Language:PythonNOASSERTION4978 35 177

torchtune

A Native-PyTorch Library for LLM Fine-tuning

Language:PythonBSD-3-Clause3674 44 381

opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Language:PythonApache-2.03402 25 431

scenic

Scenic: A Jax Library for Computer Vision Research and Beyond

Language:PythonApache-2.03167 39 245

awesome-RLHF

A curated list of reinforcement learning with human feedback resources (continually updated)

Apache-2.03103 58 3

PixArt-alpha

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Language:PythonAGPL-3.02584 460

big_vision

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Language:Jupyter NotebookApache-2.02069 37 51

AutoPrompt

A framework for prompt tuning using Intent-based Prompt Calibration

Language:PythonApache-2.01921 10 24

direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Language:PythonApache-2.01905 19 77

MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

Language:PythonApache-2.01860 23 84

self-rewarding-lm-pytorch

Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI

Language:PythonMIT1272 23 17

safe-rlhf

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Language:PythonApache-2.01260 17 82

self-attention-cv

Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.

Language:PythonMIT1150 20 15

SPIN

The official implementation of Self-Play Fine-Tuning (SPIN)

Language:PythonApache-2.0905 12 30

flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Language:PythonMIT781 20 29

HALOs

A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).

Language:PythonApache-2.0643 6 20

ao

Custom data types and layouts for training and inference

Language:PythonBSD-3-Clause440 26 93

Vita-CLIP

Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]

Language:PythonMIT101 7 10

disco

A Toolkit for Distributional Control of Generative Models

Language:PythonNOASSERTION68 40

POVID

[Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning

Language:PythonApache-2.057 3 8

Self-Explore

Self-Explore to avoid ️the p️️it! Improving the Reasoning Capabilities of Language Models with Fine-grained Rewards

Language:Python33 1 1

LoL-RL

Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients

Language:Python23 20