puyuan1996

蒲源's starred repositories

PromptAgent

This is the official repo for "PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization". PromptAgent is a novel automatic prompt optimization method that autonomously crafts prompts equivalent in quality to those handcrafted by experts, i.e., expert-level prompts.

Language:PythonApache-2.015000

FoRL

A library for for First-order Reinforcement Learning algorithms

Language:Python300

PWM

PWM: Policy Learning with Large World Models

Language:Jupyter NotebookMIT2500

nano-llama31

nanoGPT style version of Llama 3.1

Language:Python49900

graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system

Language:PythonMIT1389100

DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

MIT319800

onnx

Open standard for machine learning interoperability

Language:PythonApache-2.01739600

XTRA

On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement Learning

Language:PythonApache-2.01700

ellm

Language:Python5500

lm-human-preferences

Code for the paper Fine-Tuning Language Models from Human Preferences

Language:PythonMIT118500

trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Language:PythonMIT441100

RL4LMs

A modular RL library to fine-tune language models to human preferences

Language:PythonApache-2.0214400

OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

Language:PythonApache-2.0186600

dm_control

Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.

Language:PythonApache-2.0367100

mistral-inference

Official inference library for Mistral models

Language:Jupyter NotebookApache-2.0938500

soft-moe-pytorch

Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch

Language:PythonMIT22900

torchtune

A Native-PyTorch Library for LLM Fine-tuning

Language:PythonBSD-3-Clause372300

LanguageAgentTreeSearch

[ICML 2024] Official repository for "Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models"

Language:PythonMIT55900

MCTS-DPO

This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.

Language:PythonApache-2.04400

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonMIT2035000