蒲源 (puyuan1996)

puyuan1996

Geek Repo

Company:China

Location:Shenzhen

Github PK Tool:Github PK Tool

蒲源's starred repositories

PromptAgent

This is the official repo for "PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization". PromptAgent is a novel automatic prompt optimization method that autonomously crafts prompts equivalent in quality to those handcrafted by experts, i.e., expert-level prompts.

Language:PythonLicense:Apache-2.0Stargazers:150Issues:0Issues:0

FoRL

A library for for First-order Reinforcement Learning algorithms

Language:PythonStargazers:3Issues:0Issues:0

PWM

PWM: Policy Learning with Large World Models

Language:Jupyter NotebookLicense:MITStargazers:25Issues:0Issues:0

nano-llama31

nanoGPT style version of Llama 3.1

Language:PythonStargazers:499Issues:0Issues:0

graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system

Language:PythonLicense:MITStargazers:13891Issues:0Issues:0

DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

License:MITStargazers:3198Issues:0Issues:0

onnx

Open standard for machine learning interoperability

Language:PythonLicense:Apache-2.0Stargazers:17396Issues:0Issues:0

XTRA

On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement Learning

Language:PythonLicense:Apache-2.0Stargazers:17Issues:0Issues:0
Language:PythonStargazers:55Issues:0Issues:0

lm-human-preferences

Code for the paper Fine-Tuning Language Models from Human Preferences

Language:PythonLicense:MITStargazers:1185Issues:0Issues:0

trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Language:PythonLicense:MITStargazers:4411Issues:0Issues:0

RL4LMs

A modular RL library to fine-tune language models to human preferences

Language:PythonLicense:Apache-2.0Stargazers:2144Issues:0Issues:0

OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

Language:PythonLicense:Apache-2.0Stargazers:1866Issues:0Issues:0

dm_control

Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.

Language:PythonLicense:Apache-2.0Stargazers:3671Issues:0Issues:0

mistral-inference

Official inference library for Mistral models

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:9385Issues:0Issues:0

soft-moe-pytorch

Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch

Language:PythonLicense:MITStargazers:229Issues:0Issues:0

torchtune

A Native-PyTorch Library for LLM Fine-tuning

Language:PythonLicense:BSD-3-ClauseStargazers:3723Issues:0Issues:0

LanguageAgentTreeSearch

[ICML 2024] Official repository for "Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models"

Language:PythonLicense:MITStargazers:559Issues:0Issues:0

MCTS-DPO

This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.

Language:PythonLicense:Apache-2.0Stargazers:44Issues:0Issues:0

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonLicense:MITStargazers:20350Issues:0Issues:0

rl

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

Language:PythonLicense:MITStargazers:2064Issues:0Issues:0

rl-learned-optimization

Official Implementation of "Can Learned Optimization Make Reinforcement Learning Less Difficult"

Language:PythonLicense:Apache-2.0Stargazers:8Issues:0Issues:0

godot

Godot Engine – Multi-platform 2D and 3D game engine

Language:C++License:MITStargazers:87524Issues:0Issues:0

avalon

A 3D video game environment and benchmark designed from scratch for reinforcement learning research

Language:Jupyter NotebookLicense:GPL-3.0Stargazers:177Issues:0Issues:0

GenerativeRL

Python library for solving reinforcement learning (RL) problems using generative models (e.g. Diffusion Models).

Language:PythonLicense:Apache-2.0Stargazers:39Issues:0Issues:0

introRL

Intro to Reinforcement Learning (强化学习纲要)

License:MITStargazers:3180Issues:0Issues:0

Gymnasium

An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)

Language:PythonLicense:MITStargazers:6439Issues:0Issues:0

glow

Code for reproducing results in "Glow: Generative Flow with Invertible 1x1 Convolutions"

Language:PythonLicense:MITStargazers:3104Issues:0Issues:0

SiT

Official PyTorch Implementation of "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"

Language:PythonLicense:MITStargazers:565Issues:0Issues:0

gemma_pytorch

The official PyTorch implementation of Google's Gemma models

Language:PythonLicense:Apache-2.0Stargazers:5211Issues:0Issues:0