Yifu Yuan's repositories
Uni-RLHF-Platform
Uni-RLHF platform for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback" (ICLR2024)
Clean-Offline-RLHF
Offline RLHF codebase implementation for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback" (ICLR2024)
euclid-iclr2023
Official implementation for "EUCLID: Towards efficient unsupervised reinforcement learning with multi-choice dynamics model" (ICLR2023)
BabyAI-text
We perform functional grounding of LLMs' knowledge in BabyAI-Text
ED2
the ED2 implementation
Mini-Uni-RLHF
Minimal implementation for easy-to-use RLHF annotation
Best-README-Template
An awesome README template to jumpstart your projects!
cleanrl
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
diffusion_policy
[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
diffusion_reward
Official implementation of the paper "Diffusion Reward: Learning Rewards via Conditional Video Diffusion"
dreamerv2
Mastering Atari with Discrete World Models
dreamerv3-torch
Implementation of Dreamer v3 in pytorch.
drqv2
DrQ-v2: Improved Data-Augmented Reinforcement Learning
Everything-LLMs-And-Robotics
The world's largest GitHub Repository for LLMs + Robotics
learning-from-scratch
The repository of On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline
PreferenceTransformer
Preference Transformer: Modeling Human Preferences using Transformers for RL (ICLR2023 Accepted)
pytorch3d
PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
RLHF
RLHF
robohive
A unified framework for robot learning
robomimic
robomimic: A Modular Framework for Robot Learning from Demonstration
tdmpc
Code for "Temporal Difference Learning for Model Predictive Control"
tdmpc2
Code for "TD-MPC2: Scalable, Robust World Models for Continuous Control"
text2reward
Code for the paper "Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning"
unstable_baselines
Re-implementations of SOTA RL algorithms.
v-d4rl
Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations