Hongyi Guo's repositories
trading_strategy
Course project of SJTU EE359 Data Mining (advised by Prof. Bo Yuan), where we use reinforcement learning to decide trading strategy.
secure_connectivity
Course project of SJTU EE447: Mobile Internet, advised by Prof. Luoyi Fu and Prof. Xinbing Wang. The task is to design a defending strategy to predict and protect the edges that is most likely to be attacked by attackers.
self_alignment
Retrieval-Augmented Self-Alignment (RASA)
alignment-handbook
Robust recipes to align language models with human and AI preferences
alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
auto_literature
Automatically arrange literature
end-to-end-negotiator
Deal or No Deal? End-to-End Learning for Negotiation Dialogues
exploration-by-disagreement
[ICML 2019] TensorFlow Code for Self-Supervised Exploration via Disagreement
hyperparallel_machine_learning
course repo for IV-J
look_for_words
Looking for words? Try me.
multiagent-particle-envs
Code for a multi-agent particle environment used in the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"
OvercookedGPT
An OpenAI gym environment to evaluate the ability of LLMs (eg. GPT-4, Claude) in long-horizon reasoning and task planning in dynamic multi-agent settings.
peer_bc_ct
Mirror of Stable-Baselines: a fork of OpenAI Baselines, implementations of reinforcement learning algorithms
RAIN
Official implementation of [RAIN: Your Language Models Can Align Themselves without Finetuning]
rl-baselines-zoo
A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.
safe-rlhf
Safe-RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
trl
Train transformer language models with reinforcement learning.
troubleshooting
All issues I encountered, continuously updating