gohsyi

Hongyi Guo's repositories

PeerLoss

Learning with Noisy Labels by adopting a peer prediction loss function.

Language:PythonMIT36 4 1

trading_strategy

Course project of SJTU EE359 Data Mining (advised by Prof. Bo Yuan), where we use reinforcement learning to decide trading strategy.

Language:Python5 10

Course project of SJTU EE447: Mobile Internet, advised by Prof. Luoyi Fu and Prof. Xinbing Wang. The task is to design a defending strategy to predict and protect the edges that is most likely to be attacked by attackers.

Language:PythonMIT3 10

tslda

Replication of paper "Topic Modeling based Sentiment Analysis on Social Media for Stock Market Prediction".

Language:PythonMIT2 2 1

self_alignment

Retrieval-Augmented Self-Alignment (RASA)

Language:Jupyter Notebook1 10

alignment-handbook

Robust recipes to align language models with human and AI preferences

Language:PythonApache-2.0000

alpaca_eval

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Language:Jupyter NotebookApache-2.0000

auto_literature

Automatically arrange literature

000

baselines

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Language:PythonMIT010

cheatsheet

020

CoPiEr

Co-training for Policy Learning

Language:C020

end-to-end-negotiator

Deal or No Deal? End-to-End Learning for Negotiation Dialogues

Language:PythonNOASSERTION010

exploration-by-disagreement

[ICML 2019] TensorFlow Code for Self-Supervised Exploration via Disagreement

Language:Python010

gohsyi.github.io

Language:CSSNOASSERTION010

hyperparallel_machine_learning

course repo for IV-J

Language:Python010

L_DMI

Code for NeurIPS 2019 Paper, "L_DMI: An Information-theoretic Noise-robust Loss Function"

Language:Python020

LightZero

Language:PythonApache-2.0000

look_for_words

Looking for words? Try me.

Language:PythonMIT020

multiagent-particle-envs

Code for a multi-agent particle environment used in the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"

Language:PythonMIT020

OvercookedGPT

An OpenAI gym environment to evaluate the ability of LLMs (eg. GPT-4, Claude) in long-horizon reasoning and task planning in dynamic multi-agent settings.

Language:PythonMIT000

peer_bc_ct

Mirror of Stable-Baselines: a fork of OpenAI Baselines, implementations of reinforcement learning algorithms

Language:Python010

RAIN

Official implementation of [RAIN: Your Language Models Can Align Themselves without Finetuning]

Language:PythonBSD-2-Clause000

rl-baselines-zoo

A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.

Language:PythonMIT010

safe-rlhf

Safe-RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Language:PythonApache-2.0000

survey

MIT020

taxi

CUMCM 2019, Problem C

Language:Python020

tianshou

An elegant, flexible, and superfast PyTorch deep Reinforcement Learning platform.

Language:PythonMIT020

torch-ac

Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms: A2C and PPO

Language:PythonMIT020

trl

Train transformer language models with reinforcement learning.

Apache-2.0000

troubleshooting

All issues I encountered, continuously updating

010