Hongyi Guo (gohsyi)

gohsyi

Geek Repo

Company:Northwestern University

Location:Evanston

Github PK Tool:Github PK Tool

Hongyi Guo's repositories

PeerLoss

Learning with Noisy Labels by adopting a peer prediction loss function.

Language:PythonLicense:MITStargazers:36Issues:4Issues:1

cluster_optimization

Course project of SJTU EE357: Computer Network, advised by Prof. Na Ruan. We implemented and improved "A Hierarchical Framework of Cloud Resource Allocation and Power Management using Deep Reinforcement Learning" and achieve a good trade-off between power usage and job latency.

Language:PythonStargazers:30Issues:3Issues:0

SegNet-ADDA

Course project of SJTU CS386: Digital Image Processing, where I implemented SegNet combined with ADDA, advised by Prof. Bin Sheng

Language:Jupyter NotebookStargazers:7Issues:2Issues:0

trading_strategy

Course project of SJTU EE359 Data Mining (advised by Prof. Bo Yuan), where we use reinforcement learning to decide trading strategy.

Language:PythonStargazers:5Issues:1Issues:0

secure_connectivity

Course project of SJTU EE447: Mobile Internet, advised by Prof. Luoyi Fu and Prof. Xinbing Wang. The task is to design a defending strategy to predict and protect the edges that is most likely to be attacked by attackers.

Language:PythonLicense:MITStargazers:3Issues:0Issues:0

tslda

Replication of paper "Topic Modeling based Sentiment Analysis on Social Media for Stock Market Prediction".

Language:PythonLicense:MITStargazers:2Issues:2Issues:1

self_alignment

Retrieval-Augmented Self-Alignment (RASA)

Language:Jupyter NotebookStargazers:1Issues:0Issues:0

alignment-handbook

Robust recipes to align language models with human and AI preferences

License:Apache-2.0Stargazers:0Issues:0Issues:0

alpaca_eval

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

License:Apache-2.0Stargazers:0Issues:0Issues:0

baselines

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Language:PythonLicense:MITStargazers:0Issues:1Issues:0
Stargazers:0Issues:2Issues:0

CoPiEr

Co-training for Policy Learning

Language:CStargazers:0Issues:2Issues:0

end-to-end-negotiator

Deal or No Deal? End-to-End Learning for Negotiation Dialogues

License:NOASSERTIONStargazers:0Issues:0Issues:0

exploration-by-disagreement

[ICML 2019] TensorFlow Code for Self-Supervised Exploration via Disagreement

Language:PythonStargazers:0Issues:1Issues:0
Language:CSSLicense:NOASSERTIONStargazers:0Issues:0Issues:0

hyperparallel_machine_learning

course repo for IV-J

Language:PythonStargazers:0Issues:1Issues:0

L_DMI

Code for NeurIPS 2019 Paper, "L_DMI: An Information-theoretic Noise-robust Loss Function"

Language:PythonStargazers:0Issues:2Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

look_for_words

Looking for words? Try me.

Language:PythonLicense:MITStargazers:0Issues:2Issues:0

multiagent-particle-envs

Code for a multi-agent particle environment used in the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"

Language:PythonLicense:MITStargazers:0Issues:2Issues:0

OvercookedGPT

An OpenAI gym environment to evaluate the ability of LLMs (eg. GPT-4, Claude) in long-horizon reasoning and task planning in dynamic multi-agent settings.

License:MITStargazers:0Issues:0Issues:0

peer_bc_ct

Mirror of Stable-Baselines: a fork of OpenAI Baselines, implementations of reinforcement learning algorithms

Language:PythonStargazers:0Issues:1Issues:0

RAIN

Official implementation of [RAIN: Your Language Models Can Align Themselves without Finetuning]

License:BSD-2-ClauseStargazers:0Issues:0Issues:0

rl-baselines-zoo

A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

safe-rlhf

Safe-RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:2Issues:0

taxi

CUMCM 2019, Problem C

Language:PythonStargazers:0Issues:2Issues:0

tianshou

An elegant, flexible, and superfast PyTorch deep Reinforcement Learning platform.

Language:PythonLicense:MITStargazers:0Issues:2Issues:0

torch-ac

Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms: A2C and PPO

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

troubleshooting

All issues I encountered, continuously updating

Stargazers:0Issues:1Issues:0