Yan Song's repositories

Master-thesis

Policy gradient planning in MBRL using probabilistic models.

Language:Jupyter NotebookStargazers:5Issues:2Issues:0
Language:PythonStargazers:4Issues:0Issues:0

NLP-project

Abstractive Summarisation

Language:Jupyter NotebookStargazers:4Issues:0Issues:0

Gibbs-sampler

coursework

Language:Jupyter NotebookStargazers:1Issues:1Issues:0

malib

A parallel framework for population-based multi-agent reinforcement learning.

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

Network-analysis

toy software

Language:JavaStargazers:1Issues:0Issues:0
Language:Jupyter NotebookStargazers:1Issues:1Issues:0
Language:JavaScriptLicense:MITStargazers:0Issues:0Issues:0
Language:Jupyter NotebookStargazers:0Issues:1Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

envpool

C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

gym

A toolkit for developing and comparing reinforcement learning algorithms.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

LLM_Tree_Search

The official implementation of paper: Alphazero-like Tree-Search can guide large language model decoding and training

Stargazers:0Issues:0Issues:0

ma-gym

A collection of multi agent environments based on OpenAI gym.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

minigrid-rl

RL experiments using mini grid gym environment

Language:PythonStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

overcooked_ai

A benchmark environment for fully cooperative human-AI performance.

License:MITStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:1Issues:0

safe-rlhf

Safe-RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

License:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0