Sandy Tanwisuth's repositories
meltingpot
A suite of test scenarios for multi-agent reinforcement learning.
alpaca-lora
Instruct-tune LLaMA on consumer hardware
awesome-model-based-RL
A curated list of awesome model based RL resources (continually updated)
contrastive_metrics
Code for the paper "Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making"
distributional-sr
Official implementation of the Ξ΄-model presented in the paper "A Distributional Analogue to the Successor Representation".
effective-horizon
Code and data for the paper "Bridging RL Theory and Practice with the Effective Horizon"
hanabi.github.io
A list of Hanabi strategies
hidden-context
Code and data for the paper "Understanding Hidden Context in Preference Learning: Consequences for RLHF"
icvf_release
Public code for "Reinforcement Learning from Passive Data via Latent Intentions"
JaxMARL-minimal-information
Multi-Agent Reinforcement Learning with JAX
lab2d
A customisable 2D platform for agent-based AI research
maddpg
Code for the MADDPG algorithm from the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"
Mava
π¦ A research-friendly codebase for fast experimentation of multi-agent reinforcement learning in JAX
maxtext
A simple, performant and scalable Jax LLM!
nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Neural-Network-Zero-to-Hero
Writing keys libraries and core architectures from scratch. Following the tutorials of Neural Network Zero to Hero class from Andrej Karphathy.
overcooked_ai
A benchmark environment for fully cooperative human-AI performance.
paper-reviewer-matcher
Linear programming solver for paper-reviewer matching and mind-matching
pax
Scalable Opponent Shaping Experiments in JAX
purejaxrl
Really Fast End-to-End Jax RL Implementations
pycid
Library for graphical models of decision making, based on pgmpy and networkx
ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
rliable
[NeurIPS'21 Outstanding Paper] Library for reliable evaluation on RL and ML benchmarks, even with only a handful of seeds.
SAELens
Training Sparse Autoencoders on Language Models