sandguine

User data from Github https://github.com/sandguine

followers

following

stars

University of California, Berkeley

San Francisco, Bay Area

sandytanwisuth.web.app

Sandy Tanwisuth's repositories

concordia

A library for generative social simulation

Language:PythonApache-2.0100

Melting-Pot-Contest-2023

Language:PythonApache-2.0100

meltingpot

A suite of test scenarios for multi-agent reinforcement learning.

Language:PythonApache-2.0100

alpaca-lora

Instruct-tune LLaMA on consumer hardware

Language:Jupyter NotebookApache-2.0000

awesome-model-based-RL

A curated list of awesome model based RL resources (continually updated)

Apache-2.0000

contrastive_metrics

Code for the paper "Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making"

000

distributional-sr

Official implementation of the δ-model presented in the paper "A Distributional Analogue to the Successor Representation".

MIT000

effective-horizon

Code and data for the paper "Bridging RL Theory and Practice with the Effective Horizon"

000

hanabi.github.io

A list of Hanabi strategies

CC-BY-SA-4.0000

hidden-context

Code and data for the paper "Understanding Hidden Context in Preference Learning: Consequences for RLHF"

Language:Python000

icvf_release

Public code for "Reinforcement Learning from Passive Data via Latent Intentions"

MIT000

JaxMARL-minimal-information

Multi-Agent Reinforcement Learning with JAX

Language:PythonApache-2.0000

lab2d

A customisable 2D platform for agent-based AI research

Language:C++Apache-2.0000

maddpg

Code for the MADDPG algorithm from the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"

Language:PythonMIT000

Mava

🦁 A research-friendly codebase for fast experimentation of multi-agent reinforcement learning in JAX

Language:PythonApache-2.0000

maxtext

A simple, performant and scalable Jax LLM!

Apache-2.0000

Multi-Agent-Transformer

000

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Language:PythonMIT000

Neural-Network-Zero-to-Hero

Writing keys libraries and core architectures from scratch. Following the tutorials of Neural Network Zero to Hero class from Andrej Karphathy.

MIT010

overcooked_ai

A benchmark environment for fully cooperative human-AI performance.

MIT000

paper-reviewer-matcher

Linear programming solver for paper-reviewer matching and mind-matching

Language:PythonApache-2.0000

pax

Scalable Opponent Shaping Experiments in JAX

Language:PythonApache-2.0000

purejaxrl

Really Fast End-to-End Jax RL Implementations

Apache-2.0000

pycid

Library for graphical models of decision making, based on pgmpy and networkx

Apache-2.0000

ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Language:PythonApache-2.0000

redpoint_hacks

Language:PythonMIT000

rliable

[NeurIPS'21 Outstanding Paper] Library for reliable evaluation on RL and ML benchmarks, even with only a handful of seeds.

Apache-2.0000

SAELens

Training Sparse Autoencoders on Language Models

Language:HTMLMIT000

value-of-intent

Language:Jupyter NotebookMIT000

Voyager-Contracts

CAIF

Language:JavaScriptMIT000