TakuyaHiraoka

[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multimodal models (LMMs) such as GPT-4V(ision).

Language:PythonNOASSERTION609 16 41

LeanRL

LeanRL is a fork of CleanRL, where selected PyTorch scripts optimized for performance using compile and cudagraphs.

Language:PythonNOASSERTION412 8 4

torax

TORAX: Tokamak transport simulation in JAX

Language:PythonNOASSERTION355 17 12

serl

SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning

Language:PythonMIT329 11 25

awesome-large-multimodal-agents

317 5 2

Open_Duck_Mini

Making a mini version of the BDX droid

Language:PythonApache-2.0217 90

dmc2gym

OpenAI Gym wrapper for the DeepMind Control Suite

Language:PythonMIT203 5 12

flashbax

⚡ Flashbax: Accelerated Replay Buffers in JAX

Language:PythonApache-2.0203 13 11

xland-minigrid

JAX-accelerated Meta-Reinforcement Learning Environments Inspired by XLand and MiniGrid 🏎️

Language:PythonApache-2.0194 9 15

yay_robot

PyTorch implementation of YAY Robot

Language:Python116 6 5

JAX-CORL

Clean single-file implementation of offline RL algorithms in JAX

Language:PythonMIT88 4 21

purejaxql

Simple single-file baselines for Q-Learning in pure-GPU setting

Language:PythonApache-2.087 10

CrossQ

Official code release for "CrossQ: Batch Normalization in Deep Reinforcement Learning for Greater Sample Efficiency and Simplicity"

Language:PythonNOASSERTION57 4 8

DrM, a visual RL algorithm, minimizes the dormant ratio to guide exploration-exploitation trade-offs, achieving significant improvements in sample efficiency and asymptotic performance across diverse domains.

Language:PythonMIT56 2 4

genrl

[NeurIPS 2024] GenRL: Multimodal foundation world models allow grounding language and video prompts into embodied domains, by turning them into sequences of latent world model states. Latent state sequences can be decoded using the decoder of the model, allowing visualization of the expected behavior, before training the agent to execute it.

Language:PythonMIT53 1 1

TakuyaHiraoka

Takuya Hiraoka's starred repositories

OpenHands

JARVIS

AI-Scientist

ToolBench

PufferLib

DrEureka

RLHF-Reward-Modeling

SeeAct

LeanRL

torax

serl

awesome-large-multimodal-agents

unitree_rl_gym

Open_Duck_Mini

dmc2gym

flashbax

xland-minigrid

crossformer

SMPLOlympics

rejax

yay_robot

JAX-CORL

purejaxql

CrossQ

DrM

genrl

pianomime

minirllab

stable-eureka-ollama

OfflineRLStructuredNonstationarity