Takuya Hiraoka (TakuyaHiraoka)

TakuyaHiraoka

Geek Repo

Location:Tokyo-3, Japan

Home Page:https://takuyahiraoka.github.io

Github PK Tool:Github PK Tool

Takuya Hiraoka's starred repositories

OpenHands

๐Ÿ™Œ OpenHands: Code Less, Make More

Language:PythonLicense:MITStargazers:33106Issues:292Issues:1454

JARVIS

JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf

Language:PythonLicense:MITStargazers:23618Issues:381Issues:178

AI-Scientist

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery ๐Ÿง‘โ€๐Ÿ”ฌ

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:7848Issues:93Issues:102

ToolBench

[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.

Language:PythonLicense:Apache-2.0Stargazers:4791Issues:49Issues:291

PufferLib

Simplifying reinforcement learning for complex game environments

Language:PythonLicense:MITStargazers:1162Issues:5Issues:10

DrEureka

Official Repository for "DrEureka: Language Model Guided Sim-To-Real Transfer" (RSS 2024)

Language:PythonLicense:MITStargazers:788Issues:9Issues:11

RLHF-Reward-Modeling

Recipes to train reward model for RLHF.

Language:PythonLicense:Apache-2.0Stargazers:751Issues:20Issues:30

SeeAct

[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multimodal models (LMMs) such as GPT-4V(ision).

Language:PythonLicense:NOASSERTIONStargazers:609Issues:16Issues:41

LeanRL

LeanRL is a fork of CleanRL, where selected PyTorch scripts optimized for performance using compile and cudagraphs.

Language:PythonLicense:NOASSERTIONStargazers:412Issues:8Issues:4

torax

TORAX: Tokamak transport simulation in JAX

Language:PythonLicense:NOASSERTIONStargazers:355Issues:17Issues:12

serl

SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning

Language:PythonLicense:MITStargazers:329Issues:11Issues:25
Language:PythonLicense:BSD-3-ClauseStargazers:231Issues:5Issues:12

Open_Duck_Mini

Making a mini version of the BDX droid

Language:PythonLicense:Apache-2.0Stargazers:217Issues:9Issues:0

dmc2gym

OpenAI Gym wrapper for the DeepMind Control Suite

Language:PythonLicense:MITStargazers:203Issues:5Issues:12

flashbax

โšก Flashbax: Accelerated Replay Buffers in JAX

Language:PythonLicense:Apache-2.0Stargazers:203Issues:13Issues:11

xland-minigrid

JAX-accelerated Meta-Reinforcement Learning Environments Inspired by XLand and MiniGrid ๐ŸŽ๏ธ

Language:PythonLicense:Apache-2.0Stargazers:194Issues:9Issues:15
Language:PythonLicense:NOASSERTIONStargazers:153Issues:10Issues:12
Language:PythonLicense:Apache-2.0Stargazers:145Issues:4Issues:10

yay_robot

PyTorch implementation of YAY Robot

JAX-CORL

Clean single-file implementation of offline RL algorithms in JAX

Language:PythonLicense:MITStargazers:88Issues:4Issues:21

purejaxql

Simple single-file baselines for Q-Learning in pure-GPU setting

Language:PythonLicense:Apache-2.0Stargazers:87Issues:1Issues:0

CrossQ

Official code release for "CrossQ: Batch Normalization in Deep Reinforcement Learning for Greater Sample Efficiency and Simplicity"

Language:PythonLicense:NOASSERTIONStargazers:57Issues:4Issues:8

DrM

DrM, a visual RL algorithm, minimizes the dormant ratio to guide exploration-exploitation trade-offs, achieving significant improvements in sample efficiency and asymptotic performance across diverse domains.

Language:PythonLicense:MITStargazers:56Issues:2Issues:4

genrl

[NeurIPS 2024] GenRL: Multimodal foundation world models allow grounding language and video prompts into embodied domains, by turning them into sequences of latent world model states. Latent state sequences can be decoded using the decoder of the model, allowing visualization of the expected behavior, before training the agent to execute it.

Language:PythonLicense:MITStargazers:53Issues:1Issues:1
Language:PythonStargazers:30Issues:0Issues:0

minirllab

Mini RL Lab

Language:PythonLicense:MITStargazers:15Issues:2Issues:0
Language:PythonLicense:Apache-2.0Stargazers:6Issues:1Issues:0

OfflineRLStructuredNonstationarity

Implementation for RLC paper "Offline Reinforcement Learning from Datasets with Structured Non-Stationarity".

Language:PythonLicense:MITStargazers:5Issues:2Issues:0