TakuyaHiraoka

Takuya Hiraoka's repositories

Dropout-Q-Functions-for-Doubly-Efficient-Reinforcement-Learning

Source files to replicate experiments in my ICLR 2022 paper.

Language:Python58 4 1

Efficient-SRGC-RL-with-a-High-RR-and-Regularization

Source files to replicate experiments in my Arxiv 2023 paper.

Language:PythonMIT100

Soft-Actor-Critic-and-Extensions

PyTorch implementation of Soft-Actor-Critic and Prioritized Experience Replay (PER) + Emphasizing Recent Experience (ERE) + Munchausen RL + D2RL and parallel Environments.

Language:PythonMIT100

Which-Experiences-Are-Influential-for-RL-Agents

Source files to replicate experiments in my ArXiv 2024 paper.

Language:PythonMIT100

d3rlpy

An offline deep reinforcement learning library

Language:PythonMIT000

d4rl

A benchmark for offline reinforcement learning.

Language:PythonApache-2.0000

deep_bisim4control

Learning Invariant Representations for Reinforcement Learning without Reconstruction

NOASSERTION000

dm_control

DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.

Language:PythonApache-2.0010

ElegantRL

Scalable and Elastic Deep Reinforcement Learning Using PyTorch. Please star. 🔥

NOASSERTION000

JARVIS

JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf

MIT000

Meta-Model-Based-Meta-Policy-Optimization

Source files to replicate experiments in my ACML 2021 paper.

000

metaworld

An open source robotics benchmark for meta- and multi-task reinforcement learning

MIT000

mopo

Code for MOPO: Model-based Offline Policy Optimization

Language:PythonMIT010

mujoco

Multi-Joint dynamics with Contact. A general purpose physics simulator.

Apache-2.0000

mujoco-maze

Simple maze environments using mujoco-py

Apache-2.0000

oyster

Implementation of Efficient Off-policy Meta-learning via Probabilistic Context Variables (PEARL)

MIT000

pianoplayer

Automatic fingering generator for piano scores

Language:PythonMIT000

pomdp-baselines

Simple (but often Strong) Baselines for POMDPs in PyTorch - ICML 2022

MIT000

An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

Apache-2.0000

REDQ

Author's PyTorch implementation of Randomized Ensembled Double Q-Learning (REDQ) algorithm.

MIT000

rltorch

A simple framework for distributed reinforcement learning in PyTorch.

MIT000

robopianist

🎹 🤖 A benchmark for high-dimensional robot control.

Language:PythonApache-2.0000

robopianist-rl

000

soft-actor-critic.pytorch

A PyTorch implementation of Soft Actor-Critic(SAC).

MIT000

SparseBaseline

000

TakuyaHiraoka.github.io

Language:HTML020

ToolBench

An open platform for training, serving, and evaluating large language model for tool learning.

Apache-2.0000

TakuyaHiraoka

Takuya Hiraoka's repositories

Dropout-Q-Functions-for-Doubly-Efficient-Reinforcement-Learning

Efficient-SRGC-RL-with-a-High-RR-and-Regularization

Soft-Actor-Critic-and-Extensions

Which-Experiences-Are-Influential-for-RL-Agents

d3rlpy

d4rl

deep_bisim4control

DiffRL

dm_control

ElegantRL

Eureka

JARVIS

mbrl-lib

Meta-Model-Based-Meta-Policy-Optimization

metaworld

mopo

mujoco

mujoco-maze

oyster

pianoplayer

pomdp-baselines

ray

REDQ

rltorch

robopianist

robopianist-rl

soft-actor-critic.pytorch

SparseBaseline

TakuyaHiraoka.github.io

ToolBench