Takuya Hiraoka's starred repositories
AI-Scientist
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery ๐งโ๐ฌ
RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
Open_Duck_Mini
Making a mini version of the BDX droid
xland-minigrid
JAX-accelerated Meta-Reinforcement Learning Environments Inspired by XLand and MiniGrid ๐๏ธ
genrl
[NeurIPS 2024] GenRL: Multimodal foundation world models allow grounding language and video prompts into embodied domains, by turning them into sequences of latent world model states. Latent state sequences can be decoded using the decoder of the model, allowing visualization of the expected behavior, before training the agent to execute it.
OfflineRLStructuredNonstationarity
Implementation for RLC paper "Offline Reinforcement Learning from Datasets with Structured Non-Stationarity".