takuseno / d3rlpy

An offline deep reinforcement learning library

Home Page:https://takuseno.github.io/d3rlpy

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Question] How can I convert a saved reaply_buffer to MDPDataset

HateBunnyPlzzz opened this issue · comments

Hi,
So I was training a SACConfig() policy over mountain_car_continous env*, and I need to convert a saved/exported replay buffer into a MDPDataset which I will later use for further research analysis. As I'm unbale to search through docs and tutorials regarding the same.
Here is my code:

from d3rlpy.algos import DoubleDQNConfig, SACConfig
from d3rlpy.dataset import create_fifo_replay_buffer
from d3rlpy.algos import ConstantEpsilonGreedy
import gym
import torch
import d3rlpy

env = gym.make("MountainCarContinuous-v0")

# data collection while training SAC
sac = SACConfig().create(device=device)
buffer = create_fifo_replay_buffer(limit=100000, env=env)
explorer = ConstantEpsilonGreedy(0.3)
sac.fit_online(env, buffer, explorer, n_steps=100000)

# saving the buffer dataset
with open("SAC_replay_buffer.h5", "w+b") as f:
    buffer.dump(f)

# loading the replay buffer
with open("SAC_replay_buffer.h5", "rb") as f:
    sac_dataset = d3rlpy.dataset.ReplayBuffer.load(f, d3rlpy.dataset.InfiniteBuffer())

As there is no existing current method to do such conversion, is there a work around for the same?

@HateBunnyPlzzz Hi, thanks for the issue. In v1, ReplayBuffer and MDPDataset were completely different classes. However, in v2, they're actually the same. If you see MDPDataset class, you'll notice that MDPDataset simply inherits ReplayBuffer:

class MDPDataset(ReplayBuffer):

@HateBunnyPlzzz Hi, thanks for the issue. In v1, ReplayBuffer and MDPDataset were completely different classes. However, in v2, they're actually the same. If you see MDPDataset class, you'll notice that MDPDataset simply inherits ReplayBuffer:

class MDPDataset(ReplayBuffer):

Hi, thank you for the response,
I did the workaround by accessing the property functions via .episodes()
feel free to close the issue.

Both ReplayBuffer and MDPDataset have episodes property. If your question was how to switch infinite buffer and FIFO buffer, this is the example:

import d3rlpy

# this dataset is infinite
dataset, _ = d3rlpy.datasets.get_dataset("hopper-medium-v0")

# convert this to FIFO buffer
fifo_dataset = d3rlpy.dataset.create_fifo_replay_buffer(limit=1000000, episodes=dataset.episodes)

# convert this to infinite buffer again
infinite_dataset = d3rlpy.dataset.create_infinite_replay_buffer(episodes=fifo_dataset.episodes)

Since the issue seems resolved, let me close this issue. Feel free to reopen this if there is any further discussion.