google-deepmind / reverb

Reverb is an efficient and easy-to-use data storage and transport system designed for machine learning research

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Maintaining episode-level priority while sampling individual transitions from a replay buffer with MinHeap remover

mathieu-reymond opened this issue · comments

Hi,

I have a table containing episodic trajectories. For each episode, I assign a priority. I only want to keep N episodes with the highest priority. For that, I use a MinHeap selector as remover. However, I want to sample individual transitions from all these trajectories, not full episodes. I have tried adding another table that stores individual transitions and sampling from there, but since it keeps a separate remover, the items in both tables do not match when inserting new data. Any ideas how I could achieve this?
On a related note, is it possible to keep a separate priority for sampling (to achieve something like Prioritized Experience Replay) and for removing for the same item?

Any help would be greatly appreciated!