Maintaining episode-level priority while sampling individual transitions from a replay buffer with MinHeap remover
mathieu-reymond opened this issue · comments
Hi,
I have a table containing episodic trajectories. For each episode, I assign a priority. I only want to keep N episodes with the highest priority. For that, I use a MinHeap selector as remover. However, I want to sample individual transitions from all these trajectories, not full episodes. I have tried adding another table that stores individual transitions and sampling from there, but since it keeps a separate remover, the items in both tables do not match when inserting new data. Any ideas how I could achieve this?
On a related note, is it possible to keep a separate priority for sampling (to achieve something like Prioritized Experience Replay) and for removing for the same item?
Any help would be greatly appreciated!