openai / baselines

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Understanding the replay_buffer implementation

willtop opened this issue · comments

For deepq/replay_buffer.py and common/segment_tree.py, I am a bit confused on the way it's implemented:

  1. Seg_tree has sum (or min) access of O(1) by just taking the first element. Why in the code, we have to call seg_tree.sum() for sum (and seg_tree.min()) for min instead?
  2. Continuing from the above question, sum() (or min()) triggers reduce() operation within the segment_tree.py implementation. When taking default argument, it just directly returns the first element (as expected), however, on line 109 of deepq/replay_buffer.py, why calling with the argument specified as self._it_sum.sum(0, len(self._storage) - 1)? Since the remaining elements are set default by zero, just looking at the first element of the tree as the total sum would always return the same result wouldn't it?
  3. Without needing to compute sum (or min) for a specific range of leaves in dqn training, when would we need the reduce() operation within the segment_tree.py (I don't see the need of it)?

These parts have been confusing to me for a while. Would really appreciate if anyone can help clarify the code. Thanks!