rlcode / reinforcement-learning

Minimal and Clean Reinforcement Learning Examples

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Why use self.batch_size instead of batch_size

JieMEI1994 opened this issue · comments

From reinforcement-learning/2-cartpole/1-dqn/cartpole_dqn.py/train_model

def train_model(self):
    if len(self.memory) < self.train_start:
        return
    batch_size = min(self.batch_size, len(self.memory))
    mini_batch = random.sample(self.memory, batch_size)

    update_input = np.zeros((batch_size, self.state_size))
    update_target = np.zeros((batch_size, self.state_size))
    action, reward, done = [], [], []

    for i in range(self.batch_size):
        update_input[i] = mini_batch[i][0]
        action.append(mini_batch[i][1])
        reward.append(mini_batch[i][2])
        update_target[i] = mini_batch[i][3]
        done.append(mini_batch[i][4])

    target = self.model.predict(update_input)
    target_val = self.target_model.predict(update_target)

    for i in range(self.batch_size):
        # Q Learning: get maximum Q value at s' from target model
        if done[i]:
            target[i][action[i]] = reward[i]
        else:
            target[i][action[i]] = reward[i] + self.discount_factor * (
                np.amax(target_val[i]))

    # and do the model fit!
    self.model.fit(update_input, target, batch_size=self.batch_size,
                   epochs=1, verbose=0)

In the this part of code, why you use self.batch_size after take the minimum value between self.batch_size and the length of memory? Would batch_size be better?

batch_size = min(self.batch_size, len(self.memory)) is unnecessary part of the code because below is in front of it

if len(self.memory) < self.train_start: return
Thank you for letting us know