wingsweihua / IntelliLight

IntelliLight: A Reinforcement Learning Approach for Intelligent Traffic Light Control

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Some Questions About Your Experiment

KamijouToumaKun opened this issue · comments

Hello, I'm watching your project. Here are some questions I'm curious about:

  1. According to the runexp.py, it looks like you get your experiment results at the same time of training rather than run the trained model over the same configuration again. Do I misunderstand?

  2. The parameters needed to estimate the reward will fluctuate over time, like queue length, duration. I'm wandering how the performances given in the table 6,7,8,9 are calculated? You adopt the final timestamp's parameter as data? Or you use the average?

  1. Runexp is a sample code for one round of training.
  2. We use the average over all timestamps.

Oh I see. When I checked the training log generated by runexp.py, I found the duration value is much better than the value displayed in your paper. That's why I got confused.

So if I want to reproduce the result of your experiment, which model should I use? Only run the final synthetic one over 4 configurations? And would you mind telling me how many epoches does it take to reach the best performance? Thanks.

There are 4 synthetic data in the paper. One model for each data.
For uniform data, the number of epoch seems not to matter once converge.

commented

Hi, @wingsweihua, you said "Runexp is a sample code for one round of training."
Did you mean one round is one episode in deep reinforcement learning.
Besides, I find you train your model one round consists 72000s, ie 20 hours, is that right?
Did you save model after one round?

@ynuwm I have the same problem,Did you get the answer?

commented

@ynuwm I have the same problem,Did you get the answer?

@pilipili520 No, the author doesn't give me any reply. I think my guess is right, after one episode is ended, we need to save the model, and then resume the next round.

I'm having some issues with duration also.