A Potential Bug Report
chenwangnida opened this issue · comments
Dear Yining,
I hope this message finds you well. I was using your TSP-improve project on GitHub and noticed a bug that I wanted to bring to your attention.
The issue I encountered is to reset 'exchange' parameter back to 'None' after each full episode whose length is 4.
I believe this issue can be solved by moving Line: 226 'exchange = None' to the following line after Line 238: while t < T in train.py file.
Please let me know if I made any mistakes. Thank you for your time and effort in maintaining this project.
Best regards,
Chen
Sry, I made a mistake in understanding the step n, which is not the episode length. It is a parameter that updates policy every n steps in your continous problem :)