yining043 / TSP-improve

An improvement-based Deep Reinforcement Learning Algorithm presented in paper https://arxiv.org/abs/1912.05784v2 for solving the TSP problem.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A Potential Bug Report

chenwangnida opened this issue · comments

Dear Yining,

I hope this message finds you well. I was using your TSP-improve project on GitHub and noticed a bug that I wanted to bring to your attention.

The issue I encountered is to reset 'exchange' parameter back to 'None' after each full episode whose length is 4.
I believe this issue can be solved by moving Line: 226 'exchange = None' to the following line after Line 238: while t < T in train.py file.

Please let me know if I made any mistakes. Thank you for your time and effort in maintaining this project.

Best regards,
Chen

Sry, I made a mistake in understanding the step n, which is not the episode length. It is a parameter that updates policy every n steps in your continous problem :)