ai4co / rl4co

A PyTorch library for all things Reinforcement Learning (RL) for Combinatorial Optimization (CO)

Home Page:https://rl4.co

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG] possible bug in Rollout Baseline

fedebotu opened this issue · comments

In both mTSP and PDP, with rollout baseline, we may get an exploding behavior (loss increases after some time)
I suspect this may be due to gradient clipping by PyTorch Lightning, so we may have to investigate

I confirm the bug, there is an bug in the REINFORCE logic , I will be working on this