ai4co / rl4co

A PyTorch library for all things Reinforcement Learning (RL) for Combinatorial Optimization (CO)

Home Page:https://rl4.co

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Help] The results about the sdvrp environment

kuoiii opened this issue · comments

commented

Hi, When I test the sdvrp env, I found the results is not normal, when i setting the Env config below, Basiclly i think the vehicle should revisited the same target location for many times, while the results and printed actions shows that each target location only be visited once.
Plus, when i setting the min_demand to 0 and train the model , it sometimes happened error for the start_loc demand will be the negative and raise the error.
We are are appreciated with your work and waiting for your reply,it will be better if there will be more tutorial and example about how to load the trained model for that we are unfamiliar with your structure.
sdvrpenv = SDVRPEnv(num_loc=20,
min_loc=0,
max_loc=1,
min_demand=1,
max_demand=10,
vehicle_capacity=1.0)

Thanks for your interest! Let us answer down here:

Basically i think the vehicle should revisited the same target location for many times, while the results and printed actions shows that each target location only be visited once.
We could not reproduce the result, could you provide more context? For example, in the notebook above, if we print the duplicate actions we obtain:

# Find duplicate actions for each batch and print them
for actions in out["actions"]:
    unique_elements, counts = torch.unique(actions, return_counts=True)
    duplicates = unique_elements[counts > 1] # exclude visited once only
    visited_more_than_once = duplicates[duplicates != 0] # exclude depot
    print(f"Number of duplicate actions: {visited_more_than_once}")
Number of duplicate actions: tensor([3, 5], device='cuda:0')
Number of duplicate actions: tensor([10, 16, 17], device='cuda:0')
Number of duplicate actions: tensor([ 2, 17], device='cuda:0')

you may also see it from the plot
image

Plus, when i setting the min_demand to 0 and train the model , it sometimes happened error for the start_loc demand will be the negative and raise the error.

You are correct. If min_demand <= 0 some problems may arise, because it does not make sense from the problem's standpoint (e.g. if demand of a node is 0, it is always suboptimal to visit it since there is no point in going there). We may add some check like assert min_demand > 0. Thanks for pointing this out!

We are are appreciated with your work and waiting for your reply,it will be better if there will be more tutorial and example about how to load the trained model for that we are unfamiliar with your structure.

We made some small fixes so that checkpointing works well for notebooks too (we made checkpoint loading mostly based on Hydra, but we would like it to as accessible for anyone to use!). We will make some refactoring of the LitModule class (follow in this issue) to make it easier, more modular and documented. Make sure to check the updated notebook here, where we added more comments as well as checkpoint saving and loading, logging and more testing! :)

Closing since answered - feel free to re-open should you have any further issues ;)