Use my own test set (TSP / CVRP Lib)

Question

Use my own test set (TSP / CVRP Lib)

WYF99111 opened this issue a year ago · comments

After training I want to test performance with my own test set how can I achieve this

Federico Berto · Answer 1 · Mon Jul 17 2023 09:55:16 GMT+0800 (China Standard Time)

Hi @WYF99111 !
Could you tell us more about your test set? Is it of the same kind as the one generated by generate_data.py or something different?

YiFei Wang · Answer 2 · Mon Jul 17 2023 10:00:54 GMT+0800 (China Standard Time)

I want to use the CVRP benchmark on CVRPLIB, how can I do it?

YiFei Wang · Answer 3 · Mon Jul 17 2023 10:11:14 GMT+0800 (China Standard Time)

Hi @WYF99111 ! Could you tell us more about your test set? Is it of the same kind as the one generated by generate_data.py or something different?

I want to use the CVRP benchmark on CVRPLIB, how can I do it?

Junyoung Park · Answer 4 · Mon Jul 17 2023 19:30:02 GMT+0800 (China Standard Time)

Hi @WYF99111,

You can test your dataset by overriding some input tensordict fields.

policy = AttentionModelPolicy() # Assume the trained parameters are loaded.
td = policy.env.reset(batchsize=[n]) # assuming you are evaluating "n" instances.
td[‘loc’] = your_loc
td[‘depot’] = your_depot_loc
td[‘demand’] = your_demand
out = policy(td)
print(out[‘reward’])

Note that code can break when batchsize=[1]. For evaluating a single instance,
you can duplicate the location, depot, and demand, performs the same routine, and take any rewards from out. For this issue, we will fix it soon.

YiFei Wang · Answer 5 · Tue Jul 18 2023 19:57:08 GMT+0800 (China Standard Time)

Hi @WYF99111,

You can test your dataset by overriding some input tensordict fields.
policy = AttentionModelPolicy() # Assume the trained parameters are loaded.
td = policy.env.reset(batchsize=[n]) # assuming you are evaluating "n" instances.
td[‘loc’] = your_loc
td[‘depot’] = your_depot_loc
td[‘demand’] = your_demand
out = policy(td)
print(out[‘reward’])
Note that code can break when batchsize=[1]. For evaluating a single instance, you can duplicate the location, depot, and demand, performs the same routine, and take any rewards from out. For this issue, we will fix it soon.

Hello, can you give me an example about td['loc'], td['depot'], td['demand'], I don't know much about its data types

Federico Berto · Answer 6 · Tue Jul 18 2023 21:50:28 GMT+0800 (China Standard Time)

Hello, can you give me an example about td['loc'], td['depot'], td['demand'], I don't know much about its data types

Same as the answer on Slack from @cbhua , reporting it here:

You can run this minimalistic example and print the td to check the shape of each feature:

from rl4co.envs import TSPEnv

# Environment, Model, and Lightning Module
env = TSPEnv(num_loc=20)

# Get an example random tensordict
td = env.reset(batch_size=[64])
print(td)

Next time, please try to answer the question in only one place, as it would make it easier for us to track issues :)

ElijaDei · Answer 7 · Thu Aug 24 2023 20:11:43 GMT+0800 (China Standard Time)

I have the similar issue. I want to train the model for CVRPenv on my own dataset and I wrote my own data_parser and generator like this:

`  def __init__(self, td_test):
        self.max_loc = max_coord(td_test["locs"][0, :,:])
        self.min_loc = min_coord(td_test["locs"][0, :,:])
        self.num_loc = td_test["demand"].shape[1] #depot counts as location here
        self.depot = td_test["depot"][0,:]
        self.max_dem = max_demand(td_test["demand"][0, :])
        self.min_demand = min_demand(td_test["demand"][0,:])
        self.capacity = td_test["capacity"][0].item()

`
in order to parametrize the environment:

    env2 = CVRPEnv(num_loc=vrp_size,
                   max_demand=max_demand,
                   min_demand=min_demand,
                   max_loc=max_loc,
                   min_loc=min_loc,
                   capacity=capacity,
                   vehicle_capacity=capacity`

What I am facing, is that CAPACITIES dict from Kool in the CVRPenv is linked somehow deep in the models, so even when you set the "capacity" and "demand" (e.g. between 10 and 100) variables in CVRPEnv, you get the output like this:

Output tensors are then :

```python
vehicle_capacity:tensor([[144], [144]]) 
demand:tensor([0.4167, 0.1500, 1.0000, 0.5667, 0.3000, 0.7500, 0.5167, 0.8667, 0.1167, 0.3500])

Obviously, the demand is divided by the capacity taken from the CAPACITIES dict in 'def generate_data(self, batch_size) ' in CVRPEnv, but when you have different number of customers, it returns an error, because no capacity in the dict was found.
Additionally, the demand is somehow allways normalized by the not normalized vehicle capacity. I think, this behaviour is not desired, is it?
/

state = env1.reset(batch_size = [6])
-> vehicle_capacity:tensor([[150],
        [150],
        ...)
locs:tensor([[101.7827, 611.7601],
...])
demand:tensor([0.6286, 0.6429, 0.7286, 0.6571, 0.4714, 0.1714, 0.4143, 0.6000, 0.2571,
        0.3286])

So, what is the best way to train the model on your own dataset? Thx

Junyoung Park · Answer 8 · Fri Aug 25 2023 10:09:55 GMT+0800 (China Standard Time)

Hi,

In short, in my opinion, you might want to consider overriding the generate_data method of CVRPEnv. This will allow you to have complete control over generating instances.

Additionally, the demand is somehow allways normalized by the not normalized vehicle capacity. I think, this behaviour is not desired, is it?

I'm assuming you are referring to the behavior of the code in Line 249 of cvrp.py. During the development of CVRPEnv, we aligned our implementation with Kool's implementation here. The normalization performed in Line 249 is intended to enable a comparison of methods with Kool's implementation.

However, we realize that the use of the CAPACITIES dictionary might not be necessary, as you pointed out. Let's continue discussing this matter to enhance the modularity of CVRPEnv.

@fedebotu, could you also review this?

ElijaDei · Answer 9 · Fri Aug 25 2023 19:59:41 GMT+0800 (China Standard Time)

Hi @Junyoungpark, thank you for your reply. Yes, I meant exactly this line. If you set the vehicle capacity in CVRPEnv to an arbitrary value, the demand will be normalized by "/CAPACITIES[num_loc]" , but the vehicle capacity will not. This lead to the output, like state = env1.reset(batch_size = [3]) -> vehicle_capacity:tensor([[150],[150], ...), demand:tensor([0.6286, 0.6429, 0.7286, 0.6571, ...])". don't know, if it has direct impact on training, or mb you normalize vehicle-capacity somewhere during the embeddings procedure as well (obviously it should be 1 then).
And deleting the /CAPACITIES[num_loc] term lead somehow to futher implications. My trainer.fit(model) stuck somewhere at hte beginning without throwing any exceptions. Thank you for your support. Best regards, Elija

Chuanbo HUA · Answer 10 · Mon Aug 28 2023 05:12:19 GMT+0800 (China Standard Time)

Hi @ElijaDei! Thanks for pointing out this part. This problem comes from some redundant code.

The vehicle_capacity is set by the environment's initial parameter like L67 in the CVRPEnv, which is by default =1. And the capacity initialized in the generate_data() function actually will be rewritten in the reset() part like L149 in the CVRPEnv.

But the /CAPACITIES[num_loc] shouldn't be deleted since this is the normalization for the demand of nodes.

Federico Berto · Answer 11 · Thu Sep 07 2023 20:23:01 GMT+0800 (China Standard Time)

We will keep this issue open since we want to implement native loading for TSP/CVRP Libs soon!

Chuanbo HUA · Answer 12 · Thu Dec 14 2023 16:52:26 GMT+0800 (China Standard Time)

Hi there!

Now we have two tutorial notebooks about how to test your trained model on the TSP/CVRP libs (5-test-on-tsplib.ipynb/6-test-on-cvrplib.ipynb) under notebooks/tutorials. 🎉

In these notebooks we cleaned up the pipeline to test the model, including guidance to download and prepare the dataset, load your trained model, and test with greedy, augmentation, and sampling ways.

Have fun!

Chuanbo HUA · Answer 13 · Thu Dec 14 2023 16:58:27 GMT+0800 (China Standard Time)

The current code for testing on the TSPLib and CVRPLib contains massive data loading, tensor converting and testing loop. In the next step we could wrap these to utilize functions for more clear and convenient usage. Also we could support more baseline libs for better diversity.