What is the `best_test_perf_at_best_val` ?

Question

What is the `best_test_perf_at_best_val` ?

GabbySuwichaya opened this issue 3 years ago · comments

Hello! @dongkwan-kim...
After I posted a couple of questions... I just realize that I should say this... Your paper is awesome! When I see the real implementation of your paper, I have got that excitement ... So I totally forgot.

Yet, again, before I am going to ask a couple more questions... It is that I am just very new to this area and would like to learn it from your work as well.....

My questions are as follows:

Could you please explain a little about the reason why the best_test_perf_at_best_val and test_perf_at_best_val get worse when I choose to save and probably load the previously trained model? (The captured ## RESULTS SUMMARY ## is provided below)
What is the insight of knowing the best_test_perf_at_best_val and test_perf_at_best_val ?
I am particularly interested in the link prediction task... I would like to know how is your work scales with the number of nodes and edges....? I am looking for a method for my application with 2000-4000 nodes with 200-400 edges.

## RESULTS SUMMARY ##
best_test_perf: 0.854 +- 0.002
best_test_perf_at_best_val: 0.606 +- 0.383
best_val_perf: 0.833 +- 0.004
test_perf_at_best_val: 0.605 +- 0.382
## RESULTS DETAILS ##
best_test_perf: [0.852, 0.854, 0.855, 0.856, 0.856, 0.851, 0.852]
best_test_perf_at_best_val: [0.85, 0.853, 0.847, 0.852, 0.0, 0.84, 0.0]
best_val_perf: [0.824, 0.832, 0.834, 0.836, 0.836, 0.836, 0.836]
test_perf_at_best_val: [0.849, 0.845, 0.847, 0.852, 0.0, 0.84, 0.0]
Time for runs (s): 107.82641579399933

Dongkwan Kim · Answer 1 · Wed Mar 10 2021 12:28:05 GMT+0800 (China Standard Time)

Could you please explain a little about the reason why the best_test_perf_at_best_val and test_perf_at_best_val get worse when I choose to save and probably load the previously trained model? (The captured ## RESULTS SUMMARY ## is provided below)

Although I have not used save_model and load_model methods (as I told you before), I can guess what causes your situation.
If you successfully load the model, the best_val_perf of that model will be also loaded (https://github.com/dongkwan-kim/SuperGAT/blob/f337f448d6/SuperGAT/main.py#L296).
So, if your training-after-loading does not get val_perf bigger than loaded best_val_perf, *_at_best_val will not be updated.

Note: I might be wrong.

What is the insight of knowing the best_test_perf_at_best_val and test_perf_at_best_val ?

Looking at best_test_perf_at_best_val and test_perf_at_best_val, we can know whether we need additional regularization such as early stopping.
If we get a much larger best_test_perf_at_best_val than test_perf_at_best_val, we believe that our model is overfitted.

I am particularly interested in the link prediction task... I would like to know how is your work scales with the number of nodes and edges....? I am looking for a method for my application with 2000-4000 nodes with 200-400 edges.

Dataset statistics that I used are described in the appendix. I think you will have no problem using SuperGAT since 2000-4000 nodes / 200-400 edges are small and really sparse.

Note: our main focus when writing the paper was the node classification task, so I cannot guarantee our model outperforms in the link prediction task comparing to famous baselines: GCN, GAT, GIN, and so on.

GAAP · Answer 2 · Thu Mar 11 2021 18:57:24 GMT+0800 (China Standard Time)

@dongkwan-kim Thank you very much.

I see.

Then, I wonder how do you normally use your work..... ?
This is because I noticed as I used the following command, the training data is used for testing and validating as well... (Or did I do anything wrong?)

python3 SuperGAT/main.py   --dataset-class LinkPlanetoid    --dataset-name CiteSeer   --custom-key EV13NSO8-ES-Link     --num-gpus-total 2

Therefore, this leads me to the question of do you always need to train the new model for every new data sample?

I am quite curious because I am more used to the deep learning approach where you actually have a well-separated phase of training and testing .... For example, using training data just for training, and testing data just for testing.

Dongkwan Kim · Answer 3 · Fri Mar 12 2021 10:38:10 GMT+0800 (China Standard Time)

Or did I do anything wrong?

You are not doing anything wrong.
The problem is that my code is purely for research purposes, does not consider a pipeline for production, transfer learning, or many other different settings.
So, if you want to run other experiments, you have to build your own codes for your own setting.

do you always need to train the new model for every new data sample?

For all PyTorch models including SuperGAT, you can separate the execution path of training and testing.
However, it has not been implemented in this repository and I believe you can easily implement it.

Thank you!

GAAP · Answer 4 · Thu Mar 18 2021 14:50:03 GMT+0800 (China Standard Time)

Thanks so much!