Error for running the code

Question

Error for running the code

Jimmy-7664 opened this issue 2 years ago · comments

I've got several errors while running the code.
a).
For the forecasting part,

I run the following command as written in the Readme.md "python step/run.py --cfg='step/step_METR-LA.py' --gpus='0'" then I got
2022-09-29 15:15:25,915 - easytorch-launcher - INFO - Launching EasyTorch training.
Traceback (most recent call last):
File "step/run.py", line 20, in
launch_training(args.cfg, args.gpus)
File "/home/ght/tsformer/STEP/basicts/launcher.py", line 19, in launch_training
easytorch.launch_training(cfg=cfg, gpus=gpus, node_rank=node_rank)
File "/home/ght/anaconda3/envs/torch/lib/python3.8/site-packages/easytorch/launcher/launcher.py", line 58, in launch_training
cfg = init_cfg(cfg, node_rank == 0)
File "/home/ght/anaconda3/envs/torch/lib/python3.8/site-packages/easytorch/config/utils.py", line 210, in init_cfg
cfg = import_config(cfg, verbose=save)
File "/home/ght/anaconda3/envs/torch/lib/python3.8/site-packages/easytorch/config/utils.py", line 173, in import_config
cfg = import(path, fromlist=[cfg_name]).CFG
ModuleNotFoundError: No module named 'step.step_METR-LA'
There seems only one dataset's scripts (METR-LA), no scripts for others like PEMS04 ets.

b).
For the pretraining part,
I run "python step/run.py --cfg='step/TSFormer_METR-LA.py' --gpus='0,1,2,3'",
then I got

2022-09-29 15:05:58,680 - easytorch-training - ERROR - Traceback (most recent call last):
File "/home/ght/anaconda3/envs/torch/lib/python3.8/site-packages/easytorch/launcher/launcher.py", line 30, in training_func
runner.train(cfg)
File "/home/ght/anaconda3/envs/torch/lib/python3.8/site-packages/easytorch/core/runner.py", line 361, in train
self.on_epoch_end(epoch)
File "/home/ght/tsformer/STEP/basicts/runners/base_runner.py", line 141, in on_epoch_end
if self.test_data_loader is not None and epoch % self.test_interval == 0:
AttributeError: 'TSFormerRunner' object has no attribute 'test_data_loader'

I check the scripts, it seems that test_data_loader is not defined.
How can I solve the above issues, looking forward to your reply.
: )

S22 · Answer 1 · Thu Sep 29 2022 15:40:50 GMT+0800 (China Standard Time)

Thanks for your question~

This is a typo. The command line is case sensitive, so the correct command should be python step/run.py --cfg='step/STEP_METR-LA.py' --gpus='0'.
STEP has just been refactored, and I'm testing other datasets (PEMS-BAY and PEMS04) to make sure there are no performance issues. However, my computing resources are limited, so it may take a few days. I will upload scripts for the PEMS04 and PEMS-BAY datasets very soon.
I can not reproduce this error in the latest version of STEP. The test_data_loader is built in here. Could you provide more information?

S22 · Answer 2 · Thu Sep 29 2022 16:19:11 GMT+0800 (China Standard Time)

For the third question, I added the initialization of test_data_loader to avoid possible problems after you edit the config file (e.g., remove all arguments related to the test process).

Haotian Gao · Answer 3 · Thu Sep 29 2022 19:20:24 GMT+0800 (China Standard Time)

Thanks for your timely reply. Due to the limitation of computing resources, I will try again tomorrow.

S22 · Answer 4 · Tue Oct 11 2022 00:18:42 GMT+0800 (China Standard Time)

Testing on all these datasets has now been completed.
You can now find configurations for all datasets.
Moreover, our training logs are shown in training_logs/TSFormer_METR-LA.log, training_logs/TSFormer_PEMS04.log, and training_logs/TSFormer_PEMS04.log, and the our pre-trained TSFormers for each datasets are placed in tsformer_ckpt folder.