Data preparation failed in Colab
rwbfd opened this issue · comments
I have been trying to replicate the training results based on the original GitHub notebook in the repository. However, when it comes to preparing the data, it doesn't work. When I run the command python ./DiffSBDD/process_crossdock.py /content/crossdocked_pocket10 --no_H y
, such errors occurred:
#failed: 100000: 100% 100000/100000 [00:10<00:00, 9586.86it/s]
Traceback (most recent call last):
File "/content/./DiffSBDD/process_crossdock.py", line 353, in <module>
lig_coords = np.concatenate(lig_coords, axis=0)
File "<__array_function__ internals>", line 5, in concatenate
ValueError: need at least one array to concatenate
Similar but more cryptic issues arrive for the other datasets. When performed in the Colab cells, the error is
Traceback (most recent call last):
File "/content/./DiffSBDD/process_bindingmoad.py", line 450, in <module>
with open(f'data/moad_{split}.txt', 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'data/moad_test.txt
Considering sometimes Colab cells might perform some funky behavior, I have decided to use the command line. Now the error is the same as before:
#failed: 130: 100%|█| 130/130 [00:00<00:00, 9719.25it/s
Traceback (most recent call last):
File "/content/DiffSBDD/process_bindingmoad.py", line 571, in <module>
lig_coords = np.concatenate(lig_coords, axis=0)
File "<__array_function__ internals>", line 5, in concatenate
ValueError: need at least one array to concatenate
I have posted the Colab notebook here.
Any help will be greatly appreciated.
Have you solved this problem? I have the same problem.
Hi @rwbfd and @xiaoxiannv999,
sorry for the slow response. With the given information, it is quite hard for me to see what goes wrong.
However, I can say that the error: FileNotFoundError: [Errno 2] No such file or directory: 'data/moad_test.txt
is caused by hard-coded paths to the training, validation and test lists. These lists are downloaded together with the code repository and can be found in the data/
subdirectory. Because the paths are hard-coded, process_bindingmoad.py
should be run from the main directory (DiffSBDD/
). Alternatively, you could change the paths in the script (here).
Hello, @arneschneuing . I raised another question #18 ,can you answer it? I guess it may be caused by that reason.