ValueError: Cannot load file containing pickled data when allow_pickle=False

Question

ValueError: Cannot load file containing pickled data when allow_pickle=False

hwkang91 opened this issue 3 years ago · comments

Hi,

I'm trying to run your code on my linux virtual environment suits your requirement.txt.
However, when I run gcn.py, the below error shows up.

It seems the 25th line of dataloader.py causes error. --> dir_adj = sp.load_npz(os.path.join(dataset_path, 'dir_adj.npz'))

I searched the error and found the relevant information through the following link : numpy/numpy#14120

sparse.load_npz has always set allow_pickle=False, since files saved by
save_npz do not (or should not) contain pickled data. allow_pickle=True will not be
turned on, for security reasons.

Can you help me out to solve the problem?

Thank you so much!

hwkang91 · Answer 1 · Tue Jan 04 2022 23:54:52 GMT+0800 (China Standard Time)

Hi,

I'm trying to run your code on my linux virtual environment suits your requirement.txt. However, when I run gcn.py, the below error shows up.

It seems the 25th line of dataloader.py causes error. --> dir_adj = sp.load_npz(os.path.join(dataset_path, 'dir_adj.npz'))

I searched the error and found the relevant information through the following link : numpy/numpy#14120

sparse.load_npz has always set allow_pickle=False, since files saved by save_npz do not (or should not) contain pickled data. allow_pickle=True will not be turned on, for security reasons.

Can you help me out to solve the problem?

Thank you so much!

the picture is hard to see, so I copy the error as follows:

(gcn) root@b75d1d6d0a18:/data/notebook/Gilon_Jaeho20210620/Thesis/GCN# python gcn.py
Training configs: Namespace(K=2, dataset='ogbn-arxiv', droprate=0.1, early_stopping_patience=30, enable_bias=True, enable_cuda=True, epochs=10000, gso='sym_renorm_adj', lr=0.001, mode='test', model='gcn', n_hid=64, opt='adam', seed=100, weight_decay=0)
Traceback (most recent call last):
File "gcn.py", line 198, in
feature, filter, label, idx_train, idx_val, idx_test, n_feat, n_class = process_data(device, dataset, gso)
File "gcn.py", line 111, in process_data
feature, adj, label, idx_train, idx_val, idx_test, n_feat, n_class = dataloader.load_citation_data(dataset)
File "/data/notebook/Gilon_Jaeho20210620/Thesis/GCN/script/dataloader.py", line 25, in load_citation_data
dir_adj = sp.load_npz(os.path.join(dataset_path, 'dir_adj.npz')) ## allow_pickle=True를 추가함.
File "/root/anaconda3/envs/gcn/lib/python3.8/site-packages/scipy/sparse/_matrix_io.py", line 123, in load_npz
with np.load(file, **PICKLE_KWARGS) as loaded:
File "/root/anaconda3/envs/gcn/lib/python3.8/site-packages/numpy/lib/npyio.py", line 445, in load
raise ValueError("Cannot load file containing pickled data "

Chieh Chang · Answer 2 · Tue Jan 11 2022 22:45:22 GMT+0800 (China Standard Time)

Because I've used Git Large File Storage to store .csv and .npz files.