hello,please tell me how to use generated samples to train

Question

hello,please tell me how to use generated samples to train

liuxiuxuhaodong opened this issue 6 years ago · comments

I make use of market-1501 datasets to train DCGAN and get the number of pictures,but,I do not know how to train baseline by generated samples,such as the name of generated pictures,class ,please tell me,thank you.

gq · Answer 1 · Mon Jun 25 2018 22:19:08 GMT+0800 (China Standard Time)

hi, try to read the source code of train_baseline.py and prepare.py. you do not need to know all the details of the source code, just know how the model read the dataset（the path）, its not difficult, bset wishes!

liuxiuxuhaodong · Answer 2 · Tue Jun 26 2018 15:17:26 GMT+0800 (China Standard Time)

well，I did it，but when i run train_baseline.py.i met a trouble that print
Traceback (most recent call last):
File "/home/dl/Person-reid-GAN/train_baseline.py", line 339, in
os.mkdir(dir_name)
FileNotFoundError: [Errno 2] No such file or directory: './model/ft_DesNet121'

gq · Answer 3 · Tue Jun 26 2018 15:41:34 GMT+0800 (China Standard Time)

just create a new folder（named model）

liuxiuxuhaodong · Answer 4 · Tue Jun 26 2018 16:43:28 GMT+0800 (China Standard Time)

thank you very much!

liuxiuxuhaodong · Answer 5 · Tue Jun 26 2018 17:17:43 GMT+0800 (China Standard Time)

excuse me,after creat a new folder named model,i run again,then i also meet a new question that printed follw:
RuntimeError: cuda runtime error (10) : invalid device ordinal at torch/csrc/cuda/Module.cpp:84
I look for solutions on internet.there is a blog writer who met a similar trouble in https://blog.csdn.net/shincling/article/details/78919282.But i just to learn pytorch.Would you like to help me resolve this problem？

liuxiuxuhaodong · Answer 6 · Tue Jun 26 2018 17:19:13 GMT+0800 (China Standard Time)

My computer only have one gpu.

gq · Answer 7 · Thu Jul 05 2018 16:10:37 GMT+0800 (China Standard Time)

just change it to the single GPU-training mode.
torch.cuda.set_device(gpu_ids[0])
and delete the code : model=nn.DataParallel(model,device_ids=[0,1,2]) # multi-GPU
for mode details , you can search the internet

liuxiuxuhaodong · Answer 8 · Thu Jul 05 2018 16:56:09 GMT+0800 (China Standard Time)

thank you very much,i have solved my trouble and run it successfully.

liuxiuxuhaodong · Answer 9 · Sat Jul 07 2018 14:55:32 GMT+0800 (China Standard Time)

I am so sorry to trouble you again,i run train_baseline.py successfully a few days ago,but i just meet a new problem when i run it again.I read your demo ,but i don't find solution,the problem is printed follow:
train Loss: 291.3983 Acc: 0.0109
/pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [0,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [2,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [6,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [7,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [10,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [11,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed.
THCudaCheck FAIL file=/pytorch/torch/lib/THC/generated/../THCReduceAll.cuh line=339 error=59 : device-side assert triggered
Traceback (most recent call last):
File "train_baseline.py", line 346, in
num_epochs=130)
File "train_baseline.py", line 246, in train_model
loss = criterion(outputs,labels,flags)
File "/home/dl/anaconda3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "train_baseline.py", line 173, in forward
return loss.mean()
RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/torch/lib/THC/generated/../THCReduceAll.cuh:339
I don't modify your code , a little confused.

liuxiuxuhaodong · Answer 10 · Sat Jul 07 2018 15:45:27 GMT+0800 (China Standard Time)

I have solved it ,thank you

flychen321 · Answer 11 · Sat Jul 07 2018 17:21:30 GMT+0800 (China Standard Time)

I met it too, how did you solve it?

vincy · Answer 12 · Wed May 01 2019 15:43:25 GMT+0800 (China Standard Time)

me too,anyone solved it?

vincy · Answer 13 · Wed May 01 2019 16:56:59 GMT+0800 (China Standard Time)

I am so sorry to trouble you again,i run train_baseline.py successfully a few days ago,but i just meet a new problem when i run it again.I read your demo ,but i don't find solution,the problem is printed follow:
train Loss: 291.3983 Acc: 0.0109
/pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [0,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [2,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [6,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [7,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [10,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed.
/pytorch/torch/lib/THC/THCTensorScatterGather.cu:97: void THCudaTensor_gatherKernel(TensorInfo<Real, IndexType>, TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [11,0,0] Assertion indexValue >= 0 && indexValue < src.sizes[dim] failed.
THCudaCheck FAIL file=/pytorch/torch/lib/THC/generated/../THCReduceAll.cuh line=339 error=59 : device-side assert triggered
Traceback (most recent call last):
File "train_baseline.py", line 346, in
num_epochs=130)
File "train_baseline.py", line 246, in train_model
loss = criterion(outputs,labels,flags)
File "/home/dl/anaconda3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "train_baseline.py", line 173, in forward
return loss.mean()
RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/torch/lib/THC/generated/../THCReduceAll.cuh:339
I don't modify your code , a little confused.

can you tell me how to solve it,thank you