dataset

Question

dataset

PINK512 opened this issue 2 years ago · comments

PINK512 commented 2 years ago

Excuse me,I am a beginner of this.
I can not run the code successfully.Could you please answer my questions?

1.where should I download the initial dataset in your paper?

2.In what order should I run the code?

I will continue to search on the Internet for answers. Thank you in advance.

Huu Quan, CAP · Answer 1 · Tue Aug 16 2022 15:56:17 GMT+0800 (China Standard Time)

Hello @PINK512
Sorry for my late reply!
Here are the answers to your 2 questions:

The cucumber dataset that we used belongs to a national project so we can't share it. Sorry, but you'll need to work with your own dataset at the moment. Please note that if the dataset is not cucumber, you'll need to re-train the LFLSeg module.
Here is the order

You first need to train the LFLSeg module (only if your dataset is not cucumber leaf). Refer to this: LFLSeg tutorial
Set up the environment using the command: pip install -r requirements.txt
Then arrange your dataset and train LeafGAN (refer to https://github.com/IyatomiLab/LeafGAN#readme)

Feel free to ask me anything!

PINK512 · Answer 2 · Fri Nov 11 2022 19:54:24 GMT+0800 (China Standard Time)

Hi, @huuquan1994
Thank you for the answer. I still have some questions.
1.After I trained the LFLSeg module,should I run the code of train.py directly?
2.The picture in the folder of trainA and trainB is health pictures and disease pictures, right?
3.Is it correct to set the dataset folders as follow:
H2B:
trainA
trainA_mask
testA
trainB
trainB_mask
testB
4.Is the image generated by prepare_mask.py the same as the output file generated in the LFLSeg module?
I don't know which one is pretrain path.
Thank you for your consideration.

Huu Quan, CAP · Answer 3 · Sat Nov 12 2022 18:30:29 GMT+0800 (China Standard Time)

@PINK512
Let me answer your questions one by one

After I trained the LFLSeg module, should I run the code of train.py directly?

Yes, if you've trained the LFLSeg module and prepared all the training data. You can start training with train.py

The picture in the folder of trainA and trainB is health pictures and disease pictures, right?

In our paper, trainA and trainB are healthy and disease images, respectively. But note that LeafGAN is based on CycleGAN and you can train your model to transfer on arbitrary domains. For example, based on your purposes, you can add disease images to the trainA folder if you want.

3.I s it correct to set the dataset folders as follow:
H2B:
trainA
trainA_mask
testA
trainB
trainB_mask
testB

Yes, this is correct. Note that if you train your model with mask images (trainA_mask, trainB_mask), you don't need to load the LFLSeg module.
Please refer to README for more details.

Is the image generated by prepare_mask.py the same as the output file generated in the LFLSeg module?

Yes, they are the same. To save time in training, it's recommended to use the LFLSeg module to segment the training leaf data beforehand.
The --pretrain_path is the path to the LFLSeg trained module. If you're working with cucumber data, you can refer to the LFLSeg to download the pre-trained model.

PINK512 · Answer 4 · Wed Nov 16 2022 11:05:20 GMT+0800 (China Standard Time)

Hi @huuquan1994
I use the tomato datasets.
I have trained the train.py, but it has error as follow:

I can get the rec_A, but can't get rec_B.
And this is my folder of datasets:

I put the health and disease images into trainA and trainB,respectively.
I don't know what's wrong with it.
Thanks in advance.

Huu Quan, CAP · Answer 5 · Wed Nov 16 2022 12:04:33 GMT+0800 (China Standard Time)

@PINK512
Correct me if I'm wrong but it seems to me that the code in leaf_gan_model.py has changed, isn't it?
For LeafGAN (or CycleGAN-based methods), the rec_B & rec_A will be created when you call the forward function.
Then the networks are trained by calling the backward_G & backward_D functions.
Please refer to the original CycleGAN paper for more details!

As I see in your error logs, please check the forward function in the leaf_gan_model.py.

PINK512 · Answer 6 · Thu Dec 01 2022 15:56:40 GMT+0800 (China Standard Time)

@huuquan1994
I had run the whole code .
I used the unmasked pictures to train first, and then I used the masked.
When I run the test.py with the new unmasked pictures, the result is as follow:

Is there anything wrong here?
Thank you in advance.

Huu Quan, CAP · Answer 7 · Tue Dec 06 2022 23:02:30 GMT+0800 (China Standard Time)

@PINK512
Thanks for your question!
By looking at your result, I can't really tell what is wrong here. There are many factors that we need to consider. For example, how big is your dataset? How many epochs did you train the model?

At the first glance, seems to me that your model is not having enough training (but again, I'm not 100% sure).

Plus, I'd advise you to check and try different hyper-parameters!

PINK512 · Answer 8 · Sat Jan 07 2023 17:15:18 GMT+0800 (China Standard Time)

Hi @huuquan1994
I have a question about training. I have trained 200 epochs with 1000 disease and 1000 health pictures.
When I used unmasked pictures, the code can generate the mask and learn the feature from leaf. I can get a fake disease picture from health.

However, when I used masked pictures for training, the result is bad, it looks like that model cannot learn the feature.

Therefore, I confused about the difference between two training ways.

Huu Quan, CAP · Answer 9 · Thu Jan 12 2023 11:02:42 GMT+0800 (China Standard Time)

@PINK512

The mask images are supposed to be inputs of the discriminators only. The generators generate full images (without masking). This way, from the GAN loss, the generators will be forced to generate symptoms in the masked area only.

I see that the 3rd and 4th images above are masked which are not the right outputs of LeafGAN.
The masked images are trained together with normal leaf images. To train with the masked images, make sure to include the --dataset_mode unaligned_masked in the command line.

Please also refer to our paper for more details!

PINK512 · Answer 10 · Thu Jan 12 2023 17:26:01 GMT+0800 (China Standard Time)

I use the correct code. --dataset_mode unaligned_masked
The 3rd image is the input and the 4th is the output of the generate.
The generated leaf cannot show the lesion.

Huu Quan, CAP · Answer 11 · Thu Jan 12 2023 18:52:20 GMT+0800 (China Standard Time)

@PINK512

Masked images are the input of the discriminator D only, not the input of the generator G.
If the images in your trainA and trainB (not in trainA_masked and trainB_masked) are full leaf images, the input of the generator must be full leaf images (not masked images). Please refer to Fig. 2b in our paper!

You mentioned the 3rd image is the input (masked version), which I assume might look different from the full leaf images in your training data !? (Note that this is just my assumption since the generated results also depend on many factors)

PINK512 · Answer 12 · Sun Mar 05 2023 17:10:55 GMT+0800 (China Standard Time)

@huuquan1994
Hi!
So far, leaves with disease spots can be generated, but the boundary between the disease region and the source region is very clear. Could you please instruct me where I should further modify the parameters, so as to make the picture more realistic？

Huu Quan, CAP · Answer 13 · Mon Apr 10 2023 17:52:05 GMT+0800 (China Standard Time)

@PINK512
Sorry for my late response!
For this problem, I think there are two main reasons:

The leaf segmentation is not accurate enough to cover the leaf area.
The cycle-consistency loss term isn't tuned correctly.

I think you could try to increase the coefficient of the cycle-consistency loss (i.e., --lambda_A and --lambda_B), then check if it reduces your problem or not.

PINK512 · Answer 14 · Wed May 17 2023 20:34:08 GMT+0800 (China Standard Time)

Excuse me, I am having a question about the batch size.
I tried to choose other batch size, but it throw an error as below:

I don't know how could I modify the code. Could you please help me with it?

Huu Quan, CAP · Answer 15 · Wed Oct 11 2023 15:35:32 GMT+0800 (China Standard Time)

@PINK512
Sorry for the late response!
By default, CycleGAN/LeafGAN uses a batch size of 1. I haven't tried to write the code to train with batch size > 1 but I think it's possible if you modify the Image Pooling (image buffer) mechanism in CycleGAN. (See the details in CycleGAN paper, section 4. Implementation - Training details).