chenxin061 / pdarts

Codes for our paper "Progressive Differentiable Architecture Search:Bridging the Depth Gap between Search and Evaluation"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

For other datasets, is it better to use pre-searched network from cifar, or search from itself?

rtrobin opened this issue · comments

I want to try PDARTS on other datasets. Is it better to use the network transferred from pre-searched cifar model, or search from the dataset itself? For other NAS methods, maybe only the former option is doable. Since PDARTS is time efficient in searching, maybe I should search from it?

Any suggestion is appreciated. Thanks. :)

@rtrobin Hello, I'm also working on applying PDARTS on different tasks!

Let's go straight: in general I would believe searching for a model is always better than a pre-searched model, because the algo optimize the model w.r.t. the data.

From my experience, it is kinda hard to search directly on other dataset; because you still need to set the parameters carefully take the data in to consideration.

I also retrained the PDARTS structure on my own data; it turns out to be quite good (but it does not beat the human experts, for I didn't change a lot of the parameters). It seems that more effort are required to fine-tune the pre-searched model.

GL,

@Catosine Thanks for the reply.

That's all my concern too. Comparing to traditional deep learning architecture, it is even harder to fine tune the hyper parameters. Is there any guidance about the corresponding actual meaning in reality for each parameters?

@rtrobin Would you mind raising a specific case for your question?

@Catosine I don't have any useful to share right now. I'm doing some rough investigation and trial on a simple and small data set. I would share something when I expand my trial.

@rtrobin I've done some rough search, and got a little experience with the parameters. You may show me what you have now so I may be able to tell you something useful if I have met the same situation as you do.
Again, thanks for keep me updated:)

As @Catosine said, you should fit some hyperparameters if you want to search on a new dataset. However, if you expect a better performance on the new dataset, my suggestion is to search on it instead of transferring existing architectures, although I believe that the released PDARTS architecture can work well.

@rtrobin Hi there! I've got some data: comparing pre-searched model, my own-searched model does have a better accuracy.

@Catosine how much improvement did you get on your own data , compare to manual architecture ?

@JarveeLee It is from 95% top 1 valid acc to 99%.

Wow! Great !

@Catosine Thanks for the update. Good job.

I haven't done NAS work after June. Maybe you could share some hyperparameters strategy here for others' further research, if they find this post.

@Catosine @rtrobin hi, I am enjoying your conversation. But I meet some troubles when I apply the PDARTS to other datasets.

one of the datasets is VOC2012, which is used to semantic segmentation generally.For this dataset, I just change input, criterion and other related parameters,like channel. Unfortunately, result is bed, the Loss is not converge finally and keep in 4~6, which is vibrate.
another one is ECSSD, which is used to saliency detection, the operation was done as last dataset do, I use the architecture to train ECSSD after search, the result are still bad.
As you can see, I wondering maybe I forget some operation need to add, or PDARTS just perform very well in Classification task like Cifar and ImageNet .

Any suggestion is appreciated. Thanks. :)

@LiuTingWed Hello!
First thing first, the PDARTS does work well on other dataset such as face identification. I've done experiment with it and it is much better than ResNet. But I cannot guarantee if it works on other datasets. From my personal idea, I think it should be able to handle classification tasks only without major change in the supernet or operations. It is definitely a good idea to try with different operations as candidate and different block structure, because at lease one thing is certain: the operations and the blocks are carefully designed for CIFAR as well as ImageNet.

In addition, there are other following works about DARTS. The latest one should be FairDARTS, It recognised the convergence issue and gave an pretty good solution. You might be interested with it.

GL,
PF