explore various backbone architectures
jeremyjordan opened this issue · comments
Define your research question and variables
torchvision
provides a number of backbone architectures for us to use, we should explore these to see which performs the best on our dataset.
We can also use: https://rwightman.github.io/pytorch-image-models/
State your hypothesis
ResNeXt-101-32x8d has the top score on ImageNet, so it's reasonable to guess that it could give us the best performance on our dataset as well.
Describe your experimental methods
- Implement backbone networks for a few
torchvision
models (we can skip ones like AlexNet and VGGNet) - Train the models using reasonable hyperparameters and log to Weights and Biases
- Put together a Report in WandB presenting the results
- efficientnet_b3a, bs 64, SGD, lr 0.01, OneCycle schedule, dropout 0.35, epochs 50
- tf_efficientnet_b6, bs 32, SGD, lr 0.0125, OneCycle schedule, dropout 0.2, epochs 30
- resnet101, bs 64, SGD, lr 0.025, OneCycle schedule, dropout 0.2, epochs 30
- resnest101e, bs 64, SGD, lr 0.025, OneCycle schedule, dropout 0.2, epochs 35
- resnest200e, bs 32, SGD, lr 0.0125, OneCycle schedule, dropout 0.2, epochs 35
- resnest200e, bs 32, SGD, lr 0.008, OneCycle schedule, dropout 0.3, epochs 35
I put together a very basic WandB report to summarize these experiment runs.
https://wandb.ai/jeremytjordan/flowers/reports/Explore-Baseline-Models--VmlldzoyOTA5MDY