antorsae / google-retrieval-challenge-2019-fastai-starter

fast.ai starter kit for Google Landmark Retrieval 2019 challenge

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Google Landmark Retrieval 2019 Competition fast.ai Starter Pack

The code here is all you need to do the first submission to the Google Landmark Retrieval 2019 Competition. It is based on FastAi library release 1.0.47 and borrows helpser code from great cnnimageretrieval-pytorch library. The latter gives much better results than code in the repo, but not ready-to-make submission and takes 3 days to converge compared to 45 min here.

Making first submission

  1. Install the fastai library, specifically version 1.0.47.

  2. Install the faiss library. conda install faiss-gpu cudatoolkit=9.0 -c pytorch-y

  3. Clone this repository.

  4. Start the download process for the data. It would take a lot, so in mean time you can run the code.

  5. Because the code here does not depend on competition data for training, only for submission.

Notebooks

  1. download-and-create-microtrain - download all the aux data for training and validation
  2. validation-no-training - playing with pretrained networks and setting up validation procedure
  3. training-validate-bad - training DenseNet121 on created micro-train in 45 min and playing with post-processing. It works as described, but just because of pure luck: lots of different "subclusters" == labels are depicting the same landmark. So, do not use it for training of all 19k subclusters
  4. training-validate-good-full - Instead, use "clusters" as a labels, it gives much better results.
  5. submission-trained - creating a first submission. Warning, this could take a lot (~4-12 hours) because of the dataset size

About

fast.ai starter kit for Google Landmark Retrieval 2019 challenge


Languages

Language:Jupyter Notebook 99.6%Language:Python 0.4%