HaoMood / pwa

Python implementation of PWA for image retrieval.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

      PWA: Part-based weighting Aggregation of Deep Convolutional Features
                              for Image Retrieval


DESCRIPTIONS
    PWA is an unsupervised method for image retrieval. The idea of PWA is that,
    Each feature map is activated by different parts or patterns of objects. We
    can use feature maps as part detectors to highlight the discriminative parts
    of objects and suppress the noise of background.

    The PWA weights are computed as follows. We first compute variances for
    each dimension of the pooled descriptor among the whole dataset. Then we
    select 25 channels (~17%) with the largest variances. We regard feature
    maps corresponding to those channels as part detectors since the responses
    of them are significantly different among the various objects. The PWA
    weights are the l2-normalized and sqrt-normalized selected spatial maps.

    We compute one weighted-pooled descriptor for each one of the PWA weights
    in turn by weighted sum-pool the deep descriptors. Therefore, 150
    D-dimensional weighted-pooled descriptors are selected. We concatenate
    those into a long vector and perform l2-normalized, PCA whitening to
    4096-dimensional, which is the final descriptor for retrieval.

    We learn the PCA whitening parameters from a separate set of images. To be
    comparable with related works, we learn the PCA whitening parameters on
    Oxford when testing on Paris and vice versa, as accustomed.

    We used cropped images for query, i.e., we regard the cropped region as
    input for the CNN and extract featrures.


REFERENCE
    J. Xu, C. Shi, C. Qi, C. Wang, and B. Xiao. Part-based weighting
    aggregation of deep convolutional features for image retrieval. arXiv
    preprint arXiv: 1705.01247.


PREREQUIREMENTS
    Caffe with Python interface supported.
    Python2.7 with Numpy, PIL, sklearn supported.


LAYOUT
    Data
    ./data/oxford5k/
        ./data/oxford5k/channels.npy     # Selected channels
        ./data/oxford5k/conv/            # Convolution feature maps
            ./data/oxford5k/conv/vgg16/all/
            ./data/oxford5k/conv/vgg16/crop/
        ./data/oxford5k/groundtruth/     # 55*4=220 query groundtruth
        ./data/oxford5k/image/           # Raw .jpg images
            ./data/oxford5k/image/all/   # 5063 database images
            ./data/oxford5k/image/crop/  # 55 crop query images
    ./data/paris6k/
        # Similar to oxford5k, with 6392 database and 55 query images

    Documentations
    ./doc/                         # Automatically generated documents
    ./README                       # This file

    Library
    ./lib/                         # Third-party library files

    Source Code
    ./src/get/                     # Extract crop query and pool5 features
    ./src/pwa/                     # Compute PWA descriptor and evaluate


USAGE
    Step 1. Extract crop query images and extract pool5 features.
        # Extract cropped query images.
        $ ./src/get/get_query.py --dataset oxford5k
        $ ./src/get/get_query.py --dataset paris6k

        # Extract pool5 features.
        $ ./src/get/get_conv.py --dataset oxford5k --model vgg16 --gpu 0
        $ ./src/get/get_conv.py --dataset paris6k --model vgg16 --gpu 0


    Step 2. Compute PWA descriptors and evaluate.
        # Select channels.
        $ ./src/pwa/select_channels.py --dataset oxford5k --model vgg16
            --channels 25
        $ ./src/pwa/select_channels.py --dataset paris6k --model vgg16
            --channels 25

        # Evaluate.
        $ ./src/pwa/evaluate.py --test oxford5k --via paris6k --model vgg16
            --dim 4096
        $ ./src/pwa/evaluate.py --test paris6k --via oxford5k --model vgg16
            --dim 4096


AUTHOR
    Hao Zhang: zhangh0214@gmail.com


LICENSE
    CC BY-SA 3.0


NOTE
    20 images out of 6412 Paris images are broken. As a common practice,
    we manually removed them:
        paris_louvre_000136.jpg
        paris_louvre_000146.jpg
        paris_moulinrouge_000422.jpg
        paris_museedorsay_001059.jpg
        paris_notredame_000188.jpg
        paris_pantheon_000284.jpg
        paris_pantheon_000960.jpg
        paris_pantheon_000974.jpg
        paris_pompidou_000195.jpg
        paris_pompidou_000196.jpg
        paris_pompidou_000201.jpg
        paris_pompidou_000467.jpg
        paris_pompidou_000640.jpg
        paris_sacrecoeur_000299.jpg
        paris_sacrecoeur_000330.jpg
        paris_sacrecoeur_000353.jpg
        paris_triomphe_000662.jpg
        paris_triomphe_000833.jpg
        paris_triomphe_000863.jpg
        paris_triomphe_000867.jpg

About

Python implementation of PWA for image retrieval.


Languages

Language:Python 93.2%Language:MATLAB 6.8%