JDAI-CV / LIO

Look-into-Object: Self-supervised Structure Modeling for Object Recognition (CVPR 2020)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About OEL module

JohnDreamer opened this issue · comments

Could you please provide the code of OEL module and the data pre-process to gain the mask by calculating the correlation with other image patches?

The code of OEL is integrated with classification model, which you can find in classification/.

There is no step to gain the mask from the data pre-process in OEL, it's self-supervised and trained with backbone. The ground-truth mask is calculated by positive images (as stated in Section 3.1).

The correlation is calculated in classificatoin/utils/rela/rela_cal.py. And the OEL loss is calculated in line 92-97 in classification/utils/train_util_index.py.

For CUB dataset, you can download the ground-truth segmentation mask from official site. Here is a small script to get ground-truth mask for LIO.

import os
import os.path as osp
import pickle

import cv2
import numpy as np
import torch
import torch.nn.functional as F
import tqdm

if __name__ == '__main__':
    base_path = 'segmentations'
    in_size = 448
    target_size = 14
    chunk_size = in_size // target_size
    sub_classes = sorted(os.listdir(base_path))
    result = {}
    for sub_class in tqdm.tqdm(sub_classes):
        for img_name in sorted(os.listdir(osp.join(base_path, sub_class))):
            img_path = osp.join(base_path, sub_class, img_name)
            img = cv2.imread(img_path).astype(np.float).sum(axis=-1)
            H, W = img.shape
            img = torch.from_numpy(img).view(1, 1, H, W)
            img = F.interpolate(img, (in_size, in_size), mode='bilinear').view(in_size, in_size)
            img = (img > 0).float()
            img = img.view(target_size, chunk_size, target_size, chunk_size)
            img = img.permute(1, 3, 0, 2).contiguous()
            img = img.view(chunk_size*chunk_size, target_size, target_size)
            img = torch.sum(img, dim=0) / (chunk_size * chunk_size)
            # img_mean = img.mean()
            # img = (img > img_mean).float()
            result[img_name.replace('.png', '.jpg')] = img
    with open(f'std_mask_{target_size}x.pkl', 'wb') as f:
        pickle.dump(result, f, protocol=4)

Note: only the LIO w/ GM configuration in table 4 use the CUB ground-truth mask for ablation study.