dk-liang / FIDTM

[IEEE TMM] Focal Inverse Distance Transform Maps for Crowd Localization

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question on JHU dataset

GloamXun opened this issue · comments

Is there any bug in file fidt_generate_jhu.py? I'm trying to train this model on JHU dataset. I noticed that the size of img is different from fidt_map. I simply add a print in file image.py like:

import scipy.spatial
from PIL import Image
import scipy.io as io
import scipy
import numpy as np
import h5py
import cv2


def load_data_fidt(img_path, args, train=True):
    gt_path = img_path.replace('.jpg', '.h5').replace('images', 'gt_fidt_map_2048')
    img = Image.open(img_path).convert('RGB')

    while True:
        try:
            gt_file = h5py.File(gt_path)
            k = np.asarray(gt_file['kpoint'])
            fidt_map = np.asarray(gt_file['fidt_map'])
            break
        except OSError:
            print("path is wrong, can not load ", img_path)
            cv2.waitKey(1000)  # Wait a bit

    img = img.copy()
    fidt_map = fidt_map.copy()
    k = k.copy()
    print(img.size, fidt_map.shape) # here
    return img, fidt_map, k

The output shows some difference like:

(968, 681) (681, 968)
(1023, 575) (575, 1023)
(2048, 1365) (1365, 2048)
(1280, 720) (720, 1280)
(2048, 1356) (1356, 2048)
(852, 480) (512, 909) #difference
(2250, 1500) (1365, 2048) #difference
(2692, 3297) (2048, 1672) #difference
(1023, 575) (575, 1023)
(2000, 1115) (1115, 2000)
(3840, 2160) (1152, 2048) #difference
(1000, 600) (600, 1000)
(1637, 1070) (1070, 1637)
(653, 282) (512, 1186)
(1280, 853) (853, 1280)
(1200, 600) (600, 1200)

I try to train the model on ShanghaiA dataset and It works fine.