QuickDraw dataset not present
sawant-nidhish opened this issue · comments
Dear Sir.
Firslty, thank you for your amazing research. There are some problems which I am facing. After going through your code I found out that there are just 1035 files (sketches) in picture_files (which is downloaded by running the dataloader/download.sh . In the paper it is mentioned that the models are trained on the Quicdraw 3.8M dataset. Since, the dataset is missing the train and test split, which happens in the dataloader_image.py, is not happening properly.
`class SketchDataset(data.Dataset):
def __init__(self, root, split, resize):
self.split = split.lower()
assert(self.split=='train' or self.split=='test')
if resize:
transforms_list = [
transforms.Resize(299),
lambda x: np.asarray(x),
]
else:
transforms_list = [
lambda x: np.asarray(x),
]
_NUM_VALIDATION = 345000
_RANDOM_SEED = 0
photo_filenames, _ = _get_filenames_and_classes(root)
random.seed(_RANDOM_SEED)
random.shuffle(photo_filenames)
if self.split == "train":
self.image_list = photo_filenames[_NUM_VALIDATION:]
elif self.split == "test":
self.image_list = photo_filenames[:_NUM_VALIDATION]
self.transform = transforms.Compose(transforms_list)
def __getitem__(self, index):
image = Image.open(self.image_list[index])
image = self.transform(image)
return image
def __len__(self):
return len(self.image_list)`
Here the _NUM_VALIDATION variable is set to 345000 but the dataset consists only of 1035 image (sketches). Hemce, the train set is allotted 0 images where as the test set gets 1035. I have printed the number of images by adding the print("DATASET SIZE:",i)
in dataloader.py
def_get_filenames_and_classes(dataset_dir):
quickdraw_root = dataset_dir
directories = []
class_names = []
for filename in os.listdir(quickdraw_root):
path = os.path.join(quickdraw_root, filename)
if os.path.isdir(path):
directories.append(path)
class_names.append(filename)
i=0
photo_filenames = []
for directory in directories:
for filename in os.listdir(directory):
i=i+1
path = os.path.join(directory, filename)
photo_filenames.append(path)
print("DATASET SIZE:",i)
return photo_filenames, sorted(class_names)
Due to this discrepancy the error raise ValueError("num_samples should be a positive integer " ValueError: num_samples should be a positive integer value, but got num_samples=0
comes up.
Please look into this matter. I hope to hear from you soon.
Once again thank you for your work. This research will help me solve a lot of problems.