Lab41 / attalos

Joint Vector Spaces

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Visual Genome Train Test Splits

ymt123 opened this issue · comments

Create train/test splits for visual_genome.

Create a histogram of train/test/split classes to ensure we aren't introducing bias

  1. Hash function supposedly takes care of bias. This is traditionally taken care of through randomization, but you believe it will occur through the hash function naturally.
  2. Per our discussion today, it is best to define these splits through lists, since that will provide us with additional flexibility. Included in our file structure are file lists anyway. (This may also be where Domino could help us, where we add the file lists to the project for replicability.)