lmb-freiburg / Multimodal-Future-Prediction

The official repository for the CVPR 2019 paper "Overcoming Limitations of Mixture Density Networks: A Sampling and Fitting Framework for Multimodal Future Prediction"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Questions about dataset creation

AlexanderRadovic opened this issue · comments

Hey, thanks for sharing your code! Is the code specifically for generating the SDD dataset public? I'm interested in playing around with different target times. Thanks!

Thanks for your interest in our code.

We can share with you most of the processing code which results in obtaining the structure of our processed dataset including both the images and floats files.

I have added two files: extract_frames.py and parse_annotation.py. You need to run them in that order. This will create the structure of the dataset including images and floats files.

Then you need to write your own script which creates the scene.txt file for each scene. The structure of the scene.txt file is already explained in the README:
scene.txt: each line represent one testing sequence and has the following format: tracking_id img_0,img_1,img_2,img_future.

This can be created by iterating over the files (which are generated from the above two scripts) and then writing the scene.txt file in the format described above. Unfortunately our script for creating the scene.txt file operates on a private different strcuture and thus cannot be shared. However, writing your own script should not be too hard and you are welcome to write for clarification.

Best,

Ah this is perfect, thanks so much for the rapid response! Skimming through the files, I think this is exactly what I needed.

Best,
-Alex

@os1a
Can you explain a little more how scene.txt files were created? I understand the format and that first three frames are x, x + 15, x + 30 and the fourth is x + 180, but I don't know:

  • how are tracking_ids chosen and if they matter?
  • what determines the number of lines per file?
  • can it happen that in the fourth frame the object is out of the picture?

I also have one question regarding parse_annotation.py:

  • is it safe to filter occluded images? Can it cause some prediction problems when ground truth is occluded?