ClementPinard / FlowNetPytorch

Pytorch implementation of FlowNet by Dosovitskiy et al.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

random crop resolution 320*448

poincarelee opened this issue · comments

Hi,
Thanks for your pioneering work and great contribution.
I saw in your paper, you crop the image in flying-chairs to 512384, while in main.py , the randomcrop was 320448, Could you tell me why you chose this resolution?
If I would like to use real scene dataset, whose resolution is eg. 1920*1080, how can I crop the image?

Hello,

FlowNet is fully convolutive, you don't need a particular input resolution. As such, if your machine is beefy enough, you can feed the network fullHD images directly.

The reason we cropped the images to be 320x448 instead of full resolution is to be able to perform data augmentation such as random rotate, or translate one fo the image to create an artificial optical flow. If we stayed with full resolution, we would have ended up with many dead zones where no pixel is present, and thus no optical flow can be learned.

Thanks a lot for such a quick reply.
Good explanation, I completely got it.

hmm, I have another question to ascertain, in my real scene dataset(having images and groundtruth of flow), do I need to use co_transform?

co_transform is only to be used when training. It provdes a set of transformations that both modify images and resulting optical flow. If you don't use data augmentation for your training, you might not need co_transforms. See the base dataset for optical flow here : https://github.com/ClementPinard/FlowNetPytorch/blob/master/datasets/listdataset.py As you can see, if no co_transform object is given, it will simply ignore it. Of course I encourage you tu use data augmentation for your training though :)

If you only use your network for inference, you won't need it. See an example here : https://github.com/ClementPinard/FlowNetPytorch/blob/master/run_inference.py