Multitask Tracking
TODO:
- Make coordinates according to 224 x 224 instead of -4 to 4
- Visualize coordinates on the image
- Make gaussian attention maps using previous frame
Possible modifications:
- regress offsets, not coordinates
To keep in mind and other ideas:
- occlusion network
- motion network