TRI-ML / RAM

Implementation for Object Permanence Emerges in a Random Walk along Memory

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Object Permanence Emerges in a Random Walk along Memory

A self-supervised approach for learning representations that localize objects under occlusion:

Object Permanence Emerges in a Random Walk along Memory,
Pavel Tokmakov, Allan Jabri, Jie Li, Adrien Gaidon,
arXiv technical report (arXiv 2204.01784)

@inproceedings{tokmakov2022object,
  title={Object Permanence Emerges in a Random Walk along Memory},
  author={Tokmakov, Pavel and Jabri, Allan and Li, Jie and Gaidon, Adrien},
  booktitle={ICML},
  year={2022}
}

Abstract

This paper proposes a self-supervised objective for learning representations that localize objects under occlusion - a property known as object permanence. A central question is the choice of learning signal in cases of total occlusion. Rather than directly supervising the locations of invisible objects, we propose a self-supervised objective that requires neither human annotation, nor assumptions about object dynamics. We show that object permanence can emerge by optimizing for temporal coherence of memory: we fit a Markov walk along a space-time graph of memories, where the states in each time step are non-Markovian features from a sequence encoder. This leads to a memory representation that stores occluded objects and predicts their motion, to better localize them. The resulting model outperforms existing approaches on several datasets of increasing complexity and realism, despite requiring minimal supervision, and hence being broadly applicable.

Installation

Please refer to INSTALL.md for installation instructions.

Benchmark Evaluation and Training

After installation, follow the instructions in DATA.md to setup the datasets. Then check GETTING_STARTED.md to reproduce the results in the paper. We provide scripts for all the experiments in the experiments folder.

License

RAM is developed upon PermaTrack and CenterTrack. Both codebases are released under MIT License themselves. Some code of CenterTrack are from third-parties with different licenses, please check the CenterTrack repo for details. In addition, this repo uses py-motmetrics, and TAO codebase for computing Track AP. ConvGRU implementation is adopted from this repo. See NOTICE for detail. Please note the licenses of each dataset. Most of the datasets we used in this project are under non-commercial licenses.

About

Implementation for Object Permanence Emerges in a Random Walk along Memory

License:GNU General Public License v3.0


Languages

Language:Python 97.4%Language:HTML 1.2%Language:Shell 0.8%Language:Dockerfile 0.3%Language:Makefile 0.2%