lawpdas / LVLT

Long-term Vision-Language Tracking

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LVLT: Long-term Vision-Language Tracking

We annotate the popular long-term tracking dataset, LTB50, with dense language descriptions. Based on this language-annotated dataset, we extend traditional Long-term visual Tracking (LT) to Long-term Vision-Language Tracking (LVLT).

Language Description Annotator

We also provide an annotation toolkit, which is developed with the tkinter package.

python -m lib.gui 

  • Text Box: The upper one shows the last description, the lower one is used to annotate the current frame. You can fill the lower one with a language description and click the save button (or press the Enter key).
  • keyCtrl+Up and Ctrl+Down (button|< abd >|): choose video
  • keyCtrl+Left and Ctrl+Right (button< abd >): choose frame
  • keyShift+Left and Shift+Right (button<< abd >>): fast-backward and fast-forward
  • keyAlt+Left and Alt+Right (button@<< abd >>@): to the last description, to the next description
  • keyEnter (buttonSave): save the description of current frame
  • keyDelete (buttonClear): clear the description of current frame

Citation

If you find this project useful in your research, please consider cite:

@article{DBLP:journals/corr/abs-1804-07056,
  author    = {Alan Lukezic and
               Luka Cehovin Zajc and
               Tom{\'{a}}s Voj{\'{\i}}r and
               Jiri Matas and
               Matej Kristan},
  title     = {Now you see me: evaluating performance in long-term visual tracking},
  eprinttype = {arXiv},
  eprint    = {1804.07056},
}

References

  • Now you see me: evaluating performance in long-term visual tracking.
    Alan Lukežič, Luka Čehovin Zajc, Tomáš Vojíř, Jiří Matas, Matej Kristan. arXiv, 1804.07056.

License

LVLT is released under the GPL-3.0 License.

About

Long-term Vision-Language Tracking

License:GNU General Public License v3.0


Languages

Language:Python 100.0%