MOMA-LRG

MOMA-LRG is a dataset dedicated to multi-object, multi-actor activity parsing.

Installation

git clone https://github.com/d1ngn1gefe1/momatools
cd momatools
pip install .

You can install all the dependencies needed for MOMA-LRG by running

pip install -r requirements.txt

Warning

Note that the dependency on pygraphviz requires the installation of graphviz, which can be installed via sudo apt-get install graphviz graphviz-dev on Linux systems and brew install graphviz via Homebrew on macOS.

Requirements:

Python 3.7+
ffmpeg (only for preprocessing): pip install ffmpeg-python
jsbeautifier (for better visualization of json files): pip install jsbeautifier

Requirements: data visualization

distinctipy: a lightweight package for generating visually distinct colors
Graphviz: sudo apt-get install graphviz graphviz-dev
PyGraphviz: a Python interface to the Graphviz graph layout and visualization package
seaborn: a data visualization library based on matplotlib
Torchvision

Hierarchy

Level	Concept	Representation
1	Activity	Semantic label
2	Sub-activity	Temporal boundary and semantic label
3	Higher-order interaction	Spatial-temporal scene graph
	┗━ Entity	Graph node w/ bounding box, instance label, and semantic label
	┣━ Actor	-
	┗━ Object	-
	┗━ Predicate	-
	┗━ Relationship	Directed edge as a triplet (source node, semantic label, and target node)
	┗━ Attribute	Semantic label of a graph node as a pair (source node, semantic label)

Dataset directory layout

Download the dataset into a directory titled dir_moma with the structure below. The anns directory requires roughly 1.8GB of space and the video directory requires 436 GB.

$ tree dir_moma
.
├── anns
│    ├── anns.json
│    ├── split_std.json
│    ├── split_fs.json
│    ├── clips.json
│    └── taxonomy
└── videos
     ├── all
     ├── raw
     ├── activity_fr
     ├── activity
     ├── sub_activity_fr
     ├── sub_activity
     ├── interaction
     ├── interaction_frames
     └── interaction_video

Scripts

tests/run_preproc.py: Pre-process the dataset. Don't run this script since the dataset has been pre-processed.

tests/run_visualize.py: Visualize annotations and dataset statistics.

Annotations

In this version, we include:

148 hours of videos
1,412 activity instances from 20 activity classes ranging from 31s to 600s and with an average duration of 241s.
15,842 sub-activity instances from 91 sub-activity classes ranging from 3s to 31s and with an average duration of 9s.
161,265 higher-order interaction instances.
636,194 image actor instances and 104,564 video actor instances from 26 classes.
349,034 image object instances and 47,494 video object instances from 126 classes.
984,941 relationship instances from 19 classes.
261,249 attribute instances from 4 classes.
52,072 transitive action instances from 33 classes.
442,981 intransitive action instances from 9 classes.

Below, we show the syntax of the MOMA-LRG annotations.

[
  {
    "file_name": str,
    "num_frames": int,
    "width": int,
    "height": int,
    "duration": float,

    // an activity
    "activity": {
      "id": str,
      "class_name": str,
      "start_time": float,
      "end_time": float,

      "sub_activities": [
        // a sub-activity
        {
          "id": str,
          "class_name": str,
          "start_time": float,
          "end_time": float,

          "higher_order_interactions": [
            // a higher-order interaction
            {
              "id": str,
              "time": float,

              "actors": [
                // an actor
                {
                  "id": str,
                  "class_name": str,
                  "bbox": [x, y, width, height]
                },
                ...
              ],

              "objects": [
                // an object
                {
                  "id": str,
                  "class_name": str,
                  "bbox": [x, y, width, height]
                },
                ...
              ],

              "relationships": [
                // a relationship
                {
                  "class_name": str,
                  "source_id": str,
                  "target_id": str
                },
                ...
              ],

              "attributes": [
                // an attribute
                {
                  "class_name": str,
                  "source_id": str
                },
                ...
              ],

              "transitive_actions": [
                // a transitive action
                {
                  "class_name": str,
                  "source_id": str,
                  "target_id": str
                },
                ...
              ],

              "intransitive_actions": [
                // an intransitive action
                {
                  "class_name": str,
                  "source_id": str
                },
                ...
              ]
            }
          ]
        },
        ...
      ]
    }
  },
  ...
]