COCO is a large-scale object detection, segmentation, and captioning dataset.
- 330K images (>200K labeled)
- 1.5 million object instances
- 80 object categories, 91 stuff categories
- 5 captions per image
- 250,000 people with keypoints
- 15,440,132 boxes on 600 categories
- 30,113,078 image-level labels on 19,794 categories
The data set consists of approximately 380,000 15-20s video segments extracted from 240,000 different publicly visible YouTube videos, All these video segments were human-annotated with high precision classifications and bounding boxes at 1 frame per second.
Related blog post.
DensePose-COCO, a large-scale ground-truth dataset with image-to-surface correspondences manually annotated on 50K COCO images.
DensePose-RCNN: https://github.com/facebookresearch/DensePose
- 100,000 HD video sequences of over 1,100-hour driving
- 100,000 images of road object detections
- 10,000 images of instance segmentaion
- 100,000 images of driveable area & lane markings
Their tasks of interest are: stereo, optical flow, visual odometry, 3D object detection and 3D tracking.
Parsing replay files provides highly detailed match data.
StarCraft: Brood War replay dataset with 65646 games (365 GB, 1535 million frames, and 496 million player actions).
Enjoy the game :P
And learn more at AlphaGo Teach.
A knowledge base, an ongoing effort to connect structured image concepts to language.
- 108,077 Images
- 5.4 Million Region Descriptions
- 1.7 Million Visual Question Answers
- 3.8 Million Object Instances
- 2.8 Million Attributes
- 2.3 Million Relationships
- Everything Mapped to Wordnet Synsets
Gym is a toolkit for developing and comparing reinforcement learning algorithms.
From CS to physics. For CS featured:
- NewsQA: 12,744 stories with 119,633 Question-Answer Pairs.
- Frames: Human-human goal oriented dataset with 1369 dialogues. (a.k.a. Maluuba Frames)
- Dual Word Embeddings Trained on Bing Queries
https://silviogiancola.github.io/SoccerNet/
A Scalable Dataset for Action Spotting in Soccer Videos.