STAIR-Lab-CIT / STAIR-captions

STAIR captions: large-scale Japanese image caption dataset

Home Page:http://captions.stair.center

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

STAIR Captions

We developed a large-scale Japanese image caption dataset, named STAIR Captions. STAIR Captions website is http://captions.stair.center .

Annotation Format

STAIR Captions dataset is provided as JSON files. The annotation format of STAIR Captions follows the one of MS-COCO:

annotation{
  "id"                : int,
  "image_id"          : int,
  "caption"           : str,
  "tokenized_caption" : str,
}

For the details of the annotation format, please see MS-COCO download page.

Publications

  • Yuya Yoshikawa, Yutaro Shigeto, Akikazu Takeuchi, ``STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset,'' Annual Meeting of the Association for Computational Linguistics (ACL), Short Paper, 2017. [arXiv]
  • 吉川友也, 重藤優太郎, 竹内彰一, ``STAIR Captions: 大規模日本語画像キャプションデータセット'', 言語処理学会第23回年次大会 (NLP2017), 2017. (In Japanese) [PDF]

License

Creative Commons Attribution 4.0 License.

About

STAIR captions: large-scale Japanese image caption dataset

http://captions.stair.center