SJTU-ViSYS / TextSLAM-Dataset

πŸ€– Dataset for TextSLAM: Visual SLAM with Semantic Planar Text Features. (ICRA2020 & TPAMI2023)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TextSLAM-Dataset: Text-oriented Semantic Dataset

Project: TextSLAM: Visual SLAM with Semantic Planar Text Features

Authors: Boying Li, Danping Zou, Yuan Huang, Xinghan Niu, Ling Pei and Wenxian Yu.

🏠 [Project]   πŸ“ [Paper]   πŸ”₯ [Code]   πŸ”§ [Extra Evaluation Tool]

This repository contains TextSLAM-Dataset, the Text-oriented Semantic Dataset.

Overview

⭐ TextSLAM-Dataset is A Robust and Expansive Text-oriented Semantic Dataset covering various real-world scenarios, both indoor and outdoor, accompanied by comprehensive ground truth:

  • This is the First Text-oriented dataset for SLAM method.
  • Cover diverse Indoor and Outdoor scenes, including Rich Scene Texts with various sizes, fonts, languages, and backgrounds.
  • Cover real-world complex environments with Rich Semantic Objects and multiple challengings, such as complex occlusion, glass reflection, dynamic pedestrians, and illumination changes.
  • Provide Pose and Mapping Ground Truth with high precision.
  • Provide Image Retrieval Ground Truth for Day-Night sequence, serving as a valuable resource for Visual Localization tasks.

⭐ Dataset Overview:

  • Comprise a total of 36 sequences covering a mix of indoor and outdoor scenes.
  • Utilizing the depth-camera Intel RS-D455 for data collection.
  • Provide text extraction results within sequences for fair comparisons. The text detection and recognition results in this paper are from AttentionOCR. Note that more advanced text extractors can be integrated if they are available.
  • Refer our paper to find the performance of state-of-the-art SLAM algorithms in this dataset.

Overview of TextSLAM dataset (outdoor)
 

Overview of TextSLAM dataset Ground truth (outdoor)
 

Our accompanying videos are now available on YouTube (click below images to open) and Bilibili1-outdoor, 2-night, 3-rapid.

video video video

⭐ Please consider citing the following papers in your publications if the project helps your works.

@article{li2023textslam,
  title={TextSLAM: Visual SLAM with Semantic Planar Text Features},
  author={Li, Boying and Zou, Danping and Huang, Yuan and Niu, Xinghan and Pei, Ling and Yu, Wenxian},
  booktitle={IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
  year={2023}
}

@inproceedings{li2020textslam,
  title={TextSLAM: Visual SLAM with Planar Text Features},
  author={Li, Boying and Zou, Danping and Sartori, Daniele and Pei, Ling and Yu, Wenxian},
  booktitle={IEEE International Conference on Robotics and Automation (ICRA)},
  year={2020}
}

Dataset Download

We provide sequences according to their collection scenes respectively. In the Download Table, the 'All' link allows users to download all the data within a single sequence. Additionally, individual item download links ('Images', 'Texts', 'Ground Truth', 'Image List') are provided in the following columns.

A. Indoor Scene Download


Overview of Indoor Environment & Sensors
 

➑️ Indoor sequences (10 sequences) : BaiduYun Link, Google Link
Refer to yaml/GeneralMotion.yaml in TextSLAM algorithm and Table-2 in our paper.

➑️ Indoor sequences for loop test (8 sequences): BaiduYun Link, Google Link
Refer to yaml/AIndoorLoop.yaml in TextSLAM algorithm and Table-4 in our paper.

B. Large Indoor Scene Download


Large Indoor Environment
 

➑️ Large Indoor sequences for loop test (9 sequences): BaiduYun Link, Google Link
Refer to yaml/LIndoorLoop.yaml in TextSLAM algorithm and Table-5 in our paper.

C. Outdoor Scene Download


Outdoor Environment & Sensors
 

➑️ Day Sequences (8 sequences): BaiduYun Link, Google Link
Refer to yaml/Outdoor.yaml in TextSLAM algorithm and Table-6 in our paper.

➑️ Night Sequences (1 sequence): BaiduYun Link, Google Link
Refer to Figure-23 in our paper.

Notation:

The structure of each file is as follows:

<Sequence name>
β”‚
β”œβ”€β”€ <images>
β”œβ”€β”€β”€β”€β”€β”€β”€β”€	[timestamp].png
β”œβ”€β”€β”€β”€β”€β”€β”€β”€ .......
β”œβ”€β”€ <text>
β”œβ”€β”€β”€β”€β”€β”€β”€β”€	[timestamp]_dete.txt		   // detection result for [timestamp].png. Each line: u1,v1,u2,v2,u3,v3,u4,v4
β”œβ”€β”€β”€β”€β”€β”€β”€β”€	[timestamp]_mean.txt		   // recognition results for [timestamp].png. Each line: meaning, confidence
β”œβ”€β”€β”€β”€β”€β”€β”€β”€ .......
β”œβ”€β”€ <Exper.txt>                      		  // image list for this sequence
β”œβ”€β”€ <gt.txt>                         		  // Each line (TUM format): timestamp tx ty tz qx qy qz qw
<Intrinsic Parameters>          		  // First line: fx, fy, cx, cy; Second line: k1, k2, p1, p2, k3

For night sequence:

<match_gt.txt>                     		  // Each line: [night_image_name].png [matched day_image_name].png

Specifically, the [matched day_image_name].png is from Seq_02 in the 4_Outdoor sequences.

License

TextSLAM-Dataset is licensed under a CC BY-NC-SA 4.0 License, which is released for non-commercial research purpose only.

About

πŸ€– Dataset for TextSLAM: Visual SLAM with Semantic Planar Text Features. (ICRA2020 & TPAMI2023)