3dv-casia / LSLM_VLoc

Lightweight Structured Line Map Based Visual Localization

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LSLM_VLoc-Lightweight Structured Line Map Based Visual Localization

This repository contains the implementation of the paper: Lightweight Structured Line Map Based Visual Localization, Hongmin Liu, Chengyang Cao, Hanqiao Ye, Hainan Cui, Wei Gao, Xing Wang, and Shuhan Shen

Abstract

Visual localization, also known as camera pose estimation, is a crucial component of many applications, such as robotics, autonomous driving, and augmented reality. Traditional visual localization algorithms typically run on point cloud maps generated by algorithms such as Structure-from-Motion (SfM) or Simultaneous Localization and Mapping (SLAM). However, point features are sensitive to weak textures and illumination changes. In addition, the generated 3D point cloud maps often contain millions of points, posing higher demands on device storage and computing resources. To address these challenges, we propose a visual localization algorithm based on lightweight structured line maps. Instead of extracting and matching point features in the images, we select line segments that represent structured scene information as image features. These line segments are then used to construct a lightweight line map containing rich structured scene information. Then, the camera pose is estimated through a series of steps including line extraction, matching, initial pose estimation, and pose refinement. Experimental results on benchmark datasets demonstrate that compared to the current state-of-the-art visual localization methods, our method achieves competitive localization accuracy while significantly reducing the memory footprint of the 3D map.

LSLM-VLoc

Pipeline

The pipeline of LSLM_VLoc can be divided into four steps:

  • Inputs: Queries and database images.

  • Step1: utilize the camera poses of reference images provided by a standard point-based SfM algorithm as input, a 3D scene line map is constructed offline as a pre-built map for visual localization.

  • Step2: establish 2D-3D line correspondences through line segment detection and a coarse-to-fine hierarchical matching strategy.

  • Step3: The initial pose of the camera is estimated by the designed Group-RANSAC PnL algorithm.

  • Step4: iteratively refine the initial pose using a reprojection loss function specifically designed for line segments to obtain the final six degrees of freedom camera pose.

  • Outputs: The six-degree-of-freedom pose of the camera when the query image is taken, which consists of a three-degree-of-freedom rotation matrix R and a three-degree-of-freedom translation vector t.

The following figure shows the pipeline of LSLM_VLoc:

pipeline

Datasets

We perform experiments on the Cambridge Landmarks dataset and the Aachen Day-Night dataset.

Download the Cambridge Landmarks dataset from the: https://www.repository.cam.ac.uk/items/53788265-cb98-42ee-b85b-7a0cbc8eddb3

export dataset=datasets/cambridge
export scenes=( "KingsCollege" "OldHospital" "StMarysChurch" "ShopFacade" "GreatCourt" )
export IDs=( "251342" "251340" "251294" "251336" "251291" )
for i in "${!scenes[@]}"; do
wget https://www.repository.cam.ac.uk/bitstream/handle/1810/${IDs[i]}/${scenes[i]}.zip -P $dataset \
&& unzip $dataset/${scenes[i]}.zip -d $dataset && rm $dataset/${scenes[i]}.zip; done

Download the Aachen Day-Night dataset from the: https://data.ciirc.cvut.cz/public/projects/2020VisualLocalization/Aachen-Day-Night/

export dataset=datasets/aachen
wget -r -np -nH -R "index.html*,aachen_v1_1.zip" --cut-dirs=4  https://data.ciirc.cvut.cz/public/projects/2020VisualLocalization/Aachen-Day-Night/ -P $dataset
unzip $dataset/images/database_and_query_images.zip -d $dataset

Codes

We are actively preparing to release the source code.

If you're interested in our project, keep an eye out as the source code will be available very soon.

Results

A line map constructed in the Old Hospital scenario of the Cambridge Landmarks dataset:

LineMap

Experimental results of various current state-of-the-art visual localization methods in the Old Hospital scenario.

Method Map Size Median Errors (m/°)
Active Search 200MB 0.52/1.12
HLoc 800MB 0.15/0.30
PixLoc ~600MB 0.16/0.30
GoMatch ~12MB 2.83/8.14
BPnPNet+SuperPoint ~12MB 24.8/162.99
PtLine >800MB 0.15/0.31
SRC 40MB 0.38/0.50
DSAC++ 207MB 0.20/0.30
PoseNet 50MB 2.31/5.38
MS-Transformer ~70MB 1.81/2.39
SANet ∼260MB 0.32/0.50
CROSSFIRE 50MB 0.43/0.70
Ours ∼3MB 0.34/0.86

About

Lightweight Structured Line Map Based Visual Localization