liuruijin17 / 3DLSCPTR

This is an official repository of Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

3DLSCP: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints

  • Predicting 3D lanes and camera pose from a single image.
  • Learning via geometry constraints to improve performances on both tasks.
  • This work has been accepted by AAAI2022. Codes in this repo is a eariler version, the latest version will be released at CLGo.

3DLSCPTRDEMO

3DLSCPTRCOMP

Model Zoo

The pretrained models are stored in 3DLSCPTRZoos/

Set Envirionment

  • Linux ubuntu 16.04
  • GeForce RTX 3090
  • Python 3.8.5
  • CUDA 11.1

Create virtualenv environment

python3 -m venv 3dlscptr

Activate it

source 3dlscptr/bin/activate

Then install dependencies

pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt

Data Preparation

Download and extract ApolloSim from yuliangguo/3D_Lane_Synthetic_Dataset

We expect the directory structure to be the following:

3dlscptr/
3DLSCPTR/
3DLSCPTRZoos/
Apollo_Sim_3D_Lane_Release/

Evaluation

Pv-stage:

(1) Balanced scenes

python test.py Pv-stage_standard

(2) Rarely observed scenes

python test.py Pv-stage_rare_subset

(3) Scenes with visual variations

python test.py Pv-stage_illus_chg

Tv-stage:

(1) Balanced scenes

python test.py Tv-stage_standard

(2) Rarely observed scenes

python test.py Tv-stage_rare_subset

(3) Scenes with visual variations

python test.py Tv-stage_illus_chg

Pv-Tv (Firstly, you must run three commands of Pv-stage to get predicted camera poses!):

(1) Balanced scenes

python test.py Tv-stage_standard --predcam

(2) Rarely observed scenes

python test.py Tv-stage_rare_subset --predcam

(3) Scenes with visual variations

python test.py Tv-stage_illus_chg --predcam

Evaluation results

Scene Method GTCP Height(cm) Pitch(o) F-Score AP X error near X error far Z error near Z error far
3D-LaneNet Yes 86.4 89.3 0.068 0.477 0.015 0.202
Balanced Gen-LaneNet Yes 88.1 90.1 0.061 0.496 0.012 0.214
Scenes Pv-stage(ours) No 0.031 0.136 88.5 90.4 0.095 0.477 0.040 0.277
Pv-Tv(ours) No 0.031 0.136 89.5 91.3 0.091 0.450 0.041 0.281
3D-LaneNet Yes 72.0 74.6 0.166 0.855 0.039 0.521
Rarely Observed Gen-LaneNet Yes 78.0 79.0 0.139 0.903 0.030 0.539
Scenes Pv-stage(ours) No 0.069 0.295 75.1 76.5 0.210 0.906 0.084 0.652
Pv-Tv(ours) No 0.069 0.295 79.7 81.4 0.207 0.860 0.092 0.661
3D-LaneNet Yes 72.5 74.9 0.115 0.601 0.032 0.230
Scenes with Gen-LaneNet Yes 85.3 87.2 0.074 0.538 0.015 0.232
visual variations Pv-stage(ours) No 0.078 0.164 85.8 87.5 0.091 0.523 0.050 0.330
Pv-Tv(ours) No 0.078 0.164 84.9 86.6 0.103 0.501 0.050 0.308

Comparisons of the upper bounds. All methods are fed with perfect camera poses during testing phase. GTSeg means the requirement of ground truth lane segmentation.

Scene Method GTSeg F-Score AP
3D-LaneNet No 86.4 89.3
Balanced Gen-LaneNet No 88.1 90.1
Scenes 3D-GeoNet Yes 91.8 93.8
Tv-stage(ours) No 90.7 92.6
3D-LaneNet No 72.0 74.6
Rarely Observed Gen-LaneNet No 78.0 79.0
Scenes 3D-GoeNet Yes 84.7 86.6
Tv-stage(ours) No 85.7 87.8
3D-LaneNet No 72.5 74.9
Scenes with Gen-LaneNet No 85.3 87.2
visual variations 3D-GeoNet Yes 90.2 92.3
Tv-stage(ours) No 86.1 88.0

Comparisons of resource consumption. 1 MAC is approx. 2 FLOPs. PP means the requirement of post processing.

Method FPS MACs(G) Para(M) PP
3D-LaneNet 53 60.47 20.6 Yes
Gen-LaneNet 60 9.85 3.4 Yes
Pv-Tv(ours) 75 0.861 1.5 No

Training

Corresponding codes will be released after acceptance.

Acknowledgements

Gen-LaneNet

LSTR

About

This is an official repository of Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Python 100.0%