3lLobo / EagleAIs

Computer vision project.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Eagle AI

banner

SDAIA hackathon project.

Demo Video

Demo notebook

Project Description

Eagle AIs is modular, scalable and a cutting-edge solution for the automated detection, localization and mitigation of street damage, yielding in a modern road network with automated sustenance workflow.

Engineered and prototyped by a team of young PhD candidates, the presented solution synergizes the power of Deep Learning, Cloud-computing and augmented-reality.

We solve the challenge of detection, localization and severity estimation of road-damage, provide exact geo-coordinates and 3D-reconstruction and eliminate the need for patrolling and specialized vehicles.

Data can be provided through a mobile-phone camera by ordinary citizens. While the geo-coordinates and time-stamp give an exact reference, the images get processed in batches on a cloud-environment. From a single image, we are able to detect different kinds of street damage, estimate the size and severity; moreover generate a complete 3D-reconstruction of the scene for on-site inspection with augmented-reality. Besides the standard detection of potholes, we decided to increase fidelity by expanding the detection capabilities with 2 additional classes, longitudinal road-damage and sand-accumulation. While potholes proposing the most prominent risk to vehicle and driver require immediate attention, smaller and often expanded road damage such as cracks and water puddles are categorized with lower priority, yet shall be targets of pre-emptive care. Finally we decided to include sand-accumulation as a separate class since it hinders visibility and could potentially cover severe damage, therefore shall be depper investigated for appropriate action. The severity of the damage is estimated by combining a Laplacian-filter, a classical and reputable computer vision technique with 3D reconstruction for depth-estimation. In combination with the GPS coordinates from the phone we provide a modular severity heatmap in a panoptic birds-eye view, combinable with popular map providers such as GoogleMaps. This is intended to optimize the coordination and logistics of repair-missions. Project Eagle AIs is scalable, low-cost and highly performant.

Screenshots

obj_detection segmented_overlay augmented_reality

How to run

Install dependencies:

poetry install

Load the data into data/ folder.

Semantic segmentation:

poetry run python3 src/sem_seg_dpt.py

Depth estimation:

poetry run python3 src/depth_est_dpt.py

Dataset

https://smartathon.hackerearth.com/

Approach

  1. Semantic Segmentation with NVIDIA VIT and [DPTlarge-ade(https://huggingface.co/Intel/dpt-large-ade)]. ✅
  2. Image depth estimation with MiDas and DTPlarge - Tesla was the first one to replaced Lidar with monocular depth estimation! ✅
  3. LabelStudio for manual annotation. ✅
  4. Pothole detection with YOLOv8 - fine-tuned on manually annotated dataset. ✅
  5. Classic CV: Canny edge detection, Hough transform, Laplacian of Gaussian.
  6. Fancy Video:
    1. Overlay with segmentation. ✅
    2. Depth indicator with distance measure.
    3. Increasing pothole count.
    4. Pothole bounding box and severity score.
    5. Street damage barometer.
  7. Demo Video.
  8. Wirte-up in paper style.
  9. Submittt!

Label Studio

Manual labeling was done in Label Studio. The 'docker-compose.yml' file is adopted from the labelImg repo. To start label studio, create the required folderstructure with g+wr permissions:

  • data
    • import
    • export
    • media

Then run:

cd labelImg && docker-compose up

Open the browser at localhost:8080, set up the project and import the images for labeling.

~30 images were annotated for 3 classes:

  • pothole
  • streetdamage
  • sand on road

image

The annotations are exported in COCO format.

Fine-tuning YOLOv8

Follow these instructions to save your labeled data and train a custom model.

We use the YOLOv8 model.

Remarks

Playing with image pre-processing techniques, reveals that reducing the exposure of the image improves the prediction quality of both the semantic segmentation and depth estimation models. This might be due to the more drastic darkening of the road compared buildings, peripherals and sky or due to the notorious over-exposure of the dessert scenery.

0024contrast

0024exposure

CUDA

When working on WSL2, cuda is not readily available. Install it from the official website. Then apply this tweak to your shells profile.

Congrats, your are squared away with CUDA!

Resources

pretrained Models

DPT paper

About

Computer vision project.

License:MIT License


Languages

Language:Jupyter Notebook 97.3%Language:Python 2.7%