Louiscrrn / semantic-segmentation-drone-dataset

Multiclass semantic segmentation problem using U-Net and SegFormer, trained on a multi-GPU.

Repository from Github https://github.comLouiscrrn/semantic-segmentation-drone-datasetRepository from Github https://github.comLouiscrrn/semantic-segmentation-drone-dataset

Drone Semantic Segmentation

Deep Learning project for binary and multiclass semantic segmentation on drone imagery.
Developed as part of the SICOM S9 "Acceleration Material" course.

πŸ“Œ Overview

This repository provides a complete pipeline for semantic segmentation on drone-acquired datasets using state-of-the-art models like UNet, SegFormer, and UFormer.
It supports training, evaluation, prediction visualization, and distributed training (for Gricad cluster).


πŸ“ Project Structure

semantic-segmentation-drone-data/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ python/
β”‚   β”‚   β”œβ”€β”€ droneDataset.py     # Dataset and preprocessing logic
β”‚   β”‚   β”œβ”€β”€ metrics.py          # Metrics: PA, MPA, IoU, mIoU
β”‚   β”‚   β”œβ”€β”€ model.py            # Model definitions (UNet, SegFormer, UFormer)
β”‚   β”‚   β”œβ”€β”€ trainer.py          # Training, validation, and testing logic
β”‚   β”‚   β”œβ”€β”€ vizualization.py    # Visualization utilities
β”‚   β”œβ”€β”€ get_curves.py           # Plot training curves from CSV logs
β”‚   β”œβ”€β”€ main.py                 # Train/validate/test a model
β”‚   └── predict.py              # Generate predictions from a trained model
β”œβ”€β”€ outputs/
β”‚   β”œβ”€β”€ MultiUnet/
β”‚   β”‚   └── predictions.zip
β”‚   └── SegFormer/
β”‚       └── predictions.zip
β”œβ”€β”€ config.yaml                 # Main configuration file
└── README.md                   # Project documentation

πŸš€ Getting Started

1. Clone the Repository

git clone https://github.com/your-username/semantic-segmentation-drone-data.git
cd semantic-segmentation-drone-data

2. Install Dependencies

Make sure you’re using Python β‰₯3.8 and a virtual environment:

python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows
pip install -r requirements.txt

βš™οΈ Configuration

Update the config.yaml file to modify:

  • Dataset paths
  • Model architecture (UNet, SegFormer, etc.)
  • Training hyperparameters
  • Output paths
  • Distributed training settings

πŸ“ˆ Usage

πŸ‹οΈ Train a model

python ./src/main.py

Make sure distributed::active is disabled in config.yaml if running locally.

πŸ” Make predictions

python ./src/predict.py

Update the model checkpoint path in config.yaml.

πŸ“Š Plot learning curves

python ./src/get_curves.py

πŸ–₯️ Distributed Training on Gricad

πŸ”§ Setup

  1. Enable distributed training:

    distributed:
      active: 1
  2. Create a .h script with the following:

    export CUDA_VISIBLE_DEVICES=0,1,2,3
    torchrun --nproc_per_node=4 src/main.py

Use localhost as master node (Gricad allocates it). If port conflicts occur, change it manually.


πŸ“¦ Requirements

All dependencies are listed in requirements.txt.

Key packages include:

  • PyTorch
  • torchvision
  • segmentation_models_pytorch
  • transformers
  • matplotlib / plotly
  • PyYAML / pandas / PIL

πŸ“Œ Notes

  • Predictions and curves must be run with distributed mode off.
  • Some models require custom install steps depending on your PyTorch version (e.g. for HuggingFace SegFormer).
  • Set TF_ENABLE_ONEDNN_OPTS=0 for compatibility when using CPU backends.

πŸ“· Results

Below is a sample prediction result from the MultiUNet model on the drone dataset:

MultiUNet Prediction

The UNet model successfully segments large and clearly defined classes such as moving objects and landable areas. However, it shows limitations with smaller or less contrasted elements, sometimes misclassifying obstacles or blending class boundaries. This result highlights UNet's solid performance in general structure recognition, but also its relative weakness in fine-grained or context-dependent segmentation tasks.

A sample prediction result from the SegFormer model

SegFormer Prediction

This SegFormer prediction shows robust performance on large and structured regions, especially the water body, which is segmented with high precision. The model also correctly identifies many surrounding obstacles and patches of nature, even in complex, cluttered urban scenery. While some minor confusion persists between moving and obstacle classes in dense zones, the overall segmentation is consistent and well-aligned with the ground truth. This reflects SegFormer's strong ability to model long-range dependencies and handle heterogeneous scenes with fine details.


πŸ§‘β€πŸ’» Contributors

  • Louis Carron β€” [Main Developer]

About

Multiclass semantic segmentation problem using U-Net and SegFormer, trained on a multi-GPU.


Languages

Language:Python 100.0%