Deep Learning project for binary and multiclass semantic segmentation on drone imagery.
Developed as part of the SICOM S9 "Acceleration Material" course.
This repository provides a complete pipeline for semantic segmentation on drone-acquired datasets using state-of-the-art models like UNet, SegFormer, and UFormer.
It supports training, evaluation, prediction visualization, and distributed training (for Gricad cluster).
semantic-segmentation-drone-data/
βββ src/
β βββ python/
β β βββ droneDataset.py # Dataset and preprocessing logic
β β βββ metrics.py # Metrics: PA, MPA, IoU, mIoU
β β βββ model.py # Model definitions (UNet, SegFormer, UFormer)
β β βββ trainer.py # Training, validation, and testing logic
β β βββ vizualization.py # Visualization utilities
β βββ get_curves.py # Plot training curves from CSV logs
β βββ main.py # Train/validate/test a model
β βββ predict.py # Generate predictions from a trained model
βββ outputs/
β βββ MultiUnet/
β β βββ predictions.zip
β βββ SegFormer/
β βββ predictions.zip
βββ config.yaml # Main configuration file
βββ README.md # Project documentation
git clone https://github.com/your-username/semantic-segmentation-drone-data.git
cd semantic-segmentation-drone-data
Make sure youβre using Python β₯3.8 and a virtual environment:
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
pip install -r requirements.txt
Update the config.yaml
file to modify:
- Dataset paths
- Model architecture (
UNet
,SegFormer
, etc.) - Training hyperparameters
- Output paths
- Distributed training settings
python ./src/main.py
Make sure
distributed::active
is disabled inconfig.yaml
if running locally.
python ./src/predict.py
Update the model checkpoint path in
config.yaml
.
python ./src/get_curves.py
-
Enable distributed training:
distributed: active: 1
-
Create a
.h
script with the following:export CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 src/main.py
Use
localhost
as master node (Gricad allocates it). If port conflicts occur, change it manually.
All dependencies are listed in requirements.txt
.
Key packages include:
- PyTorch
- torchvision
- segmentation_models_pytorch
- transformers
- matplotlib / plotly
- PyYAML / pandas / PIL
- Predictions and curves must be run with distributed mode off.
- Some models require custom install steps depending on your PyTorch version (e.g. for HuggingFace SegFormer).
- Set
TF_ENABLE_ONEDNN_OPTS=0
for compatibility when using CPU backends.
Below is a sample prediction result from the MultiUNet model on the drone dataset:
The UNet model successfully segments large and clearly defined classes such as moving objects and landable areas. However, it shows limitations with smaller or less contrasted elements, sometimes misclassifying obstacles or blending class boundaries. This result highlights UNet's solid performance in general structure recognition, but also its relative weakness in fine-grained or context-dependent segmentation tasks.
A sample prediction result from the SegFormer model
This SegFormer prediction shows robust performance on large and structured regions, especially the water body, which is segmented with high precision. The model also correctly identifies many surrounding obstacles and patches of nature, even in complex, cluttered urban scenery. While some minor confusion persists between moving and obstacle classes in dense zones, the overall segmentation is consistent and well-aligned with the ground truth. This reflects SegFormer's strong ability to model long-range dependencies and handle heterogeneous scenes with fine details.
- Louis Carron β [Main Developer]