VSydorskyy / hubmap_2022_htt_solution

Codebase for HuBMAP + HPA - Hacking the Human Body: Human Torus Team solution (3d Place)

hubmap_2022_htt_solution

Codebase for HuBMAP + HPA - Hacking the Human Body: Human Torus Team solution (3d Place)

Setting up environment

conda env create -f environment.yml
pip install -e . --no-deps

Prepare Data and CV split

cd data
kaggle competitions download -c hubmap-organ-segmentation
unzip hubmap-organ-segmentation.zip
cd ../
python scripts/create_cv_split.py data/train.csv --save_path data/cv_split5_v2.npy
Download HPA additional data to data/hpa folder and unzip all
Download all GTEX additional data in data/gtex
- https://www.kaggle.com/datasets/sakvaua/gtex-pseudo-humantorusteam

Train First models without Pseudo

CUDA_VISIBLE_DEVICES="{gpu_num}" python scripts/main_train.py train_configs/unet_no_pseudo.py
CUDA_VISIBLE_DEVICES="{gpu_num}" python scripts/main_train.py train_configs/unetpp_no_pseudo.py

Perform Inference (Not obligatory)

Find out experiment names in logdirs
Use notebooks/inference_notebook.ipynb path EXP_NAME and CONFIG from unet_no_pseudo.py (or unetpp_no_pseudo.py)
Run notebook and get OOF metrics and other evaluation results

Create Pseudo Labels (Not obligatory)

Use notebooks/create_pseudo_gtex.ipynb and notebooks/create_pseudo.ipynb for creating pseudo labels for each organ you have to re-run notebook, changing organ path and name in the config. create_pseudo_gtex for GTEX and create_pseudo for HPA

Download Pseudo Labels

You can download pseudo labels directly from Kaggle - https://www.kaggle.com/datasets/vladimirsydor/hubmap-2022-add-data-labels-v2 . It contains them both in .csv (rle) and .png (soft) formats

Create final Pseudo datasets

HPA

Use notebooks/prepare_pseudo_train_data.ipynb to aggregate all HPA pseudo in one dataframe and folder. V2 - refers to first pseudo iteration (K=1) and V3 (V3 Full) refers to second pseudo iteration (K=2)

GTEX

Use notebooks/prepare_pseudo_train_data.ipynb to aggregate all HPA pseudo in one dataframe and folder. Simply change data/hpa to data/gtex

Train Final models with Pseudo

CUDA_VISIBLE_DEVICES="{gpu_num}" python scripts/main_train.py train_configs/unet.py
CUDA_VISIBLE_DEVICES="{gpu_num}" python scripts/main_train.py train_configs/unetmitb3.py
CUDA_VISIBLE_DEVICES="{gpu_num}" python scripts/main_train.py train_configs/unetmitb5.py
CUDA_VISIBLE_DEVICES="{gpu_num}" python scripts/main_train.py train_configs/unetpp.py

Evaluate ensemble

Use notebooks/inference_notebook_ensem.ipynb for evaluating ensemble just define EXP_NAME, CONFIG and ADITIONAL_MODELS from trained experiments from previous steps (from logdirs)

About

Codebase for HuBMAP + HPA - Hacking the Human Body: Human Torus Team solution (3d Place)

Languages

Language:Python 78.8%Language:Jupyter Notebook 21.2%