Codebase for HuBMAP + HPA - Hacking the Human Body: Human Torus Team solution (3d Place)
- conda env create -f environment.yml
- pip install -e . --no-deps
- cd data
- kaggle competitions download -c hubmap-organ-segmentation
- unzip hubmap-organ-segmentation.zip
- cd ../
- python scripts/create_cv_split.py data/train.csv --save_path data/cv_split5_v2.npy
- Download HPA additional data to
data/hpa
folder and unzip all- https://www.kaggle.com/datasets/igorkrashenyi/liver-hpa-pt0
- https://www.kaggle.com/datasets/igorkrashenyi/liver-hpa-pt2
- https://www.kaggle.com/datasets/igorkrashenyi/liver-hpa-pt1
- https://www.kaggle.com/datasets/igorkrashenyi/hap-kidney-dataset-pt1
- https://www.kaggle.com/datasets/igorkrashenyi/kidney-hpa-dataset-pt0
- https://www.kaggle.com/datasets/igorkrashenyi/hpa-colon-dataset
- https://www.kaggle.com/datasets/igorkrashenyi/hpa-spleen-dataset-pt1
- https://www.kaggle.com/datasets/igorkrashenyi/hpa-spleen-dataset-pt0
- https://www.kaggle.com/datasets/igorkrashenyi/hpa-prostate-dataset
- https://www.kaggle.com/datasets/igorkrashenyi/lung-hpa-dataset
- Download all GTEX additional data in
data/gtex
- CUDA_VISIBLE_DEVICES="{gpu_num}" python scripts/main_train.py train_configs/unet_no_pseudo.py
- CUDA_VISIBLE_DEVICES="{gpu_num}" python scripts/main_train.py train_configs/unetpp_no_pseudo.py
- Find out experiment names in
logdirs
- Use
notebooks/inference_notebook.ipynb
pathEXP_NAME
andCONFIG
fromunet_no_pseudo.py
(orunetpp_no_pseudo.py
) - Run notebook and get OOF metrics and other evaluation results
- Use
notebooks/create_pseudo_gtex.ipynb
andnotebooks/create_pseudo.ipynb
for creating pseudo labels for each organ you have to re-run notebook, changing organ path and name in the config. create_pseudo_gtex for GTEX and create_pseudo for HPA
- You can download pseudo labels directly from Kaggle - https://www.kaggle.com/datasets/vladimirsydor/hubmap-2022-add-data-labels-v2 . It contains them both in .csv (rle) and .png (soft) formats
- Use
notebooks/prepare_pseudo_train_data.ipynb
to aggregate all HPA pseudo in one dataframe and folder. V2 - refers to first pseudo iteration (K=1) and V3 (V3 Full) refers to second pseudo iteration (K=2)
- Use
notebooks/prepare_pseudo_train_data.ipynb
to aggregate all HPA pseudo in one dataframe and folder. Simply changedata/hpa
todata/gtex
- CUDA_VISIBLE_DEVICES="{gpu_num}" python scripts/main_train.py train_configs/unet.py
- CUDA_VISIBLE_DEVICES="{gpu_num}" python scripts/main_train.py train_configs/unetmitb3.py
- CUDA_VISIBLE_DEVICES="{gpu_num}" python scripts/main_train.py train_configs/unetmitb5.py
- CUDA_VISIBLE_DEVICES="{gpu_num}" python scripts/main_train.py train_configs/unetpp.py
- Use
notebooks/inference_notebook_ensem.ipynb
for evaluating ensemble just defineEXP_NAME
,CONFIG
andADITIONAL_MODELS
from trained experiments from previous steps (fromlogdirs
)