MIC-DKFZ / nnDetection

nnDetection is a self-configuring framework for 3D (volumetric) medical object detection which can be applied to new data sets without manual intervention. It includes guides for 12 data sets that were used to develop and evaluate the performance of the proposed method.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The dataset that is to be used for say Rib Fracture detection

DSRajesh opened this issue · comments

In the paper (Baumgartner M., Jäger P.F., Isensee F., Maier-Hein K.H. (2021) nnDetection: A Self-configuring Method for Medical Object Detection. In: de Bruijne M. et al. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2021), we can see the following text:

"Data sets. An overview of all data sets and their properties can be found in the
supplementary material. Out of the 13 data sets, we used 10 for development and
validation of nnDetection. These are further divided into a training pool (4 data
sets: CADA [21], LIDC-IDRI [1,9], RibFrac [10] and Kits19 [6].) and validation
pool (6 data sets: ProstateX [12,3], ADAM [22], Medical Segmentation Decathlon
Liver, Pancreas, Hepatic Vessel and Colon [19]). While in the training pool we
used all data for development and report 5-fold cross-validation results, in the
validation pool roughly 40% of each data set was split off as held-out test set
before development".

Does this mean that the Lung Nodule Detection LUNA model was trained using all of the following data {Rib fractures, LIDC nodules, Kits, CADA} as a single foreground class, or as multiple foreground classes ? What does it mean when we say the entire LUNA dataset containing 888 scans is a "test / hold-out dataset entirely held out from the model training" ?

Or is it meant to indicate that LIDC data (comprising 7371 nodules) was used for training and LUNA was used for testing. This would not make sense since LUNA is a subset of LIDC. This text in the paper is also contrary to the scripts in https://github.com/MIC-DKFZ/nnDetection/blob/main/projects/Task016_Luna/scripts/prepare.py, which seem to use LUNA alone for training the model.

Can you kindly clarify the following

  1. What dataset was used for evaluation of Lung nodules in the following figure
    https://github.com/MIC-DKFZ/nnDetection/blob/main/docs/results/nnDetectionV001.md#luna with 92.7% at FP of 0.5 per scan
  2. In https://github.com/MIC-DKFZ/nnDetection/blob/main/docs/results/nnDetectionV001.md#luna, the LUNA set is missing from the test pool, making the paper and this in-consistent.

Thanks

Dear @DSRajesh ,

since Luna has a predefined split and many previous methods are already using that one, we did not perform 5 fold crosss-validation there but follow DeepLung (https://github.com/wentaozhu/DeepLung) and performed a 10 fold cross-validation with the provided splits. This is why it is not part of the main figure of nnDetection and we reported the results in a separate figure.

Each dataset in nnDetection is to be taken by itself, we never do transfer learning (i.e. pretrain on another dataset to fine-tune on a different one).
Nevertheless, it is possible to take nnDetection and apply it to new datasets, i.e. it generalises to new problems via its self configuring planning. (note: this still means training models from scatch is necessary for the new dataset, the goal is to evaluate the generalisation of the self-configuring process) -> This was done for the datasets in the test pool, including LUNA. While LUNA is a subset of LIDC, the configuration of nnDetection is still different between the two datasets and as such we evaluated the generalisation capabilities of nnDetection on LUNA as well.

Best, Michael

This issue is stale because it has been open for 30 days with no activity.

This issue was closed because it has been inactive for 14 days since being marked as stale.