Data size

Question

Data size

enochsol opened this issue 2 years ago · comments

For each classes (Lepidic, Acinar, Papillary, Micropapillary, Solid and Benign), how many patches you used for training?
In your paper it says "For the training set, pathologists annotated 4,161 crops from 245 images, about 17 crops per image. These rectangular crops varied in size (mean: 718×771 pixels, standard deviation: 645×701 pixels, median: 429×473 pixels)" How the crops shape would be rectangular? What is the context of mean, standard deviation and median?
In your paper it says "For the development set, our pathologists annotated 1,068 square patches of 224×224 pixels for classic examples of each pattern." Did the pathologists annotated the patches or should be the crops? If it was patches level, why for the development set it was different from the training set?

Thank you so much!!

Joseph DiPalma · Answer 1 · Wed Feb 02 2022 05:36:28 GMT+0800 (China Standard Time)

For each classes (Lepidic, Acinar, Papillary, Micropapillary, Solid and Benign), how many patches you used for training?

We used approximately 8000 patches per class. See the "Materials and Methods" section of the paper for further details.

In your paper it says "For the training set, pathologists annotated 4,161 crops from 245 images, about 17 crops per image. These rectangular crops varied in size (mean: 718×771 pixels, standard deviation: 645×701 pixels, median: 429×473 pixels)" How the crops shape would be rectangular? What is the context of mean, standard deviation and median?

I'm not sure what this question is asking. What shape would you expect the crops to be? Please elaborate more so we can help.

In your paper it says "For the development set, our pathologists annotated 1,068 square patches of 224×224 pixels for classic examples of each pattern." Did the pathologists annotated the patches or should be the crops? If it was patches level, why for the development set it was different from the training set?

The pathologists labeled at the patch-level for the development set to ensure each patch was truly representative of the class. In the training set, it's likely that a very small group of patches may be mislabeled due to the less precise nature of crop-level annotation.

Enoch Solomon · Answer 2 · Sat Feb 05 2022 04:20:29 GMT+0800 (China Standard Time)

For each classes (Lepidic, Acinar, Papillary, Micropapillary, Solid and Benign), how many patches you used for training?

We used approximately 8000 patches per class. See the "Materials and Methods" section of the paper for further details.

Lepidic has 515 crops, Acinar has 691 crops,... approximately how many patches per the crop? Did you disregard some patches from the crop?

In your paper it says "For the training set, pathologists annotated 4,161 crops from 245 images, about 17 crops per image. These rectangular crops varied in size (mean: 718×771 pixels, standard deviation: 645×701 pixels, median: 429×473 pixels)" How the crops shape would be rectangular? What is the context of mean, standard deviation and median?

I'm not sure what this question is asking. What shape would you expect the crops to be? Please elaborate more so we can help.

I would expect the pathologists annotated crops shape would be some kind of circular shape, but the shape of patches would be rectangular. Probably I may get confused with the terms. Please clarify and confirm. WSI --> Annotated Crops --> Patches.
What does it imply? "mean: 718×771 pixels, standard deviation: 645×701 pixels, median: 429×473 pixels"

In your paper it says "For the development set, our pathologists annotated 1,068 square patches of 224×224 pixels for classic examples of each pattern." Did the pathologists annotated the patches or should be the crops? If it was patches level, why for the development set it was different from the training set?

The pathologists labeled at the patch-level for the development set to ensure each patch was truly representative of the class. In the training set, it's likely that a very small group of patches may be mislabeled due to the less precise nature of crop-level annotation.

It makes sense. Thank you again!!

Joseph DiPalma · Answer 3 · Sat Feb 12 2022 02:24:08 GMT+0800 (China Standard Time)

Lepidic has 515 crops, Acinar has 691 crops,... approximately how many patches per the crop? Did you disregard some patches from the crop?

The approximate number of patches per crop can vary widely as the annotated regions in each slide are different. We only removed patches consisting of mostly white space background. All other patches were kept.

I would expect the pathologists annotated crops shape would be some kind of circular shape, but the shape of patches would be rectangular. Probably I may get confused with the terms. Please clarify and confirm. WSI --> Annotated Crops --> Patches.

Your understanding of the steps is correct.

What does it imply? "mean: 718×771 pixels, standard deviation: 645×701 pixels, median: 429×473 pixels"

This means that on average the mean height and width of an annotated region is 718 and 771 pixels respectively. The same logic follows for the remaining values.

Enoch Solomon · Answer 4 · Sat Feb 12 2022 02:27:37 GMT+0800 (China Standard Time)

Thank you so much!!