About Dartmouth Lung Cancer Histology Dataset

Question

About Dartmouth Lung Cancer Histology Dataset

chenqz1998 opened this issue 3 years ago · comments

I downloaded the dataset but failed to find out the annotations. There is only whole-slide image (WSI) without annotations. How could I get the annotations made by pathologists?

George Batchkala · Answer 1 · Thu Jul 22 2021 22:09:57 GMT+0800 (China Standard Time)

The WSI-level annotations are in the MetaData_Release_1.0.csv file. The Class column specifies WSI's predominant pattern.

Joseph DiPalma · Answer 2 · Thu Jul 22 2021 22:14:36 GMT+0800 (China Standard Time)

@GeorgeBatch You are correct that the wsi-level annotations are in the csv file.

@chenqz1998 The pathologists did not provide annotations for the test set, but rather a slide-level class. We only have annotations for the training and validation sets, but we cannot make those datasets public at this time.

George Batchkala · Answer 3 · Thu Jul 22 2021 22:19:04 GMT+0800 (China Standard Time)

@JosephDiPalma Do you think you can make the train and validation sets available on a per-request basis like the test set? Maybe it can include more questions, or even an official request from a research group so there is more accountability in terms of what happens to the data?

Joseph DiPalma · Answer 4 · Thu Jul 22 2021 22:20:35 GMT+0800 (China Standard Time)

@GeorgeBatch I don't know but I can definitely check. It would depend on the privacy restrictions put forth by our institution.

George Batchkala · Answer 5 · Thu Jul 22 2021 22:25:32 GMT+0800 (China Standard Time)

@JosephDiPalma Thank you very much! It will be very helpful!

Can you also please check if the same applies to the pre-trained weights?

Qingzhong Chen · Answer 6 · Thu Jul 22 2021 23:22:40 GMT+0800 (China Standard Time)

I see. Yes, I know there is WSI-level class annotation provided in MetaData_Release_1.0.csv file. But I wanna know the area annotation (e.g. the area provided in Fig.4 in the paper "Pathologist-level classification of histologic patterns on resected lung adenocarcinoma slides with deep neural networks") to train and tese the model for detection.

Naofumi Tomita · Answer 7 · Fri Jul 23 2021 00:11:04 GMT+0800 (China Standard Time)

@chenqz1998 The ROI annotations presented in the figure are only available for the selected slides in the test set as it is meant for visualization and error analysis; thus it is not part of the dataset.