mahmoodlab / PathomicFusion

Fusing Histology and Genomics via Deep Learning - IEEE TMI

Home Page:http://www.mahmoodlab.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Regard reproducing the GBMLGG grade calssification

omniaalwazzan opened this issue · comments

Hi

Thanks for sharing the dataset and codes.

I am trying to use a simple CNN classifier to classify the grade of the GBMLGG dataset into 3 classes (2,3,4)

I am using the same split of training and testing the pathomic paper used. However, my training acc is very high 97~ 98% but validation is relatively low: 72% , same for train loss, and validation loss!

My backbone network is VGG_bn.

Is there something wrong I am doing it? Can I use the grade of the tumour as a label for the patches and classify them based on that? or that's not possible?

Thanks

I can tell from the TSN-E visualization, that the network got very confused when it comes to grades 2 and 3 classification?

TSN-E

what are your kind thoughts regarding this?

Hi @omniaalwazzan - What was the size of the input of the patches? Were you following the same scheme as implemented in the paper(e.g. - training with 512 x 512 patches, macro-averaging the predicted scores for each image)?

image

See also the Jupyter Notebook here, for reproducing the exact numbers used in the paper.

Hello, Richard.

Thank you very much for this useful information.

I'm trying to make a single model for each of the images, clinical data, and genes. Then make a fusion model once I have a firm understanding of them. However, because I am new to everything (pytorch, histo images, genes, etc.), I am attempting to begin slowly. I first replicated the CLAM pipeline to have a benchmark. Now I'm trying to build the fusion model, but with simple code. Thus, you might see me asking in all your multi-model GitHub repositories :)
Hope you got my point!

No. I did not stick to the parameters specified in the pathomic paper. I used VGG11_bn, lr =0.0004, Adam optim, and the following augmentations parameters: RandomRotation(5), RandomHorizontalFlip(0.5), RandomCrop(224, padding=2). The normalisation was the same as the ImageNet mean and standard deviation.

After labelling the patches with the original ROI label, I trained the model on (512x512) patches. Did I miss something?

I am sorry I didn't get "macro-averaging the predicted scores for each image"?

Hi @omniaalwazzan - since the ROI images are of size [1024 x 1024] - for each ROI, I subdivided into [512 x 512] image patches. For all patches of the same patient, I then macro-average the predicted risk scores, to get that patient-level risk score.

The evaluation is performed in this function here:

def poolSurvTestPD(ckpt_name='./checkpoints/TCGA_GBMLGG/surv_15_rnaseq/', model='pathgraphomic_fusion', split='test', zscore=False, agg_type='Hazard_mean'):

I see, thanks, that helps me understand how you obtained the patient-level label.

One more question if you don't mind answring!

I can tell from the dataset that ROIs from the same patients always have the same tumour grade. Does this also happen in real life?

I'm asking this question because your lab group is made up of people from different fields, so I think you might know the answer :)