Zero-We / BCL

A bayesian collaborative learning framework for whole-slide image classification

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Project Details

Rainydu184 opened this issue · comments

Hi, your work is very fascinating!
I would like to replicate your work on TCGA-Lung datasets. I noticed that there are many empty .md files in the repo. Are there any pending contents to be written? /dataset/README.md /dataset_csv/README.md
Could you please provide a brief explanation of how the features extracted through CLAM should be applied in your repo and what format the cross-validation files should be in?

Hi, README.md file in /dataset and /dataset_csv just that I forgot to delete, dataloader.py file is placed in /dataset, and the .csv files related to dataset spliting are stored in /dataset_csv (I have uploaded a csv file of camelyon16 dataset as a reference).

Once you get all patch features within each WSI and store in pt file, then you could run M2_update_MIL_classifier.py and set feat_dir as the directory that save patch features, then you will get t0_primary_attn.pkl which stores attention scores of all patches in WSI, then run E_pseudo_labeling.py to generate pseudo labels and then M1_update_feat_encoder.py to optimize patch feature encoder, extract new patch features by updated feature encoder, finally M2 again to iterate.

HI!

I think I have obtained the correct training through M2_update_MIL_classifier.py. But I got t0_primary.pth and t1_primary_attn.pkl, instead of t0_primary_attn.pkl.
屏幕截图 2023-08-23 142225
I found that the E_pseudo_labeling.py file only contains the processing for "camelyon". so I made some simple modifications to line80 and line134.

if args.dset == 'camelyon' or args.dset == 'tcga_lung':

When running the E_pseudo_labeling.py, I encountered strange AUC, ACC values, and warnings.
屏幕截图 2023-08-23 144620

For MIL tasks, there is a difference between TCGA_Lung and Camelyon.

In Camelyon, "normal" samples are negative cases, and "tumor" samples are positive cases, resulting in only two types of patches (negative or positive). However, in TCGA_Lung, there are no negative samples, and LUAD and LUSC are different categories of positive samples, resulting in three types of patches: negative (normal tissue in WSI), LUAD, and LUSC.

For TCGA_Lung, there should be some differences in operations compared to Camelyon. Could you please provide some suggestions?

Hi, it is fine when you get t0_primary.pth and t1_primary attn.pkl, because t0_primary stores the model weights at t0, and t1_primary_attn stores the attention weights for pseudo labeling at t1 time.
Indeed, you noticed an important point, there is a difference between Camelyon and TCGA Lung. Because there are no "normal" samples in TCGA Lung. Therefore, we add a category called "background" in patch classification branch for typing task liked TCGA Lung. During pseudo labeling, the higher attention weights patches will be assigned to the WSI class, and the smaller will be assigned to "background".
But the WSI classification branch has no "background" category, so the patch classifier will have one more class than the WSI classifier. As you can see in the framework, one more node for the patch classifier.

Thanks for your reply!
Is the AUC score normal? Just need to ignore warning messages?

It seems that AUC score is abnormal, you need to debug further.

Hi, @Zero-We :
As mentioned in your article, the allocation of pseudo-labels to patches is crucial.

I noticed that the allocation of pseudo-labels requires both the MIL Classifier and Patch Classifier. While I can easily understand how the MIL Classifier can be initialized by classifying the entire image, I'm curious about how to initialize the Patch Classifier. This will have an impact on the initial allocation of pseudo-labels.

Using random initialization may be problematic.

Yes, you're right, pseudo labeling is important.

Actually, we just randomly initialized the Patch Classifier, because the initial pseudo-label assignment does not consider the scores obtained by Patch Classifier, but only attention weights. It means that all classifier scores are set to 1 at initial round (as you can see at Line 60-62 in E_pseudo_labeling.py). So the initialization of Patch Classifier has no effect on the initial pseudo-label assignment.

Get it! Thanks for your reply.

Hi Zero-We!
I'm sorry to disturb you again, but I found that the ACC for WSI classification are still poor when following the instructions in the readme.

I think I should utilize the updated weights to re-extract features between the execution of M1_update_feat_encoder.py and M2_update_MIL_classifier.py. Is that correct?

I noticed that your function "extract_feature_clean" seems to be able to perform this task. Is its functionality complete? What is the difference between using it and initializing CLAM's ResNet50 with the new weights for feature extraction?

Yes, you should re-extract features before running "M2_update_MIL_classifier.py" again, then repeat the EM iteration.

"extract_feature_clean.py" is used to re-extract features, there is no difference between them. But I recommend using "extract_feature_clean.py", which can load patch images directly instead of extract patch from WSI every time.

Hi @Zero-We :
Thank you very much for your patient explanation. I have successfully achieved performance very similar to that described in your article after 4 iterations on Camelyon16 dataset.

Additionally, I have identified the reason for the unsuccessful results on TCGA_lung dataset. It seems that your "E_pseudo_labeling.py" is specifically designed for pseudo-label classification on Camelyon16. Could you please share the code that is applicable to multi-class classification?

I have updated 'E_pseudo_labeling.py'.

Thanks for everything you've done!
I am also trying to implement it, and I found that your Multi-Class Attention here seems to be designed also only for binary classification.
According to your article, it should have a separate path_attention_head for each class. If I simply modify the number of final classes here, it can only calculate the confidence based on the attn value of the predicted class, rather than the attn value of its actual label.

Is my understanding correct? There is a significant difference between comelyon16's binary classification and other multi-instance multi-class problems.

You have to replace with 'n_classes=n_classes' in this line for multi-class classification.

Thanks for your prompt response.

Hi @Zero-We :

You have to replace with 'n_classes=n_classes' in this line for multi-class classification.

I tried this modification, but it resulted in an error.
It seems that the dimension of atten(A_path) has been changed to [n_classes, num_patch], and then further transformed into [1, n_classes*num_patch].
屏幕截图 2023-09-21 100157

I made some modifications to the original code as follows, so that atten can be correctly computed with feature(wsi_trans).
屏幕截图 2023-09-21 101410

Then the error was propagated to the loss calculation, and I obtained the following error. The shapes of the returned results logit, y_prob, and Y_hat have been changed to [2, 2], [2, 2], and [2, 1], respectively. I think the correct shapes should be [1, 2], [1, 2], and [1, 1], respectively.
屏幕截图 2023-09-21 101333

The correct model should be consistent with Fig.3 in your article. How can I implement it? Could you please provide some suggestions?
屏幕截图 2023-09-21 102620

please refer to 'models/cls_model_multi.py'

Thanks!