Sum over inhomogeneous array for calculating metrics

Question

Sum over inhomogeneous array for calculating metrics

mborhi opened this issue 5 months ago · comments

The error originates from finish_online_evaluation: self.online_eval_tp = np.sum(self.online_eval_tp, 0) results in error as

setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (50,) + inhomogeneous part.

The list, self.online_eval_tp is accumulated in the run_iteration method of the UniSegTrainer class, by running run_online_evaluation. This function appends lists of varying sizes (based on the number of targets of the output parameter) to self.online_eval_tp (and the other lists). As described above, this cannot be summed over.

I believe this is due to the following:

Each task (within MOTS, e.g. kidney, liver, lung, etc.) has different number of targets (3 or 2)
⁠The mapping between tasks to the number of outputs is fixed
The code truncates the output of the model to match the task’s (fixed) specified number of channels
⁠This causes a misalignment when calculating the scores (different number of columns per task)
⁠Because the calculation of the number of columns is fixed, this will always result in an error (when using tasks with differing number of targets)

Yiwen Ye · Answer 1 · Sun Apr 14 2024 16:46:37 GMT+0800 (China Standard Time)

Indeed, I have met the same error after the Numpy library update. To avoid it, I recommend reverting your Numpy library to version 1.23.4.

Marcell Borhi · Answer 2 · Tue Apr 16 2024 18:35:44 GMT+0800 (China Standard Time)

Thank you for your quick response. I am seeking clarification as to what the interpretation of this operation would then be, as reverting to version 1.23.4 will result in np.sum(*, 0) flattening the input list.

Thus, the dice score is computed as the average of the Dice score of each batch i and class c, dice_{batch_i,class_c} (as computed on line 724.

Based on my understanding, given a dataset of 3 batches and 1 class, the global Dice score is computed as
Dce = 1/3 (Dce_batch1_cls1 + Dce_batch2_cls1 + Dce_batch3_cls1).

Similarly extending the computation for C classes and N batches.

In contrast, the global Dice score should be computed using the global TP, FN, FP as
Dce = 2*TP_global/(2TP_global + FN_global + FP_global).

Hence, the interpretation of the elements in the self.all_val_eval_metrics, calculated by taking the mean of the global_dc_per_class, is unclear.

Is this the intended functionality, and if so, how should the resulting metrics be interpreted?

Yiwen Ye · Answer 3 · Tue Apr 16 2024 19:07:07 GMT+0800 (China Standard Time)

Hi,

I believe it's not necessary to focus too much on the validation values. They are primarily for validation purposes and do not determine which epoch's checkpoint will be used as the final one. The checkpoint from the last epoch is used as the final checkpoint.

Best regards,
Yiwen Ye

Marcell Borhi · Answer 4 · Wed Apr 24 2024 23:32:17 GMT+0800 (China Standard Time)

I understand, thank you for your responses.

Best,
Marcell