Lack of Uncertainty normalization

Question

Lack of Uncertainty normalization

LIKP0 opened this issue 9 months ago · comments

Hi, thanks for your greak work!

I think there are some problems in the code of uncertainty computation. As stated in your paper, the uncertainty threshold should ramp up from 0.75 to 1, right? But in my experiement, the number of uncertainty easily comes up to 3.

I think we should renormalize the uncertainty to 0~1 under the code of uncertainty = -1.0 * torch.sum(preds * torch.log(preds + 1e-6), dim=1, keepdim=True)

Would you like to fix about this? Thank you!

JiaLin Li · Answer 1 · Tue Oct 17 2023 16:26:54 GMT+0800 (China Standard Time)

Sorry, I find you fix it on the threshold part threshold = (0.75+0.25*ramps.sigmoid_rampup(iter_num, max_iterations))*np.log(2).

Since the max uncertainty value is ln2, I think it's better to do the normalization on uncertainty part like:

uncertainty = uncertainty / math.log(2) # normalize uncertainty to 0 to 1, cuz ln2 is the max value

*np.log(2) really causes people confused. 阅读你的开源代码就像是在破案...

Anyway, thanks for your great work again.

woailunhua · Answer 2 · Mon Dec 11 2023 17:04:48 GMT+0800 (China Standard Time)

兄弟这种利用信息熵算的uncertainty_map真的有效果吗？我这边复现的咋比不加uncertainty_map的效果差啊？？？我好郁闷

JiaLin Li · Answer 3 · Mon Dec 11 2023 17:21:12 GMT+0800 (China Standard Time)

@woailunhua 具体看任务和相关数据吧，可能噪声大一些的数据效果比较好。我自己用感觉是有提升但是很小0.1%，所以就不用了。Noisy label，mean teacher和semi-supervised这个领域这两年的新论文也很多，想用可以用更新的架构没有必要继续用这个了。

woailunhua · Answer 4 · Tue Dec 12 2023 10:58:46 GMT+0800 (China Standard Time)

那我在可以再问一个问题吗？论文里面关于Uncertainty的取值范围是在0-ln2。可问题最小值就不可能是这么低呀，比如是一个二分类，输出是One hot,通道0某像素预测值是0.001，通道1对应位置像素是0.999，经过soft之后一个变为0.26 一个变为0.73,-0.26ln(0.26)-0.73ln(0.73)最低都在0.58以上。它论文里面是不是错误的。这两年的半监督我看了都是Mean_teacher的架构上搭配各种伪标签在哪水 = =

JiaLin Li · Answer 5 · Tue Dec 12 2023 11:10:57 GMT+0800 (China Standard Time)

@woailunhua 论文里面那个是归一化的值。就像我上面写的那一段，你要把算出来的不确定度除以可能的不确定度最大值，归一化到0~1，这样才比较方便设置不确定度阈值。

uncertainty = uncertainty / math.log(2) # normalize uncertainty to 0 to 1, cuz ln2 is the max value

woailunhua · Answer 6 · Tue Dec 12 2023 11:32:44 GMT+0800 (China Standard Time)

谢谢您你这下说完那我大概知道怎么来的了。它代码里面没有给uncertainty÷ln2,反而是给threshold乘一个ln2，一样的道理吧。  这代码写的无语了      蒙张鹏 ***@***.***  

…

------------------ 原始邮件 ------------------ 发件人: "JiaLin ***@***.***>; 发送时间: 2023年12月12日(星期二) 中午11:11 收件人: ***@***.***>; 抄送: ***@***.***>; ***@***.***>; 主题: Re: [yulequan/UA-MT] Lack of Uncertainty normalization (Issue #18) @woailunhua 论文里面那个是归一化的值。就像我上面写的那一段，你要把算出来的不确定度除以可能的不确定度最大值，归一化到0~1，这样才比较方便设置不确定度阈值。 uncertainty = uncertainty / math.log(2) # normalize uncertainty to 0 to 1, cuz ln2 is the max value — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>

woailunhua · Answer 7 · Wed Dec 20 2023 16:07:41 GMT+0800 (China Standard Time)

大哥对不起我又打扰您了。最小的threshold=0.75*ln2=0.51 但是-xlnx+（1-x)ln(1-x)算的最小值大概是在0.58附近，我看论文里面的范围是0-1，换而言之就是0-Ln2,这个0是怎么来的，信息通道熵的结果不可能这么小啊？

woailunhua · Answer 8 · Wed Dec 20 2023 16:16:36 GMT+0800 (China Standard Time)

第一个例子，假如预测非常正确：比如通道0的一个体素预测值是0.999，通道1的一个体素预测值是0.111。softmax之后，一个是0.73，一个是0.269。 -0.73ln0.73-0.269ln(0.269)=0.582。
第二个例子，假如预测比较正确：比如通道0的一个体素预测值是0.9，通道1的一个体素预测值是0.1。softmax之后，一个是0.69，一个是0.31。 -0.69ln0.69-0.31ln(0.31)=0.61。
第三个例子，预测比较模糊：比如通道0的一个体素预测值是0.55，通道1的一个体素预测值是0.45。softmax之后，一个是0.52，一个是0.47。那么-0.52ln0.52-0.47ln0.47=0.69。我们可以看出最小的熵也就是0.58左右，而作者的最小值是0.75*ln2=0.51，她是怎么弄到0-Ln2的呀？

JiaLin Li · Answer 9 · Wed Dec 20 2023 20:55:46 GMT+0800 (China Standard Time)

论文里面写的不一定全对，而且不确定度本身的定义就是只考虑0~1，在网络里面有softmax肯定算的不一样，论文里面也没写要除以ln2对不对。我跑实验结果确实最小不确定度也是0.58，别纠结这个了没啥意义。