If two confidence levels of original and distorted inputs are high and similar, can plausibility constraints make negative effect?

Question

If two confidence levels of original and distorted inputs are high and similar, can plausibility constraints make negative effect?

QiushiYang opened this issue 2 months ago · comments

The overall idea of this paper is really interesting! While I am a bit confused on the specific design of adaptive plausibility constraint. We know only using prediction contrasting strategy may bring positive effects on the bias from LLM or VLM datasets, i.e., higher confident parts of distorted inputs, while it may bring negative effects on other conditions:

(1) For correct prediction of original input, if both two predictions of original and distorted inputs are similar, it will be false negative one;
(2) For incorrect prediction (i.e., hallucination), if it does not belong to LLM and VLM bias and both two predictions are dissimilar (smaller), it will be false positive one.

The adaptive plausibility constraint selects high confident predictions to perform prediction contrasting, however, if the confidence levels of above two conditions are high, the negative positve & negative ones are selected, bringing misleading constraints.

Moreover, I understand that if the threshold is very high, it will only remain the highest prediction, while I notice the beta is set as 0.1, is seems it usually remains multiple candidates (right?), and the negative positive & negative may exist. If the beta is set very high so that it only remains the highest one, does the VCD make equal effect with max(logit) on most samples?

I am confused on above analysis, could you help me interpret them? Thanks a lot!