Why jointly training a preference predictor with rating data does not help much with preference prediction?

Question

Why jointly training a preference predictor with rating data does not help much with preference prediction?

LinB203 opened this issue a year ago · comments

I wonder that Why jointly training a preference predictor with rating data does not help much with preference prediction?
Could you explain it in details?

KeqiangSun · Answer 1 · Tue Jul 11 2023 10:17:59 GMT+0800 (China Standard Time)

Thank you for your interest in our work! We have tried jointly training an aesthetic predictor on top of the CLIP image encoder on Simulacra and AVA. The model structure is exactly the same as the official aesthetic predictor. We empirically find that the preference accuracy does not improve compared to training only on the preference data.
This may indicate that preference data and rating data characterize different aspects of the image. Preference data is annotated with pairwise comparison while rating data is not. Moreover, the rating data from AVA is not conditioned on prompts, and the Simulacra is relatively noisy as we observed, which has a domain shift from the preference data.

lb203 · Answer 2 · Tue Jul 11 2023 13:04:37 GMT+0800 (China Standard Time)

Thank you for your interest in our work! We have tried jointly training an aesthetic predictor on top of the CLIP image encoder on Simulacra and AVA. The model structure is exactly the same as the official aesthetic predictor. We empirically find that the preference accuracy does not improve compared to training only on the preference data. This may indicate that preference data and rating data characterize different aspects of the image. Preference data is annotated with pairwise comparison while rating data is not. Moreover, the rating data from AVA is not conditioned on prompts, and the Simulacra is relatively noisy as we observed, which has a domain shift from the preference data.

You mean that you train a score predictor on Simulacra and AVA. But evaluated on preference data (the metric is acc or others) and found no improvement. Am I right?

Blakey Wu · Answer 3 · Wed Jul 12 2023 12:40:25 GMT+0800 (China Standard Time)

No, under this setting, the preference predictor is jointly trained with the score predictor. They use different heads and different data during training, and separately evaluated. They only share the image encoder.

lb203 · Answer 4 · Wed Jul 12 2023 12:41:45 GMT+0800 (China Standard Time)

No, under this setting, the preference predictor is jointly trained with the score predictor. They use different heads and different data during training, and separately evaluated. They only share the image encoder.

thx.