tgxs002 / HPSv2

Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Why jointly training a preference predictor with rating data does not help much with preference prediction?

LinB203 opened this issue · comments

commented

I wonder that Why jointly training a preference predictor with rating data does not help much with preference prediction?
Could you explain it in details?

Thank you for your interest in our work! We have tried jointly training an aesthetic predictor on top of the CLIP image encoder on Simulacra and AVA. The model structure is exactly the same as the official aesthetic predictor. We empirically find that the preference accuracy does not improve compared to training only on the preference data.
This may indicate that preference data and rating data characterize different aspects of the image. Preference data is annotated with pairwise comparison while rating data is not. Moreover, the rating data from AVA is not conditioned on prompts, and the Simulacra is relatively noisy as we observed, which has a domain shift from the preference data.

commented

Thank you for your interest in our work! We have tried jointly training an aesthetic predictor on top of the CLIP image encoder on Simulacra and AVA. The model structure is exactly the same as the official aesthetic predictor. We empirically find that the preference accuracy does not improve compared to training only on the preference data. This may indicate that preference data and rating data characterize different aspects of the image. Preference data is annotated with pairwise comparison while rating data is not. Moreover, the rating data from AVA is not conditioned on prompts, and the Simulacra is relatively noisy as we observed, which has a domain shift from the preference data.

You mean that you train a score predictor on Simulacra and AVA. But evaluated on preference data (the metric is acc or others) and found no improvement. Am I right?

No, under this setting, the preference predictor is jointly trained with the score predictor. They use different heads and different data during training, and separately evaluated. They only share the image encoder.

commented

No, under this setting, the preference predictor is jointly trained with the score predictor. They use different heads and different data during training, and separately evaluated. They only share the image encoder.

thx.