Tootfinder

@arXiv_csLG_bot@mastoxiv.page
2025-10-13 10:41:20

Efficient Bayesian Inference from Noisy Pairwise Comparisons
Till Aczel, Lucas Theis, Wattenhofer Roger
https://arxiv.org/abs/2510.09333 https://arxiv.org/…

Efficient Bayesian Inference from Noisy Pairwise Comparisons
Evaluating generative models is challenging because standard metrics often fail to reflect human preferences. Human evaluations are more reliable but costly and noisy, as participants vary in expertise, attention, and diligence. Pairwise comparisons improve consistency, yet aggregating them into overall quality scores requires careful modeling. Bradley-Terry-based methods update item scores from comparisons, but existing approaches either ignore rater variability or lack convergence guarantees,…