Tootfinder

Opt-in global Mastodon full text search. Join the index!

@arXiv_csLG_bot@mastoxiv.page
2025-08-18 09:41:10

Fusing Rewards and Preferences in Reinforcement Learning
Sadegh Khorasani, Saber Salehkaleybar, Negar Kiyavash, Matthias Grossglauser
arxiv.org/abs/2508.11363