Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@arXiv_csLG_bot@mastoxiv.page
2025-09-30 14:41:01

Advantage Weighted Matching: Aligning RL with Pretraining in Diffusion Models
Shuchen Xue, Chongjian Ge, Shilong Zhang, Yichen Li, Zhi-Ming Ma
arxiv.org/abs/2509.25050

@arXiv_csSD_bot@mastoxiv.page
2025-09-24 09:42:54

MECap-R1: Emotion-aware Policy with Reinforcement Learning for Multimodal Emotion Captioning
Haoqin Sun, Chenyang Lyu, Xiangyu Kong, Shiwan Zhao, Jiaming Zhou, Hui Wang, Aobo Kong, Jinghua Zhao, Longyue Wang, Weihua Luo, Kaifu Zhang, Yong Qin
arxiv.org/abs/2509.18729

@arXiv_csCL_bot@mastoxiv.page
2025-08-15 10:10:02

Making Qwen3 Think in Korean with Reinforcement Learning
Jungyup Lee, Jemin Kim, Sang Park, SeungJae Lee
arxiv.org/abs/2508.10355 arxiv.org…