Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@usul@piaille.fr
2025-10-27 04:51:28

Merci trump
En Argentine, le parti de Javier Milei remporte largement les législatives de mi-mandat
lemonde.fr/international/artic

@arXiv_csAI_bot@mastoxiv.page
2025-08-28 09:32:41

SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control
Quanfeng Lu, Zhantao Ma, Shuai Zhong, Jin Wang, Dahai Yu, Michael K. Ng, Ping Luo
arxiv.org/abs/2508.20018

@arXiv_csCL_bot@mastoxiv.page
2025-07-28 09:58:01

GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning
Lakshya A Agrawal, Shangyin Tan, Dilara Soylu, Noah Ziems, Rishi Khare, Krista Opsahl-Ong, Arnav Singhvi, Herumb Shandilya, Michael J Ryan, Meng Jiang, Christopher Potts, Koushik Sen, Alexandros G. Dimakis, Ion Stoica, Dan Klein, Matei Zaharia, Omar Khattab
arx…

@arXiv_csRO_bot@mastoxiv.page
2025-07-28 09:32:01

ReCoDe: Reinforcement Learning-based Dynamic Constraint Design for Multi-Agent Coordination
Michael Amir, Guang Yang, Zhan Gao, Keisuke Okumura, Heedo Woo, Amanda Prorok
arxiv.org/abs/2507.19151

@arXiv_csHC_bot@mastoxiv.page
2025-07-28 09:12:41

DxHF: Providing High-Quality Human Feedback for LLM Alignment via Interactive Decomposition
Danqing Shi, Furui Cheng, Tino Weinkauf, Antti Oulasvirta, Mennatallah El-Assady
arxiv.org/abs/2507.18802

@arXiv_csCV_bot@mastoxiv.page
2025-08-26 12:32:47

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Weiyun Wang, Zhangwei Gao, Lixin Gu, Hengjun Pu, Long Cui, Xingguang Wei, Zhaoyang Liu, Linglin Jing, Shenglong Ye, Jie Shao, Zhaokai Wang, Zhe Chen, Hongjie Zhang, Ganlin Yang, Haomin Wang, Qi Wei, Jinhui Yin, Wenhao Li, Erfei Cui, Guanzhou Chen, Zichen Ding, Changyao Tian, Zhenyu Wu, Jingjing Xie, Zehao Li, Bowen Yang, Yuchen Duan, Xuehui Wang, Songze Li, Xiangyu Zhao, Haodong Duan, Nianche…

@usul@piaille.fr
2025-08-27 14:34:07

« Oskar », colonel ukrainien : « Une mobilisation massive serait une décision impopulaire, mais nécessaire »
lemonde.fr/international/artic

@arXiv_csAI_bot@mastoxiv.page
2025-09-26 08:36:31

UserRL: Training Interactive User-Centric Agent via Reinforcement Learning
Cheng Qian, Zuxin Liu, Akshara Prabhakar, Jielin Qiu, Zhiwei Liu, Haolin Chen, Shirley Kokane, Heng Ji, Weiran Yao, Shelby Heinecke, Silvio Savarese, Caiming Xiong, Huan Wang
arxiv.org/abs/2509.19736

@arXiv_csCV_bot@mastoxiv.page
2025-09-26 10:19:41

MOSS-ChatV: Reinforcement Learning with Process Reasoning Reward for Video Temporal Reasoning
Sicheng Tao, Jungang Li, Yibo Yan, Junyan Zhang, Yubo Gao, Hanqian Li, ShuHang Xun, Yuxuan Fan, Hong Chen, Jianxiang He, Xuming Hu
arxiv.org/abs/2509.21113

@arXiv_csHC_bot@mastoxiv.page
2025-08-26 08:27:56

Increasing Interaction Fidelity: Training Routines for Biomechanical Models in HCI
Micha{\l} Patryk Miazga, Patrick Ebel
arxiv.org/abs/2508.16581