Tootfinder

Opt-in global Mastodon full text search. Join the index!

@arXiv_csCV_bot@mastoxiv.page
2025-10-15 10:44:21

CoIRL-AD: Collaborative-Competitive Imitation-Reinforcement Learning in Latent World Models for Autonomous Driving
Xiaoji Zheng, Ziyuan Yang, Yanhao Chen, Yuhang Peng, Yuanrong Tang, Gengyuan Liu, Bokui Chen, Jiangtao Gong
arxiv.org/abs/2510.12560

@arXiv_csLG_bot@mastoxiv.page
2025-10-15 08:21:22

GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving
Ruida Wang, Jiarui Yao, Rui Pan, Shizhe Diao, Tong Zhang
arxiv.org/abs/2510.11769

@arXiv_csRO_bot@mastoxiv.page
2025-10-15 09:22:41

Pretraining in Actor-Critic Reinforcement Learning for Robot Motion Control
Jiale Fan, Andrei Cramariuc, Tifanny Portela, Marco Hutter
arxiv.org/abs/2510.12363

@arXiv_eessSY_bot@mastoxiv.page
2025-10-15 08:05:41

Physics-Informed Reinforcement Learning for Large-Scale EV Smart Charging Considering Distribution Network Voltage Constraints
Stavros Orfanoudakis, Frans Oliehoek, Peter Palesnky, Pedro P. Vergara
arxiv.org/abs/2510.12335

@arXiv_csLG_bot@mastoxiv.page
2025-10-15 10:47:41

Expert or not? assessing data quality in offline reinforcement learning
Arip Asadulaev, Fakhri Karray, Martin Takac
arxiv.org/abs/2510.12638

@arXiv_csRO_bot@mastoxiv.page
2025-10-15 10:12:01

Residual MPC: Blending Reinforcement Learning with GPU-Parallelized Model Predictive Control
Se Hwan Jeon, Ho Jae Lee, Seungwoo Hong, Sangbae Kim
arxiv.org/abs/2510.12717

@arXiv_csLG_bot@mastoxiv.page
2025-10-15 10:46:41

Laminar: A Scalable Asynchronous RL Post-Training Framework
Guangming Sheng, Yuxuan Tong, Borui Wan, Wang Zhang, Chaobo Jia, Xibin Wu, Yuqi Wu, Xiang Li, Chi Zhang, Yanghua Peng, Haibin Lin, Xin Liu, Chuan Wu
arxiv.org/abs/2510.12633

@arXiv_csCL_bot@mastoxiv.page
2025-10-15 14:19:21

Replaced article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[2/5]:
- DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical...
Yichun Feng, Jiawei Wang, Lu Zhou, Zhen Lei, Yixue Li

@arXiv_csRO_bot@mastoxiv.page
2025-10-15 09:49:31

A Task-Efficient Reinforcement Learning Task-Motion Planner for Safe Human-Robot Cooperation
Gaoyuan Liu, Joris de Winter, Kelly Merckaert, Denis Steckelmacher, Ann Nowe, Bram Vanderborght
arxiv.org/abs/2510.12477

@arXiv_qbioNC_bot@mastoxiv.page
2025-12-11 08:43:31

Prefrontal scaling of reward prediction error readout gates reinforcement-derived adaptive behavior in primates
Tian Sang, Yichun Huang, Fangwei Zhong, Miao Wang, Shiqi Yu, Jiahui Li, Yuanjing Feng, Yizhou Wang, Kwok Sze Chai, Ravi S. Menon, Meiyun Wang, Fang Fang, Zheng Wang
arxiv.org/abs/2512.09761 arxiv.org/pdf/2512.09761 arxiv.org/html/2512.09761
arXiv:2512.09761v1 Announce Type: new
Abstract: Reinforcement learning (RL) enables adaptive behavior across species via reward prediction errors (RPEs), but the neural origins of species-specific adaptability remain unknown. Integrating RL modeling, transcriptomics, and neuroimaging during reversal learning, we discovered convergent RPE signatures - shared monoaminergic/synaptic gene upregulation and neuroanatomical representations, yet humans outperformed macaques behaviorally. Single-trial decoding showed RPEs guided choices similarly in both species, but humans disproportionately recruited dorsal anterior cingulate (dACC) and dorsolateral prefrontal cortex (dlPFC). Cross-species alignment uncovered that macaque prefrontal circuits encode human-like optimal RPEs yet fail to translate them into action. Adaptability scaled not with RPE encoding fidelity, but with the areal extent of dACC/dlPFC recruitment governing RPE-to-action transformation. These findings resolve an evolutionary puzzle: behavioral performance gaps arise from executive cortical readout efficiency, not encoding capacity.
toXiv_bot_toot

@arXiv_csRO_bot@mastoxiv.page
2025-10-15 10:10:51

Reflection-Based Task Adaptation for Self-Improving VLA
Baicheng Li, Dong Wu, Zike Yan, Xinchen Liu, Zecui Zeng, Lusong Li, Hongbin Zha
arxiv.org/abs/2510.12710