CoIRL-AD: Collaborative-Competitive Imitation-Reinforcement Learning in Latent World Models for Autonomous Driving
Xiaoji Zheng, Ziyuan Yang, Yanhao Chen, Yuhang Peng, Yuanrong Tang, Gengyuan Liu, Bokui Chen, Jiangtao Gong
https://arxiv.org/abs/2510.12560
Physics-Informed Reinforcement Learning for Large-Scale EV Smart Charging Considering Distribution Network Voltage Constraints
Stavros Orfanoudakis, Frans Oliehoek, Peter Palesnky, Pedro P. Vergara
https://arxiv.org/abs/2510.12335
Laminar: A Scalable Asynchronous RL Post-Training Framework
Guangming Sheng, Yuxuan Tong, Borui Wan, Wang Zhang, Chaobo Jia, Xibin Wu, Yuqi Wu, Xiang Li, Chi Zhang, Yanghua Peng, Haibin Lin, Xin Liu, Chuan Wu
https://arxiv.org/abs/2510.12633
Replaced article(s) found for cs.CL. https://arxiv.org/list/cs.CL/new
[2/5]:
- DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical...
Yichun Feng, Jiawei Wang, Lu Zhou, Zhen Lei, Yixue Li
A Task-Efficient Reinforcement Learning Task-Motion Planner for Safe Human-Robot Cooperation
Gaoyuan Liu, Joris de Winter, Kelly Merckaert, Denis Steckelmacher, Ann Nowe, Bram Vanderborght
https://arxiv.org/abs/2510.12477
Prefrontal scaling of reward prediction error readout gates reinforcement-derived adaptive behavior in primates
Tian Sang, Yichun Huang, Fangwei Zhong, Miao Wang, Shiqi Yu, Jiahui Li, Yuanjing Feng, Yizhou Wang, Kwok Sze Chai, Ravi S. Menon, Meiyun Wang, Fang Fang, Zheng Wang
https://arxiv.org/abs/2512.09761 https://arxiv.org/pdf/2512.09761 https://arxiv.org/html/2512.09761
arXiv:2512.09761v1 Announce Type: new
Abstract: Reinforcement learning (RL) enables adaptive behavior across species via reward prediction errors (RPEs), but the neural origins of species-specific adaptability remain unknown. Integrating RL modeling, transcriptomics, and neuroimaging during reversal learning, we discovered convergent RPE signatures - shared monoaminergic/synaptic gene upregulation and neuroanatomical representations, yet humans outperformed macaques behaviorally. Single-trial decoding showed RPEs guided choices similarly in both species, but humans disproportionately recruited dorsal anterior cingulate (dACC) and dorsolateral prefrontal cortex (dlPFC). Cross-species alignment uncovered that macaque prefrontal circuits encode human-like optimal RPEs yet fail to translate them into action. Adaptability scaled not with RPE encoding fidelity, but with the areal extent of dACC/dlPFC recruitment governing RPE-to-action transformation. These findings resolve an evolutionary puzzle: behavioral performance gaps arise from executive cortical readout efficiency, not encoding capacity.
toXiv_bot_toot