How Reinforcement Learning After Next-Token Prediction Facilitates Learning
Nikolaos Tsilivis, Eran Malach, Karen Ullrich, Julia Kempe
https://arxiv.org/abs/2510.11495 https://
CoIRL-AD: Collaborative-Competitive Imitation-Reinforcement Learning in Latent World Models for Autonomous Driving
Xiaoji Zheng, Ziyuan Yang, Yanhao Chen, Yuhang Peng, Yuanrong Tang, Gengyuan Liu, Bokui Chen, Jiangtao Gong
https://arxiv.org/abs/2510.12560
Residual MPC: Blending Reinforcement Learning with GPU-Parallelized Model Predictive Control
Se Hwan Jeon, Ho Jae Lee, Seungwoo Hong, Sangbae Kim
https://arxiv.org/abs/2510.12717
On Sunday, October 5, 2025, my wife and I are participating in the Canadian Cancer Society's Run for the Cure. This national fundraiser brings together communities across the country who will run or walk in support of all Canadians impacted by breast cancer. We are walking in honour of my mother, my two sisters, and both our daughters - all of whom have been touched by this terrible disease.
Our family isn't special as 1 in 8 women will face a breast cancer diagnosis in their lifetime, It is the most commonly diagnosed cancer among Canadian women, and all of us likely know someone who has battled it in the past, or is fighting it today. I encourage all Canadians to donate in honour of someone they know or simply to help their communities fight breast cancer. This cause is particularly important to us, especially this year, as we support our daughters, and I hope all Canadians get involved.
#CanadianCancerSociety #RunForTheCure
https://support.cancer.ca/site/TR/RunfortheCure/RFTC_NW_even_?px=15178695&pg=personal&fr_id=30455
Compositional shield synthesis for safe reinforcement learning in partial observability
Steven Carr, Georgios Bakirtzis, Ufuk Topcu
https://arxiv.org/abs/2509.12085 https://
Strategic Cyber Defense via Reinforcement Learning-Guided Combinatorial Auctions
Mai Pham, Vikrant Vaze, Peter Chin
https://arxiv.org/abs/2509.10983 https://
Biased-Attention Guided Risk Prediction for Safe Decision-Making at Unsignalized Intersections
Chengyang Dong, Nan Guo
https://arxiv.org/abs/2510.12428 https://
Thought Purity: Defense Paradigm For Chain-of-Thought Attack
Zihao Xue, Zhen Bi, Long Ma, Zhenlin Hu, Yan Wang, Zhenfang Liu, Qing Sheng, Jie Xiao, Jungang Lou
https://arxiv.org/abs/2507.12314
Real-Time Defense Against Coordinated Cyber-Physical Attacks: A Robust Constrained Reinforcement Learning Approach
Saman Mazaheri Khamaneh, Tong Wu, Wei Sun, Cong Chen
https://arxiv.org/abs/2509.10999 …
Few-shot Vision-based Human Activity Recognition with MLLM-based Visual Reinforcement Learning
Wenqi Zheng, Yutaka Arakawa
https://arxiv.org/abs/2508.10371 https://