
2025-06-05 07:23:33
SLAC: Simulation-Pretrained Latent Action Space for Whole-Body Real-World RL
Jiaheng Hu, Peter Stone, Roberto Mart\'in-Mart\'in
https://arxiv.org/abs/2506.04147
SLAC: Simulation-Pretrained Latent Action Space for Whole-Body Real-World RL
Jiaheng Hu, Peter Stone, Roberto Mart\'in-Mart\'in
https://arxiv.org/abs/2506.04147
This https://arxiv.org/abs/2505.24298 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…
CRScore : Reinforcement Learning with Verifiable Tool and AI Feedback for Code Review
Manav Nitin Kapadnis, Atharva Naik, Carolyn Rose
https://arxiv.org/abs/2506.00296
This https://arxiv.org/abs/2505.19641 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…
A US federal judge orders the USAGM to immediately disburse RFE/RL's May funding of ~$12M, following a similar order last month for its April funding (Radio Free Europe/Radio Liberty)
https://www.rferl.org/a/rfe-rl-order-lamberth-court-funding-/33429…
This https://arxiv.org/abs/2505.23527 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…
Learning-based primal-dual optimal control of discrete-time stochastic systems with multiplicative noise
Xiushan Jiang, Weihai Zhang
https://arxiv.org/abs/2506.02613
Ensemble-MIX: Enhancing Sample Efficiency in Multi-Agent RL Using Ensemble Methods
Tom Danino, Nahum Shimkin
https://arxiv.org/abs/2506.02841 https://
This https://arxiv.org/abs/2404.17589 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csIR_…
I am still alive. so many things happened both rl-wise and fandom-wise. most importantly my health deteriotated really badly lol
Multimodal Mathematical Reasoning with Diverse Solving Perspective
Wenhao Shi, Zhiqiang Hu, Yi Bin, Yang Yang, See-Kiong Ng, Heng Tao Shen
https://arxiv.org/abs/2507.02804
This https://arxiv.org/abs/2505.23703 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…
This https://arxiv.org/abs/2308.13140 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…
Proxy Target: Bridging the Gap Between Discrete Spiking Neural Networks and Continuous Control
Zijie Xu, Tong Bu, Zecheng Hao, Jianhao Ding, Zhaofei Yu
https://arxiv.org/abs/2505.24161
Reentrant localization in a quasiperiodic chain with correlated hopping sequences
Sourav Karmakar, Sudin Ganguly, Santanu K. Maiti
https://arxiv.org/abs/2506.02716
This https://arxiv.org/abs/2505.23585 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…
Joint Modeling for Learning Decision-Making Dynamics in Behavioral Experiments
Yuan Bian, Xingche Guo, Yuanjia Wang
https://arxiv.org/abs/2506.02394 https:…
A Reinforcement Learning-Based Telematic Routing Protocol for the Internet of Underwater Things
Mohammadhossein Homaei, Mehran Tarif, Agustin Di Bartolo, Oscar Mogollon Gutierrez, Mar Avila
https://arxiv.org/abs/2506.00133
MOTIF: Modular Thinking via Reinforcement Fine-tuning in LLMs
Purbesh Mitra, Sennur Ulukus
https://arxiv.org/abs/2507.02851 https://a…
Pearl: Automatic Code Optimization Using Deep Reinforcement Learning
Djamel Rassem Lamouri, Iheb Nassim Aouadj, Smail Kourta, Riyadh Baghdadi
https://arxiv.org/abs/2506.01880
Accelerated Portfolio Optimization and Option Pricing with Reinforcement Learning
Hadi Keramati, Samaneh Jazayeri
https://arxiv.org/abs/2507.01972 https://…
This https://arxiv.org/abs/2505.23527 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…
BASIL: Best-Action Symbolic Interpretable Learning for Evolving Compact RL Policies
Kourosh Shahnazari, Seyed Moein Ayyoubzadeh, Mohammadali Keshtparvar
https://arxiv.org/abs/2506.00328
This https://arxiv.org/abs/2506.01755 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_ees…
Agnostic Reinforcement Learning: Foundations and Algorithms
Gene Li
https://arxiv.org/abs/2506.01884 https://arxiv.org/pdf/2506.01884…
This https://arxiv.org/abs/2504.18253 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…
A note on the uniqueness properties of solutions for the Schr\"odinger-Korteweg de Vries system
Eddye Bustamante, Jos\'e Jim\'enez Urrea, Jorge Mej\'ia
https://arxiv.org/abs/2507.01733
This https://arxiv.org/abs/2505.24034 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…
This https://arxiv.org/abs/2503.07792 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…
This https://arxiv.org/abs/2505.22642 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…
RL-based Adaptive Task Offloading in Mobile-Edge Computing for Future IoT Networks
Ziad Qais Al Abbasi, Khaled M. Rabie, Senior Member, Xingwang Li, Senior Member, Wali Ullah Khan, Asma Abu Samah
https://arxiv.org/abs/2506.22474
This https://arxiv.org/abs/2505.18780 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…
Reinforcement Learning for Discrete-time LQG Mean Field Social Control Problems with Unknown Dynamics
Hanfang Zhang, Bing-Chang Wang, Shuo Chen
https://arxiv.org/abs/2507.01420
Multi-task Offline Reinforcement Learning for Online Advertising in Recommender Systems
Langming Liu, Wanyu Wang, Chi Zhang, Bo Li, Hongzhi Yin, Xuetao Wei, Wenbo Su, Bo Zheng, Xiangyu Zhao
https://arxiv.org/abs/2506.23090
ExPO: Unlocking Hard Reasoning with Self-Explanation-Guided Reinforcement Learning
Ruiyang Zhou, Shuozhe Li, Amy Zhang, Liu Leqi
https://arxiv.org/abs/2507.02834
Self-correcting Reward Shaping via Language Models for Reinforcement Learning Agents in Games
Ant\'onio Afonso, Iolanda Leite, Alessandro Sestini, Florian Fuchs, Konrad Tollmar, Linus Gissl\'en
https://arxiv.org/abs/2506.23626
Reasoning-Table: Exploring Reinforcement Learning for Table Reasoning
Fangyu Lei, Jinxiang Meng, Yiming Huang, Tinghong Chen, Yun Zhang, Shizhu He, Jun Zhao, Kang Liu
https://arxiv.org/abs/2506.01710
This https://arxiv.org/abs/2503.18616 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…
QForce-RL: Quantized FPGA-Optimized Reinforcement Learning Compute Engine
Anushka Jha, Tanushree Dewangan, Mukul Lokhande, Santosh Kumar Vishvakarma
https://arxiv.org/abs/2506.07046
This https://arxiv.org/abs/2506.00691 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…
Learned Controllers for Agile Quadrotors in Pursuit-Evasion Games
Alejandro Sanchez Roncero, Olov Andersson, Petter Ogren
https://arxiv.org/abs/2506.02849 …
An Error Bound for Aggregation in Approximate Dynamic Programming
Yuchao Li, Dimitri Bertsekas
https://arxiv.org/abs/2507.01324 https://
This https://arxiv.org/abs/2505.21119 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…
Data-assimilated model-informed reinforcement learning
Defne E. Ozan, Andrea N\'ovoa, Georgios Rigas, Luca Magri
https://arxiv.org/abs/2506.01755 https…
Homogenization of Multi-agent Learning Dynamics in Finite-state Markov Games
Yann Kerzreho (ENS Paris Saclay)
https://arxiv.org/abs/2506.21079 https://
AURA: Agentic Upskilling via Reinforced Abstractions
Alvin Zhu, Yusuke Tanaka, Dennis Hong
https://arxiv.org/abs/2506.02507 https://a…
Reinforcement Learning for Optimal Control of Spin Magnetometers
Logan W. Cooke, Stefanie Czischek
https://arxiv.org/abs/2506.21475 https://
This https://arxiv.org/abs/2505.16401 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…
Complex Model Transformations by Reinforcement Learning with Uncertain Human Guidance
Kyanna Dagenais, Istvan David
https://arxiv.org/abs/2506.20883 https:…
Reward Balancing Revisited: Enhancing Offline Reinforcement Learning for Recommender Systems
Wenzheng Shu, Yanxiang Zeng, Yongxiang Tang, Teng Sha, Ning Luo, Yanhua Cheng, Xialong Liu, Fan Zhou, Peng Jiang
https://arxiv.org/abs/2506.22112
Can you see how I learn? Human observers' inferences about Reinforcement Learning agents' learning processes
Bernhard Hilpert, Muhan Hou, Kim Baraka, Joost Broekens
https://arxiv.org/abs/2506.13583
This https://arxiv.org/abs/2501.07985 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…
A Forget-and-Grow Strategy for Deep Reinforcement Learning Scaling in Continuous Control
Zilin Kang, Chenyuan Hu, Yu Luo, Zhecheng Yuan, Ruijie Zheng, Huazhe Xu
https://arxiv.org/abs/2507.02712
This https://arxiv.org/abs/2409.10289 has been replaced.
link: https://scholar.google.com/scholar?q=a
This https://arxiv.org/abs/2411.14622 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…
Learning optimal treatment strategies for intraoperative hypotension using deep reinforcement learning
Esra Adiyeke, Tianqi Liu, Venkata Sai Dheeraj Naganaboina, Han Li, Tyler J. Loftus, Yuanfang Ren, Benjamin Shickel, Matthew M. Ruppert, Karandeep Singh, Ruogu Fang, Parisa Rashidi, Azra Bihorac, Tezcan Ozrazgat-Baslanti
https://
Testifying before Congress, Kari Lake said reform at USAGM "was not possible" but the CEOs of RFE/RL, RFA and MBN said she had not met with them even once (Scott Nover/Washington Post)
https://www.washingtonpost.com/style/media/2025/06…
OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling
Zengzhi Wang, Fan Zhou, Xuefeng Li, Pengfei Liu
https://arxiv.org/abs/2506.20512 http…
Reinforcement Learning with Data Bootstrapping for Dynamic Subgoal Pursuit in Humanoid Robot Navigation
Chengyang Peng, Zhihao Zhang, Shiting Gong, Sankalp Agrawal, Keith A. Redmill, Ayonga Hereid
https://arxiv.org/abs/2506.02206
This https://arxiv.org/abs/2504.14870 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…
"auf Mastodon sein":
im RL:
Nerdlevel ▪️ ◾ ◼️ 🔟
im Fediverse:
Nerdlevel 0️⃣ ▪️ ◾ ◼️
RL-Driven Semantic Compression Model Selection and Resource Allocation in Semantic Communication Systems
Xinyi Lin, Peizheng Li, Adnan Aijaz
https://arxiv.org/abs/2506.18660
This https://arxiv.org/abs/2503.02189 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csMA_…
This https://arxiv.org/abs/2412.05718 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…
Multi-Timescale Hierarchical Reinforcement Learning for Unified Behavior and Control of Autonomous Driving
Guizhe Jin, Zhuoren Li, Bo Leng, Ran Yu, Lu Xiong
https://arxiv.org/abs/2506.23771
RL-Guided MPC for Autonomous Greenhouse Control
Salim Msaad, Murray Harraway, Robert D. McAllister
https://arxiv.org/abs/2506.13278 https://
Mechanical Intelligence-Aware Curriculum Reinforcement Learning for Humanoids with Parallel Actuation
Yusuke Tanaka, Alvin Zhu, Quanyou Wang, Dennis Hong
https://arxiv.org/abs/2507.00273
Stream or buy R.E.M.'s new EP of "Radio Free Europe" with proceeds to support Radio Free Europe / Radio Liberty, which is having its funding cut by Trump. Merch also available with proceeds to RFE/RL.
#PublicRadio #Music
Learning Interpretable Rules from Neural Networks: Neurosymbolic AI for Radar Hand Gesture Recognition
Sarah Seifi, Tobias Sukianto, Cecilia Carbonelli, Lorenzo Servadei, Robert Wille
https://arxiv.org/abs/2506.22443
Learning to Explore: An In-Context Learning Approach for Pure Exploration
Alessio Russo, Ryan Welch, Aldo Pacchiano
https://arxiv.org/abs/2506.01876 https:…
Disturbance-Aware Adaptive Compensation in Hybrid Force-Position Locomotion Policy for Legged Robots
Yang Zhang, Buqing Nie, Zhanxiang Cao, Yangqing Fu, Yue Gao
https://arxiv.org/abs/2506.00472
Regret-Optimal Q-Learning with Low Cost for Single-Agent and Federated Reinforcement Learning
Haochen Zhang, Zhong Zheng, Lingzhou Xue
https://arxiv.org/abs/2506.04626
DriveMind: A Dual-VLM based Reinforcement Learning Framework for Autonomous Driving
Dawood Wasif, Terrence J Moore, Chandan K Reddy, Jin-Hee Cho
https://arxiv.org/abs/2506.00819
Reasoning with Exploration: An Entropy Perspective
Daixuan Cheng, Shaohan Huang, Xuekai Zhu, Bo Dai, Wayne Xin Zhao, Zhenliang Zhang, Furu Wei
https://arxiv.org/abs/2506.14758
Graphs Meet AI Agents: Taxonomy, Progress, and Future Opportunities
Yuanchen Bei, Weizhi Zhang, Siwen Wang, Weizhi Chen, Sheng Zhou, Hao Chen, Yong Li, Jiajun Bu, Shirui Pan, Yizhou Yu, Irwin King, Fakhri Karray, Philip S. Yu
https://arxiv.org/abs/2506.18019
Hybrid Learning for Cold-Start-Aware Microservice Scheduling in Dynamic Edge Environments
Jingxi Lu, Wenhao Li, Jianxiong Guo, Xingjian Ding, Zhiqing Tang, Tian Wang, Weijia Jia
https://arxiv.org/abs/2505.22424
This https://arxiv.org/abs/2505.20751 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…
Make Your AUV Adaptive: An Environment-Aware Reinforcement Learning Framework For Underwater Tasks
Yimian Ding, Jingzehua Xu, Guanwen Xie, Shuai Zhang, Yi Li
https://arxiv.org/abs/2506.15082
ReVeal: Self-Evolving Code Agents via Iterative Generation-Verification
Yiyang Jin, Kunzhao Xu, Hang Li, Xueting Han, Yanmin Zhou, Cheng Li, Jing Bai
https://arxiv.org/abs/2506.11442
This https://arxiv.org/abs/2506.04168 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…
VerIF: Verification Engineering for Reinforcement Learning in Instruction Following
Hao Peng, Yunjia Qi, Xiaozhi Wang, Bin Xu, Lei Hou, Juanzi Li
https://arxiv.org/abs/2506.09942 …
Diffusion-RL for Scalable Resource Allocation for 6G Networks
Salar Nouri, Mojdeh Karbalaee Motalleb, Vahid Shah-Mansouri
https://arxiv.org/abs/2506.07880 …
Skill-Nav: Enhanced Navigation with Versatile Quadrupedal Locomotion via Waypoint Interface
Dewei Wang, Chenjia Ba, Chenhui Li, Jiyuan Shi, Yan Ding, Chi Zhang, Bin Zhao
https://arxiv.org/abs/2506.21853
Partially Observable Residual Reinforcement Learning for PV-Inverter-Based Voltage Control in Distribution Grids
Sarra Bouchkati, Ramil Sabirov, Steffen Kortmann, Andreas Ulbig
https://arxiv.org/abs/2506.19353
On a few pitfalls in KL divergence gradient estimation for RL
Yunhao Tang, R\'emi Munos
https://arxiv.org/abs/2506.09477 https://…
Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs
Ring Team, Bin Hu, Cai Chen, Deng Zhao, Ding Liu, Dingnan Jin, Feng Zhu, Hao Dai, Hongzhi Luan, Jia Guo, Jiaming Liu, Jiewei Wu, Jun Mei, Jun Zhou, Junbo Zhao, Junwu Xiong, Kaihong Zhang, Kuan Xu, Lei Liang, Liang Jiang, Liangcheng Fu, Longfei Zheng, Qiang Gao, Qing Cui, Quan Wan, Shaomian Zheng, Shuaicheng Li, Tongkai Yang, Wang Ren, Xiaodong Yan, Xiaopei Wan, Xiaoyun Feng, Xin Zhao, Xinxing Yang, Xinyu …
This https://arxiv.org/abs/2506.04147 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…
Eliciting Reasoning in Language Models with Cognitive Tools
Brown Ebouky, Andrea Bartezzaghi, Mattia Rigotti
https://arxiv.org/abs/2506.12115 https://
MOORL: A Framework for Integrating Offline-Online Reinforcement Learning
Gaurav Chaudhary, Wassim Uddin Mondal, Laxmidhar Behera
https://arxiv.org/abs/2506.09574
This https://arxiv.org/abs/2505.00546 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…
Hierarchical Reinforcement Learning and Value Optimization for Challenging Quadruped Locomotion
Jeremiah Coholich, Muhammad Ali Murtaza, Seth Hutchinson, Zsolt Kira
https://arxiv.org/abs/2506.20036
Eye, Robot: Learning to Look to Act with a BC-RL Perception-Action Loop
Justin Kerr, Kush Hari, Ethan Weber, Chung Min Kim, Brent Yi, Tyler Bonnen, Ken Goldberg, Angjoo Kanazawa
https://arxiv.org/abs/2506.10968
Robots and Children that Learn Together : Improving Knowledge Retention by Teaching Peer-Like Interactive Robots
Imene Tarakli, Samuele Vinanzi, Richard Moore, Alessandro Di Nuovo
https://arxiv.org/abs/2506.18365
MoRE: Mixture of Residual Experts for Humanoid Lifelike Gaits Learning on Complex Terrains
Dewei Wang, Xinmiao Wang, Xinzhe Liu, Jiyuan Shi, Yingnan Zhao, Chenjia Bai, Xuelong Li
https://arxiv.org/abs/2506.08840
Learning Dexterous Object Handover
Daniel Frau-Alfaro, Julio Casta\~no-Amoros, Santiago Puente, Pablo Gil, Roberto Calandra
https://arxiv.org/abs/2506.16822
This https://arxiv.org/abs/2503.04280 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…
Policy-Based Trajectory Clustering in Offline Reinforcement Learning
Hao Hu, Xinqi Wang, Simon Shaolei Du
https://arxiv.org/abs/2506.09202 https://
This https://arxiv.org/abs/2409.17469 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…
Quadrotor Morpho-Transition: Learning vs Model-Based Control Strategies
Ioannis Mandralis, Richard M. Murray, Morteza Gharib
https://arxiv.org/abs/2506.14039