Non-conflicting Energy Minimization in Reinforcement Learning based Robot Control
Skand Peri, Akhil Perincherry, Bikram Pandit, Stefan Lee
https://arxiv.org/abs/2509.01765 https…
ExPO: Unlocking Hard Reasoning with Self-Explanation-Guided Reinforcement Learning
Ruiyang Zhou, Shuozhe Li, Amy Zhang, Liu Leqi
https://arxiv.org/abs/2507.02834
Self-Reinforcing Prototype Evolution with Dual-Knowledge Cooperation for Semi-Supervised Lifelong Person Re-Identification
Kunlun Xu, Fan Zhuo, Jiangmeng Li, Xu Zou, Jiahuan Zhou
https://arxiv.org/abs/2507.01884
Gradient-Adaptive Policy Optimization: Towards Multi-Objective Alignment of Large Language Models
Chengao Li, Hanyu Zhang, Yunkun Xu, Hongyan Xue, Xiang Ao, Qing He
https://arxiv.org/abs/2507.01915
Energy Efficient Trajectory Control and Resource Allocation in Multi-UAV-assisted MEC via Deep Reinforcement Learning
Saichao Liu, Geng Sun, Chuang Zhang, Xuejie Liu, Jiacheng Wang, Changyuan Zhao, Dusit Niyato
https://arxiv.org/abs/2508.00261
SAKURAONE: Empowering Transparent and Open AI Platforms through Private-Sector HPC Investment in Japan
Fumikazu Konishi
https://arxiv.org/abs/2507.02124 ht…
VRAgent-R1: Boosting Video Recommendation with MLLM-based Agents via Reinforcement Learning
Siran Chen, Boyu Chen, Chenyun Yu, Yuxiao Luo, Ouyang Yi, Lei Cheng, Chengxiang Zhuo, Zang Li, Yali Wang
https://arxiv.org/abs/2507.02626
Beyond expected value: geometric mean optimization for long-term policy performance in reinforcement learning
Xinyi Sheng, Dominik Baumann
https://arxiv.org/abs/2508.21443 https…
CTBC: Contact-Triggered Blind Climbing for Wheeled Bipedal Robots with Instruction Learning and Reinforcement Learning
Rankun Li, Hao Wang, Qi Li, Zhuo Han, Yifei Chu, Linqi Ye, Wende Xie, Wenlong Liao
https://arxiv.org/abs/2509.02986
Replaced article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[5/5]:
- Search-Based Credit Assignment for Offline Preference-Based Reinforcement Learning
Xiancheng Gao, Yufeng Shi, Wengang Zhou, Houqiang Li