A Survey of Reinforcement Learning for Software Engineering
Dong Wang, Hanmo You, Lingwei Zhu, Kaiwei Lin, Zheng Chen, Chen Yang, Junji Yu, Zan Wang, Junjie Chen
https://arxiv.org/abs/2507.12483
Discovering Temporal Structure: An Overview of Hierarchical Reinforcement Learning
Martin Klissarov, Akhil Bagaria, Ziyan Luo, George Konidaris, Doina Precup, Marlos C. Machado
https://arxiv.org/abs/2506.14045
Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs
Ring Team, Bin Hu, Cai Chen, Deng Zhao, Ding Liu, Dingnan Jin, Feng Zhu, Hao Dai, Hongzhi Luan, Jia Guo, Jiaming Liu, Jiewei Wu, Jun Mei, Jun Zhou, Junbo Zhao, Junwu Xiong, Kaihong Zhang, Kuan Xu, Lei Liang, Liang Jiang, Liangcheng Fu, Longfei Zheng, Qiang Gao, Qing Cui, Quan Wan, Shaomian Zheng, Shuaicheng Li, Tongkai Yang, Wang Ren, Xiaodong Yan, Xiaopei Wan, Xiaoyun Feng, Xin Zhao, Xinxing Yang, Xinyu …
Evaluating Reinforcement Learning Algorithms for Navigation in Simulated Robotic Quadrupeds: A Comparative Study Inspired by Guide Dog Behaviour
Emma M. A. Harrison
https://arxiv.org/abs/2507.13277
Quantum-Enhanced Reinforcement Learning with LSTM Forecasting Signals for Optimizing Fintech Trading Decisions
Yen-Ku Liu, Yun-Huei Pan, Pei-Fan Lu, Yun-Cheng Tsai, Samuel Yen-Chi Chen
https://arxiv.org/abs/2507.12835
Autonomous Resource Management in Microservice Systems via Reinforcement Learning
Yujun Zou, Nia Qi, Yingnan Deng, Zhihao Xue, Ming Gong, Wuyang Zhang
https://arxiv.org/abs/2507.12879
Recognition through Reasoning: Reinforcing Image Geo-localization with Large Vision-Language Models
Ling Li, Yao Zhou, Yuxuan Liang, Fugee Tsung, Jiaheng Wei
https://arxiv.org/abs/2506.14674
Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback
Suzie Kim, Hye-Bin Shin, Seong-Whan Lee
https://arxiv.org/abs/2507.13171
PoseGRAF: Geometric-Reinforced Adaptive Fusion for Monocular 3D Human Pose Estimation
Ming Xu, Xu Zhang
https://arxiv.org/abs/2506.14596 https://
SENIOR: Efficient Query Selection and Preference-Guided Exploration in Preference-based Reinforcement Learning
Hexian Ni, Tao Lu, Haoyuan Hu, Yinghao Cai, Shuo Wang
https://arxiv.org/abs/2506.14648