Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@arXiv_csRO_bot@mastoxiv.page
2025-09-03 13:20:13

Non-conflicting Energy Minimization in Reinforcement Learning based Robot Control
Skand Peri, Akhil Perincherry, Bikram Pandit, Stefan Lee
arxiv.org/abs/2509.01765

@arXiv_csLG_bot@mastoxiv.page
2025-07-04 10:22:21

ExPO: Unlocking Hard Reasoning with Self-Explanation-Guided Reinforcement Learning
Ruiyang Zhou, Shuozhe Li, Amy Zhang, Liu Leqi
arxiv.org/abs/2507.02834

@arXiv_csCV_bot@mastoxiv.page
2025-07-03 10:28:30

Self-Reinforcing Prototype Evolution with Dual-Knowledge Cooperation for Semi-Supervised Lifelong Person Re-Identification
Kunlun Xu, Fan Zhuo, Jiangmeng Li, Xu Zou, Jiahuan Zhou
arxiv.org/abs/2507.01884

@arXiv_csCL_bot@mastoxiv.page
2025-07-03 10:13:10

Gradient-Adaptive Policy Optimization: Towards Multi-Objective Alignment of Large Language Models
Chengao Li, Hanyu Zhang, Yunkun Xu, Hongyan Xue, Xiang Ao, Qing He
arxiv.org/abs/2507.01915

@arXiv_csNI_bot@mastoxiv.page
2025-08-04 09:37:30

Energy Efficient Trajectory Control and Resource Allocation in Multi-UAV-assisted MEC via Deep Reinforcement Learning
Saichao Liu, Geng Sun, Chuang Zhang, Xuejie Liu, Jiacheng Wang, Changyuan Zhao, Dusit Niyato
arxiv.org/abs/2508.00261

@arXiv_csDC_bot@mastoxiv.page
2025-07-04 07:55:01

SAKURAONE: Empowering Transparent and Open AI Platforms through Private-Sector HPC Investment in Japan
Fumikazu Konishi
arxiv.org/abs/2507.02124

@arXiv_csMM_bot@mastoxiv.page
2025-07-04 08:44:01

VRAgent-R1: Boosting Video Recommendation with MLLM-based Agents via Reinforcement Learning
Siran Chen, Boyu Chen, Chenyun Yu, Yuxiao Luo, Ouyang Yi, Lei Cheng, Chengxiang Zhuo, Zang Li, Yali Wang
arxiv.org/abs/2507.02626

@arXiv_csLG_bot@mastoxiv.page
2025-09-01 09:47:22

Beyond expected value: geometric mean optimization for long-term policy performance in reinforcement learning
Xinyi Sheng, Dominik Baumann
arxiv.org/abs/2508.21443

@arXiv_csRO_bot@mastoxiv.page
2025-09-04 09:46:21

CTBC: Contact-Triggered Blind Climbing for Wheeled Bipedal Robots with Instruction Learning and Reinforcement Learning
Rankun Li, Hao Wang, Qi Li, Zhuo Han, Yifei Chu, Linqi Ye, Wende Xie, Wenlong Liao
arxiv.org/abs/2509.02986

@arXiv_csLG_bot@mastoxiv.page
2025-09-04 13:37:16

Replaced article(s) found for cs.LG. arxiv.org/list/cs.LG/new
[5/5]:
- Search-Based Credit Assignment for Offline Preference-Based Reinforcement Learning
Xiancheng Gao, Yufeng Shi, Wengang Zhou, Houqiang Li