Tootfinder

Opt-in global Mastodon full text search. Join the index!

@arXiv_csRO_bot@mastoxiv.page
2025-06-05 07:23:33

SLAC: Simulation-Pretrained Latent Action Space for Whole-Body Real-World RL
Jiaheng Hu, Peter Stone, Roberto Mart\'in-Mart\'in
arxiv.org/abs/2506.04147

@arXiv_csLG_bot@mastoxiv.page
2025-06-05 10:59:18

This arxiv.org/abs/2505.24298 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…

@arXiv_csSE_bot@mastoxiv.page
2025-06-03 07:29:49

CRScore : Reinforcement Learning with Verifiable Tool and AI Feedback for Code Review
Manav Nitin Kapadnis, Atharva Naik, Carolyn Rose
arxiv.org/abs/2506.00296

@arXiv_csAI_bot@mastoxiv.page
2025-06-05 09:38:19

This arxiv.org/abs/2505.19641 has been replaced.
initial toot: mastoxiv.page/@arXiv_csAI_…

@Mediagazer@mstdn.social
2025-05-31 01:16:05

A US federal judge orders the USAGM to immediately disburse RFE/RL's May funding of ~$12M, following a similar order last month for its April funding (Radio Free Europe/Radio Liberty)
rferl.org/a/rfe-rl-order-lambe

@arXiv_csLG_bot@mastoxiv.page
2025-06-05 10:58:34

This arxiv.org/abs/2505.23527 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…

@arXiv_mathOC_bot@mastoxiv.page
2025-06-04 07:43:39

Learning-based primal-dual optimal control of discrete-time stochastic systems with multiplicative noise
Xiushan Jiang, Weihai Zhang
arxiv.org/abs/2506.02613

@arXiv_eessSY_bot@mastoxiv.page
2025-06-04 07:41:37

Ensemble-MIX: Enhancing Sample Efficiency in Multi-Agent RL Using Ensemble Methods
Tom Danino, Nahum Shimkin
arxiv.org/abs/2506.02841

@arXiv_csIR_bot@mastoxiv.page
2025-06-05 09:39:39

This arxiv.org/abs/2404.17589 has been replaced.
initial toot: mastoxiv.page/@arXiv_csIR_…

@mino@blorbo.social
2025-07-04 13:37:58

I am still alive. so many things happened both rl-wise and fandom-wise. most importantly my health deteriotated really badly lol

@arXiv_csCL_bot@mastoxiv.page
2025-07-04 09:42:51

Multimodal Mathematical Reasoning with Diverse Solving Perspective
Wenhao Shi, Zhiqiang Hu, Yi Bin, Yang Yang, See-Kiong Ng, Heng Tao Shen
arxiv.org/abs/2507.02804

@arXiv_csAI_bot@mastoxiv.page
2025-06-05 09:40:08

This arxiv.org/abs/2505.23703 has been replaced.
initial toot: mastoxiv.page/@arXiv_csAI_…

@arXiv_csRO_bot@mastoxiv.page
2025-06-04 13:36:56

This arxiv.org/abs/2308.13140 has been replaced.
initial toot: mastoxiv.page/@arXiv_csRO_…

@arXiv_csNE_bot@mastoxiv.page
2025-06-02 07:20:02

Proxy Target: Bridging the Gap Between Discrete Spiking Neural Networks and Continuous Control
Zijie Xu, Tong Bu, Zecheng Hao, Jianhao Ding, Zhaofei Yu
arxiv.org/abs/2505.24161

@arXiv_condmatdisnn_bot@mastoxiv.page
2025-06-04 07:37:55

Reentrant localization in a quasiperiodic chain with correlated hopping sequences
Sourav Karmakar, Sudin Ganguly, Santanu K. Maiti
arxiv.org/abs/2506.02716

@arXiv_csLG_bot@mastoxiv.page
2025-06-05 10:58:49

This arxiv.org/abs/2505.23585 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…

@arXiv_statME_bot@mastoxiv.page
2025-06-04 07:50:52

Joint Modeling for Learning Decision-Making Dynamics in Behavioral Experiments
Yuan Bian, Xingche Guo, Yuanjia Wang
arxiv.org/abs/2506.02394

@arXiv_csNI_bot@mastoxiv.page
2025-06-03 07:22:55

A Reinforcement Learning-Based Telematic Routing Protocol for the Internet of Underwater Things
Mohammadhossein Homaei, Mehran Tarif, Agustin Di Bartolo, Oscar Mogollon Gutierrez, Mar Avila
arxiv.org/abs/2506.00133

@arXiv_csCL_bot@mastoxiv.page
2025-07-04 09:52:11

MOTIF: Modular Thinking via Reinforcement Fine-tuning in LLMs
Purbesh Mitra, Sennur Ulukus
arxiv.org/abs/2507.02851 a…

@arXiv_csPL_bot@mastoxiv.page
2025-06-03 07:24:17

Pearl: Automatic Code Optimization Using Deep Reinforcement Learning
Djamel Rassem Lamouri, Iheb Nassim Aouadj, Smail Kourta, Riyadh Baghdadi
arxiv.org/abs/2506.01880

@arXiv_qfinPM_bot@mastoxiv.page
2025-07-04 08:50:11

Accelerated Portfolio Optimization and Option Pricing with Reinforcement Learning
Hadi Keramati, Samaneh Jazayeri
arxiv.org/abs/2507.01972

@arXiv_csLG_bot@mastoxiv.page
2025-06-03 22:01:41

This arxiv.org/abs/2505.23527 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 07:18:52

BASIL: Best-Action Symbolic Interpretable Learning for Evolving Compact RL Policies
Kourosh Shahnazari, Seyed Moein Ayyoubzadeh, Mohammadali Keshtparvar
arxiv.org/abs/2506.00328

@arXiv_eessSY_bot@mastoxiv.page
2025-06-04 13:44:52

This arxiv.org/abs/2506.01755 has been replaced.
initial toot: mastoxiv.page/@arXiv_ees…

@arXiv_csLG_bot@mastoxiv.page
2025-06-03 08:21:48

Agnostic Reinforcement Learning: Foundations and Algorithms
Gene Li
arxiv.org/abs/2506.01884 arxiv.org/pdf/2506.01884…

@arXiv_csRO_bot@mastoxiv.page
2025-06-03 17:51:44

This arxiv.org/abs/2504.18253 has been replaced.
initial toot: mastoxiv.page/@arXiv_csRO_…

@arXiv_mathAP_bot@mastoxiv.page
2025-07-03 10:06:10

A note on the uniqueness properties of solutions for the Schr\"odinger-Korteweg de Vries system
Eddye Bustamante, Jos\'e Jim\'enez Urrea, Jorge Mej\'ia
arxiv.org/abs/2507.01733

@arXiv_csLG_bot@mastoxiv.page
2025-06-03 22:02:07

This arxiv.org/abs/2505.24034 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 17:57:30

This arxiv.org/abs/2503.07792 has been replaced.
initial toot: mastoxiv.page/@arXiv_csAI_…

@arXiv_csRO_bot@mastoxiv.page
2025-06-03 18:00:56

This arxiv.org/abs/2505.22642 has been replaced.
initial toot: mastoxiv.page/@arXiv_csRO_…

@arXiv_csNI_bot@mastoxiv.page
2025-07-01 08:30:23

RL-based Adaptive Task Offloading in Mobile-Edge Computing for Future IoT Networks
Ziad Qais Al Abbasi, Khaled M. Rabie, Senior Member, Xingwang Li, Senior Member, Wali Ullah Khan, Asma Abu Samah
arxiv.org/abs/2506.22474

@arXiv_csRO_bot@mastoxiv.page
2025-06-04 14:07:36

This arxiv.org/abs/2505.18780 has been replaced.
initial toot: mastoxiv.page/@arXiv_csRO_…

@arXiv_mathOC_bot@mastoxiv.page
2025-07-03 09:02:00

Reinforcement Learning for Discrete-time LQG Mean Field Social Control Problems with Unknown Dynamics
Hanfang Zhang, Bing-Chang Wang, Shuo Chen
arxiv.org/abs/2507.01420

@arXiv_csIR_bot@mastoxiv.page
2025-07-01 08:11:53

Multi-task Offline Reinforcement Learning for Online Advertising in Recommender Systems
Langming Liu, Wanyu Wang, Chi Zhang, Bo Li, Hongzhi Yin, Xuetao Wei, Wenbo Su, Bo Zheng, Xiangyu Zhao
arxiv.org/abs/2506.23090

@arXiv_csLG_bot@mastoxiv.page
2025-07-04 10:22:21

ExPO: Unlocking Hard Reasoning with Self-Explanation-Guided Reinforcement Learning
Ruiyang Zhou, Shuozhe Li, Amy Zhang, Liu Leqi
arxiv.org/abs/2507.02834

@arXiv_csAI_bot@mastoxiv.page
2025-07-01 11:22:43

Self-correcting Reward Shaping via Language Models for Reinforcement Learning Agents in Games
Ant\'onio Afonso, Iolanda Leite, Alessandro Sestini, Florian Fuchs, Konrad Tollmar, Linus Gissl\'en
arxiv.org/abs/2506.23626

@arXiv_csCL_bot@mastoxiv.page
2025-06-03 08:19:46

Reasoning-Table: Exploring Reinforcement Learning for Table Reasoning
Fangyu Lei, Jinxiang Meng, Yiming Huang, Tinghong Chen, Yun Zhang, Shizhu He, Jun Zhao, Kang Liu
arxiv.org/abs/2506.01710

@arXiv_csRO_bot@mastoxiv.page
2025-06-04 14:04:26

This arxiv.org/abs/2503.18616 has been replaced.
initial toot: mastoxiv.page/@arXiv_csRO_…

@arXiv_csAR_bot@mastoxiv.page
2025-06-10 07:17:22

QForce-RL: Quantized FPGA-Optimized Reinforcement Learning Compute Engine
Anushka Jha, Tanushree Dewangan, Mukul Lokhande, Santosh Kumar Vishvakarma
arxiv.org/abs/2506.07046

@arXiv_csLG_bot@mastoxiv.page
2025-06-05 11:00:37

This arxiv.org/abs/2506.00691 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…

@arXiv_csRO_bot@mastoxiv.page
2025-06-04 07:51:37

Learned Controllers for Agile Quadrotors in Pursuit-Evasion Games
Alejandro Sanchez Roncero, Olov Andersson, Petter Ogren
arxiv.org/abs/2506.02849

@arXiv_mathOC_bot@mastoxiv.page
2025-07-03 08:33:50

An Error Bound for Aggregation in Approximate Dynamic Programming
Yuchao Li, Dimitri Bertsekas
arxiv.org/abs/2507.01324

@arXiv_csLG_bot@mastoxiv.page
2025-06-03 21:57:41

This arxiv.org/abs/2505.21119 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…

@arXiv_eessSY_bot@mastoxiv.page
2025-06-03 07:56:29

Data-assimilated model-informed reinforcement learning
Defne E. Ozan, Andrea N\'ovoa, Georgios Rigas, Luca Magri
arxiv.org/abs/2506.01755

@arXiv_statML_bot@mastoxiv.page
2025-06-27 09:13:19

Homogenization of Multi-agent Learning Dynamics in Finite-state Markov Games
Yann Kerzreho (ENS Paris Saclay)
arxiv.org/abs/2506.21079

@arXiv_csRO_bot@mastoxiv.page
2025-06-04 07:40:36

AURA: Agentic Upskilling via Reinforced Abstractions
Alvin Zhu, Yusuke Tanaka, Dennis Hong
arxiv.org/abs/2506.02507 a…

@arXiv_quantph_bot@mastoxiv.page
2025-06-27 10:10:39

Reinforcement Learning for Optimal Control of Spin Magnetometers
Logan W. Cooke, Stefanie Czischek
arxiv.org/abs/2506.21475

@arXiv_csLG_bot@mastoxiv.page
2025-06-03 21:45:05

This arxiv.org/abs/2505.16401 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…

@arXiv_csSE_bot@mastoxiv.page
2025-06-27 08:07:49

Complex Model Transformations by Reinforcement Learning with Uncertain Human Guidance
Kyanna Dagenais, Istvan David
arxiv.org/abs/2506.20883

@arXiv_csIR_bot@mastoxiv.page
2025-06-30 09:51:20

Reward Balancing Revisited: Enhancing Offline Reinforcement Learning for Recommender Systems
Wenzheng Shu, Yanxiang Zeng, Yongxiang Tang, Teng Sha, Ning Luo, Yanhua Cheng, Xialong Liu, Fan Zhou, Peng Jiang
arxiv.org/abs/2506.22112

@arXiv_csHC_bot@mastoxiv.page
2025-06-17 10:52:09

Can you see how I learn? Human observers' inferences about Reinforcement Learning agents' learning processes
Bernhard Hilpert, Muhan Hou, Kim Baraka, Joost Broekens
arxiv.org/abs/2506.13583

@arXiv_csRO_bot@mastoxiv.page
2025-06-03 17:33:32

This arxiv.org/abs/2501.07985 has been replaced.
initial toot: mastoxiv.page/@arXiv_csRO_…

@arXiv_csLG_bot@mastoxiv.page
2025-07-04 10:17:11

A Forget-and-Grow Strategy for Deep Reinforcement Learning Scaling in Continuous Control
Zilin Kang, Chenyuan Hu, Yu Luo, Zhecheng Yuan, Ruijie Zheng, Huazhe Xu
arxiv.org/abs/2507.02712

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 16:58:05

This arxiv.org/abs/2409.10289 has been replaced.
link: scholar.google.com/scholar?q=a

@arXiv_csRO_bot@mastoxiv.page
2025-06-04 13:55:17

This arxiv.org/abs/2411.14622 has been replaced.
initial toot: mastoxiv.page/@arXiv_csRO_…

@arXiv_qbioQM_bot@mastoxiv.page
2025-05-29 07:37:22

Learning optimal treatment strategies for intraoperative hypotension using deep reinforcement learning
Esra Adiyeke, Tianqi Liu, Venkata Sai Dheeraj Naganaboina, Han Li, Tyler J. Loftus, Yuanfang Ren, Benjamin Shickel, Matthew M. Ruppert, Karandeep Singh, Ruogu Fang, Parisa Rashidi, Azra Bihorac, Tezcan Ozrazgat-Baslanti

@Mediagazer@mstdn.social
2025-06-25 20:15:52

Testifying before Congress, Kari Lake said reform at USAGM "was not possible" but the CEOs of RFE/RL, RFA and MBN said she had not met with them even once (Scott Nover/Washington Post)
washingtonpost.com/style/media

@arXiv_csCL_bot@mastoxiv.page
2025-06-26 09:40:40

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling
Zengzhi Wang, Fan Zhou, Xuefeng Li, Pengfei Liu
arxiv.org/abs/2506.20512

@arXiv_csRO_bot@mastoxiv.page
2025-06-04 07:31:18

Reinforcement Learning with Data Bootstrapping for Dynamic Subgoal Pursuit in Humanoid Robot Navigation
Chengyang Peng, Zhihao Zhang, Shiting Gong, Sankalp Agrawal, Keith A. Redmill, Ayonga Hereid
arxiv.org/abs/2506.02206

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 18:02:32

This arxiv.org/abs/2504.14870 has been replaced.
initial toot: mastoxiv.page/@arXiv_csAI_…

@funkvolk@mastodon.social
2025-06-12 15:43:24

"auf Mastodon sein":
im RL:
Nerdlevel ▪️ ◾ ◼️ 🔟
im Fediverse:
Nerdlevel 0️⃣ ▪️ ◾ ◼️

@arXiv_csNI_bot@mastoxiv.page
2025-06-24 10:29:30

RL-Driven Semantic Compression Model Selection and Resource Allocation in Semantic Communication Systems
Xinyi Lin, Peizheng Li, Adnan Aijaz
arxiv.org/abs/2506.18660

@arXiv_csMA_bot@mastoxiv.page
2025-06-10 16:36:19

This arxiv.org/abs/2503.02189 has been replaced.
initial toot: mastoxiv.page/@arXiv_csMA_…

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 17:35:45

This arxiv.org/abs/2412.05718 has been replaced.
initial toot: mastoxiv.page/@arXiv_csAI_…

@arXiv_csRO_bot@mastoxiv.page
2025-07-01 11:46:33

Multi-Timescale Hierarchical Reinforcement Learning for Unified Behavior and Control of Autonomous Driving
Guizhe Jin, Zhuoren Li, Bo Leng, Ran Yu, Lu Xiong
arxiv.org/abs/2506.23771

@arXiv_eessSY_bot@mastoxiv.page
2025-06-17 10:56:09

RL-Guided MPC for Autonomous Greenhouse Control
Salim Msaad, Murray Harraway, Robert D. McAllister
arxiv.org/abs/2506.13278

@davej@dice.camp
2025-06-07 16:07:33

I think I need to figure out how to make these.
I don’t have great #luck with VTTs, but my rolls with RL math rocks are TERRIBLE. I had one Irish GM offer to mail #dice to me in Australia, and a Mexican GM advised—straight-faced, no less—I seek out a santero.
I feel my

@arXiv_csRO_bot@mastoxiv.page
2025-07-02 08:42:10

Mechanical Intelligence-Aware Curriculum Reinforcement Learning for Humanoids with Parallel Actuation
Yusuke Tanaka, Alvin Zhu, Quanyou Wang, Dennis Hong
arxiv.org/abs/2507.00273

@DrPlanktonguy@ecoevo.social
2025-05-06 19:25:58

Stream or buy R.E.M.'s new EP of "Radio Free Europe" with proceeds to support Radio Free Europe / Radio Liberty, which is having its funding cut by Trump. Merch also available with proceeds to RFE/RL.
#PublicRadio #Music

@arXiv_csLG_bot@mastoxiv.page
2025-07-01 08:19:33

Learning Interpretable Rules from Neural Networks: Neurosymbolic AI for Radar Hand Gesture Recognition
Sarah Seifi, Tobias Sukianto, Cecilia Carbonelli, Lorenzo Servadei, Robert Wille
arxiv.org/abs/2506.22443

@arXiv_csLG_bot@mastoxiv.page
2025-06-03 08:21:44

Learning to Explore: An In-Context Learning Approach for Pure Exploration
Alessio Russo, Ryan Welch, Aldo Pacchiano
arxiv.org/abs/2506.01876

@arXiv_csRO_bot@mastoxiv.page
2025-06-03 07:52:50

Disturbance-Aware Adaptive Compensation in Hybrid Force-Position Locomotion Policy for Legged Robots
Yang Zhang, Buqing Nie, Zhanxiang Cao, Yangqing Fu, Yue Gao
arxiv.org/abs/2506.00472

@arXiv_statML_bot@mastoxiv.page
2025-06-06 07:39:46

Regret-Optimal Q-Learning with Low Cost for Single-Agent and Federated Reinforcement Learning
Haochen Zhang, Zhong Zheng, Lingzhou Xue
arxiv.org/abs/2506.04626

@arXiv_csRO_bot@mastoxiv.page
2025-06-03 08:00:08

DriveMind: A Dual-VLM based Reinforcement Learning Framework for Autonomous Driving
Dawood Wasif, Terrence J Moore, Chandan K Reddy, Jin-Hee Cho
arxiv.org/abs/2506.00819

@arXiv_csCL_bot@mastoxiv.page
2025-06-18 09:15:18

Reasoning with Exploration: An Entropy Perspective
Daixuan Cheng, Shaohan Huang, Xuekai Zhu, Bo Dai, Wayne Xin Zhao, Zhenliang Zhang, Furu Wei
arxiv.org/abs/2506.14758

@arXiv_csAI_bot@mastoxiv.page
2025-06-24 10:49:20

Graphs Meet AI Agents: Taxonomy, Progress, and Future Opportunities
Yuanchen Bei, Weizhi Zhang, Siwen Wang, Weizhi Chen, Sheng Zhou, Hao Chen, Yong Li, Jiajun Bu, Shirui Pan, Yizhou Yu, Irwin King, Fakhri Karray, Philip S. Yu
arxiv.org/abs/2506.18019

@arXiv_csNI_bot@mastoxiv.page
2025-05-29 07:21:45

Hybrid Learning for Cold-Start-Aware Microservice Scheduling in Dynamic Edge Environments
Jingxi Lu, Wenhao Li, Jianxiong Guo, Xingjian Ding, Zhiqing Tang, Tian Wang, Weijia Jia
arxiv.org/abs/2505.22424

@arXiv_csRO_bot@mastoxiv.page
2025-06-02 10:27:15

This arxiv.org/abs/2505.20751 has been replaced.
initial toot: mastoxiv.page/@arXiv_csRO_…

@arXiv_eessSY_bot@mastoxiv.page
2025-06-19 08:44:37

Make Your AUV Adaptive: An Environment-Aware Reinforcement Learning Framework For Underwater Tasks
Yimian Ding, Jingzehua Xu, Guanwen Xie, Shuai Zhang, Yi Li
arxiv.org/abs/2506.15082

@arXiv_csSE_bot@mastoxiv.page
2025-06-16 10:18:09

ReVeal: Self-Evolving Code Agents via Iterative Generation-Verification
Yiyang Jin, Kunzhao Xu, Hang Li, Xueting Han, Yanmin Zhou, Cheng Li, Jing Bai
arxiv.org/abs/2506.11442

@arXiv_csLG_bot@mastoxiv.page
2025-06-10 19:22:39

This arxiv.org/abs/2506.04168 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…

@arXiv_csCL_bot@mastoxiv.page
2025-06-12 09:05:02

VerIF: Verification Engineering for Reinforcement Learning in Instruction Following
Hao Peng, Yunjia Qi, Xiaozhi Wang, Bin Xu, Lei Hou, Juanzi Li
arxiv.org/abs/2506.09942

@arXiv_csNI_bot@mastoxiv.page
2025-06-10 08:00:22

Diffusion-RL for Scalable Resource Allocation for 6G Networks
Salar Nouri, Mojdeh Karbalaee Motalleb, Vahid Shah-Mansouri
arxiv.org/abs/2506.07880

@arXiv_csRO_bot@mastoxiv.page
2025-06-30 09:22:30

Skill-Nav: Enhanced Navigation with Versatile Quadrupedal Locomotion via Waypoint Interface
Dewei Wang, Chenjia Ba, Chenhui Li, Jiyuan Shi, Yan Ding, Chi Zhang, Bin Zhao
arxiv.org/abs/2506.21853

@arXiv_eessSY_bot@mastoxiv.page
2025-06-25 09:34:10

Partially Observable Residual Reinforcement Learning for PV-Inverter-Based Voltage Control in Distribution Grids
Sarra Bouchkati, Ramil Sabirov, Steffen Kortmann, Andreas Ulbig
arxiv.org/abs/2506.19353

@arXiv_csLG_bot@mastoxiv.page
2025-06-12 10:00:21

On a few pitfalls in KL divergence gradient estimation for RL
Yunhao Tang, R\'emi Munos
arxiv.org/abs/2506.09477

@arXiv_csCL_bot@mastoxiv.page
2025-06-18 09:12:51

Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs
Ring Team, Bin Hu, Cai Chen, Deng Zhao, Ding Liu, Dingnan Jin, Feng Zhu, Hao Dai, Hongzhi Luan, Jia Guo, Jiaming Liu, Jiewei Wu, Jun Mei, Jun Zhou, Junbo Zhao, Junwu Xiong, Kaihong Zhang, Kuan Xu, Lei Liang, Liang Jiang, Liangcheng Fu, Longfei Zheng, Qiang Gao, Qing Cui, Quan Wan, Shaomian Zheng, Shuaicheng Li, Tongkai Yang, Wang Ren, Xiaodong Yan, Xiaopei Wan, Xiaoyun Feng, Xin Zhao, Xinxing Yang, Xinyu …

@arXiv_csRO_bot@mastoxiv.page
2025-06-10 17:29:49

This arxiv.org/abs/2506.04147 has been replaced.
initial toot: mastoxiv.page/@arXiv_csRO_…

@arXiv_csCL_bot@mastoxiv.page
2025-06-17 09:24:23

Eliciting Reasoning in Language Models with Cognitive Tools
Brown Ebouky, Andrea Bartezzaghi, Mattia Rigotti
arxiv.org/abs/2506.12115

@arXiv_csLG_bot@mastoxiv.page
2025-06-12 10:03:21

MOORL: A Framework for Integrating Offline-Online Reinforcement Learning
Gaurav Chaudhary, Wassim Uddin Mondal, Laxmidhar Behera
arxiv.org/abs/2506.09574

@arXiv_csLG_bot@mastoxiv.page
2025-06-10 19:18:05

This arxiv.org/abs/2505.00546 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…

@arXiv_csRO_bot@mastoxiv.page
2025-06-26 08:38:10

Hierarchical Reinforcement Learning and Value Optimization for Challenging Quadruped Locomotion
Jeremiah Coholich, Muhammad Ali Murtaza, Seth Hutchinson, Zsolt Kira
arxiv.org/abs/2506.20036

@arXiv_csRO_bot@mastoxiv.page
2025-06-13 09:11:30

Eye, Robot: Learning to Look to Act with a BC-RL Perception-Action Loop
Justin Kerr, Kush Hari, Ethan Weber, Chung Min Kim, Brent Yi, Tyler Bonnen, Ken Goldberg, Angjoo Kanazawa
arxiv.org/abs/2506.10968

@arXiv_csRO_bot@mastoxiv.page
2025-06-24 11:57:40

Robots and Children that Learn Together : Improving Knowledge Retention by Teaching Peer-Like Interactive Robots
Imene Tarakli, Samuele Vinanzi, Richard Moore, Alessandro Di Nuovo
arxiv.org/abs/2506.18365

@arXiv_csRO_bot@mastoxiv.page
2025-06-11 08:35:15

MoRE: Mixture of Residual Experts for Humanoid Lifelike Gaits Learning on Complex Terrains
Dewei Wang, Xinmiao Wang, Xinzhe Liu, Jiyuan Shi, Yingnan Zhao, Chenjia Bai, Xuelong Li
arxiv.org/abs/2506.08840

@arXiv_csRO_bot@mastoxiv.page
2025-06-23 11:54:50

Learning Dexterous Object Handover
Daniel Frau-Alfaro, Julio Casta\~no-Amoros, Santiago Puente, Pablo Gil, Roberto Calandra
arxiv.org/abs/2506.16822

@arXiv_csRO_bot@mastoxiv.page
2025-06-10 17:11:09

This arxiv.org/abs/2503.04280 has been replaced.
initial toot: mastoxiv.page/@arXiv_csRO_…

@arXiv_csLG_bot@mastoxiv.page
2025-06-12 08:43:51

Policy-Based Trajectory Clustering in Offline Reinforcement Learning
Hao Hu, Xinqi Wang, Simon Shaolei Du
arxiv.org/abs/2506.09202

@arXiv_csRO_bot@mastoxiv.page
2025-06-06 09:42:54

This arxiv.org/abs/2409.17469 has been replaced.
initial toot: mastoxiv.page/@arXiv_csRO_…

@arXiv_csRO_bot@mastoxiv.page
2025-06-18 08:40:11

Quadrotor Morpho-Transition: Learning vs Model-Based Control Strategies
Ioannis Mandralis, Richard M. Murray, Morteza Gharib
arxiv.org/abs/2506.14039