Tootfinder

No exact results. Similar results found.

@arXiv_csAI_bot@mastoxiv.page
2025-10-15 10:18:51

Inclusive Fitness as a Key Step Towards More Advanced Social Behaviors in Multi-Agent Reinforcement Learning Settings
Andries Rosseau, Rapha\"el Avalos, Ann Now\'e
https://arxiv.org/abs/2510.12555

Inclusive Fitness as a Key Step Towards More Advanced Social Behaviors in Multi-Agent Reinforcement Learning Settings
The competitive and cooperative forces of natural selection have driven the evolution of intelligence for millions of years, culminating in nature's vast biodiversity and the complexity of human minds. Inspired by this process, we propose a novel multi-agent reinforcement learning framework where each agent is assigned a genotype and where reward functions are modelled after the concept of inclusive fitness. An agent's genetic material may be shared with other agents, and our inclusive reward f…

@arXiv_csLG_bot@mastoxiv.page
2025-10-14 13:38:18

Context-Aware Model-Based Reinforcement Learning for Autonomous Racing
Emran Yasser Moustafa, Ivana Dusparic
https://arxiv.org/abs/2510.11501 https://arxiv…

Context-Aware Model-Based Reinforcement Learning for Autonomous Racing
Autonomous vehicles have shown promising potential to be a groundbreaking technology for improving the safety of road users. For these vehicles, as well as many other safety-critical robotic technologies, to be deployed in real-world applications, we require algorithms that can generalize well to unseen scenarios and data. Model-based reinforcement learning algorithms (MBRL) have demonstrated state-of-the-art performance and data efficiency across a diverse set of domains. However, these algori…

@arXiv_csRO_bot@mastoxiv.page
2025-10-13 09:35:20

Model-Based Lookahead Reinforcement Learning for in-hand manipulation
Alexandre Lopes, Catarina Barata, Plinio Moreno
https://arxiv.org/abs/2510.08884 https://

Model-Based Lookahead Reinforcement Learning for in-hand manipulation
In-Hand Manipulation, as many other dexterous tasks, remains a difficult challenge in robotics by combining complex dynamic systems with the capability to control and manoeuvre various objects using its actuators. This work presents the application of a previously developed hybrid Reinforcement Learning (RL) Framework to In-Hand Manipulation task, verifying that it is capable of improving the performance of the task. The model combines concepts of both Model-Free and Model-Based Reinforcement L…

@arXiv_eessSY_bot@mastoxiv.page
2025-10-15 08:05:41

Physics-Informed Reinforcement Learning for Large-Scale EV Smart Charging Considering Distribution Network Voltage Constraints
Stavros Orfanoudakis, Frans Oliehoek, Peter Palesnky, Pedro P. Vergara
https://arxiv.org/abs/2510.12335

Physics-Informed Reinforcement Learning for Large-Scale EV Smart Charging Considering Distribution Network Voltage Constraints
Electric Vehicles (EVs) offer substantial flexibility for grid services, yet large-scale, uncoordinated charging can threaten voltage stability in distribution networks. Existing Reinforcement Learning (RL) approaches for smart charging often disregard physical grid constraints or have limited performance for complex large-scale tasks, limiting their scalability and real-world applicability. This paper introduces a physics-informed (PI) RL algorithm that integrates a differentiable power flow m…

@arXiv_csNI_bot@mastoxiv.page
2025-10-14 10:44:08

A Flexible Multi-Agent Deep Reinforcement Learning Framework for Dynamic Routing and Scheduling of Latency-Critical Services
Vincenzo Norman Vitale, Antonia Maria Tulino, Andreas F. Molisch, Jaime Llorca
https://arxiv.org/abs/2510.11535

A Flexible Multi-Agent Deep Reinforcement Learning Framework for Dynamic Routing and Scheduling of Latency-Critical Services
Timely delivery of delay-sensitive information over dynamic, heterogeneous networks is increasingly essential for a range of interactive applications, such as industrial automation, self-driving vehicles, and augmented reality. However, most existing network control solutions target only average delay performance, falling short of providing strict End-to-End (E2E) peak latency guarantees. This paper addresses the challenge of reliably delivering packets within application-imposed deadlines by l…

@arXiv_csCL_bot@mastoxiv.page
2025-10-15 10:44:11

Reasoning Pattern Matters: Learning to Reason without Human Rationales
Chaoxu Pang, Yixuan Cao, Ping Luo
https://arxiv.org/abs/2510.12643 https://arxiv.org…

Reasoning Pattern Matters: Learning to Reason without Human Rationales
Large Language Models (LLMs) have demonstrated remarkable reasoning capabilities under the widely adopted SFT+RLVR paradigm, which first performs Supervised Fine-Tuning (SFT) on human-annotated reasoning trajectories (rationales) to establish initial reasoning behaviors, then applies Reinforcement Learning with Verifiable Rewards (RLVR) to optimize the model using verifiable signals without golden rationales. However, annotating high-quality rationales for the SFT stage remains prohibitively ex…

@arXiv_eessSP_bot@mastoxiv.page
2025-10-10 08:31:48

Utilizing Model-Free Reinforcement Learning for Optimizing Secure Multi-Party Computation Protocols
Javad Sayyadi, Mahdi Nangir, Mahmood Mohassel Feghhi, Hamid Sayyadi
https://arxiv.org/abs/2510.07814 …

Utilizing Model-Free Reinforcement Learning for Optimizing Secure Multi-Party Computation Protocols
In this manuscript, we explore the application of model-free reinforcement learning in optimizing secure multiparty computation (SMPC) protocols. SMPC is a crucial tool for performing computations on private data without the need to disclose it, holding significant importance in various domains, including information security and privacy. However, the efficiency of current protocols is often suboptimal due to computational and communicational complexities. Our proposed approach leverages model-…

@arXiv_csLG_bot@mastoxiv.page
2025-10-02 11:08:41

Multi-Actor Multi-Critic Deep Deterministic Reinforcement Learning with a Novel Q-Ensemble Method
Andy Wu, Chun-Cheng Lin, Rung-Tzuo Liaw, Yuehua Huang, Chihjung Kuo, Chia Tong Weng
https://arxiv.org/abs/2510.01083

Multi-Actor Multi-Critic Deep Deterministic Reinforcement Learning with a Novel Q-Ensemble Method
Reinforcement learning has gathered much attention in recent years due to its rapid development and rich applications, especially on control systems and robotics. When tackling real-world applications with reinforcement learning method, the corresponded Markov decision process may have huge discrete or even continuous state/action space. Deep reinforcement learning has been studied for handling these issues through deep learning for years, and one promising branch is the actor-critic architectu…

@arXiv_csLG_bot@mastoxiv.page
2025-10-01 11:55:47

Extensions of Robbins-Siegmund Theorem with Applications in Reinforcement Learning
Xinyu Liu, Zixuan Xie, Shangtong Zhang
https://arxiv.org/abs/2509.26442 https://

Extensions of Robbins-Siegmund Theorem with Applications in Reinforcement Learning
The Robbins-Siegmund theorem establishes the convergence of stochastic processes that are almost supermartingales and is foundational for analyzing a wide range of stochastic iterative algorithms in stochastic approximation and reinforcement learning (RL). However, its original form has a significant limitation as it requires the zero-order term to be summable. In many important RL applications, this summable condition, however, cannot be met. This limitation motivates us to extend the Robbins-…

@arXiv_csLG_bot@mastoxiv.page
2025-10-13 12:17:31

Crosslisted article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[4/5]:
- Application of Deep Reinforcement Learning to At-the-Money S&P 500 Options Hedging
Zofia Bracha, Pawe{\l} Sakowski, Jakub Micha\'nk\'ow

Tootfinder

Opt-in global Mastodon full text search. Join the index!