Tootfinder

No exact results. Similar results found.

@arXiv_csLG_bot@mastoxiv.page
2025-08-20 10:07:40

MACTAS: Self-Attention-Based Module for Inter-Agent Communication in Multi-Agent Reinforcement Learning
Maciej Wojtala, Bogusz Stefa\'nczyk, Dominik Bogucki, {\L}ukasz Lepak, Jakub Strykowski, Pawe{\l} Wawrzy\'nski
https://arxiv.org/abs/2508.13661

MACTAS: Self-Attention-Based Module for Inter-Agent Communication in Multi-Agent Reinforcement Learning
Communication is essential for the collective execution of complex tasks by human agents, motivating interest in communication mechanisms for multi-agent reinforcement learning (MARL). However, existing communication protocols in MARL are often complex and non-differentiable. In this work, we introduce a self-attention-based communication module that exchanges information between the agents in MARL. Our proposed approach is fully differentiable, allowing agents to learn to generate messages in …

@arXiv_csLG_bot@mastoxiv.page
2025-08-20 10:13:10

Reinforcement Learning-based Adaptive Path Selection for Programmable Networks
Jos\'e Eduardo Zerna Torres, Marios Avgeris, Chrysa Papagianni, Gergely Pongr\'acz, Istv\'an G\'odor, Paola Grosso
https://arxiv.org/abs/2508.13806

Reinforcement Learning-based Adaptive Path Selection for Programmable Networks
This work presents a proof-of-concept implementation of a distributed, in-network reinforcement learning (IN-RL) framework for adaptive path selection in programmable networks. By combining Stochastic Learning Automata (SLA) with real-time telemetry data collected via In-Band Network Telemetry (INT), the proposed system enables local, data-driven forwarding decisions that adapt dynamically to congestion conditions. The system is evaluated on a Mininet-based testbed using P4-programmable BMv2 sw…

@arXiv_csRO_bot@mastoxiv.page
2025-08-21 09:15:39

Efficient Environment Design for Multi-Robot Navigation via Continuous Control
Jahid Chowdhury Choton, John Woods, William Hsu
https://arxiv.org/abs/2508.14105 https://

Efficient Environment Design for Multi-Robot Navigation via Continuous Control
Multi-robot navigation and path planning in continuous state and action spaces with uncertain environments remains an open challenge. Deep Reinforcement Learning (RL) is one of the most popular paradigms for solving this task, but its real-world application has been limited due to sample inefficiency and long training periods. Moreover, the existing works using RL for multi-robot navigation lack formal guarantees while designing the environment. In this paper, we introduce an efficient and high…

@arXiv_csAI_bot@mastoxiv.page
2025-08-14 07:30:22

Value Function Initialization for Knowledge Transfer and Jump-start in Deep Reinforcement Learning
Soumia Mehimeh
https://arxiv.org/abs/2508.09277 https://…

Value Function Initialization for Knowledge Transfer and Jump-start in Deep Reinforcement Learning
Value function initialization (VFI) is an effective way to achieve a jumpstart in reinforcement learning (RL) by leveraging value estimates from prior tasks. While this approach is well established in tabular settings, extending it to deep reinforcement learning (DRL) poses challenges due to the continuous nature of the state-action space, the noisy approximations of neural networks, and the impracticality of storing all past models for reuse. In this work, we address these challenges and intro…

@arXiv_condmatmeshall_bot@mastoxiv.page
2025-09-18 09:34:51

Twist-modulated magnetic interactions in bilayer van der Waals materials
Tomas T. Osterholt, D. O. Oriekhov, Lumen Eek, Cristiane Morais Smith, Rembert A. Duine
https://arxiv.org/abs/2509.14122

Twist-modulated magnetic interactions in bilayer van der Waals materials
The ability to control magnetic interactions at the nanoscale is crucial for the development of next-generation spintronic devices and functional magnetic materials. In this work, we investigate theoretically, by means of many-body perturbation theory, how interlayer twisting modulates magnetic interactions in bilayer van der Waals systems composed of two ferromagnetic layers. We demonstrate that the relative strengths of the interlayer Heisenberg exchange interaction, the Dzyaloshinskii-Moriya…

@arXiv_csRO_bot@mastoxiv.page
2025-09-17 10:22:40

Integrating Trajectory Optimization and Reinforcement Learning for Quadrupedal Jumping with Terrain-Adaptive Landing
Renjie Wang, Shangke Lyu, Xin Lang, Wei Xiao, Donglin Wang
https://arxiv.org/abs/2509.12776

Integrating Trajectory Optimization and Reinforcement Learning for Quadrupedal Jumping with Terrain-Adaptive Landing
Jumping constitutes an essential component of quadruped robots' locomotion capabilities, which includes dynamic take-off and adaptive landing. Existing quadrupedal jumping studies mainly focused on the stance and flight phase by assuming a flat landing ground, which is impractical in many real world cases. This work proposes a safe landing framework that achieves adaptive landing on rough terrains by combining Trajectory Optimization (TO) and Reinforcement Learning (RL) together. The RL agent l…

@arXiv_csRO_bot@mastoxiv.page
2025-09-16 11:25:26

Quantum deep reinforcement learning for humanoid robot navigation task
Romerik Lokossou, Birhanu Shimelis Girma, Ozan K. Tonguz, Ahmed Biyabani
https://arxiv.org/abs/2509.11388 …

Quantum deep reinforcement learning for humanoid robot navigation task
Classical reinforcement learning (RL) methods often struggle in complex, high-dimensional environments because of their extensive parameter requirements and challenges posed by stochastic, non-deterministic settings. This study introduces quantum deep reinforcement learning (QDRL) to train humanoid agents efficiently. While previous quantum RL models focused on smaller environments, such as wheeled robots and robotic arms, our work pioneers the application of QDRL to humanoid robotics, specific…

@arXiv_csLG_bot@mastoxiv.page
2025-08-12 12:07:03

Stackelberg Coupling of Online Representation Learning and Reinforcement Learning
Fernando Martinez, Tao Li, Yingdong Lu, Juntao Chen
https://arxiv.org/abs/2508.07452 https://…

Stackelberg Coupling of Online Representation Learning and Reinforcement Learning
Integrated, end-to-end learning of representations and policies remains a cornerstone of deep reinforcement learning (RL). However, to address the challenge of learning effective features from a sparse reward signal, recent trends have shifted towards adding complex auxiliary objectives or fully decoupling the two processes, often at the cost of increased design complexity. This work proposes an alternative to both decoupling and naive end-to-end learning, arguing that performance can be signif…

@arXiv_csLG_bot@mastoxiv.page
2025-09-08 10:07:20

Greener Deep Reinforcement Learning: Analysis of Energy and Carbon Efficiency Across Atari Benchmarks
Jason Gardner, Ayan Dutta, Swapnoneel Roy, O. Patrick Kreidl, Ladislau Boloni
https://arxiv.org/abs/2509.05273

Greener Deep Reinforcement Learning: Analysis of Energy and Carbon Efficiency Across Atari Benchmarks
The growing computational demands of deep reinforcement learning (DRL) have raised concerns about the environmental and economic costs of training large-scale models. While algorithmic efficiency in terms of learning performance has been extensively studied, the energy requirements, greenhouse gas emissions, and monetary costs of DRL algorithms remain largely unexplored. In this work, we present a systematic benchmarking study of the energy consumption of seven state-of-the-art DRL algorithms, …

Tootfinder

Opt-in global Mastodon full text search. Join the index!