On Efficient Bayesian Exploration in Model-Based Reinforcement Learning
Alberto Caron, Chris Hicks, Vasilios Mavroudis
https://arxiv.org/abs/2507.02639 htt…
CoT-Space: A Theoretical Framework for Internal Slow-Thinking via Reinforcement Learning
Zeyu Gan, Hao Yi, Yong Liu
https://arxiv.org/abs/2509.04027 https://
"While crop seed vaults are common around the world, nurseries for wild and native plants are rare, and many plant species quietly become extinct. This marks #Gurukula out as a Noah’s ark for endangered plant species."
This https://arxiv.org/abs/2505.24298 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…
BED-LLM: Intelligent Information Gathering with LLMs and Bayesian Experimental Design
Deepro Choudhury, Sinead Williamson, Adam Goli\'nski, Ning Miao, Freddie Bickford Smith, Michael Kirchhof, Yizhe Zhang, Tom Rainforth
https://arxiv.org/abs/2508.21184
Control-Optimized Deep Reinforcement Learning for Artificially Intelligent Autonomous Systems
Oren Fivel, Matan Rudman, Kobi Cohen
https://arxiv.org/abs/2507.00268
Know When to Explore: Difficulty-Aware Certainty as a Guide for LLM Reinforcement Learning
Ang Li, Zhihang Yuan, Yang Zhang, Shouda Liu, Yisen Wang
https://arxiv.org/abs/2509.00125
Priors Matter: Addressing Misspecification in Bayesian Deep Q-Learning
Pascal R. van der Vaart, Neil Yorke-Smith, Matthijs T. J. Spaan
https://arxiv.org/abs/2508.21488 https://
Spatial-Temporal Reinforcement Learning for Network Routing with Non-Markovian Traffic
Molly Wang
https://arxiv.org/abs/2507.22174 https://arxiv.org/pdf/25…
Neural Network Acceleration on MPSoC board: Integrating SLAC's SNL, Rogue Software and Auto-SNL
Hamza Ezzaoui Rahali, Abhilasha Dave, Larry Ruckman, Mohammad Mehdi Rahimifar, Audrey C. Therrien, James J. Russel, Ryan T. Herbst
https://arxiv.org/abs/2508.21739