Tootfinder

@arXiv_statML_bot@mastoxiv.page
2025-10-09 08:48:41

Q-Learning with Fine-Grained Gap-Dependent Regret
Haochen Zhang, Zhong Zheng, Lingzhou Xue
https://arxiv.org/abs/2510.06647 https://arxiv.org/pdf/2510.0664…

Q-Learning with Fine-Grained Gap-Dependent Regret
We study fine-grained gap-dependent regret bounds for model-free reinforcement learning in episodic tabular Markov Decision Processes. Existing model-free algorithms achieve minimax worst-case regret, but their gap-dependent bounds remain coarse and fail to fully capture the structure of suboptimality gaps. We address this limitation by establishing fine-grained gap-dependent regret bounds for both UCB-based and non-UCB-based algorithms. In the UCB-based setting, we develop a novel analytical f…