Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@arXiv_csLG_bot@mastoxiv.page
2025-09-30 14:38:11

Learning Distinguishable Representations in Deep Q-Networks for Linear Transfer
Sooraj Sathish, Keshav Goyal, Raghuram Bharadwaj Diddigi
arxiv.org/abs/2509.24947

@arXiv_csGT_bot@mastoxiv.page
2025-09-30 07:47:34

Grouped Satisficing Paths in Pure Strategy Games: a Topological Perspective
Yanqing Fu, Chao Huang, Chenrun Wang, Zhuping Wang
arxiv.org/abs/2509.23157

@arXiv_csAI_bot@mastoxiv.page
2025-08-28 08:42:31

ReST-RL: Achieving Accurate Code Reasoning of LLMs with Optimized Self-Training and Decoding
Sining Zhoubian, Dan Zhang, Yuxiao Dong, Jie Tang
arxiv.org/abs/2508.19576

@arXiv_csCR_bot@mastoxiv.page
2025-08-28 07:51:31

Towards Production-Worthy Simulation for Autonomous Cyber Operations
Konur Tholl, Mariam El Mezouar, Ranwa Al Mallah
arxiv.org/abs/2508.19278

@arXiv_eessSY_bot@mastoxiv.page
2025-08-25 09:14:50

Reinforcement Learning-based Control via Y-wise Affine Neural Networks (YANNs)
Austin Braniff, Yuhe Tian
arxiv.org/abs/2508.16474 arxiv.org…

@Dragofix@veganism.social
2025-10-24 15:30:10

A closer look at Peru’s Amazon reveals new mining trends, deforestation news.mongabay.com/2025/10/a-cl

@arXiv_csCL_bot@mastoxiv.page
2025-09-25 10:44:52

Embedding Domain Knowledge for Large Language Models via Reinforcement Learning from Augmented Generation
Chaojun Nie, Jun Zhou, Guanxiang Wang, Shisong Wud, Zichen Wang
arxiv.org/abs/2509.20162

@arXiv_csLG_bot@mastoxiv.page
2025-09-26 10:31:01

Tree Search for LLM Agent Reinforcement Learning
Yuxiang Ji, Ziyu Ma, Yong Wang, Guanhua Chen, Xiangxiang Chu, Liaoni Wu
arxiv.org/abs/2509.21240

@arXiv_csCR_bot@mastoxiv.page
2025-08-27 09:59:23

Attackers Strike Back? Not Anymore - An Ensemble of RL Defenders Awakens for APT Detection
Sidahmed Benabderrahmane, Talal Rahwan
arxiv.org/abs/2508.19072

@arXiv_csLG_bot@mastoxiv.page
2025-08-25 09:59:30

RL Is Neither a Panacea Nor a Mirage: Understanding Supervised vs. Reinforcement Learning Fine-Tuning for LLMs
Hangzhan Jin, Sicheng Lv, Sifan Wu, Mohammad Hamdaqa
arxiv.org/abs/2508.16546