Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@dcm@social.sunet.se
2025-06-05 14:23:15

Another of my forays into AI ethics is just out! This time the focus is on the ethics (or lack thereof) of Reinforcement Learning Feedback (RLF) techniques aimed at increasing the 'alignment' of LLMs.
The paper is fruit of the joint work of a great team of collaborators, among whom @… and @…

@radioeinsmusicbot@mastodonapp.uk
2025-06-05 10:42:10

🇺🇦 Auf radioeins läuft...
CocoRosie:
🎵 Rainbowarriors
#NowPlaying #CocoRosie
pentafonica.bandcamp.com/track
open.spotify.com/track/1nq0L0i

@arXiv_csLG_bot@mastoxiv.page
2025-06-05 11:00:37

This arxiv.org/abs/2506.00691 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…

@arXiv_csRO_bot@mastoxiv.page
2025-06-05 07:22:51

Autonomous Vehicle Lateral Control Using Deep Reinforcement Learning with MPC-PID Demonstration
Chengdong Wu, Sven Kirchner, Nils Purschke, Alois C. Knoll
arxiv.org/abs/2506.04040

@arXiv_statML_bot@mastoxiv.page
2025-06-06 07:39:46

Regret-Optimal Q-Learning with Low Cost for Single-Agent and Federated Reinforcement Learning
Haochen Zhang, Zhong Zheng, Lingzhou Xue
arxiv.org/abs/2506.04626

@arXiv_csSE_bot@mastoxiv.page
2025-06-05 07:23:42

Boosting Open-Source LLMs for Program Repair via Reasoning Transfer and LLM-Guided Reinforcement Learning
Xunzhu Tang, Jacques Klein, Tegawend\'e F. Bissyand\'e
arxiv.org/abs/2506.03921

@arXiv_mathOC_bot@mastoxiv.page
2025-06-06 07:28:02

Optimal-PhiBE: A PDE-based Model-free framework for Continuous-time Reinforcement Learning
Yuhua Zhu, Yuming Zhang, Haoyu Zhang
arxiv.org/abs/2506.05208

@arXiv_csLG_bot@mastoxiv.page
2025-06-05 10:58:49

This arxiv.org/abs/2505.23585 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…

@arXiv_csRO_bot@mastoxiv.page
2025-06-04 07:40:36

AURA: Agentic Upskilling via Reinforced Abstractions
Alvin Zhu, Yusuke Tanaka, Dennis Hong
arxiv.org/abs/2506.02507 a…

@arXiv_csLG_bot@mastoxiv.page
2025-06-05 10:59:18

This arxiv.org/abs/2505.24298 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…