Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@vosje62@mastodon.nl
2025-06-26 16:22:01

Ratjetoe aan rare voer­tuigen op het fietspad maakt Amsterdam ‘moedeloos’: ‘Onze stad is geen pretpark’ | de #Volkskrant

Juridisch zal het allemaal best kloppen, zegt ze. ‘Maar ik voel me soms echt in de steek gelaten door het Rijk, dat hier allemaal toestemming voor geeft. Op deze manier lukt het niet om onze straten veiliger te maken.’
@arXiv_csCL_bot@mastoxiv.page
2025-06-27 09:58:19

Bridging Offline and Online Reinforcement Learning for LLMs
Jack Lanchantin, Angelica Chen, Janice Lan, Xian Li, Swarnadeep Saha, Tianlu Wang, Jing Xu, Ping Yu, Weizhe Yuan, Jason E Weston, Sainbayar Sukhbaatar, Ilia Kulikov
arxiv.org/abs/2506.21495 arxiv.org/pdf/2506.21495 arxiv.org/html/2506.21495
arXiv:2506.21495v1 Announce Type: new
Abstract: We investigate the effectiveness of reinforcement learning methods for finetuning large language models when transitioning from offline to semi-online to fully online regimes for both verifiable and non-verifiable tasks. Our experiments cover training on verifiable math as well as non-verifiable instruction following with a set of benchmark evaluations for both. Across these settings, we extensively compare online and semi-online Direct Preference Optimization and Group Reward Policy Optimization objectives, and surprisingly find similar performance and convergence between these variants, which all strongly outperform offline methods. We provide a detailed analysis of the training dynamics and hyperparameter selection strategies to achieve optimal results. Finally, we show that multi-tasking with verifiable and non-verifiable rewards jointly yields improved performance across both task types.
toXiv_bot_toot

@heiseonline@social.heise.de
2025-08-26 05:07:00

Studie: Windparks könnten chemische Stoffe ins Meer abgeben
Forscher haben 228 Substanzen festgestellt, die die Anlagen potenziell abgeben könnten. Einige Emissionen seien vermeidbar.

@arXiv_astrophSR_bot@mastoxiv.page
2025-08-27 08:34:43

Chemical evolution imprints in the rare isotopes of nearby M dwarfs
Dar\'io Gonz\'alez Picos, Ignas Snellen, Sam de Regt
arxiv.org/abs/2508.18424

@relcfp@mastodon.social
2025-06-28 06:10:16

Call for Papers: „Kunstbäume in Text und Bild der Vormoderne 2.0. – Kommunikation in ökologischen Verflechtungen“ an der Goethe-Universität Frankfurt am Main, vom 30. Oktober bis 1. November 2025
ift.tt/YcTsv7C
TOC: Dutch Crossing: Journal of Low Countries Studies, vol. 46, no. 3 (March 2022) Dutch……

@arXiv_grqc_bot@mastoxiv.page
2025-07-28 08:08:41

Doubly Separable Spacetimes and Symmetry Constraints on their Self-Gravitating Matter Content
Prashant Kocherlakota, Ramesh Narayan
arxiv.org/abs/2507.18706

@arXiv_mathAG_bot@mastoxiv.page
2025-08-26 09:17:56

Involution on the Graded Grothendieck Ring of Varieties and $\mathbb{D}$-Singularities
Andrew Burke
arxiv.org/abs/2508.17587 arxiv.org/pdf/…

@arXiv_csDS_bot@mastoxiv.page
2025-08-26 08:30:06

Towards Constant Time Multi-Call Rumor Spreading on Small-Set Expanders
Emilio Cruciani, Sebastian Forster, Tijn de Vos
arxiv.org/abs/2508.18017

@arXiv_csCV_bot@mastoxiv.page
2025-06-25 10:31:50

A Comparative Study of NAFNet Baselines for Image Restoration
Vladislav Esaulov, M. Moein Esfahani
arxiv.org/abs/2506.19845

@relcfp@mastodon.social
2025-06-28 06:10:14

CFP: Kunstbäume in Text und Bild der Vormoderne 2.0. – Kommunikation in ökologischen Verflechtungen, Frankfurt am Main (13.07.2025)
ift.tt/5IP1Dab
TOC: Dutch Crossing: Journal of Low Countries Studies, vol. 46, no. 3 (March 2022) Dutch…
via Input 4 RELCFP