Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@dcm@social.sunet.se
2025-06-29 13:48:52

Some good venting by Steve Klabnik about the sorry state of significant chunks of the AI debate today:
"What is breaking my brain a little bit is that all of the discussion online around AI is so incredibly polarized. This isn’t a “the middle is always right” sort of thing either, to be clear. It’s more that both the pro-AI and anti-AI sides are loudly proclaiming things that are pretty trivially verifiable as not true."

@arXiv_csCL_bot@mastoxiv.page
2025-06-27 09:58:19

Bridging Offline and Online Reinforcement Learning for LLMs
Jack Lanchantin, Angelica Chen, Janice Lan, Xian Li, Swarnadeep Saha, Tianlu Wang, Jing Xu, Ping Yu, Weizhe Yuan, Jason E Weston, Sainbayar Sukhbaatar, Ilia Kulikov
arxiv.org/abs/2506.21495 arxiv.org/pdf/2506.21495 arxiv.org/html/2506.21495
arXiv:2506.21495v1 Announce Type: new
Abstract: We investigate the effectiveness of reinforcement learning methods for finetuning large language models when transitioning from offline to semi-online to fully online regimes for both verifiable and non-verifiable tasks. Our experiments cover training on verifiable math as well as non-verifiable instruction following with a set of benchmark evaluations for both. Across these settings, we extensively compare online and semi-online Direct Preference Optimization and Group Reward Policy Optimization objectives, and surprisingly find similar performance and convergence between these variants, which all strongly outperform offline methods. We provide a detailed analysis of the training dynamics and hyperparameter selection strategies to achieve optimal results. Finally, we show that multi-tasking with verifiable and non-verifiable rewards jointly yields improved performance across both task types.
toXiv_bot_toot

@arXiv_csCR_bot@mastoxiv.page
2025-07-29 09:50:01

Cryptographic Data Exchange for Nuclear Warheads
Neil Perry, Daniil Zhukov
arxiv.org/abs/2507.20074 arxiv.org/pdf/2507.20074

@arXiv_csCE_bot@mastoxiv.page
2025-07-30 07:33:51

Improving Neural Network Training using Dynamic Learning Rate Schedule for PINNs and Image Classification
D. Veerababu, Ashwin A. Raikar, Prasanta K. Ghosh
arxiv.org/abs/2507.21749

@heiseonline@social.heise.de
2025-06-24 06:06:00

Deutschland Hochburg bei E-Bikes in Europa – die Preise sinken
Nirgendwo in Europa wird so viel Geld mit E-Bikes gemacht. Zwar gab es zuletzt einen Dämpfer im Geschäft. Doch langfristig dürfte kein Weg daran vorbeiführen.

@sean@scoat.es
2025-05-28 15:20:21

Unlocking my own understanding of and ability to build #Swift macros feels like a superpower.
…something something great responsibility, though.
Synthesizing boilerplate and statically-verifiable elements like custom function calls based on macro input… is magic—the good kind.
`@GET("/logs/{userId}/{timing}")`
↘️

Screen recording of my IDE completing `ApiController.$Routing.logs.resolvedPath` based on the above declaration.
@arXiv_eessSY_bot@mastoxiv.page
2025-05-30 07:23:52

Latent Representations for Control Design with Provable Stability and Safety Guarantees
Paul Lutkus, Kaiyuan Wang, Lars Lindemann, Stephen Tu
arxiv.org/abs/2505.23210

@arXiv_csMA_bot@mastoxiv.page
2025-07-29 07:56:51

Towards Multi-Agent Economies: Enhancing the A2A Protocol with Ledger-Anchored Identities and x402 Micropayments for AI Agents
Awid Vaziry, Sandro Rodriguez Garzon, Axel K\"upper
arxiv.org/abs/2507.19550

@arXiv_csPL_bot@mastoxiv.page
2025-05-28 07:20:40

VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification
Patrick Yubeaton, Andre Nakkab, Weihua Xiao, Luca Collini, Ramesh Karri, Chinmay Hegde, Siddharth Garg
arxiv.org/abs/2505.20302

@arXiv_csCL_bot@mastoxiv.page
2025-06-26 09:40:50

When Life Gives You Samples: The Benefits of Scaling up Inference Compute for Multilingual LLMs
Ammar Khairi, Daniel D'souza, Ye Shen, Julia Kreutzer, Sara Hooker
arxiv.org/abs/2506.20544