Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@heiseonline@social.heise.de
2025-09-08 12:06:00

Bundes-Klinik-Atlas geht offline, Verbraucherschützer üben Kritik
Der im Mai 2024 vom Bundesgesundheitsministerium veröffentlichte Klinik-Atlas geht nach nur einem Jahr wieder offline. Die Verbraucherschutzzentrale übt Kritik.

@arXiv_csCV_bot@mastoxiv.page
2025-07-10 07:52:21

Advancing Offline Handwritten Text Recognition: A Systematic Review of Data Augmentation and Generation Techniques
Yassin Hussein Rassul, Aram M. Ahmed, Polla Fattah, Bryar A. Hassan, Arwaa W. Abdulkareem, Tarik A. Rashid, Joan Lu
arxiv.org/abs/2507.06275

@arXiv_csHC_bot@mastoxiv.page
2025-07-10 09:23:21

Tailoring deep learning for real-time brain-computer interfaces: From offline models to calibration-free online decoding
Martin Wimpff, Jan Zerfowski, Bin Yang
arxiv.org/abs/2507.06779

@arXiv_csLG_bot@mastoxiv.page
2025-07-11 10:23:31

Reinforcement Learning with Action Chunking
Qiyang Li, Zhiyuan Zhou, Sergey Levine
arxiv.org/abs/2507.07969 arxiv.org/pdf/2507.07969 arxiv.org/html/2507.07969
arXiv:2507.07969v1 Announce Type: new
Abstract: We present Q-chunking, a simple yet effective recipe for improving reinforcement learning (RL) algorithms for long-horizon, sparse-reward tasks. Our recipe is designed for the offline-to-online RL setting, where the goal is to leverage an offline prior dataset to maximize the sample-efficiency of online learning. Effective exploration and sample-efficient learning remain central challenges in this setting, as it is not obvious how the offline data should be utilized to acquire a good exploratory policy. Our key insight is that action chunking, a technique popularized in imitation learning where sequences of future actions are predicted rather than a single action at each timestep, can be applied to temporal difference (TD)-based RL methods to mitigate the exploration challenge. Q-chunking adopts action chunking by directly running RL in a 'chunked' action space, enabling the agent to (1) leverage temporally consistent behaviors from offline data for more effective online exploration and (2) use unbiased $n$-step backups for more stable and efficient TD learning. Our experimental results demonstrate that Q-chunking exhibits strong offline performance and online sample efficiency, outperforming prior best offline-to-online methods on a range of long-horizon, sparse-reward manipulation tasks.
toXiv_bot_toot

@metacurity@infosec.exchange
2025-07-08 06:56:30

cyberscoop.com/call-of-duty-re
Call of Duty takes PC game offline after multiple reports of RCE attacks on players

@arXiv_quantph_bot@mastoxiv.page
2025-09-10 10:25:21

Variational Quantum Circuits in Offline Contextual Bandit Problems
Lukas Schulte, Daniel Hein, Steffen Udluft, Thomas A. Runkler
arxiv.org/abs/2509.07633

@ncoca@social.coop
2025-08-11 23:16:50

After two days offline, coming back online and seeing the news about the targeted assassination of the #AlJazeera journalists by #Israel was a tough, painful return to reality.
So many crimes, no accountability.

@stf@chaos.social
2025-07-10 16:57:27

ok, #farnell you really don't want to sell any products anymore? either your "website is offline for maintenance" wtf, really in 2025? or you 403 me and say (typos included):
> It appear that the our security software has identified something in this session that it is unsure of.
fine, i'll take my money elsewhere... buggers.

@j_honegger@swiss.social
2025-07-10 19:45:13

From ⁨⁨⁨⁨⁨#AnnafromUkraine⁩⁩⁩⁩⁩ @AnnafromUkraine@youtube.com
RUSSIA OFFLINE: MOBILE & WIRED INTERNET DISAPPEARED Vlog 1100: War in #Ukraine
The #internet "crashed" in

@arXiv_csLG_bot@mastoxiv.page
2025-07-11 10:23:41

EXPO: Stable Reinforcement Learning with Expressive Policies
Perry Dong, Qiyang Li, Dorsa Sadigh, Chelsea Finn
arxiv.org/abs/2507.07986 arxiv.org/pdf/2507.07986 arxiv.org/html/2507.07986
arXiv:2507.07986v1 Announce Type: new
Abstract: We study the problem of training and fine-tuning expressive policies with online reinforcement learning (RL) given an offline dataset. Training expressive policy classes with online RL present a unique challenge of stable value maximization. Unlike simpler Gaussian policies commonly used in online RL, expressive policies like diffusion and flow-matching policies are parameterized by a long denoising chain, which hinders stable gradient propagation from actions to policy parameters when optimizing against some value function. Our key insight is that we can address stable value maximization by avoiding direct optimization over value with the expressive policy and instead construct an on-the-fly RL policy to maximize Q-value. We propose Expressive Policy Optimization (EXPO), a sample-efficient online RL algorithm that utilizes an on-the-fly policy to maximize value with two parameterized policies -- a larger expressive base policy trained with a stable imitation learning objective and a light-weight Gaussian edit policy that edits the actions sampled from the base policy toward a higher value distribution. The on-the-fly policy optimizes the actions from the base policy with the learned edit policy and chooses the value maximizing action from the base and edited actions for both sampling and temporal-difference (TD) backup. Our approach yields up to 2-3x improvement in sample efficiency on average over prior methods both in the setting of fine-tuning a pretrained policy given offline data and in leveraging offline data to train online.
toXiv_bot_toot