Tootfinder

Opt-in global Mastodon full text search. Join the index!

@cheryanne@aus.social
2026-02-17 23:50:39

The Affluent Ceo Show: The Select Circle's Codes To Next Level Scaling With Wealth Sovereignty And Ease
This isn't just another business podcast. It's a transformational journey for high-achieving female entrepreneurs who are ready to stop playing small and start leading with unapologetic power...
Great Australian Pods Podcast Directory:

The Affluent Ceo Show: The Select Circle's Codes To Next Level Scaling With Wealth Sovereignty And Ease
Screenshot of the podcast listing on the Great Australian Pods website
@ocrampal@mastodon.social
2026-04-17 13:22:49

We have become master cartographers of the "given." By scaling computation, we have built maps so vast and detailed that we often mistake the territory for the grid.
ocrampal.com/the-question-ai-w

@Techmeme@techhub.social
2026-02-18 07:20:54

Q&A with Ramp CEO Eric Glyman on scaling the expense management company to over $1B revenue, the "SaaS apocalypse", using AI agents to review expenses, and more (Cheeky Pint)
cheekypint.substack.com/p/ramp

@burger_jaap@mastodon.social
2026-04-15 14:17:06

RE: #V2G pilot in Sweden with Energy Bank and Vattenfall is now scaling to 200 cars

@davidaugust@mastodon.online
2026-03-14 16:43:28

Working on commercial stuff with an #acting #coaching client and played with scaling the work up and down: still grounding things in what's really happening, but seeing them find the small internal version they welcome us to join them in was very good to see.

@primonatura@mstdn.social
2026-02-17 18:00:12

"University of Houston’s Carbon Capture Breakthrough Cuts Costs, but Scale Remains Uncertain"
#US #USA #America #CarbonCapture

@v_i_o_l_a@openbiblio.social
2026-02-04 10:34:31

"Scaling ORCID Adoption: Technical and Organizational Approaches Within a Research Organization"
#ORCID iD use…

@aardrian@toot.cafe
2026-02-06 11:59:18

[1/?]
So this happened while I was up a mountain:
joshtumath.uk/posts/2026-01-27
But the post didn’t have a demo. Last night I made one:

Using chrome://flags to enable Experimental Web Platform Features in the latest Canary.
@Techmeme@techhub.social
2026-03-14 09:31:41

An interview with SemiAnalysis CEO Dylan Patel on logic, memory, and power bottlenecks in scaling AI compute, Nvidia securing TSMC N3 allocation early, and more (Dwarkesh Patel/Dwarkesh Podcast)
dwarkesh.com/p/dylan-patel

@Mediagazer@mstdn.social
2026-04-07 18:30:41

McClatchy began rolling out a "content scaling agent" that summarizes and repurposes reporters' work with new headlines in Q1, prompting some employee pushback (Corbin Bolies/The Wrap)
thewrap.com/media-platforms/jo

@Xexyz@mastodon.me.uk
2026-03-06 00:15:25

Tomb Raider: Greece and Egypt
Having decried the lack of verticality in the Peruvian levels, St Francis's Folly more than made amends. The main part of the level has you scaling up and down a central room, opening doors with levels and solving puzzles in rooms names after Greek mythology1. The permanence of enemy deaths is very noticeable here, where the bats you shoot at the top of the room can sometimes be found lying on the ground at the bottom; when using original…

German bargain grocer Aldi plans to open more than 180 U.S. stores this year.
The chain, which has been rapidly expanding since inflation surged in 2021, has set a goal of 3,200 stores globally by the end of 2028.
Trader Joe’s has been on an expansion spree.
Many Dollar General stores have begun selling fresh produce.
“Discounters are in this really tricky spot, where price is paramount, but they’re up against the likes of Walmart,” Moran said.
Walmart and Amaz…

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 10:34:51

UrbanFM: Scaling Urban Spatio-Temporal Foundation Models
Wei Chen, Yuqian Wu, Junle Chen, Xiaofang Zhou, Yuxuan Liang
arxiv.org/abs/2602.20677 arxiv.org/pdf/2602.20677 arxiv.org/html/2602.20677
arXiv:2602.20677v1 Announce Type: new
Abstract: Urban systems, as dynamic complex systems, continuously generate spatio-temporal data streams that encode the fundamental laws of human mobility and city evolution. While AI for Science has witnessed the transformative power of foundation models in disciplines like genomics and meteorology, urban computing remains fragmented due to "scenario-specific" models, which are overfitted to specific regions or tasks, hindering their generalizability. To bridge this gap and advance spatio-temporal foundation models for urban systems, we adopt scaling as the central perspective and systematically investigate two key questions: what to scale and how to scale. Grounded in first-principles analysis, we identify three critical dimensions: heterogeneity, correlation, and dynamics, aligning these principles with the fundamental scientific properties of urban spatio-temporal data. Specifically, to address heterogeneity through data scaling, we construct WorldST. This billion-scale corpus standardizes diverse physical signals, such as traffic flow and speed, from over 100 global cities into a unified data format. To enable computation scaling for modeling correlations, we introduce the MiniST unit, a novel split mechanism that discretizes continuous spatio-temporal fields into learnable computational units to unify representations of grid-based and sensor-based observations. Finally, addressing dynamics via architecture scaling, we propose UrbanFM, a minimalist self-attention architecture designed with limited inductive biases to autonomously learn dynamic spatio-temporal dependencies from massive data. Furthermore, we establish EvalST, the largest-scale urban spatio-temporal benchmark to date. Extensive experiments demonstrate that UrbanFM achieves remarkable zero-shot generalization across unseen cities and tasks, marking a pivotal first step toward large-scale urban spatio-temporal foundation models.
toXiv_bot_toot

@burger_jaap@mastodon.social
2026-02-12 07:35:54

All the legal battles are about who can exploit users even more. These German processes will not lead to cost reduction through scaling and standardisation, nor will they make EV charging more accessible.

@juliangro@social.linux.pizza
2026-01-23 17:02:57

For some reason, my home server was using the "performance" CPU scaling governor. Changing it to "conservative" seems to have lowered power consumption for the whole machine by almost 10%. This saves more than 25€ per year in electricity cost.
You can check your own CPU scaling governor with `cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor`.

@Techmeme@techhub.social
2026-03-05 06:01:44

OpenAI is scaling back shopping directly inside ChatGPT via Instant Checkout by having checkouts instead take place in the specific apps that plug into ChatGPT (The Information)
theinformation.com/articles/op

@michabbb@social.vivaldi.net
2026-04-07 23:14:29

Modular architecture: customize Directus to match your branding and workflows. Extensions, custom endpoints, hooks — all possible.
☁️ Cloud or Self-Hosted — Your Choice
Self-service cloud starting at $15/month with auto-scaling & global CDN. Or deploy locally/on-prem.
One-click deploy via #Railway included.
📊 The Numbers Speak for Themselves:
⭐ 34.5k GitHub St…

@primonatura@mstdn.social
2026-02-10 14:00:37

"The Ocean Cleanup removed a record 25 million kilos of plastic in 2025 (and they’re just getting started)"
#Oceans #Environment #Plastic

@stefan@gardenstate.social
2026-01-31 23:53:43

Firefox is choking on video on my 4k monitor on Kubuntu. Youtube's status for nerds is showing 10 - 20% of frames dropped.
Chromium is fine. I think removing all the monitor scaling helps but does not fully fix it.
#ubuntu #linux

@Sustainable2050@mastodon.energy
2026-03-24 17:06:09

And there we are. Trump is already blackmailing Europe with 'his' LNG. We urgently need to reduce our natural gas consumption by speeding up the energy transition: focus on energy efficiency, electrification, and scaling up EU production of biomethane, developing way too slow to meet its 2030 target.

@arXiv_csPF_bot@mastoxiv.page
2026-04-01 07:49:27

Time is Not Compute: Scaling Laws for Wall-Clock Constrained Training on Consumer GPUs
Yi Liu
arxiv.org/abs/2603.28823 arxiv.org/pdf/2603.28823 arxiv.org/html/2603.28823
arXiv:2603.28823v1 Announce Type: new
Abstract: Scaling laws relate model quality to compute budget (FLOPs), but practitioners face wall-clock time constraints, not compute budgets. We study optimal model sizing under fixed time budgets from 5 minutes to 24 hours on consumer GPUs (RTX 4090). Across 70 runs spanning 50M--1031M parameters, we find: (1)~at each time budget a U-shaped curve emerges where too-small models overfit and too-large models undertrain; (2)~optimal model size follows $N^* \propto t^{0.60}$, growing \emph{faster} than Chinchilla's $N^* \propto C^{0.50}$, with $\alpha = 0.60 \pm 0.07$ robustly exceeding compute-optimal across all sensitivity analyses; (3)~a \emph{dual U-shape mechanism}: short-budget U-curves arise from compute bottlenecks, while long-budget U-curves emerge from data bottlenecks (overfitting), with an intermediate regime where the U-curve temporarily disappears. These findings have immediate implications for researchers training on consumer hardware, where wall-clock time -- not FLOPs -- is the binding constraint. We release all code, logs, and 70 experimental configurations.
toXiv_bot_toot

@Techmeme@techhub.social
2026-01-20 00:01:24

Cursor's recent experiment involved running hundreds of AI agents for nearly a week to build a web browser, writing 1M lines of code across 1,000 files (Simon Willison/Simon Willison's Weblog)
simonwillison.net/2026/Jan/19/

@niklaskorz@rheinneckar.social
2026-02-24 12:55:45

Practically just waiting for #Gnome 50 to release to give it another shot. Stabilization of VRR and fractional scaling, plus so many improvements to the Gnome Mutter compositor make me curious.
Weirdly this is a controversial opinion, but I absolutely love the #Adwaita design language and …

@azonenberg@ioc.exchange
2026-03-24 06:42:34

@… wrt recent github tickets (on my phone and can't comment on the thread yet): the point of making it an env var is that *you do not need to use the UI to set it*.
If you're stuck in an edge case with broken scaling auto detection and the UI is tiny or enormous you probably won't be able to effectively interact with the preferenc…

@arXiv_csOS_bot@mastoxiv.page
2026-02-10 19:37:06

Replaced article(s) found for cs.OS. arxiv.org/list/cs.OS/new
[1/1]:
- Flare: Anomaly Diagnostics for Divergent LLM Training in GPU Clusters of Thousand-Plus Scale
Weihao Cui, Ji Zhang, Han Zhao, Chao Liu, Jian Sha, Bingsheng He, Minyi Guo, Quan Chen
arxiv.org/abs/2502.05413 mastoxiv.page/@arXiv_csOS_bot/
- Towards High-Goodput LLM Serving with Prefill-decode Multiplexing
Chen, Cui, Zhao, Xu, Fan, Chen, Zhou, Sun, He, Chen
arxiv.org/abs/2504.14489 mastoxiv.page/@arXiv_csOS_bot/
- Scaling Data Center TCP to Terabits with Laminar
Rajath Shashidhara, Antoine Kaufmann, Simon Peter
arxiv.org/abs/2504.19058 mastoxiv.page/@arXiv_csNI_bot/
toXiv_bot_toot

@Mediagazer@mstdn.social
2026-03-07 00:56:07

Q&A with Elizabeth Hansen Shapiro on her new report, to be published next week, which argues local news funders should pick winners, scale up, and force mergers (Richard Tofel/Nieman Lab)
niemanlab.org/2026/03/its-time

@peterhoneyman@a2mi.social
2026-01-22 21:12:46

reading a prelim paper on scaling up gpu-accelerated database query engines and feeling kinda gobsmacked at where that world is at
i remember when we built a “massive” memory machine at princeton with … i think it was 256 MB of RAM. (it sat idle except when ken thompson was logged in and building hash tables for chess endgames which was most of the time)

@arXiv_csCL_bot@mastoxiv.page
2026-03-31 10:10:07

Marco DeepResearch: Unlocking Efficient Deep Research Agents via Verification-Centric Design
Bin Zhu, Qianghuai Jia, Tian Lan, Junyang Ren, Feng Gu, Feihu Jiang, Longyue Wang, Zhao Xu, Weihua Luo
arxiv.org/abs/2603.28376 arxiv.org/pdf/2603.28376 arxiv.org/html/2603.28376
arXiv:2603.28376v1 Announce Type: new
Abstract: Deep research agents autonomously conduct open-ended investigations, integrating complex information retrieval with multi-step reasoning across diverse sources to solve real-world problems. To sustain this capability on long-horizon tasks, reliable verification is critical during both training and inference. A major bottleneck in existing paradigms stems from the lack of explicit verification mechanisms in QA data synthesis, trajectory construction, and test-time scaling. Errors introduced at each stage propagate downstream and degrade the overall agent performance. To address this, we present Marco DeepResearch, a deep research agent optimized with a verification-centric framework design at three levels: \textbf{(1)~QA Data Synthesis:} We introduce verification mechanisms to graph-based and agent-based QA synthesis to control question difficulty while ensuring answers are unique and correct; \textbf{(2)~Trajectory Construction:} We design a verification-driven trajectory synthesis method that injects explicit verification patterns into training trajectories; and \textbf{(3)~Test-time scaling:} We use Marco DeepResearch itself as a verifier at inference time and effectively improve performance on challenging questions. Extensive experimental results demonstrate that our proposed Marco DeepResearch agent significantly outperforms 8B-scale deep research agents on most challenging benchmarks, such as BrowseComp and BrowseComp-ZH. Crucially, under a maximum budget of 600 tool calls, Marco DeepResearch even surpasses or approaches several 30B-scale agents, like Tongyi DeepResearch-30B.
toXiv_bot_toot

@wyri@toot-toot.wyrihaxim.us
2026-02-22 19:24:28

@… how many you have? Got 160 pods on my cluster with the 4 nodes running, it more than doubles when things start scaling

@arXiv_csDS_bot@mastoxiv.page
2026-02-10 21:08:46

Replaced article(s) found for cs.DS. arxiv.org/list/cs.DS/new
[1/1]:
- Fully Dynamic Adversarially Robust Correlation Clustering in Polylogarithmic Update Time
Vladimir Braverman, Prathamesh Dharangutte, Shreyas Pai, Vihan Shah, Chen Wang
arxiv.org/abs/2411.09979 mastoxiv.page/@arXiv_csDS_bot/
- A Simple and Combinatorial Approach to Proving Chernoff Bounds and Their Generalizations
William Kuszmaul
arxiv.org/abs/2501.03488 mastoxiv.page/@arXiv_csDS_bot/
- The Structural Complexity of Matrix-Vector Multiplication
Emile Anand, Jan van den Brand, Rose McCarty
arxiv.org/abs/2502.21240 mastoxiv.page/@arXiv_csDS_bot/
- Clustering under Constraints: Efficient Parameterized Approximation Schemes
Sujoy Bhore, Ameet Gadekar, Tanmay Inamdar
arxiv.org/abs/2504.06980 mastoxiv.page/@arXiv_csDS_bot/
- Minimizing Envy and Maximizing Happiness in Graphical House Allocation
Anubhav Dhar, Ashlesha Hota, Palash Dey, Sudeshna Kolay
arxiv.org/abs/2505.00296 mastoxiv.page/@arXiv_csDS_bot/
- Fast and Simple Densest Subgraph with Predictions
Thai Bui, Luan Nguyen, Hoa T. Vu
arxiv.org/abs/2505.12600 mastoxiv.page/@arXiv_csDS_bot/
- Compressing Suffix Trees by Path Decompositions
Becker, Cenzato, Gagie, Kim, Koerkamp, Manzini, Prezza
arxiv.org/abs/2506.14734 mastoxiv.page/@arXiv_csDS_bot/
- Improved sampling algorithms and functional inequalities for non-log-concave distributions
Yuchen He, Zhehan Lei, Jianan Shao, Chihao Zhang
arxiv.org/abs/2507.11236 mastoxiv.page/@arXiv_csDS_bot/
- Deterministic Lower Bounds for $k$-Edge Connectivity in the Distributed Sketching Model
Peter Robinson, Ming Ming Tan
arxiv.org/abs/2507.11257 mastoxiv.page/@arXiv_csDS_bot/
- Optimally detecting uniformly-distributed $\ell_2$ heavy hitters in data streams
Santhoshini Velusamy, Huacheng Yu
arxiv.org/abs/2509.07286 mastoxiv.page/@arXiv_csDS_bot/
- Uncrossed Multiflows and Applications to Disjoint Paths
Chandra Chekuri, Guyslain Naves, Joseph Poremba, F. Bruce Shepherd
arxiv.org/abs/2511.00254 mastoxiv.page/@arXiv_csDS_bot/
- Dynamic Matroids: Base Packing and Covering
Tijn de Vos, Mara Grilnberger
arxiv.org/abs/2511.15460 mastoxiv.page/@arXiv_csDS_bot/
- Branch-width of connectivity functions is fixed-parameter tractable
Tuukka Korhonen, Sang-il Oum
arxiv.org/abs/2601.04756 mastoxiv.page/@arXiv_csDS_bot/
- CoinPress: Practical Private Mean and Covariance Estimation
Sourav Biswas, Yihe Dong, Gautam Kamath, Jonathan Ullman
arxiv.org/abs/2006.06618
- The Ideal Membership Problem and Abelian Groups
Andrei A. Bulatov, Akbar Rafiey
arxiv.org/abs/2201.05218
- Bridging Classical and Quantum: Group-Theoretic Approach to Quantum Circuit Simulation
Daksh Shami
arxiv.org/abs/2407.19575 mastoxiv.page/@arXiv_quantph_b
- Young domination on Hamming rectangles
Janko Gravner, Matja\v{z} Krnc, Martin Milani\v{c}, Jean-Florent Raymond
arxiv.org/abs/2501.03788 mastoxiv.page/@arXiv_mathCO_bo
- On the Space Complexity of Online Convolution
Joel Daniel Andersson, Amir Yehudayoff
arxiv.org/abs/2505.00181 mastoxiv.page/@arXiv_csCC_bot/
- Universal Solvability for Robot Motion Planning on Graphs
Anubhav Dhar, Pranav Nyati, Tanishq Prasad, Ashlesha Hota, Sudeshna Kolay
arxiv.org/abs/2506.18755 mastoxiv.page/@arXiv_csCC_bot/
- Colorful Minors
Evangelos Protopapas, Dimitrios M. Thilikos, Sebastian Wiederrecht
arxiv.org/abs/2507.10467
- Learning fermionic linear optics with Heisenberg scaling and physical operations
Aria Christensen, Andrew Zhao
arxiv.org/abs/2602.05058
toXiv_bot_toot

@almad@fosstodon.org
2026-02-25 00:30:25

“AI Prototyping Fallacy”: Because I’m used to code doing the same thing over and over again and reasonably scaling, I’m quite sure same approach can be used for evaluating LLMs!

@Techmeme@techhub.social
2026-03-09 19:13:18

Jay Graber is stepping down as Bluesky CEO, saying a "seasoned operator focused on scaling and execution" is needed; VC Toni Schneider is named interim CEO (Kate Knibbs/Wired)
wired.com/story/bluesky-ceo-ja

@seeingwithsound@mas.to
2026-01-20 19:22:07

Contrasting success in neurotechnology: Sensory substitution, brain–computer interfaces, and the limits of dimensional reduction researchgate.net/publication/3

Contrasting outcomes in neurotechnology: Perceptual robustness versus neural scaling complexity.
@mrysav@social.linux.pizza
2026-02-05 21:37:32

I've been a GNOME user primarily for... probably over a decade at this point, occasionally trying #KDEPlasma. Nothing particularly bad about GNOME but recently been having some issues with scaling and theming so decided to try Plasma 6 and I think it might stick this time!

@arXiv_astrophGA_bot@mastoxiv.page
2026-04-02 09:35:03

Crosslisted article(s) found for astro-ph.GA. arxiv.org/list/astro-ph.GA/new
[1/1]:
- Analytical Scaling of Relativistic Drag in the Interstellar Medium
Lucky Gangwar

@cheryanne@aus.social
2026-02-24 18:08:38

Australian Business Podcast
Australia's top business podcast for growing and scaling your business from idea to a 7-figure exit...
Great Australian Pods Podcast Directory: greataustralianpods.com/austra

Australian Business Podcast 
Screenshot of the podcast listing on the Great Australian Pods website
@Techmeme@techhub.social
2026-02-27 15:45:54

OpenAI says ChatGPT has 900M weekly active users, 50M consumer subscribers, and weekly Codex users have more than tripled since the start of the year to 1.6M (OpenAI)
openai.com/index/scaling-ai-fo

@burger_jaap@mastodon.social
2026-02-02 12:58:42

National Grid DSO's contracted flexibility numbers are impressive. The domestic segment is showing particularly strong growth, with EV charging accounting for the majority of this. Many small units are coming together to make a big contribution.
dso.nationalg…

Scaling up flexibility

Page shows FY23/24 > FY 24/25 > FY 25>26 growth in assets, capacity and dispatch
Zero carbon leading the way

Chart shows domestic EV charging points providing approximately 40% of capacity, and over 80% of number of assets.
@arXiv_csLG_bot@mastoxiv.page
2026-02-25 10:42:41

Scaling Vision Transformers: Evaluating DeepSpeed for Image-Centric Workloads
Huy Trinh, Rebecca Ma, Zeqi Yu, Tahsin Reza
arxiv.org/abs/2602.21081 arxiv.org/pdf/2602.21081 arxiv.org/html/2602.21081
arXiv:2602.21081v1 Announce Type: new
Abstract: Vision Transformers (ViTs) have demonstrated remarkable potential in image processing tasks by utilizing self-attention mechanisms to capture global relationships within data. However, their scalability is hindered by significant computational and memory demands, especially for large-scale models with many parameters. This study aims to leverage DeepSpeed, a highly efficient distributed training framework that is commonly used for language models, to enhance the scalability and performance of ViTs. We evaluate intra- and inter-node training efficiency across multiple GPU configurations on various datasets like CIFAR-10 and CIFAR-100, exploring the impact of distributed data parallelism on training speed, communication overhead, and overall scalability (strong and weak scaling). By systematically varying software parameters, such as batch size and gradient accumulation, we identify key factors influencing performance of distributed training. The experiments in this study provide a foundational basis for applying DeepSpeed to image-related tasks. Future work will extend these investigations to deepen our understanding of DeepSpeed's limitations and explore strategies for optimizing distributed training pipelines for Vision Transformers.
toXiv_bot_toot

@arXiv_physicsinsdet_bot@mastoxiv.page
2026-02-03 09:41:32

Proton Energy Dependence of Radiation Induced Low Gain Avalanche Detector Degradation
Veronika Kraus, Marcos Fernandez Garcia, Luca Menzio, Michael Moll
arxiv.org/abs/2602.01800 arxiv.org/pdf/2602.01800 arxiv.org/html/2602.01800
arXiv:2602.01800v1 Announce Type: new
Abstract: Low Gain Avalanche Detectors (LGADs) are key components for precise timing measurements in high-energy physics experiments, including the High Luminosity upgrades of the current LHC detectors. Their performance is, however, limited by radiation induced degradation of the gain layer, primarily driven by acceptor removal. This study presents a systematic comparison of how the degradation evolves with different incident proton energies, using LGADs from Hamamatsu Photonics (HPK) and The Institute of Microelectronics of Barcelona (IMB-CNM) irradiated with 18 MeV, 24 MeV, 400 MeV and 23 GeV protons and fluences up to 2.5x10^15 p/cm2. Electrical characterization is used to extract the acceptor removal coefficients for different proton energies, whereas IR TCT measurements offer complementary insight into the gain evolution in LGADs after irradiation. Across all devices, lower energy protons induce stronger gain layer degradation, confirming expectations. However, 400 MeV protons consistently appear less damaging than both lower and higher energy protons, an unexpected deviation from a monotonic energy trend. Conversion of proton fluences to 1 MeV neutron-equivalent fluences reduces but does not eliminate these differences, indicating that the standard Non-Ionizing Energy Loss (NIEL) scaling does not fully account for the underlying defect formation mechanisms at different energies and requires revision when considering irradiation fields that contain a broader spectrum of particle types and energies.
toXiv_bot_toot

@arXiv_physicsfludyn_bot@mastoxiv.page
2026-02-26 08:31:01

Prandtl number dependence of rotating internally heated convection
Rodolfo Ostilla-M\'onico, Ali Arslan
arxiv.org/abs/2602.21860 arxiv.org/pdf/2602.21860 arxiv.org/html/2602.21860
arXiv:2602.21860v1 Announce Type: new
Abstract: We investigate the influence of the Prandtl number ($Pr$) on penetrative internally heated convection (IHC) in both non-rotating and rotating regimes using three-dimensional direct numerical simulations. By varying $Pr$ between 0.1 and 100, we show that the global mean temperature $\langle \overline{T} \rangle$ is not very sensitive to $Pr$, and is primarily controlled by the dynamics of the unstably stratified top boundary layer. In contrast, the Prandtl number dictates the behavior of the lower, stably stratified region and affects the vertical convective heat flux $\langle \overline{wT} \rangle$. In the non-rotating case, low $Pr$ fluids exhibit a ``symmetry recovery'' where turbulent stirring agitates the stable layer, whereas high $Pr$ fluids transition toward a ``dead zone'' of suppressed fluctuations. Under rotation, we find that $\langle \overline{wT} \rangle$ is enhanced across all Prandtl numbers, though global cooling efficiency, measured by the reduction in $\langle \overline{T} \rangle$, is only improved for $Pr\ge1$ due to the emergence of Ekman pumping. These results demonstrate that while IHC shares some scaling similarities with Rayleigh-B\'enard convection at the top boundary, the internal stratification creates a unique sensitivity to $Pr$ that is critical for understanding heat transport in planetary and stellar interiors.
toXiv_bot_toot

@Techmeme@techhub.social
2026-02-05 23:25:47

Sources: Apple wound down plans for an AI-based virtual health coach in recent weeks; Eddy Cue has told colleagues that Apple needs to move faster in health (Mark Gurman/Bloomberg)
bloomberg.com/news/articles/20

@burger_jaap@mastodon.social
2026-01-28 13:47:14

Taking a step closer to scaling up residential #V2G, a nationwide network of Swedish electricians will start offering installations of the Ambibox DC bidirectional charger (compatible with cars from the VW Group) in February.

@arXiv_csCL_bot@mastoxiv.page
2026-03-31 11:12:28

Replaced article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[1/5]:
- Beyond In-Distribution Success: Scaling Curves of CoT Granularity for Language Model Generalization
Ru Wang, Wei Huang, Selena Song, Haoyu Zhang, Qian Niu, Yusuke Iwasawa, Yutaka Matsuo, Jiaxian Guo
arxiv.org/abs/2502.18273 mastoxiv.page/@arXiv_csCL_bot/
- Benchmarking NLP-supported Language Sample Analysis for Swiss Children's Speech
Anja Ryser, Yingqiang Gao, Sarah Ebling
arxiv.org/abs/2504.00780 mastoxiv.page/@arXiv_csCL_bot/
- Cultural Biases of Large Language Models and Humans in Historical Interpretation
Fabio Celli, Georgios Spathulas
arxiv.org/abs/2504.02572 mastoxiv.page/@arXiv_csCL_bot/
- BRIDGE: Benchmarking Large Language Models for Understanding Real-world Clinical Practice Text
Jiageng Wu, et al.
arxiv.org/abs/2504.19467 mastoxiv.page/@arXiv_csCL_bot/
- Understanding the Anchoring Effect of LLM with Synthetic Data: Existence, Mechanism, and Potentia...
Yiming Huang, Biquan Bie, Zuqiu Na, Weilin Ruan, Songxin Lei, Yutao Yue, Xinlei He
arxiv.org/abs/2505.15392 mastoxiv.page/@arXiv_csCL_bot/
- Just as Humans Need Vaccines, So Do Models: Model Immunization to Combat Falsehoods
Raza, Qureshi, Farooq, Lotif, Chadha, Pandya, Emmanouilidis
arxiv.org/abs/2505.17870 mastoxiv.page/@arXiv_csCL_bot/
- LingoLoop Attack: Trapping MLLMs via Linguistic Context and State Entrapment into Endless Loops
Fu, Jiang, Hong, Li, Guo, Yang, Chen, Zhang
arxiv.org/abs/2506.14493 mastoxiv.page/@arXiv_csCL_bot/
- GHTM: A Graph-based Hybrid Topic Modeling Approach with a Benchmark Dataset for the Low-Resource ...
Farhana Haque, Md. Abdur Rahman, Sumon Ahmed
arxiv.org/abs/2508.00605 mastoxiv.page/@arXiv_csCL_bot/
- Link Prediction for Event Logs in the Process Industry
Anastasia Zhukova, Thomas Walton, Christian E. Lobm\"uller, Bela Gipp
arxiv.org/abs/2508.09096 mastoxiv.page/@arXiv_csCL_bot/
- AirQA: A Comprehensive QA Dataset for AI Research with Instance-Level Evaluation
Huang, Cao, Zhang, Kang, Wang, Wang, Luo, Zheng, Qian, Chen, Yu
arxiv.org/abs/2509.16952 mastoxiv.page/@arXiv_csCL_bot/
- Multi-View Attention Multiple-Instance Learning Enhanced by LLM Reasoning for Cognitive Distortio...
Jun Seo Kim, Hyemi Kim, Woo Joo Oh, Hongjin Cho, Hochul Lee, Hye Hyeon Kim
arxiv.org/abs/2509.17292 mastoxiv.page/@arXiv_csCL_bot/
- Dual-Space Smoothness for Robust and Balanced LLM Unlearning
Han Yan, Zheyuan Liu, Meng Jiang
arxiv.org/abs/2509.23362 mastoxiv.page/@arXiv_csCL_bot/
- The Rise of AfricaNLP: Contributions, Contributors, Community Impact, and Bibliometric Analysis
Tadesse Destaw Belay, et al.
arxiv.org/abs/2509.25477 mastoxiv.page/@arXiv_csCL_bot/
- Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Reco...
Srivastav, Zheng, Bezzam, Le Bihan, Koluguri, \.Zelasko, Majumdar, Moumen, Gandhi
arxiv.org/abs/2510.06961 mastoxiv.page/@arXiv_csCL_bot/
- Neuron-Level Analysis of Cultural Understanding in Large Language Models
Taisei Yamamoto, Ryoma Kumon, Danushka Bollegala, Hitomi Yanaka
arxiv.org/abs/2510.08284 mastoxiv.page/@arXiv_csCL_bot/
- CLMN: Concept based Language Models via Neural Symbolic Reasoning
Yibo Yang
arxiv.org/abs/2510.10063 mastoxiv.page/@arXiv_csCL_bot/
- Schema for In-Context Learning
Chen, Chen, Wang, Leong, Fung, Bernales, Aspuru-Guzik
arxiv.org/abs/2510.13905 mastoxiv.page/@arXiv_csCL_bot/
- Evaluating Latent Knowledge of Public Tabular Datasets in Large Language Models
Matteo Silvestri, Fabiano Veglianti, Flavio Giorgi, Fabrizio Silvestri, Gabriele Tolomei
arxiv.org/abs/2510.20351 mastoxiv.page/@arXiv_csCL_bot/
- LuxIT: A Luxembourgish Instruction Tuning Dataset from Monolingual Seed Data
Julian Valline, Cedric Lothritz, Siwen Guo, Jordi Cabot
arxiv.org/abs/2510.24434 mastoxiv.page/@arXiv_csCL_bot/
- Surfacing Subtle Stereotypes: A Multilingual, Debate-Oriented Evaluation of Modern LLMs
Muhammed Saeed, Muhammad Abdul-mageed, Shady Shehata
arxiv.org/abs/2511.01187 mastoxiv.page/@arXiv_csCL_bot/
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 10:39:11

Extending $\mu$P: Spectral Conditions for Feature Learning Across Optimizers
Akshita Gupta, Marieme Ngom, Sam Foreman, Venkatram Vishwanath
arxiv.org/abs/2602.20937 arxiv.org/pdf/2602.20937 arxiv.org/html/2602.20937
arXiv:2602.20937v1 Announce Type: new
Abstract: Several variations of adaptive first-order and second-order optimization methods have been proposed to accelerate and scale the training of large language models. The performance of these optimization routines is highly sensitive to the choice of hyperparameters (HPs), which are computationally expensive to tune for large-scale models. Maximal update parameterization $(\mu$P$)$ is a set of scaling rules which aims to make the optimal HPs independent of the model size, thereby allowing the HPs tuned on a smaller (computationally cheaper) model to be transferred to train a larger, target model. Despite promising results for SGD and Adam, deriving $\mu$P for other optimizers is challenging because the underlying tensor programming approach is difficult to grasp. Building on recent work that introduced spectral conditions as an alternative to tensor programs, we propose a novel framework to derive $\mu$P for a broader class of optimizers, including AdamW, ADOPT, LAMB, Sophia, Shampoo and Muon. We implement our $\mu$P derivations on multiple benchmark models and demonstrate zero-shot learning rate transfer across increasing model width for the above optimizers. Further, we provide empirical insights into depth-scaling parameterization for these optimizers.
toXiv_bot_toot

@arXiv_csDS_bot@mastoxiv.page
2026-02-03 08:07:36

Fast $k$-means Seeding Under The Manifold Hypothesis
Poojan Shah, Shashwat Agrawal, Ragesh Jaiswal
arxiv.org/abs/2602.01104 arxiv.org/pdf/2602.01104 arxiv.org/html/2602.01104
arXiv:2602.01104v1 Announce Type: new
Abstract: We study beyond worst case analysis for the $k$-means problem where the goal is to model typical instances of $k$-means arising in practice. Existing theoretical approaches provide guarantees under certain assumptions on the optimal solutions to $k$-means, making them difficult to validate in practice. We propose the manifold hypothesis, where data obtained in ambient dimension $D$ concentrates around a low dimensional manifold of intrinsic dimension $d$, as a reasonable assumption to model real world clustering instances. We identify key geometric properties of datasets which have theoretically predictable scaling laws depending on the quantization exponent $\varepsilon = 2/d$ using techniques from optimum quantization theory. We show how to exploit these regularities to design a fast seeding method called $\operatorname{Qkmeans}$ which provides $O(\rho^{-2} \log k)$ approximate solutions to the $k$-means problem in time $O(nD) \widetilde{O}(\varepsilon^{1 \rho}\rho^{-1}k^{1 \gamma})$; where the exponent $\gamma = \varepsilon \rho$ for an input parameter $\rho < 1$. This allows us to obtain new runtime - quality tradeoffs. We perform a large scale empirical study across various domains to validate our theoretical predictions and algorithm performance to bridge theory and practice for beyond worst case data clustering.
toXiv_bot_toot

@Techmeme@techhub.social
2026-03-31 18:20:52

Monzo is shuttering its US operations to focus on scaling in the UK and Europe; source: it will lay off ~50 employees and close clients' accounts in June (Aisha S Gani/Bloomberg)
bloomberg.com/news/articles/20

@burger_jaap@mastodon.social
2026-02-23 08:14:04

"Pod is now scaling its smart charging capabilities across its existing customer base and extending access to rewards beyond early trials and subscribers to its all-inclusive Pod Drive charging service. Available to thousands more #EV drivers across the UK, Pod could pay out well over £1m in rewards by the end of the year."

@arXiv_physicsfludyn_bot@mastoxiv.page
2026-02-26 08:21:00

Frequency-Dependent Magnetic modulation of deposition morphology
S. K. Saroj, P. K. Panigrahi
arxiv.org/abs/2602.21789 arxiv.org/pdf/2602.21789 arxiv.org/html/2602.21789
arXiv:2602.21789v1 Announce Type: new
Abstract: This paper presents a novel approach for magnetic modulation of deposition morphology in an evaporating ferrofluid droplet. The magnetic field strength and ferrofluid concentration are kept unchanged, while the actuation frequencies are varied from 0.016 Hz to 5 Hz. In the absence of a magnetic field, a coffee-ring formation is observed and consistent with previous studies\cite{deegan1997capillary,deegan2000contact,saroj2019drying}. The application of a time-dependent magnetic field significantly modifies the deposition morphology. The periodic magnetic field induces the formation of multiple concentric rings during evaporation. The number of rings initially increases with increasing actuation frequency of the electromagnet. However, beyond a critical actuation frequency ($f_c = 0.2\,\text{Hz}$), the number of rings decreases. At higher actuation frequencies, magnetic particles preferentially deposit in the central region of the droplet, resulting in suppression of the coffee-ring effect. Additionally, the thickness of the inner rings and the ring spacing decrease with increasing actuation frequency up to critical actuation frequency. The transition from multi-ring formation to coffee-ring suppression is governed by the competition among magnetic forcing, capillary flow, and particle diffusion. The underlying physical mechanisms responsible for droplet dynamics and deposition morphology under periodic magnetic fields are evaluated using scaling arguments. The results demonstrate that diffusive particle transport plays a dominant role in determining the deposition pattern. A non-dimensional magnetic switching number, based on the magnetic perturbation timescale, is introduced as a control parameter to characterize the frequency-dependent deposition behavior.
toXiv_bot_toot

@Techmeme@techhub.social
2026-03-27 22:21:12

Anthropic adjusts Claude session limits and says users will use up their limits faster during peak hours, amid compute strain due to Claude's new popularity (Brent D. Griffiths/Business Insider)
businessinsider.com/claude-usa

@arXiv_csCL_bot@mastoxiv.page
2026-03-31 10:12:07

Compressing Transformer Language Models via Matrix Product Operator Decomposition: A Case Study on PicoGPT
Younes Javanmard, Tanmoy Pandit, Masoud Mardani
arxiv.org/abs/2603.28534 arxiv.org/pdf/2603.28534 arxiv.org/html/2603.28534
arXiv:2603.28534v1 Announce Type: new
Abstract: Transformer-based language models achieve strong performance across NLP tasks, but their quadratic parameter scaling with hidden dimension makes deployment on resource-constrained hardware expensive. We study Matrix Product Operator (MPO) decomposition as a principled compression method for transformers. MPO factorises weight matrices into chains of low-rank cores, with approximation quality controlled by the bond dimension chi. We replace every nn.Linear layer in PicoGPT, a GPT-2-style character-level language model with about 1M parameters, with an MPOLinear module parameterised as an MPO chain. Cores are initialised either by TT-SVD from pretrained dense weights or from random initialisation, and trained using standard PyTorch autograd without a custom backward pass. We derive balanced factorisation schemes for the five distinct weight shapes in PicoGPT and evaluate bond dimensions chi in {4, 8, 16, 32} on Tiny Shakespeare. MPO compression achieves up to 13x compression per transformer block at chi = 4. At chi = 16, the model uses 191,872 parameters instead of 1,020,224 while retaining 97.7% of baseline token accuracy (51.6% vs 52.8%). Reconstruction error follows the expected trend and is lower for three-site than two-site factorisations at the same bond dimension. The chi = 8 model gives the best accuracy per parameter, exceeding the dense baseline by 2.7x on this metric. These results show that MPO parameterisation is a practical and theoretically grounded alternative to low-rank methods and unstructured pruning for transformer compression.
toXiv_bot_toot

@Techmeme@techhub.social
2026-03-25 08:01:48

Memo: Sam Altman says OpenAI's next model finished pretraining, and moves Safety to Research and Security to Scaling; Fidji Simo becomes CEO of "AGI Deployment" (Alex Heath/Sources)
sources.news/p/why-openai-kill

@Techmeme@techhub.social
2026-03-23 17:45:45

Q&A with Jensen Huang, who says "we've achieved AGI", on running Nvidia, AI scaling laws, OpenClaw, future of coding, data centers in space, China, and more (Lex Fridman)
lexfridman.com/jensen-huang-tr

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 10:45:11

Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking
Ravi Ghadia, Maksim Abraham, Sergei Vorobyov, Max Ryabinin
arxiv.org/abs/2602.21196 arxiv.org/pdf/2602.21196 arxiv.org/html/2602.21196
arXiv:2602.21196v1 Announce Type: new
Abstract: Efficiently processing long sequences with Transformer models usually requires splitting the computations across accelerators via context parallelism. The dominant approaches in this family of methods, such as Ring Attention or DeepSpeed Ulysses, enable scaling over the context dimension but do not focus on memory efficiency, which limits the sequence lengths they can support. More advanced techniques, such as Fully Pipelined Distributed Transformer or activation offloading, can further extend the possible context length at the cost of training throughput. In this paper, we present UPipe, a simple yet effective context parallelism technique that performs fine-grained chunking at the attention head level. This technique significantly reduces the activation memory usage of self-attention, breaking the activation memory barrier and unlocking much longer context lengths. Our approach reduces intermediate tensor memory usage in the attention layer by as much as 87.5$\%$ for 32B Transformers, while matching previous context parallelism techniques in terms of training speed. UPipe can support the context length of 5M tokens when training Llama3-8B on a single 8$\times$H100 node, improving upon prior methods by over 25$\%$.
toXiv_bot_toot

@arXiv_csCL_bot@mastoxiv.page
2026-03-31 11:13:03

Replaced article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[4/5]:
- Retrieving Climate Change Disinformation by Narrative
Upravitelev, Solopova, Jakob, Sahitaj, M\"oller, Schmitt
arxiv.org/abs/2603.22015 mastoxiv.page/@arXiv_csCL_bot/
- PaperVoyager : Building Interactive Web with Visual Language Models
Dasen Dai, Biao Wu, Meng Fang, Wenhao Wang
arxiv.org/abs/2603.22999 mastoxiv.page/@arXiv_csCL_bot/
- Continual Robot Skill and Task Learning via Dialogue
Weiwei Gu, Suresh Kondepudi, Anmol Gupta, Lixiao Huang, Nakul Gopalan
arxiv.org/abs/2409.03166 mastoxiv.page/@arXiv_csRO_bot/
- Shifting Perspectives: Steering Vectors for Robust Bias Mitigation in LLMs
Zara Siddique, Irtaza Khalid, Liam D. Turner, Luis Espinosa-Anke
arxiv.org/abs/2503.05371 mastoxiv.page/@arXiv_csLG_bot/
- SkillFlow: Scalable and Efficient Agent Skill Retrieval System
Fangzhou Li, Pagkratios Tagkopoulos, Ilias Tagkopoulos
arxiv.org/abs/2504.06188 mastoxiv.page/@arXiv_csAI_bot/
- Large Language Models for Computer-Aided Design: A Survey
Licheng Zhang, Bach Le, Naveed Akhtar, Siew-Kei Lam, Tuan Ngo
arxiv.org/abs/2505.08137 mastoxiv.page/@arXiv_csLG_bot/
- Structured Agent Distillation for Large Language Model
Liu, Kong, Dong, Yang, Li, Tang, Yuan, Niu, Zhang, Zhao, Lin, Huang, Wang
arxiv.org/abs/2505.13820 mastoxiv.page/@arXiv_csLG_bot/
- VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction
Fan, Zhang, Li, Zhang, Chen, Hu, Wang, Qu, Zhou, Wang, Yan, Xu, Theiss, Chen, Li, Tu, Wang, Ranjan
arxiv.org/abs/2505.20279 mastoxiv.page/@arXiv_csCV_bot/
- Learning to Diagnose Privately: DP-Powered LLMs for Radiology Report Classification
Bhattacharjee, Tian, Rubin, Lo, Merchant, Hanson, Gounley, Tandon
arxiv.org/abs/2506.04450 mastoxiv.page/@arXiv_csCR_bot/
- L-MARS: Legal Multi-Agent Workflow with Orchestrated Reasoning and Agentic Search
Ziqi Wang, Boqin Yuan
arxiv.org/abs/2509.00761 mastoxiv.page/@arXiv_csAI_bot/
- Your Models Have Thought Enough: Training Large Reasoning Models to Stop Overthinking
Han, Huang, Liao, Jiang, Lu, Zhao, Wang, Zhou, Jiang, Liang, Zhou, Sun, Yu, Xiao
arxiv.org/abs/2509.23392 mastoxiv.page/@arXiv_csAI_bot/
- Person-Centric Annotations of LAION-400M: Auditing Bias and Its Transfer to Models
Leander Girrbach, Stephan Alaniz, Genevieve Smith, Trevor Darrell, Zeynep Akata
arxiv.org/abs/2510.03721 mastoxiv.page/@arXiv_csCV_bot/
- Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Zhang, Hu, Upasani, Ma, Hong, Kamanuru, Rainton, Wu, Ji, Li, Thakker, Zou, Olukotun
arxiv.org/abs/2510.04618 mastoxiv.page/@arXiv_csLG_bot/
- Mitigating Premature Exploitation in Particle-based Monte Carlo for Inference-Time Scaling
Giannone, Xu, Nayak, Awhad, Sudalairaj, Xu, Srivastava
arxiv.org/abs/2510.05825 mastoxiv.page/@arXiv_csLG_bot/
- Complete asymptotic type-token relationship for growing complex systems with inverse power-law co...
Pablo Rosillo-Rodes, Laurent H\'ebert-Dufresne, Peter Sheridan Dodds
arxiv.org/abs/2511.02069 mastoxiv.page/@arXiv_physicsso
- ViPRA: Video Prediction for Robot Actions
Sandeep Routray, Hengkai Pan, Unnat Jain, Shikhar Bahl, Deepak Pathak
arxiv.org/abs/2511.07732 mastoxiv.page/@arXiv_csRO_bot/
- AISAC: An Integrated multi-agent System for Transparent, Retrieval-Grounded Scientific Assistance
Chandrachur Bhattacharya, Sibendu Som
arxiv.org/abs/2511.14043
- VideoARM: Agentic Reasoning over Hierarchical Memory for Long-Form Video Understanding
Yufei Yin, Qianke Meng, Minghao Chen, Jiajun Ding, Zhenwei Shao, Zhou Yu
arxiv.org/abs/2512.12360 mastoxiv.page/@arXiv_csCV_bot/
- RadImageNet-VQA: A Large-Scale CT and MRI Dataset for Radiologic Visual Question Answering
L\'eo Butsanets, Charles Corbi\`ere, Julien Khlaut, Pierre Manceron, Corentin Dancette
arxiv.org/abs/2512.17396 mastoxiv.page/@arXiv_csCV_bot/
- Measuring all the noises of LLM Evals
Sida Wang
arxiv.org/abs/2512.21326 mastoxiv.page/@arXiv_csLG_bot/
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 10:45:31

Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs
Yining Hong, Huang Huang, Manling Li, Li Fei-Fei, Jiajun Wu, Yejin Choi
arxiv.org/abs/2602.21198 arxiv.org/pdf/2602.21198 arxiv.org/html/2602.21198
arXiv:2602.21198v1 Announce Type: new
Abstract: Embodied LLMs endow robots with high-level task reasoning, but they cannot reflect on what went wrong or why, turning deployment into a sequence of independent trials where mistakes repeat rather than accumulate into experience. Drawing upon human reflective practitioners, we introduce Reflective Test-Time Planning, which integrates two modes of reflection: \textit{reflection-in-action}, where the agent uses test-time scaling to generate and score multiple candidate actions using internal reflections before execution; and \textit{reflection-on-action}, which uses test-time training to update both its internal reflection model and its action policy based on external reflections after execution. We also include retrospective reflection, allowing the agent to re-evaluate earlier decisions and perform model updates with hindsight for proper long-horizon credit assignment. Experiments on our newly-designed Long-Horizon Household benchmark and MuJoCo Cupboard Fitting benchmark show significant gains over baseline models, with ablative studies validating the complementary roles of reflection-in-action and reflection-on-action. Qualitative analyses, including real-robot trials, highlight behavioral correction through reflection.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 10:39:51

Does Order Matter : Connecting The Law of Robustness to Robust Generalization
Himadri Mandal, Vishnu Varadarajan, Jaee Ponde, Aritra Das, Mihir More, Debayan Gupta
arxiv.org/abs/2602.20971 arxiv.org/pdf/2602.20971 arxiv.org/html/2602.20971
arXiv:2602.20971v1 Announce Type: new
Abstract: Bubeck and Sellke (2021) pose as an open problem the connection between the law of robustness and robust generalization. The law of robustness states that overparameterization is necessary for models to interpolate robustly; in particular, robust interpolation requires the learned function to be Lipschitz. Robust generalization asks whether small robust training loss implies small robust test loss. We resolve this problem by explicitly connecting the two for arbitrary data distributions. Specifically, we introduce a nontrivial notion of robust generalization error and convert it into a lower bound on the expected Rademacher complexity of the induced robust loss class. Our bounds recover the $\Omega(n^{1/d})$ regime of Wu et al.\ (2023) and show that, up to constants, robust generalization does not change the order of the Lipschitz constant required for smooth interpolation. We conduct experiments to probe the predicted scaling with dataset size and model capacity, testing whether empirical behavior aligns more closely with the predictions of Bubeck and Sellke (2021) or Wu et al.\ (2023). For MNIST, we find that the lower-bound Lipschitz constant scales on the order predicted by Wu et al.\ (2023). Informally, to obtain low robust generalization error, the Lipschitz constant must lie in a range that we bound, and the allowable perturbation radius is linked to the Lipschitz scale.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 16:07:37

Replaced article(s) found for cs.LG. arxiv.org/list/cs.LG/new
[1/6]:
- Towards Attributions of Input Variables in a Coalition
Xinhao Zheng, Huiqi Deng, Quanshi Zhang
arxiv.org/abs/2309.13411
- Knee or ROC
Veronica Wendt, Jacob Steiner, Byunggu Yu, Caleb Kelly, Justin Kim
arxiv.org/abs/2401.07390
- Rethinking Disentanglement under Dependent Factors of Variation
Antonio Almud\'evar, Alfonso Ortega
arxiv.org/abs/2408.07016 mastoxiv.page/@arXiv_csLG_bot/
- Minibatch Optimal Transport and Perplexity Bound Estimation in Discrete Flow Matching
Etrit Haxholli, Yeti Z. Gurbuz, Ogul Can, Eli Waxman
arxiv.org/abs/2411.00759 mastoxiv.page/@arXiv_csLG_bot/
- Predicting Subway Passenger Flows under Incident Situation with Causality
Xiannan Huang, Shuhan Qiu, Quan Yuan, Chao Yang
arxiv.org/abs/2412.06871 mastoxiv.page/@arXiv_csLG_bot/
- Characterizing LLM Inference Energy-Performance Tradeoffs across Workloads and GPU Scaling
Paul Joe Maliakel, Shashikant Ilager, Ivona Brandic
arxiv.org/abs/2501.08219 mastoxiv.page/@arXiv_csLG_bot/
- Universality of Benign Overfitting in Binary Linear Classification
Ichiro Hashimoto, Stanislav Volgushev, Piotr Zwiernik
arxiv.org/abs/2501.10538 mastoxiv.page/@arXiv_csLG_bot/
- Safe Reinforcement Learning for Real-World Engine Control
Julian Bedei, Lucas Koch, Kevin Badalian, Alexander Winkler, Patrick Schaber, Jakob Andert
arxiv.org/abs/2501.16613 mastoxiv.page/@arXiv_csLG_bot/
- A Statistical Learning Perspective on Semi-dual Adversarial Neural Optimal Transport Solvers
Roman Tarasov, Petr Mokrov, Milena Gazdieva, Evgeny Burnaev, Alexander Korotin
arxiv.org/abs/2502.01310
- Improving the Convergence of Private Shuffled Gradient Methods with Public Data
Shuli Jiang, Pranay Sharma, Zhiwei Steven Wu, Gauri Joshi
arxiv.org/abs/2502.03652 mastoxiv.page/@arXiv_csLG_bot/
- Using the Path of Least Resistance to Explain Deep Networks
Sina Salek, Joseph Enguehard
arxiv.org/abs/2502.12108 mastoxiv.page/@arXiv_csLG_bot/
- Distributional Vision-Language Alignment by Cauchy-Schwarz Divergence
Wenzhe Yin, Zehao Xiao, Pan Zhou, Shujian Yu, Jiayi Shen, Jan-Jakob Sonke, Efstratios Gavves
arxiv.org/abs/2502.17028 mastoxiv.page/@arXiv_csLG_bot/
- Armijo Line-search Can Make (Stochastic) Gradient Descent Provably Faster
Sharan Vaswani, Reza Babanezhad
arxiv.org/abs/2503.00229 mastoxiv.page/@arXiv_csLG_bot/
- Semantic Parallelism: Redefining Efficient MoE Inference via Model-Data Co-Scheduling
Yan Li, Zhenyu Zhang, Zhengang Wang, Pengfei Chen, Pengfei Zheng
arxiv.org/abs/2503.04398 mastoxiv.page/@arXiv_csLG_bot/
- A Survey on Federated Fine-tuning of Large Language Models
Wu, Tian, Li, Sun, Tam, Zhou, Liao, Xiong, Guo, Li, Xu
arxiv.org/abs/2503.12016 mastoxiv.page/@arXiv_csLG_bot/
- Towards Trustworthy GUI Agents: A Survey
Yucheng Shi, Wenhao Yu, Jingyuan Huang, Wenlin Yao, Wenhu Chen, Ninghao Liu
arxiv.org/abs/2503.23434 mastoxiv.page/@arXiv_csLG_bot/
- CONTINA: Confidence Interval for Traffic Demand Prediction with Coverage Guarantee
Chao Yang, Xiannan Huang, Shuhan Qiu, Yan Cheng
arxiv.org/abs/2504.13961 mastoxiv.page/@arXiv_csLG_bot/
- Regularity and Stability Properties of Selective SSMs with Discontinuous Gating
Nikola Zubi\'c, Davide Scaramuzza
arxiv.org/abs/2505.11602 mastoxiv.page/@arXiv_csLG_bot/
- RECON: Robust symmetry discovery via Explicit Canonical Orientation Normalization
Alonso Urbano, David W. Romero, Max Zimmer, Sebastian Pokutta
arxiv.org/abs/2505.13289 mastoxiv.page/@arXiv_csLG_bot/
- RefLoRA: Refactored Low-Rank Adaptation for Efficient Fine-Tuning of Large Models
Yilang Zhang, Bingcong Li, Georgios B. Giannakis
arxiv.org/abs/2505.18877 mastoxiv.page/@arXiv_csLG_bot/
- SuperMAN: Interpretable and Expressive Networks over Temporally Sparse Heterogeneous Data
Bechler-Speicher, Zerio, Huri, Vestergaard, Gilad-Bachrach, Jess, Bhatt, Sazonovs
arxiv.org/abs/2505.19193 mastoxiv.page/@arXiv_csLG_bot/
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 16:07:58

Replaced article(s) found for cs.LG. arxiv.org/list/cs.LG/new
[3/6]:
- Towards Scalable Oversight via Partitioned Human Supervision
Ren Yin, Takashi Ishida, Masashi Sugiyama
arxiv.org/abs/2510.22500 mastoxiv.page/@arXiv_csLG_bot/
- ContextPilot: Fast Long-Context Inference via Context Reuse
Yinsicheng Jiang, Yeqi Huang, Liang Cheng, Cheng Deng, Xuan Sun, Luo Mai
arxiv.org/abs/2511.03475 mastoxiv.page/@arXiv_csLG_bot/
- Metabolomic Biomarker Discovery for ADHD Diagnosis Using Interpretable Machine Learning
Nabil Belacel, Mohamed Rachid Boulassel
arxiv.org/abs/2601.11283 mastoxiv.page/@arXiv_csLG_bot/
- PhysE-Inv: A Physics-Encoded Inverse Modeling approach for Arctic Snow Depth Prediction
Akila Sampath, Vandana Janeja, Jianwu Wang
arxiv.org/abs/2601.17074
- SAGE-5GC: Security-Aware Guidelines for Evaluating Anomaly Detection in the 5G Core Network
Cristian Manca, Christian Scano, Giorgio Piras, Fabio Brau, Maura Pintor, Battista Biggio
arxiv.org/abs/2602.03596
- LORE: Jointly Learning the Intrinsic Dimensionality and Relative Similarity Structure From Ordina...
Anand, Helbling, Davenport, Berman, Alagapan, Rozell
arxiv.org/abs/2602.04192
- Towards Robust Scaling Laws for Optimizers
Alexandra Volkova, Mher Safaryan, Christoph H. Lampert, Dan Alistarh
arxiv.org/abs/2602.07712 mastoxiv.page/@arXiv_csLG_bot/
- Do We Need Adam? Surprisingly Strong and Sparse Reinforcement Learning with SGD in LLMs
Sagnik Mukherjee, Lifan Yuan, Pavan Jayasinha, Dilek Hakkani-T\"ur, Hao Peng
arxiv.org/abs/2602.07729 mastoxiv.page/@arXiv_csLG_bot/
- AceGRPO: Adaptive Curriculum Enhanced Group Relative Policy Optimization for Autonomous Machine L...
Yuzhu Cai, Zexi Liu, Xinyu Zhu, Cheng Wang, Siheng Chen
arxiv.org/abs/2602.07906 mastoxiv.page/@arXiv_csLG_bot/
- VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training
Guobin Shen, Chenxiao Zhao, Xiang Cheng, Lei Huang, Xing Yu
arxiv.org/abs/2602.10693 mastoxiv.page/@arXiv_csLG_bot/
- KBVQ-MoE: KLT-guided SVD with Bias-Corrected Vector Quantization for MoE Large Language Models
Zukang Xu, Zhixiong Zhao, Xing Hu, Zhixuan Chen, Dawei Yang
arxiv.org/abs/2602.11184 mastoxiv.page/@arXiv_csLG_bot/
- MUSE: Multi-Tenant Model Serving With Seamless Model Updates
Correia, Ferreira, Martins, Bento, Guerreiro, Pereira, Gomes, Bono, Ferreira, Bizarro
arxiv.org/abs/2602.11776 mastoxiv.page/@arXiv_csLG_bot/
- Pawsterior: Variational Flow Matching for Structured Simulation-Based Inference
Jorge Carrasco-Pollo, Floor Eijkelboom, Jan-Willem van de Meent
arxiv.org/abs/2602.13813 mastoxiv.page/@arXiv_csLG_bot/
- Silent Inconsistency in Data-Parallel Full Fine-Tuning: Diagnosing Worker-Level Optimization Misa...
Hong Li, Zhen Zhou, Honggang Zhang, Yuping Luo, Xinyue Wang, Han Gong, Zhiyuan Liu
arxiv.org/abs/2602.14462 mastoxiv.page/@arXiv_csLG_bot/
- Divine Benevolence is an $x^2$: GLUs scale asymptotically faster than MLPs
Alejandro Francisco Queiruga
arxiv.org/abs/2602.14495 mastoxiv.page/@arXiv_csLG_bot/
- \"UberWeb: Insights from Multilingual Curation for a 20-Trillion-Token Dataset
DatologyAI, et al.
arxiv.org/abs/2602.15210 mastoxiv.page/@arXiv_csLG_bot/
- GLM-5: from Vibe Coding to Agentic Engineering
GLM-5-Team, et al.
arxiv.org/abs/2602.15763 mastoxiv.page/@arXiv_csLG_bot/
- Anatomy of Capability Emergence: Scale-Invariant Representation Collapse and Top-Down Reorganizat...
Jayadev Billa
arxiv.org/abs/2602.15997 mastoxiv.page/@arXiv_csLG_bot/
- AI-CARE: Carbon-Aware Reporting Evaluation Metric for AI Models
KC Santosh, Srikanth Baride, Rodrigue Rizk
arxiv.org/abs/2602.16042 mastoxiv.page/@arXiv_csLG_bot/
- Beyond Message Passing: A Symbolic Alternative for Expressive and Interpretable Graph Learning
Chuqin Geng, Li Zhang, Haolin Ye, Ziyu Zhao, Yuhe Jiang, Tara Saba, Xinyu Wang, Xujie Si
arxiv.org/abs/2602.16947 mastoxiv.page/@arXiv_csLG_bot/
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 12:33:48

Crosslisted article(s) found for cs.LG. arxiv.org/list/cs.LG/new
[3/3]:
- Functional Continuous Decomposition
Teymur Aghayev
arxiv.org/abs/2602.20857 mastoxiv.page/@arXiv_eessSP_bo
- SpatiaLQA: A Benchmark for Evaluating Spatial Logical Reasoning in Vision-Language Models
Xie, Zhang, Shan, Zhu, Tang, Wei, Song, Wan, Song
arxiv.org/abs/2602.20901 mastoxiv.page/@arXiv_csCV_bot/
- Some Simple Economics of AGI
Christian Catalini, Xiang Hui, Jane Wu
arxiv.org/abs/2602.20946 mastoxiv.page/@arXiv_econGN_bo
- Multimodal MRI Report Findings Supervised Brain Lesion Segmentation with Substructures
Yubin Ge, Yongsong Huang, Xiaofeng Liu
arxiv.org/abs/2602.20994 mastoxiv.page/@arXiv_eessIV_bo
- MIP Candy: A Modular PyTorch Framework for Medical Image Processing
Tianhao Fu, Yucheng Chen
arxiv.org/abs/2602.21033 mastoxiv.page/@arXiv_csCV_bot/
- Empirically Calibrated Conditional Independence Tests
Milleno Pan, Antoine de Mathelin, Wesley Tansey
arxiv.org/abs/2602.21036 mastoxiv.page/@arXiv_statME_bo
- Is Multi-Distribution Learning as Easy as PAC Learning: Sharp Rates with Bounded Label Noise
Rafael Hanashiro, Abhishek Shetty, Patrick Jaillet
arxiv.org/abs/2602.21039 mastoxiv.page/@arXiv_statML_bo
- Position-Aware Sequential Attention for Accurate Next Item Recommendations
Timur Nabiev, Evgeny Frolov
arxiv.org/abs/2602.21052 mastoxiv.page/@arXiv_csIR_bot/
- Motivation is Something You Need
Mehdi Acheli, Walid Gaaloul
arxiv.org/abs/2602.21064 mastoxiv.page/@arXiv_csAI_bot/
- An Enhanced Projection Pursuit Tree Classifier with Visual Methods for Assessing Algorithmic Impr...
Natalia da Silva, Dianne Cook, Eun-Kyung Lee
arxiv.org/abs/2602.21130 mastoxiv.page/@arXiv_statML_bo
- Complexity of Classical Acceleration for $\ell_1$-Regularized PageRank
Kimon Fountoulakis, David Mart\'inez-Rubio
arxiv.org/abs/2602.21138 mastoxiv.page/@arXiv_mathOC_bo
- LUMEN: Longitudinal Multi-Modal Radiology Model for Prognosis and Diagnosis
Jiang, Yang, Nath, Parida, Kulkarni, Xu, Xu, Anwar, Roth, Linguraru
arxiv.org/abs/2602.21142 mastoxiv.page/@arXiv_csCV_bot/
- A Benchmark for Deep Information Synthesis
Debjit Paul, et al.
arxiv.org/abs/2602.21143 mastoxiv.page/@arXiv_csAI_bot/
- Scaling State-Space Models on Multiple GPUs with Tensor Parallelism
Anurag Dutt, Nimit Shah, Hazem Masarani, Anshul Gandhi
arxiv.org/abs/2602.21144 mastoxiv.page/@arXiv_csDC_bot/
- Not Just How Much, But Where: Decomposing Epistemic Uncertainty into Per-Class Contributions
Mame Diarra Toure, David A. Stephens
arxiv.org/abs/2602.21160 mastoxiv.page/@arXiv_statML_bo
- Aletheia tackles FirstProof autonomously
Tony Feng, et al.
arxiv.org/abs/2602.21201 mastoxiv.page/@arXiv_csAI_bot/
- Squint: Fast Visual Reinforcement Learning for Sim-to-Real Robotics
Abdulaziz Almuzairee, Henrik I. Christensen
arxiv.org/abs/2602.21203 mastoxiv.page/@arXiv_csRO_bot/
toXiv_bot_toot