Tootfinder

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 10:45:01

Statistical Query Lower Bounds for Smoothed Agnostic Learning
Ilias Diakonikolas, Daniel M. Kane
https://arxiv.org/abs/2602.21191 https://arxiv.org/pdf/2602.21191 https://arxiv.org/html/2602.21191
arXiv:2602.21191v1 Announce Type: new
Abstract: We study the complexity of smoothed agnostic learning, recently introduced by~\cite{CKKMS24}, in which the learner competes with the best classifier in a target class under slight Gaussian perturbations of the inputs. Specifically, we focus on the prototypical task of agnostically learning halfspaces under subgaussian distributions in the smoothed model. The best known upper bound for this problem relies on $L_1$-polynomial regression and has complexity $d^{\tilde{O}(1/\sigma^2) \log(1/\epsilon)}$, where $\sigma$ is the smoothing parameter and $\epsilon$ is the excess error. Our main result is a Statistical Query (SQ) lower bound providing formal evidence that this upper bound is close to best possible. In more detail, we show that (even for Gaussian marginals) any SQ algorithm for smoothed agnostic learning of halfspaces requires complexity $d^{\Omega(1/\sigma^{2} \log(1/\epsilon))}$. This is the first non-trivial lower bound on the complexity of this task and nearly matches the known upper bound. Roughly speaking, we show that applying $L_1$-polynomial regression to a smoothed version of the function is essentially best possible. Our techniques involve finding a moment-matching hard distribution by way of linear programming duality. This dual program corresponds exactly to finding a low-degree approximating polynomial to the smoothed version of the target function (which turns out to be the same condition required for the $L_1$-polynomial regression to work). Our explicit SQ lower bound then comes from proving lower bounds on this approximation degree for the class of halfspaces.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 16:08:08

Replaced article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[4/6]:
- Neural Proposals, Symbolic Guarantees: Neuro-Symbolic Graph Generation with Hard Constraints
Chuqin Geng, Li Zhang, Mark Zhang, Haolin Ye, Ziyu Zhao, Xujie Si
https://arxiv.org/abs/2602.16954 https://mastoxiv.page/@arXiv_csLG_bot/116102434757760085
- Multi-Probe Zero Collision Hash (MPZCH): Mitigating Embedding Collisions and Enhancing Model Fres...
Ziliang Zhao, et al.
https://arxiv.org/abs/2602.17050 https://mastoxiv.page/@arXiv_csLG_bot/116102517335590034
- MASPO: Unifying Gradient Utilization, Probability Mass, and Signal Reliability for Robust and Sam...
Fu, Lin, Fang, Zheng, Hu, Shao, Qin, Pan, Zeng, Cai
https://arxiv.org/abs/2602.17550 https://mastoxiv.page/@arXiv_csLG_bot/116102581561441103
- A Theoretical Framework for Modular Learning of Robust Generative Models
Corinna Cortes, Mehryar Mohri, Yutao Zhong
https://arxiv.org/abs/2602.17554 https://mastoxiv.page/@arXiv_csLG_bot/116102582216715527
- Multi-Round Human-AI Collaboration with User-Specified Requirements
Sima Noorani, Shayan Kiyani, Hamed Hassani, George Pappas
https://arxiv.org/abs/2602.17646 https://mastoxiv.page/@arXiv_csLG_bot/116102592047544971
- NEXUS: A compact neural architecture for high-resolution spatiotemporal air quality forecasting i...
Rampunit Kumar, Aditya Maheshwari
https://arxiv.org/abs/2602.19654 https://mastoxiv.page/@arXiv_csLG_bot/116125610403473755
- Augmenting Lateral Thinking in Language Models with Humor and Riddle Data for the BRAINTEASER Task
Mina Ghashami, Soumya Smruti Mishra
https://arxiv.org/abs/2405.10385 https://mastoxiv.page/@arXiv_csCL_bot/112472190479013167
- Watermarking Language Models with Error Correcting Codes
Patrick Chao, Yan Sun, Edgar Dobriban, Hamed Hassani
https://arxiv.org/abs/2406.10281 https://mastoxiv.page/@arXiv_csCR_bot/112636307340218522
- Learning to Control Unknown Strongly Monotone Games
Siddharth Chandak, Ilai Bistritz, Nicholas Bambos
https://arxiv.org/abs/2407.00575 https://mastoxiv.page/@arXiv_csMA_bot/112715733875586837
- Classification and reconstruction for single-pixel imaging with classical and quantum neural netw...
Sofya Manko, Dmitry Frolovtsev
https://arxiv.org/abs/2407.12506 https://mastoxiv.page/@arXiv_quantph_bot/112806295477530195
- Statistical Inference for Temporal Difference Learning with Linear Function Approximation
Weichen Wu, Gen Li, Yuting Wei, Alessandro Rinaldo
https://arxiv.org/abs/2410.16106 https://mastoxiv.page/@arXiv_statML_bot/113350611306532443
- Big data approach to Kazhdan-Lusztig polynomials
Abel Lacabanne, Daniel Tubbenhauer, Pedro Vaz
https://arxiv.org/abs/2412.01283 https://mastoxiv.page/@arXiv_mathRT_bot/113587812663608119
- MoEMba: A Mamba-based Mixture of Experts for High-Density EMG-based Hand Gesture Recognition
Mehran Shabanpour, Kasra Rad, Sadaf Khademi, Arash Mohammadi
https://arxiv.org/abs/2502.17457 https://mastoxiv.page/@arXiv_eessSP_bot/114069047434302054
- Tightening Optimality gap with confidence through conformal prediction
Miao Li, Michael Klamkin, Russell Bent, Pascal Van Hentenryck
https://arxiv.org/abs/2503.04071 https://mastoxiv.page/@arXiv_statML_bot/114120074927291283
- SEED: Towards More Accurate Semantic Evaluation for Visual Brain Decoding
Juhyeon Park, Peter Yongho Kim, Jiook Cha, Shinjae Yoo, Taesup Moon
https://arxiv.org/abs/2503.06437 https://mastoxiv.page/@arXiv_csCV_bot/114142690988862508
- How much does context affect the accuracy of AI health advice?
Prashant Garg, Thiemo Fetzer
https://arxiv.org/abs/2504.18310 https://mastoxiv.page/@arXiv_econGN_bot/114414380916957986
- Reproducing and Improving CheXNet: Deep Learning for Chest X-ray Disease Classification
Daniel J. Strick, Carlos Garcia, Anthony Huang, Thomas Gardos
https://arxiv.org/abs/2505.06646 https://mastoxiv.page/@arXiv_eessIV_bot/114499319986528625
- Sharp Gaussian approximations for Decentralized Federated Learning
Soham Bonnerjee, Sayar Karmakar, Wei Biao Wu
https://arxiv.org/abs/2505.08125 https://mastoxiv.page/@arXiv_statML_bot/114505047719395949
- HoloLLM: Multisensory Foundation Model for Language-Grounded Human Sensing and Reasoning
Chuhao Zhou, Jianfei Yang
https://arxiv.org/abs/2505.17645 https://mastoxiv.page/@arXiv_csCV_bot/114572928659057348
- A Copula Based Supervised Filter for Feature Selection in Diabetes Risk Prediction Using Machine ...
Agnideep Aich, Md Monzur Murshed, Sameera Hewage, Amanda Mayeaux
https://arxiv.org/abs/2505.22554 https://mastoxiv.page/@arXiv_statML_bot/114589983451462525
- Synthesis of discrete-continuous quantum circuits with multimodal diffusion models
Florian F\"urrutter, Zohim Chandani, Ikko Hamamura, Hans J. Briegel, Gorka Mu\~noz-Gil
https://arxiv.org/abs/2506.01666 https://mastoxiv.page/@arXiv_quantph_bot/114618420761346125
toXiv_bot_toot

@arXiv_qbioNC_bot@mastoxiv.page
2025-12-10 08:38:00

Manifolds and Modules: How Function Develops in a Neural Foundation Model
Johannes Bertram, Luciano Dyballa, T. Anderson Keller, Savik Kinger, Steven W. Zucker
https://arxiv.org/abs/2512.07869 https://arxiv.org/pdf/2512.07869 https://arxiv.org/html/2512.07869
arXiv:2512.07869v1 Announce Type: new
Abstract: Foundation models have shown remarkable success in fitting biological visual systems; however, their black-box nature inherently limits their utility for under- standing brain function. Here, we peek inside a SOTA foundation model of neural activity (Wang et al., 2025) as a physiologist might, characterizing each 'neuron' based on its temporal response properties to parametric stimuli. We analyze how different stimuli are represented in neural activity space by building decoding man- ifolds, and we analyze how different neurons are represented in stimulus-response space by building neural encoding manifolds. We find that the different processing stages of the model (i.e., the feedforward encoder, recurrent, and readout modules) each exhibit qualitatively different representational structures in these manifolds. The recurrent module shows a jump in capabilities over the encoder module by 'pushing apart' the representations of different temporal stimulus patterns; while the readout module achieves biological fidelity by using numerous specialized feature maps rather than biologically plausible mechanisms. Overall, we present this work as a study of the inner workings of a prominent neural foundation model, gaining insights into the biological relevance of its internals through the novel analysis of its neurons' joint temporal response patterns.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 10:39:51

Does Order Matter : Connecting The Law of Robustness to Robust Generalization
Himadri Mandal, Vishnu Varadarajan, Jaee Ponde, Aritra Das, Mihir More, Debayan Gupta
https://arxiv.org/abs/2602.20971 https://arxiv.org/pdf/2602.20971 https://arxiv.org/html/2602.20971
arXiv:2602.20971v1 Announce Type: new
Abstract: Bubeck and Sellke (2021) pose as an open problem the connection between the law of robustness and robust generalization. The law of robustness states that overparameterization is necessary for models to interpolate robustly; in particular, robust interpolation requires the learned function to be Lipschitz. Robust generalization asks whether small robust training loss implies small robust test loss. We resolve this problem by explicitly connecting the two for arbitrary data distributions. Specifically, we introduce a nontrivial notion of robust generalization error and convert it into a lower bound on the expected Rademacher complexity of the induced robust loss class. Our bounds recover the $\Omega(n^{1/d})$ regime of Wu et al.\ (2023) and show that, up to constants, robust generalization does not change the order of the Lipschitz constant required for smooth interpolation. We conduct experiments to probe the predicted scaling with dataset size and model capacity, testing whether empirical behavior aligns more closely with the predictions of Bubeck and Sellke (2021) or Wu et al.\ (2023). For MNIST, we find that the lower-bound Lipschitz constant scales on the order predicted by Wu et al.\ (2023). Informally, to obtain low robust generalization error, the Lipschitz constant must lie in a range that we bound, and the allowable perturbation radius is linked to the Lipschitz scale.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 10:36:11

Deep unfolding of MCMC kernels: scalable, modular & explainable GANs for high-dimensional posterior sampling
Jonathan Spence, Tob\'ias I. Liaudat, Konstantinos Zygalakis, Marcelo Pereyra
https://arxiv.org/abs/2602.20758 https://arxiv.org/pdf/2602.20758 https://arxiv.org/html/2602.20758
arXiv:2602.20758v1 Announce Type: new
Abstract: Markov chain Monte Carlo (MCMC) methods are fundamental to Bayesian computation, but can be computationally intensive, especially in high-dimensional settings. Push-forward generative models, such as generative adversarial networks (GANs), variational auto-encoders and normalising flows offer a computationally efficient alternative for posterior sampling. However, push-forward models are opaque as they lack the modularity of Bayes Theorem, leading to poor generalisation with respect to changes in the likelihood function. In this work, we introduce a novel approach to GAN architecture design by applying deep unfolding to Langevin MCMC algorithms. This paradigm maps fixed-step iterative algorithms onto modular neural networks, yielding architectures that are both flexible and amenable to interpretation. Crucially, our design allows key model parameters to be specified at inference time, offering robustness to changes in the likelihood parameters. We train these unfolded samplers end-to-end using a supervised regularized Wasserstein GAN framework for posterior sampling. Through extensive Bayesian imaging experiments, we demonstrate that our proposed approach achieves high sampling accuracy and excellent computational efficiency, while retaining the physics consistency, adaptability and interpretability of classical MCMC strategies.
toXiv_bot_toot

@arXiv_csGT_bot@mastoxiv.page
2025-12-10 07:44:21

The Theory of Strategic Evolution: Games with Endogenous Players and Strategic Replicators
Kevin Vallier
https://arxiv.org/abs/2512.07901 https://arxiv.org/pdf/2512.07901 https://arxiv.org/html/2512.07901
arXiv:2512.07901v1 Announce Type: new
Abstract: This paper develops the Theory of Strategic Evolution, a general model for systems in which the population of players, strategies, and institutional rules evolve together. The theory extends replicator dynamics to settings with endogenous players, multi level selection, innovation, constitutional change, and meta governance. The central mathematical object is a Poiesis stack: a hierarchy of strategic layers linked by cross level gain matrices. Under small gain conditions, the system admits a global Lyapunov function and satisfies selection, tracking, and stochastic stability results at every finite depth. We prove that the class is closed under block extension, innovation events, heterogeneous utilities, continuous strategy spaces, and constitutional evolution. The closure theorem shows that no new dynamics arise at higher levels and that unrestricted self modification cannot preserve Lyapunov structure. The theory unifies results from evolutionary game theory, institutional design, innovation dynamics, and constitutional political economy, providing a general mathematical model of long run strategic adaptation.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 12:33:22

Crosslisted article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[1/3]:
- SMaRT: Online Reusable Resource Assignment and an Application to Mediation in the Kenyan Judiciary
Farabi, Pinto, Lu, Ramos-Maqueda, Das, Deeb, Sautmann
https://arxiv.org/abs/2602.18431 https://mastoxiv.page/@arXiv_csCY_bot/116119352329590193
- Benchmarking Distilled Language Models: Performance and Efficiency in Resource-Constrained Settings
Sachin Gopal Wani, Eric Page, Ajay Dholakia, David Ellison
https://arxiv.org/abs/2602.20164 https://mastoxiv.page/@arXiv_csCL_bot/116130101399805837
- VISION-ICE: Video-based Interpretation and Spatial Identification of Arrhythmia Origins via Neura...
Dorsa EPMoghaddam, Feng Gao, Drew Bernard, Kavya Sinha, Mehdi Razavi, Behnaam Aazhang
https://arxiv.org/abs/2602.20165 https://mastoxiv.page/@arXiv_csCV_bot/116130222034322594
- Benchmarking Early Deterioration Prediction Across Hospital-Rich and MCI-Like Emergency Triage Un...
KMA Solaiman, Joshua Sebastian, Karma Tobden
https://arxiv.org/abs/2602.20168 https://mastoxiv.page/@arXiv_csCY_bot/116130239074411770
- Cross-Chirality Generalization by Axial Vectors for Hetero-Chiral Protein-Peptide Interaction Design
Yang, Tian, Jia, Zhang, Zheng, Wang, Su, He, Liu, Lan
https://arxiv.org/abs/2602.20176 https://mastoxiv.page/@arXiv_qbioBM_bot/116130281674122586
- Enhancing Heat Sink Efficiency in MOSFETs using Physics Informed Neural Networks: A Systematic St...
Aniruddha Bora, Isabel K. Alvarez, Julie Chalfant, Chryssostomos Chryssostomidis
https://arxiv.org/abs/2602.20177 https://mastoxiv.page/@arXiv_csNE_bot/116130397676559696
- Data-Driven Deep MIMO Detection:Network Architectures and Generalization Analysis
Yongwei Yi, Xinping Yi, Wenjin Wang, Xiao Li, Shi Jin
https://arxiv.org/abs/2602.20178 https://mastoxiv.page/@arXiv_eessSP_bot/116130257424413457
- OrgFlow: Generative Modeling of Organic Crystal Structures from Molecular Graphs
Mohammadmahdi Vahediahmar, Matthew A. McDonald, Feng Liu
https://arxiv.org/abs/2602.20195 https://mastoxiv.page/@arXiv_condmatmtrlsci_bot/116130271189617558
- KEMP-PIP: A Feature-Fusion Based Approach for Pro-inflammatory Peptide Prediction
Soumik Deb Niloy, Md. Fahmid-Ul-Alam Juboraj, Swakkhar Shatabda
https://arxiv.org/abs/2602.20198 https://mastoxiv.page/@arXiv_qbioQM_bot/116130341315320687
- Regressor-guided Diffusion Model for De Novo Peptide Sequencing with Explicit Mass Control
Shaorong Chen, Jingbo Zhou, Jun Xia
https://arxiv.org/abs/2602.20209 https://mastoxiv.page/@arXiv_qbioQM_bot/116130374083646541
- The Sim-to-Real Gap in MRS Quantification: A Systematic Deep Learning Validation for GABA
Zien Ma, S. M. Shermer, Oktay Karaku\c{s}, Frank C. Langbein
https://arxiv.org/abs/2602.20289 https://mastoxiv.page/@arXiv_eessSP_bot/116130267228834775
- Gap-Dependent Bounds for Nearly Minimax Optimal Reinforcement Learning with Linear Function Appro...
Haochen Zhang, Zhong Zheng, Lingzhou Xue
https://arxiv.org/abs/2602.20297 https://mastoxiv.page/@arXiv_statML_bot/116130255458256497
- Multilevel Determinants of Overweight and Obesity Among U.S. Children Aged 10-17: Comparative Eva...
Joyanta Jyoti Mondal
https://arxiv.org/abs/2602.20303 https://mastoxiv.page/@arXiv_csAI_bot/116130097466859145
- An artificial intelligence framework for end-to-end rare disease phenotyping from clinical notes ...
Shyr, Hu, Tinker, Cassini, Byram, Hamid, Fabbri, Wright, Peterson, Bastarache, Xu
https://arxiv.org/abs/2602.20324 https://mastoxiv.page/@arXiv_csAI_bot/116130100089848459
- Circuit Tracing in Vision-Language Models: Understanding the Internal Mechanisms of Multimodal Th...
Jingcheng Yang, Tianhu Xiong, Shengyi Qian, Klara Nahrstedt, Mingyuan Wu
https://arxiv.org/abs/2602.20330 https://mastoxiv.page/@arXiv_csCV_bot/116130463214879334
- No One Size Fits All: QueryBandits for Hallucination Mitigation
Nicole Cho, William Watson, Alec Koppel, Sumitra Ganesh, Manuela Veloso
https://arxiv.org/abs/2602.20332 https://mastoxiv.page/@arXiv_csCL_bot/116130370809116915
- Learning During Detection: Continual Learning for Neural OFDM Receivers via DMRS
Mohanad Obeed, Ming Jian
https://arxiv.org/abs/2602.20361 https://mastoxiv.page/@arXiv_csIT_bot/116130289537785136
- Detecting and Mitigating Group Bias in Heterogeneous Treatment Effects
Joel Persson, Jurri\"en Bakker, Dennis Bohle, Stefan Feuerriegel, Florian von Wangenheim
https://arxiv.org/abs/2602.20383 https://mastoxiv.page/@arXiv_statME_bot/116130509065601748
- Selecting Optimal Variable Order in Autoregressive Ising Models
Shiba Biswal, Marc Vuffray, Andrey Y. Lokhov
https://arxiv.org/abs/2602.20394 https://mastoxiv.page/@arXiv_statML_bot/116130299369541741
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2025-12-22 10:34:50

Regularized Random Fourier Features and Finite Element Reconstruction for Operator Learning in Sobolev Space
Xinyue Yu, Hayden Schaeffer
https://arxiv.org/abs/2512.17884 https://arxiv.org/pdf/2512.17884 https://arxiv.org/html/2512.17884
arXiv:2512.17884v1 Announce Type: new
Abstract: Operator learning is a data-driven approximation of mappings between infinite-dimensional function spaces, such as the solution operators of partial differential equations. Kernel-based operator learning can offer accurate, theoretically justified approximations that require less training than standard methods. However, they can become computationally prohibitive for large training sets and can be sensitive to noise. We propose a regularized random Fourier feature (RRFF) approach, coupled with a finite element reconstruction map (RRFF-FEM), for learning operators from noisy data. The method uses random features drawn from multivariate Student's $t$ distributions, together with frequency-weighted Tikhonov regularization that suppresses high-frequency noise. We establish high-probability bounds on the extreme singular values of the associated random feature matrix and show that when the number of features $N$ scales like $m \log m$ with the number of training samples $m$, the system is well-conditioned, which yields estimation and generalization guarantees. Detailed numerical experiments on benchmark PDE problems, including advection, Burgers', Darcy flow, Helmholtz, Navier-Stokes, and structural mechanics, demonstrate that RRFF and RRFF-FEM are robust to noise and achieve improved performance with reduced training time compared to the unregularized random feature model, while maintaining competitive accuracy relative to kernel and neural operator tests.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2025-12-22 10:32:10

Polyharmonic Cascade
Yuriy N. Bakhvalov
https://arxiv.org/abs/2512.17671 https://arxiv.org/pdf/2512.17671 https://arxiv.org/html/2512.17671
arXiv:2512.17671v1 Announce Type: new
Abstract: This paper presents a deep machine learning architecture, the "polyharmonic cascade" -- a sequence of packages of polyharmonic splines, where each layer is rigorously derived from the theory of random functions and the principles of indifference. This makes it possible to approximate nonlinear functions of arbitrary complexity while preserving global smoothness and a probabilistic interpretation. For the polyharmonic cascade, a training method alternative to gradient descent is proposed: instead of directly optimizing the coefficients, one solves a single global linear system on each batch with respect to the function values at fixed "constellations" of nodes. This yields synchronized updates of all layers, preserves the probabilistic interpretation of individual layers and theoretical consistency with the original model, and scales well: all computations reduce to 2D matrix operations efficiently executed on a GPU. Fast learning without overfitting on MNIST is demonstrated.
toXiv_bot_toot

Tootfinder

Opt-in global Mastodon full text search. Join the index!