Tootfinder

Opt-in global Mastodon full text search. Join the index!

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 10:45:01

Statistical Query Lower Bounds for Smoothed Agnostic Learning
Ilias Diakonikolas, Daniel M. Kane
arxiv.org/abs/2602.21191 arxiv.org/pdf/2602.21191 arxiv.org/html/2602.21191
arXiv:2602.21191v1 Announce Type: new
Abstract: We study the complexity of smoothed agnostic learning, recently introduced by~\cite{CKKMS24}, in which the learner competes with the best classifier in a target class under slight Gaussian perturbations of the inputs. Specifically, we focus on the prototypical task of agnostically learning halfspaces under subgaussian distributions in the smoothed model. The best known upper bound for this problem relies on $L_1$-polynomial regression and has complexity $d^{\tilde{O}(1/\sigma^2) \log(1/\epsilon)}$, where $\sigma$ is the smoothing parameter and $\epsilon$ is the excess error. Our main result is a Statistical Query (SQ) lower bound providing formal evidence that this upper bound is close to best possible. In more detail, we show that (even for Gaussian marginals) any SQ algorithm for smoothed agnostic learning of halfspaces requires complexity $d^{\Omega(1/\sigma^{2} \log(1/\epsilon))}$. This is the first non-trivial lower bound on the complexity of this task and nearly matches the known upper bound. Roughly speaking, we show that applying $L_1$-polynomial regression to a smoothed version of the function is essentially best possible. Our techniques involve finding a moment-matching hard distribution by way of linear programming duality. This dual program corresponds exactly to finding a low-degree approximating polynomial to the smoothed version of the target function (which turns out to be the same condition required for the $L_1$-polynomial regression to work). Our explicit SQ lower bound then comes from proving lower bounds on this approximation degree for the class of halfspaces.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 16:08:08

Replaced article(s) found for cs.LG. arxiv.org/list/cs.LG/new
[4/6]:
- Neural Proposals, Symbolic Guarantees: Neuro-Symbolic Graph Generation with Hard Constraints
Chuqin Geng, Li Zhang, Mark Zhang, Haolin Ye, Ziyu Zhao, Xujie Si
arxiv.org/abs/2602.16954 mastoxiv.page/@arXiv_csLG_bot/
- Multi-Probe Zero Collision Hash (MPZCH): Mitigating Embedding Collisions and Enhancing Model Fres...
Ziliang Zhao, et al.
arxiv.org/abs/2602.17050 mastoxiv.page/@arXiv_csLG_bot/
- MASPO: Unifying Gradient Utilization, Probability Mass, and Signal Reliability for Robust and Sam...
Fu, Lin, Fang, Zheng, Hu, Shao, Qin, Pan, Zeng, Cai
arxiv.org/abs/2602.17550 mastoxiv.page/@arXiv_csLG_bot/
- A Theoretical Framework for Modular Learning of Robust Generative Models
Corinna Cortes, Mehryar Mohri, Yutao Zhong
arxiv.org/abs/2602.17554 mastoxiv.page/@arXiv_csLG_bot/
- Multi-Round Human-AI Collaboration with User-Specified Requirements
Sima Noorani, Shayan Kiyani, Hamed Hassani, George Pappas
arxiv.org/abs/2602.17646 mastoxiv.page/@arXiv_csLG_bot/
- NEXUS: A compact neural architecture for high-resolution spatiotemporal air quality forecasting i...
Rampunit Kumar, Aditya Maheshwari
arxiv.org/abs/2602.19654 mastoxiv.page/@arXiv_csLG_bot/
- Augmenting Lateral Thinking in Language Models with Humor and Riddle Data for the BRAINTEASER Task
Mina Ghashami, Soumya Smruti Mishra
arxiv.org/abs/2405.10385 mastoxiv.page/@arXiv_csCL_bot/
- Watermarking Language Models with Error Correcting Codes
Patrick Chao, Yan Sun, Edgar Dobriban, Hamed Hassani
arxiv.org/abs/2406.10281 mastoxiv.page/@arXiv_csCR_bot/
- Learning to Control Unknown Strongly Monotone Games
Siddharth Chandak, Ilai Bistritz, Nicholas Bambos
arxiv.org/abs/2407.00575 mastoxiv.page/@arXiv_csMA_bot/
- Classification and reconstruction for single-pixel imaging with classical and quantum neural netw...
Sofya Manko, Dmitry Frolovtsev
arxiv.org/abs/2407.12506 mastoxiv.page/@arXiv_quantph_b
- Statistical Inference for Temporal Difference Learning with Linear Function Approximation
Weichen Wu, Gen Li, Yuting Wei, Alessandro Rinaldo
arxiv.org/abs/2410.16106 mastoxiv.page/@arXiv_statML_bo
- Big data approach to Kazhdan-Lusztig polynomials
Abel Lacabanne, Daniel Tubbenhauer, Pedro Vaz
arxiv.org/abs/2412.01283 mastoxiv.page/@arXiv_mathRT_bo
- MoEMba: A Mamba-based Mixture of Experts for High-Density EMG-based Hand Gesture Recognition
Mehran Shabanpour, Kasra Rad, Sadaf Khademi, Arash Mohammadi
arxiv.org/abs/2502.17457 mastoxiv.page/@arXiv_eessSP_bo
- Tightening Optimality gap with confidence through conformal prediction
Miao Li, Michael Klamkin, Russell Bent, Pascal Van Hentenryck
arxiv.org/abs/2503.04071 mastoxiv.page/@arXiv_statML_bo
- SEED: Towards More Accurate Semantic Evaluation for Visual Brain Decoding
Juhyeon Park, Peter Yongho Kim, Jiook Cha, Shinjae Yoo, Taesup Moon
arxiv.org/abs/2503.06437 mastoxiv.page/@arXiv_csCV_bot/
- How much does context affect the accuracy of AI health advice?
Prashant Garg, Thiemo Fetzer
arxiv.org/abs/2504.18310 mastoxiv.page/@arXiv_econGN_bo
- Reproducing and Improving CheXNet: Deep Learning for Chest X-ray Disease Classification
Daniel J. Strick, Carlos Garcia, Anthony Huang, Thomas Gardos
arxiv.org/abs/2505.06646 mastoxiv.page/@arXiv_eessIV_bo
- Sharp Gaussian approximations for Decentralized Federated Learning
Soham Bonnerjee, Sayar Karmakar, Wei Biao Wu
arxiv.org/abs/2505.08125 mastoxiv.page/@arXiv_statML_bo
- HoloLLM: Multisensory Foundation Model for Language-Grounded Human Sensing and Reasoning
Chuhao Zhou, Jianfei Yang
arxiv.org/abs/2505.17645 mastoxiv.page/@arXiv_csCV_bot/
- A Copula Based Supervised Filter for Feature Selection in Diabetes Risk Prediction Using Machine ...
Agnideep Aich, Md Monzur Murshed, Sameera Hewage, Amanda Mayeaux
arxiv.org/abs/2505.22554 mastoxiv.page/@arXiv_statML_bo
- Synthesis of discrete-continuous quantum circuits with multimodal diffusion models
Florian F\"urrutter, Zohim Chandani, Ikko Hamamura, Hans J. Briegel, Gorka Mu\~noz-Gil
arxiv.org/abs/2506.01666 mastoxiv.page/@arXiv_quantph_b
toXiv_bot_toot

@Techmeme@techhub.social
2025-12-10 00:16:07

Chinese AI startup Z.ai releases the GLM-4.6V series of vision models, with support for native function calling, available with 106B and 9B parameters (Carl Franzen/VentureBeat)
venturebeat.com/ai/z-ai-debuts

@arXiv_qbioNC_bot@mastoxiv.page
2025-12-10 08:38:00

Manifolds and Modules: How Function Develops in a Neural Foundation Model
Johannes Bertram, Luciano Dyballa, T. Anderson Keller, Savik Kinger, Steven W. Zucker
arxiv.org/abs/2512.07869 arxiv.org/pdf/2512.07869 arxiv.org/html/2512.07869
arXiv:2512.07869v1 Announce Type: new
Abstract: Foundation models have shown remarkable success in fitting biological visual systems; however, their black-box nature inherently limits their utility for under- standing brain function. Here, we peek inside a SOTA foundation model of neural activity (Wang et al., 2025) as a physiologist might, characterizing each 'neuron' based on its temporal response properties to parametric stimuli. We analyze how different stimuli are represented in neural activity space by building decoding man- ifolds, and we analyze how different neurons are represented in stimulus-response space by building neural encoding manifolds. We find that the different processing stages of the model (i.e., the feedforward encoder, recurrent, and readout modules) each exhibit qualitatively different representational structures in these manifolds. The recurrent module shows a jump in capabilities over the encoder module by 'pushing apart' the representations of different temporal stimulus patterns; while the readout module achieves biological fidelity by using numerous specialized feature maps rather than biologically plausible mechanisms. Overall, we present this work as a study of the inner workings of a prominent neural foundation model, gaining insights into the biological relevance of its internals through the novel analysis of its neurons' joint temporal response patterns.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 10:39:51

Does Order Matter : Connecting The Law of Robustness to Robust Generalization
Himadri Mandal, Vishnu Varadarajan, Jaee Ponde, Aritra Das, Mihir More, Debayan Gupta
arxiv.org/abs/2602.20971 arxiv.org/pdf/2602.20971 arxiv.org/html/2602.20971
arXiv:2602.20971v1 Announce Type: new
Abstract: Bubeck and Sellke (2021) pose as an open problem the connection between the law of robustness and robust generalization. The law of robustness states that overparameterization is necessary for models to interpolate robustly; in particular, robust interpolation requires the learned function to be Lipschitz. Robust generalization asks whether small robust training loss implies small robust test loss. We resolve this problem by explicitly connecting the two for arbitrary data distributions. Specifically, we introduce a nontrivial notion of robust generalization error and convert it into a lower bound on the expected Rademacher complexity of the induced robust loss class. Our bounds recover the $\Omega(n^{1/d})$ regime of Wu et al.\ (2023) and show that, up to constants, robust generalization does not change the order of the Lipschitz constant required for smooth interpolation. We conduct experiments to probe the predicted scaling with dataset size and model capacity, testing whether empirical behavior aligns more closely with the predictions of Bubeck and Sellke (2021) or Wu et al.\ (2023). For MNIST, we find that the lower-bound Lipschitz constant scales on the order predicted by Wu et al.\ (2023). Informally, to obtain low robust generalization error, the Lipschitz constant must lie in a range that we bound, and the allowable perturbation radius is linked to the Lipschitz scale.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 10:36:11

Deep unfolding of MCMC kernels: scalable, modular & explainable GANs for high-dimensional posterior sampling
Jonathan Spence, Tob\'ias I. Liaudat, Konstantinos Zygalakis, Marcelo Pereyra
arxiv.org/abs/2602.20758 arxiv.org/pdf/2602.20758 arxiv.org/html/2602.20758
arXiv:2602.20758v1 Announce Type: new
Abstract: Markov chain Monte Carlo (MCMC) methods are fundamental to Bayesian computation, but can be computationally intensive, especially in high-dimensional settings. Push-forward generative models, such as generative adversarial networks (GANs), variational auto-encoders and normalising flows offer a computationally efficient alternative for posterior sampling. However, push-forward models are opaque as they lack the modularity of Bayes Theorem, leading to poor generalisation with respect to changes in the likelihood function. In this work, we introduce a novel approach to GAN architecture design by applying deep unfolding to Langevin MCMC algorithms. This paradigm maps fixed-step iterative algorithms onto modular neural networks, yielding architectures that are both flexible and amenable to interpretation. Crucially, our design allows key model parameters to be specified at inference time, offering robustness to changes in the likelihood parameters. We train these unfolded samplers end-to-end using a supervised regularized Wasserstein GAN framework for posterior sampling. Through extensive Bayesian imaging experiments, we demonstrate that our proposed approach achieves high sampling accuracy and excellent computational efficiency, while retaining the physics consistency, adaptability and interpretability of classical MCMC strategies.
toXiv_bot_toot

@arXiv_csGT_bot@mastoxiv.page
2025-12-10 07:44:21

The Theory of Strategic Evolution: Games with Endogenous Players and Strategic Replicators
Kevin Vallier
arxiv.org/abs/2512.07901 arxiv.org/pdf/2512.07901 arxiv.org/html/2512.07901
arXiv:2512.07901v1 Announce Type: new
Abstract: This paper develops the Theory of Strategic Evolution, a general model for systems in which the population of players, strategies, and institutional rules evolve together. The theory extends replicator dynamics to settings with endogenous players, multi level selection, innovation, constitutional change, and meta governance. The central mathematical object is a Poiesis stack: a hierarchy of strategic layers linked by cross level gain matrices. Under small gain conditions, the system admits a global Lyapunov function and satisfies selection, tracking, and stochastic stability results at every finite depth. We prove that the class is closed under block extension, innovation events, heterogeneous utilities, continuous strategy spaces, and constitutional evolution. The closure theorem shows that no new dynamics arise at higher levels and that unrestricted self modification cannot preserve Lyapunov structure. The theory unifies results from evolutionary game theory, institutional design, innovation dynamics, and constitutional political economy, providing a general mathematical model of long run strategic adaptation.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 12:33:22

Crosslisted article(s) found for cs.LG. arxiv.org/list/cs.LG/new
[1/3]:
- SMaRT: Online Reusable Resource Assignment and an Application to Mediation in the Kenyan Judiciary
Farabi, Pinto, Lu, Ramos-Maqueda, Das, Deeb, Sautmann
arxiv.org/abs/2602.18431 mastoxiv.page/@arXiv_csCY_bot/
- Benchmarking Distilled Language Models: Performance and Efficiency in Resource-Constrained Settings
Sachin Gopal Wani, Eric Page, Ajay Dholakia, David Ellison
arxiv.org/abs/2602.20164 mastoxiv.page/@arXiv_csCL_bot/
- VISION-ICE: Video-based Interpretation and Spatial Identification of Arrhythmia Origins via Neura...
Dorsa EPMoghaddam, Feng Gao, Drew Bernard, Kavya Sinha, Mehdi Razavi, Behnaam Aazhang
arxiv.org/abs/2602.20165 mastoxiv.page/@arXiv_csCV_bot/
- Benchmarking Early Deterioration Prediction Across Hospital-Rich and MCI-Like Emergency Triage Un...
KMA Solaiman, Joshua Sebastian, Karma Tobden
arxiv.org/abs/2602.20168 mastoxiv.page/@arXiv_csCY_bot/
- Cross-Chirality Generalization by Axial Vectors for Hetero-Chiral Protein-Peptide Interaction Design
Yang, Tian, Jia, Zhang, Zheng, Wang, Su, He, Liu, Lan
arxiv.org/abs/2602.20176 mastoxiv.page/@arXiv_qbioBM_bo
- Enhancing Heat Sink Efficiency in MOSFETs using Physics Informed Neural Networks: A Systematic St...
Aniruddha Bora, Isabel K. Alvarez, Julie Chalfant, Chryssostomos Chryssostomidis
arxiv.org/abs/2602.20177 mastoxiv.page/@arXiv_csNE_bot/
- Data-Driven Deep MIMO Detection:Network Architectures and Generalization Analysis
Yongwei Yi, Xinping Yi, Wenjin Wang, Xiao Li, Shi Jin
arxiv.org/abs/2602.20178 mastoxiv.page/@arXiv_eessSP_bo
- OrgFlow: Generative Modeling of Organic Crystal Structures from Molecular Graphs
Mohammadmahdi Vahediahmar, Matthew A. McDonald, Feng Liu
arxiv.org/abs/2602.20195 mastoxiv.page/@arXiv_condmatmt
- KEMP-PIP: A Feature-Fusion Based Approach for Pro-inflammatory Peptide Prediction
Soumik Deb Niloy, Md. Fahmid-Ul-Alam Juboraj, Swakkhar Shatabda
arxiv.org/abs/2602.20198 mastoxiv.page/@arXiv_qbioQM_bo
- Regressor-guided Diffusion Model for De Novo Peptide Sequencing with Explicit Mass Control
Shaorong Chen, Jingbo Zhou, Jun Xia
arxiv.org/abs/2602.20209 mastoxiv.page/@arXiv_qbioQM_bo
- The Sim-to-Real Gap in MRS Quantification: A Systematic Deep Learning Validation for GABA
Zien Ma, S. M. Shermer, Oktay Karaku\c{s}, Frank C. Langbein
arxiv.org/abs/2602.20289 mastoxiv.page/@arXiv_eessSP_bo
- Gap-Dependent Bounds for Nearly Minimax Optimal Reinforcement Learning with Linear Function Appro...
Haochen Zhang, Zhong Zheng, Lingzhou Xue
arxiv.org/abs/2602.20297 mastoxiv.page/@arXiv_statML_bo
- Multilevel Determinants of Overweight and Obesity Among U.S. Children Aged 10-17: Comparative Eva...
Joyanta Jyoti Mondal
arxiv.org/abs/2602.20303 mastoxiv.page/@arXiv_csAI_bot/
- An artificial intelligence framework for end-to-end rare disease phenotyping from clinical notes ...
Shyr, Hu, Tinker, Cassini, Byram, Hamid, Fabbri, Wright, Peterson, Bastarache, Xu
arxiv.org/abs/2602.20324 mastoxiv.page/@arXiv_csAI_bot/
- Circuit Tracing in Vision-Language Models: Understanding the Internal Mechanisms of Multimodal Th...
Jingcheng Yang, Tianhu Xiong, Shengyi Qian, Klara Nahrstedt, Mingyuan Wu
arxiv.org/abs/2602.20330 mastoxiv.page/@arXiv_csCV_bot/
- No One Size Fits All: QueryBandits for Hallucination Mitigation
Nicole Cho, William Watson, Alec Koppel, Sumitra Ganesh, Manuela Veloso
arxiv.org/abs/2602.20332 mastoxiv.page/@arXiv_csCL_bot/
- Learning During Detection: Continual Learning for Neural OFDM Receivers via DMRS
Mohanad Obeed, Ming Jian
arxiv.org/abs/2602.20361 mastoxiv.page/@arXiv_csIT_bot/
- Detecting and Mitigating Group Bias in Heterogeneous Treatment Effects
Joel Persson, Jurri\"en Bakker, Dennis Bohle, Stefan Feuerriegel, Florian von Wangenheim
arxiv.org/abs/2602.20383 mastoxiv.page/@arXiv_statME_bo
- Selecting Optimal Variable Order in Autoregressive Ising Models
Shiba Biswal, Marc Vuffray, Andrey Y. Lokhov
arxiv.org/abs/2602.20394 mastoxiv.page/@arXiv_statML_bo
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2025-12-22 10:34:50

Regularized Random Fourier Features and Finite Element Reconstruction for Operator Learning in Sobolev Space
Xinyue Yu, Hayden Schaeffer
arxiv.org/abs/2512.17884 arxiv.org/pdf/2512.17884 arxiv.org/html/2512.17884
arXiv:2512.17884v1 Announce Type: new
Abstract: Operator learning is a data-driven approximation of mappings between infinite-dimensional function spaces, such as the solution operators of partial differential equations. Kernel-based operator learning can offer accurate, theoretically justified approximations that require less training than standard methods. However, they can become computationally prohibitive for large training sets and can be sensitive to noise. We propose a regularized random Fourier feature (RRFF) approach, coupled with a finite element reconstruction map (RRFF-FEM), for learning operators from noisy data. The method uses random features drawn from multivariate Student's $t$ distributions, together with frequency-weighted Tikhonov regularization that suppresses high-frequency noise. We establish high-probability bounds on the extreme singular values of the associated random feature matrix and show that when the number of features $N$ scales like $m \log m$ with the number of training samples $m$, the system is well-conditioned, which yields estimation and generalization guarantees. Detailed numerical experiments on benchmark PDE problems, including advection, Burgers', Darcy flow, Helmholtz, Navier-Stokes, and structural mechanics, demonstrate that RRFF and RRFF-FEM are robust to noise and achieve improved performance with reduced training time compared to the unregularized random feature model, while maintaining competitive accuracy relative to kernel and neural operator tests.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2025-12-22 10:32:10

Polyharmonic Cascade
Yuriy N. Bakhvalov
arxiv.org/abs/2512.17671 arxiv.org/pdf/2512.17671 arxiv.org/html/2512.17671
arXiv:2512.17671v1 Announce Type: new
Abstract: This paper presents a deep machine learning architecture, the "polyharmonic cascade" -- a sequence of packages of polyharmonic splines, where each layer is rigorously derived from the theory of random functions and the principles of indifference. This makes it possible to approximate nonlinear functions of arbitrary complexity while preserving global smoothness and a probabilistic interpretation. For the polyharmonic cascade, a training method alternative to gradient descent is proposed: instead of directly optimizing the coefficients, one solves a single global linear system on each batch with respect to the function values at fixed "constellations" of nodes. This yields synchronized updates of all layers, preserves the probabilistic interpretation of individual layers and theoretical consistency with the original model, and scales well: all computations reduce to 2D matrix operations efficiently executed on a GPU. Fast learning without overfitting on MNIST is demonstrated.
toXiv_bot_toot