2025-10-15 09:49:02
Statistical Guarantees for High-Dimensional Stochastic Gradient Descent
Jiaqi Li, Zhipeng Lou, Johannes Schmidt-Hieber, Wei Biao Wu
https://arxiv.org/abs/2510.12013 https://
Statistical Guarantees for High-Dimensional Stochastic Gradient Descent
Jiaqi Li, Zhipeng Lou, Johannes Schmidt-Hieber, Wei Biao Wu
https://arxiv.org/abs/2510.12013 https://
Correlating Cross-Iteration Noise for DP-SGD using Model Curvature
Xin Gu, Yingtai Xiao, Guanlin He, Jiamu Bai, Daniel Kifer, Kiwan Maeng
https://arxiv.org/abs/2510.05416 https:…
S-D-RSM: Stochastic Distributed Regularized Splitting Method for Large-Scale Convex Optimization Problems
Maoran Wang, Xingju Cai, Yongxin Chen
https://arxiv.org/abs/2511.10133 https://arxiv.org/pdf/2511.10133 https://arxiv.org/html/2511.10133
arXiv:2511.10133v1 Announce Type: new
Abstract: This paper investigates the problems large-scale distributed composite convex optimization, with motivations from a broad range of applications, including multi-agent systems, federated learning, smart grids, wireless sensor networks, compressed sensing, and so on. Stochastic gradient descent (SGD) and its variants are commonly employed to solve such problems. However, existing algorithms often rely on vanishing step sizes, strong convexity assumptions, or entail substantial computational overhead to ensure convergence or obtain favorable complexity. To bridge the gap between theory and practice, we integrate consensus optimization and operator splitting techniques (see Problem Reformulation) to develop a novel stochastic splitting algorithm, termed the \emph{stochastic distributed regularized splitting method} (S-D-RSM). In practice, S-D-RSM performs parallel updates of proximal mappings and gradient information for only a randomly selected subset of agents at each iteration. By introducing regularization terms, it effectively mitigates consensus discrepancies among distributed nodes. In contrast to conventional stochastic methods, our theoretical analysis establishes that S-D-RSM achieves global convergence without requiring diminishing step sizes or strong convexity assumptions. Furthermore, it achieves an iteration complexity of $\mathcal{O}(1/\epsilon)$ with respect to both the objective function value and the consensus error. Numerical experiments show that S-D-RSM achieves up to 2--3$\times$ speedup compared to state-of-the-art baselines, while maintaining comparable or better accuracy. These results not only validate the algorithm's theoretical guarantees but also demonstrate its effectiveness in practical tasks such as compressed sensing and empirical risk minimization.
toXiv_bot_toot
Quantitative Convergence Analysis of Projected Stochastic Gradient Descent for Non-Convex Losses via the Goldstein Subdifferential
Yuping Zheng, Andrew Lamperski
https://arxiv.org/abs/2510.02735
Effective continuous equations for adaptive SGD: a stochastic analysis view
Luca Callisti, Marco Romito, Francesco Triggiano
https://arxiv.org/abs/2509.21614 https://
Federated Learning of Quantile Inference under Local Differential Privacy
Leheng Cai, Qirui Hu, Shuyuan Wu
https://arxiv.org/abs/2509.21800 https://arxiv.o…
DIVEBATCH: Accelerating Model Training Through Gradient-Diversity Aware Batch Size Adaptation
Yuen Chen, Yian Wang, Hari Sundaram
https://arxiv.org/abs/2509.16173 https://
Risk Comparisons in Linear Regression: Implicit Regularization Dominates Explicit Regularization
Jingfeng Wu, Peter L. Bartlett, Jason D. Lee, Sham M. Kakade, Bin Yu
https://arxiv.org/abs/2509.17251