MAPGD: Multi-Agent Prompt Gradient Descent for Collaborative Prompt Optimization
Yichen Han, Bojun Liu, Zhengpeng zhou, Guanyu Liu, Zeng Zhang, Yang Yang, Wenli Wang, Isaac N Shi, Yunyan, Lewei He, Tianyu Shi
https://arxiv.org/abs/2509.11361
A Parallelizable Approach for Characterizing NE in Zero-Sum Games After a Linear Number of Iterations of Gradient Descent
Taemin Kim, James P. Bailey
https://arxiv.org/abs/2507.11366
Understanding Outer Optimizers in Local SGD: Learning Rates, Momentum, and Acceleration
Ahmed Khaled, Satyen Kale, Arthur Douillard, Chi Jin, Rob Fergus, Manzil Zaheer
https://arxiv.org/abs/2509.10439 …
Deep Equilibrium models for Poisson Imaging Inverse problems via Mirror Descent
Christian Daniele, Silvia Villa, Samuel Vaiter, Luca Calatroni
https://arxiv.org/abs/2507.11461
A Differentiable Surrogate Model for the Generation of Radio Pulses from In-Ice Neutrino Interactions
Philipp Pilar, Martin Ravn, Christian Glaser, Niklas Wahlstr\"om
https://arxiv.org/abs/2509.10274
Discovery of energy landscapes towards optimized quantum transport: Environmental effects and long-range tunneling
Maggie Lawrence, Matthew Pocrnic, Erin Fung, Juan Carrasquilla, Erik M. Gauger, Dvira Segal
https://arxiv.org/abs/2508.09371
Randomized HyperSteiner: A Stochastic Delaunay Triangulation Heuristic for the Hyperbolic Steiner Minimal Tree
Aniss Aiman Medbouhi, Alejandro Garc\'ia-Castellanos, Giovanni Luca Marchetti, Daniel Pelt, Erik J Bekkers, Danica Kragic
https://arxiv.org/abs/2510.09328
PLRV-O: Advancing Differentially Private Deep Learning via Privacy Loss Random Variable Optimization
Qin Yang, Nicholas Stout, Meisam Mohammady, Han Wang, Ayesha Samreen, Christopher J Quinn, Yan Yan, Ashish Kundu, Yuan Hong
https://arxiv.org/abs/2509.06264
Projected Gradient Descent for Constrained Decision-Dependent Optimization
Zifan Wang, Changxin Liu, Thomas Parisini, Michael M. Zavlanos, Karl H. Johansson
https://arxiv.org/abs/2508.08856
MachineLearningLM: Continued Pretraining Language Models on Millions of Synthetic Tabular Prediction Tasks Scales In-Context ML
Haoyu Dong, Pengkun Zhang, Mingzhe Lu, Yanzhen Shen, Guolin Ke
https://arxiv.org/abs/2509.06806
On the $O(1/T)$ Convergence of Alternating Gradient Descent-Ascent in Bilinear Games
Tianlong Nan, Shuvomoy Das Gupta, Garud Iyengar, Christian Kroer
https://arxiv.org/abs/2510.03855
Randomized coordinate gradient descent almost surely escapes strict saddle points
Ziang Chen, Yingzhou Li, Zihao Li
https://arxiv.org/abs/2508.07535 https://
Objective Soups: Multilingual Multi-Task Modeling for Speech Processing
A F M Saif, Lisha Chen, Xiaodong Cui, Songtao Lu, Brian Kingsbury, Tianyi Chen
https://arxiv.org/abs/2508.09228
Phase diagram and eigenvalue dynamics of stochastic gradient descent in multilayer neural networks
Chanju Park (Swansea University), Biagio Lucini (Queen Mary University of London), Gert Aarts (Swansea University)
https://arxiv.org/abs/2509.01349
Comparative Analysis of Novel NIRMAL Optimizer Against Adam and SGD with Momentum
Nirmal Gaud, Surej Mouli, Preeti Katiyar, Vaduguru Venkata Ramya
https://arxiv.org/abs/2508.04293
Replaced article(s) found for cs.LO. https://arxiv.org/list/cs.LO/new
[1/1]:
- Compact Rule-Based Classifier Learning via Gradient Descent
Javier Fumanal-Idocin, Raquel Fernandez-Peralta, Javier Andreu-Perez
Correlating Cross-Iteration Noise for DP-SGD using Model Curvature
Xin Gu, Yingtai Xiao, Guanlin He, Jiamu Bai, Daniel Kifer, Kiwan Maeng
https://arxiv.org/abs/2510.05416 https:…
Towards Fast Option Pricing PDE Solvers Powered by PIELM
Akshay Govind Srinivasan, Anuj Jagannath Said, Sathwik Pentela, Vikas Dwivedi, Balaji Srinivasan
https://arxiv.org/abs/2510.04322
Differentially Private Two-Stage Gradient Descent for Instrumental Variable Regression
Haodong Liang, Yanhao Jin, Krishnakumar Balasubramanian, Lifeng Lai
https://arxiv.org/abs/2509.22794
Linear Convergence of Gradient Descent for Quadratically Regularized Optimal Transport
Alberto Gonz\'alez-Sanz, Marcel Nutz, Andr\'es Riveros Valdevenito
https://arxiv.org/abs/2509.08547
Information Entropy-Based Scheduling for Communication-Efficient Decentralized Learning
Jaiprakash Nagar, Zheng Chen, Marios Kountouris, Photios A. Stavrou
https://arxiv.org/abs/2507.17426
On the Perturbed Projection-Based Distributed Gradient-Descent Algorithm: A Fully-Distributed Adaptive Redesign
Tarek Bazizi, Mohamed Maghenem, Paolo Frasca, Antonio Lor\`ia, Elena Panteley
https://arxiv.org/abs/2509.03443
Quantitative Convergence Analysis of Projected Stochastic Gradient Descent for Non-Convex Losses via the Goldstein Subdifferential
Yuping Zheng, Andrew Lamperski
https://arxiv.org/abs/2510.02735
GRADSTOP: Early Stopping of Gradient Descent via Posterior Sampling
Arash Jamshidi, Lauri Sepp\"al\"ainen, Katsiaryna Haitsiukevich, Hoang Phuc Hau Luu, Anton Bj\"orklund, Kai Puolam\"aki
https://arxiv.org/abs/2508.19028
Lightweight Gradient Descent Optimization for Mitigating Hardware Imperfections in RIS Systems
Pedro H. C. de Souza (National Institute of Telecommunications), Luiz A. M. Pereira (National Institute of Telecommunications), Faustino R. G\'omez (National Institute of Telecommunications), Elsa M. Mater\'on (National Institute of Telecommunications), Jorge Ricardo Mej\'ia-Salazar (National Institute of Telecommunications)
Crosslisted article(s) found for cs.CE. https://arxiv.org/list/cs.CE/new
[1/1]:
- Fast training of accurate physics-informed neural networks without gradient descent
Datar, Kapoor, Chandra, Sun, Bolager, Burak, Veselovska, Fornasier, Dietrich
Harmonized Gradient Descent for Class Imbalanced Data Stream Online Learning
Han Zhou, Hongpeng Yin, Xuanhong Deng, Yuyu Huang, Hao Ren
https://arxiv.org/abs/2508.11353 https://…
Replaced article(s) found for stat.ML. https://arxiv.org/list/stat.ML/new
[2/2]:
- Gradient Descent with Large Step Sizes: Chaos and Fractal Convergence Region
Shuang Liang, Guido Mont\'ufar
Towards understanding Accelerated Stein Variational Gradient Flow -- Analysis of Generalized Bilinear Kernels for Gaussian target distributions
Viktor Stein, Wuchen Li
https://arxiv.org/abs/2509.04008 …
Risk Comparisons in Linear Regression: Implicit Regularization Dominates Explicit Regularization
Jingfeng Wu, Peter L. Bartlett, Jason D. Lee, Sham M. Kakade, Bin Yu
https://arxiv.org/abs/2509.17251
Accelerated Gradient Methods with Biased Gradient Estimates: Risk Sensitivity, High-Probability Guarantees, and Large Deviation Bounds
Mert G\"urb\"uzbalaban, Yasa Syed, Necdet Serhat Aybat
https://arxiv.org/abs/2509.13628
A Universal Banach--Bregman Framework for Stochastic Iterations: Unifying Stochastic Mirror Descent, Learning and LLM Training
Johnny R. Zhang (Independent Researcher), Xiaomei Mi (University of Manchester), Gaoyuan Du (Amazon), Qianyi Sun (Microsoft), Shiqi Wang (Meta), Jiaxuan Li (Amazon), Wenhua Zhou (Independent Researcher)
https://arx…

A Universal Banach--Bregman Framework for Stochastic Iterations: Unifying Stochastic Mirror Descent, Learning and LLM Training
Stochastic optimization powers the scalability of modern artificial intelligence, spanning machine learning, deep learning, reinforcement learning, and large language model training. Yet, existing theory remains largely confined to Hilbert spaces, relying on inner-product frameworks and orthogonality. This paradigm fails to capture non-Euclidean settings, such as mirror descent on simplices, Bregman proximal methods for sparse learning, natural gradient descent in information geometry, or Kullb…
Escaping Saddle Points via Curvature-Calibrated Perturbations: A Complete Analysis with Explicit Constants and Empirical Validation
Faruk Alpay, Hamdi Alakkad
https://arxiv.org/abs/2508.16540
Universal and Transferable Adversarial Attack on Large Language Models Using Exponentiated Gradient Descent
Sajib Biswas, Mao Nishino, Samuel Jacob Chacko, Xiuwen Liu
https://arxiv.org/abs/2508.14853
From Sublinear to Linear: Fast Convergence in Deep Networks via Locally Polyak-Lojasiewicz Regions
Agnideep Aich, Ashit Baran Aich, Bruce Wade
https://arxiv.org/abs/2507.21429 h…
A Frank-Wolfe Algorithm for Strongly Monotone Variational Inequalities
Reza Rahimi Baghbadorani, Peyman Mohajerin Esfahani, Sergio Grammatico
https://arxiv.org/abs/2510.03842 ht…
Replaced article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[2/4]:
- Convergence Properties of Natural Gradient Descent for Minimizing KL Divergence
Adwait Datar, Nihat Ay
Norm-Constrained Flows and Sign-Based Optimization: Theory and Algorithms
Valentin Leplat, Sergio Mayorga, Roland Hildebrand, Alexander Gasnikov
https://arxiv.org/abs/2508.18510
Replaced article(s) found for math.OC. https://arxiv.org/list/math.OC/new
[1/1]:
- FastPart: Over-Parameterized Stochastic Gradient Descent for Sparse optimisation on Measures
Yohann De Castro, S\'ebastien Gadat, Cl\'ement Marteau
Replaced article(s) found for math.OC. https://arxiv.org/list/math.OC/new
[1/1]:
- The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization
Constantinos Daskalakis, Ioannis Panageas
Active-set Newton-MR methods for nonconvex optimization problems with bound constraints
Ernesto G. Birgin, Geovani N. Grapiglia, Diaulas S. Marcondes
https://arxiv.org/abs/2508.20967
Polyak Stepsize: Estimating Optimal Functional Values Without Parameters or Prior Knowledge
Farshed Abdukhakimov, Cuong Anh Pham, Samuel Horv\'ath, Martin Tak\'a\v{c}, Slavom{\i}r Hanzely
https://arxiv.org/abs/2508.17288