2025-10-14 11:35:58
Nonlinearly Preconditioned Gradient Methods: Momentum and Stochastic Analysis
Konstantinos Oikonomidis, Jan Quan, Panagiotis Patrinos
https://arxiv.org/abs/2510.11312 https://…
Nonlinearly Preconditioned Gradient Methods: Momentum and Stochastic Analysis
Konstantinos Oikonomidis, Jan Quan, Panagiotis Patrinos
https://arxiv.org/abs/2510.11312 https://…
Statistical Guarantees for High-Dimensional Stochastic Gradient Descent
Jiaqi Li, Zhipeng Lou, Johannes Schmidt-Hieber, Wei Biao Wu
https://arxiv.org/abs/2510.12013 https://
Forward and backward error bounds for a mixed precision preconditioned conjugate gradient algorithm
Thomas Bake, Erin Carson, Yuxin Ma
https://arxiv.org/abs/2510.11379 https://
Optimal gradient estimates for conductivity problems with imperfect low-conductivity interfaces
Hongjie Dong, Haigang Li, Yan Zhao
https://arxiv.org/abs/2510.10615 https://
Grad-CL: Source Free Domain Adaptation with Gradient Guided Feature Disalignment
Rini Smita Thakur, Rajeev Ranjan Dwivedi, Vinod K Kurmi
https://arxiv.org/abs/2509.10134 https:/…
Gradient-flowed operator product expansion without IR renormalons
Martin Beneke (TU Munich), Hiromasa Takaura (Kyoto University)
https://arxiv.org/abs/2510.12193 https://…
Gradient-based search of quantum phases: discovering unconventional fractional Chern insulators
Andr\'e Grossi Fonseca, Eric Wang, Sachin Vaidya, Patrick J. Ledwith, Ashvin Vishwanath, Marin Solja\v{c}i\'c
https://arxiv.org/abs/2509.10438
SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models
Chengyu Wang, Paria Rashidinejad, DiJia Su, Song Jiang, Sid Wang, Siyan Zhao, Cai Zhou, Shannon Zejiang Shen, Feiyu Chen, Tommi Jaakkola, Yuandong Tian, Bo Liu
https://arxiv.org/abs/2510.09541
Adaptive Conditional Gradient Descent
Abbas Khademi, Antonio Silveti-Falls
https://arxiv.org/abs/2510.11440 https://arxiv.org/pdf/2510.11440
An Invitation to Obstruction Bundle Gluing Through Morse Flow Lines
Ipsita Datta, Yuan Yao
https://arxiv.org/abs/2510.10393 https://arxiv.org/pdf/2510.1039…
Modified Loss of Momentum Gradient Descent: Fine-Grained Analysis
Matias D. Cattaneo, Boris Shigida
https://arxiv.org/abs/2509.08483 https://arxiv.org/pdf/…
Temporal Variabilities Limit Convergence Rates in Gradient-Based Online Optimization
Bryan Van Scoy, Gianluca Bianchin
https://arxiv.org/abs/2510.12512 https://
Liouville results for $(p,q)$-Laplacian elliptic equations with source terms involving gradient nonlinearities
Mousomi Bhakta, Anup Biswas, Roberta Filippucci
https://arxiv.org/abs/2510.12486
Statistical Benchmarking of Optimization Methods for Variational Quantum Eigensolver under Quantum Noise
Silvie Ill\'esov\'a, Tom\'a\v{s} Bezd\v{e}k, Vojt\v{e}ch Nov\'ak, Bruno Senjean, Martin Beseda
https://arxiv.org/abs/2510.08727
Reliability Sensitivity with Response Gradient
Siu-Kui Au, Zi-Jun Cao
https://arxiv.org/abs/2510.09315 https://arxiv.org/pdf/2510.09315
Google rolls out its new gradient "G" icon company-wide, saying it "now represents all of Google ... and visually reflects our evolution in the AI era" (Abner Li/9to5Google)
https://9to5google.com/2025/09/29/google-g-gradient-company-icon/
Simple Projection Variants Improve ColBERT Performance
Benjamin Clavi\'e, Sean Lee, Rikiya Takehi, Aamir Shakir, Makoto P. Kato
https://arxiv.org/abs/2510.12327 https://
Locally Permuted Low Rank Column-wise Sensing
Ahmed Ali Abbasi, Namrata Vaswani
https://arxiv.org/abs/2509.09820 https://arxiv.org/pdf/2509.09820
A Gradient Guided Diffusion Framework for Chance Constrained Programming
Boyang Zhang, Zhiguo Wang, Ya-Feng Liu
https://arxiv.org/abs/2510.12238 https://ar…
Predictive Spike Timing Enables Distributed Shortest Path Computation in Spiking Neural Networks
Simen Storesund, Kristian Valset Aars, Robin Dietrich, Nicolai Waniek
https://arxiv.org/abs/2509.10077
A framework for realisable data-driven active flow control using model predictive control applied to a simplified truck wake
Alberto Solera-Rico, Carlos Sanmiguel Vila, Stefano Discetti
https://arxiv.org/abs/2510.11600
From Morse Functions to Lefschetz Fibrations on Cotangent Bundles
Emmanuel Giroux
https://arxiv.org/abs/2510.10669 https://arxiv.org/pdf/2510.10669
A Differentiable Surrogate Model for the Generation of Radio Pulses from In-Ice Neutrino Interactions
Philipp Pilar, Martin Ravn, Christian Glaser, Niklas Wahlstr\"om
https://arxiv.org/abs/2509.10274
The Hidden Width of Deep ResNets: Tight Error Bounds and Phase Diagrams
L\'ena\"ic Chizat
https://arxiv.org/abs/2509.10167 https://arxiv.org/pdf/2…
Replaced article(s) found for cs.GR. https://arxiv.org/list/cs.GR/new
[1/1]:
- GASP: A Gradient-Aware Shortest Path Algorithm for Boundary-Confined Visualization of 2-Manifold ...
Sefat E. Rahman, Tushar M. Athawale, Paul Rosen
Building Gradient by Gradient: Decentralised Energy Functions for Bimanual Robot Assembly
Alexander L. Mitchell, Joe Watson, Ingmar Posner
https://arxiv.org/abs/2510.04696 https…
Crosslisted article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[1/3]:
- Gradient-Sign Masking for Task Vector Transport Across Pre-Trained Models
Rinaldi, Panariello, Salici, Liu, Ciccone, Porrello, Calderara
On curvature estimates for four-dimensional gradient Ricci solitons
Huai-Dong Cao
https://arxiv.org/abs/2510.06059 https://arxiv.org/pdf/2510.06059
Gradient-Guided Furthest Point Sampling for Robust Training Set Selection
Morris Trestman, Stefan Gugler, Felix A. Faber, O. A. von Lilienfeld
https://arxiv.org/abs/2510.08906 h…
New Classes of Non-monotone Variational Inequality Problems Solvable via Proximal Gradient on Smooth Gap Functions
Lei Zhao, Daoli Zhu, Shuzhong Zhang
https://arxiv.org/abs/2510.12105
Stable High-Order Vortices in Spin-Orbit-Coupled Spin-1 Bose-Einstein Condensates
Xin-Feng Zhang, Huan-Bo Luo, Josep Batle, Bin Liu, Yongyao Li
https://arxiv.org/abs/2510.09832 …
Rotational radial shear in the low solar photosphere. Direct detection from high-resolution spectro-imaging
T. Corbard (Universit\'e C\^ote d'Azur, Observatoire de la C\^ote d'Azur, CNRS, Laboratoire Lagrange, Nice, France), M. Faurobert (Universit\'e C\^ote d'Azur, Observatoire de la C\^ote d'Azur, CNRS, Laboratoire Lagrange, Nice, France), B. Gelly (CNRS-IRL2009, Tenerife, Spain), R. Douet (CNRS-IRL2009, Tenerife, Spain), D. Laforgue (CNRS-IRL2009, Tenerife, S…
SMEC: Rethinking Matryoshka Representation Learning for Retrieval Embedding Compression
Biao Zhang, Lixin Chen, Tong Liu, Bo Zheng
https://arxiv.org/abs/2510.12474 https://
Running man spreads his arms like the wings of an airplane
If you're in a hurry to get more running photos check my Behance at https://www.behance.net/gallery/234496843/Ithaca-5-and-10-2025
Effective Atom Theory: Gradient-Driven ab initio Materials Design
Justin Tahmassebpur, Brandon Li, Boris Barron, H\'ector Abru\~na, Peter Frazier, Tom\'as Arias
https://arxiv.org/abs/2509.07180
A Computable Measure of Suboptimality for Entropy-Regularised Variational Objectives
Cl\'ementine Chazal, Heishiro Kanagawa, Zheyang Shen, Anna Korba, Chris. J. Oates
https://arxiv.org/abs/2509.10393
Thermal gradient-driven skyrmion dynamics with near-zero skyrmion Hall angle
Yogesh Kumar, Hurmal Saren, Pintu Das
https://arxiv.org/abs/2510.07020 https://
GCond: Gradient Conflict Resolution via Accumulation-based Stabilization for Large-Scale Multi-Task Learning
Evgeny Alves Limarenko, Anastasiia Alexandrovna Studenikina
https://arxiv.org/abs/2509.07252
Replaced article(s) found for hep-ph. https://arxiv.org/list/hep-ph/new
[1/2]:
- Simple Gradient Flow Equation for the Bounce Solution
Ryosuke Sato
https://
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[4/8]:
- Boosting Adversarial Transferability via Commonality-Oriented Gradient Optimization
Yanting Gao, Yepeng Liu, Junming Liu, Qi Zhang, Hongyun Zhang, Duoqian Miao, Cairong Zhao
Thermodynamically Consistent Continuum Theory of Magnetic Particles in High-Gradient Fields
Marko Tesanovic, Daniel M. Markiewitz, Marcus L. Popp, Martin Z. Bazant, Sonja Berensmeier
https://arxiv.org/abs/2510.07552
Reading Between the Lines: Towards Reliable Black-box LLM Fingerprinting via Zeroth-order Gradient Estimation
Shuo Shao, Yiming Li, Hongwei Yao, Yifei Chen, Yuchen Yang, Zhan Qin
https://arxiv.org/abs/2510.06605
Active Subspaces in Infinite Dimension
Poorbita Kundu, Nathan Wycoff
https://arxiv.org/abs/2510.11871 https://arxiv.org/pdf/2510.11871
On the maximum bound principle and energy dissipation of exponential time differencing methods for the chiral liquid crystal blue phases
Wenshuai Hu, Guanghua Ji
https://arxiv.org/abs/2510.12499
Infinite Interacting Brownian Motions and EVI Gradient Flows
Kohei Suzuki
https://arxiv.org/abs/2509.06869 https://arxiv.org/pdf/2509.06869
(Adaptive) Scaled gradient methods beyond locally Holder smoothness: Lyapunov analysis, convergence rate and complexity
Susan Ghaderi, Morteza Rahimi, Yves Moreau, Masoud Ahookhosh
https://arxiv.org/abs/2511.10425 https://arxiv.org/pdf/2511.10425 https://arxiv.org/html/2511.10425
arXiv:2511.10425v1 Announce Type: new
Abstract: This paper addresses the unconstrained minimization of smooth convex functions whose gradients are locally Holder continuous. Building on these results, we analyze the Scaled Gradient Algorithm (SGA) under local smoothness assumptions, proving its global convergence and iteration complexity. Furthermore, under local strong convexity and the Kurdyka-Lojasiewicz (KL) inequality, we establish linear convergence rates and provide explicit complexity bounds. In particular, we show that when the gradient is locally Lipschitz continuous, SGA attains linear convergence for any KL exponent. We then introduce and analyze an adaptive variant of SGA (AdaSGA), which automatically adjusts the scaling and step-size parameters. For this method, we show global convergence, and derive local linear rates under strong convexity.
toXiv_bot_toot
Galaxy Metallicity Gradients in the Reionization Epoch from the FIRE-2 Simulations
Xunda Sun, Xin Wang, Fangzhou Jiang, Houjun Mo, Luis C. Ho, Qianqiao Zhou, Xiangcheng Ma, Hu Zhan, Andrew Wetzel, Russell L. Graf, Philip F. Hopkins, Dusan Keres, Jonathan Stern
https://arxiv.org/abs/2510.08997
Argus: JAX state-space filtering for gravitational wave detection with a pulsar timing array
Tom Kimpson, Nicholas J. O'Neill, Patrick M. Meyers, Andrew Melatos
https://arxiv.org/abs/2510.11077
Evidence for easy-plane XY ferromagnetism in heavy-fermion quantum-critical CeRh6Ge4
Riku Yamamoto, Sejun Park, Zachary W. Riedel, Phurba Sherpa, Joe D. Thompson, Filip Ronning, Eric D. Bauer, Adam P. Dioguardi, Michihiro Hirata
https://arxiv.org/abs/2510.12006
Data-Driven Energy Estimation for Virtual Servers Using Combined System Metrics and Machine Learning
Amandip Sangha
https://arxiv.org/abs/2509.09991 https://
Global Convergence of Four-Layer Matrix Factorization under Random Initialization
Minrui Luo, Weihang Xu, Xiang Gao, Maryam Fazel, Simon Shaolei Du
https://arxiv.org/abs/2511.09925 https://arxiv.org/pdf/2511.09925 https://arxiv.org/html/2511.09925
arXiv:2511.09925v1 Announce Type: new
Abstract: Gradient descent dynamics on the deep matrix factorization problem is extensively studied as a simplified theoretical model for deep neural networks. Although the convergence theory for two-layer matrix factorization is well-established, no global convergence guarantee for general deep matrix factorization under random initialization has been established to date. To address this gap, we provide a polynomial-time global convergence guarantee for randomly initialized gradient descent on four-layer matrix factorization, given certain conditions on the target matrix and a standard balanced regularization term. Our analysis employs new techniques to show saddle-avoidance properties of gradient decent dynamics, and extends previous theories to characterize the change in eigenvalues of layer weights.
toXiv_bot_toot
Stability of asymptotically conical gradient K\"ahler-Ricci expanders
Longteng Chen
https://arxiv.org/abs/2510.06850 https://arxiv.org/pdf/2510.06850
Decoupling pressure gradient history effects in turbulent boundary layers through high-Reynolds number experiments
Ahmad Zarei, Mitchell Lozier, Rahul Deshpande, Ivan Marusic
https://arxiv.org/abs/2509.07545
Understanding Outer Optimizers in Local SGD: Learning Rates, Momentum, and Acceleration
Ahmed Khaled, Satyen Kale, Arthur Douillard, Chi Jin, Rob Fergus, Manzil Zaheer
https://arxiv.org/abs/2509.10439 …
Homogenization of rate-independent elastoplastic spring network models with non-local random fields
Simone Hermann
https://arxiv.org/abs/2509.09872 https://
Boundary-Guided Policy Optimization for Memory-efficient RL of Diffusion Large Language Models
Nianyi Lin, Jiajie Zhang, Lei Hou, Juanzi Li
https://arxiv.org/abs/2510.11683 http…
PLRV-O: Advancing Differentially Private Deep Learning via Privacy Loss Random Variable Optimization
Qin Yang, Nicholas Stout, Meisam Mohammady, Han Wang, Ayesha Samreen, Christopher J Quinn, Yan Yan, Ashish Kundu, Yuan Hong
https://arxiv.org/abs/2509.06264
Low-Discrepancy Set Post-Processing via Gradient Descent
Fran\c{c}ois Cl\'ement, Linhang Huang, Woorim Lee, Cole Smidt, Braeden Sodt, Xuan Zhang
https://arxiv.org/abs/2511.10496 https://arxiv.org/pdf/2511.10496 https://arxiv.org/html/2511.10496
arXiv:2511.10496v1 Announce Type: new
Abstract: The construction of low-discrepancy sets, used for uniform sampling and numerical integration, has recently seen great improvements based on optimization and machine learning techniques. However, these methods are computationally expensive, often requiring days of computation or access to GPU clusters. We show that simple gradient descent-based techniques allow for comparable results when starting with a reasonably uniform point set. Not only is this method much more efficient and accessible, but it can be applied as post-processing to any low-discrepancy set generation method for a variety of standard discrepancy measures.
toXiv_bot_toot
Hybrid Quantum-Classical Policy Gradient for Adaptive Control of Cyber-Physical Systems: A Comparative Study of VQC vs. MLP
Aueaphum Aueawatthanaphisut, Nyi Wunna Tun
https://arxiv.org/abs/2510.06010
Stochastic Gradient Descent for Incomplete Tensor Linear Systems
Anna Ma, Deanna Needell, Alexander Xue
https://arxiv.org/abs/2510.07630 https://arxiv.org/…
Zeroes of Eigenfunctions of Schr\"odinger Operators after Schwartzman
Willie Wai-Yeung Wong
https://arxiv.org/abs/2509.09739 https://arxiv.org/pdf/250…
Curvature pinching of asymptotically conical gradient expanding Ricci solitons
Huai-Dong Cao, Junming Xie
https://arxiv.org/abs/2510.05075 https://arxiv.or…
S-D-RSM: Stochastic Distributed Regularized Splitting Method for Large-Scale Convex Optimization Problems
Maoran Wang, Xingju Cai, Yongxin Chen
https://arxiv.org/abs/2511.10133 https://arxiv.org/pdf/2511.10133 https://arxiv.org/html/2511.10133
arXiv:2511.10133v1 Announce Type: new
Abstract: This paper investigates the problems large-scale distributed composite convex optimization, with motivations from a broad range of applications, including multi-agent systems, federated learning, smart grids, wireless sensor networks, compressed sensing, and so on. Stochastic gradient descent (SGD) and its variants are commonly employed to solve such problems. However, existing algorithms often rely on vanishing step sizes, strong convexity assumptions, or entail substantial computational overhead to ensure convergence or obtain favorable complexity. To bridge the gap between theory and practice, we integrate consensus optimization and operator splitting techniques (see Problem Reformulation) to develop a novel stochastic splitting algorithm, termed the \emph{stochastic distributed regularized splitting method} (S-D-RSM). In practice, S-D-RSM performs parallel updates of proximal mappings and gradient information for only a randomly selected subset of agents at each iteration. By introducing regularization terms, it effectively mitigates consensus discrepancies among distributed nodes. In contrast to conventional stochastic methods, our theoretical analysis establishes that S-D-RSM achieves global convergence without requiring diminishing step sizes or strong convexity assumptions. Furthermore, it achieves an iteration complexity of $\mathcal{O}(1/\epsilon)$ with respect to both the objective function value and the consensus error. Numerical experiments show that S-D-RSM achieves up to 2--3$\times$ speedup compared to state-of-the-art baselines, while maintaining comparable or better accuracy. These results not only validate the algorithm's theoretical guarantees but also demonstrate its effectiveness in practical tasks such as compressed sensing and empirical risk minimization.
toXiv_bot_toot
Gradient Flows of Interfacial Energies: Curvature Agents and Incompressibility
Keith Promislow, Truong Vu, Brian Wetton
https://arxiv.org/abs/2509.07380 https://
On the Theory of Continual Learning with Gradient Descent for Neural Networks
Hossein Taheri, Avishek Ghosh, Arya Mazumdar
https://arxiv.org/abs/2510.05573 https://
On the optimization dynamics of RLVR: Gradient gap and step size thresholds
Joe Suk, Yaqi Duan
https://arxiv.org/abs/2510.08539 https://arxiv.org/pdf/2510.…
Replaced article(s) found for math.OC. https://arxiv.org/list/math.OC/new
[1/1]:
- A robust BFGS algorithm for unconstrained nonlinear optimization problems
Yaguang Yang
https://arxiv.org/abs/1212.5929
- Quantum computing and the stable set problem
Alja\v{z} Krpan, Janez Povh, Dunja Pucher
https://arxiv.org/abs/2405.12845 https://mastoxiv.page/@arXiv_mathOC_bot/112483516437815686
- Mean Field Game with Reflected Jump Diffusion Dynamics: A Linear Programming Approach
Zongxia Liang, Xiang Yu, Keyu Zhang
https://arxiv.org/abs/2508.20388 https://mastoxiv.page/@arXiv_mathOC_bot/115111048711698998
- Differential Dynamic Programming for the Optimal Control Problem with an Ellipsoidal Target Set a...
Sungjun Eom, Gyunghoon Park
https://arxiv.org/abs/2509.07546 https://mastoxiv.page/@arXiv_mathOC_bot/115179281556444440
- On the Moreau envelope properties of weakly convex functions
Marien Renaud, Arthur Leclaire, Nicolas Papadakis
https://arxiv.org/abs/2509.13960 https://mastoxiv.page/@arXiv_mathOC_bot/115224514482363803
- Automated algorithm design via Nevanlinna-Pick interpolation
Ibrahim K. Ozaslan, Tryphon T. Georgiou, Mihailo R. Jovanovic
https://arxiv.org/abs/2509.21416 https://mastoxiv.page/@arXiv_mathOC_bot/115286533597711930
- Optimal Control of a Bioeconomic Crop-Energy System with Energy Reinvestment
Othman Cherkaoui Dekkaki
https://arxiv.org/abs/2510.11381 https://mastoxiv.page/@arXiv_mathOC_bot/115372322896073250
- Point Convergence Analysis of the Accelerated Gradient Method for Multiobjective Optimization: Co...
Yingdong Yin
https://arxiv.org/abs/2510.26382 https://mastoxiv.page/@arXiv_mathOC_bot/115468018035252078
- History-Aware Adaptive High-Order Tensor Regularization
Chang He, Bo Jiang, Yuntian Jiang, Chuwen Zhang, Shuzhong Zhang
https://arxiv.org/abs/2511.05788
- Equivalence of entropy solutions and gradient flows for pressureless 1D Euler systems
Jos\'e Antonio Carrillo, Sondre Tesdal Galtung
https://arxiv.org/abs/2312.04932 https://mastoxiv.page/@arXiv_mathAP_bot/111560077272113052
- Kernel Modelling of Fading Memory Systems
Yongkang Huo, Thomas Chaffey, Rodolphe Sepulchre
https://arxiv.org/abs/2403.11945 https://mastoxiv.page/@arXiv_eessSY_bot/112121123836064435
- The Maximum Theoretical Ground Speed of the Wheeled Vehicle
Altay Zhakatayev, Mukatai Nemerebayev
https://arxiv.org/abs/2502.15341 https://mastoxiv.page/@arXiv_physicsclassph_bot/114057765769441123
- Hessian stability and convergence rates for entropic and Sinkhorn potentials via semiconcavity
Giacomo Greco, Luca Tamanini
https://arxiv.org/abs/2504.11133 https://mastoxiv.page/@arXiv_mathPR_bot/114346453424694503
- Optimizing the ground state energy of the three-dimensional magnetic Dirichlet Laplacian with con...
Matthias Baur
https://arxiv.org/abs/2504.21597 https://mastoxiv.page/@arXiv_mathph_bot/114431404740241516
- A localized consensus-based sampling algorithm
Arne Bouillon, Alexander Bodard, Panagiotis Patrinos, Dirk Nuyens, Giovanni Samaey
https://arxiv.org/abs/2505.24861 https://mastoxiv.page/@arXiv_mathNA_bot/114612580684567066
- A Novel Sliced Fused Gromov-Wasserstein Distance
Moritz Piening, Robert Beinert
https://arxiv.org/abs/2508.02364 https://mastoxiv.page/@arXiv_csLG_bot/114976243138728278
- Minimal Regret Walras Equilibria for Combinatorial Markets via Duality, Integrality, and Sensitiv...
Alo\"is Duguet, Tobias Harks, Martin Schmidt, Julian Schwarz
https://arxiv.org/abs/2511.09021 https://mastoxiv.page/@arXiv_csGT_bot/115541243299714775
toXiv_bot_toot
Evaluating the Impact of Adversarial Attacks on Traffic Sign Classification using the LISA Dataset
Nabeyou Tadessa, Balaji Iyangar, Mashrur Chowdhury
https://arxiv.org/abs/2509.06835
Computing Wasserstein Barycenters through Gradient Flows
Eduardo Fernandes Montesuma, Yassir Bendou, Mike Gartrell
https://arxiv.org/abs/2510.04602 https://
Asymptotic behaviour of the weak inverse anisotropic mean curvature flow
Chaoqun Gao, Yong Wei, Rong Zhou
https://arxiv.org/abs/2510.08168 https://arxiv.or…
Weight Initialization and Variance Dynamics in Deep Neural Networks and Large Language Models
Yankun Han
https://arxiv.org/abs/2510.09423 https://arxiv.org…
Linear Convergence of a Unified Primal--Dual Algorithm for Convex--Concave Saddle Point Problems with Quadratic Growth
Cody Melcher, Afrooz Jalilzadeh, Erfan Yazdandoost Hamedani
https://arxiv.org/abs/2510.11990
A gradient estimate for the linearized translator equation
Kyeongsu Choi, Robert Haslhofer, Or Hershkovits
https://arxiv.org/abs/2509.07629 https://arxiv.o…
Accelerated stochastic first-order method for convex optimization under heavy-tailed noise
Chuan He, Zhaosong Lu
https://arxiv.org/abs/2510.11676 https://a…
Linear Algebra Problems Solved by Using Damped Dynamical Systems on the Stiefel Manifold
M Gulliksson, A Oleynik, M Ogren, R Bakhshandeh-Chamazkoti
https://arxiv.org/abs/2510.10535
NeST-BO: Fast Local Bayesian Optimization via Newton-Step Targeting of Gradient and Hessian Information
Wei-Ting Tang, Akshay Kudva, Joel A. Paulson
https://arxiv.org/abs/2510.05516
Statistical Inference for Gradient Boosting Regression
Haimo Fang, Kevin Tan, Giles Hooker
https://arxiv.org/abs/2509.23127 https://arxiv.org/pdf/2509.2312…
Learning Mean-Field Games through Mean-Field Actor-Critic Flow
Mo Zhou, Haosheng Zhou, Ruimeng Hu
https://arxiv.org/abs/2510.12180 https://arxiv.org/pdf/25…
Flatness-Aware Stochastic Gradient Langevin Dynamics
Stefano Bruno, Youngsik Hwang, Jaehyeon An, Sotirios Sabanis, Dong-Young Lim
https://arxiv.org/abs/2510.02174 https://
Gradient regularity for widely degenerate parabolic equations
Michael Strunk
https://arxiv.org/abs/2510.07999 https://arxiv.org/pdf/2510.07999
Convexity of Optimization Curves: Local Sharp Thresholds, Robustness Impossibility, and New Counterexamples
Le Duc Hieu
https://arxiv.org/abs/2509.08954 https://
Convergence of Stochastic Gradient Methods for Wide Two-Layer Physics-Informed Neural Networks
Bangti Jin, Longjun Wu
https://arxiv.org/abs/2508.21571 https://
AdaBet: Gradient-free Layer Selection for Efficient Training of Deep Neural Networks
Irene Tenison, Soumyajit Chatterjee, Fahim Kawsar, Mohammad Malekzadeh
https://arxiv.org/abs/2510.03101
Global Solutions to Non-Convex Functional Constrained Problems with Hidden Convexity
Ilyas Fatkhullin, Niao He, Guanghui Lan, Florian Wolf
https://arxiv.org/abs/2511.10626 https://arxiv.org/pdf/2511.10626 https://arxiv.org/html/2511.10626
arXiv:2511.10626v1 Announce Type: new
Abstract: Constrained non-convex optimization is fundamentally challenging, as global solutions are generally intractable and constraint qualifications may not hold. However, in many applications, including safe policy optimization in control and reinforcement learning, such problems possess hidden convexity, meaning they can be reformulated as convex programs via a nonlinear invertible transformation. Typically such transformations are implicit or unknown, making the direct link with the convex program impossible. On the other hand, (sub-)gradients with respect to the original variables are often accessible or can be easily estimated, which motivates algorithms that operate directly in the original (non-convex) problem space using standard (sub-)gradient oracles. In this work, we develop the first algorithms to provably solve such non-convex problems to global minima. First, using a modified inexact proximal point method, we establish global last-iterate convergence guarantees with $\widetilde{\mathcal{O}}(\varepsilon^{-3})$ oracle complexity in non-smooth setting. For smooth problems, we propose a new bundle-level type method based on linearly constrained quadratic subproblems, improving the oracle complexity to $\widetilde{\mathcal{O}}(\varepsilon^{-1})$. Surprisingly, despite non-convexity, our methodology does not require any constraint qualifications, can handle hidden convex equality constraints, and achieves complexities matching those for solving unconstrained hidden convex optimization.
toXiv_bot_toot
On fundamental properties of high-order forward-backward envelope
Alireza Kabgani, Masoud Ahookhosh
https://arxiv.org/abs/2511.10421 https://arxiv.org/pdf/2511.10421 https://arxiv.org/html/2511.10421
arXiv:2511.10421v1 Announce Type: new
Abstract: This paper studies the fundamental properties of the high-order forward-backward splitting mapping (HiFBS) and its associated forward-backward envelope (HiFBE) through the lens of high-order regularization for nonconvex composite functions. Specifically, we (i) establish the boundedness and uniform boundedness of HiFBS, along with the H\"older and Lipschitz continuity of HiFBE; (ii) derive an explicit form for the subdifferentials of HiFBE; and (iii) investigate necessary and sufficient conditions for the differentiability and weak smoothness of HiFBE under suitable assumptions. By leveraging the prox-regularity of $g$ and the concept of $p$-calmness, we further demonstrate the local single-valuedness and continuity of HiFBS, which in turn guarantee the differentiability of HiFBE in neighborhoods of calm points. This paves the way for the development of gradient-based algorithms tailored to nonconvex composite optimization problems.
toXiv_bot_toot
Minimizing smooth Kurdyka-{\L}ojasiewicz functions via generalized descent methods: Convergence rate and complexity
Masoud Ahookhosh, Susan Ghaderi, Alireza Kabgani, Morteza Rahimi
https://arxiv.org/abs/2511.10414 https://arxiv.org/pdf/2511.10414 https://arxiv.org/html/2511.10414
arXiv:2511.10414v1 Announce Type: new
Abstract: This paper addresses the generalized descent algorithm (DEAL) for minimizing smooth functions, which is analyzed under the Kurdyka-{\L}ojasiewicz (KL) inequality. In particular, the suggested algorithm guarantees a sufficient decrease by adapting to the cost function's geometry. We leverage the KL property to establish the global convergence, convergence rates, and complexity. A particular focus is placed on the linear convergence of generalized descent methods. We show that the constant step-size and Armijo line search strategies along a generalized descent direction satisfy our generalized descent condition. Additionally, for nonsmooth functions by leveraging the smoothing techniques such as forward-backward and high-order Moreau envelopes, we show that the boosted proximal gradient method (BPGA) and the boosted high-order proximal-point (BPPA) methods are also specific cases of DEAL, respectively. It is notable that if the order of the high-order proximal term is chosen in a certain way (depending on the KL exponent), then the sequence generated by BPPA converges linearly for an arbitrary KL exponent. Our preliminary numerical experiments on inverse problems and LASSO demonstrate the efficiency of the proposed methods, validating our theoretical findings.
toXiv_bot_toot
Value bounds and Convergence Analysis for Averages of LRP attributions
Alexander Binder, Nastaran Takmil-Homayouni, Urun Dogan
https://arxiv.org/abs/2509.08963 https://
Inductive inference of gradient-boosted decision trees on graphs for insurance fraud detection
F\'elix Vandervorst, Bruno Deprez, Wouter Verbeke, Tim Verdonck
https://arxiv.org/abs/2510.05676
Balancing Utility and Privacy: Dynamically Private SGD with Random Projection
Zhanhong Jiang, Md Zahid Hasan, Nastaran Saadati, Aditya Balu, Chao Liu, Soumik Sarkar
https://arxiv.org/abs/2509.09485
Linear Convergence of Gradient Descent for Quadratically Regularized Optimal Transport
Alberto Gonz\'alez-Sanz, Marcel Nutz, Andr\'es Riveros Valdevenito
https://arxiv.org/abs/2509.08547
Replaced article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[2/5]:
- Convergence Analysis of Asynchronous Federated Learning with Gradient Compression for Non-Convex ...
Diying Yang, Yingwei Hou, Weigang Wu
Approximate Bregman proximal gradient algorithm with variable metric Armijo--Wolfe line search
Kiwamu Fujiki, Shota Takahashi, Akiko Takeda
https://arxiv.org/abs/2510.06615 http…
Data-driven multifidelity and multiscale topology optimization based on phasor-based evolutionary de-homogenization
Shuzhi Xu, Yifan Guo, Hiroki Kawabe, Kentaro Yaji
https://arxiv.org/abs/2510.08830
Correlating Cross-Iteration Noise for DP-SGD using Model Curvature
Xin Gu, Yingtai Xiao, Guanlin He, Jiamu Bai, Daniel Kifer, Kiwan Maeng
https://arxiv.org/abs/2510.05416 https:…
Stochastic versus Deterministic in Stochastic Gradient Descent
Runze Li, Jintao Xu, Wenxun Xing
https://arxiv.org/abs/2509.02912 https://arxiv.org/pdf/2509…
Quantitative Convergence Analysis of Projected Stochastic Gradient Descent for Non-Convex Losses via the Goldstein Subdifferential
Yuping Zheng, Andrew Lamperski
https://arxiv.org/abs/2510.02735
Robust Tangent Space Estimation via Laplacian Eigenvector Gradient Orthogonalization
Dhruv Kohli, Sawyer J. Robertson, Gal Mishne, Alexander Cloninger
https://arxiv.org/abs/2510.02308
Finding a Multiple Follower Stackelberg Equilibrium: A Fully First-Order Method
April Niu, Kai Wang, Juba Ziani
https://arxiv.org/abs/2509.08161 https://ar…
Towards understanding Accelerated Stein Variational Gradient Flow -- Analysis of Generalized Bilinear Kernels for Gaussian target distributions
Viktor Stein, Wuchen Li
https://arxiv.org/abs/2509.04008 …
On the Perturbed Projection-Based Distributed Gradient-Descent Algorithm: A Fully-Distributed Adaptive Redesign
Tarek Bazizi, Mohamed Maghenem, Paolo Frasca, Antonio Lor\`ia, Elena Panteley
https://arxiv.org/abs/2509.03443