2026-01-28 14:18:50
Replaced article(s) found for physics.atom-ph. https://arxiv.org/list/physics.atom-ph/new
[1/1]:
- Relativistic corrections of order $m\alpha^6$: singular operators and regularization
Vladimir I Korobov
Replaced article(s) found for physics.atom-ph. https://arxiv.org/list/physics.atom-ph/new
[1/1]:
- Relativistic corrections of order $m\alpha^6$: singular operators and regularization
Vladimir I Korobov
Mitigating Forgetting in Low Rank Adaptation
Joanna Sliwa, Frank Schneider, Philipp Hennig, Jose Miguel Hernandez-Lobato
https://arxiv.org/abs/2512.17720 https://arxiv.org/pdf/2512.17720 https://arxiv.org/html/2512.17720
arXiv:2512.17720v1 Announce Type: new
Abstract: Parameter-efficient fine-tuning methods, such as Low-Rank Adaptation (LoRA), enable fast specialization of large pre-trained models to different downstream applications. However, this process often leads to catastrophic forgetting of the model's prior domain knowledge. We address this issue with LaLoRA, a weight-space regularization technique that applies a Laplace approximation to Low-Rank Adaptation. Our approach estimates the model's confidence in each parameter and constrains updates in high-curvature directions, preserving prior knowledge while enabling efficient target-domain learning. By applying the Laplace approximation only to the LoRA weights, the method remains lightweight. We evaluate LaLoRA by fine-tuning a Llama model for mathematical reasoning and demonstrate an improved learning-forgetting trade-off, which can be directly controlled via the method's regularization strength. We further explore different loss landscape curvature approximations for estimating parameter confidence, analyze the effect of the data used for the Laplace approximation, and study robustness across hyperparameters.
toXiv_bot_toot
On fundamental properties of high-order forward-backward envelope
Alireza Kabgani, Masoud Ahookhosh
https://arxiv.org/abs/2511.10421 https://arxiv.org/pdf/2511.10421 https://arxiv.org/html/2511.10421
arXiv:2511.10421v1 Announce Type: new
Abstract: This paper studies the fundamental properties of the high-order forward-backward splitting mapping (HiFBS) and its associated forward-backward envelope (HiFBE) through the lens of high-order regularization for nonconvex composite functions. Specifically, we (i) establish the boundedness and uniform boundedness of HiFBS, along with the H\"older and Lipschitz continuity of HiFBE; (ii) derive an explicit form for the subdifferentials of HiFBE; and (iii) investigate necessary and sufficient conditions for the differentiability and weak smoothness of HiFBE under suitable assumptions. By leveraging the prox-regularity of $g$ and the concept of $p$-calmness, we further demonstrate the local single-valuedness and continuity of HiFBS, which in turn guarantee the differentiability of HiFBE in neighborhoods of calm points. This paves the way for the development of gradient-based algorithms tailored to nonconvex composite optimization problems.
toXiv_bot_toot
Regularized Random Fourier Features and Finite Element Reconstruction for Operator Learning in Sobolev Space
Xinyue Yu, Hayden Schaeffer
https://arxiv.org/abs/2512.17884 https://arxiv.org/pdf/2512.17884 https://arxiv.org/html/2512.17884
arXiv:2512.17884v1 Announce Type: new
Abstract: Operator learning is a data-driven approximation of mappings between infinite-dimensional function spaces, such as the solution operators of partial differential equations. Kernel-based operator learning can offer accurate, theoretically justified approximations that require less training than standard methods. However, they can become computationally prohibitive for large training sets and can be sensitive to noise. We propose a regularized random Fourier feature (RRFF) approach, coupled with a finite element reconstruction map (RRFF-FEM), for learning operators from noisy data. The method uses random features drawn from multivariate Student's $t$ distributions, together with frequency-weighted Tikhonov regularization that suppresses high-frequency noise. We establish high-probability bounds on the extreme singular values of the associated random feature matrix and show that when the number of features $N$ scales like $m \log m$ with the number of training samples $m$, the system is well-conditioned, which yields estimation and generalization guarantees. Detailed numerical experiments on benchmark PDE problems, including advection, Burgers', Darcy flow, Helmholtz, Navier-Stokes, and structural mechanics, demonstrate that RRFF and RRFF-FEM are robust to noise and achieve improved performance with reduced training time compared to the unregularized random feature model, while maintaining competitive accuracy relative to kernel and neural operator tests.
toXiv_bot_toot
Calibratable Disambiguation Loss for Multi-Instance Partial-Label Learning
Wei Tang, Yin-Fang Yang, Weijia Zhang, Min-Ling Zhang
https://arxiv.org/abs/2512.17788 https://arxiv.org/pdf/2512.17788 https://arxiv.org/html/2512.17788
arXiv:2512.17788v1 Announce Type: new
Abstract: Multi-instance partial-label learning (MIPL) is a weakly supervised framework that extends the principles of multi-instance learning (MIL) and partial-label learning (PLL) to address the challenges of inexact supervision in both instance and label spaces. However, existing MIPL approaches often suffer from poor calibration, undermining classifier reliability. In this work, we propose a plug-and-play calibratable disambiguation loss (CDL) that simultaneously improves classification accuracy and calibration performance. The loss has two instantiations: the first one calibrates predictions based on probabilities from the candidate label set, while the second one integrates probabilities from both candidate and non-candidate label sets. The proposed CDL can be seamlessly incorporated into existing MIPL and PLL frameworks. We provide a theoretical analysis that establishes the lower bound and regularization properties of CDL, demonstrating its superiority over conventional disambiguation losses. Experimental results on benchmark and real-world datasets confirm that our CDL significantly enhances both classification and calibration performance.
toXiv_bot_toot
Global Convergence of Four-Layer Matrix Factorization under Random Initialization
Minrui Luo, Weihang Xu, Xiang Gao, Maryam Fazel, Simon Shaolei Du
https://arxiv.org/abs/2511.09925 https://arxiv.org/pdf/2511.09925 https://arxiv.org/html/2511.09925
arXiv:2511.09925v1 Announce Type: new
Abstract: Gradient descent dynamics on the deep matrix factorization problem is extensively studied as a simplified theoretical model for deep neural networks. Although the convergence theory for two-layer matrix factorization is well-established, no global convergence guarantee for general deep matrix factorization under random initialization has been established to date. To address this gap, we provide a polynomial-time global convergence guarantee for randomly initialized gradient descent on four-layer matrix factorization, given certain conditions on the target matrix and a standard balanced regularization term. Our analysis employs new techniques to show saddle-avoidance properties of gradient decent dynamics, and extends previous theories to characterize the change in eigenvalues of layer weights.
toXiv_bot_toot
S-D-RSM: Stochastic Distributed Regularized Splitting Method for Large-Scale Convex Optimization Problems
Maoran Wang, Xingju Cai, Yongxin Chen
https://arxiv.org/abs/2511.10133 https://arxiv.org/pdf/2511.10133 https://arxiv.org/html/2511.10133
arXiv:2511.10133v1 Announce Type: new
Abstract: This paper investigates the problems large-scale distributed composite convex optimization, with motivations from a broad range of applications, including multi-agent systems, federated learning, smart grids, wireless sensor networks, compressed sensing, and so on. Stochastic gradient descent (SGD) and its variants are commonly employed to solve such problems. However, existing algorithms often rely on vanishing step sizes, strong convexity assumptions, or entail substantial computational overhead to ensure convergence or obtain favorable complexity. To bridge the gap between theory and practice, we integrate consensus optimization and operator splitting techniques (see Problem Reformulation) to develop a novel stochastic splitting algorithm, termed the \emph{stochastic distributed regularized splitting method} (S-D-RSM). In practice, S-D-RSM performs parallel updates of proximal mappings and gradient information for only a randomly selected subset of agents at each iteration. By introducing regularization terms, it effectively mitigates consensus discrepancies among distributed nodes. In contrast to conventional stochastic methods, our theoretical analysis establishes that S-D-RSM achieves global convergence without requiring diminishing step sizes or strong convexity assumptions. Furthermore, it achieves an iteration complexity of $\mathcal{O}(1/\epsilon)$ with respect to both the objective function value and the consensus error. Numerical experiments show that S-D-RSM achieves up to 2--3$\times$ speedup compared to state-of-the-art baselines, while maintaining comparable or better accuracy. These results not only validate the algorithm's theoretical guarantees but also demonstrate its effectiveness in practical tasks such as compressed sensing and empirical risk minimization.
toXiv_bot_toot
Replaced article(s) found for math.OC. https://arxiv.org/list/math.OC/new
[1/1]:
- A robust BFGS algorithm for unconstrained nonlinear optimization problems
Yaguang Yang
https://arxiv.org/abs/1212.5929
- Quantum computing and the stable set problem
Alja\v{z} Krpan, Janez Povh, Dunja Pucher
https://arxiv.org/abs/2405.12845 https://mastoxiv.page/@arXiv_mathOC_bot/112483516437815686
- Mean Field Game with Reflected Jump Diffusion Dynamics: A Linear Programming Approach
Zongxia Liang, Xiang Yu, Keyu Zhang
https://arxiv.org/abs/2508.20388 https://mastoxiv.page/@arXiv_mathOC_bot/115111048711698998
- Differential Dynamic Programming for the Optimal Control Problem with an Ellipsoidal Target Set a...
Sungjun Eom, Gyunghoon Park
https://arxiv.org/abs/2509.07546 https://mastoxiv.page/@arXiv_mathOC_bot/115179281556444440
- On the Moreau envelope properties of weakly convex functions
Marien Renaud, Arthur Leclaire, Nicolas Papadakis
https://arxiv.org/abs/2509.13960 https://mastoxiv.page/@arXiv_mathOC_bot/115224514482363803
- Automated algorithm design via Nevanlinna-Pick interpolation
Ibrahim K. Ozaslan, Tryphon T. Georgiou, Mihailo R. Jovanovic
https://arxiv.org/abs/2509.21416 https://mastoxiv.page/@arXiv_mathOC_bot/115286533597711930
- Optimal Control of a Bioeconomic Crop-Energy System with Energy Reinvestment
Othman Cherkaoui Dekkaki
https://arxiv.org/abs/2510.11381 https://mastoxiv.page/@arXiv_mathOC_bot/115372322896073250
- Point Convergence Analysis of the Accelerated Gradient Method for Multiobjective Optimization: Co...
Yingdong Yin
https://arxiv.org/abs/2510.26382 https://mastoxiv.page/@arXiv_mathOC_bot/115468018035252078
- History-Aware Adaptive High-Order Tensor Regularization
Chang He, Bo Jiang, Yuntian Jiang, Chuwen Zhang, Shuzhong Zhang
https://arxiv.org/abs/2511.05788
- Equivalence of entropy solutions and gradient flows for pressureless 1D Euler systems
Jos\'e Antonio Carrillo, Sondre Tesdal Galtung
https://arxiv.org/abs/2312.04932 https://mastoxiv.page/@arXiv_mathAP_bot/111560077272113052
- Kernel Modelling of Fading Memory Systems
Yongkang Huo, Thomas Chaffey, Rodolphe Sepulchre
https://arxiv.org/abs/2403.11945 https://mastoxiv.page/@arXiv_eessSY_bot/112121123836064435
- The Maximum Theoretical Ground Speed of the Wheeled Vehicle
Altay Zhakatayev, Mukatai Nemerebayev
https://arxiv.org/abs/2502.15341 https://mastoxiv.page/@arXiv_physicsclassph_bot/114057765769441123
- Hessian stability and convergence rates for entropic and Sinkhorn potentials via semiconcavity
Giacomo Greco, Luca Tamanini
https://arxiv.org/abs/2504.11133 https://mastoxiv.page/@arXiv_mathPR_bot/114346453424694503
- Optimizing the ground state energy of the three-dimensional magnetic Dirichlet Laplacian with con...
Matthias Baur
https://arxiv.org/abs/2504.21597 https://mastoxiv.page/@arXiv_mathph_bot/114431404740241516
- A localized consensus-based sampling algorithm
Arne Bouillon, Alexander Bodard, Panagiotis Patrinos, Dirk Nuyens, Giovanni Samaey
https://arxiv.org/abs/2505.24861 https://mastoxiv.page/@arXiv_mathNA_bot/114612580684567066
- A Novel Sliced Fused Gromov-Wasserstein Distance
Moritz Piening, Robert Beinert
https://arxiv.org/abs/2508.02364 https://mastoxiv.page/@arXiv_csLG_bot/114976243138728278
- Minimal Regret Walras Equilibria for Combinatorial Markets via Duality, Integrality, and Sensitiv...
Alo\"is Duguet, Tobias Harks, Martin Schmidt, Julian Schwarz
https://arxiv.org/abs/2511.09021 https://mastoxiv.page/@arXiv_csGT_bot/115541243299714775
toXiv_bot_toot