Tootfinder

Opt-in global Mastodon full text search. Join the index!

@@arXiv_physicsatomph_bot@mastoxiv.page@mastoxiv.page
2026-01-28 14:18:50

Replaced article(s) found for physics.atom-ph. arxiv.org/list/physics.atom-ph
[1/1]:
- Relativistic corrections of order $m\alpha^6$: singular operators and regularization
Vladimir I Korobov

@arXiv_csLG_bot@mastoxiv.page
2025-12-22 10:33:00

Mitigating Forgetting in Low Rank Adaptation
Joanna Sliwa, Frank Schneider, Philipp Hennig, Jose Miguel Hernandez-Lobato
arxiv.org/abs/2512.17720 arxiv.org/pdf/2512.17720 arxiv.org/html/2512.17720
arXiv:2512.17720v1 Announce Type: new
Abstract: Parameter-efficient fine-tuning methods, such as Low-Rank Adaptation (LoRA), enable fast specialization of large pre-trained models to different downstream applications. However, this process often leads to catastrophic forgetting of the model's prior domain knowledge. We address this issue with LaLoRA, a weight-space regularization technique that applies a Laplace approximation to Low-Rank Adaptation. Our approach estimates the model's confidence in each parameter and constrains updates in high-curvature directions, preserving prior knowledge while enabling efficient target-domain learning. By applying the Laplace approximation only to the LoRA weights, the method remains lightweight. We evaluate LaLoRA by fine-tuning a Llama model for mathematical reasoning and demonstrate an improved learning-forgetting trade-off, which can be directly controlled via the method's regularization strength. We further explore different loss landscape curvature approximations for estimating parameter confidence, analyze the effect of the data used for the Laplace approximation, and study robustness across hyperparameters.
toXiv_bot_toot

@arXiv_mathOC_bot@mastoxiv.page
2025-11-14 09:44:20

On fundamental properties of high-order forward-backward envelope
Alireza Kabgani, Masoud Ahookhosh
arxiv.org/abs/2511.10421 arxiv.org/pdf/2511.10421 arxiv.org/html/2511.10421
arXiv:2511.10421v1 Announce Type: new
Abstract: This paper studies the fundamental properties of the high-order forward-backward splitting mapping (HiFBS) and its associated forward-backward envelope (HiFBE) through the lens of high-order regularization for nonconvex composite functions. Specifically, we (i) establish the boundedness and uniform boundedness of HiFBS, along with the H\"older and Lipschitz continuity of HiFBE; (ii) derive an explicit form for the subdifferentials of HiFBE; and (iii) investigate necessary and sufficient conditions for the differentiability and weak smoothness of HiFBE under suitable assumptions. By leveraging the prox-regularity of $g$ and the concept of $p$-calmness, we further demonstrate the local single-valuedness and continuity of HiFBS, which in turn guarantee the differentiability of HiFBE in neighborhoods of calm points. This paves the way for the development of gradient-based algorithms tailored to nonconvex composite optimization problems.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2025-12-22 10:34:50

Regularized Random Fourier Features and Finite Element Reconstruction for Operator Learning in Sobolev Space
Xinyue Yu, Hayden Schaeffer
arxiv.org/abs/2512.17884 arxiv.org/pdf/2512.17884 arxiv.org/html/2512.17884
arXiv:2512.17884v1 Announce Type: new
Abstract: Operator learning is a data-driven approximation of mappings between infinite-dimensional function spaces, such as the solution operators of partial differential equations. Kernel-based operator learning can offer accurate, theoretically justified approximations that require less training than standard methods. However, they can become computationally prohibitive for large training sets and can be sensitive to noise. We propose a regularized random Fourier feature (RRFF) approach, coupled with a finite element reconstruction map (RRFF-FEM), for learning operators from noisy data. The method uses random features drawn from multivariate Student's $t$ distributions, together with frequency-weighted Tikhonov regularization that suppresses high-frequency noise. We establish high-probability bounds on the extreme singular values of the associated random feature matrix and show that when the number of features $N$ scales like $m \log m$ with the number of training samples $m$, the system is well-conditioned, which yields estimation and generalization guarantees. Detailed numerical experiments on benchmark PDE problems, including advection, Burgers', Darcy flow, Helmholtz, Navier-Stokes, and structural mechanics, demonstrate that RRFF and RRFF-FEM are robust to noise and achieve improved performance with reduced training time compared to the unregularized random feature model, while maintaining competitive accuracy relative to kernel and neural operator tests.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2025-12-22 10:33:50

Calibratable Disambiguation Loss for Multi-Instance Partial-Label Learning
Wei Tang, Yin-Fang Yang, Weijia Zhang, Min-Ling Zhang
arxiv.org/abs/2512.17788 arxiv.org/pdf/2512.17788 arxiv.org/html/2512.17788
arXiv:2512.17788v1 Announce Type: new
Abstract: Multi-instance partial-label learning (MIPL) is a weakly supervised framework that extends the principles of multi-instance learning (MIL) and partial-label learning (PLL) to address the challenges of inexact supervision in both instance and label spaces. However, existing MIPL approaches often suffer from poor calibration, undermining classifier reliability. In this work, we propose a plug-and-play calibratable disambiguation loss (CDL) that simultaneously improves classification accuracy and calibration performance. The loss has two instantiations: the first one calibrates predictions based on probabilities from the candidate label set, while the second one integrates probabilities from both candidate and non-candidate label sets. The proposed CDL can be seamlessly incorporated into existing MIPL and PLL frameworks. We provide a theoretical analysis that establishes the lower bound and regularization properties of CDL, demonstrating its superiority over conventional disambiguation losses. Experimental results on benchmark and real-world datasets confirm that our CDL significantly enhances both classification and calibration performance.
toXiv_bot_toot

@arXiv_mathOC_bot@mastoxiv.page
2025-11-14 09:19:00

Global Convergence of Four-Layer Matrix Factorization under Random Initialization
Minrui Luo, Weihang Xu, Xiang Gao, Maryam Fazel, Simon Shaolei Du
arxiv.org/abs/2511.09925 arxiv.org/pdf/2511.09925 arxiv.org/html/2511.09925
arXiv:2511.09925v1 Announce Type: new
Abstract: Gradient descent dynamics on the deep matrix factorization problem is extensively studied as a simplified theoretical model for deep neural networks. Although the convergence theory for two-layer matrix factorization is well-established, no global convergence guarantee for general deep matrix factorization under random initialization has been established to date. To address this gap, we provide a polynomial-time global convergence guarantee for randomly initialized gradient descent on four-layer matrix factorization, given certain conditions on the target matrix and a standard balanced regularization term. Our analysis employs new techniques to show saddle-avoidance properties of gradient decent dynamics, and extends previous theories to characterize the change in eigenvalues of layer weights.
toXiv_bot_toot

@arXiv_mathOC_bot@mastoxiv.page
2025-11-14 09:37:10

S-D-RSM: Stochastic Distributed Regularized Splitting Method for Large-Scale Convex Optimization Problems
Maoran Wang, Xingju Cai, Yongxin Chen
arxiv.org/abs/2511.10133 arxiv.org/pdf/2511.10133 arxiv.org/html/2511.10133
arXiv:2511.10133v1 Announce Type: new
Abstract: This paper investigates the problems large-scale distributed composite convex optimization, with motivations from a broad range of applications, including multi-agent systems, federated learning, smart grids, wireless sensor networks, compressed sensing, and so on. Stochastic gradient descent (SGD) and its variants are commonly employed to solve such problems. However, existing algorithms often rely on vanishing step sizes, strong convexity assumptions, or entail substantial computational overhead to ensure convergence or obtain favorable complexity. To bridge the gap between theory and practice, we integrate consensus optimization and operator splitting techniques (see Problem Reformulation) to develop a novel stochastic splitting algorithm, termed the \emph{stochastic distributed regularized splitting method} (S-D-RSM). In practice, S-D-RSM performs parallel updates of proximal mappings and gradient information for only a randomly selected subset of agents at each iteration. By introducing regularization terms, it effectively mitigates consensus discrepancies among distributed nodes. In contrast to conventional stochastic methods, our theoretical analysis establishes that S-D-RSM achieves global convergence without requiring diminishing step sizes or strong convexity assumptions. Furthermore, it achieves an iteration complexity of $\mathcal{O}(1/\epsilon)$ with respect to both the objective function value and the consensus error. Numerical experiments show that S-D-RSM achieves up to 2--3$\times$ speedup compared to state-of-the-art baselines, while maintaining comparable or better accuracy. These results not only validate the algorithm's theoretical guarantees but also demonstrate its effectiveness in practical tasks such as compressed sensing and empirical risk minimization.
toXiv_bot_toot

@arXiv_mathOC_bot@mastoxiv.page
2025-11-14 13:23:10

Replaced article(s) found for math.OC. arxiv.org/list/math.OC/new
[1/1]:
- A robust BFGS algorithm for unconstrained nonlinear optimization problems
Yaguang Yang
arxiv.org/abs/1212.5929
- Quantum computing and the stable set problem
Alja\v{z} Krpan, Janez Povh, Dunja Pucher
arxiv.org/abs/2405.12845 mastoxiv.page/@arXiv_mathOC_bo
- Mean Field Game with Reflected Jump Diffusion Dynamics: A Linear Programming Approach
Zongxia Liang, Xiang Yu, Keyu Zhang
arxiv.org/abs/2508.20388 mastoxiv.page/@arXiv_mathOC_bo
- Differential Dynamic Programming for the Optimal Control Problem with an Ellipsoidal Target Set a...
Sungjun Eom, Gyunghoon Park
arxiv.org/abs/2509.07546 mastoxiv.page/@arXiv_mathOC_bo
- On the Moreau envelope properties of weakly convex functions
Marien Renaud, Arthur Leclaire, Nicolas Papadakis
arxiv.org/abs/2509.13960 mastoxiv.page/@arXiv_mathOC_bo
- Automated algorithm design via Nevanlinna-Pick interpolation
Ibrahim K. Ozaslan, Tryphon T. Georgiou, Mihailo R. Jovanovic
arxiv.org/abs/2509.21416 mastoxiv.page/@arXiv_mathOC_bo
- Optimal Control of a Bioeconomic Crop-Energy System with Energy Reinvestment
Othman Cherkaoui Dekkaki
arxiv.org/abs/2510.11381 mastoxiv.page/@arXiv_mathOC_bo
- Point Convergence Analysis of the Accelerated Gradient Method for Multiobjective Optimization: Co...
Yingdong Yin
arxiv.org/abs/2510.26382 mastoxiv.page/@arXiv_mathOC_bo
- History-Aware Adaptive High-Order Tensor Regularization
Chang He, Bo Jiang, Yuntian Jiang, Chuwen Zhang, Shuzhong Zhang
arxiv.org/abs/2511.05788
- Equivalence of entropy solutions and gradient flows for pressureless 1D Euler systems
Jos\'e Antonio Carrillo, Sondre Tesdal Galtung
arxiv.org/abs/2312.04932 mastoxiv.page/@arXiv_mathAP_bo
- Kernel Modelling of Fading Memory Systems
Yongkang Huo, Thomas Chaffey, Rodolphe Sepulchre
arxiv.org/abs/2403.11945 mastoxiv.page/@arXiv_eessSY_bo
- The Maximum Theoretical Ground Speed of the Wheeled Vehicle
Altay Zhakatayev, Mukatai Nemerebayev
arxiv.org/abs/2502.15341 mastoxiv.page/@arXiv_physicscl
- Hessian stability and convergence rates for entropic and Sinkhorn potentials via semiconcavity
Giacomo Greco, Luca Tamanini
arxiv.org/abs/2504.11133 mastoxiv.page/@arXiv_mathPR_bo
- Optimizing the ground state energy of the three-dimensional magnetic Dirichlet Laplacian with con...
Matthias Baur
arxiv.org/abs/2504.21597 mastoxiv.page/@arXiv_mathph_bo
- A localized consensus-based sampling algorithm
Arne Bouillon, Alexander Bodard, Panagiotis Patrinos, Dirk Nuyens, Giovanni Samaey
arxiv.org/abs/2505.24861 mastoxiv.page/@arXiv_mathNA_bo
- A Novel Sliced Fused Gromov-Wasserstein Distance
Moritz Piening, Robert Beinert
arxiv.org/abs/2508.02364 mastoxiv.page/@arXiv_csLG_bot/
- Minimal Regret Walras Equilibria for Combinatorial Markets via Duality, Integrality, and Sensitiv...
Alo\"is Duguet, Tobias Harks, Martin Schmidt, Julian Schwarz
arxiv.org/abs/2511.09021 mastoxiv.page/@arXiv_csGT_bot/
toXiv_bot_toot