SupCLAP: Controlling Optimization Trajectory Drift in Audio-Text Contrastive Learning with Support Vector Regularization
Jiehui Luo, Yuguo Yin, Yuxin Xie, Jinghan Ru, Xianwei Zhuang, Minghua He, Aofan Liu, Zihan Xiong, Dongchao Yang
https://arxiv.org/abs/2509.21033
Replaced article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[2/8]:
- Regularization can make diffusion models more efficient
Mahsa Taheri, Johannes Lederer
Mitigating Forgetting in Low Rank Adaptation
Joanna Sliwa, Frank Schneider, Philipp Hennig, Jose Miguel Hernandez-Lobato
https://arxiv.org/abs/2512.17720 https://arxiv.org/pdf/2512.17720 https://arxiv.org/html/2512.17720
arXiv:2512.17720v1 Announce Type: new
Abstract: Parameter-efficient fine-tuning methods, such as Low-Rank Adaptation (LoRA), enable fast specialization of large pre-trained models to different downstream applications. However, this process often leads to catastrophic forgetting of the model's prior domain knowledge. We address this issue with LaLoRA, a weight-space regularization technique that applies a Laplace approximation to Low-Rank Adaptation. Our approach estimates the model's confidence in each parameter and constrains updates in high-curvature directions, preserving prior knowledge while enabling efficient target-domain learning. By applying the Laplace approximation only to the LoRA weights, the method remains lightweight. We evaluate LaLoRA by fine-tuning a Llama model for mathematical reasoning and demonstrate an improved learning-forgetting trade-off, which can be directly controlled via the method's regularization strength. We further explore different loss landscape curvature approximations for estimating parameter confidence, analyze the effect of the data used for the Laplace approximation, and study robustness across hyperparameters.
toXiv_bot_toot
Crosslisted article(s) found for cs.CL. https://arxiv.org/list/cs.CL/new
[2/3]:
- Rediscovering Entropy Regularization: Adaptive Coefficient Unlocks Its Potential for LLM Reinforc...
Xiaoyun Zhang, Xiaojian Yuan, Di Huang, Wang You, Chen Hu, Jingqing Ruan, Kejiang Chen, Xing Hu
A Comprehensive Survey of Website Fingerprinting Attacks and Defenses in Tor: Advances and Open Challenges
Yuwen Cui, Guangjing Wang, Khanh Vu, Kai Wei, Kehan Shen, Zhengyuan Jiang, Xiao Han, Ning Wang, Zhuo Lu, Yao Liu
https://arxiv.org/abs/2510.11804
Crosslisted article(s) found for cs.AI. https://arxiv.org/list/cs.AI/new
[4/8]:
- DM1: MeanFlow with Dispersive Regularization for 1-Step Robotic Manipulation
Guowei Zou, Haitao Wang, Hejun Wu, Yukun Qian, Yuhang Wang, Weibing Li
Contrastive Representation Regularization for Vision-Language-Action Models
Taeyoung Kim, Jimin Lee, Myungkyu Koo, Dongyoung Kim, Kyungmin Lee, Changyeon Kim, Younggyo Seo, Jinwoo Shin
https://arxiv.org/abs/2510.01711
Instance-Aware Robust Consistency Regularization for Semi-Supervised Nuclei Instance Segmentation
Zenan Lin, Wei Li, Jintao Chen, Zihao Wu, Wenxiong Kang, Changxin Gao, Liansheng Wang, Jin-Gang Yu
https://arxiv.org/abs/2510.09329
FM-IRL: Flow-Matching for Reward Modeling and Policy Regularization in Reinforcement Learning
Zhenglin Wan, Jingxuan Wu, Xingrui Yu, Chubin Zhang, Mingcong Lei, Bo An, Ivor Tsang
https://arxiv.org/abs/2510.09222
Optimizing Cross-Domain Transfer for Universal Machine Learning Interatomic Potentials
Jaesun Kim, Jinmu You, Yutack Park, Yunsung Lim, Yujin Kang, Jisu Kim, Haekwan Jeon, Deokgi Hong, Seung Yul Lee, Saerom Choi, Yongdeok Kim, Jae W. Lee, Seungwu Han
https://arxiv.org/abs/2510.11241
Macroeconomic Forecasting and Machine Learning
Ta-Chung Chi (Kevin), Ting-Han Fan (Kevin), Raffaele M. Ghigliazza (Kevin), Domenico Giannone (Kevin), Zixuan (Kevin), Wang
https://arxiv.org/abs/2510.11008
Token-Level Policy Optimization: Linking Group-Level Rewards to Token-Level Aggregation via Markov Likelihood
Xingyu Lin, Yilin Wen, En Wang, Du Su, Wenbin Liu, Chenfu Bao, Zhonghou Lv
https://arxiv.org/abs/2510.09369
Continual Action Quality Assessment via Adaptive Manifold-Aligned Graph Regularization
Kanglei Zhou, Qingyi Pan, Xingxing Zhang, Hubert P. H. Shum, Frederick W. B. Li, Xiaohui Liang, Liyuan Wang
https://arxiv.org/abs/2510.06842
Crosslisted article(s) found for math.MG. https://arxiv.org/list/math.MG/new
[1/1]:
- Optimal Regularization Under Uncertainty: Distributional Robustness and Convexity Constraints
Oscar Leong, Eliza O'Reilly, Yong Sheng Soh
Regularized Random Fourier Features and Finite Element Reconstruction for Operator Learning in Sobolev Space
Xinyue Yu, Hayden Schaeffer
https://arxiv.org/abs/2512.17884 https://arxiv.org/pdf/2512.17884 https://arxiv.org/html/2512.17884
arXiv:2512.17884v1 Announce Type: new
Abstract: Operator learning is a data-driven approximation of mappings between infinite-dimensional function spaces, such as the solution operators of partial differential equations. Kernel-based operator learning can offer accurate, theoretically justified approximations that require less training than standard methods. However, they can become computationally prohibitive for large training sets and can be sensitive to noise. We propose a regularized random Fourier feature (RRFF) approach, coupled with a finite element reconstruction map (RRFF-FEM), for learning operators from noisy data. The method uses random features drawn from multivariate Student's $t$ distributions, together with frequency-weighted Tikhonov regularization that suppresses high-frequency noise. We establish high-probability bounds on the extreme singular values of the associated random feature matrix and show that when the number of features $N$ scales like $m \log m$ with the number of training samples $m$, the system is well-conditioned, which yields estimation and generalization guarantees. Detailed numerical experiments on benchmark PDE problems, including advection, Burgers', Darcy flow, Helmholtz, Navier-Stokes, and structural mechanics, demonstrate that RRFF and RRFF-FEM are robust to noise and achieve improved performance with reduced training time compared to the unregularized random feature model, while maintaining competitive accuracy relative to kernel and neural operator tests.
toXiv_bot_toot
Replaced article(s) found for eess.SY. https://arxiv.org/list/eess.SY/new
[1/1]:
- Sparse dynamic network reconstruction through L1-regularization of a Lyapunov equation
Belaustegui, Arango, Rossi-Pool, Leonard, Franci
Calibratable Disambiguation Loss for Multi-Instance Partial-Label Learning
Wei Tang, Yin-Fang Yang, Weijia Zhang, Min-Ling Zhang
https://arxiv.org/abs/2512.17788 https://arxiv.org/pdf/2512.17788 https://arxiv.org/html/2512.17788
arXiv:2512.17788v1 Announce Type: new
Abstract: Multi-instance partial-label learning (MIPL) is a weakly supervised framework that extends the principles of multi-instance learning (MIL) and partial-label learning (PLL) to address the challenges of inexact supervision in both instance and label spaces. However, existing MIPL approaches often suffer from poor calibration, undermining classifier reliability. In this work, we propose a plug-and-play calibratable disambiguation loss (CDL) that simultaneously improves classification accuracy and calibration performance. The loss has two instantiations: the first one calibrates predictions based on probabilities from the candidate label set, while the second one integrates probabilities from both candidate and non-candidate label sets. The proposed CDL can be seamlessly incorporated into existing MIPL and PLL frameworks. We provide a theoretical analysis that establishes the lower bound and regularization properties of CDL, demonstrating its superiority over conventional disambiguation losses. Experimental results on benchmark and real-world datasets confirm that our CDL significantly enhances both classification and calibration performance.
toXiv_bot_toot
Replaced article(s) found for stat.ML. https://arxiv.org/list/stat.ML/new
[1/2]:
- Multiparameter regularization and aggregation in the context of polynomial functional regression
Gizewski, Holzleitner, Mayer-Suess, Pereverzyev, Pereverzyev
Dirichlet-Prior Shaping: Guiding Expert Specialization in Upcycled MoEs
Leyla Mirvakhabova, Babak Ehteshami Bejnordi, Gaurav Kumar, Hanxue Liang, Wanru Zhao, Paul Whatmough
https://arxiv.org/abs/2510.01185
On fundamental properties of high-order forward-backward envelope
Alireza Kabgani, Masoud Ahookhosh
https://arxiv.org/abs/2511.10421 https://arxiv.org/pdf/2511.10421 https://arxiv.org/html/2511.10421
arXiv:2511.10421v1 Announce Type: new
Abstract: This paper studies the fundamental properties of the high-order forward-backward splitting mapping (HiFBS) and its associated forward-backward envelope (HiFBE) through the lens of high-order regularization for nonconvex composite functions. Specifically, we (i) establish the boundedness and uniform boundedness of HiFBS, along with the H\"older and Lipschitz continuity of HiFBE; (ii) derive an explicit form for the subdifferentials of HiFBE; and (iii) investigate necessary and sufficient conditions for the differentiability and weak smoothness of HiFBE under suitable assumptions. By leveraging the prox-regularity of $g$ and the concept of $p$-calmness, we further demonstrate the local single-valuedness and continuity of HiFBS, which in turn guarantee the differentiability of HiFBE in neighborhoods of calm points. This paves the way for the development of gradient-based algorithms tailored to nonconvex composite optimization problems.
toXiv_bot_toot
Replaced article(s) found for stat.ME. https://arxiv.org/list/stat.ME/new
[1/1]:
- Joint identification of spatially variable genes via a network-assisted Bayesian regularization a...
Mingcong Wu, Yang Li, Shuangge Ma, Mengyun Wu
Global Convergence of Four-Layer Matrix Factorization under Random Initialization
Minrui Luo, Weihang Xu, Xiang Gao, Maryam Fazel, Simon Shaolei Du
https://arxiv.org/abs/2511.09925 https://arxiv.org/pdf/2511.09925 https://arxiv.org/html/2511.09925
arXiv:2511.09925v1 Announce Type: new
Abstract: Gradient descent dynamics on the deep matrix factorization problem is extensively studied as a simplified theoretical model for deep neural networks. Although the convergence theory for two-layer matrix factorization is well-established, no global convergence guarantee for general deep matrix factorization under random initialization has been established to date. To address this gap, we provide a polynomial-time global convergence guarantee for randomly initialized gradient descent on four-layer matrix factorization, given certain conditions on the target matrix and a standard balanced regularization term. Our analysis employs new techniques to show saddle-avoidance properties of gradient decent dynamics, and extends previous theories to characterize the change in eigenvalues of layer weights.
toXiv_bot_toot
S-D-RSM: Stochastic Distributed Regularized Splitting Method for Large-Scale Convex Optimization Problems
Maoran Wang, Xingju Cai, Yongxin Chen
https://arxiv.org/abs/2511.10133 https://arxiv.org/pdf/2511.10133 https://arxiv.org/html/2511.10133
arXiv:2511.10133v1 Announce Type: new
Abstract: This paper investigates the problems large-scale distributed composite convex optimization, with motivations from a broad range of applications, including multi-agent systems, federated learning, smart grids, wireless sensor networks, compressed sensing, and so on. Stochastic gradient descent (SGD) and its variants are commonly employed to solve such problems. However, existing algorithms often rely on vanishing step sizes, strong convexity assumptions, or entail substantial computational overhead to ensure convergence or obtain favorable complexity. To bridge the gap between theory and practice, we integrate consensus optimization and operator splitting techniques (see Problem Reformulation) to develop a novel stochastic splitting algorithm, termed the \emph{stochastic distributed regularized splitting method} (S-D-RSM). In practice, S-D-RSM performs parallel updates of proximal mappings and gradient information for only a randomly selected subset of agents at each iteration. By introducing regularization terms, it effectively mitigates consensus discrepancies among distributed nodes. In contrast to conventional stochastic methods, our theoretical analysis establishes that S-D-RSM achieves global convergence without requiring diminishing step sizes or strong convexity assumptions. Furthermore, it achieves an iteration complexity of $\mathcal{O}(1/\epsilon)$ with respect to both the objective function value and the consensus error. Numerical experiments show that S-D-RSM achieves up to 2--3$\times$ speedup compared to state-of-the-art baselines, while maintaining comparable or better accuracy. These results not only validate the algorithm's theoretical guarantees but also demonstrate its effectiveness in practical tasks such as compressed sensing and empirical risk minimization.
toXiv_bot_toot
Replaced article(s) found for math.OC. https://arxiv.org/list/math.OC/new
[1/1]:
- A robust BFGS algorithm for unconstrained nonlinear optimization problems
Yaguang Yang
https://arxiv.org/abs/1212.5929
- Quantum computing and the stable set problem
Alja\v{z} Krpan, Janez Povh, Dunja Pucher
https://arxiv.org/abs/2405.12845 https://mastoxiv.page/@arXiv_mathOC_bot/112483516437815686
- Mean Field Game with Reflected Jump Diffusion Dynamics: A Linear Programming Approach
Zongxia Liang, Xiang Yu, Keyu Zhang
https://arxiv.org/abs/2508.20388 https://mastoxiv.page/@arXiv_mathOC_bot/115111048711698998
- Differential Dynamic Programming for the Optimal Control Problem with an Ellipsoidal Target Set a...
Sungjun Eom, Gyunghoon Park
https://arxiv.org/abs/2509.07546 https://mastoxiv.page/@arXiv_mathOC_bot/115179281556444440
- On the Moreau envelope properties of weakly convex functions
Marien Renaud, Arthur Leclaire, Nicolas Papadakis
https://arxiv.org/abs/2509.13960 https://mastoxiv.page/@arXiv_mathOC_bot/115224514482363803
- Automated algorithm design via Nevanlinna-Pick interpolation
Ibrahim K. Ozaslan, Tryphon T. Georgiou, Mihailo R. Jovanovic
https://arxiv.org/abs/2509.21416 https://mastoxiv.page/@arXiv_mathOC_bot/115286533597711930
- Optimal Control of a Bioeconomic Crop-Energy System with Energy Reinvestment
Othman Cherkaoui Dekkaki
https://arxiv.org/abs/2510.11381 https://mastoxiv.page/@arXiv_mathOC_bot/115372322896073250
- Point Convergence Analysis of the Accelerated Gradient Method for Multiobjective Optimization: Co...
Yingdong Yin
https://arxiv.org/abs/2510.26382 https://mastoxiv.page/@arXiv_mathOC_bot/115468018035252078
- History-Aware Adaptive High-Order Tensor Regularization
Chang He, Bo Jiang, Yuntian Jiang, Chuwen Zhang, Shuzhong Zhang
https://arxiv.org/abs/2511.05788
- Equivalence of entropy solutions and gradient flows for pressureless 1D Euler systems
Jos\'e Antonio Carrillo, Sondre Tesdal Galtung
https://arxiv.org/abs/2312.04932 https://mastoxiv.page/@arXiv_mathAP_bot/111560077272113052
- Kernel Modelling of Fading Memory Systems
Yongkang Huo, Thomas Chaffey, Rodolphe Sepulchre
https://arxiv.org/abs/2403.11945 https://mastoxiv.page/@arXiv_eessSY_bot/112121123836064435
- The Maximum Theoretical Ground Speed of the Wheeled Vehicle
Altay Zhakatayev, Mukatai Nemerebayev
https://arxiv.org/abs/2502.15341 https://mastoxiv.page/@arXiv_physicsclassph_bot/114057765769441123
- Hessian stability and convergence rates for entropic and Sinkhorn potentials via semiconcavity
Giacomo Greco, Luca Tamanini
https://arxiv.org/abs/2504.11133 https://mastoxiv.page/@arXiv_mathPR_bot/114346453424694503
- Optimizing the ground state energy of the three-dimensional magnetic Dirichlet Laplacian with con...
Matthias Baur
https://arxiv.org/abs/2504.21597 https://mastoxiv.page/@arXiv_mathph_bot/114431404740241516
- A localized consensus-based sampling algorithm
Arne Bouillon, Alexander Bodard, Panagiotis Patrinos, Dirk Nuyens, Giovanni Samaey
https://arxiv.org/abs/2505.24861 https://mastoxiv.page/@arXiv_mathNA_bot/114612580684567066
- A Novel Sliced Fused Gromov-Wasserstein Distance
Moritz Piening, Robert Beinert
https://arxiv.org/abs/2508.02364 https://mastoxiv.page/@arXiv_csLG_bot/114976243138728278
- Minimal Regret Walras Equilibria for Combinatorial Markets via Duality, Integrality, and Sensitiv...
Alo\"is Duguet, Tobias Harks, Martin Schmidt, Julian Schwarz
https://arxiv.org/abs/2511.09021 https://mastoxiv.page/@arXiv_csGT_bot/115541243299714775
toXiv_bot_toot
Crosslisted article(s) found for stat.ML. https://arxiv.org/list/stat.ML/new
[3/3]:
- Overlap-Adaptive Regularization for Conditional Average Treatment Effect Estimation
Valentyn Melnychuk, Dennis Frauen, Jonas Schweisthal, Stefan Feuerriegel
Physics-informed learning under mixing: How physical knowledge speeds up learning
Anna Scampicchio, Leonardo F. Toso, Rahel Rickenbach, James Anderson, Melanie N. Zeilinger
https://arxiv.org/abs/2509.24801
Bundle Network: a Machine Learning-Based Bundle Method
Francesca Demelas, Joseph Le Roux, Antonio Frangioni, Mathieu Lacroix, Emiliano Traversi, Roberto Wolfler Calvo
https://arxiv.org/abs/2509.24736