Tootfinder

@arXiv_csLG_bot@mastoxiv.page
2025-07-14 08:19:51

Low-rank Momentum Factorization for Memory Efficient Training
Pouria Mahdavinia, Mehrdad Mahdavi
https://arxiv.org/abs/2507.08091 https://arxiv.org/pdf/2507.08091 https://arxiv.org/html/2507.08091
arXiv:2507.08091v1 Announce Type: new
Abstract: Fine-tuning large foundation models presents significant memory challenges due to stateful optimizers like AdamW, often requiring several times more GPU memory than inference. While memory-efficient methods like parameter-efficient fine-tuning (e.g., LoRA) and optimizer state compression exist, recent approaches like GaLore bridge these by using low-rank gradient projections and subspace moment accumulation. However, such methods may struggle with fixed subspaces or computationally costly offline resampling (e.g., requiring full-matrix SVDs). We propose Momentum Factorized SGD (MoFaSGD), which maintains a dynamically updated low-rank SVD representation of the first-order momentum, closely approximating its full-rank counterpart throughout training. This factorization enables a memory-efficient fine-tuning method that adaptively updates the optimization subspace at each iteration. Crucially, MoFaSGD leverages the computed low-rank momentum factors to perform efficient spectrally normalized updates, offering an alternative to subspace moment accumulation. We establish theoretical convergence guarantees for MoFaSGD, proving it achieves an optimal rate for non-convex stochastic optimization under standard assumptions. Empirically, we demonstrate MoFaSGD's effectiveness on large language model alignment benchmarks, achieving a competitive trade-off between memory reduction (comparable to LoRA) and performance compared to state-of-the-art low-rank optimization methods. Our implementation is available at https://github.com/pmahdavi/MoFaSGD.
toXiv_bot_toot

@arXiv_csAI_bot@mastoxiv.page
2025-09-03 13:54:33

Dynamic Speculative Agent Planning
Yilin Guan, Wenyue Hua, Qingfeng Lan, Sun Fei, Dujian Ding, Devang Acharya, Chi Wang, William Yang Wang
https://arxiv.org/abs/2509.01920 https…

Dynamic Speculative Agent Planning
Despite their remarkable success in complex tasks propelling widespread adoption, large language-model-based agents still face critical deployment challenges due to prohibitive latency and inference costs. While recent work has explored various methods to accelerate inference, existing approaches suffer from significant limitations: they either fail to preserve performance fidelity, require extensive offline training of router modules, or incur excessive operational costs. Moreover, they provid…

@arXiv_csLG_bot@mastoxiv.page
2025-09-10 10:23:31

IP-Basis PINNs: Efficient Multi-Query Inverse Parameter Estimation
Shalev Manor, Mohammad Kohandel
https://arxiv.org/abs/2509.07245 https://arxiv.org/pdf/2…

IP-Basis PINNs: Efficient Multi-Query Inverse Parameter Estimation
Solving inverse problems with Physics-Informed Neural Networks (PINNs) is computationally expensive for multi-query scenarios, as each new set of observed data requires a new, expensive training procedure. We present Inverse-Parameter Basis PINNs (IP-Basis PINNs), a meta-learning framework that extends the foundational work of Desai et al. (2022) to enable rapid and efficient inference for inverse problems. Our method employs an offline-online decomposition: a deep network is first trained offl…

@arXiv_statME_bot@mastoxiv.page
2025-06-30 09:08:20

Change Point Localization and Inference in Dynamic Multilayer Networks
Fan Wang, Kyle Ritscher, Yik Lun Kei, Xin Ma, Oscar Hernan Madrid Padilla
https://arxiv.org/abs/2506.21878

Change Point Localization and Inference in Dynamic Multilayer Networks
We study offline change point localization and inference in dynamic multilayer random dot product graphs (D-MRDPGs), where at each time point, a multilayer network is observed with shared node latent positions and time-varying, layer-specific connectivity patterns. We propose a novel two-stage algorithm that combines seeded binary segmentation with low-rank tensor estimation, and establish its consistency in estimating both the number and locations of change points. Furthermore, we derive the l…

@arXiv_csCE_bot@mastoxiv.page
2025-08-04 08:51:31

Online Fine-Tuning of Carbon Emission Predictions using Real-Time Recurrent Learning for State Space Models
Julian Lemmel, Manuel Kranzl, Adam Lamine, Philipp Neubauer, Radu Grosu, Sophie Neubauer
https://arxiv.org/abs/2508.00804

Online Fine-Tuning of Carbon Emission Predictions using Real-Time Recurrent Learning for State Space Models
This paper introduces a new approach for fine-tuning the predictions of structured state space models (SSMs) at inference time using real-time recurrent learning. While SSMs are known for their efficiency and long-range modeling capabilities, they are typically trained offline and remain static during deployment. Our method enables online adaptation by continuously updating model parameters in response to incoming data. We evaluate our approach for linear-recurrent-unit SSMs using a small carbo…

@arXiv_csCV_bot@mastoxiv.page
2025-08-22 10:17:41

MapKD: Unlocking Prior Knowledge with Cross-Modal Distillation for Efficient Online HD Map Construction
Ziyang Yan, Ruikai Li, Zhiyong Cui, Bohan Li, Han Jiang, Yilong Ren, Aoyong Li, Zhenning Li, Sijia Wen, Haiyang Yu
https://arxiv.org/abs/2508.15653

MapKD: Unlocking Prior Knowledge with Cross-Modal Distillation for Efficient Online HD Map Construction
Online HD map construction is a fundamental task in autonomous driving systems, aiming to acquire semantic information of map elements around the ego vehicle based on real-time sensor inputs. Recently, several approaches have achieved promising results by incorporating offline priors such as SD maps and HD maps or by fusing multi-modal data. However, these methods depend on stale offline maps and multi-modal sensor suites, resulting in avoidable computational overhead at inference. To address t…

@arXiv_eessSY_bot@mastoxiv.page
2025-08-29 08:55:11

Delay-adaptive Control of Nonlinear Systems with Approximate Neural Operator Predictors
Luke Bhan, Miroslav Krstic, Yuanyuan Shi
https://arxiv.org/abs/2508.20367 https://…

Delay-adaptive Control of Nonlinear Systems with Approximate Neural Operator Predictors
In this work, we propose a rigorous method for implementing predictor feedback controllers in nonlinear systems with unknown and arbitrarily long actuator delays. To address the analytically intractable nature of the predictor, we approximate it using a learned neural operator mapping. This mapping is trained once, offline, and then deployed online, leveraging the fast inference capabilities of neural networks. We provide a theoretical stability analysis based on the universal approximation the…

@arXiv_csCV_bot@mastoxiv.page
2025-08-26 12:32:47

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Weiyun Wang, Zhangwei Gao, Lixin Gu, Hengjun Pu, Long Cui, Xingguang Wei, Zhaoyang Liu, Linglin Jing, Shenglong Ye, Jie Shao, Zhaokai Wang, Zhe Chen, Hongjie Zhang, Ganlin Yang, Haomin Wang, Qi Wei, Jinhui Yin, Wenhao Li, Erfei Cui, Guanzhou Chen, Zichen Ding, Changyao Tian, Zhenyu Wu, Jingjing Xie, Zehao Li, Bowen Yang, Yuchen Duan, Xuehui Wang, Songze Li, Xiangyu Zhao, Haodong Duan, Nianche…

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
We introduce InternVL 3.5, a new family of open-source multimodal models that significantly advances versatility, reasoning capability, and inference efficiency along the InternVL series. A key innovation is the Cascade Reinforcement Learning (Cascade RL) framework, which enhances reasoning through a two-stage process: offline RL for stable convergence and online RL for refined alignment. This coarse-to-fine training strategy leads to substantial improvements on downstream reasoning tasks, e.g.…

@arXiv_statCO_bot@mastoxiv.page
2025-08-04 09:12:41

Online Rolling Controlled Sequential Monte Carlo
Liwen Xue, Axel Finke, Adam M. Johansen
https://arxiv.org/abs/2508.00696 https://arxiv.org/pdf/2508.00696

Online Rolling Controlled Sequential Monte Carlo
We introduce methodology for real-time inference in general-state-space hidden Markov models. Specifically, we extend recent advances in controlled sequential Monte Carlo (CSMC) methods-originally proposed for offline smoothing-to the online setting via a rolling window mechanism. Our novel online rolling controlled sequential Monte Carlo (ORCSMC) algorithm employs two particle systems to simultaneously estimate twisting functions and perform filtering, ensuring real-time adaptivity to new obse…

@arXiv_csLG_bot@mastoxiv.page
2025-08-20 10:15:40

Revisiting Diffusion Q-Learning: From Iterative Denoising to One-Step Action Generation
Thanh Nguyen, Chang D. Yoo
https://arxiv.org/abs/2508.13904 https://

Revisiting Diffusion Q-Learning: From Iterative Denoising to One-Step Action Generation
The generative power of diffusion models (DMs) has recently enabled high-performing decision-making algorithms in offline reinforcement learning (RL), achieving state-of-the-art results across standard benchmarks. Among them, Diffusion Q-Learning (DQL) stands out as a leading method for its consistently strong performance. Nevertheless, DQL remains limited in practice due to its reliance on multi-step denoising for action generation during both training and inference. Although one-step denoisin…

Tootfinder

Opt-in global Mastodon full text search. Join the index!