Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@arXiv_statML_bot@mastoxiv.page
2025-07-14 09:01:32

Optimal and Practical Batched Linear Bandit Algorithm
Sanghoon Yu, Min-hwan Oh
arxiv.org/abs/2507.08438 arxiv.org/pdf…

@arXiv_csLG_bot@mastoxiv.page
2025-07-14 07:56:42

Tree-Structured Parzen Estimator Can Solve Black-Box Combinatorial Optimization More Efficiently
Kenshin Abe, Yunzhuo Wang, Shuhei Watanabe
arxiv.org/abs/2507.08053 arxiv.org/pdf/2507.08053 arxiv.org/html/2507.08053
arXiv:2507.08053v1 Announce Type: new
Abstract: Tree-structured Parzen estimator (TPE) is a versatile hyperparameter optimization (HPO) method supported by popular HPO tools. Since these HPO tools have been developed in line with the trend of deep learning (DL), the problem setups often used in the DL domain have been discussed for TPE such as multi-objective optimization and multi-fidelity optimization. However, the practical applications of HPO are not limited to DL, and black-box combinatorial optimization is actively utilized in some domains, e.g., chemistry and biology. As combinatorial optimization has been an untouched, yet very important, topic in TPE, we propose an efficient combinatorial optimization algorithm for TPE. In this paper, we first generalize the categorical kernel with the numerical kernel in TPE, enabling us to introduce a distance structure to the categorical kernel. Then we discuss modifications for the newly developed kernel to handle a large combinatorial search space. These modifications reduce the time complexity of the kernel calculation with respect to the size of a combinatorial search space. In the experiments using synthetic problems, we verified that our proposed method identifies better solutions with fewer evaluations than the original TPE. Our algorithm is available in Optuna, an open-source framework for HPO.
toXiv_bot_toot

@heiseonline@social.heise.de
2025-06-13 13:52:00

heise | Kleiner Haushalt, kleiner Verbrauch: Stromkostenoptimierung für Single-Haushalte
Es gab in Deutschland noch nie so viele Single-Haushalte wie heute. Deshalb schauen wir uns an, wie diese kleinen Haushalte ihre Stromkosten optimieren können.

@arXiv_csRO_bot@mastoxiv.page
2025-06-13 08:06:50

Multi-Timescale Dynamics Model Bayesian Optimization for Plasma Stabilization in Tokamaks
Rohit Sonker, Alexandre Capone, Andrew Rothstein, Hiro Josep Farre Kaga, Egemen Kolemen, Jeff Schneider
arxiv.org/abs/2506.10287

@arXiv_csSE_bot@mastoxiv.page
2025-06-13 08:18:30

AdaptiveLLM: A Framework for Selecting Optimal Cost-Efficient LLM for Code-Generation Based on CoT Length
Junhang Cheng, Fang Liu, Chengru Wu, Li Zhang
arxiv.org/abs/2506.10525

@arXiv_quantph_bot@mastoxiv.page
2025-07-14 09:46:52

Towards solving large QUBO problems using quantum algorithms: improving the LogQ scheme
Yagnik Chatterjee, J\'er\'emie Messud
arxiv.org/abs/2507.08489

@migueldeicaza@mastodon.social
2025-06-13 22:43:01

This session has a plot twist at the end that qualifies as a cinematic masterpiece of the year:
developer.apple.com/videos/pla

@arXiv_eessSY_bot@mastoxiv.page
2025-06-13 08:11:00

Learning-Based Stable Optimal Control for Infinite-Time Nonlinear Regulation Problems
Han Wang, Di Wu, Lin Cheng, Shengping Gong, Xu Huang
arxiv.org/abs/2506.10291

@arXiv_csLG_bot@mastoxiv.page
2025-07-14 08:30:42

PDE-aware Optimizer for Physics-informed Neural Networks
Hardik Shukla, Manurag Khullar, Vismay Churiwala
arxiv.org/abs/2507.08118 arxiv.org/pdf/2507.08118 arxiv.org/html/2507.08118
arXiv:2507.08118v1 Announce Type: new
Abstract: Physics-Informed Neural Networks (PINNs) have emerged as a powerful framework for solving partial differential equations (PDEs) by embedding physical constraints into the loss function. However, standard optimizers such as Adam often struggle to balance competing loss terms, particularly in stiff or ill-conditioned systems. In this work, we propose a PDE-aware optimizer that adapts parameter updates based on the variance of per-sample PDE residual gradients. This method addresses gradient misalignment without incurring the heavy computational costs of second-order optimizers such as SOAP. We benchmark the PDE-aware optimizer against Adam and SOAP on 1D Burgers', Allen-Cahn and Korteweg-de Vries(KdV) equations. Across both PDEs, the PDE-aware optimizer achieves smoother convergence and lower absolute errors, particularly in regions with sharp gradients. Our results demonstrate the effectiveness of PDE residual-aware adaptivity in enhancing stability in PINNs training. While promising, further scaling on larger architectures and hardware accelerators remains an important direction for future research.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2025-07-14 08:19:51

Low-rank Momentum Factorization for Memory Efficient Training
Pouria Mahdavinia, Mehrdad Mahdavi
arxiv.org/abs/2507.08091 arxiv.org/pdf/2507.08091 arxiv.org/html/2507.08091
arXiv:2507.08091v1 Announce Type: new
Abstract: Fine-tuning large foundation models presents significant memory challenges due to stateful optimizers like AdamW, often requiring several times more GPU memory than inference. While memory-efficient methods like parameter-efficient fine-tuning (e.g., LoRA) and optimizer state compression exist, recent approaches like GaLore bridge these by using low-rank gradient projections and subspace moment accumulation. However, such methods may struggle with fixed subspaces or computationally costly offline resampling (e.g., requiring full-matrix SVDs). We propose Momentum Factorized SGD (MoFaSGD), which maintains a dynamically updated low-rank SVD representation of the first-order momentum, closely approximating its full-rank counterpart throughout training. This factorization enables a memory-efficient fine-tuning method that adaptively updates the optimization subspace at each iteration. Crucially, MoFaSGD leverages the computed low-rank momentum factors to perform efficient spectrally normalized updates, offering an alternative to subspace moment accumulation. We establish theoretical convergence guarantees for MoFaSGD, proving it achieves an optimal rate for non-convex stochastic optimization under standard assumptions. Empirically, we demonstrate MoFaSGD's effectiveness on large language model alignment benchmarks, achieving a competitive trade-off between memory reduction (comparable to LoRA) and performance compared to state-of-the-art low-rank optimization methods. Our implementation is available at github.com/pmahdavi/MoFaSGD.
toXiv_bot_toot