
2025-08-26 07:51:36
Balancing the exploration-exploitation trade-off in active learning for surrogate model-based reliability analysis via multi-objective optimization
Jonathan A. Moran, Pablo G. Morato
https://arxiv.org/abs/2508.18170
Balancing the exploration-exploitation trade-off in active learning for surrogate model-based reliability analysis via multi-objective optimization
Jonathan A. Moran, Pablo G. Morato
https://arxiv.org/abs/2508.18170
General Proximal Quasi-Newton Methods based on model functions for nonsmooth nonconvex problems
Xiaoxi Jia, Peter Ochs
https://arxiv.org/abs/2507.18363 https://
Conservative quantum offline model-based optimization
Kristian Sotirov, Annie E. Paine, Savvas Varsamopoulos, Antonio A. Gentile, Osvaldo Simeone
https://arxiv.org/abs/2506.19714 …
Bayesian preference elicitation for decision support in multiobjective optimization
Felix Huber, Sebastian Rojas Gonzalez, Raul Astudillo
https://arxiv.org/abs/2507.16999
Can Limited Liability Increase Stability for Banks: A Dynamic Portfolio Approach
Deb Narayan Barik, Siddhartha P. Chakrabarty
https://arxiv.org/abs/2507.16494 https://
Learning in Repeated Multi-Objective Stackelberg Games with Payoff Manipulation
Phurinut Srisawad, Juergen Branke, Long Tran-Thanh
https://arxiv.org/abs/2508.14705 https://
Online survival analysis with quantile regression
Yi Deng, Shuwei Li, Liuquan Sun, Baoxue Zhang
https://arxiv.org/abs/2507.15696 https://
Efficient Visual Appearance Optimization by Learning from Prior Preferences
Zhipeng Li, Yi-Chi Liao, Christian Holz
https://arxiv.org/abs/2507.15355 https:…
An unconditional lower bound for the active-set method in convex quadratic maximization
Eleon Bach, Yann Disser, Sophie Huiberts, Nils Mosis
https://arxiv.org/abs/2507.16648
The Intrinsic Riemannian Proximal Gradient Method for Convex Optimization
Ronny Bergmann, Hajg Jasa, Paula John, Max Pfeffer
https://arxiv.org/abs/2507.16055
Reconstruction Codes for Deletions and Insertions: Connection, Distinction, and Construction
Yubo Sun, Gennian Ge
https://arxiv.org/abs/2508.14386 https://…
ALMA-IMF XIX: C18O (J=2-1): Measurements of turbulence in 15 massive protoclusters
A. Koley, A. M. Stutz, F. Louvet, F. Motte, A. Ginsburg, R. Galv\'an-Madrid, R. H. \'Alvarez-Guti\'errez, P. Sanhueza, T. Baug, N. Sandoval-Garrido, J. Salinas, G. Busquet, J. Braine, H. -L. Liu, T. Csengeri, A. Gusdorf, M. Fern\'andez-L\'opez, N. Cunningham, L. Bronfman, M. Bonfand
Local Differential Privacy for Distributed Stochastic Aggregative Optimization with Guaranteed Optimality
Ziqin Chen, Yongqiang Wang
https://arxiv.org/abs/2506.15106
A new 1D $V_p$ and $V_s$ velocity model of the western Rift of Corinth, Greece, using a fully non-linear tomography algorithm
Mark S. Noble (GEOSCIENCES), Alexandrine Gesret (GEOAZUR 7329), H\'el\`ene Lyon-Caen (GEOAZUR 7329), Anne Deschamps (GEOAZUR 7329)
https://arxiv.org/abs/2506.16222
Enhanced Ideal Objective Vector Estimation for Evolutionary Multi-Objective Optimization
Ruihao Zheng, Zhenkun Wang, Yin Wu, Maoguo Gong
https://arxiv.org/abs/2505.21903
Nonconvex Nonsmooth Multicomposite Optimization and Its Applications to Recurrent Neural Networks
Lingzi Jin, Xiao Wang, Xiaojun Chen
https://arxiv.org/abs/2506.17884
Estimating quantile treatments without strict overlap
Marco Avella-Medina, Richard Davis, Gennady Samorodnitsky
https://arxiv.org/abs/2506.18215 https://…
An analysis of the fragmentation function of gluon at next-to-leading order approximation
H. S. Nakhaei, G. R. Boroun
https://arxiv.org/abs/2508.05256 https://
Multi-Tier UAV Edge Computing for Low Altitude Networks Towards Long-Term Energy Stability
Yufei Ye, Shijian Gao, Xinhu Zheng, Liuqing Yang
https://arxiv.org/abs/2508.14601 http…
BEASST: Behavioral Entropic Gradient based Adaptive Source Seeking for Mobile Robots
Donipolo Ghimire, Aamodh Suresh, Carlos Nieto-Granda, Solmaz S. Kia
https://arxiv.org/abs/2508.10363
An End-to-End Multi-objective Ensemble Ranking Framework for Video Recommendation
Tiantian He, Minzhi Xie, Runtong Li, Xiaoxiao Xu, Jiaqi Yu, Zixiu Wang, Lantao Hu, Han Li, Kun Gai
https://arxiv.org/abs/2508.05093
Three-Dimensional Isotropic STED Nanoscopy using a Single Objective
Renlong Zhang, Xiaoyu Weng, Haoxian Zhou, Luwei Wang, Fangrui Lin, Wei Yan, Xiumin Gao, Bin Yu, Danying Lin, Liwei Liu, Chenshuang Zhang, Kayla K. Green, Ewoud R. E. Schmidt, Songlin Zhuang, Junle Qu
https://arxiv.org/abs/2507.06718
S2WTM: Spherical Sliced-Wasserstein Autoencoder for Topic Modeling
Suman Adhya, Debarshi Kumar Sanyal
https://arxiv.org/abs/2507.12451 https://
Frank-Wolfe algorithm for star-convex functions
R. Diaz Millan, Orizon Pereira Ferreira, Julien Ugon
https://arxiv.org/abs/2507.17272 https://arxiv.org/pdf…
Dynamic Regret Reduces to Kernelized Static Regret
Andrew Jacobsen, Alessandro Rudi, Francesco Orabona, Nicolo Cesa-Bianchi
https://arxiv.org/abs/2507.05478
Single and multi-objective optimal designs for group testing experiments
Chi-Kuang Yeh, Weng Kee Wong, Julie Zhou
https://arxiv.org/abs/2508.08445 https://…
A shape optimisation of mutual inductances among coils
Toru Takahashi, Tatsuya Tokito, Yi Cui, Toshiro Matsumoto
https://arxiv.org/abs/2506.14085 https://
An Efficient Network-aware Direct Search Method for Influence Maximization
Matteo Bergamaschi, Sara Venturini, Francesco Tudisco, Francesco Rinaldi
https://arxiv.org/abs/2508.12164
The Generalized Matrix Separation Problem: Algorithms
Xuemei Chen, Owen Deen
https://arxiv.org/abs/2507.17069 https://arxiv.org/pdf/2507.17069
Enhanced Ideal Objective Vector Estimation for Evolutionary Multi-Objective Optimization
Ruihao Zheng, Zhenkun Wang, Yin Wu, Maoguo Gong
https://arxiv.org/abs/2505.21903
GLASD: A Loss-Function-Agnostic Global Optimizer for Robust Correlation Estimation under Data Contamination and Heavy Tails
Priyam Das
https://arxiv.org/abs/2506.14801
Sub-sampled Trust-Region Methods with Deterministic Worst-Case Complexity Guarantees
Max L. N. Goncalves, Geovani N. Grapiglia
https://arxiv.org/abs/2507.17556 https://
Constraint Optimized Multichannel Mixer-limiter Design
Yuancheng Luo, Dmitriy Yamkovoy, Guillermo Garcia
https://arxiv.org/abs/2507.06769 https://
Adaptive Data Augmentation for Thompson Sampling
Wonyoung Kim
https://arxiv.org/abs/2506.14479 https://arxiv.org/pdf/2506.14479
A trust-region framework for optimization using Hermite kernel surrogate models
Sven Ullmann, Tobias Ehring, Robin Herkert, Bernard Haasdonk
https://arxiv.org/abs/2507.01729
ZipMPC: Compressed Context-Dependent MPC Cost via Imitation Learning
Rahel Rickenbach, Alan A. Lahoud, Erik Schaffernicht, Melanie N. Zeilinger, Johannes A. Stork
https://arxiv.org/abs/2507.13088
First Order Algorithm on an Optimization Problem with Improved Convergence when Problem is Convex
Chee-Khian Sim
https://arxiv.org/abs/2508.13302 https://a…
Decoupling Geometry from Optimization in 2D Irregular Cutting and Packing Problems: an Open-Source Collision Detection Engine
Jeroen Gardeyn, Tony Wauters, Greet Vanden Berghe
https://arxiv.org/abs/2508.08341
This https://arxiv.org/abs/2412.17780 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_qbi…
An Optimization-Based Framework for Solving Forward-Backward Stochastic Differential Equations: Convergence Analysis and Error Bounds
Yutian Wang, Yuan-Hua Ni, Xun Li
https://arxiv.org/abs/2507.15234
Learning Encodings by Maximizing State Distinguishability: Variational Quantum Error Correction
Nico Meyer, Christopher Mutschler, Andreas Maier, Daniel D. Scherer
https://arxiv.org/abs/2506.11552
Array-Aware Ambisonics and HRTF Encoding for Binaural Reproduction With Wearable Arrays
Yhonatan Gayer, Vladimir Tourbabin, Zamir Ben Hur, David Lou Alon, Boaz Rafaely
https://arxiv.org/abs/2507.11091
Inference on the value of linear programs
Leonard Goff, Eric Mbakop
https://arxiv.org/abs/2506.06776 https://arxiv.org/pdf/2506.06776…
Experimental Scheme for Polarizing the Boron Nuclei
William R. Milner, Richard G. Milner
https://arxiv.org/abs/2508.06561 https://arxiv.org/pdf/2508.06561
The pursuit of happiness
Debora Princepe, Onofrio Mazzarisi, Erol Akcay, Simon A. Levin, Matteo Marsili
https://arxiv.org/abs/2506.10537 https://
Inverse Optimal Control with Constraint Relaxation
Rahel Rickenbach, Amon Lahr, Melanie N. Zeilinger
https://arxiv.org/abs/2507.11392 https://
Unrolling Nonconvex Graph Total Variation for Image Denoising
Songlin Wei, Gene Cheung, Fei Chen, Ivan Selesnick
https://arxiv.org/abs/2506.02381 https://
On the Effect of Instruction Tuning Loss on Generalization
Anwoy Chatterjee, H S V N S Kowndinya Renduchintala, Sumit Bhatia, Tanmoy Chakraborty
https://arxiv.org/abs/2507.07817
Kurdyka-\L ojasiewicz exponent via sqaure parametrization
Wenqing Ouyang
https://arxiv.org/abs/2506.10110 https://arxiv.org/pdf/2506.…
Solving Distance-Based Optimization Problems Using Optical Hardware
Guangyao Li, Richard Zhipeng Wang, Natalia G. Berloff
https://arxiv.org/abs/2507.11378 …
Multi-Objective Covariance Matrix Adaptation MAP-Annealing
Shihan Zhao, Stefanos Nikolaidis
https://arxiv.org/abs/2505.20712 https://…
Guidelines for Gaze-based Neural Preliminary Diagnosis
Mayar Elfares, Salma Younis, Pascal Reisert, Ralf K\"usters, Tobias Renner, Andreas Bulling
https://arxiv.org/abs/2506.08517
Thompson Sampling in Function Spaces via Neural Operators
Rafael Oliveira, Xuesong Wang, Kian Ming A. Chai, Edwin V. Bonilla
https://arxiv.org/abs/2506.21894
A Distributional View of High Dimensional Optimization
Felix Benning
https://arxiv.org/abs/2507.16315 https://arxiv.org/pdf/2507.1631…
Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions
Simon Matrenok, Skander Moalla, Caglar Gulcehre
https://arxiv.org/abs/2507.08068 https://arxiv.org/pdf/2507.08068 https://arxiv.org/html/2507.08068
arXiv:2507.08068v1 Announce Type: new
Abstract: Aligning large language models with pointwise absolute rewards has so far required online, on-policy algorithms such as PPO and GRPO. In contrast, simpler methods that can leverage offline or off-policy data, such as DPO and REBEL, are limited to learning from preference pairs or relative signals. To bridge this gap, we introduce \emph{Quantile Reward Policy Optimization} (QRPO), which learns from pointwise absolute rewards while preserving the simplicity and offline applicability of DPO-like methods. QRPO uses quantile rewards to enable regression to the closed-form solution of the KL-regularized RL objective. This reward yields an analytically tractable partition function, removing the need for relative signals to cancel this term. Moreover, QRPO scales with increased compute to estimate quantile rewards, opening a new dimension for pre-computation scaling. Empirically, QRPO consistently achieves top performance on chat and coding evaluations -- reward model scores, AlpacaEval 2, and LeetCode -- compared to DPO, REBEL, and SimPO across diverse datasets and 8B-scale models. Finally, we find that training with robust rewards instead of converting them to preferences induces less length bias.
toXiv_bot_toot
Multi-Objective Covariance Matrix Adaptation MAP-Annealing
Shihan Zhao, Stefanos Nikolaidis
https://arxiv.org/abs/2505.20712 https://…
A Generalized $\ell_1$-Merit Function SQP Method Using Function Approximations with Tunable Accuracy
Dane S. Grundvig, Matthias Heinkenschloss
https://arxiv.org/abs/2507.06199
Toroidal area-preserving parameterizations of genus-one closed surfaces
Marco Sutti, Mei-Heng Yueh
https://arxiv.org/abs/2508.05111 https://arxiv.org/pdf/2…
WhiSQA: Non-Intrusive Speech Quality Prediction Using Whisper Encoder Features
George Close, Kris Hong, Thomas Hain, Stefan Goetze
https://arxiv.org/abs/2508.02210 https://
Distributionally Robust Control with Constraints on Linear Unidimensional Projections
Alexandros E. Tzikas, Lukas Fiechtner, Arec Jamgochian, Mykel J. Kochenderfer
https://arxiv.org/abs/2508.07121
Physics-Informed Neural Network Approach to Quark-Antiquark Color Flux Tube
Wei Kou, Xiaoxuan Lin, Bing'ang Guo, Xurong Chen
https://arxiv.org/abs/2506.03513
A bang-bang solution with infinitely many switching points for a parabolic boundary control problem with terminal observation
Constantin Christof
https://arxiv.org/abs/2506.12768 …
A Robust Optimization Framework for Flexible Industrial Energy Scheduling: Application to a Cement Plant with Market Participation
Sebasti\'an Rojas-Innocenti, Enrique Baeyens, Alejandro Mart\'in-Crespo, Sergio Saludes-Rodil, Fernando Frechoso Escudero
https://arxiv.org/abs/2506.10824
Lyapunov analysis for FISTA under strong convexity
Luis M. Brice\~no-Arias
https://arxiv.org/abs/2506.11785 https://arxiv.org/pdf/250…
EXPO: Stable Reinforcement Learning with Expressive Policies
Perry Dong, Qiyang Li, Dorsa Sadigh, Chelsea Finn
https://arxiv.org/abs/2507.07986 https://arxiv.org/pdf/2507.07986 https://arxiv.org/html/2507.07986
arXiv:2507.07986v1 Announce Type: new
Abstract: We study the problem of training and fine-tuning expressive policies with online reinforcement learning (RL) given an offline dataset. Training expressive policy classes with online RL present a unique challenge of stable value maximization. Unlike simpler Gaussian policies commonly used in online RL, expressive policies like diffusion and flow-matching policies are parameterized by a long denoising chain, which hinders stable gradient propagation from actions to policy parameters when optimizing against some value function. Our key insight is that we can address stable value maximization by avoiding direct optimization over value with the expressive policy and instead construct an on-the-fly RL policy to maximize Q-value. We propose Expressive Policy Optimization (EXPO), a sample-efficient online RL algorithm that utilizes an on-the-fly policy to maximize value with two parameterized policies -- a larger expressive base policy trained with a stable imitation learning objective and a light-weight Gaussian edit policy that edits the actions sampled from the base policy toward a higher value distribution. The on-the-fly policy optimizes the actions from the base policy with the learned edit policy and chooses the value maximizing action from the base and edited actions for both sampling and temporal-difference (TD) backup. Our approach yields up to 2-3x improvement in sample efficiency on average over prior methods both in the setting of fine-tuning a pretrained policy given offline data and in leveraging offline data to train online.
toXiv_bot_toot
Inequalities for Optimization of Classification Algorithms: A Perspective Motivated by Diagnostic Testing
Paul N. Patrone, Anthony J. Kearsley
https://arxiv.org/abs/2508.01065 h…
This https://arxiv.org/abs/2505.21356 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csSD_…
Non-linear Multi-objective Optimization with Probabilistic Branch and Bound
Hao Huang, Zelda B. Zabinsky
https://arxiv.org/abs/2506.04554 https://
Optimal Task Offloading with Firm Deadlines for Mobile Edge Computing Systems
Khai Doan, Wesley Araujo, Evangelos Kranakis, Ioannis Lambadaris, Yannis Viniotis, Wonjae Shin
https://arxiv.org/abs/2506.09180
Revisiting Randomized Smoothing: Nonsmooth Nonconvex Optimization Beyond Global Lipschitz Continuity
Jingfan Xia, Zhenwei Lin, Qi Deng
https://arxiv.org/abs/2508.13496 https://
A DC-Reformulation for Gradient-$L^0$-Constrained Problems in Function Spaces
Bastian Dittrich, Evelyn Herberg, Roland Herzog, Georg M\"uller
https://arxiv.org/abs/2506.11917
Sensitivity of Optimal Control Solutions and Quantities of Interest with Respect to Component Functions
Jonathan R. Cangelosi, Matthias Heinkenschloss
https://arxiv.org/abs/2506.10804
Economic Model Predictive Control with a Non-Fixed Reference Trajectory for Optimal Microgrid Dispatch
Avik Ghosh, Adil Khurram, Jan Kleissl, Sonia Martinez
https://arxiv.org/abs/2506.22406
Heavy-ball dynamics with Hessian-driven damping for non-convex optimization under the {\L}ojasiewicz condition
Vassilis Apidopoulos, Vasiliki Mavrogeorgou, Theodoros G. Tsironis
https://arxiv.org/abs/2506.11705
Global Descent Method for Non-convex Multi-objective Optimization Problems
Bikram Adhikary, Md Abu Talhamainuddin Ansary, Savin Treanta
https://arxiv.org/abs/2507.22390 https://…
BSDE Approach for $\alpha$-Potential Stochastic Differential Games
Xin Guo, Xun Li, Liangquan Zhang
https://arxiv.org/abs/2507.13256 https://
Recursive Bound-Constrained AdaGrad with Applications to Multilevel and Domain Decomposition Minimization
Serge Gratton, Alena Kopani\v{c}\'akov\'a, Philippe Toint
https://arxiv.org/abs/2507.11513
Automatic Generation of Explicit Quadratic Programming Solvers
Maximilian Schaller, Daniel Arnstr\"om, Alberto Bemporad, Stephen Boyd
https://arxiv.org/abs/2506.11513
High Probability Convergence of Distributed Clipped Stochastic Gradient Descent with Heavy-tailed Noise
Yuchen Yang, Kaihong Lu, Long Wang
https://arxiv.org/abs/2506.11647
Convergence of Momentum-Based Optimization Algorithms with Time-Varying Parameters
Mathukumalli Vidyasagar
https://arxiv.org/abs/2506.11904 https://…
This https://arxiv.org/abs/2405.08485 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_mat…
Set Smoothness Unlocks Clarke Hyper-stationarity in Bilevel Optimization
He Chen, Jiajin Li, Anthony Man-cho So
https://arxiv.org/abs/2506.04587 https://…
A Model-Free Extremum Seeking Controller with Application to Tracking a Nonlinear Chemical Reaction
Alexander Zuyev, Victoria Grushkovska
https://arxiv.org/abs/2507.07749
A linesearch-based derivative-free method for noisy black-box problems
Alberto De Santis, Giampaolo Liuzzi, Stefano Lucidi
https://arxiv.org/abs/2508.00495 https://
This https://arxiv.org/abs/2403.06708 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_mat…
A derivative-free regularization algorithm for equality constrained nonlinear least squares problems
Xi Chen, Jinyan Fan
https://arxiv.org/abs/2507.05623 h…
Projected Gradient Descent for Constrained Decision-Dependent Optimization
Zifan Wang, Changxin Liu, Thomas Parisini, Michael M. Zavlanos, Karl H. Johansson
https://arxiv.org/abs/2508.08856
Almost Sure Convergence for the Last Iterate of Stochastic Gradient Descent Schemes
Marcel Hudiani
https://arxiv.org/abs/2507.07281 https://
A Generalized Analytical Framework for the Nonlinear Best-Worst Method
Harshit M. Ratandhara, Mohit Kumar
https://arxiv.org/abs/2508.06048 https://arxiv.or…
A Cubic Regularization Method for Multiobjective Optimization
Douglas S. Gon\c{c}alves, Max L. N. Gon\c{c}alves, Jefferson G. Melo
https://arxiv.org/abs/2506.08181
Combinatorial Algorithm for Tropical Linearly Factorized Programming
Yuki Nishida
https://arxiv.org/abs/2507.07596 https://arxiv.org/…
This https://arxiv.org/abs/2404.17386 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_mat…
On Relatively Smooth Optimization over Riemannian Manifolds
Chang He, Jiaxiang Li, Bo Jiang, Shiqian Ma, Shuzhong Zhang
https://arxiv.org/abs/2508.03048 https://
This https://arxiv.org/abs/2410.14899 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_mat…
Was Residual Penalty and Neural Operators All We Needed for Solving Optimal Control Problems?
Oliver G. S. Lundqvist, Fabricio Oliveira
https://arxiv.org/abs/2506.04742
A Parameter-free Decentralized Algorithm for Composite Convex Optimization
Xiaokai Chen, Ilya Kuruzov, Gesualdo Scutari, Alexander Gasnikov
https://arxiv.org/abs/2508.01466 http…
This https://arxiv.org/abs/2407.04562 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_mat…
Perturbed Gradient Descent Algorithms are Small-Disturbance Input-to-State Stable
Leilei Cui, Zhong-Ping Jiang, Eduardo D. Sontag, Richard D. Braatz
https://arxiv.org/abs/2507.02131
Convergence Rate Analysis for Monotone Accelerated Proximal Gradient Method
Zepeng Wang, Juan Peypouquet
https://arxiv.org/abs/2507.00939 https://
Affine-Invariant Global Non-Asymptotic Convergence Analysis of BFGS under Self-Concordance
Qiujiang Jin, Aryan Mokhtari
https://arxiv.org/abs/2507.00361 ht…