
2025-10-10 11:16:19
DYNAMIX: RL-based Adaptive Batch Size Optimization in Distributed Machine Learning Systems
Yuanjun Dai, Keqiang He, An Wang
https://arxiv.org/abs/2510.08522 https://
DYNAMIX: RL-based Adaptive Batch Size Optimization in Distributed Machine Learning Systems
Yuanjun Dai, Keqiang He, An Wang
https://arxiv.org/abs/2510.08522 https://
Sequentially Auditing Differential Privacy
Tom\'as Gonz\'alez, Mateo Dulce-Rubio, Aaditya Ramdas, M\'onica Ribero
https://arxiv.org/abs/2509.07055 https://
Adaptive Execution Scheduler for DataDios SmartDiff
Aryan Poduri
https://arxiv.org/abs/2510.07811 https://arxiv.org/pdf/2510.07811
Deep Fuzzy Optimization for Batch-Size and Nearest Neighbors in Optimal Robot Motion Planning
Liding Zhang, Qiyang Zong, Yu Zhang, Zhenshan Bing, Alois Knoll
https://arxiv.org/abs/2508.20884
The Length of Functional Batch and PIR Codes
Altan B. Kilic, Alberto Ravagnani, Flavio Salizzoni
https://arxiv.org/abs/2508.02586 https://arxiv.org/pdf/250…
ASPEN: An Additional Sampling Penalty Method for Finite-Sum Optimization Problems with Nonlinear Equality Constraints
Nata\v{s}a Kreji\'c, Nata\v{s}a Krklec Jerinki\'c, Tijana Ostoji\'c, Nemanja Vu\v{c}i\'cevi\'c
https://arxiv.org/abs/2508.02299
DIVEBATCH: Accelerating Model Training Through Gradient-Diversity Aware Batch Size Adaptation
Yuen Chen, Yian Wang, Hari Sundaram
https://arxiv.org/abs/2509.16173 https://
Efficient Distributed Training via Dual Batch Sizes and Cyclic Progressive Learning
Kuan-Wei Lu, Ding-Yong Hong, Pangfeng Liu, Jan-Jan Wu
https://arxiv.org/abs/2509.26092 https:…
Now out in #TMLR:
🍇 GRAPES: Learning to Sample Graphs for Scalable Graph Neural Networks 🍇
There's lots of work on sampling subgraphs for GNNs, but relatively little on making this sampling process _adaptive_. That is, learning to select the data from the graph that is relevant for your task.
We introduce an RL-based and a GFLowNet-based sampler and show that the approach perf…
Theory of Scaling Laws for In-Context Regression: Depth, Width, Context and Time
Blake Bordelon, Mary I. Letey, Cengiz Pehlevan
https://arxiv.org/abs/2510.01098 https://
Access Paths for Efficient Ordering with Large Language Models
Fuheng Zhao, Jiayue Chen, Yiming Pan, Tahseen Rabbani, Divyakant Agrawal, Amr El Abbadi
https://arxiv.org/abs/2509.00303
NeST-BO: Fast Local Bayesian Optimization via Newton-Step Targeting of Gradient and Hessian Information
Wei-Ting Tang, Akshay Kudva, Joel A. Paulson
https://arxiv.org/abs/2510.05516
A Highly Scalable TDMA for GPUs and Its Application to Flow Solver Optimization
Seungchan Kim, Jihoo Kim, Sanghyun Ha, Donghyun You
https://arxiv.org/abs/2509.03933 https://
Nearly Minimax Discrete Distribution Estimation in Kullback-Leibler Divergence with High Probability
Dirk van der Hoeven, Julia Olkhovskaia, Tim van Erven
https://arxiv.org/abs/2507.17316
Tighter Privacy Analysis for Truncated Poisson Sampling
Arun Ganesh
https://arxiv.org/abs/2508.15089 https://arxiv.org/pdf/2508.15089
Fisher-Orthogonal Projection Methods for Natural Gradient Descent with Large Batches
Yishun Lu, Wesley Armour
https://arxiv.org/abs/2508.13898 https://arxi…
Faster and Memory-Efficient Training of Sequential Recommendation Models for Large Catalogs
Maxim Zhelnin, Dmitry Redko, Volkov Daniil, Anna Volodkevich, Petr Sokerin, Valeriy Shevchenko, Egor Shvetsov, Alexey Vasilev, Darya Denisova, Ruslan Izmailov, Alexey Zaytsev
https://arxiv.org/abs/2509.09682…
SparseServe: Unlocking Parallelism for Dynamic Sparse Attention in Long-Context LLM Serving
Qihui Zhou, Peiqi Yin, Pengfei Zuo, James Cheng
https://arxiv.org/abs/2509.24626 http…
Prompt Curriculum Learning for Efficient LLM Post-Training
Zhaolin Gao, Joongwon Kim, Wen Sun, Thorsten Joachims, Sid Wang, Richard Yuanzhe Pang, Liang Tan
https://arxiv.org/abs/2510.01135
Efficient Hyperparameter Tuning via Trajectory Invariance Principle
Bingrui Li, Jiaxin Wen, Zhanpeng Zhou, Jun Zhu, Jianfei Chen
https://arxiv.org/abs/2509.25049 https://…
GRAFT: Gradient-Aware Fast MaxVol Technique for Dynamic Data Sampling
Ashish Jha, Anh huy Phan, Razan Dibo, Valentin Leplat
https://arxiv.org/abs/2508.13653 https://
Toward Efficient SpMV in Sparse LLMs via Block Extraction and Compressed Storage
Junqing Lin, Jingwei Sun, Mingge Lu, Guangzhong Sun
https://arxiv.org/abs/2507.12205
Unveiling m-Sharpness Through the Structure of Stochastic Gradient Noise
Haocheng Luo, Mehrtash Harandi, Dinh Phung, Trung Le
https://arxiv.org/abs/2509.18001 https://
Understanding Outer Optimizers in Local SGD: Learning Rates, Momentum, and Acceleration
Ahmed Khaled, Satyen Kale, Arthur Douillard, Chi Jin, Rob Fergus, Manzil Zaheer
https://arxiv.org/abs/2509.10439 …