Existence of ghost-eliminating constraints in multivielbein theory
J. Flinckman, S. F. Hassan
https://arxiv.org/abs/2510.03014 https://arxiv.org/pdf/2510.0…
Mean and quantile regression in the copula setting: properties, sharp bounds and a note on estimation
Henrik Kaiser, Wolfgang Trutschnig
https://arxiv.org/abs/2510.03804 https:/…
A note on new type degenerate Srirling numbers of the first kind
Taekyun Kim, Dae san Kim, Kyo-Shin Hwang, Dmitry V. Dolgy
https://arxiv.org/abs/2509.03415 https://
Cooperative Sensing Enhanced UAV Path-Following and Obstacle Avoidance with Variable Formation
Changheng Wang, Zhiqing Wei, Wangjun Jiang, Haoyue Jiang, Zhiyong Feng
https://arxiv.org/abs/2508.21316
@… ...we moeten het zeker als afschrikwekkend voorbeeld gebruiken, verder moeten we ons natuurlijk primair richten op Europa, tegelijk zal het afglijden van de VS naar een fascistische staat ook flink invloed op europa hebben of we dat nu willen of niet (is al gaande)
Constructions of Efficiently Implementable Boolean Functions with Provable Nonlinearity/Resiliency/Algebraic Immunity Trade-Offs
Palash Sarkar
https://arxiv.org/abs/2510.01720 h…
Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live
Hanchen Li, Qiuyang Mang, Runyuan He, Qizheng Zhang, Huanzhi Mao, Xiaokun Chen, Alvin Cheung, Joseph Gonzalez, Ion Stoica
https://arxiv.org/abs/2511.02230 https://arxiv.org/pdf/2511.02230 https://arxiv.org/html/2511.02230
arXiv:2511.02230v1 Announce Type: new
Abstract: Agentic LLM applications interleave LLM generation requests with tool calls. These tool calls break the continuity of the workflow by creating pauses between LLM requests, bringing many challenges for the serving system, especially under multi-turn scenarios. Each pause potentially causes KV cache eviction and extra waiting time before entering the continuous batch for the following LLM request. Since these pauses happen for each call, this problem becomes increasingly severe as turn number grow for agentic programs. Previous works either fail to incorporate information from the tool call, evicting KV cache that leads to repetitive prefill or loading, or ignore the continuity of a multi-turn program, creating waiting time between turns that increases per-request latency.
We present Continuum, a serving system to optimize job completion time for multi-turn agent workloads by combining tool-aware KV cache timeout with program-level scheduling. By predicting tool call durations in agentic workflows, Continuum selectively pins the KV cache in GPU memory with a time-to-live value based on total turn number. When combined with program-level first-come-first-serve, Continuum prevents scheduling bubbles, preserves multi-turn continuity, and optimizes for throughput for complex agentic workflows. By modeling the variability of tool call and agent program continuity, Continuum outperforms state-of-the-art baselines. Our evaluation on real-world agentic workloads (SWE-Bench and BFCL) with Llama-3.1 8B/70B models shows that Continuum significantly improves the average job completion times, and remains performant across different hardware setups and DRAM offloading schemes. Preview code is available at: https://github.com/Hanchenli/vllm-continuum
toXiv_bot_toot
Optimal Control of ODE Car-Following Models: Applications to Mixed-Autonomy Platoon Control via Coupled Autonomous Vehicles
Arwa Alanqary, Alexandre M. Bayen, Xiaoqian Gong, Anish Gollakota, Alexander Keimer, Ashish Pandian
https://arxiv.org/abs/2508.19417
Binomial edge ideals of Cameron-Walker graphs
Takayuki Hibi, Sara Saeedi Madani
https://arxiv.org/abs/2509.01150 https://arxiv.org/pdf/2509.01150
Random burning of the Euclidean lattice
Guillaume Blanc, Alice Contat
https://arxiv.org/abs/2509.02562 https://arxiv.org/pdf/2509.02562