ProphetKV: User-Query-Driven Selective Recomputation for Efficient KV Cache Reuse in Retrieval-Augmented Generation
Shihao Wang, Jiahao Chen, Yanqi Pan, Hao Huang, Yichen Hao, Xiangyu Zou, Wen Xia, Wentao Zhang, Haitao Wang, Junhong Li, Chongyang Qiu, Pengfei Wang
https://arxiv.org/abs/2602.02579 https://arxiv.org/pdf/2602.02579 https://arxiv.org/html/2602.02579
arXiv:2602.02579v1 Announce Type: new
Abstract: The prefill stage of long-context Retrieval-Augmented Generation (RAG) is severely bottlenecked by computational overhead. To mitigate this, recent methods assemble pre-calculated KV caches of retrieved RAG documents (by a user query) and reprocess selected tokens to recover cross-attention between these pre-calculated KV caches. However, we identify a fundamental "crowding-out effect" in current token selection criteria: globally salient but user-query-irrelevant tokens saturate the limited recomputation budget, displacing the tokens truly essential for answering the user query and degrading inference accuracy.
We propose ProphetKV, a user-query-driven KV Cache reuse method for RAG scenarios. ProphetKV dynamically prioritizes tokens based on their semantic relevance to the user query and employs a dual-stage recomputation pipeline to fuse layer-wise attention metrics into a high-utility set. By ensuring the recomputation budget is dedicated to bridging the informational gap between retrieved context and the user query, ProphetKV achieves high-fidelity attention recovery with minimal overhead. Our extensive evaluation results show that ProphetKV retains 96%-101% of full-prefill accuracy with only a 20% recomputation ratio, while achieving accuracy improvements of 8.8%-24.9% on RULER and 18.6%-50.9% on LongBench over the state-of-the-art approaches (e.g., CacheBlend, EPIC, and KVShare).
toXiv_bot_toot
I'll be in the #fosdem translations dev room this afternoon, speaking about something completely unrelated to my usual topics.
Don't expect a ready to use project though, it's more about sharing a story of creative problem solving. :blobcatartist:
There’s a second wrinkle to the OP’s critique beyond “abstractions should be better.”
The fundamental thing that makes programming hard is bridging the gap between ambiguous natural language and an unambiguous programming language. That’s hard.
That’s hard partly because the things that make a language unambiguous make such a language deeply unintuitive to humans, no matter how much it resembles English. BUT…
…the other reason it’s hard is that it forces you to decide •exactly• what you want.
This is what social media (and the internet) exists for.
(Also glad that Rami has bridging turned on, so I can repost it here easily! 😁) https://fed.brid.gy/r/https://bsky.app/profile/did:plc:jye22xkea3jqsabskhfec347/post/3mf3ngx67rs2k
Spatially-informed transformers: Injecting geostatistical covariance biases into self-attention for spatio-temporal forecasting
Yuri Calleo
https://arxiv.org/abs/2512.17696 https://arxiv.org/pdf/2512.17696 https://arxiv.org/html/2512.17696
arXiv:2512.17696v1 Announce Type: new
Abstract: The modeling of high-dimensional spatio-temporal processes presents a fundamental dichotomy between the probabilistic rigor of classical geostatistics and the flexible, high-capacity representations of deep learning. While Gaussian processes offer theoretical consistency and exact uncertainty quantification, their prohibitive computational scaling renders them impractical for massive sensor networks. Conversely, modern transformer architectures excel at sequence modeling but inherently lack a geometric inductive bias, treating spatial sensors as permutation-invariant tokens without a native understanding of distance. In this work, we propose a spatially-informed transformer, a hybrid architecture that injects a geostatistical inductive bias directly into the self-attention mechanism via a learnable covariance kernel. By formally decomposing the attention structure into a stationary physical prior and a non-stationary data-driven residual, we impose a soft topological constraint that favors spatially proximal interactions while retaining the capacity to model complex dynamics. We demonstrate the phenomenon of ``Deep Variography'', where the network successfully recovers the true spatial decay parameters of the underlying process end-to-end via backpropagation. Extensive experiments on synthetic Gaussian random fields and real-world traffic benchmarks confirm that our method outperforms state-of-the-art graph neural networks. Furthermore, rigorous statistical validation confirms that the proposed method delivers not only superior predictive accuracy but also well-calibrated probabilistic forecasts, effectively bridging the gap between physics-aware modeling and data-driven learning.
toXiv_bot_toot
LI.FI, which provides businesses with price comparisons of crypto exchange rates and bridging fees, raised $29M, bringing its total funding to ~$52M (Carlos Garcia/Fortune)
https://fortune.com/2025/12/11/exclusive-crypto-startup-li-fi-raises-29-million/
Climate action is finally moving from promises to practice. The Global Implementation Accelerator is bridging the gap between national climate plans and real-world results.
Here's what's working: Private capital restoring Brazilian forests. Unilever switching to renewables. H&M and IKEA modernizing Vietnam's grid. Companies in Taiwan pooling demand for clean energy.
The breakthrough? Aligning business needs with national climate goals creates wins on both sides.
Replaced article(s) found for cs.DS. https://arxiv.org/list/cs.DS/new
[1/1]:
- Fully Dynamic Adversarially Robust Correlation Clustering in Polylogarithmic Update Time
Vladimir Braverman, Prathamesh Dharangutte, Shreyas Pai, Vihan Shah, Chen Wang
https://arxiv.org/abs/2411.09979 https://mastoxiv.page/@arXiv_csDS_bot/113502653187863544
- A Simple and Combinatorial Approach to Proving Chernoff Bounds and Their Generalizations
William Kuszmaul
https://arxiv.org/abs/2501.03488 https://mastoxiv.page/@arXiv_csDS_bot/113791396712128907
- The Structural Complexity of Matrix-Vector Multiplication
Emile Anand, Jan van den Brand, Rose McCarty
https://arxiv.org/abs/2502.21240 https://mastoxiv.page/@arXiv_csDS_bot/114097340825270885
- Clustering under Constraints: Efficient Parameterized Approximation Schemes
Sujoy Bhore, Ameet Gadekar, Tanmay Inamdar
https://arxiv.org/abs/2504.06980 https://mastoxiv.page/@arXiv_csDS_bot/114312444050875805
- Minimizing Envy and Maximizing Happiness in Graphical House Allocation
Anubhav Dhar, Ashlesha Hota, Palash Dey, Sudeshna Kolay
https://arxiv.org/abs/2505.00296 https://mastoxiv.page/@arXiv_csDS_bot/114437013364446063
- Fast and Simple Densest Subgraph with Predictions
Thai Bui, Luan Nguyen, Hoa T. Vu
https://arxiv.org/abs/2505.12600 https://mastoxiv.page/@arXiv_csDS_bot/114538936921930134
- Compressing Suffix Trees by Path Decompositions
Becker, Cenzato, Gagie, Kim, Koerkamp, Manzini, Prezza
https://arxiv.org/abs/2506.14734 https://mastoxiv.page/@arXiv_csDS_bot/114703384646892523
- Improved sampling algorithms and functional inequalities for non-log-concave distributions
Yuchen He, Zhehan Lei, Jianan Shao, Chihao Zhang
https://arxiv.org/abs/2507.11236 https://mastoxiv.page/@arXiv_csDS_bot/114862112197588124
- Deterministic Lower Bounds for $k$-Edge Connectivity in the Distributed Sketching Model
Peter Robinson, Ming Ming Tan
https://arxiv.org/abs/2507.11257 https://mastoxiv.page/@arXiv_csDS_bot/114862223634372292
- Optimally detecting uniformly-distributed $\ell_2$ heavy hitters in data streams
Santhoshini Velusamy, Huacheng Yu
https://arxiv.org/abs/2509.07286 https://mastoxiv.page/@arXiv_csDS_bot/115178875220889588
- Uncrossed Multiflows and Applications to Disjoint Paths
Chandra Chekuri, Guyslain Naves, Joseph Poremba, F. Bruce Shepherd
https://arxiv.org/abs/2511.00254 https://mastoxiv.page/@arXiv_csDS_bot/115490402963680492
- Dynamic Matroids: Base Packing and Covering
Tijn de Vos, Mara Grilnberger
https://arxiv.org/abs/2511.15460 https://mastoxiv.page/@arXiv_csDS_bot/115580946319285096
- Branch-width of connectivity functions is fixed-parameter tractable
Tuukka Korhonen, Sang-il Oum
https://arxiv.org/abs/2601.04756 https://mastoxiv.page/@arXiv_csDS_bot/115864074799755995
- CoinPress: Practical Private Mean and Covariance Estimation
Sourav Biswas, Yihe Dong, Gautam Kamath, Jonathan Ullman
https://arxiv.org/abs/2006.06618
- The Ideal Membership Problem and Abelian Groups
Andrei A. Bulatov, Akbar Rafiey
https://arxiv.org/abs/2201.05218
- Bridging Classical and Quantum: Group-Theoretic Approach to Quantum Circuit Simulation
Daksh Shami
https://arxiv.org/abs/2407.19575 https://mastoxiv.page/@arXiv_quantph_bot/112874282709517475
- Young domination on Hamming rectangles
Janko Gravner, Matja\v{z} Krnc, Martin Milani\v{c}, Jean-Florent Raymond
https://arxiv.org/abs/2501.03788 https://mastoxiv.page/@arXiv_mathCO_bot/113791421814248215
- On the Space Complexity of Online Convolution
Joel Daniel Andersson, Amir Yehudayoff
https://arxiv.org/abs/2505.00181 https://mastoxiv.page/@arXiv_csCC_bot/114437005955255553
- Universal Solvability for Robot Motion Planning on Graphs
Anubhav Dhar, Pranav Nyati, Tanishq Prasad, Ashlesha Hota, Sudeshna Kolay
https://arxiv.org/abs/2506.18755 https://mastoxiv.page/@arXiv_csCC_bot/114737342714568702
- Colorful Minors
Evangelos Protopapas, Dimitrios M. Thilikos, Sebastian Wiederrecht
https://arxiv.org/abs/2507.10467
- Learning fermionic linear optics with Heisenberg scaling and physical operations
Aria Christensen, Andrew Zhao
https://arxiv.org/abs/2602.05058
toXiv_bot_toot