"A pair of US lawmakers are calling for an investigation into how easily spies can steal information based on devices’ electromagnetic and acoustic leaks—a spying trick the NSA once codenamed TEMPEST"
https://www.wired.com/story/how-vulnerable
"Christopher Bishop’s 2006 book “Pattern Recognition and Machine Learning,” arguably one of the triggers of the current popularity of machine learning, is quite literally a book about applied mathematics, diving into probabilities, linear algebra, neural networks, Markov models, and combinatorics. And rightfully so; if your objective is to find a job as an engineer at OpenAI, knowing a thing or two about eigenvalues and eigenvectors is definitely going to be useful."
Rebuilding public trust in AI requires meaningful citizen engagement, transparent governance, and robust legislation. Technology itself is not the problem. The issue is that few people trust institutions to deploy it wisely and for their benefit. This makes the first step to answer the following question: What’s it in for me?
🇺🇦 #NowPlaying on #KEXP's #Early
Confidence Man:
🎵 Angry Girl
#ConfidenceMan
https://confidenceman.bandcamp.com/track/angry-girl-chai-version
https://open.spotify.com/track/2PXULQ9Lo1AmU7eMnnBnxp
So, I have an answer to my previous question about GPU transfer efficiency.
Original code: write data to staging buffer on CPU, vkCopyBuffer to GPU local memory, run int-float32 conversion on GPU out of that buffer. The copy operation shows 50% SM occupancy by compute warps, 50% unallocated warp slots in active SMs.
GPU memory write bandwidth is sitting around 2%, about 1.9 ms copy/shader run time.
ProphetKV: User-Query-Driven Selective Recomputation for Efficient KV Cache Reuse in Retrieval-Augmented Generation
Shihao Wang, Jiahao Chen, Yanqi Pan, Hao Huang, Yichen Hao, Xiangyu Zou, Wen Xia, Wentao Zhang, Haitao Wang, Junhong Li, Chongyang Qiu, Pengfei Wang
https://arxiv.org/abs/2602.02579 https://arxiv.org/pdf/2602.02579 https://arxiv.org/html/2602.02579
arXiv:2602.02579v1 Announce Type: new
Abstract: The prefill stage of long-context Retrieval-Augmented Generation (RAG) is severely bottlenecked by computational overhead. To mitigate this, recent methods assemble pre-calculated KV caches of retrieved RAG documents (by a user query) and reprocess selected tokens to recover cross-attention between these pre-calculated KV caches. However, we identify a fundamental "crowding-out effect" in current token selection criteria: globally salient but user-query-irrelevant tokens saturate the limited recomputation budget, displacing the tokens truly essential for answering the user query and degrading inference accuracy.
We propose ProphetKV, a user-query-driven KV Cache reuse method for RAG scenarios. ProphetKV dynamically prioritizes tokens based on their semantic relevance to the user query and employs a dual-stage recomputation pipeline to fuse layer-wise attention metrics into a high-utility set. By ensuring the recomputation budget is dedicated to bridging the informational gap between retrieved context and the user query, ProphetKV achieves high-fidelity attention recovery with minimal overhead. Our extensive evaluation results show that ProphetKV retains 96%-101% of full-prefill accuracy with only a 20% recomputation ratio, while achieving accuracy improvements of 8.8%-24.9% on RULER and 18.6%-50.9% on LongBench over the state-of-the-art approaches (e.g., CacheBlend, EPIC, and KVShare).
toXiv_bot_toot
Perfect Network Resilience in Polynomial Time
Matthias Bentert, Stefan Schmid
https://arxiv.org/abs/2602.03827 https://arxiv.org/pdf/2602.03827 https://arxiv.org/html/2602.03827
arXiv:2602.03827v1 Announce Type: new
Abstract: Modern communication networks support local fast rerouting mechanisms to quickly react to link failures: nodes store a set of conditional rerouting rules which define how to forward an incoming packet in case of incident link failures. The rerouting decisions at any node $v$ must rely solely on local information available at $v$: the link from which a packet arrived at $v$, the target of the packet, and the incident link failures at $v$. Ideally, such rerouting mechanisms provide perfect resilience: any packet is routed from its source to its target as long as the two are connected in the underlying graph after the link failures. Already in their seminal paper at ACM PODC '12, Feigenbaum, Godfrey, Panda, Schapira, Shenker, and Singla showed that perfect resilience cannot always be achieved. While the design of local rerouting algorithms has received much attention since then, we still lack a detailed understanding of when perfect resilience is achievable.
This paper closes this gap and presents a complete characterization of when perfect resilience can be achieved. This characterization also allows us to design an $O(n)$-time algorithm to decide whether a given instance is perfectly resilient and an $O(nm)$-time algorithm to compute perfectly resilient rerouting rules whenever it is. Our algorithm is also attractive for the simple structure of the rerouting rules it uses, known as skipping in the literature: alternative links are chosen according to an ordered priority list (per in-port), where failed links are simply skipped. Intriguingly, our result also implies that in the context of perfect resilience, skipping rerouting rules are as powerful as more general rerouting rules. This partially answers a long-standing open question by Chiesa, Nikolaevskiy, Mitrovic, Gurtov, Madry, Schapira, and Shenker [IEEE/ACM Transactions on Networking, 2017] in the affirmative.
toXiv_bot_toot
I plan to bow out of NYT #Wordle after tomorrow. I’ll miss the friendly competition with the #OldGal & #YoungPups. Their announcement that they’ll start reusing previous answers as of Monday Feb, 2 was enough t…
Meine Motorik ist so im Eimer, oder auch: Hört mich, wie ich meinen Computer anschreie "Höre ich mal auf, mich zu verradieren?!"
Noch ein paar der zuletzt hier besonders häufig geteilten #News:
IT-Angriff betrifft IT der Beweisstückstelle der Polizei