Ein Re-Design zum 17 in Sachen Rückseite wäre mir lieber. Den neuen Kamera-Hügel finde ich ja okay, aber das Quadrat darunter eher mäh https://www.mobiflip.de/apple-das-iphone-18-pro-bekommt-ein-redesign-auf-der-front/
Oh sweet memory. This is where I spend 3 years during my apprenticeship at SIEMENS-Nixdorf. Sad to see this place being abandoned
https://www.radiohochstift.de/aktionen/lokale-aktionen-und…
Public health groups are suing the Environmental Protection Agency ( #EPA ) over its approval of a #PFAS “forever chemical”
#insecticide
that industry research found likely
reduces testicle size,
lowers sperm coun…
📉 Reduces token consumption by 99%! From ~47,000 tokens to ~400 tokens when using 6 MCP servers with 60 tools - massive savings for AI agent workflows
⚡ Connection pooling with lazy-spawn daemon featuring 60s idle timeout for optimal performance - no manual start/stop needed
🎛️ Tool filtering via allowedTools and disabledTools config to control which tools are available - supports glob patterns like read_* or *file*
Anthropic details how it had to redesign its take-home test for hiring performance engineers as Claude kept defeating it, and releases the original test (Anthropic)
https://www.anthropic.com/engineering/AI-resistant-technical-evaluations
Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking
Ravi Ghadia, Maksim Abraham, Sergei Vorobyov, Max Ryabinin
https://arxiv.org/abs/2602.21196 https://arxiv.org/pdf/2602.21196 https://arxiv.org/html/2602.21196
arXiv:2602.21196v1 Announce Type: new
Abstract: Efficiently processing long sequences with Transformer models usually requires splitting the computations across accelerators via context parallelism. The dominant approaches in this family of methods, such as Ring Attention or DeepSpeed Ulysses, enable scaling over the context dimension but do not focus on memory efficiency, which limits the sequence lengths they can support. More advanced techniques, such as Fully Pipelined Distributed Transformer or activation offloading, can further extend the possible context length at the cost of training throughput. In this paper, we present UPipe, a simple yet effective context parallelism technique that performs fine-grained chunking at the attention head level. This technique significantly reduces the activation memory usage of self-attention, breaking the activation memory barrier and unlocking much longer context lengths. Our approach reduces intermediate tensor memory usage in the attention layer by as much as 87.5$\%$ for 32B Transformers, while matching previous context parallelism techniques in terms of training speed. UPipe can support the context length of 5M tokens when training Llama3-8B on a single 8$\times$H100 node, improving upon prior methods by over 25$\%$.
toXiv_bot_toot
Ukraine retakes 90% of Kupyansk and gains ground near Pokrovsk: https://benborges.xyz/2025/12/18/ukraine-retakes-of-kupyansk-and.html
🇺🇦 Auf radioeins läuft...
Zoot Woman:
🎵 Taken It All
#NowPlaying #ZootWoman
https://zootwoman.bandcamp.com/track/taken-it-all-redesigned
https://open.spotify.com/track/2C1a9QRxNhNHJpOUKwwTnA