Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@arXiv_csLG_bot@mastoxiv.page
2026-02-25 10:45:11

Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking
Ravi Ghadia, Maksim Abraham, Sergei Vorobyov, Max Ryabinin
arxiv.org/abs/2602.21196 arxiv.org/pdf/2602.21196 arxiv.org/html/2602.21196
arXiv:2602.21196v1 Announce Type: new
Abstract: Efficiently processing long sequences with Transformer models usually requires splitting the computations across accelerators via context parallelism. The dominant approaches in this family of methods, such as Ring Attention or DeepSpeed Ulysses, enable scaling over the context dimension but do not focus on memory efficiency, which limits the sequence lengths they can support. More advanced techniques, such as Fully Pipelined Distributed Transformer or activation offloading, can further extend the possible context length at the cost of training throughput. In this paper, we present UPipe, a simple yet effective context parallelism technique that performs fine-grained chunking at the attention head level. This technique significantly reduces the activation memory usage of self-attention, breaking the activation memory barrier and unlocking much longer context lengths. Our approach reduces intermediate tensor memory usage in the attention layer by as much as 87.5$\%$ for 32B Transformers, while matching previous context parallelism techniques in terms of training speed. UPipe can support the context length of 5M tokens when training Llama3-8B on a single 8$\times$H100 node, improving upon prior methods by over 25$\%$.
toXiv_bot_toot

@rasterweb@mastodon.social
2026-04-25 04:55:01

10 things my wife learned from her first 100 miles e-biking to work
electrek.co/2023/12/11/everyth

Decide in advance whether you will unlock your device or provide the passcode for a search.
Your overall likelihood of experiencing a device search is low
(e.g., less than .01% of international travelers are selected),
but depending on what information you carry, the impact of a search may be quite high.
If you plan to unlock your device for a search or provide the passcode, ensure your devices are prepared:
☐ Upload any information you would like to keep in clo…

@anderelampe@chaos.social
2026-04-22 06:34:40

Meine Arbeit als Wissenschaftler ist sehr abwechslungsreich weil ich in einer core facility arbeite (Advanced Medical BIOimaging: #AMBIO) und weil ich einem Transregionalen SFB im Projekt für Infrastruktur (INF) beschäftigt bin.
Eine Aufgabe: Kollegys in #RDM /

Let me just start with Kharg Island.
We can put troops on there. We can air mobile them in. We could land them by boat.
I guess the comment I have about Kharg is, I’m not sure what the significance is of putting troops there.
It’s only about 20 miles off the coast of Iran. So you’re definitely under the threat of their weapon systems.
You’d be very, very vulnerable there.
And I don’t know that it would give us any particular tactical advantage that we don’t…

@arXiv_physicsaccph_bot@mastoxiv.page
2026-02-24 08:06:27

Conventional Accelerator Magnets
Stephane Sanfilippo
arxiv.org/abs/2602.19808 arxiv.org/pdf/2602.19808 arxiv.org/html/2602.19808
arXiv:2602.19808v1 Announce Type: new
Abstract: This course introduces conventional magnets used in particle accelerators, focusing on both normal-conducting copper coil magnets and permanent magnets (PMs). It covers magnet classification, design principles, material selection, and mechanical constraints. Advantages and limitations of PMs compared to copper coil magnets are discussed. Key construction steps and cooling methods are presented. The course also includes magnetic field measurement techniques and quality control. Practical examples from PSI and CERN illustrate the concepts.
toXiv_bot_toot

@fortune@social.linux.pizza
2026-02-16 10:00:01

The countdown had stalled at 'T' minus 69 seconds when Desiree, the first
female ape to go up in space, winked at me slyly and pouted her thick,
rubbery lips unmistakably -- the first of many such advances during what
would prove to be the longest, and most memorable, space voyage of my
career.
-- Winning sentence, 1985 Bulwer-Lytton bad fiction contest.

@inthehands@hachyderm.io
2026-04-12 02:46:33

(The underlying logic here is that LLMs embed biases, so you take advantage of that fact by prompting an LLM to take on a spectrum of different demographic biases that correspond to population demographics, then ask the LLM a polling question in the context of each of those demographically weighted biases.
So yeah, from my OP it might sound like they’re replacing polling with stabbing themselves in the face, but •actually• they’re juggling a bunch of knives and •then• stabbing themselves in the face.)

@gwire@mastodon.social
2026-02-14 13:34:38

Given details of many ministerial speeches made outside of parliament are sent in advance to the press, they're not usually something that would be expected to convey financial advantage. So without solid details, this feels a bit thin.
news.sky.c…