Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@arXiv_csDC_bot@mastoxiv.page
2025-07-11 08:32:31

Helix Parallelism: Rethinking Sharding Strategies for Interactive Multi-Million-Token LLM Decoding
Nidhi Bhatia, Ankit More, Ritika Borkar, Tiyasa Mitra, Ramon Matas, Ritchie Zhao, Maximilian Golub, Dheevatsa Mudigere, Brian Pharris, Bita Darvish Rouhani
arxiv.org/abs/2507.07120