Tootfinder

@arXiv_csDC_bot@mastoxiv.page
2025-07-25 09:20:22

Unlock the Potential of Fine-grained LLM Serving via Dynamic Module Scaling
Jingfeng Wu, Yiyuan He, Minxian Xu, Xitong Gao, Kejiang Ye, Chengzhong Xu
https://arxiv.org/abs/2507.18006

Unlock the Potential of Fine-grained LLM Serving via Dynamic Module Scaling
The rise of large language models (LLMs) has created new opportunities across various fields but has also introduced significant challenges in resource management. Current LLM serving systems face a fundamental tension: balancing serving demands with limited resources while adapting to unpredictable traffic patterns. Static deployments lead to suboptimal resource utilization and performance degradation under dynamic workloads. Furthermore, the high cost of adjusting instances hinders dynamic sc…

Tootfinder

Opt-in global Mastodon full text search. Join the index!