2025-10-20 02:35:34
Alibaba Cloud details a GPU pooling system that it claims reduced the number of Nvidia H20 required by 82% when serving dozens of LLMs of up to 72B parameters (Vincent Chow/South China Morning Post)
https://www.scmp.com/business/article/3329







































