Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@arXiv_csDC_bot@mastoxiv.page
2025-08-26 09:54:06

Zen-Attention: A Compiler Framework for Dynamic Attention Folding on AMD NPUs
Aadesh Deshmukh, Venkata Yaswanth Raparti, Samuel Hsu
arxiv.org/abs/2508.17593

@arXiv_csET_bot@mastoxiv.page
2025-09-23 09:46:40

Evaluating the Energy Efficiency of NPU-Accelerated Machine Learning Inference on Embedded Microcontrollers
Anastasios Fanariotis, Theofanis Orphanoudakis, Vasilis Fotopoulos
arxiv.org/abs/2509.17533

@arXiv_csAR_bot@mastoxiv.page
2025-09-19 07:31:01

eIQ Neutron: Redefining Edge-AI Inference with Integrated NPU and Compiler Innovations
Lennart Bamberg, Filippo Minnella, Roberto Bosio, Fabrizio Ottati, Yuebin Wang, Jongmin Lee, Luciano Lavagno, Adam Fuks
arxiv.org/abs/2509.14388

@arXiv_csAR_bot@mastoxiv.page
2025-10-08 07:40:29

From Principles to Practice: A Systematic Study of LLM Serving on Multi-core NPUs
Tianhao Zhu, Dahu Feng, Erhu Feng, Yubin Xia
arxiv.org/abs/2510.05632

@arXiv_csAI_bot@mastoxiv.page
2025-10-01 11:31:17

Benchmarking Deep Learning Convolutions on Energy-constrained CPUs
Enrique Galvez (ALSOC), Adrien Cassagne (ALSOC), Alix Munier (ALSOC), Manuel Bouyer
arxiv.org/abs/2509.26217

@arXiv_csDC_bot@mastoxiv.page
2025-08-01 07:36:50

H2SGEMM: Emulating FP32 GEMM on Ascend NPUs using FP16 Units with Precision Recovery and Cache-Aware Optimization
Weicheng Xue, Baisong Xu, Kai Yang, Yongxiang Liu, Dengdeng Fan, Pengxiang Xu, Yonghong Tian
arxiv.org/abs/2507.23387

@arXiv_csAI_bot@mastoxiv.page
2025-10-01 17:40:14

Replaced article(s) found for cs.AI. arxiv.org/list/cs.AI/new
[8/9]:
- MindVL: Towards Efficient and Effective Training of Multimodal Large Language Models on Ascend NPUs
Feilong Chen, Yijiang Liu, Yi Huang, Hao Wang, Miren Tian, Ya-Qi Yu, Minghui Liao, Jihao Wu

@arXiv_csDC_bot@mastoxiv.page
2025-09-30 09:13:41

Scaling LLM Test-Time Compute with Mobile NPU on Smartphones
Zixu Hao, Jianyu Wei, Tuowei Wang, Minxing Huang, Huiqiang Jiang, Shiqi Jiang, Ting Cao, Ju Ren
arxiv.org/abs/2509.23324

@arXiv_csDC_bot@mastoxiv.page
2025-10-08 07:36:59

Tiny but Mighty: A Software-Hardware Co-Design Approach for Efficient Multimodal Inference on Battery-Powered Small Devices
Yilong Li, Shuai Zhang, Yijing Zeng, Hao Zhang, Xinmiao Xiong, Jingyu Liu, Pan Hu, Suman Banerjee
arxiv.org/abs/2510.05109

@arXiv_csDC_bot@mastoxiv.page
2025-08-04 12:08:41

Replaced article(s) found for cs.DC. arxiv.org/list/cs.DC/new
[1/1]:
- SGEMM-cube: Emulating FP32 GEMM on Ascend NPUs Using FP16 Cube Units with Precision Recovery
Weicheng Xue, Baisong Xu, Kai Yang, Yongxiang Liu, Dengdeng Fan, Pengxiang Xu, Yonghong Tian