2025-09-12 10:12:19
ButterflyQuant: Ultra-low-bit LLM Quantization through Learnable Orthogonal Butterfly Transforms
Bingxin Xu, Zhen Dong, Oussama Elachqar, Yuzhang Shang
https://arxiv.org/abs/2509.09679
ButterflyQuant: Ultra-low-bit LLM Quantization through Learnable Orthogonal Butterfly Transforms
Bingxin Xu, Zhen Dong, Oussama Elachqar, Yuzhang Shang
https://arxiv.org/abs/2509.09679
Spatial-Temporal Multi-Scale Quantization for Flexible Motion Generation
Zan Wang, Jingze Zhang, Yixin Chen, Baoxiong Jia, Wei Liang, Siyuan Huang
https://arxiv.org/abs/2508.08991
Finite Scalar Quantization Enables Redundant and Transmission-Robust Neural Audio Compression at Low Bit-rates
Harry Julia, Rachel Beeson, Lohith Konathala, Johanna Ulin, Jiameng Gao
https://arxiv.org/abs/2509.09550
HERO: Hardware-Efficient RL-based Optimization Framework for NeRF Quantization
Yipu Zhang, Chaofang Ma, Jinming Ge, Lin Jiang, Jiang Xu, Wei Zhang
https://arxiv.org/abs/2510.09010
Quantization of charged fields in the presence of intense electromagnetic fields
\'Alvaro \'Alvarez-Dom\'inguez
https://arxiv.org/abs/2510.09447 https://
Profiling Large Language Model Inference on Apple Silicon: A Quantization Perspective
Afsara Benazir, Felix Xiaozhu Lin
https://arxiv.org/abs/2508.08531 https://
Quantization of the electromagnetic fields from single atomic or molecular radiators
Valerica Raicu
https://arxiv.org/abs/2509.07359 https://arxiv.org/pdf/…
Quantization of bounded symplectic domains associated with compact Lie groups
Alexey A. Sharapov
https://arxiv.org/abs/2509.05931 https://arxiv.org/pdf/250…
Training Dynamics Impact Post-Training Quantization Robustness
Albert Catalan-Tatjer, Niccol\`o Ajroldi, Jonas Geiping
https://arxiv.org/abs/2510.06213 https://
Hierarchical Spatial Algorithms for High-Resolution Image Quantization and Feature Extraction
Noor Islam S. Mohammad
https://arxiv.org/abs/2510.08449 https://
Recursive algorithm for constructing antisymmetric fermionic states in first quantization mapping
E. Rule, I. A. Chernyshev, I. Stetcu, J. Carlson, R. Weiss
https://arxiv.org/abs/2509.07279
Sharpness-Aware Data Generation for Zero-shot Quantization
Dung Hoang-Anh, Cuong Pham Trung Le, Jianfei Cai, Thanh-Toan Do
https://arxiv.org/abs/2510.07018 https://
Optimizing Fronthaul Quantization for Flexible User Load in Cell-Free Massive MIMO
Fabian G\"ottsch, Max Franke, Arash Pourdamghani, Giuseppe Caire, Stefan Schmid
https://arxiv.org/abs/2510.06734 …
Rate-Adaptive Semantic Communication via Multi-Stage Vector Quantization
Jinsung Park, Junyong Shin, Yongjeong Oh, Jihun Park, Yo-Seb Jeon
https://arxiv.org/abs/2510.02646 https…
Supersymmetric lattice theories on curved space
David Berenstein, Simon Catterall
https://arxiv.org/abs/2509.08885 https://arxiv.org/pdf/2509.08885
The Uneven Impact of Post-Training Quantization in Machine Translation
Benjamin Marie, Atsushi Fujita
https://arxiv.org/abs/2508.20893 https://arxiv.org/pd…
The Schr\"odinger equation for a spherically symmetric system, its structure and solutions
R. I. Ayala O\~na, T. P. Shestakova
https://arxiv.org/abs/2509.08523 https://
CSI Compression Beyond Latents: End-to-End Hybrid Attention-CNN Networks with Entropy Regularization
Maryam Ansarifard, Mostafa Rahmani, Mohit K. Sharma, Kishor C. Joshi, George Exarchakos, Alister Burr
https://arxiv.org/abs/2509.08776
LIV-Decoherence on Gravitational Cat States
Iarley P. Lobo, Kelvin Sampaio, Gislaine Var\~ao, Moises Rojas, Valdir B. Bezerra
https://arxiv.org/abs/2510.08828 https://
Quantization of spin circular photogalvanic effect in altermagnetic Weyl semimetals
Hiroki Yoshida, Jan Priessnitz, Libor \v{S}mejkal, Shuichi Murakami
https://arxiv.org/abs/2509.05620
Duality between polyhedral approximation of value functions and optimal quantization of measures
Abdellah Bulaich Mehamdi, Wim van Ackooij, Luce Brotcorne, St\'ephane Gaubert, Quentin Jacquet
https://arxiv.org/abs/2509.04101
Huawei's Zurich Lab unveils SINQ, an open-source quantization method that it claims can reduce LLM memory use by 60-70% without significant quality loss (Carl Franzen/VentureBeat)
https://venturebeat.com/ai/huaweis-new-open-source-techni…
Integrating Pruning with Quantization for Efficient Deep Neural Networks Compression
Sara Makenali, Babak Rokh, Ali Azarpeyvand
https://arxiv.org/abs/2509.04244 https://
Robust Residual Finite Scalar Quantization for Neural Compression
Xiaoxu Zhu
https://arxiv.org/abs/2508.15860 https://arxiv.org/pdf/2508.15860
FineServe: Precision-Aware KV Slab and Two-Level Scheduling for Heterogeneous Precision LLM Serving
Kyungmin Bin, Seungbeom Choi, Jimyoung Son, Jieun Choi, Daseul Bae, Daehyeon Baek, Kihyo Moon, Minsung Jang, Hyojung Lee
https://arxiv.org/abs/2509.06261
Dynamical Quantum Multigraphs
Kassahun H. Betre, Nathan Lewis
https://arxiv.org/abs/2509.08296 https://arxiv.org/pdf/2509.08296
Crosslisted article(s) found for physics.optics. https://arxiv.org/list/physics.optics/new
[1/1]:
- Quantization of the electromagnetic fields from single atomic or molecular radiators
Valerica Raicu
A Theoretically-Grounded Codebook for Digital Semantic Communications
Lingyi Wang, Rashed Shelim, Walid Saad, Naren Ramakrishnan
https://arxiv.org/abs/2510.07108 https://…
Deformation quantization of a hessian KV- structure on $\mathbb{R}^2$
Herguey Mopeng, Prosper Rosaire Mama Assandje, Joseph Dongho, Armand Tsimi
https://arxiv.org/abs/2509.23228
Replaced article(s) found for cs.AR. https://arxiv.org/list/cs.AR/new
[1/1]:
- KLLM: Fast LLM Inference with K-Means Quantization
Xueying Wu, Baijun Zhou, Zhihui Gao, Yuzhe Fu, Qilin Zheng, Yintao He, Hai Li
Cat: Post-training quantization error reduction via cluster-based affine transformation
Ali Zoljodi, Radu Timofte, Masoud Daneshtalab
https://arxiv.org/abs/2509.26277 https://…
From Faraday and Maxwell to Quantum Physics. The later story of the Electromagnetic Vector Potential
Tuck Choy, Miguel Ortuno
https://arxiv.org/abs/2509.04486 https://
Crosslisted article(s) found for physics.chem-ph. https://arxiv.org/list/physics.chem-ph/new
[1/1]:
- Recursive algorithm for constructing antisymmetric fermionic states in first quantization mapping
E. Rule, I. A. Chernyshev, I. Stetcu, J. Carlson, R. Weiss
Mixture of Many Zero-Compute Experts: A High-Rate Quantization Theory Perspective
Yehuda Dar
https://arxiv.org/abs/2510.03151 https://arxiv.org/pdf/2510.03…
Progressive Semantic Residual Quantization for Multimodal-Joint Interest Modeling in Music Recommendation
Shijia Wang, Tianpei Ouyang, Qiang Xiao, Dongjing Wang, Yintao Ren, Songpei Xu, Da Guo, Chuanjiang Luo
https://arxiv.org/abs/2508.20359
Distributed Platoon Control Under Quantization: Stability Analysis and Privacy Preservation
Kaixiang Zhang, Zhaojian Li, Wei Lin
https://arxiv.org/abs/2510.05959 https://…
AQUA-LLM: Evaluating Accuracy, Quantization, and Adversarial Robustness Trade-offs in LLMs for Cybersecurity Question Answering
Onat Gungor, Roshan Sood, Harold Wang, Tajana Rosing
https://arxiv.org/abs/2509.13514
Batalin-Fradkin-Vilkovisky Quantization of FLPR model
Ansha S. Nair, Saurabh Gupta
https://arxiv.org/abs/2509.05632 https://arxiv.org/pdf/2509.05632…
A quantization of the $\operatorname{SL}_2(\mathbb{C})$-Chern-Simons invariant of tangle exteriors
Calvin McPhail-Snyder
https://arxiv.org/abs/2509.02365 https://
Studying $\textrm{QED}_3$ with radial quantization on the lattice -- I. Free limit
Peter A. Boyle, Richard C. Brower, George T. Fleming, Emanuel Katz, Nobuyuki Matsumoto, Rohan Misra
https://arxiv.org/abs/2510.03085
Replaced article(s) found for cond-mat.str-el. https://arxiv.org/list/cond-mat.str-el/new
[1/1]:
- Exact quantization of topological order parameter in SU($N$) spin models, $N$-ality transformatio...
Hang Su, Yuan Yao, Akira Furusaki
How Quantization Shapes Bias in Large Language Models
Federico Marcuzzi, Xuefei Ning, Roy Schwartz, Iryna Gurevych
https://arxiv.org/abs/2508.18088 https://
$C^0$-rigidity of Legendrians and coisotropics via sheaf quantization
Tomohiro Asano, Yuichi Ike, Christopher Kuo, Wenyuan Li
https://arxiv.org/abs/2510.01746 https://
DiVeQ: Differentiable Vector Quantization Using the Reparameterization Trick
Mohammad Hassan Vali, Tom B\"ackstr\"om, Arno Solin
https://arxiv.org/abs/2509.26469 https…
Quantization of blow-up masses for the Finsler $N$-Liouville equation
Xia Huang, Yuan Li, Dong Ye, Feng Zhou
https://arxiv.org/abs/2508.16080 https://arxiv…
Post-Training Quantization via Residual Truncation and Zero Suppression for Diffusion Models
Donghoon Kim, Dongyoung Lee, Ik Joon Chang, Sung-Ho Bae
https://arxiv.org/abs/2509.26436
Word2Spike: Poisson Rate Coding for Associative Memories and Neuromorphic Algorithms
Archit Kalra, Midhun Sadanand
https://arxiv.org/abs/2509.07361 https://
Non-Lagrangian Construction of Anyons via Flux Quantization in Cohomotopy
Hisham Sati, Urs Schreiber
https://arxiv.org/abs/2509.02577 https://arxiv.org/pdf…
Classical Polymerization of the Bianchi I Model with Deformed Poisson Structure
Babak Vakili
https://arxiv.org/abs/2510.06628 https://arxiv.org/pdf/2510.06…
Using Landau quantization to probe disorder in semiconductor heterostructures
Asser Elsayed, Davide Costa, Lucas E. A. Stehouwer, Alberto Tosato, Mario Lodari, Brian Paquelet Wuetz, Davide Degli Esposti, Giordano Scappucci
https://arxiv.org/abs/2510.02794
LiquidGEMM: Hardware-Efficient W4A8 GEMM Kernel for High-Performance LLM Serving
Huanqi Hu, Bowen Xiao, Shixuan Sun, Jianian Yin, Zhexi Zhang, Xiang Luo, Chengquan Jiang, Weiqi Xu, Xiaoying Jia, Xin Liu, Minyi Guo
https://arxiv.org/abs/2509.01229
Binary Weight Multi-Bit Activation Quantization for Compute-in-Memory CNN Accelerators
Wenyong Zhou, Zhengwu Liu, Yuan Ren, Ngai Wong
https://arxiv.org/abs/2508.21524 https://…
Distributed Detection and Bandwidth Allocation with Hybrid Quantized and Full-Precision Observations over Multiplicative Fading Channels
Linlin Mao, Zeping Sui, Michail Matthaiou, Hongbin Li
https://arxiv.org/abs/2510.06429
AMAQ: Adaptive Mixed-bit Activation Quantization for Collaborative Parameter Efficient Fine-tuning
Yurun Song, Zhuoyi Yang, Ian G. Harris, Sangeetha Abdu Jyothi
https://arxiv.org/abs/2510.05468
Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs
Haokun Lin, Haobo Xu, Yichen Wu, Ziyu Guo, Renrui Zhang, Zhichao Lu, Ying Wei, Qingfu Zhang, Zhenan Sun
https://arxiv.org/abs/2508.14896
Spectrogram Patch Codec: A 2D Block-Quantized VQ-VAE and HiFi-GAN for Neural Speech Coding
Luis Felipe Chary, Miguel Arjona Ramirez
https://arxiv.org/abs/2509.02244 https://
Operator Algebras and Third Quantization
Yidong Chen, Marius Junge, Nima Lashkari
https://arxiv.org/abs/2509.02293 https://arxiv.org/pdf/2509.02293
DPQuant: Efficient and Differentially-Private Model Training via Dynamic Quantization Scheduling
Yubo Gao, Renbo Tu, Gennady Pekhimenko, Nandita Vijaykumar
https://arxiv.org/abs/2509.03472
A Node-Aware Dynamic Quantization Approach for Graph Collaborative Filtering
Lin Li, Chunyang Li, Yu Yin, Xiaohui Tao, Jianwei Zhang
https://arxiv.org/abs/2508.16516 https://
Analysis of polymerized superconducting circuits
Sean Crowe, Stefan Evans, Alexei Smolyaninov
https://arxiv.org/abs/2509.18016 https://arxiv.org/pdf/2509.1…
Systematic Characterization of LLM Quantization: A Performance, Energy, and Quality Perspective
Tianyao Shi, Yi Ding
https://arxiv.org/abs/2508.16712 https://
Stratification of the half-density quantization of the Jeffrey-Weitsman-Witten invariants
Adrian Chitan
https://arxiv.org/abs/2509.17656 https://arxiv.org/…
Continuum Fractons: Quantization and the Many Body Problem
Ylias Sadki, Abhishodh Prakash, S. L. Sondhi
https://arxiv.org/abs/2510.00110 https://arxiv.org/…
Thermodynamics of Kerr black hole: Tsallis-Cirto composition law and entropy quantization
G. E. Volovik
https://arxiv.org/abs/2509.00748 https://arxiv.org/…
Coproducts for affine super-Yangian and Weyl groupoid action
Vasiliy Volkov, Vladimir Stukopin
https://arxiv.org/abs/2510.04221 https://arxiv.org/pdf/2510.…
Efficient Quantization-Aware Neural Receivers: Beyond Post-Training Quantization
SaiKrishna Saketh Yellapragada, Esa Ollila, Mario Costa
https://arxiv.org/abs/2509.13786 https:/…
TinyMusician: On-Device Music Generation with Knowledge Distillation and Mixed Precision Quantization
Hainan Wang, Mehdi Hosseinzadeh, Reza Rawassizadeh
https://arxiv.org/abs/2509.00914
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer
Ziyuan Huang, DanDan Zheng, Cheng Zou, Rui Liu, Xiaolong Wang, Kaixiang Ji, Weilong Chai, Jianxin Sun, Libin Wang, Yongjie Lv, Taozhi Huang, Jiajia Liu, Qingpei Guo, Ming Yang, Jingdong Chen, Jun Zhou
https://arxiv.org/abs/2510.06590
Phonological Representation Learning for Isolated Signs Improves Out-of-Vocabulary Generalization
Lee Kezar, Zed Sehyr, Jesse Thomason
https://arxiv.org/abs/2509.04745 https://
Bohr-Sommerfeld quantization conditions for Schrodinger operator: the Method of Microlocal Wronskian and Gram Matrix
Abdelwaheb Ifa, Michel Rouleux
https://arxiv.org/abs/2509.23514
Simultaneous Quantization and Reduction of Constrained Systems
Jianhao M. Yang
https://arxiv.org/abs/2509.22747 https://arxiv.org/pdf/2509.22747
Enhancing Communication Efficiency in FL with Adaptive Gradient Quantization and Communication Frequency Optimization
Asadullah Tariq, Tariq Qayyum, Mohamed Adel Serhani, Farag Sallabi, Ikbal Taleb, Ezedin S. Barka
https://arxiv.org/abs/2509.23419
All-optical band structure reconstruction and onset of Landau quantization of Dirac fermions
Josef Riepl (Department of Physics, University of Regensburg, Regensburg, Germany), Marc Aichner (Department of Physics, University of Regensburg, Regensburg, Germany, LSI, CNRS, CEA/DRF/IRAMIS, Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau, France), Nikolai N. Mikhailov (A.V. Rzhanov Institute of Semiconductor Physics, Novosibirsk, Russia), Sergei A. Dvoretsky (A.V. Rzhanov I…
Can Less Precise Be More Reliable? A Systematic Evaluation of Quantization's Impact on CLIP Beyond Accuracy
Aymen Bouguerra, Daniel Montoya, Alexandra Gomez-Villa, Fabio Arnez, Chokri Mraidha
https://arxiv.org/abs/2509.21173
Deep Multiple Quantization Network on Long Behavior Sequence for Click-Through Rate Prediction
Zhuoxing Wei, Qi Liu, Qingchen Xie
https://arxiv.org/abs/2508.20865 https://
Stochastic dynamics for group field theories II: Methods for nonequilibrium renormalization group
Vincent Lahoche, Dine Ousmane Samary
https://arxiv.org/abs/2509.05507 https://
Beat-Based Rhythm Quantization of MIDI Performances
Maximilian Wachter, Sebastian Murgul, Michael Heizmann
https://arxiv.org/abs/2508.19262 https://arxiv.o…
Fair-GPTQ: Bias-Aware Quantization for Large Language Models
Irina Proskurina, Guillaume Metzler, Julien Velcin
https://arxiv.org/abs/2509.15206 https://ar…
AI-Driven Fronthaul Link Compression in Wireless Communication Systems: Review and Method Design
Keqin Zhang
https://arxiv.org/abs/2509.04805 https://arxiv…
Optimal Remainder Estimates in the Quantization of Complex Projective Spaces
Tommaso Aschieri, B{\l}a\.zej Ruba, Jan Philip Solovej
https://arxiv.org/abs/2508.19968 https://
SIRA: Scaled-Integer Range Analysis for Optimizing FPGA Dataflow Neural Network Accelerators
Yaman Umuroglu, Christoph Berganski, Felix Jentzsch, Michal Danilowicz, Tomasz Kryjak, Charalampos Bezaitis, Magnus Sjalander, Ian Colbert, Thomas Preusser, Jakoba Petri-Koenig, Michaela Blott
https://arxiv.org/abs/2508.21493
Scene-Aware Vectorized Memory Multi-Agent Framework with Cross-Modal Differentiated Quantization VLMs for Visually Impaired Assistance
Xiangxiang Wang, Xuanyu Wang, YiJia Luo, Yongbin Yu, Manping Fan, Jingtao Zhang, Liyong Ren
https://arxiv.org/abs/2508.18177
Language Model Based Text-to-Audio Generation: Anti-Causally Aligned Collaborative Residual Transformers
Juncheng Wang, Chao Xu, Cheng Yu, Zhe Hu, Haoyu Xie, Guoqi Yu, Lei Shang, Shujun Wang
https://arxiv.org/abs/2510.04577
Renormalization of Chern-Simons Wilson Loops via Flux Quantization in Cohomotopy
Hisham Sati, Urs Schreiber
https://arxiv.org/abs/2509.25336 https://arxiv.…
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs
Wei Huang, Yi Ge, Shuai Yang, Yicheng Xiao, Huizi Mao, Yujun Lin, Hanrong Ye, Sifei Liu, Ka Chun Cheung, Hongxu Yin, Yao Lu, Xiaojuan Qi, Song Han, Yukang Chen
https://arxiv.org/abs/2510.11696
$\gamma$-Quant: Towards Learnable Quantization for Low-bit Pattern Recognition
Mishal Fatima, Shashank Agnihotri, Marius Bock, Kanchana Vaishnavi Gandikota, Kristof Van Laerhoven, Michael Moeller, Margret Keuper
https://arxiv.org/abs/2509.22448
Exact WKB Formulation of Quantization and Particle Production in Time-Dependent Backgrounds
Ryo Namba, Motoo Suzuki
https://arxiv.org/abs/2509.19194 https://
LLM Compression: How Far Can We Go in Balancing Size and Performance?
Sahil Sk, Debasish Dhal, Sonal Khosla, Sk Shahid, Sambit Shekhar, Akash Dhaka, Shantipriya Parida, Dilip K. Prasad, Ond\v{r}ej Bojar
https://arxiv.org/abs/2508.11318
AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models
Sangjun Lee, Seung-taek Woo, Jungyu Jin, Changhun Lee, Eunhyeok Park
https://arxiv.org/abs/2509.12019
On the Quantization of the Electromagnetic Field with Magnetic Monopoles
Kanan Anwar
https://arxiv.org/abs/2509.17284 https://arxiv.org/pdf/2509.17284
Making Pose Representations More Expressive and Disentangled via Residual Vector Quantization
Sukhyun Jeong, Hong-Gi Shin, Yong-Hoon Choi
https://arxiv.org/abs/2508.14561 https:…
Chunk Based Speech Pre-training with High Resolution Finite Scalar Quantization
Yun Tang, Cindy Tseng
https://arxiv.org/abs/2509.15579 https://arxiv.org/pd…
CHORD: Customizing Hybrid-precision On-device Model for Sequential Recommendation with Device-cloud Collaboration
Tianqi Liu, Kairui Fu, Shengyu Zhang, Wenyan Fan, Zhaocheng Du, Jieming Zhu, Fan Wu, Fei Wu
https://arxiv.org/abs/2510.03038
GDNSQ: Gradual Differentiable Noise Scale Quantization for Low-bit Neural Networks
Sergey Salishev, Ian Akhremchik
https://arxiv.org/abs/2508.14004 https://
Exploiting Information Redundancy in Attention Maps for Extreme Quantization of Vision Transformers
Lucas Maisonnave, Karim Haroun, Tom Pegeot
https://arxiv.org/abs/2508.16311 h…
Unstable mode and the Unruh-DeWitt detector
Bruno S. Felipe, Jo\~ao P. M. Pitelli
https://arxiv.org/abs/2508.20993 https://arxiv.org/pdf/2508.20993
Enhancing Model Privacy in Federated Learning with Random Masking and Quantization
Zhibo Xu, Jianhao Zhu, Jingwen Xu, Changze Lv, Zisu Huang, Xiaohua Wang, Muling Wu, Qi Qian, Xiaoqing Zheng, Xuanjing Huang
https://arxiv.org/abs/2508.18911
Image-Conditioned 3D Gaussian Splat Quantization
Xinshuang Liu, Runfa Blark Li, Keito Suzuki, Truong Nguyen
https://arxiv.org/abs/2508.15372 https://arxiv.…
Q-Palette: Fractional-Bit Quantizers Toward Optimal Bit Allocation for Efficient LLM Deployment
Deokjae Lee, Hyun Oh Song
https://arxiv.org/abs/2509.20214 https://
CARVQ: Corrective Adaptor with Group Residual Vector Quantization for LLM Embedding Compression
Dayin Gou, Sanghyun Byun, Nilesh Malpeddi, Gabrielle De Micheli, Prathamesh Vaste, Jacob Song, Woo Seong Chung
https://arxiv.org/abs/2510.12721