2025-09-25 10:16:02
ExpFace: Exponential Angular Margin Loss for Deep Face Recognition
Jinhui Zheng, Xueyuan Gong
https://arxiv.org/abs/2509.19753 https://arxiv.org/pdf/2509.1…
ExpFace: Exponential Angular Margin Loss for Deep Face Recognition
Jinhui Zheng, Xueyuan Gong
https://arxiv.org/abs/2509.19753 https://arxiv.org/pdf/2509.1…
Dendrograms of Mixing Measures for Softmax-Gated Gaussian Mixture of Experts: Consistency without Model Sweeps
Do Tien Hai, Trung Nguyen Mai, TrungTin Nguyen, Nhat Ho, Binh T. Nguyen, Christopher Drovandi
https://arxiv.org/abs/2510.12744
Task-Level Insights from Eigenvalues across Sequence Models
Rahel Rickenbach, Jelena Trisovic, Alexandre Didier, Jerome Sieber, Melanie N. Zeilinger
https://arxiv.org/abs/2510.09379
Credal Transformer: A Principled Approach for Quantifying and Mitigating Hallucinations in Large Language Models
Shihao Ji, Zihui Song, Jiajie Huang
https://arxiv.org/abs/2510.12137
Computing Control Lyapunov-Barrier Functions: Softmax Relaxation and Smooth Patching with Formal Guarantees
Jun Liu, Maxwell Fitzsimmons
https://arxiv.org/abs/2510.02223 https:/…
Deconstructing Attention: Investigating Design Principles for Effective Language Modeling
Huiyin Xue, Nafise Sadat Moosavi, Nikolaos Aletras
https://arxiv.org/abs/2510.11602 htt…
Quantum Probabilistic Label Refining: Enhancing Label Quality for Robust Image Classification
Fang Qi, Lu Peng, Zhengming Ding
https://arxiv.org/abs/2510.00528 https://
Crosslisted article(s) found for econ.TH. https://arxiv.org/list/econ.TH/new
[1/1]:
- Beyond Softmax: A New Perspective on Gradient Bandits
Emerson Melo, David M\"uller
TokenChain: A Discrete Speech Chain via Semantic Token Modeling
Mingxuan Wang, Satoshi Nakamura
https://arxiv.org/abs/2510.06201 https://arxiv.org/pdf/2510…
An empirical study on the limitation of Transformers in program trace generation
Simeng Sun
https://arxiv.org/abs/2509.25073 https://arxiv.org/pdf/2509.250…