Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for Long Video Generation
Xingyang Li, Muyang Li, Tianle Cai, Haocheng Xi, Shuo Yang, Yujun Lin, Lvmin Zhang, Songlin Yang, Jinbo Hu, Kelly Peng, Maneesh Agrawala, Ion Stoica, Kurt Keutzer, Song Han
https://arxiv.org/abs/2506.19852…
Box-Constrained Softmax Function and Its Application for Post-Hoc Calibration
Kyohei Atarashi, Satoshi Oyama, Hiromi Arai, Hisashi Kashima
https://arxiv.org/abs/2506.10572
Intrinsic and Extrinsic Organized Attention: Softmax Invariance and Network Sparsity
Oluwadamilola Fasina, Ruben V. C. Pohle, Pei-Chun Su, Ronald R. Coifman
https://arxiv.org/abs/2506.15541
NDCG-Consistent Softmax Approximation with Accelerated Convergence
Yuanhao Pu, Defu Lian, Xiaolong Chen, Xu Huang, Jin Chen, Enhong Chen
https://arxiv.org/abs/2506.09454
Advancing Loss Functions in Recommender Systems: A Comparative Study with a R\'enyi Divergence-Based Solution
Shengjia Zhang, Jiawei Chen, Changdong Li, Sheng Zhou, Qihao Shi, Yan Feng, Chun Chen, Can Wang
https://arxiv.org/abs/2506.15120
Evaluating Logit-Based GOP Scores for Mispronunciation Detection
Aditya Kamlesh Parikh, Cristian Tejedor-Garcia, Catia Cucchiarini, Helmer Strik
https://arxiv.org/abs/2506.12067