
2025-06-12 10:00:11
NDCG-Consistent Softmax Approximation with Accelerated Convergence
Yuanhao Pu, Defu Lian, Xiaolong Chen, Xu Huang, Jin Chen, Enhong Chen
https://arxiv.org/abs/2506.09454
NDCG-Consistent Softmax Approximation with Accelerated Convergence
Yuanhao Pu, Defu Lian, Xiaolong Chen, Xu Huang, Jin Chen, Enhong Chen
https://arxiv.org/abs/2506.09454
This https://arxiv.org/abs/2405.06003 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_sta…
This https://arxiv.org/abs/2505.17282 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…
Joint Beamforming and Integer User Association using a GNN with Gumbel-Softmax Reparameterizations
Qing Lyu, Mai Vu
https://arxiv.org/abs/2506.05241 https:…
Differentiable Reward Optimization for LLM based TTS system
Changfeng Gao, Zhihao Du, Shiliang Zhang
https://arxiv.org/abs/2507.05911 https://
Intrinsic and Extrinsic Organized Attention: Softmax Invariance and Network Sparsity
Oluwadamilola Fasina, Ruben V. C. Pohle, Pei-Chun Su, Ronald R. Coifman
https://arxiv.org/abs/2506.15541
This https://arxiv.org/abs/2303.17475 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…
Box-Constrained Softmax Function and Its Application for Post-Hoc Calibration
Kyohei Atarashi, Satoshi Oyama, Hiromi Arai, Hisashi Kashima
https://arxiv.org/abs/2506.10572
Neural Jumps for Option Pricing
Duosi Zheng, Hanzhong Guo, Yanchu Liu, Wei Huang
https://arxiv.org/abs/2506.05137 https://arxiv.org/p…
Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for Long Video Generation
Xingyang Li, Muyang Li, Tianle Cai, Haocheng Xi, Shuo Yang, Yujun Lin, Lvmin Zhang, Songlin Yang, Jinbo Hu, Kelly Peng, Maneesh Agrawala, Ion Stoica, Kurt Keutzer, Song Han
https://arxiv.org/abs/2506.19852…
Refining Datapath for Microscaling ViTs
Can Xiao, Jianyi Cheng, Aaron Zhao
https://arxiv.org/abs/2505.22194 https://arxiv.org/pdf/250…
ASDA: Audio Spectrogram Differential Attention Mechanism for Self-Supervised Representation Learning
Junyu Wang, Tianrui Wang, Meng Ge, Longbiao Wang, Jianwu Dang
https://arxiv.org/abs/2507.02666
Advancing Loss Functions in Recommender Systems: A Comparative Study with a R\'enyi Divergence-Based Solution
Shengjia Zhang, Jiawei Chen, Changdong Li, Sheng Zhou, Qihao Shi, Yan Feng, Chun Chen, Can Wang
https://arxiv.org/abs/2506.15120
Evaluating Logit-Based GOP Scores for Mispronunciation Detection
Aditya Kamlesh Parikh, Cristian Tejedor-Garcia, Catia Cucchiarini, Helmer Strik
https://arxiv.org/abs/2506.12067
SystolicAttention: Fusing FlashAttention within a Single Systolic Array
Jiawei Lin, Guokai Chen, Yuanlong Li, Thomas Bourgeat
https://arxiv.org/abs/2507.11331