Contextualized Automatic Speech Recognition with Dynamic Vocabulary Prediction and ActivationZhennan Lin, Kaixun Huang, Wei Ren, Linju Yang, Lei Xiehttps://arxiv.org/abs/2505.23077
Contextualized Automatic Speech Recognition with Dynamic Vocabulary Prediction and ActivationDeep biasing improves automatic speech recognition (ASR) performance by incorporating contextual phrases. However, most existing methods enhance subwords in a contextual phrase as independent units, potentially compromising contextual phrase integrity, leading to accuracy reduction. In this paper, we propose an encoder-based phrase-level contextualized ASR method that leverages dynamic vocabulary prediction and activation. We introduce architectural optimizations and integrate a bias loss to ex…