LLM-based phoneme-to-grapheme for phoneme-based speech recognition
Te Ma, Min Bi, Saierdaer Yusuyin, Hao Huang, Zhijian Ou
https://arxiv.org/abs/2506.04711
The American president wrote, “Vladimir, STOP!” on his Truth Social account in April,
-- but the Russian president did not halt his offensive in eastern Ukraine.
The Ukrainian president called for an unconditional cease-fire in May,
-- but the Russians did not agree to stop attacking Ukrainian civilians from the air.
Donald Trump repeatedly promised, during his campaign, that he would end the war “in one day,”
-- but the war is not over.
He spoke to Vla…
Grapheme-Coherent Phonemic and Prosodic Annotation of Speech by Implicit and Explicit Grapheme Conditioning
Hien Ohnaka, Yuma Shirahata, Byeongseon Park, Ryuichi Yamamoto
https://arxiv.org/abs/2506.04527
Adaptability of ASR Models on Low-Resource Language: A Comparative Study of Whisper and Wav2Vec-BERT on Bangla
Md Sazzadul Islam Ridoy, Sumi Akter, Md. Aminur Rahman
https://arxiv.org/abs/2507.01931
Unsupervised Rhythm and Voice Conversion to Improve ASR on Dysarthric Speech
Karl El Hajal, Enno Hermann, Sevada Hovsepyan, Mathew Magimai. -Doss
https://arxiv.org/abs/2506.01618 …
AURA: Agent for Understanding, Reasoning, and Automated Tool Use in Voice-Driven Tasks
Leander Melroy Maben, Gayathri Ganesh Lakshmy, Srijith Radhakrishnan, Siddhant Arora, Shinji Watanabe
https://arxiv.org/abs/2506.23049
NAVER LABS Europe Submission to the Instruction-following Track
Beomseok Lee, Marcely Zanon Boito, Laurent Besacier, Ioan Calapodescu
https://arxiv.org/abs/2506.01808
Fine-Tuning ASR for Stuttered Speech: Personalized vs. Generalized Approaches
Dena Mujtaba, Nihar Mahapatra
https://arxiv.org/abs/2506.00853 https://
This https://arxiv.org/abs/2505.24200 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csSD_…
Efficient Multilingual ASR Finetuning via LoRA Language Experts
Jiahong Li, Yiwen Shao, Jianheng Zhuo, Chenda Li, Liliang Tang, Dong Yu, Yanmin Qian
https://arxiv.org/abs/2506.21555