Canary-1B-v2 & Parakeet-TDT-0.6B-v3: Efficient and High-Performance Models for Multilingual ASR and AST
Monica Sekoyan, Nithin Rao Koluguri, Nune Tadevosyan, Piotr Zelasko, Travis Bartley, Nick Karpov, Jagadeesh Balam, Boris Ginsburg
https://arxiv.org/abs/2509.14128
The Trump administration has vowed to crack down on what it calls hate speech. It has labeled antifa, a loosely organized anti-fascist group, a terrorist organization.
And it has sought to punish figures such as TV host Jimmy Kimmel for statements perceived critical of conservative activists.
What the First Amendment makes clear is that it does not just protect the rights of speakers who say things with which Americans agree.
Or, as the Supreme Court said in a separate deci…
Meta introduces Omnilingual Automatic Speech Recognition, a suite of AI models providing automatic speech recognition capabilities for more than 1,600 languages (Carl Franzen/VentureBeat)
https://venturebeat.com/ai/meta-returns-to-open-source-ai…
Proficiency-Aware Adaptation and Data Augmentation for Robust L2 ASR
Ling Sun, Charlotte Zhu, Shuju Shi
https://arxiv.org/abs/2510.10738 https://arxiv.org/…
Articulation-Informed ASR: Integrating Articulatory Features into ASR via Auxiliary Speech Inversion and Cross-Attention Fusion
Ahmed Adel Attia, Jing Liu, Carol Espy Wilson
https://arxiv.org/abs/2510.08585
Accent-Invariant Automatic Speech Recognition via Saliency-Driven Spectrogram Masking
Mohammad Hossein Sameti, Sepehr Harfi Moridani, Ali Zarean, Hossein Sameti
https://arxiv.org/abs/2510.09528
Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation
Vaibhav Srivastav, Steven Zheng, Eric Bezzam, Eustache Le Bihan, Nithin Koluguri, Piotr \.Zelasko, Somshubra Majumdar, Adel Moumen, Sanchit Gandhi
https://arxiv.org/abs/2510.06961
Listening or Reading? Evaluating Speech Awareness in Chain-of-Thought Speech-to-Text Translation
Jacobo Romero-D\'iaz, Gerard I. G\'allego, Oriol Pareras, Federico Costa, Javier Hernando, Cristina Espa\~na-Bonet
https://arxiv.org/abs/2510.03115
Revisiting Direct Speech-to-Text Translation with Speech LLMs: Better Scaling than CoT Prompting?
Oriol Pareras, Gerard I. G\'allego, Federico Costa, Cristina Espa\~na-Bonet, Javier Hernando
https://arxiv.org/abs/2510.03093