Tootfinder

Opt-in global Mastodon full text search. Join the index!

@aardrian@toot.cafe
2025-10-28 22:02:42

I appreciate Hidde’s ability to take preposterous claims by a major accessibility tooling maker and the rampant thievery of IP by LLM makers, consider their dehumanizing output, and propose this gentle, whisper-soft rebuke:
hidde.blog/teaching-llms-or-co

@hikingdude@mastodon.social
2025-10-28 20:34:30

'Dem Boden ganz nah' #FotoVorschlag 'close to the ground'
I think this could fit the topic. I took this photo when we had some vacations at the #balticSea . Quite a while ago - but all fond memories. 🙂

A timeless monochrome scene unfolds along a windswept sandy path, where weathered wooden posts stand as silent sentinels. The foreground post, richly textured with deep grooves and cracks, dominates the frame, its rough bark telling stories of seasons passed. Beyond it, the path stretches into the distance, flanked by more posts that gradually fade into soft focus, guiding the eye toward the gentle rise of a dune. Delicate grasses, caught in mid-sway, add a whisper of movement to the otherwise …
@kexpmusicbot@mastodonapp.uk
2025-10-26 01:39:20

🇺🇦 #NowPlaying on KEXP's #Audioasis
Delvon Lamarr Organ Trio:
🎵 Careless Whisper
#DelvonLamarrOrganTrio
delvonlamarrorgantrio.bandcamp
open.spotify.com/track/4xmkuzw

@arXiv_csCL_bot@mastoxiv.page
2025-07-29 08:31:31

HITSZ's End-To-End Speech Translation Systems Combining Sequence-to-Sequence Auto Speech Recognition Model and Indic Large Language Model for IWSLT 2025 in Indic Track
Xuchen Wei, Yangxin Wu, Yaoyin Zhang, Henglyu Liu, Kehai Chen, Xuefeng Bai, Min Zhang
arxiv.org/abs/2507.19616

@arXiv_csSD_bot@mastoxiv.page
2025-08-29 09:08:01

Speech Emotion Recognition via Entropy-Aware Score Selection
ChenYi Chua, JunKai Wong, Chengxin Chen, Xiaoxiao Miao
arxiv.org/abs/2508.20796

@guerda@ruhr.social
2025-08-16 09:17:15

Cooles Feature. Ich bin kein großer AI Fan gleichzeitig sehe ich bei Transkription tatsächlich Potenzial. Und nahtlos und offline SRT Dateien erstellen zu lassen ist super.
FFmpeg 8.0 integriert Whisper: Lokale Audio-Transkription ohne Cloud | heise online

@aral@mastodon.ar.al
2025-10-24 16:19:42

“I sat beside a boy who tried to smile at me. I couldn’t return a real smile. Tears welled in my eyes as I realized words could never reach the horrors his soul had witnessed. All I could do was place my hand gently on his shoulder and whisper, ‘You are not alone.’”
@…

@arXiv_csLG_bot@mastoxiv.page
2025-10-15 10:49:51

Structured Sparsity and Weight-adaptive Pruning for Memory and Compute efficient Whisper models
Prasenjit K Mudi, Anshi Sachan, Dahlia Devapriya, Sheetal Kalyani
arxiv.org/abs/2510.12666

@arXiv_csCV_bot@mastoxiv.page
2025-09-23 13:09:31

Does Audio Matter for Modern Video-LLMs and Their Benchmarks?
Geewook Kim, Minjoon Seo
arxiv.org/abs/2509.17901 arxiv.org/pdf/2509.17901

@Mediagazer@mstdn.social
2025-09-19 11:02:11

How the outrage over Jimmy Kimmel's remarks on Charlie Kirk ballooned, starting with one X user, who monitors late night shows for liberal bias, posting a clip (Stuart A. Thompson/New York Times)
nytimes.com/2025/09/19/technol

@malik@Mastodon.Social
2025-08-19 21:05:46

@… Jordi, I hear Macwhisper is now blazingly fast with Nvidia tech — did you ever finish the export that we can use for the podcasts with multiple speaker speakers plus chapter marker − .srt?
I’m always waiting to use Mac whisper, but I can’t because the output file doesn’t help me.
And I am one of thousands.

@arXiv_csCL_bot@mastoxiv.page
2025-08-26 11:56:56

Zero-shot Context Biasing with Trie-based Decoding using Synthetic Multi-Pronunciation
Changsong Liu, Yizhou Peng, Eng Siong Chng
arxiv.org/abs/2508.17796

@arXiv_eessAS_bot@mastoxiv.page
2025-10-13 08:37:20

BaldWhisper: Faster Whisper with Head Shearing and Layer Merging
Yaya Sy, Christophe Cerisara, Irina Illina
arxiv.org/abs/2510.08599 arxiv.…

@thomasfuchs@hachyderm.io
2025-10-03 22:59:42

The three words you can whisper into the ear of any support AI chatbot to get their attention
credit card chargeback

@arXiv_csCL_bot@mastoxiv.page
2025-09-22 10:08:51

VOX-KRIKRI: Unifying Speech and Language through Continuous Fusion
Dimitrios Damianos, Leon Voukoutis, Georgios Paraskevopoulos, Vassilis Katsouros
arxiv.org/abs/2509.15667

@BBC6MusicBot@mastodonapp.uk
2025-09-20 08:30:01

🇺🇦 #NowPlaying on #BBC6Music's #RadcliffeAndMaconie
Cecile Campbell:
🎵 Whisper to Me
#CecileCampbell
basswiserecords.bandcamp.com/t
open.spotify.com/track/5xgPgAW

@arXiv_csSD_bot@mastoxiv.page
2025-09-11 08:56:43

Behind the Scenes: Mechanistic Interpretability of LoRA-adapted Whisper for Speech Emotion Recognition
Yujian Ma, Jinqiu Sang, Ruizhe Li
arxiv.org/abs/2509.08454

@arXiv_csMA_bot@mastoxiv.page
2025-10-07 08:28:22

Audit the Whisper: Detecting Steganographic Collusion in Multi-Agent LLMs
Om Tailor
arxiv.org/abs/2510.04303 arxiv.org/pdf/2510.04303

@michabbb@social.vivaldi.net
2025-08-14 09:30:10

📹 Creates SRT subtitle files for videos and supports real-time live broadcast transcription
🔄 Seamless workflow integration
🔄 allowing automated processing and data transfer to other applications
🔗 Source URL
heise.de/en/news…

@stefan@gardenstate.social
2025-10-03 13:05:20

Why must we whisper so much? A quite sad season opener feels bad!
#CriticalRole

@escap@azapft.is
2025-08-09 10:22:18

Programmieren per Spracheingabe LLM, ein riesen Spaß. Gerade hat whisper "grobes Scheiß-Skript" verstanden und es ist inhaltlich absolut korrekt :D Hab "shell script" gesagt, aber ersteres gedacht...

@arXiv_eessAS_bot@mastoxiv.page
2025-09-26 09:13:01

Variational Low-Rank Adaptation for Personalized Impaired Speech Recognition
Niclas Pokel, Pehu\'en Moure, Roman Boehringer, Shih-Chii Liu, Yingqiang Gao
arxiv.org/abs/2509.20397

@losttourist@social.chatty.monster
2025-08-01 21:30:43

It's basically impossible to imagine a time where Careless Whisper was a song you'd never heard before. But here it is. #TOTP

@JGraber@mastodon.social
2025-08-15 08:40:14

#Python Friday #292: Extract Text From Audio Files With Whisper - #ai #nlp

@arXiv_csCL_bot@mastoxiv.page
2025-08-14 09:50:22

Assessing the Feasibility of Lightweight Whisper Models for Low-Resource Urdu Transcription
Abdul Rehman Antall, Naveed Akhtar
arxiv.org/abs/2508.09865

@arXiv_csSD_bot@mastoxiv.page
2025-10-10 08:53:58

Leveraging Whisper Embeddings for Audio-based Lyrics Matching
Eleonora Mancini, Joan Serr\`a, Paolo Torroni, Yuki Mitsufuji
arxiv.org/abs/2510.08176

@arXiv_eessAS_bot@mastoxiv.page
2025-09-30 09:34:11

AISHELL6-whisper: A Chinese Mandarin Audio-visual Whisper Speech Dataset with Speech Recognition Baselines
Cancan Li, Fei Su, Juan Liu, Hui Bu, Yulong Wan, Hongbin Suo, Ming Li
arxiv.org/abs/2509.23833

@arXiv_csCL_bot@mastoxiv.page
2025-09-24 10:57:14

SloPalSpeech: A 2,8000-Hour Slovak Speech Corpus from Parliamentary Data
Erik Bo\v{z}\'ik, Marek \v{S}uppa
arxiv.org/abs/2509.19270 arx…

@arXiv_csSD_bot@mastoxiv.page
2025-09-23 08:30:40

Idiosyncratic Versus Normative Modeling of Atypical Speech Recognition: Dysarthric Case Studies
Vishnu Raja, Adithya V Ganesan, Anand Syamkumar, Ritwik Banerjee, H Andrew Schwartz
arxiv.org/abs/2509.16718

@arXiv_csLG_bot@mastoxiv.page
2025-10-10 13:29:33

Crosslisted article(s) found for cs.LG. arxiv.org/list/cs.LG/new
[4/5]:
- Leveraging Whisper Embeddings for Audio-based Lyrics Matching
Eleonora Mancini, Joan Serr\`a, Paolo Torroni, Yuki Mitsufuji

@arXiv_mathDS_bot@mastoxiv.page
2025-09-09 10:27:52

Isoperimetric-type inequalities for Mather's $\beta$-function of convex billiards
Stefano Baranzini, Misha Bialy, Alfonso Sorrentino
arxiv.org/abs/2509.06915

@arXiv_csCL_bot@mastoxiv.page
2025-08-14 09:54:32

Which one Performs Better? Wav2Vec or Whisper? Applying both in Badini Kurdish Speech to Text (BKSTT)
Renas Adnan, Hossein Hassani
arxiv.org/abs/2508.09957

@arXiv_csSD_bot@mastoxiv.page
2025-08-15 07:56:22

Whisper Smarter, not Harder: Adversarial Attack on Partial Suppression
Zheng Jie Wong, Bingquan Shen
arxiv.org/abs/2508.09994 arxiv.org/pdf…

@arXiv_eessAS_bot@mastoxiv.page
2025-10-07 10:13:42

Probing Whisper for Dysarthric Speech in Detection and Assessment
Zhengjun Yue, Devendra Kayande, Zoran Cvetkovic, Erfan Loweimi
arxiv.org/abs/2510.04219

@arXiv_csSD_bot@mastoxiv.page
2025-09-03 09:27:43

Evaluating the Effectiveness of Transformer Layers in Wav2Vec 2.0, XLS-R, and Whisper for Speaker Identification Tasks
Linus Stuhlmann, Michael Alexander Saxer
arxiv.org/abs/2509.00230

@arXiv_eessAS_bot@mastoxiv.page
2025-10-07 09:27:12

Adapting Diarization-Conditioned Whisper for End-to-End Multi-Talker Speech Recognition
Martin Kocour, Martin Karafiat, Alexander Polok, Dominik Klement, Luk\'a\v{s} Burget, Jan \v{C}ernock\'y
arxiv.org/abs/2510.03723

@arXiv_csCL_bot@mastoxiv.page
2025-08-20 08:09:29

Whispering Context: Distilling Syntax and Semantics for Long Speech Transcripts
Duygu Altinok
arxiv.org/abs/2508.13376 arxiv.org/pdf/2508.1…

@michabbb@social.vivaldi.net
2025-08-14 09:30:10

🎵 #FFmpeg 8.0 integrates #Whisper for local audio transcription 🎵 #ai
🔧 Direct integration of #OpenAI

@BBC6MusicBot@mastodonapp.uk
2025-08-17 22:39:56

🇺🇦 #NowPlaying on #BBC6Music's #DreamTime
Wim Mertens:
🎵 Whisper Me
#WimMertens
open.spotify.com/track/0gBLAW3

@arXiv_csSD_bot@mastoxiv.page
2025-07-30 08:27:51

Whilter: A Whisper-based Data Filter for "In-the-Wild" Speech Corpora Using Utterance-level Multi-Task Classification
William Ravenscroft, George Close, Kit Bower-Morris, Jamie Stacey, Dmitry Sityaev, Kris Y. Hong
arxiv.org/abs/2507.21642

@arXiv_eessAS_bot@mastoxiv.page
2025-09-23 09:18:00

Audio-Conditioned Diffusion LLMs for ASR and Deliberation Processing
Mengqi Wang, Zhan Liu, Zengrui Jin, Guangzhi Sun, Chao Zhang, Philip C. Woodland
arxiv.org/abs/2509.16622

@arXiv_csCL_bot@mastoxiv.page
2025-09-15 10:00:11

WhisTLE: Deeply Supervised, Text-Only Domain Adaptation for Pretrained Speech Recognition Transformers
Akshat Pandey, Karun Kumar, Raphael Tang
arxiv.org/abs/2509.10452

@arXiv_csSD_bot@mastoxiv.page
2025-09-19 10:05:51

Towards Building Speech Large Language Models for Multitask Understanding in Low-Resource Languages
Mingchen Shao, Bingshen Mu, Chengyou Wang, Hai Li, Ying Yan, Zhonghua Fu, Lei Xie
arxiv.org/abs/2509.14804

@arXiv_eessAS_bot@mastoxiv.page
2025-08-22 07:39:10

Transsion Multilingual Speech Recognition System for MLC-SLM 2025 Challenge
Xiaoxiao Li, An Zhu, Youhai Jiang, Fengjie Zhu
arxiv.org/abs/2508.14916

@arXiv_csSD_bot@mastoxiv.page
2025-08-05 10:27:00

WhiSQA: Non-Intrusive Speech Quality Prediction Using Whisper Encoder Features
George Close, Kris Hong, Thomas Hain, Stefan Goetze
arxiv.org/abs/2508.02210

@arXiv_eessAS_bot@mastoxiv.page
2025-09-04 08:49:21

Speech Intelligibility Assessment with Uncertainty-Aware Whisper Embeddings and sLSTM
Ryandhimas E. Zezario, Dyah A. M. G. Wisnu, Hsin-Min Wang, Yu Tsao
arxiv.org/abs/2509.03013

@arXiv_csCL_bot@mastoxiv.page
2025-08-11 09:56:29

Large Language Model Data Generation for Enhanced Intent Recognition in German Speech
Theresa Pekarek Rosin, Burak Can Kaplan, Stefan Wermter
arxiv.org/abs/2508.06277

@arXiv_eessAS_bot@mastoxiv.page
2025-09-04 09:03:51

A Study on Zero-Shot Non-Intrusive Speech Intelligibility for Hearing Aids Using Large Language Models
Ryandhimas E. Zezario, Dyah A. M. G. Wisnu, Hsin-Min Wang, Yu Tsao
arxiv.org/abs/2509.03021

@kexpmusicbot@mastodonapp.uk
2025-08-05 19:55:39

🇺🇦 #NowPlaying on #KEXP's #MiddayShow
Kae Tempest:
🎵 Prayers to Whisper
#KaeTempest
open.spotify.com/track/3ZcmA9U

@arXiv_csSD_bot@mastoxiv.page
2025-10-14 11:06:48

Proficiency-Aware Adaptation and Data Augmentation for Robust L2 ASR
Ling Sun, Charlotte Zhu, Shuju Shi
arxiv.org/abs/2510.10738 arxiv.org/…

@arXiv_csCL_bot@mastoxiv.page
2025-10-07 12:05:22

A Low-Resource Speech-Driven NLP Pipeline for Sinhala Dyslexia Assistance
Peshala Perera, Deshan Sumanathilaka
arxiv.org/abs/2510.04750 arx…

@arXiv_eessAS_bot@mastoxiv.page
2025-09-10 08:03:41

Identifying and Calibrating Overconfidence in Noisy Speech Recognition
Mingyue Huo, Yuheng Zhang, Yan Tang
arxiv.org/abs/2509.07195 arxiv.o…

@arXiv_csSD_bot@mastoxiv.page
2025-08-12 10:41:33

Bridging ASR and LLMs for Dysarthric Speech Recognition: Benchmarking Self-Supervised and Generative Approaches
Ahmed Aboeitta, Ahmed Sharshar, Youssef Nafea, Shady Shehata
arxiv.org/abs/2508.08027

@arXiv_csSD_bot@mastoxiv.page
2025-09-10 09:17:01

Competitive Audio-Language Models with Data-Efficient Single-Stage Training on Public Data
Gokul Karthik Kumar, Rishabh Saraf, Ludovick Lepauloux, Abdul Muneer, Billel Mokeddem, Hakim Hacid
arxiv.org/abs/2509.07526

@arXiv_eessAS_bot@mastoxiv.page
2025-08-15 10:47:49

Crosslisted article(s) found for eess.AS. arxiv.org/list/eess.AS/new
[1/1]:
- Whisper Smarter, not Harder: Adversarial Attack on Partial Suppression
Zheng Jie Wong, Bingquan Shen

@arXiv_eessAS_bot@mastoxiv.page
2025-08-08 07:38:52

Keyword Spotting with Hyper-Matched Filters for Small Footprint Devices
Yael Segal-Feldman, Ann R. Bradlow, Matthew Goldrick, Joseph Keshet
arxiv.org/abs/2508.04857

@arXiv_eessAS_bot@mastoxiv.page
2025-08-01 08:24:41

Feature Importance across Domains for Improving Non-Intrusive Speech Intelligibility Prediction in Hearing Aids
Ryandhimas E. Zezario, Sabato M. Siniscalchi, Fei Chen, Hsin-Min Wang, Yu Tsao
arxiv.org/abs/2507.23223