
2025-05-29 07:19:54
Mitigating Audiovisual Mismatch in Visual-Guide Audio Captioning
Le Xu, Chenxing Li, Yong Ren, Yujie Chen, Yu Gu, Ruibo Fu, Shan Yang, Dong Yu
https://arxiv.org/abs/2505.22045
Mitigating Audiovisual Mismatch in Visual-Guide Audio Captioning
Le Xu, Chenxing Li, Yong Ren, Yujie Chen, Yu Gu, Ruibo Fu, Shan Yang, Dong Yu
https://arxiv.org/abs/2505.22045
Pixel-wise Modulated Dice Loss for Medical Image Segmentation
Seyed Mohsen Hosseini
https://arxiv.org/abs/2506.15744 https://arxiv.or…
Low-resource keyword spotting using contrastively trained transformer acoustic word embeddings
Julian Herreilers, Christiaan Jacobs, Thomas Niesler
https://arxiv.org/abs/2506.17690
Single-cell metabolic flux analysis reveals coexisting optimal sub-groups, cross-feeding, and mixotrophy in a cyanobacterial population
Ari\'an Ferrero-Fern\'andez, Paula Prondzinsky, Lucia Gastoldi, David A. Fike, Harrison B. Smith, Daniele De Martino, Andrea De Martino, Shawn Erin McGlynn
https://arxiv.org/abs/2506.059…
Robust Unsupervised Adaptation of a Speech Recogniser Using Entropy Minimisation and Speaker Codes
Rogier C. van Dalen, Shucong Zhang, Titouan Parcollet, Sourav Bhattacharya
https://arxiv.org/abs/2506.10653