Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@arXiv_csLG_bot@mastoxiv.page
2025-10-13 10:37:10

Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs
Yumin Choi, Dongki Kim, Jinheon Baek, Sung Ju Hwang
arxiv.org/abs/2510.09201

@arXiv_csCV_bot@mastoxiv.page
2025-10-13 10:25:00

Spotlight on Token Perception for Multimodal Reinforcement Learning
Siyuan Huang, Xiaoye Qu, Yafu Li, Yun Luo, Zefeng He, Daizong Liu, Yu Cheng
arxiv.org/abs/2510.09285

@arXiv_csCL_bot@mastoxiv.page
2025-10-13 10:37:40

Multimodal Policy Internalization for Conversational Agents
Zhenhailong Wang, Jiateng Liu, Amin Fazel, Ritesh Sarkhel, Xing Fan, Xiang Li, Chenlei Guo, Heng Ji, Ruhi Sarikaya
arxiv.org/abs/2510.09474

@arXiv_csAI_bot@mastoxiv.page
2025-10-13 09:31:10

Tiny-R1V: Lightweight Multimodal Unified Reasoning Model via Model Merging
Qixiang Yin, Huanjin Yao, Jianghao Chen, Jiaxing Huang, Zhicheng Zhao, Fei Su
arxiv.org/abs/2510.08987

@arXiv_csSD_bot@mastoxiv.page
2025-10-13 08:14:20

Evaluating Hallucinations in Multimodal LLMs with Spoken Queries under Diverse Acoustic Conditions
Hansol Park, Hoseong Ahn, Junwon Moon, Yejin Lee, Kyuhong Shim
arxiv.org/abs/2510.08581

@arXiv_statML_bot@mastoxiv.page
2025-10-13 09:11:10

Interpretable Generative and Discriminative Learning for Multimodal and Incomplete Clinical Data
Albert Belenguer-Llorens, Carlos Sevilla-Salcedo, Janaina Mourao-Miranda, Vanessa G\'omez-Verdejo
arxiv.org/abs/2510.09513

@arXiv_csHC_bot@mastoxiv.page
2025-10-13 08:44:20

MLLM as a UI Judge: Benchmarking Multimodal LLMs for Predicting Human Perception of User Interfaces
Reuben A. Luera, Ryan Rossi, Franck Dernoncourt, Samyadeep Basu, Sungchul Kim, Subhojyoti Mukherjee, Puneet Mathur, Ruiyi Zhang, Jihyung Kil, Nedim Lipka, Seunghyun Yoon, Jiuxiang Gu, Zichao Wang, Cindy Xiong Bearfield, Branislav Kveton

@arXiv_csIR_bot@mastoxiv.page
2025-10-13 08:34:50

MRMR: A Realistic and Expert-Level Multidisciplinary Benchmark for Reasoning-Intensive Multimodal Retrieval
Siyue Zhang, Yuan Gao, Xiao Zhou, Yilun Zhao, Tingyu Song, Arman Cohan, Anh Tuan Luu, Chen Zhao
arxiv.org/abs/2510.09510

@arXiv_csAI_bot@mastoxiv.page
2025-10-13 09:20:30

LM Fight Arena: Benchmarking Large Multimodal Models via Game Competition
Yushuo Zheng, Zicheng Zhang, Xiongkuo Min, Huiyu Duan, Guangtao Zhai
arxiv.org/abs/2510.08928

@arXiv_csCV_bot@mastoxiv.page
2025-10-13 10:23:00

Diagnosing Shoulder Disorders Using Multimodal Large Language Models and Consumer-Grade Cameras
Jindong Hong, Wencheng Zhang, Shiqin Qiao, Jianhai Chen, Jianing Qiu, Chuanyang Zheng, Qian Xu, Yun Ji, Qianyue Wen, Weiwei Sun, Hao Li, Huizhen Li, Huichao Wang, Kai Wu, Meng Li, Yijun He, Lingjie Luo, Jiankai Sun
arxiv.org/abs/2510.092…