Hidden in Plain Sight: Probing Implicit Reasoning in Multimodal Language ModelsQianqi Yan, Hongquan Li, Shan Jiang, Yang Zhao, Xinze Guan, Ching-Chen Kuo, Xin Eric Wanghttps://arxiv.org/abs/2506.00258
Hidden in Plain Sight: Probing Implicit Reasoning in Multimodal Language ModelsMultimodal large language models (MLLMs) are increasingly deployed in open-ended, real-world environments where inputs are messy, underspecified, and not always trustworthy. Unlike curated benchmarks, these settings frequently involve instructions that refer to missing objects or contradictory facts, rely on ambiguous references, or request infeasible actions. In such cases, success hinges not on task execution alone, but on a model's ability to detect when something is silently wrong. This pap…