Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[3/5]:
- SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial...
Ziyang Gong, Wenhao Li, Oliver Ma, Songyuan Li, Jiayi Ji, Xue Yang, Gen Luo, Junchi Yan, Rongrong Ji
Large Language Models Show Signs of Alignment with Human Neurocognition During Abstract Reasoning
Christopher Pinier, Sonia Acu\~na Vargas, Mariia Steeghs-Turchina, Dora Matzke, Claire E. Stevenson, Michael D. Nunez
https://arxiv.org/abs/2508.10057
Superstudent intelligence in thermodynamics
Rebecca Loubet, Pascal Zittlau, Marco Hoffmann, Luisa Vollmer, Sophie Fellenz, Heike Leitte, Fabian Jirasek, Johannes Lenhard, Hans Hasse
https://arxiv.org/abs/2506.09822
Outsmarting Linear Neural Networks via an Incoherent Light-Driven Optical Extreme Learner with Data Reverberation
Bofeng Liu, Xu Mei, Sadman Shafi, Tunan Xia, Iam-Choon Khoo, Zhiwen Liu, Xingjie Ni
https://arxiv.org/abs/2508.08428
Beyond Surface-Level Detection: Towards Cognitive-Driven Defense Against Jailbreak Attacks via Meta-Operations Reasoning
Rui Pu, Chaozhuo Li, Rui Ha, Litian Zhang, Lirong Qiu, Xi Zhang
https://arxiv.org/abs/2508.03054
Hidden in the Noise: Unveiling Backdoors in Audio LLMs Alignment through Latent Acoustic Pattern Triggers
Liang Lin, Miao Yu, Kaiwen Luo, Yibo Zhang, Lilan Peng, Dexian Wang, Xuehai Tang, Yuanhe Zhang, Xikang Yang, Zhenhong Zhou, Kun Wang, Yang Liu
https://arxiv.org/abs/2508.02175
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[3/5]:
- CountingFruit: Language-Guided 3D Fruit Counting with Semantic Gaussian Splatting
Fengze Li, Yangle Liu, Jieming Ma, Hai-Ning Liang, Yaochun Shen, Huangxiang Li, Zhijing Wu
Harnessing Patterns to Support the Development of Hybrid Quantum Applications
Daniel Vietz, Martin Beisel, Johanna Barzen, Frank Leymann, Lavinia Stiliadou, Benjamin Weder
https://arxiv.org/abs/2507.00696
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[10/10]:
- Judging the Judges: Can Large Vision-Language Models Fairly Evaluate Chart Comprehension and Reas...
Laskar, Islam, Mahbub, Masry, Rahman, Bhuiyan, Nayeem, Joty, Hoque, Huang
Bhatt Conjectures: On Necessary-But-Not-Sufficient Benchmark Tautology for Human Like Reasoning
Manish Bhatt
https://arxiv.org/abs/2506.11423 https://
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[6/10]:
- CountingFruit: Language-Guided 3D Fruit Counting with Semantic Gaussian Splatting
Fengze Li, Yangle Liu, Jieming Ma, Hai-Ning Liang, Yaochun Shen, Huangxiang Li, Zhijing Wu
Lost in Translation? Converting RegExes for Log Parsing into Dynatrace Pattern Language
Julian Fragner, Christian Macho, Bernhard Dieber, Martin Pinzger
https://arxiv.org/abs/2506.19539
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[3/4]:
- Perceiving Beyond Language Priors: Enhancing Visual Comprehension and Attention in Multimodal Models
Aarti Ghatkesar, Ganesh Venkatesh
Multimodal Behavioral Patterns Analysis with Eye-Tracking and LLM-Based Reasoning
Dongyang Guo, Yasmeen Abdrabou, Enkeleda Thaqi, Enkelejda Kasneci
https://arxiv.org/abs/2507.18252
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[2/4]:
- The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Info...
Li, Shi, Gao, Liu, Wang, Chen, Liu, Zhao, Wang, Metaxas
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[3/9]:
- PriorDiffusion: Leverage Language Prior in Diffusion Models for Monocular Depth Estimation
Zeng, Ni, Wang, Rim, Chung, Yang, Hong, Wong
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[4/9]:
- "Principal Components" Enable A New Language of Images
Xin Wen, Bingchen Zhao, Ismail Elezi, Jiankang Deng, Xiaojuan Qi
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[4/5]:
- VRU-Accident: A Vision-Language Benchmark for Video Question Answering and Dense Captioning for A...
Younggun Kim, Ahmed S. Abdelrahman, Mohamed Abdel-Aty
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[2/5]:
- RadAlign: Advancing Radiology Report Generation with Vision-Language Concept Alignment
Difei Gu, Yunhe Gao, Yang Zhou, Mu Zhou, Dimitris Metaxas
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[3/7]:
- Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Spee...
Jeong Hun Yeo, Minsu Kim, Chae Won Kim, Stavros Petridis, Yong Man Ro
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[1/4]:
- Bi-VLDoc: Bidirectional Vision-Language Modeling for Visually-Rich Document Understanding
Chuwei Luo, Guozhi Tang, Qi Zheng, Cong Yao, Lianwen Jin, Chenliang Li, Yang Xue, Luo Si
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[5/5]:
Imagine, Verify, Execute: Memory-Guided Agentic Exploration with Vision-Language Models
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[5/7]:
- Calibrated and Robust Foundation Models for Vision-Language and Medical Image Tasks Under Distrib...
Behraj Khan, Tahir Qasim Syed, Nouman M. Durrani, Bilal Naseem, Shabir Ahmad, Rizwan Qureshi
…