Evaluation Awareness Scales Predictably in Open-Weights Large Language Models
Maheep Chaudhary, Ian Su, Nikhil Hooda, Nishith Shankar, Julia Tan, Kevin Zhu, Ashwinee Panda, Ryan Lagasse, Vasu Sharma
https://arxiv.org/abs/2509.13333
Enhancing Cross-task Transfer of Large Language Models via Activation Steering
Xinyu Tang, Zhihao Lv, Xiaoxue Cheng, Junyi Li, Wayne Xin Zhao, Zujie Wen, Zhiqiang Zhang, Jun Zhou
https://arxiv.org/abs/2507.13236
@… @…
Integer division makes more sense in a statically typed language like C, but is uncommon in scripting languages. (Lua, JavaScript, Python all do float division.)
Universal Jailbreak Suffixes Are Strong Attention Hijackers
Matan Ben-Tov, Mor Geva, Mahmood Sharif
https://arxiv.org/abs/2506.12880 https://
What's in the Box? Reasoning about Unseen Objects from Multimodal Cues
Lance Ying, Daniel Xu, Alicia Zhang, Katherine M. Collins, Max H. Siegel, Joshua B. Tenenbaum
https://arxiv.org/abs/2506.14212
MME-Emotion: A Holistic Evaluation Benchmark for Emotional Intelligence in Multimodal Large Language Models
Fan Zhang, Zebang Cheng, Chong Deng, Haoxuan Li, Zheng Lian, Qian Chen, Huadai Liu, Wen Wang, Yi-Fan Zhang, Renrui Zhang, Ziyu Guo, Zhihong Zhu, Hao Wu, Haixin Wang, Yefeng Zheng, Xiaojiang Peng, Xian Wu, Kun Wang, Xiangang Li, Jieping Ye, Pheng-Ann Heng
…
Learning the Topic, Not the Language: How LLMs Classify Online Immigration Discourse Across Languages
Andrea Nasuto, Stefano Maria Iacus, Francisco Rowe, Devika Jain
https://arxiv.org/abs/2508.06435
Machine Mirages: Defining the Undefined
Hamidou Tembine
https://arxiv.org/abs/2506.13990 https://arxiv.org/pdf/2506.13990
Learning to Detect Unknown Jailbreak Attacks in Large Vision-Language Models: A Unified and Accurate Approach
Shuang Liang, Zhihao Xu, Jialing Tao, Hui Xue, Xiting Wang
https://arxiv.org/abs/2508.09201
VLM-3D:End-to-End Vision-Language Models for Open-World 3D Perception
Fuhao Chang, Shuxin Li, Yabei Li, Lei He
https://arxiv.org/abs/2508.09061 https://arx…