Is #AI really just dumb statistics? "Olympiad-level physics problem-solving presents a significant challenge for both humans and artificial intelligence (AI), as it requires a sophisticated integration of precise calculation, abstract reasoning, and a fundamental grasp of physical principles," says the (abstract of the) paper https://arxiv.org/abs/2511.10515: "The Chinese Physics Olympiad (CPhO), renowned for its complexity and depth, serves as an ideal and rigorous testbed for these advanced capabilities. In this paper, we introduce LOCA-R (LOgical Chain Augmentation for Reasoning), an improved version of the LOCA framework adapted for complex reasoning, and apply it to the CPhO 2025 theory examination. LOCA-R achieves a near-perfect score of 313 out of 320 points, solidly surpassing the highest-scoring human competitor and significantly outperforming all baseline methods." Oops ...?
Explicit Reasoning Makes Better Judges: A Systematic Study on Accuracy, Efficiency, and Robustness
Pratik Jayarao, Himanshu Gupta, Neeraj Varshney, Chaitanya Dwivedi
https://arxiv.org/abs/2509.13332
Towards Inference-time Scaling for Continuous Space Reasoning
Minghan Wang, Thuy-Trang Vu, Ehsan Shareghi, Gholamreza Haffari
https://arxiv.org/abs/2510.12167 https://
CTIArena: Benchmarking LLM Knowledge and Reasoning Across Heterogeneous Cyber Threat Intelligence
Yutong Cheng, Yang Liu, Changze Li, Dawn Song, Peng Gao
https://arxiv.org/abs/2510.11974
Beyond Seeing: Evaluating Multimodal LLMs on Tool-Enabled Image Perception, Transformation, and Reasoning
Xingang Guo, Utkarsh Tyagi, Advait Gosai, Paula Vergara, Ernesto Gabriel Hern\'andez Montoya, Chen Bo Calvin Zhang, Bin Hu, Yunzhong He, Bing Liu, Rakshith Sharma Srinivasa
https://arxiv.org/abs/2510.12712
Reasoning Pattern Matters: Learning to Reason without Human Rationales
Chaoxu Pang, Yixuan Cao, Ping Luo
https://arxiv.org/abs/2510.12643 https://arxiv.org…
Multi-Agent Debate for LLM Judges with Adaptive Stability Detection
Tianyu Hu, Zhen Tan, Song Wang, Huaizhi Qu, Tianlong Chen
https://arxiv.org/abs/2510.12697 https://
PRoH: Dynamic Planning and Reasoning over Knowledge Hypergraphs for Retrieval-Augmented Generation
Xiangjun Zai, Xingyu Tan, Xiaoyang Wang, Qing Liu, Xiwei Xu, Wenjie Zhang
https://arxiv.org/abs/2510.12434
Information-Preserving Reformulation of Reasoning Traces for Antidistillation
Jiayu Ding, Lei Cui, Li Dong, Nanning Zheng, Furu Wei
https://arxiv.org/abs/2510.11545 https://