Interesting explanation of LLM training frameworks and the incentives for confident guessing.
"The authors examined ten major AI benchmarks, including those used by Google, OpenAI and also the top leaderboards that rank AI models. This revealed that nine benchmarks use binary grading systems that award zero points for AIs expressing uncertainty.
" ... When an AI system says “I don’t know”, it receives the same score as giving completely wrong information. The optimal strategy under such evaluation becomes clear: always guess. ...
"More sophisticated approaches like active learning, where AI systems ask clarifying questions to reduce uncertainty, can improve accuracy but further multiply computational requirements. ...
"Users want systems that provide confident answers to any question. Evaluation benchmarks reward systems that guess rather than express uncertainty. Computational costs favour fast, overconfident responses over slow, uncertain ones."
=
My comment: "Fast, overconfident responses" sounds a bit similar to "bullshit", does it not?
#ChatGPT #LLMs #SoCalledAI
Towards Understanding Visual Grounding in Visual Language Models
Georgios Pantazopoulos, Eda B. \"Ozyi\u{g}it
https://arxiv.org/abs/2509.10345 https://
Toposes with enough points as categories of \'etale spaces
Sam van Gool, J\'er\'emie Marqu\`es, Umberto Tarantino
https://arxiv.org/abs/2508.09604 https://
Complications from surgery landed my dad back in the hospital late Wednesday. Successful exploratory surgery found and hopefully stopped the internal bleeding.
If a few key numbers remain stabile, he might get released today. But in the interim, it’ll have to be a Father’s Day video call to his hospital room.
There are myriad ways in which to view this, but I’m choosing gratefulness. #HappyFathersDay
TFRank: Think-Free Reasoning Enables Practical Pointwise LLM Ranking
Yongqi Fan, Xiaoyang Chen, Dezhi Ye, Jie Liu, Haijin Liang, Jin Ma, Ben He, Yingfei Sun, Tong Ruan
https://arxiv.org/abs/2508.09539 …
The thermal backreaction of a scalar field in de Sitter spacetime
Nikos Irges, Antonis Kalogirou, Fotis Koutroulis
https://arxiv.org/abs/2507.08774 https:/…
Sampling theorems for inverse problems on Riemannian manifolds
Giovanni S. Alberti, Ernesto De Vito, Bianca Gariboldi, Giacomo Gigante
https://arxiv.org/abs/2508.10810 https://
Pointwise explicit estimates for derivatives of solutions to linear parabolic PDEs with Neumann boundary conditions
C Ciccarella
https://arxiv.org/abs/2507.08622
Axis-level Symmetry Detection with Group-Equivariant Representation
Wongyun Yu, Ahyun Seo, Minsu Cho
https://arxiv.org/abs/2508.10740 https://arxiv.org/pdf…