
2025-09-14 09:09:54
Interesting explanation of LLM training frameworks and the incentives for confident guessing.
"The authors examined ten major AI benchmarks, including those used by Google, OpenAI and also the top leaderboards that rank AI models. This revealed that nine benchmarks use binary grading systems that award zero points for AIs expressing uncertainty.
" ... When an AI system says “I don’t know”, it receives the same score as giving completely wrong information. The optimal strategy under such evaluation becomes clear: always guess. ...
"More sophisticated approaches like active learning, where AI systems ask clarifying questions to reduce uncertainty, can improve accuracy but further multiply computational requirements. ...
"Users want systems that provide confident answers to any question. Evaluation benchmarks reward systems that guess rather than express uncertainty. Computational costs favour fast, overconfident responses over slow, uncertain ones."
=
My comment: "Fast, overconfident responses" sounds a bit similar to "bullshit", does it not?
#ChatGPT #LLMs #SoCalledAI