Interesting explanation of LLM training frameworks and the incentives for confident guessing.
"The authors examined ten major AI benchmarks, including those used by Google, OpenAI and also the top leaderboards that rank AI models. This revealed that nine benchmarks use binary grading systems that award zero points for AIs expressing uncertainty.
" ... When an AI system says “I don’t know”, it receives the same score as giving completely wrong information. The optimal strategy under such evaluation becomes clear: always guess. ...
"More sophisticated approaches like active learning, where AI systems ask clarifying questions to reduce uncertainty, can improve accuracy but further multiply computational requirements. ...
"Users want systems that provide confident answers to any question. Evaluation benchmarks reward systems that guess rather than express uncertainty. Computational costs favour fast, overconfident responses over slow, uncertain ones."
=
My comment: "Fast, overconfident responses" sounds a bit similar to "bullshit", does it not?
#ChatGPT #LLMs #SoCalledAI
Noch einige der zuletzt hier besonders häufig geteilten #News:
Deutsch-österreichischer Angriff auf Metas Geschäftsmodell
❝I have noticed that we people privileged by supremacy have a tendency to take this same stance toward newly aware people, a stance which is not ours to assume. We seem to feel that it is our business to meet people who are in the same place that we were just a few short years or decades ago, and meet their shock and surprise and anger and dismay with a skepticism and an impatience we haven't earned.
We say things like "are you surprised?"
We say things like "why does this shock you?"
We say "oh so you're only angry now?"
We say things like "where have you been?"
…
Instead of asking “are you surprised?” say “I was surprised once, too; here's what I know.” Instead of “what took you so long?” say “I just got here recently; here's what I've learned.”❞ https://mastodon.social/@JuliusGoat/114847074342312772
#Wordle 1,516 5/6*
⬜⬜⬜🟩⬜ <1% of 205,256 (61)
⬜🟨⬜🟩⬜ 12% of 231 (12)
🟨🟩⬜🟩⬜ 11% of 28 (3)
⬜🟩🟨🟩⬜ 17% of 594 (1)
🟩🟩🟩🟩🟩 78% of 15,911
WordleBot
Skill 92/99
Luck 29/99
Well, now that I see the answer I understand why I had so much trouble finding it!
It's the #DayOfHelios / Sol's Day / #Sunday! ☀️
"Against Phineus once on a time was the Titan Phaethon [Helios] angered, wroth for the victory of [Phineus] the prophet of Phoibos [Apollon], and robbed him of his sight and sent the shameless Harpies, a winged race to dwell with …
from my link log —
TLD domain name renewal grace periods.
https://www.namecheap.com/support/knowledgebase/article.aspx/9916/2207/tlds-grace-periods/
saved 2025-09-09
Porsche-Holding investiert Geld in Militär-Startups
Russlands Angriffskrieg beunruhigt die Familie Porsche–Piëch. Mit Geld für Start-ups für Militärtechnik will sie helfen, Demokratie und Freiheit zu verteidigen.
https…