"Kremlin Hotline: Hungary colluded with Russia to delist sanctioned oligarchs, companies and banks"
https://vsquare.org/kremlin-hotline-hungary-colluded-with-russia-to-delist-sanctioned-oligarchs-companies-and-bank…
“In this work, we conduct a large-scale simulation of how users might delegate work to LLMs across 52 professional domains. We find that current LLMs are unreliable delegates: even frontier models corrupt an average of 25% of document content over long workflows, with sparse but severe errors that silently compound over time.”
Good to see the issue addressed explicitly, even though the results aren’t surprising—why would anyone expect LLMs to be reliable!?
LLMs Corrupt Your Documents When You Delegate
"Delegation requires trust - the expectation that the LLM will faithfully execute the task without introducing errors into documents. We introduce DELEGATE-52 to study the readiness of AI systems in delegated workflows. DELEGATE-52 simulates long delegated workflows that require in-depth document editing across 52 professional domains, such as coding, crystallography, and music notation. Our large-scale experiment with 19 LLMs reveals …
Israel launched a daylight attack Saturday on Iran’s capital,
with a cloud of smoke rising from the city’s downtown.
The apparent strike happened near the offices of Supreme Leader Ayatollah Ali Khamenei.
It wasn’t immediately clear whether the 86-year-old Khamenei had been in his offices at the time.
He hasn’t been seen publicly in days as tensions with the United States have grown.
But the attack comes as the United States has assembled a vast fleet of fighter …
Efecto Rufišn.
Podemos se abre a acuerdos en País ValenciŠ, Murcia y Canarias y Rufišn sigue marcando agenda
https://www.elsaltodiario.com/partidos-politicos/gabriel-rufian-emilio-delgado-sumar-yolanda-diaz-madrid
I'm still not used to this:
If the temperature during the day has been like 25C, the sun has no business setting before 9pm. I'm robbed of like 2 hours of evening daylight on a day that is made for gardening!
Large Language Models (LLMs) are poised to disrupt knowledge work,
with the emergence of delegated work as a new interaction paradigm
(e.g., vibe coding).
Delegation requires trust
- the expectation that the LLM will faithfully execute the task without introducing errors into documents.
Our large-scale experiment with 19 LLMs reveals that current models degrade documents during delegation:
even frontier models (Gemini 3.1 Pro, Claude 4.6 Opus, GPT 5.4) c…