Urban Demons IV 👻
城市鬼魂 IV 👻
📷 Zeiss IKON Super Ikonta 533/16
🎞️ Ilford HP5 400 Plus, expired 1993
If you like my work, buy me a coffee from PayPal https://www.paypal.com/paypalme/ydcdingsite
In New York social circles, he was known as the “Jewish James Bond”:
a refugee from Nazi Germany whose gratitude to his American hosts was such that he volunteered to join the US army and became the CIA’s first station chief in Berlin as a mere twentysomething,
filing early warnings about Soviet activity that have been credited with ringing in the cold war.
Like 007, Peter Sichel also appreciated a fine tipple, and after leaving the US foreign intelligence service it was h…
Filing: Anthropic says it cannot manipulate Claude once the military has deployed it, denying DOD accusations that Anthropic could tamper with models during war (Paresh Dave/Wired)
https://www.wired.com/story/anthropic-denies-sabotage-ai-tools-war-claude/
Is building an LLM inherently problematic? Not necessarily, but there's no good way to do it under capitalism. Is using an local LLM funding these evil companies? No. It's not.
Spelling and grammar checking is one of the few uses of LLMs that is not based on fundamentally failing to understand what an LLM actually is. A statistical model is gonna be *really good* at flagging things that are probably typos (low probability areas). There will be false positives, which is fine if you're actually paying attention...
Oh wow, I feel quite sore today. Maybe I didn't do "enough" during winter to keep my muscles in shape.
But it was worth it! (Pictures and video are yet to come). After the snow has mostly melted, it's time again to clean up all kinds of litter.
I once heard a metallic *pling* and knew ... "new microspikes needed now ... but as the metal plate broke and not just a chain, I hope that the shop might replace it. I mean .. broken metal after just ~1year? Keep fi…
So to follow up on this, I've caught it in action. Models, when quantized a bit, just do a bit more poorly with short contexts. Even going from f32 (as trained) to bf16 (as usually run) to q8 tends to do okay for "normal" context windows. And q4 you start feeling like "this model is a little stupid and gets stuck sometimes” (it is! It's just that it's still mostly careening about in the space of "plausible" most of the time. Not good guesswork, but still in the zone). With long contexts, the probability of parameters collapsing to zero are higher, so the more context the more likelihood you are to see brokenness.
And then at Q2 (2 bits per parameter) or Q1, the model falls apart completely. Parameters collapse to zero easily. You start seeing "all work and no play makes jack a dull boy” sorts of behavior, with intense and unscrutinized repetition, followed by a hard stop when it just stops working.
And quantization is a parameter that a model vendor can turn relatively easily. (they have to regenerate the model from the base with more quantization, but it's a data transformation on the order of running a terabyte through a straightforward and fast process, not like training).
If you have 1000 customers and enough equipment to handle the requests of 700, going from bf16 to q8 is a no-brainer. Suddenly you can handle the load and have a little spare capacity. They get worse results, probably pay the same per token (or they're on a subscription that hides the cost anyway so you are even freer to make trade-offs. There's a reason that subscription products are kinda poorly described.)
It's also possible for them to vary this across a day: use models during quieter periods? Maybe you get an instance running a bf16 quantization. If you use it during a high use period? You get a Q4 model.
Or intelligent routing is possible. No idea if anyone is doing this, but if they monitor what you send a bit, and you generally shoot for an expensive model for simple requests? They could totally substitute a highly quantized version of the model to answer the question.
There are •so many tricks• that can be pulled here. Some of them very reasonable to make, some of them treading into outright misleading or fraudulent, and it's weirdly hard to draw the line between them.
Weird feeling as while looking to answer to the question "how do I get the coordinates of the minimal value of a numpy.ndarray ?", found that I already asked that question ten years ago: https://laurentperrinet.github.io/sciblog/p…
Source: Anthropic is preparing to release Claude Opus 4.7, along with a new AI-powered tool for designing websites and presentations, as soon as this week (Stephanie Palazzolo/The Information)
https://www.theinformation.com/briefings/exclusive…
Alibaba's new Token Hub unit releases Happy Oyster, a new AI world model that can create 3D environments, interactive videos, films, video content, and games (Luz Ding/Bloomberg)
https://www.bloomberg.com/news/articles/2026-04-16/al…