Today in the drizzle I stopped by the Living Library, a beautiful native plant garden near SF's Balboa Park at the San Jose/Seneca intersection. Highly recommend a visit to see some #ceanothus in bloom and California poppies and island mallows beginning to, and much more.
""She's a Gavin fan-girl and she doesn't crush on many people," one former Pelosi aide said. "I will say this: She's hardly ever wrong. When she says she sees something, it's a real thing.""
Pelosi's new campaign: Boost Newsom for 2028
https://www.axios.com/2026/02/15/pelosis-new-campaign-boost-newsom-for-2028
So to follow up on this, I've caught it in action. Models, when quantized a bit, just do a bit more poorly with short contexts. Even going from f32 (as trained) to bf16 (as usually run) to q8 tends to do okay for "normal" context windows. And q4 you start feeling like "this model is a little stupid and gets stuck sometimes” (it is! It's just that it's still mostly careening about in the space of "plausible" most of the time. Not good guesswork, but still in the zone). With long contexts, the probability of parameters collapsing to zero are higher, so the more context the more likelihood you are to see brokenness.
And then at Q2 (2 bits per parameter) or Q1, the model falls apart completely. Parameters collapse to zero easily. You start seeing "all work and no play makes jack a dull boy” sorts of behavior, with intense and unscrutinized repetition, followed by a hard stop when it just stops working.
And quantization is a parameter that a model vendor can turn relatively easily. (they have to regenerate the model from the base with more quantization, but it's a data transformation on the order of running a terabyte through a straightforward and fast process, not like training).
If you have 1000 customers and enough equipment to handle the requests of 700, going from bf16 to q8 is a no-brainer. Suddenly you can handle the load and have a little spare capacity. They get worse results, probably pay the same per token (or they're on a subscription that hides the cost anyway so you are even freer to make trade-offs. There's a reason that subscription products are kinda poorly described.)
It's also possible for them to vary this across a day: use models during quieter periods? Maybe you get an instance running a bf16 quantization. If you use it during a high use period? You get a Q4 model.
Or intelligent routing is possible. No idea if anyone is doing this, but if they monitor what you send a bit, and you generally shoot for an expensive model for simple requests? They could totally substitute a highly quantized version of the model to answer the question.
There are •so many tricks• that can be pulled here. Some of them very reasonable to make, some of them treading into outright misleading or fraudulent, and it's weirdly hard to draw the line between them.
Democratic Rep. Eric Swalwell’s abrupt exit from the race for California governor
— then his announcement he would leave Congress
— left his rivals scrambling to lock down his former supporters in a crowded contest with no clear leader,
injecting more turmoil into the campaign to lead the nation’s most populous state.
Swalwell’s decision to suspend his campaign Sunday followed allegations that he sexually assaulted a woman twice,
including when she worked for him…
ByteDance launches its Seedance 2.0 video model to enterprise clients in 100 countries, excluding the US amid legal disputes, after a February launch in China (Juro Osawa/The Information)
https://www.theinformation.com/briefings/bytedance-laun…
David Cronenberg, master of body horror movies, was born today (March 15) in 1943 (during World War II!)
I saw his films 'Crash' and 'Naked Lunch' early, around the same time, in 1997 or 1998. Since then, I've seen many more of his- I just counted, I've seen 13 of his 23 films, and I've loved all of them. I'm a huge fan. My favorite was Shivers for a while. I'm not sure what it is now. Gotta see those final 10!
Happy birthday, David. We love …
Samsung quietly increases US prices of the Galaxy S25 Edge, S25 FE, Z Flip 7, Tab S11, Tab S11 Ultra, and more; the 1TB Galaxy Tab S11 Ultra jumps by $280 (Adrian Diaconescu/PhoneArena)
https://www.phonearena.com/news/samsung-us-p…
Unitrees humanoider H1-Roboter erreicht im 100-m-Sprint 10,1 m/s
Der H1-Roboter von Unitree kann maximal 10,1 m/s schnell laufen. Das ist ein Spitzenwert für humanoide Roboter.
https://www.
Iran’s missile arsenal is still largely intact, according to U.S. and other intelligence agencies,
contradicting claims made by the Trump administration.
Iran has regained access to 30 of its 33 missile sites along the Strait of Hormuz,
as well as to the majority of its mobile launchers and underground facilities, according to a report by The New York Times.