So to follow up on this, I've caught it in action. Models, when quantized a bit, just do a bit more poorly with short contexts. Even going from f32 (as trained) to bf16 (as usually run) to q8 tends to do okay for "normal" context windows. And q4 you start feeling like "this model is a little stupid and gets stuck sometimes” (it is! It's just that it's still mostly careening about in the space of "plausible" most of the time. Not good guesswork, but still in the zone). With long contexts, the probability of parameters collapsing to zero are higher, so the more context the more likelihood you are to see brokenness.
And then at Q2 (2 bits per parameter) or Q1, the model falls apart completely. Parameters collapse to zero easily. You start seeing "all work and no play makes jack a dull boy” sorts of behavior, with intense and unscrutinized repetition, followed by a hard stop when it just stops working.
And quantization is a parameter that a model vendor can turn relatively easily. (they have to regenerate the model from the base with more quantization, but it's a data transformation on the order of running a terabyte through a straightforward and fast process, not like training).
If you have 1000 customers and enough equipment to handle the requests of 700, going from bf16 to q8 is a no-brainer. Suddenly you can handle the load and have a little spare capacity. They get worse results, probably pay the same per token (or they're on a subscription that hides the cost anyway so you are even freer to make trade-offs. There's a reason that subscription products are kinda poorly described.)
It's also possible for them to vary this across a day: use models during quieter periods? Maybe you get an instance running a bf16 quantization. If you use it during a high use period? You get a Q4 model.
Or intelligent routing is possible. No idea if anyone is doing this, but if they monitor what you send a bit, and you generally shoot for an expensive model for simple requests? They could totally substitute a highly quantized version of the model to answer the question.
There are •so many tricks• that can be pulled here. Some of them very reasonable to make, some of them treading into outright misleading or fraudulent, and it's weirdly hard to draw the line between them.
Happily it was a straightforward swap of identical models. Didn’t have to replace the sink flange as a result. It was an awkward business of hefting it up to get it aligned and locked into the flange, but it took only a mild amount of swearing to finally get it right.🤬
No leaks. Orders have been given to unfurl the mission accomplished banner.
(The wiring under this sink could use some taming. Whoever did it was certainly not paid by the hour. 🙄 )
The entire machinery of online discourse around building and creating has been so thoroughly captured by entrepreneurial "logic"
that we've lost the language to describe what it feels like to simply make a thing that helps someone,
give it away, and move on with your life.
I've been feeling this for a while now, and I suspect a lot of folks who have the itch to build feel it too, even if they haven't articulated it.
On the first day of the #PTSD intensive, we talked about the shooting. I had felt like I was done with that, that it didn't have anything left for me. But there was something still that filled me with rage... that is still confusing and enraging.
It wasn't actually being shot. I wasn't even the possibility of death. I had been prepared to die. I always knew that was possible. It was something else.
I remember Marc Hokoana's face as he pepper sprayed pacifists, smiling and taunting, joyfully hurting people who he knew were refusing to respond. I remember their flags, the kek flag, literally a Nazi battle flag replaced in 4chan colors with the clover 4chan logo instead of the swastika. How many people have been tortured, have died? How much suffering, that these people not only welcomed but celebrated, joyfully participated in.
The cruelty was the point. It was the plan, the plan he posted to Facebook, the same plan as they have always had, of torturing people until someone responds and then murdering them. Inflicting trauma, responding with overwhelming force, showing how "big and strong" they are because they can always escalate.
Try to stop someone from peppers praying people, they shoot you. Shoot back, like Michael Reinoehl, and they send a death squad for you. But we keep standing up, so they keep escalating to the slightest imagined infraction. Now they just murder you for being in a car, for filming at a protest, for existing.
The bar for what justifies murder or torture will continue to move lower until there is no one left, or until they can no longer escalate.
The feeling of helplessness is still not the biggest thing though. It's the joy with which they inflict this on us. That's it. That's the thing.
CW: gun violence, abuse dynamics
https://hexmhell.writeas.com/the-creature-ptss-5-day-1
Russini resigns from Athletic following Vrabel pics https://www.espn.com/nfl/story/_/id/48486392/dianna-russini-resigns-athletic-following-mike-vrabel-photos
Just finished Iveliz Explains It All by Andrea Beatriz Arango. Once I picked it up I couldn't put it down till I finished it, in less than a day. A really tender and entrancing novel-in-verse about a kid Puerto Rican kid struggling with some deep stuff. I loved the way that the journal focalization let deep feelings flow while also giving the reader a bit of a puzzle in the beginning to understand what exactly was going on. Deals with friendship, loss, and mental health (including depression, anxiety, panic attacks, and flashbacks).
#AmReading #ReadingNow
Its good to have many tests in your R package, but it can be a pain to debug some failing tests when it happens. {lazytest} for the rescue: only rerun the failing tests, until they pass: #RStats
A citizen of Rome in 117 AD,
under Emperor Trajan,
would've found it difficult to imagine the empire not existing.
The roads, the aqueducts, the legal system, the trade networks stretching from Britain to Mesopotamia:
all of it seemed to be a near-fact of nature, like gravity
Edward Gibbon gave us six volumes explaining how that feeling turned out to be wrong,
and even he couldn't fully untangle all the causes.
But the overarching theme mig…
Ginn returns to Aviators sideline after DWI arrest https://www.espn.com/united-football-league/story/_/id/48485633/ted-ginn-jr-resumes-coaching-aviators-following-dwi-arrest