Since falling in love with soccer during Euro 2000, I've never been so physically close to the World Cup (the first match in Toronto is later today). But I've never felt so distant from the competition. Few organizations are as good at finding new ways to disgust as FIFA and the US regime's actions - not just those focused on the World Cup, but those are bad enough - fill me with rage. In the past, I've plugged my nose and enjoyed the matches (I'm not proud). I can't …
So to follow up on this, I've caught it in action. Models, when quantized a bit, just do a bit more poorly with short contexts. Even going from f32 (as trained) to bf16 (as usually run) to q8 tends to do okay for "normal" context windows. And q4 you start feeling like "this model is a little stupid and gets stuck sometimes” (it is! It's just that it's still mostly careening about in the space of "plausible" most of the time. Not good guesswork, but still in the zone). With long contexts, the probability of parameters collapsing to zero are higher, so the more context the more likelihood you are to see brokenness.
And then at Q2 (2 bits per parameter) or Q1, the model falls apart completely. Parameters collapse to zero easily. You start seeing "all work and no play makes jack a dull boy” sorts of behavior, with intense and unscrutinized repetition, followed by a hard stop when it just stops working.
And quantization is a parameter that a model vendor can turn relatively easily. (they have to regenerate the model from the base with more quantization, but it's a data transformation on the order of running a terabyte through a straightforward and fast process, not like training).
If you have 1000 customers and enough equipment to handle the requests of 700, going from bf16 to q8 is a no-brainer. Suddenly you can handle the load and have a little spare capacity. They get worse results, probably pay the same per token (or they're on a subscription that hides the cost anyway so you are even freer to make trade-offs. There's a reason that subscription products are kinda poorly described.)
It's also possible for them to vary this across a day: use models during quieter periods? Maybe you get an instance running a bf16 quantization. If you use it during a high use period? You get a Q4 model.
Or intelligent routing is possible. No idea if anyone is doing this, but if they monitor what you send a bit, and you generally shoot for an expensive model for simple requests? They could totally substitute a highly quantized version of the model to answer the question.
There are •so many tricks• that can be pulled here. Some of them very reasonable to make, some of them treading into outright misleading or fraudulent, and it's weirdly hard to draw the line between them.
When I got in to tech, things felt fragile. After years of trying to fix things, I spent more years feeling as though the information apocalypse was immanent. Everywhere I turned, something was broken horribly. I can't even count the number of times I've just had to be like, "oh fuck. That's really bad. I knew it was bad, but like... oh fuck."
We have *all* had our identity stolen. I don't even know how many times my social security number has been in a data breach. How many of my medical records are on the market? But yeah, sure, let's accelerate that.
The problem was never that we couldn't find problems. The problem has always been "leadership" being unwilling to invest in fixing them. The problem has always been this mind-set of growth-at-any-cost.
I tuck these things away in my brain, and they sit there gnawing on my sanity, like little RFK worms.
fuck man, kane parsons is actually 20 years old? the director of a franchise i've loved watching for years, now creator of a24's top grossing movie of all time, is actually two years younger than me? my understanding of growth and self worth is limited to a faulty capitalist logic of so-called meritocracy, ignoring all privileges, advantages, or even luck, fueling an inferiority complex in which i can only justify me being less successful than him by thinking of myself as lesser, despite tha…
Any idea about why, sometimes, DNS over UDP might fail with a specific ISP?
So, I have been having random problems with the network at home and, you won't believe that, after hours researching it, turns out it was DNS. Surprise. The problem is that, apparently, my ISP, or my ISP's router, is dropping UDP packets, or failing.
PS: I have already tried checking in my router's configuration if there is anything like "UDP flood prevention". I cannot find anything…
Just finished "Starfish" by Akemi Dawn Bowman. It was gripping (I basically barely put it down and finished in in a single day) but also feels flawed in some ways.
Things I liked: a protagonist that I really strongly rooted for, and a resolution that landed with a bit of complexity.
Things I'm feeling a way about: complete lack of depth in interrogating heritage, despite that being a huge theme, some tinges of deus ex machina in how the central conflicts are resolved, and a real lack of good messaging around consent.
#AmReading #ReadingNow #Bookstodon
Fucking around with a Sony A7S that was loaned to me ages ago. I'm finally at the point with the work that I do where I'm feeling the need to be able to run a camera myself, even if it's just to clearly communicate with people I work with.
It has a Sony FE 1.8/50 lens, that's all I've got to work with 😂
#photography
…it’s not even that. It’s just ugly. Bad layouts. Bad margins. Bad proportions. Awkward animations. Flickers and flashes. Content peeking through all the negative space so that the screen is filled with visual noise. It feels designed by committee. It feels pasted together.
The feel of Apple products has covered a lot of ground over the decades. They’ve felt elegant. They’ve felt basic. They’ve felt bauble-y and cute. They’ve felt futuristic. They’ve felt practical. But this is the first time I can recall an Apple product feeling •cheap•.
Pride parade in #Buffalo yesterday.
Awesome performances in floats. Great community support. It’s nice when people aren’t bigots.
Lots of corporate, political involvement. Branded rainbow swag nobody needed.
This gamelan float is how I discovered there’s an org in Buffalo: