Gloom & Bloom IV ☁️🌺
黑暗绽放 IV ☁️ 🌺
📷 Zeiss IKON Super Ikonta 533/16
🎞️ Lucky SHD 400
#filmphotography #Photography #blackandwhite
So to follow up on this, I've caught it in action. Models, when quantized a bit, just do a bit more poorly with short contexts. Even going from f32 (as trained) to bf16 (as usually run) to q8 tends to do okay for "normal" context windows. And q4 you start feeling like "this model is a little stupid and gets stuck sometimes” (it is! It's just that it's still mostly careening about in the space of "plausible" most of the time. Not good guesswork, but still in the zone). With long contexts, the probability of parameters collapsing to zero are higher, so the more context the more likelihood you are to see brokenness.
And then at Q2 (2 bits per parameter) or Q1, the model falls apart completely. Parameters collapse to zero easily. You start seeing "all work and no play makes jack a dull boy” sorts of behavior, with intense and unscrutinized repetition, followed by a hard stop when it just stops working.
And quantization is a parameter that a model vendor can turn relatively easily. (they have to regenerate the model from the base with more quantization, but it's a data transformation on the order of running a terabyte through a straightforward and fast process, not like training).
If you have 1000 customers and enough equipment to handle the requests of 700, going from bf16 to q8 is a no-brainer. Suddenly you can handle the load and have a little spare capacity. They get worse results, probably pay the same per token (or they're on a subscription that hides the cost anyway so you are even freer to make trade-offs. There's a reason that subscription products are kinda poorly described.)
It's also possible for them to vary this across a day: use models during quieter periods? Maybe you get an instance running a bf16 quantization. If you use it during a high use period? You get a Q4 model.
Or intelligent routing is possible. No idea if anyone is doing this, but if they monitor what you send a bit, and you generally shoot for an expensive model for simple requests? They could totally substitute a highly quantized version of the model to answer the question.
There are •so many tricks• that can be pulled here. Some of them very reasonable to make, some of them treading into outright misleading or fraudulent, and it's weirdly hard to draw the line between them.
A Swiss canton has suspended its pilot of electronic voting after failing to count 2,048 votes cast in national referendums held on March 8.
Basel-Stadt announced the problem with its e-voting pilot, open to about 10,300 locals living abroad and 30 people with disabilities, last Friday afternoon.
It encouraged participants to deliver a paper vote to the town hall or use a polling station but admitted this would not be possible for many.
By the close of polling on Sunday, its …
Yesterday I explained my mixed feelings regarding LLMs to a friend by comparing them to cars ("I don't like cars, they destroy a lot, but they are also so very convenient"). Today I read the same argument here: https://aphyr.com/posts/420-the-future
messal_shale: Messel Shale food web (2014)
A network of feeling links among taxa based on the 48 million years old uppermost early Eocene Messel Shale. Edge property 'certainty' denotes the certainty of the edge. Metadata include evidence, habitat, and trophic roles. The edge direction goes from consumer to resource.
This network has 700 nodes and 6444 edges.
Tags: Biological, Food Web, Uncertain, Weighted, Metadata
Following federal cuts to history-focused organizations, the president of the Canadian Historical Association, Colin Coates, sent this letter to Marc Miller, the Minister of Canadian Identity and Culture.
One thing might not be obvious: Coates's reference to Carney's recent Quebec City speech suggests Canadians' need for historical context right now. He doesn't agree with Carney's claims. In fact, most Canadian historians would dispute them.
David Cronenberg, master of body horror movies, was born today (March 15) in 1943 (during World War II!)
I saw his films 'Crash' and 'Naked Lunch' early, around the same time, in 1997 or 1998. Since then, I've seen many more of his- I just counted, I've seen 13 of his 23 films, and I've loved all of them. I'm a huge fan. My favorite was Shivers for a while. I'm not sure what it is now. Gotta see those final 10!
Happy birthday, David. We love …
Ex-Raiders QB Rich Gannon Drops True Feelings on Fernando Mendoza https://heavy.com/sports/nfl/las-vegas-raiders/rich-gannon-true-feelings-fernando-mendoza/
In his first term, Trump appointed a record-setting 54 federal appellate judges.
Circuit judges are nominated by presidents and, if confirmed by the Senate, serve lifetime appointments.
This analysis provides an early look at how those appointments will likely reverberate nationwide in terms of dismantling or failing to uphold environmental laws and policy, legal scholars said.
“Long term, it’s going to set a lot of precedent that pushes the law away from environmental pr…