Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@aredridel@kolektiva.social
2026-04-14 14:22:42

So to follow up on this, I've caught it in action. Models, when quantized a bit, just do a bit more poorly with short contexts. Even going from f32 (as trained) to bf16 (as usually run) to q8 tends to do okay for "normal" context windows. And q4 you start feeling like "this model is a little stupid and gets stuck sometimes” (it is! It's just that it's still mostly careening about in the space of "plausible" most of the time. Not good guesswork, but still in the zone). With long contexts, the probability of parameters collapsing to zero are higher, so the more context the more likelihood you are to see brokenness.
And then at Q2 (2 bits per parameter) or Q1, the model falls apart completely. Parameters collapse to zero easily. You start seeing "all work and no play makes jack a dull boy” sorts of behavior, with intense and unscrutinized repetition, followed by a hard stop when it just stops working.
And quantization is a parameter that a model vendor can turn relatively easily. (they have to regenerate the model from the base with more quantization, but it's a data transformation on the order of running a terabyte through a straightforward and fast process, not like training).
If you have 1000 customers and enough equipment to handle the requests of 700, going from bf16 to q8 is a no-brainer. Suddenly you can handle the load and have a little spare capacity. They get worse results, probably pay the same per token (or they're on a subscription that hides the cost anyway so you are even freer to make trade-offs. There's a reason that subscription products are kinda poorly described.)
It's also possible for them to vary this across a day: use models during quieter periods? Maybe you get an instance running a bf16 quantization. If you use it during a high use period? You get a Q4 model.
Or intelligent routing is possible. No idea if anyone is doing this, but if they monitor what you send a bit, and you generally shoot for an expensive model for simple requests? They could totally substitute a highly quantized version of the model to answer the question.
There are •so many tricks• that can be pulled here. Some of them very reasonable to make, some of them treading into outright misleading or fraudulent, and it's weirdly hard to draw the line between them.

A citizen of Rome in 117 AD,
under Emperor Trajan,
would've found it difficult to imagine the empire not existing.
The roads, the aqueducts, the legal system, the trade networks stretching from Britain to Mesopotamia:
all of it seemed to be a near-fact of nature, like gravity
Edward Gibbon gave us six volumes explaining how that feeling turned out to be wrong,
and even he couldn't fully untangle all the causes.
But the overarching theme mig…

@primonatura@mstdn.social
2026-02-14 12:00:06

"Tree planting can combat urban heat, but some neighborhoods are falling behind"
#Trees #Environment

@qurlyjoe@mstdn.social
2026-03-15 01:40:51

#silentSunday
Sprague Lake, #RockyMountainNationalPark

A photo taken at Sprague Lake in Rocky Mountain National Park. It’s been snowing, and is still snowing. The sun has risen an hour ago and the clouds are not that deep so the sky is washed out. Across the bottom 2/3 of the image, dense pine forest appears hazy because of the large snowflakes falling gently. Across the bottom of the image the lake is mostly snow-covered except for what looks like a band of open water but is really ice. There is a log at the right side, and a chunk of tree trunk i…
@tiotasram@kolektiva.social
2026-01-15 02:32:28

Just finished "Far Sector" written by N. K. Jemisin and illustrated by Jamal Campbell. I don't normally go for Marvel/DC comics stuff and this was a good reminder why. Jemisin's authorship was the draw for me here, as well as some curiosity about what I might be missing out on by avoiding the classic comics lineage. I won't go into too much detail about particulars, but suffice to say it ends up feeling to me line a very neoliberal story dressed up in a veneer of radicalism, which is not what I'd expected of Jemisin. Particularly in light of current events, the "good cops" aspects of the storyline ring truly hollow. There's still a lot of neat parts, but I guess I also wound up disappointed by the sci-fi aspects in a lot off ways. I truly think Jemisin is capable of better than this, based on her other (excellent) work.
#AmReading #ReadingNow

@stefanlaser@social.tchncs.de
2026-04-15 14:35:44

On the radio, I hear the German research foundation #DFG defend its recent move to allow #AI in project reviews, just with local setups, just for language clarity – lots of reservations.
I then listen to the most recent episode of Mél’s Data Fix podcast. An anonymous guest (🔥) talks about their daily…

There are some changes for the 2026 tax filing season that people who are 65 years of age and older should be aware of.
The most recent being the enhanced deduction for seniors
irs.gov/newsroom/2026-filing-s

@primonatura@mstdn.social
2026-04-15 11:00:07

"Say no to pesticides, mix up your lawn – and six more ways to help bees to thrive"
#Environment #Bees #Insects

A trial date has been set for Trump's $10 billion lawsuit against the BBC.
On Thursday, Judge Roy K. Altman of the Southern District of Florida set a provisional start date of ⭐️February 15, 2027, for a two-week trial. 
The lawsuit was filed following the release of an episode from Panorama,
the BBC's investigative documentary series, titled
"Trump: A Second Chance?"
In it, the BBC cut together two parts of Trump's January 2021 speech to …

"The sooner David Ellison takes over that network the better,"
Pete Hegseth said during a morning briefing.
Hegseth's invoking the name of the Paramount Skydance chief executive
— whose company will take control of CNN once its deal to merge with Warner Bros. Discovery is finalized
— amplified the fear many have that the cable news channel will seek to appease the Trump administration
Hegseth made the remarks after blasting CNN's reporting on t…