Tootfinder

Opt-in global Mastodon full text search. Join the index!

@hw@fediscience.org
2025-12-10 08:50:15

Freudian slip in Meske et. al 2025. arxiv.org/abs/2507.21928 #GenAI #vibecoding #AIResearch

@hw@fediscience.org
2025-12-08 09:34:31

Kevin Xu argues that it's misleading to characterise the US–China AI competition as a race, since there's mutual co-operation and co-optation going on all the time: #AIResearch #LLM #AIResearch

@hw@fediscience.org
2025-10-30 08:23:48

Are you afraid of our new GenAI overlords taking over our jobs soon? According to a new benchmark, The Remote Labor Index by Scale AI and the Center for AI Safety (CAIS), there's no need to be. The best current models are able to solve around ~2% of the tasks of the index: #AIResearch #GenAI

@hw@fediscience.org
2025-10-24 06:39:24

So, the new LLM from Zhipu, GLM 4.6, is about as good at coding as Anthropic's Sonnet 4.5. but roughly 8 times cheaper. It's impressive since, apparently, Zhipu has raised 13x less capital than Anthropic. Additionally, since GLM 4.6 is an open(ish) model, the inference costs will come down rapidly.
The beginning of the end for the #AI investment bubble?
#AIResearch #opensource

@hw@fediscience.org
2025-11-20 08:18:31

This website illustrates nicely how the US lost the competition–at least for now–in open(ish) LLM models: #AIResearch #AGI_hype
/via Wired

@hw@fediscience.org
2025-10-20 07:18:41

Claude Sonnet 4.5 shows significantly increased situational awareness when testing for alignment, here's a fascinating example from p. 59 of the system card (#anthropology #AIResearch

@hw@fediscience.org
2025-11-04 07:25:48

"I found these ads after I was targeted by one suggesting I join this ethnically ambiguous, dead-eyed family of generic blue hat wearers at the World Series to root on, I guess, the Dodgers."
#AI #generativeAI #meta #AIResearch