
2025-06-07 11:42:03
from my link log —
Designing APIs for humans: Stripe object IDs.
https://dev.to/stripe/designing-apis-for-humans-object-ids-3o5a
saved 2025-05-22
from my link log —
Designing APIs for humans: Stripe object IDs.
https://dev.to/stripe/designing-apis-for-humans-object-ids-3o5a
saved 2025-05-22
Q&A with Google DeepMind CEO Demis Hassabis on "a 50% chance" of AGI in the next five to 10 years, bad actors and technical risks, AI regulation, jobs, and more (Steven Levy/Wired)
https://www.wired.com/story/google-deepmin
Curious humpback whales approach humans and blow bubble 'smoke' rings
https://phys.org/news/2025-06-curious-humpback-whales-approach-humans.html
Really cool video about why the video games industry is struggling: everybody has to compete with addictive social media for eyeballs and time. And unless whole new markets are opened up (humans are not born quickly enough) there's just no longer a way to create exponential growth. But billionaire investors need that. That's why they are rather investing in AI.
By the way, this is the same reason that cinemas have gotten in trouble (and now even streaming services...)
❝We humans are stability-seeking creatures. Getting accustomed to what used to seem unthinkable can feel like an accomplishment. And when the unthinkable recedes at least a bit…it’s easy to mistake it for proof that the dark times are ending.
But these comparatively small victories don’t alter the direction of our transformation — they don’t even slow it down measurably — even while they appeal to our deep need to normalize.…And so just when we most need to act — while there is indeed room for action and some momentum to the resistance — we tend to be lulled into complacency by the sense of relief on the one hand and boredom on the other.❞
https://www.nytimes.com/2025/05/28/opinion/trump-danger-normalization-shock.html
So I've found my answer after maybe ~30 minutes of effort. First stop was the first search result on Startpage (https://millennialhawk.com/does-poop-have-calories/), which has some evidence of maybe-AI authorship but which is better than a lot of slop. It actually has real links & cites research, so I'll start by looking at the sources.
It claims near the top that poop contains 4.91 kcal per gram (note: 1 kcal = 1 Calorie = 1000 calories, which fact I could find/do trust despite the slop in that search). Now obviously, without a range or mention of an average, this isn't the whole picture, but maybe it's an average to start from? However, the citation link is to a study (https://pubmed.ncbi.nlm.nih.gov/32235930/) which only included 27 people with impaired glucose tolerance and obesity. Might have the cited stat, but it's definitely not a broadly representative one if this is the source. The public abstract does not include the stat cited, and I don't want to pay for the article. I happen to be affiliated with a university library, so I could see if I have access that way, but it's a pain to do and not worth it for this study that I know is too specific. Also most people wouldn't have access that way.
Side note: this doing-the-research protect has the nice benefit of letting you see lots of cool stuff you wouldn't have otherwise. The abstract of this study is pretty cool and I learned a bit about gut microbiome changes from just reading the abstract.
My next move was to look among citations in this article to see if I could find something about calorie content of poop specifically. Luckily the article page had indicators for which citations were free to access. I ended up reading/skimming 2 more articles (a few more interesting facts about gut microbiomes were learned) before finding this article whose introduction has what I'm looking for: https://pmc.ncbi.nlm.nih.gov/articles/PMC3127503/
Here's the relevant paragraph:
"""
The alteration of the energy-balance equation, which is defined by the equilibrium of energy intake and energy expenditure (1–5), leads to weight gain. One less-extensively-studied component of the energy-balance equation is energy loss in stools and urine. Previous studies of healthy adults showed that ≈5% of ingested calories were lost in stools and urine (6). Individuals who consume high-fiber diets exhibit a higher fecal energy loss than individuals who consume low-fiber diets with an equivalent energy content (7, 8). Webb and Annis (9) studied stool energy loss in 4 lean and 4 obese individuals and showed a tendency to lower the fecal energy excretion in obese compared with lean study participants.
"""
And there's a good-enough answer if we do some math, along with links to more in-depth reading if we want them. A Mayo clinic calorie calculator suggests about 2250 Calories per day for me to maintain my weight, I think there's probably a lot of variation in that number, but 5% of that would be very roughly 100 Calories lost in poop per day, so maybe an extremely rough estimate for a range of humans might be 50-200 Calories per day. Interestingly, one of the AI slop pages I found asserted (without citation) 100-200 Calories per day, which kinda checks out. I had no way to trust that number though, and as we saw with the provenance of the 4.91 kcal/gram, it might not be good provenance.
To double-check, I visited this link from the paragraph above: https://www.sciencedirect.com/science/article/abs/pii/S0022316622169853?via=ihub
It's only a 6-person study, but just the abstract has numbers: ~250 kcal/day pooped on a low-fiber diet vs. ~400 kcal/day pooped on a high-fiber diet. That's with intakes of ~2100 and ~2350 kcal respectively, which is close to the number from which I estimated 100 kcal above, so maybe the first estimate from just the 5% number was a bit low.
Glad those numbers were in the abstract, since the full text is paywalled... It's possible this study was also done on some atypical patient group...
Just to come full circle, let's look at that 4.91 kcal/gram number again. A search suggests 14-16 ounces of poop per day is typical, with at least two sources around 14 ounces, or ~400 grams. (AI slop was strong here too, with one including a completely made up table of "studies" that was summarized as 100-200 grams/day). If we believe 400 grams/day of poop, then 4.91 kcal/gram would be almost 2000 kcal/day, which is very clearly ludicrous! So that number was likely some unrelated statistic regurgitated by the AI. I found that number in at least 3 of the slop pages I waded through in my initial search.
This https://arxiv.org/abs/2505.10661 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csHC_…
You will find no answers to the struggles we face as humans by looking to the machines.
Be eager to say “I don’t know… but I want to find out!”
➡️ https://rasterweb.net/raster/2025/06/02/experts-dont-know/
Feel the Force: Contact-Driven Learning from Humans
Ademi Adeniji, Zhuoran Chen, Vincent Liu, Venkatesh Pattabiraman, Raunaq Bhirangi, Siddhant Haldar, Pieter Abbeel, Lerrel Pinto
https://arxiv.org/abs/2506.01944
Researchers in Japan Discover Medicine Capable of Regrowing Third Set of Teeth for Humans - Dentistry Today
https://www.dentistrytoday.com/researchers-in-japan-discover-medicine-capable-of-regrowing-third-set-of-teeth-for-humans/
I've been a Duolingo user since 2013. You can track the enshittification of the product since they went public.
Enshittification is now complete, with humans being replaced by AI. This act of mine is my way of avoiding being treated like a wallet while you mistreat others.
#duolingo #workingClass
Philippe Neau & Antonella Eye Porcelluzzi – Elephant
https://www.clongclongmoo.org/2025/06/05/philippe-neau-antonella-eye-porcelluzzi-elephant/
"The industry perpetuates this state of things, keeping itself in a state of blissful high-hormone idiocy. Software is important, so clearly those who are writing it must be hailed as the holders of some occult knowledge and the purveyors of infinite wisdom. Through bribery, hubris, or ill luck, some of those same assholes find themselves later in management positions, and continue the tradition by hiring more people like themselves, because that is what humans do."
Complexity in the Wake of Artificial Intelligence
Theodore Modis
https://arxiv.org/abs/2506.04269 https://arxiv.org/pdf/2506.04269
Human-Machine Collaboration and Ethical Considerations in Adaptive Cyber-Physical Systems
Zoe Pfister
https://arxiv.org/abs/2507.02578 https://
Speaking off: did those scumbags at the University of Zürich ever face any consequences for their highly unethical work?
https://arstechnica.com/ai/2025/06/reddit-ceo-pledges-site-will-remain-written-by-humans-and-voted-on-by-humans
1945: rent time from humans.
1955: build your own computer!
1965: rent time on an IBM mainframe.
1975: get your own home computer!
1985: rent time on CompuServe.
1995: get a PC with Windows 95!
2005: rent time on AWS.
2015: get an iPhone or an Android!
2025: rent time on ChatGPT.
2035: get your own whatever!
Love this, by D.J. Grothe
We are truly only just getting started. All we have to do is to fail to kill everyone, and things will get better.
"Human civilization has existed for only 3% of the time that anatomically modern humans have existed. And modern industrial civilization has existed for just 2% of that 3% — just 0.06% of the time that anatomically modern humans have existed. Maybe we’re just getting started!"
Towards Human-like Preference Profiling in Sequential Recommendation
Zhongyu Ouyang, Qianlong Wen, Chunhui Zhang, Yanfang Ye, Soroush Vosoughi
https://arxiv.org/abs/2506.02261
This https://arxiv.org/abs/2505.17433 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…
I always find the use of first person plural pronouns in discussions of the distant future to be too cute…
Humans didn’t exist a million years ago. There were some crafty hominid apes but not the sort you could clean up and mistake for a human.
There is no “us” 7 billion years from now. There is very likely no “us” in half a million years. We may or may not have descendants but they won’t be “us” in any sense.
A billion years is longer than Earth has had visible life
I've been using em dashes since I picked up the LaTeX manual in 1986 and I'm not going back just because some text extruding software uses them more than most humans.
Also, my grandfather was a printer and I knew from an early age that "em" and "en" were legit scrabble words.
#OldManYellsAtCloud
Aktueller Titel: Kalte Nacht – Humans Are Mistakes
#KleineEchos – jetzt live bei https://www.mixcloud.com/live/thopan
It's really telling how much of the conversation around AI boils down to,
"Is there any value in humans being able to think?"
Which all too quickly reduces to
"Is there any value in humans” ?
https://bsky.app/profile/kevinriggle.b
My Advisor, Her AI and Me: Evidence from a Field Experiment on Human-AI Collaboration and Investment Decisions
Cathy (Liu), Yang, Kevin Bauer, Xitong Li, Oliver Hinz
https://arxiv.org/abs/2506.03707
TRiMM: Transformer-Based Rich Motion Matching for Real-Time multi-modal Interaction in Digital Humans
Yueqian Guo, Tianzhao Li, Xin Lyu, Jiehaolin Chen, Zhaohan Wang, Sirui Xiao, Yurun Chen, Yezi He, Helin Li, Fan Zhang
https://arxiv.org/abs/2506.01077
It's the #DayOfZeus / Jupiter's Day / Thorsday! ⚡
Enraged by #Prometheus stealing fire for the humans, #Zeus, "bound [ready-witted Prometheus] with inextricable bonds, cruel chains,…
Amazon deploys 1 millionth robot, nearing point where machines outnumber humans in warehouses https://ground.news/article/amazon-deploys-its-1-millionth-robot-in-a-sign-of-more-job-automation
Misalignment or misuse? The AGI alignment tradeoff
Max Hellrigel-Holderbaum, Leonard Dung
https://arxiv.org/abs/2506.03755 https://ar…
A Forget-and-Grow Strategy for Deep Reinforcement Learning Scaling in Continuous Control
Zilin Kang, Chenyuan Hu, Yu Luo, Zhecheng Yuan, Ruijie Zheng, Huazhe Xu
https://arxiv.org/abs/2507.02712
So Builder . ai Mechanical Turked things and Microsoft (and others) were none the wiser.
Due diligence: boring stuff that when you skip it with catch up with you fast.
“Builder . ai’s platform relied on around 700 engineers based in India who manually wrote code based on customer requests. Despite the company marketing it as AI-generated, most of the work was done by humans behind the scenes.”
At a "couples retreat" for human-AI pairs, users of services like Replika and Nomi grapple with the virtual reality and emotional limits of their partners (Sam Apple/Wired)
https://www.wired.com/story/couples-retrea
Ad on the tube says 'Humans were the beta test. The era of AI employees is here'.
I can't *imagine* why people are a bit resistant to AI! At least offshoring never advertised on the tube. The enshittification of 21st century life continues.
Quanta Magazine authors Janna Levin and Steven Strogatz strike up a conversation with Ellie Pavlick (Research Scientist at Google Deep Mind) about the differences and similarities between the way people understand language, what NLP algorithms do, and the fact that such conversations more often than not shed light into more than Linguistics' computational side.
"Will AI Ever Understand Language Like Humans?"
Telescopes on the Andes glimpse elusive encounters fueled by the very first stars in the universe more than 13 billion years ago by detecting cosmic microwave light signals https://www.404media.co/humans-have-now-seen-the-dawn-of-time-from-ear…
A Hierarchical Integer Linear Programming Approach for Optimizing Team Formation in Education
Aaron Kessler, Tim Scheiber, Heinz Schmitz, Ioanna Lykourentzou
https://arxiv.org/abs/2506.02756
This https://arxiv.org/abs/2503.02077 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csMA_…
Palaeontologists find #collagen #biomarker to identify ancient #Australian #megafauna, notable because it’s more durable…
When humans feel powerless, especially after traumatic events or retraumatization or ~ gestures generally at C-PTSD ~ what often helps is having an area of control over choices, decisions, and outcomes (especially outcomes that have positive side effects like humans liking the action/work/result)
And thus I flew to LA for a weekend and got a tattoo.
#BloomScrolling
DRAG: Distilling RAG for SLMs from LLMs to Transfer Knowledge and Mitigate Hallucination via Evidence and Graph-based Distillation
Jennifer Chen, Aidar Myrzakhan, Yaxin Luo, Hassaan Muhammad Khan, Sondos Mahmoud Bsharat, Zhiqiang Shen
https://arxiv.org/abs/2506.01954
This https://arxiv.org/abs/2505.20290 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…
For The Love Of Dogs (And Their Humans!)
Great Australian Pods Podcast Directory: #GreatAusPods
Note that nowhere in that definition is there actually any attempt to define or measure “intelligence” — a term which we are scarcely able to define and to measure even for humans!
Note also that the definition is inherently a broad one and a shifting one. It’s relative to humans •and• relative to recent history.
4/
“As is the case with reading and writing a language, code is one of those things where if you don’t use it, you lose it. Early studies indicate that humans who use A.I. could become less creative over time.”
Early studies link: https://dl.acm.org/doi/abs/10.1145/3706598.3714198
Designing Algorithmic Delegates: The Role of Indistinguishability in Human-AI Handoff
Sophie Greenwood, Karen Levy, Solon Barocas, Hoda Heidari, Jon Kleinberg
https://arxiv.org/abs/2506.03102
There was somebody fussing in my replies to my last link to my blog post about Medium (I don’t see them now; they probably blocked me, but their specific words don’t really matter), and the gist of their message was that they didn’t like that site. On the modern internet, if you have an issue with content written by humans, with no surveillance ads, that doesn’t allow AI scraping or AI slop content, with a business model that makes money… I don’t know how to help you. Honestly.
Explicit Residual-Based Scalable Image Coding for Humans and Machines
Yui Tatsumi, Ziyue Zeng, Hiroshi Watanabe
https://arxiv.org/abs/2506.19297 https://…
This https://arxiv.org/abs/2504.07879 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csHC_…
Convergent and divergent connectivity patterns of the arcuate fasciculus in macaques and humans
Jiahao Huang, Ruifeng Li, Wenwen Yu, Anan Li, Xiangning Li, Mingchao Yan, Lei Xie, Qingrun Zeng, Xueyan Jia, Shuxin Wang, Ronghui Ju, Feng Chen, Qingming Luo, Hui Gong, Xiaoquan Yang, Yuanjing Feng, Zheng Wang
https://arxiv.org/abs/25…
Animals are abused and exploited in various ways for the sake of entertainment. LCA strongly opposes the use of animals in entertainment.
Animals have their own needs, interests, and rights, especially the right to engage in their natural behaviors in their natural habitat. https://www.lcanimal.org/…
Reluctant Interaction Inference after Additive Modeling
Yiling Huang, Snigdha Panigrahi, Guo Yu, Jacob Bien
https://arxiv.org/abs/2506.01219 https://
This https://arxiv.org/abs/2308.03734 has been replaced.
link: https://scholar.google.com/scholar?q=a
Aerones, which makes robots that can service wind turbines in about half the time of humans, raised $62M led by Activate Capital and S2G Investments (Virginia Furness/Reuters)
https://www.reuters.com/sustainability/cli
LLMs exhibit "potemkin understanding"!
Hope the methodology here is better than the last LLM-hater arxiv paper that came through
Must read it more carefully...
https://mathstodon.xyz/@gregeganSF/114758840374128081
This https://arxiv.org/abs/2503.08720 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCY_…
Note that nowhere in that definition is there actually any attempt to define or measure “intelligence” — a term which we are scarcely able to define and to measure even for humans!
Note also that the definition is inherently a broad one and a shifting one. It’s relative to humans •and• relative to recent history.
4/
A controversial new book:
"We are eating the Earth"
says excess carbon dioxide in the atmosphere is a long-term challenge resulting from an otherwise cheerful story,
in which more people live better lives with fuller bellies and bigger dreams.
Lawyer-turned-science-cop Tim Searchinger discovered that the popular carbon solution of 20 years ago,
-- plant-based biofuels,
-- was a disaster in the making.
His insight: Land used to grow fuel w…
RoboEgo System Card: An Omnimodal Model with Native Full Duplexity
Yiqun Yao, Xiang Li, Xin Jiang, Xuezhi Fang, Naitong Yu, Aixin Sun, Yequan Wang
https://arxiv.org/abs/2506.01934
Requirements Elicitation Follow-Up Question Generation
Yuchen Shen, Anmol Singhal, Travis Breaux
https://arxiv.org/abs/2507.02858 https://
Can LLMs and humans be friends? Uncovering factors affecting human-AI intimacy formation
Yeseon Hong, Junhyuk Choi, Minju Kim, Bugeun Kim
https://arxiv.org/abs/2505.24658
This https://arxiv.org/abs/2409.18745 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…
According to my napkin estimations, and also assuming (unreasonably) that we will have cracked 100% fusion efficiency of E=mc² within the next few weeks, humans will have burned up the entire mantle of the Earth in approximately 2200 years.
This is using the 2023 figures for present use (15000 Mtoe, ie about 7 tonnes annually) and its 2.2% growth rate which, while suddenly up from the long standing 1.5%, is largely pre-LLMs.
WebChoreArena: Evaluating Web Browsing Agents on Realistic Tedious Web Tasks
Atsuyuki Miyai, Zaiying Zhao, Kazuki Egashira, Atsuki Sato, Tatsumi Sunada, Shota Onohara, Hiromasa Yamanishi, Mashiro Toyooka, Kunato Nishina, Ryoma Maeda, Kiyoharu Aizawa, Toshihiko Yamasaki
https://arxiv.org/abs/2506.01952
Can Machines Philosophize?
Michele Pizzochero, Giorgia Dellaferrera
https://arxiv.org/abs/2507.00675 https://arxiv.org/pdf/2507.00675…
FWIW, I’ve yet to see any indication that the use of LLMs (pseudo-AI) has improved the quality of phishing as measured by how much gets past technical defenses.
LLMs are a great leveler. They produce median texts to fit their prompts. They cannot produce anything that requires creativity. They cannot produce high-quality fakes because they cannot produce high-quality anything. They are a play on the fact that 50% of people are at or below median cognitive capacity.
Animals are abused and exploited in various ways for the sake of entertainment. LCA strongly opposes the use of animals in entertainment.
Animals have their own needs, interests, and rights, especially the right to engage in their natural behaviors in their natural habitat. https://www.lcanimal.org/…
Scaling Human Judgment in Community Notes with LLMs
Haiwen Li, Soham De, Manon Revel, Andreas Haupt, Brad Miller, Keith Coleman, Jay Baxter, Martin Saveski, Michiel A. Bakker
https://arxiv.org/abs/2506.24118
Do Language Models Mirror Human Confidence? Exploring Psychological Insights to Address Overconfidence in LLMs
Chenjun Xu, Bingbing Wen, Bin Han, Robert Wolfe, Lucy Lu Wang, Bill Howe
https://arxiv.org/abs/2506.00582
This https://arxiv.org/abs/2504.14305 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…
Stylometry recognizes human and LLM-generated texts in short samples
Karol Przystalski, Jan K. Argasi\'nski, Iwona Grabska-Gradzi\'nska, Jeremi K. Ochab
https://arxiv.org/abs/2507.00838
This https://arxiv.org/abs/2505.23436 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…
This https://arxiv.org/abs/2402.11871 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…
Can LLMs Replace Humans During Code Chunking?
Christopher Glasz, Emily Escamilla, Eric O. Scott, Anand Patel, Jacob Zimmer, Colin Diggs, Michael Doyle, Scott Rosen, Nitin Naik, Justin F. Brunelle, Samruddhi Thaker, Parthav Poudel, Arun Sridharan, Amit Madan, Doug Wendt, William Macke, Thomas Schill
https://arxiv.org/abs/2506.198…
‘It’s death by a thousand cuts’: marine ecologist on the collapse of coral reefs https://www.theguardian.com/environment/ng-interactive/2025/jun/25/tipping-points-coral-oceans-climate-crisis-marine-ecologist
Re this from @…, of the biggest tells about the current AI hype bubble:
Instead of replacing the work humans don’t want to do, it’s purporting to replace the work executives hate paying for.
Instead of an end to drudgery, they’re pushing an end to purpose and meaning.
And yeah, we’re going to end up cleaning up the AI’s messes. And doing its laundry.
https://mastodon.social/@PavelASamsonov/114598616057210141
In an Oxford study, LLMs correctly identified medical conditions 94.9% of the time when given test scenarios directly, vs. 34.5% when prompted by human subjects (Nick Mokey/VentureBeat)
https://venturebeat.com/ai/just-add-hu
EDEN: Entorhinal Driven Egocentric Navigation Toward Robotic Deployment
Mikolaj Walczak, Romina Aalishah, Wyatt Mackey, Brittany Story, David L. Boothe Jr., Nicholas Waytowich, Xiaomin Lin, Tinoosh Mohsenin
https://arxiv.org/abs/2506.03046
This https://arxiv.org/abs/2501.07071 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…
Who Gets the Kidney? Human-AI Alignment, Indecision, and Moral Values
John P. Dickerson, Hadi Hosseini, Samarth Khanna, Leona Pierce
https://arxiv.org/abs/2506.00079
Re this from @…, of the biggest tells about the current AI hype bubble:
Instead of replacing the work humans don’t want to do, it’s purporting to replace the work executives hate paying for.
Instead of an end to drudgery, they’re pushing an end to purpose and meaning.
And yeah, we’re going to end up cleaning up the AI’s messes. And doing its laundry.
https://mastodon.social/@PavelASamsonov/114598616057210141
This https://arxiv.org/abs/2503.05231 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…
Monitoring Robustness and Individual Fairness
Ashutosh Gupta, Thomas A. Henzinger, Konstantin Kueffner, Kaushik Mallik, David Pape
https://arxiv.org/abs/2506.00496
Call center agents in Australia, Canada, Greece, and the US say they've been repeatedly mistaken for AI, as the industry rapidly integrates AI alongside humans (Morgan Meaker/Bloomberg)
https://www.bloomberg.com/news/articles/20
Here’s the real actual definition of “artificial intelligence,” the true technical meaning in research and engineering circles when it’s not being used as marketing hype.
Artificial intelligence is anything that
1. humans are generally good at, and
2. computers were recently bad at.
That’s it. That’s all it means. You’ll hear people refine it and dress it up, but that’s the heart of the definition. (Check Wikipedia!)
3/
A Hybrid Approach to Indoor Social Navigation: Integrating Reactive Local Planning and Proactive Global Planning
Arnab Debnath, Gregory J. Stein, Jana Kosecka
https://arxiv.org/abs/2506.02593
This https://arxiv.org/abs/2412.05718 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…
This https://arxiv.org/abs/2412.16772 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCY_…
Here’s the real actual definition of “artificial intelligence,” the true technical meaning in research and engineering circles when it’s not being used as marketing hype.
Artificial intelligence is anything that
1. humans are generally good at, and
2. computers were recently bad at.
That’s it. That’s all it means. You’ll hear people refine it and dress it up, but that’s the heart of the definition. (Check Wikipedia!)
3/
[Thread] A new US paper shows the best frontier LLM models achieve 0% on hard real-life Programming Contest problems, domains where expert humans still excel (Rohan Paul/@rohanpaul_ai)
https://x.com/rohanpaul_ai/status/1934751145400111572
For example:
- Telling apart photos of cats and dogs is “AI.”
- Making up fake but plausible facts on an arbitrary topic is “AI.”
- Walking is “AI.”
- Doing long multiplication is something we might call “intelligence” in humans, but it is not “AI” because computers have •always• been good at it.
- Winning at checkers •used• to be “AI” because computers didn’t used to be able to do that, but now it’s not “AI” because computers have been good at it for too long.
5/
This https://arxiv.org/abs/2505.21432 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…
For example:
- Telling apart photos of cats and dogs is “AI.”
- Making up fake but plausible facts on an arbitrary topic is “AI.”
- Walking is “AI.”
- Doing long multiplication is something we might call “intelligence” in humans, but it is not “AI” because computers have •always• been good at it.
- Winning at checkers •used• to be “AI” because computers didn’t used to be able to do that, but now it’s not “AI” because computers have been good at it for too long.
5/
This https://arxiv.org/abs/2309.03678 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…
This https://arxiv.org/abs/2503.03480 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…