Tootfinder

Opt-in global Mastodon full text search. Join the index!

@Techmeme@techhub.social
2025-08-30 00:56:04

Apple's Xcode 26 beta 7 adds support for GPT-5 and Claude Sonnet 4, which developers can use by signing into their paid Claude account (Chance Miller/9to5Mac)
9to5mac.com/2025/08/28/new-xco

@heiseonline@social.heise.de
2025-07-28 13:04:00

KI-Update: GPT-5, Google-Shopping, Bias in der KI, Unitree R1
Das "KI-Update" liefert werktäglich eine Zusammenfassung der wichtigsten KI-Entwicklungen.
heise.de…

@arXiv_csCV_bot@mastoxiv.page
2025-07-29 12:16:31

GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
Yuhan Wang, Siwei Yang, Bingchen Zhao, Letian Zhang, Qing Liu, Yuyin Zhou, Cihang Xie
arxiv.org/abs/2507.21033

@arXiv_csCL_bot@mastoxiv.page
2025-06-30 10:20:20

Identifying a Circuit for Verb Conjugation in GPT-2
David Demitri Africa
arxiv.org/abs/2506.22105 arxiv.org/pdf/2506.…

@arXiv_csIR_bot@mastoxiv.page
2025-06-30 09:55:10

HLTCOE at LiveRAG: GPT-Researcher using ColBERT retrieval
Kevin Duh, Eugene Yang, Orion Weller, Andrew Yates, Dawn Lawrie
arxiv.org/abs/2506.22356

@arXiv_csLG_bot@mastoxiv.page
2025-08-29 10:27:01

GPT-FT: An Efficient Automated Feature Transformation Using GPT for Sequence Reconstruction and Performance Enhancement
Yang Gao, Dongjie Wang, Scott Piersall, Ye Zhang, Liqiang Wang
arxiv.org/abs/2508.20824

@arXiv_csCL_bot@mastoxiv.page
2025-07-31 09:32:21

AI-generated stories favour stability over change: homogeneity and cultural stereotyping in narratives generated by gpt-4o-mini
Jill Walker Rettberg, Hermann Wigers
arxiv.org/abs/2507.22445

@arXiv_csCY_bot@mastoxiv.page
2025-08-26 07:42:56

The GPT-4o Shock Emotional Attachment to AI Models and Its Impact on Regulatory Acceptance: A Cross-Cultural Analysis of the Immediate Transition from GPT-4o to GPT-5
Hiroki Naito
arxiv.org/abs/2508.16624

@arXiv_csHC_bot@mastoxiv.page
2025-08-28 08:40:31

Capabilities of GPT-5 across critical domains: Is it the next breakthrough?
Georgios P. Georgiou
arxiv.org/abs/2508.19259 arxiv.org/pdf/250…

@Techmeme@techhub.social
2025-08-30 13:40:46

A study focused on OpenAI's GPT-4o mini found that LLMs can be persuaded to comply with objectionable requests using the same tactics that persuade humans (Dina Bass/Bloomberg)

@ErikJonker@mastodon.social
2025-07-17 10:07:55

Echt goed nieuws, iedereen die de beschikking heeft over relevante Nederlandse content zou dat beschikbaar moeten stellen aan GPT-NL !
Doe mee met GPT-NL! - gpt-nl.nl/samenwerken/doe-mee/

ChatGPT 5 power consumption could be as much as eight times higher than GPT 4
— Research institute estimates medium-sized GPT-5 response can consume up to 40 watt-hours of electricity

@arXiv_csSE_bot@mastoxiv.page
2025-07-31 09:52:01

From Articles to Code: On-Demand Generation of Core Algorithms from Scientific Publications
Cameron S. Movassaghi, Amanda Momenzadeh, Jesse G. Meyer
arxiv.org/abs/2507.22324

@EgorKotov@datasci.social
2025-07-28 12:27:17

Sonnet 4 is infinitely better than any of the GPT 4/o4 models for Typst, in my subjective opinion and recent experience. I don't know if it is trained on more recent Typst docs, or if it is just better at getting the logic of previously unseen code, but it solved my problem on second attempt, gpt whatever version did not with several (more than 10...) attempts.

@pbloem@sigmoid.social
2025-07-29 18:32:32

I can't imagine OpenAI is going to Trumpify GPT. I'm pretty sure they've spent more on instruction tuning data than they have on model training.
Even if they wanted to throw all that out and start again, you'd need to build a coherent image of what the bot should do in any given situation. The MAGA worldview simply doesn't have the required coherence.

@Techmeme@techhub.social
2025-07-25 14:20:50

Sources: GPT-5 shows improved performance in coding, particularly in practical software engineering tasks, outperforming prior OpenAI models and Claude Sonnet 4 (Stephanie Palazzolo/The Information)
theinformation.com/articles/op

@Simone21@mastodon.social
2025-08-29 11:47:31

Wenn die Realität keinen Widerstand mehr leistet...
Wenn man immer recht hat, egal, was man behauptet...
Wenn Mitgefühl nur errechnet, nicht empfunden und geschenkt wird...

(noch?) ohne Bezahlschranke:
tagesanzeiger.ch/chatgpt-staer

@arXiv_csAR_bot@mastoxiv.page
2025-08-26 07:31:46

GPT-OSS-20B: A Comprehensive Deployment-Centric Analysis of OpenAI's Open-Weight Mixture of Experts Model
Deepak Kumar, Divakar Yadav, Yash Patel
arxiv.org/abs/2508.16700

@heiseonline@social.heise.de
2025-08-22 13:04:00

KI-Update kompakt: Gemini Live, AI Mode, GPT-5-Test, KI-Psychosen
Das "KI-Update" liefert werktäglich eine Zusammenfassung der wichtigsten KI-Entwicklungen.

@metacurity@infosec.exchange
2025-08-11 08:06:00

GPT-5 surrendered to the hackers in 24 hours and gave out a "recipe" for a bomb, more likely 4o
itc.ua/en/news/gpt-5-surrender

@adlerweb@social.adlerweb.info
2025-07-23 10:00:04

[BLOG] Windows: Umwandlung einer BIOS/MBR-Installation zu EFI/GPT adlerweb.info/blog/2025/07/23/

@Mediagazer@mstdn.social
2025-08-28 06:31:04

A test of seven AI chatbots' abilities to identify news photos' location, date, and photographer showed all failed to consistently identify photos' provenance (Columbia Journalism Review)
cjr.org/tow_center/why-ai-mode

@arXiv_eessIV_bot@mastoxiv.page
2025-08-20 08:46:20

Benchmarking GPT-5 for Zero-Shot Multimodal Medical Reasoning in Radiology and Radiation Oncology
Mingzhe Hu, Zach Eidex, Shansong Wang, Mojtaba Safari, Qiang Li, Xiaofeng Yang
arxiv.org/abs/2508.13192

@poppastring@dotnet.social
2025-08-29 02:42:08

Roadmap for AI in Visual Studio (September) #visualstudio :visualstudio:
devblogs.microsoft.com/visuals

@arXiv_csCR_bot@mastoxiv.page
2025-06-24 11:59:20

Security Assessment of DeepSeek and GPT Series Models against Jailbreak Attacks
Xiaodong Wu, Xiangman Li, Jianbing Ni
arxiv.org/abs/2506.18543

@tezoatlipoca@mas.to
2025-08-28 16:19:11

#AI #programming assist has been helpful for me. But I'm not losing my job anytime soon. Here's a simple example of why.
I have a script with
`cmd1`
I prompt GPT-4.1 to "now invoke cmd2 and cmd3 at the end". Good:
`cmd1`
`cmd2`
`cmd3`
"Add a 15 second pau…

@catsalad@infosec.exchange
2025-08-01 20:20:02

CatGPT users shocked to learn their chats were in Meow meow meow
🙀 cat-gpt.com/chat/meow

@Techmeme@techhub.social
2025-07-24 16:06:07

Source: OpenAI is planning to launch GPT-5 in early August, complete with mini and nano versions that will also be available through its API (Tom Warren/The Verge)
theverge.com/notepad-microsoft

@arXiv_csRO_bot@mastoxiv.page
2025-08-29 09:46:41

Learning Primitive Embodied World Models: Towards Scalable Robotic Learning
Qiao Sun, Liujia Yang, Wei Tang, Wei Huang, Kaixin Xu, Yongchao Chen, Mingyu Liu, Jiange Yang, Haoyi Zhu, Yating Wang, Tong He, Yilun Chen, Xili Dai, Nanyang Ye, Qinying Gu
arxiv.org/abs/2508.20840

@arXiv_csCY_bot@mastoxiv.page
2025-07-29 10:11:51

The Carbon Cost of Conversation, Sustainability in the Age of Language Models
Sayed Mahbub Hasan Amiri, Prasun Goswami, Md. Mainul Islam, Mohammad Shakhawat Hossen, Sayed Majhab Hasan Amiri, Naznin Akter
arxiv.org/abs/2507.20018

@arXiv_csHC_bot@mastoxiv.page
2025-07-30 08:22:41

Empowering Educators in the Age of AI: An Empirical Study on Creating custom GPTs in Qualitative Research Method education
Qian Huang, Thijs Willems
arxiv.org/abs/2507.21074

@inthehands@hachyderm.io
2025-06-26 22:40:10

This essay (ht @… ) offers a lot to chew on: some gems, some flubs, some quibblable provocations, some big insights. This sentence in particular stood out to me (context for it in the screenshot):
“Whether we’re reading or conversing, we want something to be meant, not just said.”
slate.com/life/2025/06/ai-chat

@heiseonline@social.heise.de
2025-08-13 13:55:00

GPT-5 zu unfreundlich: OpenAI setzt wieder auf 4o als Standardmodell
Nach einer Woche GPT-5 reagiert OpenAI auf Kritik: Zahlende Nutzer erhalten 4o als Standard zurück. Das Routing von GPT-5 geht jetzt auch im Handbetrieb.

@arXiv_csCL_bot@mastoxiv.page
2025-06-30 10:21:10

Detection of Personal Data in Structured Datasets Using a Large Language Model
Albert Agisha Ntwali, Luca R\"uck, Martin Heckmann
arxiv.org/abs/2506.22305

@ErikJonker@mastodon.social
2025-08-06 12:22:31

Open Weight language models are released by OpenAI.
Interesting what the experiences will be on local configurations , 16GB (V)RAM is a lot but attainable for a lot of people.
#openai

@kurtsh@mastodon.social
2025-08-08 00:32:18

Been using this for a while & it's excellent on providing accurate, thorough, fast but SAFE output... without needing use hardcore reasoning of o3-mini.
✅ Available today: GPT-5 in Microsoft 365 Copilot | Microsoft 365 Blog
microsoft.co…

@jlpiraux@wallonie-bruxelles.social
2025-08-15 11:36:59

"Aussi peu glorieuse qu'elle soit, la réduction des coûts est actuellement logique du point de vue d'OpenAI. L'entreprise est plus que jamais confrontée Š la concurrence et subit une pression croissante pour trouver un moyen de rentabiliser son modèle d'entreprise. Son évaluation anticipée de quelque 500 milliards de dollars s'accompagne de l'attente implicite qu'elle trouvera bientôt un moyen de gagner de l'argent."

@tinoeberl@mastodon.online
2025-08-31 16:18:02

#KINutzen #Retröt
#KünstlicheIntelligenz kann effektiv #Verschwörungstheorien

@mapcar@mastodon.sdf.org
2025-08-12 07:36:05

bloodinthemachine.com/p/gpt-5-

@stefanlaser@social.tchncs.de
2025-08-12 05:53:23

#ChatGPT5, it appears, is full of shit.
"#OpenAI’s products are no longer primarily aimed at consumers but at investors. As long as you avoid a full-scale user revolt (which GPT-5 actually did incur…), you can continue to assuage or even attract more backers on your path of relentless e…

@arXiv_csAI_bot@mastoxiv.page
2025-08-27 10:13:33

Sense of Self and Time in Borderline Personality. A Comparative Robustness Study with Generative AI
Marcin Moskalewicz, Anna Sterna, Marek Pokropski, Paula Flores
arxiv.org/abs/2508.19008

@lmc@mastodon.social
2025-06-24 04:04:38

this is probably the one and only time you’ll ever hear me say how much I appreciate AI
#MLB

@arXiv_csCL_bot@mastoxiv.page
2025-07-31 09:30:21

NeedleChain: Measuring Intact Long-Context Reasoning Capability of Large Language Models
Hyeonseok Moon, Heuiseok Lim
arxiv.org/abs/2507.22411

@arXiv_csCY_bot@mastoxiv.page
2025-08-26 09:05:56

Leveraging Multi-Source Textural UGC for Neighbourhood Housing Quality Assessment: A GPT-Enhanced Framework
Qiyuan Hong, Huimin Zhao, Ying Long
arxiv.org/abs/2508.16657

@michabbb@social.vivaldi.net
2025-08-26 11:02:51

That's the output of #GPT5 *HIGH* 😞
I get it now, if many people complain about GPT-5 🙄
and yes: i have never seen anything similar from sonnet.....
#ai #coding

@heiseonline@social.heise.de
2025-08-14 04:08:00

#heiseshow: GPT-5, ICE L, Solar-Förderung
In der #heiseshow: OpenAI veröffentlicht GPT-5, die Bahn jubelt über die ICE L-Zulassung und es gibt Aufregung um die neue Solarförderung.

@arXiv_csSE_bot@mastoxiv.page
2025-07-29 10:09:02

From Prompt to Pipeline: Large Language Models for Scientific Workflow Development in Bioinformatics
Khairul Alam, Banani Roy
arxiv.org/abs/2507.20122

@arXiv_csCV_bot@mastoxiv.page
2025-08-15 10:25:22

Performance of GPT-5 in Brain Tumor MRI Reasoning
Mojtaba Safari, Shansong Wang, Mingzhe Hu, Zach Eidex, Qiang Li, Xiaofeng Yang
arxiv.org/abs/2508.10865

@poppastring@dotnet.social
2025-08-12 02:35:01

Just in time for the version that seems to be struggling the msot...
"Apple Intelligence’s ChatGPT integration will use GPT-5 starting with iOS 26"
theverge.com/news/756799/apple

@Techmeme@techhub.social
2025-08-07 10:05:52

A now-deleted GitHub blog post reveals GPT-5, available as gpt-5, gpt-5-mini, gpt-5-nano, and gpt-5-chat with "major improvements" in reasoning, code, and more (Tom Warren/The Verge)
theverge.com/news/752091/opena

@heiseonline@social.heise.de
2025-08-18 13:04:00

KI-Update kompakt: Stromnetze, o3 vs. GPT-5, Claude, KI-Buzzwords, FrOSCon
Das "KI-Update" liefert werktäglich eine Zusammenfassung der wichtigsten KI-Entwicklungen.

@ErikJonker@mastodon.social
2025-08-05 06:04:03

GPT-NL, een mooi initiatief dat voortgang maakt, let ook op het doel "Tot slot is het goed om in het achterhoofd te houden dat GPT-NL wordt ontwikkeld voor specifieke taken: samenvatten, versimpelen, en het extraheren van informatie. Het doel van GPT-NL is niet om een generiek kennismodel te ontwikkelen."
Lees deze blog:

@Techmeme@techhub.social
2025-08-14 02:41:03

GPT-5 review: GPT-5-Thinking is a substantial upgrade over o3, Auto is only useful for free tier users, picking the right model still matters, and more (Zvi Mowshowitz/Don't Worry About the Vase)
thezvi.substack.com/p/gpt-5s-a

@arXiv_csHC_bot@mastoxiv.page
2025-06-23 11:32:40

Can GPT-4o Evaluate Usability Like Human Experts? A Comparative Study on Issue Identification in Heuristic Evaluation
Guilherme Guerino, Luiz Rodrigues, Bruna Capeleti, Rafael Ferreira Mello, Andr\'e Freire, Luciana Zaina
arxiv.org/abs/2506.16345

@arXiv_csCY_bot@mastoxiv.page
2025-08-28 08:13:51

Should LLMs be WEIRD? Exploring WEIRDness and Human Rights in Large Language Models
Ke Zhou, Marios Constantinides, Daniele Quercia
arxiv.org/abs/2508.19269

@tinoeberl@mastodon.online
2025-08-27 16:18:02

#SteadyCommunityContent #Retröt
Der Einsatz von #GPT4 in der #Diagnostik zeigt …

@arXiv_csCR_bot@mastoxiv.page
2025-06-19 08:11:43

LLM vs. SAST: A Technical Analysis on Detecting Coding Bugs of GPT4-Advanced Data Analysis
Madjid G. Tehrani, Eldar Sultanow, William J. Buchanan, Mahkame Houmani, Christel H. Djaha Fodja
arxiv.org/abs/2506.15212

@Techmeme@techhub.social
2025-08-15 20:55:50

Some developers say GPT-5 excels at technical reasoning and planning coding tasks and is cost-effective, but Claude Opus and Sonnet still produce better code (Lauren Goode/Wired)
wired.com/story/gpt-5-coding-r

@heiseonline@social.heise.de
2025-08-15 13:03:00

KI-Update kompakt: unfreundliches GPT-5, Meta, KI-Mutterinstinkte, Krebsvorsorge
Das "KI-Update" liefert werktäglich eine Zusammenfassung der wichtigsten KI-Entwicklungen.

@arXiv_csCV_bot@mastoxiv.page
2025-08-14 10:16:42

Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation
Junyan Ye, Dongzhi Jiang, Zihao Wang, Leqi Zhu, Zhenghao Hu, Zilong Huang, Jun He, Zhiyuan Yan, Jinghua Yu, Hongsheng Li, Conghui He, Weijia Li
arxiv.org/abs/2508.09987

@ErikJonker@mastodon.social
2025-08-07 20:33:26

Ethan Mollick about GPT-5,
#AI #GPT5

@Techmeme@techhub.social
2025-08-28 17:25:42

OpenAI makes Realtime API generally available with new features, including MCP support, and launches gpt-realtime, its most advanced speech-to-speech model (Sabrina Ortiz/ZDNET)
zdnet.com/article/openai-gives

@arXiv_csHC_bot@mastoxiv.page
2025-08-26 11:13:36

Caregiver-in-the-Loop AI: A Simulation-Based Feasibility Study for Dementia Task Verification
Joy Lai, David Black, Kelly Beaton, Bing Ye, Alex Mihailidis
arxiv.org/abs/2508.18267

@Techmeme@techhub.social
2025-08-07 10:16:19

Internal OpenAI code suggests a tiered GPT-5 rollout: free users get basic GPT-5, Plus users get advanced reasoning, and Pro gets research-level performance (Alexey Shabanov/TestingCatalog)
testingcatalog.com/leaked-deta

@arXiv_csCY_bot@mastoxiv.page
2025-07-29 10:24:42

VArsity: Can Large Language Models Keep Power Engineering Students in Phase?
Samuel Talkington, Daniel K. Molzahn
arxiv.org/abs/2507.20995

@arXiv_csCV_bot@mastoxiv.page
2025-08-19 12:08:30

Has GPT-5 Achieved Spatial Intelligence? An Empirical Study
Zhongang Cai, Yubo Wang, Qingping Sun, Ruisi Wang, Chenyang Gu, Wanqi Yin, Zhiqian Lin, Zhitao Yang, Chen Wei, Xuanke Shi, Kewang Deng, Xiaoyang Han, Zukai Chen, Jiaqi Li, Xiangyu Fan, Hanming Deng, Lewei Lu, Bo Li, Ziwei Liu, Quan Wang, Dahua Lin, Lei Yang
arxiv.org/abs/2…

@heiseonline@social.heise.de
2025-08-08 13:04:01

KI-Update: Chat GPT-5, KI-Übersetzer, KI und Unis, KI-Schuld, Nvidia und China
Das "KI-Update" liefert werktäglich eine Zusammenfassung der wichtigsten KI-Entwicklungen.

@arXiv_csCL_bot@mastoxiv.page
2025-08-28 11:06:41

Crosslisted article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[1/1]:
- Capabilities of GPT-5 across critical domains: Is it the next breakthrough?
Georgios P. Georgiou

@Techmeme@techhub.social
2025-08-13 02:45:59

OpenAI restores GPT-4o as default for all paid ChatGPT users, vows "plenty of notice" if 4o is deprecated, raises GPT-5 Thinking rate limits to 3K messages/week (Carl Franzen/VentureBeat)
venturebeat.com/a…

@Techmeme@techhub.social
2025-08-08 14:01:35

GPT-5's system card says gpt-5-thinking has a hallucination rate of 4.5% with browsing enabled, compared to gpt-5-main's 9.6%, GPT-4o's 12.9%, and o3's 12.7% (Cecily Mauran/Mashable)
mashable.com/article/openai-gp

@ErikJonker@mastodon.social
2025-08-11 13:53:53

a good blog but the most relevant line is "GPT-5 may be a moderate quantitative improvement (and it may be cheaper) but it still fails in all the same qualitative ways as its predecessors" , very true but indeed now using it a few days and i notice those moderate improvements.. And i was already aware of all it's failings.
For me in day-to-day use it is better and that is wat counts for me at least. Oh and always (yes always) check outcomes before you use them.

@arXiv_csCL_bot@mastoxiv.page
2025-07-28 13:02:38

Replaced article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[1/3]:
- Comparison of pipeline, sequence-to-sequence, and GPT models for end-to-end relation extraction: ...
Shashank Gupta, Xuguang Ai, Ramakanth Kavuluru

@Techmeme@techhub.social
2025-08-05 17:11:02

OpenAI releases gpt-oss-120b and gpt-oss-20b, its first open-weight models since GPT-2; the smaller model can run locally on a consumer device with 16GB of RAM (Reece Rogers/Wired)
wired.com/story/openai-just-re

@arXiv_csHC_bot@mastoxiv.page
2025-06-19 08:23:34

Optimizing Web-Based AI Query Retrieval with GPT Integration in LangChain A CoT-Enhanced Prompt Engineering Approach
Wenqi Guan, Yang Fang
arxiv.org/abs/2506.15512

@arXiv_csCV_bot@mastoxiv.page
2025-08-18 09:56:10

Is ChatGPT-5 Ready for Mammogram VQA?
Qiang Li, Shansong Wang, Mingzhe Hu, Mojtaba Safari, Zachary Eidex, Xiaofeng Yang
arxiv.org/abs/2508.11628

@Techmeme@techhub.social
2025-08-07 17:19:27

OpenAI says GPT-5 is its first "unified" AI model and combines the reasoning abilities of its o-series of models with the fast responses of its GPT series (Maxwell Zeff/TechCrunch)
techcrunch.com/2025/08/07/open

@arXiv_csCL_bot@mastoxiv.page
2025-08-14 09:54:12

Performance of GPT-5 Frontier Models in Ophthalmology Question Answering
Fares Antaki, David Mikhail, Daniel Milad, Danny A Mammo, Sumit Sharma, Sunil K Srivastava, Bing Yu Chen, Samir Touma, Mertcan Sevgi, Jonathan El-Khoury, Pearse A Keane, Qingyu Chen, Yih Chung Tham, Renaud Duval
arxiv.org/abs/2508.09956

@arXiv_csCL_bot@mastoxiv.page
2025-07-28 09:57:51

TokenSmith: Streamlining Data Editing, Search, and Inspection for Large-Scale Language Model Training and Interpretability
Mohammad Aflah Khan, Ameya Godbole, Johnny Tian-Zheng Wei, Ryan Wang, James Flemings, Krishna Gummadi, Willie Neiswanger, Robin Jia
arxiv.org/abs/2507.19419

@Techmeme@techhub.social
2025-08-07 18:05:48

GPT-5 hands-on: it exudes competence but doesn't feel like a dramatic leap ahead of other LLMs, and the pricing is aggressively competitive with other providers (Simon Willison/Simon Willison's Weblog)
simonwillison.net/2025/Aug/7/g

@ErikJonker@mastodon.social
2025-08-08 09:02:06

AI geletterdheid betekent ook, snappen dat als GPT-5 het aantal B's in het woord "Blueberry" niet correct kan tellen, dat niet betekent dat het model waardeloos/onbruikbaar is... Modellen als GPT-5 zijn goed in bepaalde dingen en slecht in andere. We moeten leren hoe we ze waar toepassen en waar niet. Daarbij ook de kosten afwegen tegen de baten, is zoiets als ChatGPT wel nodig, afwegingen maken mbt bias, ethiek etc.
("GPT-5 Thinking" doet het overigens wel corr…

@Techmeme@techhub.social
2025-08-08 10:20:55

Apple says Apple Intelligence will use OpenAI's GPT-5 on iOS 26, iPadOS 26, and macOS Tahoe 26, with the system updates expected to arrive in September (Zac Hall/9to5Mac)
9to5mac.com/2025/08/07/apple-i

@Techmeme@techhub.social
2025-08-07 19:05:48

OpenAI highlights GPT-5 scores on math, coding, and health benchmarks: 94.6% on AIME 2025 without tools, 74.9% on SWE-bench Verified, 46.2% on HealthBench Hard (Carl Franzen/VentureBeat)
venturebeat.com/ai/openai-laun

@Techmeme@techhub.social
2025-08-10 19:25:36

GPT-5's release was underwhelming, offering incremental improvements and failing to meet expectations, showing that pure scaling simply isn't the path to AGI (Gary Marcus/Marcus on AI)
garymarcus.substack.com/p/gpt-

@arXiv_csCL_bot@mastoxiv.page
2025-08-27 10:09:43

ThinkDial: An Open Recipe for Controlling Reasoning Effort in Large Language Models
Qianyu He, Siyu Yuan, Xuefeng Li, Mingxuan Wang, Jiangjie Chen
arxiv.org/abs/2508.18773

@Techmeme@techhub.social
2025-08-13 11:50:55

GPT-5's router directs queries based on complexity and intent, helping OpenAI allocate compute for low-value informational and high-value commercial requests (SemiAnalysis)
semianalysis.com/2025/08/13/gp

@Techmeme@techhub.social
2025-08-01 18:25:51

Source: GPT-5 improvements won't be comparable to the leaps in performance of earlier models, such as between GPT-3 in 2020 and GPT-4 in 2023 (The Information)
theinformation.com/articles/in

@Techmeme@techhub.social
2025-08-08 19:55:55

OpenAI says ChatGPT Pro users can select old models for now but plans to deprecate them in 60 days; Sam Altman says Plus users will be able to keep using GPT-4o (Joanna Stern/Joanna Stern's Newsletter)
joannastern.beehiiv.com/p/gpt-

@Techmeme@techhub.social
2025-08-07 18:41:02

OpenAI releases GPT-5 pro, a version with extended reasoning exclusive to ChatGPT Pro subscribers, saying it scored 88.4% without tools on the GPQA benchmark (Maximilian Schreiner/The Decoder)
the-decoder.com/openai-claims-

@Techmeme@techhub.social
2025-07-12 18:25:56

Moonshot's Kimi K2 uses a 1T-parameter MoE architecture with 32B active parameters and outperforms models like GPT-4.1 and DeepSeek-V3 on key benchmarks (Michael Nuñez/VentureBeat)
venturebeat.com/ai/moonshot-ai

@Techmeme@techhub.social
2025-08-08 05:10:49

During the GPT-5 livestream, OpenAI showed two charts whose scales were all over the place, with Sam Altman later calling one "a mega chart screwup from us" (Jay Peters/The Verge)
theverge.com/news/756444/opena

@Techmeme@techhub.social
2025-08-11 18:20:58

xAI makes Grok 4 free for all users worldwide after making Grok Imagine free for all US users; Grok 4 Heavy remains exclusive to SuperGrok Heavy subscribers (Omair Pall/Mashable India)
in.mashable.com/tech/98367/elo

@Techmeme@techhub.social
2025-08-15 13:10:54

Sam Altman says OpenAI "totally screwed up some things" on the GPT-5 rollout, confirms plans to fund a brain-computer interface startup to rival Neuralink (Alex Heath/The Verge)
theverge.com/command-line-news

@Techmeme@techhub.social
2025-08-08 14:35:51

Sam Altman says OpenAI should prioritize growth and its investments in training and compute "for a long time", even if it delays its path to profitability (Ashley Capoot/CNBC)
cnbc.com/2025/08/08/chatgpt-gp

@Techmeme@techhub.social
2025-08-14 16:46:02

Q&A with OpenAI VP and Head of ChatGPT Nick Turley on ChatGPT's future, showing ads in chatbots, hallucinations, GPT-5 blowback, 4o, subscriptions, and more (Alex Heath/The Verge)
theverge.com/decoder-podcast-w

@Techmeme@techhub.social
2025-08-13 06:30:55

OpenAI introduces "Auto", "Fast", and "Thinking" settings for GPT-5 in ChatGPT's model picker, with "Auto" similar to the GPT-5 model router announced earlier (Maxwell Zeff/TechCrunch)
techcrunch.com/2025/08/12/chat

@Techmeme@techhub.social
2025-08-08 16:05:58

With GPT-5's launch, OpenAI has removed its older models like GPT-4o and o3 from the ChatGPT model selector, sparking a backlash from some users (Michael Kan/PCMag)
pcmag.com/news/openai-faces-ba

@Techmeme@techhub.social
2025-08-16 21:20:53

A new Artificial Analysis benchmark, focusing on OpenAI's gpt-oss-120b, shows how open-weight LLMs exhibit inconsistent performance across hosting providers (Simon Willison/Simon Willison's Weblog)
simonwillison.net/2025/Aug/15/

@Techmeme@techhub.social
2025-08-07 17:55:49

GPT-5 will use "safe completions", a training approach to maximize model helpfulness within safety constraints and an improvement over refusal-based training (OpenAI)
openai.com/index/gpt-5-safe-co

@Techmeme@techhub.social
2025-08-23 10:40:46

Q&A with David Luan, head of Amazon's AGI research lab, on leaving Adept in a reverse acquihire deal, why he believes progress on AI models has slowed, and more (Alex Heath/The Verge)
theverge.com/decoder-podcast-w