Tootfinder

Opt-in global Mastodon full text search. Join the index!

@heiseonline@social.heise.de
2025-12-12 04:09:00

GPT-5.2: Neues KI-Modell von OpenAI soll Büroarbeiten besser unterstützen
Nur einen Monat nach GPT-5.1 kommt ein neues KI-Modell der ChatGPT-Entwickler. GPT-5.2 soll bessere Tabellen, Präsentationen und Code produzieren können.

@Techmeme@techhub.social
2025-12-11 18:18:02

OpenAI says GPT-5.2 Thinking hallucinates less than GPT-5.1 and has improved reliability for agentic AI needs; pre-release testers include Notion, Box, Shopify (Hayden Field/The Verge)
theverge.com/ai-artificial-int

@Techmeme@techhub.social
2026-02-11 00:15:57

OpenAI updates ChatGPT's deep research tool with GPT-5.2, a full-screen report view, and an option to focus research on specific websites (Matthias Bastian/The Decoder)
the-decoder.com/openais-deep-r

@Techmeme@techhub.social
2025-12-12 07:01:18

GPT-5.2 models match GPT-5 and 5.1 with a 400K context window and 128K max output tokens, but have a newer knowledge cutoff of Aug. 31, 2025 vs. Sept. 30, 2024 (Simon Willison/Simon Willison's Newsletter)
simonw.substack.com/p/gpt-52-a

@heiseonline@social.heise.de
2025-12-12 05:18:00

Freitag: Kritik an eID-Karte wegen Geldwäsche, neues OpenAI-Modell als Bürohilfe
eID-Karte zu einfach zu ergaunern GPT-5.2 für Profi-Nutzer Disney gegen Google-KI wegen Copyright Kritik an EU wegen VMware Roboter-Bewegungen erklärt

@Techmeme@techhub.social
2025-12-11 18:06:51

OpenAI launches GPT-5.2, its "best model yet," in Instant, Thinking, and Pro variants, with significant improvements in writing, coding, and reasoning (Maxwell Zeff/Wired)
wired.com/story/openai-gpt-lau

@Techmeme@techhub.social
2025-12-11 19:16:04

[Thread] GPT-5.2 is now available in the API, priced at $1.75/1M input and $14/1M output tokens; GPT-5.2 Pro is priced at $21/1M input and $168/1M output tokens (@openaidevs)
x.com/openaidevs/status/199918

@Mediagazer@mstdn.social
2026-01-10 07:26:09

Researchers say GPT 4.1, Claude 3.7 Sonnet, Gemini 2.5 Pro, and Grok 3 can reproduce long excerpts from books they were trained on when strategically prompted (Alex Reisner/The Atlantic)
theatlantic.com/technology/202

@Techmeme@techhub.social
2025-12-11 18:45:58

OpenAI says GPT‑5.2 Thinking beats or ties industry professionals on 70.9% of GDPval knowledge work tasks, delivering outputs at >11x the speed and <1% the cost (OpenAI)
openai.com/index/introducing-g

@Techmeme@techhub.social
2025-12-12 17:06:30

Companies are updating insider trading policies to cover prediction markets; Kalshi and others are pushing for federal oversight, including of insider trading (Rocket Drew/The Information)
theinformation.com/articles/po

@Techmeme@techhub.social
2026-01-10 07:25:54

Researchers say GPT 4.1, Claude 3.7 Sonnet, Gemini 2.5 Pro, and Grok 3 can reproduce long excerpts from books they were trained on when strategically prompted (Alex Reisner/The Atlantic)

@tante@tldr.nettime.org
2026-01-28 10:00:05

"What Lin and Cursor achieved was to show that an AI agent can generate millions of lines of code that’s lifted from other projects, and that don’t compile, let alone work."
(Original title: Cursor lies about vibe-coding a web browser with AI)
pivot-to-ai.c…

@digitalnaiv@mastodon.social
2025-12-31 10:19:00

Marcus Schwarze stellt in der #FAZ die Toptools der Künstlichen Intelligenz 2026 vor und lobt die Bild geneirerungsfunlktionen von Gemini und ChatGPT.
Meine Erfahrung: Wenn beide Bild-KIs am deutschen Umlaut "Ausgewählt" scheitern kann man nicht von wirklich guten Tools sprechen. Sorry. Das ist einfach gar nicht intelligent.

A damning new study could put AI companies on the defensive.
In it, Stanford and Yale researchers found compelling evidence that AI models are actually copying all that data,
not “learning” from it.
Specifically, four prominent LLMs
— OpenAI’s GPT-4.1, Google’s Gemini 2.5 Pro, xAI’s Grok 3, and Anthropic’s Claude 3.7 Sonnet
— happily reproduced lengthy excerpts from popular
— and protected
— works, with a stunning degree of accuracy.
They fou…

@Techmeme@techhub.social
2026-02-02 18:08:02

OpenAI launches a Codex app for macOS, designed to serve as a command center for managing AI agents, and says Codex usage has nearly doubled since mid-December (David Gewirtz/ZDNET)
zdnet.com/article/openai-codex

@Techmeme@techhub.social
2026-01-27 18:31:03

OpenAI for Science launches Prism, a free LaTeX-based text editor that embeds GPT-5.2 to assist in scientific paper drafting and citation management (Will Douglas Heaven/MIT Technology Review)
technologyreview.com/2026/01/2

@Techmeme@techhub.social
2025-11-30 06:40:47

Alibaba Technical Report: Qwen3-VL beats GPT-5 and Gemini 2.5 Pro on visual tasks and has 100% accuracy on "needle-in-a-haystack" tests for 30-minute videos (Jonathan Kemper/The Decoder)
the-decoder.com/qwen3-vl-can-s

@Techmeme@techhub.social
2025-12-18 18:57:30

OpenAI releases GPT‑5.2-Codex, with improvements on long-horizon work through context compaction, stronger performance on large code changes, and more (OpenAI)
openai.com/index/introducing-g

@Techmeme@techhub.social
2026-01-25 01:50:58

Tests show GPT-5.2 on ChatGPT citing Grokipedia as a source on a wide range of queries, including on Iranian conglomerates and Holocaust deniers (Aisha Down/The Guardian)
theguardian.com/technology/202

@mariyadelano@hachyderm.io
2025-11-13 22:00:11

Curious that whenever someone shows me “the cool #AI flow” they built that’s supposed to be impressive, the conversation goes the same way:
Stage 1: “But you don’t understand. You don’t like AI because you haven’t used it right. Let me show you how much you can do it with.”
Stage 2: “Here are the steps in the flow and the instructions I feed to this agent / custom GPT / Claude project. I tell it to do X, reference document Y, and aim for Z.”
Stage 3: “Now, let me show you the results it gives.”
*Writes task, presses to run the prompt.*
Stage 4: “Umm sorry it’s taking a while. It’s fast but not instant. And by the way, the prompt isn’t perfect, you can definitely make it better. I just threw this together real quick the other day. It makes some mistakes, but it’s really good.”
Stage 5: “Uuuuuuh actually don’t look at the output.” *scrolls or stops screen share or pulls device away.*
“You know it’s already doing so well, if I do more prompt engineering it will get really good but I need to give it better instructions. And it ran just fine last night, I don’t know what’s up with it. And this is a cheap model, if we use another model it will be better.”
Stage 6: “You know, you really shouldn’t judge this so much. The technology will improve, it will get there sooner than you know and then you’ll regret not trying it sooner.”
So curious that this keeps happening 🤷‍♀️
#LLMs #work #tech #AIBubble

@Techmeme@techhub.social
2026-01-26 17:50:42

Qwen releases Qwen3-Max-Thinking, its flagship reasoning model that it says demonstrates performance comparable to models such as GPT-5.2 Thinking and Opus 4.5 (Qwen)
qwen.ai/blog?id=qwen3-max-thin

@Techmeme@techhub.social
2025-11-13 20:41:04

Baidu unveils Ernie 5.0, an AI model to process and generate text, images, audio, and video, claiming it beats GPT-5-High and Gemini 2.5 Pro on some benchmarks (Carl Franzen/VentureBeat)
venturebeat.com/ai/baidu-unvei

@Techmeme@techhub.social
2025-11-18 20:55:53

Gemini 3 Pro is priced at $2-$4 per 1M input tokens and $12-$18 per 1M output tokens, cheaper than Claude Sonnet 4.5 but more expensive than GPT-5.1 (Simon Willison/Simon Willison's Weblog)
simonwillison.net/2025/Nov/18/

@Techmeme@techhub.social
2025-12-17 16:15:44

Google makes Gemini 3 Flash the default model in Gemini app and Search's AI mode; it scored 33.7% without tool use on Humanity's Last Exam vs. GPT-5.2's 34.5% (Ivan Mehta/TechCrunch)
techcrunch.com/2025/12/17/goog

@Techmeme@techhub.social
2025-12-16 17:21:00

OpenAI launches FrontierScience, a benchmark to measure models' expert-level scientific reasoning with 700 questions, finding GPT-5.2 is its strongest model (OpenAI)
openai.com/index/frontierscien

@Techmeme@techhub.social
2025-11-13 20:35:45

Anthropic open sources a method to score AI model political evenhandedness; Gemini 2.5 Pro got 97%, Grok 4 96%, Claude Opus 4.1 95%, GPT-5 89%, and Llama 4 66% (Ina Fried/Axios)
axios.com/2025/11/13/anthropic