Tootfinder

Opt-in global Mastodon full text search. Join the index!

@Techmeme@techhub.social
2026-01-01 16:15:29

DeepSeek researchers detail a new mHC architecture they used to train 3B, 9B, and 27B models, finding it scaled without adding significant computational burden (Vincent Chow/South China Morning Post)
scmp.com/tech/big-tech/article

@Techmeme@techhub.social
2025-12-01 15:21:59

DeepSeek releases DeepSeek-V3.2 and DeepSeek-V3.2-Speciale, calling them "reasoning-first models built for agents", after releasing V3.2-Exp in September (Saritha Rai/Bloomberg)
bloomberg.com/news/articles/20

@heiseonline@social.heise.de
2025-11-20 10:46:00

DeepSeek-R1 erzeugt unsicheren Code bei politisch sensiblen Begriffen
Die chinesische KI DeepSeek-R1 erzeugt schlechteren Code, wenn Begriffe wie Falun Gong oder Taiwan im Prompt stehen. Das fanden Sicherheitsforscher heraus.

@ErikJonker@mastodon.social
2025-12-03 09:37:32

Short video about Deepseek V3.2 , describes where it improves on, compared to other models, why it is relevant. Also in a geopolitical context.
#AI

@heiseonline@social.heise.de
2025-10-21 16:02:00

DeepSeek-OCR: Wie Bilder Chatbots helfen, lange Gespräche zu führen
Chinesische KI-Forscher wollen Chatbots mit Bildern bei langen Kontexten schnell und günstig halten. Optische Kontextkompression soll KI-Assistenten verbessern.

@rperezrosario@mastodon.social
2025-10-15 00:30:39

AI model builder Eric Hartford unpacks the results of the September 25, 2025 NIST report on DeepSeek ("Evaluation of DeepSeek AI Models") and reports that the lack of evidence of malicious code, backdoors or data exfiltration, should lead to questions about the report's motives, framing and implications.
"The Demonization of DeepSeek -
How NIST Turned Open Science into a Security Scare"

A black and white line art illustration showing a demonic figure (purportedly  the NIST in this context), a computer monitor showing the word "DeepSeek", a scared-looking scientist and a microscope. Designed and executed by ChatGPT 4o.
@seeingwithsound@mas.to
2025-12-10 07:26:57

How Qwen 3 outcompetes OpenAI and DeepSeek apidog.com/blog/qwen-3-outcomp "Qwen 3 shines with its variety of model sizes", "small models for edge devices or massive ones for heavy lifting";

@Techmeme@techhub.social
2025-11-28 22:16:00

DeepSeek says its new DeepSeekMath-V2 model got gold-medal level status on the International Mathematical Olympiad 2025 and Chinese Mathematical Olympiad 2024 (Matthias Bastian/The Decoder)
the-decoder.com/deepseekmath-v

@metacurity@infosec.exchange
2025-11-16 20:50:19

Inspections by Taiwan's National Security Bureau (NSB) of five Chinese generative AI apps -- Deepseek, Doubao (豆包), Yiyan (文心一言), Tongyi (通義千問), and Yuanbao (騰訊元寶) -- found violations of users' communication security across several indicators.
focustaiwan.tw/cross-strait/20

@Techmeme@techhub.social
2025-12-04 04:25:41

Nvidia says its GB200 Blackwell AI servers boost performance 10x compared to H200 servers for MoE models like Moonshot's Kimi K2 Thinking and DeepSeek's R1 (Stephen Nellis/Reuters)
reuters.com/world/china/nvidia

@ErikJonker@mastodon.social
2025-10-10 06:28:27

Interesting, a lab that wants to build opensource (!) attracts a lot of funding 🤔
Reflection AI raises $2B to be America's open frontier AI lab, challenging DeepSeek | TechCrunch techcrunch.com/2025/10/09/refl

@Techmeme@techhub.social
2025-10-20 18:15:46

DeepSeek releases DeepSeek-OCR, a vision language model designed for efficient vision-text compression, enabling longer contexts with less compute (Jonathan Kemper/The Decoder)
the-decoder.com/deepseeks-ocr-

@metacurity@infosec.exchange
2025-10-27 11:05:26

Chatbots Are Pushing Sanctioned Russian Propaganda
wired.com/story/chatbots-are-p

@ErikJonker@mastodon.social
2025-12-30 20:31:00

What a great read and overview, recommended !
"The State Of LLMs 2025: Progress, Problems, and Predictions"
#AI

@Techmeme@techhub.social
2025-12-10 12:16:02

Sources: DeepSeek is developing its new AI model using several thousand Nvidia Blackwell chips, which were smuggled into China via third-party countries (The Information)
theinformation.com/articles/de

@tinoeberl@mastodon.online
2025-11-15 15:16:07

"In den nächsten zehn bis 20 Jahren könnte KI den Rest der Arbeit (die von Menschen verrichtet wird) übernehmen, und die Gesellschaft könnte vor einer massiven Herausforderung stehen."
Hmmm... ja klar.
Wenigstens benennen sie klar den weltweiten immensen #Jobabbau.

@Techmeme@techhub.social
2025-10-27 05:40:40

Papers, patents, and tenders show China's military is integrating DeepSeek and Qwen models in weapons like AI-powered drones, and continues to use Nvidia chips (Reuters)
reuters.com/world/asia-pacific

@jorgecandeias@mastodon.social
2025-12-14 15:08:20

RE: mastodon.social/@bruces/115718
Holy crap. So many people trying to get fat while the hype lasts, at the expense of pretty much everybody else.
And I don't see here mistral or deepseek, so these may be just the American ones…

@arXiv_csLG_bot@mastoxiv.page
2025-10-14 13:40:48

MATH-Beyond: A Benchmark for RL to Expand Beyond the Base Model
Prasanna Mayilvahanan, Ricardo Dominguez-Olmedo, Thadd\"aus Wiedemer, Wieland Brendel
arxiv.org/abs/2510.11653

@Techmeme@techhub.social
2025-11-26 11:06:09

PitchBook: US AI and robotics VC deals are up over 4x since 2023 to $160B so far in 2025, while comparable China deals are just $10B , up from $9.24B in 2023 (CNBC)
cnbc.com/2025/11/26/cnbc-china

@arXiv_csAI_bot@mastoxiv.page
2025-10-10 07:33:08

Base Models Know How to Reason, Thinking Models Learn When
Constantin Venhoff, Iv\'an Arcuschin, Philip Torr, Arthur Conmy, Neel Nanda
arxiv.org/abs/2510.07364

@arXiv_csHC_bot@mastoxiv.page
2025-10-14 08:48:58

ROBOPSY PL[AI]: Using Role-Play to Investigate how LLMs Present Collective Memory
Margarete Jahrmann, Thomas Brandstetter, Stefan Glasauer
arxiv.org/abs/2510.09874

@arXiv_csCL_bot@mastoxiv.page
2025-10-09 10:26:21

Overview of the Plagiarism Detection Task at PAN 2025
Andr\'e Greiner-Petter, Maik Fr\"obe, Jan Philip Wahle, Terry Ruas, Bela Gipp, Akiko Aizawa, Martin Potthast
arxiv.org/abs/2510.06805

@Techmeme@techhub.social
2025-10-23 06:46:03

A look at DeepSeek's rise in Africa, as China's lightweight, low-cost AI models make inroads into the continent via tech infrastructure built by Huawei and ZTE (Bloomberg)

@arXiv_csAI_bot@mastoxiv.page
2025-10-10 10:31:19

R-Horizon: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?
Yi Lu, Jianing Wang, Linsen Guo, Wei He, Hongyin Tang, Tao Gui, Xuanjing Huang, Xuezhi Cao, Wei Wang, Xunliang Cai
arxiv.org/abs/2510.08189

@ErikJonker@mastodon.social
2025-11-08 15:07:21

Kimi K2 is another Deepseek moment it seems, only not everybody is noticing it yet. It will be interesting to see what the stock market will do on monday.
#AI #KimiK2

Someone tested Kimi K2 on unpublished material and it performed as good as GPT-5 and Gemini 2.5
@Techmeme@techhub.social
2025-11-08 06:36:00

In DeepSeek's first public appearance since R1's success, a senior researcher told a state-run conference he was pessimistic about AI's impact on humanity (Reuters)
reuters.com/world/asia-pacific

@Techmeme@techhub.social
2025-10-09 12:17:47

Reflection AI, which is developing an open-source AI model to compete with DeepSeek, raised $2B led by Nvidia, valuing it at $8B, up from $545M in March (Michael J. de la Merced/New York Times)
nytimes.com/2025/10/09/busines

@Techmeme@techhub.social
2025-12-10 14:41:28

Nvidia says that it hasn't seen "substantiation or received tips" about chip smuggling via data centers outside of China, after The Information's DeepSeek story (Mark Bergen/Bloomberg)
bloomberg.com/news/articles/2…

@Techmeme@techhub.social
2025-10-22 01:15:50

In an experiment, ChatGPT-4o, Claude Sonnet 4.5, and DeepSeek-V3.2-Exp expressed secular, Western liberal values regardless of the language of the questions (Kelsey Piper/The Argument)
theargumentmag.com/p/do-ais-th

@Techmeme@techhub.social
2025-11-19 07:40:57

A look at Tsinghua University, which leads China's AI innovation, with 4,986 AI patents between 2005 and 2024, alumni behind startups like DeepSeek, and more (Saritha Rai/Bloomberg)

@Techmeme@techhub.social
2025-11-20 17:35:50

Allen Institute for AI, or Ai2, unveils Olmo 3 models that it says outperform open models like Stanford's Marin and commercial open-weight models like Llama 3.1 (Todd Bishop/GeekWire)
geekwire.com/2025/ai2-releases

@Techmeme@techhub.social
2025-10-17 02:10:43

A look at ByteDance's Doubao, which became China's most popular AI app in August, with over 157M MAUs, thanks in part to its deep integration with Douyin (Zeyi Yang/Wired)
wired.com/story/bytedance-doub

@Techmeme@techhub.social
2025-10-12 22:45:45

Young people in China are turning to AI chatbots like DeepSeek and Doubao for therapy to save time and money, while avoiding stigma around mental health (Yi-Ling Liu/Rest of World)
restofworld.org/2025/young-peo

@Techmeme@techhub.social
2025-10-10 20:26:02

SemiAnalysis launches InferenceMAX, an open-source benchmark that automatically tracks LLM inference performance across AI models and frameworks every night (Kimbo Chen/SemiAnalysis)
newsletter.semianalysis.com/p/