
2025-09-18 16:36:00
Huawei says DeepSeek-R1-Safe, which was trained on 1,000 of its Ascend AI chips, is "nearly 100% successful" in preventing politically sensitive topics (Eduardo Baptista/Reuters)
https://www.reuters.com/business/media-tel
DeepSeek details V3.1 and says it surpasses R1 on key benchmarks and is customized to work with next-gen Chinese-made AI chips, after unveiling it on August 19 (Bloomberg)
https://www.bloomberg.com/news/articles/2025-08-21/deep…
The Emperor's New Chain-of-Thought: Probing Reasoning Theater Bias in Large Reasoning Models
Qian Wang, Yubo Fan, Zhenheng Tang, Nuo Chen, Wenxuan Wang, Bingsheng He
https://arxiv.org/abs/2507.13758
In a peer-reviewed Nature article, DeepSeek says it has spent $294,000 on training its R1 model and used 512 Nvidia H800 chips (Eduardo Baptista/Reuters)
https://www.reuters.com/world/china/chinas-deepseek-says-its-hit-ai-model-cos…
The geopolitical #aiarmsrace seems largely unimpressed by people proclaiming #LLMs have plateaued and #AGI is never coming.
Such assessments are only relevant for the market, but not so much for count…
A Study on Thinking Patterns of Large Reasoning Models in Code Generation
Kevin Halim, Sin G. Teo, Ruitao Feng, Zhenpeng Chen, Yang Gu, Chong Wang, Yang Liu
https://arxiv.org/abs/2509.13758
Mathematical Computation and Reasoning Errors by Large Language Models
Liang Zhang, Edith Aurora Graf
https://arxiv.org/abs/2508.09932 https://arxiv.org/pd…
MATH-Beyond: A Benchmark for RL to Expand Beyond the Base Model
Prasanna Mayilvahanan, Ricardo Dominguez-Olmedo, Thadd\"aus Wiedemer, Wieland Brendel
https://arxiv.org/abs/2510.11653
MobileLLM-R1: Exploring the Limits of Sub-Billion Language Model Reasoners with Open Training Recipes
Changsheng Zhao, Ernie Chang, Zechun Liu, Chia-Jung Chang, Wei Wen, Chen Lai, Rick Cao, Yuandong Tian, Raghuraman Krishnamoorthi, Yangyang Shi, Vikas Chandra
https://arxiv.org/abs/2509.24945
Compare and buy tulips! One monthly subscription to get the very best tulip colours for all your tulip needs!
https://store.boingboing.net/sales/chatplayground-ai-basic-plan-lifetime-subscriptions
Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing
Jeffrey Amico, Gabriel Passamani Andrade, John Donaghy, Ben Fielding, Tristin Forbus, Harry Grieve, Semih Kara, Jari Kolehmainen, Yihua Lou, Christopher Nies, Edward Phillip Flores Nu\~no, Diogo Ortega, Shikhar Rastogi, Austin Virts, Matthew J. Wright
https://
Base Models Know How to Reason, Thinking Models Learn When
Constantin Venhoff, Iv\'an Arcuschin, Philip Torr, Arthur Conmy, Neel Nanda
https://arxiv.org/abs/2510.07364 https…
SemiAnalysis launches InferenceMAX, an open-source benchmark that automatically tracks LLM inference performance across AI models and frameworks every night (Kimbo Chen/SemiAnalysis)
https://newsletter.semianalysis.com/p/inferencemax-open-source-inference
Overview of the Plagiarism Detection Task at PAN 2025
Andr\'e Greiner-Petter, Maik Fr\"obe, Jan Philip Wahle, Terry Ruas, Bela Gipp, Akiko Aizawa, Martin Potthast
https://arxiv.org/abs/2510.06805
Diamonds in the rough: Transforming SPARCs of imagination into a game concept by leveraging medium sized LLMs
Julian Geheeb, Farhan Abid Ivan, Daniel Dyrda, Miriam Ansch\"utz, Georg Groh
https://arxiv.org/abs/2509.24730
R-Horizon: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?
Yi Lu, Jianing Wang, Linsen Guo, Wei He, Hongyin Tang, Tao Gui, Xuanjing Huang, Xuezhi Cao, Wei Wang, Xunliang Cai
https://arxiv.org/abs/2510.08189
An Iterative LLM Framework for SIBT utilizing RAG-based Adaptive Weight Optimization
Zhuo Xiao (Image Processing Center, Beihang University, Beijing, China), Qinglong Yao (Image Processing Center, Beihang University, Beijing, China), Jingjing Wang (Image Processing Center, Beihang University, Beijing, China), Fugen Zhou (Image Processing Center, Beihang University, Beijing, China), Bo Liu (Image Processing Center, Beihang University, Beijing, China), Haitao Sun (Department of Radiation…
Evaluating Open-Source Large Language Models for Technical Telecom Question Answering
Arina Caraus, Alessio Buscemi, Sumit Kumar, Ion Turcanu
https://arxiv.org/abs/2509.21949 ht…
DRQA: Dynamic Reasoning Quota Allocation for Controlling Overthinking in Reasoning Large Language Models
Kaiwen Yan, Xuanqing Shi, Hongcheng Guo, Wenxuan Wang, Zhuosheng Zhang, Chengwei Qin
https://arxiv.org/abs/2508.17803
SLIM: Subtrajectory-Level Elimination for More Effective Reasoning
Xifeng Yao, Chengyuan Ma, Dongyu Lang, Yinhao Ni, Zhiwei Xu, Huarui Xie, Zihao Chen, Guang Shen, Dandan Tu, Yi Bai, Changzheng Zhang
https://arxiv.org/abs/2508.19502
The Impact of Language Mixing on Bilingual LLM Reasoning
Yihao Li, Jiayi Xin, Miranda Muqing Miao, Qi Long, Lyle Ungar
https://arxiv.org/abs/2507.15849 htt…