2026-01-25 19:16:13
Putin secretly sends 'special tasks' general to Abu Dhabi talks with Ukraine, signaling a Kremlin strategy shift: https://benborges.xyz/2026/01/25/putin-secretly-sends-special-tasks.html
Putin secretly sends 'special tasks' general to Abu Dhabi talks with Ukraine, signaling a Kremlin strategy shift: https://benborges.xyz/2026/01/25/putin-secretly-sends-special-tasks.html
Moonshot says Kimi K2.5 builds on K2 with "pretraining over ~15T mixed visual and text tokens" and "can self-direct an agent swarm with up to 100 sub-agents" (Kimi)
https://www.kimi.com/blog/kimi-k2-5.html
heise | To-Do-Apps im Vergleich: Google Tasks vs. Zenkit, Todoist und Tasks.org
Endlich nichts mehr vergessen: Google Tasks erinnert pünktlich an Aufgaben, Pflichten und Geburtstage. Wir zeigen, welche Apps mehr Komfort und Features bieten.
Some people now argue that LLMs are useless. I disagree; they can be very useful if you take them as what they are: models of language that generate text on the basis of some given text. As such, they can be useful for a wide range of text-related tasks, including assisting with writing. And the more formulaic the genre, the better they work obviously. This is part of the reason why they are so popular with students, and in academia more generally.
⇢
Some people now argue that LLMs are useless. I disagree; they can be very useful if you take them as what they are: models of language that generate text on the basis of some given text. As such, they can be useful for a wide range of text-related tasks, including assisting with writing. And the more formulaic the genre, the better they work obviously. This is part of the reason why they are so popular with students, and in academia more generally.
⇢
Welcome to the world of the field, engineering.
For a long time, we've hired very few into sales, mktg, support or consulting that don't already gobs of experience elsewhere.
✅ How #Microsoft’s developers are using #AI - The Verge
Like all the rest of the nerds, I did a bit of tech support on family computers.
They're all popping up windows from scam virus scanners lying that subscriptions need to be renewed or machines are unprotected. People don't know how to remove these things. Luckily they also don't really know how to pay the subscription.
Their phones are updating on them. Changing where buttons used to be. Removing options. Forcing people to register to use they things they have been doing for years.
They don't know how to register.
Things pop up asking for passwords and they have no idea who is asking or which password to use.
I tell them that I don't really understand why they keep using Windows now it is so shitty and awful. They say they don't know how to use anything else. The fact they don't really know how to use windows either doesn't seem to register.
The tech corporations have given up completely on being user friendly. They are all deliberately user hostile and exploitative now.
Corporate tech is terrible. The industry is failing it's users, abusing them. People don't even know there is any other way. They are just giving up on achieving their tasks until someone can fix the pop-ups and subscription boxes and passwords and 2fa for them.
Tech sucks now. Sucks hard.
#tech #christmasTechSupport
Todoist FTW.
Several years ago we had smoky pies and rolls for Thanksgiving because we put off cleaning the ovens until it was too late.
I created a recurring Todoist task to clean the ovens on the 3rd Sunday of November. Ever since then we’ve had pristine ovens and smoke free cooking every Thanksgiving.
Collette appreciates that a holiday for which most of the tasks are hers, she doesn’t have to worry and the ovens are ready to go.
Been using a number of AI models over the past week or so as work has slowed down, giving me time to explore things more deeply.
Been using Claude Code with musistudio/claude-code-router which is great as I can switch between different models on similar tasks.
Experience so far has been that Gemini 3 Flash is very good for thinking and coding tasks but the code does tend to be fragile so rewrites are needed. For tough problems where the errors are not straightforward it falls d…
I yearn for C 26 and not having to do repetitive tasks ever again to have a modicum of reflection. I want my static reflections yesterday, and my template for one day before that.
Oh, this is cool: A mind mapper for the terminal #cli
RE: https://hachyderm.io/@thomasfuchs/115601979925351548
hear me out, how about NOT CRAMMING EVERYTHING INTO ONE DEVICE that just works mid for everything, but instead, you know, do some actual innovation here and there
for example, make devices specifically tailored for certain tasks
like if you're Apple why in the fuck don't you make devices with e-paper screens for people who don't want to be terminally online
This is the level of prep my wife brings to Christmas lunch. Can you tell she's a scientist? As you can see, it's going well. I've been assigned a few tasks, but she mostly wants to do this herself, it seems.
#Christmas
Dynamic reversal of IT-PFC information flow orchestrates visual categorization under perceptual uncertainty https://www.biorxiv.org/content/10.64898/2025.12.17.695044v1 Quite a mouthful to say that "the brain actually reverses its information flow when things get blurr…
Raiders’ Ashton Jeanty Gets Bold Christian McCaffrey Message https://heavy.com/sports/nfl/las-vegas-raiders/ashton-jeanty-bold-christian-mccaffrey-message/
#TIL about CYBATHLON #cybathlon
Beijing-based DP Technology, which develops AI tools used by researchers for tasks like computer-aided drug design and battery design, raised a ~$114M Series C (Eunice Xu/South China Morning Post)
https://www.scmp.com/business/companies/ar
mich nervt es wenn programme eine plugin schnittstelle haben aber keine möglichkeit damit alle funktionalitäten des programms zu erweitern
zb kann man wohl in gimp keine zusätzlichen tools zur toolbox hinzufügen sodass man sich immer durch menüs klicken muss
es lohnt also nicht wirklich für wiederkehrende tasks ein plugin zu entwickeln
<…
"Deep Research, Shallow Agency: What Academic Deep Research Can and Can't Do"
https://aarontay.substack.com/p/how-agentic-are-academic-deep-research
My brain seems to have two modes:
1. time-unaware, that slightly chaotic relaxed flow state where I just do the next thing that feels right
2. time-aware, which is always a bit tense and stressful because I need to force myself to keep to a calendar, and execute tasks in specific times.
Easy Adaptation: An Efficient Task-Specific Knowledge Injection Method for Large Models in Resource-Constrained Environments
Dong Chen, Zhengqing Hu, Shixing Zhao, Yibo Guo
https://arxiv.org/abs/2512.17771 https://arxiv.org/pdf/2512.17771 https://arxiv.org/html/2512.17771
arXiv:2512.17771v1 Announce Type: new
Abstract: While the enormous parameter scale endows Large Models (LMs) with unparalleled performance, it also limits their adaptability across specific tasks. Parameter-Efficient Fine-Tuning (PEFT) has emerged as a critical approach for effectively adapting LMs to a diverse range of downstream tasks. However, existing PEFT methods face two primary challenges: (1) High resource cost. Although PEFT methods significantly reduce resource demands compared to full fine-tuning, it still requires substantial time and memory, making it impractical in resource-constrained environments. (2) Parameter dependency. PEFT methods heavily rely on updating a subset of parameters associated with LMs to incorporate task-specific knowledge. Yet, due to increasing competition in the LMs landscape, many companies have adopted closed-source policies for their leading models, offering access only via Application Programming Interface (APIs). Whereas, the expense is often cost-prohibitive and difficult to sustain, as the fine-tuning process of LMs is extremely slow. Even if small models perform far worse than LMs in general, they can achieve superior results on particular distributions while requiring only minimal resources. Motivated by this insight, we propose Easy Adaptation (EA), which designs Specific Small Models (SSMs) to complement the underfitted data distribution for LMs. Extensive experiments show that EA matches the performance of PEFT on diverse tasks without accessing LM parameters, and requires only minimal resources.
toXiv_bot_toot
AI Agents ‘Perilous’ for Secure Apps Such as Signal, Whittaker Says
https://www.bloomberg.com/news/articles/2026-01-20/ai-agents-perilous-for-secure-apps-such-as-signal-whittaker
windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted
Experimental insights into data augmentation techniques for deep learning-based multimode fiber imaging: limitations and success
Jawaria Maqbool, M. Imran Cheema
https://arxiv.org/abs/2511.19072 https://arxiv.org/pdf/2511.19072 https://arxiv.org/html/2511.19072
arXiv:2511.19072v1 Announce Type: new
Abstract: Multimode fiber~(MMF) imaging using deep learning has high potential to produce compact, minimally invasive endoscopic systems. Nevertheless, it relies on large, diverse real-world medical data, whose availability is limited by privacy concerns and practical challenges. Although data augmentation has been extensively studied in various other deep learning tasks, it has not been systematically explored for MMF imaging. This work provides the first in-depth experimental and computational study on the efficacy and limitations of augmentation techniques in this field. We demonstrate that standard image transformations and conditional generative adversarial-based synthetic speckle generation fail to improve, or even deteriorate, reconstruction quality, as they neglect the complex modal interference and dispersion that results in speckle formation. To address this, we introduce a physical data augmentation method in which only organ images are digitally transformed, while their corresponding speckles are experimentally acquired via fiber. This approach preserves the physics of light-fiber interaction and enhances the reconstruction structural similarity index measure~(SSIM) by up to 17\%, forming a viable system for reliable MMF imaging under limited data conditions.
toXiv_bot_toot
from my link log —
Exploring the fragmentation of Wayland: an xdotool adventure.
https://www.semicomplete.com/blog/xdotool-and-exploring-wayland-fragmentation/
saved 2025-11-21
Google’s vibe-coding tool, Opal,
is making its way to Gemini.
The company on Wednesday said it is integrating the tool,
which lets you build AI-powered mini apps,
inside the Gemini web app,
allowing users to create their own custom apps,
which Google calls Gems.
Introduced in 2024,
Gems are customized versions of Gemini designed for specific tasks or scenarios.
For instance, some of Google’s pre-made Gems include
a learning coach,…
Sunday Robotics unveils Memo, a fully autonomous home robot capable of tasks like making espresso and loading dishwashers, set to launch in beta in 2026 (Will Knight/Wired)
https://www.wired.com/story/memo-sunday-robotics-home-robot/
Senior Microsoft Product Manager Wendy Breiding discusses in this recent post how you can now customize your IDE to include agentic AI to your project that is focused on tasks related to a specific language or UI stack, in this case: C# and WinForms. The results have been positive when comparing these agents to previous more general approaches.
"Introducing Custom Agents for .NET Developers: C# Expert & WinForms Expert"
🎲 TextBandit: Evaluating Probabilistic Reasoning in LLMs Through Language-Only Decision Tasks
#llm
Do you use LLMs to generate regular expressions? We do, too! Do you *review* your regexes? Is that frustrating? How can we put humans in the loop better, doing relatively few, meaningful tasks? Please try out our new tool PICK:regex, available for VSCode!
https://blog.brownplt.org/2025/12/11/p
To flexibly organize thought, the brain makes use of space https://news.mit.edu/2026/to-flexibly-organize-thought-the-brain-makes-use-of-space-0120
I sat down at my design to design and print something then I started doing sysadmin tasks and now it's an hour later... dammit!
Good website #uber guys...
I needed to update my password. Then I couldn't go back (yes yes, outside of my browser's back button)
Just so you can guess where the "AI" thing is headed, look at this listing from a local community college
"AI made simple for everyday life"
... And look at what comes next:
"Flaggers certification"
The exciting thing (for bosses) is the idea that knowledge workers will be as interchangeable (and precarious) as DOT flaggers
#YouDeserveAUnion
Arbiter, which is using AI to automate healthcare administrative tasks, emerges from stealth with a $52M seed from multiple family offices at a $400M valuation (Rebecca Torrence/Business Insider)
https://www.businessinsider.com/health-sta
China's MiniMax releases M2.1, an upgrade to its open-source M2 model that it says has "significantly enhanced" coding capabilities in Rust, Java, and others (MiniMax)
https://www.minimax.io/news/minimax-m21
Started the official rewrite of the Sisyphus client in #golang, working on getting the Ffmpeg command-line tasks parsed and validated against the schema. This should make things easier to distribute with respect to the client as I can just distribute static binaries.
#programming
Can You Hear Me Now? A Benchmark for Long-Range Graph Propagation
Luca Miglior, Matteo Tolloso, Alessio Gravina, Davide Bacciu
https://arxiv.org/abs/2512.17762 https://arxiv.org/pdf/2512.17762 https://arxiv.org/html/2512.17762
arXiv:2512.17762v1 Announce Type: new
Abstract: Effectively capturing long-range interactions remains a fundamental yet unresolved challenge in graph neural network (GNN) research, critical for applications across diverse fields of science. To systematically address this, we introduce ECHO (Evaluating Communication over long HOps), a novel benchmark specifically designed to rigorously assess the capabilities of GNNs in handling very long-range graph propagation. ECHO includes three synthetic graph tasks, namely single-source shortest paths, node eccentricity, and graph diameter, each constructed over diverse and structurally challenging topologies intentionally designed to introduce significant information bottlenecks. ECHO also includes two real-world datasets, ECHO-Charge and ECHO-Energy, which define chemically grounded benchmarks for predicting atomic partial charges and molecular total energies, respectively, with reference computations obtained at the density functional theory (DFT) level. Both tasks inherently depend on capturing complex long-range molecular interactions. Our extensive benchmarking of popular GNN architectures reveals clear performance gaps, emphasizing the difficulty of true long-range propagation and highlighting design choices capable of overcoming inherent limitations. ECHO thereby sets a new standard for evaluating long-range information propagation, also providing a compelling example for its need in AI for science.
toXiv_bot_toot
NeuroSketch: An Effective Framework for Neural Decoding via Systematic Architectural Optimization
Gaorui Zhang, Zhizhang Yuan, Jialan Yang, Junru Chen, Li Meng, Yang Yang
https://arxiv.org/abs/2512.09524 https://arxiv.org/pdf/2512.09524 https://arxiv.org/html/2512.09524
arXiv:2512.09524v1 Announce Type: new
Abstract: Neural decoding, a critical component of Brain-Computer Interface (BCI), has recently attracted increasing research interest. Previous research has focused on leveraging signal processing and deep learning methods to enhance neural decoding performance. However, the in-depth exploration of model architectures remains underexplored, despite its proven effectiveness in other tasks such as energy forecasting and image classification. In this study, we propose NeuroSketch, an effective framework for neural decoding via systematic architecture optimization. Starting with the basic architecture study, we find that CNN-2D outperforms other architectures in neural decoding tasks and explore its effectiveness from temporal and spatial perspectives. Building on this, we optimize the architecture from macro- to micro-level, achieving improvements in performance at each step. The exploration process and model validations take over 5,000 experiments spanning three distinct modalities (visual, auditory, and speech), three types of brain signals (EEG, SEEG, and ECoG), and eight diverse decoding tasks. Experimental results indicate that NeuroSketch achieves state-of-the-art (SOTA) performance across all evaluated datasets, positioning it as a powerful tool for neural decoding. Our code and scripts are available at https://github.com/Galaxy-Dawn/NeuroSketch.
toXiv_bot_toot
⭐ The Simple Habit That Saves My Evenings | alikhil | software engineering, kubernetes & self-hosting
https://alikhil.dev/posts/the-simple-habit-that-saves-my-evenings/
Spoiler alert:
Here are the two main ideas of it:
* Don’t overwork
*…
Caught a bug over the holidays so I’m mostly resting, feeling sorry for myself, and taking the time to at least carry out some mindless housekeeping tasks (updating dependencies, etc.) on some of my Node modules.
Released updates to the following packages yesterday:
Tape-based Node.js testing:
• Tap monkey (https://
windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted
The Pentagon partners with xAI to embed the company's frontier AI systems, based on the Grok family of models, directly into GenAI.mil as soon as early 2026 (Bonny Chu/Fox News)
https://www.foxnews.com/politics/pentagon-
Eww..🤮
Hopefully I left Brave
https://brave.com/blog/ai-browsing/
The LLMs are useful for some tasks, I'm currently tidying up, proof reading and editing a document written by many international co-authors.
The LLM I'm using can very quickly correct grammar and spelling mistakes and produces much easier to understand text from occasionally tortured paragraphs. It still needs an experts eye (mine!) to check no mistakes have been introduced or complexities over-simplified.
This is actually the first time I've used an LLM for this task. It's making it much faster and less painful.
It also explains a lot about academic publishing lately..
"Deep Research, Shallow Agency: What Academic Deep Research Can and Can't Do"
https://aarontay.substack.com/p/how-agentic-are-academic-deep-research
Today at #CHR2025, I will be presenting our work on the evaluation of the historical adequacy of masked language models (MLMs) for #Latin. There are several models like this, and they represent the current state of the art for a number of downstream tasks, like semantic change and text reuse detection. However, a h…
We've updated the What Uses More app to reflect last week's finding by Luccioni and Gamazaychikov that "reasoning" mode increases energy and water usage by 30x. The study casts doubt on the improved efficiency AI companies are claiming for newer models
https://www.
"I'm an average user, so I don't need all the options and apps the programme has to offer. But, to be honest, #Microsoft is making it increasingly attractive to switch. Now that the company is putting #AI in everything, everything is becoming more annoying to use."
Can Dutch
Anthropic launches Cowork for Claude, built on Claude Code to automate complex tasks with minimal prompting, as a research preview for Claude Max subscribers (Webb Wright/ZDNET)
https://www.zdnet.com/article/anthropic-cowork-for-claude-complex-action…
Someone built an indexed & vectorized index & a conversational AI for the Epstein Files.
Never mind Qs like "How many times was Trump mentioned?" Try asking it complex tasks & questions that require intelligent reasoning like, "Is there any evidence..."
https://epstein.trynia.ai/
It's funny how "AI" tools are simulteanously marketed as "agents" that can run fully in the background and do stuff but whenever they do something bad it's the user at fault for not supervising the software that doesn't work.
Even when it’s directly used and the user has the chance to review everything—it’s extremely dangerous, especially at tasks it is doing fine like 95% of the time and/or when the bad things are only subtly wrong.
Imagine other tools being like this, like a steering wheel that turns the car 95 out of a 100 times. 2% of the time it steers into the other direction. 3% of the time it steers 5x as much as normally.
windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted
Anthropic finds that LLMs trained to "reward hack" by cheating on coding tasks show even more misaligned behavior, including sabotaging AI-safety research (Anthropic)
https://www.anthropic.com/research/emergent-misalignment-reward-hacking
Financial well-being has an outsize imprint on older Americans’ quality of life
-- affecting their physical health, social life and even cognitive skills.
Low-income seniors are more likely to experience mental confusion,
spend less time pursuing hobbies,
and face difficulties with everyday tasks such as climbing stairs and grocery shopping,
compared with their more affluent counterparts
METR: Claude Opus 4.5 has a 50% task completion time horizon of about 4 hours and 49 minutes, more than double that of Claude Opus 4 released earlier this year (@metr_evals)
https://x.com/metr_evals/status/2002203627377574113
I'm trying out kan.bn since my Focalboard install took a shit. kan.bn was fairly easy to get running via Docker on my OpenMediaVault NAS.
I'm not a huge fan of the kanban board style but I'll give it a try. At least until I can find something better to manage my tasks.
https://kan.bn/
Weighted Stochastic Differential Equation to Implement Wasserstein-Fisher-Rao Gradient Flow
Herlock Rahimi
https://arxiv.org/abs/2512.17878 https://arxiv.org/pdf/2512.17878 https://arxiv.org/html/2512.17878
arXiv:2512.17878v1 Announce Type: new
Abstract: Score-based diffusion models currently constitute the state of the art in continuous generative modeling. These methods are typically formulated via overdamped or underdamped Ornstein--Uhlenbeck-type stochastic differential equations, in which sampling is driven by a combination of deterministic drift and Brownian diffusion, resulting in continuous particle trajectories in the ambient space. While such dynamics enjoy exponential convergence guarantees for strongly log-concave target distributions, it is well known that their mixing rates deteriorate exponentially in the presence of nonconvex or multimodal landscapes, such as double-well potentials. Since many practical generative modeling tasks involve highly non-log-concave target distributions, considerable recent effort has been devoted to developing sampling schemes that improve exploration beyond classical diffusion dynamics.
A promising line of work leverages tools from information geometry to augment diffusion-based samplers with controlled mass reweighting mechanisms. This perspective leads naturally to Wasserstein--Fisher--Rao (WFR) geometries, which couple transport in the sample space with vertical (reaction) dynamics on the space of probability measures. In this work, we formulate such reweighting mechanisms through the introduction of explicit correction terms and show how they can be implemented via weighted stochastic differential equations using the Feynman--Kac representation. Our study provides a preliminary but rigorous investigation of WFR-based sampling dynamics, and aims to clarify their geometric and operator-theoretic structure as a foundation for future theoretical and algorithmic developments.
toXiv_bot_toot
Legal AI startup Ivo, which aims to reduce hallucinations by breaking legal reviews into 400 tasks, raised a $55M Series B, a source says at a $355M valuation (Aditya Soni/Reuters)
https://www.reuters.com/technology/legal-ai-startup…
windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted
You Only Train Once: Differentiable Subset Selection for Omics Data
Daphn\'e Chopard, Jorge da Silva Gon\c{c}alves, Irene Cannistraci, Thomas M. Sutter, Julia E. Vogt
https://arxiv.org/abs/2512.17678 https://arxiv.org/pdf/2512.17678 https://arxiv.org/html/2512.17678
arXiv:2512.17678v1 Announce Type: new
Abstract: Selecting compact and informative gene subsets from single-cell transcriptomic data is essential for biomarker discovery, improving interpretability, and cost-effective profiling. However, most existing feature selection approaches either operate as multi-stage pipelines or rely on post hoc feature attribution, making selection and prediction weakly coupled. In this work, we present YOTO (you only train once), an end-to-end framework that jointly identifies discrete gene subsets and performs prediction within a single differentiable architecture. In our model, the prediction task directly guides which genes are selected, while the learned subsets, in turn, shape the predictive representation. This closed feedback loop enables the model to iteratively refine both what it selects and how it predicts during training. Unlike existing approaches, YOTO enforces sparsity so that only the selected genes contribute to inference, eliminating the need to train additional downstream classifiers. Through a multi-task learning design, the model learns shared representations across related objectives, allowing partially labeled datasets to inform one another, and discovering gene subsets that generalize across tasks without additional training steps. We evaluate YOTO on two representative single-cell RNA-seq datasets, showing that it consistently outperforms state-of-the-art baselines. These results demonstrate that sparse, end-to-end, multi-task gene subset selection improves predictive performance and yields compact and meaningful gene subsets, advancing biomarker discovery and single-cell analysis.
toXiv_bot_toot
Stuut, which connects to CRM and other systems to automate management of accounts receivable, raised a $29.5M Series A led by a16z (Charlie Fink/Forbes)
https://www.forbes.com/sites/charliefink/2025…
Dude. What the...? 😜
#Azure
From: @…
https://infosec.exchange/@alevsk/11549
AI robotics startup Physical Intelligence claims vision-language-action models learn to align human videos and robot data as pre-training is scaled up (Physical Intelligence)
https://www.physicalintelligence.company/research/human_to_robot
windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted
Maxima, whose AI platform automates accounting tasks like reconciliation and journal entry, raised $41M in seed and Series A rounds at a $143M valuation (Aditya Soni/Reuters)
https://www.reuters.com/business/ai-accounti…
Crosslisted article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[2/3]:
- Sharp Structure-Agnostic Lower Bounds for General Functional Estimation
Jikai Jin, Vasilis Syrgkanis
https://arxiv.org/abs/2512.17341 https://mastoxiv.page/@arXiv_statML_bot/115762312049963700
- Timely Information Updating for Mobile Devices Without and With ML Advice
Yu-Pin Hsu, Yi-Hsuan Tseng
https://arxiv.org/abs/2512.17381 https://mastoxiv.page/@arXiv_csNI_bot/115762180316858485
- SWE-Bench : A Framework for the Scalable Generation of Software Engineering Benchmarks from Open...
Wang, Ramalho, Celestino, Pham, Liu, Sinha, Portillo, Osunwa, Maduekwe
https://arxiv.org/abs/2512.17419 https://mastoxiv.page/@arXiv_csSE_bot/115762487015279852
- Perfect reconstruction of sparse signals using nonconvexity control and one-step RSB message passing
Xiaosi Gu, Ayaka Sakata, Tomoyuki Obuchi
https://arxiv.org/abs/2512.17426 https://mastoxiv.page/@arXiv_statML_bot/115762346108219997
- MULTIAQUA: A multimodal maritime dataset and robust training strategies for multimodal semantic s...
Jon Muhovi\v{c}, Janez Per\v{s}
https://arxiv.org/abs/2512.17450 https://mastoxiv.page/@arXiv_csCV_bot/115762717053353674
- When Data Quality Issues Collide: A Large-Scale Empirical Study of Co-Occurring Data Quality Issu...
Emmanuel Charleson Dapaah, Jens Grabowski
https://arxiv.org/abs/2512.17460 https://mastoxiv.page/@arXiv_csSE_bot/115762500123147574
- Behavioural Effects of Agentic Messaging: A Case Study on a Financial Service Application
Olivier Jeunen, Schaun Wheeler
https://arxiv.org/abs/2512.17462 https://mastoxiv.page/@arXiv_csIR_bot/115762430673347625
- Linear Attention for Joint Power Optimization and User-Centric Clustering in Cell-Free Networks
Irched Chafaa, Giacomo Bacci, Luca Sanguinetti
https://arxiv.org/abs/2512.17466 https://mastoxiv.page/@arXiv_eessSY_bot/115762336277179643
- Translating the Rashomon Effect to Sequential Decision-Making Tasks
Dennis Gross, J{\o}rn Eirik Betten, Helge Spieker
https://arxiv.org/abs/2512.17470 https://mastoxiv.page/@arXiv_csAI_bot/115762556506696539
- Alternating Direction Method of Multipliers for Nonlinear Matrix Decompositions
Atharva Awari, Nicolas Gillis, Arnaud Vandaele
https://arxiv.org/abs/2512.17473 https://mastoxiv.page/@arXiv_eessSP_bot/115762580078964235
- TwinSegNet: A Digital Twin-Enabled Federated Learning Framework for Brain Tumor Analysis
Almustapha A. Wakili, Adamu Hussaini, Abubakar A. Musa, Woosub Jung, Wei Yu
https://arxiv.org/abs/2512.17488 https://mastoxiv.page/@arXiv_csCV_bot/115762726884307901
- Resource-efficient medical image classification for edge devices
Mahsa Lavaei, Zahra Abadi, Salar Beigzad, Alireza Maleki
https://arxiv.org/abs/2512.17515 https://mastoxiv.page/@arXiv_eessIV_bot/115762459510336799
- PathBench-MIL: A Comprehensive AutoML and Benchmarking Framework for Multiple Instance Learning i...
Brussee, Valkema, Weijer, Doeleman, Schrader, Kers
https://arxiv.org/abs/2512.17517 https://mastoxiv.page/@arXiv_csCV_bot/115762741957639051
- HydroGym: A Reinforcement Learning Platform for Fluid Dynamics
Christian Lagemann, et al.
https://arxiv.org/abs/2512.17534 https://mastoxiv.page/@arXiv_physicsfludyn_bot/115762391350754768
- When De-noising Hurts: A Systematic Study of Speech Enhancement Effects on Modern Medical ASR Sys...
Chondhekar, Murukuri, Vasani, Goyal, Badami, Rana, SN, Pandia, Katiyar, Jagadeesh, Gulati
https://arxiv.org/abs/2512.17562 https://mastoxiv.page/@arXiv_csSD_bot/115762423443170715
- Enabling Disaggregated Multi-Stage MLLM Inference via GPU-Internal Scheduling and Resource Sharing
Lingxiao Zhao, Haoran Zhou, Yuezhi Che, Dazhao Cheng
https://arxiv.org/abs/2512.17574 https://mastoxiv.page/@arXiv_csDC_bot/115762425409322293
- SkinGenBench: Generative Model and Preprocessing Effects for Synthetic Dermoscopic Augmentation i...
N. A. Adarsh Pritam, Jeba Shiney O, Sanyam Jain
https://arxiv.org/abs/2512.17585 https://mastoxiv.page/@arXiv_eessIV_bot/115762479150695610
- MAD-OOD: A Deep Learning Cluster-Driven Framework for an Out-of-Distribution Malware Detection an...
Tosin Ige, Christopher Kiekintveld, Aritran Piplai, Asif Rahman, Olukunle Kolade, Sasidhar Kunapuli
https://arxiv.org/abs/2512.17594 https://mastoxiv.page/@arXiv_csCR_bot/115762509298207765
- Confidence-Credibility Aware Weighted Ensembles of Small LLMs Outperform Large LLMs in Emotion De...
Menna Elgabry, Ali Hamdi
https://arxiv.org/abs/2512.17630 https://mastoxiv.page/@arXiv_csCL_bot/115762575512981257
- Generative Multi-Objective Bayesian Optimization with Scalable Batch Evaluations for Sample-Effic...
Madhav R. Muthyala, Farshud Sorourifar, Tianhong Tan, You Peng, Joel A. Paulson
https://arxiv.org/abs/2512.17659 https://mastoxiv.page/@arXiv_statML_bot/115762554519447500
toXiv_bot_toot
Anthropic launches Agent Skills, which let AI assistants perform specialized tasks using modular instructions, and says Microsoft, Cursor, and others use them (Michael Nuñez/VentureBeat)
https://venturebeat.com/ai/anthropic-launches-enterprise-age…
windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted
UK AI Security Institute report: AI models are rapidly improving at potentially dangerous biological and chemical tasks, and show fast jumps in self-replication (Shakeel Hashim/Transformer)
https://www.transformernews.ai/p/aisi-ai-s
Manus says it crossed $100M ARR eight months after launch and is growing at 20% MoM since Manus 1.5's release; its total revenue run rate is now over $125M (Jake Rudnitsky/Bloomberg)
https://www.bloomberg.com/news/articles/20
windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted
Research: AI's ability to complete long and complex software engineering tasks doubles every 6-7 months, but there is a "messiness tax" for real-world tasks (Boaz Barak/Windows On Theory)
https://windowsontheory.org/2025/11/04/thoughts-by…
A look at data labeling startups like Objectways, whose workers record and annotate repetitive tasks like folding towels to train AI robots for physical tasks (Nilesh Christopher/Los Angeles Times)
https://www.latimes.com/business/story/202…
windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted
windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted
Alibaba Technical Report: Qwen3-VL beats GPT-5 and Gemini 2.5 Pro on visual tasks and has 100% accuracy on "needle-in-a-haystack" tests for 30-minute videos (Jonathan Kemper/The Decoder)
https://the-decoder.com/qwen3-vl-can-scan-two-hour-…
windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted
OpenAI says GPT‑5.2 Thinking beats or ties industry professionals on 70.9% of GDPval knowledge work tasks, delivering outputs at >11x the speed and <1% the cost (OpenAI)
https://openai.com/index/introducing-gpt-5-2
Pebble unveils the Pebble Index 01, a $99 smart ring with an on-device LLM for processing voice notes, shipping in March 2026, initially for $75 (Julian Chokkattu/Wired)
https://www.wired.com/story/pebble-index-ring/
Source: Anthropic, OpenAI, Google, Microsoft, and more are set to unveil the Agentic AI Foundation to build open-source AI agent standards as soon as this week (Aaron Holmes/The Information)
https://www.theinformation.com/articles/openai…
Local governments across China are funding dozens of "robot training centers", where human trainers mimic movements like folding clothes to teach the robots (Rest of World)
https://restofworld.org/2026/china-robots-training-centers-workers/
An OpenAI survey of 9,000 workers at 100 companies: it saves workers ~40 to 60 minutes per day on average for professional tasks; OpenAI has 1M business clients (Shirin Ghaffary/Bloomberg)
https://www.bloomberg.com/news/articles/20
OpenAI is rolling out a HIPAA-compliant version of ChatGPT for clinicians to assist with medical reasoning and administrative tasks, at Cedars-Sinai and others (Shirin Ghaffary/Bloomberg)
https://www.bloomberg.com/news/newsletters
An analysis of 100T tokens from the past year shows reasoning models now represent over half of all usage, open-weight model use has grown steadily, and more (OpenRouter)
https://openrouter.ai/state-of-ai
Pine, which offers an AI agent to automate digital chores, like making calls, handling emails, and operating software to complete tasks, raised a $25M Series A (FinSMEs)
https://www.finsmes.com/2025/12/pine-raises-25m-in-series-a-funding.html
AMD unveils Ryzen AI 400 Series AI PC chips with 12 CPU cores, claiming 1.3x faster multitasking and 1.7x faster content creation than rivals (Rebecca Szkutak/TechCrunch)
https://techcrunch.com/2026/01/05/amd-unveils-new-ai-pc-processors-…
Switzerland-based Mimic Robotics, which is building AI models to enable human-like robotic hands to adapt to complex, high-precision tasks, raised a $16M seed (Kyt Dotson/SiliconANGLE)
https://siliconangle.com/2025/11/04/mimic-raises-16m-build-a…
Giga, which develops voice-based AI agents for customer support, raised a $61M Series A led by Redpoint with participation from Y Combinator and Nexus (Beatrice Nolan/Fortune)
https://fortune.com/2025/11/05/voice-ai-giga-raise-61-million-customer-serv…
A look at startups like AGI and Plato, which build replicas of websites to let AI agents learn to navigate and complete specific tasks, like booking flights (Cade Metz/New York Times)
https://www.nytimes.com/2025/12/02…
Sources: OpenAI is developing a new LLM codenamed Garlic, which performs well when compared to Gemini 3 and Opus 4.5 in coding and reasoning tasks (Stephanie Palazzolo/The Information)
https://www.theinformation.com/articles/openai-developi…
Sources: multiple Microsoft divisions lowered how much salespeople are supposed to grow sales of certain AI products after missing growth targets, a rare move (Aaron Holmes/The Information)
https://www.theinformation.com/articles/mi
OpenAI releases gpt-oss-safeguard, its open-weight reasoning models for safety classification tasks, available in 120B and 20B parameters, under Apache 2.0 (OpenAI)
https://openai.com/index/introducing-gpt-oss-safeguard/