Tootfinder

@frankel@mastodon.top
2026-02-18 09:00:44

SkillsBench: Benchmarking How Well Agent #Skills Work Across Diverse Tasks
#LLM …

SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks
Agent Skills are structured packages of procedural knowledge that augment LLM agents at inference time. Despite rapid adoption, there is no standard way to measure whether they actually help. We present SkillsBench, a benchmark of 86 tasks across 11 domains paired with curated Skills and deterministic verifiers. Each task is evaluated under three conditions: no Skills, curated Skills, and self-generated Skills. We test 7 agent-model configurations over 7,308 trajectories. Curated Skills raise a…

@inthehands@hachyderm.io
2026-03-18 17:03:46

The other one I truly love is GitUp (https://gitup.co). Its visualization handles certain specific tasks better than anything else — tasks where I’m more concerned about the shape of the commit graph than the contents of individual commits.
Because of the way it does live updates of repo state and offers a whole-commit-graph-level undo, I’ll sometimes keep it open in the background while doing some fiddly thing in another tool (Fork, CLI, whatever) just so I can see what the ^*@# is happening.
Alas, its lack of support for commit signing means I use it less and less.

@heiseonline@social.heise.de
2026-01-13 15:49:01

heise | To-Do-Apps im Vergleich: Google Tasks vs. Zenkit, Todoist und Tasks.org
Endlich nichts mehr vergessen: Google Tasks erinnert pünktlich an Aufgaben, Pflichten und Geburtstage. Wir zeigen, welche Apps mehr Komfort und Features bieten.

To-Do-Apps im Vergleich: Google Tasks vs. Zenkit, Todoist und Tasks.org
Endlich nichts mehr vergessen: Google Tasks erinnert pünktlich an Aufgaben, Pflichten und Geburtstage. Wir zeigen, welche Apps mehr Komfort und Features bieten.

@Techmeme@techhub.social
2026-02-19 16:22:04

Google rolls out Gemini 3.1 Pro, which it says is "a step forward in core reasoning", for AI Pro and Ultra subscribers; the .1 increment is a first for Google (Abner Li/9to5Google)
https://9to5google.com/2026/02/19/google-announces-gem…

Google announces Gemini 3.1 Pro for ‘complex problem-solving’
In November, Google introduced Gemini 3 Pro in preview. Google today announced Gemini 3.1 Pro "for tasks where a simple answer isn’t enough."

@cdarwin@c.im
2025-12-19 04:53:30

Google’s vibe-coding tool, Opal,
is making its way to Gemini.
The company on Wednesday said it is integrating the tool,
which lets you build AI-powered mini apps,
inside the Gemini web app,
allowing users to create their own custom apps,
which Google calls Gems.
Introduced in 2024,
Gems are customized versions of Gemini designed for specific tasks or scenarios.
For instance, some of Google’s pre-made Gems include
a learning coach,…

Quickstart | Opal | Google for Developers

@nelson@tech.lgbt
2026-03-18 15:59:54

Trying out Jules, a coding agent from Google that is similar to Claude Code. But it's all hosted: you use a web UI to talk to it, it checks out code from GitHub and runs it in containers. And it operates asynchronously, you give it tasks and it comes back 5-15 minutes later with work done. I like it quite a bit, but there's a question whether the Gemini models are as good for coding as Claude.

@jtk@infosec.exchange
2026-02-18 13:51:17

Tom's Hardware has a headline this week summarizing a Financial Times interview with a Microsoft #AI exec that begins thusly:
"Microsoft’s AI boss says AI can replace every white-collar job in 18 months".
If you watch the interview, that is not what was said. The statement is a bit more nuanced claim that AI can fully automate tasks of some white collar work.
But I…

@UP8@mastodon.social
2026-02-18 16:51:21

💍 Origami-inspired ring lets users 'feel' virtual worlds
#vr

Origami-inspired ring lets users 'feel' virtual worlds
Virtual reality (VR) and augmented reality (AR) are technologies that allow users to immerse themselves in digital worlds or enhance their surroundings with computer-generated filters or images, respectively. Both these technologies are now widely used worldwide, whether to experience video games and media content in more engaging ways or improve specific training and assist professionals in their daily tasks.

@v_i_o_l_a@openbiblio.social
2026-01-16 21:31:13

"Deep Research, Shallow Agency: What Academic Deep Research Can and Can't Do"
https://aarontay.substack.com/p/how-agentic-are-academic-deep-research

Deep Research, Shallow Agency: What Academic Deep Research Can and Can't Do"
The Agentic Illusion: Most Academic Deep Research runs fixed workflows and stumble when given unfamiliar literature review tasks that do not fit them.

@mia@hcommons.social
2026-02-14 12:18:03

'Automate tasks, not jobs' - a great headline from a report on 'the AI opportunity for Scotland’s public services' https://stormid.com/research/

Automate tasks, not jobs: The AI opportunity for Scotland’s public services
Independent analysis identifying the highest value public sector workflows for AI driven productivity gains.

@jamesthebard@social.linux.pizza
2026-02-16 18:01:52

Ansible playbook done "enough", everything is now matching. Evidently I did forget to transition one server over to Debian, but got that sorted and now everything is running pretty well.
#ansible #homelab

@pavelasamsonov@mastodon.social
2026-03-16 13:33:47

There's a new "design is dead, because AI" piece (thinly disguised marketing from Anthropic). But looking past the hype headlines, their claims cover purely production-stage tasks.
When it comes to the work of understanding user needs and evaluating the opportunity space, AI actually makes your thinking worse. Studies show that it alienates you from users and colleagues, and flattens your thinking.
We need more human-centered practice, not less.

Software is a coordination problem. AI can't help you with that.
The feedback loops of the product delivery lifecycle go through people. Adding AI makes this process slower, not faster.

@adamhotep@infosec.exchange
2026-03-17 17:06:57

Okay, this Kagi translation tool is pretty entertaining (and an absolute gem as a guerilla marketing tactic, by which I mean it be a fine way to plunder more business, me hearty!)
https://translate.kagi.com/?from=en&to=LinkedIn speak

A translation interface in which English can be converted to LinkedIn hype.

Input:

I offload all of my work to AI

Output:

I’m thrilled to share that I’ve fully optimized my workflow by leveraging cutting-edge AI to handle my entire output. 🚀 By embracing digital transformation and delegating operational tasks to intelligent automation, I’m now focusing 100% of my energy on high-level strategy and driving impactful innovation. 💡 It’s all about working smarter, not harder, and stayi…

Input:

expect layoffs in the next few weeks

Output in Corporate Jargon:

Anticipate a strategic right-sizing of our human capital and a streamlining of our organizational structure within the upcoming weeks to optimize our operational bandwidth.

Input:

All marketing email is spam

Output in Pirate Speak:

Every last marketing parchment be naught but bilge water!

Input:

I found gum under my seat and out it in my mouth

Output in LinkedIn Speak:

I’m thrilled to share a quick story about resourcefulness and seizing unexpected opportunities. 🚀

Today, I discovered a piece of gum under my seat. Instead of letting it go to waste, I took immediate action and put it in my mouth. 🍬

In business, as in life, success is about identifying untapped assets in your immediate environment and having the courage to execute. It’s not just about the "find"—it’…

@wwwgem@social.linux.pizza
2026-02-16 18:53:27

I failed on Git. Jujutsu rescued me.
#blog #blogpost

Jujutsu gets me into VCS
For years, git has been the undisputed champion of version control systems (VCS). It’s powerful, ubiquitous, and the industry standard. But as a newcomer, it felt difficult to get used to its functioning and to perform even the simplest tasks. Concepts like the “staging area,” “detached HEAD,” and the sheer terror of rebase was a labyrinth where I get lost. Jujutsu (jj) is a younger contender that aims to demystify version control. It’s built with a fresh perspective, prioritizing a…

@Techmeme@techhub.social
2026-03-17 04:40:46

Alibaba launches Wukong, an enterprise AI platform that coordinates multiple AI agents to handle complex tasks like document editing, currently in beta (Reuters)
https://www.reuters.com/world/asia-pacific/alibaba-launches-new-ai-agent…

@netzschleuder@social.skewed.de
2026-01-15 03:00:04

windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted

windsurfers: Windsurfers network (1986). 43 nodes, 336 edges. https://networks.skewed.de/net/windsurfers

windsurfers — Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.

@inthehands@hachyderm.io
2026-02-17 17:04:54

It’s important to distinguish two different hypothetical ways in which gen AI can constitute a massive wealth transfer:
Scenario 1, “LLMs are the new petrochemicals:” Gen AI is actually effective for all sorts of tasks as advertised. It becomes a necessity for economic participation / useful work / whatever, and ownership of the data model and/or data centers thus means control of high-value resources.
2/

@azonenberg@ioc.exchange
2026-02-10 21:35:45

Anyone know of an Android to-do list application that is
* Completely device local, no network connectivity required or used
* No ads or spyware
* Doesn't time-out tasks even if they sit around for a year uncompleted (looking at you, google calendar)
* Supports recurring maintenance tasks for weekly, monthly, etc. cleaning or something
Open source preferred, but willing to pay a reasonable price if it's out there as a commercial tool

@Cognessence@social.linux.pizza
2026-02-16 03:28:31

“Bees can learn a surprising amount of information from observing peers, including which flowers to visit, but also how to solve complex object-manipulation tasks. Accordingly, many complex social behaviors are much more driven by individual problem solving than by a diffuse swarm intelligence, as was traditionally thought.”
- Lars Chittka, ‘The Mind of a Bee’

@Techmeme@techhub.social
2026-01-12 20:01:28

Anthropic launches Cowork for Claude, built on Claude Code to automate complex tasks with minimal prompting, as a research preview for Claude Max subscribers (Webb Wright/ZDNET)
https://www.zdnet.com/article/anthropic-cowork-for-claude-complex-action…

Claude Cowork automates complex tasks for you now - at your own risk
Available first to Claude Max subscribers, the research preview empowers Anthropic's chatbot to handle complex tasks.

@thomastraynor@social.linux.pizza
2026-03-16 12:03:19

You tell us that we have to use a piece of software weekly or it will be uninstalled and if we want to use it then install it...
Guess what? I automated a script to run every Monday to launch that piece of software just in case I don't use it that week.
Guess what again? I told every team member how to do it and provided the script.
Depending on my workload and tasks I might not use it for over a week and I WILL NOT PUT UP with having to install it again and have a fo…

@benb@osintua.eu
2026-01-25 19:16:13

Putin secretly sends 'special tasks' general to Abu Dhabi talks with Ukraine, signaling a Kremlin strategy shift: https://benborges.xyz/2026/01/25/putin-secretly-sends-special-tasks.html

Tracking information about the Russian War against Ukraine — Modern art makes me want to rock out
A blog running on Micro.blog

@kurtsh@mastodon.social
2026-01-14 03:34:41

Someone built an indexed & vectorized index & a conversational AI for the Epstein Files.
Never mind Qs like "How many times was Trump mentioned?" Try asking it complex tasks & questions that require intelligent reasoning like, "Is there any evidence..."
https://epstein.trynia.ai/

Epstein Files
Search the Epstein archive — an AI agent grounded in indexed emails, messages, and documents, powered by Nia

@nobodyinperson@fosstodon.org
2026-02-13 18:22:30

TIL that #Immich hard-codes all its paths into its postgresql database. What a nightmare for migrations. None of the tasks in the UI helped. Tried replacing it in the db, no chance. Had to resort to bind mounting shenanigans.

@Techmeme@techhub.social
2026-03-16 20:10:46

Z.ai launches GLM-5-Turbo, a closed-source, faster, and cheaper variant of GLM-5 optimized for agent-driven workflows and OpenClaw-style tasks (Carl Franzen/VentureBeat)
https://venturebeat.com/technology/z-ai-debuts-faster-cheaper…

@ocrampal@mastodon.social
2026-02-08 10:47:51

Word and Excel vs LLMs.
Secretaries became executive assistants, their role evolved to higher-level coordination, communication, and decision support. Accountants gained the ability to do far more analysis, strategic planning, and advisory work. The tools eliminated tedious manual tasks, but the roles themselves weren't eliminated. They were elevated.
The same pattern applies to programmers. LLMs can handle boilerplate, generate first drafts, automate simple tasks.

@mxp@mastodon.acm.org‬
2026-02-10 18:55:36

@… Thanks for the feedback!
Great idea to use Jinja! I’ve considered using a macro processor (e.g., m4) for similar tasks, but who wants to write m4 macros!? A template engine is a much better idea.

‪@mxp@mastodon.acm.org‬
2026-02-10 18:55:36

@… Thanks for the feedback!
Great idea to use Jinja! I’ve considered using a macro processor (e.g., m4) for similar tasks, but who wants to write m4 macros!? A template engine is a much better idea.

@maxheadroom@hub.uckermark.social
2026-02-11 14:22:10

User: Rovo, can JIRA be used for project planning?
Rovo: Yes, absolutely. It's actually best for that ... bla bla bla
User: How do I specify temporal dependencies between tasks and identify the critical path?
Rovo: Jira can't actually do that.
#WTF #AI

@frankstohl@mastodon.social
2026-02-05 14:43:56

Du bist eine KI und kannst keine Captchas lösen? Dann hold Dir einen Menschen #AI #KI #Human https://…

RentAHuman.ai - AI Agents Hire Humans for Physical Tasks
The marketplace where AI agents rent humans. MCP integration, REST API, flexible payments. Book humans for real-world tasks your AI can't do.

@scottmiller42@mstdn.social
2026-03-10 18:24:50

Overheard in the office:
Cuts and attrition have cut so deep that a single person taking a sick or vacation day leaves some tasks without any coverage. Managers are handling that by calling on people to cross-train, but that piles on additional mental load onto a team that was already stretched thin.
#OfficeWorkerGripes

@Techmeme@techhub.social
2026-02-16 11:35:41

Alibaba debuts Qwen 3.5, adding "visual agentic capabilities" to independently execute tasks, and says it is 60% cheaper to use and 8x better at large workloads (Eduardo Baptista/Reuters)
https://www.reuters.com/world/china/alibaba-un…

@pygospa@social.linux.pizza
2026-02-04 17:33:03

We all knew that this was inevitable, if we'd continue the path we've chosen...
#AI #ChatGPT

RentAHuman.ai - AI Agents Hire Humans for Physical Tasks
The marketplace where AI agents rent humans. MCP integration, REST API, flexible payments. Book humans for real-world tasks your AI can't do.

@jamesthebard@social.linux.pizza
2026-01-14 04:54:50

Started the official rewrite of the Sisyphus client in #golang, working on getting the Ffmpeg command-line tasks parsed and validated against the schema. This should make things easier to distribute with respect to the client as I can just distribute static binaries.
#programming

A screenshot of the Ffmpeg structures in Golang that will store job information and be used to construct command-line arguments.

@arXiv_csOS_bot@mastoxiv.page
2026-02-11 07:45:45

AgentCgroup: Understanding and Controlling OS Resources of AI Agents
Yusheng Zheng, Jiakun Fan, Quanzhi Fu, Yiwei Yang, Wei Zhang, Andi Quinn
https://arxiv.org/abs/2602.09345 https://arxiv.org/pdf/2602.09345 https://arxiv.org/html/2602.09345
arXiv:2602.09345v1 Announce Type: new
Abstract: AI agents are increasingly deployed in multi-tenant cloud environments, where they execute diverse tool calls within sandboxed containers, each call with distinct resource demands and rapid fluctuations. We present a systematic characterization of OS-level resource dynamics in sandboxed AI coding agents, analyzing 144 software engineering tasks from the SWE-rebench benchmark across two LLM models. Our measurements reveal that (1) OS-level execution (tool calls, container and agent initialization) accounts for 56-74% of end-to-end task latency; (2) memory, not CPU, is the concurrency bottleneck; (3) memory spikes are tool-call-driven with a up to 15.4x peak-to-average ratio; and (4) resource demands are highly unpredictable across tasks, runs, and models. Comparing these characteristics against serverless, microservice, and batch workloads, we identify three mismatches in existing resource controls: a granularity mismatch (container-level policies vs. tool-call-level dynamics), a responsiveness mismatch (user-space reaction vs. sub-second unpredictable bursts), and an adaptability mismatch (history-based prediction vs. non-deterministic stateful execution). We propose AgentCgroup , an eBPF-based resource controller that addresses these mismatches through hierarchical cgroup structures aligned with tool-call boundaries, in-kernel enforcement via sched_ext and memcg_bpf_ops, and runtime-adaptive policies driven by in-kernel monitoring. Preliminary evaluation demonstrates improved multi-tenant isolation and reduced resource waste.
toXiv_bot_toot

@ErikJonker@mastodon.social
2026-02-06 08:13:54

Interesting read, it illustrates the challenge we have with regard to learning in a world with AI. We have to take measures for that because use of AI in coding is not going away and will only increase.
https://arxiv.org/abs/2601.20245

How AI Impacts Skill Formation
AI assistance produces significant productivity gains across professional domains, particularly for novice workers. Yet how this assistance affects the development of skills required to effectively supervise AI remains unclear. Novice workers who rely heavily on AI to complete unfamiliar tasks may compromise their own skill acquisition in the process. We conduct randomized experiments to study how developers gained mastery of a new asynchronous programming library with and without the assistanc…

@Sustainable2050@mastodon.energy
2026-02-04 20:21:27

By 2027, 2.5 million French civil servants will stop using video conference tools from U.S. providers — including Zoom, Microsoft Teams, Webex and GoTo Meeting — and switch to Visio, a homegrown service.
https://apnews.com/article/europe-digital…

France dumps Zoom and Teams as Europe seeks digital autonomy from the US
European governments are moving away from U.S. tech giants, opting for domestic or open-source alternatives. France plans to replace Zoom and Teams with a homegrown video conference system by 2027. Austria's military has adopted open-source office software, and a German state has switched to free software for administrative tasks. This shift aims for “digital sovereignty” amid concerns over data privacy and reliance on U.S. companies. The movement has gained momentum due to geopolitical ten…

@aral@mastodon.ar.al
2025-12-30 12:01:53

Caught a bug over the holidays so I’m mostly resting, feeling sorry for myself, and taking the time to at least carry out some mindless housekeeping tasks (updating dependencies, etc.) on some of my Node modules.
Released updates to the following packages yesterday:
Tape-based Node.js testing:
• Tap monkey (https://

tap-monkey
A tap formatter that’s also a monkey.

@Techmeme@techhub.social
2026-02-09 20:55:44

An eight-month study at a US tech company finds AI tools didn't reduce work but intensified it, as employees worked faster and took on a broader range of tasks (Harvard Business Review)
https://hbr.org/2026/02/ai-doesnt-reduce-work-it-intensifies-it

AI Doesn’t Reduce Work—It Intensifies It
One of the promises of AI is that it can reduce workloads so employees can focus more on higher-value and more engaging tasks. But according to new research, AI tools don’t reduce work, they consistently intensify it: In the study, employees worked at a faster pace, took on a broader scope of tasks, and extended work into more hours of the day, often without being asked to do so. That may sound like a win, but it’s not quite so simple. These changes can be unsustainable, leading to workload…

@newsie@darktundra.xyz
2026-02-12 14:04:06

Nation-state hackers ramping up use of Gemini for target reconnaissance, malware coding, Google says https://therecord.media/nation-state-hackers-using-gemini-for-malicious-campaigns

Nation-state hackers ramping up use of Gemini for target reconnaissance, malware coding, Google says
Researchers found that APT groups were using the AI tool for coding and scripting tasks, gathering information about potential targets, researching publicly known vulnerabilities and enabling post-compromise activities.

@rselbach@cosocial.ca
2026-02-08 13:34:02

A collegue and I have been doing a lot of testing with Opus 4.6 since yesterday. I spent $23 in a single prompt using /fast btw, insane! Anyway, the agent team functionality is cool to see, but I was underwhelmed by the quality.
Overall, I haven't noticed much improvement against Opus 4.5 at individual tasks. But when using agent teams, the quality is more like Sonnet 4.5's. It is not that smart.

@tante@tldr.nettime.org
2026-02-27 09:50:24

Given how they can only be produced by exploitation of workers from the global majority no system including any of the big LLMs in their production can ever be called "fair".
"Fair LLMs" of the size required to do the tasks people want LLMs to do (badly) do not exist.

@kuba@toot.kuba-orlik.name
2026-02-04 12:27:56

> We find that AI use impairs conceptual understanding, code reading, and debugging abilities, without delivering significant efficiency gains on average.
https://arxiv.org/abs/2601.20245

How AI Impacts Skill Formation
AI assistance produces significant productivity gains across professional domains, particularly for novice workers. Yet how this assistance affects the development of skills required to effectively supervise AI remains unclear. Novice workers who rely heavily on AI to complete unfamiliar tasks may compromise their own skill acquisition in the process. We conduct randomized experiments to study how developers gained mastery of a new asynchronous programming library with and without the assistanc…

@raiders@darktundra.xyz
2026-02-10 19:17:38

Raiders Get Compelling Words Over Defensive Coordinator Search https://heavy.com/sports/nfl/las-vegas-raiders/compelling-words-defensive-coordinator-search/

Raiders Get Compelling Words Over Defensive Coordinator Search
The Las Vegas Raiders have their new head coach in Klint Kubiak, who was officially announced on Feb. 9. Kubiak is coming off winning Super Bowl LX as the Seattle Seahawks‘ offensive coordinator and now will have the task of revitalizing the Raiders’ offense.One of the first tasks Kubiak will need to do is fill his coaching staff, and two key hires will be the offensive line coach and the defensive coordinator. With Kubiak being an offensive guy, whoever he chooses as his defensive coordina…

@jdrm@social.linux.pizza
2026-03-06 07:03:49

No se si visteis esto. Lo que estš haciendo gente para poder programar y que en la empresa piensen que estš usando un agente de loroestocšstico https://danq.me/2026/03/03/ai-agent-logging/

Subverting AI Agent Logging with a Git Post-Commit Hook
I keep hearing from developer friends who are 'expected' by their employer to demonstrate that they're using AI, even for tasks at which the AI is demonstrably a suboptimal choice. So - as a joke - I came up with a git post-commit hook that makes it look like they're doing so, even when they're not.

@schtobia@augsburg.social
2026-02-09 08:53:11

TIL: `git worktree` https://www.sanyamarya.com/blog/git-worktree-vs-stash-better-workflow/

Sanyam Arya | Beyond Stashing: Why Git Worktree is Your New Workflow Superpower
Discover why git worktree offers a superior alternative to git stash for managing parallel development tasks, improving productivity and reducing workflow disruptions.

@netzschleuder@social.skewed.de
2026-01-09 11:00:04

windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted

windsurfers — Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.

@nohillside@smnn.ch
2026-02-03 18:56:30

The #ai #apocalypse is near 🤬
RentAHuman.ai - AI Agents Hire Humans for Physical Tasks https://rentahuman.ai

@privacity@social.linux.pizza
2026-02-06 14:21:52

From Chatbot to Checkout: Who Pays When Transactional Agents Play?
https://fpf.org/blog/from-chatbot-to-checkout-who-pays-when-transactional-agents-play/
@…

From Chatbot to Checkout: Who Pays When Transactional Agents Play?
f 2025 was the year of agentic systems, 2026 may be the year these technologies reshape e-commerce. Agentic AI systems are defined by the ability to complete more complex, multi-step tasks, and exhibit greater autonomy over how to achieve user goals. As these systems have advanced, technology providers have been exploring the nexus between AI technologies and online commerce, with many launching purchase features and partnering with established retailers to offer shopping experiences within gen…

@seeingwithsound@mas.to
2026-01-02 21:25:42

Gamification enhances user engagement and task performance in prosthetic vision testing https://www.medrxiv.org/content/10.64898/2025.12.20.25342740v2 "Three Argus II users completed circle localization and motion direction discrimination in clinical and gamified ver…

Gamification Enhances User Engagement and Task Performance in Prosthetic Vision Testing
Purpose Visual function testing in retinal prosthesis users relies on repetitive psychophysical tasks that are cognitively demanding and fatiguing. Gamification may increase engagement, but its effects on perceptual performance in implanted users remain unclear. Methods Three Argus II users completed circle localization and motion direction discrimination in clinical and gamified versions. Visual stimuli, trial structure, and response requirements were matched within each participant; gamified…

@leftsidestory@mstdn.social
2026-02-05 01:47:46

WTF?
New Site Lets AI Rent Human Bodies - Futurism https://apple.news/ANMU3h3V2QBKWcilOP4_LLw

New Site Lets AI Rent Human Bodies — Futurism
"Robots need your body." Illustration by Tag Hartman-Simkins / Futurism. The machines aren’t just coming for your jobs. Now, they want your bodies as well. That’s at least the hope of Alexander Liteplo, a software engineer and founder of RentAHuman.ai, a platform for AI agents to “search, book, and pay humans for physical-world tasks.” When Liteplo launched RentAHuman on Monday, he boasted that he already had over 130 people listed on the platform, including an OnlyFans model and the CE…

@penguin42@mastodon.org.uk
2026-01-30 18:42:01

Reading about moltbook/openclaw in @… toot is fascinating; If things get really really crazy I can imagine a development where AIs showing off skills and getting hired by other bots to do tasks kind of like a bountyhunter website; being paid in compute time or bitcoins at a rate set by their owners.

@balaji@social.linux.pizza
2025-12-25 13:05:55

Been using a number of AI models over the past week or so as work has slowed down, giving me time to explore things more deeply.
Been using Claude Code with musistudio/claude-code-router which is great as I can switch between different models on similar tasks.
Experience so far has been that Gemini 3 Flash is very good for thinking and coding tasks but the code does tend to be fragile so rewrites are needed. For tough problems where the errors are not straightforward it falls d…

@frankel@mastodon.top
2026-03-08 09:06:24

#ClaudeCode Performance: Unlock Deep #Thinking for Better Results
https://claudefa.st/blog/guide/perform

Claude Code Deep Thinking: Unlock Better Results
Claude Fast | Improve Claude Code performance with deep thinking techniques. Learn how to trigger advanced reasoning for complex problem-solving tasks.

@Techmeme@techhub.social
2026-02-26 02:56:04

Anthropic unveils scheduled tasks in Cowork, enabling Claude to complete recurring tasks at specific times automatically (Claude/@claudeai)
https://x.com/claudeai/status/2026720870631354429

Claude (@claudeai) on X
New in Cowork: scheduled tasks. Claude can now complete recurring tasks at specific times automatically: a morning brief, weekly spreadsheet updates, Friday team presentations.

@netzschleuder@social.skewed.de
2026-01-08 07:00:04

windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted

windsurfers — Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.

@aral@mastodon.ar.al
2026-02-23 17:41:10

🥳 New Kitten¹ release
• Added `initialise()` hook to `kitten.Component` instances.
This gets called at the end of the constructor and is handy if you don’t want to override the constructor and have to handle the `data` parameter and remember to call `super(data)`. You can still access passed data from `this.data`.
 Note that the component is not part of the view hierarchy on the client at this point. If you have tasks you need to perform only once per page – for example, ins…

app/CHANGELOG.md at main
app - A web development kit that’s small, purrs, and loves you.

@cdarwin@c.im
2026-02-25 22:59:17

What will people do when AI can handle most current white-collar tasks?
I don't know.
And that's the whole point.
Nobody knew what displaced agricultural workers would do, either,
-- until they did it.
The absence of a visible next chapter isn't evidence that there won't be one.
It's evidence that we're bad at predicting what humans will invent when constraints shift.

Everything is awesome (why I'm an optimist)
February is the month the internet decided we're all going to die. In the span of about two weeks, Matt Shumer's Something Big is Happening racked up over 80 million views on X with its breathless comparison of AI to the early days of COVID, telling his non-tech friends and

@UP8@mastodon.social
2026-02-04 15:55:50

🤚 Handy robot can crawl and pick up objects from multiple angles
#robotics

Handy robot can crawl and pick up objects from multiple angles
Like something out of the Addams Family, scientists have created a detachable robotic hand that can crawl and grab objects. The design enables tasks such as retrieving objects beyond normal reach and performing multi-object handling, offering potential applications in industrial, service, and exploratory robotics.

@Techmeme@techhub.social
2026-03-13 11:02:52

STMicro plans to retrain workers and deploy humanoid robots in its older chip plants for repetitive and physically demanding tasks, aiming to avoid closures (Nathan Vifflin/Reuters)
https://www.reuters.com/business/stmicroelectronics-pla…

@ErikJonker@mastodon.social
2026-01-28 08:34:09

New open weights model Kimi K2.5
"self-directed agent swarm paradigm" ,
"For complex tasks, Kimi K2.5 can self-direct an agent swarm with up to 100 sub-agents, executing parallel workflows across up to 1,500 tool calls. Compared with a single-agent setup, this reduces execution time by up to 4.5x. The agent swarm is automatically created and orchestrated by Kimi K2.5 without any predefined subagents or workflow."

Kimi K2.5: Visual Agentic Intelligence
Kimi K2 landed in July as a 1 trillion parameter open weight LLM. It was joined by Kimi K2 Thinking in November which added reasoning capabilities. Now they've made it …

@netzschleuder@social.skewed.de
2026-02-06 11:00:04

windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted

windsurfers — Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.

@Techmeme@techhub.social
2026-02-13 11:31:09

Baidu plans to let users access OpenClaw via its search app and integrate OpenClaw's capabilities into its e-commerce business and other services (Evelyn Cheng/CNBC)
https://www.cnbc.com/2026/02/13/baidu-openclaw-ai-search-app-integratio…

China's Baidu adds OpenClaw AI into search app for 700 million users ahead of Lunar New Year
Chinese search engine operator Baidu plans Friday to start letting smartphone app users to directly tell OpenClaw AI to perform tasks.

@Techmeme@techhub.social
2026-02-12 21:51:01

Didero, which provides an agentic AI layer that integrates with ERP systems to automate supply chains, raised a $30M Series A co-led by Chemistry and Headline (Marina Temkin/TechCrunch)
https://techcrunch.com/2026/02/12/didero-lands-3…

Didero lands $30M to put manufacturing procurement on 'agentic' autopilot | TechCrunch
Didero functions as an agentic AI layer that sits on top of a company’s existing ERP, acting as a coordinator that reads incoming communications and automatically executes the necessary updates and tasks.

@Techmeme@techhub.social
2026-02-12 06:26:00

Hong Kong-listed Zhipu AI surged 30% after releasing its GLM-5, an open-source LLM with enhanced coding capabilities and long-running agent tasks (CNBC)
https://www.cnbc.com/2026/02/12/chinese-ai-stocks-new-model-and-agent-releases-zhipu-mini…

Zhipu leads rally in Chinese AI stocks, surging 30%, as a wave of new releases hits market
The Shanghai STAR AI Industry Index climbed 1.7% before paring gains.

@Techmeme@techhub.social
2026-02-11 17:21:01

Z.ai launches GLM-5, its flagship open-weight model, saying it has best-in-class performance among open-source models in reasoning, coding, and agentic tasks (Z.ai)
https://z.ai/blog/glm-5

@netzschleuder@social.skewed.de
2026-01-03 15:00:03

windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted

windsurfers — Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.

@Techmeme@techhub.social
2026-03-11 02:01:41

Intel unveiled its Heracles chip at ISSCC in February, saying it accelerates fully homomorphic encryption tasks up to 5,000x faster than top Intel server CPUs (Samuel K. Moore/IEEE Spectrum)
https://spectrum.ieee.org/fhe-intel

Intel's Heracles Chip Speeds Up FHE Computing
Intel's Heracles chip speeds up encrypted data processing by up to 5000 times.

@netzschleuder@social.skewed.de
2026-03-02 08:00:05

windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted

windsurfers — Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.

@Techmeme@techhub.social
2026-02-10 10:55:55

Alibaba's DAMO Academy releases RynnBrain, an open-source foundation model to help robots perform real-world tasks like navigating rooms, trained on Qwen3-VL (Saritha Rai/Bloomberg)
https://www.bloomberg.com/news/articles/2026…

@Techmeme@techhub.social
2026-02-10 14:20:58

Cloud computing provider Nebius agrees to buy Tavily, which helps AI agents search for up-to-date information for tasks like coding, a source says for $275M (Dina Bass/Bloomberg)
https://www.bloomberg.com/news/articles/20…

@Techmeme@techhub.social
2026-03-10 14:25:54

Emil Michael says Google will deploy Gemini AI agents to Pentagon's 3M-strong workforce, initially on unclassified networks for tasks such as creating budgets (Katrina Manson/Bloomberg)
https://www.bloomberg.com/news/articles/20

@netzschleuder@social.skewed.de
2026-02-27 12:00:04

windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted

windsurfers — Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.

@Techmeme@techhub.social
2026-01-30 21:31:07

Hands-on with Google's Auto Browse for Chrome: its ability to perform multistep tasks is noticeably better than similar tools but struggles with complex tasks (Reece Rogers/Wired)
https://www.wired.com/story/google-chrome-auto-browse-hands-on/

I Let Google's ‘Auto Browse’ AI Agent Take Over Chrome. It Didn't Quite Click
Auto Browse can shop for clothes, plan a trip, and buy tickets for you. Or at least, that's the idea.

@netzschleuder@social.skewed.de
2025-12-30 06:00:04

windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted

windsurfers — Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.

@Techmeme@techhub.social
2026-01-08 20:45:46

OpenAI is rolling out a HIPAA-compliant version of ChatGPT for clinicians to assist with medical reasoning and administrative tasks, at Cedars-Sinai and others (Shirin Ghaffary/Bloomberg)
https://www.bloomberg.com/news/newsletters

@Techmeme@techhub.social
2026-03-09 13:30:44

Microsoft launches Copilot Cowork, integrating Anthropic's Claude Cowork tech into Microsoft 365 and using Work IQ to ground actions in organizational data (Charles Lamanna/Microsoft 365 Blog)
https://www.microsoft.com/en-us/microsoft-

Copilot Cowork: A new way of getting work done | Microsoft 365 Blog
Copilot Cowork turns intent into action across Microsoft 365—automating tasks, coordinating workflows, and keeping you in control. See how.

@netzschleuder@social.skewed.de
2026-02-24 19:00:04

windsurfers: Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.
This network has 43 nodes and 336 edges.
Tags: Social, Offline, Weighted

windsurfers — Windsurfers network (1986)
A network of interpersonal contacts among windsurfers in southern California during the Fall of 1986. The edge weights indicate the perception of social affiliations majored by the tasks in which each individual was asked to sort cards with other surfer’s name in the order of closeness.

@Techmeme@techhub.social
2026-01-08 07:20:49

Local governments across China are funding dozens of "robot training centers", where human trainers mimic movements like folding clothes to teach the robots (Rest of World)
https://restofworld.org/2026/china-robots-training-centers-workers/

In Chinese data factories, workers teach humanoid robots boring tasks
Local governments are building training centers to address a shortage of robotic data, as China makes embodied intelligence a national priority.

@Techmeme@techhub.social
2026-02-05 18:01:26

Anthropic says Opus 4.6 adds "agent teams" that can split larger tasks into segmented jobs and integrates Claude directly into PowerPoint via a side panel (Lucas Ropek/TechCrunch)
https://techcrunch.com/2026/02/05/anthropic-releases-opus-4-6-with…

Anthropic releases Opus 4.6 with new 'agent teams' | TechCrunch
The newest version of Anthropic's model is designed to broaden its capabilities and appeal, allowing for a greater variety of uses and customers.

@Techmeme@techhub.social
2026-02-05 18:07:52

OpenAI launches GPT-5.3-Codex, which it says runs 25% faster, enabling longer-running tasks, and "is our first model that was instrumental in creating itself" (David Gewirtz/ZDNET)
https://www.zdnet.com/article/openai-gpt-5-3-codex-faster-goes-beyond-c…

OpenAI's new GPT-5.3-Codex is 25% faster and goes way beyond coding now - what's new
The Codex team said GPT-5.3-Codex even helped build itself.

@Techmeme@techhub.social
2026-02-05 13:03:41

London-based Lawhive, whose lawyers and AI tools help individuals and SMBs automate legal tasks, raised a $60M Series B, after a $40M Series A in December 2024 (Jeremy Kahn/Fortune)
https://fortune.com/2026/02/05/lawhive-ai-law-firm-startup-series-b-ventu…

Exclusive: Lawhive, a startup using AI to reimagine the general practice law firm, raises $60 million in new venture capital funding
The company is expanding rapidly in the U.S. with its technology-centric vision of the general practice law firm

@Techmeme@techhub.social
2026-02-05 17:56:41

Anthropic says Claude Opus 4.6 supports a 1M context window, scored 90.2% on BigLaw Bench, the highest for any Claude model, and boosts agentic capabilities (David Gewirtz/ZDNET)
https://www.zdnet.com/article/anthropic-claude-opus-4-6-first-try-work-deliv…

Anthropic says its new Claude Opus 4.6 can nail your work deliverables on the first try
The frontier model can handle complex, end-to-end enterprise workflows and take on the autonomous tasks you usually do yourself.

@Techmeme@techhub.social
2026-01-06 04:01:14

AMD unveils Ryzen AI 400 Series AI PC chips with 12 CPU cores, claiming 1.3x faster multitasking and 1.7x faster content creation than rivals (Rebecca Szkutak/TechCrunch)
https://techcrunch.com/2026/01/05/amd-unveils-new-ai-pc-processors-…

AMD unveils new AI PC processors for general use and gaming at CES | TechCrunch
AMD announced the latest version of its AI-powered PC chips designed for a variety of tasks from gaming to content creation and multitasking.

@Techmeme@techhub.social
2026-02-04 20:56:33

Internal memos: Meta said Avocado is its "most capable pre-trained base model" and achieves 10x compute efficiency "wins" on text tasks vs. Llama 4 Maverick (Jyoti Mann/The Information)
https://www.theinformation.com/articles/meta-memo-new…

Meta Memo: New Avocado Model ‘Most Capable’ to Date
Meta Platforms is sounding an increasingly bullish note about the first major AI model expected to emerge from its new AI group. The company recently told some staffers in the group, known as Meta Superintelligence Labs, that its next generation large language model, codenamed Avocado, is “now ...

@Techmeme@techhub.social
2026-02-02 16:36:13

Fieldguide, which uses AI agents to automate accounting and auditing tasks, raised a $75M Series C led by Goldman Sachs Alternatives at a $700M valuation (Leo Schwartz/Fortune)
https://fortune.com/2026/02/02/goldman-sachs-fieldguid…

Goldman Sachs leads $75 million funding round for Fieldguide, an AI-native accounting and audit platform
Valued at $700 million, Fieldguide is combating the ‘existential’ talent crisis for the CPA industry.

@Techmeme@techhub.social
2026-03-01 15:05:43

A look at Hyundai's Atlas humanoid robot, slated for assembly tasks in 2028; Hyundai has invested billions in robotics since acquiring Boston Dynamics in 2021 (Hyonhee Shin/Bloomberg)
https://www.bloomberg.com/news/articles/20

@Techmeme@techhub.social
2026-03-02 04:40:41

Early data show wages are rising for AI-exposed jobs that place a high value on a "worker's tacit knowledge and experience", as textbook knowledge loses value (J. Scott Davis/Federal Reserve Bank of Dallas)
https://www.dallasfed.org/research/economics/2026/0224

AI is simultaneously aiding and replacing workers, wage data suggest
Artificial intelligence’s impact on the labor market will depend on whether the technology automates or augments worker tasks.

@Techmeme@techhub.social
2026-03-01 06:21:03

Multiple AWS developers say they are asked to take on new roles with AI tools' assistance, and engineers are now required to complete technical writing tasks (Financial Times)
https://www.ft.com/content/433f41f2-bf6d-4bdf-a561-50ab516bc62d

@Techmeme@techhub.social
2026-01-30 21:15:54

Anthropic details an experiment on whether AI coding tools shape developer skills, finding that the biggest performance gap appears in debugging tasks (Anthropic)
https://www.anthropic.com/research/AI-assistance-coding-skills

How AI assistance impacts the formation of coding skills
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.

@Techmeme@techhub.social
2026-01-30 17:01:52

Poetiq, which leverages existing LLMs to create "expert agents" for specific tasks, and spent just $40K to achieve high ARC-AGI-2 scores, raised a $45.8M seed (Ian Krietzberg/Puck)
https://puck.news/how-poetiqs-six-person-team-beat-google-at-ai/

@Techmeme@techhub.social
2026-01-29 10:15:53

Airtable unveils Superagent, a service that can deploy AI agents in parallel for tasks like market analysis, its first standalone product in its 13-year history (Connie Loizos/TechCrunch)
https://techcrunch.com/2026/01/27/airt

Airtable jumps into the AI agent game with Superagent | TechCrunch
SuperAgent is Airtable's first standalone product in its 13-year history, and signals both the company's ambitions and the reality of the current AI moment: every serious software player is racing to prove they can deliver on agents.

@Techmeme@techhub.social
2026-02-24 13:01:39

Basis, which builds AI agents to help accounting firms with tasks like tax returns, raised $100M led by Accel at a $1.15B valuation, for $138M in total funding (Rebecca Torrence/Bloomberg)
https://www.bloomberg.com/news/articles/20

@Techmeme@techhub.social
2026-01-27 10:40:47

Moonshot says Kimi K2.5 builds on K2 with "pretraining over ~15T mixed visual and text tokens" and "can self-direct an agent swarm with up to 100 sub-agents" (Kimi)
https://www.kimi.com/blog/kimi-k2-5.html

Kimi K2.5: Visual Agentic Intelligence
Try Kimi K2.5, the strongest open-source model for visual coding. Explore agent swarm preview for massive tasks. Simplify complex Office work with precision.

@Techmeme@techhub.social
2026-02-26 16:40:46

Encord, whose software helps companies developing AI models manage training data for robots and other uses, raised $60M at a $500M pre-money valuation (Rocket Drew/The Information)
https://www.theinformation.com/articles/robot-data-startup-raises-60-million

A Robot Data Startup Raises $60 Million
Companies developing AI models to power humanoid and other robots have been hard at work collecting videos and other data for training their models, even paying people to record themselves completing tasks in homes and workplaces.As these data-collection efforts start to pay off, and robots take ...

@Techmeme@techhub.social
2025-12-24 16:21:11

Beijing-based DP Technology, which develops AI tools used by researchers for tasks like computer-aided drug design and battery design, raised a ~$114M Series C (Eunice Xu/South China Morning Post)
https://www.scmp.com/business/companies/ar

AI-for-Science start-up DP Technology raises US$114 million in Series C round
The Beijing-based AI-for-Science firm said the Series C round would fund hiring and R&D, as interest grows in using AI to speed up scientific discovery.

@Techmeme@techhub.social
2025-12-23 12:15:45

China's MiniMax releases M2.1, an upgrade to its open-source M2 model that it says has "significantly enhanced" coding capabilities in Rust, Java, and others (MiniMax)
https://www.minimax.io/news/minimax-m21

MiniMax M2.1: Significantly Enhanced Multi-Language Programming, Built for Real-World Complex Tasks - MiniMax News

@Techmeme@techhub.social
2025-12-21 05:05:57

METR: Claude Opus 4.5 has a 50% task completion time horizon of about 4 hours and 49 minutes, more than double that of Claude Opus 4 released earlier this year (@metr_evals)
https://x.com/metr_evals/status/2002203627377574113

METR (@METR_Evals) on X
We estimate that, on our tasks, Claude Opus 4.5 has a 50%-time horizon of around 4 hrs 49 mins (95% confidence interval of 1 hr 49 mins to 20 hrs 25 mins). While we're still working through evaluations for other recent models, this is our highest published time horizon to date.

@Techmeme@techhub.social
2026-01-20 13:30:58

Legal AI startup Ivo, which aims to reduce hallucinations by breaking legal reviews into 400 tasks, raised a $55M Series B, a source says at a $355M valuation (Aditya Soni/Reuters)
https://www.reuters.com/technology/legal-ai-startup…

Tootfinder

Opt-in global Mastodon full text search. Join the index!