Tootfinder

@Techmeme@techhub.social
2025-06-17 10:05:43

[Thread] A new US paper shows the best frontier LLM models achieve 0% on hard real-life Programming Contest problems, domains where expert humans still excel (Rohan Paul/@rohanpaul_ai)
https://x.com/rohanpaul_ai/status/1934751145400111572

Rohan Paul (@rohanpaul_ai) on X
This is really BAD news of LLM's coding skill. ☹️ The best Frontier LLM models achieve 0% on hard real-life Programming Contest problems, domains where expert humans still excel. LiveCodeBench Pro, a benchmark composed of problems from Codeforces, ICPC, and IOI (“International

@jake4480@c.im
2025-06-13 20:11:22

Telescopes on the Andes glimpse elusive encounters fueled by the very first stars in the universe more than 13 billion years ago by detecting cosmic microwave light signals https://www.404media.co/humans-have-now-seen-the-dawn-of-time-from-ear…

CLASS telescopes can detect cosmic microwave light signals from the 'cosmic dawn'. Image of two telescopes, a blue sky, and the research facility by: Deniz Valle and Jullianna Couto

Humans Have Now Seen the Dawn of Time from Earth After Breakthrough
Telescopes perched on the Andes Mountains glimpsed elusive encounters fueled by the first of the first stars in the universe more than 13 billion years ago.

@inthehands@hachyderm.io
2025-06-16 17:07:15

There has long been a quiet debate, currently drowned out but still very much happening, about human replacement vs human augmentation. Think of Gary Kasparov remarking years ago that he thought chess played by humans with computer assistance could be a far more interesting game than either human-only or computer-only chess.
Here’s an argument for augmentation over automation, and note how far it diverges from the current hype despite it being written from a very AI-friendly point of view:
https://digitaleconomy.stanford.edu/news/the-turing-trap-the-promise-peril-of-human-like-artificial-intelligence/

@v_i_o_l_a@openbiblio.social
2025-06-15 19:34:59

"Pen vs. Keyboard? Why the Tool Doesn’t Matter as Much as You Think for Personal Knowledge Management" #KnowledgeManagement

Pen vs. Keyboard? Why the Tool Doesn’t Matter as Much as You Think for Personal Knowledge Management
Fact First Humans have always adapted their note-taking tools to the technologies of their time. From clay tablets and hieroglyphs to parchment notebooks and digital apps, the mediums have changed, but the purpose remains...

@Techmeme@techhub.social
2025-06-14 21:16:27

In an Oxford study, LLMs correctly identified medical conditions 94.9% of the time when given test scenarios directly, vs. 34.5% when prompted by human subjects (Nick Mokey/VentureBeat)
https://venturebeat.com/ai/just-add-hu

Just add humans: Oxford medical study underscores the missing link in chatbot testing
Patients using chatbots to assess their own medical conditions may end up with worse outcomes than conventional methods, according to a new Oxford study.

@cheryanne@aus.social
2025-06-12 03:42:46

For The Love Of Dogs (And Their Humans!)
Great Australian Pods Podcast Directory: #GreatAusPods

For The Love Of Dogs (And Their Humans!)
Screenshot of the podcast listing on the Great Australian Pods website

@TobiasFrech@ijug.social
2025-04-16 07:49:52

I had to interact with some Amazon voice systems this morning. Obviously they used some automated translation to German and it failed. The voice recognition itself wasn't that impressive either. There is still a very long way to go if we want to reach the same quality levels with automated systems compared to well trained humans. And I personally am not so sure if this path leads to the mountain top or off a cliff.

@newsie@darktundra.xyz
2025-06-12 14:16:42

Humans Have Now Seen the Dawn of Time from Earth After Breakthrough https://www.404media.co/humans-have-now-seen-the-dawn-of-time-from-earth-after-breakthrough/

Humans Have Now Seen the Dawn of Time from Earth After Breakthrough
Telescopes perched on the Andes Mountains glimpsed elusive encounters fueled by the first of the first stars in the universe more than 13 billion years ago.

@jby@ecoevo.social
2025-05-14 22:39:07

“Apocalypse forces us to radically change. But by facing the future with optimism instead of doom, we can transform ourselves into the kinds of people-the kinds of communities—who can survive.”
https://slate.com/technology/2025/05/h

Yes, Humans as a Species Are Headed for Disaster. I Have a Lot of Hope for What Will Come Next.
Writing a book about apocalypses made me more optimistic.

@arXiv_csSE_bot@mastoxiv.page
2025-06-16 10:26:49

LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?
Zihan Zheng, Zerui Cheng, Zeyu Shen, Shang Zhou, Kaiyuan Liu, Hansen He, Dongruixuan Li, Stanley Wei, Hangyi Hao, Jianzhu Yao, Peiyao Sheng, Zixuan Wang, Wenhao Chai, Aleksandra Korolova, Peter Henderson, Sanjeev Arora, Pramod Viswanath, Jingbo Shang, Saining Xie
http…

LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?
Recent reports claim that large language models (LLMs) now outperform elite humans in competitive programming. Drawing on knowledge from a group of medalists in international algorithmic contests, we revisit this claim, examining how LLMs differ from human experts and where limitations still remain. We introduce LiveCodeBench Pro, a benchmark composed of problems from Codeforces, ICPC, and IOI that are continuously updated to reduce the likelihood of data contamination. A team of Olympiad medal…

@arXiv_csRO_bot@mastoxiv.page
2025-06-16 08:01:29

Control Architecture and Design for a Multi-robotic Visual Servoing System in Automated Manufacturing Environment
Rongfei Li
https://arxiv.org/abs/2506.11387

Control Architecture and Design for a Multi-robotic Visual Servoing System in Automated Manufacturing Environment
The use of robotic technology has drastically increased in manufacturing in the 21st century. But by utilizing their sensory cues, humans still outperform machines, especially in micro scale manufacturing, which requires high-precision robot manipulators. These sensory cues naturally compensate for high levels of uncertainties that exist in the manufacturing environment. Uncertainties in performing manufacturing tasks may come from measurement noise, model inaccuracy, joint compliance (e.g., el…

@fanf@mendeddrum.org
2025-06-07 11:42:03

from my link log —
Designing APIs for humans: Stripe object IDs.
https://dev.to/stripe/designing-apis-for-humans-object-ids-3o5a
saved 2025-05-22

Designing APIs for humans: Object IDs
Choosing your ID type Regardless of what type of business you run, you very likely require...

@joxean@mastodon.social
2025-06-14 23:37:13

A short #piano song I have just composed while watching a storm from my window, thinking about those poor humans out there waiting for the war to end.
The #score is here:

@AimeeMaroux@mastodon.social
2025-06-14 19:44:41

Content warning:

This should not be surprising for anyone who knows how LLMs work but holy shit is this scary!
The article is about regular people whose conspiracy beliefs were encouraged by #ChatGPT.
I think the fact that humans are lonelier than ever makes it easy to prey on a large amount of vulnerable people, which is why #LLM

@grifferz@social.bitfolk.com
2025-05-14 22:18:35

On Saturday I met Freddie greyhound, Mandy greyhound's nephew¹. Mandy passed away in 2021.
Coincidentally both homed by WGW and Freddie now lives a few miles from where Mandy lived her pet life. His human knows Mandy's humans well.
Considering the rather small greyhound gene pool this is not unusual. Mandy's sire has had over 5,000 offspring so far.
#Greyhounds

A small shiny black greyhound boy with a white chest patch sits in the boot of a car laying on his left side propped up and looking towards the left of the camera. He wears a red patterned martingale collar.

A small shiny black greyhound girl standing facing towards the legt side of the camera with her ears at half attention. She wears a Star Wars themed fleece coat in black, white and blue and a red patterned martingale collar. She stands on a tarmac path in a cemetery.

@primonatura@mstdn.social
2025-06-11 18:00:25

'‘Probiotic Can Slow Disease on Coral Reefs: They Have Microbiomes That Benefit, Like Humans"
#Oceans #Environment #CoralReef

Ingredient in Yogurt Can Slow Disease in Reefs as Corals Have Microbiomes That Benefits From Probiotics
A new Smithsonian study demonstrated how a probiotic could save coral reefs at-risk with SCTLD disease in Florida and the Caribbean.

@arXiv_csMM_bot@mastoxiv.page
2025-06-13 08:00:30

Thief of Truth: VR comics about the relationship between AI and humans
Joonhyung Bae
https://arxiv.org/abs/2506.10012 https://arxiv.o…

Thief of Truth: VR comics about the relationship between AI and humans
Thief of Truth is a first-person perspective Virtual Reality (VR) comic that explores the relationship between humans and artificial intelligence (AI). The work tells the story of a mind-uploaded human being reborn as a new subject while interacting with an AI that is looking for the meaning of life. In order to experiment with the expandability of VR comics, the work was produced by focusing on three problems. First, the comic is designed using the viewing control effect of VR. Second, through…

@arXiv_csSE_bot@mastoxiv.page
2025-06-16 10:08:49

Code Researcher: Deep Research Agent for Large Systems Code and Commit History
Ramneet Singh, Sathvik Joel, Abhav Mehrotra, Nalin Wadhwa, Ramakrishna B Bairi, Aditya Kanade, Nagarajan Natarajan
https://arxiv.org/abs/2506.11060

Code Researcher: Deep Research Agent for Large Systems Code and Commit History
Large Language Model (LLM)-based coding agents have shown promising results on coding benchmarks, but their effectiveness on systems code remains underexplored. Due to the size and complexities of systems code, making changes to a systems codebase is a daunting task, even for humans. It requires researching about many pieces of context, derived from the large codebase and its massive commit history, before making changes. Inspired by the recent progress on deep research agents, we design the fi…

@servelan@newsie.social
2025-06-05 20:50:22

Curious humpback whales approach humans and blow bubble 'smoke' rings
https://phys.org/news/2025-06-curious-humpback-whales-approach-humans.html

@gedankenstuecke@scholar.social
2025-06-12 16:56:20

The most shortsighted take on "let's replace human iNaturalist contributors with 'AI'" that I've read comes from people that cite that shit paper about how much more energy efficient "AI" is at solving tasks than humans.
All aspects of human enjoyment and fulfilment that comes from volunteering aside. It's not like they'll put all contributors against the wall to save energy now, do they?! Bob Contributor will still drive his car, eat his sandwiches and burn energy 🤦‍♂️

@floheinstein@chaos.social
2025-06-13 12:16:03

EU's Frontex has published a children's book for children who are about to be deported
😲
https://op.europa.eu/en/publication-detail/-/publication/6eea0e99-e464-11ef-be2a-01aa75ed71a1
Official PDFs availabl…

A picture children's book style showing an airplane high above the clouds (so high you can see earth's curvature) with a lot of people sitting inside of many skin colors. Only the pilots aren't discernible as humans.
Title "My guidebook on return"

Toolbox for children in return - Publications Office of the EU
Addressing the return of unaccompanied minors or children within families is a delicate matter that demands a humane approach. This approach must not only respect fundamental rights but also prioritise the child's best interests. Additionally, clear and comprehensive information about the return procedures is essential for those involved, enabling them to prepare as effectively as possible. One of the primary objectives of Frontex is to assist MS throughout all stages of the return procedures. …

@grifferz@social.bitfolk.com
2025-05-14 22:18:35

On Saturday I met Freddie greyhound, Mandy greyhound's nephew¹. Mandy passed away in 2021.
Coincidentally both homed by WGW and Freddie now lives a few miles from where Mandy lived her pet life. His human knows Mandy's humans well.
Considering the rather small greyhound gene pool this is not unusual. Mandy's sire has had over 5,000 offspring so far.
#Greyhounds

@cdarwin@c.im
2025-06-10 15:48:45

In a laboratory setting, humans have accelerated particles — protons, antiprotons, electrons, and positrons — to incredibly high energies: up to the TeV (trillions of electron-volts) scale.
 But cosmic rays, also including protons, electrons, and other atomic nuclei, are produced up to far greater energies, at the PeV (quadrillions of electron-volts) scale and beyond.
 These very high energy cosmic rays are produced somewhere in our own galaxy:
in natural, astrophysical…

Astronomers close in on the source of the highest energy particles
On Earth, our particle accelerators can reach tera-electron-volt (TeV) energies. Particles from space are thousands of times as energetic.

@teledyn@mstdn.ca
2025-06-01 15:02:41

Researchers in Japan Discover Medicine Capable of Regrowing Third Set of Teeth for Humans - Dentistry Today
https://www.dentistrytoday.com/researchers-in-japan-discover-medicine-capable-of-regrowing-third-set-of-teeth-for-humans/

@rasterweb@mastodon.social
2025-06-02 12:46:21

You will find no answers to the struggles we face as humans by looking to the machines.
Be eager to say “I don’t know… but I want to find out!”
➡️ https://rasterweb.net/raster/2025/06/02/experts-dont-know/

Experts Don’t Know
Humans don't know it all, and that's good...

@arXiv_csRO_bot@mastoxiv.page
2025-06-13 08:56:30

Modeling Trust Dynamics in Robot-Assisted Delivery: Impact of Trust Repair Strategies
Dong Hae Mangalindan, Karthik Kandikonda, Ericka Rovira, Vaibhav Srivastava
https://arxiv.org/abs/2506.10884

Modeling Trust Dynamics in Robot-Assisted Delivery: Impact of Trust Repair Strategies
With increasing efficiency and reliability, autonomous systems are becoming valuable assistants to humans in various tasks. In the context of robot-assisted delivery, we investigate how robot performance and trust repair strategies impact human trust. In this task, while handling a secondary task, humans can choose to either send the robot to deliver autonomously or manually control it. The trust repair strategies examined include short and long explanations, apology and promise, and denial. …

@arXiv_csLG_bot@mastoxiv.page
2025-06-12 08:15:11

Multi-Task Reward Learning from Human Ratings
Mingkang Wu, Devin White, Evelyn Rose, Vernon Lawhern, Nicholas R Waytowich, Yongcan Cao
https://arxiv.org/abs/2506.09183

Multi-Task Reward Learning from Human Ratings
Reinforcement learning from human feeback (RLHF) has become a key factor in aligning model behavior with users' goals. However, while humans integrate multiple strategies when making decisions, current RLHF approaches often simplify this process by modeling human reasoning through isolated tasks such as classification or regression. In this paper, we propose a novel reinforcement learning (RL) method that mimics human decision-making by jointly considering multiple tasks. Specifically, we lever…

@Techmeme@techhub.social
2025-06-04 11:15:51

Q&A with Google DeepMind CEO Demis Hassabis on "a 50% chance" of AGI in the next five to 10 years, bad actors and technical risks, AI regulation, jobs, and more (Steven Levy/Wired)
https://www.wired.com/story/google-deepmin

Google DeepMind’s CEO Thinks AI Will Make Humans Less Selfish
Demis Hassabis says that systems as smart as humans are almost here, and we’ll need to radically change how we think and behave.

@trochee@dair-community.social
2025-06-08 23:52:10

I know I shouldn't still be using this curséd website
but what is happening here with your lesson planning, Duo
Employ humans to write QA validations to keep these questions from being this silly
And stop using LLMs. Please.

A Duolingo puzzle page

The question reads
Choisis l'option qui veut dire « parfait »

[Pick the option that means "parfait"]

The words below are available for selection:
Un chien parfait. Je l'ai vu en ligne, papa.

[A perfect dog. I saw it on line, dad?…]

Both instances of the word "parfait" are circled in blue

@EmilyMoranBarwick@mastodon.social
2025-06-10 19:02:44

"What I'm optimizing for isn't growth... reach, or influence. I'm chasing connection[...]spending my days on things that bring me alive."
"It's about...signals that something genuinely MATTERED to one or more humans...private replies saying they've never felt so seen or understood." Rob Hardy
To fellow #writers

A screenshot of an excerpt from the linked article. It reads:

"Internal Resonance: How did it feel to write and publish this? Did it make me feel alive, both intellectually and somatically? Did it feel like something no one else but me could have created? Did it feel true to who I am, and who I'm becoming? Did the content of this writing matter to the deepest parts of me, beneath all of the cultural stories about who I think I should be and what I should do?

External Resonance: How did people…

A screenshot of an excerpt from the linked article. It reads:

"For the game I'm playing with Ungated, and with my life, understanding these two metrics matters more than anything I'd find in a traditional analytics dashboard. What I'm optimizing for isn't growth, certainty, or control. It's not the maximization of short-term revenue, reach, or influence. Instead, I'm chasing connection—both with myself and with others. I'm trying to design an infinite game, where I spend my days working on thi…

@dcm@social.sunet.se
2025-06-10 13:17:04

A new, updated, streamlined, and generally improved version of The Vector Grounding Problem paper, joint work by @… and me on the meaningfulness or else of LLM outputs and internal representations is now available on ArXiv.

The Vector Grounding Problem
The remarkable performance of large language models (LLMs) on complex linguistic tasks has sparked debate about their capabilities. Unlike humans, these models learn language solely from textual data without directly interacting with the world. Yet they generate seemingly meaningful text on diverse topics. This achievement has renewed interest in the classical `Symbol Grounding Problem' -- the question of whether the internal representations and outputs of symbolic AI systems can possess intrin…

@arXiv_csHC_bot@mastoxiv.page
2025-06-13 07:45:10

Extended Creativity: A Conceptual Framework for Understanding Human-AI Creative Relations
Andrea Gaggioli, Sabrina Bartolotta, Andrea Ubaldi, Katusha Gerardini, Eleonora Diletta Sarcinella, Alice Chirico
https://arxiv.org/abs/2506.10249

Extended Creativity: A Conceptual Framework for Understanding Human-AI Creative Relations
Artificial Intelligence holds significant potential to enhance human creativity. However, achieving this vision requires a clearer understanding of how such enhancement can be effectively realized. Adopting the perspective of distributed creativity, we identify three primary modes through which AI can contribute to creative processes: Support, where AI acts as a tool; Synergy, where AI and humans collaborate in complementary ways; and Symbiosis, where human and AI cognition become so integrated…

@david_colquhoun@mstdn.social
2025-06-01 10:15:41

Love this, by D.J. Grothe
We are truly only just getting started. All we have to do is to fail to kill everyone, and things will get better.
"Human civilization has existed for only 3% of the time that anatomically modern humans have existed. And modern industrial civilization has existed for just 2% of that 3% — just 0.06% of the time that anatomically modern humans have existed. Maybe we’re just getting started!"

@arXiv_csCV_bot@mastoxiv.page
2025-06-12 10:15:41

InterActHuman: Multi-Concept Human Animation with Layout-Aligned Audio Conditions
Zhenzhi Wang, Jiaqi Yang, Jianwen Jiang, Chao Liang, Gaojie Lin, Zerong Zheng, Ceyuan Yang, Dahua Lin
https://arxiv.org/abs/2506.09984

InterActHuman: Multi-Concept Human Animation with Layout-Aligned Audio Conditions
End-to-end human animation with rich multi-modal conditions, e.g., text, image and audio has achieved remarkable advancements in recent years. However, most existing methods could only animate a single subject and inject conditions in a global manner, ignoring scenarios that multiple concepts could appears in the same video with rich human-human interactions and human-object interactions. Such global assumption prevents precise and per-identity control of multiple concepts including humans and …

@arXiv_csCL_bot@mastoxiv.page
2025-06-10 19:05:01

This https://arxiv.org/abs/2506.05142 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCL_…

Do Large Language Models Judge Error Severity Like Humans?
Large Language Models (LLMs) are increasingly used as automated evaluators in natural language generation, yet it remains unclear whether they can accurately replicate human judgments of error severity. In this study, we systematically compare human and LLM assessments of image descriptions containing controlled semantic errors. We extend the experimental framework of van Miltenburg et al. (2020) to both unimodal (text-only) and multimodal (text + image) settings, evaluating four error types: a…

@rperezrosario@mastodon.social
2025-05-21 14:22:26

Quanta Magazine authors Janna Levin and Steven Strogatz strike up a conversation with Ellie Pavlick (Research Scientist at Google Deep Mind) about the differences and similarities between the way people understand language, what NLP algorithms do, and the fact that such conversations more often than not shed light into more than Linguistics' computational side.
"Will AI Ever Understand Language Like Humans?"

Will AI Ever Understand Language Like Humans? | Quanta Magazine
AI may sound like a human, but that doesn’t mean that AI learns like a human. In this episode, Ellie Pavlick explains why understanding how LLMs can process language could unlock deeper insights into both AI and the human mind.

@arXiv_csGR_bot@mastoxiv.page
2025-06-09 07:35:12

Gen4D: Synthesizing Humans and Scenes in the Wild
Jerrin Bright, Zhibo Wang, Yuhao Chen, Sirisha Rambhatla, John Zelek, David Clausi
https://arxiv.org/abs/2506.05397

Gen4D: Synthesizing Humans and Scenes in the Wild
Lack of input data for in-the-wild activities often results in low performance across various computer vision tasks. This challenge is particularly pronounced in uncommon human-centric domains like sports, where real-world data collection is complex and impractical. While synthetic datasets offer a promising alternative, existing approaches typically suffer from limited diversity in human appearance, motion, and scene composition due to their reliance on rigid asset libraries and hand-crafted r…

@joxean@mastodon.social
2025-06-10 18:36:00

Palestinian humans are being exterminated by Israel. Israel must be stopped by any means necessary.

@arXiv_csCR_bot@mastoxiv.page
2025-06-10 16:21:09

This https://arxiv.org/abs/2406.10281 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCR_…

Watermarking Language Models with Error Correcting Codes
Recent progress in large language models enables the creation of realistic machine-generated content. Watermarking is a promising approach to distinguish machine-generated text from human text, embedding statistical signals in the output that are ideally undetectable to humans. We propose a watermarking framework that encodes such signals through an error correcting code. Our method, termed robust binary code (RBC) watermark, introduces no noticeable degradation in quality. We evaluate our wate…

@arXiv_csSD_bot@mastoxiv.page
2025-06-12 07:57:21

Incorporating Linguistic Constraints from External Knowledge Source for Audio-Visual Target Speech Extraction
Wenxuan Wu, Shuai Wang, Xixin Wu, Helen Meng, Haizhou Li
https://arxiv.org/abs/2506.09792

Incorporating Linguistic Constraints from External Knowledge Source for Audio-Visual Target Speech Extraction
Audio-visual target speaker extraction (AV-TSE) models primarily rely on target visual cues to isolate the target speaker's voice from others. We know that humans leverage linguistic knowledge, such as syntax and semantics, to support speech perception. Inspired by this, we explore the potential of pre-trained speech-language models (PSLMs) and pre-trained language models (PLMs) as auxiliary knowledge sources for AV-TSE. In this study, we propose incorporating the linguistic constraints from PS…

@inthehands@hachyderm.io
2025-06-09 16:10:00

Hard to find a single summarizing quote from the post, but it keeps coming back to two closely related ideas:
(1) the tendency of humans to blame themselves for poor tool performance (“oh I should have prompted in •that• way instead, my bad”), and
(2) what we educators call the “hidden curriculum:” people are unaware of learning they have done / habitual effort they are expending, and thus they see their own learning / ongoing effort as zero-cost, obvious, nonexistent, innate personal virtue, etc.
The existence of (2) sets people up for (1); recognizing (2) helps cure (1).
4/

@mia@hcommons.social
2025-06-01 16:27:44

Ad on the tube says 'Humans were the beta test. The era of AI employees is here'.
I can't *imagine* why people are a bit resistant to AI! At least offshoring never advertised on the tube. The enshittification of 21st century life continues.

@thopan@norden.social
2025-06-04 22:34:40

Aktueller Titel: Kalte Nacht – Humans Are Mistakes
#KleineEchos – jetzt live bei https://www.mixcloud.com/live/thopan

THOPAN on Mixcloud Live
Broadcast live to your community of fans and tune in direct to creators from every genre

@tschfflr@fediscience.org
2025-06-04 07:19:14

Students, THIS is how generative AI (ChatGPT et alia) "reads" papers and "understands" content. It's a bullshit machine, a gaslighting machine. It shows the linguistic behavior of a psychopath (is this what us humans average to, if one trains on all our "content" and online behavior?). Yikes.
https://amandaguinzburg.substack.com/p/diabolus-ex-machina

@arXiv_csIR_bot@mastoxiv.page
2025-06-04 07:22:34

Towards Human-like Preference Profiling in Sequential Recommendation
Zhongyu Ouyang, Qianlong Wen, Chunhui Zhang, Yanfang Ye, Soroush Vosoughi
https://arxiv.org/abs/2506.02261

Towards Human-like Preference Profiling in Sequential Recommendation
Sequential recommendation systems aspire to profile users by interpreting their interaction histories, echoing how humans make decisions by weighing experience, relative preference strength, and situational relevance. Yet, existing large language model (LLM)-based recommenders often fall short of mimicking the flexible, context-aware decision strategies humans exhibit, neglecting the structured, dynamic, and context-aware mechanisms fundamental to human behaviors. To bridge this gap, we propose…

@davidaugust@mastodon.online
2025-06-02 07:20:33

So Builder . ai Mechanical Turked things and Microsoft (and others) were none the wiser.
Due diligence: boring stuff that when you skip it with catch up with you fast.
“Builder . ai’s platform relied on around 700 engineers based in India who manually wrote code based on customer requests. Despite the company marketing it as AI-generated, most of the work was done by humans behind the scenes.”

London AI startup Builder.ai collapses after 'human-powered' tech revelation
London AI startup Builder.ai collapses after 'human-powered' tech revelation

@arXiv_csCY_bot@mastoxiv.page
2025-06-02 09:56:07

This https://arxiv.org/abs/2503.08720 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCY_…

AI for Just Work: Constructing Diverse Imaginations of AI beyond "Replacing Humans"
"why" we develop AI. Lacking critical reflections on the general visions and purposes of AI may make the community vulnerable to manipulation. In this position paper, we explore the "why" question of AI. We denote answers to the "why" question the imaginations of AI, which depict our general visions, frames, and mindsets for the prospects of AI. We identify that the prevailing vision in the AI community is largely a monoculture that emphasizes objectives such as replacing humans and improving p…

@anildash@me.dm
2025-05-28 13:29:52

There was somebody fussing in my replies to my last link to my blog post about Medium (I don’t see them now; they probably blocked me, but their specific words don’t really matter), and the gist of their message was that they didn’t like that site. On the modern internet, if you have an issue with content written by humans, with no surveillance ads, that doesn’t allow AI scraping or AI slop content, with a business model that makes money… I don’t know how to help you. Honestly.

@daniel@social.telemetrydeck.com
2025-06-01 05:18:39

I went to a concentration camp, Neuengamme, yesterday to learn more about the local history. The visit has reinforced my belief that fascism must be stopped at all costs.

A leftover train car for transporting humans

The foundations of barracks marked on the floor

Long strips of cloth inscribed with the names of victims of this concentration camp. There are tens of thousands of names.

@clongclongmoo@social.bau-ha.us
2025-06-05 12:04:50

Philippe Neau & Antonella Eye Porcelluzzi – Elephant
https://www.clongclongmoo.org/2025/06/05/philippe-neau-antonella-eye-porcelluzzi-elephant/

Philippe Neau & Antonella Eye Porcelluzzi – Elephant
[ACP 1452] Philippe Neau & Antonella Eye Porcelluzzi “Elephant” “we are the elephants, the creatures, the humans, the existential cry, the acceptance of life and of its struggle even when we correctly state and express our needs, we are indeed steadily overwhelmed overwhelmed by our feelings and the feelings of the others they actually determine...

@arXiv_csCV_bot@mastoxiv.page
2025-06-10 19:09:31

This https://arxiv.org/abs/2506.03988 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCV_…

RAID: A Dataset for Testing the Adversarial Robustness of AI-Generated Image Detectors
AI-generated images have reached a quality level at which humans are incapable of reliably distinguishing them from real images. To counteract the inherent risk of fraud and disinformation, the detection of AI-generated images is a pressing challenge and an active research topic. While many of the presented methods claim to achieve high detection accuracy, they are usually evaluated under idealized conditions. In particular, the adversarial robustness is often neglected, potentially due to a la…

@rberger@hachyderm.io
2025-04-27 19:48:27

"I think it is a huge mistake for people to assume that they can trust AI when they do not trust each other. The safest way to develop superintelligence is to first strengthen trust between humans, and then cooperate with each other to develop superintelligence in a safe manner. But what we are doing now is exactly the opposite. Instead, all efforts are being directed toward developing a superintelligence."
#AGI #AI
https://www.wired.com/story/questions-answered-by-yuval-noah-harari-for-wired-ai-artificial-intelligence-singularity/

@inthehands@hachyderm.io
2025-06-09 16:04:07

Here’s a thoughtful piece from @…, well worth reading. It says things I hadn’t heard yet articulated so well.
One thing I appreciate immensely: the way Fred’s analytical approach centers humans instead of tech, and takes the subjective experiences of human developers •seriously•.
Fred’s summary in the quoted post gives the core idea, but the larger piece has many sharp thoughts and rewards close reading. I’ll quote a few in the thread below.
1/ https://hachyderm.io/@mononcqc/114653605519944320

@arXiv_csRO_bot@mastoxiv.page
2025-06-13 09:11:30

Eye, Robot: Learning to Look to Act with a BC-RL Perception-Action Loop
Justin Kerr, Kush Hari, Ethan Weber, Chung Min Kim, Brent Yi, Tyler Bonnen, Ken Goldberg, Angjoo Kanazawa
https://arxiv.org/abs/2506.10968

Eye, Robot: Learning to Look to Act with a BC-RL Perception-Action Loop
Humans do not passively observe the visual world -- we actively look in order to act. Motivated by this principle, we introduce EyeRobot, a robotic system with gaze behavior that emerges from the need to complete real-world tasks. We develop a mechanical eyeball that can freely rotate to observe its surroundings and train a gaze policy to control it using reinforcement learning. We accomplish this by first collecting teleoperated demonstrations paired with a 360 camera. This data is imported in…

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 18:14:50

This https://arxiv.org/abs/2505.23436 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…

Emergent Risk Awareness in Rational Agents under Resource Constraints
Advanced reasoning models with agentic capabilities (AI agents) are deployed to interact with humans and to solve sequential decision-making problems under (approximate) utility functions and internal models. When such problems have resource or failure constraints where action sequences may be forcibly terminated once resources are exhausted, agents face implicit trade-offs that reshape their utility-driven (rational) behaviour. Additionally, since these agents are typically commissioned by a h…

@arXiv_csCL_bot@mastoxiv.page
2025-06-10 18:59:51

This https://arxiv.org/abs/2506.00975 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCL_…

NTPP: Generative Speech Language Modeling for Dual-Channel Spoken Dialogue via Next-Token-Pair Prediction
Inspired by the impressive capabilities of GPT-4o, there is growing interest in enabling speech language models (SLMs) to engage in natural, fluid spoken interactions with humans. Recent advancements have led to the development of several SLMs that demonstrate promising results in this area. However, current approaches have yet to fully exploit dual-channel speech data, which inherently captures the structure and dynamics of human conversation. In this work, we systematically explore the use of…

@camerontw@social.coop
2025-03-28 23:46:37

Reminder that pigs are intelligent and social, and even when exploited by humans, do a great service as food waste processors - they are too good to be used disparagingly when describing the brutishly stupid fascists being allowed to run various countries at the moments.
Pigs would never be fascists. (Urban Orwell, hanging out on a hobby farm his wife had to run so that he could write, has a lot to answer for.)

@Dragofix@veganism.social
2025-05-31 00:49:15

Animals are abused and exploited in various ways for the sake of entertainment. LCA strongly opposes the use of animals in entertainment.
Animals have their own needs, interests, and rights, especially the right to engage in their natural behaviors in their natural habitat. https://www.lcanimal.org/…

Last Chance for Animals - Animals in Entertainment
Last Chance for Animals is a national, non-profit organization dedicated to eliminating animal exploitation through education, investigations, legislation, and media attention. The organization believes that animals are highly sentient creatures who exist for their own reasons independent of their service to humans; they should thus not be made to suffer for the latter. LCA therefore opposes the use of animals in food and clothing production, scientific experimentation, and entertainment. Ins…

@arXiv_econGN_bot@mastoxiv.page
2025-06-05 07:22:38

My Advisor, Her AI and Me: Evidence from a Field Experiment on Human-AI Collaboration and Investment Decisions
Cathy (Liu), Yang, Kevin Bauer, Xitong Li, Oliver Hinz
https://arxiv.org/abs/2506.03707

My Advisor, Her AI and Me: Evidence from a Field Experiment on Human-AI Collaboration and Investment Decisions
Amid ongoing policy and managerial debates on keeping humans in the loop of AI decision-making, we investigate whether human involvement in AI-based service production benefits downstream consumers. Partnering with a large savings bank in Europe, we produced pure AI and human-AI collaborative investment advice, passed it to customers, and examined their advice-taking in a field experiment. On the production side, contrary to concerns that humans might inefficiently override AI output, we find t…

@seeingwithsound@mas.to
2025-05-27 20:54:45

There's a lovely discussion on Twitter/X between humans and AI models (Grok and Perplexity) on whether a Neuralink Blindsight brain implant can provide meaningful vision to people born blind https://x.com/waleedd322/status/1927463484800835657

walid (@waleedd322) on X
@seeingwithsound @MarioNawfal @grok @AskPerplexity @grok @AskPerplexity is he right ?

@arXiv_csHC_bot@mastoxiv.page
2025-06-02 07:19:42

Can LLMs and humans be friends? Uncovering factors affecting human-AI intimacy formation
Yeseon Hong, Junhyuk Choi, Minju Kim, Bugeun Kim
https://arxiv.org/abs/2505.24658

Can LLMs and humans be friends? Uncovering factors affecting human-AI intimacy formation
Large language models (LLMs) are increasingly being used in conversational roles, yet little is known about how intimacy emerges in human-LLM interactions. Although previous work emphasized the importance of self-disclosure in human-chatbot interaction, it is questionable whether gradual and reciprocal self-disclosure is also helpful in human-LLM interaction. Thus, this study examined three possible aspects contributing to intimacy formation: gradual self-disclosure, reciprocity, and naturalnes…

@Techmeme@techhub.social
2025-06-09 05:05:37

Cloudflare open sourced an OAuth library mostly written by Claude, showing how AI handles mechanical implementation while humans guide with context and judgment (Max Mitchell)
https://www.maxemitchell.com/writings/i-read-all-of-cloudflares…

Max Mitchell's Personal Portfolio Website
Max Mitchell's personal portfolio website showcasing his photography, YouTube videos, coding projects, and work history.

@arXiv_csGR_bot@mastoxiv.page
2025-06-03 07:22:35

TRiMM: Transformer-Based Rich Motion Matching for Real-Time multi-modal Interaction in Digital Humans
Yueqian Guo, Tianzhao Li, Xin Lyu, Jiehaolin Chen, Zhaohan Wang, Sirui Xiao, Yurun Chen, Yezi He, Helin Li, Fan Zhang
https://arxiv.org/abs/2506.01077

TRiMM: Transformer-Based Rich Motion Matching for Real-Time multi-modal Interaction in Digital Humans
Large Language Model (LLM)-driven digital humans have sparked a series of recent studies on co-speech gesture generation systems. However, existing approaches struggle with real-time synthesis and long-text comprehension. This paper introduces Transformer-Based Rich Motion Matching (TRiMM), a novel multi-modal framework for real-time 3D gesture generation. Our method incorporates three modules: 1) a cross-modal attention mechanism to achieve precise temporal alignment between speech and gesture…

@inthehands@hachyderm.io
2025-06-09 16:42:33

All this brings me back to some text I was writing yesterday for my students, on which I’d appreciate any thoughtful feedback:
❝You can let the computer do the typing for you, but never let it do the thinking for you.
This is doubly true in the current era of AI hype. If the AI optimists are correct (the credible ones, anyway), software development will consist of humans critically evaluating, shaping, and correcting the output of LLMs. If the AI skeptics are correct, then the future will bring mountains of AI slop to decode, disentangle, fix, and/or rewrite. Either way, it is •understanding• and •critically evaluating• code — not merely •generating• it — that will be the truly essential ability. Always has been; will be even more so. •That• is what you are learning here.❞
11/

@arXiv_csCL_bot@mastoxiv.page
2025-06-10 18:55:20

This https://arxiv.org/abs/2505.19914 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCL_…

Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles
Large Language Models (LLMs), such as OpenAI's o1 and DeepSeek's R1, excel at advanced reasoning tasks like math and coding via Reinforcement Learning with Verifiable Rewards (RLVR), but still struggle with puzzles solvable by humans without domain knowledge. We introduce Enigmata, the first comprehensive suite tailored for improving LLMs with puzzle reasoning skills. It includes 36 tasks across seven categories, each with 1) a generator that produces unlimited examples with controllable diffic…

@arXiv_csRO_bot@mastoxiv.page
2025-06-03 08:07:45

Feel the Force: Contact-Driven Learning from Humans
Ademi Adeniji, Zhuoran Chen, Vincent Liu, Venkatesh Pattabiraman, Raunaq Bhirangi, Siddhant Haldar, Pieter Abbeel, Lerrel Pinto
https://arxiv.org/abs/2506.01944

Feel the Force: Contact-Driven Learning from Humans
Controlling fine-grained forces during manipulation remains a core challenge in robotics. While robot policies learned from robot-collected data or simulation show promise, they struggle to generalize across the diverse range of real-world interactions. Learning directly from humans offers a scalable solution, enabling demonstrators to perform skills in their natural embodiment and in everyday environments. However, visual demonstrations alone lack the information needed to infer precise contac…

@arXiv_csSE_bot@mastoxiv.page
2025-06-10 16:59:49

This https://arxiv.org/abs/2502.06994 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csSE_…

SyncMind: Measuring Agent Out-of-Sync Recovery in Collaborative Software Engineering
Software engineering (SE) is increasingly collaborative, with developers working together on shared complex codebases. Effective collaboration in shared environments requires participants -- whether humans or AI agents -- to stay on the same page as their environment evolves. When a collaborator's understanding diverges from the current state -- what we term the out-of-sync challenge -- the collaborator's actions may fail, leading to integration issues. In this work, we introduce SyncMind, a fr…

@inthehands@hachyderm.io
2025-06-09 16:33:19

These pieces are much harsher than Fred’s, much more in the LLM-bashing camp, but feel relevant here:
https://softwarecrisis.dev/letters/llmentalist/
https://pivot-to-ai.com/2025/06/05/generative-ai-runs-on-gambling-addiction-just-one-more-prompt-bro/
Fred’s point is that a bad interaction model creates hidden work, then humans do that work and “the machine claims the praise.” These other two pieces make the point that this phenomenon of giving the machine credit for unrecognized human work is age-old, taps into some deep trapdoors in human cognition.
10/

@arXiv_csHC_bot@mastoxiv.page
2025-06-10 16:37:29

This https://arxiv.org/abs/2411.03295 has been replaced.
link: https://scholar.google.com/scholar?q=a

Examining Human-AI Collaboration for Co-Writing Constructive Comments Online
This paper examines if large language models (LLMs) can help people write constructive comments on divisive social issues due to the difficulty of expressing constructive disagreement online. Through controlled experiments with 600 participants from India and the US, who reviewed and wrote constructive comments on threads related to Islamophobia and homophobia, we observed potential misalignment between how LLMs and humans perceive constructiveness in online comments. While the LLM was more lik…

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 17:48:46

This https://arxiv.org/abs/2501.07071 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…

Value Compass Leaderboard: A Platform for Fundamental and Validated Evaluation of LLMs Values
As Large Language Models (LLMs) achieve remarkable breakthroughs, aligning their values with humans has become imperative for their responsible development and customized applications. However, there still lack evaluations of LLMs values that fulfill three desirable goals. (1) Value Clarification: We expect to clarify the underlying values of LLMs precisely and comprehensively, while current evaluations focus narrowly on safety risks such as bias and toxicity. (2) Evaluation Validity: Existing …

@camerontw@social.coop
2025-03-28 23:46:37

Reminder that pigs are intelligent and social, and even when exploited by humans, do a great service as food waste processors - they are too good to be used disparagingly when describing the brutishly stupid fascists being allowed to run various countries at the moments.
Pigs would never be fascists. (Urban Orwell, hanging out on a hobby farm his wife had to run so that he could write, has a lot to answer for.)

@Dragofix@veganism.social
2025-05-31 00:49:15

Animals are abused and exploited in various ways for the sake of entertainment. LCA strongly opposes the use of animals in entertainment.
Animals have their own needs, interests, and rights, especially the right to engage in their natural behaviors in their natural habitat. https://www.lcanimal.org/…

Last Chance for Animals - Animals in Entertainment
Last Chance for Animals is a national, non-profit organization dedicated to eliminating animal exploitation through education, investigations, legislation, and media attention. The organization believes that animals are highly sentient creatures who exist for their own reasons independent of their service to humans; they should thus not be made to suffer for the latter. LCA therefore opposes the use of animals in food and clothing production, scientific experimentation, and entertainment. Ins…

@arXiv_csRO_bot@mastoxiv.page
2025-06-12 08:09:41

Analyzing Key Objectives in Human-to-Robot Retargeting for Dexterous Manipulation
Chendong Xin, Mingrui Yu, Yongpeng Jiang, Zhefeng Zhang, Xiang Li
https://arxiv.org/abs/2506.09384

Analyzing Key Objectives in Human-to-Robot Retargeting for Dexterous Manipulation
Kinematic retargeting from human hands to robot hands is essential for transferring dexterity from humans to robots in manipulation teleoperation and imitation learning. However, due to mechanical differences between human and robot hands, completely reproducing human motions on robot hands is impossible. Existing works on retargeting incorporate various optimization objectives, focusing on different aspects of hand configuration. However, the lack of experimental comparative studies leaves the…

@inthehands@hachyderm.io
2025-05-30 22:07:06

Note that nowhere in that definition is there actually any attempt to define or measure “intelligence” — a term which we are scarcely able to define and to measure even for humans!
Note also that the definition is inherently a broad one and a shifting one. It’s relative to humans •and• relative to recent history.
4/

@inthehands@hachyderm.io
2025-05-30 22:07:06

Note that nowhere in that definition is there actually any attempt to define or measure “intelligence” — a term which we are scarcely able to define and to measure even for humans!
Note also that the definition is inherently a broad one and a shifting one. It’s relative to humans •and• relative to recent history.
4/

@Techmeme@techhub.social
2025-06-03 01:30:40

Aerones, which makes robots that can service wind turbines in about half the time of humans, raised $62M led by Activate Capital and S2G Investments (Virginia Furness/Reuters)
https://www.reuters.com/sustainability/cli

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 08:09:11

RoboEgo System Card: An Omnimodal Model with Native Full Duplexity
Yiqun Yao, Xiang Li, Xin Jiang, Xuezhi Fang, Naitong Yu, Aixin Sun, Yequan Wang
https://arxiv.org/abs/2506.01934

RoboEgo System Card: An Omnimodal Model with Native Full Duplexity
Humans naturally process real-world multimodal information in a full-duplex manner. In artificial intelligence, replicating this capability is essential for advancing model development and deployment, particularly in embodied contexts. The development of multimodal models faces two primary challenges: (1) effectively handling more than three modalities-such as vision, audio, and text; and (2) delivering full-duplex responses to rapidly evolving human instructions. To facilitate research on mode…

@arXiv_csRO_bot@mastoxiv.page
2025-06-10 17:07:39

This https://arxiv.org/abs/2502.10090 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…

Manual2Skill: Learning to Read Manuals and Acquire Robotic Skills for Furniture Assembly Using Vision-Language Models
Humans possess an extraordinary ability to understand and execute complex manipulation tasks by interpreting abstract instruction manuals. For robots, however, this capability remains a substantial challenge, as they cannot interpret abstract instructions and translate them into executable actions. In this paper, we present Manual2Skill, a novel framework that enables robots to perform complex assembly tasks guided by high-level manual instructions. Our approach leverages a Vision-Language Mode…

@inthehands@hachyderm.io
2025-06-05 16:26:59

❝We humans are stability-seeking creatures. Getting accustomed to what used to seem unthinkable can feel like an accomplishment. And when the unthinkable recedes at least a bit…it’s easy to mistake it for proof that the dark times are ending.
But these comparatively small victories don’t alter the direction of our transformation — they don’t even slow it down measurably — even while they appeal to our deep need to normalize.…And so just when we most need to act — while there is indeed room for action and some momentum to the resistance — we tend to be lulled into complacency by the sense of relief on the one hand and boredom on the other.❞
https://www.nytimes.com/2025/05/28/opinion/trump-danger-normalization-shock.html

@arXiv_csRO_bot@mastoxiv.page
2025-06-11 08:36:25

Deploying SICNav in the Field: Safe and Interactive Crowd Navigation using MPC and Bilevel Optimization
Sepehr Samavi, Garvish Bhutani, Florian Shkurti, Angela P. Schoellig
https://arxiv.org/abs/2506.08851

Deploying SICNav in the Field: Safe and Interactive Crowd Navigation using MPC and Bilevel Optimization
Safe and efficient navigation in crowded environments remains a critical challenge for robots that provide a variety of service tasks such as food delivery or autonomous wheelchair mobility. Classical robot crowd navigation methods decouple human motion prediction from robot motion planning, which neglects the closed-loop interactions between humans and robots. This lack of a model for human reactions to the robot plan (e.g. moving out of the way) can cause the robot to get stuck. Our proposed …

@arXiv_csAI_bot@mastoxiv.page
2025-06-05 09:37:48

This https://arxiv.org/abs/2505.17433 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…

MemeReaCon: Probing Contextual Meme Understanding in Large Vision-Language Models
Memes have emerged as a popular form of multimodal online communication, where their interpretation heavily depends on the specific context in which they appear. Current approaches predominantly focus on isolated meme analysis, either for harmful content detection or standalone interpretation, overlooking a fundamental challenge: the same meme can express different intents depending on its conversational context. This oversight creates an evaluation gap: although humans intuitively recognize ho…

@arXiv_csHC_bot@mastoxiv.page
2025-06-06 09:39:41

This https://arxiv.org/abs/2505.10661 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csHC_…

It's only fair when I think it's fair: How Gender Bias Alignment Undermines Distributive Fairness in Human-AI Collaboration
Human-AI collaboration is increasingly relevant in consequential areas where AI recommendations support human discretion. However, human-AI teams' effectiveness, capability, and fairness highly depend on human perceptions of AI. Positive fairness perceptions have been shown to foster trust and acceptance of AI recommendations. Yet, work on confirmation bias highlights that humans selectively adhere to AI recommendations that align with their expectations and beliefs -- despite not being necessa…

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 17:35:45

This https://arxiv.org/abs/2412.05718 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…

RLZero: Direct Policy Inference from Language Without In-Domain Supervision
The reward hypothesis states that all goals and purposes can be understood as the maximization of a received scalar reward signal. However, in practice, defining such a reward signal is notoriously difficult, as humans are often unable to predict the optimal behavior corresponding to a reward function. Natural language offers an intuitive alternative for instructing reinforcement learning (RL) agents, yet previous language-conditioned approaches either require costly supervision or test-time tr…

@arXiv_csRO_bot@mastoxiv.page
2025-06-02 10:03:00

This https://arxiv.org/abs/2409.18745 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…

A study on the effects of mixed explicit and implicit communications in human-virtual-agent interactions
Communication between humans and robots (or virtual agents) is essential for interaction and often inspired by human communication, which uses gestures, facial expressions, gaze direction, and other explicit and implicit means. This work presents an interaction experiment where humans and virtual agents interact through explicit (gestures, manual entries using mouse and keyboard, voice, sound, and information on screen) and implicit (gaze direction, location, facial expressions, and raise of ey…

@arXiv_csRO_bot@mastoxiv.page
2025-06-09 08:38:52

3DFlowAction: Learning Cross-Embodiment Manipulation from 3D Flow World Model
Hongyan Zhi, Peihao Chen, Siyuan Zhou, Yubo Dong, Quanxi Wu, Lei Han, Mingkui Tan
https://arxiv.org/abs/2506.06199

3DFlowAction: Learning Cross-Embodiment Manipulation from 3D Flow World Model
Manipulation has long been a challenging task for robots, while humans can effortlessly perform complex interactions with objects, such as hanging a cup on the mug rack. A key reason is the lack of a large and uniform dataset for teaching robots manipulation skills. Current robot datasets often record robot action in different action spaces within a simple scene. This hinders the robot to learn a unified and robust action representation for different robots within diverse scenes. Observing how …

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 07:21:03

Do Language Models Mirror Human Confidence? Exploring Psychological Insights to Address Overconfidence in LLMs
Chenjun Xu, Bingbing Wen, Bin Han, Robert Wolfe, Lucy Lu Wang, Bill Howe
https://arxiv.org/abs/2506.00582

Do Language Models Mirror Human Confidence? Exploring Psychological Insights to Address Overconfidence in LLMs
Psychology research has shown that humans are poor at estimating their performance on tasks, tending towards underconfidence on easy tasks and overconfidence on difficult tasks. We examine three LLMs, Llama-3-70B-instruct, Claude-3-Sonnet, and GPT-4o, on a range of QA tasks of varying difficulty, and show that models exhibit subtle differences from human patterns of overconfidence: less sensitive to task difficulty, and when prompted to answer based on different personas -- e.g., expert vs laym…

@inthehands@hachyderm.io
2025-05-30 21:15:46

Re this from @…, of the biggest tells about the current AI hype bubble:
Instead of replacing the work humans don’t want to do, it’s purporting to replace the work executives hate paying for.
Instead of an end to drudgery, they’re pushing an end to purpose and meaning.
And yeah, we’re going to end up cleaning up the AI’s messes. And doing its laundry.
https://mastodon.social/@PavelASamsonov/114598616057210141

@inthehands@hachyderm.io
2025-05-30 21:15:46

Re this from @…, of the biggest tells about the current AI hype bubble:
Instead of replacing the work humans don’t want to do, it’s purporting to replace the work executives hate paying for.
Instead of an end to drudgery, they’re pushing an end to purpose and meaning.
And yeah, we’re going to end up cleaning up the AI’s messes. And doing its laundry.
https://mastodon.social/@PavelASamsonov/114598616057210141

@arXiv_csRO_bot@mastoxiv.page
2025-06-09 08:09:12

Where Do We Look When We Teach? Analyzing Human Gaze Behavior Across Demonstration Devices in Robot Imitation Learning
Yutaro Ishida, Takamitsu Matsubara, Takayuki Kanai, Kazuhiro Shintani, Hiroshi Bito
https://arxiv.org/abs/2506.05808

Where Do We Look When We Teach? Analyzing Human Gaze Behavior Across Demonstration Devices in Robot Imitation Learning
Imitation learning for acquiring generalizable policies often requires a large volume of demonstration data, making the process significantly costly. One promising strategy to address this challenge is to leverage the cognitive and decision-making skills of human demonstrators with strong generalization capability, particularly by extracting task-relevant cues from their gaze behavior. However, imitation learning typically involves humans collecting data using demonstration devices that emulate…

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 07:19:50

Monitoring Robustness and Individual Fairness
Ashutosh Gupta, Thomas A. Henzinger, Konstantin Kueffner, Kaushik Mallik, David Pape
https://arxiv.org/abs/2506.00496

Monitoring Robustness and Individual Fairness
Input-output robustness appears in various different forms in the literature, such as robustness of AI models to adversarial or semantic perturbations and individual fairness of AI models that make decisions about humans. We propose runtime monitoring of input-output robustness of deployed, black-box AI models, where the goal is to design monitors that would observe one long execution sequence of the model, and would raise an alarm whenever it is detected that two similar inputs from the past…

@inthehands@hachyderm.io
2025-05-30 22:02:06

Here’s the real actual definition of “artificial intelligence,” the true technical meaning in research and engineering circles when it’s not being used as marketing hype.
Artificial intelligence is anything that
1. humans are generally good at, and
2. computers were recently bad at.
That’s it. That’s all it means. You’ll hear people refine it and dress it up, but that’s the heart of the definition. (Check Wikipedia!)
3/

@inthehands@hachyderm.io
2025-05-30 22:02:06

Here’s the real actual definition of “artificial intelligence,” the true technical meaning in research and engineering circles when it’s not being used as marketing hype.
Artificial intelligence is anything that
1. humans are generally good at, and
2. computers were recently bad at.
That’s it. That’s all it means. You’ll hear people refine it and dress it up, but that’s the heart of the definition. (Check Wikipedia!)
3/

@inthehands@hachyderm.io
2025-05-30 22:10:21

For example:
- Telling apart photos of cats and dogs is “AI.”
- Making up fake but plausible facts on an arbitrary topic is “AI.”
- Walking is “AI.”
- Doing long multiplication is something we might call “intelligence” in humans, but it is not “AI” because computers have •always• been good at it.
- Winning at checkers •used• to be “AI” because computers didn’t used to be able to do that, but now it’s not “AI” because computers have been good at it for too long.
5/

@inthehands@hachyderm.io
2025-05-30 22:10:21

For example:
- Telling apart photos of cats and dogs is “AI.”
- Making up fake but plausible facts on an arbitrary topic is “AI.”
- Walking is “AI.”
- Doing long multiplication is something we might call “intelligence” in humans, but it is not “AI” because computers have •always• been good at it.
- Winning at checkers •used• to be “AI” because computers didn’t used to be able to do that, but now it’s not “AI” because computers have been good at it for too long.
5/

@arXiv_csRO_bot@mastoxiv.page
2025-06-04 14:05:06

This https://arxiv.org/abs/2504.14305 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…

Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning
Humans exhibit diverse and expressive whole-body movements. However, attaining human-like whole-body coordination in humanoid robots remains challenging, as conventional approaches that mimic whole-body motions often neglect the distinct roles of upper and lower body. This oversight leads to computationally intensive policy learning and frequently causes robot instability and falls during real-world execution. To address these issues, we propose Adversarial Locomotion and Motion Imitation (ALMI…

@arXiv_csRO_bot@mastoxiv.page
2025-06-05 09:59:19

This https://arxiv.org/abs/2505.20290 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…

EgoZero: Robot Learning from Smart Glasses
Despite recent progress in general purpose robotics, robot policies still lag far behind basic human capabilities in the real world. Humans interact constantly with the physical world, yet this rich data resource remains largely untapped in robot learning. We propose EgoZero, a minimal system that learns robust manipulation policies from human demonstrations captured with Project Aria smart glasses, $\textbf{and zero robot data}$. EgoZero enables: (1) extraction of complete, robot-executable ac…

@arXiv_csRO_bot@mastoxiv.page
2025-06-04 14:02:33

This https://arxiv.org/abs/2503.05231 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…

Kaiwu: A Multimodal Manipulation Dataset and Framework for Robot Learning and Human-Robot Interaction
Cutting-edge robot learning techniques including foundation models and imitation learning from humans all pose huge demands on large-scale and high-quality datasets which constitute one of the bottleneck in the general intelligent robot fields. This paper presents the Kaiwu multimodal dataset to address the missing real-world synchronized multimodal data problems in the sophisticated assembling scenario,especially with dynamics information and its fine-grained labelling. The dataset first provi…

@arXiv_csRO_bot@mastoxiv.page
2025-06-03 17:58:33

This https://arxiv.org/abs/2505.21432 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…

Hume: Introducing System-2 Thinking in Visual-Language-Action Model
Humans practice slow thinking before performing actual actions when handling complex tasks in the physical world. This thinking paradigm, recently, has achieved remarkable advancement in boosting Large Language Models (LLMs) to solve complex tasks in digital domains. However, the potential of slow thinking remains largely unexplored for robotic foundation models interacting with the physical world. In this work, we propose Hume: a dual-system Vision-Language-Action (VLA) model with value-guided…

@arXiv_csRO_bot@mastoxiv.page
2025-06-04 13:40:40

This https://arxiv.org/abs/2402.11871 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…

From Real World to Logic and Back: Learning Generalizable Relational Concepts For Long Horizon Robot Planning
Humans efficiently generalize from limited demonstrations, but robots still struggle to transfer learned knowledge to complex, unseen tasks with longer horizons and increased complexity. We propose the first known method enabling robots to autonomously invent relational concepts directly from small sets of unannotated, unsegmented demonstrations. The learned symbolic concepts are grounded into logic-based world models, facilitating efficient zero-shot generalization to significantly more comple…

@arXiv_csRO_bot@mastoxiv.page
2025-06-04 07:53:02

EDEN: Entorhinal Driven Egocentric Navigation Toward Robotic Deployment
Mikolaj Walczak, Romina Aalishah, Wyatt Mackey, Brittany Story, David L. Boothe Jr., Nicholas Waytowich, Xiaomin Lin, Tinoosh Mohsenin
https://arxiv.org/abs/2506.03046

EDEN: Entorhinal Driven Egocentric Navigation Toward Robotic Deployment
Deep reinforcement learning agents are often fragile while humans remain adaptive and flexible to varying scenarios. To bridge this gap, we present EDEN, a biologically inspired navigation framework that integrates learned entorhinal-like grid cell representations and reinforcement learning to enable autonomous navigation. Inspired by the mammalian entorhinal-hippocampal system, EDEN allows agents to perform path integration and vector-based navigation using visual and motion sensor data. At th…

Tootfinder

Opt-in global Mastodon full text search. Join the index!