Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@tiotasram@kolektiva.social
2025-05-26 12:51:54

Let's say you find a really cool forum online that has lots of good advice on it. It's even got a very active community that's happy to answer questions very quickly, and the community seems to have a wealth of knowledge about all sorts of subjects.
You end up visiting this community often, and trusting the advice you get to answer all sorts of everyday questions you might have, which before you might have found answers to using a web search (of course web search is now full of SEI spam and other crap so it's become nearly useless).
Then one day, you ask an innocuous question about medicine, and from this community you get the full homeopathy treatment as your answer. Like, somewhat believable on the face of it, includes lots of citations to reasonable-seeming articles, except that if you know even a tiny bit about chemistry and biology (which thankfully you do), you know that the homoeopathy answers are completely bogus and horribly dangerous (since they offer non-treatments for real diseases). Your opinion of this entire forum suddenly changes. "Oh my God, if they've been homeopathy believers all this time, what other myths have they fed me as facts?"
You stop using the forum for anything, and go back to slogging through SEI crap to answer your everyday questions, because one you realize that this forum is a community that's fundamentally untrustworthy, you realize that the value of getting advice from it on any subject is negative: you knew enough to spot the dangerous homeopathy answer, but you know there might be other such myths that you don't know enough to avoid, and any community willing to go all-in on one myth has shown itself to be capable of going all in on any number of other myths.
...
This has been a parable about large language models.
#AI #LLM

@arXiv_csCV_bot@mastoxiv.page
2025-06-30 10:16:50

MiCo: Multi-image Contrast for Reinforcement Visual Reasoning
Xi Chen, Mingkang Zhu, Shaoteng Liu, Xiaoyang Wu, Xiaogang Xu, Yu Liu, Xiang Bai, Hengshuang Zhao
arxiv.org/abs/2506.22434

@arXiv_csLO_bot@mastoxiv.page
2025-06-30 07:46:30

Negated String Containment is Decidable (Technical Report)
Vojt\v{e}ch Havlena, Michal He\v{c}ko, Luk\'a\v{s} Hol\'ik, Ond\v{r}ej Leng\'al
arxiv.org/abs/2506.22061

@Techmeme@techhub.social
2025-06-25 06:10:49

Daydream, which raised a $50M seed in June 2024 to build a generative AI shopping agent for fashion, launches in beta, with an app expected later this summer (Hilary Milnes/Vogue Business)
voguebusiness.com/story/techno

@arXiv_csAI_bot@mastoxiv.page
2025-06-24 10:50:20

Action Language BC
Joseph Babb, Joohyung Lee
arxiv.org/abs/2506.18044 arxiv.org/pdf/2506.18044

@ginevra@hachyderm.io
2025-06-20 00:35:29

Language learning has been part of me since high school. I'm solid in 2 non-English languages, crappy but survivable in 2 others. I've played with & started learning others many times.
I'm real busy rn, but language learning could be a fun thing to do for myself & make me feel like I'm still me.
But I'm stumped about my language picks. I learnt the obvious European languages in school; later tried key Asian languages. What do I want to do now?
African languages? I won't be getting a chance to use them much in Aus, & I'm unlikely to get to a stage where I can read literature.
I tried Slovenian/Slovene on a whim & really love it, but I'll never go there. Is the practical but unfun answer grind out more kanji/hanzi? Or is whimsically learning a language spoken by only 2.5 million people reasonable? I will continue struggling through with Ukrainian, 'cause I think it's important.
#LanguageLearning

@arXiv_eessAS_bot@mastoxiv.page
2025-05-30 07:22:43

Spoken question answering for visual queries
Nimrod Shabtay, Zvi Kons, Avihu Dekel, Hagai Aronowitz, Ron Hoory, Assaf Arbelle
arxiv.org/abs/2505.23308

@arXiv_csRO_bot@mastoxiv.page
2025-06-26 09:45:30

HRIBench: Benchmarking Vision-Language Models for Real-Time Human Perception in Human-Robot Interaction
Zhonghao Shi, Enyu Zhao, Nathaniel Dennler, Jingzhen Wang, Xinyang Xu, Kaleen Shrestha, Mengxue Fu, Daniel Seita, Maja Matari\'c
arxiv.org/abs/2506.20566

@arXiv_csCL_bot@mastoxiv.page
2025-06-27 09:59:49

Potemkin Understanding in Large Language Models
Marina Mancoridis, Bec Weeks, Keyon Vafa, Sendhil Mullainathan
arxiv.org/abs/2506.21521 arxiv.org/pdf/2506.21521 arxiv.org/html/2506.21521
arXiv:2506.21521v1 Announce Type: new
Abstract: Large language models (LLMs) are regularly evaluated using benchmark datasets. But what justifies making inferences about an LLM's capabilities based on its answers to a curated set of questions? This paper first introduces a formal framework to address this question. The key is to note that the benchmarks used to test LLMs -- such as AP exams -- are also those used to test people. However, this raises an implication: these benchmarks are only valid tests if LLMs misunderstand concepts in ways that mirror human misunderstandings. Otherwise, success on benchmarks only demonstrates potemkin understanding: the illusion of understanding driven by answers irreconcilable with how any human would interpret a concept. We present two procedures for quantifying the existence of potemkins: one using a specially designed benchmark in three domains, the other using a general procedure that provides a lower-bound on their prevalence. We find that potemkins are ubiquitous across models, tasks, and domains. We also find that these failures reflect not just incorrect understanding, but deeper internal incoherence in concept representations.
toXiv_bot_toot

@arXiv_csCL_bot@mastoxiv.page
2025-06-26 07:33:20

Inference Scaled GraphRAG: Improving Multi Hop Question Answering on Knowledge Graphs
Travis Thompson, Seung-Hwan Lim, Paul Liu, Ruoying He, Dongkuan Xu
arxiv.org/abs/2506.19967