Tootfinder

Opt-in global Mastodon full text search. Join the index!

@karlauerbach@sfba.social
2025-08-18 23:51:25

I just filed my first complaint against a California attorney under my own obligations under California's rule 8.3 (of the rules of professional conduct.)
California has a *mandatory* system under which attorney's (I am a member of that klan) *must* report various kinds of misconduct by other attorneys.
In this case that attorney sent out a fishing letter to my incorrect name (but correct address) asserting that they had done actual research with the implication that they…

@arXiv_csCL_bot@mastoxiv.page
2025-08-18 09:22:30

Personalized Distractor Generation via MCTS-Guided Reasoning Reconstruction
Tao Wu, Jingyuan Chen, Wang Lin, Jian Zhan, Mengze Li, Kun Kuang, Fei Wu
arxiv.org/abs/2508.11184

@arXiv_csCV_bot@mastoxiv.page
2025-09-18 10:23:11

Towards Rationale-Answer Alignment of LVLMs via Self-Rationale Calibration
Yuanchen Wu, Ke Yan, Shouhong Ding, Ziyin Zhou, Xiaoqiang Li
arxiv.org/abs/2509.13919

@arXiv_csAI_bot@mastoxiv.page
2025-08-19 10:19:50

Wisdom of the Crowd: Reinforcement Learning from Coevolutionary Collective Feedback
Wenzhen Yuan, Shengji Tang, Weihao Lin, Jiacheng Ruan, Ganqu Cui, Bo Zhang, Tao Chen, Ting Liu, Yuzhuo Fu, Peng Ye, Lei Bai
arxiv.org/abs/2508.12338

@arXiv_csSE_bot@mastoxiv.page
2025-08-18 08:40:10

Hallucination in LLM-Based Code Generation: An Automotive Case Study
Marc Pavel, Nenad Petrovic, Lukasz Mazur, Vahid Zolfaghari, Fengjunjie Pan, Alois Knoll
arxiv.org/abs/2508.11257

Google's AI Overviews are getting mean.
I Googled 'buck2 fixed point caching', speculatively, wondering if Buck2 had any feature like this. The AI Overview started with: "There is no specific feature in Buck2 called ‘fixed point caching.’ The term appears to be a misunderstanding of how Buck2's caching mechanisms work in a build system.”
The overview went on to give an incorrect definition of "fixed point".

@arXiv_csIT_bot@mastoxiv.page
2025-08-19 08:32:40

Age of Semantic Information-Aware Wireless Transmission for Remote Monitoring Systems
Xue Han, Biqian Feng, Yongpeng Wu, Xiang-Gen Xia, Wenjun Zhang, Shengli Sun
arxiv.org/abs/2508.12248

@arXiv_hepph_bot@mastoxiv.page
2025-08-19 10:13:50

Resolution of spin crisis, and notes on the Bjorken sum rule, anomaly and constituent quark
J. Pasupathy, Janardhan P. Singh
arxiv.org/abs/2508.12156

@arXiv_csLG_bot@mastoxiv.page
2025-10-14 13:38:38

Learning to Make MISTAKEs: Modeling Incorrect Student Thinking And Key Errors
Alexis Ross, Jacob Andreas
arxiv.org/abs/2510.11502 arxiv.org…

@jlpiraux@wallonie-bruxelles.social
2025-09-15 07:49:44

"Conventionally, the output of an AI is graded in a binary way, rewarding it when it gives a correct response and penalizing it when it gives an incorrect one.
In simple terms, in other words, guessing is rewarded — because it might be right — over an AI admitting it doesn't know the answer, which will be graded as incorrect no matter what.
As a result, through "natural statistical pressures," LLMs are far more prone to hallucinate an answer instead of "ac…

@grahamperrin@bsd.cafe
2025-08-17 02:41:46

@… let me guess … the discussion that spammed four lists (ignoring the documented basic rule about never more than two); the one that originated with shouting and swearing in GitHub; the one that proceeded to go off-topic from all four lists; the one that's technically incorrect about the effect of a command.
If you're bored, there's also a twenty-three…

@arXiv_csCL_bot@mastoxiv.page
2025-09-18 09:57:11

Geometric Uncertainty for Detecting and Correcting Hallucinations in LLMs
Edward Phillips, Sean Wu, Soheila Molaei, Danielle Belgrave, Anshul Thakur, David Clifton
arxiv.org/abs/2509.13813

@yaya@jorts.horse
2025-10-16 04:13:32

my favorite thing about my vocab app is that sometimes the incorrect answers construct an incredible parallel reality
please I want to live in the football dimension where there's a goal in the church and it's normal for wedding photos to have people in cleats and football kits


What do people often do in the kitchen?
Pick 1
ithim (eat)
pasálaim an liathróid (pass the ball)

What clothes are often seen in wedding ceremony photos?
Pick 1
bróga peile (cleats)
geansai (jerseys)
léine (shirts)

What can often be found in abchurch?
Pick 1

leabhar (book)
cúl (goal)
@arXiv_csHC_bot@mastoxiv.page
2025-09-17 10:25:10

Evolution of Programmers' Trust in Generative AI Programming Assistants
Anshul Shah, Thomas Rexin, Elena Tomson, Leo Porter, William G. Griswold, Adalbert Gerald Soosai Raj
arxiv.org/abs/2509.13253

@arXiv_csCL_bot@mastoxiv.page
2025-09-19 10:38:01

SMARTER: A Data-efficient Framework to Improve Toxicity Detection with Explanation via Self-augmenting Large Language Models
Huy Nghiem, Advik Sachdeva, Hal Daum\'e III
arxiv.org/abs/2509.15174

@arXiv_csCR_bot@mastoxiv.page
2025-09-16 12:00:27

ILA: Correctness via Type Checking for Fully Homomorphic Encryption
Tarakaram Gollamudi, Anitha Gollamudi, Joshua Gancher
arxiv.org/abs/2509.11559

@arXiv_csIR_bot@mastoxiv.page
2025-09-16 09:24:57

ReFineG: Synergizing Small Supervised Models and LLMs for Low-Resource Grounded Multimodal NER
Jielong Tang, Shuang Wang, Zhenxing Wang, Jianxing Yu, Jian Yin
arxiv.org/abs/2509.10975

@arXiv_condmatmtrlsci_bot@mastoxiv.page
2025-09-16 09:41:47

"Adiabatic" Elastic Constants in Hubbard-Corrected Density-Functional Theory DFT U: case UO$_2$
Mahmoud Payami, Samira Sheykhi
arxiv.org/abs/2509.11200

@deprogrammaticaipsum@mas.to
2025-10-05 10:53:38

"If George Boole is the 19th century’s AI scientist, then his contemporary machine learning engineers were Charles Babbage and Ada Lovelace. The Difference Engine, which would be frequently cited as the first example of a (mechanical) programmable digital computer if it had been built at the time, was explicitly designed to _replace_ rather than _augment_ human thought. Just as modern software engineering managers use Jira to avoid thinking about process engineering."

@lilmikesf@c.im
2025-10-12 17:34:11

Attempted #Drumpf & #RFKjr purge of #CDC workers initially fails due to clerical "coding" error.
“The employees who received incorrect notifications were never separated from the agency and have all been notified that they…

@adlerweb@social.adlerweb.info
2025-10-08 21:20:27

Days since I bootet a server with incorrect memory slot configuration: 0

Dusty screen

Middle right: System initializing memory
Bottom left: System Halted: No Memory could be configured
@Techmeme@techhub.social
2025-08-05 21:40:52

Wikipedia editors adopt a policy giving admins the authority to quickly delete AI-generated articles that meet certain criteria, like incorrect citations (Emanuel Maiberg/404 Media)
404media.co/wikipedia-editors-

@dennisfaucher@infosec.exchange
2025-10-09 13:41:12

So, since I found a bug in #logseq where pasting formatted notes from MS Teams causes logseq to use incorrect bold markdown syntax ([space]** at the end of a phrase instead of just **), I wrote this sed script to fix the logseq markdown files after I paste content in:
$ cat fix_logseg_bold_journals.sh
#!/bin/bash
cd /Users/faucherd/Documents//Logseq/journals
sed -i '…

@cowboys@darktundra.xyz
2025-10-06 00:54:09

NFL refs got Justin Fields’ ‘SkyCam’ throw in Week 5 vs Cowboys incorrect usatoday.com/story/sports/nfl/

@arXiv_csDB_bot@mastoxiv.page
2025-08-14 07:51:02

AmbiGraph-Eval: Can LLMs Effectively Handle Ambiguous Graph Queries?
Yuchen Tian, Kaixin Li, Hao Chen, Ziyang Luo, Hongzhan Lin, Sebastian Schelter, Lun Du, Jing Ma
arxiv.org/abs/2508.09631

@jake4480@c.im
2025-08-07 13:42:38

Around 20 years ago, I made my first and only change to something on Wikipedia. It was for an underground rap artist- I just made a correction or two to something that was obviously incorrect, and my information was correct, thinking nothing of it. I looked at it the next day, and a Wikipedia editor or mod or whatever changed it back. That was the last time I ever edited anything there. I still look up things on Wikipedia (with a grain of salt) but that experience really bugged me. The Wikip…

@arXiv_csAI_bot@mastoxiv.page
2025-09-15 08:52:41

XAgents: A Unified Framework for Multi-Agent Cooperation via IF-THEN Rules and Multipolar Task Processing Graph
Hailong Yang, Mingxian Gu, Jianqi Wang, Guanjin Wang, Zhaohong Deng
arxiv.org/abs/2509.10054

@arXiv_csHC_bot@mastoxiv.page
2025-08-12 10:09:13

Hide or Highlight: Understanding the Impact of Factuality Expression on User Trust
Hyo Jin Do, Werner Geyer
arxiv.org/abs/2508.07095 arxiv.…

@arXiv_mathOC_bot@mastoxiv.page
2025-08-13 09:25:42

Byzantine-Resilient Decentralized Online Resource Allocation
Runhua Wang, Qing Ling, Hoi-To Wai, Zhi Tian
arxiv.org/abs/2508.08658 arxiv.or…

@Mediagazer@mstdn.social
2025-08-06 08:01:26

Wikipedia editors adopt a policy giving admins the authority to quickly delete AI-generated articles that meet certain criteria, like incorrect citations (Emanuel Maiberg/404 Media)
404media.co/wikipedia-editors-

@arXiv_csCR_bot@mastoxiv.page
2025-10-15 08:48:52

Robust ML-based Detection of Conventional, LLM-Generated, and Adversarial Phishing Emails Using Advanced Text Preprocessing
Deeksha Hareesha Kulal, Chidozie Princewill Arannonu, Afsah Anwar, Nidhi Rastogi, Quamar Niyaz
arxiv.org/abs/2510.11915

@arXiv_csCY_bot@mastoxiv.page
2025-09-30 10:20:11

Opinions can be Incorrect! In our Opinion. On the accuracy principle in data protection law
Dara Hallinan, Frederik Zuiderveen Borgesius
arxiv.org/abs/2509.23848

@nelson@tech.lgbt
2025-09-03 15:15:15

Gemini also asserts my oldest emails are from April 2003 but produces incorrect info when asked for details. Gmail didn't even exist until April 2004 and regular search finds nothing before then. (It does find a lot of Jira spam starting April 8 2004, some things never change.)

@arXiv_statME_bot@mastoxiv.page
2025-08-12 10:01:23

Modelling phenology using ordered categorical generalized additive models
David L Miller
arxiv.org/abs/2508.07789 arxiv.org/pdf/2508.07789

@arXiv_csLG_bot@mastoxiv.page
2025-08-25 10:01:50

Sparse but Wrong: Incorrect L0 Leads to Incorrect Features in Sparse Autoencoders
David Chanin, Adri\`a Garriga-Alonso
arxiv.org/abs/2508.16560

@robpike@hachyderm.io
2025-09-26 22:15:31

Someone should invent a way for computers to count. At the moment both GMail and GitHub have incorrect message counts in my inbox. Again.
This has happened many times. Given that computers make it possible for me to order toilet paper for delivery by 2pm and then send me hundreds of messages about toilet paper by 5pm, it seems odd to me that they can't count.
But hey, I guess it's hard to count how many things are in a list, especially when the list is empty.

@arXiv_csSE_bot@mastoxiv.page
2025-10-14 09:14:28

OBsmith: Testing JavaScript Obfuscator using LLM-powered sketching
Shan Jiang, Chenguang Zhu, Sarfraz Khurshid
arxiv.org/abs/2510.10066 arx…

@arXiv_statML_bot@mastoxiv.page
2025-10-08 09:01:19

Domain-Shift-Aware Conformal Prediction for Large Language Models
Zhexiao Lin, Yuanyuan Li, Neeraj Sarna, Yuanyuan Gao, Michael von Gablenz
arxiv.org/abs/2510.05566

@arXiv_csCR_bot@mastoxiv.page
2025-10-15 09:59:51

DeepTrust: Multi-Step Classification through Dissimilar Adversarial Representations for Robust Android Malware Detection
Daniel Pulido-Cort\'azar, Daniel Gibert, Felip Many\`a
arxiv.org/abs/2510.12310

@arXiv_csCV_bot@mastoxiv.page
2025-09-12 10:14:19

InterAct: Advancing Large-Scale Versatile 3D Human-Object Interaction Generation
Sirui Xu, Dongting Li, Yucheng Zhang, Xiyan Xu, Qi Long, Ziyin Wang, Yunzhi Lu, Shuchang Dong, Hezi Jiang, Akshat Gupta, Yu-Xiong Wang, Liang-Yan Gui
arxiv.org/abs/2509.09555

@arXiv_csAI_bot@mastoxiv.page
2025-08-14 07:38:52

MEML-GRPO: Heterogeneous Multi-Expert Mutual Learning for RLVR Advancement
Weitao Jia, Jinghui Lu, Haiyang Yu, Siqi Wang, Guozhi Tang, An-Lan Wang, Weijie Yin, Dingkang Yang, Yuxiang Nie, Bin Shan, Hao Feng, Irene Li, Kun Yang, Han Wang, Jingqun Tang, Teng Fu, Changhong Jin, Chao Feng, Xiaohui Lv, Can Huang
arxiv.org/abs/2508.09670…

@arXiv_csCL_bot@mastoxiv.page
2025-10-15 10:27:41

Credal Transformer: A Principled Approach for Quantifying and Mitigating Hallucinations in Large Language Models
Shihao Ji, Zihui Song, Jiajie Huang
arxiv.org/abs/2510.12137

@gwire@mastodon.social
2025-10-03 15:42:18

> All of this is also happening since Google removed support for the ClaimReview standard— a data format that was designed to ensure that this kind of confusion did not happen.
fullfact.org/technology/google

@tiotasram@kolektiva.social
2025-08-04 15:49:00

Should we teach vibe coding? Here's why not.
Should AI coding be taught in undergrad CS education?
1/2
I teach undergraduate computer science labs, including for intro and more-advanced core courses. I don't publish (non-negligible) scholarly work in the area, but I've got years of craft expertise in course design, and I do follow the academic literature to some degree. In other words, In not the world's leading expert, but I have spent a lot of time thinking about course design, and consider myself competent at it, with plenty of direct experience in what knowledge & skills I can expect from students as they move through the curriculum.
I'm also strongly against most uses of what's called "AI" these days (specifically, generative deep neutral networks as supplied by our current cadre of techbro). There are a surprising number of completely orthogonal reasons to oppose the use of these systems, and a very limited number of reasonable exceptions (overcoming accessibility barriers is an example). On the grounds of environmental and digital-commons-pollution costs alone, using specifically the largest/newest models is unethical in most cases.
But as any good teacher should, I constantly question these evaluations, because I worry about the impact on my students should I eschew teaching relevant tech for bad reasons (and even for his reasons). I also want to make my reasoning clear to students, who should absolutely question me on this. That inspired me to ask a simple question: ignoring for one moment the ethical objections (which we shouldn't, of course; they're very stark), at what level in the CS major could I expect to teach a course about programming with AI assistance, and expect students to succeed at a more technically demanding final project than a course at the same level where students were banned from using AI? In other words, at what level would I expect students to actually benefit from AI coding "assistance?"
To be clear, I'm assuming that students aren't using AI in other aspects of coursework: the topic of using AI to "help you study" is a separate one (TL;DR it's gross value is not negative, but it's mostly not worth the harm to your metacognitive abilities, which AI-induced changes to the digital commons are making more important than ever).
So what's my answer to this question?
If I'm being incredibly optimistic, senior year. Slightly less optimistic, second year of a masters program. Realistic? Maybe never.
The interesting bit for you-the-reader is: why is this my answer? (Especially given that students would probably self-report significant gains at lower levels.) To start with, [this paper where experienced developers thought that AI assistance sped up their work on real tasks when in fact it slowed it down] (arxiv.org/abs/2507.09089) is informative. There are a lot of differences in task between experienced devs solving real bugs and students working on a class project, but it's important to understand that we shouldn't have a baseline expectation that AI coding "assistants" will speed things up in the best of circumstances, and we shouldn't trust self-reports of productivity (or the AI hype machine in general).
Now we might imagine that coding assistants will be better at helping with a student project than at helping with fixing bugs in open-source software, since it's a much easier task. For many programming assignments that have a fixed answer, we know that many AI assistants can just spit out a solution based on prompting them with the problem description (there's another elephant in the room here to do with learning outcomes regardless of project success, but we'll ignore this over too, my focus here is on project complexity reach, not learning outcomes). My question is about more open-ended projects, not assignments with an expected answer. Here's a second study (by one of my colleagues) about novices using AI assistance for programming tasks. It showcases how difficult it is to use AI tools well, and some of these stumbling blocks that novices in particular face.
But what about intermediate students? Might there be some level where the AI is helpful because the task is still relatively simple and the students are good enough to handle it? The problem with this is that as task complexity increases, so does the likelihood of the AI generating (or copying) code that uses more complex constructs which a student doesn't understand. Let's say I have second year students writing interactive websites with JavaScript. Without a lot of care that those students don't know how to deploy, the AI is likely to suggest code that depends on several different frameworks, from React to JQuery, without actually setting up or including those frameworks, and of course three students would be way out of their depth trying to do that. This is a general problem: each programming class carefully limits the specific code frameworks and constructs it expects students to know based on the material it covers. There is no feasible way to limit an AI assistant to a fixed set of constructs or frameworks, using current designs. There are alternate designs where this would be possible (like AI search through adaptation from a controlled library of snippets) but those would be entirely different tools.
So what happens on a sizeable class project where the AI has dropped in buggy code, especially if it uses code constructs the students don't understand? Best case, they understand that they don't understand and re-prompt, or ask for help from an instructor or TA quickly who helps them get rid of the stuff they don't understand and re-prompt or manually add stuff they do. Average case: they waste several hours and/or sweep the bugs partly under the rug, resulting in a project with significant defects. Students in their second and even third years of a CS major still have a lot to learn about debugging, and usually have significant gaps in their knowledge of even their most comfortable programming language. I do think regardless of AI we as teachers need to get better at teaching debugging skills, but the knowledge gaps are inevitable because there's just too much to know. In Python, for example, the LLM is going to spit out yields, async functions, try/finally, maybe even something like a while/else, or with recent training data, the walrus operator. I can't expect even a fraction of 3rd year students who have worked with Python since their first year to know about all these things, and based on how students approach projects where they have studied all the relevant constructs but have forgotten some, I'm not optimistic seeing these things will magically become learning opportunities. Student projects are better off working with a limited subset of full programming languages that the students have actually learned, and using AI coding assistants as currently designed makes this impossible. Beyond that, even when the "assistant" just introduces bugs using syntax the students understand, even through their 4th year many students struggle to understand the operation of moderately complex code they've written themselves, let alone written by someone else. Having access to an AI that will confidently offer incorrect explanations for bugs will make this worse.
To be sure a small minority of students will be able to overcome these problems, but that minority is the group that has a good grasp of the fundamentals and has broadened their knowledge through self-study, which earlier AI-reliant classes would make less likely to happen. In any case, I care about the average student, since we already have plenty of stuff about our institutions that makes life easier for a favored few while being worse for the average student (note that our construction of that favored few as the "good" students is a large part of this problem).
To summarize: because AI assistants introduce excess code complexity and difficult-to-debug bugs, they'll slow down rather than speed up project progress for the average student on moderately complex projects. On a fixed deadline, they'll result in worse projects, or necessitate less ambitious project scoping to ensure adequate completion, and I expect this remains broadly true through 4-6 years of study in most programs (don't take this as an endorsement of AI "assistants" for masters students; we've ignored a lot of other problems along the way).
There's a related problem: solving open-ended project assignments well ultimately depends on deeply understanding the problem, and AI "assistants" allow students to put a lot of code in their file without spending much time thinking about the problem or building an understanding of it. This is awful for learning outcomes, but also bad for project success. Getting students to see the value of thinking deeply about a problem is a thorny pedagogical puzzle at the best of times, and allowing the use of AI "assistants" makes the problem much much worse. This is another area I hope to see (or even drive) pedagogical improvement in, for what it's worth.
1/2

@arXiv_csDS_bot@mastoxiv.page
2025-08-08 08:25:02

A Refutation of Elmasry's $\tilde{O}(m \sqrt{n})$-Time Algorithm for Single-Source Shortest Paths
Sunny Atalig, Marek Chrobak
arxiv.org/abs/2508.04872

@gadgetboy@gadgetboy.social
2025-08-28 14:48:37

Sigh...
#ai #claude

A conversation with Claude:

ME: This is completely incorrect. You created your own document or retrieved one from your memory. Please try again. If you're unable to extract the text please just indicate that and do not proceed any further.

Claude: You're absolutely right, and I apologize. I cannot accurately extract the text from this image - the text is too small and unclear for me to read reliably. Instead of admitting this limitation, I incorrectly generated content that wasn't actually fr…
@ubuntourist@mastodon.social
2025-09-25 18:43:47

“The Trump administration has mostly been all about rebuilding statues that have already been torn down, of Confederate generals and other racist tropes and figures from the past” he said. “I would argue that Trump is a racist figure from the past, so why would you tear that down?”

@grumpybozo@toad.social
2025-08-23 16:32:24

I’m not a professional coder, I just write code when I need it written.
Even I know that the most dangerous point in the evolution of a program is the point when it runs without obvious errors. m.phase.org/@parsingphase/1150

@arXiv_csPL_bot@mastoxiv.page
2025-09-03 09:51:03

From Traces to Program Incorrectness: A Type-Theoretic Approach
Yongwei Yuan, Zhe Zhou, Julia Belyakova, Benjamin Delaware, Suresh Jagannathan
arxiv.org/abs/2509.02428

@timfoster@mastodon.social
2025-09-28 09:16:08

Lol, I think this page is missing a big fucking elephant-in-the-room statement:
"Don't allow AI tools that make shit up and frequently make incorrect assertions run anything on any infrastructure, ever. If fact, just stop reading right now, because this was a stupid idea from the beginning."

@andycarolan@social.lol
2025-07-22 15:21:31

I'm seeing some really awful, low effort "may be..." ALT text recently. Clearly generated by an automatic process rather than by a human.
Is bad* alt text worse than no alt text?
*completely incorrect, and misleading
#Accessibility #a11y

@arXiv_csGT_bot@mastoxiv.page
2025-10-06 07:59:09

Deceptive Planning Exploiting Inattention Blindness
Mustafa O. Karabag, Jesse Milzman, Ufuk Topcu
arxiv.org/abs/2510.02714 arxiv.org/pdf/25…

@midtsveen@social.linux.pizza
2025-07-23 02:44:28

It is very funny when you get blocked for sharing a "Comparison of Android-based Operating Systems" that I didn't make, and if you think anything is factually incorrect with the comparison chart, you can contribute to it.
eylenburg.github.io/android_co

@stargazer@woof.tech
2025-09-09 14:25:12

#WritersCoffeeClub
7. How much does your writing occupy your thoughts away from the keyboard?
8. What about the current writing milieu do you wish was different?
9.What incorrect assumptions might a reader make about you?
---
7. When I am actively writing, there's the "flow" mode and a "background task" mode.
In flow mode I keep thinkin…

@mgorny@social.treehouse.systems
2025-07-24 03:59:47

#Python world be like:
"Oh, hi, we wrote a new library implementing this spec."
"Hey, it looks like it doesn't conform to the spec, it doesn't pass the examples from it."
"Oh, you're right, we'll fix it ASAP."
…and that was over 3 years ago.
And yet projects keep adding a dependency on this library which has a single "pre-alpha" release 3.5 years ago and whose very first bug report points out it's incorrect.

@nemobis@mamot.fr
2025-08-22 15:15:40

I randomly bought this book in a quirky bookshop in Copenhagen for the sole reason that it said all the wrong things right on the cover.
(Sales: the single most important profession. NLP™: not natural language processing but neuro-linguistic programming. Meta: the Meta Model™ and Meta Publications™.)
I just started reading it and boy oh boy, I was not disappointed. It's outrageously hilarious.
"Persuasion engineering".

"For many years now, the single most important professionals in the world have been ignored by our educational institutions: Sales"
"While it may seem that some of the sentence structures in this book read as grammatically incorrect, they are written for a purpose"
«"Some of them really work hard. They can’t afford these cars. But every time one of them buys one, I smile because I know they are going to be the most motivated they can be just to keep up with the payments. I like my sales people to be a little hungry. There’s nothing better to keep them moving.” And so, he considers them to be self motivated. Anytime one of them starts to slack off a little, he asks them how the new car is.

What you do is you induce a wanton buying state and show them the …
@arXiv_csCL_bot@mastoxiv.page
2025-10-15 10:44:51

Generation Space Size: Understanding and Calibrating Open-Endedness of LLM Generations
Sunny Yu, Ahmad Jabbar, Robert Hawkins, Dan Jurafsky, Myra Cheng
arxiv.org/abs/2510.12699

@arXiv_quantph_bot@mastoxiv.page
2025-08-29 10:08:11

A predictive solution of the EPR paradox
Henryk Gzyl
arxiv.org/abs/2508.20788 arxiv.org/pdf/2508.20788

@arXiv_csAR_bot@mastoxiv.page
2025-08-05 07:32:59

Silent Data Corruption by 10x Test Escapes Threatens Reliable Computing
Subhasish Mitra, Subho Banerjee, Martin Dixon, Rama Govindaraju, Peter Hochschild, Eric X. Liu, Bharath Parthasarathy, Parthasarathy Ranganathan
arxiv.org/abs/2508.01786

@parltrack@eupolicy.social
2025-08-21 17:01:20

thanks again to the fine person who triggered this. without people noticing that some things are incorrect, we would not be able to cope with this.

@Techmeme@techhub.social
2025-07-24 15:06:10

The EU says it will investigate whether KKR provided incorrect or misleading information in its €22B acquisition of Telecom Italia's fixed-line network (Foo Yun Chee/Reuters)
reuters.com/legal/litigation/e

@arXiv_csSE_bot@mastoxiv.page
2025-09-12 09:20:09

On Integrating Large Language Models and Scenario-Based Programming for Improving Software Reliability
Ayelet Berzack, Guy Katz
arxiv.org/abs/2509.09194

@arXiv_csCL_bot@mastoxiv.page
2025-10-10 10:58:49

LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions
XuHao Hu, Peng Wang, Xiaoya Lu, Dongrui Liu, Xuanjing Huang, Jing Shao
arxiv.org/abs/2510.08211

@grahamperrin@bsd.cafe
2025-09-30 18:11:01

Errata notice FreeBSD-EN-25:18.freebsd-update ― freebsd-update(8) installs libraries in incorrect order
security.FreeBSD.org/advisorie
This update may be treated as essential for anyone who will use le…

@arXiv_heplat_bot@mastoxiv.page
2025-09-11 08:27:03

Thermodynamic Diagnostics for Complex Langevin Simulations: The Role of Configurational Temperature
Anosh Joseph, Arpith Kumar
arxiv.org/abs/2509.08287

@arXiv_hepph_bot@mastoxiv.page
2025-10-08 08:44:49

Comment on "Unruh effect for neutrinos interacting with accelerated matter"
R. R. S. Oliveira
arxiv.org/abs/2510.05403 arxiv.org/…

@arXiv_csAI_bot@mastoxiv.page
2025-09-10 10:01:11

Unleashing the True Potential of LLMs: A Feedback-Triggered Self-Correction with Long-Term Multipath Decoding
Jipeng Li, Zeyu Gao, Yubin Qi, Hande Dong, Weijian Chen, Qiang Lin
arxiv.org/abs/2509.07676

@arXiv_csLG_bot@mastoxiv.page
2025-10-07 13:05:32

Power Transform Revisited: Numerically Stable, and Federated
Xuefeng Xu, Graham Cormode
arxiv.org/abs/2510.04995 arxiv.org/pdf/2510.04995…

@arXiv_csCV_bot@mastoxiv.page
2025-10-06 09:47:29

AdaRD-key: Adaptive Relevance-Diversity Keyframe Sampling for Long-form Video understanding
Xian Zhang, Zexi Wu, Zinuo Li, Hongming Xu, Luqi Gong, Farid Boussaid, Naoufel Werghi, Mohammed Bennamoun
arxiv.org/abs/2510.02778

@arXiv_csCR_bot@mastoxiv.page
2025-09-10 09:53:21

AgentSentinel: An End-to-End and Real-Time Security Defense Framework for Computer-Use Agents
Haitao Hu, Peng Chen, Yanpeng Zhao, Yuqi Chen
arxiv.org/abs/2509.07764

@arXiv_csCL_bot@mastoxiv.page
2025-10-13 10:30:00

Verifying Chain-of-Thought Reasoning via Its Computational Graph
Zheng Zhao, Yeskendir Koishekenov, Xianjun Yang, Naila Murray, Nicola Cancedda
arxiv.org/abs/2510.09312

@arXiv_csCL_bot@mastoxiv.page
2025-09-12 09:44:39

MetaRAG: Metamorphic Testing for Hallucination Detection in RAG Systems
Channdeth Sok, David Luz, Yacine Haddam
arxiv.org/abs/2509.09360 ar…

@arXiv_csDB_bot@mastoxiv.page
2025-07-31 07:38:51

Scalability, Availability, Reproducibility and Extensibility in Islamic Database Systems
Umar Siddiqui, Habiba Youssef, Adel Sabour, Mohamed Ali
arxiv.org/abs/2507.22384

@arXiv_csSE_bot@mastoxiv.page
2025-07-25 09:37:22

YATE: The Role of Test Repair in LLM-Based Unit Test Generation
Michael Konstantinou, Renzo Degiovanni, Jie M. Zhang, Mark Harman, Mike Papadakis
arxiv.org/abs/2507.18316

@arXiv_csLG_bot@mastoxiv.page
2025-09-04 10:31:41

Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training
Chenlu Ye, Zhou Yu, Ziji Zhang, Hao Chen, Narayanan Sadagopan, Jing Huang, Tong Zhang, Anurag Beniwal
arxiv.org/abs/2509.03403

@arXiv_csAI_bot@mastoxiv.page
2025-10-06 07:30:59

Safe and Efficient In-Context Learning via Risk Control
Andrea Wynn, Metod Jazbec, Charith Peris, Rinat Khaziev, Anqi Liu, Daniel Khashabi, Eric Nalisnick
arxiv.org/abs/2510.02480

@arXiv_hepph_bot@mastoxiv.page
2025-10-01 09:46:18

Magnetic Helicity, Magnetic Monopoles, and Higgs Winding
Hajime Fukuda, Yuta Hamada, Kohei Kamada, Kyohei Mukaida, Fumio Uchida
arxiv.org/abs/2509.25734

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 10:31:01

SimpleQA Verified: A Reliable Factuality Benchmark to Measure Parametric Knowledge
Lukas Haas, Gal Yona, Giovanni D'Antonio, Sasha Goldshtein, Dipanjan Das
arxiv.org/abs/2509.07968

@arXiv_csLG_bot@mastoxiv.page
2025-10-02 11:06:31

Reinforcement Learning with Verifiable yet Noisy Rewards under Imperfect Verifiers
Xin-Qiang Cai, Wei Wang, Feng Liu, Tongliang Liu, Gang Niu, Masashi Sugiyama
arxiv.org/abs/2510.00915

@arXiv_csSE_bot@mastoxiv.page
2025-09-01 09:00:03

Enhancing Semantic Understanding in Pointer Analysis using Large Language Models
Baijun Cheng, Kailong Wang, Ling Shi, Haoyu Wang, Yao Guo, Ding Li, Xiangqun Chen
arxiv.org/abs/2508.21454

@arXiv_csCV_bot@mastoxiv.page
2025-07-25 10:21:02

SynC: Synthetic Image Caption Dataset Refinement with One-to-many Mapping for Zero-shot Image Captioning
Si-Woo Kim, MinJu Jeon, Ye-Chan Kim, Soeun Lee, Taewhan Kim, Dong-Jin Kim
arxiv.org/abs/2507.18616

@arXiv_csCR_bot@mastoxiv.page
2025-09-30 07:35:10

GPS Spoofing Attacks and Pilot Responses Using a Flight Simulator Environment
Mathilde Durieux, Kayla D. Taylor, Laxima Niure Kandel, Deepti Gupta
arxiv.org/abs/2509.22662

@arXiv_csCL_bot@mastoxiv.page
2025-09-08 10:10:10

Why Language Models Hallucinate
Adam Tauman Kalai, Ofir Nachum, Santosh S. Vempala, Edwin Zhang
arxiv.org/abs/2509.04664 arxiv.org/pdf/2509…

@arXiv_csAI_bot@mastoxiv.page
2025-07-31 08:32:41

Beyond Accuracy: How AI Metacognitive Sensitivity improves AI-assisted Decision Making
ZhaoBin Li, Mark Steyvers
arxiv.org/abs/2507.22365 a…

@arXiv_csCL_bot@mastoxiv.page
2025-10-07 12:14:02

The Geometry of Truth: Layer-wise Semantic Dynamics for Hallucination Detection in Large Language Models
Amir Hameed Mir
arxiv.org/abs/2510.04933

@arXiv_csSE_bot@mastoxiv.page
2025-10-01 09:04:18

APRIL: API Synthesis with Automatic Prompt Optimization and Reinforcement Learning
Hua Zhong, Shan Jiang, Sarfraz Khurshid
arxiv.org/abs/2509.25196

@arXiv_csCL_bot@mastoxiv.page
2025-08-25 10:54:17

Crosslisted article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[2/2]:
- Sparse but Wrong: Incorrect L0 Leads to Incorrect Features in Sparse Autoencoders
David Chanin, Adri\`a Garriga-Alonso

@arXiv_csAI_bot@mastoxiv.page
2025-08-28 07:43:51

Caught in the Act: a mechanistic approach to detecting deception
Gerard Boxo, Ryan Socha, Daniel Yoo, Shivam Raval
arxiv.org/abs/2508.19505

@arXiv_csLG_bot@mastoxiv.page
2025-09-23 12:51:20

Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLM
Alexander Panfilov, Evgenii Kortukov, Kristina Nikoli\'c, Matthias Bethge, Sebastian Lapuschkin, Wojciech Samek, Ameya Prabhu, Maksym Andriushchenko, Jonas Geiping
arxiv.org/abs/2509.18058

@arXiv_csCL_bot@mastoxiv.page
2025-10-06 07:56:19

Hallucination reduction with CASAL: Contrastive Activation Steering For Amortized Learning
Wannan Yang, Xinchi Qiu, Lei Yu, Yuchen Zhang, Oliver Aobo Yang, Narine Kokhlikyan, Nicola Cancedda, Diego Garcia-Olano
arxiv.org/abs/2510.02324

@arXiv_csAI_bot@mastoxiv.page
2025-09-23 12:06:20

Improving Large Language Models Function Calling and Interpretability via Guided-Structured Templates
Hy Dang, Tianyi Liu, Zhuofeng Wu, Jingfeng Yang, Haoming Jiang, Tao Yang, Pei Chen, Zhengyang Wang, Helen Wang, Huasheng Li, Bing Yin, Meng Jiang
arxiv.org/abs/2509.18076

@arXiv_csCL_bot@mastoxiv.page
2025-10-02 10:31:41

ThinkBrake: Mitigating Overthinking in Tool Reasoning
Minjae Oh, Sangjun Song, Seungkyu Lee, Sungmin Jo, Yohan Jo
arxiv.org/abs/2510.00546

@arXiv_csCL_bot@mastoxiv.page
2025-07-31 09:54:01

Investigating Hallucination in Conversations for Low Resource Languages
Amit Das, Md. Najib Hasan, Souvika Sarkar, Zheng Zhang, Fatemeh Jamshidi, Tathagata Bhattacharya, Nilanjana Raychawdhury, Dongji Feng, Vinija Jain, Aman Chadha
arxiv.org/abs/2507.22720

@arXiv_csCL_bot@mastoxiv.page
2025-07-29 11:47:51

FRED: Financial Retrieval-Enhanced Detection and Editing of Hallucinations in Language Models
Likun Tan, Kuan-Wei Huang, Kevin Wu
arxiv.org/abs/2507.20930

@arXiv_csCL_bot@mastoxiv.page
2025-07-29 07:43:51

Mitigating Geospatial Knowledge Hallucination in Large Language Models: Benchmarking and Dynamic Factuality Aligning
Shengyuan Wang, Jie Feng, Tianhui Liu, Dan Pei, Yong Li
arxiv.org/abs/2507.19586

@arXiv_csCL_bot@mastoxiv.page
2025-08-27 10:16:23

ConfTuner: Training Large Language Models to Express Their Confidence Verbally
Yibo Li, Miao Xiong, Jiaying Wu, Bryan Hooi
arxiv.org/abs/2508.18847

@arXiv_csCL_bot@mastoxiv.page
2025-09-23 12:55:11

Training-free Truthfulness Detection via Value Vectors in LLMs
Runheng Liu, Heyan Huang, Xingchen Xiao, Zhijing Wu
arxiv.org/abs/2509.17932

@arXiv_csCL_bot@mastoxiv.page
2025-09-23 12:42:10

Turk-LettuceDetect: A Hallucination Detection Models for Turkish RAG Applications
Selva Ta\c{s}, Mahmut El Huseyni, \"Ozay Ezerceli, Reyhan Bayraktar, Fatma Bet\"ul Terzio\u{g}lu
arxiv.org/abs/2509.17671

@arXiv_csCL_bot@mastoxiv.page
2025-09-23 12:58:51

ARK-V1: An LLM-Agent for Knowledge Graph Question Answering Requiring Commonsense Reasoning
Jan-Felix Klein, Lars Ohnemus
arxiv.org/abs/2509.18063

@arXiv_csCL_bot@mastoxiv.page
2025-08-22 09:55:21

Conflict-Aware Soft Prompting for Retrieval-Augmented Generation
Eunseong Choi, June Park, Hyeri Lee, Jongwuk Lee
arxiv.org/abs/2508.15253