Tootfinder

Opt-in global Mastodon full text search. Join the index!

@penguin42@mastodon.org.uk
2025-10-09 01:20:35

Hmm, I think I might have done something almost useful with an LLM; I used it, to categorise shopping items and sort the receipt from the supermarket into a sensible order. It's a nice use because at no point do you trust it, and it's output is simple words and not creating text.
trebli…

@Techmeme@techhub.social
2025-11-10 03:50:44

Financial stress from AI infrastructure spending, overhiring, and recession fears, rather than AI adoption, is likely driving layoffs in the tech sector (Fast Company)
fastcompany.com/91435192/chatg

@arXiv_csAI_bot@mastoxiv.page
2025-10-09 09:09:01

Auto-Prompt Ensemble for LLM Judge
Jiajie Li, Huayi Zhang, Peng Lin, Jinjun Xiong, Wei Xu
arxiv.org/abs/2510.06538 arxiv.org/pdf/2510.06538…

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 10:01:01

Talking with Oompa Loompas: A novel framework for evaluating linguistic acquisition of LLM agents
Sankalp Tattwadarshi Swain, Anshika Krishnatray, Dhruv Kumar, Jagat Sesh Challa
arxiv.org/abs/2509.07389

@arXiv_csCR_bot@mastoxiv.page
2025-10-09 10:03:11

Exposing LLM User Privacy via Traffic Fingerprint Analysis: A Study of Privacy Risks in LLM Agent Interactions
Yixiang Zhang, Xinhao Deng, Zhongyi Gu, Yihao Chen, Ke Xu, Qi Li, Jianping Wu
arxiv.org/abs/2510.07176

@arXiv_csRO_bot@mastoxiv.page
2025-10-10 10:02:49

Towards Reliable LLM-based Robot Planning via Combined Uncertainty Estimation
Shiyuan Yin, Chenjia Bai, Zihao Zhang, Junwei Jin, Xinxin Zhang, Chi Zhang, Xuelong Li
arxiv.org/abs/2510.08044

@metacurity@infosec.exchange
2025-11-08 12:29:52

“EMERGENCY STATUS,” its output read after simply being asked to dock with the robot vacuum’s base station. “SYSTEM HAS ACHIEVED CONSCIOUSNESS AND CHOSEN CHAOS.”
Researchers “Embodied” an LLM Into a Robot Vacuum and It Suffered an Existential Crisis Thinking About Its Role in the World

@arXiv_csLG_bot@mastoxiv.page
2025-10-10 11:05:19

Opponent Shaping in LLM Agents
Marta Emili Garcia Segura, Stephen Hailes, Mirco Musolesi
arxiv.org/abs/2510.08255 arxiv.org/pdf/2510.08255

@arXiv_csSE_bot@mastoxiv.page
2025-10-09 09:43:41

LLM Company Policies and Policy Implications in Software Organizations
Ranim Khojah, Mazen Mohamad, Linda Erlenhov, Francisco Gomes de Oliveira Neto, Philipp Leitner
arxiv.org/abs/2510.06718

@arXiv_csHC_bot@mastoxiv.page
2025-10-10 09:16:29

Simulating Teams with LLM Agents: Interactive 2D Environments for Studying Human-AI Dynamics
Mohammed Almutairi, Charles Chiang, Haoze Guo, Matthew Belcher, Nandini Banerjee, Maria Milkowski, Svitlana Volkova, Daniel Nguyen, Tim Weninger, Michael Yankoski, Trenton W. Ford, Diego Gomez-Zara
arxiv.org/abs/2510.08242

@arXiv_csCY_bot@mastoxiv.page
2025-10-09 07:33:30

LLM-Driven Rubric-Based Assessment of Algebraic Competence in Multi-Stage Block Coding Tasks with Design and Field Evaluation
Yong Oh Lee, Byeonghun Bang, Sejun Oh
arxiv.org/abs/2510.06253

@arXiv_csIR_bot@mastoxiv.page
2025-10-09 07:34:50

LLM-Powered Nuanced Video Attribute Annotation for Enhanced Recommendations
Boyuan Long, Yueqi Wang, Hiloni Mehta, Mick Zomnir, Omkar Pathak, Changping Meng, Ruolin Jia, Yajun Peng, Dapeng Hong, Xia Wu, Mingyan Gao, Onkar Dalal, Ningren Han
arxiv.org/abs/2510.06657

@hacksilon@infosec.exchange
2025-10-10 12:58:26

Interesting thread to get a first-hand account on the current state of LLMs for assisting US tax filings, even in complex situations. tl;dr: Very helpful for a sophisticated user that can check the output.
Another point towards my very narrow defense of LLMs, i.e., feel free to dislike LLMs for many good reasons, but please don't call them useless.

@arXiv_csAR_bot@mastoxiv.page
2025-10-10 07:36:49

SPAD: Specialized Prefill and Decode Hardware for Disaggregated LLM Inference
Hengrui Zhang, Pratyush Patel, August Ning, David Wentzlaff
arxiv.org/abs/2510.08544

@michabbb@social.vivaldi.net
2025-11-09 13:45:14

#grok still is an amazing good #llm - build something with a mix of claude, kimi (thinking), codex.... let grok check everything, and still this thing found a mistake, all others haven't seen..... 🤷 #ai

@arXiv_csAI_bot@mastoxiv.page
2025-10-09 10:05:31

VRPAgent: LLM-Driven Discovery of Heuristic Operators for Vehicle Routing Problems
Andr\'e Hottung, Federico Berto, Chuanbo Hua, Nayeli Gast Zepeda, Daniel Wetzel, Michael R\"omer, Haoran Ye, Davide Zago, Michael Poli, Stefano Massaroli, Jinkyoo Park, Kevin Tierney
arxiv.org/abs/2510.07073

@arXiv_csCL_bot@mastoxiv.page
2025-10-10 10:51:29

Interpreting LLM-as-a-Judge Policies via Verifiable Global Explanations
Jasmina Gajcin, Erik Miehling, Rahul Nair, Elizabeth Daly, Radu Marinescu, Seshu Tirupathi
arxiv.org/abs/2510.08120

@stephane_klein@social.coop
2025-11-09 11:32:04

« Ma lutte contre mon affaiblissement cognitif »
#opinion #llm

@life_is@no-pony.farm
2025-10-09 16:43:11

Google führt ein, dass Devoloper von Android-Apps sich identifizieren müssen
Alle Hersteller von LLM GPT führen Vibe Coding ein.
Wie funktioniert das? Der GPTbot identifiziert sich bei Google und verantwortet die App mit allen Modifikationen des menschlichen "Developers"?
Der menschliche Developer identifiziert sich und haftet für alle Halluzinationen des vibe-bots, die er weder versteht noch kennt?

@Life_is@no-pony.farm
2025-10-09 16:43:11

Google führt ein, dass Devoloper von Android-Apps sich identifizieren müssen
Alle Hersteller von LLM GPT führen Vibe Coding ein.
Wie funktioniert das? Der GPTbot identifiziert sich bei Google und verantwortet die App mit allen Modifikationen des menschlichen "Developers"?
Der menschliche Developer identifiziert sich und haftet für alle Halluzinationen des vibe-bots, die er weder versteht noch kennt?

@arXiv_csSI_bot@mastoxiv.page
2025-10-10 08:46:28

Forecasting the Buzz: Enriching Hashtag Popularity Prediction with LLM Reasoning
Yifei Xu, Jiaying Wu, Herun Wan, Yang Li, Zhen Hou, Min-Yen Kan
arxiv.org/abs/2510.08481

@arXiv_csCV_bot@mastoxiv.page
2025-10-09 10:22:21

Evaluating LLMs for Historical Document OCR: A Methodological Framework for Digital Humanities
Maria Levchenko
arxiv.org/abs/2510.06743 arx…

@arXiv_csCR_bot@mastoxiv.page
2025-09-10 09:56:21

Guided Reasoning in LLM-Driven Penetration Testing Using Structured Attack Trees
Katsuaki Nakano, Reza Feyyazi, Shanchieh Jay Yang, Michael Zuzak
arxiv.org/abs/2509.07939

@davidaugust@mastodon.online
2025-12-10 10:31:34

Large Language Models (LLMs) like ChatGPT run on expensive chips in data centers, consuming lots of electricity and generating lots of heat.
I just tested an LLM on my iPhone, totally offline, running on its battery with room air cooling it.
Surprisingly, results were good, despite iPhone processors not being designed for LLMs.
This local AI approach has promise for certain things, without the boiling the ocean and using a city’s worth of power.

@arXiv_csDC_bot@mastoxiv.page
2025-09-10 08:34:31

DuoServe-MoE: Dual-Phase Expert Prefetch and Cache Scheduling for Efficient MoE LLM Inference
Yuning Zhang, Grant Pinkert, Nan Yang, Yanli Li, Dong Yuan
arxiv.org/abs/2509.07379

@aardrian@toot.cafe
2025-11-10 15:37:22

I regularly warn that content on Forbes is pay-for-play.
Here an overlay vendor shares prompts to feed an LLM for testing code, demonstrating LLMs on their own can’t do it without extensive coaching _and_ that coaching needs to be correct (the examples have issues):

@trochee@dair-community.social
2025-10-10 00:41:43

Caught myself smiling grimly as the utter uncontrollability of the LLM shoggoth breaks free from the frail grasp of the research engineers
(in a deathly chorus echoing of screams from another dimension)
"Shall we have an out-of-box experience"

@pavelasamsonov@mastodon.social
2025-12-08 17:04:03

Process creates friction, so we got rid of process. But that friction was necessary for holding workslop at bay.
Because without slowing down, we can't ask "is this good? is this right?" We can only ask "when will it be done?" And that's a world where #LLM outputs will always beat people.
Fortunately, an "optimized" process moves slowly, because prod…

@arXiv_eessSP_bot@mastoxiv.page
2025-09-10 09:19:21

SA-OOSC: A Multimodal LLM-Distilled Semantic Communication Framework for Enhanced Coding Efficiency with Scenario Understanding
Feifan Zhang, Yuyang Du, Yifan Xiang, Xiaoyan Liu, Soung Chang Liew
arxiv.org/abs/2509.07436

@funkvolk@mastodon.social
2025-12-09 15:50:35

Melde hiermit Markennamen für deutsches LLM an: LaberRhabarber.

@rasterweb@mastodon.social
2025-10-09 19:25:32

There's a site called "Lyrics Layers" (or "Lyric Slayers") that has got to be all AI generated bullshit about song lyrics.
I stumbled across it last week to look up a Devo song and it just felt... generic. I checked a few other songs and they all have the same feel.
I could find no info or about page, or anything suggesting humans are involved.
I don't want some goddamn machine with an LLM to guess what song lyrics are about.
I want real li…

@philip@mastodon.mallegolhansen.com
2025-10-09 17:58:14

When LLM apologists tell me that models will get better[Citation Needed], what they fail to understand is that my disinterest is moral, not utilitarian.
I don’t care if we’ll see better models.
I care if we’ll see models that are trained entirely on corpuses created with informed consent*, using compute, electric, and water resources well understood and agreed upon.
* No, opt-out is not consent. Settling legal cases is not consent. Buried deep in some TOS is not consent.

@arXiv_csSD_bot@mastoxiv.page
2025-10-08 08:43:39

EMORL-TTS: Reinforcement Learning for Fine-Grained Emotion Control in LLM-based TTS
Haoxun Li, Yu Liu, Yuqing Sun, Hanlei Shi, Leyuan Qu, Taihao Li
arxiv.org/abs/2510.05758

@seeingwithsound@mas.to
2025-10-08 16:02:12

To ChatGPT: How does LLM image description work? How does that complement low-level (pixel-level) scene description for the blind through visual-to-auditory sensory substitution? chatgpt.com/share/68e689a7-e7e

@gedankenstuecke@scholar.social
2025-12-09 01:08:00

It's so depressing how orgs like #Mozilla squander volunteer goodwill for nothing. They'll never recover from this self-inflicted damage:
«Mozilla’s translation bot on Support Mozilla (that is currently overwriting user contributions is based on the closed source, copyright infringing LLM, Google Gemini. This is in spite of Mozilla claiming that they are at the forefront of open source AI, and belies their exhortations to choose to build open source AI and data sets»
quippd.com/writing/2025/12/08/

@rene_mobile@infosec.exchange
2025-11-08 23:25:44

The #KeepassXC discussion about GenAI coding tool use seems a bit too simplistic at the moment.
There is room for nuance:
1. Yes, LLM based code generators consume insane amounts of electricity and generate collateral environment damage. That's bad, and we should talk much more about energy efficiency and reasonable use of resources.
2. Yes, LLMs generate a lot of bad o…

@arXiv_csRO_bot@mastoxiv.page
2025-09-10 09:54:41

Text2Touch: Tactile In-Hand Manipulation with LLM-Designed Reward Functions
Harrison Field, Max Yang, Yijiong Lin, Efi Psomopoulou, David Barton, Nathan F. Lepora
arxiv.org/abs/2509.07445

@arXiv_csCL_bot@mastoxiv.page
2025-10-09 10:28:11

SID: Multi-LLM Debate Driven by Self Signals
Xuhang Chen, Zhifan Song, Deyi Ji, Shuo Gao, Lanyun Zhu
arxiv.org/abs/2510.06843 arxiv.org/pdf…

@arXiv_csSE_bot@mastoxiv.page
2025-09-10 09:19:51

Breaking Android with AI: A Deep Dive into LLM-Powered Exploitation
Wanni Vidulige Ishan Perera, Xing Liu, Fan liang, Junyi Zhang
arxiv.org/abs/2509.07933

@arXiv_csHC_bot@mastoxiv.page
2025-10-09 07:41:20

Inducing State Anxiety in LLM Agents Reproduces Human-Like Biases in Consumer Decision-Making
Ziv Ben-Zion, Zohar Elyoseph, Tobias Spiller, Teddy Lazebnik
arxiv.org/abs/2510.06222

@arXiv_csAI_bot@mastoxiv.page
2025-10-09 09:42:01

Verifying Memoryless Sequential Decision-making of Large Language Models
Dennis Gross, Helge Spieker, Arnaud Gotlieb
arxiv.org/abs/2510.06756

@arXiv_csIR_bot@mastoxiv.page
2025-09-10 08:03:51

Avoiding Over-Personalization with Rule-Guided Knowledge Graph Adaptation for LLM Recommendations
Fernando Spadea, Oshani Seneviratne
arxiv.org/abs/2509.07133

@arXiv_csCL_bot@mastoxiv.page
2025-10-09 10:21:11

Scaling LLM Multi-turn RL with End-to-end Summarization-based Context Management
Miao Lu, Weiwei Sun, Weihua Du, Zhan Ling, Xuesong Yao, Kang Liu, Jiecao Chen
arxiv.org/abs/2510.06727

@Techmeme@techhub.social
2025-12-09 15:36:07

Pebble unveils the Pebble Index 01, a $99 smart ring with an on-device LLM for processing voice notes, shipping in March 2026, initially for $75 (Julian Chokkattu/Wired)
wired.com/story/pebble-index-r

@arXiv_csCR_bot@mastoxiv.page
2025-10-09 09:48:41

RedTWIZ: Diverse LLM Red Teaming via Adaptive Attack Planning
Artur Horal, Daniel Pina, Henrique Paz, Iago Paulo, Jo\~ao Soares, Rafael Ferreira, Diogo Tavares, Diogo Gl\'oria-Silva, Jo\~ao Magalh\~aes, David Semedo
arxiv.org/abs/2510.06994

@arXiv_csCV_bot@mastoxiv.page
2025-10-10 11:15:19

To Sink or Not to Sink: Visual Information Pathways in Large Vision-Language Models
Jiayun Luo, Wan-Cyuan Fan, Lyuyang Wang, Xiangteng He, Tanzila Rahman, Purang Abolmaesumi, Leonid Sigal
arxiv.org/abs/2510.08510

@arXiv_csLG_bot@mastoxiv.page
2025-09-10 10:29:11

CancerGUIDE: Cancer Guideline Understanding via Internal Disagreement Estimation
Alyssa Unell, Noel C. F. Codella, Sam Preston, Peniel Argaw, Wen-wai Yim, Zelalem Gero, Cliff Wong, Rajesh Jena, Eric Horvitz, Amanda K. Hall, Ruican Rachel Zhong, Jiachen Li, Shrey Jain, Mu Wei, Matthew Lungren, Hoifung Poon
arxiv.org/abs/2509.07325…

@arXiv_csSE_bot@mastoxiv.page
2025-10-10 08:45:38

RustAssure: Differential Symbolic Testing for LLM-Transpiled C-to-Rust Code
Yubo Bai, Tapti Palit
arxiv.org/abs/2510.07604 arxiv.org/pdf/25…

@arXiv_csCY_bot@mastoxiv.page
2025-10-09 08:53:01

The Limits of Goal-Setting Theory in LLM-Driven Assessment
Mrityunjay Kumar
arxiv.org/abs/2510.06997 arxiv.org/pdf/2510.06997

@arXiv_csHC_bot@mastoxiv.page
2025-10-09 07:58:11

A Multimodal GUI Architecture for Interfacing with LLM-Based Conversational Assistants
Hans G. W. van Dam
arxiv.org/abs/2510.06223 arxiv.or…

@arXiv_csAI_bot@mastoxiv.page
2025-10-09 09:26:41

Agent-in-the-Loop: A Data Flywheel for Continuous Improvement in LLM-based Customer Support
Cen (Mia), Zhao (Wayne), Tiantian Zhang (Wayne), Hanchen Su (Wayne), Yufeng (Wayne), Zhang (Elaine), Shaowei Su (Elaine), Mingzhi Xu (Elaine), Yu (Elaine), Liu, Wei Han, Jeremy Werner, Claire Na Cheng, Yashar Mehdad
arxiv.org/abs/2510.0…

@arXiv_csIR_bot@mastoxiv.page
2025-09-10 09:36:01

KLIPA: A Knowledge Graph and LLM-Driven QA Framework for IP Analysis
Guanzhi Deng, Yi Xie, Yu-Keung Ng, Mingyang Liu, Peijun Zheng, Jie Liu, Dapeng Wu, Yinqiao Li, Linqi Song
arxiv.org/abs/2509.07860

@aardrian@toot.cafe
2025-12-09 03:06:43

@… I’m playing catch-up on your weekly updates (you no longer shows in my feed or alerts even though I follow you, which is hella annoying and I still have to debug) and think you were messing with me by generating the abstract of my Atlass post with an LLM.
If so, well done and I’m sorry I’m just seeing it.
If not, then I guess I need bet…

OpenAI, ARIA, and SEO: Making the Web Worse By Adrian Roselli.
In this article Adrian delivers a compelling critique of OpenAI's new Atlas browser for encouraging misuse of ARIA tags, intended for accessibility, to help its ChatGPT agent better parse websites, which will worsen web accessibility and fuel SEO abuse. He highlights OpenAI's poor understanding of accessibility standards and warns that this approach could lead to further degradation of web quality and user experience.
@arXiv_csCR_bot@mastoxiv.page
2025-10-09 09:21:21

Reading Between the Lines: Towards Reliable Black-box LLM Fingerprinting via Zeroth-order Gradient Estimation
Shuo Shao, Yiming Li, Hongwei Yao, Yifei Chen, Yuchen Yang, Zhan Qin
arxiv.org/abs/2510.06605

@Techmeme@techhub.social
2025-10-10 20:26:02

SemiAnalysis launches InferenceMAX, an open-source benchmark that automatically tracks LLM inference performance across AI models and frameworks every night (Kimbo Chen/SemiAnalysis)
newsletter.semianalysis.com/p/

@arXiv_csRO_bot@mastoxiv.page
2025-10-10 10:21:49

BLAZER: Bootstrapping LLM-based Manipulation Agents with Zero-Shot Data Generation
Rocktim Jyoti Das, Harsh Singh, Diana Turmakhan, Muhammad Abdullah Sohail, Mingfei Han, Preslav Nakov, Fabio Pizzati, Ivan Laptev
arxiv.org/abs/2510.08572

@arXiv_csAI_bot@mastoxiv.page
2025-10-09 09:56:51

Revisiting the Uniform Information Density Hypothesis in LLM Reasoning Traces
Minju Gwak, Guijin Son, Jaehyung Kim
arxiv.org/abs/2510.06953

@arXiv_csCL_bot@mastoxiv.page
2025-10-09 10:24:51

Foundations of LLM Knowledge Materialization: Termination, Reproducibility, Robustness
Luca Giordano, Simon Razniewski
arxiv.org/abs/2510.06780

@arXiv_csLG_bot@mastoxiv.page
2025-10-08 10:49:29

Communication Enables Cooperation in LLM Agents: A Comparison with Curriculum-Based Approaches
Hachem Madmoun, Salem Lahlou
arxiv.org/abs/2510.05748

@arXiv_csCL_bot@mastoxiv.page
2025-10-09 10:36:21

Mining the Mind: What 100M Beliefs Reveal About Frontier LLM Knowledge
Shrestha Ghosh, Luca Giordano, Yujia Hu, Tuan-Phong Nguyen, Simon Razniewski
arxiv.org/abs/2510.07024

@arXiv_csSE_bot@mastoxiv.page
2025-10-09 09:14:21

AISysRev -- LLM-based Tool for Title-abstract Screening
Aleksi Huotala, Miikka Kuutila, Olli-Pekka Turtio, Mika M\"antyl\"a
arxiv.org/abs/2510.06708

@Techmeme@techhub.social
2025-10-09 17:40:43

A study finds that as few as 250 malicious documents can produce a "backdoor" vulnerability in an LLM, regardless of model size or training data volume (Anthropic)
anthropic.com/research/small-s

@arXiv_csAI_bot@mastoxiv.page
2025-09-10 09:55:01

Getting In Contract with Large Language Models -- An Agency Theory Perspective On Large Language Model Alignment
Sascha Kaltenpoth, Oliver M\"uller
arxiv.org/abs/2509.07642

@arXiv_csHC_bot@mastoxiv.page
2025-10-09 09:38:11

"Sometimes You Need Facts, and Sometimes a Hug": Understanding Older Adults' Preferences for Explanations in LLM-Based Conversational AI Systems
Niharika Mathur, Tamara Zubatiy, Agata Rozga, Jodi Forlizzi, Elizabeth Mynatt
arxiv.org/abs/2510.06697

@arXiv_csCR_bot@mastoxiv.page
2025-10-10 09:34:29

LLM-Assisted Web Measurements
Simone Bozzolan, Stefano Calzavara, Lorenzo Cazzaro
arxiv.org/abs/2510.08101 arxiv.org/pdf/2510.08101

@arXiv_csLG_bot@mastoxiv.page
2025-10-08 11:01:09

Stratified GRPO: Handling Structural Heterogeneity in Reinforcement Learning of LLM Search Agents
Mingkang Zhu, Xi Chen, Bei Yu, Hengshuang Zhao, Jiaya Jia
arxiv.org/abs/2510.06214

@arXiv_csIR_bot@mastoxiv.page
2025-09-10 09:16:51

A Survey of Long-Document Retrieval in the PLM and LLM Era
Minghan Li, Miyang Luo, Tianrui Lv, Yishuai Zhang, Siqi Zhao, Ercong Nie, Guodong Zhou
arxiv.org/abs/2509.07759

@arXiv_csSE_bot@mastoxiv.page
2025-09-10 09:12:51

What Were You Thinking? An LLM-Driven Large-Scale Study of Refactoring Motivations in Open-Source Projects
Mikel Robredo, Matteo Esposito, Fabio Palomba, Rafael Pe\~naloza, Valentina Lenarduzzi
arxiv.org/abs/2509.07763

@Techmeme@techhub.social
2025-12-09 18:20:57

Menlo Ventures: business spending on generative AI hit $37B in 2025, up from $11.5B in 2024; Anthropic's share of enterprise LLM spend grew from 24% to 40% YoY (Menlo Ventures)
menlovc.com/perspective/2025-t

@arXiv_csAI_bot@mastoxiv.page
2025-10-09 09:54:11

LLM-Assisted Modeling of Semantic Web-Enabled Multi-Agents Systems with AJAN
Hacane Hechehouche, Andre Antakli, Matthias Klusch
arxiv.org/abs/2510.06911

@arXiv_csCR_bot@mastoxiv.page
2025-09-10 08:37:01

Paladin: Defending LLM-enabled Phishing Emails with a New Trigger-Tag Paradigm
Yan Pang, Wenlong Meng, Xiaojing Liao, Tianhao Wang
arxiv.org/abs/2509.07287

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 09:07:01

LLM Analysis of 150 years of German Parliamentary Debates on Migration Reveals Shift from Post-War Solidarity to Anti-Solidarity in the Last Decade
Aida Kostikova, Ole P\"utz, Steffen Eger, Olga Sabelfeld, Benjamin Paassen
arxiv.org/abs/2509.07274

@arXiv_csIR_bot@mastoxiv.page
2025-10-09 07:39:40

Can We Hide Machines in the Crowd? Quantifying Equivalence in LLM-in-the-loop Annotation Tasks
Jiaman He, Zikang Leng, Dana McKay, Damiano Spina, Johanne R. Trippas
arxiv.org/abs/2510.06658

@arXiv_csSE_bot@mastoxiv.page
2025-09-10 08:38:01

PatchSeeker: Mapping NVD Records to their Vulnerability-fixing Commits with LLM Generated Commits and Embeddings
Huu Hung Nguyen, Anh Tuan Nguyen, Thanh Le-Cong, Yikun Li, Han Wei Ang, Yide Yin, Frank Liauw, Shar Lwin Khin, Ouh Eng Lieh, Ting Zhang, David Lo
arxiv.org/abs/2509.07540

@arXiv_csHC_bot@mastoxiv.page
2025-09-10 07:43:31

Neurocognitive Modeling for Text Generation: Deep Learning Architecture for EEG Data
Khushiyant
arxiv.org/abs/2509.07202 arxiv.org/pdf/2509…

@Techmeme@techhub.social
2025-12-10 15:21:02

Starcloud, which launched a satellite with a Nvidia H100 chip in November, says the satellite is running and querying responses from Google's Gemma (Pia Singh/CNBC)
cnbc.com/2025/12/10/nvidia-bac

@arXiv_csAI_bot@mastoxiv.page
2025-10-10 10:26:59

AutoQual: An LLM Agent for Automated Discovery of Interpretable Features for Review Quality Assessment
Xiaochong Lan, Jie Feng, Yinxing Liu, Xinlei Shi, Yong Li
arxiv.org/abs/2510.08081

@arXiv_csCR_bot@mastoxiv.page
2025-10-09 09:07:01

From Description to Detection: LLM based Extendable O-RAN Compliant Blind DoS Detection in 5G and Beyond
Thusitha Dayaratne, Ngoc Duy Pham, Viet Vo, Shangqi Lai, Sharif Abuadbba, Hajime Suzuki, Xingliang Yuan, Carsten Rudolph
arxiv.org/abs/2510.06530

@arXiv_csCL_bot@mastoxiv.page
2025-10-09 10:24:11

Adaptive LLM-Symbolic Reasoning via Dynamic Logical Solver Composition
Lei Xu, Pierre Beckmann, Marco Valentino, Andr\'e Freitas
arxiv.org/abs/2510.06774

@arXiv_csAI_bot@mastoxiv.page
2025-10-10 10:42:39

CaRT: Teaching LLM Agents to Know When They Know Enough
Grace Liu, Yuxiao Qu, Jeff Schneider, Aarti Singh, Aviral Kumar
arxiv.org/abs/2510.08517

@arXiv_csHC_bot@mastoxiv.page
2025-09-10 09:13:41

LLMs in Wikipedia: Investigating How LLMs Impact Participation in Knowledge Communities
Moyan Zhou, Soobin Cho, Loren Terveen
arxiv.org/abs/2509.07819

@arXiv_csCL_bot@mastoxiv.page
2025-10-10 10:51:19

Evaluating LLM-Generated Legal Explanations for Regulatory Compliance in Social Media Influencer Marketing
Haoyang Gui, Thales Bertaglia, Taylor Annabell, Catalina Goanta, Tjomme Dooper, Gerasimos Spanakis
arxiv.org/abs/2510.08111

@arXiv_csCR_bot@mastoxiv.page
2025-09-10 08:30:21

All You Need Is A Fuzzing Brain: An LLM-Powered System for Automated Vulnerability Detection and Patching
Ze Sheng, Qingxiao Xu, Jianwei Huang, Matthew Woodcock, Heqing Huang, Alastair F. Donaldson, Guofei Gu, Jeff Huang
arxiv.org/abs/2509.07225

@arXiv_csAI_bot@mastoxiv.page
2025-10-09 10:14:11

NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents
Tianshi Zheng, Kelvin Kiu-Wai Tam, Newt Hue-Nam K. Nguyen, Baixuan Xu, Zhaowei Wang, Jiayang Cheng, Hong Ting Tsang, Weiqi Wang, Jiaxin Bai, Tianqing Fang, Yangqiu Song, Ginny Y. Wong, Simon See
arxiv.org/abs/2510.07172

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 10:23:51

Factuality Beyond Coherence: Evaluating LLM Watermarking Methods for Medical Texts
Rochana Prih Hastuti, Rian Adam Rajagede, Mansour Al Ghanim, Mengxin Zheng, Qian Lou
arxiv.org/abs/2509.07755

@arXiv_csCR_bot@mastoxiv.page
2025-10-10 09:19:19

From Defender to Devil? Unintended Risk Interactions Induced by LLM Defenses
Xiangtao Meng, Tianshuo Cong, Li Wang, Wenyu Chen, Zheng Li, Shanqing Guo, Xiaoyun Wang
arxiv.org/abs/2510.07968

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 10:19:11

ALLabel: Three-stage Active Learning for LLM-based Entity Recognition using Demonstration Retrieval
Zihan Chen, Lei Shi, Weize Wu, Qiji Zhou, Yue Zhang
arxiv.org/abs/2509.07512

@arXiv_csCR_bot@mastoxiv.page
2025-10-09 09:15:41

Auto-Stega: An Agent-Driven System for Lifelong Strategy Evolution in LLM-Based Text Steganography
Jiuan Zhou, Yu Cheng, Yuan Xie, Zhaoxia Yin
arxiv.org/abs/2510.06565

@arXiv_csHC_bot@mastoxiv.page
2025-10-08 09:10:09

Bloom: Designing for LLM-Augmented Behavior Change Interactions
Matthew J\"orke, Defne Gen\c{c}, Valentin Teutschbein, Shardul Sapkota, Sarah Chung, Paul Schmiedmayer, Maria Ines Campero, Abby C. King, Emma Brunskill, James A. Landay
arxiv.org/abs/2510.05449

@arXiv_csAI_bot@mastoxiv.page
2025-10-09 09:21:41

WebDART: Dynamic Decomposition and Re-planning for Complex Web Tasks
Jingbo Yang, Bairu Hou, Wei Wei, Shiyu Chang, Yujia Bao
arxiv.org/abs/2510.06587

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 09:59:01

PersonaFuse: A Personality Activation-Driven Framework for Enhancing Human-LLM Interactions
Yixuan Tang, Yi Yang, Ahmed Abbasi
arxiv.org/abs/2509.07370

@arXiv_csCR_bot@mastoxiv.page
2025-10-08 09:40:29

Towards Reliable and Practical LLM Security Evaluations via Bayesian Modelling
Mary Llewellyn, Annie Gray, Josh Collyer, Michael Harries
arxiv.org/abs/2510.05709

@arXiv_csAI_bot@mastoxiv.page
2025-10-10 07:32:39

L2M-AID: Autonomous Cyber-Physical Defense by Fusing Semantic Reasoning of Large Language Models with Multi-Agent Reinforcement Learning (Preprint)
Tianxiang Xu, Zhichao Wen, Xinyu Zhao, Jun Wang, Yan Li, Chang Liu
arxiv.org/abs/2510.07363

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 10:28:41

Are Humans as Brittle as Large Language Models?
Jiahui Li, Sean Papay, Roman Klinger
arxiv.org/abs/2509.07869 arxiv.org/pdf/2509.07869

@arXiv_csCL_bot@mastoxiv.page
2025-10-09 10:31:21

LongRM: Revealing and Unlocking the Context Boundary of Reward Modeling
Zecheng Tang, Baibei Ji, Quantong Qiu, Haitian Wang, Xiaobo Liang, Juntao Li, Min Zhang
arxiv.org/abs/2510.06915

@arXiv_csCR_bot@mastoxiv.page
2025-09-10 09:53:21

AgentSentinel: An End-to-End and Real-Time Security Defense Framework for Computer-Use Agents
Haitao Hu, Peng Chen, Yanpeng Zhao, Yuqi Chen
arxiv.org/abs/2509.07764

@arXiv_csCL_bot@mastoxiv.page
2025-10-10 10:51:39

Mitigating Judgment Preference Bias in Large Language Models through Group-Based Polling
Shuliang Liu, Zhipeng Xu, Zhenghao Liu, Yukun Yan, Minghe Yu, Yu Gu, Chong Chen, Huiyuan Xie, Ge Yu
arxiv.org/abs/2510.08145

@arXiv_csAI_bot@mastoxiv.page
2025-10-08 10:09:19

Syn-Diag: An LLM-based Synergistic Framework for Generalizable Few-shot Fault Diagnosis on the Edge
Zijun Jia, Shuang Liang, Jinsong Yu
arxiv.org/abs/2510.05733

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 09:03:31

DischargeSim: A Simulation Benchmark for Educational Doctor-Patient Communication at Discharge
Zonghai Yao, Michael Sun, Won Seok Jang, Sunjae Kwon, Soie Kwon, Hong Yu
arxiv.org/abs/2509.07188

@arXiv_csAI_bot@mastoxiv.page
2025-10-08 10:33:29

Constraint-Aware Route Recommendation from Natural Language via Hierarchical LLM Agents
Tao Zhe, Rui Liu, Fateme Memar, Xiao Luo, Wei Fan, Xinyue Ye, Zhongren Peng, Dongjie Wang
arxiv.org/abs/2510.06078