Tootfinder

Opt-in global Mastodon full text search. Join the index!

@kubikpixel@chaos.social
2025-07-07 05:05:17

»Forscher verstecken LLM-Prompts in Papern, um bessere Bewertungen zu erhalten:
Forscher verstecken in ihren Papern Prompts, die bessere Bewertungen bringen sollen und faule Reviewer entlarven können.«
Aha, so klug ist also die KI und forscht über sich selber in der KI? Nun ja das war mMn vorhersehbar, denn Betrug gab es in der Wissenschaft schon immer wieder. Nun dies löst zumindest eine Diskussion aus.
🤖

@arXiv_csCR_bot@mastoxiv.page
2025-08-06 08:53:40

VFLAIR-LLM: A Comprehensive Framework and Benchmark for Split Learning of LLMs
Zixuan Gu, Qiufeng Fan, Long Sun, Yang Liu, Xiaojun Ye
arxiv.org/abs/2508.03097

@arXiv_csCL_bot@mastoxiv.page
2025-08-06 10:01:00

Pay What LLM Wants: Can LLM Simulate Economics Experiment with 522 Real-human Persona?
Junhyuk Choi, Hyeonchu Park, Haemin Lee, Hyebeen Shin, Hyun Joung Jin, Bugeun Kim
arxiv.org/abs/2508.03262

@arXiv_csSE_bot@mastoxiv.page
2025-08-06 10:12:30

ReFuzzer: Feedback-Driven Approach to Enhance Validity of LLM-Generated Test Programs
Iti Shree, Karine Even-Mendoz, Tomasz Radzik
arxiv.org/abs/2508.03603

@mgorny@social.treehouse.systems
2025-09-07 02:42:14

#LLM folks when someone points out that it's unethical: "it's just a tool, it depends on how you use it!"
LLM folks when "#AI" messes up and they're asked to take responsibility: 👀 [monkey side eyes meme]

@arXiv_csSD_bot@mastoxiv.page
2025-08-07 08:38:04

Efficient Scaling for LLM-based ASR
Bingshen Mu, Yiwen Shao, Kun Wei, Dong Yu, Lei Xie
arxiv.org/abs/2508.04096 arxiv.org/pdf/2508.04096

@newsie@darktundra.xyz
2025-08-07 15:21:14

More than 130,000 Claude, Grok, ChatGPT, and Other LLM Chats Readable on Archive.org 404media.co/more-than-130-000-

@mgorny@pol.social
2025-09-07 02:43:19

#LLM -owcy, kiedy ktoś zwraca im uwagę na nieetyczność tego rozwiązania: "to tylko narzędzie, ty wybierasz, jak go używasz!"
LLM-owcy, kiedy "#AI" odpieprzy manianę i mają za to wziąć odpowiedzialność: 👀 [mem z pluszową małpą]

@arXiv_csIR_bot@mastoxiv.page
2025-08-07 07:38:43

Privacy Risks of LLM-Empowered Recommender Systems: An Inversion Attack Perspective
Yubo Wang, Min Tang, Nuo Shen, Shujie Cui, Weiqing Wang
arxiv.org/abs/2508.03703

@arXiv_csAI_bot@mastoxiv.page
2025-08-06 09:47:00

Toward Verifiable Misinformation Detection: A Multi-Tool LLM Agent Framework
Zikun Cui, Tianyi Huang, Chia-En Chiang, Cuiqianhe Du
arxiv.org/abs/2508.03092

@publicvoit@graz.social
2025-08-06 21:12:52

"#Slopsquatting is a type of #cybersquatting. It is the practice of registering a non-existent software package name that a large language model (#LLM) may hallucinate in its output, whereby someone u…

@poppastring@dotnet.social
2025-09-04 14:30:25

Prompt Attacks Against LLM-Powered Assistants in Production Are Practical and Dangerous
#security #llm
arxiv.org/abs/2508.12175

@privacity@social.linux.pizza
2025-07-06 23:39:20

Nature of Data in Pre-Trained Large Language Models
fpf.org/blog/nature-of-data-in
@…

@arXiv_qbiobm_bot@mastoxiv.page
2025-08-07 08:23:43

MD-LLM-1: A Large Language Model for Molecular Dynamics
Mhd Hussein Murtada, Z. Faidon Brotzakis, Michele Vendruscolo
arxiv.org/abs/2508.03709

@arXiv_csCV_bot@mastoxiv.page
2025-08-06 10:35:50

R2GenKG: Hierarchical Multi-modal Knowledge Graph for LLM-based Radiology Report Generation
Futian Wang, Yuhan Qiao, Xiao Wang, Fuling Wang, Yuxiang Zhang, Dengdi Sun
arxiv.org/abs/2508.03426

@heiseonline@social.heise.de
2025-09-05 08:04:00

Risikomanagement und Resilienz in der IT-Sicherheit: IT-Sicherheitstag Dortmund
Praxisnahe Vorträge am 16. September – von aktuellen Hacking-Methoden über Schutz vor LLM-Angriffen bis hin zu Strategien für mehr Cyber-Resilienz.

@arXiv_csPL_bot@mastoxiv.page
2025-08-06 07:50:30

SAGE-HLS: Syntax-Aware AST-Guided LLM for High-Level Synthesis Code Generation
M Zafir Sadik Khan, Nowfel Mashnoor, Mohammad Akyash, Kimia Azar, Hadi Kamali
arxiv.org/abs/2508.03558

@arXiv_csDL_bot@mastoxiv.page
2025-08-06 08:33:20

Who Gets Cited? Gender- and Majority-Bias in LLM-Driven Reference Selection
Jiangen He
arxiv.org/abs/2508.02740 arxiv.org/pdf/2508.02740

@arXiv_csCR_bot@mastoxiv.page
2025-08-06 08:54:10

Attack the Messages, Not the Agents: A Multi-round Adaptive Stealthy Tampering Framework for LLM-MAS
Bingyu Yan, Ziyi Zhou, Xiaoming Zhang, Chaozhuo Li, Ruilin Zeng, Yirui Qi, Tianbo Wang, Litian Zhang
arxiv.org/abs/2508.03125

@thomasfuchs@hachyderm.io
2025-07-07 01:38:13

Even if “AI” worked (it doesn’t), there’s many reasons why you shouldn’t use it:
1. It’s destroying Internet sites that you love as you use chat bots instead of actually going to sources of information—this will cause them to be less active and eventually shut down.
2. Pollution and water use from server farms cause immediate harm; often—just like other heavy industry—these are built in underprivileged communities and harming poor people. Without any benefits as the big tech companies get tax breaks and don’t pay for power, while workers aren’t from the community but commute in.
3. The basic underlying models of any LLM rely on stolen data, even when specific extra data is obtained legally. Chatbots can’t learn to speak English just by reading open source code.
4. You’re fueling a speculation bubble that is costing many people their jobs—because the illusion of “efficiency” is kept up by firing people and counting that as profit.
5. Whenever you use the great cheat machine in the cloud you’re robbing yourself from doing real research, writing or coding—literally atrophying your brain and making you stupider.
It’s a grift, through and through.

@khalidabuhakmeh@mastodon.social
2025-07-07 14:01:55

I'm starting a new software consultancy called "LLM" or "Losers Like Machines,” specializing in fixing people's AI-generated trash. $2000/hour

@arXiv_csDC_bot@mastoxiv.page
2025-08-06 09:00:20

Block: Balancing Load in LLM Serving with Context, Knowledge and Predictive Scheduling
Wei Da, Evangelia Kalyvianaki
arxiv.org/abs/2508.03611

@stefan@gardenstate.social
2025-08-06 16:59:32

Will you see an LLM president in your lifetime?

@mariyadelano@hachyderm.io
2025-07-07 16:26:18

For context - I didn't write that prompt or feed particularly bad input to prove my point. I was trying out Claude, which so many people tell me is "the good" LLM product and above-and-beyond.
This prompt was written by Anthropic's own team, and featured in their gallery of web apps that can be built with this AI.
THEY THINK THIS IS GOOD!

@arXiv_csLG_bot@mastoxiv.page
2025-09-05 10:20:11

TAGAL: Tabular Data Generation using Agentic LLM Methods
Beno\^it Ronval, Pierre Dupont, Siegfried Nijssen
arxiv.org/abs/2509.04152 arxiv.o…

@arXiv_csCL_bot@mastoxiv.page
2025-08-07 10:23:04

StyliTruth : Unlocking Stylized yet Truthful LLM Generation via Disentangled Steering
Chenglei Shen, Zhongxiang Sun, Teng Shi, Xiao Zhang, Jun Xu
arxiv.org/abs/2508.04530

@aardrian@toot.cafe
2025-08-07 15:29:01

I am at the point where if I see ✨ in a post or email or message or whatever, that I assume it’s about some fake-AI tool or feature or shill or scam, and then ignore or delete it (sometimes blocking the perpetrator).
Anyway, fair warning that those corporate LLM anus logos may have ruined that emoji (the symbols in the emoji are generally little four-pointed stars).

@arXiv_csCE_bot@mastoxiv.page
2025-08-06 09:07:50

Learning to Incentivize: LLM-Empowered Contract for AIGC Offloading in Teleoperation
Zijun Zhan, Yaxian Dong, Daniel Mawunyo Doe, Yuqing Hu, Shuai Li, Shaohua Cao, Zhu Han
arxiv.org/abs/2508.03464

@_tillwe_@mastodon.social
2025-06-07 12:04:03

Hmm, statt der üblichen 10 bis maximal 100 täglichen Zugriffen gestern fast 3000 Blogzugriffe (aber nur 10 Besucher). Klingt schwer nach LLM-Crawler - oder was könnte das sonst sein?

@adulau@infosec.exchange
2025-09-04 07:57:53

The recent release of Apertus, a fully open suite of large language models (LLMs), is super interesting.
The technical report provides plenty of details about the entire process.
#ai #opensource #llm

@penguin42@mastodon.org.uk
2025-09-06 20:57:32

I've switched my local play LLM from Gemma3 to Qwen3-Coder-30B-A3B-Instruct-IQ4_NL - it's actually usefully fast on my CPU (AMD 3950x - llama -t 32) about 14 token/s- Gemma3 is only something like 2-3 token/s. The first fast one I've found that doesn't make too much gibberish. Not as good on translation though, hmm.
I guess the speed is due to MoE ?

@samir@functional.computer
2025-07-07 17:54:44

@… Yes, so much yes.
I have started using “a checklist / comparison table where none was necessary” as an LLM heuristic now, because there are so many terrible articles (many written before mainstream LLMs were released) which use a checklist as a replacement for any actual content, and now the machines have been trained to regurgitate them.
My thinking …

@crell@phpc.social
2025-08-06 23:28:33

My plane still hasn't taken off after a hour, because United Airlines had a *system wide* crash in the software that computes the cargo load and distribution needed for take off.
WHY THE EVER LOVING FUCK DOES SUCH A SYSTEM NEDD A NETWORK CONNECTION? YOU COULD HANDLE ALL THE NEEDED COMPUTATION ON A BLOODY PHONE IN THE CAPTAIN'S POCKET!
Are they using a gratuitous LLM or something? What the hell, United?

@marekmcgann@sciences.social
2025-09-07 11:00:24

Students aren't generally looking to cheat, and most think LLM use should not be allowed in college education. Let's not allow salespeople to speak for them.

Technology companies are not shy of falsely claiming that students are lazy or lack writing skills. Such a mantra serves only to sell products — or coverupandexcuseoverworkingthembyourcolleagues—withnoreflectiononreality.We condemn those claims and reassert students’ agency vis-à-vis corporate control.
@fell@ma.fellr.net
2025-08-07 09:07:48

What I like about the @… built-in LLM answers:
- It's only(!) triggered on request, i.e. when your query ends with a question mark.
- It always admits when it couldn't find good information or there is no clear answer.
- It always cites sources, usually word for word.
I find myself not so much "believing the AI" but rather us…

@arXiv_csSE_bot@mastoxiv.page
2025-08-06 08:20:30

Blueprint First, Model Second: A Framework for Deterministic LLM Workflow
Libin Qiu, Yuhang Ye, Zhirong Gao, Xide Zou, Junfu Chen, Ziming Gui, Weizhi Huang, Xiaobo Xue, Wenkai Qiu, Kun Zhao
arxiv.org/abs/2508.02721

@arXiv_csCY_bot@mastoxiv.page
2025-09-05 07:41:40

Are LLM Agents the New RPA? A Comparative Study with RPA Across Enterprise Workflows
Petr Pr\r{u}cha, Michaela Matou\v{s}kov\'a, Jan Strnad
arxiv.org/abs/2509.04198

@grumpybozo@toad.social
2025-08-06 20:53:16

I already have a deeply-trained deterministic expert system using carbon-based storage and processing which gives me the answer provided by a LLM ('the commands to run to get an answer') in an instant.
It even knows not to tell me 'netstat' when I'm on a FreeBSD box. infosec.exchange/…

@philip@mastodon.mallegolhansen.com
2025-07-07 21:05:32

@… I imagine they just type “Is that true?” Into the LLM.
Actually, that’s probably expecting too much.

@arXiv_csOS_bot@mastoxiv.page
2025-08-06 07:43:30

AgentSight: System-Level Observability for AI Agents Using eBPF
Yusheng Zheng, Yanpeng Hu, Tong Yu, Andi Quinn
arxiv.org/abs/2508.02736 arx…

@arXiv_csHC_bot@mastoxiv.page
2025-08-05 11:22:20

Eye2Recall: Exploring the Design of Enhancing Reminiscence Activities via Eye Tracking-Based LLM-Powered Interaction Experience for Older Adults
Lei Han, Mingnan Wei, Qiongyan Chen, Anqi Wang, Rong Pang, Kefei Liu, Rongrong Chen, David Yip
arxiv.org/abs/2508.02232

@mgorny@social.treehouse.systems
2025-08-07 07:29:47

Claiming that LLMs bring us closer to AGI is like claiming that bullshitting brings one closer to wisdom.
Sure, you need "some" knowledge on different topics to bullshit successfully. Still, what's the point if all that knowledge is buried under an avalanche of lies? You probably can't distinguish what you knew from what you made up anymore.
#AI #LLM

@arXiv_csMM_bot@mastoxiv.page
2025-08-07 07:43:44

LUST: A Multi-Modal Framework with Hierarchical LLM-based Scoring for Learned Thematic Significance Tracking in Multimedia Content
Anderson de Lima Luiz
arxiv.org/abs/2508.04353

@nobodyinperson@fosstodon.org
2025-06-07 09:04:12

Oh wow ​LLMs are just so terrible. 🤦‍♂️
I made a #systemd service watcher¹ ( :nixos: #NixOS module²³) which regularly feeds systemd status outputs into an LLM (mistral here) and sends me an email if it thinks it found real problems. Well, now I always get alarmist emails with bullshit warnings, suc…

An email:

Subject: ⚠ yann-desktop-nixos forgejo: "Urgent: Forgejo Service Overrun - Possible Resource Issue"

Diagnose
========

 PROBLEM
   Forgejo service has been running for over 4 hours without restarting or signaling an error, which is unusual. This may indicate a problem with the application or a resource consumption issue. It's recommended to investigate further and possibly restart the service.
@veit@mastodon.social
2025-07-01 06:10:31

I’ve written about design patterns for the securing of LLM agents: #AI

@arXiv_csIR_bot@mastoxiv.page
2025-08-07 09:09:14

ViLLA-MMBench: A Unified Benchmark Suite for LLM-Augmented Multimodal Movie Recommendation
Fatemeh Nazary, Ali Tourani, Yashar Deldjoo, Tommaso Di Noia
arxiv.org/abs/2508.04206

@kurt@nelson.fun
2025-06-07 20:40:21

I appreciate the idea that you can respond to tickets really fast, but when it is obvious some sort of LLM is being used to respond and not answering the question asked, it is worse.
I'm looking at you clicks.tech

@arXiv_csAI_bot@mastoxiv.page
2025-08-06 10:17:10

Refining Critical Thinking in LLM Code Generation: A Faulty Premise-based Evaluation Framework
Jialin Li, Jinzhe Li, Gengxu Li, Yi Chang, Yuan Wu
arxiv.org/abs/2508.03622

@portaloffreedom@social.linux.pizza
2025-08-06 21:29:12

Honestly, I don't want to take a PhD for every LLM query that random people make. It was very exhausting to do just one.

@thomasrenkert@hcommons.social
2025-09-02 08:29:04

Big News! The completely #opensource #LLM #Apertus 🇨🇭 has been released today:
📰

Apertus evaluation graph
Data Protection and Copyright Requests
For removal requests of personally identifiable information (PII) or of copyrighted content, please contact the respective dataset owners or us directly

llm-privacy-requests@swiss-ai.org
llm-copyright-requests@swiss-ai.org
@arXiv_statME_bot@mastoxiv.page
2025-09-05 09:20:01

How many patients could we save with LLM priors?
Shota Arai, David Selby, Andrew Vargo, Sebastian Vollmer
arxiv.org/abs/2509.04250 arxiv.or…

@tiotasram@kolektiva.social
2025-07-06 12:58:28

So to summarize this whole adventure:
1. A good 45 minutes was spent to get an answer that we probably could have gotten in 5 minutes in the 2010's, or in maybe 1-2 hours in the 1990's.
2. The time investment wasn't a total waste as we learned a lot along the way that we wouldn't have in the 2010's. Most relevant is the wide range of variation (e.g. a 2x factor depending on fiber intake!).
3. Most of the search engine results were confidently wrong answers that had no relation to reality. We were lucky to get one that had real citations we could start from (but that same article included the bogus 4.91 kcal/gram number). Next time I want to know a random factoid I might just start on Google scholar.
4. At least one page we chased citations through had a note at the top about being frozen due to NIH funding issues. The digital commons is under attack on multiple fronts.
All of this is yet another reason not to support the big LLM companies.
#AI

@frankstohl@mastodon.social
2025-07-07 18:13:51

Apple’s newest AI study unlocks street navigation for blind users #apple #ai 9to5ma…

@arXiv_csNI_bot@mastoxiv.page
2025-08-04 09:36:00

Quality-of-Service Aware LLM Routing for Edge Computing with Multiple Experts
Jin Yang, Qiong Wu, Zhiying Feng, Zhi Zhou, Deke Guo, Xu Chen
arxiv.org/abs/2508.00234

@theodric@social.linux.pizza
2025-07-06 20:08:01

I am once again trying to fine-tune an LLM on 17 years worth of my internet poasts

fucking
shit
fuck
@arXiv_csSE_bot@mastoxiv.page
2025-08-06 09:16:10

Automated Validation of LLM-based Evaluators for Software Engineering Artifacts
Ora Nova Fandina, Eitan Farchi, Shmulik Froimovich, Rami Katan, Alice Podolsky, Orna Raz, Avi Ziv
arxiv.org/abs/2508.02827

@arXiv_csIR_bot@mastoxiv.page
2025-08-06 09:51:20

LLMDistill4Ads: Using Cross-Encoders to Distill from LLM Signals for Advertiser Keyphrase Recommendations at eBay
Soumik Dey, Benjamin Braun, Naveen Ravipati, Hansi Wu, Binbin Li
arxiv.org/abs/2508.03628

@arXiv_csCL_bot@mastoxiv.page
2025-08-06 10:01:40

LECTOR: LLM-Enhanced Concept-based Test-Oriented Repetition for Adaptive Spaced Learning
Jiahao Zhao
arxiv.org/abs/2508.03275 arxiv.org/pdf…

@mariyadelano@hachyderm.io
2025-08-05 17:26:35

AI agents = advanced malware that most of society decided is for some reason totally okay and chill and worth funding if it’s made by one of 3-4 tech giants
#AI #tech #LLM

@arXiv_csAI_bot@mastoxiv.page
2025-08-06 10:18:00

Automated Algorithmic Discovery for Gravitational-Wave Detection Guided by LLM-Informed Evolutionary Monte Carlo Tree Search
He Wang, Liang Zeng
arxiv.org/abs/2508.03661

@heiseonline@social.heise.de
2025-08-04 10:03:00

Risikomanagement und Resilienz in der IT-Sicherheit: IT-Sicherheitstag Dortmund
Am 16. September bietet das Programm der eintägigen Konferenz an der FH Dortmund spannende Inhalte für Forschung und Wirtschaft: Von Hacking bis LLM-Angriffen.

@adulau@infosec.exchange
2025-09-05 18:44:22

I keep seeing pull requests clearly generated by LLMs, but what’s really awkward is their inability to create separate branches and PRs for each fix, even after asking the contributor multiple times.
Can we conclude that git is still out of reach for LLMs to really understand?
#git #llm

@arXiv_csLG_bot@mastoxiv.page
2025-09-04 10:33:01

On Entropy Control in LLM-RL Algorithms
Han Shen
arxiv.org/abs/2509.03493 arxiv.org/pdf/2509.03493

@arXiv_csCR_bot@mastoxiv.page
2025-08-06 09:05:10

From Legacy to Standard: LLM-Assisted Transformation of Cybersecurity Playbooks into CACAO Format
Mehdi Akbari Gurabi, Lasse Nitz, Radu-Mihai Castravet, Roman Matzutt, Avikarsha Mandal, Stefan Decker
arxiv.org/abs/2508.03342

@mgorny@pol.social
2025-07-05 18:36:35

Jak ktoś chwali sobie #Claude #LLM, to wspomnę:
ClaudeBot dziś wykonał 20 tysięcy żądań do bugs.gentoo.org. Spośród nich, 15 tysięcy w kółko ciągnęło plik robots.txt. Zaprawdę wysokiej jakości kod.
#AI

@portaloffreedom@social.linux.pizza
2025-08-06 21:20:53

Yeah, one of the two guys, a smart engineer, told me the llm answered him why the moon is always facing the same to earth, being the shape of the moon the cause of being locked in place.
A quick research on Wikipedia seems to confirm it was just wrong. It's because the orbital period was faster than earth's rotational period:

@arXiv_csSD_bot@mastoxiv.page
2025-09-05 09:03:01

Denoising GER: A Noise-Robust Generative Error Correction with LLM for Speech Recognition
Yanyan Liu, Minqiang Xu, Yihao Chen, Liang He, Lei Fang, Sian Fang, Lin Liu
arxiv.org/abs/2509.04392

@arXiv_csCL_bot@mastoxiv.page
2025-08-06 10:02:50

Investigating Gender Bias in LLM-Generated Stories via Psychological Stereotypes
Shahed Masoudian, Gustavo Escobedo, Hannah Strauss, Markus Schedl
arxiv.org/abs/2508.03292

@arXiv_csAI_bot@mastoxiv.page
2025-09-05 09:17:31

Are LLM Agents Behaviorally Coherent? Latent Profiles for Social Simulation
James Mooney, Josef Woldense, Zheng Robert Jia, Shirley Anugrah Hayati, My Ha Nguyen, Vipul Raheja, Dongyeop Kang
arxiv.org/abs/2509.03736

@arXiv_csSE_bot@mastoxiv.page
2025-08-06 09:40:00

Industrial LLM-based Code Optimization under Regulation: A Mixture-of-Agents Approach
Mari Ashiga, Vardan Voskanyan, Fateme Dinmohammadi, Jingzhi Gong, Paul Brookes, Matthew Truscott, Rafail Giavrimis, Mike Basios, Leslie Kanthan, Wei Jie
arxiv.org/abs/2508.03329

@mgorny@social.treehouse.systems
2025-07-05 18:35:18

To whomever praises #Claude #LLM:
ClaudeBot has made 20k requests to bugs.gentoo.org today. 15k of them were repeatedly fetching robots.txt. That surely is a sign of great code quality.
#AI

@arXiv_csCL_bot@mastoxiv.page
2025-08-06 10:22:00

Beyond the Surface: Enhancing LLM-as-a-Judge Alignment with Human via Internal Representations
Peng Lai, Jianjie Zheng, Sijie Cheng, Yun Chen, Peng Li, Yang Liu, Guanhua Chen
arxiv.org/abs/2508.03550

@arXiv_csCR_bot@mastoxiv.page
2025-07-04 09:54:51

Control at Stake: Evaluating the Security Landscape of LLM-Driven Email Agents
Jiangrong Wu, Yuhong Nan, Jianliang Wu, Zitong Yao, Zibin Zheng
arxiv.org/abs/2507.02699

@mariyadelano@hachyderm.io
2025-08-05 17:28:50

I really can’t think of AI agents as anything other than malware with articles like these:
#AI #tech #LLM

@arXiv_csIR_bot@mastoxiv.page
2025-08-06 08:07:30

LLM-based IR-system for Bank Supervisors
Ilias Aarab
arxiv.org/abs/2508.02945 arxiv.org/pdf/2508.02945

@arXiv_csAI_bot@mastoxiv.page
2025-08-06 07:32:40

Large Language Model-based Data Science Agent: A Survey
Peiran Wang, Yaoning Yu, Ke Chen, Xianyang Zhan, Haohan Wang
arxiv.org/abs/2508.02744

@arXiv_csSE_bot@mastoxiv.page
2025-08-06 10:09:00

BitsAI-Fix: LLM-Driven Approach for Automated Lint Error Resolution in Practice
Yuanpeng Li, Qi Long, Zhiyuan Yao, Jian Xu, Lintao Xie, Xu He, Lu Geng, Xin Han, Yueyan Chen, Wenbo Duan
arxiv.org/abs/2508.03487

@arXiv_csCL_bot@mastoxiv.page
2025-08-06 10:22:20

Tackling Distribution Shift in LLM via KILO: Knowledge-Instructed Learning for Continual Adaptation
Iing Muttakhiroh, Thomas Fevens
arxiv.org/abs/2508.03571

@arXiv_csSE_bot@mastoxiv.page
2025-08-06 09:37:40

GUI-ReRank: Enhancing GUI Retrieval with Multi-Modal LLM-based Reranking
Kristian Kolthoff, Felix Kretzer, Christian Bartelt, Alexander Maedche, Simone Paolo Ponzetto
arxiv.org/abs/2508.03298

@arXiv_csCL_bot@mastoxiv.page
2025-08-06 10:24:20

More Than a Score: Probing the Impact of Prompt Specificity on LLM Code Generation
Yangtian Zi, Harshitha Menon, Arjun Guha
arxiv.org/abs/2508.03678

@arXiv_csIR_bot@mastoxiv.page
2025-09-05 08:16:21

Efficient Item ID Generation for Large-Scale LLM-based Recommendation
Anushya Subbiah, Vikram Aggarwal, James Pine, Steffen Rendle, Krishna Sayana, Kun Su
arxiv.org/abs/2509.03746

@arXiv_csAI_bot@mastoxiv.page
2025-09-05 10:11:51

EvoEmo: Towards Evolved Emotional Policies for LLM Agents in Multi-Turn Negotiation
Yunbo Long, Liming Xu, Lukas Beckenbauer, Yuhan Liu, Alexandra Brintrup
arxiv.org/abs/2509.04310

@arXiv_csCL_bot@mastoxiv.page
2025-08-07 10:28:44

GeRe: Towards Efficient Anti-Forgetting in Continual Learning of LLM via General Samples Replay
Yunan Zhang, Shuoran Jiang, Mengchen Zhao, Yuefeng Li, Yang Fan, Xiangping Wu, Qingcai Chen
arxiv.org/abs/2508.04676

@arXiv_csSE_bot@mastoxiv.page
2025-08-05 09:37:30

Tuning LLM-based Code Optimization via Meta-Prompting: An Industrial Perspective
Jingzhi Gong, Rafail Giavrimis, Paul Brookes, Vardan Voskanyan, Fan Wu, Mari Ashiga, Matthew Truscott, Mike Basios, Leslie Kanthan, Jie Xu, Zheng Wang
arxiv.org/abs/2508.01443

@arXiv_csAI_bot@mastoxiv.page
2025-09-05 09:56:01

Meta-Policy Reflexion: Reusable Reflective Memory and Rule Admissibility for Resource-Efficient LLM Agent
Chunlong Wu, Zhibo Qu
arxiv.org/abs/2509.03990

@arXiv_csCL_bot@mastoxiv.page
2025-08-07 10:28:54

FaST: Feature-aware Sampling and Tuning for Personalized Preference Alignment with Limited Data
Thibaut Thonet, Germ\'an Kruszewski, Jos Rozen, Pierre Erbacher, Marc Dymetman
arxiv.org/abs/2508.04698

@arXiv_csIR_bot@mastoxiv.page
2025-09-05 08:18:11

LLM-based Relevance Assessment for Web-Scale Search Evaluation at Pinterest
Han Wang, Alex Whitworth, Pak Ming Cheung, Zhenjie Zhang, Krishna Kamath
arxiv.org/abs/2509.03764

@arXiv_csSE_bot@mastoxiv.page
2025-08-04 09:34:50

Is LLM-Generated Code More Maintainable \& Reliable than Human-Written Code?
Alfred Santa Molison, Marcia Moraes, Glaucia Melo, Fabio Santos, Wesley K. G. Assuncao
arxiv.org/abs/2508.00700

@arXiv_csAI_bot@mastoxiv.page
2025-09-05 09:48:21

FaMA: LLM-Empowered Agentic Assistant for Consumer-to-Consumer Marketplace
Yineng Yan, Xidong Wang, Jin Seng Cheng, Ran Hu, Wentao Guan, Nahid Farahmand, Hengte Lin, Yue Li
arxiv.org/abs/2509.03890

@arXiv_csAI_bot@mastoxiv.page
2025-09-05 09:57:01

AutoPBO: LLM-powered Optimization for Local Search PBO Solvers
Jinyuan Li, Yi Chu, Yiwen Sun, Mengchuan Zou, Shaowei Cai
arxiv.org/abs/2509.04007

@arXiv_csAI_bot@mastoxiv.page
2025-09-05 09:21:51

Leveraging LLM-Based Agents for Intelligent Supply Chain Planning
Yongzhi Qi, Jiaheng Yin, Jianshen Zhang, Dongyang Geng, Zhengyu Chen, Hao Hu, Wei Qi, Zuo-Jun Max Shen
arxiv.org/abs/2509.03811

@arXiv_csCL_bot@mastoxiv.page
2025-09-05 09:43:41

Breaking the Mirror: Activation-Based Mitigation of Self-Preference in LLM Evaluators
Dani Roytburg, Matthew Bozoukov, Matthew Nguyen, Jou Barzdukas, Simon Fu, Narmeen Oozeer
arxiv.org/abs/2509.03647

@arXiv_csAI_bot@mastoxiv.page
2025-09-05 09:37:41

What Would an LLM Do? Evaluating Policymaking Capabilities of Large Language Models
Pierre Le Coz, Jia An Liu, Debarun Bhattacharjya, Georgina Curto, Serge Stinckwich
arxiv.org/abs/2509.03827

@arXiv_csCL_bot@mastoxiv.page
2025-09-05 10:22:31

Measuring Bias or Measuring the Task: Understanding the Brittle Nature of LLM Gender Biases
Bufan Gao, Elisa Kreiss
arxiv.org/abs/2509.04373

@arXiv_csAI_bot@mastoxiv.page
2025-07-04 09:36:11

Scaling LLM Planning: NL2FLOW for Parametric Problem Generation and Rigorous Evaluation
Jungkoo Kang
arxiv.org/abs/2507.02253

@arXiv_csAI_bot@mastoxiv.page
2025-09-05 10:18:41

ArcMemo: Abstract Reasoning Composition with Lifelong LLM Memory
Matthew Ho, Chen Si, Zhaoxiang Feng, Fangxu Yu, Zhijian Liu, Zhiting Hu, Lianhui Qin
arxiv.org/abs/2509.04439

@arXiv_csAI_bot@mastoxiv.page
2025-09-05 07:33:50

Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents
Davide Paglieri, Bart{\l}omiej Cupia{\l}, Jonathan Cook, Ulyana Piterbarg, Jens Tuyls, Edward Grefenstette, Jakob Nicolaus Foerster, Jack Parker-Holder, Tim Rockt\"aschel
arxiv.org/abs/2509.03581

@arXiv_csAI_bot@mastoxiv.page
2025-07-04 09:45:11

OMS: On-the-fly, Multi-Objective, Self-Reflective Ad Keyword Generation via LLM Agent
Bowen Chen, Zhao Wang, Shingo Takamatsu
arxiv.org/abs/2507.02353

@arXiv_csAI_bot@mastoxiv.page
2025-08-06 07:30:20

Efficient Agents: Building Effective Agents While Reducing Cost
Ningning Wang, Xavier Hu, Pai Liu, He Zhu, Yue Hou, Heyuan Huang, Shengyu Zhang, Jian Yang, Jiaheng Liu, Ge Zhang, Changwang Zhang, Jun Wang, Yuchen Eleanor Jiang, Wangchunshu Zhou
arxiv.org/abs/2508.02694

@arXiv_csAI_bot@mastoxiv.page
2025-07-04 09:16:01

Do Role-Playing Agents Practice What They Preach? Belief-Behavior Consistency in LLM-Based Simulations of Human Trust
Amogh Mannekote, Adam Davies, Guohao Li, Kristy Elizabeth Boyer, ChengXiang Zhai, Bonnie J Dorr, Francesco Pinto
arxiv.org/abs/2507.02197