Tootfinder

Opt-in global Mastodon full text search. Join the index!

@arXiv_csAI_bot@mastoxiv.page
2025-07-25 07:52:32

Does visualization help AI understand data?
Victoria R. Li, Johnathan Sun, Martin Wattenberg
arxiv.org/abs/2507.18022 arxiv.org/pdf/2507.18…

@arXiv_csCL_bot@mastoxiv.page
2025-08-22 12:38:52

Replaced article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[3/3]:
- CRISPR-GPT for Agentic Automation of Gene-editing Experiments
Qu, Huang, Yin, Zhan, Liu, Yin, Cousins, Johnson, Wang, Shah, Altman, Zhou, Wang, Cong

In its December 2023 lawsuit against OpenAI, The New York Times produced dozens of examples where GPT-4 exactly reproduced significant passages from Times stories.
In its response, OpenAI described this as a “fringe behavior” and a “problem that researchers at OpenAI and elsewhere work hard to address.”
But is it actually a fringe behavior?
And have leading AI companies addressed it? 
New research—focusing on books rather than newspaper articles and on different compa…

@arXiv_csSE_bot@mastoxiv.page
2025-06-23 09:16:00

Evaluating the Use of LLMs for Documentation to Code Traceability
Ebube Alor, SayedHassan Khatoonabadi, Emad Shihab
arxiv.org/abs/2506.16440

@arXiv_csCR_bot@mastoxiv.page
2025-07-22 07:53:50

Mitigating Trojanized Prompt Chains in Educational LLM Use Cases: Experimental Findings and Detection Tool Design
Richard M. Charles, James H. Curry, Richard B. Charles
arxiv.org/abs/2507.14207

@arXiv_csCV_bot@mastoxiv.page
2025-08-15 10:25:22

Performance of GPT-5 in Brain Tumor MRI Reasoning
Mojtaba Safari, Shansong Wang, Mingzhe Hu, Zach Eidex, Qiang Li, Xiaofeng Yang
arxiv.org/abs/2508.10865

@Techmeme@techhub.social
2025-08-01 18:25:51

Source: GPT-5 improvements won't be comparable to the leaps in performance of earlier models, such as between GPT-3 in 2020 and GPT-4 in 2023 (The Information)
theinformation.com/articles/in

@ErikJonker@mastodon.social
2025-06-07 08:07:20

Interesting, "GPT-style models have a fixed memorization capacity of approximately 3.6 bits per parameter."
venturebeat.com/ai/how-much-in

@arXiv_csSE_bot@mastoxiv.page
2025-08-21 09:32:00

Assessing the Quality and Security of AI-Generated Code: A Quantitative Analysis
Abbas Sabra, Olivier Schmitt, Joseph Tyler
arxiv.org/abs/2508.14727

@arXiv_csCY_bot@mastoxiv.page
2025-07-16 07:41:31

Can Large Language Models Understand As Well As Apply Patent Regulations to Pass a Hands-On Patent Attorney Test?
Bhakti Khera, Rezvan Alamian, Pascal A. Scherz, Stephan M. Goetz
arxiv.org/abs/2507.10576

@jdrm@social.linux.pizza
2025-08-06 09:04:05

Nos reíamos de que Reagan preguntara a una vidente decisiones de política durante su presidencia. Pues en Suecia estšn con la versión 3.0 de consultar a un oršculo: theguardian.com/technology/202

@arXiv_csCL_bot@mastoxiv.page
2025-08-21 08:31:50

Assessing and Mitigating Data Memorization Risks in Fine-Tuned Large Language Models
Badrinath Ramakrishnan, Akshaya Balaji
arxiv.org/abs/2508.14062

@usul@piaille.fr
2025-06-11 11:31:32

Focus and Context and LLMs | Taras' Blog on AI, Perf, Hacks
#AI

@arXiv_csCV_bot@mastoxiv.page
2025-07-03 10:32:10

How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks
Rahul Ramachandran, Ali Garjani, Roman Bachmann, Andrei Atanov, O\u{g}uzhan Fatih Kar, Amir Zamir
arxiv.org/abs/2507.01955

@arXiv_csHC_bot@mastoxiv.page
2025-08-01 09:22:31

Exploring LLM-generated Culture-specific Affective Human-Robot Tactile Interaction
Qiaoqiao Ren, Tony Belpaeme
arxiv.org/abs/2507.22905 arx…

@jonippolito@digipres.club
2025-07-02 12:40:23

I built a free tool to help students compare the energy/water use of AI tasks—like a 3-sec video gen or 500-word GPT reply—to everyday ones like Netflix, Google, or cloud storage. Try it at what-uses-more.com
Adjust variables like prompt complexity or the energy source and climate of local …

Decorative graphic with title "What Uses More" and a chart showing the different energy footprints of two tasks
@arXiv_csSE_bot@mastoxiv.page
2025-06-18 08:59:19

Quality Assessment of Python Tests Generated by Large Language Models
Victor Alves, Carla Bezerra, Ivan Machado, Larissa Rocha, T\'assio Virg\'inio, Publio Silva
arxiv.org/abs/2506.14297

@ErikJonker@mastodon.social
2025-08-09 18:02:14

GPT-5 may be slightly disappointing, Genie 3 demo blew me away... Watch it.
#ai

@arXiv_csAI_bot@mastoxiv.page
2025-08-06 09:49:50

Can Large Language Models Bridge the Gap in Environmental Knowledge?
Linda Smail (College of Interdisciplinary Studies, Zayed University, UAE), David Santandreu Calonge (Department of Academic Development, Mohamed bin Zayed University of Artificial Intelligence, UAE), Firuz Kamalov (School of Engineering, Applied Science,Technology, Canadian University Dubai, UAE), Nur H. Orak (Department of Environmental Engineering, Marmara University, T\"urkiye)

@arXiv_csPL_bot@mastoxiv.page
2025-08-07 12:56:16

Replaced article(s) found for cs.PL. arxiv.org/list/cs.PL/new
[1/1]:
- RTLCoder: Outperforming GPT-3.5 in Design RTL Generation with Our Open-Source Dataset and Lightwe...
Shang Liu, Wenji Fang, Yao Lu, Qijun Zhang, Hongce Zhang, Zhiyao Xie

@arXiv_csCY_bot@mastoxiv.page
2025-06-09 07:25:02

Can LLMs Talk 'Sex'? Exploring How AI Models Handle Intimate Conversations
Huiqian Lai
arxiv.org/abs/2506.05514

@arXiv_csIR_bot@mastoxiv.page
2025-06-10 07:52:42

FinBERT2: A Specialized Bidirectional Encoder for Bridging the Gap in Finance-Specific Deployment of Large Language Models
Xuan Xu, Fufang Wen, Beilin Chu, Zhibing Fu, Qinhong Lin, Jiaqi Liu, Binjie Fei, Zhongliang Yang, Linna Zhou, Yu Li
arxiv.org/abs/2506.06335

@arXiv_csCR_bot@mastoxiv.page
2025-06-03 17:52:02

This arxiv.org/abs/2505.18889 has been replaced.
initial toot: mastoxiv.page/@arXiv_csCR_…

@arXiv_physicsedph_bot@mastoxiv.page
2025-08-13 08:59:32

The Boiling-Frog Problem of Physics Education
Gerd Kortemeyer
arxiv.org/abs/2508.08842 arxiv.org/pdf/2508.08842

@arXiv_csCY_bot@mastoxiv.page
2025-06-03 07:20:41

Evaluating Prompt Engineering Techniques for Accuracy and Confidence Elicitation in Medical LLMs
Nariman Naderi, Zahra Atf, Peter R Lewis, Aref Mahjoub far, Seyed Amir Ahmad Safavi-Naini, Ali Soroush
arxiv.org/abs/2506.00072

@arXiv_csSE_bot@mastoxiv.page
2025-07-14 08:37:21

Leveraging Large Language Models for Classifying App Users' Feedback
Yasaman Abedini, Abbas Heydarnoori
arxiv.org/abs/2507.08250

@arXiv_csHC_bot@mastoxiv.page
2025-08-08 08:43:02

Charts-of-Thought: Enhancing LLM Visualization Literacy Through Structured Data Extraction
Amit Kumar Das, Mohammad Tarun, Klaus Mueller
arxiv.org/abs/2508.04842

@arXiv_csCL_bot@mastoxiv.page
2025-06-12 09:20:52

Large Language Models for Toxic Language Detection in Low-Resource Balkan Languages
Amel Muminovic, Amela Kadric Muminovic
arxiv.org/abs/2506.09992

@arXiv_csAI_bot@mastoxiv.page
2025-08-11 09:30:00

Retrieval Augmented Large Language Model System for Comprehensive Drug Contraindications
Byeonghun Bang, Jongsuk Yoon, Dong-Jin Chang, Seho Park, Yong Oh Lee
arxiv.org/abs/2508.06145

@arXiv_csSE_bot@mastoxiv.page
2025-06-13 08:08:42

Augmenting Large Language Models with Static Code Analysis for Automated Code Quality Improvements
Seyed Moein Abtahi, Akramul Azim
arxiv.org/abs/2506.10330

@arXiv_csCY_bot@mastoxiv.page
2025-06-05 07:16:45

Facts are Harder Than Opinions -- A Multilingual, Comparative Analysis of LLM-Based Fact-Checking Reliability
Lorraine Saju, Arnim Bleier, Jana Lasser, Claudia Wagner
arxiv.org/abs/2506.03655

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 07:21:03

Do Language Models Mirror Human Confidence? Exploring Psychological Insights to Address Overconfidence in LLMs
Chenjun Xu, Bingbing Wen, Bin Han, Robert Wolfe, Lucy Lu Wang, Bill Howe
arxiv.org/abs/2506.00582

@arXiv_csCY_bot@mastoxiv.page
2025-07-29 10:11:51

The Carbon Cost of Conversation, Sustainability in the Age of Language Models
Sayed Mahbub Hasan Amiri, Prasun Goswami, Md. Mainul Islam, Mohammad Shakhawat Hossen, Sayed Majhab Hasan Amiri, Naznin Akter
arxiv.org/abs/2507.20018

@arXiv_csSE_bot@mastoxiv.page
2025-06-10 10:11:13

Evaluating LLMs Effectiveness in Detecting and Correcting Test Smells: An Empirical Study
E. G. Santana Jr, Jander Pereira Santos Junior, Erlon P. Almeida, Iftekhar Ahmed, Paulo Anselmo da Mota Silveira Neto, Eduardo Santana de Almeida
arxiv.org/abs/2506.07594

@arXiv_csCL_bot@mastoxiv.page
2025-07-28 13:02:38

Replaced article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[1/3]:
- Comparison of pipeline, sequence-to-sequence, and GPT models for end-to-end relation extraction: ...
Shashank Gupta, Xuguang Ai, Ramakanth Kavuluru

@arXiv_csCY_bot@mastoxiv.page
2025-08-07 08:33:34

Prompt Injection Vulnerability of Consensus Generating Applications in Digital Democracy
Jairo Gudi\~no-Rosero, Cl\'ement Contet, Umberto Grandi, C\'esar A. Hidalgo
arxiv.org/abs/2508.04281