
2025-07-25 07:52:32
Does visualization help AI understand data?
Victoria R. Li, Johnathan Sun, Martin Wattenberg
https://arxiv.org/abs/2507.18022 https://arxiv.org/pdf/2507.18…
Does visualization help AI understand data?
Victoria R. Li, Johnathan Sun, Martin Wattenberg
https://arxiv.org/abs/2507.18022 https://arxiv.org/pdf/2507.18…
Replaced article(s) found for cs.CL. https://arxiv.org/list/cs.CL/new
[3/3]:
- CRISPR-GPT for Agentic Automation of Gene-editing Experiments
Qu, Huang, Yin, Zhan, Liu, Yin, Cousins, Johnson, Wang, Shah, Altman, Zhou, Wang, Cong
In its December 2023 lawsuit against OpenAI, The New York Times produced dozens of examples where GPT-4 exactly reproduced significant passages from Times stories.
In its response, OpenAI described this as a “fringe behavior” and a “problem that researchers at OpenAI and elsewhere work hard to address.”
But is it actually a fringe behavior?
And have leading AI companies addressed it?
New research—focusing on books rather than newspaper articles and on different compa…
Evaluating the Use of LLMs for Documentation to Code Traceability
Ebube Alor, SayedHassan Khatoonabadi, Emad Shihab
https://arxiv.org/abs/2506.16440 https:…
Mitigating Trojanized Prompt Chains in Educational LLM Use Cases: Experimental Findings and Detection Tool Design
Richard M. Charles, James H. Curry, Richard B. Charles
https://arxiv.org/abs/2507.14207
Performance of GPT-5 in Brain Tumor MRI Reasoning
Mojtaba Safari, Shansong Wang, Mingzhe Hu, Zach Eidex, Qiang Li, Xiaofeng Yang
https://arxiv.org/abs/2508.10865 https://…
Source: GPT-5 improvements won't be comparable to the leaps in performance of earlier models, such as between GPT-3 in 2020 and GPT-4 in 2023 (The Information)
https://www.theinformation.com/articles/inside-openais-rocky-path-gpt-5
Interesting, "GPT-style models have a fixed memorization capacity of approximately 3.6 bits per parameter."
https://venturebeat.com/ai/how-much-information-do-llms-really-memorize-now-we-know-thanks-to-met…
Assessing the Quality and Security of AI-Generated Code: A Quantitative Analysis
Abbas Sabra, Olivier Schmitt, Joseph Tyler
https://arxiv.org/abs/2508.14727 https://
Can Large Language Models Understand As Well As Apply Patent Regulations to Pass a Hands-On Patent Attorney Test?
Bhakti Khera, Rezvan Alamian, Pascal A. Scherz, Stephan M. Goetz
https://arxiv.org/abs/2507.10576
Nos reíamos de que Reagan preguntara a una vidente decisiones de política durante su presidencia. Pues en Suecia estšn con la versión 3.0 de consultar a un oršculo: https://www.theguardian.com/technology/2025/aug/05/chat-gpt-sw…
Assessing and Mitigating Data Memorization Risks in Fine-Tuned Large Language Models
Badrinath Ramakrishnan, Akshaya Balaji
https://arxiv.org/abs/2508.14062 https://
Focus and Context and LLMs | Taras' Blog on AI, Perf, Hacks
#AI
How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks
Rahul Ramachandran, Ali Garjani, Roman Bachmann, Andrei Atanov, O\u{g}uzhan Fatih Kar, Amir Zamir
https://arxiv.org/abs/2507.01955
Exploring LLM-generated Culture-specific Affective Human-Robot Tactile Interaction
Qiaoqiao Ren, Tony Belpaeme
https://arxiv.org/abs/2507.22905 https://arx…
I built a free tool to help students compare the energy/water use of AI tasks—like a 3-sec video gen or 500-word GPT reply—to everyday ones like Netflix, Google, or cloud storage. Try it at https://what-uses-more.com
Adjust variables like prompt complexity or the energy source and climate of local …
Quality Assessment of Python Tests Generated by Large Language Models
Victor Alves, Carla Bezerra, Ivan Machado, Larissa Rocha, T\'assio Virg\'inio, Publio Silva
https://arxiv.org/abs/2506.14297
GPT-5 may be slightly disappointing, Genie 3 demo blew me away... Watch it.
#ai
Can Large Language Models Bridge the Gap in Environmental Knowledge?
Linda Smail (College of Interdisciplinary Studies, Zayed University, UAE), David Santandreu Calonge (Department of Academic Development, Mohamed bin Zayed University of Artificial Intelligence, UAE), Firuz Kamalov (School of Engineering, Applied Science,Technology, Canadian University Dubai, UAE), Nur H. Orak (Department of Environmental Engineering, Marmara University, T\"urkiye)
Replaced article(s) found for cs.PL. https://arxiv.org/list/cs.PL/new
[1/1]:
- RTLCoder: Outperforming GPT-3.5 in Design RTL Generation with Our Open-Source Dataset and Lightwe...
Shang Liu, Wenji Fang, Yao Lu, Qijun Zhang, Hongce Zhang, Zhiyao Xie
Can LLMs Talk 'Sex'? Exploring How AI Models Handle Intimate Conversations
Huiqian Lai
https://arxiv.org/abs/2506.05514 https://
FinBERT2: A Specialized Bidirectional Encoder for Bridging the Gap in Finance-Specific Deployment of Large Language Models
Xuan Xu, Fufang Wen, Beilin Chu, Zhibing Fu, Qinhong Lin, Jiaqi Liu, Binjie Fei, Zhongliang Yang, Linna Zhou, Yu Li
https://arxiv.org/abs/2506.06335
This https://arxiv.org/abs/2505.18889 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCR_…
The Boiling-Frog Problem of Physics Education
Gerd Kortemeyer
https://arxiv.org/abs/2508.08842 https://arxiv.org/pdf/2508.08842
Evaluating Prompt Engineering Techniques for Accuracy and Confidence Elicitation in Medical LLMs
Nariman Naderi, Zahra Atf, Peter R Lewis, Aref Mahjoub far, Seyed Amir Ahmad Safavi-Naini, Ali Soroush
https://arxiv.org/abs/2506.00072
Leveraging Large Language Models for Classifying App Users' Feedback
Yasaman Abedini, Abbas Heydarnoori
https://arxiv.org/abs/2507.08250 https://
Charts-of-Thought: Enhancing LLM Visualization Literacy Through Structured Data Extraction
Amit Kumar Das, Mohammad Tarun, Klaus Mueller
https://arxiv.org/abs/2508.04842 https:/…
Large Language Models for Toxic Language Detection in Low-Resource Balkan Languages
Amel Muminovic, Amela Kadric Muminovic
https://arxiv.org/abs/2506.09992
Retrieval Augmented Large Language Model System for Comprehensive Drug Contraindications
Byeonghun Bang, Jongsuk Yoon, Dong-Jin Chang, Seho Park, Yong Oh Lee
https://arxiv.org/abs/2508.06145
Augmenting Large Language Models with Static Code Analysis for Automated Code Quality Improvements
Seyed Moein Abtahi, Akramul Azim
https://arxiv.org/abs/2506.10330
Facts are Harder Than Opinions -- A Multilingual, Comparative Analysis of LLM-Based Fact-Checking Reliability
Lorraine Saju, Arnim Bleier, Jana Lasser, Claudia Wagner
https://arxiv.org/abs/2506.03655
Do Language Models Mirror Human Confidence? Exploring Psychological Insights to Address Overconfidence in LLMs
Chenjun Xu, Bingbing Wen, Bin Han, Robert Wolfe, Lucy Lu Wang, Bill Howe
https://arxiv.org/abs/2506.00582
The Carbon Cost of Conversation, Sustainability in the Age of Language Models
Sayed Mahbub Hasan Amiri, Prasun Goswami, Md. Mainul Islam, Mohammad Shakhawat Hossen, Sayed Majhab Hasan Amiri, Naznin Akter
https://arxiv.org/abs/2507.20018
Evaluating LLMs Effectiveness in Detecting and Correcting Test Smells: An Empirical Study
E. G. Santana Jr, Jander Pereira Santos Junior, Erlon P. Almeida, Iftekhar Ahmed, Paulo Anselmo da Mota Silveira Neto, Eduardo Santana de Almeida
https://arxiv.org/abs/2506.07594
Replaced article(s) found for cs.CL. https://arxiv.org/list/cs.CL/new
[1/3]:
- Comparison of pipeline, sequence-to-sequence, and GPT models for end-to-end relation extraction: ...
Shashank Gupta, Xuguang Ai, Ramakanth Kavuluru
Prompt Injection Vulnerability of Consensus Generating Applications in Digital Democracy
Jairo Gudi\~no-Rosero, Cl\'ement Contet, Umberto Grandi, C\'esar A. Hidalgo
https://arxiv.org/abs/2508.04281