
2025-07-25 10:06:32
AraTable: Benchmarking LLMs' Reasoning and Understanding of Arabic Tabular Data
Rana Alshaikh, Israa Alghanmi, Shelan Jeawak
https://arxiv.org/abs/2507.18442 https://…
AraTable: Benchmarking LLMs' Reasoning and Understanding of Arabic Tabular Data
Rana Alshaikh, Israa Alghanmi, Shelan Jeawak
https://arxiv.org/abs/2507.18442 https://…
Understanding Software Engineering Agents: A Study of Thought-Action-Result Trajectories
Islem Bouzenia, Michael Pradel
https://arxiv.org/abs/2506.18824 ht…
SV-LLM: An Agentic Approach for SoC Security Verification using Large Language Models
Dipayan Saha, Shams Tarek, Hasan Al Shaikh, Khan Thamid Hasan, Pavan Sai Nalluri, Md. Ajoad Hasan, Nashmin Alam, Jingbo Zhou, Sujan Kumar Saha, Mark Tehranipoor, Farimah Farahmandi
https://arxiv.org/abs/2506.20415…
Challenges in Grounding Language in the Real World
Peter Lindes, Kaoutar Skiker
https://arxiv.org/abs/2506.17375 https://arxiv.org/pd…
Capturing Visualization Design Rationale
Maeve Hutchinson, Radu Jianu, Aidan Slingsby, Jo Wood, Pranava Madhyastha
https://arxiv.org/abs/2506.16571 https:/…
Evaluating Uncertainty and Quality of Visual Language Action-enabled Robots
Pablo Valle, Chengjie Lu, Shaukat Ali, Aitor Arrieta
https://arxiv.org/abs/2507.17049
Automated Optimization Modeling through Expert-Guided Large Language Model Reasoning
Beinuo Yang, Qishen Zhou, Junyi Li, Xingchen Su, Simon Hu
https://arxiv.org/abs/2508.14410 h…
CodeDiffuser: Attention-Enhanced Diffusion Policy via VLM-Generated Code for Instruction Ambiguity
Guang Yin, Yitong Li, Yixuan Wang, Dale McConachie, Paarth Shah, Kunimatsu Hashimoto, Huan Zhang, Katherine Liu, Yunzhu Li
https://arxiv.org/abs/2506.16652
eSapiens: A Real-World NLP Framework for Multimodal Document Understanding and Enterprise Knowledge Processing
Isaac Shi, Zeyuan Li, Wenli Wang, Lewei He, Yang Yang, Tianyu Shi
https://arxiv.org/abs/2506.16768
StepProof: Step-by-step verification of natural language mathematical proofs
Xiaolin Hu, Qinghua Zhou, Bogdan Grechuk, Ivan Y. Tyukin
https://arxiv.org/abs/2506.10558
DeepRTL2: A Versatile Model for RTL-Related Tasks
Yi Liu, Hongji Zhang, Yunhao Zhou, Zhengyuan Shi, Changran Xu, Qiang Xu
https://arxiv.org/abs/2506.15697 …
Datrics Text2SQL: A Framework for Natural Language to SQL Query Generation
Tetiana Gladkykh, Kyrylo Kirykov
https://arxiv.org/abs/2506.12234 https://
TComQA: Extracting Temporal Commonsense from Text
Lekshmi R Nair, Arun Sankar, Koninika Pal
https://arxiv.org/abs/2508.15274 https://arxiv.org/pdf/2508.152…
GTool: Graph Enhanced Tool Planning with Large Language Model
Wenjie Chen, Wenbin Li, Di Yao, Xuying Meng, Chang Gong, Jingping Bi
https://arxiv.org/abs/2508.12725 https://
Speak2Sign3D: A Multi-modal Pipeline for English Speech to American Sign Language Animation
Kazi Mahathir Rahman, Naveed Imtiaz Nafis, Md. Farhan Sadik, Mohammad Al Rafi, Mehedi Hasan Shahed
https://arxiv.org/abs/2507.06530
ElliottAgents: A Natural Language-Driven Multi-Agent System for Stock Market Analysis and Prediction
Jaros{\l}aw A. Chudziak, Micha{\l} Wawer
https://arxiv.org/abs/2507.03435
LayLens: Improving Deepfake Understanding through Simplified Explanations
Abhijeet Narang, Parul Gupta, Liuyijia Su, Abhinav Dhall
https://arxiv.org/abs/2507.10066
LangNavBench: Evaluation of Natural Language Understanding in Semantic Navigation
Sonia Raychaudhuri, Enrico Cancelli, Tommaso Campari, Lamberto Ballan, Manolis Savva, Angel X. Chang
https://arxiv.org/abs/2507.07299
Rethinking LLM Training through Information Geometry and Quantum Metrics
Riccardo Di Sipio
https://arxiv.org/abs/2506.15830 https://a…
Identifying economic narratives in large text corpora -- An integrated approach using Large Language Models
Tobias Schmidt, Kai-Robin Lange, Matthias Reccius, Henrik M\"uller, Michael Roos, Carsten Jentsch
https://arxiv.org/abs/2506.15041
This https://arxiv.org/abs/2505.19433 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…
Health Insurance Coverage Rule Interpretation Corpus: Law, Policy, and Medical Guidance for Health Insurance Coverage Understanding
Mike Gartner
https://arxiv.org/abs/2508.03718
Understanding protein function with a multimodal retrieval-augmented foundation model
Timothy Fei Truong Jr, Tristan Bepler
https://arxiv.org/abs/2508.04724 https://
EdgeProfiler: A Fast Profiling Framework for Lightweight LLMs on Edge Using Analytical Model
Alyssa Pinnock, Shakya Jayakody, Kawsher A Roxy, Md Rubel Ahmed
https://arxiv.org/abs/2506.09061
CS-Agent: LLM-based Community Search via Dual-agent Collaboration
Jiahao Hua, Long Yuan, Qingshuai Feng, Qiang Fang, Shan Huang
https://arxiv.org/abs/2508.09549 https://
ITDR: An Instruction Tuning Dataset for Enhancing Large Language Models in Recommendations
Zekun Liu, Xiaowen Huang, Jitao Sang
https://arxiv.org/abs/2508.05667 https://
MobQA: A Benchmark Dataset for Semantic Understanding of Human Mobility Data through Question Answering
Hikaru Asano, Hiroki Ouchi, Akira Kasuga, Ryo Yonetani
https://arxiv.org/abs/2508.11163
AI Conversational Tutors in Foreign Language Learning: A Mixed-Methods Evaluation Study
Nikolaos Avouris
https://arxiv.org/abs/2508.05156 https://arxiv.org…
Vision Language Action Models in Robotic Manipulation: A Systematic Review
Muhayy Ud Din, Waseem Akram, Lyes Saad Saoud, Jan Rosell, Irfan Hussain
https://arxiv.org/abs/2507.10672
LOOP: A Plug-and-Play Neuro-Symbolic Framework for Enhancing Planning in Autonomous Systems
Ronit Virwani, Ruchika Suryawanshi
https://arxiv.org/abs/2508.13371 https://
When Explainability Meets Privacy: An Investigation at the Intersection of Post-hoc Explainability and Differential Privacy in the Context of Natural Language Processing
Mahdi Dhaini, Stephen Meisenbacher, Ege Erdogan, Florian Matthes, Gjergji Kasneci
https://arxiv.org/abs/2508.10482
SpecCLIP: Aligning and Translating Spectroscopic Measurements for Stars
Xiaosheng Zhao, Yang Huang, Guirong Xue, Xiao Kong, Jifeng Liu, Xiaoyu Tang, Timothy C. Beers, Yuan-Sen Ting, A-Li Luo
https://arxiv.org/abs/2507.01939
Unveiling Privacy Policy Complexity: An Exploratory Study Using Graph Mining, Machine Learning, and Natural Language Processing
Vijayalakshmi Ramasamy, Seth Barrett, Gokila Dorai, Jessica Zumbach
https://arxiv.org/abs/2507.02968
A Navigation Framework Utilizing Vision-Language Models
Yicheng Duan, Kaiyu tang
https://arxiv.org/abs/2506.10172 https://arxiv.org/p…
ReferSplat: Referring Segmentation in 3D Gaussian Splatting
Shuting He, Guangquan Jie, Changshuo Wang, Yun Zhou, Shuming Hu, Guanbin Li, Henghui Ding
https://arxiv.org/abs/2508.08252
Language Models Can Understand Spectra: A Multimodal Model for Molecular Structure Elucidation
Yunyue Su, Jiahui Chen, Zao Jiang, Zhenyi Zhong, Liang Wang, Qiang Liu
https://arxiv.org/abs/2508.08441
MERA Code: A Unified Framework for Evaluating Code Generation Across Tasks
Artem Chervyakov, Alexander Kharitonov, Pavel Zadorozhny, Adamenko Pavel, Rodion Levichev, Dmitrii Vorobev, Dmitrii Salikhov, Aidar Valeev, Alena Pestova, Maria Dziuba, Ilseyar Alimova, Artem Zavgorodnev, Aleksandr Medvedev, Stanislav Moiseev, Elena Bruches, Daniil Grebenkin, Roman Derunets, Vikulov Vladimir, Anton Emelyanov, Dmitrii Babaev, Vladimir V. Ivanov, Valentin Malykh, Alena Fenogenova
MuDRiC: Multi-Dialect Reasoning for Arabic Commonsense Validation
Kareem Elozeiri, Mervat Abassy, Preslav Nakov, Yuxia Wang
https://arxiv.org/abs/2508.13130 https://
State and Memory is All You Need for Robust and Reliable AI Agents
Matthew Muhoberac, Atharva Parikh, Nirvi Vakharia, Saniya Virani, Aco Radujevic, Savannah Wood, Meghav Verma, Dimitri Metaxotos, Jeyaraman Soundararajan, Thierry Masquelin, Alexander G. Godfrey, Sean Gardner, Dobrila Rudnicki, Sam Michael, Gaurav Chopra
https://a…
Querying GI Endoscopy Images: A VQA Approach
Gaurav Parajuli
https://arxiv.org/abs/2507.21165 https://arxiv.org/pdf/2507.21165
Interpretable Robot Control via Structured Behavior Trees and Large Language Models
Ingrid Ma\'eva Chekam, Ines Pastor-Martinez, Ali Tourani, Jose Andres Millan-Romera, Laura Ribeiro, Pedro Miguel Bastos Soares, Holger Voos, Jose Luis Sanchez-Lopez
https://arxiv.org/abs/2508.09621
StreamLink: Large-Language-Model Driven Distributed Data Engineering System
Dawei Feng, Di Mei, Huiri Tan, Lei Ren, Xianying Lou, Zhangxi Tan
https://arxiv.org/abs/2505.21575
NL in the Middle: Code Translation with LLMs and Intermediate Representations
Chi-en Amy Tai, Pengyu Nie, Lukasz Golab, Alexander Wong
https://arxiv.org/abs/2507.08627
Beyond Single Models: Enhancing LLM Detection of Ambiguity in Requests through Debate
Ana Davila, Jacinto Colan, Yasuhisa Hasegawa
https://arxiv.org/abs/2507.12370
Bootstrapping Grounded Chain-of-Thought in Multimodal LLMs for Data-Efficient Model Adaptation
Jiaer Xia, Bingkui Tong, Yuhang Zang, Rui Shao, Kaiyang Zhou
https://arxiv.org/abs/2507.02859
Using Language and Road Manuals to Inform Map Reconstruction for Autonomous Driving
Akshar Tumu, Henrik I. Christensen, Marcell Vazquez-Chanlatte, Chikao Tsuchiya, Dhaval Bhanderi
https://arxiv.org/abs/2506.10317
Understanding User Preferences for Interaction Styles in Conversational Recommender Systems: The Predictive Role of System Qualities, User Experience, and Traits
Raj Mahmud, Shlomo Berkovsky, Mukesh Prasad, A. Baki Kocaballi
https://arxiv.org/abs/2508.02328
False Alarms, Real Damage: Adversarial Attacks Using LLM-based Models on Text-based Cyber Threat Intelligence Systems
Samaneh Shafee, Alysson Bessani, Pedro M. Ferreira
https://arxiv.org/abs/2507.06252
VRAgent-R1: Boosting Video Recommendation with MLLM-based Agents via Reinforcement Learning
Siran Chen, Boyu Chen, Chenyun Yu, Yuxiao Luo, Ouyang Yi, Lei Cheng, Chengxiang Zhuo, Zang Li, Yali Wang
https://arxiv.org/abs/2507.02626
ExpliCIT-QA: Explainable Code-Based Image Table Question Answering
Maximiliano Hormaz\'abal Lagos, \'Alvaro Bueno S\'aez, Pedro Alonso Doval, Jorge Alcalde Vesteiro, H\'ector Cerezo-Costas
https://arxiv.org/abs/2507.11694
This https://arxiv.org/abs/2505.07453 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…
Layer-Wise Perturbations via Sparse Autoencoders for Adversarial Text Generation
Huizhen Shu, Xuying Li, Qirui Wang, Yuji Kosuga, Mengqiu Tian, Zhuo Li
https://arxiv.org/abs/2508.10404
SeisCoDE: 3D Seismic Interpretation Foundation Model with Contrastive Self-Distillation Learning
Goodluck Archibong, Ardiansyah Koeshidayatullah, Umair Waheed, Weichang Li, Dicky Harishidayat, Motaz Alfarraj
https://arxiv.org/abs/2505.20518
Position: Intelligent Coding Systems Should Write Programs with Justifications
Xiangzhe Xu, Shiwei Feng, Zian Su, Chengpeng Wang, Xiangyu Zhang
https://arxiv.org/abs/2508.06017 …
This https://arxiv.org/abs/2504.15448 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_eco…
skLEP: A Slovak General Language Understanding Benchmark
Marek \v{S}uppa, Andrej Ridzik, Daniel Hl\'adek, Tom\'a\v{s} Jav\r{u}rek, Vikt\'oria Ondrejov\'a, Krist\'ina S\'asikov\'a, Martin Tamajka, Mari\'an \v{S}imko
https://arxiv.org/abs/2506.21508
This https://arxiv.org/abs/2411.03079 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csSE_…
Can User Feedback Help Issue Detection? An Empirical Study on a One-billion-user Online Service System
Shuyao Jiang, Jiazhen Gu, Wujie Zheng, Yangfan Zhou, Michael R. Lyu
https://arxiv.org/abs/2508.00593
How Do LLMs Persuade? Linear Probes Can Uncover Persuasion Dynamics in Multi-Turn Conversations
Brandon Jaipersaud, David Krueger, Ekdeep Singh Lubana
https://arxiv.org/abs/2508.05625