
2025-06-13 07:43:10
StepProof: Step-by-step verification of natural language mathematical proofs
Xiaolin Hu, Qinghua Zhou, Bogdan Grechuk, Ivan Y. Tyukin
https://arxiv.org/abs/2506.10558
StepProof: Step-by-step verification of natural language mathematical proofs
Xiaolin Hu, Qinghua Zhou, Bogdan Grechuk, Ivan Y. Tyukin
https://arxiv.org/abs/2506.10558
A Navigation Framework Utilizing Vision-Language Models
Yicheng Duan, Kaiyu tang
https://arxiv.org/abs/2506.10172 https://arxiv.org/p…
EdgeProfiler: A Fast Profiling Framework for Lightweight LLMs on Edge Using Analytical Model
Alyssa Pinnock, Shakya Jayakody, Kawsher A Roxy, Md Rubel Ahmed
https://arxiv.org/abs/2506.09061
ReferSplat: Referring Segmentation in 3D Gaussian Splatting
Shuting He, Guangquan Jie, Changshuo Wang, Yun Zhou, Shuming Hu, Guanbin Li, Henghui Ding
https://arxiv.org/abs/2508.08252
ITDR: An Instruction Tuning Dataset for Enhancing Large Language Models in Recommendations
Zekun Liu, Xiaowen Huang, Jitao Sang
https://arxiv.org/abs/2508.05667 https://
LangNavBench: Evaluation of Natural Language Understanding in Semantic Navigation
Sonia Raychaudhuri, Enrico Cancelli, Tommaso Campari, Lamberto Ballan, Manolis Savva, Angel X. Chang
https://arxiv.org/abs/2507.07299
Using Language and Road Manuals to Inform Map Reconstruction for Autonomous Driving
Akshar Tumu, Henrik I. Christensen, Marcell Vazquez-Chanlatte, Chikao Tsuchiya, Dhaval Bhanderi
https://arxiv.org/abs/2506.10317
ElliottAgents: A Natural Language-Driven Multi-Agent System for Stock Market Analysis and Prediction
Jaros{\l}aw A. Chudziak, Micha{\l} Wawer
https://arxiv.org/abs/2507.03435
AI Conversational Tutors in Foreign Language Learning: A Mixed-Methods Evaluation Study
Nikolaos Avouris
https://arxiv.org/abs/2508.05156 https://arxiv.org…
Position: Intelligent Coding Systems Should Write Programs with Justifications
Xiangzhe Xu, Shiwei Feng, Zian Su, Chengpeng Wang, Xiangyu Zhang
https://arxiv.org/abs/2508.06017 …
Speak2Sign3D: A Multi-modal Pipeline for English Speech to American Sign Language Animation
Kazi Mahathir Rahman, Naveed Imtiaz Nafis, Md. Farhan Sadik, Mohammad Al Rafi, Mehedi Hasan Shahed
https://arxiv.org/abs/2507.06530
Understanding protein function with a multimodal retrieval-augmented foundation model
Timothy Fei Truong Jr, Tristan Bepler
https://arxiv.org/abs/2508.04724 https://
Unveiling Privacy Policy Complexity: An Exploratory Study Using Graph Mining, Machine Learning, and Natural Language Processing
Vijayalakshmi Ramasamy, Seth Barrett, Gokila Dorai, Jessica Zumbach
https://arxiv.org/abs/2507.02968
Health Insurance Coverage Rule Interpretation Corpus: Law, Policy, and Medical Guidance for Health Insurance Coverage Understanding
Mike Gartner
https://arxiv.org/abs/2508.03718
This https://arxiv.org/abs/2505.19433 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…
How Do LLMs Persuade? Linear Probes Can Uncover Persuasion Dynamics in Multi-Turn Conversations
Brandon Jaipersaud, David Krueger, Ekdeep Singh Lubana
https://arxiv.org/abs/2508.05625
This https://arxiv.org/abs/2505.07453 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…
False Alarms, Real Damage: Adversarial Attacks Using LLM-based Models on Text-based Cyber Threat Intelligence Systems
Samaneh Shafee, Alysson Bessani, Pedro M. Ferreira
https://arxiv.org/abs/2507.06252
StreamLink: Large-Language-Model Driven Distributed Data Engineering System
Dawei Feng, Di Mei, Huiri Tan, Lei Ren, Xianying Lou, Zhangxi Tan
https://arxiv.org/abs/2505.21575
SpecCLIP: Aligning and Translating Spectroscopic Measurements for Stars
Xiaosheng Zhao, Yang Huang, Guirong Xue, Xiao Kong, Jifeng Liu, Xiaoyu Tang, Timothy C. Beers, Yuan-Sen Ting, A-Li Luo
https://arxiv.org/abs/2507.01939
Understanding User Preferences for Interaction Styles in Conversational Recommender Systems: The Predictive Role of System Qualities, User Experience, and Traits
Raj Mahmud, Shlomo Berkovsky, Mukesh Prasad, A. Baki Kocaballi
https://arxiv.org/abs/2508.02328
VRAgent-R1: Boosting Video Recommendation with MLLM-based Agents via Reinforcement Learning
Siran Chen, Boyu Chen, Chenyun Yu, Yuxiao Luo, Ouyang Yi, Lei Cheng, Chengxiang Zhuo, Zang Li, Yali Wang
https://arxiv.org/abs/2507.02626
skLEP: A Slovak General Language Understanding Benchmark
Marek \v{S}uppa, Andrej Ridzik, Daniel Hl\'adek, Tom\'a\v{s} Jav\r{u}rek, Vikt\'oria Ondrejov\'a, Krist\'ina S\'asikov\'a, Martin Tamajka, Mari\'an \v{S}imko
https://arxiv.org/abs/2506.21508
Challenges in Grounding Language in the Real World
Peter Lindes, Kaoutar Skiker
https://arxiv.org/abs/2506.17375 https://arxiv.org/pd…
State and Memory is All You Need for Robust and Reliable AI Agents
Matthew Muhoberac, Atharva Parikh, Nirvi Vakharia, Saniya Virani, Aco Radujevic, Savannah Wood, Meghav Verma, Dimitri Metaxotos, Jeyaraman Soundararajan, Thierry Masquelin, Alexander G. Godfrey, Sean Gardner, Dobrila Rudnicki, Sam Michael, Gaurav Chopra
https://a…
This https://arxiv.org/abs/2411.03079 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csSE_…
Querying GI Endoscopy Images: A VQA Approach
Gaurav Parajuli
https://arxiv.org/abs/2507.21165 https://arxiv.org/pdf/2507.21165
Bootstrapping Grounded Chain-of-Thought in Multimodal LLMs for Data-Efficient Model Adaptation
Jiaer Xia, Bingkui Tong, Yuhang Zang, Rui Shao, Kaiyang Zhou
https://arxiv.org/abs/2507.02859
This https://arxiv.org/abs/2504.15448 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_eco…
DeepRTL2: A Versatile Model for RTL-Related Tasks
Yi Liu, Hongji Zhang, Yunhao Zhou, Zhengyuan Shi, Changran Xu, Qiang Xu
https://arxiv.org/abs/2506.15697 …
AraTable: Benchmarking LLMs' Reasoning and Understanding of Arabic Tabular Data
Rana Alshaikh, Israa Alghanmi, Shelan Jeawak
https://arxiv.org/abs/2507.18442 https://…
Datrics Text2SQL: A Framework for Natural Language to SQL Query Generation
Tetiana Gladkykh, Kyrylo Kirykov
https://arxiv.org/abs/2506.12234 https://
SV-LLM: An Agentic Approach for SoC Security Verification using Large Language Models
Dipayan Saha, Shams Tarek, Hasan Al Shaikh, Khan Thamid Hasan, Pavan Sai Nalluri, Md. Ajoad Hasan, Nashmin Alam, Jingbo Zhou, Sujan Kumar Saha, Mark Tehranipoor, Farimah Farahmandi
https://arxiv.org/abs/2506.20415…
Understanding Software Engineering Agents: A Study of Thought-Action-Result Trajectories
Islem Bouzenia, Michael Pradel
https://arxiv.org/abs/2506.18824 ht…
Capturing Visualization Design Rationale
Maeve Hutchinson, Radu Jianu, Aidan Slingsby, Jo Wood, Pranava Madhyastha
https://arxiv.org/abs/2506.16571 https:/…
eSapiens: A Real-World NLP Framework for Multimodal Document Understanding and Enterprise Knowledge Processing
Isaac Shi, Zeyuan Li, Wenli Wang, Lewei He, Yang Yang, Tianyu Shi
https://arxiv.org/abs/2506.16768
Can User Feedback Help Issue Detection? An Empirical Study on a One-billion-user Online Service System
Shuyao Jiang, Jiazhen Gu, Wujie Zheng, Yangfan Zhou, Michael R. Lyu
https://arxiv.org/abs/2508.00593
SeisCoDE: 3D Seismic Interpretation Foundation Model with Contrastive Self-Distillation Learning
Goodluck Archibong, Ardiansyah Koeshidayatullah, Umair Waheed, Weichang Li, Dicky Harishidayat, Motaz Alfarraj
https://arxiv.org/abs/2505.20518
LayLens: Improving Deepfake Understanding through Simplified Explanations
Abhijeet Narang, Parul Gupta, Liuyijia Su, Abhinav Dhall
https://arxiv.org/abs/2507.10066
Evaluating Uncertainty and Quality of Visual Language Action-enabled Robots
Pablo Valle, Chengjie Lu, Shaukat Ali, Aitor Arrieta
https://arxiv.org/abs/2507.17049
Vision Language Action Models in Robotic Manipulation: A Systematic Review
Muhayy Ud Din, Waseem Akram, Lyes Saad Saoud, Jan Rosell, Irfan Hussain
https://arxiv.org/abs/2507.10672
Identifying economic narratives in large text corpora -- An integrated approach using Large Language Models
Tobias Schmidt, Kai-Robin Lange, Matthias Reccius, Henrik M\"uller, Michael Roos, Carsten Jentsch
https://arxiv.org/abs/2506.15041
CodeDiffuser: Attention-Enhanced Diffusion Policy via VLM-Generated Code for Instruction Ambiguity
Guang Yin, Yitong Li, Yixuan Wang, Dale McConachie, Paarth Shah, Kunimatsu Hashimoto, Huan Zhang, Katherine Liu, Yunzhu Li
https://arxiv.org/abs/2506.16652
Beyond Single Models: Enhancing LLM Detection of Ambiguity in Requests through Debate
Ana Davila, Jacinto Colan, Yasuhisa Hasegawa
https://arxiv.org/abs/2507.12370
Rethinking LLM Training through Information Geometry and Quantum Metrics
Riccardo Di Sipio
https://arxiv.org/abs/2506.15830 https://a…
ExpliCIT-QA: Explainable Code-Based Image Table Question Answering
Maximiliano Hormaz\'abal Lagos, \'Alvaro Bueno S\'aez, Pedro Alonso Doval, Jorge Alcalde Vesteiro, H\'ector Cerezo-Costas
https://arxiv.org/abs/2507.11694
NL in the Middle: Code Translation with LLMs and Intermediate Representations
Chi-en Amy Tai, Pengyu Nie, Lukasz Golab, Alexander Wong
https://arxiv.org/abs/2507.08627
MERA Code: A Unified Framework for Evaluating Code Generation Across Tasks
Artem Chervyakov, Alexander Kharitonov, Pavel Zadorozhny, Adamenko Pavel, Rodion Levichev, Dmitrii Vorobev, Dmitrii Salikhov, Aidar Valeev, Alena Pestova, Maria Dziuba, Ilseyar Alimova, Artem Zavgorodnev, Aleksandr Medvedev, Stanislav Moiseev, Elena Bruches, Daniil Grebenkin, Roman Derunets, Vikulov Vladimir, Anton Emelyanov, Dmitrii Babaev, Vladimir V. Ivanov, Valentin Malykh, Alena Fenogenova