
2025-09-05 10:10:31
Exploring NLP Benchmarks in an Extremely Low-Resource Setting
Ulin Nuha, Adam Jatowt
https://arxiv.org/abs/2509.03962 https://arxiv.org/pdf/2509.03962
Exploring NLP Benchmarks in an Extremely Low-Resource Setting
Ulin Nuha, Adam Jatowt
https://arxiv.org/abs/2509.03962 https://arxiv.org/pdf/2509.03962
A Survey on Data Security in Large Language Models
Kang Chen, Xiuze Zhou, Yuanguo Lin, Jinhe Su, Yuanhui Yu, Li Shen, Fan Lin
https://arxiv.org/abs/2508.02312 https://
GHTM: A Graph based Hybrid Topic Modeling Approach in Low-Resource Bengali Language
Farhana Haque, Md. Abdur Rahman, Sumon Ahmed
https://arxiv.org/abs/2508.00605 https://…
VRAgent-R1: Boosting Video Recommendation with MLLM-based Agents via Reinforcement Learning
Siran Chen, Boyu Chen, Chenyun Yu, Yuxiao Luo, Ouyang Yi, Lei Cheng, Chengxiang Zhuo, Zang Li, Yali Wang
https://arxiv.org/abs/2507.02626
Theories of "Sexuality" in Natural Language Processing Bias Research
Jacob Hobbs
https://arxiv.org/abs/2506.22481 https://a…
When Attention is Beneficial for Learning Wireless Resource Allocation Efficiently?
Jia Guo, Chenyang Yang
https://arxiv.org/abs/2507.02427 https://…
PicoAudio2: Temporal Controllable Text-to-Audio Generation with Natural Language Description
Zihao Zheng, Zeyu Xie, Xuenan Xu, Wen Wu, Chao Zhang, Mengyue Wu
https://arxiv.org/abs/2509.00683
Quantum-Enhanced Natural Language Generation: A Multi-Model Framework with Hybrid Quantum-Classical Architectures
Chi-Sheng Chen, En-Jui Kuo
https://arxiv.org/abs/2508.21332 htt…
Introducing a New Brexit-Related Uncertainty Index: Its Evolution and Economic Consequences
Ismet Gocer, Julia Darby, Serdar Ongan
https://arxiv.org/abs/2507.02439
A Survey: Towards Privacy and Security in Mobile Large Language Models
Honghui Xu, Kaiyang Li, Wei Chen, Danyang Zheng, Zhiyuan Li, Zhipeng Cai
https://arxiv.org/abs/2509.02411 …
SelfAug: Mitigating Catastrophic Forgetting in Retrieval-Augmented Generation via Distribution Self-Alignment
Yuqing Huang, Rongyang Zhang, Qimeng Wang, Chengqiang Lu, Yan Gao, Yi Wu, Yao Hu, Xuyang Zhi, Guiquan Liu, Xin Li, Hao Wang, Enhong Chen
https://arxiv.org/abs/2509.03934
Towards On-Device Personalization: Cloud-device Collaborative Data Augmentation for Efficient On-device Language Model
Zhaofeng Zhong, Wei Yuan, Liang Qu, Tong Chen, Hao Wang, Xiangyu Zhao, Hongzhi Yin
https://arxiv.org/abs/2508.21313
From Sentences to Sequences: Rethinking Languages in Biological System
Ke Liu, Shuanke Shen, Hao Chen
https://arxiv.org/abs/2507.00953 https://
Integrated photonic neuromorphic computing: device, architecture, chip, algorithm
Shuiying Xiang, Chengyang Yu, Yizhi Wang, Xintao Zeng, Yuna Zhang, Dianzhuang Zheng, Xinran Niu, Haowen Zhao, Hanxu Zhou, Yanan Han, Xingxing Guo, Yahui Zhang, Yue Hao
https://arxiv.org/abs/2509.01262
VEDA: Efficient LLM Generation Through Voting-based KV Cache Eviction and Dataflow-flexible Accelerator
Zhican Wang, Hongxiang Fan, Haroon Waris, Gang Wang, Zhenyu Li, Jianfei Jiang, Yanan Sun, Guanghui He
https://arxiv.org/abs/2507.00797
chDzDT: Word-level morphology-aware language model for Algerian social media text
Abdelkrime Aries
https://arxiv.org/abs/2509.01772 https://arxiv.org/pdf/2…
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Yang Wang, Chenghao Xiao, Chia-Yi Hsiao, Zi Yan Chang, Chi-Li Chen, Tyler Loakman, Chenghua Lin
https://arxiv.org/abs/2509.03867
Seeing Through Green: Text-Based Classification and the Firm's Returns from Green Patents
Lapo Santarlasci, Armando Rungi, Antonio Zinilli
https://arxiv.org/abs/2507.02287
Bookmarked: Talking About Muslims in Middle French: The Potential of Word-to-Vector Models for Studying Semantic Relationships in Medieval Languages – DH Lab #Digital_Humanities
Arabic Chatbot Technologies in Education: An Overview
Hicham Bourhil, Yacine El Younoussi
https://arxiv.org/abs/2509.04066 https://arxiv.org/pdf/2509.04066…
LMPVC and Policy Bank: Adaptive voice control for industrial robots with code generating LLMs and reusable Pythonic policies
Ossi Parikka, Roel Pieters
https://arxiv.org/abs/2506.22028
Turning Tabular Foundation Models into Graph Foundation Models
Dmitry Eremeev, Gleb Bazhenov, Oleg Platonov, Artem Babenko, Liudmila Prokhorenkova
https://arxiv.org/abs/2508.20906
"AI Can Help Limit the Spread of Misinformation During Natural Disaster, Study Finds"
#AI #ArtificialIntelligence
GLiDRE: Generalist Lightweight model for Document-level Relation Extraction
Robin Armingaud, Romaric Besan\c{c}on
https://arxiv.org/abs/2508.00757 https://…
Querying GI Endoscopy Images: A VQA Approach
Gaurav Parajuli
https://arxiv.org/abs/2507.21165 https://arxiv.org/pdf/2507.21165
What Makes a Level Hard in Super Mario Maker 2?
Carlo A. Furia, Andrea Mocci
https://arxiv.org/abs/2507.21078 https://arxiv.org/pdf/2507.21078
Named Entity Recognition of Historical Text via Large Language Model
Shibingfeng Zhang, Giovanni Colavizza
https://arxiv.org/abs/2508.18090 https://arxiv.o…
Breaking Barriers in Software Testing: The Power of AI-Driven Automation
Saba Naqvi, Mohammad Baqar
https://arxiv.org/abs/2508.16025 https://arxiv.org/pdf/…
The Carbon Cost of Conversation, Sustainability in the Age of Language Models
Sayed Mahbub Hasan Amiri, Prasun Goswami, Md. Mainul Islam, Mohammad Shakhawat Hossen, Sayed Majhab Hasan Amiri, Naznin Akter
https://arxiv.org/abs/2507.20018
Revisiting Active Learning under (Human) Label Variation
Cornelia Gruber, Helen Alber, Bernd Bischl, G\"oran Kauermann, Barbara Plank, Matthias A{\ss}enmacher
https://arxiv.org/abs/2507.02593
Characterizing Communication Patterns in Distributed Large Language Model Inference
Lang Xu, Kaushik Kandadi Suresh, Quentin Anthony, Nawras Alnaasan, Dhabaleswar K. Panda
https://arxiv.org/abs/2507.14392
I randomly bought this book in a quirky bookshop in Copenhagen for the sole reason that it said all the wrong things right on the cover.
(Sales: the single most important profession. NLP™: not natural language processing but neuro-linguistic programming. Meta: the Meta Model™ and Meta Publications™.)
I just started reading it and boy oh boy, I was not disappointed. It's outrageously hilarious.
"Persuasion engineering".
Towards Fair Rankings: Leveraging LLMs for Gender Bias Detection and Measurement
Maryam Mousavian, Zahra Abbasiantaeb, Mohammad Aliannejadi, Fabio Crestani
https://arxiv.org/abs/2506.22372
Natural language processing for African languages
David Ifeoluwa Adelani
https://arxiv.org/abs/2507.00297 https://arxiv.org/pdf/2507.…
ElliottAgents: A Natural Language-Driven Multi-Agent System for Stock Market Analysis and Prediction
Jaros{\l}aw A. Chudziak, Micha{\l} Wawer
https://arxiv.org/abs/2507.03435
H2EAL: Hybrid-Bonding Architecture with Hybrid Sparse Attention for Efficient Long-Context LLM Inference
Zizhuo Fu, Xiaotian Guo, Wenxuan Zeng, Shuzhang Zhong, Yadong Zhang, Peiyu Chen, Runsheng Wang, Le Ye, Meng Li
https://arxiv.org/abs/2508.16653
An advanced AI driven database system
M. Tedeschi, S. Rizwan, C. Shringi, V. Devram Chandgir, S. Belich
https://arxiv.org/abs/2507.17778 https://arxiv.org/…
Integrating Quantized LLMs into Robotics Systems as Edge AI to Leverage their Natural Language Processing Capabilities
Miguel \'A. Gonz\'alez-Santamarta, Francisco J. Rodr\'iguez-Lera, David Sobr\'in-Hidalgo, \'Angel Manuel Guerrero-Higueras, Vicente Matell\'An-Olivera
https://arxiv.org/abs/2506.09581
NLP Meets the World: Toward Improving Conversations With the Public About Natural Language Processing Research
Shomir Wilson
https://arxiv.org/abs/2507.10559
On the Effectiveness of LLM-as-a-judge for Code Generation and Summarization
Giuseppe Crupi, Rosalia Tufano, Alejandro Velasco, Antonio Mastropaolo, Denys Poshyvanyk, Gabriele Bavota
https://arxiv.org/abs/2507.16587
A survey of diversity quantification in natural language processing: The why, what, where and how
Louis Est\`eve, Marie-Catherine de Marneffe, Nurit Melnik, Agata Savary, Olha Kanishcheva
https://arxiv.org/abs/2507.20858
DistrAttention: An Efficient and Flexible Self-Attention Mechanism on Modern GPUs
Haolin Jin, Mengbai Xiao, Yuan Yuan, Xiao Zhang, Dongxiao Yu, Guanghui Zhang, Haoliang Wang
https://arxiv.org/abs/2507.17245
CompLex: Music Theory Lexicon Constructed by Autonomous Agents for Automatic Music Generation
Zhejing Hu, Yan Liu, Gong Chen, Bruce X. B. Yu
https://arxiv.org/abs/2508.19603 htt…
Special-Character Adversarial Attacks on Open-Source Language Model
Ephraiem Sarabamoun
https://arxiv.org/abs/2508.14070 https://arxiv.org/pdf/2508.14070…
SonoCraftAR: Towards Supporting Personalized Authoring of Sound-Reactive AR Interfaces by Deaf and Hard of Hearing Users
Jaewook Lee, Davin Win Kyi, Leejun Kim, Jenny Peng, Gagyeom Lim, Jeremy Zhengqi Huang, Dhruv Jain, Jon E. Froehlich
https://arxiv.org/abs/2508.17597
Resource-Efficient Adaptation of Large Language Models for Text Embeddings via Prompt Engineering and Contrastive Fine-tuning
Benedikt Roth, Stephan Rappensperger, Tianming Qiu, Hamza Imamovi\'c, Julian W\"ormann, Hao Shen
https://arxiv.org/abs/2507.22729
GDLLM: A Global Distance-aware Modeling Approach Based on Large Language Models for Event Temporal Relation Extraction
Jie Zhao, Wanting Ning, Yuxiao Fei, Yubo Feng, Lishuang Li
https://arxiv.org/abs/2508.20828
On the synchronization between Hugging Face pre-trained language models and their upstream GitHub repository
Ajibode Adekunle, Abdul Ali Bangash, Bram Adams, Ahmed E. Hassan
https://arxiv.org/abs/2508.10157
eSapiens: A Real-World NLP Framework for Multimodal Document Understanding and Enterprise Knowledge Processing
Isaac Shi, Zeyuan Li, Wenli Wang, Lewei He, Yang Yang, Tianyu Shi
https://arxiv.org/abs/2506.16768
Adversarial Defence without Adversarial Defence: Enhancing Language Model Robustness via Instance-level Principal Component Removal
Yang Wang, Chenghao Xiao, Yizhi Li, Stuart E. Middleton, Noura Al Moubayed, Chenghua Lin
https://arxiv.org/abs/2507.21750
AI-Powered Legal Intelligence System Architecture: A Comprehensive Framework for Automated Legal Consultation and Analysis
Sean Kalaycioglu, Bob Liu, Colin Hong, Haipeng Xie
https://arxiv.org/abs/2508.17499
Evaluating Large Language Models for Phishing Detection, Self-Consistency, Faithfulness, and Explainability
Shova Kuikel, Aritran Piplai, Palvi Aggarwal
https://arxiv.org/abs/2506.13746
Leveraging Open-Source Large Language Models for Clinical Information Extraction in Resource-Constrained Settings
Luc Builtjes, Joeran Bosma, Mathias Prokop, Bram van Ginneken, Alessa Hering
https://arxiv.org/abs/2507.20859
Using LLMs and Essence to Support Software Practice Adoption
Sonia Nicoletti, Paolo Ciancarini
https://arxiv.org/abs/2508.16445 https://arxiv.org/pdf/2508.…
ITDR: An Instruction Tuning Dataset for Enhancing Large Language Models in Recommendations
Zekun Liu, Xiaowen Huang, Jitao Sang
https://arxiv.org/abs/2508.05667 https://
Lethe: Purifying Backdoored Large Language Models with Knowledge Dilution
Chen Chen, Yuchen Sun, Jiaxin Gao, Xueluan Gong, Qian Wang, Ziyao Wang, Yongsen Zheng, Kwok-Yan Lam
https://arxiv.org/abs/2508.21004
Evaluating Large Language Models for Phishing Detection, Self-Consistency, Faithfulness, and Explainability
Shova Kuikel, Aritran Piplai, Palvi Aggarwal
https://arxiv.org/abs/2506.13746
Prediction of mortality and resource utilization in critical care: a deep learning approach using multimodal electronic health records with natural language processing techniques
Yucheng Ruan, Xiang Lan, Daniel J. Tan, Hairil Rizal Abdullah, Mengling Feng
https://arxiv.org/abs/2508.20460
On The Role of Pretrained Language Models in General-Purpose Text Embeddings: A Survey
Meishan Zhang, Xin Zhang, Xinping Zhao, Shouzheng Huang, Baotian Hu, Min Zhang
https://arxiv.org/abs/2507.20783
DINA: A Dual Defense Framework Against Internal Noise and External Attacks in Natural Language Processing
Ko-Wei Chuang, Hen-Hsen Huang, Tsai-Yen Li
https://arxiv.org/abs/2508.05671
Evaluating Scoring Bias in LLM-as-a-Judge
Qingquan Li, Shaoyu Dou, Kailai Shao, Chao Chen, Haixiang Hu
https://arxiv.org/abs/2506.22316 https://
Enhanced Arabic Text Retrieval with Attentive Relevance Scoring
Salah Eddine Bekhouche, Azeddine Benlamoudi, Yazid Bounab, Fadi Dornaika, Abdenour Hadid
https://arxiv.org/abs/2507.23404
Neither Valid nor Reliable? Investigating the Use of LLMs as Judges
Khaoula Chehbouni, Mohammed Haddou, Jackie Chi Kit Cheung, Golnoosh Farnadi
https://arxiv.org/abs/2508.18076 …
Strategic Sample Selection for Improved Clean-Label Backdoor Attacks in Text Classification
Onur Alp Kirci, M. Emre Gursoy
https://arxiv.org/abs/2508.15934 https://
Enhancing Large Language Models through Structured Reasoning
Yubo Dong, Hehe Fan
https://arxiv.org/abs/2506.20241 https://arxiv.org/p…
From BERT to LLMs: Comparing and Understanding Chinese Classifier Prediction in Language Models
ZiqiZhang, Jianfei Ma, Emmanuele Chersoni, Jieshun You, Zhaoxin Feng
https://arxiv.org/abs/2508.18253
Language Surgery in Multilingual Large Language Models
Joanito Agili Lopo, Muhammad Ravi Shulthan Habibi, Tack Hwa Wong, Muhammad Ilham Ghozali, Fajri Koto, Genta Indra Winata, Peerat Limkonchotiwat, Alham Fikri Aji, Samuel Cahyawijaya
https://arxiv.org/abs/2506.12450
MizanQA: Benchmarking Large Language Models on Moroccan Legal Question Answering
Adil Bahaj, Mounir Ghogho
https://arxiv.org/abs/2508.16357 https://arxiv.o…
An Agile Method for Implementing Retrieval Augmented Generation Tools in Industrial SMEs
Mathieu Bourdin, Anas Neumann, Thomas Paviot, Robert Pellerin, Samir Lamouri
https://arxiv.org/abs/2508.21024
Finance Language Model Evaluation (FLaME)
Glenn Matlin, Mika Okamoto, Huzaifa Pardawala, Yang Yang, Sudheer Chava
https://arxiv.org/abs/2506.15846 https://…
Reservoir Computing as a Language Model
Felix K\"oster, Atsushi Uchida
https://arxiv.org/abs/2507.15779 https://arxiv.org/pdf/25…
Scalable and consistent few-shot classification of survey responses using text embeddings
Jonas Timmann Mjaaland, Markus Fleten Kreutzer, Halvor Tyseng, Rebeckah K. Fussell, Gina Passante, N. G. Holmes, Anders Malthe-S{\o}renssen, Tor Ole B. Odden
https://arxiv.org/abs/2508.19836
SDGO: Self-Discrimination-Guided Optimization for Consistent Safety in Large Language Models
Peng Ding, Wen Sun, Dailin Li, Wei Zou, Jiaming Wang, Jiajun Chen, Shujian Huang
https://arxiv.org/abs/2508.15648
AraTable: Benchmarking LLMs' Reasoning and Understanding of Arabic Tabular Data
Rana Alshaikh, Israa Alghanmi, Shelan Jeawak
https://arxiv.org/abs/2507.18442 https://…
Controllable Conversational Theme Detection Track at DSTC 12
Igor Shalyminov, Hang Su, Jake Vincent, Siffi Singh, Jason Cai, James Gung, Raphael Shu, Saab Mansour
https://arxiv.org/abs/2508.18783
Adoption of Explainable Natural Language Processing: Perspectives from Industry and Academia on Practices and Challenges
Mahdi Dhaini, Tobias M\"uller, Roksoliana Rabets, Gjergji Kasneci
https://arxiv.org/abs/2508.09786
Re-Representation in Sentential Relation Extraction with Sequence Routing Algorithm
Ramazan Ali Bahrami, Ramin Yahyapour
https://arxiv.org/abs/2508.21049 https://
When Explainability Meets Privacy: An Investigation at the Intersection of Post-hoc Explainability and Differential Privacy in the Context of Natural Language Processing
Mahdi Dhaini, Stephen Meisenbacher, Ege Erdogan, Florian Matthes, Gjergji Kasneci
https://arxiv.org/abs/2508.10482
Leveraging Semantic Triples for Private Document Generation with Local Differential Privacy Guarantees
Stephen Meisenbacher, Maulik Chevli, Florian Matthes
https://arxiv.org/abs/2508.20736
Exploiting Primacy Effect To Improve Large Language Models
Bianca Raimondi, Maurizio Gabbrielli
https://arxiv.org/abs/2507.13949 https://
Assessing and Mitigating Data Memorization Risks in Fine-Tuned Large Language Models
Badrinath Ramakrishnan, Akshaya Balaji
https://arxiv.org/abs/2508.14062 https://
Advancing Mental Disorder Detection: A Comparative Evaluation of Transformer and LSTM Architectures on Social Media
Khalid Hasan, Jamil Saquer, Mukulika Ghosh
https://arxiv.org/abs/2507.19511
X-Troll: eXplainable Detection of State-Sponsored Information Operations Agents
Lin Tian, Xiuzhen Zhang, Maria Myung-Hee Kim, Jennifer Biggs, Marian-Andrei Rizoiu
https://arxiv.org/abs/2508.16021
Perspectives in Play: A Multi-Perspective Approach for More Inclusive NLP Systems
Benedetta Muscato, Lucia Passaro, Gizem Gezici, Fosca Giannotti
https://arxiv.org/abs/2506.20209 …
GRILE: A Benchmark for Grammar Reasoning and Explanation in Romanian LLMs
Adrian-Marius Dumitran, Alexandra-Mihaela Danila, Angela-Liliana Dumitran
https://arxiv.org/abs/2508.14279
An Evaluation of Large Language Models on Text Summarization Tasks Using Prompt Engineering Techniques
Walid Mohamed Aly, Taysir Hassan A. Soliman, Amr Mohamed AbdelAziz
https://arxiv.org/abs/2507.05123
Replaced article(s) found for cs.CL. https://arxiv.org/list/cs.CL/new
[1/3]:
- Modeling the Sacred: Considerations when Using Religious Texts in Natural Language Processing
Ben Hutchinson
Toward a Better Localization of Princeton WordNet
Abed Alhakim Freihat
https://arxiv.org/abs/2508.18134 https://arxiv.org/pdf/2508.18134
Affective Polarization across European Parliaments
Bojan Evkoski, Igor Mozeti\v{c}, Nikola Ljube\v{s}i\'c, Petra Kralj Novak
https://arxiv.org/abs/2508.18916 https://…
Verified Language Processing with Hybrid Explainability: A Technical Report
Oliver Robert Fox, Giacomo Bergami, Graham Morgan
https://arxiv.org/abs/2507.05017
Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management
Luis Gasco, Hermenegildo Fabregat, Laura Garc\'ia-Sardi\~na, Paula Estrella, Daniel Deniz, Alvaro Rodrigo, Rabih Zbib
https://arxiv.org/abs/2507.13275
RoMedQA: The First Benchmark for Romanian Medical Question Answering
Ana-Cristina Rogoz, Radu Tudor Ionescu, Alexandra-Valentina Anghel, Ionut-Lucian Antone-Iordache, Simona Coniac, Andreea Iuliana Ionescu
https://arxiv.org/abs/2508.16390
Tokens with Meaning: A Hybrid Tokenization Approach for NLP
M. Ali Bayram, Ali Arda Fincan, Ahmet Semih G\"um\"u\c{s}, Sercan Karaka\c{s}, Banu Diri, Sava\c{s} Y{\i}ld{\i}r{\i}m, Demircan \c{C}elik
https://arxiv.org/abs/2508.14292
Comparing energy consumption and accuracy in text classification inference
Johannes Zschache, Tilman Hartwig
https://arxiv.org/abs/2508.14170 https://arxiv…
Improving Drug Identification in Overdose Death Surveillance using Large Language Models
Arthur J. Funnell, Panayiotis Petousis, Fabrice Harel-Canada, Ruby Romero, Alex A. T. Bui, Adam Koncsol, Hritika Chaturvedi, Chelsea Shover, David Goodman-Meza
https://arxiv.org/abs/2507.12679
T-REX: Table -- Refute or Entail eXplainer
Tim Luka Horstmann, Baptiste Geisenberger, Mehwish Alam
https://arxiv.org/abs/2508.14055 https://arxiv.org/pdf/2…
Filling the Gap for Uzbek: Creating Translation Resources for Southern Uzbek
Mukhammadsaid Mamasaidov, Azizullah Aral, Abror Shopulatov, Mironshoh Inomjonov
https://arxiv.org/abs/2508.14586
Evolutionary Feature-wise Thresholding for Binary Representation of NLP Embeddings
Soumen Sinha, Shahryar Rahnamayan, Azam Asilian Bidgoli
https://arxiv.org/abs/2507.17025
Extracting Structured Requirements from Unstructured Building Technical Specifications for Building Information Modeling
Insaf Nahri, Romain Pinqui\'e, Philippe V\'eron, Nicolas Bus, Mathieu Thorel
https://arxiv.org/abs/2508.13833
Do\u{g}al Dil \.I\c{s}lemede Tokenizasyon Standartlar{\i} ve \"Ol\c{c}\"um\"u: T\"urk\c{c}e \"Uzerinden B\"uy\"uk Dil Modellerinin Kar\c{s}{\i}la\c{s}t{\i}rmal{\i} Analizi
M. Ali Bayram, Ali Arda Fincan, Ahmet Semih G\"um\"u\c{s}, Sercan Karaka\c{s}, Banu Diri, Sava\c{s} Y{\i}ld{\i}r{\i}m
https://