2025-09-25 08:26:12
Poster: ChatIYP: Enabling Natural Language Access to the Internet Yellow Pages Database
Vasilis Andritsoudis, Pavlos Sermpezis, Ilias Dimitriadis, Athena Vakali
https://arxiv.org/abs/2509.19411
Poster: ChatIYP: Enabling Natural Language Access to the Internet Yellow Pages Database
Vasilis Andritsoudis, Pavlos Sermpezis, Ilias Dimitriadis, Athena Vakali
https://arxiv.org/abs/2509.19411
QAgent: A modular Search Agent with Interactive Query Understanding
Yi Jiang, Lei Shen, Lujie Niu, Sendong Zhao, Wenbo Su, Bo Zheng
https://arxiv.org/abs/2510.08383 https://
from my link log —
Pipelined Relational Query Language, PRQL: a simple, powerful, pipelined SQL replacement.
https://prql-lang.org/book/
saved 2025-11-05 https://
Query-Specific GNN: A Comprehensive Graph Representation Learning Method for Retrieval Augmented Generation
Yuchen Yan, Zhihua Liu, Hao Wang, Weiming Li, Xiaoshuai Hao
https://arxiv.org/abs/2510.11541 …
📈 #LogsQL query language provides fast full-text search, advanced analytics, and data extraction/transformation at query time. Can be combined with Unix tools like grep, less, sort, and jq for log analysis.
🎯 Optimized for high cardinality fields like trace_id, user_id, and ip addresses. Supports logs with hundreds of fields (wide events), multitenancy, out-of-order ingestion, live taili…
CardRewriter: Leveraging Knowledge Cards for Long-Tail Query Rewriting on Short-Video Platforms
Peiyuan Gong, Feiran Zhu, Yaqi Yin, Chenglei Dai, Chao Zhang, Kai Zheng, Wentian Bao, Jiaxin Mao, Yi Zhang
https://arxiv.org/abs/2510.10095
Imagine ChatGPT but instead of predicting text it just linked you to the to 3 documents most-influential on the probabilities that would have been used to predict that text.
Could even generate some info about which parts of each would have been combined how.
There would still be issues with how training data is sourced and filtered, but these could be solved by crawling normally respecting robots.txt and by paying filterers a fair wage with a more relaxed work schedule and mental health support.
The energy issues are mainly about wild future investment and wasteful query spam, not optimized present-day per-query usage.
Is this "just search?"
Yes, but it would have some advantages for a lot of use cases, mainly in synthesizing results across multiple documents and in leveraging a language model more fully to find relevant stuff.
When we talk about the harms of current corporate LLMs, the opportunity cost of NOT building things like this is part of that.
The equivalent for art would have been so amazing too! "Here are some artists that can do what you want, with examples pulled from their portfolios."
It would be a really cool coding assistant that I'd actually encourage my students to use (with some guidelines).
#AI #GenAI #LLMs
Query-Centric Graph Retrieval Augmented Generation
Yaxiong Wu, Jianyuan Bo, Yongyue Zhang, Sheng Liang, Yong Liu
https://arxiv.org/abs/2509.21237 https://a…
Accelerating LLM Inference with Precomputed Query Storage
Jay H. Park, Youngju Cho, Choungsol Lee, Moonwook Oh, Euiseong Seo
https://arxiv.org/abs/2509.25919 https://
Implementing Semantic Join Operators Efficiently
Immanuel Trummer
https://arxiv.org/abs/2510.08489 https://arxiv.org/pdf/2510.08489
PrediQL: Automated Testing of GraphQL APIs with LLMs
Shaolun Liu, Sina Marefat, Omar Tsai, Yu Chen, Zecheng Deng, Jia Wang, Mohammad A. Tayebi
https://arxiv.org/abs/2510.10407 h…
Unlike keyword search,
semantic search lets you search using natural language.
It looks beyond exact matches to understand the meaning and intent behind your query.
This means it can surface relevant precedents even when they're phrased differently
—something keyword searches often miss.
Semantic search is currently available through an API, but we're already working to bring it to the website—stay tuned!
And don't worry, keyword search isn…
Agentic generative AI for media content discovery at the national football league
Henry Wang, Sirajus Salekin, Jake Lee, Ross Claytor, Shinan Zhang, Michael Chi
https://arxiv.org/abs/2510.07297
Extending ResourceLink: Patterns for Large Dataset Processing in MCP Applications
Scott Frees
https://arxiv.org/abs/2510.05968 https://arxiv.org/pdf/2510.0…
Polars is a lightning fast DataFrame library/in-memory query engine with parallel execution and cache efficiency. And now you can use is with the tidyverse syntax: #rstats
Query-Kontext: An Unified Multimodal Model for Image Generation and Editing
Yuxin Song, Wenkai Dong, Shizun Wang, Qi Zhang, Song Xue, Tao Yuan, Hu Yang, Haocheng Feng, Hang Zhou, Xinyan Xiao, Jingdong Wang
https://arxiv.org/abs/2509.26641
A Simple but Effective Elaborative Query Reformulation Approach for Natural Language Recommendation
Qianfeng Wen, Yifan Liu, Justin Cui, Joshua Zhang, Anton Korikov, George-Kirollos Saad, Scott Sanner
https://arxiv.org/abs/2510.02656
RAVEN: Realtime Accessibility in Virtual ENvironments for Blind and Low-Vision People
Xinyun Cao, Kexin Phyllis Ju, Chenglin Li, Venkatesh Potluri, Dhruv Jain
https://arxiv.org/abs/2510.06573
SAGE-LD: Towards Scalable and Generalizable End-to-End Language Diarization via Simulated Data Augmentation
Sangmin Lee, Woongjib Choi, Jihyun Kim, Hong-Goo Kang
https://arxiv.org/abs/2510.00582
VL-KnG: Visual Scene Understanding for Navigation Goal Identification using Spatiotemporal Knowledge Graphs
Mohamad Al Mdfaa, Svetlana Lukina, Timur Akhtyamov, Arthur Nigmatzyanov, Dmitrii Nalberskii, Sergey Zagoruyko, Gonzalo Ferrer
https://arxiv.org/abs/2510.01483
AutoMaAS: Self-Evolving Multi-Agent Architecture Search for Large Language Models
Bo Ma, Hang Li, ZeHua Hu, XiaoFan Gui, LuYao Liu, Simon Liu
https://arxiv.org/abs/2510.02669 ht…
On Theoretical Interpretations of Concept-Based In-Context Learning
Huaze Tang, Tianren Peng, Shao-lun Huang
https://arxiv.org/abs/2509.20882 https://arxiv…
SQUARE: Semantic Query-Augmented Fusion and Efficient Batch Reranking for Training-free Zero-Shot Composed Image Retrieval
Ren-Di Wu, Yu-Yen Lin, Huei-Fang Yang
https://arxiv.org/abs/2509.26330
TASP: Topology-aware Sequence Parallelism
Yida Wang (Capital Normal University, Infinigence-AI), Ke Hong (Tsinghua University, Infinigence-AI), Xiuhong Li (Infinigence-AI), Yuanchao Xu (Capital Normal University), Wenxun Wang (Tsinghua University), Guohao Dai (Infinigence-AI, Shanghai Jiao Tong University), Yu Wang (Tsinghua University)
https://
Poseidon: A OneGraph Engine
Brad Bebee, \"Umit V. \c{C}ataly\"urek, Olaf Hartig, Ankesh Khandelwal, Simone Rondelli, Michael Schmidt, Lefteris Sidirourgos, Bryan Thompson
https://arxiv.org/abs/2510.11166
Replaced article(s) found for cs.CL. https://arxiv.org/list/cs.CL/new
[2/3]:
- Query-Level Uncertainty in Large Language Models
Lihu Chen, Gerard de Melo, Fabian M. Suchanek, Ga\"el Varoquaux
Doc2Query : Topic-Coverage based Document Expansion and its Application to Dense Retrieval via Dual-Index Fusion
Tzu-Lin Kuo, Wei-Ning Chiu, Wei-Yun Ma, Pu-Jen Cheng
https://arxiv.org/abs/2510.09557
Differentially Private Synthetic Text Generation for Retrieval-Augmented Generation (RAG)
Junki Mori, Kazuya Kakizaki, Taiki Miyagawa, Jun Sakuma
https://arxiv.org/abs/2510.06719
The silence of the weights: an investigation of structural pruning strategies for attention-based audio signal architectures
Andrea Diecidue, Carlo Alberto Barbano, Piero Fraternali, Mathieu Fontaine, Enzo Tartaglione
https://arxiv.org/abs/2509.26207
ARUQULA -- An LLM based Text2SPARQL Approach using ReAct and Knowledge Graph Exploration Utilities
Felix Brei, Lorenz B\"uhmann, Johannes Frey, Daniel Gerber, Lars-Peter Meyer, Claus Stadler, Kirill Bulert
https://arxiv.org/abs/2510.02200
From Reasoning LLMs to BERT: A Two-Stage Distillation Framework for Search Relevance
Runze Xia, Yupeng Ji, Yuxi Zhou, Haodong Liu, Teng Zhang, Piji Li
https://arxiv.org/abs/2510.11056
TaoSR-SHE: Stepwise Hybrid Examination Reinforcement Learning Framework for E-commerce Search Relevance
Pengkun Jiao, Yiming Jin, Jianhui Yang, Chenhe Dong, Zerui Huang, Shaowei Yao, Xiaojiang Zhou, Dan Ou, Haihong Tang
https://arxiv.org/abs/2510.07972
From NL2SQL to NL2GeoSQL: GeoSQL-Eval for automated evaluation of LLMs on PostGIS queries
Shuyang Hou, Haoyue Jiao, Ziqi Liu, Lutong Xie, Guanyu Chen, Shaowen Wu, Xuefeng Guan, Huayi Wu
https://arxiv.org/abs/2509.25264
BESPOKE: Benchmark for Search-Augmented Large Language Model Personalization via Diagnostic Feedback
Hyunseo Kim, Sangam Lee, Kwangwook Seo, Dongha Lee
https://arxiv.org/abs/2509.21106
Bridging Language Gaps: Advances in Cross-Lingual Information Retrieval with Multilingual LLMs
Roksana Goworek, Olivia Macmillan-Scott, Eda B. \"Ozyi\u{g}it
https://arxiv.org/abs/2510.00908
Mixture of Thoughts: Learning to Aggregate What Experts Think, Not Just What They Say
Jacob Fein-Ashley, Dhruv Parikh, Rajgopal Kannan, Viktor Prasanna
https://arxiv.org/abs/2509.21164
Play by the Type Rules: Inferring Constraints for LLM Functions in Declarative Programs
Parker Glenn, Alfy Samuel, Daben Liu
https://arxiv.org/abs/2509.20208 https://
CoDA: Agentic Systems for Collaborative Data Visualization
Zichen Chen, Jiefeng Chen, Sercan \"O. Arik, Misha Sra, Tomas Pfister, Jinsung Yoon
https://arxiv.org/abs/2510.03194
Automated Discovery of Test Oracles for Database Management Systems Using LLMs
Qiuyang Mang, Runyuan He, Suyang Zhong, Xiaoxuan Liu, Huanchen Zhang, Alvin Cheung
https://arxiv.org/abs/2510.06663
TaoSR-AGRL: Adaptive Guided Reinforcement Learning Framework for E-commerce Search Relevance
Jianhui Yang, Yiming Jin, Pengkun Jiao, Chenhe Dong, Zerui Huang, Shaowei Yao, Xiaojiang Zhou, Dan Ou, Haihong Tang
https://arxiv.org/abs/2510.08048
QueryGym: Step-by-Step Interaction with Relational Databases
Haritha Ananthakrishanan, Harsha Kokel, Kelsey Sikes, Debarun Bhattacharjya, Michael Katz, Shirin Sohrabi, Kavitha Srinivas
https://arxiv.org/abs/2509.21674
Learning Compact Representations of LLM Abilities via Item Response Theory
Jianhao Chen, Chenxu Wang, Gengrui Zhang, Peng Ye, Lei Bai, Wei Hu, Yuzhong Qu, Shuyue Hu
https://arxiv.org/abs/2510.00844
Reasoning by Exploration: A Unified Approach to Retrieval and Generation over Graphs
Haoyu Han, Kai Guo, Harry Shomer, Yu Wang, Yucheng Chu, Hang Li, Li Ma, Jiliang Tang
https://arxiv.org/abs/2510.07484
Drawing Conclusions from Draws: Rethinking Preference Semantics in Arena-Style LLM Evaluation
Raphael Tang, Crystina Zhang, Wenyan Li, Carmen Lai, Pontus Stenetorp, Yao Lu
https://arxiv.org/abs/2510.02306
Learning Compact Representations of LLM Abilities via Item Response Theory
Jianhao Chen, Chenxu Wang, Gengrui Zhang, Peng Ye, Lei Bai, Wei Hu, Yuzhong Qu, Shuyue Hu
https://arxiv.org/abs/2510.00844
F2LLM Technical Report: Matching SOTA Embedding Performance with 6 Million Open-Source Data
Ziyin Zhang, Zihan Liao, Hang Yu, Peng Di, Rui Wang
https://arxiv.org/abs/2510.02294 …
PARROT: A Benchmark for Evaluating LLMs in Cross-System SQL Translation
Wei Zhou, Guoliang Li, Haoyu Wang, Yuxing Han, Xufei Wu, Fan Wu, Xuanhe Zhou
https://arxiv.org/abs/2509.23338
Study on LLMs for Promptagator-Style Dense Retriever Training
Daniel Gwon, Nour Jedidi, Jimmy Lin
https://arxiv.org/abs/2510.02241 https://arxiv.org/pdf/25…