Do LLMs Align with My Task? Evaluating Text-to-SQL via Dataset AlignmentDavood Rafiei, Morgan Lindsay Heisler, Weiwei Zhang, Mohammadreza Pourreza, Yong Zhanghttps://arxiv.org/abs/2510.04919
Do LLMs Align with My Task? Evaluating Text-to-SQL via Dataset AlignmentSupervised Fine-Tuning (SFT) is an effective method for adapting Large Language Models (LLMs) on downstream tasks. However, variability in training data can hinder a model's ability to generalize across domains. This paper studies the problem of dataset alignment for Natural Language to SQL (NL2SQL or text to SQL), examining how well SFT training data matches the structural characteristics of target queries and how this alignment impacts model performance. We hypothesize that alignment can be a…