Finally, what Xia & Lindell call a "separation problem" is, in our view, a feature of our approach and not a bug.
If, e.g., all languages in a family are polysynthetic (or none are), that’s not a statistical artefact – it’s the signal. The outcome is well associated with genealogy, showing that family membership captures someth genuinely informative about the process. When the model finds that family explains a large share of the variance, that's not a failure–it's evidence that phylogenetic structure dominates the pattern.
So while Xia & Lindell insist that "autocorrelation due to relationships and distance cannot be captured in family or regional-level analyses", we see that as an empirical question – and we treated it as one.
The real test is whether a mixed model that explicitly represents phylogeny and geography performs worse than their alternative, where the entire shared history of languages and environments is effectively collapsed into a single dimension (an eigenvector).
In other words: we model relationships – Xia & Lindell summarise them into one number per language.
Cracking CodeWhisperer: Analyzing Developers' Interactions and Patterns During Programming Tasks
Jeena Javahar, Tanya Budhrani, Manaal Basha, Cleidson R. B. de Souza, Ivan Beschastnikh, Gema Rodriguez-Perez
https://arxiv.org/abs/2510.11516
A systematic comparison of Large Language Models for automated assignment assessment in programming education: Exploring the importance of architecture and vendor
Marcin Jukiewicz
https://arxiv.org/abs/2509.26483
SPATA: Systematic Pattern Analysis for Detailed and Transparent Data Cards
Jo\~ao Vitorino, Eva Maia, Isabel Pra\c{c}a, Carlos Soares
https://arxiv.org/abs/2509.26640 https://…
Geometric filtering effect in expanding Bose-Einstein condensate shells
Andrea Tononi, Maciej Lewenstein, Luis Santos
https://arxiv.org/abs/2510.12309 https://