
2025-10-10 06:06:01
Eyeballing Figure 1 of their response actually seems to support this: the three subregions in the Americas contain nearly 80 % of all polysynthetic languages. In each of them, the median population size lies below the global median. However, if we compare within each of these three regions, polysynthetic languages have a higher median L1_population size than non-polysynthetic ones. Might this pattern point towards a classic Simpson's paradox?
A negative global association arises because polysynth lang are concentrated in regions with smaller overall populations, even though within regions the relationsh is positive. Once we account for that structure—as our mixed logit models do—the supposed "global" negative effect reverses direction.