Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@sascha_wolfer@fediscience.org
2025-10-10 06:06:17

Finally, what Xia & Lindell call a "separation problem" is, in our view, a feature of our approach and not a bug.
If, e.g., all languages in a family are polysynthetic (or none are), that’s not a statistical artefact – it’s the signal. The outcome is well associated with genealogy, showing that family membership captures someth genuinely informative about the process. When the model finds that family explains a large share of the variance, that's not a failure–it's evidence that phylogenetic structure dominates the pattern.
So while Xia & Lindell insist that "autocorrelation due to relationships and distance cannot be captured in family or regional-level analyses", we see that as an empirical question – and we treated it as one.
The real test is whether a mixed model that explicitly represents phylogeny and geography performs worse than their alternative, where the entire shared history of languages and environments is effectively collapsed into a single dimension (an eigenvector).
In other words: we model relationships – Xia & Lindell summarise them into one number per language.

@sascha_wolfer@fediscience.org
2025-10-10 06:05:44

One other thing, while we don't claim that our mixed-effects logit model is the perfect way to account for non-independence between languages, we don't think it's correct, as Xia & Lindell assert, to just claim that our results are "counterintuitive", the fix-eff estimates are "unreliable" and that the high model fits are "unrealistic." Whether a mix model better captures the data-generat. process is ultimately an empirical question, not one to be decided by assertion. Take, for instance, our finding that once random effects for either subregion or language family are included, the estimated effect of L1_population reverses direction—from the negative value reported by Xia & Lindell et al. to a positive one.

@arXiv_csNI_bot@mastoxiv.page
2025-09-30 10:34:01

Continual Learning to Generalize Forwarding Strategies for Diverse Mobile Wireless Networks
Cheonjin Park, Victoria Manfredi, Xiaolan Zhang, Chengyi Liu, Alicia P Wolfe, Dongjin Song, Sarah Tasneem, Bing Wang
arxiv.org/abs/2509.23913

@sascha_wolfer@fediscience.org
2025-07-23 06:22:41

- We investigate the distribution of the feature variables (corpus frequency, dictionary views, part-of-speech, polysemy) over CEFR levels.
- Variable importance analyses show us how important each variable was for the classification of each level.
We conclude: "Thus, our semi-automatic approach offers a practical solution to the limitations of existing CEFR lists, providing a framework for expanding these lists in a systematic and data-driven manner. However, our findings also reveal the importance of human oversight in the process."
Supplementary material contains an ensemble approach to classification and all used/generated data: osf.io/6s9y7/