Modeling the mutational dynamics of very short tandem repeats
Amos Onn (Chair of Experimental Medicine and Therapy Research, University of Regensburg, Bioinformatics Group, Faculty of Mathematics and Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig), Tzipy Marx (Department of Computer Science and Applied Mathematics, Weizmann Institute of Science), Liming Tao (Cellular Tissue Genomics, Genentech), Tamir Biezuner (Department of Computer Science and Applied Mathematics, Weizmann Institute of Science), Ehud Shapiro (Department of Computer Science and Applied Mathematics, Weizmann Institute of Science), Christoph A. Klein (Chair of Experimental Medicine and Therapy Research, University of Regensburg, Fraunhofer Institute for Toxicology and Experimental Medicine Regensburg), Peter F. Stadler (Bioinformatics Group, Faculty of Mathematics and Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Max Planck Institute for Mathematics in the Sciences, Institute for Theoretical Chemistry, University of Vienna, Facultad de Ciencias, Universidad Nacional de Colombia, Center for non-coding RNA in Technology and Health, University of Copenhagen, Santa Fe Institute)
https://arxiv.org/abs/2603.25628 https://arxiv.org/pdf/2603.25628 https://arxiv.org/html/2603.25628
arXiv:2603.25628v1 Announce Type: new
Abstract: Short tandem repeats (STRs) are low-entropy regions in the genome, consisting of a short (1-6 bp) unit that is consecutively repeated multiple times. They are known for high mutational instability, due to so-called stutter-mutations, in which the number of units in the run increases or descreases. In particular, STRs with repeat unit length of 1-2 bp are prone to mutate even within several cell divisions. The extremely rapid accumulation of variation makes them interesting phylogenetic markers for retrospective single-cell lineage reconstruction. Here we model their mutational dynamics at the level of individual repeat unit type and then aggregate length variations over many STR loci with the aim of obtaining a very fast ``molecular clock''. We calibrate our model based on several datasets with known lineage structure prepared from cultured cells. We find that the mutational dynamics of STRs are reasonably consistent for a given cell line, but vary among different ones. This suggests that the dynamics are not entirely explained by mutations in caretaker genes, rather, various other factors play a role -- possibly tissue origin and differentiation state. Further data and research is necessary to asses their relative effects.
toXiv_bot_toot