On Reconfigurable Bisimulation, with an Application to the Distributed Synthesis Problem
Yehia Abd Alrahman, Nir Piterman
https://arxiv.org/abs/2505.21672 …
from my link log —
Thoughts on hashing in Rust.
https://purplesyringa.moe/blog/thoughts-on-rust-hashing/
saved 2024-12-13
CodeMorph: Mitigating Data Leakage in Large Language Model Assessment
Hongzhou Rao, Yanjie Zhao, Wenjie Zhu, Ling Xiao, Meizhen Wang, Haoyu Wang
https://arxiv.org/abs/2506.17627
A look at the Chile-led Latam-GPT project, which involves 30 Latin American and Caribbean institutions collaborating to release an open-source LLM in September (Cristišn Vera-Cruz/Rest of World)
https://restofworld.org/2025/chatgpt-latin-america-alternative-latamgpt…
LLMs are Bayesian, in Expectation, not in Realization
Leon Chlon, Sarah Rashidi, Zein Khamis, MarcAntonio M. Awada
https://arxiv.org/abs/2507.11768 https:/…
InfoFlood: Jailbreaking Large Language Models with Information Overload
Advait Yadav, Haibo Jin, Man Luo, Jun Zhuang, Haohan Wang
https://arxiv.org/abs/2506.12274
AI, AGI, and learning efficiency
My 4-month-old kid is not DDoSing Wikipedia right now, nor will they ever do so before learning to speak, read, or write. Their entire "training corpus" will not top even 100 million "tokens" before they can speak & understand language, and do so with real intentionally.
Just to emphasize that point: 100 words-per-minute times 60 minutes-per-hour times 12 hours-per-day times 365 days-per-year times 4 years is a mere 105,120,000 words. That's a ludicrously *high* estimate of words-per-minute and hours-per-day, and 4 years old (the age of my other kid) is well after basic speech capabilities are developed in many children, etc. More likely the available "training data" is at least 1 or 2 orders of magnitude less than this.
The point here is that large language models, trained as they are on multiple *billions* of tokens, are not developing their behavioral capabilities in a way that's remotely similar to humans, even if you believe those capabilities are similar (they are by certain very biased ways of measurement; they very much aren't by others). This idea that humans must be naturally good at acquiring language is an old one (see e.g. #AI #LLM #AGI
Rust vs. C for Python Libraries: Evaluating Rust-Compatible Bindings Toolchains
Isabella Basso do Amaral (University of S\~ao Paulo), Renato Cordeiro Ferreira (University of S\~ao Paulo, Jheronimus Academy of Data Science, Technical University of Eindhoven, Tilburg University), Alfredo Goldman (University of S\~ao Paulo)
https://
This https://arxiv.org/abs/2410.18042 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csPL_…
This https://arxiv.org/abs/2506.02791 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csSE_…