XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs
Yitian Gong, Luozhijie Jin, Ruifan Deng, Dong Zhang, Xin Zhang, Qinyuan Cheng, Zhaoye Fei, Shimin Li, Xipeng Qiu
https://arxiv.org/abs/2506.23325
Semantic Numeration Systems as Dynamical Systems
Alexander Yu. Chunikhin
https://arxiv.org/abs/2507.21295 https://arxiv.org/pdf/2507.21295
Density, asymmetry and citation dynamics in scientific literature
Nathaniel Imel, Zachary Hafen
https://arxiv.org/abs/2506.23366 https://
Fun educational psychology experiment idea if anyone wants to run it by an IRB and get a paper published:
Get an entire compsci 1 class to take the RAADS-R. Then measure how easily/quickly they understood call/return stack semantics when the concept was first introduced (based on quiz/homework score, self reported "this makes sense" level, or something else).
And scatter plot the two against each other. Do you get two neat clusters?
Base-extension Semantics for Intuitionistic Modal Logics
Yll Buzoku, David. J. Pym
https://arxiv.org/abs/2507.06834 https://arxiv.org…
To add a single example here (feel free to chime in with your own):
Problem: editing code is sometimes tedious because external APIs require boilerplate.
Solutions:
- Use LLM-generated code. Downsides: energy use, code theft, potential for legal liability, makes mistakes, etc. Upsides: popular among some peers, seems easy to use.
- Pick a better library (not always possible).
- Build internal functions to centralize boilerplate code, then use those (benefits: you get a better understanding of the external API, and a more-unit-testable internal code surface; probably less amortized effort).
- Develop a non-LLM system that actually reasons about code at something like the formal semantics level and suggests boilerplate fill-ins based on rules, while foregrounding which rules it's applying so you can see the logic behind the suggestions (needs research).
Obviously LLM use in coding goes beyond this single issue, but there are similar analyses for each potential use of LLMs in coding. I'm all cases there are:
1. Existing practical solutions that require more effort (or in many cases just seem to but are less-effort when amortized).
2. Near-term researchable solutions that directly address the problem and which would be much more desirable in the long term.
Thus in addition to disastrous LLM effects on the climate, on data laborers, and on the digital commons, they tend to suck us into cheap-seeming but ultimately costly design practices while also crowding out better long-term solutions. Next time someone suggests how useful LLMs are for some task, try asking yourself (or them) what an ideal solution for that task would look like, and whether LLM use moves us closer to or father from a world in which that solution exists.
Modal Logic for Stratified Becoming: Actualization Beyond Possible Worlds
Alexandre Le Nepvou
https://arxiv.org/abs/2506.17276 https://
Modeling Uncertainty: From Simulink to Stochastic Hybrid Automata
Pauline Blohm, Felix Schulz, Lisa Willemsen, Anne Remke, Paula Herber
https://arxiv.org/abs/2506.14581
Considerations on Everett J. Nelson's connexive logic
Davide Fazio, Raffaele Mascella
https://arxiv.org/abs/2506.10893 https://ar…
This https://arxiv.org/abs/2506.00512 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csGR_…