2025-09-05 10:02:01
Intermediate Languages Matter: Formal Languages and LLMs affect Neurosymbolic Reasoning
Alexander Beiser, David Penz, Nysret Musliu
https://arxiv.org/abs/2509.04083 https://
Intermediate Languages Matter: Formal Languages and LLMs affect Neurosymbolic Reasoning
Alexander Beiser, David Penz, Nysret Musliu
https://arxiv.org/abs/2509.04083 https://
What if I ask in \textit{alia lingua}? Measuring Functional Similarity Across Languages
Debangan Mishra, Arihant Rastogi, Agyeya Negi, Shashwat Goel, Ponnurangam Kumaraguru
https://arxiv.org/abs/2509.04032
unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted
Janus-faces of temporal constraint languages: a dichotomy of expressivity
Johanna Brunar, Michael Pinsker, Moritz Sch\"obi
https://arxiv.org/abs/2509.04347 https://
Internal languages of locally cartesian closed $(\infty,1)$-categories
El Mehdi Cherradi
https://arxiv.org/abs/2509.03371 https://arxiv.org/pdf/2509.03371
MultiWikiQA: A Reading Comprehension Benchmark in 300 Languages
Dan Saattrup Smart
https://arxiv.org/abs/2509.04111 https://arxiv.org/pdf/2509.04111
Researchers find OpenAI's o1 can analyze languages like a human expert, including inferring the phonological rules of made-up languages without prior knowledge (Steve Nadis/Quanta Magazine)
https://www.quantamagazine.org/in-a-first-
IMO the reason why people yearn for tools that generate code is that programming is broken—everything now is giant layer cakes of huge, complex and intransparent frameworks designed by and for large teams in giant tech companies.
The same tech companies that flooded programming with overly complex tools, endless toolchains, new programming languages du jour every few years, required backwards-compatibility breaking updates and mandatory design overhauls are now selling you “AI” to generate code for the mess they made.
A substitution lemma for multiple context-free languages
Andrew Duncan, Murray Elder, Lisa Frenkel, Mengfan Lyu
https://arxiv.org/abs/2509.02117 https://ar…
from my link log —
Control structures in programming languages: from goto to algebraic effects.
http://xavierleroy.org/control-structures/
saved 2025-11-03
Was recently reminded about the time me and my brothers were studying high school German with the aid of some bilingual story that had each page printed in both languages, and came across the word "Stuhlgang".
From that point forth, any time we came across a stool of the sort you sit on (as opposed to the kind you flush down the toilet) we'd refer to it as a Stuhlgang.
Every day, some 2 billion people around the world use privacy-protection tools supported by the Open Technology Fund.
When people in China escape their government’s firewalls and censorship software
—now so dense that the system has been called the “locknet”
—or when users in Cuba or Myanmar evade cruder internet blocks,
they can access material written in their own languages and read stories they would otherwise never see.
Both the access and some of the informa…
Messing the whole weekend with the WordPress instance of a friend. He's been using an old plugin for multiple languages. The plugin is abandoned since 2015 and doesn't support newer PHP versions. Wordpress and the other plugins is about to no longer support the old PHP version.
At first I thought about writing a conversion tool to another plugin. Spend some considerable time setting up a test environment as well. Eventually it turns out group of people forked the problematic p…
Programming in Forth is oddly relaxing. I want to forget about modern languages and embrace antique ones
Replaced article(s) found for cs.PL. https://arxiv.org/list/cs.PL/new
[1/1]:
- Escape with Your Self: Sound and Expressive Bidirectional Typing with Avoidance for Reachability ...
Songlin Jia, Guannan Wei, Siyuan He, Yuyan Bao, Tiark Rompf
The 85th edition of De Programmatica Ipsum is out!
This month, we look at the strategies used by major programming languages to manage memory; in our Vidéothèque section, we learn how C and C manage memory in a video by Ryan Baker; and in the Library section, we review "What Every Programmer Should Know About Memory" by Ulrich Drepper.
Exploring NLP Benchmarks in an Extremely Low-Resource Setting
Ulin Nuha, Adam Jatowt
https://arxiv.org/abs/2509.03962 https://arxiv.org/pdf/2509.03962
Store Languages of Turing Machines and Counter Machines
Noah Friesen, Oscar H. Ibarra, Jozef Jir\'asek, Ian McQuillan
https://arxiv.org/abs/2509.02828 https://
How inaccurate AI translations of Wikipedia pages, which AI models use for training, may cause a doom spiral that further marginalizes vulnerable languages (Jacob Judah/MIT Technology Review)
https://www.technologyreview.com/2025/09/25/11240…
Are We SOLID Yet? An Empirical Study on Prompting LLMs to Detect Design Principle Violations
Fatih Pehlivan, Ar\c{c}in \"Ulk\"u Erg\"uzen, Sahand Moslemi Yengejeh, Mayasah Lami, Anil Koyuncu
https://arxiv.org/abs/2509.03093
Characterization of Speech Similarity Between Australian Aboriginal and High-Resource Languages: A Case Study on Dharawal
Ting Dang, Trini Manoj Jeyaseelan, Eliathamby Ambikairajah, Vidhyasaharan Sethu
https://arxiv.org/abs/2509.01419
Milco: Learned Sparse Retrieval Across Languages via a Multilingual Connector
Thong Nguyen, Yibin Lei, Jia-Huei Ju, Eugene Yang, Andrew Yates
https://arxiv.org/abs/2510.00671 ht…
Denny Vrandencic is guiding us through the long history of Wikipedia and the Semantic Web in his brilliant keynote at #ISWC2025 from Aristotle to Wikifunctions, enabling an automatically generated (multilingual) Wikipedia also in lesser represented languages
#semanticweb
Crosslisted article(s) found for cs.PL. https://arxiv.org/list/cs.PL/new
[1/1]:
- Lattice Annotated Temporal (LAT) Logic for Non-Markovian Reasoning
Mukherji, Patil, Aditya, Shakarian, Parkar, Pokala, Dorman, Simari
Shape and word parts combine linearly in the Bouba–Kiki effect https://link.springer.com/article/10.3758/s13414-025-03151-1
L3Cube-IndicHeadline-ID: A Dataset for Headline Identification and Semantic Evaluation in Low-Resource Indian Languages
Nishant Tanksale, Tanmay Kokate, Darshan Gohad, Sarvadnyaa Barate, Raviraj Joshi
https://arxiv.org/abs/2509.02503
How inaccurate AI translations of Wikipedia pages, which AI models use for training, may cause a doom spiral that further marginalizes vulnerable languages (Jacob Judah/MIT Technology Review)
https://www.technologyreview.com/2025/09/25/11240…
wikipedia_link: Wikipedia links (2016)
Networks of hyperlinks among articles on Wikipedia, for all available languages. A directed edge (i,j) indicates that article i hyperlinks to j.
This network has 83330 nodes and 2095962 edges.
Tags: Informational, Web graph, Unweighted
https://networks.skewed.de/n…
Mixing languages can be confusing
#linguistics #languages #language
Replaced article(s) found for cs.PL. https://arxiv.org/list/cs.PL/new
[1/1]:
- Modal Abstractions for Virtualizing Memory Addresses
Ismail Kuru, Colin S. Gordon
h…
Video of my SPLSS 2025 talk “Do Programming Languages Fulfill Requirements? Should They?” is online.
https://www.youtube.com/watch?v=rhC6bBa8Rf8
Meta-Pretraining for Zero-Shot Cross-Lingual Named Entity Recognition in Low-Resource Philippine Languages
David Demitri Africa, Suchir Salhan, Yuval Weiss, Paula Buttery, Richard Diehl Martinez
https://arxiv.org/abs/2509.02160
#memorysafe implementation of the C and C programming languages
Safe Memory Reclamation Techniques
Ajay Singh
https://arxiv.org/abs/2509.02457 https://arxiv.org/pdf/2509.02457
Bias beyond Borders: Global Inequalities in AI-Generated Music
Ahmet Solak, Florian Gr\"otschla, Luca A. Lanzend\"orfer, Roger Wattenhofer
https://arxiv.org/abs/2510.01963
Replaced article(s) found for cs.FL. https://arxiv.org/list/cs.FL/new
[1/1]:
- Autoformalization in the Wild: Assessing LLMs on Real-World Mathematical Definitions
Lan Zhang, Marco Valentino, Andre Freitas
THE LESSER-KNOWN PROGRAMMING LANGUAGES #8: LAIDBACK
This language was developed at the Marin County Center for T'ai Chi,
Mellowness and Computer Programming (now defunct), as an alternative to
the more intense atmosphere in nearby Silicon Valley.
The center was ideal for programmers who liked to soak in hot tubs while
they worked. Unfortunately few programmers could survive there because the
center outlawed Pizza and Coca-Cola in favor of Tofu and Perrier…
Google says NotebookLM's Video Overviews now support 80 languages, and Audio Overviews now provide more detailed non-English summaries (Lauren Forristal/TechCrunch)
https://techcrunch.com/2025/08/25/notebooklms-video-overview-feature-now-su…
Automatic Speech Recognition (ASR) for African Low-Resource Languages: A Systematic Literature Review
Sukairaj Hafiz Imam, Tadesse Destaw Belay, Kedir Yassin Husse, Ibrahim Said Ahmad, Idris Abdulmumin, Hadiza Ali Umar, Muhammad Yahuza Bello, Joyce Nakatumba-Nabende, Seid Muhie Yimam, Shamsuddeen Hassan Muhammad
https://arxiv.org/abs/2510.…
Interesting way to represent uncertain or vague information into #knowledgegraphs (as e.g. easier integration of LLM/Deep Learning Results into KGs) via "Fuzzy OWL". Paper by Fernando Bobillo & Umberto Straccia: Fuzzy Ontology Representation using OWL 2
Crosslisted article(s) found for cs.FL. https://arxiv.org/list/cs.FL/new
[1/1]:
- Simplicity Lies in the Eye of the Beholder: A Strategic Perspective on Controllers in Reactive Sy...
Mickael Randour
from my link log —
Sapir-Whorf does not apply to programming languages.
https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/
saved 2025-08-21
[2025-09-04 Thu (UTC), 1 new article found for cs.PL Programming Languages]
toXiv_bot_toot
Sequent Calculi for Data-Aware Modal Logics
Carlos Areces (Universidad Nacional de Cordoba,CONICET), Valentin Cassano (Universidad Nacional de Rio Cuarto,CONICET), Danae Dutto (Universidad Nacional de Cordoba,CONICET), Raul Fervari (Universidad Nacional de Cordoba,CONICET)
https://arxiv.org/abs/2510.01868
Scalable Thread-Safety Analysis of Java Classes with CodeQL
Bj{\o}rnar Haugstad J{\aa}tten, Simon Boye J{\o}rgensen, Rasmus Petersen, Ra\'ul Pardo
https://arxiv.org/abs/2509.02022
An LLM-enabled semantic-centric framework to consume privacy policies
Rui Zhao, Vladyslav Melnychuk, Jun Zhao, Jesse Wright, Nigel Shadbolt
https://arxiv.org/abs/2509.01716 http…
"Programmers, as users of compilers, experience Wittgenstein’s observation every day; newer programming languages provide more sophisticated ways to express algorithms, thereby expanding the limits of their own programming capacity, LLMs and “vibe coding” notwithstanding."
https://deprogrammaticaipsum.com/vikra…
Crosslisted article(s) found for cs.FL. https://arxiv.org/list/cs.FL/new
[1/1]:
- Identifiability and minimality bounds of quantum and post-quantum models of classical stochastic ...
Paul M. Riechers, Thomas J. Elliott
[2025-09-05 Fri (UTC), 1 new article found for cs.PL Programming Languages]
toXiv_bot_toot
EuroSpeech: A Multilingual Speech Corpus
Samuel Pfisterer, Florian Gr\"otschla, Luca A. Lanzend\"orfer, Florian Yan, Roger Wattenhofer
https://arxiv.org/abs/2510.00514
unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted
ArabEmoNet: A Lightweight Hybrid 2D CNN-BiLSTM Model with Attention for Robust Arabic Speech Emotion Recognition
Ali Abouzeid, Bilal Elbouardi, Mohamed Maged, Shady Shehata
https://arxiv.org/abs/2509.01401
When Lifetimes Liberate: A Type System for Arenas with Higher-Order Reachability Tracking
Siyuan He, Songlin Jia, Yuyan Bao, Tiark Rompf
https://arxiv.org/abs/2509.04253 https:/…
Virtual Group Knowledge and Group Belief in Topological Evidence Models (Extended Version)
Alexandru Baltag, Malvin Gattinger, Djanira Gomes
https://arxiv.org/abs/2509.00184 htt…
Beyond Single LLMs: Enhanced Code Generation via Multi-Stage Performance-Guided LLM Orchestration
Huashan Chen, Zhenyu Qi, Haotang Li, Hong Chen, Jinfu Chen, Kebin Peng, In Kee Kim, Kyu Hyung Lee, Sen He
https://arxiv.org/abs/2510.01379
[2025-09-05 Fri (UTC), 1 new article found for cs.FL Formal Languages and Automata Theory]
toXiv_bot_toot
unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted
Semantically Reflected Programs
Eduard Kamburjan, Vidar Norstein Klungre, Yuanwei Qu, Rudolf Schlatte, Egor V. Kostylev, Martin Giese, Einar Broch Johnsen
https://arxiv.org/abs/2509.03318
from my link log —
Let's take esoteric programming languages seriously.
https://arxiv.org/abs/2505.15327
saved 2025-10-11 https://dotat.at/:/XKTKR.…
Tencent open sources translation models Hunyuan-MT-7B and Hunyuan-MT-Chimera-7B, which support 33 languages, claiming they beat established models in benchmarks (Jonathan Kemper/The Decoder)
https://the-decoder.com/tencent-open-sources-two-high-performing…
A RoBERTa-Based Functional Syntax Annotation Model for Chinese Texts
Han Xiaohui, Zhang Yunlong, Guo Yuxi
https://arxiv.org/abs/2509.04046 https://arxiv.or…
[2025-09-04 Thu (UTC), 1 new article found for cs.FL Formal Languages and Automata Theory]
toXiv_bot_toot
Replaced article(s) found for cs.CL. https://arxiv.org/list/cs.CL/new
[3/3]:
- Science Across Languages: Assessing LLM Multilingual Translation of Scientific Papers
Hannah Calzi Kleidermacher, James Zou
word_adjacency: Word Adjacency Networks
Directed Networks of word adjacency in texts of several languages including English, French, Spanish and Japanese.
This network has 2704 nodes and 8300 edges.
Tags: Informational, Language, Unweighted
https://networks.skewed.de/net/word_ad
Agentic Specification Generator for Move Programs
Yu-Fu Fu, Meng Xu, Taesoo Kim
https://arxiv.org/abs/2509.24515 https://arxiv.org/pdf/2509.24515
Replaced article(s) found for cs.PL. https://arxiv.org/list/cs.PL/new
[1/1]:
- Discovering Software Parallelization Points Using Deep Neural Networks
Izavan dos S. Correia, Henrique C. T. Santos, Tiago A. E. Ferreira
MENLO: From Preferences to Proficiency - Evaluating and Modeling Native-like Quality Across 47 Languages
Chenxi Whitehouse, Sebastian Ruder, Tony Lin, Oksana Kurylo, Haruka Takagi, Janice Lam, Nicol\`o Busetto, Denise Diaz
https://arxiv.org/abs/2509.26601
Replaced article(s) found for cs.PL. https://arxiv.org/list/cs.PL/new
[1/1]:
- Abstract Interpretation of Temporal Safety Effects of Higher Order Programs
Mihai Nicola, Chaitanya Agarwal, Eric Koskinen, Thomas Wies
Crosslisted article(s) found for cs.PL. https://arxiv.org/list/cs.PL/new
[1/1]:
- REFINESTAT: Efficient Exploration for Probabilistic Program Synthesis
Madhav Kanda, Shubham Ugare, Sasa Misailovic
Replaced article(s) found for cs.FL. https://arxiv.org/list/cs.FL/new
[1/1]:
- List of Results on the \v{C}ern\'y Conjecture and Reset Thresholds for Synchronizing Automata
Mikhail V. Volkov
unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted
Crosslisted article(s) found for cs.FL. https://arxiv.org/list/cs.FL/new
[1/1]:
- Mean-payoff and Energy Discrete Bidding Games
Guy Avni, Suman Sadhukhan
https://…
Languages Still Left Behind: Toward a Better Multilingual Machine Translation Benchmark
Chihiro Taguchi, Seng Mai, Keita Kurabe, Yusuke Sakai, Georgina Agyei, Soudabeh Eslami, David Chiang
https://arxiv.org/abs/2508.20511
word_adjacency: Word Adjacency Networks
Directed Networks of word adjacency in texts of several languages including English, French, Spanish and Japanese.
This network has 2704 nodes and 8300 edges.
Tags: Informational, Language, Unweighted
https://networks.skewed.de/net/word_ad
Crosslisted article(s) found for cs.FL. https://arxiv.org/list/cs.FL/new
[1/1]:
- Cobham's theorem for the Gaussian integers
\'Alvaro Bustos-Gajardo, Robbert Fokkink, Reem Yassawi
[2025-09-03 Wed (UTC), 6 new articles found for cs.PL Programming Languages]
toXiv_bot_toot
[2025-10-03 Fri (UTC), no new articles found for cs.PL Programming Languages]
toXiv_bot_toot
JGU Mainz's Submission to the WMT25 Shared Task on LLMs with Limited Resources for Slavic Languages: MT and QA
Hossain Shaikh Saadi, Minh Duc Bui, Mario Sanz-Guerrero, Katharina von der Wense
https://arxiv.org/abs/2509.22490
Type-Based Incorrectness Reasoning
Zhe Zhou, Benjamin Delaware, Suresh Jagannathan
https://arxiv.org/abs/2509.01511 https://arxiv.org/pdf/2509.01511…
Flavors of Moonshine: Tiny Specialized ASR Models for Edge Devices
Evan King, Adam Sabra, Manjunath Kudlur, James Wang, Pete Warden
https://arxiv.org/abs/2509.02523 https://
[2025-10-03 Fri (UTC), 1 new article found for cs.FL Formal Languages and Automata Theory]
toXiv_bot_toot
Regression Language Models for Code
Yash Akhauri, Xingyou Song, Arissa Wongpanich, Bryan Lewandowski, Mohamed S. Abdelfattah
https://arxiv.org/abs/2509.26476 https://
[2025-09-03 Wed (UTC), 4 new articles found for cs.FL Formal Languages and Automata Theory]
toXiv_bot_toot
Formalizing Linear Motion G-code for Invariant Checking and Differential Testing of Fabrication Tools
Yumeng He, Chandrakana Nandi, Sreepathi Pai
https://arxiv.org/abs/2509.00699
PalmX 2025: The First Shared Task on Benchmarking LLMs on Arabic and Islamic Culture
Fakhraddin Alwajih, Abdellah El Mekki, Hamdy Mubarak, Majd Hawasly, Abubakr Mohamed, Muhammad Abdul-Mageed
https://arxiv.org/abs/2509.02550
Permutation closure for multiple context-free languages
Andrew Duncan, Murray Elder, Lisa Frenkel, Mengfan Lyu
https://arxiv.org/abs/2509.22239 https://arx…
It's All About In-Context Learning! Teaching Extremely Low-Resource Languages to LLMs
Yue Li, Zhixue Zhao, Carolina Scarton
https://arxiv.org/abs/2508.19089 https://
chDzDT: Word-level morphology-aware language model for Algerian social media text
Abdelkrime Aries
https://arxiv.org/abs/2509.01772 https://arxiv.org/pdf/2…
Macro-embedding Compiler Intermediate Languages in Racket
William J. Bowman
https://arxiv.org/abs/2509.19607 https://arxiv.org/pdf/2509.19607
LLM-Based Multi-Task Bangla Hate Speech Detection: Type, Severity, and Target
Md Arid Hasan, Firoj Alam, Md Fahad Hossain, Usman Naseem, Syed Ishtiaque Ahmed
https://arxiv.org/abs/2510.01995
Replaced article(s) found for cs.FL. https://arxiv.org/list/cs.FL/new
[1/1]:
- Time for Timed Monitorability
Thomas M. Grosen, Sean Kauffman, Kim G. Larsen, Martin Zimmermann
The role of synthetic data in Multilingual, Multi-cultural AI systems: Lessons from Indic Languages
Pranjal A. Chitale, Varun Gumma, Sanchit Ahuja, Prashant Kodali, Manan Uppadhyay, Deepthi Sudharsan, Sunayana Sitaram
https://arxiv.org/abs/2509.21294
[2025-10-02 Thu (UTC), 2 new articles found for cs.PL Programming Languages]
toXiv_bot_toot
[2025-09-02 Tue (UTC), no new articles found for cs.PL Programming Languages]
toXiv_bot_toot
CorIL: Towards Enriching Indian Language to Indian Language Parallel Corpora and Machine Translation Systems
Soham Bhattacharjee, Mukund K Roy, Yathish Poojary, Bhargav Dave, Mihir Raj, Vandan Mujadia, Baban Gain, Pruthwik Mishra, Arafat Ahsan, Parameswari Krishnamurthy, Ashwath Rao, Gurpreet Singh Josan, Preeti Dubey, Aadil Amin Kak, Anna Rao Kulkarni, Narendra VG, Sunita Arora, Rakesh Balbantray, Prasenjit Majumdar, Karunesh K Arora, Asif Ekbal, Dipti Mishra Sharma
Replaced article(s) found for cs.PL. https://arxiv.org/list/cs.PL/new
[1/1]:
- Complete the Cycle: Reachability Types with Expressive Cyclic References (Extended Version)
Haotian Deng, Siyuan He, Songlin Jia, Yuyan Bao, Tiark Rompf
Crosslisted article(s) found for cs.PL. https://arxiv.org/list/cs.PL/new
[1/1]:
- The WASM Cloak: Evaluating Browser Fingerprinting Defenses Under WebAssembly based Obfuscation
Sakib, Bin Akram, Spracklen, Kalutarage, Wijewickrama, Bilogrevic, Jadliwala
From Formal Language Theory to Statistical Learning: Finite Observability of Subregular Languages
Katsuhiko Hayashi, Hidetaka Kamigaito
https://arxiv.org/abs/2509.22598 https://…