LLM coding is the opposite of DRY
An important principle in software engineering is DRY: Don't Repeat Yourself. We recognize that having the same code copied in more than one place is bad for several reasons:
1. It makes the entire codebase harder to read.
2. It increases maintenance burden, since any problems in the duplicated code need to be solved in more than one place.
3. Because it becomes possible for the copies to drift apart if changes to one aren't transferred to the other (maybe the person making the change has forgotten there was a copy) it makes the code more error-prone and harder to debug.
All modern programming languages make it almost entirely unnecessary to repeat code: we can move the repeated code into a "function" or "module" and then reference it from all the different places it's needed. At a larger scale, someone might write an open-source "library" of such functions or modules and instead of re-implementing that functionality ourselves, we can use their code, with an acknowledgement. Using another person's library this way is complicated, because now you're dependent on them: if they stop maintaining it or introduce bugs, you've inherited a problem, but still, you could always copy their project and maintain your own version, and it would be not much more work than if you had implemented stuff yourself from the start. It's a little more complicated than this, but the basic principle holds, and it's a foundational one for software development in general and the open-source movement in particular. The network of "citations" as open-source software builds on other open-source software and people contribute patches to each others' projects is a lot of what makes the movement into a community, and it can lead to collaborations that drive further development. So the DRY principle is important at both small and large scales.
Unfortunately, the current crop of hyped-up LLM coding systems from the big players are antithetical to DRY at all scales:
- At the library scale, they train on open source software but then (with some unknown frequency) replicate parts of it line-for-line *without* any citation [1]. The person who was using the LLM has no way of knowing that this happened, or even any way to check for it. In theory the LLM company could build a system for this, but it's not likely to be profitable unless the courts actually start punishing these license violations, which doesn't seem likely based on results so far and the difficulty of finding out that the violations are happening. By creating these copies (and also mash-ups, along with lots of less-problematic stuff), the LLM users (enabled and encouraged by the LLM-peddlers) are directly undermining the DRY principle. If we see what the big AI companies claim to want, which is a massive shift towards machine-authored code, DRY at the library scale will effectively be dead, with each new project simply re-implementing the functionality it needs instead of every using a library. This might seem to have some upside, since dependency hell is a thing, but the downside in terms of comprehensibility and therefore maintainability, correctness, and security will be massive. The eventual lack of new high-quality DRY-respecting code to train the models on will only make this problem worse.
- At the module & function level, AI is probably prone to re-writing rather than re-using the functions or needs, especially with a workflow where a human prompts it for many independent completions. This part I don't have direct evidence for, since I don't use LLM coding models myself except in very specific circumstances because it's not generally ethical to do so. I do know that when it tries to call existing functions, it often guesses incorrectly about the parameters they need, which I'm sure is a headache and source of bugs for the vibe coders out there. An AI could be designed to take more context into account and use existing lookup tools to get accurate function signatures and use them when generating function calls, but even though that would probably significantly improve output quality, I suspect it's the kind of thing that would be seen as too-baroque and thus not a priority. Would love to hear I'm wrong about any of this, but I suspect the consequences are that any medium-or-larger sized codebase written with LLM tools will have significant bloat from duplicate functionality, and will have places where better use of existing libraries would have made the code simpler. At a fundamental level, a principle like DRY is not something that current LLM training techniques are able to learn, and while they can imitate it from their training sets to some degree when asked for large amounts of code, when prompted for many smaller chunks, they're asymptotically likely to violate it.
I think this is an important critique in part because it cuts against the argument that "LLMs are the modern compliers, if you reject them you're just like the people who wanted to keep hand-writing assembly code, and you'll be just as obsolete." Compilers actually represented a great win for abstraction, encapsulation, and DRY in general, and they supported and are integral to open source development, whereas LLMs are set to do the opposite.
[1] to see what this looks like in action in prose, see the example on page 30 of the NYTimes copyright complaint against OpenAI (#AI #GenAI #LLMs #VibeCoding
from my link log —
Firefox's optimized zip format: reading zip files really quickly.
https://taras.glek.net/posts/optimized-zip-format/
saved 2025-07-03
Just got a small space heater and damn it’s so much more comfortable here now
As a @… fan I’d prefer if my AC/heat pump supported heating mode, but welp it was already installed when I moved here so an electric heater will do it for the 2 days a year it’s actually cold enough for that to be useful here. Getting all that heat from the bedroom to the living room would be a problem anyway.
And I didn’t even need to make a fire hazard in order to use it since a 20A outlet was already around due to the coffee thingy, yay!
It’s been on for just around 40mins and it’s already sooo much better in here (the temp sensor is not on the side the heater is pointing to (the sofa) so it’ll take a while for it to reflect the change specially since this is a big room (kitchen dinner living), but just pointing the heater to where I’m at is enough to make it a comfortable temperature (and probably even way too hot in a bit)).
I’ve been wanting this for a while, but it never felt worth it bc we don’t really have many cold days here. Tho this year we got some more I think and today was specially cold (9~11°C) so I decided to just do it. Extra points bc it was available on fucking iFood of all places so it arrived less than an hour after I ordered it lmao.
I had this idea for a Cairn lifepath generator where there are three stages of life and you roll 1d6 for your stats at each stage, and also get appropriate items.
It has not been playtested, it's barely been proofread, but I've been having a lot of fun generating guys
perchance.org/lt8m69fg35
#ttrpg #CairnRpg
Identifying Long Radio Transients with Accompanying X-Ray Emission as Disk-Jet Precessing Black Holes: The Case of ASKAP J1832-0911
Antonios Nathanail
https://arxiv.org/abs/2506.17389
With the first #AAS246 presser "Cosmic Accelerators and Active Black Holes" came the papers https://iopscience.iop.org/article/10.3847/1538-4357/adb7e0 (Discovery of a Pulsar Wind Nebula Candidate Associated with the Galactic PeVatron 1LHAASO J0343 5254u) &
https://iopscience.iop.org/article/10.3847/2515-5172/adccb9 (Swift-XRT Observations and Upper Limits at Five LHAASO Galactic Sources) with the press release https://natsci.msu.edu/news/2025/2025-06-where-did-cosmic-rays-come-from.aspx, https://iopscience.iop.org/article/10.3847/1538-4357/adccc1 (Radial Profiles of Radio Halos in Massive Galaxy Clusters: Diffuse Giants Over 2 Mpc) etc. with the press release https://www.cfa.harvard.edu/news/record-breaking-cosmic-structure-discovered-colossal-galaxy-cluster and https://arxiv.org/abs/2504.09676 (Investigating the Emission Mechanism in the Spatially Resolved Jets of Two z ≈ 3 Radio-loud Quasars) with the press releases https://chandra.si.edu/photo/2025/j1610/ and https://www.nasa.gov/image-article/nasas-chandra-sees-surprisingly-strong-black-hole-jet-at-cosmic-noon/ and the abstract https://ui.adsabs.harvard.edu/abs/2025AAS...24520421S/abstract (NIRCam imaging of NGC 4258).
10th Enemy Encounters Webinar “Rumours and 19th-20th Century Religious Resistance, State Repression and Maoist Campaigns in China”
https://ift.tt/Gdnl5hN
CFP: Failure: Understanding Art as Process, 1150-1750 (Florence, 15-17 Oct 20) Call for…
via Input 4 RELCFP
VOTE: Vision-Language-Action Optimization with Trajectory Ensemble Voting
Juyi Lin, Amir Taherin, Arash Akbari, Arman Akbari, Lei Lu, Guangyu Chen, Taskin Padir, Xiaomeng Yang, Weiwei Chen, Yiqian Li, Xue Lin, David Kaeli, Pu Zhao, Yanzhi Wang
https://arxiv.org/abs/2507.05116
1D Vlasov Simulations of QED Cascades Over Pulsar Polar Caps
Dingyi Ye, Alexander Y. Chen
https://arxiv.org/abs/2507.15804 https://ar…