Tootfinder

Opt-in global Mastodon full text search. Join the index!

@netzschleuder@social.skewed.de
2025-07-31 15:00:05

email_enron: Email network (Enron corpus)
The Enron email corpus, containing all the email communication from the Enron corporation, which was made public as a result of legal action. Nodes are email addresses and node i links to node j if i sent at least one email to address j. Non-Enron email addresses are also present, but only their links to/from Enron addresses are observed.
This network has 36692 nodes and 367662 edges.
Tags: Social, Communication, Unweighted, Multigr…

email_enron: Email network (Enron corpus). 36692 nodes, 367662 edges. https://networks.skewed.de/net/email_enron
@tiotasram@kolektiva.social
2025-07-31 16:25:48

LLM coding is the opposite of DRY
An important principle in software engineering is DRY: Don't Repeat Yourself. We recognize that having the same code copied in more than one place is bad for several reasons:
1. It makes the entire codebase harder to read.
2. It increases maintenance burden, since any problems in the duplicated code need to be solved in more than one place.
3. Because it becomes possible for the copies to drift apart if changes to one aren't transferred to the other (maybe the person making the change has forgotten there was a copy) it makes the code more error-prone and harder to debug.
All modern programming languages make it almost entirely unnecessary to repeat code: we can move the repeated code into a "function" or "module" and then reference it from all the different places it's needed. At a larger scale, someone might write an open-source "library" of such functions or modules and instead of re-implementing that functionality ourselves, we can use their code, with an acknowledgement. Using another person's library this way is complicated, because now you're dependent on them: if they stop maintaining it or introduce bugs, you've inherited a problem, but still, you could always copy their project and maintain your own version, and it would be not much more work than if you had implemented stuff yourself from the start. It's a little more complicated than this, but the basic principle holds, and it's a foundational one for software development in general and the open-source movement in particular. The network of "citations" as open-source software builds on other open-source software and people contribute patches to each others' projects is a lot of what makes the movement into a community, and it can lead to collaborations that drive further development. So the DRY principle is important at both small and large scales.
Unfortunately, the current crop of hyped-up LLM coding systems from the big players are antithetical to DRY at all scales:
- At the library scale, they train on open source software but then (with some unknown frequency) replicate parts of it line-for-line *without* any citation [1]. The person who was using the LLM has no way of knowing that this happened, or even any way to check for it. In theory the LLM company could build a system for this, but it's not likely to be profitable unless the courts actually start punishing these license violations, which doesn't seem likely based on results so far and the difficulty of finding out that the violations are happening. By creating these copies (and also mash-ups, along with lots of less-problematic stuff), the LLM users (enabled and encouraged by the LLM-peddlers) are directly undermining the DRY principle. If we see what the big AI companies claim to want, which is a massive shift towards machine-authored code, DRY at the library scale will effectively be dead, with each new project simply re-implementing the functionality it needs instead of every using a library. This might seem to have some upside, since dependency hell is a thing, but the downside in terms of comprehensibility and therefore maintainability, correctness, and security will be massive. The eventual lack of new high-quality DRY-respecting code to train the models on will only make this problem worse.
- At the module & function level, AI is probably prone to re-writing rather than re-using the functions or needs, especially with a workflow where a human prompts it for many independent completions. This part I don't have direct evidence for, since I don't use LLM coding models myself except in very specific circumstances because it's not generally ethical to do so. I do know that when it tries to call existing functions, it often guesses incorrectly about the parameters they need, which I'm sure is a headache and source of bugs for the vibe coders out there. An AI could be designed to take more context into account and use existing lookup tools to get accurate function signatures and use them when generating function calls, but even though that would probably significantly improve output quality, I suspect it's the kind of thing that would be seen as too-baroque and thus not a priority. Would love to hear I'm wrong about any of this, but I suspect the consequences are that any medium-or-larger sized codebase written with LLM tools will have significant bloat from duplicate functionality, and will have places where better use of existing libraries would have made the code simpler. At a fundamental level, a principle like DRY is not something that current LLM training techniques are able to learn, and while they can imitate it from their training sets to some degree when asked for large amounts of code, when prompted for many smaller chunks, they're asymptotically likely to violate it.
I think this is an important critique in part because it cuts against the argument that "LLMs are the modern compliers, if you reject them you're just like the people who wanted to keep hand-writing assembly code, and you'll be just as obsolete." Compilers actually represented a great win for abstraction, encapsulation, and DRY in general, and they supported and are integral to open source development, whereas LLMs are set to do the opposite.
[1] to see what this looks like in action in prose, see the example on page 30 of the NYTimes copyright complaint against OpenAI (#AI #GenAI #LLMs #VibeCoding

@Dragofix@veganism.social
2025-06-30 23:42:42

Banks bet big on fossil fuels, boosting financing in 2024, report finds news.mongabay.com/2025/06/bank

@arXiv_csCV_bot@mastoxiv.page
2025-07-30 10:40:31

Motion Matters: Motion-guided Modulation Network for Skeleton-based Micro-Action Recognition
Jihao Gu, Kun Li, Fei Wang, Yanyan Wei, Zhiliang Wu, Hehe Fan, Meng Wang
arxiv.org/abs/2507.21977

@arXiv_csLG_bot@mastoxiv.page
2025-07-31 09:35:31

Parametrized Multi-Agent Routing via Deep Attention Models
Salar Basiri, Dhananjay Tiwari, Srinivasa M. Salapaka
arxiv.org/abs/2507.22338 a…

@metacurity@infosec.exchange
2025-07-25 17:55:48

Russian bots pose as The Insider to spread disinfo about Moldova’s President Maia Sandu, alleging she uses orphans to win votes
theins.press/en/news/283467

@netzschleuder@social.skewed.de
2025-07-26 10:00:05

email_enron: Email network (Enron corpus)
The Enron email corpus, containing all the email communication from the Enron corporation, which was made public as a result of legal action. Nodes are email addresses and node i links to node j if i sent at least one email to address j. Non-Enron email addresses are also present, but only their links to/from Enron addresses are observed.
This network has 36692 nodes and 367662 edges.
Tags: Social, Communication, Unweighted, Multigr…

email_enron: Email network (Enron corpus). 36692 nodes, 367662 edges. https://networks.skewed.de/net/email_enron
@arXiv_physicsedph_bot@mastoxiv.page
2025-07-28 08:03:21

Informal Education is Essential to Physics: Findings of the 2024 JNIPER Summit and Recommendations for Action
Alexandra C. Lau, Jessica R. Hoehn, Michael B. Bennett, Claudia Fracchiolla, Kathleen Hinko, Noah Finkelstein, Jacqueline Acres, Lindsey D. Anderson, Shane D. Bergin, Cherie Bornhorst, Turhan K. Carroll, Michael Gregory, Cameron Hares, E. L. Hazlett, Meghan Healy, Erik A Herman, Lindsay R. House, Michele W. McColgan, Brad McLain, Azar Panah, Sarah A. Perdue, Jonathan D. Perry, …

@qurlyjoe@mstdn.social
2025-07-21 02:46:17

A map of #Fediverse shows every instance and the connections to all the other instances it’s connected to, as a snapshot or live-action. Non-federated instances might or might be shown, depends what you’re looking for and how you’re looking. A wire-frame map could show different skins maybe, to suit the purpose. Kind of like a wire-frame diagram of a brain’s network of synapses and shit.

@netzschleuder@social.skewed.de
2025-07-27 01:00:05

email_enron: Email network (Enron corpus)
The Enron email corpus, containing all the email communication from the Enron corporation, which was made public as a result of legal action. Nodes are email addresses and node i links to node j if i sent at least one email to address j. Non-Enron email addresses are also present, but only their links to/from Enron addresses are observed.
This network has 36692 nodes and 367662 edges.
Tags: Social, Communication, Unweighted, Multigr…

email_enron: Email network (Enron corpus). 36692 nodes, 367662 edges. https://networks.skewed.de/net/email_enron
@burger_jaap@mastodon.social
2025-07-10 14:08:52

The most recent heatwave in Brussels (from 30 June to 2 July) led to a 15% increase in electricity consumption, according to the region's DSO.
Assuming that 🍦 production is only a marginal component of this, demineralisation and building insulation measures should be at the forefront of action.

The graph shows electricity consumption on the Sibelga network in Brussels: the average temperatures during the heatwave and reference period are also indicated.
@jlpiraux@wallonie-bruxelles.social
2025-06-10 15:11:02

"L'obstruction, le retard ou l'affaiblissement de réglementations démocratiquement adoptées signalent un changement de priorités en faveur du profit privé Š court terme, au détriment des objectifs sociaux et environnementaux Š long terme"
#RulesToProtect #démocratie

@dichotomiker@dresden.network
2025-07-16 08:08:35

Andor.
Film noir plus some action in a rich environment. (But still very Disney., annoying companion droids for example.)
It's "Luck makes the galaxy turn, right?" instead of "Remember, the force will be with you, always."

@arXiv_mathPR_bot@mastoxiv.page
2025-06-17 10:27:37

Noise-induced stabilization in a chemical reaction network without boundary effects
Andrea Agazzi, Lucie Laurence
arxiv.org/abs/2506.12163

@primonatura@mstdn.social
2025-07-02 15:00:36

"Banks bet big on fossil fuels, boosting financing in 2024, report finds"
#FossilFuels #Climate #ClimateChange

@arXiv_csNI_bot@mastoxiv.page
2025-06-09 07:49:53

Pegasus: A Universal Framework for Scalable Deep Learning Inference on the Dataplane
Yinchao Zhang, Su Yao, Yong Feng, Kang Chen, Tong Li, Zhuotao Liu, Yi Zhao, Lexuan Zhang, Xiangyu Gao, Feng Xiong, Qi Li, Ke Xu
arxiv.org/abs/2506.05779

@netzschleuder@social.skewed.de
2025-07-20 20:00:05

email_enron: Email network (Enron corpus)
The Enron email corpus, containing all the email communication from the Enron corporation, which was made public as a result of legal action. Nodes are email addresses and node i links to node j if i sent at least one email to address j. Non-Enron email addresses are also present, but only their links to/from Enron addresses are observed.
This network has 36692 nodes and 367662 edges.
Tags: Social, Communication, Unweighted, Multigr…

email_enron: Email network (Enron corpus). 36692 nodes, 367662 edges. https://networks.skewed.de/net/email_enron
@arXiv_csCV_bot@mastoxiv.page
2025-07-17 10:26:40

DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition
Hayat Ullah, Muhammad Ali Shafique, Abbas Khan, Arslan Munir
arxiv.org/abs/2507.12426

@arXiv_csRO_bot@mastoxiv.page
2025-06-03 08:06:25

FreqPolicy: Frequency Autoregressive Visuomotor Policy with Continuous Tokens
Yiming Zhong, Yumeng Liu, Chuyang Xiao, Zemin Yang, Youzhuo Wang, Yufei Zhu, Ye Shi, Yujing Sun, Xinge Zhu, Yuexin Ma
arxiv.org/abs/2506.01583

@netzschleuder@social.skewed.de
2025-06-08 18:00:05

email_enron: Email network (Enron corpus)
The Enron email corpus, containing all the email communication from the Enron corporation, which was made public as a result of legal action. Nodes are email addresses and node i links to node j if i sent at least one email to address j. Non-Enron email addresses are also present, but only their links to/from Enron addresses are observed.
This network has 36692 nodes and 367662 edges.
Tags: Social, Communication, Unweighted, Multigr…

email_enron: Email network (Enron corpus). 36692 nodes, 367662 edges. https://networks.skewed.de/net/email_enron
@netzschleuder@social.skewed.de
2025-07-03 11:00:05

email_enron: Email network (Enron corpus)
The Enron email corpus, containing all the email communication from the Enron corporation, which was made public as a result of legal action. Nodes are email addresses and node i links to node j if i sent at least one email to address j. Non-Enron email addresses are also present, but only their links to/from Enron addresses are observed.
This network has 36692 nodes and 367662 edges.
Tags: Social, Communication, Unweighted, Multigr…

email_enron: Email network (Enron corpus). 36692 nodes, 367662 edges. https://networks.skewed.de/net/email_enron
@arXiv_statAP_bot@mastoxiv.page
2025-06-13 09:40:20

Educational Intervention Re-Wires Social Interactions in Isolated Village Networks
Marios Papamichalis, Laura Forastiere, Edoardo M. Airoldi, Nicholas A. Christakis
arxiv.org/abs/2506.10496

@arXiv_csRO_bot@mastoxiv.page
2025-07-09 09:09:42

Hybrid Diffusion Policies with Projective Geometric Algebra for Efficient Robot Manipulation Learning
Xiatao Sun, Yuxuan Wang, Shuo Yang, Yinxing Chen, Daniel Rakita
arxiv.org/abs/2507.05695