Tootfinder

Opt-in global Mastodon full text search. Join the index!

@heiseonline@social.heise.de
2025-08-02 16:14:00

Belgisches Gericht ordnet Sperre der Open Library des Internet Archive an
Ein Brüsseler Gericht hat eine sehr breite Anordnung für Websperren erlassen. Sie richtet sich gegen die Open Library sowie Schattenbibliotheken wie Z-Library.

@kubikpixel@chaos.social
2025-08-03 06:20:38

»Belgisches Gericht ordnet Sperre der Open Library des Internet Archive an:
Ein Brüsseler Gericht hat eine sehr breite Anordnung für Websperren erlassen. Sie richtet sich gegen die Open Library sowie Schattenbibliotheken wie Z-Library«
Archiv ist, egal in welcher techn. Form, wichtig und hat nichts mit Datenklau zu tun. Dies wird leider aber vom Kommerz öfters als solches angesehen.
🤨

@heiseonline@social.heise.de
2025-08-03 05:00:13

Einige der zuletzt hier besonders häufig geteilten #News:
Belgisches Gericht ordnet Sperre der Open Library des Internet Archive an

@v_i_o_l_a@openbiblio.social
2025-07-02 12:08:45

"#OpenAccess and #Citation #Impact: Modality, Funding, Publisher, and Disciplinary Trends at the University of Kentucky"

@arXiv_eessIV_bot@mastoxiv.page
2025-08-01 08:22:01

MRpro - open PyTorch-based MR reconstruction and processing package
Felix Frederik Zimmermann, Patrick Schuenke, Christoph S. Aigner, Bill A. Bernhardt, Mara Guastini, Johannes Hammacher, Noah Jaitner, Andreas Kofler, Leonid Lunin, Stefan Martin, Catarina Redshaw Kranich, Jakob Schattenfroh, David Schote, Yanglei Wu, Christoph Kolbitsch

@tiotasram@kolektiva.social
2025-07-31 16:25:48

LLM coding is the opposite of DRY
An important principle in software engineering is DRY: Don't Repeat Yourself. We recognize that having the same code copied in more than one place is bad for several reasons:
1. It makes the entire codebase harder to read.
2. It increases maintenance burden, since any problems in the duplicated code need to be solved in more than one place.
3. Because it becomes possible for the copies to drift apart if changes to one aren't transferred to the other (maybe the person making the change has forgotten there was a copy) it makes the code more error-prone and harder to debug.
All modern programming languages make it almost entirely unnecessary to repeat code: we can move the repeated code into a "function" or "module" and then reference it from all the different places it's needed. At a larger scale, someone might write an open-source "library" of such functions or modules and instead of re-implementing that functionality ourselves, we can use their code, with an acknowledgement. Using another person's library this way is complicated, because now you're dependent on them: if they stop maintaining it or introduce bugs, you've inherited a problem, but still, you could always copy their project and maintain your own version, and it would be not much more work than if you had implemented stuff yourself from the start. It's a little more complicated than this, but the basic principle holds, and it's a foundational one for software development in general and the open-source movement in particular. The network of "citations" as open-source software builds on other open-source software and people contribute patches to each others' projects is a lot of what makes the movement into a community, and it can lead to collaborations that drive further development. So the DRY principle is important at both small and large scales.
Unfortunately, the current crop of hyped-up LLM coding systems from the big players are antithetical to DRY at all scales:
- At the library scale, they train on open source software but then (with some unknown frequency) replicate parts of it line-for-line *without* any citation [1]. The person who was using the LLM has no way of knowing that this happened, or even any way to check for it. In theory the LLM company could build a system for this, but it's not likely to be profitable unless the courts actually start punishing these license violations, which doesn't seem likely based on results so far and the difficulty of finding out that the violations are happening. By creating these copies (and also mash-ups, along with lots of less-problematic stuff), the LLM users (enabled and encouraged by the LLM-peddlers) are directly undermining the DRY principle. If we see what the big AI companies claim to want, which is a massive shift towards machine-authored code, DRY at the library scale will effectively be dead, with each new project simply re-implementing the functionality it needs instead of every using a library. This might seem to have some upside, since dependency hell is a thing, but the downside in terms of comprehensibility and therefore maintainability, correctness, and security will be massive. The eventual lack of new high-quality DRY-respecting code to train the models on will only make this problem worse.
- At the module & function level, AI is probably prone to re-writing rather than re-using the functions or needs, especially with a workflow where a human prompts it for many independent completions. This part I don't have direct evidence for, since I don't use LLM coding models myself except in very specific circumstances because it's not generally ethical to do so. I do know that when it tries to call existing functions, it often guesses incorrectly about the parameters they need, which I'm sure is a headache and source of bugs for the vibe coders out there. An AI could be designed to take more context into account and use existing lookup tools to get accurate function signatures and use them when generating function calls, but even though that would probably significantly improve output quality, I suspect it's the kind of thing that would be seen as too-baroque and thus not a priority. Would love to hear I'm wrong about any of this, but I suspect the consequences are that any medium-or-larger sized codebase written with LLM tools will have significant bloat from duplicate functionality, and will have places where better use of existing libraries would have made the code simpler. At a fundamental level, a principle like DRY is not something that current LLM training techniques are able to learn, and while they can imitate it from their training sets to some degree when asked for large amounts of code, when prompted for many smaller chunks, they're asymptotically likely to violate it.
I think this is an important critique in part because it cuts against the argument that "LLMs are the modern compliers, if you reject them you're just like the people who wanted to keep hand-writing assembly code, and you'll be just as obsolete." Compilers actually represented a great win for abstraction, encapsulation, and DRY in general, and they supported and are integral to open source development, whereas LLMs are set to do the opposite.
[1] to see what this looks like in action in prose, see the example on page 30 of the NYTimes copyright complaint against OpenAI (#AI #GenAI #LLMs #VibeCoding

@frankel@mastodon.top
2025-06-24 16:15:02

Carrot Cache: High-Performance, SSD-Friendly #Caching #Library for #Java

@timbray@cosocial.ca
2025-06-22 01:20:45

So, @… is working on using LLMs to process XML Except for, the models can’t write legal XML. So he’s using the model to generate a sloppy-XML parser: lucumr.pocoo.org/202…

@awinkler@openbiblio.social
2025-07-24 21:12:41

this looks very elegant; I think the mixture of persistence and flexibility (through suffixes, variants, and content negotiation) is very intriguing and it's very unfortunate that so many cultural heritage projects in Germany have opted for DOIs or URNs.
VIA: @…

@boris@cosocial.ca
2025-05-29 04:55:33

Met @… at @… event. #CoSocialCa members in the wild.

Boris open mouth wave selfie with JDD at Internet Archive Canada Permanent Library.
@gap@glammr.us
2025-06-18 10:36:21

Open Letter to CRL from the academic wing of #CripLib - ACRLog
acrlog.or…

@arXiv_eessSY_bot@mastoxiv.page
2025-06-27 09:23:39

DPLib: A Standard Benchmark Library for Distributed Power System Analysis and Optimization
Milad Hasanzadeh, Amin Kargarian
arxiv.org/abs/2506.20819

@ronaldsnijder@mastodon.social
2025-06-18 07:12:40

FAU University Press: Now in the top catalogs for open access publications ub.fau.de/en/2025/06/17/fau-un

@arXiv_csSE_bot@mastoxiv.page
2025-07-22 10:21:10

Harnessing LLMs for Document-Guided Fuzzing of OpenCV Library
Bin Duan, Tarek Mahmud, Meiru Che, Yan Yan, Naipeng Dong, Dan Dongseong Kim, Guowei Yang
arxiv.org/abs/2507.14558

@mia@hcommons.social
2025-07-18 13:47:39

#DH2025 Listening to Victoria and Thea on 'Building a FAIR data future at the Journal of Open Humanities' - I'm hoping you'll see a lot more British Library data papers over time, as along with datasheets for datasets it's a big part of making our open collections findable and usable

@tiotasram@kolektiva.social
2025-06-24 09:39:49

Subtooting since people in the original thread wanted it to be over, but selfishly tagging @… and @… whose opinions I value...
I think that saying "we are not a supply chain" is exactly what open-source maintainers should be doing right now in response to "open source supply chain security" threads.
I can't claim to be an expert and don't maintain any important FOSS stuff, but I do release almost all of my code under open licenses, and I do use many open source libraries, and I have felt the pain of needing to replace an unmaintained library.
There's a certain small-to-mid-scale class of program, including many open-source libraries, which can be built/maintained by a single person, and which to my mind best operate on a "snake growth" model: incremental changes/fixes, punctuated by periodic "skin-shedding" phases where make rewrites or version updates happen. These projects aren't immortal either: as the whole tech landscape around them changes, they become unnecessary and/or people lose interest, so they go unmaintained and eventually break. Each time one of their dependencies breaks (or has a skin-shedding moment) there's a higher probability that they break or shed too, as maintenance needs shoot up at these junctures. Unless you're a company trying to make money from a single long-lived app, it's actually okay that software churns like this, and if you're a company trying to make money, your priorities absolutely should not factor into any decisions people making FOSS software make: we're trying (and to a huge extent succeeding) to make a better world (and/or just have fun with our own hobbies share that fun with others) that leaves behind the corrosive & planet-destroying plague which is capitalism, and you're trying to personally enrich yourself by embracing that plague. The fact that capitalism is *evil* is not an incidental thing in this discussion.
To make an imperfect analogy, imagine that the peasants of some domain have set up a really-free-market, where they provide each other with free stuff to help each other survive, sometimes doing some barter perhaps but mostly just everyone bringing their surplus. Now imagine the lord of the domain, who is the source of these peasants' immiseration, goes to this market secretly & takes some berries, which he uses as one ingredient in delicious tarts that he then sells for profit. But then the berry-bringer stops showing up to the free market, or starts bringing a different kind of fruit, or even ends up bringing rotten berries by accident. And the lord complains "I have a supply chain problem!" Like, fuck off dude! Your problem is that you *didn't* want to build a supply chain and instead thought you would build your profit-focused business in other people's free stuff. If you were paying the berry-picker, you'd have a supply chain problem, but you weren't, so you really have an "I want more free stuff" problem when you can't be arsed to give away your own stuff for free.
There can be all sorts of problems in the really-free-market, like maybe not enough people bring socks, so the peasants who can't afford socks are going barefoot, and having foot problems, and the peasants put their heads together and see if they can convince someone to start bringing socks, and maybe they can't and things are a bit sad, but the really-free-market was never supposed to solve everyone's problems 100% when they're all still being squeezed dry by their taxes: until they are able to get free of the lord & start building a lovely anarchist society, the really-free-market is a best-effort kind of deal that aims to make things better, and sometimes will fall short. When it becomes the main way goods in society are distributed, and when the people who contribute aren't constantly drained by the feudal yoke, at that point the availability of particular goods is a real problem that needs to be solved, but at that point, it's also much easier to solve. And at *no* point does someone coming into the market to take stuff only to turn around and sell it deserve anything from the market or those contributing to it. They are not a supply chain. They're trying to help each other out, but even then they're doing so freely and without obligation. They might discuss amongst themselves how to better coordinate their mutual aid, but they're not going to end up forcing anyone to bring anything or even expecting that a certain person contribute a certain amount, since the whole point is that the thing is voluntary & free, and they've all got changing life circumstances that affect their contributions. Celebrate whatever shows up at the market, express your desire for things that would be useful, but don't impose a burden on anyone else to bring a specific thing, because otherwise it's fair for them to oppose such a burden on you, and now you two are doing your own barter thing that's outside the parameters of the really-free-market.

@arXiv_nlinPS_bot@mastoxiv.page
2025-06-26 08:20:10

rd-spiral: An open-source Python library for learning 2D reaction-diffusion dynamics through pseudo-spectral method
Sandy H. S. Herho, Iwan P. Anwar, Rusmawan Suwarman
arxiv.org/abs/2506.20633

@v_i_o_l_a@openbiblio.social
2025-07-24 12:24:20

"Open Access and Citation Impact: Modality, Funding, Publisher, and Disciplinary Trends at the University of Kentucky" #OpenAccess

@arXiv_statME_bot@mastoxiv.page
2025-07-25 08:26:32

Spatialize v1.0: A Python/C Library for Ensemble Spatial Interpolation
Alvaro F. Ega\~na, Alejandro Ehrenfeld, Felipe Garrido, Mar\'ia Jes\'us Valenzuela, Juan F. S\'anchez-P\'erez
arxiv.org/abs/2507.17867

@gscherer2@social.linux.pizza
2025-06-21 16:36:52

Cactus Flowers. Huntington Library, San Marino, California, USA. June, 2025. #huntingtonlibrary #cactüs #cactusflower

Close up of two cactus flowers.  One is just starting to open, the other is open with white central petals surrounded by reddish pink petals.  The background is dark green and out of focus.
@arXiv_csCL_bot@mastoxiv.page
2025-07-28 09:57:51

TokenSmith: Streamlining Data Editing, Search, and Inspection for Large-Scale Language Model Training and Interpretability
Mohammad Aflah Khan, Ameya Godbole, Johnny Tian-Zheng Wei, Ryan Wang, James Flemings, Krishna Gummadi, Willie Neiswanger, Robin Jia
arxiv.org/abs/2507.19419

@arXiv_mathNA_bot@mastoxiv.page
2025-06-26 08:31:30

DefElement: an encyclopedia of finite element definitions
Matthew W. Scroggs, Pablo D. Brubeck, Joseph P. Dean, J{\o}rgen S. Dokken, India Marsden
arxiv.org/abs/2506.20188

@lapizistik@social.tchncs.de
2025-06-19 20:37:50

I learned¹ about the Baldwin Library of Historical Children's Literature² that has more than 10000 books scanned and available online. Just great.
It is hosted by the University of Florida. So let's hope that it stays available, i.e. that the Republicans don't find the old children's books from 1750 to woke.³
__
¹via

@v_i_o_l_a@openbiblio.social
2025-07-14 10:00:47

"How to Become an Integrity Sleuth in the Library"
katinamagazine.org/content/art
"Open access agreement management c…

@arXiv_csDC_bot@mastoxiv.page
2025-07-16 09:09:51

A new Dune grid for scalable dynamic adaptivity based on the p4est software library
Carsten Burstedde, Mikhail Kirilin, Robert Kl\"ofkorn
arxiv.org/abs/2507.11386

@arXiv_csCR_bot@mastoxiv.page
2025-07-08 13:04:51

FIDESlib: A Fully-Fledged Open-Source FHE Library for Efficient CKKS on GPUs
Carlos Agull\'o-Domingo (Universidad de Murcia), \'Oscar Vera-L\'opez (Universidad de Murcia), Seyda Guzelhan (Boston University), Lohit Daksha (Boston University), Aymane El Jerari (Northeastern University), Kaustubh Shivdikar (Advanced Micro Devices), Rashmi Agrawal (Boston University), David Kaeli (Northeastern University), Ajay Joshi (Boston University), Jos\'e L. Abell\'an (Universidad…

@Techmeme@techhub.social
2025-06-09 05:05:37

Cloudflare open sourced an OAuth library mostly written by Claude, showing how AI handles mechanical implementation while humans guide with context and judgment (Max Mitchell)
maxemitchell.com/writings/i-re

@arXiv_csMS_bot@mastoxiv.page
2025-05-27 07:22:20

f4ncgb: High Performance Gr\"obner Basis Computations in Free Algebras
Maximilian Heisinger, Clemens Hofstadler
arxiv.org/abs/2505.19304

@crell@phpc.social
2025-07-14 17:19:23

Is anyone looking for good first-timer OSS contributor issues? Crell/Serde has a few tagged "good first issue" if you're interested.
github.com/Crell/Serde/issues?

@arXiv_csNI_bot@mastoxiv.page
2025-07-08 08:44:40

OpenSN: An Open Source Library for Emulating LEO Satellite Networks
Wenhao Lu, Zhiyuan Wang, Hefan Zhang, Shan Zhang, Hongbin Luo
arxiv.org/abs/2507.03248

@arXiv_qbiobm_bot@mastoxiv.page
2025-06-26 09:13:00

ProCaliper: functional and structural analysis, visualization, and annotation of proteins
Jordan C. Rozum, Hunter Ufford, Alexandria K. Im, Tong Zhang, David D. Pollock, Doo Nam Kim, Song Feng
arxiv.org/abs/2506.19961

@ronaldsnijder@mastodon.social
2025-05-27 08:00:25

I wrote a little blogpost about #AI #bots that are like a plague of locusts on the #OAPEN #Library

@matthiasott@mastodon.social
2025-07-07 08:02:49

I’m trying to help a client pick a good UI framework they can start their product with, but ultimately grow into their own design system and component library. They have started development with React, which isn’t surprising, but they are also open to using a more framework-agnostic approach in the future.
Any suggestions for a really mature and solid, themeable framework as a starting point? Chakra UI? Ark UI? Radix?

@stefan@gardenstate.social
2025-07-09 13:53:17

ATM I don't see any end in site for me sipping for tailwind. It solves all my problems and doesn't cause any.
Always open to being sold something new but I wanted tailwind since 2017 when I wanted to just use inline css instead of what ever css library I was using.

@azonenberg@ioc.exchange
2025-07-06 07:35:46

Help wanted: Can we get someone to go through the build/link time dependencies of ngscopeclient, identify every third-party open source library we use, and ensure that they're all credited properly in the documentation, and include/link to the text of the appropriate licenses?
github.com/ng…

@mia@hcommons.social
2025-06-11 20:24:27

Very excited about this! Code to access GRIN will help lots of Google Books partners, and the example might open other doors, as well as the obvious benefits of access to data!
'Institutional Books 1.0: A 242B token dataset from Harvard Library's collections, refined for accuracy and usability' arxiv.org/abs/2506…

@arXiv_eessSY_bot@mastoxiv.page
2025-06-18 09:03:03

PGLib-CO2: A Power Grid Library for Computing and Optimizing Carbon Emissions
Young-ho Cho, Min-Seung Ko, Hao Zhu
arxiv.org/abs/2506.14662

@arXiv_csSE_bot@mastoxiv.page
2025-06-24 11:02:40

SAVANT: Vulnerability Detection in Application Dependencies through Semantic-Guided Reachability Analysis
Wang Lingxiang, Quanzhi Fu, Wenjia Song, Gelei Deng, Yi Liu, Dan Williams, Ying Zhang
arxiv.org/abs/2506.17798

@arXiv_csSD_bot@mastoxiv.page
2025-06-17 10:18:09

Video-Guided Text-to-Music Generation Using Public Domain Movie Collections
Haven Kim, Zachary Novack, Weihan Xu, Julian McAuley, Hao-Wen Dong
arxiv.org/abs/2506.12573

@arXiv_mathOC_bot@mastoxiv.page
2025-07-09 08:11:42

MultiObjectiveAlgorithms.jl: a Julia package for solving multi-objective optimization problems
Oscar Dowson, Xavier Gandibleux, G\"okhan Kof
arxiv.org/abs/2507.05501

@arXiv_csSE_bot@mastoxiv.page
2025-07-16 09:40:31

How Robust are LLM-Generated Library Imports? An Empirical Study using Stack Overflow
Jasmine Latendresse, SayedHassan Khatoonabadi, Emad Shihab
arxiv.org/abs/2507.10818

@arXiv_hepph_bot@mastoxiv.page
2025-06-03 16:44:54

This arxiv.org/abs/1910.14012 has been replaced.
link: scholar.google.com/scholar?q=a

@mia@hcommons.social
2025-07-04 14:22:45

Nice! Osma Suominen @… from National Library of Finland (Annif & FintoAI)'s 5 points for AI in libraries:
- Use AI to make the world better
- Use the smallest AI that works
- Don't depend on corporate AI
- Evaluate & create data sets
- Be open and transparent

@arXiv_csDC_bot@mastoxiv.page
2025-07-08 10:15:40

Demystifying NCCL: An In-depth Analysis of GPU Communication Protocols and Algorithms
Zhiyi Hu, Siyuan Shen, Tommaso Bonato, Sylvain Jeaugey, Cedell Alexander, Eric Spada, Jeff Hammond, Torsten Hoefler
arxiv.org/abs/2507.04786

@arXiv_csSE_bot@mastoxiv.page
2025-06-16 10:23:49

Understanding API Usage and Testing: An Empirical Study of C Libraries
Ahmed Zaki, Cristian Cadar
arxiv.org/abs/2506.11598

@ronaldsnijder@mastodon.social
2025-06-16 09:42:56

At @oapenbooks.bsky.social, we have updated our #Metadata feeds, to better integrate our #OpenAccess #books into #libraries