Tootfinder

Opt-in global Mastodon full text search. Join the index!

@villavelius@mastodon.online
2025-07-26 11:48:10

"The Seven Capital Sins of Open Science"
1. Worshiping the 'age factor'
2. Ignoring the value of data reuse and complexity
3. Disrespecting other disciplines
4. Publishing data without a supplementary paper
5. Creating and maintaining a nightmare for machines
6. Refusing to support investment in general infrastructure
7. Creating data without a FAIR and explicit data stewardship plan.

@awinkler@openbiblio.social
2025-06-27 11:45:42

schöne Entdeckung heute im #bibliocon25 Panel zu FAIR data: Karin Schmidgall stellt das Datenangebot Data des Literaturarchivs Marbach vor #dla

@andres4ny@social.ridetrans.it
2025-06-26 20:30:52

Ah fuck, people are using LLMs for kernel code. They really are going to fuck over everything, aren't they?
lwn.net/SubscriberLink/1026558

comment from "comex", in a thread discussing a mistake in an LLM-generated commit:

"(Disclaimer: I am not sashal.)

…In other words, you're saying that the patch is buggy because it drops the __read_mostly attribute (which places the data in a different section).

That's a good reminder of how untrustworthy LLMs still are. Even for such a simple patch, the LLM was still able to make a subtle mistake.

To be fair, a human could definitely make the same mistake. And whatever humans revie…
Comment by "adobriyan" showing the commit in question, which replaces a "struct hlist_head event_hash[EVENT_HASHSIZE] __read_mostly" with "DEFINE_HASHTABLE(event_hash, EVENT_HASH_BITS)"
@arXiv_csIR_bot@mastoxiv.page
2025-06-24 11:07:00

A GenAI System for Improved FAIR Independent Biological Database Integration
Syed N. Sakib, Kallol Naha, Sajratul Y. Rubaiat, Hasan M. Jamil
arxiv.org/abs/2506.17934

@arXiv_csDL_bot@mastoxiv.page
2025-05-23 07:17:30

Towards Machine-actionable FAIR Digital Objects with a Typing Model that Enables Operations
Maximilian Inckmann, Nicolas Blumenr\"ohr, Rossella Aversa
arxiv.org/abs/2505.16550

@arXiv_csNI_bot@mastoxiv.page
2025-07-25 09:04:22

Enhanced Velocity-Adaptive Scheme: Joint Fair Access and Age of Information Optimization in Vehicular Networks
Xiao Xu, Qiong Wu, Pingyi Fan, Kezhi Wang, Nan Cheng, Wen Chen, Khaled B. Letaief
arxiv.org/abs/2507.18328

@prachisrivas@masto.ai
2025-06-04 13:13:26

This looks very cool.
'OpenAIRE in collaboration with Area Science Park organizes a hands-on workshop titled “Where LEGO Meets FAIR Data,” designed to introduce the principles of FAIR data through a creative, interactive simulation using LEGO metaphors.'

@mia@hcommons.social
2025-07-18 13:47:39

#DH2025 Listening to Victoria and Thea on 'Building a FAIR data future at the Journal of Open Humanities' - I'm hoping you'll see a lot more British Library data papers over time, as along with datasheets for datasets it's a big part of making our open collections findable and usable

@arXiv_csGT_bot@mastoxiv.page
2025-06-23 09:48:20

A Vision for Trustworthy, Fair, and Efficient Socio-Technical Control using Karma Economies
Ezzat Elokda, Andrea Censi, Emilio Frazzoli, Florian D\"orfler, Saverio Bolognani
arxiv.org/abs/2506.17115

@arXiv_csCR_bot@mastoxiv.page
2025-06-19 08:06:33

Fair Data Exchange with Constant-Time Proofs
Majid Khabbazian
arxiv.org/abs/2506.14944 arxiv.org/pdf/2506.14944

@arXiv_physicssocph_bot@mastoxiv.page
2025-06-17 11:50:01

A case study: the savings potential thanks to FAIR data in one Materials Science PhD project
Michael Seitz, Nick Garabedian, Ilia Bagov, Christian Greiner
arxiv.org/abs/2506.12043

@privacity@social.linux.pizza
2025-06-05 13:46:54

FPF Unveils Paper on State Data Minimization Trends
fpf.org/blog/fpf-unveils-paper
@…

@Gord1i@fosstodon.org
2025-06-22 08:52:21

As someone who uses #LLM s a fair bit, this sort of hallucination is good for reminding yourself that it's just bashing words together until it looks sort of like what's in its training data, especially in various RAG-type setups

Screenshot from Google search results summary, claiming a 1979 Pink Floyd song was popular during the *1976* Soweto uprising
@arXiv_csDL_bot@mastoxiv.page
2025-06-24 08:59:50

Cost for research -- how cost data of research can be included in open metadata to be reused and evaluated
Julia Bartlewski, Christoph Broschinski, Gernot Deinzer, Cornelia Lang, Dirk Pieper, Bianca Schweighofer, Colin Sippl, Lisa-Marie Stein, Alexander Wagner, Silke Weisheit
arxiv.org/abs/2506.18517

@gwire@mastodon.social
2025-07-21 15:25:58

> "Sites must demonstrate sufficient access to water to support at least 500MW of AI infrastructure."
The actual amounts are left as an exercise.
It's fair enough that it's difficult to guess how much water would be required for an AI data centre in 2030... but in the context of development applications, a figure representing a base minimum might be expected?

@arXiv_statML_bot@mastoxiv.page
2025-06-18 10:27:48

Meta Optimality for Demographic Parity Constrained Regression via Post-Processing
Kazuto Fukuchi
arxiv.org/abs/2506.13947

@arXiv_csDS_bot@mastoxiv.page
2025-07-14 08:23:51

On Fair Epsilon Net and Geometric Hitting Set
Mohsen Dehghankar, Stavros Sintos, Abolfazl Asudeh
arxiv.org/abs/2507.08758

@arXiv_csCY_bot@mastoxiv.page
2025-06-18 08:14:38

hyperFA*IR: A hypergeometric approach to fair rankings with finite candidate pool
Mauritz N. Cartier van Dissel, Samuel Martin-Gutierrez, Lisette Esp\'in-Noboa, Ana Mar\'ia Jaramillo, Fariba Karimi
arxiv.org/abs/2506.14349

@arXiv_eessSY_bot@mastoxiv.page
2025-07-14 09:01:12

Large-Scale Processing and Validation of Grid Data for Assessing the Fair Spatial Distribution of PV Hosting Capacity
Ali Mohamed Ali, Yaser Raeisi, Plouton Grammatikos, Davide Pavanello, Pierre Roduit, Fabrizio Sossan
arxiv.org/abs/2507.08684

@dcm@social.sunet.se
2025-07-03 15:50:59

Interesting reporting and analysis by Gordon Hull about recent decisions in the US courts about AI and copyright:
newappsblog.com/2025/07/ai-and

@arXiv_csCL_bot@mastoxiv.page
2025-07-18 09:46:42

Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management
Luis Gasco, Hermenegildo Fabregat, Laura Garc\'ia-Sardi\~na, Paula Estrella, Daniel Deniz, Alvaro Rodrigo, Rabih Zbib
arxiv.org/abs/2507.13275

@arXiv_csLG_bot@mastoxiv.page
2025-07-04 10:14:31

Fair Deepfake Detectors Can Generalize
Harry Cheng, Ming-Hui Liu, Yangyang Guo, Tianyi Wang, Liqiang Nie, Mohan Kankanhalli
arxiv.org/abs/2507.02645

@arXiv_csSI_bot@mastoxiv.page
2025-07-08 08:04:30

Dominance or Fair Play in Social Networks? A Model of Influencer Popularity Dynamic
Franco Galante, Chiara Ravazzi, Luca Vassio, Michele Garetto, Emilio Leonardi
arxiv.org/abs/2507.03448

@tiotasram@kolektiva.social
2025-07-19 07:51:05

AI, AGI, and learning efficiency
My 4-month-old kid is not DDoSing Wikipedia right now, nor will they ever do so before learning to speak, read, or write. Their entire "training corpus" will not top even 100 million "tokens" before they can speak & understand language, and do so with real intentionally.
Just to emphasize that point: 100 words-per-minute times 60 minutes-per-hour times 12 hours-per-day times 365 days-per-year times 4 years is a mere 105,120,000 words. That's a ludicrously *high* estimate of words-per-minute and hours-per-day, and 4 years old (the age of my other kid) is well after basic speech capabilities are developed in many children, etc. More likely the available "training data" is at least 1 or 2 orders of magnitude less than this.
The point here is that large language models, trained as they are on multiple *billions* of tokens, are not developing their behavioral capabilities in a way that's remotely similar to humans, even if you believe those capabilities are similar (they are by certain very biased ways of measurement; they very much aren't by others). This idea that humans must be naturally good at acquiring language is an old one (see e.g. #AI #LLM #AGI

@arXiv_csDL_bot@mastoxiv.page
2025-07-03 07:33:40

The hunt for research data: Development of an open-source workflow for tracking institutionally-affiliated research data publications
Bryan M. Gee
arxiv.org/abs/2507.01228

@arXiv_statML_bot@mastoxiv.page
2025-07-17 08:40:20

Incorporating Fairness Constraints into Archetypal Analysis
Aleix Alcacer, Irene Epifanio
arxiv.org/abs/2507.12021 ar…

@arXiv_eessAS_bot@mastoxiv.page
2025-07-16 08:25:51

Standardized Evaluation of Fetal Phonocardiography Processing Methods
Krist\'of M\"uller, Janka Hatvani, M\'arton \'Aron Goda, Mikl\'os Koller
arxiv.org/abs/2507.10783

@awinkler@openbiblio.social
2025-07-04 18:05:56

I wouldn't have thought that quantitative analyses of retrospective national bibliographies would be that painful: data access via SRU, OAI, and REST API; another resource has a JSON dump, another one again consists of various ttl for which you have to set up your own sparql endpoint. And I've not even arrived at formats, metadata standards and cataloguing peculiarities 🤯 so everything's #FAIR

@arXiv_csIR_bot@mastoxiv.page
2025-06-03 07:24:50

Curate, Connect, Inquire: A System for Findable Accessible Interoperable and Reusable (FAIR) Human-Robot Centered Datasets
Xingru Zhou, Sadanand Modak, Yao-Cheng Chan, Zhiyun Deng, Luis Sentis, Maria Esteva
arxiv.org/abs/2506.00220

@arXiv_mathOC_bot@mastoxiv.page
2025-06-04 07:47:13

BenLOC: A Benchmark for Learning to Configure MIP Optimizers
Hongpei Li, Ziyan He, Yufei Wang, Wenting Tu, Shanwen Pu, Qi Deng, Dongdong Ge
arxiv.org/abs/2506.02752

@arXiv_qbioOT_bot@mastoxiv.page
2025-06-13 09:29:30

The Cell Ontology in the age of single-cell omics
Shawn Zheng Kai Tan, Aleix Puig-Barbe, Damien Goutte-Gattat, Caroline Eastwood, Brian Aevermann, Alida Avola, James P Balhoff, Ismail Ugur Bayindir, Jasmine Belfiore, Anita Reane Caron, David S Fischer, Nancy George, Benjamin M Gyori, Melissa A Haendel, Charles Tapley Hoyt, Huseyin Kir, Tiago Lubiana, Nicolas Matentzoglu, James A Overton, Beverly Peng, Bjoern Peters, Ellen M Quardokus, Patrick L Ray, Paola Roncaglia, Andrea D Rivera, Ra…

@arXiv_csCR_bot@mastoxiv.page
2025-07-08 13:06:01

BackFed: An Efficient & Standardized Benchmark Suite for Backdoor Attacks in Federated Learning
Thinh Dao, Dung Thuy Nguyen, Khoa D Doan, Kok-Seng Wong
arxiv.org/abs/2507.04903

@arXiv_statAP_bot@mastoxiv.page
2025-06-05 07:39:17

Probabilistic measures afford fair comparisons of AIWP and NWP model output
Tilmann Gneiting, Tobias Biegert, Kristof Kraus, Eva-Maria Walz, Alexander I. Jordan, Sebastian Lerch
arxiv.org/abs/2506.03744

@arXiv_csCY_bot@mastoxiv.page
2025-06-05 09:37:52

This arxiv.org/abs/2505.13469 has been replaced.
initial toot: mastoxiv.page/@arXiv_csCY_…