
2025-07-18 12:24:24
Ukrainian hackers wipe databases at Russia's Gazprom in major cyberattack, intelligence source says: https://benborges.xyz/2025/07/18/ukrainian-hackers-wipe-databases-at.html
Ukrainian hackers wipe databases at Russia's Gazprom in major cyberattack, intelligence source says: https://benborges.xyz/2025/07/18/ukrainian-hackers-wipe-databases-at.html
Multimodal Data Storage and Retrieval for Embodied AI: A Survey
Yihao Lu, Hao Tang
https://arxiv.org/abs/2508.13901 https://arxiv.org/pdf/2508.13901…
Opening The Black-Box: Explaining Learned Cost Models For Databases
Roman Heinrich, Oleksandr Havrylov, Manisha Luthra, Johannes Wehrstein, Carsten Binnig
https://arxiv.org/abs/2507.14495
Choose Democracy – Resist List
https://choosedemocracy.us/resist-list/
Finally took the time VACUUM FULL all my #PostgreSQL #databases and empty a runaway table. Dropped from ~90% diskusage to ~25% and a hell of a lot less CPU
Querying Graph-Relational Data
Michael J. Sullivan, Zhibo Chen, Elvis Pranskevichus, Robert J. Simmons, Victor Petrovykh, Alja\v{z} Mur Er\v{z}en, Yury Selivanov
https://arxiv.org/abs/2507.16089
"Does the cafe entrance look accessible? Where is the door?" Towards Geospatial AI Agents for Visual Inquiries
Jon E. Froehlich, Jared Hwang, Zeyu Wang, John S. O'Meara, Xia Su, William Huang, Yang Zhang, Alex Fiannaca, Philip Nelson, Shaun Kane
https://arxiv.org/abs/2508.15752
The Rest is Silence: Leveraging Unseen Species Models for Computational Musicology
Fabian C. Moss, Jan Haji\v{c} jr., Adrian Nachtwey, Laurent Pugin
https://arxiv.org/abs/2507.14638
Parkes transient events: II. Pulsar single pulses database containing raw data segment
Xuan Yang, S. B. Zhang, Le-Yu Tang, L. Toomey, Xue-Feng Wu
https://arxiv.org/abs/2508.14403
topology: Internet AS graph (2004)
An integrated snapshot of the structure of the Internet at the level of Autonomous Systems (ASs), reconstructed from multiple sources, including the RouteViews and RIPE BGP trace collectors, route servers, looking glasses, and the Internet Routing Registry databases. This snapshot was created around October 2004.
This network has 34761 nodes and 171403 edges.
Tags: Technological, Communication, Unweighted, Multigraph, Timestamps
Efficient and Scalable Self-Healing Databases Using Meta-Learning and Dependency-Driven Recovery
Joydeep Chandra, Prabal Manhas
https://arxiv.org/abs/2507.13757
Bookmarked: calfa-co/lexical-databases: Source files of Calfa.fr web-based dictionary. #Armenisch_Lexikon Source files of Calfa.fr web-base…
I Built a Bloom Filter Data Structure Simulator
https://coffeebytes.dev/en/databases/i-built-a-bloom-filter-data-structure-simulator/
Finite Axiomatizability by Disjunctive Existential Rules
Marco Calautti, Marco Console, Andreas Pieris
https://arxiv.org/abs/2508.11946 https://arxiv.org/p…
A Risk Manager for Intrusion Tolerant Systems: Enhancing HAL 9000 with New Scoring and Data Sources
Tadeu Freitas, Carlos Novo, In\^es Dutra, Jo\~ao Soares, Manuel Correia, Benham Shariati, Rolando Martins
https://arxiv.org/abs/2508.13364
Digital-GenAI-Enhanced HCI in DevOps as a Driver of Sustainable Innovation: An Empirical Framework
Jun Cui
https://arxiv.org/abs/2508.13185 https://arxiv.o…
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Enabling Data Dependency-based Query Optimization
Daniel Lindner, Daniel Ritter, Felix Naumann
Knowing when to stop: insights from ecology for building catalogues, collections, and corpora
Jan Haji\v{c} jr., Fabian Moss
https://arxiv.org/abs/2507.14614 https://
PHP turned 30 last month. Why does this language that powers 75% of the web—including Facebook, Wikipedia, and WordPress—have such staying power? Find out by registering for this fall's DIG 540, UMaine's online course in building cultural databases.
https://DigitalCuration.UMaine.edu
CRED-SQL: Enhancing Real-world Large Scale Database Text-to-SQL Parsing through Cluster Retrieval and Execution Description
Shaoming Duan, Zirui Wang, Chuanyi Liu, Zhibin Zhu, Yuhao Zhang, Peiyi Han, Liang Yan, Zewu Penge
https://arxiv.org/abs/2508.12769
Software developer Heval Hazal Kurt discusses the pros and cons of relational databases, in comparison with their document-based counterparts, in this July 2025 article. Polyglot systems combining both paradigms are discussed as an ideal solution to differing data access needs within a same project.
"When to Choose NoSQL Over SQL"
A Distributed Learned Hash Table
Shengze Wang, Yi Liu, Xiaoxue Zhang, Liting Hu, Chen Qian
https://arxiv.org/abs/2508.14239 https://arxiv.org/pdf/2508.1423…
NV-like Defects More Common Than Four-Leaf Clovers: A Perspective on High-Throughput Point Defect Data
Joel Davidsson
https://arxiv.org/abs/2508.14223 https://
Are you running open-source databases on Kubernetes? At Berlin Buzzwords 2025, Peter Zaitsev discussed best practices for high availability, security, backups, and disaster recovery. Discover key pitfalls to avoid and learn how Operators can simplify database management for MySQL, MongoDB, and PostgreSQL in Kubernetes environments.
Watch the full session:
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Common Data Format (CDF): A Standardized Format for Match-Data in Football (Soccer)
Anzer, Arnsmeyer, Bauer, Bekkers, Brefeld, Davis, Evans, Kempe, Robertson, Smith, Van Haaren
The Home Office has permitted police to run facial recognition scans against images held in the UK’s passport and immigration databases without notifying Parliament or the public, say privacy campaigners.
https://www.computing.co.uk/news/2025/uk-p
Connecting, enriching & extending Copilot's understanding of "people" in your org to the individual profiles of your HR systems, internal databases, etc. to provide better more accurate responses.
Original post: https://bsky.app/profile/did:plc:3y3e6
Heading back to Berlin after just 2 days of #IANLS25 . Really happy with the 4 special sessions on digital technology. There's are strong focus on databases and corpora, but there were also papers on digital methods. Let's see how things develop over the next triennium!
@…
from my link log —
PostgreSQL replication slots: preventing WAL bloat and other production issues.
https://www.morling.dev/blog/mastering-postgres-replication-slots/
saved 2025-07-08
A bit of actually helpful #Sysadminnery...
So, you don't have a DBA on staff because the databases you "administer" are all designed by external app developers and a DBA would have no work? Your databases have been running w/o intervention for years, aren't generating errors? Getting used to everything kinda running slow?
Schedule an annual date to do the equivale…
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Confidence Estimation for Text-to-SQL in Large Language Models
Sepideh Entezari Maleki, Mohammadreza Pourreza, Davood Rafiei
Google says hackers stole its customers’ data in a breach of its Salesforce database
https://techcrunch.com/2025/08/06/google-says-hackers-stole-its-customers-data-in-a-breach-of-its-salesforce-database/
Efficient Semi-External Breadth-First Search
Xiaolong Wan, Xixian Han
https://arxiv.org/abs/2507.12925 https://arxiv.org/pdf/2507.129…
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Integrated data-driven biotechnology research environments
Rosalia Moreddu
https://…
As Palantir continues to expand its influence within the administration,
the Trump administration has given the company the right to surveil Americans.
In a chilling report,The New York Times notes that the company is already creating
“detailed portraits of Americans based on government data,”
with the Trump administration already seeking
“access to hundreds of data points on citizens and others through government databases,
including their bank accoun…
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- On the Effectiveness of Graph Reordering for Accelerating Approximate Nearest Neighbor Search on GPU
Yutaro Oguri, Mai Nishimura, Yusuke Matsui
Those companies that you pay to get your data out of data brokers’ databases are just creating databases so they can become data brokers, themselves, right?
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Who is Responsible When AI Fails? Mapping Causes, Entities, and Consequences of AI Privacy and Et...
Hilda Hadan, Reza Hadi Mogavi, Leah Zhang-Kennedy, Lennart E. Nacke
> Having to maintain half a dozen cursed #OpenSource pseudo-databases, because people absolutely must use this month's fad.
Another of them betrays community trust and changes license.
> Having to maintain a bunch of independent open source forks of said pseudo-database.
Rinse and repeat.
#Gentoo
topology: Internet AS graph (2004)
An integrated snapshot of the structure of the Internet at the level of Autonomous Systems (ASs), reconstructed from multiple sources, including the RouteViews and RIPE BGP trace collectors, route servers, looking glasses, and the Internet Routing Registry databases. This snapshot was created around October 2004.
This network has 34761 nodes and 171403 edges.
Tags: Technological, Communication, Unweighted, Multigraph, Timestamps
Inside ICE’s Supercharged Facial Recognition App of 200 Million Images https://www.404media.co/inside-ices-supercharged-facial-recognition-app-of-200-million-images/
AlDBaran: Towards Blazingly Fast State Commitments for Blockchains
Bernhard Kauer, Aleksandr Petrosyan, Benjamin Livshits
https://arxiv.org/abs/2508.10493 https://
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Clustering with Set Outliers and Applications in Relational Clustering
Vaishali Surianarayanan, Neeraj Kumar, Stavros Sintos
Chat-Driven Text Generation and Interaction for Person Retrieval
Zequn Xie, Chuxin Wang, Sihang Cai, Yeqiang Wang, Shulei Wang, Tao Jin
https://arxiv.org/abs/2509.12662 https://…
Large Language Models in the Data Science Lifecycle: A Systematic Mapping Study
Sai Sanjna Chintakunta, Nathalia Nascimento, Everton Guimaraes
https://arxiv.org/abs/2508.11698 h…
Finding Inter-species Associations on Large Citizen Science Datasets
Jacob Deutsch
https://arxiv.org/abs/2508.14259 https://arxiv.org/pdf/2508.14259…
What are some of the key concepts and design choices behind modern, scalable, high-performance databases? At Berlin Buzzwords 2025, Guy Shtub discussed how a database delivers sub-millisecond 99 percentile latency at millions of operations per second throughput, at scale, and how you can use it.
Watch the full session: https://
[2025-06-23 Mon (UTC), 9 new articles found for cs.DB Databases]
toXiv_bot_toot
A hacking collective calling itself "Scattered LapSus Hunters," has threatened to leak Google databases unless the company sacks two senior employees. Whilst the group has yet to provide any evidence that it holds Google data, Google has recently disclosed a third-party security breach involving Salesforce.
[2025-07-22 Tue (UTC), 5 new articles found for cs.DB Databases]
toXiv_bot_toot
Contrastive timbre representations for musical instrument and synthesizer retrieval
Gwendal Le Vaillant, Yannick Molle
https://arxiv.org/abs/2509.13285 https://
[2025-08-22 Fri (UTC), 6 new articles found for cs.DB Databases]
toXiv_bot_toot
[2025-09-22 Mon (UTC), 4 new articles found for cs.DB Databases]
toXiv_bot_toot
[2025-07-23 Wed (UTC), no new articles found for cs.DB Databases]
toXiv_bot_toot
Empowering Graph-based Approximate Nearest Neighbor Search with Adaptive Awareness Capabilities
Jiancheng Ruan, Tingyang Chen, Renchi Yang, Xiangyu Ke, Yunjun Gao
https://arxiv.org/abs/2506.15986
topology: Internet AS graph (2004)
An integrated snapshot of the structure of the Internet at the level of Autonomous Systems (ASs), reconstructed from multiple sources, including the RouteViews and RIPE BGP trace collectors, route servers, looking glasses, and the Internet Routing Registry databases. This snapshot was created around October 2004.
This network has 34761 nodes and 171403 edges.
Tags: Technological, Communication, Unweighted, Multigraph, Timestamps
Challenges in GenAI and Authentication: a scoping review
Wesley dos Reis Bezerra, Lais Machado Bezerra, Carlos Becker Westphall
https://arxiv.org/abs/2507.11775
A meta-analysis on the performance of machine-learning based language models for sentiment analysis
Elena Rohde, Jonas Klingwort, Christian Borgs
https://arxiv.org/abs/2509.09728
IDSS, a Novel P2P Relational Data Storage Service
Massimo Cafaro, Italo Epicoco, Marco Pulimeno, Lunodzo J. Mwinuka, Lucas Pereira, Hugo Morais
https://arxiv.org/abs/2507.14682
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- NLI4DB: A Systematic Review of Natural Language Interfaces for Databases
Mengyi Liu, Jianqiu Xu
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- TabulaX: Leveraging Large Language Models for Multi-Class Table Transformations
Arash Dargahi Nobari, Davood Rafiei
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Fitting Ontologies and Constraints to Relational Structures
Simon Hosemann, Jean Christoph Jung, Carsten Lutz, Sebastian Rudolph
Rel-HNN: Split Parallel Hypergraph Neural Network for Learning on Relational Databases
Md. Tanvir Alam, Md. Ahasanul Alam, Md Mahmudur Rahman, Md. Mosaddek Khan
https://arxiv.org/abs/2507.12562
[2025-07-21 Mon (UTC), 5 new articles found for cs.DB Databases]
toXiv_bot_toot
[2025-08-21 Thu (UTC), 3 new articles found for cs.DB Databases]
toXiv_bot_toot
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Efficient Discovery of Motif Transition Process for Large-Scale Temporal Graphs
Zhiyuan Zheng, Jianpeng Qi, Jiantao Li, Guoqing Chao, Junyu Dong, Yanwei Yu
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- AegisBlock: A Privacy-Preserving Medical Research Framework using Blockchain
Calkin Garg, Omar Rios Cruz, Tessa Andersen, Gaby G. Dagher, Donald Winiecki, Min Long
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Keywords are not always the key: A metadata field analysis for natural language search on open da...
Lisa-Yao Gan, Arunav Das, Johanna Walker, Elena Simperl
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Exploring Distributed Vector Databases Performance on HPC Platforms: A Study with Qdrant
Ockerman, Gueroudji, Oh, Underwood, Chia, Chard, Ross, Venkataraman
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- AMAZe: A Multi-Agent Zero-shot Index Advisor for Relational Databases
Zhaodonghui Li, Haitao Yuan, Jiachen Shi, Hao Zhang, Yu Rong, Gao Cong
Towards a Standard for JSON Document Databases
Elena Botoeva, Julien Corman
https://arxiv.org/abs/2509.12189 https://arxiv.org/pdf/2509.12189
Synthesize, Retrieve, and Propagate: A Unified Predictive Modeling Framework for Relational Databases
Ning Li, Kounianhua Du, Han Zhang, Quan Gan, Minjie Wang, David Wipf, Weinan Zhang
https://arxiv.org/abs/2508.08327
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- MedPix 2.0: A Comprehensive Multimodal Biomedical Data set for Advanced AI Applications with Retr...
Irene Siragusa, Salvatore Contino, Massimo La Ciura, Rosario Alicata, Roberto Pirrone
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- EllieSQL: Cost-Efficient Text-to-SQL with Complexity-Aware Routing
Yizhang Zhu, Runzhi Jiang, Boyan Li, Nan Tang, Yuyu Luo
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Compressive Meta-Learning
Daniel Mas Montserrat, David Bonet, Maria Perera, Xavier Gir\'o-i-Nieto, Alexander G. Ioannidis
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- LLM Agents for Interactive Workflow Provenance: Reference Architecture and Evaluation Methodology
Souza, Poteet, Etz, Rosendo, Gueroudji, Shin, Balaprakash, da Silva
[2025-08-20 Wed (UTC), 3 new articles found for cs.DB Databases]
toXiv_bot_toot
Query Logs Analytics: A Aystematic Literature Review
Dihia Lanasri
https://arxiv.org/abs/2508.13949 https://arxiv.org/pdf/2508.13949
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- The Impact of Modern AI in Metadata Management
Wenli Yang, Rui Fu, Muhammad Bilal Amin, Byeong Kang
[2025-08-19 Tue (UTC), 5 new articles found for cs.DB Databases]
toXiv_bot_toot
[2025-09-19 Fri (UTC), 3 new articles found for cs.DB Databases]
toXiv_bot_toot
BridgeScope: A Universal Toolkit for Bridging Large Language Models and Databases
Lianggui Weng, Dandan Liu, Rong Zhu, Bolin Ding, Jingren Zhou
https://arxiv.org/abs/2508.04031 …
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- MINE GRAPH RULE: A New Cypher-like Operator for Mining Association Rules on Property Graphs
Francesco Cambria, Francesco Invernici, Anna Bernasconi, Stefano Ceri
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Consensus-Free Spreadsheet Integration
Brandon Baylor, Eric Daimler, James Hansen, Esteban Montero, Ryan Wisnesky
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Quality Assessment of Tabular Data using Large Language Models and Code Generation
Ashlesha Akella, Akshar Kaul, Krishnasuri Narayanam, Sameep Mehta
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Inconsistency Handling in Prioritized Databases with Universal Constraints: Complexity Analysis a...
Meghyn Bienvenu, Camille Bourgaux
Language Native Lightly Structured Databases for Large Language Model Driven Composite Materials Research
Yuze Liu, Zhaoyuan Zhang, Xiangsheng Zeng, Yihe Zhang, Leping Yu, Lejia Wang, Xi Yu
https://arxiv.org/abs/2509.06093
[2025-08-18 Mon (UTC), 1 new article found for cs.DB Databases]
toXiv_bot_toot
[2025-07-18 Fri (UTC), 3 new articles found for cs.DB Databases]
toXiv_bot_toot
[2025-09-18 Thu (UTC), 4 new articles found for cs.DB Databases]
toXiv_bot_toot
Marlin: Efficient Coordination for Autoscaling Cloud DBMS (Extended Version)
Wenjie Hu, Guanzhou Hu, Mahesh Balakrishnan, Xiangyao Yu
https://arxiv.org/abs/2508.01931 https://…
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- MAPS: A Multilingual Benchmark for Global Agent Performance and Security
Hofman, Brokman, Rachmil, Bose, Pahuja, Shimizu, Starostina, Marchisio, Goldfarb-Tarrant, Vainshtein
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Cut Costs, Not Accuracy: LLM-Powered Data Processing with Guarantees
Sepanta Zeighami, Shreya Shankar, Aditya Parameswaran
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Text-to-SQL Oriented to the Process Mining Domain: A PT-EN Dataset for Query Translation
Bruno Yui Yamate, Thais Rodrigues Neubauer, Marcelo Fantinato, Sarajane Marques Peres
Zero-Knowledge Verifiable Graph Query Evaluation via Expansion-Centric Operator Decomposition
Hao Wu, Changzheng Wei, Yanhao Wang, Li Lin, Yilong Leng, Shiyu He, Minghao Zhao, Hanghang Wu, Ying Yan, Aoying Zhou
https://arxiv.org/abs/2507.00427
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- QUEST: Query Optimization in Unstructured Document Analysis
Sun, Deng, Chai, Jin, Guo, Han, Yuan, Wang, Cao
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Scalable Graph Indexing using GPUs for Approximate Nearest Neighbor Search
Zhonggen Li, Xiangyu Ke, Yifan Zhu, Bocheng Yu, Baihua Zheng, Yunjun Gao
[2025-07-17 Thu (UTC), 1 new article found for cs.DB Databases]
toXiv_bot_toot
[2025-09-17 Wed (UTC), 1 new article found for cs.DB Databases]
toXiv_bot_toot