
2025-07-18 12:24:24
Ukrainian hackers wipe databases at Russia's Gazprom in major cyberattack, intelligence source says: https://benborges.xyz/2025/07/18/ukrainian-hackers-wipe-databases-at.html
Ukrainian hackers wipe databases at Russia's Gazprom in major cyberattack, intelligence source says: https://benborges.xyz/2025/07/18/ukrainian-hackers-wipe-databases-at.html
Finally took the time VACUUM FULL all my #PostgreSQL #databases and empty a runaway table. Dropped from ~90% diskusage to ~25% and a hell of a lot less CPU
Efficient and Scalable Self-Healing Databases Using Meta-Learning and Dependency-Driven Recovery
Joydeep Chandra, Prabal Manhas
https://arxiv.org/abs/2507.13757
Multimodal Data Storage and Retrieval for Embodied AI: A Survey
Yihao Lu, Hao Tang
https://arxiv.org/abs/2508.13901 https://arxiv.org/pdf/2508.13901…
topology: Internet AS graph (2004)
An integrated snapshot of the structure of the Internet at the level of Autonomous Systems (ASs), reconstructed from multiple sources, including the RouteViews and RIPE BGP trace collectors, route servers, looking glasses, and the Internet Routing Registry databases. This snapshot was created around October 2004.
This network has 34761 nodes and 171403 edges.
Tags: Technological, Communication, Unweighted, Multigraph, Timestamps
"Does the cafe entrance look accessible? Where is the door?" Towards Geospatial AI Agents for Visual Inquiries
Jon E. Froehlich, Jared Hwang, Zeyu Wang, John S. O'Meara, Xia Su, William Huang, Yang Zhang, Alex Fiannaca, Philip Nelson, Shaun Kane
https://arxiv.org/abs/2508.15752
Opening The Black-Box: Explaining Learned Cost Models For Databases
Roman Heinrich, Oleksandr Havrylov, Manisha Luthra, Johannes Wehrstein, Carsten Binnig
https://arxiv.org/abs/2507.14495
The Rest is Silence: Leveraging Unseen Species Models for Computational Musicology
Fabian C. Moss, Jan Haji\v{c} jr., Adrian Nachtwey, Laurent Pugin
https://arxiv.org/abs/2507.14638
Finite Axiomatizability by Disjunctive Existential Rules
Marco Calautti, Marco Console, Andreas Pieris
https://arxiv.org/abs/2508.11946 https://arxiv.org/p…
Exploring Distributed Vector Databases Performance on HPC Platforms: A Study with Qdrant
Seth Ockerman, Amal Gueroudji, Song Young Oh, Robert Underwood, Nicholas Chia, Kyle Chard, Robert Ross, Shivaram Venkataraman
https://arxiv.org/abs/2509.12384
Parkes transient events: II. Pulsar single pulses database containing raw data segment
Xuan Yang, S. B. Zhang, Le-Yu Tang, L. Toomey, Xue-Feng Wu
https://arxiv.org/abs/2508.14403
A Distributed Learned Hash Table
Shengze Wang, Yi Liu, Xiaoxue Zhang, Liting Hu, Chen Qian
https://arxiv.org/abs/2508.14239 https://arxiv.org/pdf/2508.1423…
I Built a Bloom Filter Data Structure Simulator
https://coffeebytes.dev/en/databases/i-built-a-bloom-filter-data-structure-simulator/
Bookmarked: calfa-co/lexical-databases: Source files of Calfa.fr web-based dictionary. #Armenisch_Lexikon Source files of Calfa.fr web-base…
NV-like Defects More Common Than Four-Leaf Clovers: A Perspective on High-Throughput Point Defect Data
Joel Davidsson
https://arxiv.org/abs/2508.14223 https://
CRED-SQL: Enhancing Real-world Large Scale Database Text-to-SQL Parsing through Cluster Retrieval and Execution Description
Shaoming Duan, Zirui Wang, Chuanyi Liu, Zhibin Zhu, Yuhao Zhang, Peiyi Han, Liang Yan, Zewu Penge
https://arxiv.org/abs/2508.12769
A Risk Manager for Intrusion Tolerant Systems: Enhancing HAL 9000 with New Scoring and Data Sources
Tadeu Freitas, Carlos Novo, In\^es Dutra, Jo\~ao Soares, Manuel Correia, Benham Shariati, Rolando Martins
https://arxiv.org/abs/2508.13364
Digital-GenAI-Enhanced HCI in DevOps as a Driver of Sustainable Innovation: An Empirical Framework
Jun Cui
https://arxiv.org/abs/2508.13185 https://arxiv.o…
Software developer Heval Hazal Kurt discusses the pros and cons of relational databases, in comparison with their document-based counterparts, in this July 2025 article. Polyglot systems combining both paradigms are discussed as an ideal solution to differing data access needs within a same project.
"When to Choose NoSQL Over SQL"
Finding Inter-species Associations on Large Citizen Science Datasets
Jacob Deutsch
https://arxiv.org/abs/2508.14259 https://arxiv.org/pdf/2508.14259…
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Common Data Format (CDF): A Standardized Format for Match-Data in Football (Soccer)
Anzer, Arnsmeyer, Bauer, Bekkers, Brefeld, Davis, Evans, Kempe, Robertson, Smith, Van Haaren
PHP turned 30 last month. Why does this language that powers 75% of the web—including Facebook, Wikipedia, and WordPress—have such staying power? Find out by registering for this fall's DIG 540, UMaine's online course in building cultural databases.
https://DigitalCuration.UMaine.edu
Connecting, enriching & extending Copilot's understanding of "people" in your org to the individual profiles of your HR systems, internal databases, etc. to provide better more accurate responses.
Original post: https://bsky.app/profile/did:plc:3y3e6
The Home Office has permitted police to run facial recognition scans against images held in the UK’s passport and immigration databases without notifying Parliament or the public, say privacy campaigners.
https://www.computing.co.uk/news/2025/uk-p
Heading back to Berlin after just 2 days of #IANLS25 . Really happy with the 4 special sessions on digital technology. There's are strong focus on databases and corpora, but there were also papers on digital methods. Let's see how things develop over the next triennium!
@…
Are you running open-source databases on Kubernetes? At Berlin Buzzwords 2025, Peter Zaitsev discussed best practices for high availability, security, backups, and disaster recovery. Discover key pitfalls to avoid and learn how Operators can simplify database management for MySQL, MongoDB, and PostgreSQL in Kubernetes environments.
Watch the full session:
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Confidence Estimation for Text-to-SQL in Large Language Models
Sepideh Entezari Maleki, Mohammadreza Pourreza, Davood Rafiei
from my link log —
PostgreSQL replication slots: preventing WAL bloat and other production issues.
https://www.morling.dev/blog/mastering-postgres-replication-slots/
saved 2025-07-08
A bit of actually helpful #Sysadminnery...
So, you don't have a DBA on staff because the databases you "administer" are all designed by external app developers and a DBA would have no work? Your databases have been running w/o intervention for years, aren't generating errors? Getting used to everything kinda running slow?
Schedule an annual date to do the equivale…
Google says hackers stole its customers’ data in a breach of its Salesforce database
https://techcrunch.com/2025/08/06/google-says-hackers-stole-its-customers-data-in-a-breach-of-its-salesforce-database/
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Integrated data-driven biotechnology research environments
Rosalia Moreddu
https://…
As Palantir continues to expand its influence within the administration,
the Trump administration has given the company the right to surveil Americans.
In a chilling report,The New York Times notes that the company is already creating
“detailed portraits of Americans based on government data,”
with the Trump administration already seeking
“access to hundreds of data points on citizens and others through government databases,
including their bank accoun…
Efficient Semi-External Breadth-First Search
Xiaolong Wan, Xixian Han
https://arxiv.org/abs/2507.12925 https://arxiv.org/pdf/2507.129…
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- On the Effectiveness of Graph Reordering for Accelerating Approximate Nearest Neighbor Search on GPU
Yutaro Oguri, Mai Nishimura, Yusuke Matsui
Knowing when to stop: insights from ecology for building catalogues, collections, and corpora
Jan Haji\v{c} jr., Fabian Moss
https://arxiv.org/abs/2507.14614 https://
Inside ICE’s Supercharged Facial Recognition App of 200 Million Images https://www.404media.co/inside-ices-supercharged-facial-recognition-app-of-200-million-images/
Those companies that you pay to get your data out of data brokers’ databases are just creating databases so they can become data brokers, themselves, right?
topology: Internet AS graph (2004)
An integrated snapshot of the structure of the Internet at the level of Autonomous Systems (ASs), reconstructed from multiple sources, including the RouteViews and RIPE BGP trace collectors, route servers, looking glasses, and the Internet Routing Registry databases. This snapshot was created around October 2004.
This network has 34761 nodes and 171403 edges.
Tags: Technological, Communication, Unweighted, Multigraph, Timestamps
> Having to maintain half a dozen cursed #OpenSource pseudo-databases, because people absolutely must use this month's fad.
Another of them betrays community trust and changes license.
> Having to maintain a bunch of independent open source forks of said pseudo-database.
Rinse and repeat.
#Gentoo
🔧 Integrate tools to your agent in less than 10 lines of code - reuse between multiple agents or frameworks
💬 Query databases in plain English directly from your #IDE - no SQL writing needed
Hi everyone at #MCH2025! We're about to start with the last talk of the day: #BadgeHub! In this talk Francis, Edwin, and Aleksander will explain the Infra, Frameworks, Databases, Backend and Frontend for this new website where Apps for this amazing Badge can be shared. Please reply to this message if…
Large Language Models in the Data Science Lifecycle: A Systematic Mapping Study
Sai Sanjna Chintakunta, Nathalia Nascimento, Everton Guimaraes
https://arxiv.org/abs/2508.11698 h…
AlDBaran: Towards Blazingly Fast State Commitments for Blockchains
Bernhard Kauer, Aleksandr Petrosyan, Benjamin Livshits
https://arxiv.org/abs/2508.10493 https://
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- NLI4DB: A Systematic Review of Natural Language Interfaces for Databases
Mengyi Liu, Jianqiu Xu
Plex tells customers to reset their passwords after suffering data breach that includes email addresses, usernames, and passwords (Lawrence Abrams/BleepingComputer)
https://www.bleepingcomputer.com/news/security/plex-tells-users-to-re…
Chat-Driven Text Generation and Interaction for Person Retrieval
Zequn Xie, Chuxin Wang, Sihang Cai, Yeqiang Wang, Shulei Wang, Tao Jin
https://arxiv.org/abs/2509.12662 https://…
[2025-07-21 Mon (UTC), 5 new articles found for cs.DB Databases]
toXiv_bot_toot
What are some of the key concepts and design choices behind modern, scalable, high-performance databases? At Berlin Buzzwords 2025, Guy Shtub discussed how a database delivers sub-millisecond 99 percentile latency at millions of operations per second throughput, at scale, and how you can use it.
Watch the full session: https://
[2025-07-22 Tue (UTC), 5 new articles found for cs.DB Databases]
toXiv_bot_toot
[2025-08-21 Thu (UTC), 3 new articles found for cs.DB Databases]
toXiv_bot_toot
A hacking collective calling itself "Scattered LapSus Hunters," has threatened to leak Google databases unless the company sacks two senior employees. Whilst the group has yet to provide any evidence that it holds Google data, Google has recently disclosed a third-party security breach involving Salesforce.
[2025-08-22 Fri (UTC), 6 new articles found for cs.DB Databases]
toXiv_bot_toot
Contrastive timbre representations for musical instrument and synthesizer retrieval
Gwendal Le Vaillant, Yannick Molle
https://arxiv.org/abs/2509.13285 https://
IDSS, a Novel P2P Relational Data Storage Service
Massimo Cafaro, Italo Epicoco, Marco Pulimeno, Lunodzo J. Mwinuka, Lucas Pereira, Hugo Morais
https://arxiv.org/abs/2507.14682
Challenges in GenAI and Authentication: a scoping review
Wesley dos Reis Bezerra, Lais Machado Bezerra, Carlos Becker Westphall
https://arxiv.org/abs/2507.11775
topology: Internet AS graph (2004)
An integrated snapshot of the structure of the Internet at the level of Autonomous Systems (ASs), reconstructed from multiple sources, including the RouteViews and RIPE BGP trace collectors, route servers, looking glasses, and the Internet Routing Registry databases. This snapshot was created around October 2004.
This network has 34761 nodes and 171403 edges.
Tags: Technological, Communication, Unweighted, Multigraph, Timestamps
A meta-analysis on the performance of machine-learning based language models for sentiment analysis
Elena Rohde, Jonas Klingwort, Christian Borgs
https://arxiv.org/abs/2509.09728
Rel-HNN: Split Parallel Hypergraph Neural Network for Learning on Relational Databases
Md. Tanvir Alam, Md. Ahasanul Alam, Md Mahmudur Rahman, Md. Mosaddek Khan
https://arxiv.org/abs/2507.12562
Artificial Intelligence and Journalism: A Systematic Bibliometric and Thematic Analysis of Global Research
Mohammad Al Masum Molla, Md Manjurul Ahsan
https://arxiv.org/abs/2507.10891
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Efficient Discovery of Motif Transition Process for Large-Scale Temporal Graphs
Zhiyuan Zheng, Jianpeng Qi, Jiantao Li, Guoqing Chao, Junyu Dong, Yanwei Yu
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- AegisBlock: A Privacy-Preserving Medical Research Framework using Blockchain
Calkin Garg, Omar Rios Cruz, Tessa Andersen, Gaby G. Dagher, Donald Winiecki, Min Long
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- TabulaX: Leveraging Large Language Models for Multi-Class Table Transformations
Arash Dargahi Nobari, Davood Rafiei
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Fitting Ontologies and Constraints to Relational Structures
Simon Hosemann, Jean Christoph Jung, Carsten Lutz, Sebastian Rudolph
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- AMAZe: A Multi-Agent Zero-shot Index Advisor for Relational Databases
Zhaodonghui Li, Haitao Yuan, Jiachen Shi, Hao Zhang, Yu Rong, Gao Cong
[2025-08-20 Wed (UTC), 3 new articles found for cs.DB Databases]
toXiv_bot_toot
Towards a Standard for JSON Document Databases
Elena Botoeva, Julien Corman
https://arxiv.org/abs/2509.12189 https://arxiv.org/pdf/2509.12189
Query Logs Analytics: A Aystematic Literature Review
Dihia Lanasri
https://arxiv.org/abs/2508.13949 https://arxiv.org/pdf/2508.13949
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- MedPix 2.0: A Comprehensive Multimodal Biomedical Data set for Advanced AI Applications with Retr...
Irene Siragusa, Salvatore Contino, Massimo La Ciura, Rosario Alicata, Roberto Pirrone
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Keywords are not always the key: A metadata field analysis for natural language search on open da...
Lisa-Yao Gan, Arunav Das, Johanna Walker, Elena Simperl
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Exploring Distributed Vector Databases Performance on HPC Platforms: A Study with Qdrant
Ockerman, Gueroudji, Oh, Underwood, Chia, Chard, Ross, Venkataraman
Synthesize, Retrieve, and Propagate: A Unified Predictive Modeling Framework for Relational Databases
Ning Li, Kounianhua Du, Han Zhang, Quan Gan, Minjie Wang, David Wipf, Weinan Zhang
https://arxiv.org/abs/2508.08327
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- The Impact of Modern AI in Metadata Management
Wenli Yang, Rui Fu, Muhammad Bilal Amin, Byeong Kang
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- EllieSQL: Cost-Efficient Text-to-SQL with Complexity-Aware Routing
Yizhang Zhu, Runzhi Jiang, Boyan Li, Nan Tang, Yuyu Luo
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Compressive Meta-Learning
Daniel Mas Montserrat, David Bonet, Maria Perera, Xavier Gir\'o-i-Nieto, Alexander G. Ioannidis
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- LLM Agents for Interactive Workflow Provenance: Reference Architecture and Evaluation Methodology
Souza, Poteet, Etz, Rosendo, Gueroudji, Shin, Balaprakash, da Silva
[2025-08-19 Tue (UTC), 5 new articles found for cs.DB Databases]
toXiv_bot_toot
[2025-09-19 Fri (UTC), 3 new articles found for cs.DB Databases]
toXiv_bot_toot
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- MINE GRAPH RULE: A New Cypher-like Operator for Mining Association Rules on Property Graphs
Francesco Cambria, Francesco Invernici, Anna Bernasconi, Stefano Ceri
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Consensus-Free Spreadsheet Integration
Brandon Baylor, Eric Daimler, James Hansen, Esteban Montero, Ryan Wisnesky
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Quality Assessment of Tabular Data using Large Language Models and Code Generation
Ashlesha Akella, Akshar Kaul, Krishnasuri Narayanam, Sameep Mehta
[2025-08-18 Mon (UTC), 1 new article found for cs.DB Databases]
toXiv_bot_toot
[2025-07-18 Fri (UTC), 3 new articles found for cs.DB Databases]
toXiv_bot_toot
[2025-09-18 Thu (UTC), 4 new articles found for cs.DB Databases]
toXiv_bot_toot
BridgeScope: A Universal Toolkit for Bridging Large Language Models and Databases
Lianggui Weng, Dandan Liu, Rong Zhu, Bolin Ding, Jingren Zhou
https://arxiv.org/abs/2508.04031 …
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- MAPS: A Multilingual Benchmark for Global Agent Performance and Security
Hofman, Brokman, Rachmil, Bose, Pahuja, Shimizu, Starostina, Marchisio, Goldfarb-Tarrant, Vainshtein
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Inconsistency Handling in Prioritized Databases with Universal Constraints: Complexity Analysis a...
Meghyn Bienvenu, Camille Bourgaux
Language Native Lightly Structured Databases for Large Language Model Driven Composite Materials Research
Yuze Liu, Zhaoyuan Zhang, Xiangsheng Zeng, Yihe Zhang, Leping Yu, Lejia Wang, Xi Yu
https://arxiv.org/abs/2509.06093
Marlin: Efficient Coordination for Autoscaling Cloud DBMS (Extended Version)
Wenjie Hu, Guanzhou Hu, Mahesh Balakrishnan, Xiangyao Yu
https://arxiv.org/abs/2508.01931 https://…
[2025-07-17 Thu (UTC), 1 new article found for cs.DB Databases]
toXiv_bot_toot
[2025-09-17 Wed (UTC), 1 new article found for cs.DB Databases]
toXiv_bot_toot
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- QUEST: Query Optimization in Unstructured Document Analysis
Sun, Deng, Chai, Jin, Guo, Han, Yuan, Wang, Cao
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Scalable Graph Indexing using GPUs for Approximate Nearest Neighbor Search
Zhonggen Li, Xiangyu Ke, Yifan Zhu, Bocheng Yu, Baihua Zheng, Yunjun Gao
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Cut Costs, Not Accuracy: LLM-Powered Data Processing with Guarantees
Sepanta Zeighami, Shreya Shankar, Aditya Parameswaran
Crosslisted article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- Text-to-SQL Oriented to the Process Mining Domain: A PT-EN Dataset for Query Translation
Bruno Yui Yamate, Thais Rodrigues Neubauer, Marcelo Fantinato, Sarajane Marques Peres
Zero-Knowledge Verifiable Graph Query Evaluation via Expansion-Centric Operator Decomposition
Hao Wu, Changzheng Wei, Yanhao Wang, Li Lin, Yilong Leng, Shiyu He, Minghao Zhao, Hanghang Wu, Ying Yan, Aoying Zhou
https://arxiv.org/abs/2507.00427
Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- PROV-AGENT: Unified Provenance for Tracking AI Agent Interactions in Agentic Workflows
Souza, Gueroudji, DeWitt, Rosendo, Ghosal, Ross, Balaprakash, da Silva
[2025-07-16 Wed (UTC), 4 new articles found for cs.DB Databases]
toXiv_bot_toot
[2025-09-16 Tue (UTC), 5 new articles found for cs.DB Databases]
toXiv_bot_toot
Dynamic read & write optimization with TurtleKV
Tony Astolfi, Vidya Silai, Darby Huye, Lan Liu, Raja R. Sambasivan, Johes Bater
https://arxiv.org/abs/2509.10714 https://