2025-08-07 11:30:04
Venice Film Festival Hacked, Attendee Data Compromised
https://www.hollywoodreporter.com/movies/movie-news/venice-film-festival-hacked-data-compromised-1236338374/
Venice Film Festival Hacked, Attendee Data Compromised
https://www.hollywoodreporter.com/movies/movie-news/venice-film-festival-hacked-data-compromised-1236338374/
Sources: data center startup Crusoe is arranging an employee share sale involving ~$120M worth of shares that values it at ~$13B, up from $10B just weeks ago (The Information)
https://www.theinformation.com/articles/openais-data-cen…
Complex Domain Approach for Reversible Data Hiding and Homomorphic Encryption: General Framework and Application to Dispersed Data
David Megias
https://arxiv.org/abs/2510.03770 …
Anthropic uses questionable dark patterns to obtain users’ consent to the use of AI data in Claude: https://the-decoder.com/anthropic-uses-a-questionable-dark-pattern-to-obtain-user-consent-for-ai-data-use-in-claude/
Google and Cisco, have disclosed separate data breaches stemming from voice phishing (vishing) attacks that compromised customer information stored in cloud-based CRM systems.
https://www.computing.co.uk/news/2025/security/google-and-ci…
There may finally come a first peer-reviewed publication about Neuralink with human data, but for now still no details https://www.businesstimes.com.sg/startups-tech/startups/musks-neuralink-submits-brain-implant-patient-data-journ…
A Type System for Data Privacy Compliance in Active Object Languages
Chinmayi Prabhu Baramashetru (University of Oslo, Norway), Paola Giannini (Universita' del Piemonte Orientale, Italy), Silvia Lizeth Tapia Tarifa (University of Oslo, Norway), Olaf Owe (University of Oslo, Norway)
https://arxiv.org/abs/2508.03831
Researchers show how a weakness in OpenAI's Connectors let sensitive data be extracted from a Google Drive account using an indirect prompt injection attack (Matt Burgess/Wired)
https://www.wired.com/story/poisoned-document-could-leak-secret-data-chatgpt/
CoDA: Agentic Systems for Collaborative Data Visualization
Zichen Chen, Jiefeng Chen, Sercan \"O. Arik, Misha Sra, Tomas Pfister, Jinsung Yoon
https://arxiv.org/abs/2510.03194
Just noticed that the 30 themes for the 2025 #30DayMapChallenge are now up!
#GISchat
What Do We Mean When We Talk About Data Storytelling?
Leni Yang, Zezhong Wang, Xingyu Lan
https://arxiv.org/abs/2510.04761 https://arxiv.org/pdf/2510.04761…
🎼 Unser NFDI4Culture-Kollege @…, Martin Albrecht-Hohmaier präsentiert hier auf der #gfm25 die Open Educational Resource des Teams der Cultural Research Data Academy, die kürzlich auf unserem Culture Information Portal erschienen ist:
https://www.bleepingcomputer.com/news/security/sandworm-hackers-use-data-wipers-to-disrupt-ukraines-grain-sector/
Sandworm hackers use data wipers to disrupt Ukraine's grain sector
Finding Diamonds in Conversation Haystacks: A Benchmark for Conversational Data Retrieval
Yohan Lee, Yongwoo Song, Sangyeop Kim
https://arxiv.org/abs/2510.02938 https://
from my link log —
Disassembling vs decompressing terabytes of random data with Zig and Capstone: what are the failure rates?
https://jstrieb.github.io/posts/random-instructions/
saved 2025-11-06
Constraint-Preserving Data Generation for Visuomotor Policy Learning
Kevin Lin, Varun Ragunath, Andrew McAlinden, Aaditya Prasad, Jimmy Wu, Yuke Zhu, Jeannette Bohg
https://arxiv.org/abs/2508.03944
PowerPlots: An Open Source Power Grid Visualization and Data Analysis Framework for Academic Research
Noah Rhodes
https://arxiv.org/abs/2510.05063 https://…
Technical specification of a framework for the collection of clinical images and data
Alistair Mackenzie (Royal Surrey NHS Foundation Trust, Guildford, UK), Mark Halling-Brown (Royal Surrey NHS Foundation Trust, Guildford, UK), Ruben van Engen (Dutch Expert Centre for Screening), Carlijn Roozemond (Dutch Expert Centre for Screening), Lucy Warren (Royal Surrey NHS Foundation Trust, Guildford, UK), Dominic Ward (Royal Surrey NHS Foundation Trust, Guildford, UK), Nadia Smith (Royal Surrey…
The QIC-Index: A Novel, Data-Centric Metric for Quantifying the Impact of Research Data Sharing
Martin G. Frasch
https://arxiv.org/abs/2510.03307 https://a…
sp_baboons: Baboons' interactions (2020)
Network of interactions between a group of 20 Guinea baboons living in an enclosure of a Primate Center in France, between June 13th 2019 and July 10th 2019. The data set contains observational and wearable sensors data.
This network has 23 nodes and 3197 edges.
Tags: Social, Animal, Offline, Unweighted, Weighted, Temporal, Metadata
Advocates raise alarm over Pfas pollution from datacenters amid AI boom #environment
Proud of my city (Tucson) to reject a ridiculous Amazon data center development.
The desert is the last place you want to build data centers.
EU Parliament committee votes to advance controversial Europol data sharing proposal https://therecord.media/eu-parliament-committee-votes-europol-data-sharing-agreement
The European Union general data protection regulation: what it is and what it means
Chris Jay Hoofnagle, Bart van der Sloot, Frederik Zuiderveen Borgesius
https://arxiv.org/abs/2510.02861
{nplyr} has helper functions to work on nested dataframes: #rstats #datascience
The Trump Administration Wants Your Voter Registration Data. Why? (Matt Cohen/Democracy Docket)
https://www.democracydocket.com/analysis/the-trump-administration-wants-your-voter-registration-data-why/
http://www.memeorandum.com/250906/p45#a250906p45
Not every day is a sunny day: Synthetic cloud injection for deep land cover segmentation robustness evaluation across data sources
Sara Mobsite, Renaud Hostache, Laure Berti Equille, Emmanuel Roux, Joris Guerin
https://arxiv.org/abs/2510.03006
On 16WW Mains Inlet Water Temperature - Domestic mains water temperature data for 16WW on tap; seasonal min/max about 10C/20C in winter/summer. #dataset #water #temperature -
Partially paywalled but still lots of juicy data. “State of the software engineering job market in 2025”: https://newsletter.pragmaticengineer.com/p/state-of-the-tech-market-in-2025
From which:
Edge-assisted Parallel Uncertain Skyline Processing for Low-latency IoE Analysis
Chuan-Chi Lai, Yan-Lin Chen, Bo-Xin Liu, Chuan-Ming Liu
https://arxiv.org/abs/2508.04596 https:/…
What your brain activity says about you: A review of neuropsychiatric disorders identified in resting-state and sleep EEG data
J. E. M. Scanlon, A. Pelzer, M. Gharleghi, K. C. Fuhrmeister, T. K\"ollmer, P. Aichroth, R. G\"oder, C. Hansen, K. I. Wolf
https://arxiv.org/abs/2510.04984
@jacob@mountaincommunity.coThis is particularly bad in #Italy, where #WhatsApp is the default means of communication. In other countries I've lived in at least they consider alternatives. In Italy people don't even ask you if you have an account before adding you in school parent groups or contacting you for work over WhatsApp.
„Tech companies’ use of Pfas gas at facilities may mean datacenters’ climate impact is worse than previously thought“ (via @… )
https://www.
Combining the second data release of the European Pulsar Timing Array with low-frequency pulsar data
F. Iraci, A. Chalumeau, C. Tiburzi, J. P. W. Verbiest, A. Possenti, S. C. Susarla, M. A. Krishnakumar, G. M. Shaifullah, J. Antoniadis, M. Bagchi, C. Bassa, R. N. Caballero, B. Cecconi, S. Chen, S. Chowdhury, B. Ciardi, I. Cognard, S. Corbel, S. Desai, D. Deb, J. Girard, A. Golden, J-M. Grie{\ss}meier, L. Guillemot, M. Hoeft, H. Hu, F. Jankowski, G. Janssen, B. C. Joshi, S. Kala, E. Kea…
Orthogonal Procrustes problem preserves correlations in synthetic data
Oussama Ounissi, Nicklas J\"averg\r{a}rd, Adrian Muntean
https://arxiv.org/abs/2510.02405 https://
This is amaxing: THe @… have produced a beatuful set of easy to access tools for using with their satellite data products.
Some great quick start notebooks and tonnes and tonnes of lovely #EarthSytsem climate data.
(Our PISCO project will also be steadily adding stuff to this as well over the next year and a half)
https://climate.esa.int/en/data/toolbox/
Larry Ellison Is a ‘Shadow President’ in Donald Trump’s America
The Ellison family is cornering the market on attention and data the same way the Vanderbilts did railroads and the Rockefellers did oil.
👉 https://www.wired.com/story/larry-ellison-
KEEP: Integrating Medical Ontologies with Clinical Data for Robust Code Embeddings
Ahmed Elhussein, Paul Meddeb, Abigail Newbury, Jeanne Mirone, Martin Stoll, Gamze Gursoy
https://arxiv.org/abs/2510.05049
Flock Surveillance
refers to the camera and data systems developed by Flock Safety,
-- a private technology company that provides automated license plate recognition and
vehicle-tracking networks to
police departments, homeowners’ associations, and private businesses across the U.S.
🔥This system enables mass tracking of drivers and data sharing across police and private networks
without sufficient oversight,
raising serious concerns about privacy, …
So AI Datacenters are going to space...
https://research.google/blog/exploring
«Lessons in humility & simplicity for 'data science': Garmin's health status»
I blogged about how another #wearable manufacturer went down the road of leaving up data interpretation to humans instead of automating it – and how that relates to "AI" or "automated decision making"
(Responses to this toot become blog comments too)
#quantifiedself #personalscience
RIVM update rioolwaarden en percentage positief Sars-Cov-2.
Na de stijging vorige week, deze week juist weer een daling.
Er zitten 7 nieuwe dagen in de data, van 27/10-2/11 met resp. 95%-1% van de meetstations. Vooral de laatste drie dagen stellen nog weinig voor qua data.
Het gewogen gemiddelde is wel gezakt van rond de 765 naar rond de 665.
#qp2t
https://www.abc.net.au/news/2025-10-06/data-breach-northern-rivers-resilient-homes-program-chatgpt/105855284
NSW flood victims' personal details loaded to ChatGPT in major data breach
"What works in #India will scale better everywhere else. Naturally, the country is a battleground for #AI search."
This is fishing for data and the next #enshittification to roll ou…
Description of CRESST-III lithium aluminate data
G. Angloher (CRESST Collaboration), S. Banik (CRESST Collaboration), G. Benato (CRESST Collaboration), A. Bento (CRESST Collaboration), A. Bertolini (CRESST Collaboration), R. Breier (CRESST Collaboration), C. Bucci (CRESST Collaboration), J. Burkhart (CRESST Collaboration), L. Canonica (CRESST Collaboration), A. D'Addabbo (CRESST Collaboration), S. Di Lorenzo (CRESST Collaboration), L. Einfalt (CRESST Collaboration), A. Erb (CRESST …
Kernel ridge regression under power-law data: spectrum and generalization
Arie Wortsman, Bruno Loureiro
https://arxiv.org/abs/2510.04780 https://arxiv.org/…
I'm putting together notes for novices to do simple data analysis with R and the fact that I'm telling them to "cut and paste this inscrutable block of code at the start of your file" reminds me of nothing so much as when I worked for the ESRI in Dublin in the mid 1980s and we used to run SPSS analyses remotely on UCD's Amdahl by sandwiching our SPSS code (on punch cards) between two decks of IBM Job Control Language cards of which we understood nothing whatsoever.
DualBird, which has created a plug-in for rewritable hardware to accelerate data workloads, raised $25M, including a $16.5M Series A led by Lightspeed (Chris Metinko/Axios)
https://www.axios.com/pro/enterprise-software-deals/2…
Detecting Distillation Data from Reasoning Models
Hengxiang Zhang, Hyeong Kyu Choi, Yixuan Li, Hongxin Wei
https://arxiv.org/abs/2510.04850 https://arxiv.o…
Vector Autoregression (VAR) of Longitudinal Sleep and Self-report Mood Data
Jeff Brozena
https://arxiv.org/abs/2510.02511 https://arxiv.org/pdf/2510.02511
A look at OpenAI's search for the sites of its Stargate data centers in the US; OpenAI has received 800 applications since January and has 20 finalist sites (MacKenzie Sigalos/CNBC)
https://www.cnbc.com/2025/10/05/openai-stargate…
Velocity-Form Data-Enabled Predictive Control of Soft Robots under Unknown External Payloads
Huanqing Wang, Kaixiang Zhang, Kyungjoon Lee, Yu Mei, Vaibhav Srivastava, Jun Sheng, Ziyou Song, Zhaojian Li
https://arxiv.org/abs/2510.04509
Protecting Persona Biometric Data: The Case of Facial Privacy
Lambert Hogenhout, Rinzin Wangmo
https://arxiv.org/abs/2510.03035 https://arxiv.org/pdf/2510.…
twitter_higgs: Twitter, Higgs boson (2012)
Data on tweets related to the announcement of the discovery of a new fundamental particle with the features of the Higgs boson on 4th July 2012. Data covers 1-7 July 2012, and includes four types of networks: followers, retweets, replies, and mentions.
This network has 456626 nodes and 14855842 edges.
Tags: Social, Online, Weighted, Multilayer
Ed tech company fined $5.1 million for poor data security practices leading to hack https://therecord.media/ed-tech-company-fined-5-million-data-breach-security-practices
Say goodbye to the frustrations of copying and pasting data to and from R with Datapasta from @…! Get the package now: https://milesmcbain.github.io/datapasta/…
Robust estimation of causal dose-response relationship using exposure data with dose as an instrumental variable
Jixian Wang, Zhiwei Zhang, Ram Tiwari
https://arxiv.org/abs/2508.04215
https://www.koreatimes.co.kr/business/companies/20251106/kt-accused-of-concealing-major-malware-infection-faces-probe-over-customer-data-breach
Mobile carrier KT is facing mounting scr…
On 16WW Data Collections and Graphs - Open for research #dataset - https://www.earth.org.uk/note-on-data.html
Energy Efficiency in Cloud-Based Big Data Processing for Earth Observation: Gap Analysis and Future Directions
Adhitya Bhawiyuga, Serkan Girgin, Rolf A. de By, Raul Zurita-Milla
https://arxiv.org/abs/2510.02882
The Path of Self-Evolving Large Language Models: Achieving Data-Efficient Learning via Intrinsic Feedback
Hangfan Zhang, Siyuan Xu, Zhimeng Guo, Huaisheng Zhu, Shicheng Liu, Xinrun Wang, Qiaosheng Zhang, Yang Chen, Peng Ye, Lei Bai, Shuyue Hu
https://arxiv.org/abs/2510.02752
RFKJr and one of his ACIP picks, Robert Malone,
appear to be getting ready to restrict the Respiratory Syncytial Virus (RSV) immunizations
(nirsevimab and clesrovimab)
saying there is a vast cover up of vaccine safety data.
"You talk a lot about how unsafe it is to vaccinate children.
Do you know how unsafe it is to not vaccinate them?"
Recall.ai, which offers an API and a desktop SDK for managing meeting recordings and transcripts, raised $38M in a Series B led by Bessemer at a $250M valuation (Paul Gillin/SiliconANGLE)
https://siliconangle.com/2025/09/05/recall-ai-lands-38m-unlock-spoken-…
@… by the way, does it sometimes happen that the Geofabrik data downloads lag a few days behind the actual state of the OSM despite the timestamps seeming current? Or does the picture gen pipeline somehow not clear out stale data sometimes? 🙂
ResCP: Reservoir Conformal Prediction for Time Series Forecasting
Roberto Neglia, Andrea Cini, Michael M. Bronstein, Filippo Maria Bianchi
https://arxiv.org/abs/2510.05060 https…
Data-driven Practical Stabilization of Nonlinear Systems via Chain Policies: Sample Complexity and Incremental Learning
Roy Siegelmann, Enrique Mallada
https://arxiv.org/abs/2510.03982
Scalable Causal Discovery from Recursive Nonlinear Data via Truncated Basis Function Scores and Tests
Joseph Ramsey, Bryan Andrews
https://arxiv.org/abs/2510.04276 https://
Yowzer, take a break from reading all the election news by checking out today's Metacurity for the most critical infosec developments you should know, including
--EU cops bust money launderers who set up crypto fraud network,
--OFAC sanctions DPRK firms for supporting criminal activity,
--Probe reveals how easy it is to intercept EU and NATO sensitive movement data,
--KC PD hack exposes misconduct details,
--Nikkei Slack hack exposes data on 17K employees an…
External Data Extraction Attacks against Retrieval-Augmented Large Language Models
Yu He, Yifei Chen, Yiming Li, Shuo Shao, Leyi Qi, Boheng Li, Dacheng Tao, Zhan Qin
https://arxiv.org/abs/2510.02964
DRIVE-T: A Methodology for Discriminative and Representative Data Viz Item Selection for Literacy Construct and Assessment
Angela Locoro, Silvia Golia, Davide Falessi
https://arxiv.org/abs/2508.04160
Discord says sensitive info stolen during cyberattack on customer service provider https://therecord.media/discord-data-breach-third-party
Automatically describe data and models as text using the {report} package. #rstats
WOW: WAIC-Optimized Gating of Mixture Priors for External Data Borrowing
Shouhao Zhou, Qiuxin Gao, Chenqi Fu, Yanxun Xu
https://arxiv.org/abs/2510.05085 https://
A profile of OpenAI President Greg Brockman and his role in the company's $1.4T infrastructure buildout that's required to reach AGI (Sharon Goldman/Fortune)
https://fortune.com/2025/11/05/openai-greg-brockman-ai-infrastructure-data-c…
Good lord, I hope the data was sufficiently anonymized, but with only three patients, I dunno...
Musk’s Neuralink Submits Brain Implant Patient Data to Journal
https://www.bloomberg.com/news/articles/2025-10-05/musk-s-neu…
MagicGUI: A Foundational Mobile GUI Agent with Scalable Data Pipeline and Reinforcement Fine-tuning
Liujian Tang, Shaokang Dong, Yijia Huang, Minqi Xiang, Hongtao Ruan, Bin Wang, Shuo Li, Zhihui Cao, Hailiang Pang, Heng Kong, He Yang, Mingxu Chai, Zhilin Gao, Xingyu Liu, Yingnan Fu, Jiaming Liu, Tao Gui, Xuanjing Huang, Yu-Gang Jiang, Qi Zhang, Kang Wang, Yunke Zhang, Yuran Wang
AMANDA: Agentic Medical Knowledge Augmentation for Data-Efficient Medical Visual Question Answering
Ziqing Wang, Chengsheng Mao, Xiaole Wen, Yuan Luo, Kaize Ding
https://arxiv.org/abs/2510.02328
Privacy in the Age of AI: A Taxonomy of Data Risks
Grace Billiris, Asif Gill, Madhushi Bandara
https://arxiv.org/abs/2510.02357 https://arxiv.org/pdf/2510.…
https://www.scmp.com/news/asia/australasia/article/3324470/australias-qantas-airways-penalises-ceo-over-data-breach-bonus-cut
Anyone ever hear of a CEO getting their pay docked over a data breach before?
TeachLM: Post-Training LLMs for Education Using Authentic Learning Data
Janos Perczel, Jin Chow, Dorottya Demszky
https://arxiv.org/abs/2510.05087 https://…
Sources: Veeam, which offers data backup and disaster recovery tools, is in advanced talks to acquire data privacy management software maker Securiti for ~$1.8B (Bloomberg)
https://www.bloomberg.com/news/articles/2025-10-06/insigh…
Per-element Secure Aggregation against Data Reconstruction Attacks in Federated Learning
Takumi Suimon, Yuki Koizumi, Junji Takemasa, Toru Hasegawa
https://arxiv.org/abs/2508.04285
https://www.bleepingcomputer.com/news/security/media-giant-nikkei-reports-data-breach-impacting-17-000-people/
Media giant Nikkei reports data breach impacting 17,000 people
Data shows WLFI's sharp drop was driven by shorting and dumping across exchanges, not Justin Sun's token movements; WLFI blames phishing-related compromises (Sam Reynolds/CoinDesk)
https://www.coindesk.com/markets/2025/09/0
So the city of Houston is blaming its firefighters for clicking on a link in an email the city sent to them that intended to share information about the firefighters’ promotion exam but instead linked to a bunch of social security numbers in the clear.
Houston data breach exposes firefighters’ personal info, union says they’re being blamed
A basket tracking stocks of 10 European data center operators and infrastructure providers surged 23% in 2025, topping the Nasdaq 100, driven by the AI boom (Bloomberg)
https://www.bloomberg.com/news/articles/2025-10-04/the-bi…
Sources and documents: Google plans to build an AI data center on Australia's remote Christmas Island after signing a cloud deal with its DOD earlier in 2025 (Kirsty Needham/Reuters)
https://www.reuters.com/world/asia-pacific
Google adds Gemini's Deep Search to Google Finance, which will also have prediction market data from Kalshi and Polymarket for event analysis, first in the US (Aamir Siddiqui/Android Authority)
https://www.androidauthority.com/google-finance…
SaaS giant Workiva discloses data breach after Salesforce attack
https://www.bleepingcomputer.com/news/security/saas-giant-workiva-discloses-data-breach-after-salesforce-attack/?mid=1#cid=3061643
LinkedIn sues a company called ProAPIs for allegedly operating millions of fake accounts to scrape LinkedIn member data and selling it for ~$15,000 per month (Suzanne Smalley/The Record)
https://therecord.media/linkedin-sues-data-scraping-company
At the White House dinner, Zuckerberg said Meta plans to spend "something like at least $600B" through 2028 on data centers and other infrastructure in the US (Kalley Huang/The Information)
https://www.theinformation.com/briefings/meta…
No such thing as a quiet weekend in cybersecurity, so check out today's Metacurity for the infosec developments you might have missed since Friday, including
--Scattered LAPSUS$ Hunters claims theft of 1 billion Salesforce records,
--Game engine developer Unity urges users to patch a severe flaw,
--Hackers stole some Discord user data after a third-party compromise,
--US immigration dramatically expands its spying ability,
--Asahi reverts to pen and paper to …
A joint Databroker Files investigation shows commercially traded mobile phone location data allows the tracking of millions of Europeans, including EU officials (netzpolitik.org)
https://netzpolitik.org/2025/databroker-files-targeting-the-eu/
Check out today's Metacurity for the most crucial cybersecurity developments you should know, including
--CISA plans to fire 54 employees despite court injunction,
--Google reports new ways threat actors can use AI in their attacks,
--KT accused of concealing BPFDoor infection,
--Meta earns $7b a year in scam ads,
--Hackers stole data from Hyundai AutoEver America,
--Chinese court sentences scam operators to death,
--NV ransomware attack took plac…
Things are moving fast in cyber world, so don't miss today's Metacurity for the most critical developments you should know, including
--ShinyHunters is extorting Red Hat, LAPSUS$ teen in UK custody may be connected,
--Jaguar Land Rover to resume production at some sites,
--Apple faces French probe over Siri recordings,
--SEC probes AppLovin's data collection practices,
--Google's CodeMender is a security Swiss army knife,
--Prankster tells W…
A look at data labeling startups like Objectways, whose workers record and annotate repetitive tasks like folding towels to train AI robots for physical tasks (Nilesh Christopher/Los Angeles Times)
https://www.latimes.com/business/story/202…
Google unveils Project Suncatcher to launch two solar-powered satellites, each with four TPUs, into low Earth orbit in 2027, as it seeks to scale AI compute (Reed Albergotti/Semafor)
https://www.semafor.com/article/11/04/2025/google-wants-…