Tootfinder

Opt-in global Mastodon full text search. Join the index!

@erc_bk@fosstodon.org
2025-05-29 14:02:20

Spark SQL pipe (|>) for Spark 4.0.0?!
issues.apache.org/jira/browse/

@berlinbuzzwords@floss.social
2025-05-20 10:02:17

With Apache NiFi, a multimodal data pipelining tool, you can assemble existing and/or custom Java & Python processors into a variety of flows. Join Lester Martin at Berlin Buzzwords this year and watch a rich data pipeline be constructed from Kafka, stored using the Apache Iceberg table format and consumed from Trino.
Learn more:

Apache Iceberg ingestion with Apache NiFi
Photo of Lester Martin
Join us from June 15-17 in Berlin or participate online / berlinbuzzwords.de
@datascience@genomic.social
2025-05-27 10:00:02

The inner working of parquette/arrow data in R: #rstats

@arXiv_csDB_bot@mastoxiv.page
2025-05-29 07:17:06

StreamLink: Large-Language-Model Driven Distributed Data Engineering System
Dawei Feng, Di Mei, Huiri Tan, Lei Ren, Xianying Lou, Zhangxi Tan
arxiv.org/abs/2505.21575

@arXiv_astrophGA_bot@mastoxiv.page
2025-07-29 10:04:32

Unveiling the Sagittarius Dwarf Spheroidal Galaxy Core with Gaia DR3
Ellie K. H. Toguchi-Tani, Daniel R. Hey, Thomas de Boer, Peter M. Frinchaboy, Daniel Huber
arxiv.org/abs/2507.20212

@shaun@mastodon.xyz
2025-07-10 16:11:55

#Apache 2.4.64 is released! It fixes some vulnerabilities, listed here:
httpd.apache.org/security/vuln

@berlinbuzzwords@floss.social
2025-05-29 11:00:17

Modern applications require search capabilities that go beyond basic text matching. They must be fast, accurate, personalised and context-aware. At this year's Berlin Buzzwords, Saurabh Singh will demonstrate how OpenSearch’s latest AI/ML enhancements and engine improvements enable organisations to build intelligent, scalable search experiences that meet these evolving needs.
Learn more:

Session title: From Search to Insight: Leveraging OpenSearch for Scalable, AI-Driven Search Experiences
Saurabh Singh
Join us for Berlin Buzzwords on June 15-17 at Kulturbrauerei or online / berlinbuzzwords.de
@frankel@mastodon.top
2025-05-31 16:05:00

Apache Fury (incubating)
#java #python

@tante@tldr.nettime.org
2025-06-18 21:37:52

Well, it doesn't look like much but I just switched my infra from Apache to Caddy.
Sometimes doing some admin work is good for the soul.
(Also a preparation to install an AI scraper poison service)

@berlinbuzzwords@floss.social
2025-05-23 14:00:11

Join Andrew Musselman and Trevor Grant as they present the latest developments in Mahout's new quantum compute layer, Qumat. They will provide an overview of the project, explain why Qumat was developed, and demonstrate its current capabilities. They will also present a demo of Qumat in action and conclude with calls to action for researchers and engineers who are interested in using and contributing to the project.
Learn more:

Session title: Qumat: Apache Mahout Quantum Compute
Andrew Musselman
Trevor Grant
Join us from June 15-17 in Berlin or participate online / berlinbuzzwords.de
@arXiv_astrophEP_bot@mastoxiv.page
2025-07-17 09:37:10

Palomar and Apache Point Spectrophotometry of Interstellar Comet 3I/ATLAS
Matthew Belyakov, Christoffer Fremling, Matthew J. Graham, Bryce T. Bolin, Mukremin Kilic, Gracyn Jewett, Carey M. Lisse, Carl Ingebretsen, M. Ryleigh Davis, Ian Wong
arxiv.org/abs/2507.11720

@frankel@mastodon.top
2025-06-24 16:15:02

Carrot Cache: High-Performance, SSD-Friendly #Caching #Library for #Java

@berlinbuzzwords@floss.social
2025-05-15 11:00:10

Join Kevin Liang at this year's Berlin Buzzwords, where he will discuss how Apache Solr/Lucene builds dense vector indexes and talk about how he and his team optimised their dense vector setup, sharing the challenges they faced and the best practices they learned along the way.
Learn more:

Session title: Performance Tuning Apache Solr for Dense Vectors
Kevin Liang
Join us from June 15-17 in Berlin or participate online / berlinbuzzwords.de
@grumpybozo@toad.social
2025-07-05 21:07:35

The people committed to DDoSing the #Apache #SpamAssassin RuleQA server seem to have substantial resources. I’ve blocked a lot of them, but they keep coming, asking about things like the May 7 2017 performance of a single rule in one contributor's stats. Not stuff real people want.
Of course…

@cjust@infosec.exchange
2025-07-18 17:48:38

#ShamelesslyStolenFromTheOtherSite

Ryan Petersen & — @typesfast- 
Board should give him a raise. Without this viral
moment, I'd never know that Astronomer is used
by enterprise clients to manage apache airflow and
achieve 70% higher uptime than self-managed
airflow.
@Techmeme@techhub.social
2025-06-10 15:20:57

Mistral launches its first reasoning models: Magistral Small, on Hugging Face under an Apache 2.0 license, and Magistral Medium, in preview on Mistral's Le Chat (Kyle Wiggers/TechCrunch)
techcrunch.com/2025/06/10/mist

@tante@tldr.nettime.org
2025-06-18 07:49:35

Been wanting to build it but I haven't had the time recently: Has anyone integrated iocaine or some similar anti AI scraper tools into apache?

@alm10965@mastodon.social
2025-07-06 11:40:18

Okee!…?
Sone art schlager-hiphop
RAF Camora x Apache 207 - JUPITER
youtube.com/watch?v=3RvBj77W9sQ
> JUPITER” hier streamen:

@crell@phpc.social
2025-07-03 16:52:03

What's the preferred easy-to-use benchmarking tool these days for testing full HTTP responses? I know ab (apache bench), but it's also very old so I assume there's a new favorite.
This is for mostly informal tests, so ease of use > capability. Must run on Linux CLI.
#PHP

@arXiv_csDC_bot@mastoxiv.page
2025-06-04 07:25:23

Big Data-Driven Fraud Detection Using Machine Learning and Real-Time Stream Processing
Chen Liu, Hengyu Tang, Zhixiao Yang, Ke Zhou, Sangwhan Cha
arxiv.org/abs/2506.02008

@arXiv_astrophSR_bot@mastoxiv.page
2025-06-30 08:22:20

Apache Point Observatory follow-up of ACcelerating Candidate ExopLanet host Stars (APO ACCELS): Ages for 166 Accelerating Stars in the Northern Hemisphere
Anne E. Peck (Department of Astronomy, New Mexico State University), Eric L. Nielsen (Department of Astronomy, New Mexico State University), Robert J. De Rosa (European Southern Observatory), William Thompson (National Research Council, Herzberg Astronomy and Astrophysics), Bruce Macintosh (Department of Astronomy and Astrophysics, U…

@arXiv_astrophGA_bot@mastoxiv.page
2025-07-24 10:00:09

A Galactic Self-Portrait: Density Structure and Integrated Properties of the Milky Way Disk
Julie Imig, Jon A. Holtzman, Gail Zasowski, Jianhui Lian, Nicholas F. Boardman, Alexander Stone-Martinez, J. Ted Mackereth, Moire K. M. Prescott, Rachael L. Beaton, Timothy C. Beers, Dmitry Bizyaev, Michael R. Blanton, Katia Cunha, Jos\'e G. Fern\'andez-Trincado, Catherine E. Fielder, Sten Hasselquist, Christian R. Hayes, Misha Haywood, Henrik J\"onsson, Richard R. Lane, Steven R. M…

@berlinbuzzwords@floss.social
2025-07-16 11:00:19

As data speeds increase, it has become crucial to detect problems as they happen. At this year's Berlin Buzzwords, Olena Kutsenko explained how to build a real-time anomaly detection system using Apache Kafka for streaming, Apache Flink for processing, and AI for pattern recognition, covering Apache Iceberg for storing historical data to improve models.
Watch the full session:

@arXiv_csOS_bot@mastoxiv.page
2025-05-15 08:58:24

This arxiv.org/abs/2504.06151 has been replaced.
initial toot: mastoxiv.page/@arXiv_csOS_…

@arXiv_csSE_bot@mastoxiv.page
2025-06-12 08:16:41

Microservices and Real-Time Processing in Retail IT: A Review of Open-Source Toolchains and Deployment Strategies
Aaditaa Vashisht (Department of Information Science,Engineering, RV College of Engineering, India), Rekha B S (Department of Information Science,Engineering, RV College of Engineering, India)
arxiv.org/abs/25…

@NuclearDisorder@mastodon.social
2025-07-08 04:54:21

Heute vor 69 Jahren: Am 8. Juli 1956 kam es zum #Atomtest "Apache". Die Operation #Redwing war eine US-Serie von 17 Atomtestdetonationen von Mai bis Juli 1956. Sie wurden von der Joint Task Force 7 (JTF7) auf den Atollen

Kartierung aller Koordinaten von "Operation Redwing (Atomtest)" im Testgebiet Bikini- und Eniwetok-Atoll
Quelle: OpenStreetMap
Lizenz: Open Data Commons Open Database-Lizenz (ODbL)
@arXiv_astrophEP_bot@mastoxiv.page
2025-05-30 07:29:34

Apache Point rapid response characterization of primitive pre-impact detection asteroid 2024 RW$_1$
Carl Ingebretsen, Bryce T. Bolin, Robert Jedicke, Peter Vere\v{s}, Christine H. Chen, Carey M. Lisse, Russet McMillan, Torrie Sutherland, Amanda J. Townsend
arxiv.org/abs/2505.23736

@arXiv_csCR_bot@mastoxiv.page
2025-07-02 07:44:30

Plug. Play. Persist. Inside a Ready-to-Go Havoc C2 Infrastructure
Alessio Di Santo
arxiv.org/abs/2507.00189 arxiv.org…

@berlinbuzzwords@floss.social
2025-07-22 11:00:32

At Berlin Buzzwords 2025, Ved Prakash discussed how Siphon transformed their data pipeline using Apache Iceberg to successfully stream quality data into both Snowflake and Clickhouse simultaneously. In this short talk, you’ll learn about their battle-tested architecture, the performance improvements they’ve achieved, and their strategies for maintaining data consistency across two analytics engines.
Watch the full session:

@shaun@mastodon.xyz
2025-07-09 02:23:07

Got slammed by an unidentified but certainly "#AI"-related #distributed #crawler this week, it drove one site's traffic to 10× average. Today I tired of playing Whac-a-Mole and blocked the two bigge…

Output of a cut, sort, uniq, sort -n job on an Apache format access_log file. It shows around 30K entries per day on July 1, 2, 3, 4. Then suddenly ramping up to 200K and nearly 400K entries on subsequent days. The extra traffic is all from some asshole's "AI" crawler.
Part of an iptables listing from a Linux server. It shows some of my POLICY_DROP_WEB chains which block abusive traffic to 80,443 from various sources. Two rules added today, one for AS136907 (Huawei Cloud) and one for AS45899 (VNPT) have already blocked around 35,000 requests apiece.
@arXiv_csDC_bot@mastoxiv.page
2025-06-04 07:20:45

CityPulse: Real-Time Traffic Data Analytics and Congestion Prediction
Idriss Djiofack Teledjieu, Irzum Shafique
arxiv.org/abs/2506.01971

@arXiv_astrophSR_bot@mastoxiv.page
2025-07-17 08:19:40

Abundances of P, S, and K in 58 bulge spheroid stars from APOGEE
B. Barbuy, H. Ernandes, A. C. S. Fria\c{c}a, M. S. Camargo, P. da Silva, S. O. Souza, T. Masseron, M. Brauner, D. A. Garcia-Hernandez, J. G. Fernandez-Trincado, K. Cunha, V. V. Smith, A. Peerez-Villegas, C. Chiappini, A. B. A. Queiroz, B. X. Santiago, T. C. Beers, F. Anders, R. P. Schiavon, M. Valentini, D. Minniti, D. Geisler, D. Souto, V. M. Placco, M. Zoccali, S. Feltzing, M. Schultheis, C. Nitschelm

@BBC6MusicBot@mastodonapp.uk
2025-05-31 13:26:50

🇺🇦 #NowPlaying on #BBC6Music's #JamzSupernova
Steven Julien:
🎵 Apache
#StevenJulien
stevenjulien.bandcamp.com/trac
open.spotify.com/track/7dkPQYZ

@berlinbuzzwords@floss.social
2025-05-15 14:00:15

Apache Flink is uniquely positioned to serve as the backbone for AI agents, equipping them with the powerful new tool of stream processing. Join Steffen Hoellinger at this year's Berlin Buzzwords to explore how Flink jobs can be transformed into “Agents”—autonomous, goal-driven entities that dynamically interact with data streams, trigger actions, and adapt their behaviour based on real-time insights.
Learn more:

Session title: Flink Jobs as Agents – Stream Processing for Agentic AI
Steffen Hoellinger
Join us from June 15-17 in Berlin or participate online / berlinbuzzwords.de
@berlinbuzzwords@floss.social
2025-05-21 12:40:04

We're thrilled to announce that @… has rejoined Berlin Buzzwords as a Platinum Partner!
Learn more about OpenSearch: opensearch.org/

Platinum Partner - OpenSearch
@BBC3MusicBot@mastodonapp.uk
2025-07-12 16:45:45

🇺🇦 #NowPlaying on BBCRadio3's #ThisClassicalLife
David Raksin & Johnny Mercer:
🎵 Love Song From "Apache"
#DavidRaksin #JohnnyMercer

@arXiv_astrophSR_bot@mastoxiv.page
2025-07-11 08:42:21

OWLS I: The Olin Wilson Legacy Survey
Brett M. Morris, Leslie Hebb, Suzanne L. Hawley, Kathryn Jones, Jake Romney
arxiv.org/abs/2507.07330

@arXiv_csDC_bot@mastoxiv.page
2025-06-05 07:17:32

Analysis of Server Throughput For Managed Big Data Analytics Frameworks
Emmanouil Anagnostakis, Polyvios Pratikakis
arxiv.org/abs/2506.03854

@berlinbuzzwords@floss.social
2025-07-15 11:00:12

At this year's Berlin Buzzwords, Michal Gancarski led a workshop demonstrating practical ways to deploy, configure, interact with, and utilise the advanced features of Apache Iceberg.
 
Watch the full session: youtu.be/v15EiNQt9R0?si=QDuAZ4

@berlinbuzzwords@floss.social
2025-05-12 16:25:08

Join Adrien Grand and Luca Cavanna at this year's Berlin Buzzwords as they share the fascinating journey to the release of version 10.0 of the popular Java search engine Apache Lucene, discussing the ups and downs, the team effort it took to get there, and much more.
Learn more:

Session title: Shipping Lucene 10.0, 25 years in the making
Adrien Grand
Luca Cavanna
Join us from June 15-17 in Berlin or participate online / berlinbuzzwords.de
@berlinbuzzwords@floss.social
2025-05-12 16:25:01

Join Adrien Grand and Luca Cavanna at this year's Berlin Buzzwords as they share the fascinating journey to the release of version 10.0 of the popular Java search engine Apache Lucene, discussing the ups and downs, the team effort it took to get there, and much more.
Learn more:

Session title: Shipping Lucene 10.0, 25 years in the making
Adrien Grand
Luca Cavanna
Join us from June 15-17 in Berlin or participate online / berlinbuzzwords.de
@berlinbuzzwords@floss.social
2025-07-08 11:00:04

Apache Solr 9.8 introduces the LLM module, opening the doors to end-to-end natural language query support through vector-backed semantic search (K Nearest Neighbors). At Berlin Buzzwords 2025, Alessandro Benedetti discussed the open-source contributions from both an indexing and query perspective, as well as what's next for Solr in terms of Large Language Model integration.
Watch the full session here:

@berlinbuzzwords@floss.social
2025-07-09 11:00:19

At Berlin Buzzwords 2025, Javier Ramirez shared the journey of developing QuestDB, an Apache 2.0-licensed open-source time-series database, into a much faster analytical database.
Watch the full session: youtu.be/SuxHP3_KOgQ?si=mGVdSK

@berlinbuzzwords@floss.social
2025-07-07 11:00:25

At this year's Berlin Buzzwords, Ilaria Petreti, Anna Ruggero, and Edward Lambe presented an AI Filter Assistant for Statistical Data (SDMX). They demonstrated how large language models can suggest the most effective filters for your natural language queries and assist in refining your results in Apache Solr.
You can watch the full session here: