2025-10-15 13:04:01
I can't see how anything could possibly go wrong with this #MentalHealth
BoN Appetit Team at LeWiDi-2025: Best-of-N Test-time Scaling Can Not Stomach Annotation Disagreements (Yet)
Tomas Ruiz, Siyao Peng, Barbara Plank, Carsten Schwemmer
https://arxiv.org/abs/2510.12516
💊 Vålerenga call for anti-doping changes after artificial pitch causes footballer to fail drug test
https://www.theguardian.com/football/2025/nov/26/valerenga-anti-doping-changes-artificial-pitch-causes-footb…
The Morgan-Pitman Test of Equality of Variances and its Application to Machine Learning Model Evaluation and Selection
Argimiro Arratia, Alejandra Caba\~na, Ernesto Mordecki, Gerard Rovira-Parra
https://arxiv.org/abs/2509.12185
PSA about food labeling in the US
We have a gluten detection service dog because many things that should be gluten free/say they’re gluten free are not actually gluten free.
Stuff gets contaminated when growing (e.g. next to wheat field), by shared equipment, in factories, from packaging, during transport and in-store.
Every US consumer should know:
1. The list of ingredients on food isn't exhaustive
2. Allergen labeling:
a) limited to just some allergens
b) manufacturers don't actually have to test
c) "certified" foods are tested—but not continuously
d) testing only works with enough contamination
Some certifications may require batch-testing, but usually they don't.
A "certified gluten free" product may e.g. contain oats which sometimes are contaminated with gluten—but as not every batch is tested it's impossible to know unless you test yourself (hence the service dog).
Even if the product is properly batch-tested, you might get a part of the product that has the allergen in it, whereas the tested part didn't.
Or the threshold was too low (our dog can detect gluten better than any available lab testing equipment; yes, dogs are amazing).
Food products also contain ingredients that do not have to be included on the label when they're "incidental" (included in an another ingredient) or if they're considered part of the manufacturing process but not of the final product (e.g. various coatings on factory equipment).
Don't need to list flavors or specific spices either. ¯\_(ツ)_/¯
As for allergens, only those responsible for ~90% of food allergies* have to be specifically declared, and they're not tested for as it's simply based on the ingredients list.
Good luck if you have other allergies.
*milk, egg, egg, fish, Crustacean shellfish, tree nuts, wheat, peanuts, soybeans
Interessant: In meinem Artikel vom Dienstag über E-Mail-Weiterleitungen waren natürlich als Beispiele einige E-Mail-Adressen zu lesen.
https://www.kuketz-blog.de/anbieter-von-e-mail-aliassen-im-test-mail-aliasse-teil-1/
Bereits heute tref…
Ach du meine Güte!
Ich hatte so eben aus Neugier und Spass bei einigen kommerziellen Passworts management Services mal meine Demo-Passwörter geprüft. Die sind sehr simpel und sollten nirgends verwendet werden, denn die sind mehr als nur eratbar.
Was zeigen mir deren Webseiten an beim Passwort-Test?!?? Genau ich hätte anscheinend nur sichere Demo/Test Passwörter angewendet. Ich nehme ihre Nachricht mal als Marketing wahr und nicht der Realität von Profis 🤦
A Martingale Kernel Two-Sample Test
Anirban Chatterjee, Aaditya Ramdas
https://arxiv.org/abs/2510.11853 https://arxiv.org/pdf/2510.11853
TPSQLi: Test Prioritization for SQL Injection Vulnerability Detection in Web Applications
Guan-Yan Yang, Farn Wang, You-Zong Gu, Ya-Wen Teng, Kuo-Hui Yeh, Ping-Hsueh Ho, Wei-Ling Wen
https://arxiv.org/abs/2509.10920
Spin-induced Quadrupole Moment (SIQM) Test for Eccentric Compact Binaries
Syed U. Naqvi, Chandra Kant Mishra
https://arxiv.org/abs/2509.10675 https://arxiv…
Data-Model Co-Evolution: Growing Test Sets to Refine LLM Behavior
Minjae Lee, Minsuk Kahng
https://arxiv.org/abs/2510.12728 https://arxiv.org/pdf/2510.1272…
The Lovelace Test of Intelligence: Can Humans Recognise and Esteem AI-Generated Art?
Ewelina Gajewska
https://arxiv.org/abs/2509.11371 https://arxiv.org/pd…
😻 New test can flag drugs that could be harmful to cats
#cats
AppLovin shut down Array last quarter, calling it "a test product" that wasn't "economically viable", after allegations it led to unwanted Android app installs (Bloomberg)
https://www.bloomberg.com/news/articles/20
On the universal calibration of Pareto-type linear combination tests
Parijat Chakraborty, F. Richard Guo, Kerby Shedden, Stilian Stoev
https://arxiv.org/abs/2509.12066 https://
Quick teaser / test video of the PIC12F683 project.
Anyone have thoughts on format/audio quality etc?
https://www.youtube.com/watch?v=hQoWavKAhbA
Blick hinter Türchen - Täuschend oder traumhaft? Adventkalender im Test #News #Nachrichten
I've just bought test materials to evaluate the relative strengths of a flax/balsa sandwich composite versus a carbon/Nomex (aramid) sandwich composite.
Yes, I know the carbon/Nomex will be stronger, but would the flax/balsa be strong enough for my #Tricycle project?
#BikeTooter
I got a pair of rain pants from 33,000ft and they are large enough to fit over my regular pants, so hopefully I can bike in the rain now. It may actually rain tomorrow morning so I might get a chance to test them.
from my link log —
ucs-detect: automatically test the Unicode version and support level of a terminal emulator.
https://ucs-detect.readthedocs.io/
saved 2025-11-15 https:/…
Planning a video"meeting" with friends via #nextcloud talk. I just did a test with my wife and I must admit that I am amazed how easy it just worked!
I'm curious if it works as expected.
Instagram unveils its first TV app, initially available on Amazon's Fire TV as a test with plans to expand to other TV platforms (Kurt Wagner/Bloomberg)
https://www.bloomberg.com/news/articles/2025-12-16/instagram-debuts-ded…
Development of a national thrust test facility for electric propulsion at Robinson Research Institute, New Zealand
Emile Webster, Ben Mallett
https://arxiv.org/abs/2509.10558 ht…
Öffentlicher Nahverkehr, Heinerliner oder doch Carsharing? Mein erster Carsharing-Test mit Book-n-Drive: erst super easy, dann die Überraschung – plötzlich ein anderes Auto, 5 km entfernt! 😅Manchmal top, manchmal flop – aber spannend bleibt’s. Ein erster Preisvergleich zeigt: Carsharing, Heinerliner oder Bahn – preislich gar nicht so weit auseinander #LadeLust ⚡️
Resource-sensitive but language-blind: Community size and not grammatical complexity better predicts the accuracy of Large Language Models in a novel Wug Test
Nikoleta Pantelidou, Evelina Leivada, Paolo Morosi
https://arxiv.org/abs/2510.12463
DIPLODOCUS II: Implementation of transport equations and test cases relevant to micro-scale physics of jetted astrophysical sources
Christopher N. Everett, Marc Klinger-Plaisier, Garret Cotter
https://arxiv.org/abs/2510.12505
D-TPT: Dimensional Entropy Maximization for Calibrating Test-Time Prompt Tuning in Vision-Language Models
Jisu Han, Wonjun Hwang
https://arxiv.org/abs/2510.09473 https://…
A collection of stills from newly released videos of the #Starship returning is now leading https://skyweek.wordpress.com/2025/10/15/auch-suborbital-test-nr-11-verlief-problemlos/ - here are two in full resolution.
Shocked to hear that «After careful consideration, we have decided to stop accepting new customers for Amazon Glacier (original standalone vault-based service) starting on December 15, 2025. There will be no change to the S3 Glacier storage classes as part of this plan». /s
For years I've been duly billed either 0,01 € or 0,00 € per month for some ancient Glacier test I never bothered to clean up. Now that makes me some kind of protected species on
Selection Procedures in Competitive Admission
Nathan Hancart
https://arxiv.org/abs/2510.12653 https://arxiv.org/pdf/2510.12653
#TIL
I always struggled with #TestFirst #Programming because it requires much discipline to delay new features and concentrate on test cases first.
But while I was working on my
"Syberia Remastered" im Test: Gefangen zwischen den Zeiten
Ein wunderschönes Grafikadventure wird renoviert. Doch "Syberia Remastered" hat leider auch Schwächen. Kate Walker kämpft mit Mäusen und einer Doppelgängerin.
…
'systemd-analyze' is a useful if random tool that's part of systemd; it's actually got a whole bunch of different useful bits thrown in. The 'blame' , 'plot' and 'critical-chain' subcommands let you debug start up time. 'calendar' and 'timestamp' let you test if your format for a time/date is OK to use in a systemd file; 'verify' lets you check your systemd unit file for errors. There's loads more random bits.
Generative AI-Enabled Adaptive Learning Platform: How I Can Help You Pass Your Driving Test?
Riya Gill, Ievgeniia Kuzminykh, Maher Salem, Bogdan Ghita
https://arxiv.org/abs/2509.11438
@… you can use BETA5 now to test hardware compatibility. Install to a USB flash drive.
@…
friends spot in Philly is super fuckin nice and I have been bestowed the honor of sleeping in Da Big Bed while im here :ablobreach:
so gonna just test that out by attempting to nap rn
Managed to put the new Pi music player together before leaving work today. Got it set up with remote control and a quick test of playback. Next stop: Speaker mounting and figuring out if I should bet on physical security through obscurity by making it not easily visible or using a moderately long RCA cable. What’s a reasonable cable length with decent quality cables?
heise | Powerbeats Fit im Test: Hörer mit Haken
Die Powerbeats Fit von Beats halten gut im Ohr und bringen ANC mit. Wie schlagen sie sich im Vergleich zu den AirPods Pro 3?
https://www.heise.de/tests…
The Autocar Christmas road test and coffee. Sunday morning is starting well.
#WeirdCarMastodon
On Korovkin-type theorems including exponential test functions on infinite intervals through power series convergence
Dilek S\"oylemez, Mehmet \"Unver
https://arxiv.org/abs/2510.12568
Representation-Based Exploration for Language Models: From Test-Time to Post-Training
Jens Tuyls, Dylan J. Foster, Akshay Krishnamurthy, Jordan T. Ash
https://arxiv.org/abs/2510.11686
Titans Revisited: A Lightweight Reimplementation and Critical Analysis of a Test-Time Memory Model
Gavriel Di Nepi, Federico Siciliano, Fabrizio Silvestri
https://arxiv.org/abs/2510.09551
'That sucks': Who said it? Test your knowledge in the NFL Week 6 quote quiz https://www.espn.com/nfl/story/_/id/46585863/browns-steelers-rams-ravens-chargers-dolphins-patriots-jaguars-jets-week-6-2025-…
Beyond Test Scores: How Academic Rank Shapes Long-Term Outcomes
Emilia Del Bono, Angus Holford, Tommaso Sartori
https://arxiv.org/abs/2510.11973 https://ar…
Jiggled interferometer: Ground-based gravitational wave detector using rapidly-repeated free-falling test masses
Shoki Iwaguchi, Bin Wu, Kurumi Umemura, Tomohiro Ishikawa, Kenji Tsuji, Ryota Nishimura, Yuta Michimura, Yutaro Enomoto, Soichiro Morisaki, Yoichi Aso, Tomotada Akutsu, Keiko Kokeyama, Seiji Kawamura
https://arxiv.org/abs/2509.1…
Instagram unveils its first TV app, initially available on Amazon's Fire TV as a test with plans to expand to other TV platforms (Kurt Wagner/Bloomberg)
https://www.bloomberg.com/news/articles/2025-12-16/instagram-debuts-ded…
@…
EDIT: nevermind. I converted the references but I'm still open to feedback if you have any.
do you have any advice for me?
I'm surely doing something wrong.
https://
ArtPerception: ASCII Art-based Jailbreak on LLMs with Recognition Pre-test
Guan-Yan Yang, Tzu-Yu Cheng, Ya-Wen Teng, Farn Wanga, Kuo-Hui Yeh
https://arxiv.org/abs/2510.10281 htt…
60 years ago today (maybe), the muir beach acid test, during which the grateful dead & owsley stanley encounter one another for the very first time & owsley promptly freaks out & drives his car into a ditch. muir beach photo from later in the '60s. (sometimes the palo alto acid test is placed on 12/11 & muir beach on 12/18, but rosie mcgee says she met phil lesh for the 1st time on 12/11 & attended palo alto the following week, which seems to settle the matter.) [1/4]…
Für das @… habe ich drei bekannte E-Mail-Alias-Dienste getestet:
https://www.kuketz-blog.de/anbieter-von-e-mail-alias…
More Screen Time Linked To Lower Test Scores For Elementary Students - Slashdot
https://mobile.slashdot.org/story/25/10/11/0036214/more-screen-time-linked-to-lower-test-scores-for-elementary-students
Heute vor 55 Jahren: Am 16.12.1970 kam es simultan zu 4 #Atomtests "Avens-Andorre/Alkermes/Asamite/Cream". Die Operation #Emery war eine US-Serie von 24 Tests zwischen 1970/71. Sie wurden auf dem Nevada Test Site (NST) unterirdisch durchgeführt und dienten der Waffenentwicklung.
Raider Nation Origin Stories: Meet James Hollins, America's unsung first Black nuclear test leader https://www.raiders.com/news/raider-nation-origin-stories-meet-james-hollins-america-s-unsung-first-black-nuclear-tes…
why does my state's offical driver test practice app look like it was created by a 12yo on codepen
heise | Fotodaten-Manager MetaImage für macOS im Test
MetaImage hilft dabei, Metadaten von Fotos einzusehen und zu bearbeiten.
https://www.heise.de/tests/Fotodaten-Manag
from my link log —
The unreasonable effectiveness of modern sort algorithms.
https://github.com/Voultapher/sort-research-rs/blob/main/writeup/unreasonable/text.md
saved 2025-09-14
A Kolmogorov-Smirnov-Type Test for Dependently Double-Truncated Data
Anne-Marie Toparkus, Rafael Weissbach
https://arxiv.org/abs/2510.11517 https://arxiv.o…
Die Verantwortlichen haben nicht verstanden, dass die Zeit der symbolischen Projekte vorbei ist. FCAS wäre kein Prestige‑Spielplatz für Luftfahrtkonzerne, sondern ein Stresstest für die viel beschworene „strategische Souveränität“. Besteht Europa diesen Test nicht, wird die Abhängigkeit von US‑Systemen zementiert – und jeder weitere Appell an „europäische Eigenständigkeit“ zur leeren Hülse. | Horst Schulte | Warum Europas Kampfjet-Projekt scheitert
🏃♀️ Air Force announces new force-wide fitness test: 2-mile run, twice per year
#airforce
Agentic Property-Based Testing: Finding Bugs Across the Python Ecosystem
Muhammad Maaz, Liam DeVoe, Zac Hatfield-Dodds, Nicholas Carlini
https://arxiv.org/abs/2510.09907 https:/…
Shares of Tesla closed at a 2025 high on Monday after the company confirmed it is testing driverless vehicles in Austin without a human safety operator (Lora Kolodny/CNBC)
https://www.cnbc.com/2025/12/15/tesla-tests-driverless-cars-in-austin…
"Anno 117" im Test: Rom für Zuckerbäcker
Produktionsketten sind wichtig, aber auch nicht alles: Mit "Anno 117" bedient die Serie alte Stärken und punktet vor allem mit einer wunderschönen Antike.
https://www.…
Trixie update: Ceph got uninstalled from my storage cluster node due to package version conflicts or something.
This is a bit of a problem. (I did test on one node of a 3-way redudnant cluster so loss of one isn't the end of the world but it's an annoyance...)
This Week in Sports Trivia: October 16, 2025 https://www.nytimes.com/athletic/6720263/2025/10/16/this-week-in-sports-trivia-october-16-2025/
Learning-To-Measure: In-context Active Feature Acquisition
Yuta Kobayashi, Zilin Jing, Jiayu Yao, Hongseok Namkoong, Shalmali Joshi
https://arxiv.org/abs/2510.12624 https://
Duolingo launches a five-episode anime series in collaboration with Titmouse, premiering October 13 on YouTube, as a marketing campaign (Todd Spangler/Variety)
https://variety.com/2025/digital/news/duolingo-anime-series-final-test-…
»Passwortmanager — Test offenbart Sicherheitslücken bei Nutzerdaten:
Passwortmanager im Test - Nur drei von zehn Produkten verschlüsseln alle Daten komplett. Bei einigen können Hersteller auf Passwörter zugreifen«
Ein Argument mehr um dem Open-Source Passwort-Manager @… & Co zu vertrauen. Immer noch wird dis verhältnismässig von wenigen priv. &am…
Efficient Real-World Deblurring using Single Images: AIM 2025 Challenge Report
Daniel Feijoo, Paula Garrido-Mellado, Marcos V. Conde, Jaesung Rim, Alvaro Garcia, Sunghyun Cho, Radu Timofte
https://arxiv.org/abs/2510.12788
Crosslisted article(s) found for cs.AI. https://arxiv.org/list/cs.AI/new
[9/11]:
- Inducing Uncertainty for Test-Time Privacy
Muhammad H. Ashiq, Peter Triantafillou, Hung Yun Tseng, Grigoris G. Chrysos
„Routine“ im Test: Im Weltraum hört dich jeder laufen
„Routine“ ist ein kurzer, visuell umwerfender Reparaturbesuch auf einer Mondbasis mit Mörderbots, der sich manchmal selbst im Weg steht.
https://www.
Raider Nation Origin Stories: Meet James Hollins, America's unsung first Black nuclear test leader https://www.raiders.com/news/raider-nation-origin-stories-meet-james-hollins-america-s-unsung-first-black-nuclear-tes…
Search-based Hyperparameter Tuning for Python Unit Test Generation
Stephan Lukasczyk, Gordon Fraser
https://arxiv.org/abs/2510.08716 https://arxiv.org/pdf/…
Prompting Test-Time Scaling Is A Strong LLM Reasoning Data Augmentation
Sondos Mahmoud Bsharat, Zhiqiang Shen
https://arxiv.org/abs/2510.09599 https://arxi…
LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence
Zixin Yin, Xili Dai, Duomin Wang, Xianfang Zeng, Lionel M. Ni, Gang Yu, Heung-Yeung Shum
https://arxiv.org/abs/2509.12203
⚡ Will electric tractors gain traction? At a pilot event for farmers, researchers see possibilities
https://techxplore.com/news/2025-09-electric-tractors-gain-traction-event.html
Constraint-Guided Unit Test Generation for Machine Learning Libraries
Lukas Krodinger, Altin Hajdari, Stephan Lukasczyk, Gordon Fraser
https://arxiv.org/abs/2510.09108 https://
Top 10 Browser Fingerprint Test Tools in 2025
A browser fingerprint is the unique set of data your browser and device reveal #online, such as system info, fonts, screen size, and #IPaddress. Detecting your #browser
Raiders Face Daunting Climb to Pull Off Week 7 Upset https://www.si.com/nfl/raiders/las-vegas-brock-bowers-pete-carroll-kansas-city-chiefs
Heute vor 36 Jahren: Am 14. November 1989 testen die #USA die Atombombe "Muleshoe". Die Operation Aqueduct war eine Serie von 10 US-amerikanischen #Kernwaffentests, die 1989 und 1990 auf der Nevada Test Site in Nevada unterirdisch durchgeführt wurde.
Preservation of Language Understanding Capabilities in Speech-aware Large Language Models
Marek Kubis, Pawe{\l} Sk\'orzewski, Iwona Christop, Mateusz Czy\.znikiewicz, Jakub Kubiak, {\L}ukasz Bondaruk, Marcin Lewandowski
https://arxiv.org/abs/2509.12171
How media coverage of Trump's AI EO overstated federal authority over states and overlooked how the order's interstate commerce argument could backfire (Mike Masnick/Techdirt)
https://www.techdirt.com/2025/12/12/trump-pretends-to-…
Heute vor 67 Jahren: Am 15. Oktober 1958 führte die USA mit "Hamilton" den 15. Atomtest der Operation Hardtack II, eine Serie von 37 #Atomtests auf der #Nevada Test Site, NTS, durch. Von Ballontests bis zu unterirdischen Explosionen wurden verschiedene Methoden erprobt.
Sedeve-Kit, a Specification-Driven Development Framework for Building Distributed Systems
Hua Guo, Yunhong Ji, Xuan Zhou
https://arxiv.org/abs/2509.11566 https://
Rethinking Technology Stack Selection with AI Coding Proficiency
Xiaoyu Zhang, Weipeng Jiang, Juan Zhai, Shiqing Ma, Qingshuang Bao, Chenhao Lin, Chao Shen, Tianlin Li, Yang Liu
https://arxiv.org/abs/2509.11132