Tootfinder

@heiseonline@social.heise.de
2025-06-06 17:10:00

heise | Silent-Tastatur Be Quiet Dark Mount im Test: So leise kann mechanisch sein
Kein Klicken, Klappern oder Kratzen: Akustisch überzeugt die Dark Mount Silent Tactile von Be Quiet. Was die mechanische Tastatur leistet.

Silent-Tastatur Be Quiet Dark Mount im Test: So leise kann mechanisch sein
Kein Klicken, Klappern oder Kratzen: Akustisch überzeugt die Dark Mount Silent Tactile von Be Quiet. Was die mechanische Tastatur leistet.

@gwire@mastodon.social
2025-06-07 19:09:06

With most apps, "[feature] enable push notifications" is just a line in the changelog, not a segment on BBC News.
https://www.gov.uk/government/news/patients-to-receive-reminders-and-test-results-via-the-nhs-app

Patients to receive reminders and test results via the NHS App
Millions more patients will receive appointments, screening invitations and other important information via the NHS App, as part of £50million upgrade.

@nofollownoindex@deppenkessel.de
2025-06-08 09:34:10

Anekdotische Evidenz, aber... in Albanien rechnen sich offenbar Verbrenner-taxis nicht mehr! https://www.heise.de/forum/heise-online/Kommentare/Hyundai-Ioniq-5-im…

heise online
News und Foren zu Computer, IT, Wissenschaft, Medien und Politik. Preisvergleich von Hardware und Software sowie Downloads bei Heise Medien.

@memeorandum@universeodon.com
2025-06-07 18:16:09

Biden's doctor failed to properly assess fitness for office, Obama's doctor says (Paige Winfield Cunningham/Washington Post)
https://www.washingtonpost.com/health/2025/06/07/biden-obama-trump-cognitive-test/
http://www.memeorandum.com/250607/p39#a250607p39

@juandesant@astrodon.social
2025-06-06 21:22:40

Combined Public Service Announcement and Today I Learned: if you have `CLICOLOR=1` on macOS, if you do an `ls` of a directory which is inside iCloud Drive, evicted files (i.e., those that cannot be immediately used, but need to be downloaded first from iCloud) show with a grey background… see below:
First, both files are in iCloud, but not locally available. They have a dotted cloud icons in the iCloud Status field of the Finder window, and a greyed background in the output of the `ls` …

A Finder window showing a Folder named "CLICOLOR_Test" with two files: a PDF called "cssday.pdf", and a PNG file with a name that starts with "Screenshot-2025-" and ends with "17.08.34.png". The iCloud Status icon shows a cloud with a down arrow, indicando that it is not currently downloaded.

Screenshot of Terminal.app showing the command `ls -ln` run on the CLICOLOR_Test folder, and showing both files with a gray background in their filename.

Another screenshot of the CLICOLOR_Test folder, now with cssday.pdf showing a down-arrow circle (representing that the file will be kept downloaded), which means it is undoubtedly locally available.

Screenshot of Terminal.app showing the command `ls -ln` run on the CLICOLOR_Test folder, but this time the `cssday.pdf` file shows a white background, while the screenshot file still sports a gray background behind their filename.

@arXiv_csAR_bot@mastoxiv.page
2025-06-09 07:16:42

ScaleRTL: Scaling LLMs with Reasoning Data and Test-Time Compute for Accurate RTL Code Generation
Chenhui Deng, Yun-Da Tsai, Guan-Ting Liu, Zhongzhi Yu, Haoxing Ren
https://arxiv.org/abs/2506.05566

ScaleRTL: Scaling LLMs with Reasoning Data and Test-Time Compute for Accurate RTL Code Generation
Recent advances in large language models (LLMs) have enabled near-human performance on software coding benchmarks, but their effectiveness in RTL code generation remains limited due to the scarcity of high-quality training data. While prior efforts have fine-tuned LLMs for RTL tasks, they do not fundamentally overcome the data bottleneck and lack support for test-time scaling due to their non-reasoning nature. In this work, we introduce ScaleRTL, the first reasoning LLM for RTL coding that scal…

@ThatHoarder@mastodon.online
2025-06-08 14:34:17

What if I test out treating myself like I'm a person who doesn't deserve to be paralysed by shame? What if I test out being compassionate to myself instead of yelling at myself? https://www.

@Techmeme@techhub.social
2025-06-05 02:26:05

Source: Amazon is developing software for humanoid robots to deliver packages and is near completion of an indoor "humanoid park" in San Francisco to test them (Rocket Drew/The Information)
https://www.theinformation.com/articles/am

Amazon Prepares to Test Humanoid Robots for Delivering Packages
Amazon is developing software for humanoid robots that could eventually take the jobs of delivery workers, according to a person who has been involved in the effort. In doing so, Amazon is paving the way to automate a major part of its operation, the delivery of parcels around the world.

@UP8@mastodon.social
2025-06-06 22:10:25

💉 Olympic anti-doping lab puts U.S. meat supply to the test
#food

Olympic anti-doping lab puts U.S. meat supply to the test
Scientists at UCLA's Olympic Analytical Laboratory turned their sophisticated analytical capabilities for testing athlete samples for performance-enhancing drugs to research examining the U.S. meat supply as part of a study led by Texas Tech. The study was designed to investigate concerns that residues of growth promoters used in meat production could potentially cause athletes to test positive.

@azonenberg@ioc.exchange
2025-06-08 12:58:21

Writing a unit test for a newer, faster version of a libscopehal primitive and it wasn't lining up with the original results.
Turns out there was a bug in the original (FindZeroCrossings would fail to correctly detect a crossing between samples 0 and 1).

@raiders@darktundra.xyz
2025-06-08 14:27:08

Raiders Have a Test on the Road Early in the Season https://www.si.com/nfl/raiders/las-vegas-jayden-daniels-washington-commanders-pete-carroll-dan-quinn

Raiders Have a Test on the Road Early in the Season
The Las Vegas Raiders will be facing off against the Washington Commanders in week three of the 2025 NFL season. How will they do?

@simon_brooke@mastodon.scot
2025-06-06 10:32:02

"As I've mentioned before, I'm working (with a group of other volunteers) on producing a Local Place Plan for Auchencairn, and, in the process of doing that, we're using the Place Standard Tool. Data from the Place Standard Tool is delivered as an unwieldy CSV file with 53 columns, most of which contain narrative data.
This is pretty hard to analyse... so I've written a tool to automate summarising the data. And, actually, I'm quite pleased with it." -- m…

Place Standard Tool Summariser: Full Test Run

@roelgrif@mstdn.social
2025-06-07 23:28:53

"There's one way to end this war. You say to Vladimir Putin:
If you don't stop this war, and we agree to the ceasefire terms I dictate, we are bringing Ukraine into NATO within 30 days. Which part of that sentence don't you understand? Do you want to mess with me? Test me.
That's how you end this war."
https://

- YouTube
Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.

@frankel@mastodon.top
2025-06-07 16:12:07

Why I Don’t Use #Mocking Frameworks and Why You Might Not Need Them Either by @…

Why I Don't Use Mocking Frameworks and Why You Might Not Need Them Either - Martinelli
“I never use mocking frameworks like Mockito. Why? Either I have my test data under control, or I write the methods in a functional way.” When I say this, it usually provokes strong reactions. Mocking has become such a standard part of unit testing that it seems almost rebellious to suggest otherwise. But after many [...]

@macandi@social.heise.de
2025-06-04 06:03:00

heise | Flexible Schreibwerkstatt: adoc Studio im Test
Es muss nicht immer eine klassische Textverarbeitung sein: adoc Studio formatiert mithilfe von AsciiDoc umfangreiche Texte und Dokumentationen.
https://www.

Flexible Schreibwerkstatt: adoc Studio im Test
Es muss nicht immer eine klassische Textverarbeitung sein: adoc Studio formatiert mithilfe von AsciiDoc umfangreiche Texte und Dokumentationen.

@geant@mstdn.social
2025-05-07 13:29:11

📣 Calling all GÉANT Project partners…
Got an idea for digital research, data transfer or secure storage solutions to support open science?
The 2025 GÉANT Above-the-Net Services Incubator is officially open for proposals!
This is your opportunity to:
✅ Develop and test your innovative idea
✅ Deliver impact to the whole community through new, shared, open-source services
✅ Help shape the future of GÉANT's Above-the-Net services portfolio
Learn more: …

GÉANT Project Above-the-Net Services Incubator: Call for Proposals | GÉANT CONNECT Online
The GÉANT (GN5-2) Project’s Above-the-Net Services incubator is looking for practical proposals around three areas of service concept investment being developed by the GN5-2 Above-the-Net Services team. These are: Digital Research Environment (DRE) Object Storage Infrastructure Data Movement Infrastructure Through the incubator, the GN5-2 project will offer person months (PMs) to project partners to develop

@PaulWermer@sfba.social
2025-06-07 12:13:16

I've had a number of reasons to want to understand the standards (did you know that the industry work group creating the standards has vested interests in what is/ is not included, what the test method is, etc. So of course you can trust them without knowing the details, right?).
So this👇 is spot on:
https://

Aires (@aires@tiggi.es)
@matildalove@wetdry.world @soatok@furry.engineer ISO: "We created global standards for everyone to follow" Everyone: "Can we see them?" ISO: "No"

@paulwermer@sfba.social
2025-06-07 12:13:16

I've had a number of reasons to want to understand the standards (did you know that the industry work group creating the standards has vested interests in what is/ is not included, what the test method is, etc. So of course you can trust them without knowing the details, right?).
So this👇 is spot on:
https://

Aires (@aires@tiggi.es)
@matildalove@wetdry.world @soatok@furry.engineer ISO: "We created global standards for everyone to follow" Everyone: "Can we see them?" ISO: "No"

@krone@frawas.de
2025-06-06 09:33:22

Test: Solarbank 3 Pro - Strom vom Balkon: Was Kraftwerk wirklich bringt #News #Nachrichten

Test: Solarbank 3 Pro - Strom vom Balkon: Was Kraftwerk wirklich bringt
Strom selbst produziert – im „Balkonkraftwerk“: Dafür gibt es schlüsselfertige Lösungen mit Pufferakku, die einfache Installation und ...

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 08:08:25

The Ultimate Test of Superintelligent AI Agents: Can an AI Balance Care and Control in Asymmetric Relationships?
Djallel Bouneffouf, Matthew Riemer, Kush Varshney
https://arxiv.org/abs/2506.01813

The Ultimate Test of Superintelligent AI Agents: Can an AI Balance Care and Control in Asymmetric Relationships?
This paper introduces the Shepherd Test, a new conceptual test for assessing the moral and relational dimensions of superintelligent artificial agents. The test is inspired by human interactions with animals, where ethical considerations about care, manipulation, and consumption arise in contexts of asymmetric power and self-preservation. We argue that AI crosses an important, and potentially dangerous, threshold of intelligence when it exhibits the ability to manipulate, nurture, and instrumen…

@fossunleashed@social.linux.pizza
2025-06-05 17:27:54

A while back I discovered that ttyper (a typing test program for the terminal) can use a custom wordlist. I decided to create one entirely out of words I always misspell. I don't think my WPM will ever recover TT
https://github.com/max-niederman/ttyper

GitHub - max-niederman/ttyper: Terminal-based typing test.
Terminal-based typing test. Contribute to max-niederman/ttyper development by creating an account on GitHub.

@arXiv_statML_bot@mastoxiv.page
2025-06-05 10:03:51

This https://arxiv.org/abs/2504.09567 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_sta…

Conditional Independence Test Based on Transport Maps
Testing conditional independence between two random vectors given a third is a fundamental and challenging problem in statistics, particularly in multivariate nonparametric settings due to the complexity of conditional structures. We propose a innovative framework for testing conditional independence using transport maps. At the population level, we show that two well-defined transport maps can transform the conditional independence test into an unconditional independence test, this substantial…

@heiseonline@social.heise.de
2025-06-02 11:29:00

heise | Entspannter klicken: Vier ergonomische Mäuse im Test
Vertikale Mäuse sollen die Handgelenke schonen. Davon profitieren Sie bei der Büroarbeit – dank moderner Hardware mitunter aber auch beim Spielen.

Entspannter klicken: Vier ergonomische Mäuse im Test
Vertikale Mäuse sollen die Handgelenke schonen. Davon profitieren Sie bei der Büroarbeit – dank moderner Hardware mitunter aber auch beim Spielen.

@midtsveen@social.linux.pizza
2025-06-06 18:49:12

Test @…!
#RudolfRocker

The image is a sepia-toned portrait of an older man. He is wearing a dark suit with a white shirt and a dark tie. The man has a full, white beard and mustache, and his hair is curly and graying. He is wearing round, wire-rimmed glasses. The background is a plain, dark color, which contrasts with the subject's lighter hair and beard. The man is looking directly at the camera with a serious expression. The photograph has a vintage quality, suggesting it was taken in the early 20th century.

@Techmeme@techhub.social
2025-06-04 13:21:31

ChatGPT-4o, Claude 3.7 Sonnet, Gemini 2.0 Flash, Llama 4, and Copilot comparison: Claude was the best overall with the highest consistency and no hallucinations (Geoffrey A. Fowler/Washington Post)
https://www.washingtonpost.com/technology/

Review | 5 AI bots took our tough reading test. One was smartest — and it wasn’t ChatGPT.
We challenged AI helpers to decode legal contracts, simplify medical research, speed-read a novel and make sense of Trump speeches. Some of the AI analysis was impressive — and some was downright dumb.

@arXiv_astrophCO_bot@mastoxiv.page
2025-06-06 07:28:42

A novel test of gravity: Does spacetime geometry track matter density?
Camille Bonvin, Nastassia Grimm, Isaac Tutusaus
https://arxiv.org/abs/2506.04387 htt…

A novel test of gravity: Does spacetime geometry track matter density?
We propose a novel test of gravity that combines galaxy clustering with gravitational lensing. In general relativity, the evolution of matter density fluctuations and of the Weyl potential -- the sum of spatial and temporal distortions of the geometry -- are governed by the same growth function. In contrast, alternative theories of gravity that modify the relation between geometry and matter content can lead to differences in these two growths. Exploiting a recent method to directly measure the…

@arXiv_csSE_bot@mastoxiv.page
2025-06-04 07:46:02

GenFair: Systematic Test Generation for Fairness Fault Detection in Large Language Models
Madhusudan Srinivasan, Jubril Abdel
https://arxiv.org/abs/2506.03024

GenFair: Systematic Test Generation for Fairness Fault Detection in Large Language Models
Large Language Models (LLMs) are increasingly deployed in critical domains, yet they often exhibit biases inherited from training data, leading to fairness concerns. This work focuses on the problem of effectively detecting fairness violations, especially intersectional biases that are often missed by existing template-based and grammar-based testing methods. Previous approaches, such as CheckList and ASTRAEA, provide structured or grammar-driven test generation but struggle with low test diver…

@selea@social.linux.pizza
2025-06-02 10:36:15

Test site deployed
This will be running for a few weeks as a test.
#meshtastic

@UP8@mastodon.social
2025-05-07 16:22:42

✒️ Scientists have found a way to 'tattoo' tardigrades
... I think there will be a lot of competition for Ig Nobels this year
https://phys.org/news/2025-04-scientists-tattoo-tardigrades.html

Scientists have found a way to 'tattoo' tardigrades
If you haven't heard of a tardigrade before, prepare to be wowed. These clumsy, eight-legged creatures, nicknamed water bears, are about half a millimeter long and can survive practically anything: freezing temperatures, near starvation, high pressure, radiation exposure, outer space and more. Researchers reporting in the journal Nano Letters took advantage of the tardigrade's nearly indestructible nature and gave the critters tiny "tattoos" to test a microfabrication technique to build microsc…

@deabigt@universeodon.com
2025-06-07 01:13:23

Think movies and shows entirely generated from AI are still a long way off? Check this video made with Veo 3 that starts at just $67/month
Veo3 test // non-existent car show https://www.veo.co/en-us/pricing

@benrosstransit@mastodon.social
2025-06-05 21:49:50

Maryland DOT launches new policy of quick-fix road diets with paint & plastic "bollards" after initial test projects work well. Approach is try it for 6-9 months and see if it works in place of long studies.
https://www.

Maryland plans more ‘quick build’ road safety changes, citing success across state
Planners have a menu of traffic calming measures to consider for each site.

@arXiv_physicschemph_bot@mastoxiv.page
2025-06-06 07:33:24

Localised and Delocalised Charge Distribution in a Diamine Cation and Rydberg Excited State: A Challenging Test for Density Functionals
Benedikt O. Birgisson, Marta Ga{\l}y\'nska, Hemanadhan Myneni, Elvar \"O. J\'onsson, Ragnar Bjornsson, Hannes J\'onsson
https://arxiv.org/abs/2506.05077

Localised and Delocalised Charge Distribution in a Diamine Cation and Rydberg Excited State: A Challenging Test for Density Functionals
The balance between localised and delocalised electron distribution in the N,N'-dimethylpiperazine (DMP) molecule in the 3s Rydberg excited state and in the fully ionised DMP$^+$ provides a valuable test of density functionals, in particular the weight of Fock exchange (FE) in hybrid functionals and the scaling of explicit orbital-based self-interaction correction (SIC) applied to less elaborate functionals. We present results of calculations using density functionals of all rungs of Jacob's la…

@netzschleuder@social.skewed.de
2025-06-04 10:00:05

hiv_transmission: HIV transmission network (1988-2001)
A set of networks of HIV transmissions between people through sexual, needle-sharing, or social connections, based on combining 8 datasets collected from 1988 to 2001. Metadata includes test results of several diseases, as well as demographic variables such as age, ethnicity, and gender. Networks come in two flavors: egodyads and altdyads. Egodyads are the network among study-participants and their direct partners. Altdyads are the…

hiv_transmission: HIV transmission network (1988-2001). 35229 nodes, 85890 edges. https://networks.skewed.de/net/hiv_transmission

hiv_transmission — HIV transmission network (1988-2001)
A set of networks of HIV transmissions between people through sexual, needle-sharing, or social connections, based on combining 8 datasets collected from 1988 to 2001. Metadata includes test results of several diseases, as well as demographic variables such as age, ethnicity, and gender. Networks come in two flavors: egodyads and altdyads. Egodyads are the network among study-participants and their direct partners. Altdyads are the network among people who are connected to the egodyad network. …

@gfriend@mas.to
2025-05-05 20:51:42

Everything you wanted to know about birthright citizenship. And then some.
Trump’s Birthright Citizenship Arguments at the Supreme Court Are Epically Bad - Slate https://apple.news/AGkCaScEfRkm2VNFajiKAfg

Trump’s Birthright Citizenship Arguments at the Supreme Court Are Epically Bad — Slate
The 14th Amendment’s authors saw Trump’s citizenship test coming and rejected it.

@drbruced@aus.social
2025-06-06 03:51:08

This is one of the better articles I’ve seen on where AI might lead, trying to find a middle ground between “it’s a load of hype” and “it’s going to solve world hunger/kill us all”.
A choice quote:
Which is it: business as usual or the end of the world? “The test of a first-rate intelligence,” F. Scott Fitzgerald famously claimed, “is the ability to hold two opposed ideas in the mind at the same time, and still retain the ability to function.” Reading these reports back-to-back,…

Two Paths for A.I.
Joshua Rothman on two papers about the future of A.I., by deeply knowledgeable experts, which arrive at absurdly divergent conclusions about the threat the tech poses.

@arXiv_statME_bot@mastoxiv.page
2025-06-03 08:06:20

Characterization based Goodness-of-Fit for Generalized Pareto Distribution: A Blend of Stein's Identity and Dynamic Survival Extropy
Gaurav Kandpal, Nitin Gupta
https://arxiv.org/abs/2506.01473

Characterization based Goodness-of-Fit for Generalized Pareto Distribution: A Blend of Stein's Identity and Dynamic Survival Extropy
This paper proposes a goodness of fit test for the generalized Pareto distribution (GPD). Firstly, we provide two characterizations of GPD based on Stein's identity and dynamic survival extropy. These characterizations are used to test GPD separately for the positive and negative shape parameter cases. A Monte Carlo simulation is conducted to provide the critical values and power of the proposed test against a good number of alternatives. Our test is simple to use and it has asymptotic normalit…

@arXiv_csLG_bot@mastoxiv.page
2025-06-03 21:46:06

This https://arxiv.org/abs/2505.17155 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…

TrimR: Verifier-based Training-Free Thinking Compression for Efficient Test-Time Scaling
Large Reasoning Models (LRMs) demonstrate exceptional capability in tackling complex mathematical, logical, and coding tasks by leveraging extended Chain-of-Thought (CoT) reasoning. Test-time scaling methods, such as prolonging CoT with explicit token-level exploration, can push LRMs' accuracy boundaries, but they incur significant decoding overhead. A key inefficiency source is LRMs often generate redundant thinking CoTs, which demonstrate clear structured overthinking and underthinking patter…

@kurt@nelson.fun
2025-06-05 20:51:34

I just did the generic "Applications Engineer" job test for City of SF and it definitely was written by someone who hasn't moved past domain controllers and 2005. I got a "15.0", whatever that means for the rule of the list.

@curiouscat@fosstodon.org
2025-05-06 16:33:00

5 Things You Might Not Know About Me
- I have flown on “Air Force One.” Not technically, since it the president was not aboard, but while working for the White House Military Office I flew on the plane on a couple test flights.
- I spent many Thanksgivings beating John Dower https://curiouscat.com/books/dower

@blakes7bot@mas.torpidity.net
2025-06-05 15:31:35

Series C, Episode 02 - Powerplay
TARRANT: She wasn't exactly gentle with me. [Harmon and a guard enter]
GUARD: What happened?
TARRANT: He went for a gun. He wasn't going to chance taking the voice test.
KLEGG: There are no weapons on him, he's unarmed.
https://blake.torpidity.net/m/302/57

Blakes7Bot | Search
Companion site to the Blakes7Bot Twitter, Mastodon, and Bluesky accounts. Automated lines of Blakes 7 dialog. Includes a script search facility.

@mro@digitalcourage.social
2025-06-05 06:44:49

Hi @…,
as you implemented #RFC9421 you may know - which servers in the wild use it?
#Activitypub

@macandi@social.heise.de
2025-05-30 10:03:00

Mac & i 3/25: Reisezubehör, Mac-Speicher freiräumen, Mail-Apps im Test, iPad 11
Was taugen das Einsteiger-MacBook und -iPad? Welche Mail-App ist die beste? Und was braucht man unterwegs an Apps und Zubehör? Die neue Mac & i klärt auf.

Mac & i 3/25: Reisezubehör, Mac-Speicher freiräumen, Mail-Apps im Test, iPad 11
Was taugen das Einsteiger-MacBook und -iPad? Welche Mail-App ist die beste? Und was braucht man unterwegs an Apps und Zubehör? Die neue Mac & i klärt auf.

@arXiv_csCR_bot@mastoxiv.page
2025-06-02 07:18:00

Looking for Attention: Randomized Attention Test Design for Validator Monitoring in Optimistic Rollups
Suhyeon Lee
https://arxiv.org/abs/2505.24393 https:/…

Looking for Attention: Randomized Attention Test Design for Validator Monitoring in Optimistic Rollups
Optimistic Rollups (ORUs) significantly enhance blockchain scalability but inherently suffer from the verifier's dilemma, particularly concerning validator attentiveness. Current systems lack mechanisms to proactively ensure validators are diligently monitoring L2 state transitions, creating a vulnerability where fraudulent states could be finalized. This paper introduces the Randomized Attention Test (RAT), a novel L1-based protocol designed to probabilistically challenge validators in ORUs, t…

@fluchtkapsel@nerdculture.de
2025-05-30 12:34:57

Content warning: tech, admin, dns

Today, I got notified about spamhaus not responding anymore to requests from our mailserver due to using an "open resolver".
Huh?
I found the command `dig short test.openresolver.com TXT @<ip_of_dns_server_to_test>` to test if my DNS server is deemed an open resolver. And yes, the mailserver uses a DNS server that got recognized as an open resolver.
Out of curiosity, I tried the same in my local network where I have a dnsmasq serving DHCP and DNS for my cli…

@burger_jaap@mastodon.social
2025-06-04 09:52:59

🇸🇪DSO E.ON Energidistribution (part of 🇩🇪 E.ON SE) is launching a flexibility marketplace with the aim of connecting new production to the grid more quickly. Both sources of #flexibility that increase consumption (🔌⬆️) and sources that curtail production (✂️⬇️) can participate.

Key test for faster connection
n a flexibility market focused on generation load, producers, consumers and energy storage operators can be compensated for temporarily reducing their production or increasing their consumption during specific hours.

Using existing grid capacity more efficiently creates better conditions for faster connections and supports the ongoing electrification of society.

Here's how it works:

Who can participate?

Producers, consumers and operators with energy storage wi…

@trezzer@social.linux.pizza
2025-06-05 10:51:08

Browser benchmarks hot off the press:
Speedometer 3.1
N100 @ Fedora 42
Librewolf 139.0.1-1
Balanced: 7.13 — /- 0.13
Performance: 7.31 — /- 0.22
Powersave: 3.68 — /- 0.10
Brave 1.79.119
Balanced: 11.8 — /- 0.48
Performance: 12.7 — /- 0.52
Powersave: 6.74 — /- 0.28
Vivaldi 7.4.3684.46
Balanced: 11.7 — /- 0.51
Performance: 12.9 — /- 0.52
Powersave: 6.56 — /- 0.23
Note: Not a scientific test, but conditions …

@NFL@darktundra.xyz
2025-06-05 10:02:22

This Week in Sports Trivia: June 5, 2025 https://www.nytimes.com/athletic/6404019/2025/06/05/this-week-in-sports-trivia-june-5-2025/

This Week in Sports Trivia: June 5, 2025
How closely were you following the sports news this week? Find out and test your knowledge by taking The Athletic's weekly quiz.

@arXiv_csAI_bot@mastoxiv.page
2025-06-05 09:37:23

This https://arxiv.org/abs/2505.10981 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…

Rethinking the Role of Prompting Strategies in LLM Test-Time Scaling: A Perspective of Probability Theory
Recently, scaling test-time compute on Large Language Models (LLM) has garnered wide attention. However, there has been limited investigation of how various reasoning prompting strategies perform as scaling. In this paper, we focus on a standard and realistic scaling setting: majority voting. We systematically conduct experiments on 6 LLMs $\times$ 8 prompting strategies $\times$ 6 benchmarks. Experiment results consistently show that as the sampling time and computational overhead increase, co…

@clongclongmoo@social.bau-ha.us
2025-05-20 07:34:57

love sella, AlmabuenA – All my best friends don’t pass the turing test (AlmabuenA Remix)
https://www.clongclongmoo.org/2025/05/20/love-sella-almabuena-all-my-best-friends-dont-pass-the-turing-test-almabue…

love sella, AlmabuenA – All my best friends don’t pass the turing test (AlmabuenA Remix)
love sella, AlmabuenA “All my best friends don’t pass the turing test (AlmabuenA Remix)” It all started when fellow the Hagueians Barend of the legendary trip hop duo AlmabuenA and local underground music legend WivWav got stuck in traffic together. After discussing their musical faves they birthed the conceptual remix exchange that would eventually grow...

@arXiv_csSD_bot@mastoxiv.page
2025-06-03 07:30:02

$\texttt{AVROBUSTBENCH}$: Benchmarking the Robustness of Audio-Visual Recognition Models at Test-Time
Sarthak Kumar Maharana, Saksham Singh Kushwaha, Baoming Zhang, Adrian Rodriguez, Songtao Wei, Yapeng Tian, Yunhui Guo
https://arxiv.org/abs/2506.00358

$\texttt{AVROBUSTBENCH}$: Benchmarking the Robustness of Audio-Visual Recognition Models at Test-Time
While recent audio-visual models have demonstrated impressive performance, their robustness to distributional shifts at test-time remains not fully understood. Existing robustness benchmarks mainly focus on single modalities, making them insufficient for thoroughly assessing the robustness of audio-visual models. Motivated by real-world scenarios where shifts can occur $\textit{simultaneously}$ in both audio and visual modalities, we introduce $\texttt{AVROBUSTBENCH}$, a comprehensive benchmark…

@arXiv_econEM_bot@mastoxiv.page
2025-06-06 07:22:08

Power-boosting in Specification Tests using Kernel Directional Component
Cui Rui, Li Yuhao, Song Xiaojun
https://arxiv.org/abs/2506.04900 https://

Power-boosting in Specification Tests using Kernel Directional Component
We propose power-boosting strategies for kernel-based specification tests in conditional moment models, with a focus on the Kernel Conditional Moment (KCM) test. By decomposing the KCM statistic into spectral components, we demonstrate that truncating poorly estimated directions and selecting kernels based on a non-asymptotic signal-to-noise ratio significantly improves both test power and size control. Our theoretical and simulation results demonstrate that, while divergent component weights m…

@arXiv_csSE_bot@mastoxiv.page
2025-06-04 07:40:22

Towards More Effective Fault Detection in LLM-Based Unit Test Generation
Guancheng Wang, Qinghua Xu, Lionel C. Briand, Kui Liu
https://arxiv.org/abs/2506.02954

Towards More Effective Fault Detection in LLM-Based Unit Test Generation
Unit tests play a vital role in uncovering potential faults in software. While tools like EvoSuite focus on maximizing code coverage, recent advances in large language models (LLMs) have shifted attention toward LLM-based test generation. However, code coverage metrics -- such as line and branch coverage -- remain overly emphasized in reported research, despite being weak indicators of a test suite's fault-detection capability. In contrast, \textit{mutation score} offers a more reliable and str…

@arXiv_astrophGA_bot@mastoxiv.page
2025-06-05 07:29:59

Test of conformal gravity as an alternative to dark matter from the observations of elliptical galaxies
Li-Xue Yue, Da-Ming Chen
https://arxiv.org/abs/2506.03955

Test of conformal gravity as an alternative to dark matter from the observations of elliptical galaxies
As an alternative gravitational theory to General Relativity (GR), the Conformal Gravity (CG) has recently been successfully verified by observations of Type Ia supernovae (SN Ia) and the rotation curves of spiral galaxies. The observations of galaxies only pertain to the non-relativistic form of gravity. In this context, within the framework of the Newtonian theory of gravity (the non-relativistic form of GR), dark matter is postulated to account for the observations. On the other hand, the no…

@floheinstein@chaos.social
2025-06-05 05:02:49

AgentPorn.ai - Where Developers get their fix
https://agentporn.ai/

Green and Pink on Black text in little boxes appearing to be videos of scripts running with names like
Thick API Handles Massive Load Without Timing Out
Raw Dogging Production (No Tests)
Watch Me Penetration Test This Vulnerable Endpoint
My Step-Function Is Stuck In A Loop
Dirty Cache Gets Flushed Hard
Backend Developer Exposes Everything
Young Package Satisfies All Dependencies
Big O Notation Gets Dominated

⚠️ WARNING: Extremely Satisfying Developer Content
🔞 DEVELOPERS ONLY - Watch builds compile with ZERO errors. Git pushes with no conflicts. Docker containers that actually work. AgentPorn.ai

@arXiv_quantph_bot@mastoxiv.page
2025-06-02 10:27:23

This https://arxiv.org/abs/2404.04599 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_qu…

Local Test for Unitarily Invariant Properties of Bipartite Quantum States
We study the power of local test for bipartite quantum states. Our central result is that, for properties of bipartite pure states, unitary invariance on one part implies an optimal (over all global testers) local tester acting only on the other part. As an application, we show that - Purified samples offer no advantage in property testing of mixed states. - A matching lower bound $Ω(r^2/\varepsilon^2)$ for testing the Schmidt rank of bipartite states with perfect completeness, settling an…

@arXiv_csLO_bot@mastoxiv.page
2025-06-06 07:19:16

Trustworthiness Preservation by Copies of Machine Learning Systems
Leonardo Ceragioli, Giuseppe Primiero
https://arxiv.org/abs/2506.05203 https://

Trustworthiness Preservation by Copies of Machine Learning Systems
A common practice of ML systems development concerns the training of the same model under different data sets, and the use of the same (training and test) sets for different learning models. The first case is a desirable practice for identifying high quality and unbiased training conditions. The latter case coincides with the search for optimal models under a common dataset for training. These differently obtained systems have been considered akin to copies. In the quest for responsible AI, a l…

@outer@mas.to
2025-06-03 15:35:24

I'll test it on Apples.
I use the Orion browser on Apple computers. EFF's attempting to tell advertisers to USE EXPLICIT METHODS to control ads, and our privacy.
The Orion browser has its own "advanced" blockers.
But because it has Firefox extensions, it should "work" in Orion (the free browser by search engine company Kagi). Also, on iPads and iPhones, extensions work in Orion, but not Safari.
Privacy is a Human Right.

Morpheus Being (@MorpheusB@aus.social)
#It #Computers #Security #Privacy https://www.eff.org/deeplinks/2025/03/online-tracking-out-control-privacy-badger-can-help-you-fight-back

@arXiv_grqc_bot@mastoxiv.page
2025-06-04 07:51:49

Test Gravitational-Wave Polarizations with Space-Based Detectors
Jun-Shuai Wang, Chang Liu, Ju Chen, Jibo He
https://arxiv.org/abs/2506.02909 https://

Test Gravitational-Wave Polarizations with Space-Based Detectors
In this work, we systematically investigate the capability of space-based gravitational wave detectors in constraining parameters of non-tensor polarization modes. Using Bayesian inference and Fisher Information Matrix methods, we analyze gravitational wave signals from the inspiral phase of supermassive binary black hole mergers. By starting with time-domain signals and applying Fourier transforms, we avoid the use of the stationary phase approximation. We found an asymmetry in the estimation …

@domm@social.linux.pizza
2025-04-22 19:41:29

Today, as an evening amusement, I fixed a very nast Bug in the #Koha test suite, that did not occure when running only the affected test file, but only when also running some other tests: https://bugs.koh…

39700 – Fix test case t/db_dependent/Authority/Merge.t broken in 34739
enhancement, P5 - low, assigned to chris, Needs Signoff, in Test Suite, Koha

@arXiv_csCL_bot@mastoxiv.page
2025-06-03 08:20:56

RewardBench 2: Advancing Reward Model Evaluation
Saumya Malik, Valentina Pyatkin, Sander Land, Jacob Morrison, Noah A. Smith, Hannaneh Hajishirzi, Nathan Lambert
https://arxiv.org/abs/2506.01937

RewardBench 2: Advancing Reward Model Evaluation
Reward models are used throughout the post-training of language models to capture nuanced signals from preference data and provide a training target for optimization across instruction following, reasoning, safety, and more domains. The community has begun establishing best practices for evaluating reward models, from the development of benchmarks that test capabilities in specific skill areas to others that test agreement with human preferences. At the same time, progress in evaluation has not…

@arXiv_csRO_bot@mastoxiv.page
2025-06-05 09:52:50

This https://arxiv.org/abs/2505.06787 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csRO_…

Digital-physical testbed for ship autonomy studies in the Marine Cybernetics Laboratory basin
The algorithms developed for Maritime Autonomous Surface Ships (MASS) are often challenging to test on actual vessels due to high operational costs and safety considerations. Simulations offer a cost-effective alternative and eliminate risks, but they may not accurately represent real-world dynamics for the given tasks. Utilizing small-scale model ships and robotic vessels in conjunction with a laboratory basin provides an accessible testing environment for the early stages of validation proces…

@jerome@jasette.facil.services
2025-06-04 13:46:46

No quote post yet
Remote fetch replies is set as experimental and also asynchronous, unclear if it would be ready for the final release. https://mastodon.social/@MastodonEngineering/114625074809479231

Mastodon Engineering (@MastodonEngineering@mastodon.social)
The first beta release of Mastodon 4.4.0 is ready for testing! Many user-facing improvements: featured content, updates for lists, follower management, new emoji, and refreshed audio and media players. For server owners: new legal features, moderation tweaks, announcements updates, upgraded software stack, and some experimental feature options. Please test it out, and provide feedback 💬 https://github.com/mastodon/mastodon/releases/tag/v4.4.0-beta.1

@metacurity@infosec.exchange
2025-06-02 12:23:16

Bitcoin options trading venue BitMEX discovered an operational security mistake in a thwarted attack by N. Korea's Lazarus Group, which revealed the attackers' IP address and uncovered at least 10 potential accounts used to test or develop its malware.
h…

@arXiv_astrophIM_bot@mastoxiv.page
2025-06-04 07:48:08

Verification of the Timing System for the X-ray Imaging and Spectroscopy Mission in the GPS Unsynchronized Mode
Megumi Shidatsu, Yukikatsu Terada, Takashi Kominato, So Kato, Ryohei Sato, Minami Sakama, Takumi Shioiri, Yugo Motogami, Yuuki Niida, Chulsoo Kang, Toshihiro Takagi, Taichi Nakamoto, Chikara Natsukari, Makoto S. Tashiro, Kenichi Toda, Hironori Maejima, Shin Watanabe, Ryo Iizuka, Rie Sato, Chris Baluta, Katsuhiro Hayashi, Tessei Yoshida, Shoji Ogawa, Yoshiaki Kanemaru, Kotaro Fukushima, Akio Hoshino, Hiromitsu Takahashi, Masayoshi Nobukawa, Tsunefumi Mizuno, Kazuhiro Nakazawa, Shinichiro Uno, Ken Ebisawa, Satoshi Eguchi, Satoru Katsuda, Aya Kubota, Naomi Ota, Atsushi Tanimoto, Yuichi Terashima, Yohko Tsuboi, Yuusuke Uchida, Hideki Uchiyama, Shigeo Yamauchi, Tomokage Yoneyama, Satoshi Yamada, Nagomi Uchida, Matt Holland, Michael Loewenstein, Tahir Yaqoob, Eric D. Miller, Robert S. Hill, Efrain C. Perez-Solis, Morgan D. Waddy, Mark Mekosh, Joseph B. Fox, Isabella S. Brewer, Emily Aldoretta, Koji Mukai, Kenji Hamaguchi, Francois Mernier, Anna Ogorzalek, Katja Pottschmidt, Mihoko Yukita
#toXiv_bot_toot

@arXiv_mathNT_bot@mastoxiv.page
2025-06-06 07:26:15

On the number of divisors of Mersenne numbers
Vjekoslav Kova\v{c}, Florian Luca
https://arxiv.org/abs/2506.04883 https://arxiv.org/pd…

On the number of divisors of Mersenne numbers
Denote $f(n):=\sum_{1\le k\le n} τ(2^k-1)$, where $τ$ is the number of divisors function. Motivated by a question of Paul Erdős, we show that the sequence of ratios $f(2n)/f(n)$ is unbounded. We also present conditional results on the divergence of this sequence to infinity. Finally, we test numerically both the conjecture $f(2n)/f(n)\to\infty$ and our sufficient conditions for it to hold.

@mia@hcommons.social
2025-06-01 16:27:44

Ad on the tube says 'Humans were the beta test. The era of AI employees is here'.
I can't *imagine* why people are a bit resistant to AI! At least offshoring never advertised on the tube. The enshittification of 21st century life continues.

@JorgeStolfi@mas.to
2025-05-28 11:51:50

Testing my new LLM, "#CrokAI":
Q. Explain why the 9th full test of the Starship was a resounding success.
A. Before we can get to Mars, we must perfect the needed technology by colonizing a target that is closer to us. While some think that the Moon could serve this purpose, the Earth is a much more convenient (if challenging) target. In test #9, Starshit, sorry, Starship managed t…

@arXiv_astrophCO_bot@mastoxiv.page
2025-06-04 13:50:14

This https://arxiv.org/abs/2502.02638 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_…

An Alcock-Paczynski Test on Reionization Bubbles for Cosmology
In this paper, we propose an Alcock-Paczyński (AP) test to constrain cosmology using HII bubbles during the Epoch of Reionization. Similarly to cosmic voids, a stack of HII bubbles is spherically symmetric because ionizing fronts propagate isotropically on average (even if individual bubbles may not be spherical), making them standard spheres to be used in an AP test. Upcoming 21-cm observations, from the Square Kilometer Array (SKA) for instance, will contain tomographic information about HII…

@JGraber@mastodon.social
2025-03-18 09:12:38

Close the Loop and Load Test the Improvements - #dotNet
https://improveandrepeat.com/2025/03/close-the-loop-and-load-test-the-improvements/

Close the Loop and Load Test the Improvements - Improve & Repeat
None

@arXiv_astrophSR_bot@mastoxiv.page
2025-06-06 07:30:34

Going Bayesian on the ages of nearby young stellar systems I. The expansion rate method
J. Olivares, A. Berihuete, H. Bouy
https://arxiv.org/abs/2506.05110

Going Bayesian on the ages of nearby young stellar systems I. The expansion rate method
Context. Determining the ages of young stellar systems is fundamental to test and validate current star-formation theories. Aims. We aim at developing a Bayesian version of the expansion rate method that incorporates the a priori knowledge on the stellar system's age and solves some of the caveats of the traditional frequentist approach. Methods. We upgrade an existing Bayesian hierarchical model with additional parameter hierarchies that include, amongst others, the system's age. For this late…

@krone@frawas.de
2025-06-05 04:40:54

„Krone“ machte Test - Billigflug um 217 Euro von Wien nach Singapur #News #Nachrichten

„Krone“ machte Test - Billigflug um 217 Euro von Wien nach Singapur
Für 217 Euro von Wien nach Singapur – eine neue Billig-Airline mischt den Flughafen Schwechat auf und bietet erstmals Langstrecken-Destinationen ...

@kerstinsailer@sciences.social
2025-06-02 19:54:40

🚨 New paper alert 🚨
Together with my co-authors, we compare two different diagnostic clinics of Moorfields Eye Hospital in London regarding their spatial designs and effective patient flows
We highlight the importance of line of sight relationships between diagnostic test stations to ease patient flow and coordination and suggest an ideal clinic configuration based on queuing models
Published

Lanes, clusters, sightlines: modelling patient flow in medical clinics | Buildings & Cities

@arXiv_physicsfludyn_bot@mastoxiv.page
2025-05-29 07:34:27

Reduced order modelling of air puff test for corneal material characterisation
Osama M. Maklad, Muting Hao
https://arxiv.org/abs/2505.22495 https://…

Reduced order modelling of air puff test for corneal material characterisation
Models of the fluid-structure interaction (FSI) model for the air puff test were analysed. Using Abaqus, the air puff test is applied to eyes with varying biomechanical parameters, such as material properties, corneal thickness, and radius. A reduced order model of the air puff (a turbulent impinging jet) has been acquired to decrease simulation time from 48 hours for the FSI model to approximately 12 minutes for the finite element analysis (FEA) model alone. To further accelerate simulations a…

@arXiv_physicsinsdet_bot@mastoxiv.page
2025-06-02 07:34:55

A highly sensitive SF$_6$-based leak test system for JUNO 3-inch PMT underwater electronics boxes
Ziliang Chu, Diru Wu, Miao He, Jilei Xu, Xiaoping Jing, Jian Wang
https://arxiv.org/abs/2505.24142

A highly sensitive SF$_6$-based leak test system for JUNO 3-inch PMT underwater electronics boxes
A total of 25600 3-inch photomultiplier tubes (PMTs), along with their corresponding frontend electronics, have been installed at the Jiangmen Underground Neutrino Observatory (JUNO). These electronics are housed in 200 stainless steel boxes that operate underwater. To verify the sealing integrity of the underwater boxes following integration, we developed an SF$_6$-based leak test system, opting against the typical helium-based system due to helium's ability to penetrate the PMT glass. After a…

@arXiv_csSE_bot@mastoxiv.page
2025-06-03 16:28:32

This https://arxiv.org/abs/2404.10304 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csSE_…

LLM-Powered Test Case Generation for Detecting Bugs in Plausible Programs
Detecting tricky bugs in plausible programs, those that pass existing test suites yet still contain bugs, remains a significant challenge in software testing. To address this problem, we propose TrickCatcher, an LLM-powered approach to generating test cases for uncovering bugs in plausible programs. TrickCatcher operates in three stages: First, it uses an LLM to generate program variants based on the program under test (PUT) and its specification. Second, it employs an LLM to construct an input…

@arXiv_csIR_bot@mastoxiv.page
2025-06-06 07:19:34

Reason-to-Recommend: Using Interaction-of-Thought Reasoning to Enhance LLM Recommendation
Keyu Zhao, Fengli Xu, Yong Li
https://arxiv.org/abs/2506.05069 ht…

Reason-to-Recommend: Using Interaction-of-Thought Reasoning to Enhance LLM Recommendation
Driven by advances in Large Language Models (LLMs), integrating them into recommendation tasks has gained interest due to their strong semantic understanding and prompt flexibility. Prior work encoded user-item interactions or metadata into prompts for recommendations. In parallel, LLM reasoning, boosted by test-time scaling and reinforcement learning, has excelled in fields like mathematics and code, where reasoning traces and correctness signals are clear, enabling high performance and interp…

@arXiv_mathST_bot@mastoxiv.page
2025-06-03 07:35:29

Asymptotic analysis of high-dimensional uniformity tests under heavy-tailed alternatives
Tiefeng Jiang, Tuan Pham
https://arxiv.org/abs/2506.00393 https://…

Asymptotic analysis of high-dimensional uniformity tests under heavy-tailed alternatives
We study the high-dimensional uniformity testing problem, which involves testing whether the underlying distribution is the uniform distribution, given $n$ data points on the $p$-dimensional unit hypersphere. While this problem has been extensively studied in scenarios with fixed $p$, only three testing procedures are known in high-dimensional settings: the Rayleigh test \cite{Cutting-P-V}, the Bingham test \cite{Cutting-P-V2}, and the packing test \cite{Jiang13}. Most existing research focuses…

@arXiv_astrophHE_bot@mastoxiv.page
2025-06-02 10:08:36

This https://arxiv.org/abs/2410.08862 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_…

Test for LISA foreground Gaussianity and stationarity: extreme mass-ratio inspirals
Extreme Mass Ratio Inspirals (EMRIs) are key observational targets for the Laser Interferometer Space Antenna (LISA) mission. Unresolvable EMRI signals contribute to the formation of a gravitational wave background (GWB). Characterizing the statistical features of the GWB from EMRIs is of great importance, as EMRIs will ubiquitously affect large segments of the inference scheme. In this work, we apply a frequentist test for GWB Gaussianity and stationarity, exploring three astrophysically-motiv…

@radioeinsmusicbot@mastodonapp.uk
2025-06-08 09:47:19

🇺🇦 Auf radioeins läuft...
Crash Test Dummies:
🎵 God Shuffled His Feet
#NowPlaying #CrashTestDummies
#radioeins gespielten Titel als #Spotify Playliste: https://open.spotify.com/playlist/3hdH98B6uyXilhcWxCA6nv

@deabigt@universeodon.com
2025-05-05 18:53:51

Interesting. Pointed some test code at google to see if new driver works and got a captcha. Guess they do not want anyone scraping them like they do.

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 07:16:56

Control-R: Towards controllable test-time scaling
Di Zhang, Weida Wang, Junxian Li, Xunzhi Wang, Jiatong Li, Jianbo Wu, Jingdi Lei, Haonan He, Peng Ye, Shufei Zhang, Wanli Ouyang, Yuqiang Li, Dongzhan Zhou
https://arxiv.org/abs/2506.00189

Control-R: Towards controllable test-time scaling
This paper target in addressing the challenges of underthinking and overthinking in long chain-of-thought (CoT) reasoning for Large Reasoning Models (LRMs) by introducing Reasoning Control Fields (RCF)--a novel test-time approach that injects structured control signals to guide reasoning from a tree search perspective. RCF enables models to adjust reasoning effort according to given control conditions when solving complex tasks. Additionally, we present the Control-R-4K dataset, which consists …

@arXiv_csLG_bot@mastoxiv.page
2025-06-05 10:58:06

This https://arxiv.org/abs/2505.22813 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…

X-Factor: Quality Is a Dataset-Intrinsic Property
In the universal quest to optimize machine-learning classifiers, three factors -- model architecture, dataset size, and class balance -- have been shown to influence test-time performance but do not fully account for it. Previously, evidence was presented for an additional factor that can be referred to as dataset quality, but it was unclear whether this was actually a joint property of the dataset and the model architecture, or an intrinsic property of the dataset itself. If quality is truly d…

@arXiv_csCR_bot@mastoxiv.page
2025-06-03 16:33:44

This https://arxiv.org/abs/2403.11981 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCR_…

Certified Robustness to Clean-Label Poisoning Using Diffusion Denoising
We present a certified defense to clean-label poisoning attacks under $\ell_2$-norm. These attacks work by injecting a small number of poisoning samples (e.g., 1%) that contain bounded adversarial perturbations into the training data to induce a targeted misclassification of a test-time input. Inspired by the adversarial robustness achieved by $randomized$ $smoothing$, we show how an off-the-shelf diffusion denoising model can sanitize the tampered training data. We extensively test our defense…

@arXiv_astrophCO_bot@mastoxiv.page
2025-06-06 07:28:18

Uncalibrated Cosmic Standards as a Robust Test on Late-Time Cosmological Models
Yihao Wang, Weikang Lin
https://arxiv.org/abs/2506.04333 https://

Uncalibrated Cosmic Standards as a Robust Test on Late-Time Cosmological Models
We present a minimally model-dependent framework for testing late-time cosmological models using Uncalibrated Cosmic Standards (UCS), including standard rulers and standard candles, without relying on absolute calibrations. The method exploits a tight, model-insensitive correlation between the sound horizons at recombination and the drag epoch. By avoiding dependence on pre-recombination physics and the amplitude of the Cosmic Microwave Background (CMB) power spectra, the UCS framework reduces …

@arXiv_statME_bot@mastoxiv.page
2025-06-03 17:27:08

This https://arxiv.org/abs/2410.12201 has been replaced.
link: https://scholar.google.com/scholar?q=a

Data-light Uncertainty Set Merging with Admissibility: Synthetics, Aggregation, and Test Inversion
This article introduces a Synthetics, Aggregation, and Test inversion (SAT) approach for merging diverse and potentially dependent uncertainty sets into a single unified set. The procedure is data-light, relying only on initial sets and control levels, and it adapts to any user-specified initial uncertainty sets, accommodating potentially varying coverage levels. SAT is motivated by the challenge of integrating uncertainty sets when only the initial sets and their control levels are available -…

@arXiv_csSE_bot@mastoxiv.page
2025-06-04 07:28:55

Automated Web Application Testing: End-to-End Test Case Generation with Large Language Models and Screen Transition Graphs
Nguyen-Khang Le, Quan Minh Bui, Minh Ngoc Nguyen, Hiep Nguyen, Trung Vo, Son T. Luu, Shoshin Nomura, Minh Le Nguyen
https://arxiv.org/abs/2506.02529

Automated Web Application Testing: End-to-End Test Case Generation with Large Language Models and Screen Transition Graphs
Web applications are critical to modern software ecosystems, yet ensuring their reliability remains challenging due to the complexity and dynamic nature of web interfaces. Recent advances in large language models (LLMs) have shown promise in automating complex tasks, but limitations persist in handling dynamic navigation flows and complex form interactions. This paper presents an automated system for generating test cases for two key aspects of web application testing: site navigation and form …

@deabigt@universeodon.com
2025-06-05 19:10:34

Run Local LLMs on Hardware from $50 to $50,000 - We Test and Compare! https://youtu.be/mUGsv_IHT-g?si=SsLWOdpLdHYrBFEH

@memeorandum@universeodon.com
2025-06-02 12:05:44

For Senate Majority Leader John Thune, Trump's big bill is a big test (Theodoric Meyer/Washington Post)
https://www.washingtonpost.com/politics/2025/06/02/trump-bill-senate-thune/
http://www.memeorandum.com/250602/p15#a250602p15

@arXiv_astrophSR_bot@mastoxiv.page
2025-06-06 07:30:39

Bayesian ages of local young stellar associations I. Through the expansion rate method
J. Olivares, N. Miret-Roig, P. A. B. Galli, H. Bouy
https://arxiv.org/abs/2506.05130

Bayesian ages of local young stellar associations I. Through the expansion rate method
Context. Local young stellar associations (LYSAs <50 Myr and <150 pc) are important laboratories to test predictions from star-formation theories. Estimating their ages through various dating techniques with minimal biases is thus of paramount importance. Aims. We aim at determining the ages of LYSAs with the expansion rate dating technique. Methods. We estimate the ages of the LYSAs using literature membership lists, publicly available data (astrometry and radial velocities), and a recent open…

@arXiv_csLG_bot@mastoxiv.page
2025-06-05 10:56:04

This https://arxiv.org/abs/2505.14613 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…

Virtual Cells: Predict, Explain, Discover
Drug discovery is fundamentally a process of inferring the effects of treatments on patients, and would therefore benefit immensely from computational models that can reliably simulate patient responses, enabling researchers to generate and test large numbers of therapeutic hypotheses safely and economically before initiating costly clinical trials. Even a more specific model that predicts the functional response of cells to a wide range of perturbations would be tremendously valuable for disco…

@Techmeme@techhub.social
2025-06-02 12:30:38

Paradromics implanted and removed its Connexus brain implant in a patient for ~10 minutes during epilepsy surgery on May 14, a first for the Neuralink rival (Emily Mullin/Wired)
https://www.wired.com/story/paradromics-neuralink-rival-tested-brain-impl…

A Neuralink Rival Just Tested a Brain Implant in a Person
Paradromics, a brain-computer-interface startup, inserted its brain implant in a person—briefly—in an early test of its technology.

@arXiv_csSE_bot@mastoxiv.page
2025-06-04 07:27:15

The Impact of Software Testing with Quantum Optimization Meets Machine Learning
Gopichand Bandarupalli
https://arxiv.org/abs/2506.02090 https://

The Impact of Software Testing with Quantum Optimization Meets Machine Learning
Modern software systems complexity challenges efficient testing, as traditional machine learning (ML) struggles with large test suites. This research presents a hybrid framework integrating Quantum Annealing with ML to optimize test case prioritization in CI/CD pipelines. Leveraging quantum optimization, it achieves a 25 percent increase in defect detection efficiency and a 30 percent reduction in test execution time versus classical ML, validated on the Defects4J dataset. A simulated CI/CD env…

@arXiv_csSE_bot@mastoxiv.page
2025-06-02 07:21:35

Principal Context-aware Diffusion Guided Data Augmentation for Fault Localization
Shihao Fu, Yan Lei
https://arxiv.org/abs/2505.24079 https://

Principal Context-aware Diffusion Guided Data Augmentation for Fault Localization
Test cases are indispensable for conducting effective fault localization (FL). However, test cases in practice are severely class imbalanced, i.e. the number of failing test cases (i.e. minority class) is much less than that of passing ones (i.e. majority class). The severe class imbalance between failing and passing test cases have hindered the FL effectiveness. To address this issue, we propose PCD-DAug: a Principal Context-aware Diffusion guided Data Augmentation approach that generate syn…

@arXiv_astrophCO_bot@mastoxiv.page
2025-06-06 09:45:42

This https://arxiv.org/abs/2308.02636 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_…

Learning from Topology: Cosmological Parameter Estimation from the Large-scale Structure
The topology of the large-scale structure of the universe contains valuable information on the underlying cosmological parameters. While persistent homology can extract this topological information, the optimal method for parameter estimation from the tool remains an open question. To address this, we propose a neural network model to map persistence images to cosmological parameters. Through a parameter recovery test, we demonstrate that our model makes accurate and precise estimates, consider…

@arXiv_csSE_bot@mastoxiv.page
2025-06-06 09:38:25

This https://arxiv.org/abs/2502.14948 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csSE_…

Learning to Solve and Verify: A Self-Play Framework for Code and Test Generation
Recent advances in large language models (LLMs) have improved their performance on coding benchmarks. However, improvement is plateauing due to the exhaustion of readily available high-quality data. Prior work has shown the potential of synthetic self-instruct data, but naively training on a model's own outputs can cause error accumulation, especially in coding tasks, where generalization may collapse due to overly simple or erroneous training data, highlighting the need for rigorous quality ch…

@Techmeme@techhub.social
2025-05-31 20:45:48

Twitch plans to start testing the ability to host a vertical livestream and rolls out an open beta of 2k streaming, letting creators stream at 1440p (Jay Peters/The Verge)
https://www.theverge.com/news/677548/twitch-twitchcon-e…

Twitch is getting vertical livestreams
Twitch is announcing a bunch of updates at TwitchCon Europe, including vertical livestreams and an open beta test that lets creators stream at a higher quality.

@arXiv_csSE_bot@mastoxiv.page
2025-06-02 10:02:00

This https://arxiv.org/abs/2502.05368 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csSE_…

Otter: Generating Tests from Issues to Validate SWE Patches
While there has been plenty of work on generating tests from existing code, there has been limited work on generating tests from issues. A correct test must validate the code patch that resolves the issue. This paper focuses on the scenario where that code patch does not yet exist. Doing so supports two major use-cases. First, it supports TDD (test-driven development), the discipline of "test first, write code later" that has well-documented benefits for human software engineers. Second, it als…

@arXiv_astrophCO_bot@mastoxiv.page
2025-06-03 16:47:32

This https://arxiv.org/abs/2405.03024 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_…

CMB low multipole alignments across WMAP and \emph{Planck} data releases
The first observations of the cosmic microwave background (CMB) from NASA's \emph{Wilkinson Microwave Anisotropy Probe} (WMAP) led to finding `alignment' anomalies not expected from fluctuations in the isotropic cosmological model. We study the data of all 8 full-sky public releases since then to test for anomalous alignments and shapes of the first 60 multipoles, i.e., over the range $2\leq l \leq 61$. We use rotationally invariant and covariant statistics to test isotropy of all subsequent WM…

@arXiv_csSE_bot@mastoxiv.page
2025-06-04 07:38:55

A Multi-agent LLM-based JUit Test Generation with Strong Oracles
Qinghua Xu, Guancheng Wang, Lionel Briand, Kui Liu
https://arxiv.org/abs/2506.02943 https:…

A Multi-agent LLM-based JUit Test Generation with Strong Oracles
Unit testing plays a critical role in ensuring software correctness. However, writing unit tests manually is laborious, especially for strong typed languages like Java, motivating the need for automated approaches. Traditional methods primarily rely on search-based or randomized algorithms to generate tests that achieve high code coverage and produce regression oracles, which are derived from the program's current behavior rather than its intended functionality. Recent advances in large languag…

@arXiv_csSE_bot@mastoxiv.page
2025-06-05 09:44:57

This https://arxiv.org/abs/2506.02943 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csSE_…

A Multi-agent LLM-based JUit Test Generation with Strong Oracles
Unit testing plays a critical role in ensuring software correctness. However, writing unit tests manually is laborious, especially for strong typed languages like Java, motivating the need for automated approaches. Traditional methods primarily rely on search-based or randomized algorithms to generate tests that achieve high code coverage and produce regression oracles, which are derived from the program's current behavior rather than its intended functionality. Recent advances in large languag…

Tootfinder

Opt-in global Mastodon full text search. Join the index!