
2025-05-24 08:01:05
Software Fairness Testing in Practice
Ronnie de Souza Santos, Matheus de Morais Leca, Reydne Santos, Cleyton Magalhaes
https://arxiv.org/abs/2506.17095 htt…
A Better Vocabulary for #Testing
https://alperenkeles.com/posts/vocab-for-testing/
Have I missed something or is there no way with openssh to exempt certain netblocks from the MaxStartups setting?
There's PerSourcePenaltyExemptList but that's spoecifically for the "penalties" things which are separate from MaxStartups, right?
https://
Diffusion-Based Hypothesis Testing and Change-Point Detection
Sean Moushegian, Taposh Banerjee, Vahid Tarokh
https://arxiv.org/abs/2506.16089 https://
Testing the variety hypothesis
A. Lerario, P. Roos Hoefgeest, M. Scolamiero, A. Tamai
https://arxiv.org/abs/2507.16705 https://arxiv.…
Functionize, which offers a cloud platform that uses AI to speed up software testing, raised a $41M Series B, bringing its total funding to $67M (Maria Deutscher/SiliconANGLE)
https://siliconangle.com/2025/08/19/functionize-nabs-41m-speed-software-testing…
Joint training and NUCLEAR testing: what are Russia and Belarus preparing? #shorts: https://benborges.xyz/2025/08/19/joint-training-and-nuclear-testing.html
ChatChecker: A Framework for Dialogue System Testing and Evaluation Through Non-cooperative User Simulation
Roman Mayr, Michel Schimpf, Thomas Bohn\'e
https://arxiv.org/abs/2507.16792
Testing Against Tree Ordered Alternatives in One-way ANOVA
Subha Halder, Anjana Mondal, Somesh Kumar
https://arxiv.org/abs/2507.17229 https://arxiv.org/pdf…
The Optimality of a Nested Generalized Pairwise Group Testing Procedure
Yaakov Malinovsky, Viktor Skorniakov
https://arxiv.org/abs/2506.15797 https://
Development of a Standardized Testing Environment for QRNGs based on Semiconductor Laser Phase Noise
Matthias Ostner, Innocenzo De Marco, Christian Roubal
https://arxiv.org/abs/2507.17471
Regression Testing Optimization for ROS-based Autonomous Systems: A Comprehensive Review of Techniques
Yupeng Jiang, Shuaiyi Sun, Xi Zheng
https://arxiv.org/abs/2506.16101
Testing Quantum-Corrected Black Holes with QPOs Observations: A Study of Particle Dynamics and Accretion Flow
G. Mustafa, Sushant G. Ghosh, Orhan Donmez, S. K. Maurya, Shakhzod Orzuev, Farruh Atamurotov
https://arxiv.org/abs/2506.16405
Testing the Lense-Thirring Precession Origin of the QPO in Swift J1727.8$-$1613
Ruican Ma, Chris Done, Aya Kubota
https://arxiv.org/abs/2506.18857 https://…
{testthat} is great for automatic testing. Here are some tricks for the heavy user: #rstats
Testing Light Unaffiliated Mass Clumps in MACS 0416 on galaxy and galaxy cluster scales using JWST
Marceau Limousin, Derek Perera, Liliya L. R. Williams, Jori Liesenborgs, Gregor Rihtarsic
https://arxiv.org/abs/2506.16034
Wow, yet another Thunderbird update.
It seems that Thunderbird, which ought to be rather stable by now, is getting updated more often than Chrome.
Either somebody is writing crummy code, not testing, or has a low acceptance hurdle for proposed "enhancements."
"The Scale of Russian Sabotage Operations Against Europe’s Critical Infrastructure" by IISS.
https://www.iiss.org/research-paper/2025/08/the-scale-of-russian--sabotage-operations--against-europes-critic…
LED Lighting: Mini Reviews - Real-world testing! - https://www.earth.org.uk/LED-lighting.html
Of course, I don't know 'hardware', as you can tell from my technical description, but I have a sample from another tuning peg gear, and the peg and gear for testing, I get to Home Hardware and they have loose bolts of small dimension. I quickly learn that #6 is too large, #4 is too small and they have no #5's where the thread matches.
But you know what works? Do you remember those little chrome bolts with the hex-wrench heads that used to hold expansion cards in the ibm-pc? Perfect match, only 5mm too long, easily compensated by buying a matching nut and my 53-years owned pawnshop 5-string, my first banjo, is back in action!
Brazil’s Chamber of Deputies Approves Bill Banning Cosmetic Testing on Live Vertebrates https://vegconomist.com/politics-law/brazil-approves-bill-banning-cosmetic-testing-live-vertebrates/
President Epstein List has an insufficiency? Sounds right.
#USpol
Deep Learning Framework Testing via Model Mutation: How Far Are We?
Yanzhou Mu, Rong Wang, Juan Zhai, Chunrong Fang, Xiang Chen, Zhiyuan Peng, Peiran Yang, Ruixiang Qian, Shaoyu Yang, Zhenyu Chen
https://arxiv.org/abs/2506.17638
"‘Shark Skin’ Coating for Airliners May Cut Fuel Use by 4% – Delta is Testing on its 767 Fleet"
#Aviation #Aeroplanes
Characterizing and Testing Configuration Stability in Two-Dimensional Threshold Cellular Automata
Yonatan Nakar, Dana Ron
https://arxiv.org/abs/2507.14569 …
LiveMCP-101: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries
Ming Yin, Dinghan Shen, Silei Xu, Jianbing Han, Sixun Dong, Mian Zhang, Yebowen Hu, Shujian Liu, Simin Ma, Song Wang, Sathish Reddy Indurthi, Xun Wang, Yiran Chen, Kaiqiang Song
https://arxiv.org/abs/2508.15760
from my link log —
Mix-testing: revealing a new class of compiler concurrency bugs.
https://johnwickerson.wordpress.com/2024/06/28/mix-testing-revealing-a-new-class-of-compiler-bugs/
saved 2024-06-29
Gaussian Sequence Model: Sample Complexities of Testing, Estimation and LFHT
Zeyu Jia, Yury Polyanskiy
https://arxiv.org/abs/2507.16734 https://
Leveraging Optimal Transport for Distributed Two-Sample Testing: An Integrated Transportation Distance-based Framework
Zhengqi Lin, Yan Chen
https://arxiv.org/abs/2506.16047
Microsoft says it is testing a new aggregated gaming library in the Xbox PC app for Windows 11 with Xbox Insiders, integrating "leading PC storefronts" (Jez Corden/Windows Central)
https://www.windowscentral.com/gaming/pc-g
From #AnnafromUkraine @AnnafromUkraine@youtube.com
PROTEST IN KYIV: ANTI-CORRUPTION LAW & MOSCOW NO FLY ZONE Vlog 1113: War in #Ukraine
Why #Moscow has become the new testing gro…
Tuning Random Generators: Property-Based Testing as Probabilistic Programming
Ryan Tjoa, Poorva Garg, Harrison Goldstein, Todd Millstein, Benjamin Pierce, Guy Van den Broeck
https://arxiv.org/abs/2508.14394
Robust Self-Testing of Multiqudit Supersinglet Slater States via Constant Number of Binary Measurements
Arturo Konderak, Wojciech Bruzda, Remigiusz Augusiak
https://arxiv.org/abs/2508.15546
🔥 Ukraine will become a testing ground for the weapons of the future: tests right on the front line: https://benborges.xyz/2025/07/18/ukraine-will-become-a-testing.html
Testing Clustered Equal Predictive Ability with Unknown Clusters
Oguzhan Akgun, Alain Pirotte, Giovanni Urga, Zhenlin Yang
https://arxiv.org/abs/2507.14621
Breaking Single-Tester Limits: Multi-Agent LLMs for Multi-User Feature Testing
Sidong Feng, Changhao Du, Huaxiao Liu, Qingnan Wang, Zhengwei Lv, Mengfei Wang, Chunyang Chen
https://arxiv.org/abs/2506.17539
Towards Effective Complementary Security Analysis using Large Language Models
Jonas Wagner, Simon M\"uller, Christian N\"ather, Jan-Philipp Stegh\"ofer, Andreas Both
https://arxiv.org/abs/2506.16899
Source: Netflix is using Runway AI's video generation tools for production; Disney is testing out the tools and talked with Runway about possible uses for them (Rachel Metz/Bloomberg)
https://www.bloomberg.com/news/articles/20
‘Shark Skin’ Coating for Airliners May Cut Fuel Use by 4% – Delta is Testing on its 767 Fleet https://www.goodnewsnetwork.org/shark-skin-coating-for-airliners-may-cut-fuel-use-by-4-delta-is-testing/
Testing gravitational physics by combining DESI DR1 and weak lensing datasets using the E_G estimator
S. J. Rauhut, C. Blake, U. Andrade, H. E. Noriega, J. Aguilar, S. Ahlen, S. BenZvi, D. Bianchi, D. Brooks, T. Claybaugh, A. Cuceu, A. de la Macorra, J. DeRose, P. Doel, N. Emas, S. Ferraro, J. E. Forero-Romero, C. Garcia-Quintero, E. Gazta\~naga, G. Gutierrez, S. Heydenreich, K. Honscheid, C. Howlett, D. Huterer, M. Ishak, S. Joudaki, R. Joyce, E. Jullo, R. Kehoe, D. Kirkby, A. Kremin,…
DiCriTest: Testing Scenario Generation for Decision-Making Agents Considering Diversity and Criticality
Qitong Chu, Yufeng Yue, Danya Yao, Huaxin Pei
https://arxiv.org/abs/2508.11514
CT Radiomics-Based Explainable Machine Learning Model for Accurate Differentiation of Malignant and Benign Endometrial Tumors: A Two-Center Study
Tingrui Zhang, Honglin Wu, Zekun Jiang, Yingying Wang, Rui Ye, Huiming Ni, Chang Liu, Jin Cao, Xuan Sun, Rong Shao, Xiaorong Wei, Yingchun Sun
https://arxiv.org/abs/2506.18106
Multi-Angle Rotational Actuation in a 0.8-mm-Thick Preload-Free Piezoelectric Micromotor
Haijia Yu, Mingtong Chen, Zhengbao Yang
https://arxiv.org/abs/2507.17155 https://…
NYC grants Waymo its first permit, which extends through late September, to test up to eight of its autonomous vehicles in Manhattan and Downtown Brooklyn (Samantha Subin/CNBC)
https://www.cnbc.com/2025/08/22/waymo-permit-new-york-city-nyc-rides.html
inrep: A Comprehensive Framework for Adaptive Testing in R
Clievins Selva
https://arxiv.org/abs/2507.15893 https://arxiv.org/pdf/2507…
Pest und Cholera in einem schönen Ebenmaß?
https://mastodon.online/@9to5Mac/115051912057222137
Optimal Parallel Algorithms for Convex Hulls in 2D and 3D under Noisy Primitive Operations
Michael T. Goodrich, Vinesh Sridhar
https://arxiv.org/abs/2506.17507
Refining Ray-Tracing Accuracy and Efficiency in the Context of FRMCS Urban Railway Channel Predictions
Romain Charbonnier, Thierry Tenoux, Yoann Corre
https://arxiv.org/abs/2506.16236
A Simple Apparatus for Testing PMT Humidity Tolerance
A. Germer, K. Park, C. Skuse, C. Yang, D. S. Parno
https://arxiv.org/abs/2507.13545 https://
On the Feasibility of Quantum Unit Testing
Andriy Miranskyy, Jos\'e Campos, Anila Mjeda, Lei Zhang, Ignacio Garc\'ia Rodr\'iguez de Guzm\'an
https://arxiv.org/abs/2507.17235
Testing the dark side of neutrino oscillations with the solar neutrino fog at Dark Matter experiments
Julia Gehrlein, Tanmay Kushwaha
https://arxiv.org/abs/2508.14166 https://…
Apropos of yet another conversation today, I’m a big fan of using automation in WCAG testing.
But I also know WCAG well enough to understand the limitations (and lies) the tools.
https://adrianroselli.com/2025/04/automated-wcag-testing-is-grrreat.htm…
Hypothesis Testing for Quantifying LLM-Human Misalignment in Multiple Choice Settings
Harbin Hong, Sebastian Caldas, Liu Leqi
https://arxiv.org/abs/2506.14997
Challenges and Practices in Quantum Software Testing and Debugging: Insights from Practitioners
Jake Zappin, Trevor Stalnaker, Oscar Chaparro, Denys Poshyvanyk
https://arxiv.org/abs/2506.17306
Bayesian Optimization-based Search for Agent Control in Automated Game Testing
Carlos Celemin
https://arxiv.org/abs/2508.13121 https://arxiv.org/pdf/2508.1…
BACFuzz: Exposing the Silence on Broken Access Control Vulnerabilities in Web Applications
I Putu Arya Dharmaadi, Mohannad Alhanahnah, Van-Thuan Pham, Fadi Mohsen, Fatih Turkmen
https://arxiv.org/abs/2507.15984
Testing Separability of High-Dimensional Covariance Matrices
Bongjung Sung, Peter D. Hoff
https://arxiv.org/abs/2506.17463 https://ar…
On Continuous Monitoring of Risk Violations under Unknown Shift
Alexander Timans, Rajeev Verma, Eric Nalisnick, Christian A. Naesseth
https://arxiv.org/abs/2506.16416
Revolutionizing Validation and Verification: Explainable Testing Methodologies for Intelligent Automotive Decision-Making Systems
Halit Eris, Stefan Wagner
https://arxiv.org/abs/2506.16876
Testing Homogeneity in a heteroscedastic contaminated normal mixture
Xiaoqing Niu, Pengfei Li, Yuejiao Fu
https://arxiv.org/abs/2507.15630 https://
Microsoft begins testing a Windows 11 feature for sharing the entire desktop with Copilot Vision; it requires first entering a special mode in the Copilot app (Zac Bowden/Windows Central)
https://www.
Possibilities for SETI at High Energy
Brian C. Lacki, Stephen DiKerby
https://arxiv.org/abs/2506.16351 https://arxiv.org/pdf/2506.163…
How Hard is it to be a Star? Convex Geometry and the Real Hierarchy
Marcus Schaefer, Daniel \v{S}tefankovi\v{c}
https://arxiv.org/abs/2506.18818 https://…
In silico evaluation of pramlintide dosing algorithms in artificial pancreas systems
Borja Pons Torres, Iv\'an Sala Mira, Clara Furi\'o-Novejarque, Ricardo Sanz, Pedro Garc\'ia, Jos\'e-Luis D\'iez, Jorge Bondia
https://arxiv.org/abs/2506.17790
Hypothesis testing for quantitative trait locus effects in both location and scale in genetic backcross studies
Guanfu Liu, Pengfei Li, Yukun Liu, Xiaolong Pu
https://arxiv.org/abs/2507.14253
Enabling Cyber Security Education through Digital Twins and Generative AI
Vita Santa Barletta, Vito Bavaro, Miriana Calvano, Antonio Curci, Antonio Piccinno, Davide Pio Posa
https://arxiv.org/abs/2507.17518
StaAgent: An Agentic Framework for Testing Static Analyzers
Elijah Nnorom, Md Basim Uddin Ahmed, Jiho Shin, Hung Viet Pham, Song Wang
https://arxiv.org/abs/2507.15892
EvolMathEval: Towards Evolvable Benchmarks for Mathematical Reasoning via Evolutionary Testing
Shengbo Wang, Mingwei Liu, Zike Li, Anji Li, Yanlin Wang, Xin Peng, Zibin Zheng
https://arxiv.org/abs/2508.13003
Google rolls out AI Mode to 180 countries and territories in English, after testing in the US, UK, and India, and plans to add more languages and regions "soon" (Abner Li/9to5Google)
https://9to5google.com/2025/08/21/google-ai-mode-countries-agentic/
Quantifying the Impact of 2D and 3D BAO Measurements on the Cosmic Distance Duality Relation with HII Galaxy observation
Jie Zheng (HNAS), Da-Chun Qiang (HNAS), Zhi-Qiang You (HNAS), Darshan Kumar (HNAS)
https://arxiv.org/abs/2507.17113
Exploring Traffic Simulation and Cybersecurity Strategies Using Large Language Models
Lu Gao, Yongxin Liu, Hongyun Chen, Dahai Liu, Yunpeng Zhang, Jingran Sun
https://arxiv.org/abs/2506.16699
Behavior Driven Development for 3D Games
Fernando Pastor Ric\'os, Beatriz Mar\'in, I. S. W. B. Prasetya, Tanja E. J. Vos, Joseph Davidson, Karel Hovorka
https://arxiv.org/abs/2506.17057
Modulator-free, self-testing quantum random number generator
Ana Bl\'azquez-Co\'ido, Fadri Gr\"unenfelder, Anthony Martin, Raphael Houlmann, Hugo Zbinden, Davide Rusca
https://arxiv.org/abs/2507.12346
Source: Netflix is using Runway AI's video generation tools for production; Disney is testing out the tools and talked with Runway about possible uses for them (Rachel Metz/Bloomberg)
https://www.bloomberg.com/news/articles/20
On the Testing of complete causal mediation and its applications
Yichin Tsai, Wan-Tzu Chang, Jia Jyun Sie, Cathy SJ Fann, Iebin Lian
https://arxiv.org/abs/2507.14246
Deep Learning Framework Testing via Heuristic Guidance Based on Multiple Model Measurements
Yinglong Zou, Juan Zhai, Chunrong Fang, Yanzhou Mu, Jiawei Liu, Zhenyu Chen
https://arxiv.org/abs/2507.15181
Multiple Hypothesis Testing To Estimate The Number Of Communities in Stochastic Block Models
Chetkar Jha, Mingyao Li, Ian Barnett
https://arxiv.org/abs/2507.15471
India-based ride-hailing app Rapido starts testing its food delivery service Ownly in Bengaluru, marking its first serious move to challenge Swiggy and Zomato (Jagmeet Singh/TechCrunch)
https://techcrunch.com/2025/08/13/indias-rapido-beg…
An adaptive procedure for detecting replicated signals with $k$-family-wise error rate control
Ninh Tran
https://arxiv.org/abs/2508.15363 https://arxiv.org…
Waymo applied for a NYC permit to test its cars with safety drivers and plans to start collecting mapping data with manually driven cars in Manhattan in July (Andrew J. Hawkins/The Verge)
https://www.theverge.com/news/689093/waymo-nyc-permit…
Testing Autonomous Driving Systems -- What Really Matters and What Doesn't
Changwen Li, Joseph Sifakis, Rongjie Yan, Jian Zhang
https://arxiv.org/abs/2507.13661
RUM: Rule LLM-Based Comprehensive Assessment on Testing Skills
Yue Wang, Zhenyu Chen, Yuan Zhao, Chunrong Fang, Ziyuan Wang, Song Huang
https://arxiv.org/abs/2508.12922 https://…
Google hires NBA star Stephen Curry as a "performance advisor" for its Health, Pixel, and Cloud products, including testing Fitbit's new personal health coach (Jess Weatherbed/The Verge)
https://www.theverge.com/news/762146/google-pixel-stephen-curry…
Navigating the growing field of research on AI for software testing -- the taxonomy for AI-augmented software testing and an ontology-driven literature survey
Ina K. Schieferdecker
https://arxiv.org/abs/2506.14640
In an Oxford study, LLMs correctly identified medical conditions 94.9% of the time when given test scenarios directly, vs. 34.5% when prompted by human subjects (Nick Mokey/VentureBeat)
https://venturebeat.com/ai/just-add-hu
Large Language Models for Unit Testing: A Systematic Literature Review
Quanjun Zhang, Chunrong Fang, Siqi Gu, Ye Shang, Zhenyu Chen, Liang Xiao
https://arxiv.org/abs/2506.15227
CASCADE: LLM-Powered JavaScript Deobfuscator at Google
Shan Jiang, Pranoy Kovuri, David Tao, Zhixun Tan
https://arxiv.org/abs/2507.17691 https://
Inside Google's Reliability Labs, where it stress tests Pixel phones and watches; Google claims the Pixel 10 Pro Fold can withstand 10 years of folding (Julian Chokkattu/Wired)
https://www.wired.com/story/google-reliability-labs-exclusive-look/
AlgoTune: Can Language Models Speed Up General-Purpose Numerical Programs?
Ori Press, Brandon Amos, Haoyu Zhao, Yikai Wu, Samuel K. Ainsworth, Dominik Krupke, Patrick Kidger, Touqir Sajed, Bartolomeo Stellato, Jisun Park, Nathanael Bosch, Eli Meril, Albert Steppi, Arman Zharmagambetov, Fangzhao Zhang, David Perez-Pineiro, Alberto Mercurio, Ni Zhan, Talor Abramovich, Kilian Lieret, Hanlin Zhang, Shirley Huang, Matthias Bethge, Ofir Press
Build It Clean: Large-Scale Detection of Code Smells in Build Scripts
Mahzabin Tamanna, Yash Chandrani, Matthew Burrows, Brandon Wroblewski, Laurie Williams, Dominik Wermke
https://arxiv.org/abs/2506.17948
May the Feedback Be with You! Unlocking the Power of Feedback-Driven Deep Learning Framework Fuzzing via LLMs
Shaoyu Yang, Chunrong Fang, Haifeng Lin, Xiang Chen, Zhenyu Chen
https://arxiv.org/abs/2506.17642
YouTube rolls out a tool to let some creators upload different thumbnails for each video dubbed into a different language, to help expand their global audience (Dan Whateley/Business Insider)
https://www.businessinsider.com/youtube-te
Accountability of Robust and Reliable AI-Enabled Systems: A Preliminary Study and Roadmap
Filippo Scaramuzza, Damian A. Tamburri, Willem-Jan van den Heuvel
https://arxiv.org/abs/2506.16831
You Don't Know Until You Click:Automated GUI Testing for Production-Ready Software Evaluation
Yutong Bian, Xianhao Lin, Yupeng Xie, Tianyang Liu, Mingchen Zhuge, Siyuan Lu, Haoming Tang, Jinlin Wang, Jiayi Zhang, Jiaqi Chen, Xiangru Tang, Yongxin Ni, Sirui Hong, Chenglin Wu
https://arxiv.org/abs/2508.14104
ORFuzz: Fuzzing the "Other Side" of LLM Safety -- Testing Over-Refusal
Haonan Zhang, Dongxia Wang, Yi Liu, Kexin Chen, Jiashui Wang, Xinlei Ying, Long Liu, Wenhai Wang
https://arxiv.org/abs/2508.11222
XAMT: Cross-Framework API Matching for Testing Deep Learning Libraries
Bin Duan, Ruican Dong, Naipeng Dong, Dan Dongseong Kim, Guowei Yang
https://arxiv.org/abs/2508.12546 https…
Extremal Testing for Network Software using LLMs
Rathin Singha, Harry Qian, Srinath Saikrishnan, Tracy Zhao, Ryan Beckett, Siva Kesava Reddy Kakarla, George Varghese
https://arxiv.org/abs/2507.11898