My neighbor wiped out on his bike breaking a leg and has been off work for a month but he’s been doing illustrations to make some money so I got one of my cat in a box.
#art #cats #Illustration
i think this attitude
— that all opposition is illegitimate and nothing we do can be questioned
— is probably pervasive in the
White House
and helps explain why they keep making terrible political choices
https://bsky.app/profile/jwmueller-pu.bsky.…
RE: https://mastodon.social/@shriramk/115956634402331444
It's also a great illustration of the ambiguity of English pronunciation, e.g., "e" in "earth" vs "heart" and even more so "gh" in "ought" vs &q…
‘All brakes are off’: #Russia’s attempt to rein in illicit market for leaked data backfires https://www.theguardian.com/world/2025/dec/26/russia-selling-personal-data-leaks-probiv-ukraine-spies?CMP=Share_AndroidApp_Other
Not far from here, by a white sun, behind a green star, lived the
Steelypips, illustrious, industrious, and they hadn't a care: no spats in
their vats, no rules, no schools, no gloom, no evil influence of the
moon, no trouble from matter or antimatter -- for they had a machine, a
dream of a machine, with springs and gears and perfect in every respect.
And they lived with it, and on it, and under it, and inside it, for it
was all they had -- first they saved …
Sequential Counterfactual Inference for Temporal Clinical Data: Addressing the Time Traveler Dilemma
Jingya Cheng, Alaleh Azhir, Jiazi Tian, Hossein Estiri
https://arxiv.org/abs/2602.21168 https://arxiv.org/pdf/2602.21168 https://arxiv.org/html/2602.21168
arXiv:2602.21168v1 Announce Type: new
Abstract: Counterfactual inference enables clinicians to ask "what if" questions about patient outcomes, but standard methods assume feature independence and simultaneous modifiability -- assumptions violated by longitudinal clinical data. We introduce the Sequential Counterfactual Framework, which respects temporal dependencies in electronic health records by distinguishing immutable features (chronic diagnoses) from controllable features (lab values) and modeling how interventions propagate through time. Applied to 2,723 COVID-19 patients (383 Long COVID heart failure cases, 2,340 matched controls), we demonstrate that 38-67% of patients with chronic conditions would require biologically impossible counterfactuals under naive methods. We identify a cardiorenal cascade (CKD -> AKI -> HF) with relative risks of 2.27 and 1.19 at each step, illustrating temporal propagation that sequential -- but not naive -- counterfactuals can capture. Our framework transforms counterfactual explanation from "what if this feature were different?" to "what if we had intervened earlier, and how would that propagate forward?" -- yielding clinically actionable insights grounded in biological plausibility.
toXiv_bot_toot
Estimation of Confidence Bounds in Binary Classification using Wilson Score Kernel Density Estimation
Thorbj{\o}rn Mosekj{\ae}r Iversen, Zebin Duan, Frederik Hagelskj{\ae}r
https://arxiv.org/abs/2602.20947 https://arxiv.org/pdf/2602.20947 https://arxiv.org/html/2602.20947
arXiv:2602.20947v1 Announce Type: new
Abstract: The performance and ease of use of deep learning-based binary classifiers have improved significantly in recent years. This has opened up the potential for automating critical inspection tasks, which have traditionally only been trusted to be done manually. However, the application of binary classifiers in critical operations depends on the estimation of reliable confidence bounds such that system performance can be ensured up to a given statistical significance. We present Wilson Score Kernel Density Classification, which is a novel kernel-based method for estimating confidence bounds in binary classification. The core of our method is the Wilson Score Kernel Density Estimator, which is a function estimator for estimating confidence bounds in Binomial experiments with conditionally varying success probabilities. Our method is evaluated in the context of selective classification on four different datasets, illustrating its use as a classification head of any feature extractor, including vision foundation models. Our proposed method shows similar performance to Gaussian Process Classification, but at a lower computational complexity.
toXiv_bot_toot
Why Pass@k Optimization Can Degrade Pass@1: Prompt Interference in LLM Post-training
Anas Barakat, Souradip Chakraborty, Khushbu Pahwa, Amrit Singh Bedi
https://arxiv.org/abs/2602.21189 https://arxiv.org/pdf/2602.21189 https://arxiv.org/html/2602.21189
arXiv:2602.21189v1 Announce Type: new
Abstract: Pass@k is a widely used performance metric for verifiable large language model tasks, including mathematical reasoning, code generation, and short-answer reasoning. It defines success if any of $k$ independently sampled solutions passes a verifier. This multi-sample inference metric has motivated inference-aware fine-tuning methods that directly optimize pass@$k$. However, prior work reports a recurring trade-off: pass@k improves while pass@1 degrades under such methods. This trade-off is practically important because pass@1 often remains a hard operational constraint due to latency and cost budgets, imperfect verifier coverage, and the need for a reliable single-shot fallback. We study the origin of this trade-off and provide a theoretical characterization of when pass@k policy optimization can reduce pass@1 through gradient conflict induced by prompt interference. We show that pass@$k$ policy gradients can conflict with pass@1 gradients because pass@$k$ optimization implicitly reweights prompts toward low-success prompts; when these prompts are what we term negatively interfering, their upweighting can rotate the pass@k update direction away from the pass@1 direction. We illustrate our theoretical findings with large language model experiments on verifiable mathematical reasoning tasks.
toXiv_bot_toot