Tootfinder

Opt-in global Mastodon full text search. Join the index!

@grumpybozo@toad.social
2025-12-27 03:14:49

They are breaking things of huge value to everyone.
Trump weather balloon cuts have Michigan meteorologists debating forecasts | Interlochen Public Radio interlochenpublicradio.org/202

@michabbb@social.vivaldi.net
2025-12-25 22:05:59

🚀 Enable via new Plugins section at openrouter.ai/settings/plugins - opt-in toggle activates automatic healing layer for all structured output requests
💡 Key insight: A 2% defect rate dropping to 1% means cutting defects, bugs, and support tickets in half - reliability at the margins is wh…

@curiouscat@fosstodon.org
2025-11-25 22:17:35

"users who leave their SSDs unpowered for over a year are risking the integrity of their data. The reliability of QLC NAND has improved over the years, so you should probably consider 2–3 years of unpowered usage as the guardrails. Without power, the voltage stored in the NAND cells can be lost, either resulting in missing data or completely useless drives."

@brichapman@mastodon.social
2025-12-25 23:08:01

Massachusetts is about to see major savings from clean energy. A new analysis shows the state's SMART 3.0 solar-plus-storage program could save ratepayers $313 million annually by 2030.
The key? Pushing out inefficient natural gas plants, cutting reliance on fossil fuels during winter, and slashing 1.6 million metric tons of CO2 per year.

@seeingwithsound@mas.to
2025-12-24 13:57:13

[OT] Dilbert? Salesforce steps back from AI: Executives reveal overconfidence in LLMs, pivot to deterministic automation opentools.ai/news/salesforce-s

@brichapman@mastodon.social
2025-11-25 19:48:00

What happens when you pair solar panels with mini nuclear reactors? Chinese researchers just cracked the code.
Their new microgrid framework combines photovoltaics with small modular reactors, using AI to balance both in real time. The results are striking: 18.7% lower costs, 37.1% fewer emissions, and 98% reliability.
The secret? Smart coordination between battery storage and hydrogen production that adapts on the fly.

@Techmeme@techhub.social
2025-12-20 02:50:55

Sources: Resolve AI, which is developing an autonomous site reliability engineering tool, raised a Series A at multiple valuation tiers, including at $1B (Marina Temkin/TechCrunch)
techcrunch.com/2025/12/19/ex-s

@fanf@mendeddrum.org
2025-12-03 09:42:03

from my link log —
A distributed systems reliability glossary.
antithesis.com/resources/relia
saved 2025-12-03

@publicvoit@graz.social
2025-12-11 15:53:49

Wow, you *really* produce bad #quality if your #EV (with much less moving parts) is worse than all the non-electric #vehicles of a list. 😲
Avoid buying

@arXiv_physicsoptics_bot@mastoxiv.page
2025-11-25 10:40:33

Dispersion-Aware Modeling Framework for Parallel Optical Computing
Ziqi Wei, Yuanjian Wan, Yuhu Cheng, Xiao Yu, Peng Xie
arxiv.org/abs/2511.18897 arxiv.org/pdf/2511.18897 arxiv.org/html/2511.18897
arXiv:2511.18897v1 Announce Type: new
Abstract: Optical computing represents a groundbreaking technology that leverages the unique properties of photons, with innate parallelism standing as its most compelling advantage. Parallel optical computing like cascaded Mach-Zehnder interferometers (MZIs) based offers powerful computational capabilities but also introduces new challenges, particularly concerning dispersion due to the introduction of new frequencies. In this work, we extend existing theories of cascaded MZI systems to develop a generalized model tailored for wavelength-multiplexed parallel optical computing. Our comprehensive model incorporates component dispersion characteristics into a wavelength-dependent transfer matrix framework and is experimentally validated. We propose a computationally efficient compensation strategy that reduces global dispersion error within a 40 nm range from 0.22 to 0.039 using edge-spectrum calibration. This work establishes a fundamental framework for dispersion-aware model and error correction in MZI-based parallel optical computing chips, advancing the reliability of multi-wavelength photonic processors.
toXiv_bot_toot

Want a used car that works?
You’d be wise to not get an older Tesla.
In Consumer Reports’ latest ranking for used cars,
the Elon Musk-run automaker came dead last in terms of reliability,
trailing by over forty points from the top spot on a scale between 0 and 100
futurism.co…

@arXiv_csCV_bot@mastoxiv.page
2025-10-15 10:48:31

EReLiFM: Evidential Reliability-Aware Residual Flow Meta-Learning for Open-Set Domain Generalization under Noisy Labels
Kunyu Peng, Di Wen, Kailun Yang, Jia Fu, Yufan Chen, Ruiping Liu, Jiamin Wu, Junwei Zheng, M. Saquib Sarfraz, Luc Van Gool, Danda Pani Paudel, Rainer Stiefelhagen
arxiv.org/abs/2510.12687

@pimterry@toot.cafe
2025-11-18 13:29:52

These AWS & Cloudflare mega-outages are honestly embarrassing as an industry. Eugh. What are we doing???
We have so many tools & processes for ensuring reliability, but somehow two vendors can each single-handledly wipe everything out anytime.

@arXiv_eessSY_bot@mastoxiv.page
2025-10-13 09:39:30

Critical States Identiffcation in Power System via Lattice Partition and Its Application in Reliability Assessment
Han Hu, Wenjie Wan, Feiyu Chen, Xiaoyu Liu, Bo Yu, Kequan Zhao
arxiv.org/abs/2510.09420

@brichapman@mastodon.social
2025-12-20 19:28:00

38 coastal, remote, and island communities are getting a lifeline for their fragile energy grids.
Through the Energy Technology Innovation Partnership Project, they're designing microgrids, exploring local renewable generation, and hardening systems against extreme weather. The goal: reliable, affordable power that can withstand the next storm.

@datascience@genomic.social
2025-11-17 11:00:01

Discover the power of property-based testing in R with the #quickcheck package! Seamlessly integrates with #testthat and offers a variety of generators for atomic vectors, lists, and tibbles. Perfect for ensuring your code's reliability. Check it out:

@arXiv_csCL_bot@mastoxiv.page
2025-10-09 10:40:01

Revisiting Metric Reliability for Fine-grained Evaluation of Machine Translation and Summarization in Indian Languages
Amir Hossein Yari, Kalmit Kulkarni, Ahmad Raza Khan, Fajri Koto
arxiv.org/abs/2510.07061

@jlpiraux@wallonie-bruxelles.social
2025-12-07 16:32:29

Fiabilité: Tesla est la pire marque du marché, selon l'organisation étatsunienne de consommateurs "Consumer Reports".
consumerreports.org/cars/which

@arXiv_csRO_bot@mastoxiv.page
2025-10-10 10:12:39

Reliability of Single-Level Equality-Constrained Inverse Optimal Control
Filip Be\v{c}anovi\'c (University of Belgrade - School of Electrical Engineering), Kosta Jovanovi\'c (University of Belgrade - School of Electrical Engineering), Vincent Bonnet (LAAS-CNRS)
arxiv.org/abs/2510.08406

@arXiv_statME_bot@mastoxiv.page
2025-10-14 11:06:08

Algorithmic analysis of a complex reliability system subject to multiple events with a preventive maintenance strategy and a Bernoulli vacation policy through MMAPs
Juan Eloy Ruiz-Castro, Hugo Ala\'in Zapata-Ceballos
arxiv.org/abs/2510.11506

@kurtsh@mastodon.social
2025-10-16 03:01:57

Giddy up.
✅ PowerToys 0.95 is here: new Light Switch utility, faster Command Palette, and Peek with Spacebar - Windows Command Line
devblogs.microsoft.com/command

@arXiv_csCY_bot@mastoxiv.page
2025-10-13 07:37:00

Assurance of Frontier AI Built for National Security
Matteo Pistillo, Charlotte Stix
arxiv.org/abs/2510.08792 arxiv.org/pdf/2510.08792

@arXiv_csSE_bot@mastoxiv.page
2025-09-30 10:54:41

Walk the Talk: Is Your Log-based Software Reliability Maintenance System Really Reliable?
Minghua He, Tong Jia, Chiming Duan, Pei Xiao, Lingzhe Zhang, Kangjin Wang, Yifan Wu, Ying Li, Gang Huang
arxiv.org/abs/2509.24352

@arXiv_physicssocph_bot@mastoxiv.page
2025-10-09 09:03:11

Corrigendum to "Degree-Based Approximations for Network Reliability Polynomials". Comment on J. Complex Networks 2025, 13, cnaf001
Xinhan Liu, Piet Van Mieghem
arxiv.org/abs/2510.06247

@arXiv_csLG_bot@mastoxiv.page
2025-12-22 10:33:50

Calibratable Disambiguation Loss for Multi-Instance Partial-Label Learning
Wei Tang, Yin-Fang Yang, Weijia Zhang, Min-Ling Zhang
arxiv.org/abs/2512.17788 arxiv.org/pdf/2512.17788 arxiv.org/html/2512.17788
arXiv:2512.17788v1 Announce Type: new
Abstract: Multi-instance partial-label learning (MIPL) is a weakly supervised framework that extends the principles of multi-instance learning (MIL) and partial-label learning (PLL) to address the challenges of inexact supervision in both instance and label spaces. However, existing MIPL approaches often suffer from poor calibration, undermining classifier reliability. In this work, we propose a plug-and-play calibratable disambiguation loss (CDL) that simultaneously improves classification accuracy and calibration performance. The loss has two instantiations: the first one calibrates predictions based on probabilities from the candidate label set, while the second one integrates probabilities from both candidate and non-candidate label sets. The proposed CDL can be seamlessly incorporated into existing MIPL and PLL frameworks. We provide a theoretical analysis that establishes the lower bound and regularization properties of CDL, demonstrating its superiority over conventional disambiguation losses. Experimental results on benchmark and real-world datasets confirm that our CDL significantly enhances both classification and calibration performance.
toXiv_bot_toot

@arXiv_csAI_bot@mastoxiv.page
2025-10-15 10:09:31

PricingLogic: Evaluating LLMs Reasoning on Complex Tourism Pricing Tasks
Yunuo Liu, Dawei Zhu, Zena Al-Khalili, Dai Cheng, Yanjun Chen, Dietrich Klakow, Wei Zhang, Xiaoyu Shen
arxiv.org/abs/2510.12409

@arXiv_csLO_bot@mastoxiv.page
2025-10-15 09:17:02

Proceedings of the International Workshop on Verification of Scientific Software
Stephen F. Siegel, Ganesh Gopalakrishnan
arxiv.org/abs/2510.12314

@arXiv_csOS_bot@mastoxiv.page
2025-09-30 08:02:34

Joyride: Rethinking Linux's network stack design for better performance, security, and reliability
Yanlin Du, Ruslan Nikolaev
arxiv.org/abs/2509.25015

@ErikJonker@mastodon.social
2025-12-05 07:13:09

The National Security Strategy of the US is worth reading. It clearly shows that the USA is no longer an ally of Europe and that NATO should be worried about the reliability of the US. It also shows how completely unhinged and extreme right the US government has become.

@arXiv_csCR_bot@mastoxiv.page
2025-10-14 12:13:08

RAG-Pull: Imperceptible Attacks on RAG Systems for Code Generation
Vasilije Stambolic, Aritra Dhar, Lukas Cavigelli
arxiv.org/abs/2510.11195

@arXiv_qbioNC_bot@mastoxiv.page
2025-10-09 09:05:41

Quantifying spike train synchrony and directionality: Measures and Applications
Thomas Kreuz
arxiv.org/abs/2510.07140 arxiv.org/pdf/2510.07…

@memeorandum@universeodon.com
2025-11-04 22:10:49

New FDA turmoil throws agency's reliability into question (Axios)
axios.com/2025/11/04/fda-staff
memeorandum.com/251104/p122#a2

@Techmeme@techhub.social
2025-12-11 18:18:02

OpenAI says GPT-5.2 Thinking hallucinates less than GPT-5.1 and has improved reliability for agentic AI needs; pre-release testers include Notion, Box, Shopify (Hayden Field/The Verge)
theverge.com/ai-artificial-int

@arXiv_statML_bot@mastoxiv.page
2025-10-14 10:20:58

Efficient Group Lasso Regularized Rank Regression with Data-Driven Parameter Determination
Meixia Lin, Meijiao Shi, Yunhai Xiao, Qian Zhang
arxiv.org/abs/2510.11546

@arXiv_csHC_bot@mastoxiv.page
2025-10-14 09:48:28

Between Knowledge and Care: Evaluating Generative AI-Based IUI in Type 2 Diabetes Management Through Patient and Physician Perspectives
Yibo Meng, Ruiqi Chen, Zhiming Liu, Xiaolan Ding, Yan Guan
arxiv.org/abs/2510.10048

@arXiv_eessSP_bot@mastoxiv.page
2025-10-15 08:27:42

A Deep Multi-Task Learning Approach to Impulsive Noise Parameter Estimation
Abdullahi Mohammad, Bdah Eya, Bassant Selim
arxiv.org/abs/2510.12179

@arXiv_csDC_bot@mastoxiv.page
2025-10-14 08:24:18

Proactive and Reactive Autoscaling Techniques for Edge Computing
Suhrid Gupta, Muhammed Tawfiqul Islam, Rajkumar Buyya
arxiv.org/abs/2510.10166

@trochee@dair-community.social
2025-09-30 21:52:20

log scales like "number of nines" are fun when you start to get to poor levels of performance.
... in a room full of people who would have laughed at them if they had said "let's get up to three nines of reliability",
a team I work with took a KPI to "get loss rates down below 20%" and received wise nods from most of the engineers in the room.
...but that's the equivalent of "let's get up to 0.5 nines"
that's a …

@arXiv_statME_bot@mastoxiv.page
2025-10-13 09:25:10

Reliability Sensitivity with Response Gradient
Siu-Kui Au, Zi-Jun Cao
arxiv.org/abs/2510.09315 arxiv.org/pdf/2510.09315

@portaloffreedom@social.linux.pizza
2025-10-07 07:27:47

Part of what makes the Steamdeck great is its stand-by reliability.
Because most desktop/console games are designed for long play sessions, they are difficult to consume on the go. Valve, by making standby seamless, allows for these games to be paused and then continued seamlessly.

@arXiv_csCL_bot@mastoxiv.page
2025-09-29 11:15:47

Exploratory Semantic Reliability Analysis of Wind Turbine Maintenance Logs using Large Language Models
Max Malyi, Jonathan Shek, Andre Biscaya
arxiv.org/abs/2509.22366

@jacobgudiol@mastodonsweden.se
2025-12-01 12:50:57

The study involved leading laboratories across multiple countries testing identical samples of gut microbiome bacteria. Results revealed startling inconsistencies, with accuracy measures varying dramatically between laboratories – despite analysing the same samples.
MHRA-led study reveals major inconsistencies in global microbiome research

@arXiv_eessSY_bot@mastoxiv.page
2025-10-07 11:14:12

Power Reserve Capacity from Virtual Power Plants with Reliability and Cost Guarantees
Lorenzo Zapparoli, Blazhe Gjorgiev, Giovanni Sansavini
arxiv.org/abs/2510.04815

@arXiv_csSD_bot@mastoxiv.page
2025-10-13 08:14:20

Evaluating Hallucinations in Multimodal LLMs with Spoken Queries under Diverse Acoustic Conditions
Hansol Park, Hoseong Ahn, Junwon Moon, Yejin Lee, Kyuhong Shim
arxiv.org/abs/2510.08581

@arXiv_astrophSR_bot@mastoxiv.page
2025-10-13 09:20:50

The age and metallicity dependence of the near-infrared absolute magnitude and colour of red clump stars
Hiroki Onozato, Yoshifusa Ita, Yoshikazu Nakada
arxiv.org/abs/2510.09168

@arXiv_condmatmtrlsci_bot@mastoxiv.page
2025-10-14 11:59:18

Optimizing Cross-Domain Transfer for Universal Machine Learning Interatomic Potentials
Jaesun Kim, Jinmu You, Yutack Park, Yunsung Lim, Yujin Kang, Jisu Kim, Haekwan Jeon, Deokgi Hong, Seung Yul Lee, Saerom Choi, Yongdeok Kim, Jae W. Lee, Seungwu Han
arxiv.org/abs/2510.11241

@arXiv_astrophEP_bot@mastoxiv.page
2025-10-14 10:13:28

Analyzing Data Quality and Decay in Mega-Constellations: A Physics-Informed Machine Learning Approach
Katarina Dyreby, Francisco Caldas, Cl\'audia Soares
arxiv.org/abs/2510.11242

@arXiv_csDL_bot@mastoxiv.page
2025-09-30 08:40:51

The Landscape of problematic papers in the field of non-coding RNA
Ying Lou, Zhengyi Zhou, Guosheng Wang, Zhesi Shen, Menghui Li
arxiv.org/abs/2509.24511

@arXiv_csNI_bot@mastoxiv.page
2025-10-14 09:48:58

Visible Light Communication for Vehicular Networks: A Tutorial
Pedro E. G\'oria Silva, Eduardo S. Lima, Jules M. Moualeu, Mohamed Korium, Pedro H. J. Nardelli
arxiv.org/abs/2510.11123

@arXiv_csLG_bot@mastoxiv.page
2025-10-15 10:51:31

Few Shot Semi-Supervised Learning for Abnormal Stop Detection from Sparse GPS Trajectories
Muhammad Ayub Sabir, Junbiao Pang, Jiaqi Wu, Fatima Ashraf
arxiv.org/abs/2510.12686

@arXiv_quantph_bot@mastoxiv.page
2025-10-07 11:52:12

Embedding-Aware Noise Modeling of Quantum Annealing
Seon-Geun Jeong, Mai Dinh Cong, Dae-Il Noh, Quoc-Viet Pham, Won-Joo Hwang
arxiv.org/abs/2510.04594

@arXiv_csRO_bot@mastoxiv.page
2025-10-03 10:25:21

Stand Up, NAO! Increasing the Reliability of Stand-Up Motions Through Error Compensation in Position Control
Philip Reichenberg, Tim Laue
arxiv.org/abs/2510.02129

@arXiv_csCV_bot@mastoxiv.page
2025-10-15 10:45:01

MMOT: The First Challenging Benchmark for Drone-based Multispectral Multi-Object Tracking
Tianhao Li, Tingfa Xu, Ying Wang, Haolin Qin, Xu Lin, Jianan Li
arxiv.org/abs/2510.12565

@arXiv_csAI_bot@mastoxiv.page
2025-10-13 09:33:10

TripScore: Benchmarking and rewarding real-world travel planning with fine-grained evaluation
Yincen Qu, Huan Xiao, Feng Li, Hui Zhou, Xiangying Dai
arxiv.org/abs/2510.09011

@arXiv_csSE_bot@mastoxiv.page
2025-10-14 11:15:48

Defects4C: Benchmarking Large Language Model Repair Capability with C/C Bugs
Jian Wang, Xiaofei Xie, Qiang Hu, Shangqing Liu, Jiongchi Yu, Jiaolong Klong, Yi Li
arxiv.org/abs/2510.11059

@michabbb@social.vivaldi.net
2025-11-12 03:25:38

📈 Build dashboards: Visualize input/output token usage, sessions & conversations, total costs in USD, terminal type distribution (#VSCode, Apple Terminal), requests per user & tool type usage (Read, Edit, LS, Bash)
🎯 Real insights: Measure ROI & productivity gains, spot performance bottlenecks & reliability issues, track adoption trends & user trust via accept/reject …

@arXiv_csCY_bot@mastoxiv.page
2025-10-14 07:42:01

Bias-Aware AI Chatbot for Engineering Advising at the University of Maryland A. James Clark School of Engineering
Prarthana P. Kartholy, Thandi M. Labor, Neil N. Panchal, Sean H. Wang, Hillary N. Owusu
arxiv.org/abs/2510.09636

@arXiv_csLG_bot@mastoxiv.page
2025-10-15 12:09:26

Crosslisted article(s) found for cs.LG. arxiv.org/list/cs.LG/new
[4/4]:
- EReLiFM: Evidential Reliability-Aware Residual Flow Meta-Learning for Open-Set Domain Generalizat...
Peng, Wen, Yang, Fu, Chen, Liu, Wu, Zheng, Sarfraz, Van Gool, Paudel, Stiefelhagen

@arXiv_csCL_bot@mastoxiv.page
2025-10-15 10:34:21

Shallow Robustness, Deep Vulnerabilities: Multi-Turn Evaluation of Medical LLMs
Blazej Manczak, Eric Lin, Francisco Eiras, James O' Neill, Vaikkunth Mugunthan
arxiv.org/abs/2510.12255

@arXiv_eessSP_bot@mastoxiv.page
2025-10-15 07:53:51

Based on Deep Neural Networks: A Machine Learning-Assisted Channel Estimation Method for MIMO Systems
Haoran He
arxiv.org/abs/2510.11891 ar…

@grumpybozo@toad.social
2025-12-02 22:58:37

I never could have been literate in Chinese. Not that I could not read those 2 characters as different, but I’d never be able to write characters with that fine a distinction with any sort of reliability. American cursive was bad enough. mastodon.social/@mcc/115651531

@arXiv_csHC_bot@mastoxiv.page
2025-10-09 07:41:20

Inducing State Anxiety in LLM Agents Reproduces Human-Like Biases in Consumer Decision-Making
Ziv Ben-Zion, Zohar Elyoseph, Tobias Spiller, Teddy Lazebnik
arxiv.org/abs/2510.06222

@arXiv_csRO_bot@mastoxiv.page
2025-10-14 12:34:58

IntersectioNDE: Learning Complex Urban Traffic Dynamics based on Interaction Decoupling Strategy
Enli Lin, Ziyuan Yang, Qiujing Lu, Jianming Hu, Shuo Feng
arxiv.org/abs/2510.11534

@arXiv_csCR_bot@mastoxiv.page
2025-10-07 11:09:22

RL Is a Hammer and LLMs Are Nails: A Simple Reinforcement Learning Recipe for Strong Prompt Injection
Yuxin Wen, Arman Zharmagambetov, Ivan Evtimov, Narine Kokhlikyan, Tom Goldstein, Kamalika Chaudhuri, Chuan Guo
arxiv.org/abs/2510.04885

@arXiv_csSE_bot@mastoxiv.page
2025-10-14 09:59:28

LLMs are All You Need? Improving Fuzz Testing for MOJO with Large Language Models
Linghan Huang, Peizhou Zhao, Huaming Chen
arxiv.org/abs/2510.10179

@arXiv_csAI_bot@mastoxiv.page
2025-10-09 09:09:01

Auto-Prompt Ensemble for LLM Judge
Jiajie Li, Huayi Zhang, Peng Lin, Jinjun Xiong, Wei Xu
arxiv.org/abs/2510.06538 arxiv.org/pdf/2510.06538…

@arXiv_csCV_bot@mastoxiv.page
2025-10-09 10:18:31

Automated Neural Architecture Design for Industrial Defect Detection
Yuxi Liu, Yunfeng Ma, Yi Tang, Min Liu, Shuai Jiang, Yaonan Wang
arxiv.org/abs/2510.06669

@arXiv_csCL_bot@mastoxiv.page
2025-10-15 10:35:31

Beating Harmful Stereotypes Through Facts: RAG-based Counter-speech Generation
Greta Damo, Elena Cabrio, Serena Villata
arxiv.org/abs/2510.12316

@arXiv_csLG_bot@mastoxiv.page
2025-10-13 10:44:00

What Do Temporal Graph Learning Models Learn?
Abigail J. Hayes, Tobias Schumacher, Markus Strohmaier
arxiv.org/abs/2510.09416 arxiv.org/pdf…

@arXiv_eessSY_bot@mastoxiv.page
2025-10-14 09:06:28

Latent-Feature-Informed Neural ODE Modeling for Lightweight Stability Evaluation of Black-box Grid-Tied Inverters
Jialin Zheng, Zhong Liu, Xiaonan Lu
arxiv.org/abs/2510.09826

@arXiv_eessSP_bot@mastoxiv.page
2025-10-01 10:28:27

Ultra-Reliable Risk-Aggregated Sum Rate Maximization via Model-Aided Deep Learning
Hassaan Hashmi, Spyridon Pougkakiotis, Dionysis Kalogerias
arxiv.org/abs/2509.26311

@arXiv_csSE_bot@mastoxiv.page
2025-10-14 09:14:28

OBsmith: Testing JavaScript Obfuscator using LLM-powered sketching
Shan Jiang, Chenguang Zhu, Sarfraz Khurshid
arxiv.org/abs/2510.10066 arx…

@arXiv_csHC_bot@mastoxiv.page
2025-10-09 09:37:31

"It feels like hard work trying to talk to it": Understanding Older Adults' Experiences of Encountering and Repairing Conversational Breakdowns with AI Systems
Niharika Mathur, Tamara Zubatiy, Agata Rozga, Elizabeth Mynatt
arxiv.org/abs/2510.06690

@arXiv_csCR_bot@mastoxiv.page
2025-10-07 10:38:22

P2P: A Poison-to-Poison Remedy for Reliable Backdoor Defense in LLMs
Shuai Zhao, Xinyi Wu, Shiqian Zhao, Xiaobao Wu, Zhongliang Guo, Yanhao Jia, Anh Tuan Luu
arxiv.org/abs/2510.04503

@arXiv_csAI_bot@mastoxiv.page
2025-10-10 07:55:39

ExpertAgent: Enhancing Personalized Education through Dynamic Planning and Retrieval-Augmented Long-Chain Reasoning
Binrong Zhu, Guiran Liu, Nina Jiang
arxiv.org/abs/2510.07456

@arXiv_csCV_bot@mastoxiv.page
2025-10-09 10:01:41

SDQM: Synthetic Data Quality Metric for Object Detection Dataset Evaluation
Ayush Zenith, Arnold Zumbrun, Neel Raut, Jing Lin
arxiv.org/abs/2510.06596

@arXiv_csCL_bot@mastoxiv.page
2025-10-13 10:38:10

Can We Reliably Rank Model Performance across Domains without Labeled Data?
Veronica Rammouz, Aaron Gonzalez, Carlos Cruzportillo, Adrian Tan, Nicole Beebe, Anthony Rios
arxiv.org/abs/2510.09519

@arXiv_eessSY_bot@mastoxiv.page
2025-10-14 10:36:48

Aggregate Modeling of Air-Conditioner Loads Under Packet-based Control with Both On and Off Grid Access Requests
Mohammad Hassan, Mads R. Almassalkhi
arxiv.org/abs/2510.10651

@arXiv_csRO_bot@mastoxiv.page
2025-10-10 10:02:49

Towards Reliable LLM-based Robot Planning via Combined Uncertainty Estimation
Shiyuan Yin, Chenjia Bai, Zihao Zhang, Junwei Jin, Xinxin Zhang, Chi Zhang, Xuelong Li
arxiv.org/abs/2510.08044

@arXiv_csLG_bot@mastoxiv.page
2025-10-08 10:54:49

Out-of-Distribution Detection from Small Training Sets using Bayesian Neural Network Classifiers
Kevin Raina, Tanya Schmah
arxiv.org/abs/2510.06025

@arXiv_csCV_bot@mastoxiv.page
2025-10-09 10:47:01

GenPilot: A Multi-Agent System for Test-Time Prompt Optimization in Image Generation
Wen Ye, Zhaocheng Liu, Yuwei Gui, Tingyu Yuan, Yunyue Su, Bowen Fang, Chaoyang Zhao, Qiang Liu, Liang Wang
arxiv.org/abs/2510.07217

@arXiv_csAI_bot@mastoxiv.page
2025-10-09 10:13:41

Integrating Domain Knowledge into Process Discovery Using Large Language Models
Ali Norouzifar, Humam Kourani, Marcus Dees, Wil van der Aalst
arxiv.org/abs/2510.07161

@arXiv_csSE_bot@mastoxiv.page
2025-10-08 08:23:29

Adaptive Reinforcement Learning for Dynamic Configuration Allocation in Pre-Production Testing
Yu Zhu
arxiv.org/abs/2510.05147 arxiv.org/pd…

@arXiv_csRO_bot@mastoxiv.page
2025-10-09 09:19:41

UniFField: A Generalizable Unified Neural Feature Field for Visual, Semantic, and Spatial Uncertainties in Any Scene
Christian Maurer, Snehal Jauhri, Sophie Lueth, Georgia Chalvatzaki
arxiv.org/abs/2510.06754

@arXiv_csCL_bot@mastoxiv.page
2025-10-09 10:27:11

Adaptive Tool Generation with Models as Tools and Reinforcement Learning
Chenpeng Wang, Xiaojie Cheng, Chunye Wang, Linfeng Yang, Lei Zhang
arxiv.org/abs/2510.06825

@arXiv_csLG_bot@mastoxiv.page
2025-10-09 10:43:51

Introspection in Learned Semantic Scene Graph Localisation
Manshika Charvi Bissessur, Efimia Panagiotaki, Daniele De Martini
arxiv.org/abs/2510.07053

@arXiv_csCL_bot@mastoxiv.page
2025-10-09 10:35:41

Towards Reliable Retrieval in RAG Systems for Large Legal Datasets
Markus Reuter, Tobias Lingenberg, R\=uta Liepi\c{n}a, Francesca Lagioia, Marco Lippi, Giovanni Sartor, Andrea Passerini, Burcu Sayin
arxiv.org/abs/2510.06999

@arXiv_csRO_bot@mastoxiv.page
2025-10-09 08:30:21

Constrained Natural Language Action Planning for Resilient Embodied Systems
Grayson Byrd, Corban Rivera, Bethany Kemp, Meghan Booker, Aurora Schmidt, Celso M de Melo, Lalithkumar Seenivasan, Mathias Unberath
arxiv.org/abs/2510.06357

@arXiv_csLG_bot@mastoxiv.page
2025-10-08 10:59:19

lm-Meter: Unveiling Runtime Inference Latency for On-Device Language Models
Haoxin Wang, Xiaolong Tu, Hongyu Ke, Huirong Chai, Dawei Chen, Kyungtae Han
arxiv.org/abs/2510.06126

@arXiv_csCV_bot@mastoxiv.page
2025-10-03 10:01:41

Uncovering Overconfident Failures in CXR Models via Augmentation-Sensitivity Risk Scoring
Han-Jay Shu, Wei-Ning Chiu, Shun-Ting Chang, Meng-Ping Huang, Takeshi Tohyama, Ahram Han, Po-Chih Kuo
arxiv.org/abs/2510.01683

@arXiv_csSE_bot@mastoxiv.page
2025-10-08 08:59:09

UnitTenX: Generating Tests for Legacy Packages with AI Agents Powered by Formal Verification
Yiannis Charalambous, Claudionor N. Coelho Jr, Luis Lamb, Lucas C. Cordeiro
arxiv.org/abs/2510.05441

@arXiv_csCL_bot@mastoxiv.page
2025-10-10 10:51:39

Mitigating Judgment Preference Bias in Large Language Models through Group-Based Polling
Shuliang Liu, Zhipeng Xu, Zhenghao Liu, Yukun Yan, Minghe Yu, Yu Gu, Chong Chen, Huiyuan Xie, Ge Yu
arxiv.org/abs/2510.08145

@arXiv_csRO_bot@mastoxiv.page
2025-10-06 10:00:29

Learning Stability Certificate for Robotics in Real-World Environments
Zhe Shen
arxiv.org/abs/2510.03123 arxiv.org/pdf/2510.03123

@arXiv_csLG_bot@mastoxiv.page
2025-10-06 10:25:19

Real Time Headway Predictions in Urban Rail Systems and Implications for Service Control: A Deep Learning Approach
Muhammad Usama, Haris Koutsopoulos
arxiv.org/abs/2510.03121

@arXiv_csCL_bot@mastoxiv.page
2025-10-06 10:18:09

Evaluating Large Language Models for IUCN Red List Species Information
Shinya Uryu
arxiv.org/abs/2510.02830 arxiv.org/pdf/2510.02830

@arXiv_csLG_bot@mastoxiv.page
2025-10-06 10:23:59

Bayesian E(3)-Equivariant Interatomic Potential with Iterative Restratification of Many-body Message Passing
Soohaeng Yoo Willow, Tae Hyeon Park, Gi Beom Sim, Sung Wook Moon, Seung Kyu Min, D. ChangMo Yang, Hyun Woo Kim, Juho Lee, Chang Woo Myung
arxiv.org/abs/2510.03046

@arXiv_csLG_bot@mastoxiv.page
2025-10-06 10:26:49

PRISM-Physics: Causal DAG-Based Process Evaluation for Physics Reasoning
Wanjia Zhao, Qinwei Ma, Jingzhe Shi, Shirley Wu, Jiaqi Han, Yijia Xiao, Si-Yuan Chen, Xiao Luo, Ludwig Schmidt, James Zou
arxiv.org/abs/2510.03185

@arXiv_csCL_bot@mastoxiv.page
2025-10-06 10:10:29

Knowledge-Graph Based RAG System Evaluation Framework
Sicheng Dong, Vahid Zolfaghari, Nenad Petrovic, Alois Knoll
arxiv.org/abs/2510.02549

@arXiv_csLG_bot@mastoxiv.page
2025-10-03 11:03:11

Addressing Pitfalls in the Evaluation of Uncertainty Estimation Methods for Natural Language Generation
Mykyta Ielanskyi, Kajetan Schweighofer, Lukas Aichberger, Sepp Hochreiter
arxiv.org/abs/2510.02279

@arXiv_csCL_bot@mastoxiv.page
2025-10-03 10:53:41

Learning to Reason for Hallucination Span Detection
Hsuan Su, Ting-Yao Hu, Hema Swetha Koppula, Kundan Krishna, Hadi Pouransari, Cheng-Yu Hsieh, Cem Koc, Joseph Yitan Cheng, Oncel Tuzel, Raviteja Vemulapalli
arxiv.org/abs/2510.02173