Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@arXiv_csCV_bot@mastoxiv.page
2025-10-13 10:36:10

A methodology for clinically driven interactive segmentation evaluation
Parhom Esmaeili, Virginia Fernandez, Pedro Borges, Eli Gibson, Sebastien Ourselin, M. Jorge Cardoso
arxiv.org/abs/2510.09499

@arXiv_csCL_bot@mastoxiv.page
2025-10-13 10:29:10

MaP: A Unified Framework for Reliable Evaluation of Pre-training Dynamics
Jiapeng Wang, Changxin Tian, Kunlong Chen, Ziqi Liu, Jiaxin Mao, Wayne Xin Zhao, Zhiqiang Zhang, Jun Zhou
arxiv.org/abs/2510.09295

@arXiv_csAI_bot@mastoxiv.page
2025-10-13 08:32:50

What Is Your Agent's GPA? A Framework for Evaluating Agent Goal-Plan-Action Alignment
Allison Sihan Jia, Daniel Huang, Nikhil Vytla, Nirvika Choudhury, John C Mitchell, Anupam Datta
arxiv.org/abs/2510.08847

@arXiv_astrophSR_bot@mastoxiv.page
2025-10-13 08:40:50

The Sonora Substellar Atmosphere Models VI. Red Diamondback: Extending Diamondback with SPHINX for Brown Dwarf Early Evolution
C. Evan Davis, Jonathan J. Fortney, Aishwarya Iyer, Sagnick Mukherjee, Caroline V. Morley, Mark S. Marley, Michael Line, Philip S. Muirhead
arxiv.org/abs/2510.08694

@cosmos4u@scicomm.xyz
2025-09-13 00:51:12

The Secular Evolution of #PlanetaryNebula IC 418 and Its Implications for Carbon Star Formation: iopscience.iop.org/article/10. -> HKU Astrophysics Research Captures 130 Years of Evolution of a Dying Star: hku.hk/press/news_detail_28550

Park Service orders changes to staff ratings, a move experts call illegal
A top National Park Service official has instructed park superintendents to limit the number of staff who get top marks in performance reviews
-- a move that experts say violates federal code and could make it easier to lay off staff.

Parks leadership generally evaluate individual employees annually on a five-point scale,
with a three rating given to those who are successful in achieving their go…

@arXiv_csLG_bot@mastoxiv.page
2025-10-13 10:46:10

Automated Evolutionary Optimization for Resource-Efficient Neural Network Training
Ilia Revin, Leon Strelkov, Vadim A. Potemkin, Ivan Kireev, Andrey Savchenko
arxiv.org/abs/2510.09566

@arXiv_csAI_bot@mastoxiv.page
2025-10-13 09:33:10

TripScore: Benchmarking and rewarding real-world travel planning with fine-grained evaluation
Yincen Qu, Huan Xiao, Feng Li, Hui Zhou, Xiangying Dai
arxiv.org/abs/2510.09011

@arXiv_csCL_bot@mastoxiv.page
2025-10-13 10:31:50

ReTraceQA: Evaluating Reasoning Traces of Small Language Models in Commonsense Question Answering
Francesco Maria Molfese, Luca Moroni, Ciro Porcaro, Simone Conia, Roberto Navigli
arxiv.org/abs/2510.09351

@arXiv_csCL_bot@mastoxiv.page
2025-10-13 10:28:20

Inflated Excellence or True Performance? Rethinking Medical Diagnostic Benchmarks with Dynamic Evaluation
Xiangxu Zhang, Lei Li, Yanyun Zhou, Xiao Zhou, Yingying Zhang, Xian Wu
arxiv.org/abs/2510.09275