Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@arXiv_csCY_bot@mastoxiv.page
2025-07-10 08:57:01

Deprecating Benchmarks: Criteria and Framework
Ayrton San Joaquin, Rokas Gipi\v{s}kis, Leon Staufer, Ariel Gil
arxiv.org/abs/2507.06434

@arXiv_csSE_bot@mastoxiv.page
2025-09-09 11:28:22

Efficiently Ranking Software Variants with Minimal Benchmarks
Th\'eo Matricon, Mathieu Acher, Helge Spieker, Arnaud Gotlieb
arxiv.org/abs/2509.06716

@Techmeme@techhub.social
2025-07-10 12:11:22

Artificial Analysis' benchmarks show Grok 4 is the leading AI model, a first for xAI, and its per-token pricing is more expensive than Gemini 2.5 Pro and o3 (@artificialanlys)
x.com/artificialanlys/status/1

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 10:31:01

SimpleQA Verified: A Reliable Factuality Benchmark to Measure Parametric Knowledge
Lukas Haas, Gal Yona, Giovanni D'Antonio, Sasha Goldshtein, Dipanjan Das
arxiv.org/abs/2509.07968

@crell@phpc.social
2025-09-08 16:55:01

I benchmarked #PHP's native serializer vs code export. You won't believe what I found!
peakd.com/hive-168588/@crell/b

@arXiv_quantph_bot@mastoxiv.page
2025-09-09 12:01:02

Benchmarking Single-Qubit Gates on a Neutral Atom Quantum Processor
Artem Rozanov, Boris Bantysh, Ivan Bobrov, Gleb Struchalin, Stanislav Straupe
arxiv.org/abs/2509.06881

@ErikJonker@mastodon.social
2025-07-10 14:17:00

A new day a new AI benchmark.
#ai #benchmarks

It’s exactly four weeks ago today that the Jeffrey Epstein story broke,
or re-broke in its current form.
On Friday, July 11, the world learned of the tense meeting that took place at the White House that previous Wednesday,
in which FBI Deputy Director Dan Bongino clashed with Attorney General Pam Bondi over the handling of the Epstein files.
Bongino was so incensed that he didn’t go to work that Friday
and threatened to resign.
He has, at least for now…

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 09:03:31

DischargeSim: A Simulation Benchmark for Educational Doctor-Patient Communication at Discharge
Zonghai Yao, Michael Sun, Won Seok Jang, Sunjae Kwon, Soie Kwon, Hong Yu
arxiv.org/abs/2509.07188

@arXiv_csSE_bot@mastoxiv.page
2025-07-09 08:52:22

CoreCodeBench: A Configurable Multi-Scenario Repository-Level Benchmark
Lingyue Fu, Hao Guan, Bolun Zhang, Haowei Yuan, Yaoming Zhu, Jun Xu, Zongyu Wang, Lin Qiu, Xunliang Cai, Xuezhi Cao, Weiwen Liu, Weinan Zhang, Yong Yu
arxiv.org/abs/2507.05281