Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@arXiv_csCL_bot@mastoxiv.page
2025-08-22 10:09:21

RadReason: Radiology Report Evaluation Metric with Reasons and Sub-Scores
Yingshu Li, Yunyi Liu, Lingqiao Liu, Lei Wang, Luping Zhou
arxiv.org/abs/2508.15464

@arXiv_csCL_bot@mastoxiv.page
2025-09-22 10:11:01

Can LLMs Judge Debates? Evaluating Non-Linear Reasoning via Argumentation Theory Semantics
Reza Sanayei, Srdjan Vesic, Eduardo Blanco, Mihai Surdeanu
arxiv.org/abs/2509.15739

@Techmeme@techhub.social
2025-11-18 16:30:55

Google says Gemini 3 Pro scores 1,501 on LMArena, above 2.5 Pro, and demonstrates PhD-level reasoning with top scores on Humanity's Last Exam and GPQA Diamond (Abner Li/9to5Google)
9to5google.com/2025/11/18/gemi

@cosmos4u@scicomm.xyz
2025-11-17 07:46:18

Is #AI really just dumb statistics? "Olympiad-level physics problem-solving presents a significant challenge for both humans and artificial intelligence (AI), as it requires a sophisticated integration of precise calculation, abstract reasoning, and a fundamental grasp of physical principles," says the (abstract of the) paper arxiv.org/abs/2511.10515: "The Chinese Physics Olympiad (CPhO), renowned for its complexity and depth, serves as an ideal and rigorous testbed for these advanced capabilities. In this paper, we introduce LOCA-R (LOgical Chain Augmentation for Reasoning), an improved version of the LOCA framework adapted for complex reasoning, and apply it to the CPhO 2025 theory examination. LOCA-R achieves a near-perfect score of 313 out of 320 points, solidly surpassing the highest-scoring human competitor and significantly outperforming all baseline methods." Oops ...?

@arXiv_csCL_bot@mastoxiv.page
2025-09-22 10:19:51

Re-FRAME the Meeting Summarization SCOPE: Fact-Based Summarization and Personalization via Questions
Frederic Kirstein, Sonu Kumar, Terry Ruas, Bela Gipp
arxiv.org/abs/2509.15901

@arXiv_csAI_bot@mastoxiv.page
2025-10-09 09:56:51

Revisiting the Uniform Information Density Hypothesis in LLM Reasoning Traces
Minju Gwak, Guijin Son, Jaehyung Kim
arxiv.org/abs/2510.06953

@arXiv_csIR_bot@mastoxiv.page
2025-10-14 10:33:38

Comparative Explanations via Counterfactual Reasoning in Recommendations
Yi Yu, Zhenxing Hu
arxiv.org/abs/2510.10920 arxiv.org/pdf/2510.109…

@arXiv_statML_bot@mastoxiv.page
2025-10-07 10:51:32

Embracing Discrete Search: A Reasonable Approach to Causal Structure Learning
Marcel Wien\"obst, Leonard Henckel, Sebastian Weichwald
arxiv.org/abs/2510.04970

@arXiv_csCL_bot@mastoxiv.page
2025-09-18 10:09:41

Slim-SC: Thought Pruning for Efficient Scaling with Self-Consistency
Colin Hong, Xu Guo, Anand Chaanan Singh, Esha Choukse, Dmitrii Ustiugov
arxiv.org/abs/2509.13990

@arXiv_csCL_bot@mastoxiv.page
2025-10-03 10:45:51

What MLLMs Learn about When they Learn about Multimodal Reasoning: Perception, Reasoning, or their Integration?
Jiwan Chung, Neel Joshi, Pratyusha Sharma, Youngjae Yu, Vibhav Vineet
arxiv.org/abs/2510.01719