Tootfinder

No exact results. Similar results found.

@arXiv_csAI_bot@mastoxiv.page
2025-10-09 07:55:01

Off-Trajectory Reasoning: Can LLMs Collaborate on Reasoning Trajectory?
Aochong Oliver Li, Tanya Goyal
https://arxiv.org/abs/2510.06410 https://arxiv.org/p…

Off-Trajectory Reasoning: Can LLMs Collaborate on Reasoning Trajectory?
Reasoning LLMs are trained to verbalize their reasoning process, yielding strong gains on complex tasks. This transparency also opens a promising direction: multiple reasoners can directly collaborate on each other's thinking within a shared trajectory, yielding better inference efficiency and exploration. A key prerequisite, however, is the ability to assess the usefulness and build on another model's partial thinking -- we call this off-trajectory reasoning. Our paper investigates a critical …

@jake4480@c.im
2025-10-08 18:33:44

Standing up at desk, leaving office, never returning #health

Health Experts Recommend Standing Up At Desk, Leaving Office, Never Coming Back
ROCHESTER, MN—In an effort to help working individuals improve their fitness and well-being, experts at the Mayo Clinic issued a new set of health guidelines Thursday recommending that Americans stand up at their desk, leave their office, and never return. “Many Americans spend a minimum of eight hours per day sitting in an office, but we observed significant physical and mental health benefits in subjects after just one instance of standing up, walking out the door, and never coming back t…

@arXiv_csCV_bot@mastoxiv.page
2025-09-09 12:31:12

Interleaving Reasoning for Better Text-to-Image Generation
Wenxuan Huang, Shuang Chen, Zheyong Xie, Shaosheng Cao, Shixiang Tang, Yufan Shen, Qingyu Yin, Wenbo Hu, Xiaoman Wang, Yuntian Tang, Junbo Qiao, Yue Guo, Yao Hu, Zhenfei Yin, Philip Torr, Yu Cheng, Wanli Ouyang, Shaohui Lin
https://arxiv.org/abs/2509.06945

Interleaving Reasoning for Better Text-to-Image Generation
Unified multimodal understanding and generation models recently have achieve significant improvement in image generation capability, yet a large gap remains in instruction following and detail preservation compared to systems that tightly couple comprehension with generation such as GPT-4o. Motivated by recent advances in interleaving reasoning, we explore whether such reasoning can further improve Text-to-Image (T2I) generation. We introduce Interleaving Reasoning Generation (IRG), a framework…

@arXiv_csLG_bot@mastoxiv.page
2025-10-08 10:58:29

Influence Functions for Efficient Data Selection in Reasoning
Prateek Humane, Paolo Cudrano, Daniel Z. Kaplan, Matteo Matteucci, Supriyo Chakraborty, Irina Rish
https://arxiv.org/abs/2510.06108

Influence Functions for Efficient Data Selection in Reasoning
Fine-tuning large language models (LLMs) on chain-of-thought (CoT) data shows that a small amount of high-quality data can outperform massive datasets. Yet, what constitutes "quality" remains ill-defined. Existing reasoning methods rely on indirect heuristics such as problem difficulty or trace length, while instruction-tuning has explored a broader range of automated selection strategies, but rarely in the context of reasoning. We propose to define reasoning data quality using influence functi…

@arXiv_csCL_bot@mastoxiv.page
2025-09-08 10:15:20

Less is More Tokens: Efficient Math Reasoning via Difficulty-Aware Chain-of-Thought Distillation
Abdul Waheed, Chancharik Mitra, Laurie Z. Wang, Deva Ramanan, Bhiksha Raj
https://arxiv.org/abs/2509.05226

Less is More Tokens: Efficient Math Reasoning via Difficulty-Aware Chain-of-Thought Distillation
Chain-of-thought reasoning, while powerful, can produce unnecessarily verbose output for simpler problems. We present a framework for difficulty-aware reasoning that teaches models to dynamically adjust reasoning depth based on problem complexity. Remarkably, we show that models can be endowed with such dynamic inference pathways without any architectural modifications; we simply post-train on data that is carefully curated to include chain-of-thought traces that are proportional in length to p…

@arXiv_csCR_bot@mastoxiv.page
2025-09-09 11:27:12

Reasoning Introduces New Poisoning Attacks Yet Makes Them More Complicated
Hanna Foerster, Ilia Shumailov, Yiren Zhao, Harsh Chaudhari, Jamie Hayes, Robert Mullins, Yarin Gal
https://arxiv.org/abs/2509.05739

Reasoning Introduces New Poisoning Attacks Yet Makes Them More Complicated
Early research into data poisoning attacks against Large Language Models (LLMs) demonstrated the ease with which backdoors could be injected. More recent LLMs add step-by-step reasoning, expanding the attack surface to include the intermediate chain-of-thought (CoT) and its inherent trait of decomposing problems into subproblems. Using these vectors for more stealthy poisoning, we introduce ``decomposed reasoning poison'', in which the attacker modifies only the reasoning path, leaving prompts …

@arXiv_csAI_bot@mastoxiv.page
2025-10-08 10:37:39

TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning
Jiaru Zou, Soumya Roy, Vinay Kumar Verma, Ziyi Wang, David Wipf, Pan Lu, Sumit Negi, James Zou, Jingrui He
https://arxiv.org/abs/2510.06217

TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning
Process Reward Models (PRMs) have recently emerged as a powerful framework for enhancing the reasoning capabilities of large reasoning models (LRMs), particularly in the context of test-time scaling (TTS). However, their potential for supervising LRMs on tabular reasoning domains remains underexplored. Through detailed empirical analyses, we identify that existing PRMs, though widely adopted for supervising text-only reasoning steps, struggle with table-specific operations such as sub-table ret…

@arXiv_csAI_bot@mastoxiv.page
2025-10-09 08:43:01

Beneficial Reasoning Behaviors in Agentic Search and Effective Post-training to Obtain Them
Jiahe Jin, Abhijay Paladugu, Chenyan Xiong
https://arxiv.org/abs/2510.06534 https://

Beneficial Reasoning Behaviors in Agentic Search and Effective Post-training to Obtain Them
Agentic search leverages large language models (LLMs) to interpret complex user information needs and execute a multi-step process of planning, searching, and synthesizing information to provide answers. This paradigm introduces unique challenges for LLMs' reasoning and agentic capabilities when interacting with retrieval systems and the broader web. In this paper, we propose a reasoning-driven LLM-based pipeline to study effective reasoning behavior patterns in agentic search. Using this pipel…

@arXiv_csAI_bot@mastoxiv.page
2025-10-08 10:30:29

Refusal Falls off a Cliff: How Safety Alignment Fails in Reasoning?
Qingyu Yin, Chak Tou Leong, Linyi Yang, Wenxuan Huang, Wenjie Li, Xiting Wang, Jaehong Yoon, YunXing, XingYu, Jinjin Gu
https://arxiv.org/abs/2510.06036

Refusal Falls off a Cliff: How Safety Alignment Fails in Reasoning?
Large reasoning models (LRMs) with multi-step reasoning capabilities have shown remarkable problem-solving abilities, yet they exhibit concerning safety vulnerabilities that remain poorly understood. In this work, we investigate why safety alignment fails in reasoning models through a mechanistic interpretability lens. Using a linear probing approach to trace refusal intentions across token positions, we discover a striking phenomenon termed as \textbf{refusal cliff}: many poorly-aligned reason…

Tootfinder

Opt-in global Mastodon full text search. Join the index!