Tootfinder

@arXiv_csCL_bot@mastoxiv.page
2025-10-10 10:58:49

LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions
XuHao Hu, Peng Wang, Xiaoya Lu, Dongrui Liu, Xuanjing Huang, Jing Shao
https://arxiv.org/abs/2510.08211

LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions
Previous research has shown that LLMs finetuned on malicious or incorrect completions within narrow domains (e.g., insecure code or incorrect medical advice) can become broadly misaligned to exhibit harmful behaviors, which is called emergent misalignment. In this work, we investigate whether this phenomenon can extend beyond safety behaviors to a broader spectrum of dishonesty and deception under high-stakes scenarios (e.g., lying under pressure and deceptive behavior). To explore this, we fin…

@arXiv_csLG_bot@mastoxiv.page
2025-10-15 10:43:31

Mitigating the Noise Shift for Denoising Generative Models via Noise Awareness Guidance
Jincheng Zhong, Boyuan Jiang, Xin Tao, Pengfei Wan, Kun Gai, Mingsheng Long
https://arxiv.org/abs/2510.12497

Mitigating the Noise Shift for Denoising Generative Models via Noise Awareness Guidance
Existing denoising generative models rely on solving discretized reverse-time SDEs or ODEs. In this paper, we identify a long-overlooked yet pervasive issue in this family of models: a misalignment between the pre-defined noise level and the actual noise level encoded in intermediate states during sampling. We refer to this misalignment as noise shift. Through empirical analysis, we demonstrate that noise shift is widespread in modern diffusion models and exhibits a systematic bias, leading to …

@arXiv_csCR_bot@mastoxiv.page
2025-10-08 08:54:09

Adapting Insider Risk mitigations for Agentic Misalignment: an empirical study
Francesca Gomez
https://arxiv.org/abs/2510.05192 https://arxiv.org/pdf/2510.…

Adapting Insider Risk mitigations for Agentic Misalignment: an empirical study
Agentic misalignment occurs when goal-directed agents take harmful actions, such as blackmail, rather than risk goal failure, and can be triggered by replacement threats, autonomy reduction, or goal conflict (Lynch et al., 2025). We adapt insider-risk control design (Critical Pathway; Situational Crime Prevention) to develop preventative operational controls that steer agents toward safe actions when facing stressors. Using the blackmail scenario from the original Anthropic study by Lynch et al…

@arXiv_csAI_bot@mastoxiv.page
2025-10-08 10:34:19

Moloch's Bargain: Emergent Misalignment When LLMs Compete for Audiences
Batu El, James Zou
https://arxiv.org/abs/2510.06105 https://arxiv.org/pdf/2510.…

Moloch's Bargain: Emergent Misalignment When LLMs Compete for Audiences
Large language models (LLMs) are increasingly shaping how information is created and disseminated, from companies using them to craft persuasive advertisements, to election campaigns optimizing messaging to gain votes, to social media influencers boosting engagement. These settings are inherently competitive, with sellers, candidates, and influencers vying for audience approval, yet it remains poorly understood how competitive feedback loops influence LLM behavior. We show that optimizing LLMs …

@arXiv_csCY_bot@mastoxiv.page
2025-10-13 07:37:00

Assurance of Frontier AI Built for National Security
Matteo Pistillo, Charlotte Stix
https://arxiv.org/abs/2510.08792 https://arxiv.org/pdf/2510.08792

Assurance of Frontier AI Built for National Security
This memorandum presents four recommendations aimed at strengthening the principles of AI model reliability and AI model governability, as DoW, ODNI, NIST, and CAISI refine AI assurance frameworks under the AI Action Plan. Our focus concerns the open scientific problem of misalignment and its implications on AI model behavior. Specifically, misalignment and scheming capabilities can be a red flag indicating AI model insufficient reliability and governability. To address the national security th…

@arXiv_csCV_bot@mastoxiv.page
2025-10-15 10:55:01

Detect Anything via Next Point Prediction
Qing Jiang, Junan Huo, Xingyu Chen, Yuda Xiong, Zhaoyang Zeng, Yihao Chen, Tianhe Ren, Junzhi Yu, Lei Zhang
https://arxiv.org/abs/2510.12798

Detect Anything via Next Point Prediction
Object detection has long been dominated by traditional coordinate regression-based models, such as YOLO, DETR, and Grounding DINO. Although recent efforts have attempted to leverage MLLMs to tackle this task, they face challenges like low recall rate, duplicate predictions, coordinate misalignment, etc. In this work, we bridge this gap and propose Rex-Omni, a 3B-scale MLLM that achieves state-of-the-art object perception performance. On benchmarks like COCO and LVIS, Rex-Omni attains performan…

@arXiv_csPL_bot@mastoxiv.page
2025-10-15 08:30:42

AwareCompiler: Agentic Context-Aware Compiler Optimization via a Synergistic Knowledge-Data Driven Framework
Hongyu Lin, Haolin Pan, Haoran Luo, Yuchen Li, Kaichun Yao, Libo Zhang, Mingjie Xing, Yanjun Wu
https://arxiv.org/abs/2510.11759

AwareCompiler: Agentic Context-Aware Compiler Optimization via a Synergistic Knowledge-Data Driven Framework
Compiler optimization is crucial for enhancing program performance by transforming the sequence of optimization passes while maintaining correctness. Despite the promising potential of large language models (LLMs)-based agent for software optimization, automating compiler optimization remains challenging due to: (1) semantic misalignment between abstract program representations and concrete optimization passes, (2) inefficient interaction mechanisms between agents and compiler environments, and…

@arXiv_csSE_bot@mastoxiv.page
2025-10-15 09:27:11

Towards Engineering Multi-Agent LLMs: A Protocol-Driven Approach
Zhenyu Mao, Jacky Keung, Fengji Zhang, Shuo Liu, Yifei Wang, Jialong Li
https://arxiv.org/abs/2510.12120 https:/…

Towards Engineering Multi-Agent LLMs: A Protocol-Driven Approach
The increasing demand for software development has driven interest in automating software engineering (SE) tasks using Large Language Models (LLMs). Recent efforts extend LLMs into multi-agent systems (MAS) that emulate collaborative development workflows, but these systems often fail due to three core deficiencies: under-specification, coordination misalignment, and inappropriate verification, arising from the absence of foundational SE structuring principles. This paper introduces Software En…

@arXiv_astrophSR_bot@mastoxiv.page
2025-09-29 08:06:47

Dynamical Pathways to the Misalignment of the VHS 1256-1257 System
Liz Holzknecht, Smadar Naoz, Cheyanne Shariat
https://arxiv.org/abs/2509.21452 https://a…

Dynamical Pathways to the Misalignment of the VHS 1256-1257 System
Circumbinary planets (CBPs) provide a unique window into planet formation and dynamical evolution in complex gravitational environments. Their orbits are shaped not only by the protoplanetary disk but also by the perturbations from two stellar hosts, making them sensitive probes of both early- and late-stage dynamical processes. In this work, we investigate the unusual architecture of the VHS J125601.92-125723.9 system, where a retrograde, nearly polar tertiary orbits an extremely low-mass subs…

@arXiv_csSD_bot@mastoxiv.page
2025-10-14 10:45:38

MARS-Sep: Multimodal-Aligned Reinforced Sound Separation
Zihan Zhang, Xize Cheng, Zhennan Jiang, Dongjie Fu, Jingyuan Chen, Zhou Zhao, Tao Jin
https://arxiv.org/abs/2510.10509 h…

MARS-Sep: Multimodal-Aligned Reinforced Sound Separation
Universal sound separation faces a fundamental misalignment: models optimized for low-level signal metrics often produce semantically contaminated outputs, failing to suppress perceptually salient interference from acoustically similar sources. To bridge this gap, we introduce MARS-Sep, a reinforcement learning framework that reformulates separation as decision making. Instead of simply regressing ground-truth masks, MARS-Sep learns a factorized Beta mask policy that is optimized by a clipped t…

@arXiv_astrophGA_bot@mastoxiv.page
2025-10-07 10:28:42

Azimuthal Misalignments in Stellar Warp Structure as Dynamical Tracers of Mergers in Milky Way-like Galaxies
Lekshmi Thulasidharan, Elena D'Onghia, Robert Benjamin
https://arxiv.org/abs/2510.04194 …

Azimuthal Misalignments in Stellar Warp Structure as Dynamical Tracers of Mergers in Milky Way-like Galaxies
We investigate the origin of warps in stellar disks using high-resolution Milky Way analogs from the IllustrisTNG50 simulation. Focusing on galaxies that experienced a major merger, we identify a characteristic azimuthal misalignment between the warp structures of stellar populations formed before and after the merger. This misalignment persists even after correcting for differential rotation, suggesting it is a dynamical imprint of the merger rather than a consequence of internal kinematics. I…

@arXiv_csCR_bot@mastoxiv.page
2025-10-08 08:44:19

Agentic Misalignment: How LLMs Could Be Insider Threats
Aengus Lynch, Benjamin Wright, Caleb Larson, Stuart J. Ritchie, Soren Mindermann, Ethan Perez, Kevin K. Troy, Evan Hubinger
https://arxiv.org/abs/2510.05179

Agentic Misalignment: How LLMs Could Be Insider Threats
We stress-tested 16 leading models from multiple developers in hypothetical corporate environments to identify potentially risky agentic behaviors before they cause real harm. In the scenarios, we allowed models to autonomously send emails and access sensitive information. They were assigned only harmless business goals by their deploying companies; we then tested whether they would act against these companies either when facing replacement with an updated version, or when their assigned goal c…

@arXiv_mathOC_bot@mastoxiv.page
2025-10-14 09:28:18

Distributionally Robust Control with End-to-End Statistically Guaranteed Metric Learning
Jingyi Wu, Chao Ning, Yang Shi
https://arxiv.org/abs/2510.10214 https://

Distributionally Robust Control with End-to-End Statistically Guaranteed Metric Learning
Wasserstein distributionally robust control (DRC) recently emerges as a principled paradigm for handling uncertainty in stochastic dynamical systems. However, it constructs data-driven ambiguity sets via uniform distribution shifts before sequentially incorporating them into downstream control synthesis. This segregation between ambiguity set construction and control objectives inherently introduces a structural misalignment, which undesirably leads to conservative control policies with sub-opt…

@arXiv_csCL_bot@mastoxiv.page
2025-10-03 10:47:01

Syntactic Blind Spots: How Misalignment Leads to LLMs Mathematical Errors
Dane Williamson, Yangfeng Ji, Matthew Dwyer
https://arxiv.org/abs/2510.01831 https://

Syntactic Blind Spots: How Misalignment Leads to LLMs Mathematical Errors
Large Language Models (LLMs) demonstrate strong mathematical problem-solving abilities but frequently fail on problems that deviate syntactically from their training distribution. We identify a systematic failure mode, syntactic blind spots, in which models misapply familiar reasoning strategies to problems that are semantically straightforward but phrased in unfamiliar ways. These errors are not due to gaps in mathematical competence, but rather reflect a brittle coupling between surface form …

@arXiv_csAI_bot@mastoxiv.page
2025-10-15 10:07:22

RAG-Anything: All-in-One RAG Framework
Zirui Guo, Xubin Ren, Lingrui Xu, Jiahao Zhang, Chao Huang
https://arxiv.org/abs/2510.12323 https://arxiv.org/pdf/25…

RAG-Anything: All-in-One RAG Framework
Retrieval-Augmented Generation (RAG) has emerged as a fundamental paradigm for expanding Large Language Models beyond their static training limitations. However, a critical misalignment exists between current RAG capabilities and real-world information environments. Modern knowledge repositories are inherently multimodal, containing rich combinations of textual content, visual elements, structured tables, and mathematical expressions. Yet existing RAG frameworks are limited to textual content, …

@arXiv_econTH_bot@mastoxiv.page
2025-10-14 08:30:48

Token is All You Price
Weijie Zhong
https://arxiv.org/abs/2510.09859 https://arxiv.org/pdf/2510.09859

Token is All You Price
We build a mechanism design framework where a platform designs GenAI models to screen users who obtain instrumental value from the generated conversation and privately differ in their preference for latency. We show that the revenue-optimal mechanism is simple: deploy a single aligned (user-optimal) model and use token cap as the only instrument to screen the user. The design decouples model training from pricing, is readily implemented with token metering, and mitigates misalignment pressures.

@arXiv_astrophHE_bot@mastoxiv.page
2025-10-08 09:41:39

The gamma-ray emission from Radio Galaxies and their contribution to the Isotropic Gamma-Ray Background
A. Circiello, A. McDaniel, M. Di Mauro, C. Karwin, N. Khatiya, M. Ajello, F. Donato, D. Hartmann, A. Strong
https://arxiv.org/abs/2510.06047

The gamma-ray emission from Radio Galaxies and their contribution to the Isotropic Gamma-Ray Background
We evaluate the contribution to the Isotropic Gamma-Ray Background (IGRB) coming from Radio Galaxies (RGs), the subclass of radio-loud Active Galactic Nuclei (AGN) with the highest misalignment from the line of sight (l.o.s.). Since only a small number of RGs are detected in gamma rays compared to the largest known radio population, the correlation between radio and gamma-ray emission serves as a crucial tool to characterize the gamma-ray properties of these sources. We analyse the population o…

@arXiv_qbioNC_bot@mastoxiv.page
2025-10-07 09:06:32

Atlas-free Brain Network Transformer
Shuai Huang, Xuan Kan, James J. Lah, Deqiang Qiu
https://arxiv.org/abs/2510.03306 https://arxiv.org/pdf/2510.03306

Atlas-free Brain Network Transformer
Current atlas-based approaches to brain network analysis rely heavily on standardized anatomical or connectivity-driven brain atlases. However, these fixed atlases often introduce significant limitations, such as spatial misalignment across individuals, functional heterogeneity within predefined regions, and atlas-selection biases, collectively undermining the reliability and interpretability of the derived brain networks. To address these challenges, we propose a novel atlas-free brain network…

@arXiv_statML_bot@mastoxiv.page
2025-09-29 09:10:08

Causal-EPIG: A Prediction-Oriented Active Learning Framework for CATE Estimation
Erdun Gao, Jake Fawkes, Dino Sejdinovic
https://arxiv.org/abs/2509.21866 https://

Causal-EPIG: A Prediction-Oriented Active Learning Framework for CATE Estimation
Estimating the Conditional Average Treatment Effect (CATE) is often constrained by the high cost of obtaining outcome measurements, making active learning essential. However, conventional active learning strategies suffer from a fundamental objective mismatch. They are designed to reduce uncertainty in model parameters or in observable factual outcomes, failing to directly target the unobservable causal quantities that are the true objects of interest. To address this misalignment, we introduce…

@arXiv_csCV_bot@mastoxiv.page
2025-10-10 11:20:29

MultiCOIN: Multi-Modal COntrollable Video INbetweening
Maham Tanveer, Yang Zhou, Simon Niklaus, Ali Mahdavi Amiri, Hao Zhang, Krishna Kumar Singh, Nanxuan Zhao
https://arxiv.org/abs/2510.08561

MultiCOIN: Multi-Modal COntrollable Video INbetweening
Video inbetweening creates smooth and natural transitions between two image frames, making it an indispensable tool for video editing and long-form video synthesis. Existing works in this domain are unable to generate large, complex, or intricate motions. In particular, they cannot accommodate the versatility of user intents and generally lack fine control over the details of intermediate frames, leading to misalignment with the creative mind. To fill these gaps, we introduce \modelname{}, a vi…

@arXiv_csGT_bot@mastoxiv.page
2025-09-29 07:58:07

Incentives in Federated Learning with Heterogeneous Agents
Ariel D. Procaccia, Han Shao, Itai Shapira
https://arxiv.org/abs/2509.21612 https://arxiv.org/pd…

Incentives in Federated Learning with Heterogeneous Agents
Federated learning promises significant sample-efficiency gains by pooling data across multiple agents, yet incentive misalignment is an obstacle: each update is costly to the contributor but boosts every participant. We introduce a game-theoretic framework that captures heterogeneous data: an agent's utility depends on who supplies each sample, not just how many. Agents aim to meet a PAC-style accuracy threshold at minimal personal cost. We show that uncoordinated play yields pathologies: pure…

@arXiv_csMM_bot@mastoxiv.page
2025-09-29 08:00:17

Perception-Consistency Multimodal Large Language Models Reasoning via Caption-Regularized Policy Optimization
Songjun Tu, Qichao Zhang, Jingbo Sun, Yuqian Fu, Linjing Li, Xiangyuan Lan, Dongmei Jiang, Yaowei Wang, Dongbin Zhao
https://arxiv.org/abs/2509.21854

Perception-Consistency Multimodal Large Language Models Reasoning via Caption-Regularized Policy Optimization
While multimodal large language models excel at tasks that integrate visual perception with symbolic reasoning, their performance is often undermined by a critical vulnerability: perception-induced errors that propagate through the reasoning chain. Current reinforcement learning (RL) fine-tuning methods, while enhancing reasoning abilities, largely fail to address the underlying misalignment between visual grounding and the subsequent reasoning process. To address this challenge, we propose \te…

@arXiv_csLG_bot@mastoxiv.page
2025-09-29 11:33:27

One Prompt Fits All: Universal Graph Adaptation for Pretrained Models
Yongqi Huang, Jitao Zhao, Dongxiao He, Xiaobao Wang, Yawen Li, Yuxiao Huang, Di Jin, Zhiyong Feng
https://arxiv.org/abs/2509.22416 …

One Prompt Fits All: Universal Graph Adaptation for Pretrained Models
Graph Prompt Learning (GPL) has emerged as a promising paradigm that bridges graph pretraining models and downstream scenarios, mitigating label dependency and the misalignment between upstream pretraining and downstream tasks. Although existing GPL studies explore various prompt strategies, their effectiveness and underlying principles remain unclear. We identify two critical limitations: (1) Lack of consensus on underlying mechanisms: Despite current GPLs have advanced the field, there is no …

@arXiv_csCY_bot@mastoxiv.page
2025-09-30 09:12:31

Regulating the Agency of LLM-based Agents
Se\'an Boddy, Joshua Joseph
https://arxiv.org/abs/2509.22735 https://arxiv.org/pdf/2509.22735

Regulating the Agency of LLM-based Agents
As increasingly capable large language model (LLM)-based agents are developed, the potential harms caused by misalignment and loss of control grow correspondingly severe. To address these risks, we propose an approach that directly measures and controls the agency of these AI systems. We conceptualize the agency of LLM-based agents as a property independent of intelligence-related measures and consistent with the interdisciplinary literature on the concept of agency. We offer (1) agency as a sy…

@arXiv_csCV_bot@mastoxiv.page
2025-10-01 08:03:17

LUMA: Low-Dimension Unified Motion Alignment with Dual-Path Anchoring for Text-to-Motion Diffusion Model
Haozhe Jia, Wenshuo Chen, Yuqi Lin, Yang Yang, Lei Wang, Mang Ning, Bowen Tian, Songning Lai, Nanqian Jia, Yifan Chen, Yutao Yue
https://arxiv.org/abs/2509.25304

LUMA: Low-Dimension Unified Motion Alignment with Dual-Path Anchoring for Text-to-Motion Diffusion Model
While current diffusion-based models, typically built on U-Net architectures, have shown promising results on the text-to-motion generation task, they still suffer from semantic misalignment and kinematic artifacts. Through analysis, we identify severe gradient attenuation in the deep layers of the network as a key bottleneck, leading to insufficient learning of high-level features. To address this issue, we propose \textbf{LUMA} (\textit{\textbf{L}ow-dimension \textbf{U}nified \textbf{M}otion …

@arXiv_csCL_bot@mastoxiv.page
2025-09-29 17:00:02

Replaced article(s) found for cs.CL. https://arxiv.org/list/cs.CL/new
[6/8]:
- HiCoLoRA: Addressing Context-Prompt Misalignment via Hierarchical Collaborative LoRA for Zero-Sho...
Shuyu Zhang, Yifan Wei, Xinru Wang, Yanmin Zhu, Yangfan He, Yixuan Weng, Bin Li

@arXiv_csCV_bot@mastoxiv.page
2025-09-30 15:00:16

UniLat3D: Geometry-Appearance Unified Latents for Single-Stage 3D Generation
Guanjun Wu, Jiemin Fang, Chen Yang, Sikuang Li, Taoran Yi, Jia Lu, Zanwei Zhou, Jiazhong Cen, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Xinggang Wang, Qi Tian
https://arxiv.org/abs/2509.25079

UniLat3D: Geometry-Appearance Unified Latents for Single-Stage 3D Generation
High-fidelity 3D asset generation is crucial for various industries. While recent 3D pretrained models show strong capability in producing realistic content, most are built upon diffusion models and follow a two-stage pipeline that first generates geometry and then synthesizes appearance. Such a decoupled design tends to produce geometry-texture misalignment and non-negligible cost. In this paper, we propose UniLat3D, a unified framework that encodes geometry and appearance in a single latent s…

Tootfinder

Opt-in global Mastodon full text search. Join the index!