Tootfinder

Opt-in global Mastodon full text search. Join the index!

@Techmeme@techhub.social
2025-11-21 22:40:59

Anthropic finds that LLMs trained to "reward hack" by cheating on coding tasks show even more misaligned behavior, including sabotaging AI-safety research (Anthropic)
anthropic.com/research/emergen

@arXiv_csCL_bot@mastoxiv.page
2025-10-10 10:58:49

LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions
XuHao Hu, Peng Wang, Xiaoya Lu, Dongrui Liu, Xuanjing Huang, Jing Shao
arxiv.org/abs/2510.08211

@arXiv_csLG_bot@mastoxiv.page
2025-10-15 10:43:31

Mitigating the Noise Shift for Denoising Generative Models via Noise Awareness Guidance
Jincheng Zhong, Boyuan Jiang, Xin Tao, Pengfei Wan, Kun Gai, Mingsheng Long
arxiv.org/abs/2510.12497

@arXiv_csCR_bot@mastoxiv.page
2025-10-08 08:54:09

Adapting Insider Risk mitigations for Agentic Misalignment: an empirical study
Francesca Gomez
arxiv.org/abs/2510.05192 arxiv.org/pdf/2510.…

@arXiv_csAI_bot@mastoxiv.page
2025-10-08 10:34:19

Moloch's Bargain: Emergent Misalignment When LLMs Compete for Audiences
Batu El, James Zou
arxiv.org/abs/2510.06105 arxiv.org/pdf/2510.…

@arXiv_csCY_bot@mastoxiv.page
2025-10-13 07:37:00

Assurance of Frontier AI Built for National Security
Matteo Pistillo, Charlotte Stix
arxiv.org/abs/2510.08792 arxiv.org/pdf/2510.08792

@arXiv_csCV_bot@mastoxiv.page
2025-10-15 10:55:01

Detect Anything via Next Point Prediction
Qing Jiang, Junan Huo, Xingyu Chen, Yuda Xiong, Zhaoyang Zeng, Yihao Chen, Tianhe Ren, Junzhi Yu, Lei Zhang
arxiv.org/abs/2510.12798

@arXiv_csPL_bot@mastoxiv.page
2025-10-15 08:30:42

AwareCompiler: Agentic Context-Aware Compiler Optimization via a Synergistic Knowledge-Data Driven Framework
Hongyu Lin, Haolin Pan, Haoran Luo, Yuchen Li, Kaichun Yao, Libo Zhang, Mingjie Xing, Yanjun Wu
arxiv.org/abs/2510.11759

@arXiv_csSE_bot@mastoxiv.page
2025-10-15 09:27:11

Towards Engineering Multi-Agent LLMs: A Protocol-Driven Approach
Zhenyu Mao, Jacky Keung, Fengji Zhang, Shuo Liu, Yifei Wang, Jialong Li
arxiv.org/abs/2510.12120

@arXiv_astrophSR_bot@mastoxiv.page
2025-09-29 08:06:47

Dynamical Pathways to the Misalignment of the VHS 1256-1257 System
Liz Holzknecht, Smadar Naoz, Cheyanne Shariat
arxiv.org/abs/2509.21452 a…

@arXiv_csSD_bot@mastoxiv.page
2025-10-14 10:45:38

MARS-Sep: Multimodal-Aligned Reinforced Sound Separation
Zihan Zhang, Xize Cheng, Zhennan Jiang, Dongjie Fu, Jingyuan Chen, Zhou Zhao, Tao Jin
arxiv.org/abs/2510.10509

@arXiv_astrophGA_bot@mastoxiv.page
2025-10-07 10:28:42

Azimuthal Misalignments in Stellar Warp Structure as Dynamical Tracers of Mergers in Milky Way-like Galaxies
Lekshmi Thulasidharan, Elena D'Onghia, Robert Benjamin
arxiv.org/abs/2510.04194

@arXiv_csCR_bot@mastoxiv.page
2025-10-08 08:44:19

Agentic Misalignment: How LLMs Could Be Insider Threats
Aengus Lynch, Benjamin Wright, Caleb Larson, Stuart J. Ritchie, Soren Mindermann, Ethan Perez, Kevin K. Troy, Evan Hubinger
arxiv.org/abs/2510.05179

@arXiv_mathOC_bot@mastoxiv.page
2025-10-14 09:28:18

Distributionally Robust Control with End-to-End Statistically Guaranteed Metric Learning
Jingyi Wu, Chao Ning, Yang Shi
arxiv.org/abs/2510.10214

@arXiv_csCL_bot@mastoxiv.page
2025-10-03 10:47:01

Syntactic Blind Spots: How Misalignment Leads to LLMs Mathematical Errors
Dane Williamson, Yangfeng Ji, Matthew Dwyer
arxiv.org/abs/2510.01831

@arXiv_csAI_bot@mastoxiv.page
2025-10-15 10:07:22

RAG-Anything: All-in-One RAG Framework
Zirui Guo, Xubin Ren, Lingrui Xu, Jiahao Zhang, Chao Huang
arxiv.org/abs/2510.12323 arxiv.org/pdf/25…

@arXiv_econTH_bot@mastoxiv.page
2025-10-14 08:30:48

Token is All You Price
Weijie Zhong
arxiv.org/abs/2510.09859 arxiv.org/pdf/2510.09859

@arXiv_astrophHE_bot@mastoxiv.page
2025-10-08 09:41:39

The gamma-ray emission from Radio Galaxies and their contribution to the Isotropic Gamma-Ray Background
A. Circiello, A. McDaniel, M. Di Mauro, C. Karwin, N. Khatiya, M. Ajello, F. Donato, D. Hartmann, A. Strong
arxiv.org/abs/2510.06047

@arXiv_qbioNC_bot@mastoxiv.page
2025-10-07 09:06:32

Atlas-free Brain Network Transformer
Shuai Huang, Xuan Kan, James J. Lah, Deqiang Qiu
arxiv.org/abs/2510.03306 arxiv.org/pdf/2510.03306

@arXiv_statML_bot@mastoxiv.page
2025-09-29 09:10:08

Causal-EPIG: A Prediction-Oriented Active Learning Framework for CATE Estimation
Erdun Gao, Jake Fawkes, Dino Sejdinovic
arxiv.org/abs/2509.21866

@arXiv_csCV_bot@mastoxiv.page
2025-10-10 11:20:29

MultiCOIN: Multi-Modal COntrollable Video INbetweening
Maham Tanveer, Yang Zhou, Simon Niklaus, Ali Mahdavi Amiri, Hao Zhang, Krishna Kumar Singh, Nanxuan Zhao
arxiv.org/abs/2510.08561

@arXiv_csGT_bot@mastoxiv.page
2025-09-29 07:58:07

Incentives in Federated Learning with Heterogeneous Agents
Ariel D. Procaccia, Han Shao, Itai Shapira
arxiv.org/abs/2509.21612 arxiv.org/pd…

@arXiv_csMM_bot@mastoxiv.page
2025-09-29 08:00:17

Perception-Consistency Multimodal Large Language Models Reasoning via Caption-Regularized Policy Optimization
Songjun Tu, Qichao Zhang, Jingbo Sun, Yuqian Fu, Linjing Li, Xiangyuan Lan, Dongmei Jiang, Yaowei Wang, Dongbin Zhao
arxiv.org/abs/2509.21854

@arXiv_csLG_bot@mastoxiv.page
2025-09-29 11:33:27

One Prompt Fits All: Universal Graph Adaptation for Pretrained Models
Yongqi Huang, Jitao Zhao, Dongxiao He, Xiaobao Wang, Yawen Li, Yuxiao Huang, Di Jin, Zhiyong Feng
arxiv.org/abs/2509.22416

@arXiv_csCY_bot@mastoxiv.page
2025-09-30 09:12:31

Regulating the Agency of LLM-based Agents
Se\'an Boddy, Joshua Joseph
arxiv.org/abs/2509.22735 arxiv.org/pdf/2509.22735

@arXiv_csCV_bot@mastoxiv.page
2025-10-01 08:03:17

LUMA: Low-Dimension Unified Motion Alignment with Dual-Path Anchoring for Text-to-Motion Diffusion Model
Haozhe Jia, Wenshuo Chen, Yuqi Lin, Yang Yang, Lei Wang, Mang Ning, Bowen Tian, Songning Lai, Nanqian Jia, Yifan Chen, Yutao Yue
arxiv.org/abs/2509.25304

@arXiv_csCL_bot@mastoxiv.page
2025-09-29 17:00:02

Replaced article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[6/8]:
- HiCoLoRA: Addressing Context-Prompt Misalignment via Hierarchical Collaborative LoRA for Zero-Sho...
Shuyu Zhang, Yifan Wei, Xinru Wang, Yanmin Zhu, Yangfan He, Yixuan Weng, Bin Li

@arXiv_csCV_bot@mastoxiv.page
2025-09-30 15:00:16

UniLat3D: Geometry-Appearance Unified Latents for Single-Stage 3D Generation
Guanjun Wu, Jiemin Fang, Chen Yang, Sikuang Li, Taoran Yi, Jia Lu, Zanwei Zhou, Jiazhong Cen, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Xinggang Wang, Qi Tian
arxiv.org/abs/2509.25079