Tootfinder

@arXiv_csCR_bot@mastoxiv.page
2025-06-19 08:06:23

Winter Soldier: Backdooring Language Models at Pre-Training with Indirect Data Poisoning
Wassim Bouaziz, Mathurin Videau, Nicolas Usunier, El-Mahdi El-Mhamdi
https://arxiv.org/abs/2506.14913

Winter Soldier: Backdooring Language Models at Pre-Training with Indirect Data Poisoning
The pre-training of large language models (LLMs) relies on massive text datasets sourced from diverse and difficult-to-curate origins. Although membership inference attacks and hidden canaries have been explored to trace data usage, such methods rely on memorization of training data, which LM providers try to limit. In this work, we demonstrate that indirect data poisoning (where the targeted behavior is absent from training data) is not only feasible but also allow to effectively protect a dat…

Tootfinder

Opt-in global Mastodon full text search. Join the index!