Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@arXiv_csCL_bot@mastoxiv.page
2025-09-03 14:30:53

AMBEDKAR-A Multi-level Bias Elimination through a Decoding Approach with Knowledge Augmentation for Robust Constitutional Alignment of Language Models
Snehasis Mukhopadhyay, Aryan Kasat, Shivam Dubey, Rahul Karthikeyan, Dhruv Sood, Vinija Jain, Aman Chadha, Amitava Das
arxiv.org/abs/2509.02133

@arXiv_csCL_bot@mastoxiv.page
2025-08-01 10:18:11

Role-Aware Language Models for Secure and Contextualized Access Control in Organizations
Saeed Almheiri, Yerulan Kongrat, Adrian Santosh, Ruslan Tasmukhanov, Josemaria Vera, Muhammad Dehan Al Kautsar, Fajri Koto
arxiv.org/abs/2507.23465

@arXiv_csCR_bot@mastoxiv.page
2025-08-28 09:42:41

Servant, Stalker, Predator: How An Honest, Helpful, And Harmless (3H) Agent Unlocks Adversarial Skills
David Noever
arxiv.org/abs/2508.19500

@arXiv_qbioQM_bot@mastoxiv.page
2025-07-29 08:50:11

Theoretical modeling and quantitative research on aquatic ecosystems driven by multiple factors
Haizhao Guan, Yiyuan Niu, Chuanjin Zu, Ju Kang
arxiv.org/abs/2507.19553

@arXiv_csLG_bot@mastoxiv.page
2025-07-24 10:03:49

Filter-And-Refine: A MLLM Based Cascade System for Industrial-Scale Video Content Moderation
Zixuan Wang, Jinghao Shi, Hanzhong Liang, Xiang Shen, Vera Wen, Zhiqian Chen, Yifan Wu, Zhixin Zhang, Hongyu Xiong
arxiv.org/abs/2507.17204

@arXiv_csCV_bot@mastoxiv.page
2025-07-04 10:24:11

Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection
Ziqi Miao, Yi Ding, Lijun Li, Jing Shao
arxiv.org/abs/2507.02844

@arXiv_csCR_bot@mastoxiv.page
2025-08-15 08:34:02

Context Misleads LLMs: The Role of Context Filtering in Maintaining Safe Alignment of LLMs
Jinhwa Kim, Ian G. Harris
arxiv.org/abs/2508.10031

@arXiv_csAI_bot@mastoxiv.page
2025-06-05 09:37:48

This arxiv.org/abs/2505.17433 has been replaced.
initial toot: mastoxiv.page/@arXiv_csAI_…

@arXiv_csCL_bot@mastoxiv.page
2025-08-19 11:44:00

Context Matters: Incorporating Target Awareness in Conversational Abusive Language Detection
Raneem Alharthi, Rajwa Alharthi, Aiqi Jiang, Arkaitz Zubiaga
arxiv.org/abs/2508.12828

@arXiv_csCL_bot@mastoxiv.page
2025-07-08 14:05:21

Response Attack: Exploiting Contextual Priming to Jailbreak Large Language Models
Ziqi Miao, Lijun Li, Yuan Xiong, Zhenhua Liu, Pengyu Zhu, Jing Shao
arxiv.org/abs/2507.05248