
2025-08-29 14:00:11
At #IAHR2025, Mustafa Kamal reported challenges of studying Urdu-language data on atheist discourses in Pakistan using LLMs (in a broad sense, from BERTopic to GenAI). This ranges from issues in video transcription to lack of pre-trained models for Urdu, to censorship in LLMs (“I cannot answer this as it would be considered blasphemous”). I wonder how many of these problems are rooted in our curren…