Tootfinder

Opt-in global Mastodon full text search. Join the index!

@arXiv_csCL_bot@mastoxiv.page
2025-08-04 09:53:00

MELAC: Massive Evaluation of Large Language Models with Alignment of Culture in Persian Language
Farhan Farsi, Farnaz Aghababaloo, Shahriar Shariati Motlagh, Parsa Ghofrani, MohammadAli SadraeiJavaheri, Shayan Bali, Amirhossein Shabani, Farbod Bijary, Ghazal Zamaninejad, AmirMohammad Salehoof, Saeedeh Momtazi
arxiv.org/abs/2508.006…

@arXiv_csCL_bot@mastoxiv.page
2025-07-03 10:03:40

MuRating: A High Quality Data Selecting Approach to Multilingual Large Language Model Pretraining
Zhixun Chen, Ping Guo, Wenhan Han, Yifan Zhang, Binbin Liu, Haobin Lin, Fengze Liu, Yan Zhao, Bingni Zhang, Taifeng Wang, Yin Zheng, Meng Fang
arxiv.org/abs/2507.01785

@netzschleuder@social.skewed.de
2025-09-03 09:00:08

wordnet: WordNet relationships
A network of English words from the WordNet. Node is a word, and edge denotes relationships between words (synonymy, hyperonymy, meronymy, etc.). The date at which this network was extracted from WordNet is not unknown.
This network has 146005 nodes and 656999 edges.
Tags: Informational, Language, Unweighted

wordnet: WordNet relationships. 146005 nodes, 656999 edges. https://networks.skewed.de/net/wordnet
@arXiv_csCY_bot@mastoxiv.page
2025-09-04 09:10:11

SESGO: Spanish Evaluation of Stereotypical Generative Outputs
Melissa Robles, Catalina Bernal, Denniss Raigoso, Mateo Dulce Rubio
arxiv.org/abs/2509.03329

@arXiv_csIR_bot@mastoxiv.page
2025-07-01 09:04:23

Teaching a Language Model to Speak the Language of Tools
Simeon Emanuilov
arxiv.org/abs/2506.23394 arxiv.org/pdf/2506…

@arXiv_csCL_bot@mastoxiv.page
2025-09-03 14:45:13

Comparative Study of Pre-Trained BERT and Large Language Models for Code-Mixed Named Entity Recognition
Mayur Shirke, Amey Shembade, Pavan Thorat, Madhushri Wagh, Raviraj Joshi
arxiv.org/abs/2509.02514

@arXiv_csHC_bot@mastoxiv.page
2025-08-04 09:28:50

The Manipulative Power of Voice Characteristics: Investigating Deceptive Patterns in Mandarin Chinese Female Synthetic Speech
Shuning Zhang (Tsinghua University, Beijing, China), Han Chen (Wuhan Institute of Technology, Wuhan, China), Yabo Wang (Tsinghua University, Beijing, China), Yiqun Xu (Tsinghua University, Beijing, China), Jiaqi Bai (Tsinghua University, Beijing, China), Yuanyuan Wu (Shanghai Jiaotong University, Shanghai, China), Shixuan Li (Tsinghua University, Beijing, China)…

@v_i_o_l_a@openbiblio.social
2025-07-24 11:36:41

"The Most Important Word in the English Language" #connect

@arXiv_eessSY_bot@mastoxiv.page
2025-07-02 09:17:41

Verifiable Natural Language to Linear Temporal Logic Translation: A Benchmark Dataset and Evaluation Suite
William H English, Chase Walker, Dominic Simon, Sumit Kumar Jha, Rickard Ewetz
arxiv.org/abs/2507.00877

@netzschleuder@social.skewed.de
2025-09-03 20:00:12

wiki_article_words: Wikipedia article-words (en) (2010)
A bipartite network of English Wikipedia articles and the words they contain. The edge weight gives the number of times a word appeared in the connected article.
This network has 276739 nodes and 2941902 edges.
Tags: Informational, Language, Weighted

wiki_article_words: Wikipedia article-words (en) (2010). 276739 nodes, 2941902 edges. https://networks.skewed.de/net/wiki_article_words
@spamless@mastodon.social
2025-07-30 12:11:34

German-English connections: "anecken" and "to egg on." These words don't mean exactly the same thing, but they are related. Etymologically, they seem closely related. English is a Germanic language. And phrasal verbs offer up interesting connections.
#language #linguistics

@relcfp@mastodon.social
2025-08-04 16:10:48

5th Hawaii International Conference on English Language and Literature Studies (HICELLS 2026)
ift.tt/KS5dAaj
updated: Monday, August 4, 2025 - 11:40amfull name / name of organization: Francisco P.…
via Input 4 RELCFP

@hex@kolektiva.social
2025-06-25 22:07:06

As I'm learning Dutch, I'm reminded that the idea that there are people who believe that the bible is to be taken literally. The idea that a several hundred year old translation of a collection of texts in multiple languages, that were themselves translated multiple times between languages, before the whole thing was translated to Latin, then being translated to English, could somehow perfectly reflect the original text... Yeah, it's only possible to believe that if you have no idea how languages work and have never learned another language.
Like, just from linguistic drift alone if the bible were written in King James English you're losing *so* much context. But Hebrew, Aramaic, and Greek translated to Latin, then to English, then to English again?
There are so many things that erg can't be translated, even as a beginner. Dutch and English are two of the closest languages that exist, they're both Germanic languages and they're the closest to each other (other than Friesian). You can't really be much closer, and yet, there are so many things you can't mutually represent. Hebrew and Latin, Aramaic and Latin, Latin and English, Greek and English, these aren't even the same families at all... They're extremely distant. There's absolutely no way to represent concepts from one to another without another book's worth of explanation.
And that ignores all the cultural context, which is mostly lost and a library and decade of education to get the stuff that we *do* know.
Only monolingual Americans could come up with an idea so incredibly asinine.

@arXiv_csCL_bot@mastoxiv.page
2025-07-03 10:11:20

Eka-Eval : A Comprehensive Evaluation Framework for Large Language Models in Indian Languages
Samridhi Raj Sinha, Rajvee Sheth, Abhishek Upperwal, Mayank Singh
arxiv.org/abs/2507.01853

@arXiv_csCV_bot@mastoxiv.page
2025-07-30 10:42:51

MetaCLIP 2: A Worldwide Scaling Recipe
Yung-Sung Chuang, Yang Li, Dong Wang, Ching-Feng Yeh, Kehan Lyu, Ramya Raghavendra, James Glass, Lifei Huang, Jason Weston, Luke Zettlemoyer, Xinlei Chen, Zhuang Liu, Saining Xie, Wen-tau Yih, Shang-Wen Li, Hu Xu
arxiv.org/abs/2507.22062

@netzschleuder@social.skewed.de
2025-08-04 09:00:04

word_adjacency: Word Adjacency Networks
Directed Networks of word adjacency in texts of several languages including English, French, Spanish and Japanese.
This network has 7381 nodes and 46281 edges.
Tags: Informational, Language, Unweighted
networks.skewed.de/net/word_ad

word_adjacency: Word Adjacency Networks. 7381 nodes, 46281 edges. https://networks.skewed.de/net/word_adjacency#darwin
@arXiv_csCL_bot@mastoxiv.page
2025-08-04 09:51:20

GHTM: A Graph based Hybrid Topic Modeling Approach in Low-Resource Bengali Language
Farhana Haque, Md. Abdur Rahman, Sumon Ahmed
arxiv.org/abs/2508.00605

@nelson@tech.lgbt
2025-08-31 14:41:27

English is an amazing language. We have two similar if folksy / archaic expressions: "to cotton to" and "to cotton on to". They mean different things. phrases.org.uk/meanings/cotton

@arXiv_csSD_bot@mastoxiv.page
2025-07-01 09:47:03

You Sound a Little Tense: L2 Tailored Clear TTS Using Durational Vowel Properties
Paige Tutt\"os\'i, H. Henny Yeung, Yue Wang, Jean-Julien Aucouturier, Angelica Lim
arxiv.org/abs/2506.23367

@arXiv_eessAS_bot@mastoxiv.page
2025-07-02 08:42:20

Mitigating Language Mismatch in SSL-Based Speaker Anonymization
Zhe Zhang, Wen-Chin Huang, Xin Wang, Xiaoxiao Miao, Junichi Yamagishi
arxiv.org/abs/2507.00458

@servelan@newsie.social
2025-07-25 01:34:08

IRS considers eliminating ability to file your taxes in language other than English | The Independent
independent.co.uk/news/world/a

@arXiv_astrophIM_bot@mastoxiv.page
2025-07-28 07:58:01

Recommendations to overcome language barriers in the Vera C. Rubin Observatory Research Ecosystem
Jos\'e Antonio Alonso Pav\'on, Andr\'es Alejandro Plazas Malag\'on
arxiv.org/abs/2507.18682

@Mediagazer@mstdn.social
2025-07-30 07:50:49

The Kyiv Independent crosses 20,000 paying members, up from 17,500 in May, with ~70% of its revenue coming from readers, and has no plans to add a paywall (Sarah Scire/Nieman Lab)
niemanlab.org/2025/07/how-kyiv

@netzschleuder@social.skewed.de
2025-07-03 03:00:06

wiki_talk: Wikipedia talk networks
Interactions among users of 10 language-specific Wikipedias: Arabic, Chinese, Dutch, English, French, German, Italian, Portuguese, Russian, and Spanish. Nodes are registered wiki editors, and an edge represents a user i having written a message on user j's talk page. Edges are timestamped. The precise dates of the snapshots are uncertain.
This network has 8097 nodes and 63809 edges.
Tags: Social, Communication, Unweighted, Multigraph, …

wiki_talk: Wikipedia talk networks. 8097 nodes, 63809 edges. https://networks.skewed.de/net/wiki_talk#gl
@aral@mastodon.ar.al
2025-07-10 11:55:43

Dear Media,
The story is that he’s a fascist, not that he’s ignorant.
Thank you. frontrange.co/@apnewsbot/11482

@arXiv_csCL_bot@mastoxiv.page
2025-07-02 10:16:20

Contrasting Cognitive Styles in Vision-Language Models: Holistic Attention in Japanese Versus Analytical Focus in English
Ahmed Sabir, Azinovi\v{c} Gasper, Mengsay Loem, Rajesh Sharma
arxiv.org/abs/2507.00700

@hanno@mastodon.social
2025-07-19 19:48:15

I wonder if one day someone can explain to me why Google thinks the fact that I am in a foreign country means I immediately am fluent in that country's language, and prefer it over English or my native language. I mean, there's an Accept-Language header, I think it's older than Google.

@rafa_font@mastodon.online
2025-06-18 19:31:17

You'll never become a NATIVE English speaker
No matter how hard you try, the years in the UK or Ireland, the effort in your accent, or the AI applications you might use to fake it
There is a language wall, made of accents, cultural references and seemingly illogical phrasal verbs and idioms, that we cannot jump
But IT DOESN'T MATTER.
90% of your interactions are probably with other non-native speakers. As long as you understand each other, you're good.

@ginevra@hachyderm.io
2025-06-20 00:35:29

Language learning has been part of me since high school. I'm solid in 2 non-English languages, crappy but survivable in 2 others. I've played with & started learning others many times.
I'm real busy rn, but language learning could be a fun thing to do for myself & make me feel like I'm still me.
But I'm stumped about my language picks. I learnt the obvious European languages in school; later tried key Asian languages. What do I want to do now?
African languages? I won't be getting a chance to use them much in Aus, & I'm unlikely to get to a stage where I can read literature.
I tried Slovenian/Slovene on a whim & really love it, but I'll never go there. Is the practical but unfun answer grind out more kanji/hanzi? Or is whimsically learning a language spoken by only 2.5 million people reasonable? I will continue struggling through with Ukrainian, 'cause I think it's important.
#LanguageLearning

@memeorandum@universeodon.com
2025-07-09 20:45:52

Trump says the president of Liberia, where English is the official language, speaks 'good English' (NBC News)
nbcnews.com/politics/donald-tr
memeorandum.com/250709/p106#a2

@arXiv_csCL_bot@mastoxiv.page
2025-07-02 09:34:40

EfficientXLang: Towards Improving Token Efficiency Through Cross-Lingual Reasoning
Sanchit Ahuja, Praneetha Vaddamanu, Barun Patra
arxiv.org/abs/2507.00246

@arXiv_csPL_bot@mastoxiv.page
2025-06-11 07:48:34

Linguine: A Natural-Language Programming Language with Formal Semantics and a Clean Compiler Pipeline
Lifan Hu
arxiv.org/abs/2506.08396

Liberians angered by Trump’s praise for leader
There was confusion and anger in Liberia on Thursday after Donald Trump praised the English skills of President Joseph Boakai.
“Such good English,” Trump said to Boakai, with visible surprise. “Such beautiful English.”
👉 English has been the west African nation’s official language since the 1800s.
But Trump did not stop there.
“Where did you learn to speak so beautifully?” he continued, as Boakai murmured a respo…

@arXiv_csDL_bot@mastoxiv.page
2025-06-27 07:55:49

Metadata Enrichment of Long Text Documents using Large Language Models
Manika Lamba, You Peng, Sophie Nikolov, Glen Layne-Worthey, J. Stephen Downie
arxiv.org/abs/2506.20918

@mgorny@social.treehouse.systems
2025-07-23 05:59:20

I'm not opposed to neologisms. To the contrary, I do love them, sometimes coining my own or adapting happily. That is, as long as they make the language richer, or perhaps more precise.
What I truly hate is the modern goo that people are speaking, because they don't know their own language well. The business newspeak, so to say.
This is especially bad in Polish where people are randomly polonizing English words for no reason at all.

@markhburton@mstdn.social
2025-09-01 08:09:07

UK State, dancing to Garage's agenda, announces yet more cruelty to people fleeing persecution.
Families of those granted asylum will face
tougher English language standards and a test of financial resources.
Imagine doing that during the Uganda or Kosovo emergencies.
bbc.co.uk/news/article…

@arXiv_qbioNC_bot@mastoxiv.page
2025-06-26 09:02:10

Brains and language models converge on a shared conceptual space across different languages
Zaid Zada, Samuel A Nastase, Jixing Li, Uri Hasson
arxiv.org/abs/2506.20489

@idbrii@mastodon.gamedev.place
2025-08-18 17:42:48

Woah, amazing! Valve is changing reviews so bombs in one region don't tank it everywhere.
> When there are enough reviews written in a particular language, Steam will calculate a review score for that language. The Review Score displayed to users will be based on their primary language. What this means is that some languages may show more positive review scores, while others may show more negative ones, for the same game.

@benb@osintua.eu
2025-08-22 15:45:21

Junior Communications Manager: benborges.xyz/2025/08/22/junio

@sperbsen@discuss.systems
2025-06-25 07:29:12

English-language trailer for "Quartet" just dropped!
chaos.social/@theateru34/11474

@arXiv_csCL_bot@mastoxiv.page
2025-07-02 09:54:30

Natural language processing for African languages
David Ifeoluwa Adelani
arxiv.org/abs/2507.00297 arxiv.org/pdf/2507.…

@arXiv_csAI_bot@mastoxiv.page
2025-07-23 09:56:22

Mind the Gap: Evaluating the Representativeness of Quantitative Medical Language Reasoning LLM Benchmarks for African Disease Burdens
Fred Mutisya (Qhala, Kenya Medical Association), Shikoh Gitau (Qhala), Christine Syovata (Kenya Medical Association), Diana Oigara (Kenya Medical Association), Ibrahim Matende (Kenya Medical Association), Muna Aden (Kenya Medical Association), Munira Ali (Kenya Medical Association), Ryan Nyotu (Kenya Medical Association), Diana Marion (Kenya Medical Asso…

@teledyn@mstdn.ca
2025-07-24 19:03:26

This just occured to me (too much sun and gin lemonade could be a factor): English is a funny language and when they say Artificial they mean Automated, and when they say Intelligence they don't mean smarts, they mean covertly gathering intel from prospective enemies!
Hence #ArtificialIntelligence, often promoted to General.
The purpose of any system is what it does, not what it consistently fails to do.

@thomasfuchs@hachyderm.io
2025-07-19 03:29:41

“Cellar door” is—according to Tolkien—the most beautiful combination of words in the English language.
But what’s the worst?

@shoppingtonz@mastodon.social
2025-06-29 19:16:50

I made a new friend today in Albion Europe.
ZairGT...I think native language is Spanish but also speaks English.
I found this Tree T4.3 in a blue zone and I went to it back and forth. At one point I find ZairGT(both him and me unflagged, no FW) there...and the gathering duel was "short and bloody" and I realized I have no chance to take that.
I admitted defeat and often went there and did the 1 sign and WP...
part 2 soon...

@arXiv_qfinGN_bot@mastoxiv.page
2025-09-01 08:05:43

A Financial Brain Scan of the LLM
Hui Chen, Antoine Didisheim, Luciano Somoza, Hanqing Tian
arxiv.org/abs/2508.21285 arxiv.org/pdf/2508.212…

@schoedland@digitalcourage.social
2025-08-26 17:36:29
Content warning: Language, Inquiry about a racial slur

In Germany, (sensitive) people no longer say „Z…“, because it’s established that it’s a racial slur.
How about „Gypsy“ in the English speaking world, is it seen as a slur, too?
Would someone introduce themselves (in a casual setting) as a Gypsy or a Romani, rather?

@arXiv_csSE_bot@mastoxiv.page
2025-08-05 11:14:31

Bridging Language Gaps in Open-Source Documentation with Large-Language-Model Translation
Elijah Kayode Adejumo, Brittany Johnson, Mariam Guizani
arxiv.org/abs/2508.02497

@arXiv_csCL_bot@mastoxiv.page
2025-09-03 14:28:43

How Instruction-Tuning Imparts Length Control: A Cross-Lingual Mechanistic Analysis
Elisabetta Rocchetti, Alfio Ferrara
arxiv.org/abs/2509.02075

@netzschleuder@social.skewed.de
2025-08-31 14:00:06

wiki_talk: Wikipedia talk networks
Interactions among users of 10 language-specific Wikipedias: Arabic, Chinese, Dutch, English, French, German, Italian, Portuguese, Russian, and Spanish. Nodes are registered wiki editors, and an edge represents a user i having written a message on user j's talk page. Edges are timestamped. The precise dates of the snapshots are uncertain.
This network has 41452 nodes and 131884 edges.
Tags: Social, Communication, Unweighted, Multigraph…

wiki_talk: Wikipedia talk networks. 41452 nodes, 131884 edges. https://networks.skewed.de/net/wiki_talk#sk
@arXiv_csHC_bot@mastoxiv.page
2025-07-28 09:48:21

RhythmTA: A Visual-Aided Interactive System for ESL Rhythm Training via Dubbing Practice
Chang Chen, Sicheng Song, Shuchang Xu, Zhicheng Li, Huamin Qu, Yanna Lin
arxiv.org/abs/2507.19026

@arXiv_csCL_bot@mastoxiv.page
2025-08-04 10:00:20

MMBERT: Scaled Mixture-of-Experts Multimodal BERT for Robust Chinese Hate Speech Detection under Cloaking Perturbations
Qiyao Xue, Yuchen Dou, Ryan Shi, Xiang Lorraine Li, Wei Gao
arxiv.org/abs/2508.00760

@arXiv_csCY_bot@mastoxiv.page
2025-08-26 10:41:27

Exploring AI-Enabled Test Practice, Affect, and Test Outcomes in Language Assessment
Jill Burstein, Ramsey Cardwell, Ping-Ling Chuang, Allison Michalowski, Steven Nydick
arxiv.org/abs/2508.17108

@arXiv_csCV_bot@mastoxiv.page
2025-07-10 09:14:01

Speak2Sign3D: A Multi-modal Pipeline for English Speech to American Sign Language Animation
Kazi Mahathir Rahman, Naveed Imtiaz Nafis, Md. Farhan Sadik, Mohammad Al Rafi, Mehedi Hasan Shahed
arxiv.org/abs/2507.06530

@netzschleuder@social.skewed.de
2025-07-01 17:00:08

wordnet: WordNet relationships
A network of English words from the WordNet. Node is a word, and edge denotes relationships between words (synonymy, hyperonymy, meronymy, etc.). The date at which this network was extracted from WordNet is not unknown.
This network has 146005 nodes and 656999 edges.
Tags: Informational, Language, Unweighted

wordnet: WordNet relationships. 146005 nodes, 656999 edges. https://networks.skewed.de/net/wordnet
@netzschleuder@social.skewed.de
2025-08-01 06:00:07

wiki_users: Wikipedia user interaction (2011)
A network derived from interactions between editors of the English language Wikipedia, as derived from the edit histories of 563 wiki pages related to politics. A positive sign indicates positive links such as trust or similarities, and a negative sign indicates distrust or disagreement.
This network has 138592 nodes and 740397 edges.
Tags: Social, Online, Signed

wiki_users: Wikipedia user interaction (2011). 138592 nodes, 740397 edges. https://networks.skewed.de/net/wiki_users
@benb@osintua.eu
2025-08-22 15:45:36

Grants & Institutional Partnerships Manager: benborges.xyz/2025/08/22/grant

@memeorandum@universeodon.com
2025-07-09 21:05:50

Trump asks Liberian president where he learned English, his country's official language (Reuters)
reuters.com/world/africa/trump
memeorandum.com/250709/p109#a2

@arXiv_eessAS_bot@mastoxiv.page
2025-08-27 07:41:32

Toward Responsible ASR for African American English Speakers: A Scoping Review of Bias and Equity in Speech Technology
Jay L. Cunningham, Adinawa Adjagbodjou, Jeffrey Basoah, Jainaba Jawara, Kowe Kadoma, Aaleyah Lewis
arxiv.org/abs/2508.18288

@arXiv_csHC_bot@mastoxiv.page
2025-06-06 09:41:05

This arxiv.org/abs/2505.24195 has been replaced.
initial toot: mastoxiv.page/@arXiv_csHC_…

@arXiv_csCL_bot@mastoxiv.page
2025-07-31 09:49:01

BALSAM: A Platform for Benchmarking Arabic Large Language Models
Rawan Al-Matham, Kareem Darwish, Raghad Al-Rasheed, Waad Alshammari, Muneera Alhoshan, Amal Almazrua, Asma Al Wazrah, Mais Alheraki, Firoj Alam, Preslav Nakov, Norah Alzahrani, Eman alBilali, Nizar Habash, Abdelrahman El-Sheikh, Muhammad Elmallah, Haonan Li, Hamdy Mubarak, Mohamed Anwar, Zaid Alyafeai, Ahmed Abdelali, Nora Altwairesh, Maram Hasanain, Abdulmohsen Al Thubaity, Shady Shehata, Bashar Alhafni, Injy Hamed, Go I…

@netzschleuder@social.skewed.de
2025-07-30 15:00:07

wiki_users: Wikipedia user interaction (2011)
A network derived from interactions between editors of the English language Wikipedia, as derived from the edit histories of 563 wiki pages related to politics. A positive sign indicates positive links such as trust or similarities, and a negative sign indicates distrust or disagreement.
This network has 138592 nodes and 740397 edges.
Tags: Social, Online, Signed

wiki_users: Wikipedia user interaction (2011). 138592 nodes, 740397 edges. https://networks.skewed.de/net/wiki_users
@arXiv_csCY_bot@mastoxiv.page
2025-06-24 09:45:29

Automatic Large Language Models Creation of Interactive Learning Lessons
Jionghao Lin, Jiarui Rao, Yiyang Zhao, Yuting Wang, Ashish Gurung, Amanda Barany, Jaclyn Ocumpaugh, Ryan S. Baker, Kenneth R. Koedinger
arxiv.org/abs/2506.17356

@benb@osintua.eu
2025-08-20 10:55:38

Vertical Video Creator: benborges.xyz/2025/08/20/verti

@netzschleuder@social.skewed.de
2025-08-31 17:00:03

word_adjacency: Word Adjacency Networks
Directed Networks of word adjacency in texts of several languages including English, French, Spanish and Japanese.
This network has 2704 nodes and 8300 edges.
Tags: Informational, Language, Unweighted
networks.skewed.de/net/word_ad

word_adjacency: Word Adjacency Networks. 2704 nodes, 8300 edges. https://networks.skewed.de/net/word_adjacency#japanese
@arXiv_csCL_bot@mastoxiv.page
2025-07-29 08:31:31

HITSZ's End-To-End Speech Translation Systems Combining Sequence-to-Sequence Auto Speech Recognition Model and Indic Large Language Model for IWSLT 2025 in Indic Track
Xuchen Wei, Yangxin Wu, Yaoyin Zhang, Henglyu Liu, Kehai Chen, Xuefeng Bai, Min Zhang
arxiv.org/abs/2507.19616

@arXiv_csHC_bot@mastoxiv.page
2025-07-18 07:36:41

"How to Explore Biases in Speech Emotion AI with Users?" A Speech-Emotion-Acting Study Exploring Age and Language Biases
Josephine Beatrice Skovbo Borre, Malene Gorm Wold, Sara Kj{\ae}r Rasmussen, Ilhan Aslan
arxiv.org/abs/2507.12580

@arXiv_csCL_bot@mastoxiv.page
2025-06-27 09:56:59

Text2Cypher Across Languages: Evaluating Foundational Models Beyond English
Makbule Gulcin Ozsoy, William Tai
arxiv.org/abs/2506.21445

@netzschleuder@social.skewed.de
2025-06-30 08:00:16

wiki_article_words: Wikipedia article-words (en) (2010)
A bipartite network of English Wikipedia articles and the words they contain. The edge weight gives the number of times a word appeared in the connected article.
This network has 276739 nodes and 2941902 edges.
Tags: Informational, Language, Weighted

wiki_article_words: Wikipedia article-words (en) (2010). 276739 nodes, 2941902 edges. https://networks.skewed.de/net/wiki_article_words
@arXiv_csCL_bot@mastoxiv.page
2025-09-01 09:40:52

BLUEX Revisited: Enhancing Benchmark Coverage with Automatic Captioning
Jo\~ao Guilherme Alves Santos, Giovana Kerche Bon\'as, Thales Sales Almeida
arxiv.org/abs/2508.21294

@arXiv_csCL_bot@mastoxiv.page
2025-08-28 10:10:11

Dhati : Fine-tuned Large Language Models for Arabic Subjectivity Evaluation
Slimane Bellaouar, Attia Nehar, Soumia Souffi, Mounia Bouameur
arxiv.org/abs/2508.19966

@netzschleuder@social.skewed.de
2025-06-27 12:00:09

wordnet: WordNet relationships
A network of English words from the WordNet. Node is a word, and edge denotes relationships between words (synonymy, hyperonymy, meronymy, etc.). The date at which this network was extracted from WordNet is not unknown.
This network has 146005 nodes and 656999 edges.
Tags: Informational, Language, Unweighted

wordnet: WordNet relationships. 146005 nodes, 656999 edges. https://networks.skewed.de/net/wordnet
@arXiv_csCL_bot@mastoxiv.page
2025-06-27 09:59:39

skLEP: A Slovak General Language Understanding Benchmark
Marek \v{S}uppa, Andrej Ridzik, Daniel Hl\'adek, Tom\'a\v{s} Jav\r{u}rek, Vikt\'oria Ondrejov\'a, Krist\'ina S\'asikov\'a, Martin Tamajka, Mari\'an \v{S}imko
arxiv.org/abs/2506.21508

@arXiv_csCL_bot@mastoxiv.page
2025-07-22 12:24:50

The Impact of Language Mixing on Bilingual LLM Reasoning
Yihao Li, Jiayi Xin, Miranda Muqing Miao, Qi Long, Lyle Ungar
arxiv.org/abs/2507.15849

@arXiv_csCL_bot@mastoxiv.page
2025-08-29 10:22:31

Signs of Struggle: Spotting Cognitive Distortions across Language and Register
Abhishek Kuber, Enrico Liscio, Ruixuan Zhang, Caroline Figueroa, Pradeep K. Murukannaiah
arxiv.org/abs/2508.20771

@arXiv_csCL_bot@mastoxiv.page
2025-06-30 10:22:00

HyperCLOVA X THINK Technical Report
NAVER Cloud HyperCLOVA X Team
arxiv.org/abs/2506.22403 arxiv.org/pdf/2506.22403…

@arXiv_csCL_bot@mastoxiv.page
2025-07-31 09:54:01

Investigating Hallucination in Conversations for Low Resource Languages
Amit Das, Md. Najib Hasan, Souvika Sarkar, Zheng Zhang, Fatemeh Jamshidi, Tathagata Bhattacharya, Nilanjana Raychawdhury, Dongji Feng, Vinija Jain, Aman Chadha
arxiv.org/abs/2507.22720

@netzschleuder@social.skewed.de
2025-08-22 16:00:20

wiki_users: Wikipedia user interaction (2011)
A network derived from interactions between editors of the English language Wikipedia, as derived from the edit histories of 563 wiki pages related to politics. A positive sign indicates positive links such as trust or similarities, and a negative sign indicates distrust or disagreement.
This network has 138592 nodes and 740397 edges.
Tags: Social, Online, Signed

wiki_users: Wikipedia user interaction (2011). 138592 nodes, 740397 edges. https://networks.skewed.de/net/wiki_users
@netzschleuder@social.skewed.de
2025-07-25 22:00:09

wordnet: WordNet relationships
A network of English words from the WordNet. Node is a word, and edge denotes relationships between words (synonymy, hyperonymy, meronymy, etc.). The date at which this network was extracted from WordNet is not unknown.
This network has 146005 nodes and 656999 edges.
Tags: Informational, Language, Unweighted

wordnet: WordNet relationships. 146005 nodes, 656999 edges. https://networks.skewed.de/net/wordnet
@arXiv_csCL_bot@mastoxiv.page
2025-08-26 12:05:16

Agri-Query: A Case Study on RAG vs. Long-Context LLMs for Cross-Lingual Technical Question Answering
Julius Gun, Timo Oksanen
arxiv.org/abs/2508.18093

@arXiv_csCL_bot@mastoxiv.page
2025-07-24 08:09:19

Text-to-SPARQL Goes Beyond English: Multilingual Question Answering Over Knowledge Graphs through Human-Inspired Reasoning
Aleksandr Perevalov, Andreas Both
arxiv.org/abs/2507.16971

@netzschleuder@social.skewed.de
2025-06-23 19:00:09

wordnet: WordNet relationships
A network of English words from the WordNet. Node is a word, and edge denotes relationships between words (synonymy, hyperonymy, meronymy, etc.). The date at which this network was extracted from WordNet is not unknown.
This network has 146005 nodes and 656999 edges.
Tags: Informational, Language, Unweighted

wordnet: WordNet relationships. 146005 nodes, 656999 edges. https://networks.skewed.de/net/wordnet
@arXiv_csCL_bot@mastoxiv.page
2025-08-26 12:00:56

Information availability in different languages and various technological constraints related to multilinguism on the Internet
Sonal Khosla, Haridasa Acharya
arxiv.org/abs/2508.17918

@arXiv_csCL_bot@mastoxiv.page
2025-08-22 10:10:51

HebID: Detecting Social Identities in Hebrew-language Political Text
Guy Mor-Lan, Naama Rivlin-Angert, Yael R. Kaplan, Tamir Sheafer, Shaul R. Shenhav
arxiv.org/abs/2508.15483

@arXiv_csCL_bot@mastoxiv.page
2025-06-12 08:54:21

The Emergence of Abstract Thought in Large Language Models Beyond Any Language
Yuxin Chen, Yiran Zhao, Yang Zhang, An Zhang, Kenji Kawaguchi, Shafiq Joty, Junnan Li, Tat-Seng Chua, Michael Qizhe Shieh, Wenxuan Zhang
arxiv.org/abs/2506.09890

@arXiv_csCL_bot@mastoxiv.page
2025-07-25 10:06:32

AraTable: Benchmarking LLMs' Reasoning and Understanding of Arabic Tabular Data
Rana Alshaikh, Israa Alghanmi, Shelan Jeawak
arxiv.org/abs/2507.18442

@arXiv_csCL_bot@mastoxiv.page
2025-08-20 09:36:51

ViExam: Are Vision Language Models Better than Humans on Vietnamese Multimodal Exam Questions?
Vy Tuong Dang, An Vo, Quang Tau, Duc Dm, Daeyoung Kim
arxiv.org/abs/2508.13680

@arXiv_csCL_bot@mastoxiv.page
2025-08-19 11:38:30

Breaking Language Barriers: Equitable Performance in Multilingual Language Models
Tanay Nagar, Grigorii Khvatskii, Anna Sokol, Nitesh V. Chawla
arxiv.org/abs/2508.12662

@arXiv_csCL_bot@mastoxiv.page
2025-07-29 11:42:51

Multilingual Self-Taught Faithfulness Evaluators
Carlo Alfano, Aymen Al Marjani, Zeno Jonke, Amin Mantrach, Saab Mansour, Marcello Federico
arxiv.org/abs/2507.20752

@arXiv_csCL_bot@mastoxiv.page
2025-08-27 11:28:21

Crosslisted article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[1/2]:
- Toward Responsible ASR for African American English Speakers: A Scoping Review of Bias and Equity...
Jay L. Cunningham, Adinawa Adjagbodjou, Jeffrey Basoah, Jainaba Jawara, Kowe Kadoma, Aaleyah Lewis

@arXiv_csCL_bot@mastoxiv.page
2025-08-28 10:02:51

Benchmarking Hindi LLMs: A New Suite of Datasets and a Comparative Analysis
Anusha Kamath, Kanishk Singla, Rakesh Paul, Raviraj Joshi, Utkarsh Vaidya, Sanjay Singh Chauhan, Niranjan Wartikar
arxiv.org/abs/2508.19831

@arXiv_csCL_bot@mastoxiv.page
2025-06-25 13:25:14

Replaced article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[1/3]:
- Impact of Visual Context on Noisy Multimodal NMT: An Empirical Study for English to Indian Languages
Baban Gain, Dibyanayan Bandyopadhyay, Samrat Mukherjee, Chandranath Adak, Asif Ekbal

@arXiv_csCL_bot@mastoxiv.page
2025-08-18 09:35:00

ToxiFrench: Benchmarking and Enhancing Language Models via CoT Fine-Tuning for French Toxicity Detection
Axel Delaval, Shujian Yang, Haicheng Wang, Han Qiu, Jialiang Lu
arxiv.org/abs/2508.11281

@arXiv_csCL_bot@mastoxiv.page
2025-06-18 08:58:51

AIn't Nothing But a Survey? Using Large Language Models for Coding German Open-Ended Survey Responses on Survey Motivation
Leah von der Heyde, Anna-Carolina Haensch, Bernd Wei{\ss}, Jessika Daikeler
arxiv.org/abs/2506.14634

@arXiv_csCL_bot@mastoxiv.page
2025-06-26 09:40:50

When Life Gives You Samples: The Benefits of Scaling up Inference Compute for Multilingual LLMs
Ammar Khairi, Daniel D'souza, Ye Shen, Julia Kreutzer, Sara Hooker
arxiv.org/abs/2506.20544

@arXiv_csCL_bot@mastoxiv.page
2025-07-18 09:38:42

HATS: Hindi Analogy Test Set for Evaluating Reasoning in Large Language Models
Ashray Gupta, Rohan Joseph, Sunny Rai
arxiv.org/abs/2507.13238

@arXiv_csCL_bot@mastoxiv.page
2025-08-13 10:17:12

Utilizing Multilingual Encoders to Improve Large Language Models for Low-Resource Languages
Imalsha Puranegedara, Themira Chathumina, Nisal Ranathunga, Nisansa de Silva, Surangika Ranathunga, Mokanarangan Thayaparan
arxiv.org/abs/2508.09091

@arXiv_csCL_bot@mastoxiv.page
2025-08-22 10:08:41

Principle Methods of Rendering Non-equivalent Words from Uzbek and Dari to Russian and English
Mohammad Ibrahim Qani
arxiv.org/abs/2508.15453

@arXiv_csCL_bot@mastoxiv.page
2025-08-13 10:05:22

Reveal-Bangla: A Dataset for Cross-Lingual Multi-Step Reasoning Evaluation
Khondoker Ittehadul Islam, Gabriele Sarti
arxiv.org/abs/2508.08933

@arXiv_csCL_bot@mastoxiv.page
2025-07-16 10:33:21

HKGAI-V1: Towards Regional Sovereign Large Language Model for Hong Kong
Sirui Han, Junqi Zhu, Ruiyuan Zhang, Yike Guo
arxiv.org/abs/2507.11502