Tootfinder

Opt-in global Mastodon full text search. Join the index!

@arXiv_csHC_bot@mastoxiv.page
2025-06-06 09:41:05

This arxiv.org/abs/2505.24195 has been replaced.
initial toot: mastoxiv.page/@arXiv_csHC_…

@netzschleuder@social.skewed.de
2025-07-07 12:00:15

wiki_article_words: Wikipedia article-words (en) (2010)
A bipartite network of English Wikipedia articles and the words they contain. The edge weight gives the number of times a word appeared in the connected article.
This network has 276739 nodes and 2941902 edges.
Tags: Informational, Language, Weighted

wiki_article_words: Wikipedia article-words (en) (2010). 276739 nodes, 2941902 edges. https://networks.skewed.de/net/wiki_article_words
@spamless@mastodon.social
2025-05-04 18:53:44

I'm a #language fiend too. Lasswell taps into a big sore point for me as well. If you read this, stay away from the comment section. You'll go nuts!
Opinion | The phrase ‘begs the question’ is begging for oblivion - The Washington Post

@smurthys@hachyderm.io
2025-06-06 01:00:51

SWELL (old-timey use) Vs SWILL. What a difference a letter makes.
#English #language #difference

@gedankenstuecke@scholar.social
2025-06-02 18:08:24

«Sure, people from Latin-language-majority countries also list their “pronouns” in e-mail signatures and on social-media profiles, but this is, I think, largely the subconscious acceptance of the English-speakers’ hegemony over social norms (and queerness?).»
🔥 by @….
I'd even go one further and say "USian English's hegemony, both for online social norms generally and queerness in particular 🙈
blog.achintyarao.in/post/the-e

@arXiv_csCL_bot@mastoxiv.page
2025-07-03 10:03:40

MuRating: A High Quality Data Selecting Approach to Multilingual Large Language Model Pretraining
Zhixun Chen, Ping Guo, Wenhan Han, Yifan Zhang, Binbin Liu, Haobin Lin, Fengze Liu, Yan Zhao, Bingni Zhang, Taifeng Wang, Yin Zheng, Meng Fang
arxiv.org/abs/2507.01785

@spamless@mastodon.social
2025-05-04 18:53:44

I'm a #language fiend too. Lasswell taps into a big sore point for me as well. If you read this, stay away from the comment section. You'll go nuts!
Opinion | The phrase ‘begs the question’ is begging for oblivion - The Washington Post

@Mediagazer@mstdn.social
2025-06-03 04:15:46

The Belfast News Letter, among the longest-running English dailies, digitizes and puts all its editions online, including the earliest surviving one from 1738 (Michael Savage/The Guardian)
theguardian.com/media/2025/jun

@arXiv_csIR_bot@mastoxiv.page
2025-07-01 09:04:23

Teaching a Language Model to Speak the Language of Tools
Simeon Emanuilov
arxiv.org/abs/2506.23394 arxiv.org/pdf/2506…

@arXiv_csHC_bot@mastoxiv.page
2025-06-02 07:19:26

WikiGap: Promoting Epistemic Equity by Surfacing Knowledge Gaps Between English Wikipedia and other Language Editions
Zining Wang, Yuxuan Zhang, Dongwook Yoon, Nicholas Vincent, Farhan Samir, Vered Shwartz
arxiv.org/abs/2505.24195

@lysander07@sigmoid.social
2025-06-02 07:24:17

At the Semantic Digital Humanities 2025 Workshop, Jose Maldonado-Rodríguez is presenting "Natural Language Querying for Humanities #KnowledgeGraphs A case study on the GOLEM KG". Main contribution is a bilingual dataset (English-Spanish) specifically designed to evaluate automatic text-to-SPARQL translation systems for GOLEM, a specialized humanities KG.
paper:

Jose Maldonado-Rodríguez is presenting "Natural Language Querying for Humanities #KnowledgeGraphs  A case study on the GOLEM KG"
The image shows a presentation slide in a conference room. The slide is titled "Motivation" and discusses bridging the gap between Knowledge Graphs and non-technical researchers. It highlights a user-friendly way of extracting data from structured graphs. The slide features a diagram illustrating a bridge labeled "Text-to-SPARQL" connecting "Non-technical researchers"…
@hex@kolektiva.social
2025-06-25 22:07:06

As I'm learning Dutch, I'm reminded that the idea that there are people who believe that the bible is to be taken literally. The idea that a several hundred year old translation of a collection of texts in multiple languages, that were themselves translated multiple times between languages, before the whole thing was translated to Latin, then being translated to English, could somehow perfectly reflect the original text... Yeah, it's only possible to believe that if you have no idea how languages work and have never learned another language.
Like, just from linguistic drift alone if the bible were written in King James English you're losing *so* much context. But Hebrew, Aramaic, and Greek translated to Latin, then to English, then to English again?
There are so many things that erg can't be translated, even as a beginner. Dutch and English are two of the closest languages that exist, they're both Germanic languages and they're the closest to each other (other than Friesian). You can't really be much closer, and yet, there are so many things you can't mutually represent. Hebrew and Latin, Aramaic and Latin, Latin and English, Greek and English, these aren't even the same families at all... They're extremely distant. There's absolutely no way to represent concepts from one to another without another book's worth of explanation.
And that ignores all the cultural context, which is mostly lost and a library and decade of education to get the stuff that we *do* know.
Only monolingual Americans could come up with an idea so incredibly asinine.

@arXiv_astrophIM_bot@mastoxiv.page
2025-06-04 07:45:38

An Exploratory Framework for Future SETI Applications: Detecting Generative Reactivity via Language Models
Po-Chieh Yu
#toXiv_bot_toot

@midtsveen@social.linux.pizza
2025-06-03 09:38:17

Title: The End of Anarchism?
Author: Luigi Galleani
Topics: #AnarchoCommunism#insurrectionary
Date: 1925
Link:

@chris@mstdn.chrisalemany.ca
2025-05-28 19:02:46

Watching the first Question Period of the 45th Parliament of Canada!
#CanPoli #CdnPoli

@arXiv_eessSY_bot@mastoxiv.page
2025-07-02 09:17:41

Verifiable Natural Language to Linear Temporal Logic Translation: A Benchmark Dataset and Evaluation Suite
William H English, Chase Walker, Dominic Simon, Sumit Kumar Jha, Rickard Ewetz
arxiv.org/abs/2507.00877

@arXiv_csCL_bot@mastoxiv.page
2025-07-03 10:11:20

Eka-Eval : A Comprehensive Evaluation Framework for Large Language Models in Indian Languages
Samridhi Raj Sinha, Rajvee Sheth, Abhishek Upperwal, Mayank Singh
arxiv.org/abs/2507.01853

@arXiv_eessAS_bot@mastoxiv.page
2025-07-02 08:42:20

Mitigating Language Mismatch in SSL-Based Speaker Anonymization
Zhe Zhang, Wen-Chin Huang, Xin Wang, Xiaoxiao Miao, Junichi Yamagishi
arxiv.org/abs/2507.00458

@netzschleuder@social.skewed.de
2025-07-03 03:00:06

wiki_talk: Wikipedia talk networks
Interactions among users of 10 language-specific Wikipedias: Arabic, Chinese, Dutch, English, French, German, Italian, Portuguese, Russian, and Spanish. Nodes are registered wiki editors, and an edge represents a user i having written a message on user j's talk page. Edges are timestamped. The precise dates of the snapshots are uncertain.
This network has 8097 nodes and 63809 edges.
Tags: Social, Communication, Unweighted, Multigraph, …

wiki_talk: Wikipedia talk networks. 8097 nodes, 63809 edges. https://networks.skewed.de/net/wiki_talk#gl
@rafa_font@mastodon.online
2025-06-18 19:31:17

You'll never become a NATIVE English speaker
No matter how hard you try, the years in the UK or Ireland, the effort in your accent, or the AI applications you might use to fake it
There is a language wall, made of accents, cultural references and seemingly illogical phrasal verbs and idioms, that we cannot jump
But IT DOESN'T MATTER.
90% of your interactions are probably with other non-native speakers. As long as you understand each other, you're good.

@wvmierlo@zirk.us
2025-05-18 16:30:27

Stop the decline!
Marked decline in semicolons in English books, study suggests | Language | The Guardian
theguardian.com/science/2025/m

@arXiv_csCL_bot@mastoxiv.page
2025-07-02 10:16:20

Contrasting Cognitive Styles in Vision-Language Models: Holistic Attention in Japanese Versus Analytical Focus in English
Ahmed Sabir, Azinovi\v{c} Gasper, Mengsay Loem, Rajesh Sharma
arxiv.org/abs/2507.00700

@arXiv_csSD_bot@mastoxiv.page
2025-07-01 09:47:03

You Sound a Little Tense: L2 Tailored Clear TTS Using Durational Vowel Properties
Paige Tutt\"os\'i, H. Henny Yeung, Yue Wang, Jean-Julien Aucouturier, Angelica Lim
arxiv.org/abs/2506.23367

@ginevra@hachyderm.io
2025-06-20 00:35:29

Language learning has been part of me since high school. I'm solid in 2 non-English languages, crappy but survivable in 2 others. I've played with & started learning others many times.
I'm real busy rn, but language learning could be a fun thing to do for myself & make me feel like I'm still me.
But I'm stumped about my language picks. I learnt the obvious European languages in school; later tried key Asian languages. What do I want to do now?
African languages? I won't be getting a chance to use them much in Aus, & I'm unlikely to get to a stage where I can read literature.
I tried Slovenian/Slovene on a whim & really love it, but I'll never go there. Is the practical but unfun answer grind out more kanji/hanzi? Or is whimsically learning a language spoken by only 2.5 million people reasonable? I will continue struggling through with Ukrainian, 'cause I think it's important.
#LanguageLearning

@arXiv_csPL_bot@mastoxiv.page
2025-06-11 07:48:34

Linguine: A Natural-Language Programming Language with Formal Semantics and a Clean Compiler Pipeline
Lifan Hu
arxiv.org/abs/2506.08396

@arXiv_csDL_bot@mastoxiv.page
2025-06-27 07:55:49

Metadata Enrichment of Long Text Documents using Large Language Models
Manika Lamba, You Peng, Sophie Nikolov, Glen Layne-Worthey, J. Stephen Downie
arxiv.org/abs/2506.20918

@arXiv_csCL_bot@mastoxiv.page
2025-07-02 09:34:40

EfficientXLang: Towards Improving Token Efficiency Through Cross-Lingual Reasoning
Sanchit Ahuja, Praneetha Vaddamanu, Barun Patra
arxiv.org/abs/2507.00246

@kctipton@mas.to
2025-05-30 03:00:09

A new best-speller
apnews.com/article/scripps-nat

@erk709@social.linux.pizza
2025-05-03 01:06:15

Mixing up the two Swedish words "Godkänna" and "Gödkanna" in a project may seem like an innocent mistake, especially if you only speak English. However, they mean very different things... Read more to learn what.
#humor #language

@arXiv_qbioNC_bot@mastoxiv.page
2025-06-26 09:02:10

Brains and language models converge on a shared conceptual space across different languages
Zaid Zada, Samuel A Nastase, Jixing Li, Uri Hasson
arxiv.org/abs/2506.20489

@sperbsen@discuss.systems
2025-06-25 07:29:12

English-language trailer for "Quartet" just dropped!
chaos.social/@theateru34/11474

@netzschleuder@social.skewed.de
2025-06-01 08:00:05

wiki_talk: Wikipedia talk networks
Interactions among users of 10 language-specific Wikipedias: Arabic, Chinese, Dutch, English, French, German, Italian, Portuguese, Russian, and Spanish. Nodes are registered wiki editors, and an edge represents a user i having written a message on user j's talk page. Edges are timestamped. The precise dates of the snapshots are uncertain.
This network has 41424 nodes and 73900 edges.
Tags: Social, Communication, Unweighted, Multigraph,…

wiki_talk: Wikipedia talk networks. 41424 nodes, 73900 edges. https://networks.skewed.de/net/wiki_talk#lv
@arXiv_csCY_bot@mastoxiv.page
2025-06-24 09:45:29

Automatic Large Language Models Creation of Interactive Learning Lessons
Jionghao Lin, Jiarui Rao, Yiyang Zhao, Yuting Wang, Ashish Gurung, Amanda Barany, Jaclyn Ocumpaugh, Ryan S. Baker, Kenneth R. Koedinger
arxiv.org/abs/2506.17356

@lukem@hachyderm.io
2025-06-16 09:21:59

It bothers me big time that many Reddit threads are auto-translated to my native language and indexed in search engines.
As someone who frequently searches for highly local knowledge, I find this annoying.
If I search anything in my native language, it usually means "I want to know what my fellows think", and not "I want to read what English-speaking community says, just translated to Polish".
Outside of a few specific exceptions, Reddit in general is such a waste of time and data.

@arXiv_csCL_bot@mastoxiv.page
2025-07-02 09:54:30

Natural language processing for African languages
David Ifeoluwa Adelani
arxiv.org/abs/2507.00297 arxiv.org/pdf/2507.…

@shoppingtonz@mastodon.social
2025-06-29 19:16:50

I made a new friend today in Albion Europe.
ZairGT...I think native language is Spanish but also speaks English.
I found this Tree T4.3 in a blue zone and I went to it back and forth. At one point I find ZairGT(both him and me unflagged, no FW) there...and the gathering duel was "short and bloody" and I realized I have no chance to take that.
I admitted defeat and often went there and did the 1 sign and WP...
part 2 soon...

@losttourist@social.chatty.monster
2025-06-14 08:23:17

Of all the many crimes against the English language committed by people from the USA, I think the only one which I'd make a capital offence is that of refusing to pronounce the "L" in "soldering".

@arXiv_eessAS_bot@mastoxiv.page
2025-06-04 07:29:31

Dhvani: A Weakly-supervised Phonemic Error Detection and Personalized Feedback System for Hindi
Arnav Rustagi, Satvik Bajpai, Nimrat Kaur, Siddharth Siddharth
arxiv.org/abs/2506.02166

@netzschleuder@social.skewed.de
2025-07-01 17:00:08

wordnet: WordNet relationships
A network of English words from the WordNet. Node is a word, and edge denotes relationships between words (synonymy, hyperonymy, meronymy, etc.). The date at which this network was extracted from WordNet is not unknown.
This network has 146005 nodes and 656999 edges.
Tags: Informational, Language, Unweighted

wordnet: WordNet relationships. 146005 nodes, 656999 edges. https://networks.skewed.de/net/wordnet
@ecukier@glammr.us
2025-06-17 16:19:52

I don't know why I'm learning to say "right" and "left" in another language when I can't even keep them straight in English.

@bmariusz@techhub.social
2025-06-26 18:39:17

Day 13 (oh, really? ;))
Registration Form Implementation
I've just finished implementing a registration form with validation and language switching using Next.js and React Hook Form. Now users can register with dynamic language support (English/Polish) and data validation (email, password, phone).
Unfortunately, my account on Write.as has been temporarily blocked, so details about the implementation will be available once the account is unlocked. Stay tuned! 😊

@erk709@social.linux.pizza
2025-05-03 01:06:15

Mixing up the two Swedish words "Godkänna" and "Gödkanna" in a project may seem like an innocent mistake, especially if you only speak English. However, they mean very different things... Read more to learn what.
#humor #language

@tezoatlipoca@mas.to
2025-06-11 15:42:48

I got to use `progenitor` in an uncontrived context at work today, and last week it was `prevaricate`, so if you need any fancy English vocabulary needs I'm right over here, under the sign `Fancy Language Person`.

@arXiv_csCL_bot@mastoxiv.page
2025-06-27 09:56:59

Text2Cypher Across Languages: Evaluating Foundational Models Beyond English
Makbule Gulcin Ozsoy, William Tai
arxiv.org/abs/2506.21445

@AdamCoffman@mathstodon.xyz
2025-06-12 16:56:45

temporarily pasting something into MS Word and the Editor pops up with this real helpful tip 📎

The "Formality" feature of the Word editor thinks that this word or phrase may strike a reader as too informal:

Non singular real algebraic plane curves can...

(the English-language joke is that sometimes "real" can informally mean "very" and sometimes in a sarcastic way)
@netzschleuder@social.skewed.de
2025-06-30 08:00:16

wiki_article_words: Wikipedia article-words (en) (2010)
A bipartite network of English Wikipedia articles and the words they contain. The edge weight gives the number of times a word appeared in the connected article.
This network has 276739 nodes and 2941902 edges.
Tags: Informational, Language, Weighted

wiki_article_words: Wikipedia article-words (en) (2010). 276739 nodes, 2941902 edges. https://networks.skewed.de/net/wiki_article_words
@Stomata@social.linux.pizza
2025-06-08 10:26:11

I'm Linuxing my pizza 🙂
#linux #pizza

The image shows a screenshot of a social media post. The post is from a user with the handle "[@]_who_up_instancing_they_host" and includes a profile picture of a light blue elephant. The tweet reads "who up linuxing they pizza" and is timestamped "Mar 14, 2025 7:26 PM" with the language set to English (EN). The post has received 161 boosts and 1 favorite. Below the toot, there are icons for sharing, retweeting, favoriting, bookmarking, and more options, represented by three dots. The backgroun…
@arXiv_csCL_bot@mastoxiv.page
2025-06-27 09:59:39

skLEP: A Slovak General Language Understanding Benchmark
Marek \v{S}uppa, Andrej Ridzik, Daniel Hl\'adek, Tom\'a\v{s} Jav\r{u}rek, Vikt\'oria Ondrejov\'a, Krist\'ina S\'asikov\'a, Martin Tamajka, Mari\'an \v{S}imko
arxiv.org/abs/2506.21508

@arXiv_csSD_bot@mastoxiv.page
2025-06-16 07:55:19

GLAP: General contrastive audio-text pretraining across domains and languages
Heinrich Dinkel, Zhiyong Yan, Tianzi Wang, Yongqing Wang, Xingwei Sun, Yadong Niu, Jizhong Liu, Gang Li, Junbo Zhang, Jian Luan
arxiv.org/abs/2506.11350

@netzschleuder@social.skewed.de
2025-06-27 12:00:09

wordnet: WordNet relationships
A network of English words from the WordNet. Node is a word, and edge denotes relationships between words (synonymy, hyperonymy, meronymy, etc.). The date at which this network was extracted from WordNet is not unknown.
This network has 146005 nodes and 656999 edges.
Tags: Informational, Language, Unweighted

wordnet: WordNet relationships. 146005 nodes, 656999 edges. https://networks.skewed.de/net/wordnet
@arXiv_csCL_bot@mastoxiv.page
2025-06-12 08:54:21

The Emergence of Abstract Thought in Large Language Models Beyond Any Language
Yuxin Chen, Yiran Zhao, Yang Zhang, An Zhang, Kenji Kawaguchi, Shafiq Joty, Junnan Li, Tat-Seng Chua, Michael Qizhe Shieh, Wenxuan Zhang
arxiv.org/abs/2506.09890

@arXiv_csCL_bot@mastoxiv.page
2025-06-30 10:22:00

HyperCLOVA X THINK Technical Report
NAVER Cloud HyperCLOVA X Team
arxiv.org/abs/2506.22403 arxiv.org/pdf/2506.22403…

@arXiv_csHC_bot@mastoxiv.page
2025-06-10 16:40:09

This arxiv.org/abs/2505.05660 has been replaced.
initial toot: mastoxiv.page/@arXiv_csHC_…

@netzschleuder@social.skewed.de
2025-06-23 19:00:09

wordnet: WordNet relationships
A network of English words from the WordNet. Node is a word, and edge denotes relationships between words (synonymy, hyperonymy, meronymy, etc.). The date at which this network was extracted from WordNet is not unknown.
This network has 146005 nodes and 656999 edges.
Tags: Informational, Language, Unweighted

wordnet: WordNet relationships. 146005 nodes, 656999 edges. https://networks.skewed.de/net/wordnet
@arXiv_qbioOT_bot@mastoxiv.page
2025-06-16 14:57:43

Replaced article(s) found for q-bio.OT. arxiv.org/list/q-bio.OT/new
[1/1]:
English dictionaries, gold and silver standard corpora for biomedical natural language processing...

@netzschleuder@social.skewed.de
2025-06-15 06:00:13

wiki_talk: Wikipedia talk networks
Interactions among users of 10 language-specific Wikipedias: Arabic, Chinese, Dutch, English, French, German, Italian, Portuguese, Russian, and Spanish. Nodes are registered wiki editors, and an edge represents a user i having written a message on user j's talk page. Edges are timestamped. The precise dates of the snapshots are uncertain.
This network has 155820 nodes and 1358426 edges.
Tags: Social, Communication, Unweighted, Multigra…

wiki_talk: Wikipedia talk networks. 155820 nodes, 1358426 edges. https://networks.skewed.de/net/wiki_talk#pl
@arXiv_csCL_bot@mastoxiv.page
2025-06-18 08:58:51

AIn't Nothing But a Survey? Using Large Language Models for Coding German Open-Ended Survey Responses on Survey Motivation
Leah von der Heyde, Anna-Carolina Haensch, Bernd Wei{\ss}, Jessika Daikeler
arxiv.org/abs/2506.14634

@arXiv_csHC_bot@mastoxiv.page
2025-06-18 08:23:09

The Teacher's Dilemma: Balancing Trade-Offs in Programming Education for Emergent Bilingual Students
Emma R. Dodoo, Tamara Nelson-Fromm, Mark Guzdial
arxiv.org/abs/2506.14147

@netzschleuder@social.skewed.de
2025-06-15 02:00:07

wiki_users: Wikipedia user interaction (2011)
A network derived from interactions between editors of the English language Wikipedia, as derived from the edit histories of 563 wiki pages related to politics. A positive sign indicates positive links such as trust or similarities, and a negative sign indicates distrust or disagreement.
This network has 138592 nodes and 740397 edges.
Tags: Social, Online, Signed

wiki_users: Wikipedia user interaction (2011). 138592 nodes, 740397 edges. https://networks.skewed.de/net/wiki_users
@arXiv_csCL_bot@mastoxiv.page
2025-06-25 13:25:14

Replaced article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[1/3]:
- Impact of Visual Context on Noisy Multimodal NMT: An Empirical Study for English to Indian Languages
Baban Gain, Dibyanayan Bandyopadhyay, Samrat Mukherjee, Chandranath Adak, Asif Ekbal

@netzschleuder@social.skewed.de
2025-06-18 13:00:03

word_adjacency: Word Adjacency Networks
Directed Networks of word adjacency in texts of several languages including English, French, Spanish and Japanese.
This network has 7381 nodes and 46281 edges.
Tags: Informational, Language, Unweighted
networks.skewed.de/net/word_ad

word_adjacency: Word Adjacency Networks. 7381 nodes, 46281 edges. https://networks.skewed.de/net/word_adjacency#darwin
@arXiv_csCL_bot@mastoxiv.page
2025-06-26 09:40:50

When Life Gives You Samples: The Benefits of Scaling up Inference Compute for Multilingual LLMs
Ammar Khairi, Daniel D'souza, Ye Shen, Julia Kreutzer, Sara Hooker
arxiv.org/abs/2506.20544

@arXiv_eessAS_bot@mastoxiv.page
2025-06-16 08:13:49

Intelligibility of Text-to-Speech Systems for Mathematical Expressions
Sujoy Roychowdhury, H. G. Ranjani, Sumit Soman, Nishtha Paul, Subhadip Bandyopadhyay, Siddhanth Iyengar
arxiv.org/abs/2506.11086

@netzschleuder@social.skewed.de
2025-06-14 07:00:09

wordnet: WordNet relationships
A network of English words from the WordNet. Node is a word, and edge denotes relationships between words (synonymy, hyperonymy, meronymy, etc.). The date at which this network was extracted from WordNet is not unknown.
This network has 146005 nodes and 656999 edges.
Tags: Informational, Language, Unweighted

wordnet: WordNet relationships. 146005 nodes, 656999 edges. https://networks.skewed.de/net/wordnet
@arXiv_csCL_bot@mastoxiv.page
2025-06-10 18:59:31

This arxiv.org/abs/2506.00759 has been replaced.
initial toot: mastoxiv.page/@arXiv_csCL_…