Tootfinder

Opt-in global Mastodon full text search. Join the index!

@netzschleuder@social.skewed.de
2026-04-09 11:00:05

email_enron: Email network (Enron corpus)
The Enron email corpus, containing all the email communication from the Enron corporation, which was made public as a result of legal action. Nodes are email addresses and node i links to node j if i sent at least one email to address j. Non-Enron email addresses are also present, but only their links to/from Enron addresses are observed.
This network has 36692 nodes and 367662 edges.
Tags: Social, Communication, Unweighted, Multigr…

email_enron: Email network (Enron corpus). 36692 nodes, 367662 edges. https://networks.skewed.de/net/email_enron
@netzschleuder@social.skewed.de
2026-03-08 01:00:05

email_enron: Email network (Enron corpus)
The Enron email corpus, containing all the email communication from the Enron corporation, which was made public as a result of legal action. Nodes are email addresses and node i links to node j if i sent at least one email to address j. Non-Enron email addresses are also present, but only their links to/from Enron addresses are observed.
This network has 36692 nodes and 367662 edges.
Tags: Social, Communication, Unweighted, Multigr…

email_enron: Email network (Enron corpus). 36692 nodes, 367662 edges. https://networks.skewed.de/net/email_enron
@ErikJonker@mastodon.social
2026-04-16 14:02:12

De GPT-NL dataset staat beschreven op Huggingface, huggingface.co/datasets/GPT-NL
met de genoemde hoeveelheid data kan het model qua parameters alleen in de GPT-3 klasse zijn lijkt me? Of zit dat anders?

@kctipton@mas.to
2026-03-18 18:01:01

Corpus Christi Cuts Timeline to Disaster as Abbott Issues Emergency Orders texasobserver.org/corpus-chris With Abbott on the case, changing rules about sucking reservoirs and wells dry, what…

@netzschleuder@social.skewed.de
2026-05-05 23:00:05

email_enron: Email network (Enron corpus)
The Enron email corpus, containing all the email communication from the Enron corporation, which was made public as a result of legal action. Nodes are email addresses and node i links to node j if i sent at least one email to address j. Non-Enron email addresses are also present, but only their links to/from Enron addresses are observed.
This network has 36692 nodes and 367662 edges.
Tags: Social, Communication, Unweighted, Multigr…

email_enron: Email network (Enron corpus). 36692 nodes, 367662 edges. https://networks.skewed.de/net/email_enron
@gwire@mastodon.social
2026-05-05 14:40:32

If AI isn't sentient then why does it sound exactly like the sentient AIs in decades of sci-fi stories that we scanned and added to the training corpus?

@servelan@newsie.social
2026-04-20 23:54:58

Corpus Christi Projects Emergency Water Restrictions in September for Large Industrial Users and 500,000 Customers - Inside Climate News
insideclimatenews.org/news/200

During her detention, Yu said uncertainty weighed heavily on her.
Yu has operated her two West Valley restaurant locations for years, building a reputation for greeting customers with humor and kindness.
Supporters said she has no criminal record and has been active in giving back to the Peoria community.
Her release came after U.S. District Court Judge Krissa Lanham granted a habeas corpus petition filed on her behalf.
The petition argued that her continued detenti…

@davej@dice.camp
2026-05-01 22:45:21

I’m not even American, and I knew that Galveston and Corpus Christi had major flooding disasters in the early 20th century. By all means, sound the alarms on climate change, but for fuck's sake, crack a book once in a while. Posting easily refuted rhetoric does nothing to help the cause. mastodon.cc/@dtm/1165016885218…

@radioeinsmusicbot@mastodonapp.uk
2026-04-20 19:54:59

🇺🇦 Auf radioeins läuft...
Corpus Delicti:
🎵 Room 36
#NowPlaying #CorpusDelicti
corpus-delicti.bandcamp.com/tr
open.spotify.com/track/1PaWdzQ

@thomasfuchs@hachyderm.io
2026-03-27 13:12:28

For the 1,000th time: "AI" does not have agency and cannot think and cannot act.
Chatbots cannot "evade safeguards" or "destroy things" or "ignore instructions".
They do literally only one thing and one thing only: string tokens together based on statistics of proximity of tokens in a data corpus.
If you attribute any deeper meaning to this, it's a sign of psychosis and you should absolutely never use chatbots, possibly you should even touch grass.

@vrandecic@mas.to
2026-03-28 10:36:01

Catching a bug in your code after running it on a large corpus for several days but before publishing the results -- I'll take it as a win.

@andres4ny@social.ridetrans.it
2026-02-22 02:10:55

This is such an unfortunate name though 😂
arxiv.org/html/2506.01732v1

@netzschleuder@social.skewed.de
2026-05-03 18:00:05

email_enron: Email network (Enron corpus)
The Enron email corpus, containing all the email communication from the Enron corporation, which was made public as a result of legal action. Nodes are email addresses and node i links to node j if i sent at least one email to address j. Non-Enron email addresses are also present, but only their links to/from Enron addresses are observed.
This network has 36692 nodes and 367662 edges.
Tags: Social, Communication, Unweighted, Multigr…

email_enron: Email network (Enron corpus). 36692 nodes, 367662 edges. https://networks.skewed.de/net/email_enron
@kctipton@mas.to
2026-03-08 23:28:40

Corpus Christi careens toward water catastrophe texastribune.org/2026/03/08/te #Texas #water #infrastructure #drought

@BBC3MusicBot@mastodonapp.uk
2026-04-22 20:45:58

🔊 #NowPlaying on #BBCRadio3:
#TheEssay
- The Death and Life of Christopher Marlowe
Jerry Brotton visits Corpus Christi College in Cambridge, where Kit Marlowe studied and was transformed from scholarship boy to gentleman – and spy.
Relisten now 👇
bbc.co.uk/programmes/m002v6z6

@netzschleuder@social.skewed.de
2026-03-30 14:00:05

email_enron: Email network (Enron corpus)
The Enron email corpus, containing all the email communication from the Enron corporation, which was made public as a result of legal action. Nodes are email addresses and node i links to node j if i sent at least one email to address j. Non-Enron email addresses are also present, but only their links to/from Enron addresses are observed.
This network has 36692 nodes and 367662 edges.
Tags: Social, Communication, Unweighted, Multigr…

email_enron: Email network (Enron corpus). 36692 nodes, 367662 edges. https://networks.skewed.de/net/email_enron
@mia@hcommons.social
2026-03-15 17:55:15

Via the AI4LAM Slack: An Extreme Multi-label Text Classification (XMTC) Library Dataset: What if we took "Use of Practical AI in Digital Libraries" seriously? arxiv.org/abs/2603.10876

@doktrock@toad.social
2026-04-17 00:31:23

50 years ago, 16 April 1976. "13 Oil Rig Workers Die When Survival Capsule Capsizes" in the Gulf of Mexico. They had tried to escape from an offshore drilling platform. #OTD Fargo Forum newspaper.

Portion of a front page of a newspaper, Friday, April 16, 1976. Corpus Christi, Texas (AP). "13 men who scrambled into a saucer-like survival capsule before an oil drilling platform sank in the wind-whipped Gulf of Mexico died later when the capsule capsized, the Coast Guard said today." (story continues.) 
Photo of a capsule- with a caption "this survival capsule is similar to the one that capsized early Friday..."

“The government’s understaffing and high case load is a problem of its own making,”
a federal judge observed.
On Feb. 18, 2026, after a 90-minute hearing,
U.S. District Judge Laura Provinzino of the District of Minnesota imposed a conditional civil contempt fine upon a government attorney handling one of the hundreds of immigration related habeas corpus cases
that have overwhelmed the courts of Minneapolis and St. Paul, Minn., since December.
She ordered Specia…

@arXiv_csCL_bot@mastoxiv.page
2026-03-31 10:11:02

LombardoGraphia: Automatic Classification of Lombard Orthography Variants
Edoardo Signoroni, Pavel Rychl\'y
arxiv.org/abs/2603.28418 arxiv.org/pdf/2603.28418 arxiv.org/html/2603.28418
arXiv:2603.28418v1 Announce Type: new
Abstract: Lombard, an underresourced language variety spoken by approximately 3.8 million people in Northern Italy and Southern Switzerland, lacks a unified orthographic standard. Multiple orthographic systems exist, creating challenges for NLP resource development and model training. This paper presents the first study of automatic Lombard orthography classification and LombardoGraphia, a curated corpus of 11,186 Lombard Wikipedia samples tagged across 9 orthographic variants, and models for automatic orthography classification. We curate the dataset, processing and filtering raw Wikipedia content to ensure text suitable for orthographic analysis. We train 24 traditional and neural classification models with various features and encoding levels. Our best models achieve 96.06% and 85.78% overall and average class accuracy, though performance on minority classes remains challenging due to data imbalance. Our work provides crucial infrastructure for building variety-aware NLP resources for Lombard.
toXiv_bot_toot

@netzschleuder@social.skewed.de
2026-04-27 21:00:06

email_enron: Email network (Enron corpus)
The Enron email corpus, containing all the email communication from the Enron corporation, which was made public as a result of legal action. Nodes are email addresses and node i links to node j if i sent at least one email to address j. Non-Enron email addresses are also present, but only their links to/from Enron addresses are observed.
This network has 36692 nodes and 367662 edges.
Tags: Social, Communication, Unweighted, Multigr…

email_enron: Email network (Enron corpus). 36692 nodes, 367662 edges. https://networks.skewed.de/net/email_enron
@arXiv_csLG_bot@mastoxiv.page
2026-02-25 10:34:51

UrbanFM: Scaling Urban Spatio-Temporal Foundation Models
Wei Chen, Yuqian Wu, Junle Chen, Xiaofang Zhou, Yuxuan Liang
arxiv.org/abs/2602.20677 arxiv.org/pdf/2602.20677 arxiv.org/html/2602.20677
arXiv:2602.20677v1 Announce Type: new
Abstract: Urban systems, as dynamic complex systems, continuously generate spatio-temporal data streams that encode the fundamental laws of human mobility and city evolution. While AI for Science has witnessed the transformative power of foundation models in disciplines like genomics and meteorology, urban computing remains fragmented due to "scenario-specific" models, which are overfitted to specific regions or tasks, hindering their generalizability. To bridge this gap and advance spatio-temporal foundation models for urban systems, we adopt scaling as the central perspective and systematically investigate two key questions: what to scale and how to scale. Grounded in first-principles analysis, we identify three critical dimensions: heterogeneity, correlation, and dynamics, aligning these principles with the fundamental scientific properties of urban spatio-temporal data. Specifically, to address heterogeneity through data scaling, we construct WorldST. This billion-scale corpus standardizes diverse physical signals, such as traffic flow and speed, from over 100 global cities into a unified data format. To enable computation scaling for modeling correlations, we introduce the MiniST unit, a novel split mechanism that discretizes continuous spatio-temporal fields into learnable computational units to unify representations of grid-based and sensor-based observations. Finally, addressing dynamics via architecture scaling, we propose UrbanFM, a minimalist self-attention architecture designed with limited inductive biases to autonomously learn dynamic spatio-temporal dependencies from massive data. Furthermore, we establish EvalST, the largest-scale urban spatio-temporal benchmark to date. Extensive experiments demonstrate that UrbanFM achieves remarkable zero-shot generalization across unseen cities and tasks, marking a pivotal first step toward large-scale urban spatio-temporal foundation models.
toXiv_bot_toot

@kctipton@mas.to
2026-04-30 16:54:07

This Summer, the American Water Crisis Becomes Real | WIRED #Texas

@thomasfuchs@hachyderm.io
2026-02-13 15:32:02

RE: hachyderm.io/@thomasfuchs/1160
It should also be noted that LLMs do not write code.
They assemble code from bits and pieces of a tokenized text corpus based on how they are statistically interrelated.
LLMs do not think, anticipate, cognize, abstract or have any theory of mind; they do not currently and will not ever have this ability.

@netzschleuder@social.skewed.de
2026-03-27 21:00:05

email_enron: Email network (Enron corpus)
The Enron email corpus, containing all the email communication from the Enron corporation, which was made public as a result of legal action. Nodes are email addresses and node i links to node j if i sent at least one email to address j. Non-Enron email addresses are also present, but only their links to/from Enron addresses are observed.
This network has 36692 nodes and 367662 edges.
Tags: Social, Communication, Unweighted, Multigr…

email_enron: Email network (Enron corpus). 36692 nodes, 367662 edges. https://networks.skewed.de/net/email_enron
@BBC3MusicBot@mastodonapp.uk
2026-05-05 14:45:59

🇺🇦 #NowPlaying on BBCRadio3's #ClassicalLive
William Byrd & Stile Antico:
🎵 Ave Verum Corpus
#WilliamByrd #StileAntico
open.spotify.com/track/7BrXyga

@netzschleuder@social.skewed.de
2026-02-17 08:00:06

email_enron: Email network (Enron corpus)
The Enron email corpus, containing all the email communication from the Enron corporation, which was made public as a result of legal action. Nodes are email addresses and node i links to node j if i sent at least one email to address j. Non-Enron email addresses are also present, but only their links to/from Enron addresses are observed.
This network has 36692 nodes and 367662 edges.
Tags: Social, Communication, Unweighted, Multigr…

email_enron: Email network (Enron corpus). 36692 nodes, 367662 edges. https://networks.skewed.de/net/email_enron
@arXiv_csCL_bot@mastoxiv.page
2026-03-31 11:12:53

Replaced article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[3/5]:
- Can Small Language Models Handle Context-Summarized Multi-Turn Customer-Service QA? A Synthetic D...
Lakshan Cooray, Deshan Sumanathilaka, Pattigadapa Venkatesh Raju
arxiv.org/abs/2602.00665 mastoxiv.page/@arXiv_csCL_bot/
- SEAD: Self-Evolving Agent for Multi-Turn Service Dialogue
Dai, Gao, Zhang, Wang, Luo, Wang, Wang, Wu, Wang
arxiv.org/abs/2602.03548
- OmniRAG-Agent: Agentic Omnimodal Reasoning for Low-Resource Long Audio-Video Question Answering
Yifan Zhu, Xinyu Mu, Tao Feng, Zhonghong Ou, Yuning Gong, Haoran Luo
arxiv.org/abs/2602.03707
- GreekMMLU: A Native-Sourced Multitask Benchmark for Evaluating Language Models in Greek
Zhang, Konomi, Xypolopoulos, Divriotis, Skianis, Nikolentzos, Stamou, Shang, Vazirgiannis
arxiv.org/abs/2602.05150
- Using LLMs for Knowledge Component-level Correctness Labeling in Open-ended Coding Problems
Zhangqi Duan, Arnav Kankaria, Dhruv Kartik, Andrew Lan
arxiv.org/abs/2602.17542 mastoxiv.page/@arXiv_csCL_bot/
- MetaState: Persistent Working Memory Enhances Reasoning in Discrete Diffusion Language Models
Kejing Xia, Mingzhe Li, Lixuan Wei, Zhenbang Du, Xiangchi Yuan, Dachuan Shi, Qirui Jin, Wenke Lee
arxiv.org/abs/2603.01331 mastoxiv.page/@arXiv_csCL_bot/
- A Browser-based Open Source Assistant for Multimodal Content Verification
Milner, Foster, Karmakharm, Razuvayevskaya, Roberts, Porcellini, Teyssou, Bontcheva
arxiv.org/abs/2603.02842 mastoxiv.page/@arXiv_csCL_bot/
- Nw\=ach\=a Mun\=a: A Devanagari Speech Corpus and Proximal Transfer Benchmark for Nepal Bhasha ASR
Sharma, Shrestha, Poudel, Tiwari, Shrestha, Ghimire, Bal
arxiv.org/abs/2603.07554 mastoxiv.page/@arXiv_csCL_bot/
- Model Merging in the Era of Large Language Models: Methods, Applications, and Future Directions
Mingyang Song, Mao Zheng
arxiv.org/abs/2603.09938 mastoxiv.page/@arXiv_csCL_bot/
- AgentDrift: Unsafe Recommendation Drift Under Tool Corruption Hidden by Ranking Metrics in LLM Ag...
Zekun Wu, Adriano Koshiyama, Sahan Bulathwela, Maria Perez-Ortiz
arxiv.org/abs/2603.12564 mastoxiv.page/@arXiv_csCL_bot/
- GhanaNLP Parallel Corpora: Comprehensive Multilingual Resources for Low-Resource Ghanaian Languages
Gyamfi, Azunre, Moore, Budu, Asare, Owusu, Asiamah
arxiv.org/abs/2603.13793 mastoxiv.page/@arXiv_csCL_bot/
- sebis at ArchEHR-QA 2026: How Much Can You Do Locally? Evaluating Grounded EHR QA on a Single Not...
Ibrahim Ebrar Yurt, Fabian Karl, Tejaswi Choppa, Florian Matthes
arxiv.org/abs/2603.13962 mastoxiv.page/@arXiv_csCL_bot/
- ExPosST: Explicit Positioning with Adaptive Masking for LLM-Based Simultaneous Machine Translation
Yuzhe Shang, Pengzhi Gao, Yazheng Yang, Jiayao Ma, Wei Liu, Jian Luan, Jinsong Su
arxiv.org/abs/2603.14903 mastoxiv.page/@arXiv_csCL_bot/
- BanglaSocialBench: A Benchmark for Evaluating Sociopragmatic and Cultural Alignment of LLMs in Ba...
Tanvir Ahmed Sijan, S. M Golam Rifat, Pankaj Chowdhury Partha, Md. Tanjeed Islam, Md. Musfique Anwar
arxiv.org/abs/2603.15949 mastoxiv.page/@arXiv_csCL_bot/
- EngGPT2: Sovereign, Efficient and Open Intelligence
G. Ciarfaglia, et al.
arxiv.org/abs/2603.16430 mastoxiv.page/@arXiv_csCL_bot/
- HypeLoRA: Hyper-Network-Generated LoRA Adapters for Calibrated Language Model Fine-Tuning
Bartosz Trojan, Filip G\k{e}bala
arxiv.org/abs/2603.19278 mastoxiv.page/@arXiv_csCL_bot/
- Automatic Analysis of Collaboration Through Human Conversational Data Resources: A Review
Yi Yu, Maria Boritchev, Chlo\'e Clavel
arxiv.org/abs/2603.19292 mastoxiv.page/@arXiv_csCL_bot/
- Alignment Whack-a-Mole : Finetuning Activates Verbatim Recall of Copyrighted Books in Large Langu...
Xinyue Liu, Niloofar Mireshghallah, Jane C. Ginsburg, Tuhin Chakrabarty
arxiv.org/abs/2603.20957 mastoxiv.page/@arXiv_csCL_bot/
- KG-Hopper: Empowering Compact Open LLMs with Knowledge Graph Reasoning via Reinforcement Learning
Shuai Wang, Yinan Yu
arxiv.org/abs/2603.21440 mastoxiv.page/@arXiv_csCL_bot/
toXiv_bot_toot

@netzschleuder@social.skewed.de
2026-04-17 00:00:05

email_enron: Email network (Enron corpus)
The Enron email corpus, containing all the email communication from the Enron corporation, which was made public as a result of legal action. Nodes are email addresses and node i links to node j if i sent at least one email to address j. Non-Enron email addresses are also present, but only their links to/from Enron addresses are observed.
This network has 36692 nodes and 367662 edges.
Tags: Social, Communication, Unweighted, Multigr…

email_enron: Email network (Enron corpus). 36692 nodes, 367662 edges. https://networks.skewed.de/net/email_enron
@netzschleuder@social.skewed.de
2026-04-15 03:00:04

email_enron: Email network (Enron corpus)
The Enron email corpus, containing all the email communication from the Enron corporation, which was made public as a result of legal action. Nodes are email addresses and node i links to node j if i sent at least one email to address j. Non-Enron email addresses are also present, but only their links to/from Enron addresses are observed.
This network has 36692 nodes and 367662 edges.
Tags: Social, Communication, Unweighted, Multigr…

email_enron: Email network (Enron corpus). 36692 nodes, 367662 edges. https://networks.skewed.de/net/email_enron
@BBC3MusicBot@mastodonapp.uk
2026-03-30 03:42:23

🇺🇦 #NowPlaying on BBCRadio3's #ThroughTheNight
Jos van Immerseel & Franz Liszt:
🎵 A la Chapelle Sixtine (Miserere de Allegri et Ave verum corpus de Mozart)
#JosvanImmerseel #FranzLiszt

@BBC3MusicBot@mastodonapp.uk
2026-03-28 12:49:47

🇺🇦 #NowPlaying on BBCRadio3's #EarlierWithJoolsHolland
Benjamin Britten, Gerald Moore & Janet Baker:
🎵 Corpus Christi Carol (From A Boy Was Born, Op 3)
#BenjaminBritten #GeraldMoore #JanetBaker
open.spotify.com/track/0pJPmLf

@BBC3MusicBot@mastodonapp.uk
2026-04-26 19:42:26

🇺🇦 #NowPlaying on BBCRadio3's #WordsAndMusic
Roderick Williams, ORA Singers & Suzi Digby:
🎵 Ave verum corpus Re-imagined
#RoderickWilliams #ORASingers #SuziDigby
open.spotify.com/track/7jbVbEm

@BBC3MusicBot@mastodonapp.uk
2026-02-23 12:31:19

🇺🇦 #NowPlaying on BBCRadio3's #EssentialClassics
Tenebrae, Wolfgang Amadeus Mozart, The Chamber Orchestra of Europe & Nigel Short:
🎵 Ave verum corpus, K 618
#Tenebrae #WolfgangAmadeusMozart #NigelShort
open.spotify.com/track/6nYkXGz

@BBC3MusicBot@mastodonapp.uk
2026-02-18 01:46:00

🇺🇦 #NowPlaying on BBCRadio3's #ThroughTheNight
Wolfgang Amadeus Mozart, Coro Maghini, Claudio Chiavazza, Academia Montis Regalis & Alessandro De Marchi:
🎵 Ave verum corpus, K.618
#WolfgangAmadeusMozart #CoroMaghini #ClaudioChiavazza #AcademiaMontisRegalis

@BBC3MusicBot@mastodonapp.uk
2026-04-11 03:43:04

🇺🇦 #NowPlaying on BBCRadio3's #ThroughTheNight
Plamena Mangova & Isaac Albéniz:
🎵 El Corpus en Sevilla from 'Iberia' (Book 1)
#PlamenaMangova #IsaacAlbéniz

@BBC3MusicBot@mastodonapp.uk
2026-02-14 06:43:04

🇺🇦 #NowPlaying on BBCRadio3's #Breakfast
Tenebrae, Wolfgang Amadeus Mozart, The Chamber Orchestra of Europe & Nigel Short:
🎵 Ave verum corpus, K 618
#Tenebrae #WolfgangAmadeusMozart #TheChamberOrchestraofEurope
open.spotify.com/track/6nYkXGz

@BBC3MusicBot@mastodonapp.uk
2026-04-15 02:38:07

🇺🇦 #NowPlaying on BBCRadio3's #ThroughTheNight
Imant Raminsh, Vancouver Chamber Choir & Jon Washburn:
🎵 Ave Verum Corpus
#ImantRaminsh #VancouverChamberChoir #JonWashburn

@BBC3MusicBot@mastodonapp.uk
2026-04-18 06:47:01

🇺🇦 #NowPlaying on BBCRadio3's #Breakfast
William Byrd, Girls' Choir of Canterbury Cathedral, Canterbury Cathedral Girls' Choir & David Newsholme:
🎵 Ave verum corpus, T 92
#WilliamByrd #GirlsChoirofCanterburyCathedral #CanterburyCathedralGirlsChoir #DavidNewsholme

@BBC3MusicBot@mastodonapp.uk
2026-04-16 18:23:44

🇺🇦 #NowPlaying on BBCRadio3's #ClassicalMixtape
The Sixteen, Wolfgang Amadeus Mozart, Academy of St Martin in the Fields & Harry Christophers:
🎵 Ave verum corpus K.618
#TheSixteen #WolfgangAmadeusMozart #AcademyofStMartinintheFields #HarryChristophers