Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@david@boles.xyz
2026-04-29 11:34:29

The States That Will Not Be Commanded
There is a class of human experience that answers to no direct order. You cannot tell yourself to fall asleep. The instruction arrives at a locked door. Sleep refuses the simple transaction of command and execution. Instead, it assembles itself once certain conditions are present, and those conditions include, strangely enough, the act of picturing yourself already inside the state you are trying to enter.

@kurtsh@mastodon.social
2026-04-29 20:29:20

People that "want a phone call" instead of just answering a simple question in email.
➡️ Don't want a paper trail
➡️ Are unable to put their thoughts into written words
➡️ Don't know how or are too lazy to type
#thismeetingcouldhavebeenanemail

@arXiv_csCL_bot@mastoxiv.page
2026-03-31 10:12:22

Training data generation for context-dependent rubric-based short answer grading
Pavel \v{S}indel\'a\v{r}, D\'avid Slivka, Christopher Bouma, Filip Pr\'a\v{s}il, Ond\v{r}ej Bojar
arxiv.org/abs/2603.28537 arxiv.org/pdf/2603.28537 arxiv.org/html/2603.28537
arXiv:2603.28537v1 Announce Type: new
Abstract: Every 4 years, the PISA test is administered by the OECD to test the knowledge of teenage students worldwide and allow for comparisons of educational systems. However, having to avoid language differences and annotator bias makes the grading of student answers challenging. For these reasons, it would be interesting to compare methods of automatic student answer grading. To train some of these methods, which require machine learning, or to compute parameters or select hyperparameters for those that do not, a large amount of domain-specific data is needed. In this work, we explore a small number of methods for creating a large-scale training dataset using only a relatively small confidential dataset as a reference, leveraging a set of very simple derived text formats to preserve confidentiality. Using these methods, we successfully created three surrogate datasets that are, at the very least, superficially more similar to the reference dataset than purely the result of prompt-based generation. Early experiments suggest one of these approaches might also lead to improved model training.
toXiv_bot_toot

@kurtsh@mastodon.social
2026-03-28 16:03:27

No they want your DNA to track you.
Folks, have you seen GATTACA?
▶️ U.S. lawmakers demand answers after Canadian man says border officers made him give DNA sample | CBC News
cbc.ca/news/canada/windsor/us-

@digitalnaiv@mastodon.social
2026-04-26 07:49:02

Der Skandal ist nicht der Angriff, sondern ein System, das daran scheitert, ihn abzufangen. Signal ohne Regeln ersetzt keine staatliche Infrastruktur. Caspar Clemens Mierau hat in mancherlei Beziehung Recht, aber es ist schon bezeichnend, wenn jemand wie Klöckner u.a. auf einen simplen Phishing-Angriff reinfallen. Das zeigt etwas über deren Digitalkompetenz aus.
#Golem

Featuring headliners such as
Robert De Niro,
Minneapolis Mayor Jacob Frey
and journalist Don Lemon,
the “State of the Swamp” address
is set to continue through Trump’s address with live rebuttals.
Attendees were encouraged to dress in green frog attire
as a symbol of defiance,
honoring the frog costumes worn by many anti-Immigration and Customs Enforcement protesters during its occupation of the city.
It is also intended to reference the “…

@arXiv_csCL_bot@mastoxiv.page
2026-03-31 11:12:28

Replaced article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[1/5]:
- Beyond In-Distribution Success: Scaling Curves of CoT Granularity for Language Model Generalization
Ru Wang, Wei Huang, Selena Song, Haoyu Zhang, Qian Niu, Yusuke Iwasawa, Yutaka Matsuo, Jiaxian Guo
arxiv.org/abs/2502.18273 mastoxiv.page/@arXiv_csCL_bot/
- Benchmarking NLP-supported Language Sample Analysis for Swiss Children's Speech
Anja Ryser, Yingqiang Gao, Sarah Ebling
arxiv.org/abs/2504.00780 mastoxiv.page/@arXiv_csCL_bot/
- Cultural Biases of Large Language Models and Humans in Historical Interpretation
Fabio Celli, Georgios Spathulas
arxiv.org/abs/2504.02572 mastoxiv.page/@arXiv_csCL_bot/
- BRIDGE: Benchmarking Large Language Models for Understanding Real-world Clinical Practice Text
Jiageng Wu, et al.
arxiv.org/abs/2504.19467 mastoxiv.page/@arXiv_csCL_bot/
- Understanding the Anchoring Effect of LLM with Synthetic Data: Existence, Mechanism, and Potentia...
Yiming Huang, Biquan Bie, Zuqiu Na, Weilin Ruan, Songxin Lei, Yutao Yue, Xinlei He
arxiv.org/abs/2505.15392 mastoxiv.page/@arXiv_csCL_bot/
- Just as Humans Need Vaccines, So Do Models: Model Immunization to Combat Falsehoods
Raza, Qureshi, Farooq, Lotif, Chadha, Pandya, Emmanouilidis
arxiv.org/abs/2505.17870 mastoxiv.page/@arXiv_csCL_bot/
- LingoLoop Attack: Trapping MLLMs via Linguistic Context and State Entrapment into Endless Loops
Fu, Jiang, Hong, Li, Guo, Yang, Chen, Zhang
arxiv.org/abs/2506.14493 mastoxiv.page/@arXiv_csCL_bot/
- GHTM: A Graph-based Hybrid Topic Modeling Approach with a Benchmark Dataset for the Low-Resource ...
Farhana Haque, Md. Abdur Rahman, Sumon Ahmed
arxiv.org/abs/2508.00605 mastoxiv.page/@arXiv_csCL_bot/
- Link Prediction for Event Logs in the Process Industry
Anastasia Zhukova, Thomas Walton, Christian E. Lobm\"uller, Bela Gipp
arxiv.org/abs/2508.09096 mastoxiv.page/@arXiv_csCL_bot/
- AirQA: A Comprehensive QA Dataset for AI Research with Instance-Level Evaluation
Huang, Cao, Zhang, Kang, Wang, Wang, Luo, Zheng, Qian, Chen, Yu
arxiv.org/abs/2509.16952 mastoxiv.page/@arXiv_csCL_bot/
- Multi-View Attention Multiple-Instance Learning Enhanced by LLM Reasoning for Cognitive Distortio...
Jun Seo Kim, Hyemi Kim, Woo Joo Oh, Hongjin Cho, Hochul Lee, Hye Hyeon Kim
arxiv.org/abs/2509.17292 mastoxiv.page/@arXiv_csCL_bot/
- Dual-Space Smoothness for Robust and Balanced LLM Unlearning
Han Yan, Zheyuan Liu, Meng Jiang
arxiv.org/abs/2509.23362 mastoxiv.page/@arXiv_csCL_bot/
- The Rise of AfricaNLP: Contributions, Contributors, Community Impact, and Bibliometric Analysis
Tadesse Destaw Belay, et al.
arxiv.org/abs/2509.25477 mastoxiv.page/@arXiv_csCL_bot/
- Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Reco...
Srivastav, Zheng, Bezzam, Le Bihan, Koluguri, \.Zelasko, Majumdar, Moumen, Gandhi
arxiv.org/abs/2510.06961 mastoxiv.page/@arXiv_csCL_bot/
- Neuron-Level Analysis of Cultural Understanding in Large Language Models
Taisei Yamamoto, Ryoma Kumon, Danushka Bollegala, Hitomi Yanaka
arxiv.org/abs/2510.08284 mastoxiv.page/@arXiv_csCL_bot/
- CLMN: Concept based Language Models via Neural Symbolic Reasoning
Yibo Yang
arxiv.org/abs/2510.10063 mastoxiv.page/@arXiv_csCL_bot/
- Schema for In-Context Learning
Chen, Chen, Wang, Leong, Fung, Bernales, Aspuru-Guzik
arxiv.org/abs/2510.13905 mastoxiv.page/@arXiv_csCL_bot/
- Evaluating Latent Knowledge of Public Tabular Datasets in Large Language Models
Matteo Silvestri, Fabiano Veglianti, Flavio Giorgi, Fabrizio Silvestri, Gabriele Tolomei
arxiv.org/abs/2510.20351 mastoxiv.page/@arXiv_csCL_bot/
- LuxIT: A Luxembourgish Instruction Tuning Dataset from Monolingual Seed Data
Julian Valline, Cedric Lothritz, Siwen Guo, Jordi Cabot
arxiv.org/abs/2510.24434 mastoxiv.page/@arXiv_csCL_bot/
- Surfacing Subtle Stereotypes: A Multilingual, Debate-Oriented Evaluation of Modern LLMs
Muhammed Saeed, Muhammad Abdul-mageed, Shady Shehata
arxiv.org/abs/2511.01187 mastoxiv.page/@arXiv_csCL_bot/
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 10:44:51

Why Pass@k Optimization Can Degrade Pass@1: Prompt Interference in LLM Post-training
Anas Barakat, Souradip Chakraborty, Khushbu Pahwa, Amrit Singh Bedi
arxiv.org/abs/2602.21189 arxiv.org/pdf/2602.21189 arxiv.org/html/2602.21189
arXiv:2602.21189v1 Announce Type: new
Abstract: Pass@k is a widely used performance metric for verifiable large language model tasks, including mathematical reasoning, code generation, and short-answer reasoning. It defines success if any of $k$ independently sampled solutions passes a verifier. This multi-sample inference metric has motivated inference-aware fine-tuning methods that directly optimize pass@$k$. However, prior work reports a recurring trade-off: pass@k improves while pass@1 degrades under such methods. This trade-off is practically important because pass@1 often remains a hard operational constraint due to latency and cost budgets, imperfect verifier coverage, and the need for a reliable single-shot fallback. We study the origin of this trade-off and provide a theoretical characterization of when pass@k policy optimization can reduce pass@1 through gradient conflict induced by prompt interference. We show that pass@$k$ policy gradients can conflict with pass@1 gradients because pass@$k$ optimization implicitly reweights prompts toward low-success prompts; when these prompts are what we term negatively interfering, their upweighting can rotate the pass@k update direction away from the pass@1 direction. We illustrate our theoretical findings with large language model experiments on verifiable mathematical reasoning tasks.
toXiv_bot_toot

@Techmeme@techhub.social
2026-02-19 16:22:04

Google rolls out Gemini 3.1 Pro, which it says is "a step forward in core reasoning", for AI Pro and Ultra subscribers; the .1 increment is a first for Google (Abner Li/9to5Google)
9to5google.com/2026/02/19/goog

@Jaffa@social.linux.pizza
2026-02-22 13:22:16

Power loom built by T. Larmuth & Co., in Manchester, around 1860 - now at the Manchester Museum of Science and Industry.
> "By the early 19th century, new machines like this power loom could make cloth more quickly and cheaply than people. Groups of angry handloom weavers raided cotton mills at night. They burnt and broke power looms to protest against the new technology."
This aspect of the Industrial Revolution is now being breathlessly repeated by those who s…

A structure of iron and wood has cotton thread laced through it and looks like a sophisticated, but relatively simple, mechanism.

Science Museum Group. Object no. Y6000.423