Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@arXiv_csDS_bot@mastoxiv.page
2026-02-03 08:07:36

Fast $k$-means Seeding Under The Manifold Hypothesis
Poojan Shah, Shashwat Agrawal, Ragesh Jaiswal
arxiv.org/abs/2602.01104 arxiv.org/pdf/2602.01104 arxiv.org/html/2602.01104
arXiv:2602.01104v1 Announce Type: new
Abstract: We study beyond worst case analysis for the $k$-means problem where the goal is to model typical instances of $k$-means arising in practice. Existing theoretical approaches provide guarantees under certain assumptions on the optimal solutions to $k$-means, making them difficult to validate in practice. We propose the manifold hypothesis, where data obtained in ambient dimension $D$ concentrates around a low dimensional manifold of intrinsic dimension $d$, as a reasonable assumption to model real world clustering instances. We identify key geometric properties of datasets which have theoretically predictable scaling laws depending on the quantization exponent $\varepsilon = 2/d$ using techniques from optimum quantization theory. We show how to exploit these regularities to design a fast seeding method called $\operatorname{Qkmeans}$ which provides $O(\rho^{-2} \log k)$ approximate solutions to the $k$-means problem in time $O(nD) \widetilde{O}(\varepsilon^{1 \rho}\rho^{-1}k^{1 \gamma})$; where the exponent $\gamma = \varepsilon \rho$ for an input parameter $\rho < 1$. This allows us to obtain new runtime - quality tradeoffs. We perform a large scale empirical study across various domains to validate our theoretical predictions and algorithm performance to bridge theory and practice for beyond worst case data clustering.
toXiv_bot_toot

@arXiv_csCL_bot@mastoxiv.page
2026-03-31 11:12:28

Replaced article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[1/5]:
- Beyond In-Distribution Success: Scaling Curves of CoT Granularity for Language Model Generalization
Ru Wang, Wei Huang, Selena Song, Haoyu Zhang, Qian Niu, Yusuke Iwasawa, Yutaka Matsuo, Jiaxian Guo
arxiv.org/abs/2502.18273 mastoxiv.page/@arXiv_csCL_bot/
- Benchmarking NLP-supported Language Sample Analysis for Swiss Children's Speech
Anja Ryser, Yingqiang Gao, Sarah Ebling
arxiv.org/abs/2504.00780 mastoxiv.page/@arXiv_csCL_bot/
- Cultural Biases of Large Language Models and Humans in Historical Interpretation
Fabio Celli, Georgios Spathulas
arxiv.org/abs/2504.02572 mastoxiv.page/@arXiv_csCL_bot/
- BRIDGE: Benchmarking Large Language Models for Understanding Real-world Clinical Practice Text
Jiageng Wu, et al.
arxiv.org/abs/2504.19467 mastoxiv.page/@arXiv_csCL_bot/
- Understanding the Anchoring Effect of LLM with Synthetic Data: Existence, Mechanism, and Potentia...
Yiming Huang, Biquan Bie, Zuqiu Na, Weilin Ruan, Songxin Lei, Yutao Yue, Xinlei He
arxiv.org/abs/2505.15392 mastoxiv.page/@arXiv_csCL_bot/
- Just as Humans Need Vaccines, So Do Models: Model Immunization to Combat Falsehoods
Raza, Qureshi, Farooq, Lotif, Chadha, Pandya, Emmanouilidis
arxiv.org/abs/2505.17870 mastoxiv.page/@arXiv_csCL_bot/
- LingoLoop Attack: Trapping MLLMs via Linguistic Context and State Entrapment into Endless Loops
Fu, Jiang, Hong, Li, Guo, Yang, Chen, Zhang
arxiv.org/abs/2506.14493 mastoxiv.page/@arXiv_csCL_bot/
- GHTM: A Graph-based Hybrid Topic Modeling Approach with a Benchmark Dataset for the Low-Resource ...
Farhana Haque, Md. Abdur Rahman, Sumon Ahmed
arxiv.org/abs/2508.00605 mastoxiv.page/@arXiv_csCL_bot/
- Link Prediction for Event Logs in the Process Industry
Anastasia Zhukova, Thomas Walton, Christian E. Lobm\"uller, Bela Gipp
arxiv.org/abs/2508.09096 mastoxiv.page/@arXiv_csCL_bot/
- AirQA: A Comprehensive QA Dataset for AI Research with Instance-Level Evaluation
Huang, Cao, Zhang, Kang, Wang, Wang, Luo, Zheng, Qian, Chen, Yu
arxiv.org/abs/2509.16952 mastoxiv.page/@arXiv_csCL_bot/
- Multi-View Attention Multiple-Instance Learning Enhanced by LLM Reasoning for Cognitive Distortio...
Jun Seo Kim, Hyemi Kim, Woo Joo Oh, Hongjin Cho, Hochul Lee, Hye Hyeon Kim
arxiv.org/abs/2509.17292 mastoxiv.page/@arXiv_csCL_bot/
- Dual-Space Smoothness for Robust and Balanced LLM Unlearning
Han Yan, Zheyuan Liu, Meng Jiang
arxiv.org/abs/2509.23362 mastoxiv.page/@arXiv_csCL_bot/
- The Rise of AfricaNLP: Contributions, Contributors, Community Impact, and Bibliometric Analysis
Tadesse Destaw Belay, et al.
arxiv.org/abs/2509.25477 mastoxiv.page/@arXiv_csCL_bot/
- Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Reco...
Srivastav, Zheng, Bezzam, Le Bihan, Koluguri, \.Zelasko, Majumdar, Moumen, Gandhi
arxiv.org/abs/2510.06961 mastoxiv.page/@arXiv_csCL_bot/
- Neuron-Level Analysis of Cultural Understanding in Large Language Models
Taisei Yamamoto, Ryoma Kumon, Danushka Bollegala, Hitomi Yanaka
arxiv.org/abs/2510.08284 mastoxiv.page/@arXiv_csCL_bot/
- CLMN: Concept based Language Models via Neural Symbolic Reasoning
Yibo Yang
arxiv.org/abs/2510.10063 mastoxiv.page/@arXiv_csCL_bot/
- Schema for In-Context Learning
Chen, Chen, Wang, Leong, Fung, Bernales, Aspuru-Guzik
arxiv.org/abs/2510.13905 mastoxiv.page/@arXiv_csCL_bot/
- Evaluating Latent Knowledge of Public Tabular Datasets in Large Language Models
Matteo Silvestri, Fabiano Veglianti, Flavio Giorgi, Fabrizio Silvestri, Gabriele Tolomei
arxiv.org/abs/2510.20351 mastoxiv.page/@arXiv_csCL_bot/
- LuxIT: A Luxembourgish Instruction Tuning Dataset from Monolingual Seed Data
Julian Valline, Cedric Lothritz, Siwen Guo, Jordi Cabot
arxiv.org/abs/2510.24434 mastoxiv.page/@arXiv_csCL_bot/
- Surfacing Subtle Stereotypes: A Multilingual, Debate-Oriented Evaluation of Modern LLMs
Muhammed Saeed, Muhammad Abdul-mageed, Shady Shehata
arxiv.org/abs/2511.01187 mastoxiv.page/@arXiv_csCL_bot/
toXiv_bot_toot

@raiders@darktundra.xyz
2026-03-21 20:36:31

Raiders 7-Round Mock: Aggressive trades show GM John Spytek means business raiderramble.com/2026/03/21/ra

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 10:37:31

Regret-Guided Search Control for Efficient Learning in AlphaZero
Yun-Jui Tsai, Wei-Yu Chen, Yan-Ru Ju, Yu-Hung Chang, Ti-Rong Wu
arxiv.org/abs/2602.20809 arxiv.org/pdf/2602.20809 arxiv.org/html/2602.20809
arXiv:2602.20809v1 Announce Type: new
Abstract: Reinforcement learning (RL) agents achieve remarkable performance but remain far less learning-efficient than humans. While RL agents require extensive self-play games to extract useful signals, humans often need only a few games, improving rapidly by repeatedly revisiting states where mistakes occurred. This idea, known as search control, aims to restart from valuable states rather than always from the initial state. In AlphaZero, prior work Go-Exploit applies this idea by sampling past states from self-play or search trees, but it treats all states equally, regardless of their learning potential. We propose Regret-Guided Search Control (RGSC), which extends AlphaZero with a regret network that learns to identify high-regret states, where the agent's evaluation diverges most from the actual outcome. These states are collected from both self-play trajectories and MCTS nodes, stored in a prioritized regret buffer, and reused as new starting positions. Across 9x9 Go, 10x10 Othello, and 11x11 Hex, RGSC outperforms AlphaZero and Go-Exploit by an average of 77 and 89 Elo, respectively. When training on a well-trained 9x9 Go model, RGSC further improves the win rate against KataGo from 69.3% to 78.2%, while both baselines show no improvement. These results demonstrate that RGSC provides an effective mechanism for search control, improving both efficiency and robustness of AlphaZero training. Our code is available at rlg.iis.sinica.edu.tw/papers/r.
toXiv_bot_toot

@arXiv_csCL_bot@mastoxiv.page
2026-03-31 11:12:48

Replaced article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[2/5]:
- POTSA: A Cross-Lingual Speech Alignment Framework for Speech-to-Text Translation
Li, Cui, Wang, Ge, Huang, Li, Peng, Lu, Tashi, Wang, Dang
arxiv.org/abs/2511.09232 mastoxiv.page/@arXiv_csCL_bot/
- Beyond Elicitation: Provision-based Prompt Optimization for Knowledge-Intensive Tasks
Yunzhe Xu, Zhuosheng Zhang, Zhe Liu
arxiv.org/abs/2511.10465 mastoxiv.page/@arXiv_csCL_bot/
- $\pi$-Attention: Periodic Sparse Transformers for Efficient Long-Context Modeling
Dong Liu, Yanxuan Yu
arxiv.org/abs/2511.10696 mastoxiv.page/@arXiv_csCL_bot/
- Based on Data Balancing and Model Improvement for Multi-Label Sentiment Classification Performanc...
Zijin Su, Huanzhu Lyu, Yuren Niu, Yiming Liu
arxiv.org/abs/2511.14073 mastoxiv.page/@arXiv_csCL_bot/
- HEAD-QA v2: Expanding a Healthcare Benchmark for Reasoning
Alexis Correa-Guill\'en, Carlos G\'omez-Rodr\'iguez, David Vilares
arxiv.org/abs/2511.15355 mastoxiv.page/@arXiv_csCL_bot/
- Towards Hyper-Efficient RAG Systems in VecDBs: Distributed Parallel Multi-Resolution Vector Search
Dong Liu, Yanxuan Yu
arxiv.org/abs/2511.16681 mastoxiv.page/@arXiv_csCL_bot/
- Estonian WinoGrande Dataset: Comparative Analysis of LLM Performance on Human and Machine Transla...
Marii Ojastu, Hele-Andra Kuulmets, Aleksei Dorkin, Marika Borovikova, Dage S\"arg, Kairit Sirts
arxiv.org/abs/2511.17290 mastoxiv.page/@arXiv_csCL_bot/
- A Systematic Study of In-the-Wild Model Merging for Large Language Models
O\u{g}uz Ka\u{g}an Hitit, Leander Girrbach, Zeynep Akata
arxiv.org/abs/2511.21437 mastoxiv.page/@arXiv_csCL_bot/
- CREST: Universal Safety Guardrails Through Cluster-Guided Cross-Lingual Transfer
Lavish Bansal, Naman Mishra
arxiv.org/abs/2512.02711 mastoxiv.page/@arXiv_csCL_bot/
- Multilingual Medical Reasoning for Question Answering with Large Language Models
Pietro Ferrazzi, Aitor Soroa, Rodrigo Agerri
arxiv.org/abs/2512.05658 mastoxiv.page/@arXiv_csCL_bot/
- OnCoCo 1.0: A Public Dataset for Fine-Grained Message Classification in Online Counseling Convers...
Albrecht, Lehmann, Poltermann, Rudolph, Steigerwald, Stieler
arxiv.org/abs/2512.09804 mastoxiv.page/@arXiv_csCL_bot/
- Does Tone Change the Answer? Evaluating Prompt Politeness Effects on Modern LLMs: GPT, Gemini, an...
Hanyu Cai, Binqi Shen, Lier Jin, Lan Hu, Xiaojing Fan
arxiv.org/abs/2512.12812 mastoxiv.page/@arXiv_csCL_bot/
- Beg to Differ: Understanding Reasoning-Answer Misalignment Across Languages
Ovalle, Ross, Ruder, Williams, Ullrich, Ibrahim, Sagun
arxiv.org/abs/2512.22712 mastoxiv.page/@arXiv_csCL_bot/
- Activation Steering for Masked Diffusion Language Models
Adi Shnaidman, Erin Feiglin, Osher Yaari, Efrat Mentel, Amit Levi, Raz Lapid
arxiv.org/abs/2512.24143 mastoxiv.page/@arXiv_csCL_bot/
- JMedEthicBench: A Multi-Turn Conversational Benchmark for Evaluating Medical Safety in Japanese L...
Liu, Li, Niu, Zhang, Xun, Hou, Wang, Iwasawa, Matsuo, Hatakeyama-Sato
arxiv.org/abs/2601.01627 mastoxiv.page/@arXiv_csCL_bot/
- FACTUM: Mechanistic Detection of Citation Hallucination in Long-Form RAG
Dassen, Kotula, Murray, Yates, Lawrie, Kayi, Mayfield, Duh
arxiv.org/abs/2601.05866 mastoxiv.page/@arXiv_csCL_bot/
- {\dag}DAGGER: Distractor-Aware Graph Generation for Executable Reasoning in Math Problems
Zabir Al Nazi, Shubhashis Roy Dipta, Sudipta Kar
arxiv.org/abs/2601.06853 mastoxiv.page/@arXiv_csCL_bot/
- Symphonym: Universal Phonetic Embeddings for Cross-Script Name Matching
Stephen Gadd
arxiv.org/abs/2601.06932 mastoxiv.page/@arXiv_csCL_bot/
- LLMs versus the Halting Problem: Revisiting Program Termination Prediction
Sultan, Armengol-Estape, Kesseli, Vanegue, Shahaf, Adi, O'Hearn
arxiv.org/abs/2601.18987 mastoxiv.page/@arXiv_csCL_bot/
- MuVaC: A Variational Causal Framework for Multimodal Sarcasm Understanding in Dialogues
Diandian Guo, Fangfang Yuan, Cong Cao, Xixun Lin, Chuan Zhou, Hao Peng, Yanan Cao, Yanbing Liu
arxiv.org/abs/2601.20451 mastoxiv.page/@arXiv_csCL_bot/
toXiv_bot_toot

@nobodyinperson@fosstodon.org
2026-02-06 23:18:14

Towards simpler #btrfs raid setup with #disko:
discourse.nixos.org/t/btrfs-ra

@qbi@freie-re.de
2026-03-16 12:11:45

Kennt ihr #Center795?
Das war/ist eine russische Einheit bestehend aus #GRU und #FSB. Die sollten "Spezialaktionen" in der #Ukraine

@JBrickelt963@framapiaf.org
2026-03-05 19:32:26

#Ally43 : #OnRecrute en #Auvergne #HauteLoire
L’association

OFFRE de Stage • Sur Le Plateau D’Ally (éco-tourisme) : Mettre on place et conduire des projets d'animations et sensibilisation dans le cadre de visites et activités de pleine nature. Du 1er avril au 31 août 2026. Stage rémunéré. Logement gratuit possible.

1. Missions Principales : 

• Aide à l'animation t à l'amélioration des visites guidées de différents sites de visites en milieu naturel : une mine d'argent gall-romane, des moulins à vent 
restaurés et un parc éolien.

• Adaptation des visi…
2. Missions secondaires

• Traduction en anglais et/ou langue tierces des topos de visites guidées.
• Tenue de permanences téléphoniques et Prise de réservation en ligne ou téléphonique.

• Élaboration de supports d'animation {cartes, mémos, schémas, etc)

• Regard et amélioration des outil de communication visuelle + brochures,
flyers, affiches.

• Assurer une présence sur les réseaux sociaux : alimentation du site internet, de la page facebook et du compte Instagram.

3. Nous Recherchors

F/H…