Tootfinder

Opt-in global Mastodon full text search. Join the index!

@timbray@cosocial.ca
2025-08-29 16:14:17

A few days back I tweeted and blogged about the brand-new RFC 9839, and also published the first draft of tiny Go library to help enforce the subsets defined in the RFC. Got lots of useful input on the library and have progressed it enough to do a v0.8.0 release: github.com/timbray/rfc9839

@fanf@mendeddrum.org
2025-07-27 08:42:03

from my link log —
libu8ident: Unicode security guidelines for programming language identifiers.
github.com/rurban/libu8ident
saved 2025-02-13

@arXiv_csCL_bot@mastoxiv.page
2025-06-30 10:21:30

Why Are Parsing Actions for Understanding Message Hierarchies Not Random?
Daichi Kato, Ryo Ueda, Yusuke Miyao
arxiv.org/abs/2506.22366

@arXiv_astrophEP_bot@mastoxiv.page
2025-08-28 09:18:01

Evidence of Titanate Clouds in the Day-side Atmosphere of the Ultra-Hot Jupiter WASP-121b
Suman Saha, James S. Jenkins
arxiv.org/abs/2508.20022

@publicvoit@graz.social
2025-08-23 08:08:24

«Unicode is good. If you’re designing a data structure or protocol that has text fields, they should contain #Unicode characters encoded in #UTF8. There’s another question, though: “Which Unicode characters?” The answer is “Not all of them, please exclude some.”
This issue keeps coming up, so [

@timbray@cosocial.ca
2025-08-23 10:29:54

Three small announcements:
1. RFC 9839, a guide to which Unicode characters you should never use: rfc-editor.org/rfc/rfc9839.htm
2. Blog piece with background and context, “RFC 9839 and Bad Unicode”:

@arXiv_csCR_bot@mastoxiv.page
2025-08-25 08:04:50

Unveiling Unicode's Unseen Underpinnings in Undermining Authorship Attribution
Robert Dilworth
arxiv.org/abs/2508.15840 arxiv.org/pdf/2…

@arXiv_condmatstrel_bot@mastoxiv.page
2025-08-28 08:34:11

Thermodynamics in a split Hilbert space: Quantum impurity at the edge of a one-dimensional superconductor
Pradip Kattel, Abay Zhakenov, Natan Andrei
arxiv.org/abs/2508.19330

@arXiv_physicsfludyn_bot@mastoxiv.page
2025-07-25 08:21:02

Taylor$\unicode{x2013}$Aris dispersion of active particles in oscillatory channel flow
Bohan Wang, Weiquan Jiang, Li Zeng, Zi Wu, Ping Wang
arxiv.org/abs/2507.18241

@netzschleuder@social.skewed.de
2025-06-24 05:00:04

unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted

unicodelang: Languages spoken by country (2015). 868 nodes, 1255 edges. https://networks.skewed.de/net/unicodelang
@cheeaun@mastodon.social
2025-08-01 04:22:57

Interesting… 🤔 "ICU4X - Solving i18n for client-side and resource-constrained environments" icu4x.unicode.org/
> Why ICU4X?
> Small and fast
> ICU4X floats like a butterfly and stings like a bee
😅🦋🐝

@fanf@mendeddrum.org
2025-08-26 14:42:03

from my link log —
The modern text rendering pipeline: unicode, bidi, segmentation, shaping, …
newroadoldway.com/text1.html
saved 2025-06-24

@arXiv_condmatstatmech_bot@mastoxiv.page
2025-06-23 10:01:30

Microcanonical simulated annealing: Massively parallel Monte Carlo simulations with sporadic random-number generation
M. Bernaschi, L. A. Fernandez, I. Gonz\'alez-Adalid Pemart\'in, E. Marinari, V. Martin-Mayor, G. Parisi, F. Ricci-Tersenghi, J. J. Ruiz-Lorenzo, D. Yllanes
arxiv.org/abs/2506.16240

@arXiv_csAI_bot@mastoxiv.page
2025-08-18 08:44:50

Landmark-Assisted Monte Carlo Planning
David H. Chan, Mark Roberts, Dana S. Nau
arxiv.org/abs/2508.11493 arxiv.org/pdf/2508.11493

@arXiv_statML_bot@mastoxiv.page
2025-07-25 09:27:02

Euclidean Distance Deflation Under High-Dimensional Heteroskedastic Noise
Keyi Li, Yuval Kluger, Boris Landa
arxiv.org/abs/2507.18520 arxiv…

@masta@noc.social
2025-06-23 12:03:18

It has been 0 days since the last unicode fuckup ....
Man, 2025 und es gibt grosse Firmen die noch immer kein sauberes konsistentes Encoding von nicht-ASCII Zeichen hinbekommen.

@arXiv_csCV_bot@mastoxiv.page
2025-07-17 10:28:50

MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding
Renjie Li, Ruijie Ye, Mingyang Wu, Hao Frank Yang, Zhiwen Fan, Hezhen Hu, Zhengzhong Tu
arxiv.org/abs/2507.12463

@arXiv_astrophCO_bot@mastoxiv.page
2025-08-21 07:55:20

Breaking the Baryon Density$\unicode{x2013}$Hubble Constant Degeneracy in Fast Radio Burst Applications with Associated Gravitational Waves
Joscha N. Jahns-Schindler, Laura G. Spitler
arxiv.org/abs/2508.14434

@arXiv_csCR_bot@mastoxiv.page
2025-08-27 09:56:53

The Double-edged Sword of LLM-based Data Reconstruction: Understanding and Mitigating Contextual Vulnerability in Word-level Differential Privacy Text Sanitization
Stephen Meisenbacher, Alexandra Klymenko, Andreea-Elena Bodea, Florian Matthes
arxiv.org/abs/2508.18976

@gwire@mastodon.social
2025-07-06 14:32:02

If your initial thought on reading about the "Initial Teaching Alphabet" is to check on Unicode status, please see: unicode.org/L2/L2025/25010-scr

@alecsargent@social.linux.pizza
2025-08-22 16:45:03

The Unicode character 🗿 (U 1F5FF) is named "Moyai" which I thought it was typo mistake for "Moai" which are the stone statues in Easter Island, Chile.
Turns "Moyai" are statues in Niijima, Japan which were inspired on the ones from Easter Island.
This makes me a little disappointed but it makes me very happy that the Japanese like our statues.
#chile

@arXiv_csIT_bot@mastoxiv.page
2025-06-25 07:50:00

Poset-Markov Channels: Capacity via Group Symmetry
Eray Unsal Atay, Eitan Levin, Venkat Chandrasekaran, Victoria Kostina
arxiv.org/abs/2506.19305

@netzschleuder@social.skewed.de
2025-07-22 20:00:04

unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted

unicodelang: Languages spoken by country (2015). 868 nodes, 1255 edges. https://networks.skewed.de/net/unicodelang
@hiimmrdave@hachyderm.io
2025-06-23 05:47:27

oh you know, just being pedantic about the unicode standard in the wee hours.
as you do.

@timbray@cosocial.ca
2025-07-10 16:32:03

@… There’s some weird shit in the dusty corners of Unicode: unicode.org/charts/PDF/U1FB00.

@stf@chaos.social
2025-06-01 17:39:56

haha, i like to poison my personal data - among others - by using a random combination of unicode homoglyphs. this is a new result on a package i got delivered.

address sticker on a package, with some details blacked out. the name is partly "visible" but a warning is superposed  with a box that says "You cannot use unicode text in conjunction with not standard TrueType fonts! Try to embedd TrueType" the rest is missing due to clipping.
@arXiv_csLO_bot@mastoxiv.page
2025-06-18 08:31:54

OSTRICH2: Solver for Complex String Constraints
Matthew Hague, Denghang Hu, Artur Je\.z, Anthony W. Lin, Oliver Markgraf, Philipp R\"ummer, Zhilin Wu
arxiv.org/abs/2506.14363

@arXiv_csCL_bot@mastoxiv.page
2025-07-08 13:59:41

Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations
A. Bochkov
arxiv.org/abs/2507.04886

@arXiv_csCR_bot@mastoxiv.page
2025-06-17 09:53:21

Universal Jailbreak Suffixes Are Strong Attention Hijackers
Matan Ben-Tov, Mor Geva, Mahmood Sharif
arxiv.org/abs/2506.12880

@arXiv_mathMG_bot@mastoxiv.page
2025-07-22 16:33:57

Replaced article(s) found for math.MG. arxiv.org/list/math.MG/new
[1/1]:
- A note on Erd\H{o}s matrices and Marcus\unicode{x2013}Ree inequality
Aman Kushwaha, Raghavendra Tripathi

@arXiv_statME_bot@mastoxiv.page
2025-07-08 10:49:30

A Test for Jumps in Metric-Space Conditional Means
David Van Dijcke
arxiv.org/abs/2507.04560 arxiv.org/pdf/2507.04560…

@arXiv_condmatmtrlsci_bot@mastoxiv.page
2025-06-10 18:23:50

This arxiv.org/abs/2504.09432 has been replaced.
initial toot: mastoxiv.page/@a…

@arXiv_condmatmeshall_bot@mastoxiv.page
2025-06-13 09:01:20

Slip electron flow in GaAs microscale constrictions
Daniil I. Sarypov, Dmitriy A. Pokhabov, Arthur G. Pogosov, Evgeny Yu. Zhdanov, Andrey A. Shevyrin, Alexander A. Shklyaev, Askhat K. Bakarov
arxiv.org/abs/2506.10276

@arXiv_csDC_bot@mastoxiv.page
2025-06-02 07:17:11

EmbAdvisor: Adaptive Cache Management for Sustainable LLM Serving
Yuyang Tian, Desen Sun, Yi Ding, Sihang Liu
arxiv.org/abs/2505.23970

@netzschleuder@social.skewed.de
2025-08-19 02:00:04

unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted

unicodelang: Languages spoken by country (2015). 868 nodes, 1255 edges. https://networks.skewed.de/net/unicodelang
@arXiv_grqc_bot@mastoxiv.page
2025-06-05 10:02:35

This arxiv.org/abs/2503.05651 has been replaced.
initial toot: mastoxiv.page/@arXiv_grqc_…

@arXiv_csCR_bot@mastoxiv.page
2025-06-18 08:30:51

Universal Jailbreak Suffixes Are Strong Attention Hijackers
Matan Ben-Tov, Mor Geva, Mahmood Sharif
arxiv.org/abs/2506.12880

@arXiv_mathST_bot@mastoxiv.page
2025-08-07 08:13:24

Computable Bounds for Strong Approximations with Applications
Haoyu Ye, Morgane Austern
arxiv.org/abs/2508.03833 arxiv.org/pdf/2508.03833…

@arXiv_csHC_bot@mastoxiv.page
2025-08-11 09:30:50

Non-programmers Assessing AI-Generated Code: A Case Study of Business Users Analyzing Data
Yuvraj Virk, Dongyu Liu
arxiv.org/abs/2508.06484

@arXiv_astrophCO_bot@mastoxiv.page
2025-06-11 09:37:45

Measurement of the Dispersion$\unicode{x2013}$Galaxy Cross-Power Spectrum with the Second CHIME/FRB Catalog
Haochen Wang, Kiyoshi Masui, Shion Andrew, Emmanuel Fonseca, B. M. Gaensler, R. C. Joseph, Victoria M. Kaspi, Bikash Kharel, Adam E. Lanman, Calvin Leung, Lluis Mas-Ribas, Juan Mena-Parra, Kenzie Nimmo, Aaron B. Pearlman, Ue-Li Pen, J. Xavier Prochaska, Ryan Raikman, Kaitlyn Shin, Seth R. Siegel, Kendrick M. Smith, Ingrid H. Stairs

@netzschleuder@social.skewed.de
2025-06-13 05:00:04

unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted

unicodelang: Languages spoken by country (2015). 868 nodes, 1255 edges. https://networks.skewed.de/net/unicodelang
@arXiv_astrophSR_bot@mastoxiv.page
2025-07-01 09:51:43

Avalanching together: A model for sympathetic flaring
Louis-Simon Guit\'e, Paul Charbonneau, Antoine Strugarek
arxiv.org/abs/2506.23889

@arXiv_astrophEP_bot@mastoxiv.page
2025-07-04 09:55:41

A Highly Carbon-Rich Dayside and Disequilibrium Chemistry in the Ultra-Hot Jupiter WASP-19b
Suman Saha, James S. Jenkins
arxiv.org/abs/2507.02797

@arXiv_physicsaccph_bot@mastoxiv.page
2025-06-03 16:33:27

This arxiv.org/abs/2505.24299 has been replaced.
initial toot: mastoxiv.page/@arX…

@netzschleuder@social.skewed.de
2025-08-11 21:00:04

unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted

unicodelang: Languages spoken by country (2015). 868 nodes, 1255 edges. https://networks.skewed.de/net/unicodelang
@netzschleuder@social.skewed.de
2025-07-11 16:00:04

unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted

unicodelang: Languages spoken by country (2015). 868 nodes, 1255 edges. https://networks.skewed.de/net/unicodelang
@netzschleuder@social.skewed.de
2025-06-11 15:00:03

unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted

unicodelang: Languages spoken by country (2015). 868 nodes, 1255 edges. https://networks.skewed.de/net/unicodelang
@netzschleuder@social.skewed.de
2025-08-10 04:00:04

unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted

unicodelang: Languages spoken by country (2015). 868 nodes, 1255 edges. https://networks.skewed.de/net/unicodelang
@arXiv_physicsaccph_bot@mastoxiv.page
2025-06-02 07:32:31

Iion-motion simulations of a plasma-wakefield experiment at FLASHForward
D. Kalvik, P. Drobniak, F. Pe\~na, C. A. Lindstr{\o}m, J. Beinortaite, L. Boulton, P. Caminal, J. Garland, G. Loisch, J. B. Svensson, M. Th\'evenet, S. Wesch, J. Wood, J. Osterhoff R. D'Arcy, S. Diederichs
arxiv.org/abs/2505.24299

@netzschleuder@social.skewed.de
2025-08-04 20:00:03

unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted

unicodelang: Languages spoken by country (2015). 868 nodes, 1255 edges. https://networks.skewed.de/net/unicodelang