Tootfinder

Opt-in global Mastodon full text search. Join the index!

@catsalad@infosec.exchange
2025-10-19 01:43:01

U 00A0: NON-BINARY SPACE
#Unicode

@arXiv_astrophCO_bot@mastoxiv.page
2025-08-21 07:55:20

Breaking the Baryon Density$\unicode{x2013}$Hubble Constant Degeneracy in Fast Radio Burst Applications with Associated Gravitational Waves
Joscha N. Jahns-Schindler, Laura G. Spitler
arxiv.org/abs/2508.14434

@gideonstar@mastodon.gideonstar.de
2025-09-20 12:23:56

#encryption

@netzschleuder@social.skewed.de
2025-08-19 02:00:04

unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted

unicodelang: Languages spoken by country (2015). 868 nodes, 1255 edges. https://networks.skewed.de/net/unicodelang
@arXiv_csAI_bot@mastoxiv.page
2025-08-18 08:44:50

Landmark-Assisted Monte Carlo Planning
David H. Chan, Mark Roberts, Dana S. Nau
arxiv.org/abs/2508.11493 arxiv.org/pdf/2508.11493

@mgorny@social.treehouse.systems
2025-10-13 11:12:08

Do you want to stop putting dots or some random stuff in forms to skip required fields? The website is too smart and rejects all #Unicode spaces?
You can always use a U 200E Left-To-Right Mark.

@arXiv_csCR_bot@mastoxiv.page
2025-09-15 08:45:21

Why Data Anonymization Has Not Taken Off
Matthew J. Schneider, James Bailie, Dawn Iacobucci
arxiv.org/abs/2509.10165 arxiv.org/pdf/2509.101…

@netzschleuder@social.skewed.de
2025-10-18 07:00:04

unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted

unicodelang: Languages spoken by country (2015). 868 nodes, 1255 edges. https://networks.skewed.de/net/unicodelang
@cheeaun@mastodon.social
2025-08-01 04:22:57

Interesting… 🤔 "ICU4X - Solving i18n for client-side and resource-constrained environments" icu4x.unicode.org/
> Why ICU4X?
> Small and fast
> ICU4X floats like a butterfly and stings like a bee
😅🦋🐝

@arXiv_nuclex_bot@mastoxiv.page
2025-09-16 08:51:36

New probes of nuclear gluon dynamics through photoproduction of charm in inelastic ultra-peripheral Pb$\unicode{x2014}$Pb collisions with ALICE
Sigurd Nese (for the ALICE Collaboration)
arxiv.org/abs/2509.11814

@arXiv_quantph_bot@mastoxiv.page
2025-09-18 09:36:41

From virtual Z gates to virtual Z pulses
Christopher K. Long, Crispin H. W. Barnes
arxiv.org/abs/2509.13453 arxiv.org/pdf/2509.13453

@trochee@dair-community.social
2025-09-03 05:45:58

This list of Unicode character name errata feels like back-matter for some SF novel
unicode.org/notes/tn27/
Via @…

Screen capture

U+0F0A TIBETAN MARK BKA- SHOG YIG MGO

This character is used to indicate that a document is addressed to a superior (the "petition honorific"), but the Tibetan name actually indicates a superior addressing an inferior ("starting flourish for giving a command").
U+0F0B TIBETAN MARK INTERSYLLABIC TSHEG

The tsheg mark is not restricted to intersyllabic usage, and would have been better named Tibetan mark tsheg.
U+0F0C TIBETAN MARK DELIMITER TSHEG BSTAR

This character is not a de…
@fanf@mendeddrum.org
2025-09-13 19:39:29

the unicode standard was a little slow to catch on to the dangers of overlong UTF-8
lobste.rs/s/mlbsfi/utf_8_is_br

@timbray@cosocial.ca
2025-08-23 10:29:54

Three small announcements:
1. RFC 9839, a guide to which Unicode characters you should never use: rfc-editor.org/rfc/rfc9839.htm
2. Blog piece with background and context, “RFC 9839 and Bad Unicode”:

@arXiv_csLG_bot@mastoxiv.page
2025-09-16 12:46:07

All that structure matches does not glitter
Maya M. Martirossyan, Thomas Egg, Philipp Hoellmer, George Karypis, Mark Transtrum, Adrian Roitberg, Mingjie Liu, Richard G. Hennig, Ellad B. Tadmor, Stefano Martiniani
arxiv.org/abs/2509.12178

@publicvoit@graz.social
2025-08-23 08:08:24

«Unicode is good. If you’re designing a data structure or protocol that has text fields, they should contain #Unicode characters encoded in #UTF8. There’s another question, though: “Which Unicode characters?” The answer is “Not all of them, please exclude some.”
This issue keeps coming up, so [

@timbray@cosocial.ca
2025-08-29 16:14:17

A few days back I tweeted and blogged about the brand-new RFC 9839, and also published the first draft of tiny Go library to help enforce the subsets defined in the RFC. Got lots of useful input on the library and have progressed it enough to do a v0.8.0 release: github.com/timbray/rfc9839

@mgorny@social.treehouse.systems
2025-10-01 07:29:43

Honestly, #emoji and icons in #Unicode are a true horror.
Yeah, sure. It's great that you don't have to use <img/> anymore and you can just paste a random Unicode character. You can get graphics into fields where only text was originally intended (like bug summaries). Even better, you can now easily get cool colorful icons on terminal with almost no effort.
However, it is an #accessibility nightmare. People are now encoding *information* in random graphical symbols. Symbols that require huge fonts to render, or huge character tables to describe.
Yeah, a bare <img/> carrying information sucks. However, you can add a *meaningful* alt-text to the image, and accessibility tools can use that text to provide meaningful context. Like "bug fix".
However, emojis and icons are symbolic. The best you can get is some description like "hammer and wrench", so people can kinda figure out that it's probably a "bug fix". Or maybe it was a "maintenance task"? Or you'll get a "unknown character 0x1F6E0". And I'm sure people will surely enjoy cross-referencing a "legend" of such "unknown characters".

@fanf@mendeddrum.org
2025-07-27 08:42:03

from my link log —
libu8ident: Unicode security guidelines for programming language identifiers.
github.com/rurban/libu8ident
saved 2025-02-13

@azonenberg@ioc.exchange
2025-08-30 18:31:01

Unicode folks: why are there emoji for "spoon" and combined "fork knife" but no standalone fork and knife?

@cheeaun@mastodon.social
2025-10-09 02:03:21

Test #emojis:
Emoji 1.0 (emojipedia.org/emoji-1.0): 😀😃😄😁😆
Emoji 17.0 (

@arXiv_csHC_bot@mastoxiv.page
2025-08-11 09:30:50

Non-programmers Assessing AI-Generated Code: A Case Study of Business Users Analyzing Data
Yuvraj Virk, Dongyu Liu
arxiv.org/abs/2508.06484

@arXiv_mathST_bot@mastoxiv.page
2025-08-07 08:13:24

Computable Bounds for Strong Approximations with Applications
Haoyu Ye, Morgane Austern
arxiv.org/abs/2508.03833 arxiv.org/pdf/2508.03833…

@netzschleuder@social.skewed.de
2025-09-12 14:00:04

unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted

unicodelang: Languages spoken by country (2015). 868 nodes, 1255 edges. https://networks.skewed.de/net/unicodelang
@arXiv_physicsfludyn_bot@mastoxiv.page
2025-07-25 08:21:02

Taylor$\unicode{x2013}$Aris dispersion of active particles in oscillatory channel flow
Bohan Wang, Weiquan Jiang, Li Zeng, Zi Wu, Ping Wang
arxiv.org/abs/2507.18241

@arXiv_astrophEP_bot@mastoxiv.page
2025-08-28 09:18:01

Evidence of Titanate Clouds in the Day-side Atmosphere of the Ultra-Hot Jupiter WASP-121b
Suman Saha, James S. Jenkins
arxiv.org/abs/2508.20022

@arXiv_mathOC_bot@mastoxiv.page
2025-10-01 07:52:17

Inverse Optimal Feedback and Gain Margins for Unicycle Stabilization
Kwang Hak Kim, Velimir Todorovski, Miroslav Krsti\'c
arxiv.org/abs/2509.25563

@adlerweb@social.adlerweb.info
2025-08-29 23:27:46

"Wie willst du das Herz haben? <3 oder Anders?"
"Unicode!"
"Was ist das?"
#wampleaks

@arXiv_csCR_bot@mastoxiv.page
2025-10-08 09:50:59

The Five Safes as a Privacy Context
James Bailie, Ruobin Gong
arxiv.org/abs/2510.05803 arxiv.org/pdf/2510.05803

@arXiv_csCL_bot@mastoxiv.page
2025-10-07 12:18:22

Imperceptible Jailbreaking against Large Language Models
Kuofeng Gao, Yiming Li, Chao Du, Xin Wang, Xingjun Ma, Shu-Tao Xia, Tianyu Pang
arxiv.org/abs/2510.05025

@arXiv_mathFA_bot@mastoxiv.page
2025-09-29 08:07:27

Gamma-Convergence of Convex Functions, Conjugates, and Subdifferentials
Rafael Correa, Pedro P\'erez-Aros, Jos\'e Pablo Santander
arxiv.org/abs/2509.21863

@arXiv_quantph_bot@mastoxiv.page
2025-10-13 09:11:50

The charge-singlet measurement toolbox
Abhijit Chakraborty, Randy Lewis, Christine A. Muschik
arxiv.org/abs/2510.08718 arxiv.org/pdf/2510.0…

@netzschleuder@social.skewed.de
2025-08-11 21:00:04

unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted

unicodelang: Languages spoken by country (2015). 868 nodes, 1255 edges. https://networks.skewed.de/net/unicodelang
@cheeaun@mastodon.social
2025-09-09 13:52:59

And… the bundle size for JS emoji pickers will increase for every site… every year… 📈🤷‍♂️
mastodon.world/@emojipedia/115

Chart showing the total number of emojis approved by Unicode each year from 2010 to 2025, with a cumulative increase reaching nearly 4,000 emojis by September 2025. The chart differentiates between new emojis approved each year and the total approved emojis.
@arXiv_condmatstrel_bot@mastoxiv.page
2025-08-28 08:34:11

Thermodynamics in a split Hilbert space: Quantum impurity at the edge of a one-dimensional superconductor
Pradip Kattel, Abay Zhakenov, Natan Andrei
arxiv.org/abs/2508.19330

@Erikmitk@mastodon.gamedev.place
2025-09-24 15:35:11

When I highlight text in #Confluence and copy it, my clipboard manager warns me that the clipping is in the double digit MBs (yes, with an M! 42 in this case.).
`osascript-e 'clipboard info’`
«class weba», 42079409, «class RTF », 650, «class HTML», 867, «class utf8», 244, «class ut16», 490, string, 244, Unicode text, 488
The heck is it copying text in a weba forma…

@netzschleuder@social.skewed.de
2025-08-10 04:00:04

unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted

unicodelang: Languages spoken by country (2015). 868 nodes, 1255 edges. https://networks.skewed.de/net/unicodelang
@alecsargent@social.linux.pizza
2025-08-22 16:45:03

The Unicode character 🗿 (U 1F5FF) is named "Moyai" which I thought it was typo mistake for "Moai" which are the stone statues in Easter Island, Chile.
Turns "Moyai" are statues in Niijima, Japan which were inspired on the ones from Easter Island.
This makes me a little disappointed but it makes me very happy that the Japanese like our statues.
#chile

@arXiv_statML_bot@mastoxiv.page
2025-07-25 09:27:02

Euclidean Distance Deflation Under High-Dimensional Heteroskedastic Noise
Keyi Li, Yuval Kluger, Boris Landa
arxiv.org/abs/2507.18520 arxiv…

@arXiv_quantph_bot@mastoxiv.page
2025-09-01 09:38:22

Optimizing sparse quantum state preparation with measurement and feedforward
Yao-Cheng Lu, Han-Hsuan Lin
arxiv.org/abs/2508.21346 arxiv.org…

@arXiv_csCR_bot@mastoxiv.page
2025-08-25 08:04:50

Unveiling Unicode's Unseen Underpinnings in Undermining Authorship Attribution
Robert Dilworth
arxiv.org/abs/2508.15840 arxiv.org/pdf/2…

@fanf@mendeddrum.org
2025-08-26 14:42:03

from my link log —
The modern text rendering pipeline: unicode, bidi, segmentation, shaping, …
newroadoldway.com/text1.html
saved 2025-06-24

@netzschleuder@social.skewed.de
2025-09-04 17:00:04

unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted

unicodelang: Languages spoken by country (2015). 868 nodes, 1255 edges. https://networks.skewed.de/net/unicodelang
@netzschleuder@social.skewed.de
2025-08-04 20:00:03

unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted

unicodelang: Languages spoken by country (2015). 868 nodes, 1255 edges. https://networks.skewed.de/net/unicodelang
@arXiv_mathMG_bot@mastoxiv.page
2025-07-22 16:33:57

Replaced article(s) found for math.MG. arxiv.org/list/math.MG/new
[1/1]:
- A note on Erd\H{o}s matrices and Marcus\unicode{x2013}Ree inequality
Aman Kushwaha, Raghavendra Tripathi

@netzschleuder@social.skewed.de
2025-09-26 18:00:04

unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted

unicodelang: Languages spoken by country (2015). 868 nodes, 1255 edges. https://networks.skewed.de/net/unicodelang
@arXiv_csCR_bot@mastoxiv.page
2025-08-27 09:56:53

The Double-edged Sword of LLM-based Data Reconstruction: Understanding and Mitigating Contextual Vulnerability in Word-level Differential Privacy Text Sanitization
Stephen Meisenbacher, Alexandra Klymenko, Andreea-Elena Bodea, Florian Matthes
arxiv.org/abs/2508.18976

@netzschleuder@social.skewed.de
2025-09-22 16:00:04

unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted

unicodelang: Languages spoken by country (2015). 868 nodes, 1255 edges. https://networks.skewed.de/net/unicodelang
@netzschleuder@social.skewed.de
2025-07-22 20:00:04

unicodelang: Languages spoken by country (2015)
A bipartite network of languages and the countries in which they are spoken, as estimated by Unicode. Edges are weighted by the proportion of the given country's population that is literate in a particular language.
This network has 868 nodes and 1255 edges.
Tags: Informational, Relatedness, Weighted

unicodelang: Languages spoken by country (2015). 868 nodes, 1255 edges. https://networks.skewed.de/net/unicodelang