Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@Techmeme@techhub.social
2026-02-24 07:20:49

A look at the challenges some AI developers face in building models to extract trillions of high-quality tokens from PDFs, which are hard to parse, for training (Josh Dzieza/The Verge)
theverge.com/ai-artificial-int

@kubikpixel@chaos.social
2026-01-23 10:35:13

»PDF-Standard bekommt — Brotli-Kompression für 20 Prozent kleinere Dateien:
Die PDF Association führt Brotli als neuen Kompressionsfilter für PDF 2.0 ein. Tests zeigen durchschnittlich 20 Prozent kleinere Dateien gegenüber Deflate.«
Bis jetzt wusste ich nicht für was Brötli wirklich genutzt werden kann, da es sehr langsam ist aber efizient komprimmiert. Jetzt zeigt es mir, das die Google Erfindung bei PDF durchaus Sinn ergibt.
🥖

@fanf@mendeddrum.org
2025-12-24 21:42:02

from my link log —
Back to the future: the story of Squeak, a practical Smalltalk written in itself.
vpri.org/pdf/tr1997001_backto.
saved 2021-05-24

@gray17@mastodon.social
2026-02-24 00:30:19

finally got around to "move my archive of scanned documents out of google drive" with the help of a lovely program "ocrmypdf", which is basically a python wrapper around tesseract and various pdf tools, but it's a really well done wrapper.
the simple invocation:
`ocrmypdf input.pdf output.pdf`
does what I want. the defaults are sensible. and now I can pdfgrep when I need to find that thing from 20 years ago that I still have for questionable "I do…

@kctipton@mas.to
2026-01-24 19:09:28

Vanderbilt Policy Accelerator - Capping-Credit-Card-Rates.pdf cdn.vanderbilt.edu/vu-URL/wp-c

@karlauerbach@sfba.social
2026-01-24 18:45:28

Resurrected!!! The source of RFK's scientific knowledge, now, once again, available to all.....
"Science Made Stupid"
chrispennello.com/tweller/Scie

@seeingwithsound@mas.to
2025-12-24 20:18:38

(2015, PDF) The rehabilitative potential of auditory to visual sensory substitution devices for the blind las.touro.edu/media/schools-an

@mxp@mastodon.acm.org‬
2026-02-24 09:44:59

@… However, if I set
diagram:
engine:
mermaid:
mime-type:
application/pdf: true
image/svg xml: false
diagram.lua fails because Inkscape doesn't find pdf2svg.
But I don't see why it even tries to call Inkscape, as mmdc can directly output PDF. The mermaid function looks goo…

@fanf@mendeddrum.org
2026-02-24 15:42:02

from my link log —
Spotting fake face masks. (FFP2/N95/KN95)
bda.org/advice/Coronavirus/Doc
saved 2021-12-23

@fanf@mendeddrum.org
2025-12-25 15:42:01

from my link log —
From collisions to chosen-prefix collisions, applied to full SHA-1.
eprint.iacr.org/2019/459.pdf
saved 2019-05-11