🧠Bidirectional token-classification — unlike autoregressive LLMs, #PrivacyFilter reads input from both directions simultaneously for deeper context awareness, catching subtle #PII that simple pattern-matching or RegEx rules miss
⚡ 1.5B parameter model with only ~50M active parameters …
Replaced article(s) found for cs.FL. https://arxiv.org/list/cs.FL/new
[1/1]:
- Linear Time Subsequence and Supersequence Regex Matching
Antoine Amarilli, Bartlomiej Dudek, Florin Manea, Tina Ringleb, Markus L. Schmid
Finally a package where I can't suspect the author of vibecoding!
"""
Version: 2026.3.32
Fixed segfault.
Version: 2026.3.31
Fixed bug again.
Version: 2026.3.30
Fixed bug.
"""
https://github.com/mrabarnett/mrab-regex/blob/5d51c75da03116e08bb6fb537fae6d8c804cc92c/changelog.txt
(It's also a horribly bad package, with tons of unmaintainable code, heavily relying on CPython internals.)
âš¡ What triggered it: GitHub rolled out a new token format for GitHub App installation tokens: ghs__. Base64url encoding uses - chars, which Composer's 2021-era regex didn't permit. Any such token would fail validation and leak.
😱 Making it worse: GitHub Actions' secret masker failed to redact the token. Symfony Console wraps error output with ANSI sequences, bypassing masking. Popular actions like shivammathur/setup-php auto-register GITHUB_TOKEN — no special config needed …
Class too large (too many methods)
− 1 file
🟡 Design / code quality
Duplicate literals − should be constants
− ~7 files
Duplicate / suspicious code blocks
− 1 file
🟢 Minor / low-level issues
Regex improvements
− ~4 files
Redundant / unnecessary code (e.g. temp vars, inherited method)
− ~3 files