2025-09-19 07:44:21
Tokenization Strategies for Low-Resource Agglutinative Languages in Word2Vec: Case Study on Turkish and Finnish
Jinfan Frank Hu
https://arxiv.org/abs/2509.14238 https://
Tokenization Strategies for Low-Resource Agglutinative Languages in Word2Vec: Case Study on Turkish and Finnish
Jinfan Frank Hu
https://arxiv.org/abs/2509.14238 https://
Variable Rate Image Compression via N-Gram Context based Swin-transformer
Priyanka Mudgal, Feng Liu
https://arxiv.org/abs/2510.00058 https://arxiv.org/pdf/…
Regular Expression Indexing for Log Analysis. Extended Version
Ling Zhang, Shaleen Deep, Jignesh M. Patel, Karthikeyan Sankaralingam
https://arxiv.org/abs/2510.10348 https://
Sums of projections with random coefficients
Leonid Pastur, Alexander Pushnitski
https://arxiv.org/abs/2509.21539 https://arxiv.org/pdf/2509.21539
Resolution scaling governs DINOv3 transfer performance in chest radiograph classification
Soroosh Tayebi Arasteh, Mina Shaigan, Christiane Kuhl, Jakob Nikolas Kather, Sven Nebelung, Daniel Truhn
https://arxiv.org/abs/2510.07191
Death of the Novel(ty): Beyond n-Gram Novelty as a Metric for Textual Creativity
Arkadiy Saakyan, Najoung Kim, Smaranda Muresan, Tuhin Chakrabarty
https://arxiv.org/abs/2509.22641
A Formal Framework for Fluency-based Multi-Reference Evaluation in Grammatical Error Correction
Eitan Klinger, Zihao Huang, Tran Minh Nguyen, Emma Jayeon Park, Yige Chen, Yang Gu, Qingyu Gao, Siliang Liu, Mengyang Qiu, Jungyeul Park
https://arxiv.org/abs/2510.06749
RadEval: A framework for radiology text evaluation
Justin Xu, Xi Zhang, Javid Abderezaei, Julie Bauml, Roger Boodoo, Fatemeh Haghighi, Ali Ganjizadeh, Eric Brattain, Dave Van Veen, Zaiqiao Meng, David Eyre, Jean-Benoit Delbrouck
https://arxiv.org/abs/2509.18030
Dorabella Cipher as Musical Inspiration
Bradley Hauer, Colin Choi, Abram Hindle, Scott Smallwood, Grzegorz Kondrak
https://arxiv.org/abs/2509.17950 https://