Tootfinder

Opt-in global Mastodon full text search. Join the index!

@lysander07@sigmoid.social
2025-05-17 07:38:59

In our #ISE2025 lecture last Wednesday, we learned how in n-gram language models via Markov assumption and maximum likelihood estimation we can predict the probability of the occurrence of a word given a specific context (i.e. n words previous in the sequence of words).
#NLP

Slide from the Information Service Engineering 2025 lecture, 03 Natural Language Processing 02, 2.9, Language MOdels:
Title: N-Gram Language Model
The probability of a sequence of words can be computed via contitional probability and the Bayes Rule (including the chain rule for n words). Approximation is performed via Markov assumption (dependency only on the n last words), and the Maximum Likelihood estimation (approximating the probabilities of a sequence of words by counting and normalising …
@arXiv_csCL_bot@mastoxiv.page
2025-07-03 10:16:00

Decision-oriented Text Evaluation
Yu-Shiang Huang, Chuan-Ju Wang, Chung-Chi Chen
arxiv.org/abs/2507.01923 arxiv.org/p…

@arXiv_eessAS_bot@mastoxiv.page
2025-05-30 07:22:13

NGPU-LM: GPU-Accelerated N-Gram Language Model for Context-Biasing in Greedy ASR Decoding
Vladimir Bataev, Andrei Andrusenko, Lilit Grigoryan, Aleksandr Laptev, Vitaly Lavrukhin, Boris Ginsburg
arxiv.org/abs/2505.22857

@arXiv_csCR_bot@mastoxiv.page
2025-06-23 09:22:59

Malware Classification Leveraging NLP & Machine Learning for Enhanced Accuracy
Bishwajit Prasad Gond, Rajneekant, Pushkar Kishore, Durga Prasad Mohapatra
arxiv.org/abs/2506.16224

@lysander07@sigmoid.social
2025-05-19 14:04:32

Generating Shakespeare-like text with an n-gram language model is straight forward and quite simple. But, don't expect to much of it. It will not be able to recreate a lost Shakespear play for you ;-) It's merely a parrot, making up well sounding sentences out of fragments of original Shakespeare texts...
#ise2025

Slide from the Information Service Engineering lecture 04, Natural Language Procerssing 03, 2.9 Language Models, N-Gram Shakespeare Generation.
The background of the slide shows an AI-generated portrait of William Shakespeare as an ink drawing. There are 4 speech-bubbles around Shakespeare's head, representing artificially generated text based on 1-grams, 2-grams, 3-grams and 4-grams: '
1-gram: To him swallowed confess hear both. Which. Of save on trail for are ay device and rote life have Hill…
@arXiv_csCL_bot@mastoxiv.page
2025-06-17 09:33:51

Infini-gram mini: Exact n-gram Search at the Internet Scale with FM-Index
Hao Xu, Jiacheng Liu, Yejin Choi, Noah A. Smith, Hannaneh Hajishirzi
arxiv.org/abs/2506.12229

@lysander07@sigmoid.social
2025-05-28 05:10:40

Last week, we continued our #ISE2025 lecture on distributional semantics with the introduction of neural language models (NLMs) and compared them to traditional statistical n-gram models.
Benefits of NLMs:
- Capturing Long-Range Dependencies
- Computational and Statistical Tractability
- Improved Generalisation
- Higher Accuracy
@…

The image illustrates the architecture of a Neural Language Model, specifically focusing on Word Vectors II - Neural Language Models. It is part of a presentation on Natural Language Processing, created by the Karlsruhe Institute of Technology (KIT) and FIZ Karlsruhe, as indicated by their logos in the top right corner.

The diagram shows a neural network processing an input word embedding, represented by the phrase "to be or not to." The input is transformed into a d-sized vector representatio…
@arXiv_qbiobm_bot@mastoxiv.page
2025-06-17 11:42:50

In Vitro Antibacterial activity of hexane, Chloroform and methanolic extracts of different parts of Acronychia pedunculata grown in Sri Lanka
R. D. Nimantha Karunathilaka, Athige Rajith Niloshan Silva, Chathuranga Bharathee Ranaweera, D. M. R. K. Dissanayake, N. R. M. Nelumdeniya, Ranjith Pathirana, W. D. Ratnasooriya
ar…

@arXiv_csCL_bot@mastoxiv.page
2025-06-19 08:16:24

Oldies but Goldies: The Potential of Character N-grams for Romanian Texts
Dana Lupsa, Sanda-Maria Avram
arxiv.org/abs/2506.15650

@lysander07@sigmoid.social
2025-05-09 08:41:35

Building on the 90s, statistical n-gram language models, trained on vast text collections, became the backbone of NLP research. They fueled advancements in nearly all NLP techniques of the era, laying the groundwork for today's AI.
F. Jelinek (1997), Statistical Methods for Speech Recognition, MIT Press, Cambridge, MA
#NLP

Slide from Information Service Engineering 2025, LEcture 02, Natural Language PRocessing 01, A Brief History of NLP, NLP timeline. The timeline is located in the middle of the slide from top to bottom. The pointer on the timeline indicates 1990s. On the left, the formula for conditional probability of a word, following a given series of words, is given as a formula. Below, an AI generated portrait of William Shakespeare is displayed with 4 speech buubles, representing artificially generated tex…