Building on the 90s, statistical n-gram language models, trained on vast text collections, became the backbone of NLP research. They fueled advancements in nearly all NLP techniques of the era, laying the groundwork for today's AI.
F. Jelinek (1997), Statistical Methods for Speech Recognition, MIT Press, Cambridge, MA
#NLP
The Hype Index: an NLP-driven Measure of Market News Attention
Zheng Cao, Wanchaloem Wunkaew, Helyette Geman
https://arxiv.org/abs/2506.06329 https://
Next stop in our NLP timeline is 2013, the introduction of low dimensional dense word vectors - so-called "word embeddings" - based on distributed semantics, as e.g. word2vec by Mikolov et al. from Google, which enabled representation learning on text.
T. Mikolov et al. (2013). Efficient Estimation of Word Representations in Vector Space.
…
Optimizing Storytelling, Improving Audience Retention, and Reducing Waste in the Entertainment Industry
Andrew Cornfeld, Ashley Miller, Mercedes Mora-Figueroa, Kurt Samuels, Anthony Palomba
https://arxiv.org/abs/2506.00076
With the advent of ELIZA, Joseph Weizenbaum's first psychotherapist chatbot, NLP took another major step with pattern-based substitution algorithms based on simple regular expressions.
Weizenbaum, Joseph (1966). ELIZA—a computer program for the study of natural language communication between man and machine. Com. of the ACM. 9: 36–45.
Silence is Golden: Leveraging Adversarial Examples to Nullify Audio Control in LDM-based Talking-Head Generation
Yuan Gan, Jiaxu Miao, Yunze Wang, Yi Yang
https://arxiv.org/abs/2506.01591
Query Drift Compensation: Enabling Compatibility in Continual Learning of Retrieval Embedding Models
Dipam Goswami, Liying Wang, Bart{\l}omiej Twardowski, Joost van de Weijer
https://arxiv.org/abs/2506.00037
Enhancing Automatic PT Tagging for MEDLINE Citations Using Transformer-Based Models
Victor H. Cid, James Mork
https://arxiv.org/abs/2506.03321 https://
This https://arxiv.org/abs/2504.15448 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_eco…