Tootfinder

@pbloem@sigmoid.social
2025-05-29 18:49:41

We need to stop reasoning about AI like this. Every time it can do something amazing it's "well that's just because its training data is so big, everything is bound to be in there."
Then, when it shows some limitation, it's "Well that's because it must not be in the training data".
It can show a shark jumping out of the moon through a computer screen. It can generate stuff that's not in the data. Also, hands counting down are obviously in t…

A screenshot from an Ars Technica article on Google's video AI. The paragraph below reads: Counting down with fingers is difficult for Veo 3, likely because it's not well-represented in the training data. Instead, hands are likely usually shown in a few positions like a fist, a five-finger open palm, a two-finger peace sign, and the number one.

@berlinbuzzwords@floss.social
2025-05-14 14:00:33

LLMs are now part of our daily work, making coding easier. Join Ivan Dolgov at this year's Berlin Buzzwords to learn how they built an in-house LLM for AI code completion in JetBrains products, covering design choices, data preparation, training and model evaluation.
Learn more: https://

Session title: How to train a fast LLM for coding tasks

Join us from June 15-17 in Berlin or participate online / berlinbuzzwords.de

How to train a fast LLM for coding tasks
In this talk, we present our approach to training a code completion model using Mellum, our new open-source model, as an example. Mellum powers in-file code completion in AI-enabled JetBrains IDEs. We'll walk through the entire process, from designing the model and preparing the dataset — with emphasis on the permissiveness of using data — to the training process and evaluation strategies. Attendees will gain insights into state-of-the-art techniques and the challenges we faced and discover…

@hw@fediscience.org
2025-04-10 06:13:49

Ai2 now has a tool, where you can trace the outputs of LLMs to their possible sources in the training materials. It's very interesting.
Obviously only works with fully open models like their OLMo family of models. More info here: #LLM #OLMo2 #AI

Tootfinder

Opt-in global Mastodon full text search. Join the index!