Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@mszll@datasci.social
2024-11-15 21:42:56

Is "decentralizedwashing" a term yet? See recent #bluesky discussions like: social.wildeboer.net/@jwildebo

@theawely@mamot.fr
2024-12-13 18:48:37

Excited about the new xLSTM model release. There are many well-though designs compared to transformers: recurrence (which should allows composability), gating (like Mamba & LSTM which is based on, which allows time complexity independent of the input size), state tracking (unlike Mamba & transformers). For now, these advantage aren’t apparent on benchmarks, but most training techniques are secrets, and the recent advances of LLMs evidenced that they matter a lot.