Tootfinder

No exact results. Similar results found.

@arXiv_csCL_bot@mastoxiv.page
2025-06-03 08:20:53

Esoteric Language Models
Subham Sekhar Sahoo, Zhihan Yang, Yash Akhauri, Johnna Liu, Deepansha Singh, Zhoujun Cheng, Zhengzhong Liu, Eric Xing, John Thickstun, Arash Vahdat
https://arxiv.org/abs/2506.01928

Esoteric Language Models
Diffusion-based language models offer a compelling alternative to autoregressive (AR) models by enabling parallel and controllable generation. Among this family of models, Masked Diffusion Models (MDMs) achieve the strongest performance but still underperform AR models in perplexity and lack key inference-time efficiency features--most notably, KV caching. In this work, we introduce Eso-LMs, a new family of models that fuses AR and MDM paradigms, enabling smooth interpolation between their perp…

@arXiv_csRO_bot@mastoxiv.page
2025-06-03 08:04:16

Humanoid World Models: Open World Foundation Models for Humanoid Robotics
Muhammad Qasim Ali, Aditya Sridhar, Shahbuland Matiana, Alex Wong, Mohammad Al-Sharman
https://arxiv.org/abs/2506.01182

Humanoid World Models: Open World Foundation Models for Humanoid Robotics
Humanoid robots have the potential to perform complex tasks in human centered environments but require robust predictive models to reason about the outcomes of their actions. We introduce Humanoid World Models (HWM) a family of lightweight open source video based models that forecast future egocentric observations conditioned on actions. We train two types of generative models Masked Transformers and FlowMatching on 100 hours of humanoid demonstrations. Additionally we explore architectural var…

@arXiv_csLG_bot@mastoxiv.page
2025-06-03 22:01:41

This https://arxiv.org/abs/2505.23527 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…

Normalizing Flows are Capable Models for RL
Modern reinforcement learning (RL) algorithms have found success by using powerful probabilistic models, such as transformers, energy-based models, and diffusion/flow-based models. To this end, RL researchers often choose to pay the price of accommodating these models into their algorithms -- diffusion models are expressive, but are computationally intensive due to their reliance on solving differential equations, while autoregressive transformer models are scalable but typically require learni…

@UP8@mastodon.social
2025-05-02 15:14:55

☠️ One-sixth of the planet’s cropland has toxic levels of one or more metals
https://english.elpais.com/science-tech/2025-04-17/one-sixth-of-the-planets-cropland-has-toxic-levels-of-one-or-more-metals.html

One-sixth of the planet’s cropland has toxic levels of one or more metals
A review of tens of thousands of soil samples from Earth reveals high concentrations of arsenic, cadmium, and lead in the pedosphere

@arXiv_mathAG_bot@mastoxiv.page
2025-07-02 08:08:50

Linkage of sheaves of modules
Farhad Rahmati, Khadijeh Sayyari
https://arxiv.org/abs/2507.00200 https://arxiv.org/pdf/2507.00200

Linkage of sheaves of modules
Inspired by the works in linkage theory of modules, we define the concept of linkage of sheaves of modules as a generalization of linkage of modules. Thus, we expressed it in geometry algebraic language. We show that the linkedness of sheaves is a locally property. As an important result, we have shown that the sheaf of modules made of Glueing schemes and Glueing linked sheaves of modules is a linked sheaf. Also, it has been shown that for every sheaf of modules on non-domain, it is possible to…

@AimeeMaroux@mastodon.social
2025-06-03 01:16:57

Content warning:

This week's #MythologyMonday theme is #sexwork! Acca Larentia was a mythical woman in Roman mythology. She was a beautiful prostitute (scortum) of roughly the same age as Romulus and Remus. She was awarded to Hercules as a prize in a game of dice by the guardian of his temple, and locked…

Roman fresco of Acca Larentia. She is wearing a long, pink dress and raises both hands as if speaking but most of the painting is sadly destroyed.

@arXiv_csAI_bot@mastoxiv.page
2025-06-03 07:19:17

World Models for Cognitive Agents: Transforming Edge Intelligence in Future Networks
Changyuan Zhao, Ruichen Zhang, Jiacheng Wang, Gaosheng Zhao, Dusit Niyato, Geng Sun, Shiwen Mao, Dong In Kim
https://arxiv.org/abs/2506.00417

World Models for Cognitive Agents: Transforming Edge Intelligence in Future Networks
World models are emerging as a transformative paradigm in artificial intelligence, enabling agents to construct internal representations of their environments for predictive reasoning, planning, and decision-making. By learning latent dynamics, world models provide a sample-efficient framework that is especially valuable in data-constrained or safety-critical scenarios. In this paper, we present a comprehensive overview of world models, highlighting their architecture, training paradigms, and a…

@arXiv_csLG_bot@mastoxiv.page
2025-06-03 22:00:43

This https://arxiv.org/abs/2505.23337 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csLG_…

Matryoshka Model Learning for Improved Elastic Student Models
Industry-grade ML models are carefully designed to meet rapidly evolving serving constraints, which requires significant resources for model development. In this paper, we propose MatTA, a framework for training multiple accurate Student models using a novel Teacher-TA-Student recipe. TA models are larger versions of the Student models with higher capacity, and thus allow Student models to better relate to the Teacher model and also bring in more domain-specific expertise. Furthermore, multiple…

@arXiv_csCL_bot@mastoxiv.page
2025-07-03 09:56:10

Adapting Language Models to Indonesian Local Languages: An Empirical Study of Language Transferability on Zero-Shot Settings
Rifki Afina Putri
https://arxiv.org/abs/2507.01645

Adapting Language Models to Indonesian Local Languages: An Empirical Study of Language Transferability on Zero-Shot Settings
In this paper, we investigate the transferability of pre-trained language models to low-resource Indonesian local languages through the task of sentiment analysis. We evaluate both zero-shot performance and adapter-based transfer on ten local languages using models of different types: a monolingual Indonesian BERT, multilingual models such as mBERT and XLM-R, and a modular adapter-based approach called MAD-X. To better understand model behavior, we group the target languages into three categori…

@arXiv_csCL_bot@mastoxiv.page
2025-07-03 10:05:40

How Do Vision-Language Models Process Conflicting Information Across Modalities?
Tianze Hua, Tian Yun, Ellie Pavlick
https://arxiv.org/abs/2507.01790 https…

How Do Vision-Language Models Process Conflicting Information Across Modalities?
AI models are increasingly required to be multimodal, integrating disparate input streams into a coherent state representation on which subsequent behaviors and actions can be based. This paper seeks to understand how such models behave when input streams present conflicting information. Focusing specifically on vision-language models, we provide inconsistent inputs (e.g., an image of a dog paired with the caption "A photo of a cat") and ask the model to report the information present in one of…

Tootfinder

Opt-in global Mastodon full text search. Join the index!