Tootfinder

No exact results. Similar results found.

@Techmeme@techhub.social
2025-09-13 18:01:37

Mira Murati's TML launches a research blog called Connectionism, and shares its work on resolving nondeterminism and achieving reproducible results from LLMs (Maxwell Zeff/TechCrunch)
https://techcrunch.com/2025/09/10/thinking-machines-lab…

Thinking Machines Lab wants to make AI models more consistent | TechCrunch
In a blog post shared Wednesday, Mira Murati's startup offered a rare glimpse into some of work its doing to improve AI models.

@ErikJonker@mastodon.social
2025-10-13 13:46:47

Always fun/challenging to read new AI (pre)papers like this. "Base models know how to reason, thinking models learn when".
#AI #Google #reasoning

@arXiv_csCV_bot@mastoxiv.page
2025-10-13 10:41:30

VITA-VLA: Efficiently Teaching Vision-Language Models to Act via Action Expert Distillation
Shaoqi Dong, Chaoyou Fu, Haihan Gao, Yi-Fan Zhang, Chi Yan, Chu Wu, Xiaoyu Liu, Yunhang Shen, Jing Huo, Deqiang Jiang, Haoyu Cao, Yang Gao, Xing Sun, Ran He, Caifeng Shan
https://arxiv.org/abs/2510.09607

VITA-VLA: Efficiently Teaching Vision-Language Models to Act via Action Expert Distillation
Vision-Language Action (VLA) models significantly advance robotic manipulation by leveraging the strong perception capabilities of pretrained vision-language models (VLMs). By integrating action modules into these pretrained models, VLA methods exhibit improved generalization. However, training them from scratch is costly. In this work, we propose a simple yet effective distillation-based framework that equips VLMs with action-execution capability by transferring knowledge from pretrained small…

@arXiv_quantph_bot@mastoxiv.page
2025-08-14 09:41:22

Hybrid Quantum-Classical Latent Diffusion Models for Medical Image Generation
K\"ubra Yeter-Aydeniz, Nora M. Bauer, Pranay Jain, Max Masnick
https://arxiv.org/abs/2508.09903

Hybrid Quantum-Classical Latent Diffusion Models for Medical Image Generation
Generative learning models in medical research are crucial in developing training data for deep learning models and advancing diagnostic tools, but the problem of high-quality, diverse images is an open topic of research. Quantum-enhanced generative models have been proposed and tested in the literature but have been restricted to small problems below the scale of industry relevance. In this paper, we propose quantum-enhanced diffusion and variational autoencoder (VAE) models and test them on t…

@arXiv_csCL_bot@mastoxiv.page
2025-08-14 09:48:12

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models
Weigao Sun, Jiaxi Hu, Yucheng Zhou, Jusen Du, Disen Lan, Kexin Wang, Tong Zhu, Xiaoye Qu, Yu Zhang, Xiaoyu Mo, Daizong Liu, Yuxuan Liang, Wenliang Chen, Guoqi Li, Yu Cheng
https://arxiv.org/abs/2508.09834

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models
Large Language Models (LLMs) have delivered impressive results in language understanding, generation, reasoning, and pushes the ability boundary of multimodal models. Transformer models, as the foundation of modern LLMs, offer a strong baseline with excellent scaling properties. However, the traditional transformer architecture requires substantial computations and poses significant obstacles for large-scale training and practical deployment. In this survey, we offer a systematic examination of…

@Techmeme@techhub.social
2025-08-14 12:46:12

The US NSF and Nvidia partner to fund the Open Multimodal Infrastructure to Accelerate Science project, led by Ai2; NSF is contributing $75M and Nvidia $77M (Kyt Dotson/SiliconANGLE)
https://siliconangle.com/2025/08/14/nsf-nvidi…

NSF and Nvidia partner develop fully open AI models to lead US science innovation - SiliconANGLE
NSF and Nvidia partner develop fully open AI models to lead US science innovation - SiliconANGLE

@arXiv_csCV_bot@mastoxiv.page
2025-10-13 10:24:10

Zero-shot image privacy classification with Vision-Language Models
Alina Elena Baia, Alessio Xompero, Andrea Cavallaro
https://arxiv.org/abs/2510.09253 https://

Zero-shot image privacy classification with Vision-Language Models
While specialized learning-based models have historically dominated image privacy prediction, the current literature increasingly favours adopting large Vision-Language Models (VLMs) designed for generic tasks. This trend risks overlooking the performance ceiling set by purpose-built models due to a lack of systematic evaluation. To address this problem, we establish a zero-shot benchmark for image privacy classification, enabling a fair comparison. We evaluate the top-3 open-source VLMs, accor…

@arXiv_csCL_bot@mastoxiv.page
2025-07-14 09:50:42

The Curious Case of Factuality Finetuning: Models' Internal Beliefs Can Improve Factuality
Benjamin Newman, Abhilasha Ravichander, Jaehun Jung, Rui Xin, Hamish Ivison, Yegor Kuznetsov, Pang Wei Koh, Yejin Choi
https://arxiv.org/abs/2507.08371

The Curious Case of Factuality Finetuning: Models' Internal Beliefs Can Improve Factuality
Language models are prone to hallucination - generating text that is factually incorrect. Finetuning models on high-quality factual information can potentially reduce hallucination, but concerns remain; obtaining factual gold data can be expensive and training on correct but unfamiliar data may potentially lead to even more downstream hallucination. What data should practitioners finetune on to mitigate hallucinations in language models? In this work, we study the relationship between the factu…

@arXiv_csCL_bot@mastoxiv.page
2025-07-14 09:53:52

Diagnosing Failures in Large Language Models' Answers: Integrating Error Attribution into Evaluation Framework
Zishan Xu, Shuyi Xie, Qingsong Lv, Shupei Xiao, Linlin Song, Sui Wenjuan, Fan Lin
https://arxiv.org/abs/2507.08459

Diagnosing Failures in Large Language Models' Answers: Integrating Error Attribution into Evaluation Framework
With the widespread application of Large Language Models (LLMs) in various tasks, the mainstream LLM platforms generate massive user-model interactions daily. In order to efficiently analyze the performance of models and diagnose failures in their answers, it is essential to develop an automated framework to systematically categorize and attribute errors. However, existing evaluation models lack error attribution capability. In this work, we establish a comprehensive Misattribution Framework wi…

@Techmeme@techhub.social
2025-08-11 22:30:41

Nvidia debuts new Omniverse SDKs and Cosmos world foundation models for robotics devs, including Cosmos Reason, a 7B-parameter reasoning vision language model (Rebecca Szkutak/TechCrunch)
https://techcrunch.com/2025/08/11/nvid

Nvidia unveils new Cosmos world models, infra for robotics and physical uses | TechCrunch
Nvidia on Monday unveiled a set of new world AI models, libraries, and other infrastructure for robotics developers, most notable of which is Cosmos Reason, a 7-billion-parameter "reasoning" vision language model for physical AI applications and robots.

Tootfinder

Opt-in global Mastodon full text search. Join the index!