Tootfinder

No exact results. Similar results found.

@arXiv_csLG_bot@mastoxiv.page
2025-07-31 09:20:21

SourceSplice: Source Selection for Machine Learning Tasks
Ambarish Singh, Romila Pradhan
https://arxiv.org/abs/2507.22186 https://arxiv.org/pdf/2507.22186

SourceSplice: Source Selection for Machine Learning Tasks
Data quality plays a pivotal role in the predictive performance of machine learning (ML) tasks - a challenge amplified by the deluge of data sources available in modern organizations.Prior work in data discovery largely focus on metadata matching, semantic similarity or identifying tables that should be joined to answer a particular query, but do not consider source quality for high performance of the downstream ML task.This paper addresses the problem of determining the best subset of data sou…

@arXiv_csRO_bot@mastoxiv.page
2025-08-01 09:55:41

Scalable Multi-Task Reinforcement Learning for Generalizable Spatial Intelligence in Visuomotor Agents
Shaofei Cai, Zhancun Mu, Haiwen Xia, Bowei Zhang, Anji Liu, Yitao Liang
https://arxiv.org/abs/2507.23698

Scalable Multi-Task Reinforcement Learning for Generalizable Spatial Intelligence in Visuomotor Agents
While Reinforcement Learning (RL) has achieved remarkable success in language modeling, its triumph hasn't yet fully translated to visuomotor agents. A primary challenge in RL models is their tendency to overfit specific tasks or environments, thereby hindering the acquisition of generalizable behaviors across diverse settings. This paper provides a preliminary answer to this challenge by demonstrating that RL-finetuned visuomotor agents in Minecraft can achieve zero-shot generalization to unse…

@arXiv_csHC_bot@mastoxiv.page
2025-08-01 08:39:41

iLearnRobot: An Interactive Learning-Based Multi-Modal Robot with Continuous Improvement
Kohou Wang, ZhaoXiang Liu, Lin Bai, Kun Fan, Xiang Liu, Huan Hu, Kai Wang, Shiguo Lian
https://arxiv.org/abs/2507.22896

iLearnRobot: An Interactive Learning-Based Multi-Modal Robot with Continuous Improvement
It is crucial that robots' performance can be improved after deployment, as they are inherently likely to encounter novel scenarios never seen before. This paper presents an innovative solution: an interactive learning-based robot system powered by a Multi-modal Large Language Model(MLLM). A key feature of our system is its ability to learn from natural dialogues with non-expert users. We also propose chain of question to clarify the exact intent of the question before providing an answer and d…

@oekologisch_unterwegs@mastodon.online
2025-08-30 16:24:09

Die #Graugans, der Vorfahre unserer #Hausgans, beeindruckt mit bis zu 90 cm Länge und 4 kg Gewicht. Diese Vögel sind in Europa und Westasien verbreitet und bevorzugen Landschaften mit Zugang zu Süßwasser. Ihr lautes Geschnatter ist charakteristisch, aber leider habe ich noch keine Tonaufnahme. 🦢🌾🌍…

Graugans (Anser anser): Vorfahre der Hausgans
Lernen Sie die Graugans kennen: Gewicht, Brutzeit und die besondere Ernährung dieser weitverbreiteten Gansart

@arXiv_csIR_bot@mastoxiv.page
2025-07-01 07:43:33

Machine Assistant with Reliable Knowledge: Enhancing Student Learning via RAG-based Retrieval
Yongsheng Lian
https://arxiv.org/abs/2506.23026 https://

Machine Assistant with Reliable Knowledge: Enhancing Student Learning via RAG-based Retrieval
We present Machine Assistant with Reliable Knowledge (MARK), a retrieval-augmented question-answering system designed to support student learning through accurate and contextually grounded responses. The system is built on a retrieval-augmented generation (RAG) framework, which integrates a curated knowledge base to ensure factual consistency. To enhance retrieval effectiveness across diverse question types, we implement a hybrid search strategy that combines dense vector similarity with sparse…

@arXiv_csCV_bot@mastoxiv.page
2025-06-30 10:16:50

MiCo: Multi-image Contrast for Reinforcement Visual Reasoning
Xi Chen, Mingkang Zhu, Shaoteng Liu, Xiaoyang Wu, Xiaogang Xu, Yu Liu, Xiang Bai, Hengshuang Zhao
https://arxiv.org/abs/2506.22434

MiCo: Multi-image Contrast for Reinforcement Visual Reasoning
This work explores enabling Chain-of-Thought (CoT) reasoning to link visual cues across multiple images. A straightforward solution is to adapt rule-based reinforcement learning for Vision-Language Models (VLMs). However, such methods typically rely on manually curated question-answer pairs, which can be particularly challenging when dealing with fine grained visual details and complex logic across images. Inspired by self-supervised visual representation learning, we observe that images contai…

@arXiv_csCL_bot@mastoxiv.page
2025-07-30 10:28:01

Post-Training Large Language Models via Reinforcement Learning from Self-Feedback
Carel van Niekerk, Renato Vukovic, Benjamin Matthias Ruppik, Hsien-chin Lin, Milica Ga\v{s}i\'c
https://arxiv.org/abs/2507.21931

Post-Training Large Language Models via Reinforcement Learning from Self-Feedback
Large Language Models (LLMs) often produce plausible but poorly-calibrated answers, limiting their reliability on reasoning-intensive tasks. We present Reinforcement Learning from Self-Feedback (RLSF), a post-training stage that uses the model's own confidence as an intrinsic reward, mimicking how humans learn in the absence of external feedback. After a frozen LLM generates several chain-of-thought solutions, we define and compute the confidence of each final answer span and rank the traces ac…

@arXiv_csCL_bot@mastoxiv.page
2025-07-30 10:18:01

Libra: Assessing and Improving Reward Model by Learning to Think
Meng Zhou, Bei Li, Jiahao Liu, Xiaowen Shi, Yang Bai, Rongxiang Weng, Jingang Wang, Xunliang Cai
https://arxiv.org/abs/2507.21645

Libra: Assessing and Improving Reward Model by Learning to Think
Reinforcement learning (RL) has significantly improved the reasoning ability of large language models. However, current reward models underperform in challenging reasoning scenarios and predominant RL training paradigms rely on rule-based or reference-based rewards, which impose two critical limitations: 1) the dependence on finely annotated reference answer to attain rewards; and 2) the requirement for constrained output format. These limitations fundamentally hinder further RL data scaling an…

@arXiv_csCL_bot@mastoxiv.page
2025-07-31 09:53:11

From Sufficiency to Reflection: Reinforcement-Guided Thinking Quality in Retrieval-Augmented Reasoning for LLMs
Jie He, Victor Gutierrez Basulto, Jeff Z. Pan
https://arxiv.org/abs/2507.22716

From Sufficiency to Reflection: Reinforcement-Guided Thinking Quality in Retrieval-Augmented Reasoning for LLMs
Reinforcement learning-based retrieval-augmented generation (RAG) methods enhance the reasoning abilities of large language models (LLMs). However, most rely only on final-answer rewards, overlooking intermediate reasoning quality. This paper analyzes existing RAG reasoning models and identifies three main failure patterns: (1) information insufficiency, meaning the model fails to retrieve adequate support; (2) faulty reasoning, where logical or content-level flaws appear despite sufficient inf…

@arXiv_csCL_bot@mastoxiv.page
2025-08-27 09:54:23

Knowing or Guessing? Robust Medical Visual Question Answering via Joint Consistency and Contrastive Learning
Songtao Jiang, Yuxi Chen, Sibo Song, Yan Zhang, Yeying Jin, Yang Feng, Jian Wu, Zuozhu Liu
https://arxiv.org/abs/2508.18687

Knowing or Guessing? Robust Medical Visual Question Answering via Joint Consistency and Contrastive Learning
In high-stakes medical applications, consistent answering across diverse question phrasings is essential for reliable diagnosis. However, we reveal that current Medical Vision-Language Models (Med-VLMs) exhibit concerning fragility in Medical Visual Question Answering, as their answers fluctuate significantly when faced with semantically equivalent rephrasings of medical questions. We attribute this to two limitations: (1) insufficient alignment of medical concepts, leading to divergent reasoni…

Tootfinder

Opt-in global Mastodon full text search. Join the index!