Tootfinder

No exact results. Similar results found.

@arXiv_csAI_bot@mastoxiv.page
2025-07-01 11:01:03

The Confidence Paradox: Can LLM Know When It's Wrong
Sahil Tripathi, Md Tabrez Nafis, Imran Hussain, Jiechao Gao
https://arxiv.org/abs/2506.23464 https…

The Confidence Paradox: Can LLM Know When It's Wrong
Document Visual Question Answering (DocVQA) systems are increasingly deployed in real world applications, yet they remain ethically opaque-often producing overconfident answers to ambiguous questions or failing to communicate uncertainty in a trustworthy manner. This misalignment between model confidence and actual knowledge poses significant risks, particularly in domains requiring ethical accountability. Existing approaches such as LayoutLMv3, UDOP, and DONUT have advanced SOTA performance by…

@arXiv_csCY_bot@mastoxiv.page
2025-07-01 07:47:03

Computational Analysis of Climate Policy
Carolyn Hicks
https://arxiv.org/abs/2506.22449 https://arxiv.org/pdf/2506.22449

Computational Analysis of Climate Policy
This thesis explores the impact of the Climate Emergency movement on local government climate policy, using computational methods. The Climate Emergency movement sought to accelerate climate action at local government level through the mechanism of Climate Emergency Declarations (CEDs), resulting in a series of commitments from councils to treat climate change as an emergency. With the aim of assessing the potential of current large language models to answer complex policy questions, I first bu…

@arXiv_eessSP_bot@mastoxiv.page
2025-07-01 11:48:23

Automatic Phase Calibration for High-resolution mmWave Sensing via Ambient Radio Anchors
Ruixu Geng, Yadong Li, Dongheng Zhang, Pengcheng Huang, Binquan Wang, Binbin Zhang, Zhi Lu, Yang Hu, Yan Chen
https://arxiv.org/abs/2506.23472

Automatic Phase Calibration for High-resolution mmWave Sensing via Ambient Radio Anchors
Millimeter-wave (mmWave) radar systems with large array have pushed radar sensing into a new era, thanks to their high angular resolution. However, our long-term experiments indicate that array elements exhibit phase drift over time and require periodic phase calibration to maintain high-resolution, creating an obstacle for practical high-resolution mmWave sensing. Unfortunately, existing calibration methods are inadequate for periodic recalibration, either because they rely on artificial refer…

@arXiv_csCV_bot@mastoxiv.page
2025-06-30 10:16:50

MiCo: Multi-image Contrast for Reinforcement Visual Reasoning
Xi Chen, Mingkang Zhu, Shaoteng Liu, Xiaoyang Wu, Xiaogang Xu, Yu Liu, Xiang Bai, Hengshuang Zhao
https://arxiv.org/abs/2506.22434

MiCo: Multi-image Contrast for Reinforcement Visual Reasoning
This work explores enabling Chain-of-Thought (CoT) reasoning to link visual cues across multiple images. A straightforward solution is to adapt rule-based reinforcement learning for Vision-Language Models (VLMs). However, such methods typically rely on manually curated question-answer pairs, which can be particularly challenging when dealing with fine grained visual details and complex logic across images. Inspired by self-supervised visual representation learning, we observe that images contai…

@arXiv_csLO_bot@mastoxiv.page
2025-06-30 07:46:30

Negated String Containment is Decidable (Technical Report)
Vojt\v{e}ch Havlena, Michal He\v{c}ko, Luk\'a\v{s} Hol\'ik, Ond\v{r}ej Leng\'al
https://arxiv.org/abs/2506.22061

Negated String Containment is Decidable (Technical Report)
We provide a positive answer to a long-standing open question of the decidability of the not-contains string predicate. Not-contains is practically relevant, for instance in symbolic execution of string manipulating programs. Particularly, we show that the predicate notContains(x1 ... xn, y1 ... ym), where x1 ... xn and y1 ... ym are sequences of string variables constrained by regular languages, is decidable. Decidability of a not-contains predicate combined with chain-free word equations and …

@arXiv_csIR_bot@mastoxiv.page
2025-07-01 07:43:33

Machine Assistant with Reliable Knowledge: Enhancing Student Learning via RAG-based Retrieval
Yongsheng Lian
https://arxiv.org/abs/2506.23026 https://

Machine Assistant with Reliable Knowledge: Enhancing Student Learning via RAG-based Retrieval
We present Machine Assistant with Reliable Knowledge (MARK), a retrieval-augmented question-answering system designed to support student learning through accurate and contextually grounded responses. The system is built on a retrieval-augmented generation (RAG) framework, which integrates a curated knowledge base to ensure factual consistency. To enhance retrieval effectiveness across diverse question types, we implement a hybrid search strategy that combines dense vector similarity with sparse…

@arXiv_mathph_bot@mastoxiv.page
2025-06-30 08:17:30

Exponential decay in $O(n)$-invariant quantum spin systems
Jakob E. Bj\"ornberg, Kieran Ryan
https://arxiv.org/abs/2506.22254 https://

Exponential decay in $O(n)$-invariant quantum spin systems
We consider $O(n)$-invariant and reflection-positive quantum spin systems on the integer lattice in any dimension, and prove that spin-spin correlations decay exponentially fast provided n is large enough. This answers a question of Ueltschi, who proved that for small n there is instead long-range order (for d at least 3).

@arXiv_csAI_bot@mastoxiv.page
2025-07-01 09:54:43

MARBLE: A Hard Benchmark for Multimodal Spatial Reasoning and Planning
Yulun Jiang, Yekun Chai, Maria Brbi\'c, Michael Moor
https://arxiv.org/abs/2506.22992

MARBLE: A Hard Benchmark for Multimodal Spatial Reasoning and Planning
The ability to process information from multiple modalities and to reason through it step-by-step remains a critical challenge in advancing artificial intelligence. However, existing reasoning benchmarks focus on text-only reasoning, or employ multimodal questions that can be answered by directly retrieving information from a non-text modality. Thus, complex reasoning remains poorly understood in multimodal domains. Here, we present MARBLE, a challenging multimodal reasoning benchmark that is d…

@arXiv_csCY_bot@mastoxiv.page
2025-07-01 09:03:43

Peer Review as Structured Commentary: Immutable Identity, Public Dialogue, and Reproducible Scholarship
Craig Steven Wright
https://arxiv.org/abs/2506.22497

Peer Review as Structured Commentary: Immutable Identity, Public Dialogue, and Reproducible Scholarship
This paper reconceptualises peer review as structured public commentary. Traditional academic validation is hindered by anonymity, latency, and gatekeeping. We propose a transparent, identity-linked, and reproducible system of scholarly evaluation anchored in open commentary. Leveraging blockchain for immutable audit trails and AI for iterative synthesis, we design a framework that incentivises intellectual contribution, captures epistemic evolution, and enables traceable reputational dynamics.…

@arXiv_csIR_bot@mastoxiv.page
2025-06-30 09:55:10

HLTCOE at LiveRAG: GPT-Researcher using ColBERT retrieval
Kevin Duh, Eugene Yang, Orion Weller, Andrew Yates, Dawn Lawrie
https://arxiv.org/abs/2506.22356 …

HLTCOE at LiveRAG: GPT-Researcher using ColBERT retrieval
The HLTCOE LiveRAG submission utilized the GPT-researcher framework for researching the context of the question, filtering the returned results, and generating the final answer. The retrieval system was a ColBERT bi-encoder architecture, which represents a passage with many dense tokens. Retrieval used a local, compressed index of the FineWeb10-BT collection created with PLAID-X, using a model fine-tuned for multilingual retrieval. Query generation from context was done with Qwen2.5-7B-Instruct…

Tootfinder

Opt-in global Mastodon full text search. Join the index!