Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@arXiv_csCL_bot@mastoxiv.page
2025-09-15 09:55:31

Towards Reliable and Interpretable Document Question Answering via VLMs
Alessio Chen, Simone Giovannini, Andrea Gemelli, Fabio Coppini, Simone Marinai
arxiv.org/abs/2509.10129

@arXiv_csCV_bot@mastoxiv.page
2025-10-15 10:47:01

Reasoning in the Dark: Interleaved Vision-Text Reasoning in Latent Space
Chao Chen, Zhixin Ma, Yongqi Li, Yupeng Hu, Yinwei Wei, Wenjie Li, Liqiang Nie
arxiv.org/abs/2510.12603

@arXiv_csLG_bot@mastoxiv.page
2025-10-15 10:50:21

Keep Calm and Avoid Harmful Content: Concept Alignment and Latent Manipulation Towards Safer Answers
Ruben Belo, Claudia Soares, Marta Guimaraes
arxiv.org/abs/2510.12672

@thomasfuchs@hachyderm.io
2025-09-13 14:18:27

Here’s a healthcare lawyer answering some common questions including if it is ethical to get the COVID vaccine, are you taking a dose away from someone who really needs it, etc
(free and shareable link)
patreon.com/posts/138476876/

@arXiv_csNI_bot@mastoxiv.page
2025-09-16 10:38:06

gNB-based Local Breakout for URLLC in industrial 5G
Rajendra Paudyal, Rajendra Upadhyay, Al Nahian Bin Emran, Duminda Wijesekera
arxiv.org/abs/2509.10617

@AthanSpod@social.linux.pizza
2025-08-14 16:17:16

And that's another trip to the Specsavers in Andover done, until I go pick up the new glasses in ~2 weeks.
Once more the staff were lovely, professional and both offered good information and answered any questions I had.
My bank account is wincing, as I've refreshed all *three* pairs; "intermediate vision" for computer work (most of my day), normal varifocals, and another varifocal pair with a polaroid tint for if it's sunny.

@arXiv_csCV_bot@mastoxiv.page
2025-10-15 14:34:17

Replaced article(s) found for cs.CV. arxiv.org/list/cs.CV/new
[3/5]:
- STRIDE-QA: Visual Question Answering Dataset for Spatiotemporal Reasoning in Urban Driving Scenes
Keishi Ishihara, Kento Sasaki, Tsubasa Takahashi, Daiki Shiono, Yu Yamaguchi

@arXiv_csCV_bot@mastoxiv.page
2025-09-15 09:54:11

LaV-CoT: Language-Aware Visual CoT with Multi-Aspect Reward Optimization for Real-World Multilingual VQA
Jing Huang, Zhiya Tan, Shutao Gong, Fanwei Zeng, Jianshu Li
arxiv.org/abs/2509.10026

@arXiv_csCV_bot@mastoxiv.page
2025-10-13 10:24:30

Hallucination Filtering in Radiology Vision-Language Models Using Discrete Semantic Entropy
Patrick Wienholt, Sophie Caselitz, Robert Siepmann, Philipp Bruners, Keno Bressem, Christiane Kuhl, Jakob Nikolas Kather, Sven Nebelung, Daniel Truhn
arxiv.org/abs/2510.09256

@arXiv_csCV_bot@mastoxiv.page
2025-09-11 09:22:23

Examining Vision Language Models through Multi-dimensional Experiments with Vision and Text Features
Saurav Sengupta, Nazanin Moradinasab, Jiebei Liu, Donald E. Brown
arxiv.org/abs/2509.08266