Tootfinder

Opt-in global Mastodon full text search. Join the index!

@arXiv_csCL_bot@mastoxiv.page
2025-09-11 09:45:53

LLM Ensemble for RAG: Role of Context Length in Zero-Shot Question Answering for BioASQ Challenge
Dima Galat, Diego Molla-Aliod
arxiv.org/abs/2509.08596

@philip@mastodon.mallegolhansen.com
2025-09-12 00:51:15

@… Haha, I appreciate that.
It’s a good question. My mind immediately *wants* to answer all the ones you didn’t ask (Taking guns away from everyone, getting to pick and choose who gets them, etc.)
But the question as (purposefully I’m sure) posed, is a tough one.

@cowboys@darktundra.xyz
2025-09-09 20:29:41

Cowboys starter gets angry at reporters over Micah Parson question sportingnews.com/us/ncaa-footb

@arXiv_csLG_bot@mastoxiv.page
2025-10-10 11:16:09

Better Together: Leveraging Unpaired Multimodal Data for Stronger Unimodal Models
Sharut Gupta, Shobhita Sundaram, Chenyu Wang, Stefanie Jegelka, Phillip Isola
arxiv.org/abs/2510.08492

@Techmeme@techhub.social
2025-10-11 19:45:54

Interviews with security researchers about AI's potential for large-scale destruction, as experts remain divided and global regulatory frameworks lag (Stephen Witt/New York Times)

@eglassman@hci.social
2025-09-11 04:18:25

Sincere question for the HCI community (and all other international research communities who disseminate research primarily at conferences):
1. When any conference is in the US, will international folks risk coming? I already know prominent folks who say they won't.
2. When any conference is outside the US, will any international students within the states risk going? My PhD student has been advised not to, so I'm giving his talk overseas for him.
How will we all …

@arXiv_statML_bot@mastoxiv.page
2025-10-10 09:27:09

High-dimensional Analysis of Synthetic Data Selection
Parham Rezaei, Filip Kovacevic, Francesco Locatello, Marco Mondelli
arxiv.org/abs/2510.08123

@arXiv_csCV_bot@mastoxiv.page
2025-10-09 10:13:31

StaR-KVQA: Structured Reasoning Traces for Implicit-Knowledge Visual Question Answering
Zhihao Wen, Wenkang Wei, Yuan Fang, Xingtong Yu, Hui Zhang, Weicheng Zhu, Xin Zhang
arxiv.org/abs/2510.06638

@kurtsh@mastodon.social
2025-09-11 05:08:27

What a cool trivia question from the Lord of the Rings...
▶️ There was Another Way to Destroy the Ring and Defeat Sauron
youtube.com/watch?v=Qhnc8TbUrK

@egallager@social.treehouse.systems
2025-09-11 19:36:11

Unix question: is there a version of seq(1) for letters instead of numbers?

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 10:20:11

Avoiding Knowledge Edit Skipping in Multi-hop Question Answering with Guided Decomposition
Yi Liu, Xiangrong Zhu, Xiangyu Liu, Wei Wei, Wei Hu
arxiv.org/abs/2509.07555

@arXiv_statME_bot@mastoxiv.page
2025-09-11 08:33:23

Generative AI as a Safety Net for Survey Question Refinement
Erica Ann Metheney, Lauren Yehle
arxiv.org/abs/2509.08702 arxiv.org/pdf/2509.0…

@saraislet@infosec.exchange
2025-09-11 00:02:50

#AdviceRequested!
We want to buy an electric car! It's exciting but also daunting to make car buying decisions, and harder to evaluate with electric than it was for gas.
Safety and reliability are the highest priorities — which was easier to evaluate with models like the Honda Civic that's been around for decades
Lucid looks really nice, but I question the relia…

@brian_gettler@mas.to
2025-11-10 13:20:24

Dear author,
Thank you for submitting your article manuscript. While we trust that many in the research community would welcome a study of vitamin, drug, and disease interaction as a timely intervention, we question its suitability for publication in a history journal.
The editors

@hex@kolektiva.social
2025-10-11 18:28:48

Here's your regular reminder:
There is no debate over if cars will or will not be part of the future. They will not. They are a luxury we can no longer afford. The question is only if we will choose to rid our future of cars, or allow cars to rid us of our future.
#FuckCars

@frankel@mastodon.top
2025-11-11 09:30:05

Monorepo vs Multi-repo vs #Git submodule vs Git Subtree: A Complete Guide for Developers
levelup.gitcon…

@cdamian@rls.social
2025-10-11 15:38:00

‘It’s a question of humanity’: how a small Spanish town made headlines over its immigration stance | Spain | The Guardian
theguardian.com/world/2025/oct
> …

Spanish town house
@akosma@mastodon.online
2025-10-10 18:46:55

"Chatbots are turning on the flattery, patience, and support. Microsoft AI CEO Mustafa Suleyman said the “cool thing” about the company’s AI personal assistant is that it doesn’t “judge you for asking a stupid question.” It exhibits “kindness and empathy.” Here’s the rub: We need people to judge us. We need people to call us out for making stupid statements. Friction and conflict are key to developing resilience and learning how to function in society."

@arXiv_csCY_bot@mastoxiv.page
2025-10-09 08:46:51

Asking For It: Question-Answering for Predicting Rule Infractions in Online Content Moderation
Mattia Samory, Diana Pamfile, Andrew To, Shruti Phadke
arxiv.org/abs/2510.06350

@sascha_wolfer@fediscience.org
2025-10-10 06:05:44

One other thing, while we don't claim that our mixed-effects logit model is the perfect way to account for non-independence between languages, we don't think it's correct, as Xia & Lindell assert, to just claim that our results are "counterintuitive", the fix-eff estimates are "unreliable" and that the high model fits are "unrealistic." Whether a mix model better captures the data-generat. process is ultimately an empirical question, not one to be decided by assertion. Take, for instance, our finding that once random effects for either subregion or language family are included, the estimated effect of L1_population reverses direction—from the negative value reported by Xia & Lindell et al. to a positive one.

@NFL@darktundra.xyz
2025-10-10 19:11:32

NFL Week 6 injury report: Bengals' Ja'Marr Chase questionable vs. Packers due to illness

cbssports.com/nfl/news/nfl-wee

@BBC3MusicBot@mastodonapp.uk
2025-10-10 20:45:50

🔊 #NowPlaying on #BBCRadio3:
#TheEssay
- The Meaning and Magic of Music
How does music convey meaning to the listener? Catherine Coldstream examines this question in the context of classical music and religious faith.
Relisten now 👇
bbc.co.uk/programmes/m0029pvm

@arXiv_mathDS_bot@mastoxiv.page
2025-10-10 09:00:19

Stability with respect to periodic switching laws does not imply global stability under arbitrary switching
Ian D. Morris
arxiv.org/abs/2510.08074

@kexpmusicbot@mastodonapp.uk
2025-09-08 06:23:37

🇺🇦 #NowPlaying on KEXP's #Expansions
A.B.O.:
🎵 This Question
#ABO
deepclicks.bandcamp.com/track/

@ThatHoarder@mastodon.online
2025-11-10 09:34:03

It's a question that comes up for a lot of us, so in the latest episode I talk about potential ways to cope with it.
Find the podcast by searching for That Hoarder: Overcome Compulsive Hoarding podcast in your podcast player.
#hoarding #hoardingdisorder

@raiders@darktundra.xyz
2025-11-06 16:28:05

The Biggest Question Facing Each NFL Team in the Second Half of 2025 foxsports.com/stories/nfl/bigg

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 10:02:41

The Role of Exploration Modules in Small Language Models for Knowledge Graph Question Answering
Yi-Jie Cheng, Oscar Chew, Yun-Nung Chen
arxiv.org/abs/2509.07399

@bobmueller@mastodon.world
2025-10-10 22:00:06

An interesting piece about the death and life of Edgar Allan Poe. #taphephobia
lithub.com/to-haunt-and-be-hau

@arXiv_csAI_bot@mastoxiv.page
2025-10-10 10:18:29

VoiceAgentBench: Are Voice Assistants ready for agentic tasks?
Dhruv Jain, Harshit Shukla, Gautam Rajeev, Ashish Kulkarni, Chandra Khatri, Shubham Agarwal
arxiv.org/abs/2510.07978

@tante@tldr.nettime.org
2025-10-09 21:20:05

"In the end, we are defined not just by our actions, but by the actions we tolerate."
Mike Monteiro with another banger
(Original title: How to eat with others)
buttondown.com/monteiro/archiv

@arXiv_physicsdataan_bot@mastoxiv.page
2025-09-10 08:32:01

Revisiting the Question of Information Content of EXAFS Spectra through a Bayesian Approach
Lucy Haddad, Diego Gianolio, Andrei Sapelkin
arxiv.org/abs/2509.07950

@memeorandum@universeodon.com
2025-11-06 12:51:04

The question Mamdani won't answer (Politico)
politico.com/newsletters/playb
memeorandum.com/251106/p8#a251

@arXiv_csRO_bot@mastoxiv.page
2025-09-10 10:16:41

Temporal Counterfactual Explanations of Behaviour Tree Decisions
Tamlin Love, Antonio Andriella, Guillem Aleny\`a
arxiv.org/abs/2509.07674

@arXiv_mathNT_bot@mastoxiv.page
2025-09-11 08:32:03

Number of integers represented by families of binary forms III: fewnomials
Etienne Fouvry, Michel Waldschmidt
arxiv.org/abs/2509.08335 arxi…

@arXiv_quantph_bot@mastoxiv.page
2025-09-10 10:25:01

All you need is controlled-V: universality of a standard two-qubit gate by catalytic embedding
Robin Kaarsgaard
arxiv.org/abs/2509.07578 ar…

@arXiv_mathAG_bot@mastoxiv.page
2025-10-09 10:03:51

On the diagonal of quartic hypersurfaces and $(2,3)$-complete intersection $n$-folds
Elia Fiammengo, Morten L\"uders
arxiv.org/abs/2510.07111

@servelan@newsie.social
2025-11-09 18:24:34

Now you know they're lying: different responses to the same question, multiple answers, indicate lying about the real reason
Tariffs aren't meant for revenue and will shrink over time, Bessent says
axios.com/2025/11/09/trump-tar

@hex@kolektiva.social
2025-09-11 19:48:52

I also have this idea that somewhere there's a a transbian policule coven who take money for hexes and curses against fascists and use it to fund the revolution.
Unlreated question... if a coven is a legally registered church, shouldn't paying for a hex be a legally tax deductable charitable donation? Asking for a friend.

@drgeraint@glasgow.social
2025-10-09 21:29:23

Interesting question on road.cc: is there a bit of road that you just hate #cycling on?
For me, a short stretch of southbound Mearns Rd at Mearns Kirk. Slight rise, poor surface, from Eaglesham Rd to the roundabout:

@arXiv_csHC_bot@mastoxiv.page
2025-10-07 09:59:32

Multi-Hop Question Answering: When Can Humans Help, and Where do They Struggle?
Jinyan Su, Claire Cardie, Jennifer Healey
arxiv.org/abs/2510.04493

@grifferz@social.bitfolk.com
2025-10-09 13:04:12

Love it when someone finally responds to me after literally months of silence, I ask a question and they respond correcting me and saying they've taken action now anyway "to avoid further iteration and delay".

@arXiv_astrophSR_bot@mastoxiv.page
2025-09-10 09:33:51

SN 2022xlp: The second-known well-observed, intermediate-luminosity Iax supernova
D. B\'anhidi, B. Barna, T. Szalai, J. Vink\'o, I. B. B\'ir\'o, K. A. Bostroem, I. Cs\'anyi, K. W. Davis, R. J. Foley, L. Galbany, S. W. Jha, D. A. Howell, L. A. Kwok, A. P\'al, C. Pellegrino, C. Rojas-Bravo, P. Sz\'ekely, K. Taggart, G. Terreran, S. Tinyanont

@arXiv_csSI_bot@mastoxiv.page
2025-10-10 08:26:09

From Keywords to Clusters: AI-Driven Analysis of YouTube Comments to Reveal Election Issue Salience in 2024
Raisa M. Simoes, Timoteo Kelly, Eduardo J. Simoes, Praveen Rao
arxiv.org/abs/2510.07821

@arXiv_mathGT_bot@mastoxiv.page
2025-09-10 09:11:51

The $L^p$-diameter of the space of contractible loops
Michael Brandenbursky, Egor Shelukhin
arxiv.org/abs/2509.07270 arxiv.org/pdf/2509.072…

@aral@mastodon.ar.al
2025-11-07 18:04:48

“The real question, then, is not ‘what can we do?’, but ‘what are we afraid to do?’ Whose comfort are we protecting when we ask safe questions? Whose illusions do we preserve through politeness? Solidarity is not an optic; it is a disruption. It is noisy, uncomfortable, often isolating. It pulls reputation apart rather than polishing it.

We are too fluent in the language of outrage, too comfortable in the posture of virtue. History will not absolve spectatorship, even when specta…

@cellfourteen@social.petertoushkov.eu
2025-10-07 19:28:19

If you haven't re-watched your 720p pirated copy from 2016 of The Matrix lately, do it now, if only for the showy dialogue at the beginning. So full of hope and righteousness, pure 90s bullet-time fuck-the-establishment attitude.
#Movies #SciFi

Trinity: "It's the question that brought you here. You know the question?"
Neo: "What is the Matrix?"
@arXiv_csDS_bot@mastoxiv.page
2025-10-10 08:36:09

Dynamic Connectivity with Expected Polylogarithmic Worst-Case Update Time
Simon Meierhans, Maximilian Probst Gutenberg
arxiv.org/abs/2510.08297

@davidaugust@mastodon.online
2025-09-08 04:16:33

"When you work on anything, you want to find the range of impulses. which ones get portrayed is another question, but you want to have that complexity and that fullness, even if you’re playing a cartoon character."
—Willem Dafoe
#acting #coaching

@arXiv_astrophIM_bot@mastoxiv.page
2025-10-10 08:38:39

Probing the Origin of Water in Planets within Habitable Zones by HWO
Yasuhiro Hasegawa, Courtney Dressing, Ludmila Carone
arxiv.org/abs/2510.07349

@matzekult@chaos.social
2025-09-09 12:06:08

Martial Arts have always been a part of #StarTrek. But we have come quite a bit since the famous hand chop, as we can see with the subject of today's #TrekTriviaTuesday question.
As always no googling and no spoiling the answer for others. Please boost after voting! :BoostOK:
V…

@sascha_wolfer@fediscience.org
2025-10-10 06:06:17

Finally, what Xia & Lindell call a "separation problem" is, in our view, a feature of our approach and not a bug.
If, e.g., all languages in a family are polysynthetic (or none are), that’s not a statistical artefact – it’s the signal. The outcome is well associated with genealogy, showing that family membership captures someth genuinely informative about the process. When the model finds that family explains a large share of the variance, that's not a failure–it's evidence that phylogenetic structure dominates the pattern.
So while Xia & Lindell insist that "autocorrelation due to relationships and distance cannot be captured in family or regional-level analyses", we see that as an empirical question – and we treated it as one.
The real test is whether a mixed model that explicitly represents phylogeny and geography performs worse than their alternative, where the entire shared history of languages and environments is effectively collapsed into a single dimension (an eigenvector).
In other words: we model relationships – Xia & Lindell summarise them into one number per language.

@arXiv_mathPR_bot@mastoxiv.page
2025-09-11 08:49:23

The Random Walk Pinning Model II: Upper bounds on the free energy and disorder relevance
Quentin Berger, Hubert Lacoin
arxiv.org/abs/2509.08769

@arXiv_csNI_bot@mastoxiv.page
2025-09-10 07:51:41

TEGRA: A Flexible & Scalable NextGen Mobile Core
Bilal Saleem, Omar Basit, Jiayi Meng, Iftekhar Alam, Ajay Thakur, Christian Maciocco, Muhammad Shahbaz, Y. Charlie Hu, Larry Peterson
arxiv.org/abs/2509.07410

@arXiv_csLG_bot@mastoxiv.page
2025-10-10 11:05:59

Counterfactual Identifiability via Dynamic Optimal Transport
Fabio De Sousa Ribeiro, Ainkaran Santhirasekaram, Ben Glocker
arxiv.org/abs/2510.08294

@karlauerbach@sfba.social
2025-09-08 19:33:49

I have a question about the following:
Space-X satellites are relatively low orbit - they go around the world, their radio/signal footprint on the ground goes around the world with them.
So how does a single country, the US, issue "license" for radio spectrum that apply outside of the US geographic borders?
"SpaceX buys wireless spectrum from EchoStar in $17 billion deal"

@blackknight95857669@social.linux.pizza
2025-11-09 02:02:51

Ordered a refurb Samsung S20 FE from Newegg, arrived today. I question the "Grade A - Excellent" rating that Reebeio gave it. Has mars all around the edge of the case and the back. Also has what looks to be a pressure crack in the back panel next to the cameras, like someone sat on it while in some kind of protective case and just put enough weight on it to crack the phone case a tiny bit.
However, the screen looks perfect and it powers on and boots to initialization mode no …

@arXiv_csCV_bot@mastoxiv.page
2025-09-10 10:43:31

D-LEAF: Localizing and Correcting Hallucinations in Multimodal LLMs via Layer-to-head Attention Diagnostics
Tiancheng Yang, Lin Zhang, Jiaye Lin, Guimin Hu, Di Wang, Lijie Hu
arxiv.org/abs/2509.07864

@Techmeme@techhub.social
2025-09-09 20:16:23

Ramp says it has hit $1B in annualized revenue, after saying it had hit $700M in March; it was valued at $22.5B in July (Julie Bort/TechCrunch)
techcrunch.com/2025/09/09/ramp

@arXiv_csCL_bot@mastoxiv.page
2025-09-11 09:47:23

Memorization in Large Language Models in Medicine: Prevalence, Characteristics, and Implications
Anran Li, Lingfei Qian, Mengmeng Du, Yu Yin, Yan Hu, Zihao Sun, Yihang Fu, Erica Stutz, Xuguang Ai, Qianqian Xie, Rui Zhu, Jimin Huang, Yifan Yang, Siru Liu, Yih-Chung Tham, Lucila Ohno-Machado, Hyunghoon Cho, Zhiyong Lu, Hua Xu, Qingyu Chen

@cowboys@darktundra.xyz
2025-09-11 15:04:58

Cowboys Defense Faces HUGE Question vs Giants! youtube.com/watch?v=Yy1kKR4n28A

@arXiv_csAI_bot@mastoxiv.page
2025-09-10 10:12:31

Aligning LLMs for the Classroom with Knowledge-Based Retrieval -- A Comparative RAG Study
Amay Jain, Liu Cui, Si Chen
arxiv.org/abs/2509.07846

@memeorandum@universeodon.com
2025-11-06 01:50:56

Hakeem Jeffries dodges question on whether Mamdani is future of Democratic Party (Fox News)
foxnews.com/politics/hakeem-je
memeorandum.com/251105/p164#a2

@arXiv_statML_bot@mastoxiv.page
2025-10-10 09:37:19

PAC Learnability in the Presence of Performativity
Ivan Kirev, Lyuben Baltadzhiev, Nikola Konstantinov
arxiv.org/abs/2510.08335 arxiv.org/p…

@NFL@darktundra.xyz
2025-10-07 11:59:49

Could early bye weeks be a good thing? Why there's an advantage and how six teams are approaching them espn.com/nfl/story/_/id/465091

@arXiv_mathAG_bot@mastoxiv.page
2025-09-10 08:56:51

The nonexistence of sections of Stiefel varieties and stably free modules
Sebastian Gant
arxiv.org/abs/2509.07263 arxiv.org/pdf/2509.07263

@arXiv_csCY_bot@mastoxiv.page
2025-10-10 07:54:18

Exploring the Viability of the Updated World3 Model for Examining the Impact of Computing on Planetary Boundaries
Nara Guliyeva, Eshta Bhardwaj, Christoph Becker
arxiv.org/abs/2510.07634

@arXiv_csCL_bot@mastoxiv.page
2025-09-11 09:19:43

Culturally transmitted color categories in LLMs reflect a learning bias toward efficient compression
Nathaniel Imel, Noga Zaslavsky
arxiv.org/abs/2509.08093

@raiders@darktundra.xyz
2025-09-05 19:03:48

Patriots Rookie’s Status in Question vs. Raiders si.com/nfl/patriots/news/new-e

@arXiv_csLG_bot@mastoxiv.page
2025-10-10 11:05:19

Opponent Shaping in LLM Agents
Marta Emili Garcia Segura, Stephen Hailes, Mirco Musolesi
arxiv.org/abs/2510.08255 arxiv.org/pdf/2510.08255

@arXiv_mathDS_bot@mastoxiv.page
2025-10-10 09:14:39

On roundness of rotation sets
Boris Perrot, Jan Boro\'nski, Alex Clark
arxiv.org/abs/2510.08235 arxiv.org/pdf/2510.08235

@arXiv_csCV_bot@mastoxiv.page
2025-09-11 10:07:33

BcQLM: Efficient Vision-Language Understanding with Distilled Q-Gated Cross-Modal Fusion
Sike Xiang, Shuang Chen, Amir Atapour-Abarghouei
arxiv.org/abs/2509.08715

@servelan@newsie.social
2025-09-04 15:01:10

Firefighters question leaders’ role in Washington immigration raid
dailykos.com/stories/2025/9/4/

@cowboys@darktundra.xyz
2025-10-10 15:04:23

Mailbag: Pick your poison stopping run, pass? dallascowboys.com/news/mailbag

@davidaugust@mastodon.online
2025-11-08 16:16:33

"When you work on anything, you want to find the range of impulses. which ones get portrayed is another question, but you want to have that complexity and that fullness, even if you’re playing a cartoon character."
—Willem Dafoe
#acting #coaching

@aral@mastodon.ar.al
2025-09-07 08:13:04

Hey @… – you are complicit in genocide.
Just noting it down for the history books.
#EU #israel

@NFL@darktundra.xyz
2025-10-08 20:21:21

NFL Week 6 injury report: Jalen Carter's status in question for Eagles; Giants shorthanded at WR

cbssports.com/nfl/news/nfl-wee

@cowboys@darktundra.xyz
2025-10-10 14:13:33

Mailbag: Pick your poison stopping run, pass? dallascowboys.com/news/mailbag

@arXiv_csCL_bot@mastoxiv.page
2025-09-08 10:10:40

KERAG: Knowledge-Enhanced Retrieval-Augmented Generation for Advanced Question Answering
Yushi Sun, Kai Sun, Yifan Ethan Xu, Xiao Yang, Xin Luna Dong, Nan Tang, Lei Chen
arxiv.org/abs/2509.04716

@servelan@newsie.social
2025-09-04 16:14:19

Bill Cassidy Traps RFK Jr. With Nobel Prize Question
mediaite.com/media/news/senate

@raiders@darktundra.xyz
2025-11-04 23:18:22

Cowboys will get a chance to answer the question of whether a big trade would help foxsports.com/articles/nfl/cow

@arXiv_mathDS_bot@mastoxiv.page
2025-10-10 08:59:19

Mean dimension and rate-distortion function revisited
Rui Yang
arxiv.org/abs/2510.08051 arxiv.org/pdf/2510.08051

@Techmeme@techhub.social
2025-08-28 05:31:00

A look at India's rationale for banning online real-money games, with IT minister Ashwini Vaishnaw citing 450M people losing a combined ~$2.3B to them (Vivek Kaul/Newslaundry)
newslaundry.com/2025/08/27/the

@arXiv_csCL_bot@mastoxiv.page
2025-10-07 12:02:02

FocusMed: A Large Language Model-based Framework for Enhancing Medical Question Summarization with Focus Identification
Chao Liu, Ling Luo, Tengxiao Lv, Huan Zhuang, Lejing Yu, Jian Wang, Hongfei Lin
arxiv.org/abs/2510.04671

@NFL@darktundra.xyz
2025-09-03 22:16:23

Lamar Jackson contract: Ravens QB sidesteps question about extension, says he's 'not worried about that'

cbssports.com/nfl/news/lamar-j…

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 10:22:41

MoLoRAG: Bootstrapping Document Understanding via Multi-modal Logic-aware Retrieval
Xixi Wu, Yanchao Tan, Nan Hou, Ruiyang Zhang, Hong Cheng
arxiv.org/abs/2509.07666

@arXiv_csCL_bot@mastoxiv.page
2025-10-10 16:33:51

Replaced article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[1/7]:
- Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering
Pragya Srivastava, Manuj Malik, Vivek Gupta, Tanuja Ganu, Dan Roth

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 10:31:01

SimpleQA Verified: A Reliable Factuality Benchmark to Measure Parametric Knowledge
Lukas Haas, Gal Yona, Giovanni D'Antonio, Sasha Goldshtein, Dipanjan Das
arxiv.org/abs/2509.07968

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 10:28:41

Are Humans as Brittle as Large Language Models?
Jiahui Li, Sean Papay, Roman Klinger
arxiv.org/abs/2509.07869 arxiv.org/pdf/2509.07869

@arXiv_csCL_bot@mastoxiv.page
2025-10-10 11:02:59

Beyond Turn Limits: Training Deep Search Agents with Dynamic Context Window
Qiaoyu Tang, Hao Xiang, Le Yu, Bowen Yu, Yaojie Lu, Xianpei Han, Le Sun, WenJuan Zhang, Pengbo Wang, Shixuan Liu, Zhenru Zhang, Jianhong Tu, Hongyu Lin, Junyang Lin
arxiv.org/abs/2510.08276

@arXiv_csCL_bot@mastoxiv.page
2025-10-10 10:52:09

AI Knowledge Assist: An Automated Approach for the Creation of Knowledge Bases for Conversational AI Agents
Md Tahmid Rahman Laskar, Julien Bouvier Tremblay, Xue-Yong Fu, Cheng Chen, Shashi Bhushan TN
arxiv.org/abs/2510.08149

@arXiv_csCL_bot@mastoxiv.page
2025-10-06 10:17:59

StepChain GraphRAG: Reasoning Over Knowledge Graphs for Multi-Hop Question Answering
Tengjun Ni, Xin Yuan, Shenghong Li, Kai Wu, Ren Ping Liu, Wei Ni, Wenjie Zhang
arxiv.org/abs/2510.02827

@arXiv_csCL_bot@mastoxiv.page
2025-09-08 10:11:10

Research on Multi-hop Inference Optimization of LLM Based on MQUAKE Framework
Zucheng Liang, Wenxin Wei, Kaijie Zhang, Hongyi Chen
arxiv.org/abs/2509.04770

@arXiv_csCL_bot@mastoxiv.page
2025-10-06 08:45:09

AMANDA: Agentic Medical Knowledge Augmentation for Data-Efficient Medical Visual Question Answering
Ziqing Wang, Chengsheng Mao, Xiaole Wen, Yuan Luo, Kaize Ding
arxiv.org/abs/2510.02328

@arXiv_csCL_bot@mastoxiv.page
2025-10-06 10:14:09

Uncertainty as Feature Gaps: Epistemic Uncertainty Quantification of LLMs in Contextual Question-Answering
Yavuz Bakman, Sungmin Kang, Zhiqi Huang, Duygu Nur Yaldiz, Catarina G. Bel\'em, Chenyang Zhu, Anoop Kumar, Alfy Samuel, Salman Avestimehr, Daben Liu, Sai Praneeth Karimireddy
arxiv.org/abs/2510.02671