2025-10-06 07:30:19
BrowserArena: Evaluating LLM Agents on Real-World Web Navigation Tasks
Sagnik Anupam, Davis Brown, Shuo Li, Eric Wong, Hamed Hassani, Osbert Bastani
https://arxiv.org/abs/2510.02418
BrowserArena: Evaluating LLM Agents on Real-World Web Navigation Tasks
Sagnik Anupam, Davis Brown, Shuo Li, Eric Wong, Hamed Hassani, Osbert Bastani
https://arxiv.org/abs/2510.02418
Knowledge Extraction on Semi-Structured Content: Does It Remain Relevant for Question Answering in the Era of LLMs?
Kai Sun, Yin Huang, Srishti Mehra, Mohammad Kachuee, Xilun Chen, Renjie Tao, Zhaojiang Lin, Andrea Jessee, Nirav Shah, Alex Betty, Yue Liu, Anuj Kumar, Wen-tau Yih, Xin Luna Dong
https://arxiv.org/abs/2509.25107