MaXIFE: Multilingual and Cross-lingual Instruction Following Evaluation
Yile Liu, Ziwei Ma, Xiu Jiang, Jinglu Hu, Jing Chang, Liang Li
https://arxiv.org/abs/2506.01776
A Framework Leveraging Large Language Models for Autonomous UAV Control in Flying Networks
Diana Nunes, Ricardo Amorim, Pedro Ribeiro, Andr\'e Coelho, Rui Campos
https://arxiv.org/abs/2506.04404
Spore in the Wild: Case Study on Spore.fun, a Real-World Experiment of Sovereign Agent Open-ended Evolution on Blockchain with TEEs
Botao Amber Hu, Helena Rong
https://arxiv.org/abs/2506.04236
CyberGym: Evaluating AI Agents' Cybersecurity Capabilities with Real-World Vulnerabilities at Scale
Zhun Wang, Tianneng Shi, Jingxuan He, Matthew Cai, Jialin Zhang, Dawn Song
https://arxiv.org/abs/2506.02548
This https://arxiv.org/abs/2501.16423 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_hepp…
Do we do #ICanHazPDF on Mastodon? https://saemobilus.sae.org/papers/demonstration-a-dme-dimethyl-ether-fuelled-city-bus-2000-01-2005
This https://arxiv.org/abs/2503.07792 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…
Violation of Luttinger's theorem in one-dimensional interacting fermions
Meng Gao, Yin Zhong
https://arxiv.org/abs/2506.04064 https://
RewardBench 2: Advancing Reward Model Evaluation
Saumya Malik, Valentina Pyatkin, Sander Land, Jacob Morrison, Noah A. Smith, Hannaneh Hajishirzi, Nathan Lambert
https://arxiv.org/abs/2506.01937
CPU-Based Layout Design for Picker-to-Parts Pallet Warehouses
Timo Looms, Lin Xie
https://arxiv.org/abs/2506.04266 https://arxiv.org/…