RewardBench 2: Advancing Reward Model Evaluation
Saumya Malik, Valentina Pyatkin, Sander Land, Jacob Morrison, Noah A. Smith, Hannaneh Hajishirzi, Nathan Lambert
https://arxiv.org/abs/2506.01937
CyberGym: Evaluating AI Agents' Cybersecurity Capabilities with Real-World Vulnerabilities at Scale
Zhun Wang, Tianneng Shi, Jingxuan He, Matthew Cai, Jialin Zhang, Dawn Song
https://arxiv.org/abs/2506.02548
A High-Performance Evolutionary Multiobjective Community Detection Algorithm
Guilherme O. Santos, Lucas S. Vieira, Giulio Rossetti, Carlos H. G. Ferreira, Gladston Moreira
https://arxiv.org/abs/2506.01752
HASD: Hierarchical Adaption for pathology Slide-level Domain-shift
Jingsong Liu, Han Li, Chen Yang, Michael Deutges, Ario Sadafi, Xin You, Katharina Breininger, Nassir Navab, Peter J. Sch\"uffler
https://arxiv.org/abs/2506.23673
This https://arxiv.org/abs/2505.13476 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_mat…
Characterizing Small Circuit Classes from FAC^0 to FAC^1 via Discrete Ordinary Differential Equations
Melissa Antonelli, Arnaud Durand, Juha Kontinen
https://arxiv.org/abs/2506.23404
Any #flameshot #sway users who might have a problem with fullscreen captures, you want this:
# fix flameshot
for_window [app_id="flameshot"] border pixel 0, floating enable, fullscreen disable, move absolute position 0 0, focus
CBF-AFA: Chunk-Based Multi-SSL Fusion for Automatic Fluency Assessment
Papa S\'ega Wade, Mihai Andries, Ioannis Kanellos, Thierry Moudenc
https://arxiv.org/abs/2506.20243
Dynamics of thin film flows on a vertical fibre with vapor absorption
Souradip Chattopadhyay, Zihao Yu, Y. Sungtaek Ju, Hangjie Ji
https://arxiv.org/abs/2505.22379