Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR
Xiao Liang, Zhongzhi Li, Yeyun Gong, Yelong Shen, Ying Nian Wu, Zhijiang Guo, Weizhu Chen
https://arxiv.org/abs/2508.14029
Depth-Breadth Synergy in RLVR: Unlocking LLM Reasoning Gains with Adaptive Exploration
Zhicheng Yang, Zhijiang Guo, Yinya Huang, Yongxin Wang, Dongchun Xie, Yiwei Wang, Xiaodan Liang, Jing Tang
https://arxiv.org/abs/2508.13755
Reinforcement Learning with Rubric Anchors
Zenan Huang, Yihong Zhuang, Guoshan Lu, Zeyu Qin, Haokai Xu, Tianyu Zhao, Ru Peng, Jiaqi Hu, Zhanming Shen, Xiaomeng Hu, Xijun Gu, Peiyi Tu, Jiaxin Liu, Wenyu Chen, Yuzhuo Fu, Zhiting Fan, Yanmei Gu, Yuanyuan Wang, Zhengkai Yang, Jianguo Li, Junbo Zhao
https://arxiv.org/abs/2508.12790
I tried out a sign handle but didn't like it so I started to design my own...
Mine is parametric and has a separate clamping piece that attaches with two 8-32 (or 4mm) bolts.
On a 256x256 print bed you can print one about 270mm (10.6") tall.
( Here's the one I originally tried: https…
LaViPlan : Language-Guided Visual Path Planning with RLVR
Hayeon Oh
https://arxiv.org/abs/2507.12911 https://arxiv.org/pdf/2507.12911…
Hello les gens ! Question #OpenStreetMap. J'ai déjŠ contribué modestement Š #OSM par le passé, mais lŠ , il y a un défi autrement plus complexe que j'aurais Š relever : Je souhaiterais ajouter une nouvelle voie publique ainsi qu'un nouvel équipement dans mon village, Š Guern. Il s'agit de l'…
MEML-GRPO: Heterogeneous Multi-Expert Mutual Learning for RLVR Advancement
Weitao Jia, Jinghui Lu, Haiyang Yu, Siqi Wang, Guozhi Tang, An-Lan Wang, Weijie Yin, Dingkang Yang, Yuxiang Nie, Bin Shan, Hao Feng, Irene Li, Kun Yang, Han Wang, Jingqun Tang, Teng Fu, Changhong Jin, Chao Feng, Xiaohui Lv, Can Huang
https://arxiv.org/abs/2508.09670…
You could 3D print a little sign holder handle...
➡️ https://www.printables.com/model/1248678-ergonomic-foam-board-holder-for-rallyprotest-signs
G$^2$RPO-A: Guided Group Relative Policy Optimization with Adaptive Guidance
Yongxin Guo, Wenbo Deng, Zhenglin Cheng, Xiaoying Tang
https://arxiv.org/abs/2508.13023 https://