Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@arXiv_csAI_bot@mastoxiv.page
2025-08-19 10:53:40

Reinforcement Learning with Rubric Anchors
Zenan Huang, Yihong Zhuang, Guoshan Lu, Zeyu Qin, Haokai Xu, Tianyu Zhao, Ru Peng, Jiaqi Hu, Zhanming Shen, Xiaomeng Hu, Xijun Gu, Peiyi Tu, Jiaxin Liu, Wenyu Chen, Yuzhuo Fu, Zhiting Fan, Yanmei Gu, Yuanyuan Wang, Zhengkai Yang, Jianguo Li, Junbo Zhao
arxiv.org/abs/2508.12790

@brian_gettler@mas.to
2025-07-18 14:23:14

When someone asks you to do something that's _part of your effin' job_, answer ASAP. Say 'yes' or say 'no' (if you can), but don't let the request sit for days or weeks. The time you spend ignoring the request might cause problems for the human(s) involved.

@arXiv_csCL_bot@mastoxiv.page
2025-08-20 08:18:59

ProMed: Shapley Information Gain Guided Reinforcement Learning for Proactive Medical LLMs
Hongxin Ding, Baixiang Huang, Yue Fang, Weibin Liao, Xinke Jiang, Zheng Li, Junfeng Zhao, Yasha Wang
arxiv.org/abs/2508.13514

@arXiv_csCV_bot@mastoxiv.page
2025-08-19 12:05:10

Breaking Reward Collapse: Adaptive Reinforcement for Open-ended Medical Reasoning with Enhanced Semantic Discrimination
Yizhou Liu, Jingwei Wei, Zizhi Chen, Minghao Han, Xukun Zhang, Keliang Liu, Lihua Zhang
arxiv.org/abs/2508.12957

@thek3nger@mastodon.social
2025-08-19 16:03:18

Okay. This is my current top candidate for the most absurd, cool, and fascinating experiment/article of 2025.
Can you use candles as a clock signal for a CPU? Surprisingly, the answer is yes.
cpldcpu.com/2025/08/13/candle-

@arXiv_csAI_bot@mastoxiv.page
2025-08-20 09:47:40

LM Agents May Fail to Act on Their Own Risk Knowledge
Yuzhi Tang, Tianxiao Li, Elizabeth Li, Chris J. Maddison, Honghua Dong, Yangjun Ruan
arxiv.org/abs/2508.13465

@arXiv_csDS_bot@mastoxiv.page
2025-09-19 08:18:11

Minimum Sum Coloring with Bundles in Trees and Bipartite Graphs
Takehiro Ito, Naonori Kakimura, Naoyuki Kamiyama, Yusuke Kobayashi, Yoshio Okamoto
arxiv.org/abs/2509.15080

@arXiv_csRO_bot@mastoxiv.page
2025-08-18 08:01:30

Robust Online Calibration for UWB-Aided Visual-Inertial Navigation with Bias Correction
Yizhi Zhou, Jie Xu, Jiawei Xia, Zechen Hu, Weizi Li, Xuan Wang
arxiv.org/abs/2508.10999

@hanno@mastodon.social
2025-07-17 12:22:47

You may've seen @… 's post about mails he gets about CRA compliance from large companies asking him to answer all kinds of questions. I just saw a similar one to the maintainer of another important FOSS library. The kicker: The company uses a version from 2000. (Yes, no typo, 25 years old. I think it has some unfixed vulnerabilities.)

@arXiv_csAI_bot@mastoxiv.page
2025-08-19 10:19:50

Wisdom of the Crowd: Reinforcement Learning from Coevolutionary Collective Feedback
Wenzhen Yuan, Shengji Tang, Weihao Lin, Jiacheng Ruan, Ganqu Cui, Bo Zhang, Tao Chen, Ting Liu, Yuzhuo Fu, Peng Ye, Lei Bai
arxiv.org/abs/2508.12338