Tootfinder

No exact results. Similar results found.

@fanf@mendeddrum.org
2025-09-14 14:42:03

from my link log —
cargo-crev: A web-of-trust code review system for Rust.
https://github.com/crev-dev/cargo-crev
saved 2025-09-09 https://

GitHub - crev-dev/cargo-crev: A cryptographically verifiable code review system for the cargo (Rust) package manager.
A cryptographically verifiable code review system for the cargo (Rust) package manager. - crev-dev/cargo-crev

@arXiv_csCR_bot@mastoxiv.page
2025-10-14 11:39:28

A Scalable, Privacy-Preserving Decentralized Identity and Verifiable Data Sharing Framework based on Zero-Knowledge Proofs
Hui Yuan
https://arxiv.org/abs/2510.09715 https://

A Scalable, Privacy-Preserving Decentralized Identity and Verifiable Data Sharing Framework based on Zero-Knowledge Proofs
With the proliferation of decentralized applications (DApps), the conflict between the transparency of blockchain technology and user data privacy has become increasingly prominent. While Decentralized Identity (DID) and Verifiable Credentials (VCs) provide a standardized framework for user data sovereignty, achieving trusted identity verification and data sharing without compromising privacy remains a significant challenge. This paper proposes a novel, comprehensive framework that integrates D…

@arXiv_csSE_bot@mastoxiv.page
2025-10-13 09:18:20

Faver: Boosting LLM-based RTL Generation with Function Abstracted Verifiable Middleware
Jianan Mu, Mingyu Shi, Yining Wang, Tianmeng Yang, Bin Sun, Xing Hu, Jing Ye, Huawei Li
https://arxiv.org/abs/2510.08664

Faver: Boosting LLM-based RTL Generation with Function Abstracted Verifiable Middleware
LLM-based RTL generation is an interesting research direction, as it holds the potential to liberate the least automated stage in the current chip design. However, due to the substantial semantic gap between high-level specifications and RTL, coupled with limited training data, existing models struggle with generation accuracy. Drawing on human experience, design with verification helps improving accuracy. However, as the RTL testbench data are even more scarce, it is not friendly for LLMs. Alt…

@arXiv_csAI_bot@mastoxiv.page
2025-08-11 09:13:39

SKATE, a Scalable Tournament Eval: Weaker LLMs differentiate between stronger ones using verifiable challenges
Dewi S. W. Gould, Bruno Mlodozeniec, Samuel F. Brown
https://arxiv.org/abs/2508.06111

SKATE, a Scalable Tournament Eval: Weaker LLMs differentiate between stronger ones using verifiable challenges
Evaluating the capabilities and risks of foundation models is paramount, yet current methods demand extensive domain expertise, hindering their scalability as these models rapidly evolve. We introduce SKATE: a novel evaluation framework in which large language models (LLMs) compete by generating and solving verifiable tasks for one another. Our core insight is to treat evaluation as a game: models act as both task-setters and solvers, incentivized to create questions which highlight their own s…

@arXiv_csCV_bot@mastoxiv.page
2025-10-13 10:25:00

Spotlight on Token Perception for Multimodal Reinforcement Learning
Siyuan Huang, Xiaoye Qu, Yafu Li, Yun Luo, Zefeng He, Daizong Liu, Yu Cheng
https://arxiv.org/abs/2510.09285 …

Spotlight on Token Perception for Multimodal Reinforcement Learning
While Reinforcement Learning with Verifiable Rewards (RLVR) has advanced the reasoning capabilities of Large Vision-Language Models (LVLMs), most existing methods in multimodal reasoning neglect the critical role of visual perception within the RLVR optimization process. In this paper, we undertake a pioneering exploration of multimodal RLVR through the novel perspective of token perception, which measures the visual dependency of each generated token. With a granular analysis of Chain-of-Thoug…

@arXiv_csCL_bot@mastoxiv.page
2025-10-10 10:51:29

Interpreting LLM-as-a-Judge Policies via Verifiable Global Explanations
Jasmina Gajcin, Erik Miehling, Rahul Nair, Elizabeth Daly, Radu Marinescu, Seshu Tirupathi
https://arxiv.org/abs/2510.08120

Interpreting LLM-as-a-Judge Policies via Verifiable Global Explanations
Using LLMs to evaluate text, that is, LLM-as-a-judge, is increasingly being used at scale to augment or even replace human annotations. As such, it is imperative that we understand the potential biases and risks of doing so. In this work, we propose an approach for extracting high-level concept-based global policies from LLM-as-a-Judge. Our approach consists of two algorithms: 1) CLoVE (Contrastive Local Verifiable Explanations), which generates verifiable, concept-based, contrastive local expl…

@arXiv_csAI_bot@mastoxiv.page
2025-08-14 07:38:52

MEML-GRPO: Heterogeneous Multi-Expert Mutual Learning for RLVR Advancement
Weitao Jia, Jinghui Lu, Haiyang Yu, Siqi Wang, Guozhi Tang, An-Lan Wang, Weijie Yin, Dingkang Yang, Yuxiang Nie, Bin Shan, Hao Feng, Irene Li, Kun Yang, Han Wang, Jingqun Tang, Teng Fu, Changhong Jin, Chao Feng, Xiaohui Lv, Can Huang
https://arxiv.org/abs/2508.09670…

MEML-GRPO: Heterogeneous Multi-Expert Mutual Learning for RLVR Advancement
Recent advances demonstrate that reinforcement learning with verifiable rewards (RLVR) significantly enhances the reasoning capabilities of large language models (LLMs). However, standard RLVR faces challenges with reward sparsity, where zero rewards from consistently incorrect candidate answers provide no learning signal, particularly in challenging tasks. To address this, we propose Multi-Expert Mutual Learning GRPO (MEML-GRPO), an innovative framework that utilizes diverse expert prompts as …

@arXiv_csCL_bot@mastoxiv.page
2025-08-14 09:43:42

Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning
Vaishnavi Shrivastava, Ahmed Awadallah, Vidhisha Balachandran, Shivam Garg, Harkirat Behl, Dimitris Papailiopoulos
https://arxiv.org/abs/2508.09726

Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning
Large language models trained with reinforcement learning with verifiable rewards tend to trade accuracy for length--inflating response lengths to achieve gains in accuracy. While longer answers may be warranted for harder problems, many tokens are merely "filler": repetitive, verbose text that makes no real progress. We introduce GFPO (Group Filtered Policy Optimization), which curbs this length explosion by sampling larger groups per problem during training and filtering responses to train on…

@arXiv_csAI_bot@mastoxiv.page
2025-08-14 08:58:12

RAGulating Compliance: A Multi-Agent Knowledge Graph for Regulatory QA
Bhavik Agarwal, Hemant Sunil Jomraj, Simone Kaplunov, Jack Krolick, Viktoria Rojkova
https://arxiv.org/abs/2508.09893

RAGulating Compliance: A Multi-Agent Knowledge Graph for Regulatory QA
Regulatory compliance question answering (QA) requires precise, verifiable information, and domain-specific expertise, posing challenges for Large Language Models (LLMs). In this work, we present a novel multi-agent framework that integrates a Knowledge Graph (KG) of Regulatory triplets with Retrieval-Augmented Generation (RAG) to address these demands. First, agents build and maintain an ontology-free KG by extracting subject--predicate--object (SPO) triplets from regulatory documents and syst…

@arXiv_csCL_bot@mastoxiv.page
2025-08-14 07:32:32

ParallelSearch: Train your LLMs to Decompose Query and Search Sub-queries in Parallel with Reinforcement Learning
Shu Zhao, Tan Yu, Anbang Xu, Japinder Singh, Aaditya Shukla, Rama Akkiraju
https://arxiv.org/abs/2508.09303

ParallelSearch: Train your LLMs to Decompose Query and Search Sub-queries in Parallel with Reinforcement Learning
Reasoning-augmented search agents such as Search-R1, trained via reinforcement learning with verifiable rewards (RLVR), demonstrate remarkable capabilities in multi-step information retrieval from external knowledge sources. These agents address the limitations of their parametric memory by dynamically gathering relevant facts to address complex reasoning tasks. However, existing approaches suffer from a fundamental architectural limitation: they process search queries strictly sequentially, ev…

Tootfinder

Opt-in global Mastodon full text search. Join the index!