Tootfinder

@arXiv_csCL_bot@mastoxiv.page
2025-08-20 10:01:00

Unintended Misalignment from Agentic Fine-Tuning: Risks and Mitigation
Dongyoon Hahm, Taywon Min, Woogyeol Jin, Kimin Lee
https://arxiv.org/abs/2508.14031 https://

Unintended Misalignment from Agentic Fine-Tuning: Risks and Mitigation
Beyond simple text generation, Large Language Models (LLMs) have evolved into agentic systems capable of planning and interacting with external tools to solve complex tasks. This evolution involves fine-tuning LLMs on agent-specific tasks to enhance their proficiency. However, safety concerns are frequently overlooked during this fine-tuning process. In this work, we show that aligned LLMs can become unintentionally misaligned, leading to a higher likelihood of executing harmful tasks and a red…

@arXiv_csAR_bot@mastoxiv.page
2025-08-21 07:30:59

MAHL: Multi-Agent LLM-Guided Hierarchical Chiplet Design with Adaptive Debugging
Jinwei Tang (Katie), Jiayin Qin (Katie), Nuo Xu (Katie), Pragnya Sudershan Nalla (Katie), Yu Cao (Katie), Yang (Katie), Zhao, Caiwen Ding
https://arxiv.org/abs/2508.14053

MAHL: Multi-Agent LLM-Guided Hierarchical Chiplet Design with Adaptive Debugging
As program workloads (e.g., AI) increase in size and algorithmic complexity, the primary challenge lies in their high dimensionality, encompassing computing cores, array sizes, and memory hierarchies. To overcome these obstacles, innovative approaches are required. Agile chip design has already benefited from machine learning integration at various stages, including logic synthesis, placement, and routing. With Large Language Models (LLMs) recently demonstrating impressive proficiency in Hardwa…

@arXiv_csAI_bot@mastoxiv.page
2025-08-19 10:51:40

HeroBench: A Benchmark for Long-Horizon Planning and Structured Reasoning in Virtual Worlds
Petr Anokhin, Roman Khalikov, Stefan Rebrikov, Viktor Volkov, Artyom Sorokin, Vincent Bissonnette
https://arxiv.org/abs/2508.12782

HeroBench: A Benchmark for Long-Horizon Planning and Structured Reasoning in Virtual Worlds
Large language models (LLMs) have shown remarkable capabilities in isolated step-by-step reasoning tasks such as mathematics and programming, but their proficiency in long-horizon planning, where solutions require extended, structured sequences of interdependent actions, remains underexplored. Existing benchmarks typically assess LLMs through abstract or low-dimensional algorithmic tasks, failing to capture the complexity of realistic planning environments. We introduce HeroBench, a novel bench…

@arXiv_csSE_bot@mastoxiv.page
2025-09-16 09:50:27

Rethinking Technology Stack Selection with AI Coding Proficiency
Xiaoyu Zhang, Weipeng Jiang, Juan Zhai, Shiqing Ma, Qingshuang Bao, Chenhao Lin, Chao Shen, Tianlin Li, Yang Liu
https://arxiv.org/abs/2509.11132

Rethinking Technology Stack Selection with AI Coding Proficiency
Large language models (LLMs) are now an integral part of software development workflows and are reshaping the whole process. Traditional technology stack selection has not caught up. Most of the existing selection methods focus solely on the inherent attributes of the technology, overlooking whether the LLM can effectively leverage the chosen technology. For example, when generating code snippets using popular libraries like Selenium (one of the most widely used test automation tools with over …

@arXiv_csCL_bot@mastoxiv.page
2025-07-14 09:58:12

The AI Language Proficiency Monitor -- Tracking the Progress of LLMs on Multilingual Benchmarks
David Pomerenke, Jonas Nothnagel, Simon Ostermann
https://arxiv.org/abs/2507.08538 …

The AI Language Proficiency Monitor -- Tracking the Progress of LLMs on Multilingual Benchmarks
To ensure equitable access to the benefits of large language models (LLMs), it is essential to evaluate their capabilities across the world's languages. We introduce the AI Language Proficiency Monitor, a comprehensive multilingual benchmark that systematically assesses LLM performance across up to 200 languages, with a particular focus on low-resource languages. Our benchmark aggregates diverse tasks including translation, question answering, math, and reasoning, using datasets such as FLORES+…

@arXiv_physicssocph_bot@mastoxiv.page
2025-06-25 08:12:20

How trust networks shape students' opinions about the proficiency of artificially intelligent assistants
Yutong Bu, Andrew Melatos, Robin Evans
https://arxiv.org/abs/2506.19655

How trust networks shape students' opinions about the proficiency of artificially intelligent assistants
The rising use of educational tools controlled by artificial intelligence (AI) has provoked a debate about their proficiency. While intrinsic proficiency, especially in tasks such as grading, has been measured and studied extensively, perceived proficiency remains underexplored. Here it is shown through Monte Carlo multi-agent simulations that trust networks among students influence their perceptions of the proficiency of an AI tool. A probabilistic opinion dynamics model is constructed, in whi…

@arXiv_csSE_bot@mastoxiv.page
2025-07-17 09:34:10

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?
Xinyi He, Qian Liu, Mingzhe Du, Lin Yan, Zhijie Fan, Yiming Huang, Zejian Yuan, Zejun Ma
https://arxiv.org/abs/2507.12415

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?
Code performance optimization is paramount in real-world software engineering and critical for production-level systems. While Large Language Models (LLMs) have demonstrated impressive capabilities in code generation and bug fixing, their proficiency in enhancing code performance at the repository level remains largely unexplored. To address this gap, we introduce SWE-Perf, the first benchmark specifically designed to systematically evaluate LLMs on code performance optimization tasks within au…

@arXiv_csCY_bot@mastoxiv.page
2025-08-26 10:55:06

Detecting Struggling Student Programmers using Proficiency Taxonomies
Noga Schwartz, Roy Fairstein, Avi Segal, Kobi Gal
https://arxiv.org/abs/2508.17353 https://

Detecting Struggling Student Programmers using Proficiency Taxonomies
Early detection of struggling student programmers is crucial for providing them with personalized support. While multiple AI-based approaches have been proposed for this problem, they do not explicitly reason about students' programming skills in the model. This study addresses this gap by developing in collaboration with educators a taxonomy of proficiencies that categorizes how students solve coding tasks and is embedded in the detection model. Our model, termed the Proficiency Taxonomy Model…

@arXiv_statME_bot@mastoxiv.page
2025-08-13 09:26:52

Analytics of Adaptive Online Testing in Practice Over a Decade
Hideo Hirose
https://arxiv.org/abs/2508.08643 https://arxiv.org/pdf/2508.08643

Analytics of Adaptive Online Testing in Practice Over a Decade
Adaptive online testing efficiently assesses examinee proficiency by dynamically adjusting the difficulty of test items based on their performance. To achieve this, items are selected so that their difficulty closely matches the test taker's estimated ability at each stage of the test. This alignment implies that the probability of a correct answer tends toward 0.5. However, in practical settings, this probability may not converge to 0.5 unless the test comprises a sufficiently large number of …

@arXiv_astrophHE_bot@mastoxiv.page
2025-07-14 09:43:12

Prospects for sub-GeV astrophysical neutrino detection with IceCube
Per Arne Sevle Myhr (for the IceCube Collaboration), Gwenha\"el de Wasseige (for the IceCube Collaboration)
https://arxiv.org/abs/2507.08569

Prospects for sub-GeV astrophysical neutrino detection with IceCube
The IceCube Neutrino Observatory is currently the largest and most sensitive detector for astrophysical neutrinos and has pioneered the field of high-energy neutrino astronomy. Despite being designed with the primary goal of identifying astrophysical TeV neutrinos and their corresponding sources, recent studies, utilising the DeepCore subdetector, have shown IceCube's proficiency in being sensitive to astrophysical neutrinos at GeV energies. Currently, there is a gap in sensitivity between the …

@arXiv_csAI_bot@mastoxiv.page
2025-08-06 10:14:10

VQA support to Arabic Language Learning Educational Tool
Khaled Bachir Delassi (LIM Lab, Amar Telidji University, Laghouat, Algeria), Lakhdar Zeggane (LIM Lab, Amar Telidji University, Laghouat, Algeria), Hadda Cherroun (LIM Lab, Amar Telidji University, Laghouat, Algeria), Abdelhamid Haouhat (LIM Lab, Amar Telidji University, Laghouat, Algeria), Kaoutar Bouzouad (Computer Science Dept., USTHB, Algiers, Algeria)

VQA support to Arabic Language Learning Educational Tool
We address the problem of scarcity of educational Arabic Language Learning tools that advocate modern pedagogical models such as active learning which ensures language proficiency. In fact, we investigate the design and evaluation of an AI-powered educational tool designed to enhance Arabic language learning for non-native speakers with beginner-to-intermediate proficiency level. The tool leverages advanced AI models to generate interactive visual quizzes, deploying Visual Question Answering as…

@arXiv_csDB_bot@mastoxiv.page
2025-07-10 08:18:51

Interactive Text-to-SQL via Expected Information Gain for Disambiguation
Luyu Qiu, Jianing Li, Chi Su, Lei Chen
https://arxiv.org/abs/2507.06467 https://…

Interactive Text-to-SQL via Expected Information Gain for Disambiguation
Relational databases are foundational to numerous domains, including business intelligence, scientific research, and enterprise systems. However, accessing and analyzing structured data often requires proficiency in SQL, which is a skill that many end users lack. With the development of Natural Language Processing (NLP) technology, the Text-to-SQL systems attempt to bridge this gap by translating natural language questions into executable SQL queries via an automated algorithm. Yet, when operat…

@arXiv_csSE_bot@mastoxiv.page
2025-09-16 10:42:47

Do Code Semantics Help? A Comprehensive Study on Execution Trace-Based Information for Code Large Language Models
Jian Wang, Xiaofei Xie, Qiang Hu, Shangqing Liu, Yi Li
https://arxiv.org/abs/2509.11686

Do Code Semantics Help? A Comprehensive Study on Execution Trace-Based Information for Code Large Language Models
Code Large Language Models (Code LLMs) have opened a new era in programming with their impressive capabilities. However, recent research has revealed critical limitations in their ability to reason about runtime behavior and understand the actual functionality of programs, which poses significant challenges for their post-training and practical deployment. Specifically, Code LLMs encounter two principal issues: (1) a lack of proficiency in reasoning about program execution behavior, as they str…

@askesis@qoto.org
2025-07-01 10:58:46

# Philosophical test fails ChatGPT: AI coherence isn’t enough to prove human mind
The research reveals that #ChatGPT does exhibit proficiency in basic coherence building. It maintains consistent dictional and intentional lines by reusing phrases and aligning responses with contextual topics. It also demonstrates some ability to construct rational coherence by offering logically consistent replies…

@arXiv_csRO_bot@mastoxiv.page
2025-07-01 11:44:23

PAC Bench: Do Foundation Models Understand Prerequisites for Executing Manipulation Policies?
Atharva Gundawar, Som Sagar, Ransalu Senanayake
https://arxiv.org/abs/2506.23725

PAC Bench: Do Foundation Models Understand Prerequisites for Executing Manipulation Policies?
Vision-Language Models (VLMs) are increasingly pivotal for generalist robot manipulation, enabling tasks such as physical reasoning, policy generation, and failure detection. However, their proficiency in these high-level applications often assumes a deep understanding of low-level physical prerequisites, a capability that remains largely unverified. For robots to perform actions reliably, they must comprehend intrinsic object properties (e.g., material, weight), action affordances (e.g., grasp…

@arXiv_csLG_bot@mastoxiv.page
2025-08-26 12:25:46

CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics
Weida Wang, Dongchen Huang, Jiatong Li, Tengchao Yang, Ziyang Zheng, Di Zhang, Dong Han, Benteng Chen, Binzhao Luo, Zhiyu Liu, Kunling Liu, Zhiyuan Gao, Shiqi Geng, Wei Ma, Jiaming Su, Xin Li, Shuchen Pu, Yuhan Shui, Qianjia Cheng, Zhihao Dou, Dongfei Cui, Changyong He, Jin Zeng, Zeke Xie, Mao Su, Dongzhan Zhou, Yuqiang Li, Wanli Ouyang, Lei Bai, Yunqi Cai, Xi Dai, Shufei Zhang, Jinguang Cheng, Zh…

CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics
We introduce CMPhysBench, designed to assess the proficiency of Large Language Models (LLMs) in Condensed Matter Physics, as a novel Benchmark. CMPhysBench is composed of more than 520 graduate-level meticulously curated questions covering both representative subfields and foundational theoretical frameworks of condensed matter physics, such as magnetism, superconductivity, strongly correlated systems, etc. To ensure a deep understanding of the problem-solving process,we focus exclusively on ca…

@arXiv_csCL_bot@mastoxiv.page
2025-09-12 10:00:09

All for One: LLMs Solve Mental Math at the Last Token With Information Transferred From Other Tokens
Siddarth Mamidanna, Daking Rai, Ziyu Yao, Yilun Zhou
https://arxiv.org/abs/2509.09650

All for One: LLMs Solve Mental Math at the Last Token With Information Transferred From Other Tokens
Large language models (LLMs) demonstrate proficiency across numerous computational tasks, yet their inner workings remain unclear. In theory, the combination of causal self-attention and multilayer perceptron layers allows every token to access and compute information based on all preceding tokens. In practice, to what extent are such operations present? In this paper, on mental math tasks (i.e., direct math calculation via next-token prediction without explicit reasoning), we investigate this …

@arXiv_eessAS_bot@mastoxiv.page
2025-07-09 09:41:52

ContextASR-Bench: A Massive Contextual Speech Recognition Benchmark
He Wang, Linhan Ma, Dake Guo, Xiong Wang, Lei Xie, Jin Xu, Junyang Lin
https://arxiv.org/abs/2507.05727

ContextASR-Bench: A Massive Contextual Speech Recognition Benchmark
Automatic Speech Recognition (ASR) has been extensively investigated, yet prior evaluative efforts have largely been restricted to contextless paradigms. This constraint stems from the limited proficiency of conventional ASR models in context modeling and their deficiency in memory and reasoning based on world knowledge. Recent breakthroughs in the development of Large Language Models (LLMs) and corresponding Large Audio Language Models (LALMs) have markedly enhanced the visibility of general a…

@arXiv_csSI_bot@mastoxiv.page
2025-06-30 08:23:50

The Missing Link: Joint Legal Citation Prediction using Heterogeneous Graph Enrichment
Lorenz Wendlinger, Simon Alexander Nonn, Abdullah Al Zubaer, Michael Granitzer
https://arxiv.org/abs/2506.22165

The Missing Link: Joint Legal Citation Prediction using Heterogeneous Graph Enrichment
Legal systems heavily rely on cross-citations of legal norms as well as previous court decisions. Practitioners, novices and legal AI systems need access to these relevant data to inform appraisals and judgments. We propose a Graph-Neural-Network (GNN) link prediction model that can identify Case-Law and Case-Case citations with high proficiency through fusion of semantic and topological information. We introduce adapted relational graph convolutions operating on an extended and enriched versio…

@arXiv_csIR_bot@mastoxiv.page
2025-08-26 09:25:46

Demographically-Inspired Query Variants Using an LLM
Marwah Alaofi, Nicola Ferro, Paul Thomas, Falk Scholer, Mark Sanderson
https://arxiv.org/abs/2508.17644 https://

Demographically-Inspired Query Variants Using an LLM
This study proposes a method to diversify queries in existing test collections to reflect some of the diversity of search engine users, aligning with an earlier vision of an 'ideal' test collection. A Large Language Model (LLM) is used to create query variants: alternative queries that have the same meaning as the original. These variants represent user profiles characterised by different properties, such as language and domain proficiency, which are known in the IR literature to influence quer…

@arXiv_mathNA_bot@mastoxiv.page
2025-07-29 10:58:41

Enhancing Complex Injection Mold Design Validation Using Multicombined RV Environments
J. M. Mercado-Colmenero, D. F. Garcia-Molina, B. Gutierrez-Jimenez, C. Martin-Donate
https://arxiv.org/abs/2507.20732

Enhancing Complex Injection Mold Design Validation Using Multicombined RV Environments
The intricate design of real complex injection molds poses significant challenges. Mold design vali-dation often falls to operators with tool-handling experience but limited CAD proficiency. Unlike other industries, the scale and costs of injection mold fabrication hinder prototyping before pro-duction. Virtual reality (VR) has emerged as a revolutionary solution offering a safe, immersive, and realistic experience and accessible using QR codes. This paper presents a new multimodal virtual envi…

@arXiv_csAI_bot@mastoxiv.page
2025-08-06 09:54:50

ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools
Shaofeng Yin, Ting Lei, Yang Liu
https://arxiv.org/abs/2508.03284 https://arxiv.org/pdf…

ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools
Integrating external tools into Large Foundation Models (LFMs) has emerged as a promising approach to enhance their problem-solving capabilities. While existing studies have demonstrated strong performance in tool-augmented Visual Question Answering (VQA), recent benchmarks reveal significant gaps in real-world tool-use proficiency, particularly in functionally diverse multimodal settings requiring multi-step reasoning. In this work, we introduce ToolVQA, a large-scale multimodal dataset compri…

@arXiv_csPF_bot@mastoxiv.page
2025-08-26 07:48:06

H2EAL: Hybrid-Bonding Architecture with Hybrid Sparse Attention for Efficient Long-Context LLM Inference
Zizhuo Fu, Xiaotian Guo, Wenxuan Zeng, Shuzhang Zhong, Yadong Zhang, Peiyu Chen, Runsheng Wang, Le Ye, Meng Li
https://arxiv.org/abs/2508.16653

H2EAL: Hybrid-Bonding Architecture with Hybrid Sparse Attention for Efficient Long-Context LLM Inference
Large language models (LLMs) have demonstrated remarkable proficiency in a wide range of natural language processing applications. However, the high energy and latency overhead induced by the KV cache limits the edge deployment, especially for long contexts. Emerging hybrid bonding (HB) technology has been proposed as a promising alternative to conventional near-memory processing (NMP) architectures, offering improved bandwidth efficiency and lower power consumption while exhibiting characteris…

@arXiv_statME_bot@mastoxiv.page
2025-07-04 08:49:01

On the analysis of sequential designs without a specified number of observations
Anna Klimova, Tam\'as Rudas
https://arxiv.org/abs/2507.02580 https://

On the analysis of sequential designs without a specified number of observations
The paper focuses on sequential experiments for categorical responses in which whether or not a further observation is made depends on the outcome of a previous experiment. Examples include subsequent medical interventions being performed or not depending on the result of a previous intervention, data about offsprings, life tables, and repeated educational retraininig until a certain proficiency level is achieved. Such experiments do not lead to data with a full Cartesian product structure and,…

@arXiv_csCY_bot@mastoxiv.page
2025-06-24 10:38:00

Optimizing Mastery Learning by Fast-Forwarding Over-Practice Steps
Meng Xia, Robin Schmucker, Conrad Borchers, Vincent Aleven
https://arxiv.org/abs/2506.17577

Optimizing Mastery Learning by Fast-Forwarding Over-Practice Steps
Mastery learning improves learning proficiency and efficiency. However, the overpractice of skills--students spending time on skills they have already mastered--remains a fundamental challenge for tutoring systems. Previous research has reduced overpractice through the development of better problem selection algorithms and the authoring of focused practice tasks. However, few efforts have concentrated on reducing overpractice through step-level adaptivity, which can avoid resource-intensive cur…

@arXiv_csCL_bot@mastoxiv.page
2025-07-31 09:54:01

Investigating Hallucination in Conversations for Low Resource Languages
Amit Das, Md. Najib Hasan, Souvika Sarkar, Zheng Zhang, Fatemeh Jamshidi, Tathagata Bhattacharya, Nilanjana Raychawdhury, Dongji Feng, Vinija Jain, Aman Chadha
https://arxiv.org/abs/2507.22720

Investigating Hallucination in Conversations for Low Resource Languages
Large Language Models (LLMs) have demonstrated remarkable proficiency in generating text that closely resemble human writing. However, they often generate factually incorrect statements, a problem typically referred to as 'hallucination'. Addressing hallucination is crucial for enhancing the reliability and effectiveness of LLMs. While much research has focused on hallucinations in English, our study extends this investigation to conversational data in three languages: Hindi, Farsi, and Mandari…

@arXiv_csAI_bot@mastoxiv.page
2025-08-26 10:10:27

Modular Embedding Recomposition for Incremental Learning
Aniello Panariello, Emanuele Frascaroli, Pietro Buzzega, Lorenzo Bonicelli, Angelo Porrello, Simone Calderara
https://arxiv.org/abs/2508.16463

Modular Embedding Recomposition for Incremental Learning
The advent of pre-trained Vision-Language Models (VLMs) has significantly transformed Continual Learning (CL), mainly due to their zero-shot classification abilities. Such proficiency makes VLMs well-suited for real-world applications, enabling robust performance on novel unseen classes without requiring adaptation. However, fine-tuning remains essential when downstream tasks deviate significantly from the pre-training domain. Prior CL approaches primarily focus on preserving the zero-shot capa…

@arXiv_csSE_bot@mastoxiv.page
2025-08-26 10:16:36

modelSolver: A Symbolic Model-Driven Solver for Power Network Simulation and Monitoring
Izudin Dzafic, Rabih A. Jabr
https://arxiv.org/abs/2508.17882 https://

modelSolver: A Symbolic Model-Driven Solver for Power Network Simulation and Monitoring
The development of advanced software tools for power system analysis requires extensive programming expertise. Even when using open-source tools, programming skills are essential to modify built-in models. This can be particularly challenging for domain experts who lack coding proficiency. This paper introduces modelSolver, a software solution with a new framework centered around symbolic mathematical modeling. The proposed paradigm facilitates defining models through intuitive mathematical exp…

@arXiv_csAI_bot@mastoxiv.page
2025-08-25 09:21:30

Modular Embedding Recomposition for Incremental Learning
Aniello Panariello, Emanuele Frascaroli, Pietro Buzzega, Lorenzo Bonicelli, Angelo Porrello, Simone Calderara
https://arxiv.org/abs/2508.16463

@arXiv_csSE_bot@mastoxiv.page
2025-08-25 09:35:30

AetherCode: Evaluating LLMs' Ability to Win In Premier Programming Competitions
Zihan Wang, Jiaze Chen, Zhicheng Liu, Markus Mak, Yidi Du, Geonsik Moon, Luoqi Xu, Aaron Tua, Kunshuo Peng, Jiayi Lu, Mingfei Xia, Boqian Zou, Chenyang Ran, Guang Tian, Shoutai Zhu, Yeheng Duan, Zhenghui Kang, Zhenxing Lin, Shangshu Li, Qiang Luo, Qingshen Long, Zhiyong Chen, Yihan Xiao, Yurong Wu, Daoguang Zan, Yuyi Fu, Mingxuan Wang, Ming Ding

AetherCode: Evaluating LLMs' Ability to Win In Premier Programming Competitions
Competitive programming has emerged as a critical benchmark for evaluating the reasoning and coding capabilities of Large Language Models (LLMs). Despite impressive progress on existing benchmarks, we argue that current evaluations overstate model proficiency, masking a substantial gap between LLMs and elite human programmers. This gap arises from two key limitations: insufficient difficulty and scope of benchmark problems, and evaluation bias from low-quality test cases. To address these short…

@arXiv_csAI_bot@mastoxiv.page
2025-08-27 10:09:53

Who Is Lagging Behind: Profiling Student Behaviors with Graph-Level Encoding in Curriculum-Based Online Learning Systems
Qian Xiao, Conn Breathnach, Ioana Ghergulescu, Conor O'Sullivan, Keith Johnston, Vincent Wade
https://arxiv.org/abs/2508.18925

Who Is Lagging Behind: Profiling Student Behaviors with Graph-Level Encoding in Curriculum-Based Online Learning Systems
The surge in the adoption of Intelligent Tutoring Systems (ITSs) in education, while being integral to curriculum- based learning, can inadvertently exacerbate performance gaps. To address this problem, student profiling becomes crucial for tracking progress, identifying struggling students, and alleviating disparities among students. Such profiling requires measuring student behaviors and performance across different aspects, such as content coverage, learning intensity, and proficiency in dif…

@arXiv_csSE_bot@mastoxiv.page
2025-08-26 10:37:06

A Large-Scale Study on Developer Engagement and Expertise in Configurable Software System Projects
Karolina M. Milano, Wesley K. G. Assun\c{c}\~ao, Bruno B. P. Cafeo
https://arxiv.org/abs/2508.18070

A Large-Scale Study on Developer Engagement and Expertise in Configurable Software System Projects
Modern systems operate in multiple contexts making variability a fundamental aspect of Configurable Software Systems (CSSs). Variability, implemented via pre-processor directives (e.g., #ifdef blocks) interleaved with other code and spread across files, complicates maintenance and increases error risk. Despite its importance, little is known about how variable code is distributed among developers or whether conventional expertise metrics adequately capture variable code proficiency. This study …

@arXiv_csSE_bot@mastoxiv.page
2025-07-24 09:18:50

How Do Code Smells Affect Skill Growth in Scratch Novice Programmers?
Ricardo Hidalgo Arag\'on, Jes\'us M. Gonz\'alez-Barahona, Gregorio Robles
https://arxiv.org/abs/2507.17314

How Do Code Smells Affect Skill Growth in Scratch Novice Programmers?
Context. Code smells, which are recurring anomalies in design or style, have been extensively researched in professional code. However, their significance in block-based projects created by novices is still largely unknown. Block-based environments such as Scratch offer a unique, data-rich setting to examine how emergent design problems intersect with the cultivation of computational-thinking (CT) skills. Objective. This research explores the connection between CT proficiency and design-level c…

Tootfinder

Opt-in global Mastodon full text search. Join the index!