Ladies and gentlemen and others, this is why I recommend hosting your own forge, like forgejo: https://mastodon.social/@mcc/114536667832141959
Also, as I recently discovered: Github git implementation is pretty dumb and reports unsolvable conflicts that are automatically solved by…
"My productivity is boosted, but ..." Demystifying Users' Perception on AI Coding Assistants
Yunbo Lyu, Zhou Yang, Jieke Shi, Jianming Chang, Yue Liu, David Lo
https://arxiv.org/abs/2508.12285
The Impact of Large Language Models (LLMs) on Code Review Process
Antonio Collante, Samuel Abedu, SayedHassan Khatoonabadi, Ahmad Abdellatif, Ebube Alor, Emad Shihab
https://arxiv.org/abs/2508.11034
Low-rank Momentum Factorization for Memory Efficient Training
Pouria Mahdavinia, Mehrdad Mahdavi
https://arxiv.org/abs/2507.08091 https://arxiv.org/pdf/2507.08091 https://arxiv.org/html/2507.08091
arXiv:2507.08091v1 Announce Type: new
Abstract: Fine-tuning large foundation models presents significant memory challenges due to stateful optimizers like AdamW, often requiring several times more GPU memory than inference. While memory-efficient methods like parameter-efficient fine-tuning (e.g., LoRA) and optimizer state compression exist, recent approaches like GaLore bridge these by using low-rank gradient projections and subspace moment accumulation. However, such methods may struggle with fixed subspaces or computationally costly offline resampling (e.g., requiring full-matrix SVDs). We propose Momentum Factorized SGD (MoFaSGD), which maintains a dynamically updated low-rank SVD representation of the first-order momentum, closely approximating its full-rank counterpart throughout training. This factorization enables a memory-efficient fine-tuning method that adaptively updates the optimization subspace at each iteration. Crucially, MoFaSGD leverages the computed low-rank momentum factors to perform efficient spectrally normalized updates, offering an alternative to subspace moment accumulation. We establish theoretical convergence guarantees for MoFaSGD, proving it achieves an optimal rate for non-convex stochastic optimization under standard assumptions. Empirically, we demonstrate MoFaSGD's effectiveness on large language model alignment benchmarks, achieving a competitive trade-off between memory reduction (comparable to LoRA) and performance compared to state-of-the-art low-rank optimization methods. Our implementation is available at https://github.com/pmahdavi/MoFaSGD.
toXiv_bot_toot
Social Media Reactions to Open Source Promotions: AI-Powered GitHub Projects on Hacker News
Prachnachai Meakpaiboonwattana, Warittha Tarntong, Thai Mekratanavorakul, Chaiyong Ragkhitwetsagul, Pattaraporn Sangaroonsilp, Raula Kula, Morakot Choetkiertikul, Kenichi Matsumoto, Thanwadee Sunetnanta
https://arxiv.org/abs/2506.12643
ASTRA: Autonomous Spatial-Temporal Red-teaming for AI Software Assistants
Xiangzhe Xu, Guangyu Shen, Zian Su, Siyuan Cheng, Hanxi Guo, Lu Yan, Xuan Chen, Jiasheng Jiang, Xiaolong Jin, Chengpeng Wang, Zhuo Zhang, Xiangyu Zhang
https://arxiv.org/abs/2508.03936
I'm not surprised that Gitlab decided to run off a cliff to follow GitHub:
«AI coding bot allows prompt injection with a pull request»
Everyday I'm more grateful for @… and @…!
https://pivot-to-ai.com/2025/05/24/ai-coding-bot-allows-prompt-injection-with-a-pull-request/
On the synchronization between Hugging Face pre-trained language models and their upstream GitHub repository
Ajibode Adekunle, Abdul Ali Bangash, Bram Adams, Ahmed E. Hassan
https://arxiv.org/abs/2508.10157
Sorrel: A simple and flexible framework for multi-agent reinforcement learning
Rebekah A. Gelp\'i, Yibing Ju, Ethan C. Jackson, Yikai Tang, Shon Verch, Claas Voelcker, William A. Cunningham
https://arxiv.org/abs/2506.00228
Understanding the Issue Types in Open Source Blockchain-based Software Projects with the Transformer-based BERTopic
Md Nahidul Islam Opu, Md Shahidul Islam, Sara Rouhani, Shaiful Chowdhury
https://arxiv.org/abs/2506.11451
Prospective Learning in Retrospect
Yuxin Bai, Cecelia Shuai, Ashwin De Silva, Siyu Yu, Pratik Chaudhari, Joshua T. Vogelstein
https://arxiv.org/abs/2507.07965 https://arxiv.org/pdf/2507.07965 https://arxiv.org/html/2507.07965
arXiv:2507.07965v1 Announce Type: new
Abstract: In most real-world applications of artificial intelligence, the distributions of the data and the goals of the learners tend to change over time. The Probably Approximately Correct (PAC) learning framework, which underpins most machine learning algorithms, fails to account for dynamic data distributions and evolving objectives, often resulting in suboptimal performance. Prospective learning is a recently introduced mathematical framework that overcomes some of these limitations. We build on this framework to present preliminary results that improve the algorithm and numerical results, and extend prospective learning to sequential decision-making scenarios, specifically foraging. Code is available at: https://github.com/neurodata/prolearn2.
toXiv_bot_toot
I'm not surprised that Gitlab decided to run off a cliff to follow GitHub:
«AI coding bot allows prompt injection with a pull request»
Everyday I'm more grateful for @… and @…!
https://pivot-to-ai.com/2025/05/24/ai-coding-bot-allows-prompt-injection-with-a-pull-request/
LadyBug: A GitHub Bot for UI-Enhanced Bug Localization in Mobile Apps
Junayed Mahmud, James Chen, Terry Achille, Camilo Alvarez-Velez, Darren Dean Bansil, Patrick Ijieh, Samar Karanch, Nadeeshan De Silva, Oscar Chaparro, Andrian Marcus, Kevin Moran
https://arxiv.org/abs/2508.05085
Divide and Conquer: A Large-Scale Dataset and Model for Left-Right Breast MRI Segmentation
Maximilian Rokuss, Benjamin Hamm, Yannick Kirchhoff, Klaus Maier-Hein
https://arxiv.org/abs/2507.13830
The Effects of GitHub Copilot on Computing Students' Programming Effectiveness, Efficiency, and Processes in Brownfield Programming Tasks
Md Istiak Hossain Shihab, Christopher Hundhausen, Ahsun Tariq, Summit Haque, Yunhan Qiao, Brian Mulanda
https://arxiv.org/abs/2506.10051
An Empirical Study on Virtual Reality Software Security Weaknesses
Yifan Xu, Jinfu Chen, Zhenyu Qi, Huashan Chen, Junyi Wang, Pengfei Hu, Feng Liu, Sen He
https://arxiv.org/abs/2507.17324
SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks
Lianghong Guo, Yanlin Wang, Caihua Li, Pengyu Yang, Jiachi Chen, Wei Tao, Yingtian Zou, Duyu Tang, Zibin Zheng
https://arxiv.org/abs/2506.10954
Replaced article(s) found for cs.SE. https://arxiv.org/list/cs.SE/new/
[1/1]:
Enhancing Open-Domain Task-Solving Capability of LLMs via Autonomous Tool Integration from GitHub
What Makes ChatGPT Effective for Software Issue Resolution? An Empirical Study of Developer-ChatGPT Conversations in GitHub
Ramtin Ehsani, Sakshi Pathak, Esteban Parra, Sonia Haiduc, Preetha Chatterjee
https://arxiv.org/abs/2506.22390
Encouraging Students' Responsible Use of GenAI in Software Engineering Education: A Causal Model and Two Institutional Applications
Vahid Garousi, Zafar Jafarov, Aytan Movsumova, Atif Namazov, Huseyn Mirzayev
https://arxiv.org/abs/2506.00682
Echoes of AI: Investigating the Downstream Effects of AI Assistants on Software Maintainability
Markus Borg, Dave Hewett, Nadim Hagatulah, Noric Couderc, Emma S\"oderberg, Donald Graham, Uttam Kini, Dave Farley
https://arxiv.org/abs/2507.00788
QLPro: Automated Code Vulnerability Discovery via LLM and Static Code Analysis Integration
Junze Hu, Xiangyu Jin, Yizhe Zeng, Yuling Liu, Yunpeng Li, Dan Du, Kaiyu Xie, Hongsong Zhu
https://arxiv.org/abs/2506.23644
From Release to Adoption: Challenges in Reusing Pre-trained AI Models for Downstream Developers
Peerachai Banyongrakkul, Mansooreh Zahedi, Patanamon Thongtanunam, Christoph Treude, Haoyu Gao
https://arxiv.org/abs/2506.23234
VulGuard: An Unified Tool for Evaluating Just-In-Time Vulnerability Prediction Models
Duong Nguyen, Manh Tran-Duc, Thanh Le-Cong, Triet Huynh Minh Le, M. Ali Babar, Quyet-Thang Huynh
https://arxiv.org/abs/2507.16685

VulGuard: An Unified Tool for Evaluating Just-In-Time Vulnerability Prediction Models
We present VulGuard, an automated tool designed to streamline the extraction, processing, and analysis of commits from GitHub repositories for Just-In-Time vulnerability prediction (JIT-VP) research. VulGuard automatically mines commit histories, extracts fine-grained code changes, commit messages, and software engineering metrics, and formats them for downstream analysis. In addition, it integrates several state-of-the-art vulnerability prediction models, allowing researchers to train, evaluat…
Toward Inclusive AI-Driven Development: Exploring Gender Differences in Code Generation Tool Interactions
Manaal Basha, Ivan Beschastnikh, Gema Rodriguez-Perez, Cleidson R. B. de Souza
https://arxiv.org/abs/2507.14770