
2025-09-01 08:55:53
An Empirical Study of Vulnerable Package Dependencies in LLM Repositories
Shuhan Liu, Xing Hu, Xin Xia, David Lo, Xiaohu Yang
https://arxiv.org/abs/2508.21417 https://
An Empirical Study of Vulnerable Package Dependencies in LLM Repositories
Shuhan Liu, Xing Hu, Xin Xia, David Lo, Xiaohu Yang
https://arxiv.org/abs/2508.21417 https://
North Korean hackers target open-source repositories in new espionage campaign https://therecord.media/north-korean-hackers-targeting-open-source-repositories
Czas w końcu wziąć się za przenoszenie swoich projektów z LLM Torment Nexus, dawniej #GitHub. Projekty związane z #Gentoo trafią na naszą własną infrastrukturę, w najbliższym czasie GitHub dalej będzie służył jako serwer lustrzany / ścieżka przyjmowania łatek. W przyszłości prawdopodobnie te funkcje przejmie
TypyBench: Evaluating LLM Type Inference for Untyped Python Repositories
Honghua Dong, Jiacheng Yang, Xun Deng, Yuhe Jiang, Gennady Pekhimenko, Fan Long, Xujie Si
https://arxiv.org/abs/2507.22086
I need to remember to turn off the issues feature for sample GitHub repositories
DataLens: Enhancing Dataset Discovery via Network Topologies
Ana\"is Ollagnier (CRISAM, CNRS, MARIANNE), Aline Menin (WIMMICS, Laboratoire I3S - SPARKS)
https://arxiv.org/abs/2507.23515
Recovering Signals in CoRoT Mission (RSCoRoT): I. Short Period Variable Stars
C. E. Ferreira Lopes, A. Papageorgiou, B. L. Canto Martins, M. Catelan, D. Hazarika, I. C. Le\~ao, J. R. De Medeiros, E. Lalounta, P. E. Christopoulou, D. O. Fontinele, R. L. Gomes
https://arxiv.org/abs/2508.21665
Since at least 2009, I've had 4-5 git repositories online with gitweb.cgi for projects I worked on in the mid/late 2000s. Nothing super important, and no activity since 2010 or so, but just sitting there available.
I just had to kill the web server because facebook's web crawlers are hitting the webserver so hard and overloading the cheap VPS that it's on (which I'm using for other things). It has literally been no bother for 15 years until now.
Meta truly is a …
RepoMark: A Code Usage Auditing Framework for Code Large Language Models
Wenjie Qu, Yuguang Zhou, Bo Wang, Wengrui Zheng, Yuexin Li, Jinyuan Jia, Jiaheng Zhang
https://arxiv.org/abs/2508.21432
And, boom, with that, all my servers and VPSs are monitored (25 total, with 3 hubs). Thank you, henrygd - https://github.com/henrygd - for Beszel - https://beszel.dev/
Predicting Maintenance Cessation of Open Source Software Repositories with An Integrated Feature Framework
Yiming Xu, Runzhi He, Hengzhi Ye, Minghui Zhou, Huaimin Wang
https://arxiv.org/abs/2507.21678 …
Knowledge engineering for open science: Building and deploying knowledge bases for metadata standards
Mark A. Musen, Martin J. O'Connor, Josef Hardi, Marcos Martinez-Romero
https://arxiv.org/abs/2507.22391
This post brought to you by finally figuring out what I hate about spreading related code across multiple repositories.
You see, I’m working in infrastructure now, so your code is my data.
For anyone waiting for a bookmark export feature of the #Vanadium browser, here's a workaround that seems to be working:
https://github.com/GrapheneOS/Vanadium/issues/…
User request for access to the main SharePoint site I administer. No, you will NOT get access to the main work SharePoint site. Even though many subsites and file repositories are locked with special permissions just knowing what is there isn't something we want everyone to see.
Only a few people have access to the main site and most of them is browse only and is required for their job.
Bart Wullems of The Art of Simplicity shares his experience using Microsoft's .NET Source Browser to take a peek at a particular API's internals. The code documentation pages have links to the corresponding Github repositories where you can see exactly how any given function is implemented.
"Browse the .NET code base with the .NET Source Browser"
Classifying Issues in Open-source GitHub Repositories
Amir Hossain Raaj, Fairuz Nawer Meem, Sadia Afrin Mim
https://arxiv.org/abs/2507.18982 https://arxiv.…
Survey on longform content in institutional repositories. This is part of a study commissioned by The British Academy and carried out by Information Power Ltd. #OpenAccess #books #AcademicPublishing
Generative Foundation Model for Structured and Unstructured Electronic Health Records
Sonish Sivarajkumar, Hang Zhang, Yuelyu Ji, Maneesh Bilalpur, Xizhi Wu, Chenyu Li, Min Gu Kwak, Shyam Visweswaran, Yanshan Wang
https://arxiv.org/abs/2508.16054
Hooray, parallel downloading of repository metadata was merged for #dnf! 🥳
https://github.com/rpm-software-management/dnf5/issues/307#issuecomment-2…
FreeBSD pkg version 2.2.0 seems to simplify things for users of FreeBSD-kmods repositories.
Big thanks to @…
https://www.reddit.com/r/freebsd/comments/
AI-Powered Legal Intelligence System Architecture: A Comprehensive Framework for Automated Legal Consultation and Analysis
Sean Kalaycioglu, Bob Liu, Colin Hong, Haipeng Xie
https://arxiv.org/abs/2508.17499
scheise was ist pasirt
https://github.com/AasishPokhrel/repository/issues/1
"Does the cafe entrance look accessible? Where is the door?" Towards Geospatial AI Agents for Visual Inquiries
Jon E. Froehlich, Jared Hwang, Zeyu Wang, John S. O'Meara, Xia Su, William Huang, Yang Zhang, Alex Fiannaca, Philip Nelson, Shaun Kane
https://arxiv.org/abs/2508.15752
Detecting Hard-Coded Credentials in Software Repositories via LLMs
Chidera Biringa, Gokhan Kul
https://arxiv.org/abs/2506.13090 https://
Generative Foundation Model for Structured and Unstructured Electronic Health Records
Sonish Sivarajkumar, Hang Zhang, Yuelyu Ji, Maneesh Bilalpur, Xizhi Wu, Chenyu Li, Min Gu Kwak, Shyam Visweswaran, Yanshan Wang
https://arxiv.org/abs/2508.16054
Measuring the effectiveness of code review comments in GitHub repositories: A machine learning approach
Shadikur Rahman, Umme Ayman Koana, Hasibul Karim Shanto, Mahmuda Akter, Chitra Roy, Aras M. Ismael
https://arxiv.org/abs/2508.16053
Virtual Multiplex Staining for Histological Images using a Marker-wise Conditioned Diffusion Model
Hyun-Jic Oh, Junsik Kim, Zhiyi Shi, Yichen Wu, Yu-An Chen, Peter K. Sorger, Hanspeter Pfister, Won-Ki Jeong
https://arxiv.org/abs/2508.14681
Today in how #Elsevier ruins my mood: Pure repositories pollute #Unpaywall with garbage data.
https://phabricator.wikimedia.org/T396…
What is going on in the land of Fedora Linux?
First, the latest kernel in Fedora 42 recently went from "working" to "broken" with regard to the video screen on Intel(R) Core(TM) Ultra 7 155H on an ASUSTeK COMPUTER INC. NUC14RVK-B/NUC14RVBU7
Second, the update repositories now all have invalid checksums. (Update: this is now corrected.)
Fedora is Redhat is IBM - you would think that they could do some regression testing and maybe do a better locking of t…
Added a graphic showing the dependency tree of the #Linux @kernel-vanilla #Fedora coprs to https://fedoraproject.org/wi…
Detecting Hard-Coded Credentials in Software Repositories via LLMs
Chidera Biringa, Gokhan Kul
https://arxiv.org/abs/2506.13090 https://
Prometheus: Unified Knowledge Graphs for Issue Resolution in Multilingual Codebases
Zimin Chen, Yue Pan, Siyu Lu, Jiayi Xu, Claire Le Goues, Martin Monperrus, He Ye
https://arxiv.org/abs/2507.19942
CodeEdu: A Multi-Agent Collaborative Platform for Personalized Coding Education
Jianing Zhao, Peng Gao, Jiannong Cao, Zhiyuan Wen, Chen Chen, Jianing Yin, Ruosong Yang, Bo Yuan
https://arxiv.org/abs/2507.13814
I've drafted support for verification of #PyPI provenance for #Gentoo.
You know, the new fancy thing that protects against supply chain attacks on PyPI, and verifies that you're using genuine #GitHub artifacts. Because, you know, GitHub repositories and deployment pipelines are an unlikely attack vector. And you definitely don't need to worry about #Microsoft owning the keys, the repositories and the pipelines at all.
#security #Python #SigStore
Adaptive Parallel Downloader for Large Genomic Datasets
Rasman Mubtasim Swargo, Engin Arslan, Md Arifuzzaman
https://arxiv.org/abs/2508.05511 https://arxiv…
USRN Discovery Pilot: Increasing the Discoverability of Open Access Content Through a National Network
Petr Knoth, Paul Walk, Matteo Cancellieri, Micheal Upshall, Halyna Torchylo, Jennifer Beamer, Kathleen Shearer, Heather Joseph
https://arxiv.org/abs/2508.02379
A Reproducible, Scalable Pipeline for Synthesizing Autoregressive Model Literature
Faruk Alpay, Bugra Kilictas, Hamdi Alakkad
https://arxiv.org/abs/2508.04612 https://
GitTaskBench: A Benchmark for Code Agents Solving Real-World Tasks Through Code Repository Leveraging
Ziyi Ni, Huacan Wang, Shuo Zhang, Shuo Lu, Ziyang He, Wang You, Zhenheng Tang, Yuntao Du, Bill Sun, Hongzhang Liu, Sen Hu, Ronghao Chen, Bo Li, Xin Li, Chen Hu, Binxing Jiao, Daxin Jiang, Pin Lyu
https://arxiv.org/abs/2508.18993
FuncVul: An Effective Function Level Vulnerability Detection Model using LLM and Code Chunk
Sajal Halder, Muhammad Ejaz Ahmed, Seyit Camtepe
https://arxiv.org/abs/2506.19453
LEO: An Open-Source Platform for Linking OMERO with Lab Notebooks and Heterogeneous Metadata Sources
Rodrigo Escobar D\'iaz Guerrero, Jamile Mohammad Jafari, Tobias Meyer-Zedler, Michael Schmitt, Juergen Popp, Thomas Bocklitz
https://arxiv.org/abs/2508.00654
Reflective Homework as a Learning Tool: Evidence from Comparing Thirteen Years of Dual vs. Single Submission
Madhur Dixit, Kavya Lalbahadur Joshi, Kaveri Bhalchandra Konde, Edward F. Gehringer
https://arxiv.org/abs/2508.09314
EEG Foundation Models for BCI Learn Diverse Features of Electrophysiology
Mattson Ogg, Rahul Hingorani, Diego Luna, Griffin W. Milsap, William G. Coon, Clara A. Scholl
https://arxiv.org/abs/2506.01867
PickleBall: Secure Deserialization of Pickle-based Machine Learning Models
Andreas D. Kellas, Neophytos Christou, Wenxin Jiang, Penghui Li, Laurent Simon, Yaniv David, Vasileios P. Kemerlis, James C. Davis, Junfeng Yang
https://arxiv.org/abs/2508.15987
Does AI Code Review Lead to Code Changes? A Case Study of GitHub Actions
Kexin Sun, Hongyu Kuang, Sebastian Baltes, Xin Zhou, He Zhang, Xiaoxing Ma, Guoping Rong, Dong Shao, Christoph Treude
https://arxiv.org/abs/2508.18771
MagicGUI: A Foundational Mobile GUI Agent with Scalable Data Pipeline and Reinforcement Fine-tuning
Liujian Tang, Shaokang Dong, Yijia Huang, Minqi Xiang, Hongtao Ruan, Bin Wang, Shuo Li, Zhihui Cao, Hailiang Pang, Heng Kong, He Yang, Mingxu Chai, Zhilin Gao, Xingyu Liu, Yingnan Fu, Jiaming Liu, Tao Gui, Xuanjing Huang, Yu-Gang Jiang, Qi Zhang, Kang Wang, Yunke Zhang, Yuran Wang
SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?
Xinyi He, Qian Liu, Mingzhe Du, Lin Yan, Zhijie Fan, Yiming Huang, Zejian Yuan, Zejun Ma
https://arxiv.org/abs/2507.12415
MeAJOR Corpus: A Multi-Source Dataset for Phishing Email Detection
Paulo Mendes (GECAD, ISEP, Polytechnic of Porto, Portugal), Eva Maia (GECAD, ISEP, Polytechnic of Porto, Portugal), Isabel Pra\c{c}a (GECAD, ISEP, Polytechnic of Porto, Portugal)
https://arxiv.org/abs/2507.17978
TRAIL: Joint Inference and Refinement of Knowledge Graphs with Large Language Models
Xinkui Zhao, Haode Li, Yifan Zhang, Guanjie Cheng, Yueshen Xu
https://arxiv.org/abs/2508.04474
This https://arxiv.org/abs/2506.02139 has been replaced.
link: https://scholar.google.com/scholar?q=a
Threshold-Protected Searchable Sharing: Privacy Preserving Aggregated-ANN Search for Collaborative RAG
Ruoyang Rykie Guo
https://arxiv.org/abs/2507.17199 https://
ModeliHub: A Web-based, Federated Analytics Platform for Modelica-centric, Model-based Systems Engineering
Mohamad Omar Nachawati
https://arxiv.org/abs/2506.18790
LMR-BENCH: Evaluating LLM Agent's Ability on Reproducing Language Modeling Research
Shuo Yan, Ruochen Li, Ziming Luo, Zimu Wang, Daoyang Li, Liqiang Jing, Kaiyu He, Peilin Wu, George Michalopoulos, Yue Zhang, Ziyang Zhang, Mian Zhang, Zhiyu Chen, Xinya Du
https://arxiv.org/abs/2506.17335…
A Methodological Framework for LLM-Based Mining of Software Repositories
Vincenzo De Martino, Joel Casta\~no, Fabio Palomba, Xavier Franch, Silverio Mart\'inez-Fern\'andez
https://arxiv.org/abs/2508.02233
This https://arxiv.org/abs/2505.19165 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…
Empirical Analysis of Temporal and Spatial Fault Characteristics in Multi-Fault Bug Repositories
Dylan Callaghan, Alexandra van der Spuy, Bernd Fischer
https://arxiv.org/abs/2508.08872
Dissecting the SWE-Bench Leaderboards: Profiling Submitters and Architectures of LLM- and Agent-Based Repair Systems
Matias Martinez, Xavier Franch
https://arxiv.org/abs/2506.17208
VulGuard: An Unified Tool for Evaluating Just-In-Time Vulnerability Prediction Models
Duong Nguyen, Manh Tran-Duc, Thanh Le-Cong, Triet Huynh Minh Le, M. Ali Babar, Quyet-Thang Huynh
https://arxiv.org/abs/2507.16685
Interoperable verification and dissemination of software assets in repositories using COAR Notify
Matteo Cancellieri, Martin Docekal, David Pride, Morane Gruenpeter, David Douard, Petr Knoth
https://arxiv.org/abs/2508.02335
Structural and Connectivity Patterns in the Maven Central Software Dependency Network
Daniel Ogenrwot, John Businge, Shaikh Arifuzzaman
https://arxiv.org/abs/2508.13819 https://…
This https://arxiv.org/abs/2504.16449 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCR_…
On the synchronization between Hugging Face pre-trained language models and their upstream GitHub repository
Ajibode Adekunle, Abdul Ali Bangash, Bram Adams, Ahmed E. Hassan
https://arxiv.org/abs/2508.10157
Improving Merge Pipeline Throughput in Continuous Integration via Pull Request Prioritization
Maximilian Jungwirth, Martin Gruber, Gordon Fraser
https://arxiv.org/abs/2508.08342
Bug Classification in Quantum Software: A Rule-Based Framework and Its Evaluation
Mir Mohammad Yousuf, Shabir Ahmad Sofi
https://arxiv.org/abs/2506.10397 h…
Repeton: Structured Bug Repair with ReAct-Guided Patch-and-Test Cycles
Nguyen Phu Vinh, Anh Chung Hoang, Chris Ngo, Truong-Son Hy
https://arxiv.org/abs/2506.08173
This https://arxiv.org/abs/2503.02610 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csSE_…
Bridging Language Gaps in Open-Source Documentation with Large-Language-Model Translation
Elijah Kayode Adejumo, Brittany Johnson, Mariam Guizani
https://arxiv.org/abs/2508.02497
Evaluating Software Supply Chain Security in Research Software
Richard Hegewald, Rebecca Beyer
https://arxiv.org/abs/2508.03856 https://arxiv.org/pdf/2508.…
This https://arxiv.org/abs/2505.23419 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csSE_…