Tootfinder

@HeidiSeibold@fosstodon.org
2025-07-18 14:06:06

Does using machine learning solve our problem of p-hacking and HARKing or do we have the same problems as with statistical tests and models?
https://digiresacademy.kit.com/posts/is-machine-learning-and-ai-solving-the-problem-of-p-hacking

@arXiv_eessIV_bot@mastoxiv.page
2025-08-18 08:38:00

Semi-Supervised Learning with Online Knowledge Distillation for Skin Lesion Classification
Siyamalan Manivannan
https://arxiv.org/abs/2508.11511 https://ar…

Semi-Supervised Learning with Online Knowledge Distillation for Skin Lesion Classification
Deep Learning has emerged as a promising approach for skin lesion analysis. However, existing methods mostly rely on fully supervised learning, requiring extensive labeled data, which is challenging and costly to obtain. To alleviate this annotation burden, this study introduces a novel semi-supervised deep learning approach that integrates ensemble learning with online knowledge distillation for enhanced skin lesion classification. Our methodology involves training an ensemble of convolutional…

@arXiv_statML_bot@mastoxiv.page
2025-06-18 10:28:23

Universal Rates of ERM for Agnostic Learning
Steve Hanneke, Mingyue Xu
https://arxiv.org/abs/2506.14110 https://arxiv.org/pdf/2506.14…

Universal Rates of ERM for Agnostic Learning
The universal learning framework has been developed to obtain guarantees on the learning rates that hold for any fixed distribution, which can be much faster than the ones uniformly hold over all the distributions. Given that the Empirical Risk Minimization (ERM) principle being fundamental in the PAC theory and ubiquitous in practical machine learning, the recent work of arXiv:2412.02810 studied the universal rates of ERM for binary classification under the realizable setting. However, the ass…

@arXiv_csCR_bot@mastoxiv.page
2025-06-17 11:31:46

EBS-CFL: Efficient and Byzantine-robust Secure Clustered Federated Learning
Zhiqiang Li, Haiyong Bao, Menghong Guan, Hao Pan, Cheng Huang, Hong-Ning Dai
https://arxiv.org/abs/2506.13612

EBS-CFL: Efficient and Byzantine-robust Secure Clustered Federated Learning
Despite federated learning (FL)'s potential in collaborative learning, its performance has deteriorated due to the data heterogeneity of distributed users. Recently, clustered federated learning (CFL) has emerged to address this challenge by partitioning users into clusters according to their similarity. However, CFL faces difficulties in training when users are unwilling to share their cluster identities due to privacy concerns. To address these issues, we present an innovative Efficient and R…

@arXiv_csNE_bot@mastoxiv.page
2025-07-17 07:42:50

Emergent Heterogeneous Swarm Control Through Hebbian Learning
Fuda van Diggelen, Tugay Alperen Karag\"uzel, Andres Garcia Rincon, A. E. Eiben, Dario Floreano, Eliseo Ferrante
https://arxiv.org/abs/2507.11566

Emergent Heterogeneous Swarm Control Through Hebbian Learning
In this paper, we introduce Hebbian learning as a novel method for swarm robotics, enabling the automatic emergence of heterogeneity. Hebbian learning presents a biologically inspired form of neural adaptation that solely relies on local information. By doing so, we resolve several major challenges for learning heterogeneous control: 1) Hebbian learning removes the complexity of attributing emergent phenomena to single agents through local learning rules, thus circumventing the micro-macro prob…

@arXiv_csCV_bot@mastoxiv.page
2025-06-17 09:51:45

Branch, or Layer? Zeroth-Order Optimization for Continual Learning of Vision-Language Models
Ziwei Liu, Borui Kang, Wei Li, Hangjie Yuan, Yanbing Yang, Wenbin Li, Jun Luo, Yifan Zhu, Tao Feng
https://arxiv.org/abs/2506.12409

Branch, or Layer? Zeroth-Order Optimization for Continual Learning of Vision-Language Models
Continual learning in vision-language models (VLMs) faces critical challenges in balancing parameter efficiency, memory consumption, and optimization stability. While First-Order (FO) optimization (e.g., SGD) dominate current approaches, their deterministic gradients often trap models in suboptimal local minima and incur substantial memory overhead. This paper pioneers a systematic exploration of Zeroth-Order (ZO) optimization for vision-language continual learning (VLCL). We first identify the…

@arXiv_csAI_bot@mastoxiv.page
2025-06-18 08:03:26

Discovering Temporal Structure: An Overview of Hierarchical Reinforcement Learning
Martin Klissarov, Akhil Bagaria, Ziyan Luo, George Konidaris, Doina Precup, Marlos C. Machado
https://arxiv.org/abs/2506.14045

Discovering Temporal Structure: An Overview of Hierarchical Reinforcement Learning
Developing agents capable of exploring, planning and learning in complex open-ended environments is a grand challenge in artificial intelligence (AI). Hierarchical reinforcement learning (HRL) offers a promising solution to this challenge by discovering and exploiting the temporal structure within a stream of experience. The strong appeal of the HRL framework has led to a rich and diverse body of literature attempting to discover a useful structure. However, it is still not clear how one might …

@arXiv_quantph_bot@mastoxiv.page
2025-07-18 08:33:32

Sporadic Federated Learning Approach in Quantum Environment to Tackle Quantum Noise
Ratun Rahman, Atit Pokharel, Dinh C. Nguyen
https://arxiv.org/abs/2507.12492

Sporadic Federated Learning Approach in Quantum Environment to Tackle Quantum Noise
Quantum Federated Learning (QFL) is an emerging paradigm that combines quantum computing and federated learning (FL) to enable decentralized model training while maintaining data privacy over quantum networks. However, quantum noise remains a significant barrier in QFL, since modern quantum devices experience heterogeneous noise levels due to variances in hardware quality and sensitivity to quantum decoherence, resulting in inadequate training performance. To address this issue, we propose SpoQ…

@pbloem@sigmoid.social
2025-07-18 09:25:22

Now out in #TMLR:
🍇 GRAPES: Learning to Sample Graphs for Scalable Graph Neural Networks 🍇
There's lots of work on sampling subgraphs for GNNs, but relatively little on making this sampling process _adaptive_. That is, learning to select the data from the graph that is relevant for your task.
We introduce an RL-based and a GFLowNet-based sampler and show that the approach perf…

A diagram of the GRAPES pipeline. It shows a subgraph being sampled in two steps and being fed to a GNN, with a blue line showing the learning signal. The caption reads Figure 1: Overview of GRAPES. First, GRAPES processes a target node (green) by computing node inclusion probabilities on its 1-hop neighbors (shown by node color shade) with a sampling GNN. Given these probabilities, GRAPES samples k nodes. Then, GRAPES repeats this process over nodes in the 2-hop neighborhood. We pass the sampl…

A results table for node classification on heterophilious graphs. Table 2: F1-scores (%) for different sampling methods trained on heterophilous graphs for a batch size of 256, and a sample size of 256 per layer. We report the mean and standard deviation over 10 runs. The best values among the sampling baselines (all except GAS) are in bold, and the second best are underlined. MC stands for multi-class and ML stands for multi-label classification. OOM indicates out of memory.

Performance of samples vs sampling size showing that GRAPES generally performs well across sample sizes, while other samplers often show more variance across sample sizes. The caption reads Figure 4: Comparative analysis of classification accuracy across different sampling sizes for sampling baseline
and GRAPES. We repeated each experiment five times: The shaded regions show the 95% confidence intervals.

A diagrammatic illustration of a graph classification task used in one of the theorems. The caption reads Figure 9: An example of a graph for Theorem 1 with eight nodes. Red edges belong to E1, features xi and labels yi are shown beside every node. For nodes v1 and v2 we show the edge e12 as an example. As shown, the label of each node is the second feature of its neighbor, where a red edge connects them. The edge homophily ratio is h=12/28 = 0.43.

@arXiv_condmatstatmech_bot@mastoxiv.page
2025-06-17 10:49:45

Bio-inspired learning algorithm for time series using Loewner equation
Yusuke Shibasaki
https://arxiv.org/abs/2506.12372 https://arxi…

Bio-inspired learning algorithm for time series using Loewner equation
Though the relationship between the theoretical statistical physics and machine learning techniques has been a well-discussed topic, the studies on the mechanism of learning inspired by the biological system are still developing. In this study, we investigate the application methods of Loewner equation to the learning algorithm particularly focusing on its statistical-mechanical aspects. We suggest two simple methods of learning of one-dimensional time series based on the unique encoding proper…

@arXiv_csSE_bot@mastoxiv.page
2025-07-18 09:11:32

A Survey of Reinforcement Learning for Software Engineering
Dong Wang, Hanmo You, Lingwei Zhu, Kaiwei Lin, Zheng Chen, Chen Yang, Junji Yu, Zan Wang, Junjie Chen
https://arxiv.org/abs/2507.12483

A Survey of Reinforcement Learning for Software Engineering
Reinforcement Learning (RL) has emerged as a powerful paradigm for sequential decision-making and has attracted growing interest across various domains, particularly following the advent of Deep Reinforcement Learning (DRL) in 2015. Simultaneously, the rapid advancement of Large Language Models (LLMs) has further fueled interest in integrating RL with LLMs to enable more adaptive and intelligent systems. In the field of software engineering (SE), the increasing complexity of systems and the ris…

@arXiv_csRO_bot@mastoxiv.page
2025-07-18 09:49:52

Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback
Suzie Kim, Hye-Bin Shin, Seong-Whan Lee
https://arxiv.org/abs/2507.13171

Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback
Conventional reinforcement learning (RL) ap proaches often struggle to learn effective policies under sparse reward conditions, necessitating the manual design of complex, task-specific reward functions. To address this limitation, rein forcement learning from human feedback (RLHF) has emerged as a promising strategy that complements hand-crafted rewards with human-derived evaluation signals. However, most existing RLHF methods depend on explicit feedback mechanisms such as button presses or pr…

@arXiv_statME_bot@mastoxiv.page
2025-06-17 12:18:41

Bayesian inference for the learning rate in Generalised Bayesian inference
Jeong Eun Lee, Sitong Liu, Geoff K. Nicholls
https://arxiv.org/abs/2506.12532 ht…

Bayesian inference for the learning rate in Generalised Bayesian inference
In Generalised Bayesian Inference (GBI), the learning rate and hyperparameters of the loss must be estimated. However, these inference-hyperparameters can't be estimated jointly with the other parameters by giving them a prior, as we discuss. Several methods for estimating the learning rate have been given which elicit and minimise a loss based on the goals of the overall inference (in our case, prediction of new data). However, in some settings there exists an unknown ``true'' learning rate ab…

@arXiv_csDC_bot@mastoxiv.page
2025-06-17 09:27:27

Optimizing Federated Learning using Remote Embeddings for Graph Neural Networks
Pranjal Naman, Yogesh Simmhan
https://arxiv.org/abs/2506.12425 https://

Optimizing Federated Learning using Remote Embeddings for Graph Neural Networks
Graph Neural Networks (GNNs) have experienced rapid advancements in recent years due to their ability to learn meaningful representations from graph data structures. Federated Learning (FL) has emerged as a viable machine learning approach for training a shared model on decentralized data, addressing privacy concerns while leveraging parallelism. Existing methods that address the unique requirements of federated GNN training using remote embeddings to enhance convergence accuracy are limited by…

@arXiv_csNI_bot@mastoxiv.page
2025-06-17 10:00:49

Learning Best Paths in Quantum Networks
Xuchuang Wang, Maoli Liu, Xutong Liu, Zhuohua Li, Mohammad Hajiesmaili, John C. S. Lui, Don Towsley
https://arxiv.org/abs/2506.12462

Learning Best Paths in Quantum Networks
Quantum networks (QNs) transmit delicate quantum information across noisy quantum channels. Crucial applications, like quantum key distribution (QKD) and distributed quantum computation (DQC), rely on efficient quantum information transmission. Learning the best path between a pair of end nodes in a QN is key to enhancing such applications. This paper addresses learning the best path in a QN in the online learning setting. We explore two types of feedback: "link-level" and "path-level". Link-le…

@arXiv_csIR_bot@mastoxiv.page
2025-07-18 08:46:22

SGCL: Unifying Self-Supervised and Supervised Learning for Graph Recommendation
Weizhi Zhang, Liangwei Yang, Zihe Song, Henrry Peng Zou, Ke Xu, Yuanjie Zhu, Philip S. Yu
https://arxiv.org/abs/2507.13336

SGCL: Unifying Self-Supervised and Supervised Learning for Graph Recommendation
Recommender systems (RecSys) are essential for online platforms, providing personalized suggestions to users within a vast sea of information. Self-supervised graph learning seeks to harness high-order collaborative filtering signals through unsupervised augmentation on the user-item bipartite graph, primarily leveraging a multi-task learning framework that includes both supervised recommendation loss and self-supervised contrastive loss. However, this separate design introduces additional grap…

@Techmeme@techhub.social
2025-08-14 17:30:58

Anthropic expands Claude's Learning Mode, available only to Education users since an April launch, to all users, including two learning variants for Claude Code (Igor Bonifacic/Engadget)
https://www.engadget.com/ai/anthropic-brin

Anthropic brings Claude's learning mode to regular users and devs
Anthropic is bringing Claude's recently added Learning mode to everyone.

@arXiv_csCE_bot@mastoxiv.page
2025-07-18 07:39:22

Quantum-Enhanced Reinforcement Learning with LSTM Forecasting Signals for Optimizing Fintech Trading Decisions
Yen-Ku Liu, Yun-Huei Pan, Pei-Fan Lu, Yun-Cheng Tsai, Samuel Yen-Chi Chen
https://arxiv.org/abs/2507.12835

Quantum-Enhanced Reinforcement Learning with LSTM Forecasting Signals for Optimizing Fintech Trading Decisions
Financial trading environments are characterized by high volatility, numerous macroeconomic signals, and dynamically shifting market regimes, where traditional reinforcement learning methods often fail to deliver breakthrough performance. In this study, we design a reinforcement learning framework tailored for financial systems by integrating quantum circuits. We compare (1) the performance of classical A3C versus quantum A3C algorithms, and (2) the impact of incorporating LSTM-based prediction…

@arXiv_condmatstrel_bot@mastoxiv.page
2025-07-18 08:17:12

Self-learning Monte Carlo Method: A Review
Gaopei Pan, Chuang Chen, Zi Yang Meng
https://arxiv.org/abs/2507.12554 https://arxiv.org/p…

Self-learning Monte Carlo Method: A Review
The Self-Learning Monte Carlo (SLMC) method is a Monte Carlo approach that has emerged in recent years by integrating concepts from machine learning with conventional Monte Carlo techniques. Designed to accelerate the numerical study of interacting many-body systems, SLMC significantly improves sampling efficiency by constructing an effective model -- via machine learning methods -- based on configurations generated by conventional Monte Carlo methods and then proposes global updates based on t…

@cowboys@darktundra.xyz
2025-06-16 23:16:22

Cowboys' 1st-round rookie working to flatten learning curve of life in NFL https://cowboyswire.usatoday.com/story/sports/nfl/cowboys/2025/06/16/cowboys-rookie-tyler-booker-quotes/84233439007/

Cowboys' 1st-round rookie working to flatten learning curve of life in NFL
Rookie Tyler Booker graduated from Alabama in just 3 years. He's tackling the learning curve of life in the NFL with his smarts first.

@cdarwin@c.im
2025-08-15 16:39:10

How does the #brain transfer #MotorSkills between hands?
This study reveals that transfer relies on re-expressing the neural patterns established during initial learning in distributed higher-order brain areas,
offering new insights into learning

Transfer of motor learning is associated with patterns of activity in the default mode network
How does the brain transfer motor skills between hands? This study reveals that transfer relies on re-expressing the neural patterns established during initial learning in distributed higher-order brain areas, offering new insights into learning generalization.

@arXiv_astrophGA_bot@mastoxiv.page
2025-06-18 09:38:38

Multiple machine-learning as a powerful tool for the star clusters analysis
Denilso Camargo
https://arxiv.org/abs/2506.13951 https://…

Multiple machine-learning as a powerful tool for the star clusters analysis
This work proposes a multiple machine learning method (MMLM) aiming to improve the accuracy and robustness in the analysis of star clusters. The MMLM performance is evaluated by applying it to the reanalysis of the old binary cluster candidate - NGC 1605a and NGC 1605b - found by Camargo (2021) (hereafter C21). The binary cluster candidate is analyzed by employing a set of well established machine learning algorithms applied to the Gaia-EDR3 data. Membership probabilities and open clusters (OCs…

@arXiv_mathOC_bot@mastoxiv.page
2025-07-18 09:29:22

Unsupervised Ground Metric Learning
Janis Auffenberg, Jonas Bresch, Oleh Melnyk, Gabriele Steidl
https://arxiv.org/abs/2507.13094 https://

Unsupervised Ground Metric Learning
Data classification without access to labeled samples remains a challenging problem. It usually depends on an appropriately chosen distance between features, a topic addressed in metric learning. Recently, Huizing, Cantini and Peyré proposed to simultaneously learn optimal transport (OT) cost matrices between samples and features of the dataset. This leads to the task of finding positive eigenvectors of a certain nonlinear function that maps cost matrices to OT distances. Having this basic ide…

@arXiv_physicsoptics_bot@mastoxiv.page
2025-06-18 10:00:55

High computational density nanophotonic media for machine learning inference
Zhenyu Zhao, Yichen Pan, Jinlong Xiang, Yujia Zhang, An He, Yaotian Zhao, Youlve Chen, Yu He, Xinyuan Fang, Yikai Su, Min Gu, Xuhan Guo
https://arxiv.org/abs/2506.14269

High computational density nanophotonic media for machine learning inference
Efficient machine learning inference is essential for the rapid adoption of artificial intelligence across various domains.On-chip optical computing has emerged as a transformative solution for accelerating machine learning tasks, owing to its ultra-low power consumption. However, enhancing the computational density of on-chip optical systems remains a significant challenge, primarily due to the difficulties in miniaturizing and integrating key optical interference components.In this work, we h…

@arXiv_csCR_bot@mastoxiv.page
2025-06-18 09:06:19

EBS-CFL: Efficient and Byzantine-robust Secure Clustered Federated Learning
Zhiqiang Li, Haiyong Bao, Menghong Guan, Hao Pan, Cheng Huang, Hong-Ning Dai
https://arxiv.org/abs/2506.13612

@arXiv_csHC_bot@mastoxiv.page
2025-08-18 09:23:20

From Misunderstandings to Learning Opportunities: Leveraging Generative AI in Discussion Forums to Support Student Learning
Stanislav Pozdniakov, Jonathan Brazil, Oleksandra Poquet, Stephan Krusche, Santiago Berrezueta-Guzman, Shazia Sadiq, Hassan Khosravi
https://arxiv.org/abs/2508.11150

From Misunderstandings to Learning Opportunities: Leveraging Generative AI in Discussion Forums to Support Student Learning
In the contemporary educational landscape, particularly in large classroom settings, discussion forums have become a crucial tool for promoting interaction and addressing student queries. These forums foster a collaborative learning environment where students engage with both the teaching team and their peers. However, the sheer volume of content generated in these forums poses two significant interconnected challenges: How can we effectively identify common misunderstandings that arise in stud…

@arXiv_csAR_bot@mastoxiv.page
2025-07-18 08:42:52

WIP: Turning Fake Chips into Learning Opportunities
Haniye Mehraban, Saad Azmeen-ur-Rahman, John Hu
https://arxiv.org/abs/2507.13281 https://

WIP: Turning Fake Chips into Learning Opportunities
This work-in-progress paper presents a case study in which counterfeit TL074 operational amplifiers, discovered in a junior level electronics course, became the basis for a hands on learning experience. Counterfeit integrated circuits (IC) are increasingly common, posing a significant threat to the integrity of undergraduate electronics laboratories. Instead of simply replacing the counterfeit components, we turned the issue into a teaching moment. Students engaged in hands-on diagnostics measu…

@arXiv_statML_bot@mastoxiv.page
2025-08-18 08:24:40

Counterfactual Survival Q Learning for Longitudinal Randomized Trials via Buckley James Boosting
Jeongjin Lee, Jong-Min Kim
https://arxiv.org/abs/2508.11060 https://

Counterfactual Survival Q Learning for Longitudinal Randomized Trials via Buckley James Boosting
We propose a Buckley James (BJ) Boost Q learning framework for estimating optimal dynamic treatment regimes under right censored survival data, tailored for longitudinal randomized clinical trial settings. The method integrates accelerated failure time models with iterative boosting techniques, including componentwise least squares and regression trees, within a counterfactual Q learning framework. By directly modeling conditional survival time, BJ Boost Q learning avoids the restrictive propor…

@arXiv_csLG_bot@mastoxiv.page
2025-07-17 10:13:20

Online Training and Pruning of Deep Reinforcement Learning Networks
Valentin Frank Ingmar Guenter, Athanasios Sideris
https://arxiv.org/abs/2507.11975 http…

Online Training and Pruning of Deep Reinforcement Learning Networks
Scaling deep neural networks (NN) of reinforcement learning (RL) algorithms has been shown to enhance performance when feature extraction networks are used but the gained performance comes at the significant expense of increased computational and memory complexity. Neural network pruning methods have successfully addressed this challenge in supervised learning. However, their application to RL is underexplored. We propose an approach to integrate simultaneous training and pruning within advance…

@arXiv_csCV_bot@mastoxiv.page
2025-06-17 09:40:47

Hierarchical Deep Feature Fusion and Ensemble Learning for Enhanced Brain Tumor MRI Classification
Zahid Ullah, Jihie Kim
https://arxiv.org/abs/2506.12363 …

Hierarchical Deep Feature Fusion and Ensemble Learning for Enhanced Brain Tumor MRI Classification
Accurate brain tumor classification is crucial in medical imaging to ensure reliable diagnosis and effective treatment planning. This study introduces a novel double ensembling framework that synergistically combines pre-trained deep learning (DL) models for feature extraction with optimized machine learning (ML) classifiers for robust classification. The framework incorporates comprehensive preprocessing and data augmentation of brain magnetic resonance images (MRI), followed by deep feature e…

@arXiv_csCL_bot@mastoxiv.page
2025-07-17 08:09:40

Partitioner Guided Modal Learning Framework
Guimin Hu, Yi Xin, Lijie Hu, Zhihong Zhu, Hasti Seifi
https://arxiv.org/abs/2507.11661 https://

Partitioner Guided Modal Learning Framework
Multimodal learning benefits from multiple modal information, and each learned modal representations can be divided into uni-modal that can be learned from uni-modal training and paired-modal features that can be learned from cross-modal interaction. Building on this perspective, we propose a partitioner-guided modal learning framework, PgM, which consists of the modal partitioner, uni-modal learner, paired-modal learner, and uni-paired modal decoder. Modal partitioner segments the learned moda…

@arXiv_csAI_bot@mastoxiv.page
2025-06-18 08:11:22

Enhancing Symbolic Machine Learning by Subsymbolic Representations
Stephen Roth, Lennart Baur, Derian Boer, Stefan Kramer
https://arxiv.org/abs/2506.14569 …

Enhancing Symbolic Machine Learning by Subsymbolic Representations
The goal of neuro-symbolic AI is to integrate symbolic and subsymbolic AI approaches, to overcome the limitations of either. Prominent systems include Logic Tensor Networks (LTN) or DeepProbLog, which offer neural predicates and end-to-end learning. The versatility of systems like LTNs and DeepProbLog, however, makes them less efficient in simpler settings, for instance, for discriminative machine learning, in particular in domains with many constants. Therefore, we follow a different approach:…

@arXiv_quantph_bot@mastoxiv.page
2025-07-18 08:07:02

Quantum Transfer Learning to Boost Dementia Detection
Sounak Bhowmik, Talita Perciano, Himanshu Thapliyal
https://arxiv.org/abs/2507.12485 https://

Quantum Transfer Learning to Boost Dementia Detection
Dementia is a devastating condition with profound implications for individuals, families, and healthcare systems. Early and accurate detection of dementia is critical for timely intervention and improved patient outcomes. While classical machine learning and deep learning approaches have been explored extensively for dementia prediction, these solutions often struggle with high-dimensional biomedical data and large-scale datasets, quickly reaching computational and performance limitations. To a…

@arXiv_csRO_bot@mastoxiv.page
2025-06-18 09:23:35

SENIOR: Efficient Query Selection and Preference-Guided Exploration in Preference-based Reinforcement Learning
Hexian Ni, Tao Lu, Haoyuan Hu, Yinghao Cai, Shuo Wang
https://arxiv.org/abs/2506.14648

SENIOR: Efficient Query Selection and Preference-Guided Exploration in Preference-based Reinforcement Learning
Preference-based Reinforcement Learning (PbRL) methods provide a solution to avoid reward engineering by learning reward models based on human preferences. However, poor feedback- and sample- efficiency still remain the problems that hinder the application of PbRL. In this paper, we present a novel efficient query selection and preference-guided exploration method, called SENIOR, which could select the meaningful and easy-to-comparison behavior segment pairs to improve human feedback-efficiency…

@arXiv_csSE_bot@mastoxiv.page
2025-06-17 10:59:37

Isolating Noisy Labelled Test Cases in Human-in-the-Loop Oracle Learning
Charaka Geethal Kapugama
https://arxiv.org/abs/2506.13273 https://

Isolating Noisy Labelled Test Cases in Human-in-the-Loop Oracle Learning
Incorrectly labelled test cases can adversely affect the training process of human-in-the-loop oracle learning tech-niques. This paper introduces ISONOISE, a technique designed to identify such mislabelled test cases introduced during human-in-the-loop oracle learning. This technique can be applied to programs taking numeric inputs. Given a compromised automatic test oracle and its training test suite, ISONOISE first isolates thetest cases suspected of being mislabelled. This task is performed …

@arXiv_eessIV_bot@mastoxiv.page
2025-06-18 09:04:55

Integrating Radiomics with Deep Learning Enhances Multiple Sclerosis Lesion Delineation
Nadezhda Alsahanova, Pavel Bartenev, Maksim Sharaev, Milos Ljubisavljevic, Taleb Al. Mansoori, Yauhen Statsenko
https://arxiv.org/abs/2506.14524

Integrating Radiomics with Deep Learning Enhances Multiple Sclerosis Lesion Delineation
Background: Accurate lesion segmentation is critical for multiple sclerosis (MS) diagnosis, yet current deep learning approaches face robustness challenges. Aim: This study improves MS lesion segmentation by combining data fusion and deep learning techniques. Materials and Methods: We suggested novel radiomic features (concentration rate and Rényi entropy) to characterize different MS lesion types and fused these with raw imaging data. The study integrated radiomic features with imaging da…

@arXiv_csNI_bot@mastoxiv.page
2025-06-17 09:51:29

Latency Optimization for Wireless Federated Learning in Multihop Networks
Shaba Shaon, Van-Dinh Nguyen, Dinh C. Nguyen
https://arxiv.org/abs/2506.12081 htt…

Latency Optimization for Wireless Federated Learning in Multihop Networks
In this paper, we study a novel latency minimization problem in wireless federated learning (FL) across multi-hop networks. The system comprises multiple routes, each integrating leaf and relay nodes for FL model training. We explore a personalized learning and adaptive aggregation-aware FL (PAFL) framework that effectively addresses data heterogeneity across participating nodes by harmonizing individual and collective learning objectives. We formulate an optimization problem aimed at minimizin…

@arXiv_csCR_bot@mastoxiv.page
2025-06-17 09:52:25

Privacy-Preserving Federated Learning against Malicious Clients Based on Verifiable Functional Encryption
Nina Cai, Jinguang Han
https://arxiv.org/abs/2506.12846

Privacy-Preserving Federated Learning against Malicious Clients Based on Verifiable Functional Encryption
Federated learning is a promising distributed learning paradigm that enables collaborative model training without exposing local client data, thereby protect data privacy. However, it also brings new threats and challenges. The advancement of model inversion attacks has rendered the plaintext transmission of local models insecure, while the distributed nature of federated learning makes it particularly vulnerable to attacks raised by malicious clients. To protect data privacy and prevent malici…

@Techmeme@techhub.social
2025-06-15 21:30:34

Berlin-based Knowunity, an AI-powered learning platform with 20M users in 15 countries, raised a €27M Series B led by XAnge, bringing its total funding to €45M (Tamara Djurickovic/Tech.eu)
https://tech.eu/2025/06/13/knowunity-raises-eur…

Knowunity raises €27M to scale its personalized AI tutor globally
The Berlin-based learning platform Knowunity, has raised €27 million in a Series B round to expand its personalised AI study companion globally. The new funding brings Knowunity’s total raised to €45 million and will be used to further develop its global AI learning companion. Built by and for students, Knowunity was founded in 2020 by Benedict Kurz (CEO), along with Gregor Weber (CPO), Lucas Hild (CTO), and Yannik Prigl (Backend). All were just 17 years old and still in school at the ti…

@arXiv_csAI_bot@mastoxiv.page
2025-06-18 08:06:21

Don't throw the baby out with the bathwater: How and why deep learning for ARC
Jack Cole, Mohamed Osman
https://arxiv.org/abs/2506.14276 https://

Don't throw the baby out with the bathwater: How and why deep learning for ARC
The Abstraction and Reasoning Corpus (ARC-AGI) presents a formidable challenge for AI systems. Despite the typically low performance on ARC, the deep learning paradigm remains the most effective known strategy for generating skillful (state-of-the-art) neural networks (NN) across varied modalities and tasks in vision, language etc. The deep learning paradigm has proven to be able to train these skillful neural networks and learn the abstractions needed in these diverse domains. Our work doubles…

@arXiv_condmatstatmech_bot@mastoxiv.page
2025-06-18 09:29:18

Evolutionary chemical learning in dimerization networks
Alexei V. Tkachenko, Bortolo Matteo Mognetti, Sergei Maslov
https://arxiv.org/abs/2506.14006 https:…

Evolutionary chemical learning in dimerization networks
We present a novel framework for chemical learning based on Competitive Dimerization Networks (CDNs) - systems in which multiple molecular species, e.g. proteins or DNA/RNA oligomers, reversibly bind to form dimers. We show that these networks can be trained in vitro through directed evolution, enabling the implementation of complex learning tasks such as multiclass classification without digital hardware or explicit parameter tuning. Each molecular species functions analogously to a neuron, wi…

@arXiv_csDC_bot@mastoxiv.page
2025-07-17 09:23:20

NineToothed: A Triton-Based High-Level Domain-Specific Language for Machine Learning
Jiacheng Huang, Zimin Li, Yinghui Li, Haojie Wang
https://arxiv.org/abs/2507.11978

NineToothed: A Triton-Based High-Level Domain-Specific Language for Machine Learning
The emergence of deep learning domain-specific languages (DSLs) has substantially reduced the obstacles in developing high-performance, cross-platform compute kernels. However, current DSLs, such as Triton, still demand that developers possess expertise in parallel programming and expose them to many low-level details. This requirement complicates the development process and adds to the difficulty of maintaining compute kernels. Consequently, developing a new programming model that supports ser…

@arXiv_statML_bot@mastoxiv.page
2025-06-17 12:16:33

Variational Learning Finds Flatter Solutions at the Edge of Stability
Avrajit Ghosh, Bai Cong, Rio Yokota, Saiprasad Ravishankar, Rongrong Wang, Molei Tao, Mohammad Emtiyaz Khan, Thomas M\"ollenhoff
https://arxiv.org/abs/2506.12903

Variational Learning Finds Flatter Solutions at the Edge of Stability
Variational Learning (VL) has recently gained popularity for training deep neural networks and is competitive to standard learning methods. Part of its empirical success can be explained by theories such as PAC-Bayes bounds, minimum description length and marginal likelihood, but there are few tools to unravel the implicit regularization in play. Here, we analyze the implicit regularization of VL through the Edge of Stability (EoS) framework. EoS has previously been used to show that gradient d…

@arXiv_csRO_bot@mastoxiv.page
2025-08-18 08:35:20

GenFlowRL: Shaping Rewards with Generative Object-Centric Flow in Visual Reinforcement Learning
Kelin Yu, Sheng Zhang, Harshit Soora, Furong Huang, Heng Huang, Pratap Tokekar, Ruohan Gao
https://arxiv.org/abs/2508.11049

GenFlowRL: Shaping Rewards with Generative Object-Centric Flow in Visual Reinforcement Learning
Recent advances have shown that video generation models can enhance robot learning by deriving effective robot actions through inverse dynamics. However, these methods heavily depend on the quality of generated data and struggle with fine-grained manipulation due to the lack of environment feedback. While video-based reinforcement learning improves policy robustness, it remains constrained by the uncertainty of video generation and the challenges of collecting large-scale robot datasets for tra…

@arXiv_mathOC_bot@mastoxiv.page
2025-07-17 08:48:40

Convergence Rate of Generalized Nash Equilibrium Learning in Strongly Monotone Games with Linear Constraints
Tatiana Tatarenko, Maryam Kamgarpour
https://arxiv.org/abs/2507.12112 …

Convergence Rate of Generalized Nash Equilibrium Learning in Strongly Monotone Games with Linear Constraints
We consider payoff-based learning of a generalized Nash equilibrium (GNE) in multi-agent systems. Our focus is on games with jointly convex constraints of a linear structure and strongly monotone pseudo-gradients. We present a convergent procedure based on a partial regularization technique and establish the convergence rate of its iterates under one- and two-point payoff-based feedback. To the best of our knowledge, this work is the first one characterizing the convergence speed of iterates to…

@arXiv_csHC_bot@mastoxiv.page
2025-08-18 09:15:50

Human-in-the-Loop Systems for Adaptive Learning Using Generative AI
Bhavishya Tarun, Haoze Du, Dinesh Kannan, Edward F. Gehringer
https://arxiv.org/abs/2508.11062 https://

Human-in-the-Loop Systems for Adaptive Learning Using Generative AI
A Human-in-the-Loop (HITL) approach leverages generative AI to enhance personalized learning by directly integrating student feedback into AI-generated solutions. Students critique and modify AI responses using predefined feedback tags, fostering deeper engagement and understanding. This empowers students to actively shape their learning, with AI serving as an adaptive partner. The system uses a tagging technique and prompt engineering to personalize content, informing a Retrieval-Augmented Gen…

@arXiv_csIR_bot@mastoxiv.page
2025-06-17 09:43:44

A Gradient Meta-Learning Joint Optimization for Beamforming and Antenna Position in Pinching-Antenna Systems
Kang Zhou, Weixi Zhou, Donghong Cai, Xianfu Lei, Yanqing Xu, Zhiguo Ding, Pingzhi Fan
https://arxiv.org/abs/2506.12583

A Gradient Meta-Learning Joint Optimization for Beamforming and Antenna Position in Pinching-Antenna Systems
In this paper, we consider a novel optimization design for multi-waveguide pinching-antenna systems, aiming to maximize the weighted sum rate (WSR) by jointly optimizing beamforming coefficients and antenna position. To handle the formulated non-convex problem, a gradient-based meta-learning joint optimization (GML-JO) algorithm is proposed. Specifically, the original problem is initially decomposed into two sub-problems of beamforming optimization and antenna position optimization through equi…

@arXiv_csSE_bot@mastoxiv.page
2025-06-17 09:57:37

Quantum-Inspired Differentiable Integral Neural Networks (QIDINNs): A Feynman-Based Architecture for Continuous Learning Over Streaming Data
Oscar Boullosa Dapena
https://arxiv.org/abs/2506.12111

Quantum-Inspired Differentiable Integral Neural Networks (QIDINNs): A Feynman-Based Architecture for Continuous Learning Over Streaming Data
Real-time continuous learning over streaming data remains a central challenge in deep learning and AI systems. Traditional gradient-based models such as backpropagation through time (BPTT) face computational and stability limitations when dealing with temporally unbounded data. In this paper, we introduce a novel architecture, Quantum-Inspired Differentiable Integral Neural Networks (QIDINNs), which leverages the Feynman technique of differentiation under the integral sign to formulate neural u…

@arXiv_csLG_bot@mastoxiv.page
2025-07-17 13:51:58

Replaced article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[3/5]:
- Uncertainty Quantification for Motor Imagery BCI -- Machine Learning vs. Deep Learning
Joris Suurmeijer, Ivo Pascal de Jong, Matias Valdenegro-Toro, Andreea Ioana Sburlea

@arXiv_csCR_bot@mastoxiv.page
2025-06-18 08:28:45

Privacy-Preserving Federated Learning against Malicious Clients Based on Verifiable Functional Encryption
Nina Cai, Jinguang Han
https://arxiv.org/abs/2506.12846

@arXiv_csCV_bot@mastoxiv.page
2025-06-17 10:19:29

Comparative Analysis of Deep Learning Strategies for Hypertensive Retinopathy Detection from Fundus Images: From Scratch and Pre-trained Models
Yanqiao Zhu
https://arxiv.org/abs/2506.12492

Comparative Analysis of Deep Learning Strategies for Hypertensive Retinopathy Detection from Fundus Images: From Scratch and Pre-trained Models
This paper presents a comparative analysis of deep learning strategies for detecting hypertensive retinopathy from fundus images, a central task in the HRDC challenge~\cite{qian2025hrdc}. We investigate three distinct approaches: a custom CNN, a suite of pre-trained transformer-based models, and an AutoML solution. Our findings reveal a stark, architecture-dependent response to data augmentation. Augmentation significantly boosts the performance of pure Vision Transformers (ViTs), which we hypo…

@arXiv_csCL_bot@mastoxiv.page
2025-06-18 09:12:51

Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs
Ring Team, Bin Hu, Cai Chen, Deng Zhao, Ding Liu, Dingnan Jin, Feng Zhu, Hao Dai, Hongzhi Luan, Jia Guo, Jiaming Liu, Jiewei Wu, Jun Mei, Jun Zhou, Junbo Zhao, Junwu Xiong, Kaihong Zhang, Kuan Xu, Lei Liang, Liang Jiang, Liangcheng Fu, Longfei Zheng, Qiang Gao, Qing Cui, Quan Wan, Shaomian Zheng, Shuaicheng Li, Tongkai Yang, Wang Ren, Xiaodong Yan, Xiaopei Wan, Xiaoyun Feng, Xin Zhao, Xinxing Yang, Xinyu …

Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs
We present Ring-lite, a Mixture-of-Experts (MoE)-based large language model optimized via reinforcement learning (RL) to achieve efficient and robust reasoning capabilities. Built upon the publicly available Ling-lite model, a 16.8 billion parameter model with 2.75 billion activated parameters, our approach matches the performance of state-of-the-art (SOTA) small-scale reasoning models on challenging benchmarks (e.g., AIME, LiveCodeBench, GPQA-Diamond) while activating only one-third of the par…

@arXiv_csAI_bot@mastoxiv.page
2025-06-18 08:01:27

Causality in the human niche: lessons for machine learning
Richard D. Lange, Konrad P. Kording
https://arxiv.org/abs/2506.13803 https://

Causality in the human niche: lessons for machine learning
Humans interpret the world around them in terms of cause and effect and communicate their understanding of the world to each other in causal terms. These causal aspects of human cognition are thought to underlie humans' ability to generalize and learn efficiently in new domains, an area where current machine learning systems are weak. Building human-like causal competency into machine learning systems may facilitate the construction of effective and interpretable AI. Indeed, the machine learnin…

@arXiv_quantph_bot@mastoxiv.page
2025-07-17 10:00:40

Quantum Machine Learning in Multi-Qubit Phase-Space Part I: Foundations
Timothy Heightman, Edward Jiang, Ruth Mora-Soto, Maciej Lewenstein, Marcin P{\l}odzie\'n
https://arxiv.org/abs/2507.12117

Quantum Machine Learning in Multi-Qubit Phase-Space Part I: Foundations
Quantum machine learning (QML) seeks to exploit the intrinsic properties of quantum mechanical systems, including superposition, coherence, and quantum entanglement for classical data processing. However, due to the exponential growth of the Hilbert space, QML faces practical limits in classical simulations with the state-vector representation of quantum system. On the other hand, phase-space methods offer an alternative by encoding quantum states as quasi-probability functions. Building on pri…

@arXiv_statML_bot@mastoxiv.page
2025-06-17 12:12:49

General and Estimable Learning Bound Unifying Covariate and Concept Shifts
Hongbo Chen, Li Charlie Xia
https://arxiv.org/abs/2506.12829 https://

General and Estimable Learning Bound Unifying Covariate and Concept Shifts
Generalization under distribution shift remains a core challenge in modern machine learning, yet existing learning bound theory is limited to narrow, idealized settings and is non-estimable from samples. In this paper, we bridge the gap between theory and practical applications. We first show that existing bounds become loose and non-estimable because their concept shift definition breaks when the source and target supports mismatch. Leveraging entropic optimal transport, we propose new support…

@Techmeme@techhub.social
2025-07-15 16:40:52

Apple seems to be working on adding CUDA support to open-source ML framework MLX, which may mean that code developed using MLX would work with CUDA (Malcolm Owen/AppleInsider)
https://appleinsider.com/articles/25/0

Apple Silicon machine learning code may become more easily portable to Nvidia hardware
A project is trying to cut the cost of making machine learning applications for Nvidia hardware, by developing on an Apple Silicon Mac and exporting it to CUDA.

@arXiv_csLG_bot@mastoxiv.page
2025-07-17 10:28:10

A Bayesian Incentive Mechanism for Poison-Resilient Federated Learning
Daniel Commey, Rebecca A. Sarpong, Griffith S. Klogo, Winful Bagyl-Bac, Garth V. Crosby
https://arxiv.org/abs/2507.12439

A Bayesian Incentive Mechanism for Poison-Resilient Federated Learning
Federated learning (FL) enables collaborative model training across decentralized clients while preserving data privacy. However, its open-participation nature exposes it to data-poisoning attacks, in which malicious actors submit corrupted model updates to degrade the global model. Existing defenses are often reactive, relying on statistical aggregation rules that can be computationally expensive and that typically assume an honest majority. This paper introduces a proactive, economic defense:…

@arXiv_csDC_bot@mastoxiv.page
2025-07-18 08:00:52

Autonomous Resource Management in Microservice Systems via Reinforcement Learning
Yujun Zou, Nia Qi, Yingnan Deng, Zhihao Xue, Ming Gong, Wuyang Zhang
https://arxiv.org/abs/2507.12879

Autonomous Resource Management in Microservice Systems via Reinforcement Learning
This paper proposes a reinforcement learning-based method for microservice resource scheduling and optimization, aiming to address issues such as uneven resource allocation, high latency, and insufficient throughput in traditional microservice architectures. In microservice systems, as the number of services and the load increase, efficiently scheduling and allocating resources such as computing power, memory, and storage becomes a critical research challenge. To address this, the paper employs…

@arXiv_csCV_bot@mastoxiv.page
2025-06-17 09:40:32

EKPC: Elastic Knowledge Preservation and Compensation for Class-Incremental Learning
Huaijie Wang, De Cheng, Lingfeng He, Yan Li, Jie Li, Nannan Wang, Xinbo Gao
https://arxiv.org/abs/2506.12351

EKPC: Elastic Knowledge Preservation and Compensation for Class-Incremental Learning
Class-Incremental Learning (CIL) aims to enable AI models to continuously learn from sequentially arriving data of different classes over time while retaining previously acquired knowledge. Recently, Parameter-Efficient Fine-Tuning (PEFT) methods, like prompt pool-based approaches and adapter tuning, have shown great attraction in CIL. However, these methods either introduce additional parameters that increase memory usage, or rely on rigid regularization techniques which reduce forgetting but …

@arXiv_mathOC_bot@mastoxiv.page
2025-06-17 12:24:17

Research on Optimal Control Problem Based on Reinforcement Learning under Knightian Uncertainty
Ziyu Li, Chen Fei, Weiyin Fei
https://arxiv.org/abs/2506.13207

Research on Optimal Control Problem Based on Reinforcement Learning under Knightian Uncertainty
Considering that the decision-making environment faced by reinforcement learning (RL) agents is full of Knightian uncertainty, this paper describes the exploratory state dynamics equation in Knightian uncertainty to study the entropy-regularized relaxed stochastic control problem in a Knightian uncertainty environment. By employing stochastic analysis theory and the dynamic programming principle under nonlinear expectation, we derive the Hamilton-Jacobi-Bellman (HJB) equation and solve for the …

@arXiv_csCL_bot@mastoxiv.page
2025-07-17 10:05:10

Findings of MEGA: Maths Explanation with LLMs using the Socratic Method for Active Learning
Tosin Adewumi, Foteini Simistira Liwicki, Marcus Liwicki, Viktor Gardelli, Lama Alkhaled, Hamam Mokayed
https://arxiv.org/abs/2507.12079

Findings of MEGA: Maths Explanation with LLMs using the Socratic Method for Active Learning
This paper presents an intervention study on the effects of the combined methods of (1) the Socratic method, (2) Chain of Thought (CoT) reasoning, (3) simplified gamification and (4) formative feedback on university students' Maths learning driven by large language models (LLMs). We call our approach Mathematics Explanations through Games by AI LLMs (MEGA). Some students struggle with Maths and as a result avoid Math-related discipline or subjects despite the importance of Maths across many fie…

@arXiv_csHC_bot@mastoxiv.page
2025-06-16 08:01:09

Conversational AI as a Catalyst for Informal Learning: An Empirical Large-Scale Study on LLM Use in Everyday Learning
Na{\dj}a Terzimehi\'c, Babette B\"uhler, Enkelejda Kasneci
https://arxiv.org/abs/2506.11789

Conversational AI as a Catalyst for Informal Learning: An Empirical Large-Scale Study on LLM Use in Everyday Learning
Large language models have not only captivated the public imagination but have also sparked a profound rethinking of how we learn. In the third year following the breakthrough launch of ChatGPT, everyday informal learning has been transformed as diverse user groups explore these novel tools. Who is embracing LLMs for self-directed learning, and who remains hesitant? What are their reasons for adoption or avoidance? What learning patterns emerge with this novel technological landscape? We presen…

@arXiv_csRO_bot@mastoxiv.page
2025-07-17 10:00:30

EgoVLA: Learning Vision-Language-Action Models from Egocentric Human Videos
Ruihan Yang, Qinxi Yu, Yecheng Wu, Rui Yan, Borui Li, An-Chieh Cheng, Xueyan Zou, Yunhao Fang, Hongxu Yin, Sifei Liu, Song Han, Yao Lu, Xiaolong Wang
https://arxiv.org/abs/2507.12440

EgoVLA: Learning Vision-Language-Action Models from Egocentric Human Videos
Real robot data collection for imitation learning has led to significant advancements in robotic manipulation. However, the requirement for robot hardware in the process fundamentally constrains the scale of the data. In this paper, we explore training Vision-Language-Action (VLA) models using egocentric human videos. The benefit of using human videos is not only for their scale but more importantly for the richness of scenes and tasks. With a VLA trained on human video that predicts human wris…

@arXiv_quantph_bot@mastoxiv.page
2025-07-17 10:02:20

BenchRL-QAS: Benchmarking reinforcement learning algorithms for quantum architecture search
Azhar Ikhtiarudin, Aditi Das, Param Thakkar, Akash Kundu
https://arxiv.org/abs/2507.12189

BenchRL-QAS: Benchmarking reinforcement learning algorithms for quantum architecture search
We introduce BenchRL-QAS, a unified benchmarking framework for systematically evaluating reinforcement learning (RL) algorithms in quantum architecture search (QAS) across diverse variational quantum algorithm tasks and system sizes ranging from 2- to 8-qubit. Our study benchmarks nine RL agents including both value-based and policy-gradient methods on representative quantum problems such as variational quantum eigensolver, variational quantum state diagonalization, quantum classification, and …

@arXiv_csCR_bot@mastoxiv.page
2025-08-18 08:24:50

Activate Me!: Designing Efficient Activation Functions for Privacy-Preserving Machine Learning with Fully Homomorphic Encryption
Nges Brian Njungle, Michel A. Kinsy
https://arxiv.org/abs/2508.11575

Activate Me!: Designing Efficient Activation Functions for Privacy-Preserving Machine Learning with Fully Homomorphic Encryption
The growing adoption of machine learning in sensitive areas such as healthcare and defense introduces significant privacy and security challenges. These domains demand robust data protection, as models depend on large volumes of sensitive information for both training and inference. Fully Homomorphic Encryption (FHE) presents a compelling solution by enabling computations directly on encrypted data, maintaining confidentiality across the entire machine learning workflow. However, FHE inherently…

@arXiv_csSE_bot@mastoxiv.page
2025-06-17 09:55:37

The CAISAR Platform: Extending the Reach of Machine Learning Specification and Verification
Michele Alberti (LSL), Fran\c{c}ois Bobot (LSL), Julien Girard-Satabin (LSL), Alban Grastien (LSL), Aymeric Varasse (LSL), Zakaria Chihani (LSL)
https://arxiv.org/abs/2506.12084

The CAISAR Platform: Extending the Reach of Machine Learning Specification and Verification
The formal specification and verification of machine learning programs saw remarkable progress in less than a decade, leading to a profusion of tools. However, diversity may lead to fragmentation, resulting in tools that are difficult to compare, except for very specific benchmarks. Furthermore, this progress is heavily geared towards the specification and verification of a certain class of property, that is, local robustness properties. But while provers are becoming more and more efficient at…

@arXiv_csLG_bot@mastoxiv.page
2025-07-17 10:16:00

Information-Theoretic Generalization Bounds of Replay-based Continual Learning
Wen Wen, Tieliang Gong, Yunjiao Zhang, Zeyu Gao, Weizhan Zhang, Yong-Jin Liu
https://arxiv.org/abs/2507.12043

Information-Theoretic Generalization Bounds of Replay-based Continual Learning
Continual learning (CL) has emerged as a dominant paradigm for acquiring knowledge from sequential tasks while avoiding catastrophic forgetting. Although many CL methods have been proposed to show impressive empirical performance, the theoretical understanding of their generalization behavior remains limited, particularly for replay-based approaches. In this paper, we establish a unified theoretical framework for replay-based CL, deriving a series of information-theoretic bounds that explicitly…

@arXiv_csCV_bot@mastoxiv.page
2025-06-18 09:37:21

CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion
Jiahua Ma, Yiran Qin, Yixiong Li, Xuanqi Liao, Yulan Guo, Ruimao Zhang
https://arxiv.org/abs/2506.14769

CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion
Diffusion Policy (DP) enables robots to learn complex behaviors by imitating expert demonstrations through action diffusion. However, in practical applications, hardware limitations often degrade data quality, while real-time constraints restrict model inference to instantaneous state and scene observations. These limitations seriously reduce the efficacy of learning from expert demonstrations, resulting in failures in object localization, grasp planning, and long-horizon task execution. To add…

@arXiv_statML_bot@mastoxiv.page
2025-06-18 10:27:41

Rademacher learning rates for iterated random functions
Nikola Sandri\'c
https://arxiv.org/abs/2506.13946 https://arxiv.org/pdf/2…

Rademacher learning rates for iterated random functions
Most existing literature on supervised machine learning assumes that the training dataset is drawn from an i.i.d. sample. However, many real-world problems exhibit temporal dependence and strong correlations between the marginal distributions of the data-generating process, suggesting that the i.i.d. assumption is often unrealistic. In such cases, models naturally include time-series processes with mixing properties, as well as irreducible and aperiodic ergodic Markov chains. Moreover, the lear…

@arXiv_csRO_bot@mastoxiv.page
2025-07-18 08:35:12

Learning to Predict Mobile Robot Stability in Off-Road Environments
Nathaniel Rose, Arif Ahmed, Emanuel Gutierrez-Cornejo, Parikshit Maini
https://arxiv.org/abs/2507.12731

Learning to Predict Mobile Robot Stability in Off-Road Environments
Navigating in off-road environments for wheeled mobile robots is challenging due to dynamic and rugged terrain. Traditional physics-based stability metrics, such as Static Stability Margin (SSM) or Zero Moment Point (ZMP) require knowledge of contact forces, terrain geometry, and the robot's precise center-of-mass that are difficult to measure accurately in real-world field conditions. In this work, we propose a learning-based approach to estimate robot platform stability directly from proprioc…

@arXiv_csLG_bot@mastoxiv.page
2025-07-17 10:20:50

FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale
Boris Bonev, Thorsten Kurth, Ankur Mahesh, Mauro Bisson, Jean Kossaifi, Karthik Kashinath, Anima Anandkumar, William D. Collins, Michael S. Pritchard, Alexander Keller
https://arxiv.org/abs/2507.12144…

FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale
FourCastNet 3 advances global weather modeling by implementing a scalable, geometric machine learning (ML) approach to probabilistic ensemble forecasting. The approach is designed to respect spherical geometry and to accurately model the spatially correlated probabilistic nature of the problem, resulting in stable spectra and realistic dynamics across multiple scales. FourCastNet 3 delivers forecasting accuracy that surpasses leading conventional ensemble models and rivals the best diffusion-ba…

@arXiv_csCV_bot@mastoxiv.page
2025-08-18 09:52:30

Inside Knowledge: Graph-based Path Generation with Explainable Data Augmentation and Curriculum Learning for Visual Indoor Navigation
Daniel Airinei, Elena Burceanu, Marius Leordeanu
https://arxiv.org/abs/2508.11446

Inside Knowledge: Graph-based Path Generation with Explainable Data Augmentation and Curriculum Learning for Visual Indoor Navigation
Indoor navigation is a difficult task, as it generally comes with poor GPS access, forcing solutions to rely on other sources of information. While significant progress continues to be made in this area, deployment to production applications is still lacking, given the complexity and additional requirements of current solutions. Here, we introduce an efficient, real-time and easily deployable deep learning approach, based on visual input only, that can predict the direction towards a target fro…

@arXiv_quantph_bot@mastoxiv.page
2025-07-18 08:38:22

Leveraging Quantum Layers in Classical Neural Networks
Silvie Ill\'esov\'a
https://arxiv.org/abs/2507.12505 https://arxiv.org…

Leveraging Quantum Layers in Classical Neural Networks
Hybrid quantum-classical neural networks represent a promising frontier in the search for improved machine learning models. This thesis explores the integration of quantum layers within classical convolutional neural network architectures, aiming to leverage quantum entanglement and feature mapping to enhance learning capabilities. A detailed methodology for constructing and training such hybrid models is presented, using PyTorch and Qiskit Machine Learning frameworks. Experiments investigate t…

@arXiv_csCR_bot@mastoxiv.page
2025-07-18 07:31:52

Safeguarding Federated Learning-based Road Condition Classification
Sheng Liu, Panos Papadimitratos
https://arxiv.org/abs/2507.12568 https://

Safeguarding Federated Learning-based Road Condition Classification
Federated Learning (FL) has emerged as a promising solution for privacy-preserving autonomous driving, specifically camera-based Road Condition Classification (RCC) systems, harnessing distributed sensing, computing, and communication resources on board vehicles without sharing sensitive image data. However, the collaborative nature of FL-RCC frameworks introduces new vulnerabilities: Targeted Label Flipping Attacks (TLFAs), in which malicious clients (vehicles) deliberately alter their trainin…

@arXiv_statML_bot@mastoxiv.page
2025-06-17 12:30:01

Understanding Learning Invariance in Deep Linear Networks
Hao Duan, Guido Mont\'ufar
https://arxiv.org/abs/2506.13714 https://arx…

Understanding Learning Invariance in Deep Linear Networks
Equivariant and invariant machine learning models exploit symmetries and structural patterns in data to improve sample efficiency. While empirical studies suggest that data-driven methods such as regularization and data augmentation can perform comparably to explicitly invariant models, theoretical insights remain scarce. In this paper, we provide a theoretical comparison of three approaches for achieving invariance: data augmentation, regularization, and hard-wiring. We focus on mean squared e…

@arXiv_csRO_bot@mastoxiv.page
2025-08-18 09:09:00

Multi-Group Equivariant Augmentation for Reinforcement Learning in Robot Manipulation
Hongbin Lin, Juan Rojas, Kwok Wai Samuel Au
https://arxiv.org/abs/2508.11204 https://

Multi-Group Equivariant Augmentation for Reinforcement Learning in Robot Manipulation
Sampling efficiency is critical for deploying visuomotor learning in real-world robotic manipulation. While task symmetry has emerged as a promising inductive bias to improve efficiency, most prior work is limited to isometric symmetries -- applying the same group transformation to all task objects across all timesteps. In this work, we explore non-isometric symmetries, applying multiple independent group transformations across spatial and temporal dimensions to relax these constraints. We intr…

@arXiv_csLG_bot@mastoxiv.page
2025-07-17 10:25:50

Improving Reinforcement Learning Sample-Efficiency using Local Approximation
Mohit Prashant, Arvind Easwaran
https://arxiv.org/abs/2507.12383 https://

Improving Reinforcement Learning Sample-Efficiency using Local Approximation
In this study, we derive Probably Approximately Correct (PAC) bounds on the asymptotic sample-complexity for RL within the infinite-horizon Markov Decision Process (MDP) setting that are sharper than those in existing literature. The premise of our study is twofold: firstly, the further two states are from each other, transition-wise, the less relevant the value of the first state is when learning the $ε$-optimal value of the second; secondly, the amount of 'effort', sample-complexity-wise, ex…

@arXiv_csRO_bot@mastoxiv.page
2025-08-18 09:07:50

Actor-Critic for Continuous Action Chunks: A Reinforcement Learning Framework for Long-Horizon Robotic Manipulation with Sparse Reward
Jiarui Yang, Bin Zhu, Jingjing Chen, Yu-Gang Jiang
https://arxiv.org/abs/2508.11143

Actor-Critic for Continuous Action Chunks: A Reinforcement Learning Framework for Long-Horizon Robotic Manipulation with Sparse Reward
Existing reinforcement learning (RL) methods struggle with long-horizon robotic manipulation tasks, particularly those involving sparse rewards. While action chunking is a promising paradigm for robotic manipulation, using RL to directly learn continuous action chunks in a stable and data-efficient manner remains a critical challenge. This paper introduces AC3 (Actor-Critic for Continuous Chunks), a novel RL framework that learns to generate high-dimensional, continuous action sequences. To mak…

@arXiv_statML_bot@mastoxiv.page
2025-06-17 12:19:53

Random Matrix Theory for Deep Learning: Beyond Eigenvalues of Linear Models
Zhenyu Liao, Michael W. Mahoney
https://arxiv.org/abs/2506.13139 https://

Random Matrix Theory for Deep Learning: Beyond Eigenvalues of Linear Models
Modern Machine Learning (ML) and Deep Neural Networks (DNNs) often operate on high-dimensional data and rely on overparameterized models, where classical low-dimensional intuitions break down. In particular, the proportional regime where the data dimension, sample size, and number of model parameters are all large and comparable, gives rise to novel and sometimes counterintuitive behaviors. This paper extends traditional Random Matrix Theory (RMT) beyond eigenvalue-based analysis of linear mode…

@arXiv_csCR_bot@mastoxiv.page
2025-07-18 09:07:02

A Crowdsensing Intrusion Detection Dataset For Decentralized Federated Learning Models
Chao Feng, Alberto Huertas Celdran, Jing Han, Heqing Ren, Xi Cheng, Zien Zeng, Lucas Krauter, Gerome Bovet, Burkhard Stiller
https://arxiv.org/abs/2507.13313

A Crowdsensing Intrusion Detection Dataset For Decentralized Federated Learning Models
This paper introduces a dataset and experimental study for decentralized federated learning (DFL) applied to IoT crowdsensing malware detection. The dataset comprises behavioral records from benign and eight malware families. A total of 21,582,484 original records were collected from system calls, file system activities, resource usage, kernel events, input/output events, and network records. These records were aggregated into 30-second windows, resulting in 342,106 features used for model trai…

@arXiv_csRO_bot@mastoxiv.page
2025-07-18 09:51:32

Evaluating Reinforcement Learning Algorithms for Navigation in Simulated Robotic Quadrupeds: A Comparative Study Inspired by Guide Dog Behaviour
Emma M. A. Harrison
https://arxiv.org/abs/2507.13277

Evaluating Reinforcement Learning Algorithms for Navigation in Simulated Robotic Quadrupeds: A Comparative Study Inspired by Guide Dog Behaviour
Robots are increasingly integrated across industries, particularly in healthcare. However, many valuable applications for quadrupedal robots remain overlooked. This research explores the effectiveness of three reinforcement learning algorithms in training a simulated quadruped robot for autonomous navigation and obstacle avoidance. The goal is to develop a robotic guide dog simulation capable of path following and obstacle avoidance, with long-term potential for real-world assistance to guide d…

@arXiv_csCR_bot@mastoxiv.page
2025-06-17 09:32:15

InverTune: Removing Backdoors from Multimodal Contrastive Learning Models via Trigger Inversion and Activation Tuning
Mengyuan Sun, Yu Li, Yuchen Liu, Bo Du, Yunjie Ge
https://arxiv.org/abs/2506.12411

InverTune: Removing Backdoors from Multimodal Contrastive Learning Models via Trigger Inversion and Activation Tuning
Multimodal contrastive learning models like CLIP have demonstrated remarkable vision-language alignment capabilities, yet their vulnerability to backdoor attacks poses critical security risks. Attackers can implant latent triggers that persist through downstream tasks, enabling malicious control of model behavior upon trigger presentation. Despite great success in recent defense mechanisms, they remain impractical due to strong assumptions about attacker knowledge or excessive clean data requir…

@arXiv_statML_bot@mastoxiv.page
2025-06-17 12:10:25

A Transfer Learning Framework for Multilayer Networks via Model Averaging
Yongqin Qiu, Xinyu Zhang
https://arxiv.org/abs/2506.12455 https://

A Transfer Learning Framework for Multilayer Networks via Model Averaging
Link prediction in multilayer networks is a key challenge in applications such as recommendation systems and protein-protein interaction prediction. While many techniques have been developed, most rely on assumptions about shared structures and require access to raw auxiliary data, limiting their practicality. To address these issues, we propose a novel transfer learning framework for multilayer networks using a bi-level model averaging method. A $K$-fold cross-validation criterion based on edg…

@arXiv_csCV_bot@mastoxiv.page
2025-08-18 09:53:20

CineTrans: Learning to Generate Videos with Cinematic Transitions via Masked Diffusion Models
Xiaoxue Wu, Bingjie Gao, Yu Qiao, Yaohui Wang, Xinyuan Chen
https://arxiv.org/abs/2508.11484

CineTrans: Learning to Generate Videos with Cinematic Transitions via Masked Diffusion Models
Despite significant advances in video synthesis, research into multi-shot video generation remains in its infancy. Even with scaled-up models and massive datasets, the shot transition capabilities remain rudimentary and unstable, largely confining generated videos to single-shot sequences. In this work, we introduce CineTrans, a novel framework for generating coherent multi-shot videos with cinematic, film-style transitions. To facilitate insights into the film editing style, we construct a mul…

@arXiv_csLG_bot@mastoxiv.page
2025-07-18 13:38:26

Replaced article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[2/5]:
- Learning Universal Human Mobility Patterns with a Foundation Model for Cross-domain Data Fusion
Haoxuan Ma, Xishun Liao, Yifan Liu, Qinhua Jiang, Chris Stanford, Shangqing Cao, Jiaqi Ma

@arXiv_csRO_bot@mastoxiv.page
2025-06-18 08:40:11

Quadrotor Morpho-Transition: Learning vs Model-Based Control Strategies
Ioannis Mandralis, Richard M. Murray, Morteza Gharib
https://arxiv.org/abs/2506.14039

Quadrotor Morpho-Transition: Learning vs Model-Based Control Strategies
Quadrotor Morpho-Transition, or the act of transitioning from air to ground through mid-air transformation, involves complex aerodynamic interactions and a need to operate near actuator saturation, complicating controller design. In recent work, morpho-transition has been studied from a model-based control perspective, but these approaches remain limited due to unmodeled dynamics and the requirement for planning through contacts. Here, we train an end-to-end Reinforcement Learning (RL) controll…

@arXiv_csCR_bot@mastoxiv.page
2025-07-17 09:01:20

A Privacy-Preserving Framework for Advertising Personalization Incorporating Federated Learning and Differential Privacy
Xiang Li, Yifan Lin, Yuanzhe Zhang
https://arxiv.org/abs/2507.12098

A Privacy-Preserving Framework for Advertising Personalization Incorporating Federated Learning and Differential Privacy
To mitigate privacy leakage and performance issues in personalized advertising, this paper proposes a framework that integrates federated learning and differential privacy. The system combines distributed feature extraction, dynamic privacy budget allocation, and robust model aggregation to balance model accuracy, communication overhead, and privacy protection. Multi-party secure computing and anomaly detection mechanisms further enhance system resilience against malicious attacks. Experimental…

@arXiv_statML_bot@mastoxiv.page
2025-06-18 10:27:34

Beyond Shapley Values: Cooperative Games for the Interpretation of Machine Learning Models
Marouane Il Idrissi, Agathe Fernandes Machado, Arthur Charpentier
https://arxiv.org/abs/2506.13900

Beyond Shapley Values: Cooperative Games for the Interpretation of Machine Learning Models
Cooperative game theory has become a cornerstone of post-hoc interpretability in machine learning, largely through the use of Shapley values. Yet, despite their widespread adoption, Shapley-based methods often rest on axiomatic justifications whose relevance to feature attribution remains debatable. In this paper, we revisit cooperative game theory from an interpretability perspective and argue for a broader and more principled use of its tools. We highlight two general families of efficient al…

@arXiv_csCV_bot@mastoxiv.page
2025-07-18 10:22:02

$\pi^3$: Scalable Permutation-Equivariant Visual Geometry Learning
Yifan Wang, Jianjun Zhou, Haoyi Zhu, Wenzheng Chang, Yang Zhou, Zizun Li, Junyi Chen, Jiangmiao Pang, Chunhua Shen, Tong He
https://arxiv.org/abs/2507.13347

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning
We introduce $π^3$, a feed-forward neural network that offers a novel approach to visual geometry reconstruction, breaking the reliance on a conventional fixed reference view. Previous methods often anchor their reconstructions to a designated viewpoint, an inductive bias that can lead to instability and failures if the reference is suboptimal. In contrast, $π^3$ employs a fully permutation-equivariant architecture to predict affine-invariant camera poses and scale-invariant local point maps …

@arXiv_csLG_bot@mastoxiv.page
2025-07-17 13:52:20

Replaced article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[5/5]:
- Machine Learning-Driven Compensation for Non-Ideal Channels in AWG-Based FBG Interrogator
Kazakov, Kulichenko, Kovalev, Treskova, Barma, Malakhov, Oseledets, Shipulin

@arXiv_csRO_bot@mastoxiv.page
2025-07-18 09:44:02

ZipMPC: Compressed Context-Dependent MPC Cost via Imitation Learning
Rahel Rickenbach, Alan A. Lahoud, Erik Schaffernicht, Melanie N. Zeilinger, Johannes A. Stork
https://arxiv.org/abs/2507.13088

ZipMPC: Compressed Context-Dependent MPC Cost via Imitation Learning
The computational burden of model predictive control (MPC) limits its application on real-time systems, such as robots, and often requires the use of short prediction horizons. This not only affects the control performance, but also increases the difficulty of designing MPC cost functions that reflect the desired long-term objective. This paper proposes ZipMPC, a method that imitates a long-horizon MPC behaviour by learning a compressed and context-dependent cost function for a short-horizon MP…

@arXiv_csRO_bot@mastoxiv.page
2025-07-18 09:21:42

DEMONSTRATE: Zero-shot Language to Robotic Control via Multi-task Demonstration Learning
Rahel Rickenbach, Bruce Lee, Ren\'e Zurbr\"ugg, Carmen Amo Alonso, Melanie N. Zeilinger
https://arxiv.org/abs/2507.12855

DEMONSTRATE: Zero-shot Language to Robotic Control via Multi-task Demonstration Learning
The integration of large language models (LLMs) with control systems has demonstrated significant potential in various settings, such as task completion with a robotic manipulator. A main reason for this success is the ability of LLMs to perform in-context learning, which, however, strongly relies on the design of task examples, closely related to the target tasks. Consequently, employing LLMs to formulate optimal control problems often requires task examples that contain explicit mathematical …

@arXiv_csLG_bot@mastoxiv.page
2025-07-18 13:38:56

Replaced article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[5/5]:
- EgoVLA: Learning Vision-Language-Action Models from Egocentric Human Videos
Yang, Yu, Wu, Yan, Li, Cheng, Zou, Fang, Yin, Liu, Han, Lu, Wang

@arXiv_csRO_bot@mastoxiv.page
2025-07-16 10:04:41

ILCL: Inverse Logic-Constraint Learning from Temporally Constrained Demonstrations
Minwoo Cho, Jaehwi Jang, Daehyung Park
https://arxiv.org/abs/2507.11000 …

ILCL: Inverse Logic-Constraint Learning from Temporally Constrained Demonstrations
We aim to solve the problem of temporal-constraint learning from demonstrations to reproduce demonstration-like logic-constrained behaviors. Learning logic constraints is challenging due to the combinatorially large space of possible specifications and the ill-posed nature of non-Markovian constraints. To figure it out, we introduce a novel temporal-constraint learning method, which we call inverse logic-constraint learning (ILCL). Our method frames ICL as a two-player zero-sum game between 1) …

@arXiv_csLG_bot@mastoxiv.page
2025-06-17 19:04:33

Replaced article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[1/8]:
Boosting Resource-Constrained Federated Learning Systems with Guessed Updates

@arXiv_csLG_bot@mastoxiv.page
2025-06-17 19:05:35

Replaced article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[7/8]:
Achieving Collective Welfare in Multi-Agent Reinforcement Learning via Suggestion Sharing

Tootfinder

Opt-in global Mastodon full text search. Join the index!