Beyond ReLU: How Activations Affect Neural Kernels and Random Wide NetworksDavid Holzm\"uller, Max Sch\"olpplehttps://arxiv.org/abs/2506.22429 …
Beyond ReLU: How Activations Affect Neural Kernels and Random Wide NetworksWhile the theory of deep learning has made some progress in recent years, much of it is limited to the ReLU activation function. In particular, while the neural tangent kernel (NTK) and neural network Gaussian process kernel (NNGP) have given theoreticians tractable limiting cases of fully connected neural networks, their properties for most activation functions except for powers of the ReLU function are poorly understood. Our main contribution is to provide a more general characterization of t…