
2025-06-10 18:56:00
This https://arxiv.org/abs/2505.22083 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_qu…
This https://arxiv.org/abs/2505.22083 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_qu…
Dense Associative Memory in a Nonlinear Optical Hopfield Neural Network
Khalid Musa, Santosh Kumar, Michael Katidis, Yu-Ping Huang
https://arxiv.org/abs/2506.07849
Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs
Ziyue Li, Yang Li, Tianyi Zhou
https://arxiv.org/abs/2507.07996 https://arxiv.org/pdf/2507.07996 https://arxiv.org/html/2507.07996
arXiv:2507.07996v1 Announce Type: new
Abstract: Can a pretrained neural network adapt its architecture to different inputs without any finetuning? Do we need all layers for simple tasks, and are they adequate for challenging tasks? We found that the layers of a pretrained large language model (LLM) can be manipulated as separate modules to build a better and even shallower model customized for each test sample. In particular, each layer from the pretrained model can be skipped/pruned or repeated multiple times as recurrent neural networks (RNN), and stacked with others in arbitrary orders, yielding a chain-of-layers (CoLa) per sample. This compositional space greatly expands the scope of existing works on looped/recurrent pretrained modules, layer pruning, or early-exit networks. We develop a Monte Carlo Tree Search (MCTS) protocol to explore and identify the optimal CoLa for each sample from math and commonsense reasoning benchmarks. Compared to a static model of a fixed depth, CoLa allows shortcut paths (fast thinking), recurrence of the same layer(s) (slow thinking), and combining both, offering more flexible, dynamic architectures for different inputs. We conduct an extensive analysis of the MCTS-optimized CoLa, which leads to two key findings: (1) For >75% of samples with correct predictions by the original LLM, we can find shorter CoLa, suggesting a large space for improving inference efficiency; (2) For >60% of samples with originally incorrect predictions, we can identify CoLa achieving correct predictions, suggesting a large space of performance enhancement. Our results highlight the shortcomings of using a fixed architecture of pre-trained LLMs for inference on different samples and pave the way to unlock the generalization power of test-time depth adaptation.
toXiv_bot_toot
This https://arxiv.org/abs/2506.05588 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csNE_…
A Linear Generative Framework for Structure-Function Coupling in the Human Brain
Sam Frank Kelemen, Joaqu\'in G\~oni, S\'ergio Pequito, Arian Ashourvan
https://arxiv.org/abs/2507.06136
This https://arxiv.org/abs/2406.03456 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_qbi…
Fine-Tuning MIDI-to-Audio Alignment using a Neural Network on Piano Roll and CQT Representations
Sebastian Murgul, Moritz Reiser, Michael Heizmann, Christoph Seibert
https://arxiv.org/abs/2506.22237
Preprocessing Methods for Memristive Reservoir Computing for Image Recognition
Rishona Daniels, Duna Wattad, Ronny Ronen, David Saad, Shahar Kvatinsky
https://arxiv.org/abs/2506.05588
ChemReservoir -- An Open-Source Framework for Chemically-Inspired Reservoir Computing
Mehmet Aziz Yirik, Jakob Lykke Andersen, Rolf Fagerberg, Daniel Merkle
https://arxiv.org/abs/2506.04249
This https://arxiv.org/abs/2506.01226 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_ees…
Adaptive Neural Quantum States: A Recurrent Neural Network Perspective
Jake McNaughton, Mohamed Hibat-Allah
https://arxiv.org/abs/2507.18700 https://arxiv.…
Adaptive Market Intelligence: A Mixture of Experts Framework for Volatility-Sensitive Stock Forecasting
Diego Vallarino
https://arxiv.org/abs/2508.02686 https://
Multi-Utterance Speech Separation and Association Trained on Short Segments
Yuzhu Wang, Archontis Politis, Konstantinos Drossos, Tuomas Virtanen
https://arxiv.org/abs/2507.02562
ReCoGNet: Recurrent Context-Guided Network for 3D MRI Prostate Segmentation
Ahmad Mustafa, Reza Rastegar, Ghassan AlRegib
https://arxiv.org/abs/2506.19687 …
NeuralOS: Towards Simulating Operating Systems via Neural Generative Models
Luke Rivard, Sun Sun, Hongyu Guo, Wenhu Chen, Yuntian Deng
https://arxiv.org/abs/2507.08800
React to Surprises: Stable-by-Design Neural Feedback Control and the Youla-REN
Nicholas H. Barbara, Ruigang Wang, Alexandre Megretski, Ian R. Manchester
https://arxiv.org/abs/2506.01226
Organizational Regularities in Recurrent Neural Networks
Claus Metzner, Achim Schilling, Andreas Maier, Patrick Krauss
https://arxiv.org/abs/2505.22047 htt…
Iola Walker: A Mobile Footfall Detection System for Music Composition
Will James
https://arxiv.org/abs/2506.01211 https://arxiv.org/p…
Time Resolution Independent Operator Learning
Diab W. Abueidda, Mbebo Nonna, Panos Pantidis, Mostafa E. Mobasher
https://arxiv.org/abs/2507.02524 https://
Recurrent neural network-based robust control systems with closed-loop regional incremental ISS and application to MPC design
Daniele Ravasio, Marcello Farina, Alessio La Bella, Andrea Ballarino
https://arxiv.org/abs/2506.20334
Hebbian Memory-Augmented Recurrent Networks: Engram Neurons in Deep Learning
Daniel Szelogowski
https://arxiv.org/abs/2507.21474 https://arxiv.org/pdf/2507…
Replaced article(s) found for cond-mat.quant-gas. https://arxiv.org/list/cond-mat.quant-gas/new
[1/1]:
- Recurrent neural network wave functions for Rydberg atom arrays on kagome lattice
Mohamed Hibat-Allah, Ejaaz Merali, Giacomo Torlai, Roger G Melko, Juan Carrasquilla…
Brain-inspired interpretable reservoir computing with resonant recurrent neural networks
Mark A. Kramer
https://arxiv.org/abs/2506.17083 https://
Device-Cloud Collaborative Correction for On-Device Recommendation
Tianyu Zhan, Shengyu Zhang, Zheqi Lv, Jieming Zhu, Jiwei Li, Fan Wu, Fei Wu
https://arxiv.org/abs/2506.12687
Recursive KalmanNet: Deep Learning-Augmented Kalman Filtering for State Estimation with Consistent Uncertainty Quantification
Hassan Mortada, Cyril Falcon, Yanis Kahil, Math\'eo Clavaud, Jean-Philippe Michel
https://arxiv.org/abs/2506.11639
Replaced article(s) found for cs.SD. https://arxiv.org/list/cs.SD/new
[1/1]:
- ReMi: A Random Recurrent Neural Network Approach to Music Production
Hugo Chateau-Laurent, Tara Vanhatalo, Wei-Tung Pan, Xavier Hinaut
Sequence-to-Sequence Models with Attention Mechanistically Map to the Architecture of Human Memory Search
Nikolaus Salvatore, Qiong Zhang
https://arxiv.org/abs/2506.17424
Biological Processing Units: Leveraging an Insect Connectome to Pioneer Biofidelic Neural Architectures
Siyu Yu, Zihan Qin, Tingshan Liu, Beiya Xu, R. Jacob Vogelstein, Jason Brown, Joshua T. Vogelstein
https://arxiv.org/abs/2507.10951