Tootfinder

@arXiv_csIR_bot@mastoxiv.page
2025-06-17 09:56:33

Device-Cloud Collaborative Correction for On-Device Recommendation
Tianyu Zhan, Shengyu Zhang, Zheqi Lv, Jieming Zhu, Jiwei Li, Fan Wu, Fei Wu
https://arxiv.org/abs/2506.12687

Device-Cloud Collaborative Correction for On-Device Recommendation
With the rapid development of recommendation models and device computing power, device-based recommendation has become an important research area due to its better real-time performance and privacy protection. Previously, Transformer-based sequential recommendation models have been widely applied in this field because they outperform Recurrent Neural Network (RNN)-based recommendation models in terms of performance. However, as the length of interaction sequences increases, Transformer-based mo…

@arXiv_csCV_bot@mastoxiv.page
2025-07-14 10:08:22

NeuralOS: Towards Simulating Operating Systems via Neural Generative Models
Luke Rivard, Sun Sun, Hongyu Guo, Wenhu Chen, Yuntian Deng
https://arxiv.org/abs/2507.08800

NeuralOS: Towards Simulating Operating Systems via Neural Generative Models
We introduce NeuralOS, a neural framework that simulates graphical user interfaces (GUIs) of operating systems by directly predicting screen frames in response to user inputs such as mouse movements, clicks, and keyboard events. NeuralOS combines a recurrent neural network (RNN), which tracks computer state, with a diffusion-based neural renderer that generates screen images. The model is trained on a large-scale dataset of Ubuntu XFCE recordings, which include both randomly generated interacti…

@arXiv_csRO_bot@mastoxiv.page
2025-07-04 07:44:01

Towards Bio-Inspired Robotic Trajectory Planning via Self-Supervised RNN
Miroslav Cibula, Krist\'ina Malinovsk\'a, Matthias Kerzel
https://arxiv.org/abs/2507.02171

Towards Bio-Inspired Robotic Trajectory Planning via Self-Supervised RNN
Trajectory planning in robotics is understood as generating a sequence of joint configurations that will lead a robotic agent, or its manipulator, from an initial state to the desired final state, thus completing a manipulation task while considering constraints like robot kinematics and the environment. Typically, this is achieved via sampling-based planners, which are computationally intensive. Recent advances demonstrate that trajectory planning can also be performed by supervised sequence l…

@arXiv_quantph_bot@mastoxiv.page
2025-06-10 18:56:00

This https://arxiv.org/abs/2505.22083 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_qu…

Hyperbolic recurrent neural network as the first type of non-Euclidean neural quantum state ansatz
In this work, we introduce the first type of non-Euclidean neural quantum state (NQS) ansatz, in the form of the hyperbolic GRU (a variant of recurrent neural networks (RNNs)), to be used in the Variational Monte Carlo method of approximating the ground state wavefunction for quantum many-body systems. In particular, we examine the performances of NQS ansatzes constructed from both conventional or Euclidean RNN/GRU and from hyperbolic GRU in the prototypical settings of the one- and two-dimensi…

@arXiv_csLG_bot@mastoxiv.page
2025-07-11 10:23:51

Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs
Ziyue Li, Yang Li, Tianyi Zhou
https://arxiv.org/abs/2507.07996 https://arxiv.org/pdf/2507.07996 https://arxiv.org/html/2507.07996
arXiv:2507.07996v1 Announce Type: new
Abstract: Can a pretrained neural network adapt its architecture to different inputs without any finetuning? Do we need all layers for simple tasks, and are they adequate for challenging tasks? We found that the layers of a pretrained large language model (LLM) can be manipulated as separate modules to build a better and even shallower model customized for each test sample. In particular, each layer from the pretrained model can be skipped/pruned or repeated multiple times as recurrent neural networks (RNN), and stacked with others in arbitrary orders, yielding a chain-of-layers (CoLa) per sample. This compositional space greatly expands the scope of existing works on looped/recurrent pretrained modules, layer pruning, or early-exit networks. We develop a Monte Carlo Tree Search (MCTS) protocol to explore and identify the optimal CoLa for each sample from math and commonsense reasoning benchmarks. Compared to a static model of a fixed depth, CoLa allows shortcut paths (fast thinking), recurrence of the same layer(s) (slow thinking), and combining both, offering more flexible, dynamic architectures for different inputs. We conduct an extensive analysis of the MCTS-optimized CoLa, which leads to two key findings: (1) For >75% of samples with correct predictions by the original LLM, we can find shorter CoLa, suggesting a large space for improving inference efficiency; (2) For >60% of samples with originally incorrect predictions, we can identify CoLa achieving correct predictions, suggesting a large space of performance enhancement. Our results highlight the shortcomings of using a fixed architecture of pre-trained LLMs for inference on different samples and pave the way to unlock the generalization power of test-time depth adaptation.
toXiv_bot_toot

@arXiv_csAI_bot@mastoxiv.page
2025-07-01 11:11:33

Hybrid Approach for Electricity Price Forecasting using AlexNet and LSTM
Bosubabu Sambana, Kotamsetty Geethika Devi, Bandi Rajeswara Reddy, Galeti Mohammad Hussain, Gownivalla Siddartha
https://arxiv.org/abs/2506.23504

Hybrid Approach for Electricity Price Forecasting using AlexNet and LSTM
The recent development of advanced machine learning methods for hybrid models has greatly addressed the need for the correct prediction of electrical prices. This method combines AlexNet and LSTM algorithms, which are used to introduce a new model with higher accuracy in price forecasting. Despite RNN and ANN being effective, they often fail to deal with forex time sequence data. The traditional methods do not accurately forecast the prices. These traditional methods only focus on demand and pr…

@arXiv_csCE_bot@mastoxiv.page
2025-07-04 08:23:01

Time Resolution Independent Operator Learning
Diab W. Abueidda, Mbebo Nonna, Panos Pantidis, Mostafa E. Mobasher
https://arxiv.org/abs/2507.02524 https://

Time Resolution Independent Operator Learning
Accurately learning solution operators for time-dependent partial differential equations (PDEs) from sparse and irregular data remains a challenging task. Recurrent DeepONet extensions inherit the discrete-time limitations of sequence-to-sequence (seq2seq) RNN architectures, while neural-ODE surrogates cannot incorporate new inputs after initialization. We introduce NCDE-DeepONet, a continuous-time operator network that embeds a Neural Controlled Differential Equation (NCDE) in the branch and a…

@arXiv_qbioNC_bot@mastoxiv.page
2025-06-24 09:22:09

Sequence-to-Sequence Models with Attention Mechanistically Map to the Architecture of Human Memory Search
Nikolaus Salvatore, Qiong Zhang
https://arxiv.org/abs/2506.17424

Sequence-to-Sequence Models with Attention Mechanistically Map to the Architecture of Human Memory Search
Past work has long recognized the important role of context in guiding how humans search their memory. While context-based memory models can explain many memory phenomena, it remains unclear why humans develop such architectures over possible alternatives in the first place. In this work, we demonstrate that foundational architectures in neural machine translation -- specifically, recurrent neural network (RNN)-based sequence-to-sequence models with attention -- exhibit mechanisms that directly…

Tootfinder

Opt-in global Mastodon full text search. Join the index!