Tootfinder

Opt-in global Mastodon full text search. Join the index!

@arXiv_csLG_bot@mastoxiv.page
2025-10-09 10:37:21

Grouped Differential Attention
Junghwan Lim, Sungmin Lee, Dongseok Kim, Wai Ting Cheung, Beomgyu Kim, Taehwan Kim, Haesol Lee, Junhyeok Lee, Dongpin Oh, Eunhwan Park
arxiv.org/abs/2510.06949

@arXiv_csCL_bot@mastoxiv.page
2025-10-14 13:16:18

Deconstructing Attention: Investigating Design Principles for Effective Language Modeling
Huiyin Xue, Nafise Sadat Moosavi, Nikolaos Aletras
arxiv.org/abs/2510.11602

@arXiv_csAI_bot@mastoxiv.page
2025-10-15 10:10:21

Biased-Attention Guided Risk Prediction for Safe Decision-Making at Unsignalized Intersections
Chengyang Dong, Nan Guo
arxiv.org/abs/2510.12428

@arXiv_csCV_bot@mastoxiv.page
2025-10-09 10:38:51

DADO: A Depth-Attention framework for Object Discovery
Federico Gonzalez, Estefania Talavera, Petia Radeva
arxiv.org/abs/2510.07089 arxiv.o…

@arXiv_csIR_bot@mastoxiv.page
2025-10-14 09:15:38

Integrating Structure-Aware Attention and Knowledge Graphs in Explainable Recommendation Systems
Shuangquan Lyu, Ming Wang, Huajun Zhang, Jiasen Zheng, Junjiang Lin, Xiaoxuan Sun
arxiv.org/abs/2510.10109

@arXiv_condmatmtrlsci_bot@mastoxiv.page
2025-10-15 10:08:21

Self-attention enabled quantum path analysis of high-harmonic generation in solids
Cong Zhao, Xiaozhou Zou
arxiv.org/abs/2510.12443 arxiv.o…

@arXiv_csSD_bot@mastoxiv.page
2025-10-01 09:43:38

The silence of the weights: an investigation of structural pruning strategies for attention-based audio signal architectures
Andrea Diecidue, Carlo Alberto Barbano, Piero Fraternali, Mathieu Fontaine, Enzo Tartaglione
arxiv.org/abs/2509.26207

@arXiv_csLG_bot@mastoxiv.page
2025-10-13 10:44:30

Cross-attention Secretly Performs Orthogonal Alignment in Recommendation Models
Hyunin Lee, Yong Zhang, Hoang Vu Nguyen, Xiaoyi Liu, Namyong Park, Christopher Jung, Rong Jin, Yang Wang, Zhigang Wang, Somayeh Sojoudi, Xue Feng
arxiv.org/abs/2510.09435

@arXiv_eessIV_bot@mastoxiv.page
2025-10-13 08:46:00

Progressive Uncertainty-Guided Evidential U-KAN for Trustworthy Medical Image Segmentation
Zhen Yang, Yansong Ma, Lei Chen
arxiv.org/abs/2510.08949

@arXiv_csCL_bot@mastoxiv.page
2025-10-15 10:27:41

Credal Transformer: A Principled Approach for Quantifying and Mitigating Hallucinations in Large Language Models
Shihao Ji, Zihui Song, Jiajie Huang
arxiv.org/abs/2510.12137

@arXiv_csCV_bot@mastoxiv.page
2025-10-10 11:11:09

Gaze on the Prize: Shaping Visual Attention with Return-Guided Contrastive Learning
Andrew Lee, Ian Chuang, Dechen Gao, Kai Fukazawa, Iman Soltani
arxiv.org/abs/2510.08442

@arXiv_csAI_bot@mastoxiv.page
2025-10-01 11:45:57

HilbertA: Hilbert Attention for Image Generation with Diffusion Models
Shaoyi Zheng, Wenbo Lu, Yuxuan Xia, Haomin Liu, Shengjie Wang
arxiv.org/abs/2509.26538

@arXiv_eessSP_bot@mastoxiv.page
2025-10-15 08:27:42

A Deep Multi-Task Learning Approach to Impulsive Noise Parameter Estimation
Abdullahi Mohammad, Bdah Eya, Bassant Selim
arxiv.org/abs/2510.12179

@arXiv_csCE_bot@mastoxiv.page
2025-10-15 07:36:21

Agent-Based Simulation of a Financial Market with Large Language Models
Ryuji Hashimoto, Takehiro Takayanagi, Masahiro Suzuki, Kiyoshi Izumi
arxiv.org/abs/2510.12189

@arXiv_csLG_bot@mastoxiv.page
2025-12-22 10:32:50

Spatially-informed transformers: Injecting geostatistical covariance biases into self-attention for spatio-temporal forecasting
Yuri Calleo
arxiv.org/abs/2512.17696 arxiv.org/pdf/2512.17696 arxiv.org/html/2512.17696
arXiv:2512.17696v1 Announce Type: new
Abstract: The modeling of high-dimensional spatio-temporal processes presents a fundamental dichotomy between the probabilistic rigor of classical geostatistics and the flexible, high-capacity representations of deep learning. While Gaussian processes offer theoretical consistency and exact uncertainty quantification, their prohibitive computational scaling renders them impractical for massive sensor networks. Conversely, modern transformer architectures excel at sequence modeling but inherently lack a geometric inductive bias, treating spatial sensors as permutation-invariant tokens without a native understanding of distance. In this work, we propose a spatially-informed transformer, a hybrid architecture that injects a geostatistical inductive bias directly into the self-attention mechanism via a learnable covariance kernel. By formally decomposing the attention structure into a stationary physical prior and a non-stationary data-driven residual, we impose a soft topological constraint that favors spatially proximal interactions while retaining the capacity to model complex dynamics. We demonstrate the phenomenon of ``Deep Variography'', where the network successfully recovers the true spatial decay parameters of the underlying process end-to-end via backpropagation. Extensive experiments on synthetic Gaussian random fields and real-world traffic benchmarks confirm that our method outperforms state-of-the-art graph neural networks. Furthermore, rigorous statistical validation confirms that the proposed method delivers not only superior predictive accuracy but also well-calibrated probabilistic forecasts, effectively bridging the gap between physics-aware modeling and data-driven learning.
toXiv_bot_toot

@arXiv_condmatmtrlsci_bot@mastoxiv.page
2025-10-10 08:21:39

Attention to Order: Transformers Discover Phase Transitions via Learnability
\c{S}ener \"Oz\"onder
arxiv.org/abs/2510.07401 arxiv…

@arXiv_csAI_bot@mastoxiv.page
2025-10-01 11:28:27

LMILAtt: A Deep Learning Model for Depression Detection from Social Media Users Enhanced by Multi-Instance Learning Based on Attention Mechanism
Yukun Yang
arxiv.org/abs/2509.26145

@arXiv_csCL_bot@mastoxiv.page
2025-10-03 10:40:01

Learning to Look at the Other Side: A Semantic Probing Study of Word Embeddings in LLMs with Enabled Bidirectional Attention
Zhaoxin Feng, Jianfei Ma, Emmanuele Chersoni, Xiaojing Zhao, Xiaoyi Bao
arxiv.org/abs/2510.01652

@arXiv_csLG_bot@mastoxiv.page
2025-10-07 13:04:42

On Structured State-Space Duality
Jerry Yao-Chieh Hu, Xiwen Zhang, Weimin Wu, Han Liu
arxiv.org/abs/2510.04944 arxiv.org/pdf/2510.04944

@arXiv_csSD_bot@mastoxiv.page
2025-10-10 08:33:28

Personality-Enhanced Multimodal Depression Detection in the Elderly
Honghong Wang, Jing Deng, Rong Zheng
arxiv.org/abs/2510.08004 arxiv.org…

@arXiv_csLG_bot@mastoxiv.page
2025-10-02 11:09:21

Privacy Preserved Federated Learning with Attention-Based Aggregation for Biometric Recognition
Kassahun Azezew, Minyechil Alehegn, Tsega Asresa, Bitew Mekuria, Tizazu Bayh, Ayenew Kassie, Amsalu Tesema, Animut Embiyale
arxiv.org/abs/2510.01113

@arXiv_csCL_bot@mastoxiv.page
2025-10-13 12:01:10

Crosslisted article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[1/2]:
- Limitations of Normalization in Attention Mechanism
Timur Mudarisov, Mikhail Burtsev, Tatiana Petrova, Radu State

@arXiv_csLG_bot@mastoxiv.page
2025-10-10 11:15:29

In-Context Clustering with Large Language Models
Ying Wang, Mengye Ren, Andrew Gordon Wilson
arxiv.org/abs/2510.08466 arxiv.org/pdf/2510.08…

@arXiv_csCV_bot@mastoxiv.page
2025-10-02 10:54:11

Feature Identification for Hierarchical Contrastive Learning
Julius Ott, Nastassia Vysotskaya, Huawei Sun, Lorenzo Servadei, Robert Wille
arxiv.org/abs/2510.00837

@arXiv_csCL_bot@mastoxiv.page
2025-10-13 10:29:50

Mask Tokens as Prophet: Fine-Grained Cache Eviction for Efficient dLLM Inference
Jianuo Huang, Yaojie Zhang, Yicun Yang, Benhao Huang, Biqing Qi, Dongrui Liu, Linfeng Zhang
arxiv.org/abs/2510.09309

@arXiv_csLG_bot@mastoxiv.page
2025-10-01 11:57:07

TASP: Topology-aware Sequence Parallelism
Yida Wang (Capital Normal University, Infinigence-AI), Ke Hong (Tsinghua University, Infinigence-AI), Xiuhong Li (Infinigence-AI), Yuanchao Xu (Capital Normal University), Wenxun Wang (Tsinghua University), Guohao Dai (Infinigence-AI, Shanghai Jiao Tong University), Yu Wang (Tsinghua University)

@arXiv_csLG_bot@mastoxiv.page
2025-10-10 11:13:29

Synthetic Series-Symbol Data Generation for Time Series Foundation Models
Wenxuan Wang, Kai Wu, Yujian Betterest Li, Dan Wang, Xiaoyu Zhang
arxiv.org/abs/2510.08445

@arXiv_csLG_bot@mastoxiv.page
2025-10-06 10:25:29

Signature-Informed Transformer for Asset Allocation
Yoontae Hwang, Stefan Zohren
arxiv.org/abs/2510.03129 arxiv.org/pdf/2510.03129

@arXiv_csLG_bot@mastoxiv.page
2025-10-02 11:07:11

Random Feature Spiking Neural Networks
Maximilian Gollwitzer, Felix Dietrich
arxiv.org/abs/2510.01012 arxiv.org/pdf/2510.01012