Tootfinder

Opt-in global Mastodon full text search. Join the index!

@arXiv_csLG_bot@mastoxiv.page
2025-09-12 09:19:09

Fast attention mechanisms: a tale of parallelism
Jingwen Liu, Hantao Yu, Clayton Sanford, Alexandr Andoni, Daniel Hsu
arxiv.org/abs/2509.09001

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 09:54:21

Mitigating Attention Localization in Small Scale: Self-Attention Refinement via One-step Belief Propagation
Nakyung Lee, Yeongoon Kim, Minhae Oh, Suhwan Kim, Jin Woo Koo, Hyewon Jo, Jungwoo Lee
arxiv.org/abs/2509.07324

@arXiv_csSD_bot@mastoxiv.page
2025-09-12 08:30:29

Efficient Transformer-Based Piano Transcription With Sparse Attention Mechanisms
Weixing Wei, Kazuyoshi Yoshii
arxiv.org/abs/2509.09318 arx…

@arXiv_csCV_bot@mastoxiv.page
2025-10-10 11:11:09

Gaze on the Prize: Shaping Visual Attention with Return-Guided Contrastive Learning
Andrew Lee, Ian Chuang, Dechen Gao, Kai Fukazawa, Iman Soltani
arxiv.org/abs/2510.08442

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 09:11:11

Causal Attention with Lookahead Keys
Zhuoqing Song, Peng Sun, Huizhuo Yuan, Quanquan Gu
arxiv.org/abs/2509.07301 arxiv.org/pdf/2509.07301…

@arXiv_csLG_bot@mastoxiv.page
2025-10-09 10:37:21

Grouped Differential Attention
Junghwan Lim, Sungmin Lee, Dongseok Kim, Wai Ting Cheung, Beomgyu Kim, Taehwan Kim, Haesol Lee, Junhyeok Lee, Dongpin Oh, Eunhwan Park
arxiv.org/abs/2510.06949

@arXiv_condmatmtrlsci_bot@mastoxiv.page
2025-10-10 08:21:39

Attention to Order: Transformers Discover Phase Transitions via Learnability
\c{S}ener \"Oz\"onder
arxiv.org/abs/2510.07401 arxiv…

@arXiv_csCV_bot@mastoxiv.page
2025-10-09 10:38:51

DADO: A Depth-Attention framework for Object Discovery
Federico Gonzalez, Estefania Talavera, Petia Radeva
arxiv.org/abs/2510.07089 arxiv.o…

@arXiv_eessAS_bot@mastoxiv.page
2025-09-11 09:22:43

Accelerating Diffusion Transformer-Based Text-to-Speech with Transformer Layer Caching
Siratish Sakpiboonchit
arxiv.org/abs/2509.08696 arxi…

@arXiv_csLG_bot@mastoxiv.page
2025-10-10 11:15:29

In-Context Clustering with Large Language Models
Ying Wang, Mengye Ren, Andrew Gordon Wilson
arxiv.org/abs/2510.08466 arxiv.org/pdf/2510.08…

@arXiv_csAI_bot@mastoxiv.page
2025-10-01 11:45:57

HilbertA: Hilbert Attention for Image Generation with Diffusion Models
Shaoyi Zheng, Wenbo Lu, Yuxuan Xia, Haomin Liu, Shengjie Wang
arxiv.org/abs/2509.26538

@arXiv_csCV_bot@mastoxiv.page
2025-09-09 12:26:52

Cortex-Synth: Differentiable Topology-Aware 3D Skeleton Synthesis with Hierarchical Graph Attention
Mohamed Zayaan S
arxiv.org/abs/2509.06705

@arXiv_condmatsuprcon_bot@mastoxiv.page
2025-09-10 08:47:51

Examining density wave correlations in high pressure $\rm{La_3Ni_2O_7}$ through variational Monte Carlo
Yanxin Chen, Haoxiang Chen, Tonghuan Jiang, Ji Chen
arxiv.org/abs/2509.07219

@arXiv_csMA_bot@mastoxiv.page
2025-09-09 08:15:31

Orchestrator: Active Inference for Multi-Agent Systems in Long-Horizon Tasks
Lukas Beckenbauer, Johannes-Lucas Loewe, Ge Zheng, Alexandra Brintrup
arxiv.org/abs/2509.05651

@arXiv_csSD_bot@mastoxiv.page
2025-10-10 08:33:28

Personality-Enhanced Multimodal Depression Detection in the Elderly
Honghong Wang, Jing Deng, Rong Zheng
arxiv.org/abs/2510.08004 arxiv.org…

@arXiv_csIR_bot@mastoxiv.page
2025-09-30 10:46:51

Multi-Item-Query Attention for Stable Sequential Recommendation
Mingshi Xu, Haoren Zhu, Wilfred Siu Hung Ng
arxiv.org/abs/2509.24424 arxiv.…

@arXiv_astrophEP_bot@mastoxiv.page
2025-09-08 08:49:40

Identifying Exoplanets with Deep Learning: A CNN and RNN Classifier for Kepler DR25 and Candidate Vetting
Bibin Thomas, Vittal Bhat M, Salman Arafath Mohammed, Abdul Wase Mohammed, Adis Abebaw Dessalegn, Mohit Mittal
arxiv.org/abs/2509.04793

@peterhoneyman@a2mi.social
2025-08-18 20:00:51

i am determined to read the attention/transformer paper
i even printed it out

Attention Is All You Need
Ashish Vaswani
Noam Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan N. Gomez
Łukasz Kaiser
Illia Polosukhin

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with …
@arXiv_csMM_bot@mastoxiv.page
2025-09-08 07:46:20

An Emotion Recognition Framework via Cross-modal Alignment of EEG and Eye Movement Data
Jianlu Wang, Yanan Wang, Tong Liu
arxiv.org/abs/2509.04938

@arXiv_csLG_bot@mastoxiv.page
2025-10-10 11:13:29

Synthetic Series-Symbol Data Generation for Time Series Foundation Models
Wenxuan Wang, Kai Wu, Yujian Betterest Li, Dan Wang, Xiaoyu Zhang
arxiv.org/abs/2510.08445

@arXiv_csAI_bot@mastoxiv.page
2025-10-01 11:28:27

LMILAtt: A Deep Learning Model for Depression Detection from Social Media Users Enhanced by Multi-Instance Learning Based on Attention Mechanism
Yukun Yang
arxiv.org/abs/2509.26145

@arXiv_csLG_bot@mastoxiv.page
2025-09-05 10:20:51

Attention as an Adaptive Filter
Peter Racioppo
arxiv.org/abs/2509.04154 arxiv.org/pdf/2509.04154

@arXiv_csCL_bot@mastoxiv.page
2025-10-03 10:40:01

Learning to Look at the Other Side: A Semantic Probing Study of Word Embeddings in LLMs with Enabled Bidirectional Attention
Zhaoxin Feng, Jianfei Ma, Emmanuele Chersoni, Xiaojing Zhao, Xiaoyi Bao
arxiv.org/abs/2510.01652

@arXiv_csCV_bot@mastoxiv.page
2025-09-09 12:30:52

BIR-Adapter: A Low-Complexity Diffusion Model Adapter for Blind Image Restoration
Cem Eteke, Alexander Griessel, Wolfgang Kellerer, Eckehard Steinbach
arxiv.org/abs/2509.06904

@arXiv_csIT_bot@mastoxiv.page
2025-09-22 08:37:11

Interplay Between Belief Propagation and Transformer: Differential-Attention Message Passing Transformer
Chin Wa Lau, Xiang Shi, Ziyan Zheng, Haiwen Cao, Nian Guo
arxiv.org/abs/2509.15637

@arXiv_csSD_bot@mastoxiv.page
2025-10-01 09:43:38

The silence of the weights: an investigation of structural pruning strategies for attention-based audio signal architectures
Andrea Diecidue, Carlo Alberto Barbano, Piero Fraternali, Mathieu Fontaine, Enzo Tartaglione
arxiv.org/abs/2509.26207

@arXiv_csCE_bot@mastoxiv.page
2025-09-22 07:31:21

SPH-Net: A Co-Attention Hybrid Model for Accurate Stock Price Prediction
Yiyang Wu, Hanyu Ma, Muxin Ge, Xiaoli Ma, Yadi Liu, Ye Aung Moe, Zeyu Han, Weizheng Xie
arxiv.org/abs/2509.15414

@arXiv_csNI_bot@mastoxiv.page
2025-09-22 08:52:11

Smart Interrupted Routing Based on Multi-head Attention Mask Mechanism-Driven MARL in Software-defined UASNs
Zhenyu Wang, Chuan Lin, Guangjie Han, Shengchao Zhu, Ruoyuan Wu, Tongwei Zhang
arxiv.org/abs/2509.15856

@arXiv_csLG_bot@mastoxiv.page
2025-10-07 13:04:42

On Structured State-Space Duality
Jerry Yao-Chieh Hu, Xiwen Zhang, Weimin Wu, Han Liu
arxiv.org/abs/2510.04944 arxiv.org/pdf/2510.04944

@arXiv_csCV_bot@mastoxiv.page
2025-09-03 15:03:13

Enhancing Fitness Movement Recognition with Attention Mechanism and Pre-Trained Feature Extractors
Shanjid Hasan Nishat, Srabonti Deb, Mohiuddin Ahmed
arxiv.org/abs/2509.02511

@arXiv_csCV_bot@mastoxiv.page
2025-08-25 09:56:40

Attention Mechanism in Randomized Time Warping
Yutaro Hiraoka, Kazuya Okamura, Kota Suto, Kazuhiro Fukui
arxiv.org/abs/2508.16366 arxiv.org…

@arXiv_physicschemph_bot@mastoxiv.page
2025-09-22 08:36:31

DeepMech: A Machine Learning Framework for Chemical Reaction Mechanism Prediction
Manajit Das, Ajnabiul Hoque, Mayank Baranwal, Raghavan B. Sunoj
arxiv.org/abs/2509.15872

@arXiv_eessIV_bot@mastoxiv.page
2025-10-13 08:46:00

Progressive Uncertainty-Guided Evidential U-KAN for Trustworthy Medical Image Segmentation
Zhen Yang, Yansong Ma, Lei Chen
arxiv.org/abs/2510.08949

@arXiv_csCL_bot@mastoxiv.page
2025-09-23 12:57:41

Cross-Attention is Half Explanation in Speech-to-Text Models
Sara Papi, Dennis Fucci, Marco Gaido, Matteo Negri, Luisa Bentivogli
arxiv.org/abs/2509.18010

@arXiv_csIR_bot@mastoxiv.page
2025-10-14 09:15:38

Integrating Structure-Aware Attention and Knowledge Graphs in Explainable Recommendation Systems
Shuangquan Lyu, Ming Wang, Huajun Zhang, Jiasen Zheng, Junjiang Lin, Xiaoxuan Sun
arxiv.org/abs/2510.10109

@arXiv_csLG_bot@mastoxiv.page
2025-09-30 14:44:01

High-Dimensional Analysis of Single-Layer Attention for Sparse-Token Classification
Nicholas Barnfield, Hugo Cui, Yue M. Lu
arxiv.org/abs/2509.25153

@arXiv_csAI_bot@mastoxiv.page
2025-10-15 10:10:21

Biased-Attention Guided Risk Prediction for Safe Decision-Making at Unsignalized Intersections
Chengyang Dong, Nan Guo
arxiv.org/abs/2510.12428

@arXiv_csCR_bot@mastoxiv.page
2025-08-14 07:48:32

Shadow in the Cache: Unveiling and Mitigating Privacy Risks of KV-cache in LLM Inference
Zhifan Luo, Shuo Shao, Su Zhang, Lijing Zhou, Yuke Hu, Chenxu Zhao, Zhihao Liu, Zhan Qin
arxiv.org/abs/2508.09442

@arXiv_csLG_bot@mastoxiv.page
2025-09-25 10:38:22

Pi-Transformer: A Physics-informed Attention Mechanism for Time Series Anomaly Detection
Sepehr Maleki, Negar Pourmoazemi
arxiv.org/abs/2509.19985

@arXiv_physicsfludyn_bot@mastoxiv.page
2025-09-19 08:54:31

On the algebraic stretching dynamics of variable-density mixing in shock-bubble interaction
Xu Han, Bin Yu, Hong Liu
arxiv.org/abs/2509.14607

@arXiv_csSD_bot@mastoxiv.page
2025-09-25 08:47:12

Eliminating stability hallucinations in llm-based tts models via attention guidance
ShiMing Wang, ZhiHao Du, Yang Xiang, TianYu Zhao, Han Zhao, Qian Chen, XianGang Li, HanJie Guo, ZhenHua Ling
arxiv.org/abs/2509.19852

@arXiv_eessSP_bot@mastoxiv.page
2025-10-15 08:27:42

A Deep Multi-Task Learning Approach to Impulsive Noise Parameter Estimation
Abdullahi Mohammad, Bdah Eya, Bassant Selim
arxiv.org/abs/2510.12179

@arXiv_csLG_bot@mastoxiv.page
2025-10-02 11:09:21

Privacy Preserved Federated Learning with Attention-Based Aggregation for Biometric Recognition
Kassahun Azezew, Minyechil Alehegn, Tsega Asresa, Bitew Mekuria, Tizazu Bayh, Ayenew Kassie, Amsalu Tesema, Animut Embiyale
arxiv.org/abs/2510.01113

@arXiv_condmatmtrlsci_bot@mastoxiv.page
2025-10-15 10:08:21

Self-attention enabled quantum path analysis of high-harmonic generation in solids
Cong Zhao, Xiaozhou Zou
arxiv.org/abs/2510.12443 arxiv.o…

@arXiv_csCL_bot@mastoxiv.page
2025-10-14 13:16:18

Deconstructing Attention: Investigating Design Principles for Effective Language Modeling
Huiyin Xue, Nafise Sadat Moosavi, Nikolaos Aletras
arxiv.org/abs/2510.11602

@arXiv_csCV_bot@mastoxiv.page
2025-09-05 10:14:41

TEn-CATS: Text-Enriched Audio-Visual Video Parsing with Multi-Scale Category-Aware Temporal Graph
Yaru Chen, Faegheh Sardari, Peiliang Zhang, Ruohao Guo, Yang Xiang, Zhenbo Li, Wenwu Wang
arxiv.org/abs/2509.04086

@arXiv_csLG_bot@mastoxiv.page
2025-08-29 10:08:31

Rethinking Transformer Connectivity: TLinFormer, A Path to Exact, Full Context-Aware Linear Attention
Zhongpan Tang
arxiv.org/abs/2508.20407

@arXiv_csLG_bot@mastoxiv.page
2025-09-15 09:56:11

Multipole Semantic Attention: A Fast Approximation of Softmax Attention for Pretraining
Rupert Mitchell, Kristian Kersting
arxiv.org/abs/2509.10406

@arXiv_csLG_bot@mastoxiv.page
2025-09-29 11:34:37

Physics-informed GNN for medium-high voltage AC power flow with edge-aware attention and line search correction operator
Changhun Kim, Timon Conrad, Redwanul Karim, Julian Oelhaf, David Riebesel, Tom\'as Arias-Vergara, Andreas Maier, Johann J\"ager, Siming Bayer
arxiv.org/abs/2509.22458

@arXiv_csLG_bot@mastoxiv.page
2025-10-06 10:25:29

Signature-Informed Transformer for Asset Allocation
Yoontae Hwang, Stefan Zohren
arxiv.org/abs/2510.03129 arxiv.org/pdf/2510.03129

@arXiv_csCV_bot@mastoxiv.page
2025-08-21 10:13:40

EventSSEG: Event-driven Self-Supervised Segmentation with Probabilistic Attention
Lakshmi Annamalai, Chetan Singh Thakur
arxiv.org/abs/2508.14856

@arXiv_csCE_bot@mastoxiv.page
2025-10-15 07:36:21

Agent-Based Simulation of a Financial Market with Large Language Models
Ryuji Hashimoto, Takehiro Takayanagi, Masahiro Suzuki, Kiyoshi Izumi
arxiv.org/abs/2510.12189

@arXiv_csCL_bot@mastoxiv.page
2025-10-15 10:27:41

Credal Transformer: A Principled Approach for Quantifying and Mitigating Hallucinations in Large Language Models
Shihao Ji, Zihui Song, Jiajie Huang
arxiv.org/abs/2510.12137

@arXiv_csLG_bot@mastoxiv.page
2025-08-15 10:19:32

Natively Trainable Sparse Attention for Hierarchical Point Cloud Datasets
Nicolas Lapautre, Maria Marchenko, Carlos Miguel Pati\~no, Xin Zhou
arxiv.org/abs/2508.10758

@arXiv_csCV_bot@mastoxiv.page
2025-10-02 10:54:11

Feature Identification for Hierarchical Contrastive Learning
Julius Ott, Nastassia Vysotskaya, Huawei Sun, Lorenzo Servadei, Robert Wille
arxiv.org/abs/2510.00837

@arXiv_csLG_bot@mastoxiv.page
2025-10-01 11:57:07

TASP: Topology-aware Sequence Parallelism
Yida Wang (Capital Normal University, Infinigence-AI), Ke Hong (Tsinghua University, Infinigence-AI), Xiuhong Li (Infinigence-AI), Yuanchao Xu (Capital Normal University), Wenxun Wang (Tsinghua University), Guohao Dai (Infinigence-AI, Shanghai Jiao Tong University), Yu Wang (Tsinghua University)

@arXiv_csCL_bot@mastoxiv.page
2025-09-19 13:23:51

Replaced article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[1/3]:
- Fast Multipole Attention: A Scalable Multilevel Attention Mechanism for Text and Images
Yanming Kang, Giang Tran, Hans De Sterck

@arXiv_csSD_bot@mastoxiv.page
2025-08-28 07:48:40

Infant Cry Detection In Noisy Environment Using Blueprint Separable Convolutions and Time-Frequency Recurrent Neural Network
Haolin Yu, Yanxiong Li
arxiv.org/abs/2508.19308

@arXiv_csCV_bot@mastoxiv.page
2025-09-30 15:01:16

VideoAnchor: Reinforcing Subspace-Structured Visual Cues for Coherent Visual-Spatial Reasoning
Zhaozhi Wang, Tong Zhang, Mingyue Guo, Yaowei Wang, Qixiang Ye
arxiv.org/abs/2509.25151

@arXiv_eessAS_bot@mastoxiv.page
2025-09-18 09:21:31

Mixture of Low-Rank Adapter Experts in Generalizable Audio Deepfake Detection
Janne Laakkonen, Ivan Kukanov, Ville Hautam\"aki
arxiv.org/abs/2509.13878

@arXiv_csLG_bot@mastoxiv.page
2025-08-20 10:12:40

PENGUIN: Enhancing Transformer with Periodic-Nested Group Attention for Long-term Time Series Forecasting
Tian Sun, Yuqi Chen, Weiwei Sun
arxiv.org/abs/2508.13773

@arXiv_csLG_bot@mastoxiv.page
2025-10-02 11:07:11

Random Feature Spiking Neural Networks
Maximilian Gollwitzer, Felix Dietrich
arxiv.org/abs/2510.01012 arxiv.org/pdf/2510.01012

@arXiv_csLG_bot@mastoxiv.page
2025-09-26 10:28:01

TyphoonMLA: A Mixed Naive-Absorb MLA Kernel For Shared Prefix
Ahmet Caner Y\"uz\"ug\"uler, Ahmet \c{C}elik, Jiawei Zhuang, Lukas Cavigelli
arxiv.org/abs/2509.21081

@arXiv_csCL_bot@mastoxiv.page
2025-08-21 09:57:50

Improving in-context learning with a better scoring function
Omar Naim, Swarnadeep Bhar, J\'er\^ome Bolte, Nicholas Asher
arxiv.org/abs/2508.14685

@arXiv_csSD_bot@mastoxiv.page
2025-08-21 08:58:20

EffiFusion-GAN: Efficient Fusion Generative Adversarial Network for Speech Enhancement
Bin Wen, Tien-Ping Tan
arxiv.org/abs/2508.14525 arxi…

@arXiv_csLG_bot@mastoxiv.page
2025-10-13 10:44:30

Cross-attention Secretly Performs Orthogonal Alignment in Recommendation Models
Hyunin Lee, Yong Zhang, Hoang Vu Nguyen, Xiaoyi Liu, Namyong Park, Christopher Jung, Rong Jin, Yang Wang, Zhigang Wang, Somayeh Sojoudi, Xue Feng
arxiv.org/abs/2510.09435

@arXiv_eessAS_bot@mastoxiv.page
2025-08-13 08:05:32

Joint decoding method for controllable contextual speech recognition based on Speech LLM
Yangui Fang, Jing Peng, Yu Xi, Xu Li, Haoyu Li, Chengwei Zhang, Guohui Zhong, Kai Yu
arxiv.org/abs/2508.08585

@arXiv_csCV_bot@mastoxiv.page
2025-08-20 10:17:30

Self-Aware Adaptive Alignment: Enabling Accurate Perception for Intelligent Transportation Systems
Tong Xiang, Hongxia Zhao, Fenghua Zhu, Yuanyuan Chen, Yisheng Lv
arxiv.org/abs/2508.13823

@arXiv_csLG_bot@mastoxiv.page
2025-08-21 10:08:30

Great GATsBi: Hybrid, Multimodal, Trajectory Forecasting for Bicycles using Anticipation Mechanism
Kevin Riehl, Shaimaa K. El-Baklish, Anastasios Kouvelas, Michail A. Makridis
arxiv.org/abs/2508.14523

@arXiv_csSD_bot@mastoxiv.page
2025-09-17 09:27:30

Timbre-Adaptive Transcription: A Lightweight Architecture with Associative Memory for Dynamic Instrument Separation
Ruigang Li, Yongxu Zhu
arxiv.org/abs/2509.12712

@arXiv_csCL_bot@mastoxiv.page
2025-10-13 12:01:10

Crosslisted article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[1/2]:
- Limitations of Normalization in Attention Mechanism
Timur Mudarisov, Mikhail Burtsev, Tatiana Petrova, Radu State

@arXiv_csCL_bot@mastoxiv.page
2025-10-13 10:29:50

Mask Tokens as Prophet: Fine-Grained Cache Eviction for Efficient dLLM Inference
Jianuo Huang, Yaojie Zhang, Yicun Yang, Benhao Huang, Biqing Qi, Dongrui Liu, Linfeng Zhang
arxiv.org/abs/2510.09309

@arXiv_csCV_bot@mastoxiv.page
2025-09-17 10:58:10

Vi-SAFE: A Spatial-Temporal Framework for Efficient Violence Detection in Public Surveillance
Ligang Chang, Shengkai Xu, Liangchang Shen, Binhan Xu, Junqiao Wang, Tianyu Shi, Yanhui Du
arxiv.org/abs/2509.13210

@arXiv_csLG_bot@mastoxiv.page
2025-08-21 10:08:10

Artificial Intelligence-Based Multiscale Temporal Modeling for Anomaly Detection in Cloud Services
Lian Lian, Yilin Li, Song Han, Renzi Meng, Sibo Wang, Ming Wang
arxiv.org/abs/2508.14503