Tootfinder

@stephane_klein@social.coop
2025-11-21 13:25:32

#OpenRouterAI propose maintenant des embeddings models. Actuellement 22 models.
https://notes.sklein.xyz/2025-11-21_1350/

OpenRouter.ai propose maintenant des embeddings models - Jardin numérique de Stéphane Klein
OpenRouter.ai propose maintenant des embeddings models

@arXiv_csLG_bot@mastoxiv.page
2025-12-22 10:34:10

Exploiting ID-Text Complementarity via Ensembling for Sequential Recommendation
Liam Collins, Bhuvesh Kumar, Clark Mingxuan Ju, Tong Zhao, Donald Loveland, Leonardo Neves, Neil Shah
https://arxiv.org/abs/2512.17820 https://arxiv.org/pdf/2512.17820 https://arxiv.org/html/2512.17820
arXiv:2512.17820v1 Announce Type: new
Abstract: Modern Sequential Recommendation (SR) models commonly utilize modality features to represent items, motivated in large part by recent advancements in language and vision modeling. To do so, several works completely replace ID embeddings with modality embeddings, claiming that modality embeddings render ID embeddings unnecessary because they can match or even exceed ID embedding performance. On the other hand, many works jointly utilize ID and modality features, but posit that complex fusion strategies, such as multi-stage training and/or intricate alignment architectures, are necessary for this joint utilization. However, underlying both these lines of work is a lack of understanding of the complementarity of ID and modality features. In this work, we address this gap by studying the complementarity of ID- and text-based SR models. We show that these models do learn complementary signals, meaning that either should provide performance gain when used properly alongside the other. Motivated by this, we propose a new SR method that preserves ID-text complementarity through independent model training, then harnesses it through a simple ensembling strategy. Despite this method's simplicity, we show it outperforms several competitive SR baselines, implying that both ID and text features are necessary to achieve state-of-the-art SR performance but complex fusion architectures are not.
toXiv_bot_toot

@arXiv_csCL_bot@mastoxiv.page
2025-10-14 13:16:08

SemCSE-Multi: Multifaceted and Decodable Embeddings for Aspect-Specific and Interpretable Scientific Domain Mapping
Marc Brinner, Sina Zarrie{\ss}
https://arxiv.org/abs/2510.11599

SemCSE-Multi: Multifaceted and Decodable Embeddings for Aspect-Specific and Interpretable Scientific Domain Mapping
We propose SemCSE-Multi, a novel unsupervised framework for generating multifaceted embeddings of scientific abstracts, evaluated in the domains of invasion biology and medicine. These embeddings capture distinct, individually specifiable aspects in isolation, thus enabling fine-grained and controllable similarity assessments as well as adaptive, user-driven visualizations of scientific domains. Our approach relies on an unsupervised procedure that produces aspect-specific summarizing sentences…

@arXiv_csCR_bot@mastoxiv.page
2025-10-14 11:41:08

Secret-Key Agreement Through Hidden Markov Modeling of Wavelet Scattering Embeddings
Nora Basha, Bechir Hamdaoui, Attila A. Yavuz, Thang Hoang, Mehran Mozaffari Kermani
https://arxiv.org/abs/2510.09773

Secret-Key Agreement Through Hidden Markov Modeling of Wavelet Scattering Embeddings
Secret-key generation and agreement based on wireless channel reciprocity offers a promising avenue for securing IoT networks. However, existing approaches predominantly rely on the similarity of instantaneous channel measurement samples between communicating devices. This narrow view of reciprocity is often impractical, as it is highly susceptible to noise, asynchronous sampling, channel fading, and other system-level imperfections -- all of which significantly impair key generation performanc…

@awinkler@openbiblio.social
2025-10-17 09:25:18

Content warning:

regarding the #wikidata #embedding project: Are the actual embeddings anywhere available?

@arXiv_csAI_bot@mastoxiv.page
2025-10-13 09:44:00

Auto-scaling Continuous Memory for GUI Agent
Wenyi Wu, Kun Zhou, Ruoxin Yuan, Vivian Yu, Stephen Wang, Zhiting Hu, Biwei Huang
https://arxiv.org/abs/2510.09038 https://

Auto-scaling Continuous Memory for GUI Agent
We study how to endow GUI agents with scalable memory that help generalize across unfamiliar interfaces and long-horizon tasks. Prior GUI agents compress past trajectories into text tokens, which balloons context length and misses decisive visual cues (e.g., exact widget size and position). We propose a continuous memory that encodes each GUI trajectory into a fixed-length sequence of continuous embeddings using the VLM itself as an encoder; these embeddings are plugged directly into the backbo…

@arXiv_csSD_bot@mastoxiv.page
2025-10-10 08:53:58

Leveraging Whisper Embeddings for Audio-based Lyrics Matching
Eleonora Mancini, Joan Serr\`a, Paolo Torroni, Yuki Mitsufuji
https://arxiv.org/abs/2510.08176 https://

Leveraging Whisper Embeddings for Audio-based Lyrics Matching
Audio-based lyrics matching can be an appealing alternative to other content-based retrieval approaches, but existing methods often suffer from limited reproducibility and inconsistent baselines. In this work, we introduce WEALY, a fully reproducible pipeline that leverages Whisper decoder embeddings for lyrics matching tasks. WEALY establishes robust and transparent baselines, while also exploring multimodal extensions that integrate textual and acoustic features. Through extensive experiments…

@arXiv_csCL_bot@mastoxiv.page
2025-10-14 13:15:18

LLMAtKGE: Large Language Models as Explainable Attackers against Knowledge Graph Embeddings
Ting Li, Yang Yang, Yipeng Yu, Liang Yao, Guoqing Chao, Ruifeng Xu
https://arxiv.org/abs/2510.11584

LLMAtKGE: Large Language Models as Explainable Attackers against Knowledge Graph Embeddings
Adversarial attacks on knowledge graph embeddings (KGE) aim to disrupt the model's ability of link prediction by removing or inserting triples. A recent black-box method has attempted to incorporate textual and structural information to enhance attack performance. However, it is unable to generate human-readable explanations, and exhibits poor generalizability. In the past few years, large language models (LLMs) have demonstrated powerful capabilities in text comprehension, generation, and reas…

@arXiv_eessIV_bot@mastoxiv.page
2025-10-08 09:38:49

Modulated INR with Prior Embeddings for Ultrasound Imaging Reconstruction
R\'emi Delaunay, Christoph Hennersperger, Stefan W\"orz
https://arxiv.org/abs/2510.05731 https…

Modulated INR with Prior Embeddings for Ultrasound Imaging Reconstruction
Ultrafast ultrasound imaging enables visualization of rapid physiological dynamics by acquiring data at exceptionally high frame rates. However, this speed often comes at the cost of spatial resolution and image quality due to unfocused wave transmissions and associated artifacts. In this work, we propose a novel modulated Implicit Neural Representation (INR) framework that leverages a coordinate-based neural network conditioned on latent embeddings extracted from time-delayed I/Q channel data …

@arXiv_csIR_bot@mastoxiv.page
2025-10-15 09:34:41

Leveraging Language Semantics for Collaborative Filtering with TextGCN and TextGCN-MLP: Zero-Shot vs In-Domain Performance
Andrei Chernov, Haroon Wahab, Oleg Novitskij
https://arxiv.org/abs/2510.12461 …

Leveraging Language Semantics for Collaborative Filtering with TextGCN and TextGCN-MLP: Zero-Shot vs In-Domain Performance
In recent years, various approaches have been proposed to leverage large language models (LLMs) for incorporating textual information about items into recommender systems. Existing methods primarily focus on either fine-tuning LLMs to generate recommendations or integrating LLM-based embeddings into downstream models. In this work, we follow the latter direction and propose \textbf{TextGCN}, which applies parameter-free graph convolution layers directly over LLM-based item-title embeddings, ins…

@arXiv_csCY_bot@mastoxiv.page
2025-10-07 10:04:52

Quantifying Gender Stereotypes in Japan between 1900 and 1999 with Word Embeddings
Shintaro Sakai, Haewoon Kwak, Jisun An, Akira Matsui
https://arxiv.org/abs/2510.03905 https://…

Quantifying Gender Stereotypes in Japan between 1900 and 1999 with Word Embeddings
We quantify the evolution of gender stereotypes in Japan from 1900 to 1999 using a series of 100 word embeddings, each trained on a corpus from a specific year. We define the gender stereotype value to measure the strength of a word's gender association by computing the difference in cosine similarity of the word to female- versus male-related attribute words. We examine trajectories of gender stereotype across three traditionally gendered domains: Home, Work, and Politics, as well as occupatio…

@arXiv_mathAG_bot@mastoxiv.page
2025-10-07 10:43:02

Embeddings of weighted projective spaces
Praise Adeyemo, Dominic Bunnett, Fabi\'an Levic\'an
https://arxiv.org/abs/2510.05076 https://arxiv.org/pdf…

Embeddings of weighted projective spaces
Let $X$ be a projective toric variety of dimension $n$ and let $L$ be a ample line bundle on $X$. For $k \geq 0$, it is in general difficult to determine whether $L^{\otimes k}$ is very ample and whether it additionally gives a projectively normal embedding. These two properties are equivalent to the very ampleness, respectively normality, of the corresponding polytope. By a result of Ewald-Wessels, both statements are classically known to hold for $k \geq n - 1$. We study embeddings of weigh…

@arXiv_csCV_bot@mastoxiv.page
2025-09-25 10:33:52

Anomaly Detection by Clustering DINO Embeddings using a Dirichlet Process Mixture
Nico Schulthess, Ender Konukoglu
https://arxiv.org/abs/2509.19997 https://

Anomaly Detection by Clustering DINO Embeddings using a Dirichlet Process Mixture
In this work, we leverage informative embeddings from foundational models for unsupervised anomaly detection in medical imaging. For small datasets, a memory-bank of normative features can directly be used for anomaly detection which has been demonstrated recently. However, this is unsuitable for large medical datasets as the computational burden increases substantially. Therefore, we propose to model the distribution of normative DINOv2 embeddings with a Dirichlet Process Mixture model (DPMM),…

@arXiv_csCG_bot@mastoxiv.page
2025-10-14 07:33:20

Rigid-Invariant Sliced Wasserstein via Independent Embeddings
Peilin He, Zakk Heile, Jayson Tran, Alice Wang, Shrikant Chand
https://arxiv.org/abs/2510.10233 https://

Rigid-Invariant Sliced Wasserstein via Independent Embeddings
Comparing probability measures when their supports are related by an unknown rigid transformation is an important challenge in geometric data analysis, arising in shape matching and machine learning. Classical optimal transport (OT) distances, including Wasserstein and sliced Wasserstein, are sensitive to rotations and reflections, while Gromov-Wasserstein (GW) is invariant to isometries but computationally prohibitive for large datasets. We introduce \emph{Rigid-Invariant Sliced Wasserstein vi…

@arXiv_csDS_bot@mastoxiv.page
2025-10-13 08:13:20

Random-Shift Revisited: Tight Approximations for Tree Embeddings and L1-Oblivious Routings
Rasmus Kyng, Maximilian Probst Gutenberg, Tim Rieder
https://arxiv.org/abs/2510.09124 …

Random-Shift Revisited: Tight Approximations for Tree Embeddings and L1-Oblivious Routings
We present a new and surprisingly simple analysis of random-shift decompositions -- originally proposed by Miller, Peng, and Xu [SPAA'13]: We show that decompositions for exponentially growing scales $D = 2^0, 2^1, \ldots, 2^{\log_2(\operatorname{diam}(G))}$, have a tight constant trade-off between distance-to-center and separation probability on average across the distance scales -- opposed to a necessary $Ω(\log n)$ trade-off for a single scale. This almost immediately yields a way to comp…

@arXiv_csNE_bot@mastoxiv.page
2025-09-29 07:45:47

From Embeddings to Equations: Genetic-Programming Surrogates for Interpretable Transformer Classification
Mohammad Sadegh Khorshidi, Navid Yazdanjue, Hassan Gharoun, Mohammad Reza Nikoo, Fang Chen, Amir H. Gandomi
https://arxiv.org/abs/2509.21341

From Embeddings to Equations: Genetic-Programming Surrogates for Interpretable Transformer Classification
We study symbolic surrogate modeling of frozen Transformer embeddings to obtain compact, auditable classifiers with calibrated probabilities. For five benchmarks (SST2G, 20NG, MNIST, CIFAR10, MSC17), embeddings from ModernBERT, DINOv2, and SigLIP are partitioned on the training set into disjoint, information-preserving views via semantic-preserving feature partitioning (SPFP). A cooperative multi-population genetic program (MEGP) then learns additive, closed-form logit programs over these views…

@arXiv_csCL_bot@mastoxiv.page
2025-10-13 10:29:00

One Sentence, Two Embeddings: Contrastive Learning of Explicit and Implicit Semantic Representations
Kohei Oda, Po-Min Chuang, Kiyoaki Shirai, Natthawut Kertkeidkachorn
https://arxiv.org/abs/2510.09293

One Sentence, Two Embeddings: Contrastive Learning of Explicit and Implicit Semantic Representations
Sentence embedding methods have made remarkable progress, yet they still struggle to capture the implicit semantics within sentences. This can be attributed to the inherent limitations of conventional sentence embedding methods that assign only a single vector per sentence. To overcome this limitation, we propose DualCSE, a sentence embedding method that assigns two embeddings to each sentence: one representing the explicit semantics and the other representing the implicit semantics. These embe…

@arXiv_quantph_bot@mastoxiv.page
2025-10-02 10:31:11

Block-Encoding Tensor Networks and QUBO Embeddings
Sebastian Issel
https://arxiv.org/abs/2510.00935 https://arxiv.org/pdf/2510.00935

Block-Encoding Tensor Networks and QUBO Embeddings
We give an algorithm that converts any tensor network (TN) into a sequence of local unitaries whose composition block-encodes the network contraction, suitable for Quantum Eigenvalue / Singularvalue Transformation (QET/QSVT). The construction embeds each TN as a local isometry and dilates it to a unitary. Performing this step for every site of the tensor, allows the full network to be block-encoded. The theory is agnostic to virtual-bond sizes; for qubit resource counts and examples we assume g…

@cwensel@fosstodon.org
2025-09-25 13:53:55

I case you want to play along, I’ve been experimenting with vector dbs and embedding models for indexing PDFs (white papers) and code.
https://github.com/cwensel/chroma-embedded
This over an mcp with Claude is quite handy.

GitHub - cwensel/chroma-embedded: A ChromaDB server that has embeddings available
A ChromaDB server that has embeddings available. Contribute to cwensel/chroma-embedded development by creating an account on GitHub.

@arXiv_csSI_bot@mastoxiv.page
2025-09-30 08:38:01

Hybrid Graph Embeddings and Louvain Algorithm for Unsupervised Community Detection
Dalila Khettaf, Djamel Djenouri, Zeinab Rezaeifar, Youcef Djenouri
https://arxiv.org/abs/2509.23411

Hybrid Graph Embeddings and Louvain Algorithm for Unsupervised Community Detection
This paper proposes a novel community detection method that integrates the Louvain algorithm with Graph Neural Networks (GNNs), enabling the discovery of communities without prior knowledge. Compared to most existing solutions, the proposed method does not require prior knowledge of the number of communities. It enhances the Louvain algorithm using node embeddings generated by a GNN to capture richer structural and feature information. Furthermore, it introduces a merging algorithm to refine th…

@arXiv_mathGN_bot@mastoxiv.page
2025-10-03 08:21:01

Forbidden Four Cycle, Star Graphs and Isometric Embeddings
Oleksiy Dovgoshey, Olga Rovenska
https://arxiv.org/abs/2510.01667 https://arxiv.org/pdf/2510.016…

Forbidden Four Cycle, Star Graphs and Isometric Embeddings
We prove the necessary and sufficient conditions under which ultrametric spaces of arbitrary infinite cardinality admit isometric embeddings into ultrametric spaces generated by labeled star graphs.

@arXiv_mathCO_bot@mastoxiv.page
2025-10-01 10:28:37

Optimal Embeddings of Posets in Hypercubes
Tom\'a\v{s} Fl\'idr, Maria-Romina Ivan, Sean Jaffe
https://arxiv.org/abs/2509.26630 https://arxiv.org/pd…

Optimal Embeddings of Posets in Hypercubes
Given a finite poset $\mathcal P$, the hypercube-height, denoted by $h^*(\mathcal P)$, is defined to be the largest $h$ such that, for any natural number $n$, the subsets of $[n]$ of size less than $h$ do not contain an induced copy of $\mathcal P$. The hypercube-width, denoted by $w^*(\mathcal P)$, is the smallest $w$ such that the subsets of $[w]$ of size at most $h^*(\mathcal P)$ contain an induced copy of $\mathcal P$. In other words, $h^*(\mathcal P)$ asks how `low' can a poset be embedded…

@arXiv_astrophIM_bot@mastoxiv.page
2025-09-30 10:14:31

ASTROCO: Self-Supervised Conformer-Style Transformers for Light-Curve Embeddings
Antony Tan, Pavlos Protopapas, Martina C\'adiz-Leyton, Guillermo Cabrera-Vives, Cristobal Donoso-Oliva, Ignacio Becker
https://arxiv.org/abs/2509.24134

ASTROCO: Self-Supervised Conformer-Style Transformers for Light-Curve Embeddings
We present AstroCo, a Conformer-style encoder for irregular stellar light curves. By combining attention with depthwise convolutions and gating, AstroCo captures both global dependencies and local features. On MACHO R-band, AstroCo outperforms Astromer v1 and v2, yielding 70 percent and 61 percent lower error respectively and a relative macro-F1 gain of about 7 percent, while producing embeddings that transfer effectively to few-shot classification. These results highlight AstroCo's potential a…

@arXiv_csSE_bot@mastoxiv.page
2025-10-13 09:19:30

RAG4Tickets: AI-Powered Ticket Resolution via Retrieval-Augmented Generation on JIRA and GitHub Data
Mohammad Baqar
https://arxiv.org/abs/2510.08667 https://

RAG4Tickets: AI-Powered Ticket Resolution via Retrieval-Augmented Generation on JIRA and GitHub Data
Modern software teams frequently encounter delays in resolving recurring or related issues due to fragmented knowledge scattered across JIRA tickets, developer discussions, and GitHub pull requests (PRs). To address this challenge, we propose a Retrieval-Augmented Generation (RAG) framework that integrates Sentence-Transformers for semantic embeddings with FAISS-based vector search to deliver context-aware ticket resolution recommendations. The approach embeds historical JIRA tickets, user comm…

@arXiv_csDC_bot@mastoxiv.page
2025-09-30 08:32:01

OptimES: Optimizing Federated Learning Using Remote Embeddings for Graph Neural Networks
Pranjal Naman, Yogesh Simmhan
https://arxiv.org/abs/2509.22922 https://

OptimES: Optimizing Federated Learning Using Remote Embeddings for Graph Neural Networks
Graph Neural Networks (GNNs) have experienced rapid advancements in recent years due to their ability to learn meaningful representations from graph data structures. However, in most real-world settings, such as financial transaction networks and healthcare networks, this data is localized to different data owners and cannot be aggregated due to privacy concerns. Federated Learning (FL) has emerged as a viable machine learning approach for training a shared model that iteratively aggregates loc…

@arXiv_mathAC_bot@mastoxiv.page
2025-10-15 08:59:41

Asymptotic Syzygies of Weighted Projective Spaces
Boyana Martinova
https://arxiv.org/abs/2510.12708 https://arxiv.org/pdf/2510.12708

Asymptotic Syzygies of Weighted Projective Spaces
By adapting methods of Ein-Erman-Lazarsfeld, we prove an analogue of the Ein-Lazarsfeld result on asymptotic syzygies for Veronese embeddings, in the setting of weighted projective spaces of the form $\mathbb{P}(1^n,2)$.

@arXiv_mathFA_bot@mastoxiv.page
2025-09-30 10:43:41

On continuous embeddings of quantum Sobolev spaces into Schatten classes $\mathfrak{H}_\gamma^{s,p}(G,H) \hookrightarrow S_p(H)$
Alexander Plakhotnikov
https://arxiv.org/abs/2509.24135

On continuous embeddings of quantum Sobolev spaces into Schatten classes $\mathfrak{H}_γ^{s,p}(G,H) \hookrightarrow S_p(H)$
The purpose of this work is an attempt to expand the results obtained by A. K. Lakmon and Y. Mensah on embeddings of quantum Sobolev spaces $\mathfrak{H}_γ^{s,p}(G,H)$ consisting of Hilbert-Schmidt operators, with $p\neq 2$.

@arXiv_mathGR_bot@mastoxiv.page
2025-10-14 08:52:28

Locally compact strictly convex metric groups are abelian
Taras Banakh, Oles Mazurenko
https://arxiv.org/abs/2510.10755 https://arxiv.org/pdf/2510.10755

Locally compact strictly convex metric groups are abelian
We show that every locally compact strictly convex metric group is abelian, thus answering one problem posed by the authors in their earlir paper. To prove this theorem we first construct the isomorphic embeddings of the real line into the strictly convex metric group using its geodesic properties and charaterization of the real line as a unique not monothetic one-parametric metrizable topological group. We proceed to show that all compact subgroups in a strictly convex metric group are trivial…

@arXiv_csLG_bot@mastoxiv.page
2025-10-07 13:06:32

KEEP: Integrating Medical Ontologies with Clinical Data for Robust Code Embeddings
Ahmed Elhussein, Paul Meddeb, Abigail Newbury, Jeanne Mirone, Martin Stoll, Gamze Gursoy
https://arxiv.org/abs/2510.05049

KEEP: Integrating Medical Ontologies with Clinical Data for Robust Code Embeddings
Machine learning in healthcare requires effective representation of structured medical codes, but current methods face a trade off: knowledge graph based approaches capture formal relationships but miss real world patterns, while data driven methods learn empirical associations but often overlook structured knowledge in medical terminologies. We present KEEP (Knowledge preserving and Empirically refined Embedding Process), an efficient framework that bridges this gap by combining knowledge grap…

@arXiv_econEM_bot@mastoxiv.page
2025-09-26 08:17:31

Recidivism and Peer Influence with LLM Text Embeddings in Low Security Correctional Facilities
Shanjukta Nath, Jiwon Hong, Jae Ho Chang, Keith Warren, Subhadeep Paul
https://arxiv.org/abs/2509.20634

Recidivism and Peer Influence with LLM Text Embeddings in Low Security Correctional Facilities
We find AI embeddings obtained using a pre-trained transformer-based Large Language Model (LLM) of 80,000-120,000 written affirmations and correction exchanges among residents in low-security correctional facilities to be highly predictive of recidivism. The prediction accuracy is 30\% higher with embedding vectors than with only pre-entry covariates. However, since the text embedding vectors are high-dimensional, we perform Zero-Shot classification of these texts to a low-dimensional vector of…

@arXiv_csIR_bot@mastoxiv.page
2025-10-10 08:33:18

ReasonEmbed: Enhanced Text Embeddings for Reasoning-Intensive Document Retrieval
Jianlyu Chen, Junwei Lan, Chaofan Li, Defu Lian, Zheng Liu
https://arxiv.org/abs/2510.08252 http…

ReasonEmbed: Enhanced Text Embeddings for Reasoning-Intensive Document Retrieval
In this paper, we introduce ReasonEmbed, a novel text embedding model developed for reasoning-intensive document retrieval. Our work includes three key technical contributions. First, we propose ReMixer, a new data synthesis method that overcomes the triviality problem prevalent in previous synthetic datasets, enabling large-scale production of 82K high-quality training samples. Second, we design Redapter, a self-adaptive learning algorithm that dynamically adjusts training each sample's weight…

@arXiv_csAI_bot@mastoxiv.page
2025-10-15 12:18:59

Crosslisted article(s) found for cs.AI. https://arxiv.org/list/cs.AI/new
[1/6]:
- Leveraging LLMs, IDEs, and Semantic Embeddings for Automated Move Method Refactoring
Fraol Batole, et al.

@arXiv_csCL_bot@mastoxiv.page
2025-10-15 10:38:41

SMEC: Rethinking Matryoshka Representation Learning for Retrieval Embedding Compression
Biao Zhang, Lixin Chen, Tong Liu, Bo Zheng
https://arxiv.org/abs/2510.12474 https://

SMEC: Rethinking Matryoshka Representation Learning for Retrieval Embedding Compression
Large language models (LLMs) generate high-dimensional embeddings that capture rich semantic and syntactic information. However, high-dimensional embeddings exacerbate computational complexity and storage requirements, thereby hindering practical deployment. To address these challenges, we propose a novel training framework named Sequential Matryoshka Embedding Compression (SMEC). This framework introduces the Sequential Matryoshka Representation Learning(SMRL) method to mitigate gradient varia…

@arXiv_mathRT_bot@mastoxiv.page
2025-10-14 08:52:58

Generalized Rank via Minimal Subposet
Thomas Br\"ustle, Justin Desrochers, Samuel Leblanc
https://arxiv.org/abs/2510.10837 https://arxiv.org/pdf/2510.…

Generalized Rank via Minimal Subposet
Let $\mathcal{C}$ be a small, connected category with finite hom-sets. We show that if the embedding of a connected subcategory $\mathcal{J}$ is both initial and final, then the restriction of any $\mathcal{C}$-module along $\mathcal{J}$ preserves the generalized rank-or equivalently, the multiplicity of the ``entire" interval modules for $\mathcal{C}$ and $\mathcal{J}$. Conversely, we prove that this property characterizes initial and final embeddings when both $\mathcal{C}$ and $\mathcal{J}$ …

@arXiv_hepth_bot@mastoxiv.page
2025-09-25 09:18:12

Holographic Entanglement Entropy in Quiver Theories
Dimitrios Chatzis, Ali Fatemiabhari, Mauro Giliberti, Madison Hammond
https://arxiv.org/abs/2509.19434 https://

Holographic Entanglement Entropy in Quiver Theories
This work presents a study of the entanglement entropy (EE) in a class of four-dimensional ${\cal N}=1$ linear quiver SCFTs deformed by the presence of a VEV. We review the holographic backgrounds dual to these theories, and calculate the EE for different Ryu-Takayanagi embeddings. We allow the embeddings to explore, in addition to the usual spatial direction, the internal coordinate $z$, associated with the quiver degrees of freedom. Via the numerical optimization of splines on triangulations,…

@arXiv_csDB_bot@mastoxiv.page
2025-10-10 13:57:01

Replaced article(s) found for cs.DB. https://arxiv.org/list/cs.DB/new
[1/1]:
- GNN-based Path Embeddings for Efficient and Exact Subgraph Matching (Technical Report)
Yutong Ye, Xiang Lian, Mingsong Chen

@arXiv_csCE_bot@mastoxiv.page
2025-10-10 07:34:18

IKNet: Interpretable Stock Price Prediction via Keyword-Guided Integration of News and Technical Indicators
Jinwoong Kim, Sangjin Park
https://arxiv.org/abs/2510.07661 https://

IKNet: Interpretable Stock Price Prediction via Keyword-Guided Integration of News and Technical Indicators
The increasing influence of unstructured external information, such as news articles, on stock prices has attracted growing attention in financial markets. Despite recent advances, most existing newsbased forecasting models represent all articles using sentiment scores or average embeddings that capture the general tone but fail to provide quantitative, context-aware explanations of the impacts of public sentiment on predictions. To address this limitation, we propose an interpretable keyword-g…

@arXiv_csRO_bot@mastoxiv.page
2025-10-03 10:32:01

Do You Know Where Your Camera Is? View-Invariant Policy Learning with Camera Conditioning
Tianchong Jiang, Jingtian Ji, Xiangshan Tan, Jiading Fang, Anand Bhattad, Vitor Guizilini, Matthew R. Walter
https://arxiv.org/abs/2510.02268

Do You Know Where Your Camera Is? View-Invariant Policy Learning with Camera Conditioning
We study view-invariant imitation learning by explicitly conditioning policies on camera extrinsics. Using Plucker embeddings of per-pixel rays, we show that conditioning on extrinsics significantly improves generalization across viewpoints for standard behavior cloning policies, including ACT, Diffusion Policy, and SmolVLA. To evaluate policy robustness under realistic viewpoint shifts, we introduce six manipulation tasks in RoboSuite and ManiSkill that pair "fixed" and "randomized" scene vari…

@arXiv_csLG_bot@mastoxiv.page
2025-10-08 10:42:49

Power Mechanism: Private Tabular Representation Release for Model Agnostic Consumption
Praneeth Vepakomma, Kaustubh Ponkshe
https://arxiv.org/abs/2510.05581 https://

Power Mechanism: Private Tabular Representation Release for Model Agnostic Consumption
Traditional collaborative learning approaches are based on sharing of model weights between clients and a server. However, there are advantages to resource efficiency through schemes based on sharing of embeddings (activations) created from the data. Several differentially private methods were developed for sharing of weights while such mechanisms do not exist so far for sharing of embeddings. We propose Ours to learn a privacy encoding network in conjunction with a small utility generation net…

@arXiv_mathGN_bot@mastoxiv.page
2025-11-11 08:34:50

Dimensionality reduction and width of deep neural networks based on topological degree theory
Xiao-Song Yang
https://arxiv.org/abs/2511.06821 https://arxiv.org/pdf/2511.06821 https://arxiv.org/html/2511.06821
arXiv:2511.06821v1 Announce Type: new
Abstract: In this paper we present a mathematical framework on linking of embeddings of compact topological spaces into Euclidean spaces and separability of linked embeddings under a specific class of dimension reduction maps. As applications of the established theory, we provide some fascinating insights into classification and approximation problems in deep learning theory in the setting of deep neural networks.
toXiv_bot_toot

@arXiv_csIR_bot@mastoxiv.page
2025-10-10 07:46:08

Queries Are Not Alone: Clustering Text Embeddings for Video Search
Peyang Liu, Xi Wang, Ziqiang Cui, Wei Ye
https://arxiv.org/abs/2510.07720 https://arxiv.…

Queries Are Not Alone: Clustering Text Embeddings for Video Search
The rapid proliferation of video content across various platforms has highlighted the urgent need for advanced video retrieval systems. Traditional methods, which primarily depend on directly matching textual queries with video metadata, often fail to bridge the semantic gap between text descriptions and the multifaceted nature of video content. This paper introduces a novel framework, the Video-Text Cluster (VTC), which enhances video retrieval by clustering text queries to capture a broader s…

@arXiv_statAP_bot@mastoxiv.page
2025-09-26 08:21:01

Incorporating LLM Embeddings for Variation Across the Human Genome
Hongqian Niu, Jordan Bryan, Xihao Li, Didong Li
https://arxiv.org/abs/2509.20702 https://

Incorporating LLM Embeddings for Variation Across the Human Genome
Recent advances in large language model (LLM) embeddings have enabled powerful representations for biological data, but most applications to date focus only on gene-level information. We present one of the first systematic frameworks to generate variant-level embeddings across the entire human genome. Using curated annotations from FAVOR, ClinVar, and the GWAS Catalog, we constructed semantic text descriptions for 8.9 billion possible variants and generated embeddings at three scales: 1.5 milli…

@arXiv_csSE_bot@mastoxiv.page
2025-10-13 09:41:10

Vector Graph-Based Repository Understanding for Issue-Driven File Retrieval
Kostiantyn Bevziuk, Andrii Fatula, Svetozar Lashin Yaroslav Opanasenko, Anna Tukhtarova, Ashok Jallepalli Pradeepkumar Sharma, Hritvik Shrivastava
https://arxiv.org/abs/2510.08876

Vector Graph-Based Repository Understanding for Issue-Driven File Retrieval
We present a repository decomposition system that converts large software repositories into a vectorized knowledge graph which mirrors project architectural and semantic structure, capturing semantic relationships and allowing a significant level of automatization of further repository development. The graph encodes syntactic relations such as containment, implementation, references, calls, and inheritance, and augments nodes with LLM-derived summaries and vector embeddings. A hybrid retrieval …

@arXiv_csCG_bot@mastoxiv.page
2025-10-14 07:33:20

Rigid-Invariant Sliced Wasserstein via Independent Embeddings
Peilin He, Zakk Heile, Jayson Tran, Alice Wang, Shrikant Chand
https://arxiv.org/abs/2510.10233 https://

@arXiv_mathCO_bot@mastoxiv.page
2025-10-02 09:47:41

Throttling for metric dimension and its variants
Boris Brimkov, Peter Diao, Jesse Geneson, Carolyn Reinhart, Shen-Fu Tsai, William Wang, Kyle Worley
https://arxiv.org/abs/2510.00530

Throttling for metric dimension and its variants
Metric dimension is a graph parameter that has been applied to robot navigation and finding low-dimensional vector embeddings. Throttling entails minimizing the sum of two available resources when solving certain graph problems. In this paper, we introduce throttling for metric dimension, edge metric dimension, and mixed metric dimension. In the context of vector embeddings, metric dimension throttling finds a low-dimensional, low-magnitude embedding with integer coordinates. We show that compu…

@arXiv_csGR_bot@mastoxiv.page
2025-10-10 09:05:29

Spectral Prefiltering of Neural Fields
Mustafa B. Yaldiz, Ishit Mehta, Nithin Raghavan, Andreas Meuleman, Tzu-Mao Li, Ravi Ramamoorthi
https://arxiv.org/abs/2510.08394 https://

Spectral Prefiltering of Neural Fields
Neural fields excel at representing continuous visual signals but typically operate at a single, fixed resolution. We present a simple yet powerful method to optimize neural fields that can be prefiltered in a single forward pass. Key innovations and features include: (1) We perform convolutional filtering in the input domain by analytically scaling Fourier feature embeddings with the filter's frequency response. (2) This closed-form modulation generalizes beyond Gaussian filtering and supports…

@arXiv_csCV_bot@mastoxiv.page
2025-10-09 10:27:31

Efficient Discriminative Joint Encoders for Large Scale Vision-Language Reranking
Mitchell Keren Taraday, Shahaf Wagner, Chaim Baskin
https://arxiv.org/abs/2510.06820 https://…

Efficient Discriminative Joint Encoders for Large Scale Vision-Language Reranking
Multimodal retrieval still leans on embedding-based models like CLIP for fast vector search over pre-computed image embeddings. Yet, unlike text retrieval, where joint-encoder rerankers are standard, comparable vision--language rerankers are largely absent. We find that seminal joint encoders such as BLIP are severely bottlenecked by an expensive visual feature-extraction stage, preventing practical deployment at scale. Motivated by this bottleneck, we introduce EDJE, an Efficient Discriminativ…

@arXiv_csSD_bot@mastoxiv.page
2025-10-14 11:34:48

Diffusion-Link: Diffusion Probabilistic Model for Bridging the Audio-Text Modality Gap
KiHyun Nam, Jongmin Choi, Hyeongkeun Lee, Jungwoo Heo, Joon Son Chung
https://arxiv.org/abs/2510.11330

Diffusion-Link: Diffusion Probabilistic Model for Bridging the Audio-Text Modality Gap
Contrastive audio-language pretraining yields powerful joint representations, yet a persistent audio-text modality gap limits the benefits of coupling multimodal encoders with large language models (LLMs). We present Diffusion-Link, a diffusion-based modality-bridging module that generatively maps audio embeddings into the text-embedding distribution. The module is trained at the output embedding from the frozen multimodal encoder and implemented as a lightweight network with three residual MLP…

@arXiv_statML_bot@mastoxiv.page
2025-09-30 10:21:41

Define latent spaces by example: optimisation over the outputs of generative models
Samuel Willis, Alexandru I. Stere, Dragos D. Margineantu, Henry T. Oldroyd, John A. Fozard, Carl Henrik Ek, Henry Moss, Erik Bodin
https://arxiv.org/abs/2509.23800

Define latent spaces by example: optimisation over the outputs of generative models
Modern generative AI models such as diffusion and flow matching can sample from rich data distributions, but many downstream tasks -- such as experimental design or creative content generation -- require a higher level of control than unconstrained sampling. The challenge is to efficiently identify outputs that are both probable under the model and satisfy task-specific constraints. We address this by introducing surrogate latent spaces: non-parametric, low-dimensional Euclidean embeddings that…

@arXiv_qbioGN_bot@mastoxiv.page
2025-09-30 08:16:36

Contrastive Learning Enhances Language Model Based Cell Embeddings for Low-Sample Single Cell Transcriptomics
Luxuan Zhang, Douglas Jiang, Qinglong Wang, Haoqi Sun, Feng Tian
https://arxiv.org/abs/2509.23543

Contrastive Learning Enhances Language Model Based Cell Embeddings for Low-Sample Single Cell Transcriptomics
Large language models (LLMs) have shown strong ability in generating rich representations across domains such as natural language processing and generation, computer vision, and multimodal learning. However, their application in biomedical data analysis remains nascent. Single-cell transcriptomic profiling is essential for dissecting cell subtype diversity in development and disease, but rare subtypes pose challenges for scaling laws. We present a computational framework that integrates single-…

@arXiv_mathDG_bot@mastoxiv.page
2025-10-01 09:12:37

The sigma invariant of the $n$ torus, the K3 surface, and Euclidean and elliptic 3d manifolds
Santiago R. Simanca
https://arxiv.org/abs/2509.26079 https://…

The sigma invariant of the $n$ torus, the K3 surface, and Euclidean and elliptic 3d manifolds
On the space of isometric embeddings $f_g$ of metrics $g$ on a manifold $M^n$ into the standard $(\mb{S}^{\tn=\tn(n)},\tg)$, we consider the total exterior scalar curvature $Θ_{f_g}(M)$, and squared $L^2$ norm of the mean curvature vector $Φ_{f_g}(M)$ and second fundamental form $Π_{f_g}(M)$ functionals of $f_g$, respectively. Then $\mc{W}_{f_g}(M) =(1-δ_{n,1})(n/(n-1)) Θ_{f_{g}}(M) + Φ_{f_{g)}}(M)$ and $\mc{D}_{f_g}(M)=(1-δ_{n,1}) (1/(n-1))Θ_{f_g}(M)+Π_{f_{g)}}(M)$ are functionals int…

@arXiv_csAI_bot@mastoxiv.page
2025-10-13 09:29:30

Semantic-Condition Tuning: Fusing Graph Context with Large Language Models for Knowledge Graph Completion
Ruitong Liu, Yan Wen, Te Sun, Yunjia Wu, Pingyang Huang, Zihang Yu, Siyuan Li
https://arxiv.org/abs/2510.08966

Semantic-Condition Tuning: Fusing Graph Context with Large Language Models for Knowledge Graph Completion
Fusing Knowledge Graphs with Large Language Models is crucial for knowledge-intensive tasks like knowledge graph completion. The prevailing paradigm, prefix-tuning, simply concatenates knowledge embeddings with text inputs. However, this shallow fusion overlooks the rich relational semantics within KGs and imposes a significant implicit reasoning burden on the LLM to correlate the prefix with the text. To address these, we propose Semantic-condition Tuning (SCT), a new knowledge injection parad…

@arXiv_physicssocph_bot@mastoxiv.page
2025-10-03 13:32:33

Replaced article(s) found for physics.soc-ph. https://arxiv.org/list/physics.soc-ph/new
[1/1]:
- Multi-Scale Node Embeddings for Graph Modeling and Generation
Riccardo Milocco, Fabian Jansen, Diego Garlaschelli

@arXiv_qbioNC_bot@mastoxiv.page
2025-10-02 08:35:00

Robust State-space Reconstruction of Brain Dynamics via Bootstrap Monte Carlo SSA
Sir-Lord Wiafe, Carter Hinsley, Vince D. Calhoun
https://arxiv.org/abs/2510.00011 https://

Robust State-space Reconstruction of Brain Dynamics via Bootstrap Monte Carlo SSA
Reconstructing latent state-space geometry from time series provides a powerful route to studying nonlinear dynamics across complex systems. Delay-coordinate embedding provides the theoretical basis but assumes long, noise-free recordings, which many domains violate. In neuroimaging, for example, fMRI is short and noisy; low sampling and strong red noise obscure oscillations and destabilize embeddings. We propose bootstrap Monte Carlo SSA with a red-noise null and bootstrap stability to retain …

@arXiv_csCL_bot@mastoxiv.page
2025-10-09 10:39:11

Search-R3: Unifying Reasoning and Embedding Generation in Large Language Models
Yuntao Gui, James Cheng
https://arxiv.org/abs/2510.07048 https://arxiv.org/…

Search-R3: Unifying Reasoning and Embedding Generation in Large Language Models
Despite their remarkable natural language understanding capabilities, Large Language Models (LLMs) have been underutilized for retrieval tasks. We present Search-R3, a novel framework that addresses this limitation by adapting LLMs to generate search embeddings as a direct output of their reasoning process. Our approach exploits LLMs' chain-of-thought capabilities, allowing them to produce more effective embeddings by reasoning step-by-step through complex semantic analyses. We implement this t…

@arXiv_csLG_bot@mastoxiv.page
2025-10-14 13:40:18

Attention Factors for Statistical Arbitrage
Elliot L. Epstein, Rose Wang, Jaewon Choi, Markus Pelger
https://arxiv.org/abs/2510.11616 https://arxiv.org/pdf…

Attention Factors for Statistical Arbitrage
Statistical arbitrage exploits temporal price differences between similar assets. We develop a framework to jointly identify similar assets through factors, identify mispricing and form a trading policy that maximizes risk-adjusted performance after trading costs. Our Attention Factors are conditional latent factors that are the most useful for arbitrage trading. They are learned from firm characteristic embeddings that allow for complex interactions. We identify time-series signals from the re…

@arXiv_mathMG_bot@mastoxiv.page
2025-10-01 08:00:47

Metric Poincar\'e inequalities for graphs
Dylan J. Altschuler, Pandelis Dodos, Konstantin Tikhomirov, Konstantinos Tyros
https://arxiv.org/abs/2509.25489 https://

Metric Poincaré inequalities for graphs
This article considers embeddings of bounded degree graphs into general metric spaces. Our first main result is a metric analogue of Matoušek's extrapolation that relates the Poincaré constants $γ(G,\varrho^p)$ and $γ(G,\varrho^q)$ for any pair of exponents $0 < p,q < \infty$, any bounded degree expander graph $G$, and any metric space $\mathcal{M}=(M,\varrho)$. Our second main result provides a sharp estimate of the Poincaré constant $γ(G,\varrho)$ in terms of the cardinalities of the ve…

@arXiv_csIR_bot@mastoxiv.page
2025-10-07 09:41:32

Empowering Denoising Sequential Recommendation with Large Language Model Embeddings
Tongzhou Wu, Yuhao Wang, Maolin Wang, Chi Zhang, Xiangyu Zhao
https://arxiv.org/abs/2510.04239

Empowering Denoising Sequential Recommendation with Large Language Model Embeddings
Sequential recommendation aims to capture user preferences by modeling sequential patterns in user-item interactions. However, these models are often influenced by noise such as accidental interactions, leading to suboptimal performance. Therefore, to reduce the effect of noise, some works propose explicitly identifying and removing noisy items. However, we find that simply relying on collaborative information may result in an over-denoising problem, especially for cold items. To overcome these…

@arXiv_eessAS_bot@mastoxiv.page
2025-10-07 09:27:12

Adapting Diarization-Conditioned Whisper for End-to-End Multi-Talker Speech Recognition
Martin Kocour, Martin Karafiat, Alexander Polok, Dominik Klement, Luk\'a\v{s} Burget, Jan \v{C}ernock\'y
https://arxiv.org/abs/2510.03723

Adapting Diarization-Conditioned Whisper for End-to-End Multi-Talker Speech Recognition
We propose a speaker-attributed (SA) Whisper-based model for multi-talker speech recognition that combines target-speaker modeling with serialized output training (SOT). Our approach leverages a Diarization-Conditioned Whisper (DiCoW) encoder to extract target-speaker embeddings, which are concatenated into a single representation and passed to a shared decoder. This enables the model to transcribe overlapping speech as a serialized output stream with speaker tags and timestamps. In contrast to…

@arXiv_econGN_bot@mastoxiv.page
2025-09-30 09:12:11

Pixels to Prices: Visual Traits, Market Cycles, and the Economics of NFT Valuation
Samiha Tariq
https://arxiv.org/abs/2509.24879 https://arxiv.org/pdf/2509…

Pixels to Prices: Visual Traits, Market Cycles, and the Economics of NFT Valuation
This paper studies how visual traits and market cycles shape prices in NFT markets. Using 94,039 transactions from 26 major generative Ethereum collections, the analysis extracts 196 machine-quantified image features (covering color, composition, palette structure, geometry, texture, and deep learning embeddings), then applies a three-stage filter process to identify stable predictors for hedonic regression. A static mixed-effects model shows that market sentiment and transparent, interpretable…

@arXiv_csAI_bot@mastoxiv.page
2025-10-10 10:20:19

Language Models Do Not Embed Numbers Continuously
Alex O. Davies, Roussel Nzoyem, Nirav Ajmeri, Telmo M. Silva Filho
https://arxiv.org/abs/2510.08009 https://

Language Models Do Not Embed Numbers Continuously
Recent research has extensively studied how large language models manipulate integers in specific arithmetic tasks, and on a more fundamental level, how they represent numeric values. These previous works have found that language model embeddings can be used to reconstruct the original values, however, they do not evaluate whether language models actually model continuous values as continuous. Using expected properties of the embedding space, including linear reconstruction and principal compon…

@arXiv_csCL_bot@mastoxiv.page
2025-10-03 10:46:41

Detecting LLM-Generated Spam Reviews by Integrating Language Model Embeddings and Graph Neural Network
Xin Liu, Rongwu Xu, Xinyi Jia, Jason Liao, Jiao Sun, Ling Huang, Wei Xu
https://arxiv.org/abs/2510.01801

Detecting LLM-Generated Spam Reviews by Integrating Language Model Embeddings and Graph Neural Network
The rise of large language models (LLMs) has enabled the generation of highly persuasive spam reviews that closely mimic human writing. These reviews pose significant challenges for existing detection systems and threaten the credibility of online platforms. In this work, we first create three realistic LLM-generated spam review datasets using three distinct LLMs, each guided by product metadata and genuine reference reviews. Evaluations by GPT-4.1 confirm the high persuasion and deceptive pote…

@arXiv_csDC_bot@mastoxiv.page
2025-09-30 10:43:21

RServe: Overlapping Encoding and Prefill for Efficient LMM Inference
Tianyu Guo, Tianming Xu, Xianjie Chen, Junru Chen, Nong Xiao, Xianwei Zhang
https://arxiv.org/abs/2509.24381

RServe: Overlapping Encoding and Prefill for Efficient LMM Inference
Large multimodal models (LMMs) typically employ an encoding module to transform multimodal data inputs into embeddings, which are then fed to language models for further processing. However, efficiently serving LMMs remains highly challenging due to the inherent complexity of their inference pipelines. Traditional serving engines co-locate the encoding module and the language model, leading to significant resource interference and tight data dependency. Recent studies have alleviated this issue…

@arXiv_csLG_bot@mastoxiv.page
2025-09-26 10:28:51

LAVA: Explainability for Unsupervised Latent Embeddings
Ivan Stresec, Joana P. Gon\c{c}alves
https://arxiv.org/abs/2509.21149 https://arxiv.org/pdf/2509.21…

LAVA: Explainability for Unsupervised Latent Embeddings
Unsupervised black-box models can be drivers of scientific discovery, but remain difficult to interpret. Crucially, discovery hinges on understanding the model output, which is often a multi-dimensional latent embedding rather than a well-defined target. While explainability for supervised learning usually seeks to uncover how input features are used to predict a target, its unsupervised counterpart should relate input features to the structure of the learned latent space. Adaptations of superv…

@arXiv_eessIV_bot@mastoxiv.page
2025-10-02 08:38:01

Latent Representation Learning from 3D Brain MRI for Interpretable Prediction in Multiple Sclerosis
Trinh Ngoc Huynh, Nguyen Duc Kien, Nguyen Hai Anh, Dinh Tran Hiep, Manuela Vaneckova, Tomas Uher, Jeroen Van Schependom, Stijn Denissen, Tran Quoc Long, Nguyen Linh Trung, Guy Nagels
https://arxiv.org/abs/2510.00051

Latent Representation Learning from 3D Brain MRI for Interpretable Prediction in Multiple Sclerosis
We present InfoVAE-Med3D, a latent-representation learning approach for 3D brain MRI that targets interpretable biomarkers of cognitive decline. Standard statistical models and shallow machine learning often lack power, while most deep learning methods behave as black boxes. Our method extends InfoVAE to explicitly maximize mutual information between images and latent variables, producing compact, structured embeddings that retain clinically meaningful content. We evaluate on two cohorts: a lar…

@arXiv_csIR_bot@mastoxiv.page
2025-10-03 08:47:31

Location Matters: Leveraging Multi-Resolution Geo-Embeddings for Housing Search
Ivo Silva (QuintoAndar), Pedro Nogueira (QuintoAndar), Guilherme Bonaldo (QuintoAndar)
https://arxiv.org/abs/2510.01196

Location Matters: Leveraging Multi-Resolution Geo-Embeddings for Housing Search
QuintoAndar Group is Latin America's largest housing platform, revolutionizing property rentals and sales. Headquartered in Brazil, it simplifies the housing process by eliminating paperwork and enhancing accessibility for tenants, buyers, and landlords. With thousands of houses available for each city, users struggle to find the ideal home. In this context, location plays a pivotal role, as it significantly influences property value, access to amenities, and life quality. A great location can …

@arXiv_csDS_bot@mastoxiv.page
2025-09-30 10:30:01

Efficient Sketching and Nearest Neighbor Search Algorithms for Sparse Vector Sets
Sebastian Bruch, Franco Maria Nardini, Cosimo Rulli, Rossano Venturini
https://arxiv.org/abs/2509.24815

Efficient Sketching and Nearest Neighbor Search Algorithms for Sparse Vector Sets
Sparse embeddings of data form an attractive class due to their inherent interpretability: Every dimension is tied to a term in some vocabulary, making it easy to visually decipher the latent space. Sparsity, however, poses unique challenges for Approximate Nearest Neighbor Search (ANNS) which finds, from a collection of vectors, the k vectors closest to a query. To encourage research on this underexplored topic, sparse ANNS featured prominently in a BigANN Challenge at NeurIPS 2023, where appr…

@arXiv_csCL_bot@mastoxiv.page
2025-10-03 10:40:01

Learning to Look at the Other Side: A Semantic Probing Study of Word Embeddings in LLMs with Enabled Bidirectional Attention
Zhaoxin Feng, Jianfei Ma, Emmanuele Chersoni, Xiaojing Zhao, Xiaoyi Bao
https://arxiv.org/abs/2510.01652

Learning to Look at the Other Side: A Semantic Probing Study of Word Embeddings in LLMs with Enabled Bidirectional Attention
Autoregressive Large Language Models (LLMs) demonstrate exceptional performance in language understanding and generation. However, their application in text embedding tasks has been relatively slow, along with the analysis of their semantic representation in probing tasks, due to the constraints of the unidirectional attention mechanism. This paper aims to explore whether such constraints can be overcome by enabling bidirectional attention in LLMs. We tested different variants of the Llama ar…

@arXiv_csGR_bot@mastoxiv.page
2025-10-07 10:00:42

SAEdit: Token-level control for continuous image editing via Sparse AutoEncoder
Ronen Kamenetsky, Sara Dorfman, Daniel Garibi, Roni Paiss, Or Patashnik, Daniel Cohen-Or
https://arxiv.org/abs/2510.05081

SAEdit: Token-level control for continuous image editing via Sparse AutoEncoder
Large-scale text-to-image diffusion models have become the backbone of modern image editing, yet text prompts alone do not offer adequate control over the editing process. Two properties are especially desirable: disentanglement, where changing one attribute does not unintentionally alter others, and continuous control, where the strength of an edit can be smoothly adjusted. We introduce a method for disentangled and continuous editing through token-level manipulation of text embeddings. The ed…

@arXiv_csSD_bot@mastoxiv.page
2025-10-08 08:50:49

FoleyGRAM: Video-to-Audio Generation with GRAM-Aligned Multimodal Encoders
Riccardo Fosco Gramaccioni, Christian Marinoni, Eleonora Grassucci, Giordano Cicchetti, Aurelio Uncini, Danilo Comminiello
https://arxiv.org/abs/2510.05829

FoleyGRAM: Video-to-Audio Generation with GRAM-Aligned Multimodal Encoders
In this work, we present FoleyGRAM, a novel approach to video-to-audio generation that emphasizes semantic conditioning through the use of aligned multimodal encoders. Building on prior advancements in video-to-audio generation, FoleyGRAM leverages the Gramian Representation Alignment Measure (GRAM) to align embeddings across video, text, and audio modalities, enabling precise semantic control over the audio generation process. The core of FoleyGRAM is a diffusion-based audio synthesis model co…

@arXiv_csAI_bot@mastoxiv.page
2025-10-08 12:47:15

Crosslisted article(s) found for cs.AI. https://arxiv.org/list/cs.AI/new
[7/8]:
- Gaussian Embeddings: How JEPAs Secretly Learn Your Data Density
Randall Balestriero, Nicolas Ballas, Mike Rabbat, Yann LeCun

@arXiv_csCV_bot@mastoxiv.page
2025-09-26 10:20:11

WAVECLIP: Wavelet Tokenization for Adaptive-Resolution CLIP
Moshe Kimhi, Erez Koifman, Ehud Rivlin, Eli Schwartz, Chaim Baskin
https://arxiv.org/abs/2509.21153 https://

WAVECLIP: Wavelet Tokenization for Adaptive-Resolution CLIP
We introduce WAVECLIP, a single unified model for adaptive resolution inference in CLIP, enabled by wavelet-based tokenization. WAVECLIP replaces standard patch embeddings with a multi-level wavelet decomposition, enabling the model to process images coarse to fine while naturally supporting multiple resolutions within the same model. At inference time, the model begins with low resolution tokens and refines only when needed, using key-value caching and causal cross-level attention to reuse com…

@arXiv_csSE_bot@mastoxiv.page
2025-10-02 10:05:31

Analyzing Latent Concepts in Code Language Models
Arushi Sharma, Vedant Pungliya, Christopher J. Quinn, Ali Jannesari
https://arxiv.org/abs/2510.00476 https://

Analyzing Latent Concepts in Code Language Models
Interpreting the internal behavior of large language models trained on code remains a critical challenge, particularly for applications demanding trust, transparency, and semantic robustness. We propose Code Concept Analysis (CoCoA): a global post-hoc interpretability framework that uncovers emergent lexical, syntactic, and semantic structures in a code language model's representation space by clustering contextualized token embeddings into human-interpretable concept groups. We propose a hybri…

@arXiv_csLG_bot@mastoxiv.page
2025-10-10 13:29:33

Crosslisted article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[4/5]:
- Leveraging Whisper Embeddings for Audio-based Lyrics Matching
Eleonora Mancini, Joan Serr\`a, Paolo Torroni, Yuki Mitsufuji

@arXiv_mathDG_bot@mastoxiv.page
2025-09-26 07:43:51

Non-existence of higher-order conformal fundamental forms in odd dimensions
Samuel Blitz
https://arxiv.org/abs/2509.20554 https://arxiv.org/pdf/2509.20554

Non-existence of higher-order conformal fundamental forms in odd dimensions
Conformal fundamental forms populate a minimal generating set for low differential order invariants of conformal hypersurface embeddings. In this work we complete the characterization of conformal fundamental forms by proving the general non-existence of higher-order conformal fundamental forms when the embedded hypersurface is even dimensional.

@arXiv_mathGR_bot@mastoxiv.page
2025-09-25 08:44:12

Ping-pong for basis-conjugating HNN-extension of free group
Vasily Ionin
https://arxiv.org/abs/2509.19635 https://arxiv.org/pdf/2509.19635

Ping-pong for basis-conjugating HNN-extension of free group
We construct a normal form for a multiple HNN-extension of a free group by basis-conjugating embeddings. We provide sufficient conditions on a collection of subgroups to fulfill the requirements of ping-pong lemma. Recall that the pure braid group splits as a semidirect product of free groups. Using our result, we show that certain braids from the first two summands $F_n \rtimes F_{n-1} \subset P_{n+1}$ generate a free subgroup.

@arXiv_csIR_bot@mastoxiv.page
2025-10-14 08:54:38

PairSem: LLM-Guided Pairwise Semantic Matching for Scientific Document Retrieval
Wonbin Kweon, Runchu Tian, SeongKu Kang, Pengcheng Jiang, Zhiyong Lu, Jiawei Han, Hwanjo Yu
https://arxiv.org/abs/2510.09897

PairSem: LLM-Guided Pairwise Semantic Matching for Scientific Document Retrieval
Scientific document retrieval is a critical task for enabling knowledge discovery and supporting research across diverse domains. However, existing dense retrieval methods often struggle to capture fine-grained scientific concepts in texts due to their reliance on holistic embeddings and limited domain understanding. Recent approaches leverage large language models (LLMs) to extract fine-grained semantic entities and enhance semantic matching, but they typically treat entities as independent fr…

@arXiv_csCL_bot@mastoxiv.page
2025-10-14 13:12:58

An Encoder-Integrated PhoBERT with Graph Attention for Vietnamese Token-Level Classification
Ba-Quang Nguyen
https://arxiv.org/abs/2510.11537 https://arxiv…

An Encoder-Integrated PhoBERT with Graph Attention for Vietnamese Token-Level Classification
We propose a novel neural architecture named TextGraphFuseGAT, which integrates a pretrained transformer encoder (PhoBERT) with Graph Attention Networks for token-level classification tasks. The proposed model constructs a fully connected graph over the token embeddings produced by PhoBERT, enabling the GAT layer to capture rich inter-token dependencies beyond those modeled by sequential context alone. To further enhance contextualization, a Transformer-style self-attention layer is applied on …

@arXiv_qbioNC_bot@mastoxiv.page
2025-09-26 12:49:59

Replaced article(s) found for q-bio.NC. https://arxiv.org/list/q-bio.NC/new
[1/1]:
- Interpretable Embeddings of Speech Enhance and Explain Brain Encoding Performance of Audio Models
Riki Shimizu, Richard J. Antonello, Chandan Singh, Nima Mesgarani

@arXiv_eessAS_bot@mastoxiv.page
2025-10-07 10:13:42

Probing Whisper for Dysarthric Speech in Detection and Assessment
Zhengjun Yue, Devendra Kayande, Zoran Cvetkovic, Erfan Loweimi
https://arxiv.org/abs/2510.04219 https://…

Probing Whisper for Dysarthric Speech in Detection and Assessment
Large-scale end-to-end models such as Whisper have shown strong performance on diverse speech tasks, but their internal behavior on pathological speech remains poorly understood. Understanding how dysarthric speech is represented across layers is critical for building reliable and explainable clinical assessment tools. This study probes the Whisper-Medium model encoder for dysarthric speech for detection and assessment (i.e., severity classification). We evaluate layer-wise embeddings with a li…

@arXiv_csSD_bot@mastoxiv.page
2025-10-08 08:15:49

Sparse deepfake detection promotes better disentanglement
Antoine Teissier, Marie Tahon, Nicolas Dugu\'e, Aghilas Sini
https://arxiv.org/abs/2510.05696 https://

Sparse deepfake detection promotes better disentanglement
Due to the rapid progress of speech synthesis, deepfake detection has become a major concern in the speech processing community. Because it is a critical task, systems must not only be efficient and robust, but also provide interpretable explanations. Among the different approaches for explainability, we focus on the interpretation of latent representations. In such paper, we focus on the last layer of embeddings of AASIST, a deepfake detection architecture. We use a TopK activation inspired by…

@arXiv_csIR_bot@mastoxiv.page
2025-10-14 11:05:58

Decoupled Multimodal Fusion for User Interest Modeling in Click-Through Rate Prediction
Alin Fan, Hanqing Li, Sihan Lu, Jingsong Yuan, Jiandong Zhang
https://arxiv.org/abs/2510.11066

Decoupled Multimodal Fusion for User Interest Modeling in Click-Through Rate Prediction
Modern industrial recommendation systems improve recommendation performance by integrating multimodal representations from pre-trained models into ID-based Click-Through Rate (CTR) prediction frameworks. However, existing approaches typically adopt modality-centric modeling strategies that process ID-based and multimodal embeddings independently, failing to capture fine-grained interactions between content semantics and behavioral signals. In this paper, we propose Decoupled Multimodal Fusion (…

@arXiv_csCL_bot@mastoxiv.page
2025-10-13 10:40:30

Hierarchical Indexing with Knowledge Enrichment for Multilingual Video Corpus Retrieval
Yu Wang, Tianhao Tan, Yifei Wang
https://arxiv.org/abs/2510.09553 https://

Hierarchical Indexing with Knowledge Enrichment for Multilingual Video Corpus Retrieval
Retrieving relevant instructional videos from multilingual medical archives is crucial for answering complex, multi-hop questions across language boundaries. However, existing systems either compress hour-long videos into coarse embeddings or incur prohibitive costs for fine-grained matching. We tackle the Multilingual Video Corpus Retrieval (mVCR) task in the NLPCC-2025 M4IVQA challenge with a multi-stage framework that integrates multilingual semantics, domain terminology, and efficient long-…

@arXiv_csLG_bot@mastoxiv.page
2025-10-08 10:45:59

NEO: No-Optimization Test-Time Adaptation through Latent Re-Centering
Alexander Murphy, Michal Danilowski, Soumyajit Chatterjee, Abhirup Ghosh
https://arxiv.org/abs/2510.05635 h…

NEO: No-Optimization Test-Time Adaptation through Latent Re-Centering
Test-Time Adaptation (TTA) methods are often computationally expensive, require a large amount of data for effective adaptation, or are brittle to hyperparameters. Based on a theoretical foundation of the geometry of the latent space, we are able to significantly improve the alignment between source and distribution-shifted samples by re-centering target data embeddings at the origin. This insight motivates NEO -- a hyperparameter-free fully TTA method, that adds no significant compute compared…

@arXiv_csIR_bot@mastoxiv.page
2025-10-14 11:46:09

QDER: Query-Specific Document and Entity Representations for Multi-Vector Document Re-Ranking
Shubham Chatterjee, Jeff Dalton
https://arxiv.org/abs/2510.11589 https://

QDER: Query-Specific Document and Entity Representations for Multi-Vector Document Re-Ranking
Neural IR has advanced through two distinct paths: entity-oriented approaches leveraging knowledge graphs and multi-vector models capturing fine-grained semantics. We introduce QDER, a neural re-ranking model that unifies these approaches by integrating knowledge graph semantics into a multi-vector model. QDER's key innovation lies in its modeling of query-document relationships: rather than computing similarity scores on aggregated embeddings, we maintain individual token and entity representa…

@arXiv_csSD_bot@mastoxiv.page
2025-10-07 08:04:49

Linguistic and Audio Embedding-Based Machine Learning for Alzheimer's Dementia and Mild Cognitive Impairment Detection: Insights from the PROCESS Challenge
Adharsha Sam Edwin Sam Devahi, Sohail Singh Sangha, Prachee Priyadarshinee, Jithin Thilakan, Ivan Fu Xing Tan, Christopher Johann Clarke, Sou Ka Lon, Balamurali B T, Yow Wei Quin, Chen Jer-Ming
https://

Linguistic and Audio Embedding-Based Machine Learning for Alzheimer's Dementia and Mild Cognitive Impairment Detection: Insights from the PROCESS Challenge
Early detection of Alzheimer's Dementia (AD) and Mild Cognitive Impairment (MCI) is critical for timely intervention, yet current diagnostic approaches remain resource-intensive and invasive. Speech, encompassing both acoustic and linguistic dimensions, offers a promising non-invasive biomarker for cognitive decline. In this study, we present a machine learning framework for the PROCESS Challenge, leveraging both audio embeddings and linguistic features derived from spontaneous speech recording…

@arXiv_csIR_bot@mastoxiv.page
2025-10-13 07:45:40

Generative Data Augmentation in Graph Contrastive Learning for Recommendation
Yansong Wang, Qihui Lin, Junjie Huang, Tao Jia
https://arxiv.org/abs/2510.09129 https://

Generative Data Augmentation in Graph Contrastive Learning for Recommendation
Recommendation systems have become indispensable in various online platforms, from e-commerce to streaming services. A fundamental challenge in this domain is learning effective embeddings from sparse user-item interactions. While contrastive learning has recently emerged as a promising solution to this issue, generating augmented views for contrastive learning through most existing random data augmentation methods often leads to the alteration of original semantic information. In this paper, w…

@arXiv_eessAS_bot@mastoxiv.page
2025-10-07 10:06:42

Enhancing Speaker Verification with w2v-BERT 2.0 and Knowledge Distillation guided Structured Pruning
Ze Li, Ming Cheng, Ming Li
https://arxiv.org/abs/2510.04213 https://…

Enhancing Speaker Verification with w2v-BERT 2.0 and Knowledge Distillation guided Structured Pruning
Large-scale self-supervised Pre-Trained Models (PTMs) have shown significant improvements in the speaker verification (SV) task by providing rich feature representations. In this paper, we utilize w2v-BERT 2.0, a model with approximately 600 million parameters trained on 450 million hours of unlabeled data across 143 languages, for the SV task. The MFA structure with Layer Adapter is employed to process the multi-layer feature outputs from the PTM and extract speaker embeddings. Additionally, w…

@arXiv_csLG_bot@mastoxiv.page
2025-10-02 15:19:05

Replaced article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[1/7]:
- Learning Dynamic Graph Embeddings with Neural Controlled Differential Equations
Tiexin Qin, Benjamin Walker, Terry Lyons, Hong Yan, Haoliang Li

@arXiv_csAI_bot@mastoxiv.page
2025-09-26 09:56:51

Federation of Agents: A Semantics-Aware Communication Fabric for Large-Scale Agentic AI
Lorenzo Giusti, Ole Anton Werner, Riccardo Taiello, Matilde Carvalho Costa, Emre Tosun, Andrea Protani, Marc Molina, Rodrigo Lopes de Almeida, Paolo Cacace, Diogo Reis Santos, Luigi Serio
https://arxiv.org/abs/2509.20175…

Federation of Agents: A Semantics-Aware Communication Fabric for Large-Scale Agentic AI
We present Federation of Agents (FoA), a distributed orchestration framework that transforms static multi-agent coordination into dynamic, capability-driven collaboration. FoA introduces Versioned Capability Vectors (VCVs): machine-readable profiles that make agent capabilities searchable through semantic embeddings, enabling agents to advertise their capabilities, cost, and limitations. Our aarchitecturecombines three key innovations: (1) semantic routing that matches tasks to agents over shar…

@arXiv_csCL_bot@mastoxiv.page
2025-10-06 10:16:39

PGMEL: Policy Gradient-based Generative Adversarial Network for Multimodal Entity Linking
KM Pooja, Cheng Long, Aixin Sun
https://arxiv.org/abs/2510.02726 https://

PGMEL: Policy Gradient-based Generative Adversarial Network for Multimodal Entity Linking
The task of entity linking, which involves associating mentions with their respective entities in a knowledge graph, has received significant attention due to its numerous potential applications. Recently, various multimodal entity linking (MEL) techniques have been proposed, targeted to learn comprehensive embeddings by leveraging both text and vision modalities. The selection of high-quality negative samples can potentially play a crucial role in metric/representation learning. However, to th…

@arXiv_csAI_bot@mastoxiv.page
2025-09-25 09:16:12

Federation of Agents: A Semantics-Aware Communication Fabric for Large-Scale Agentic AI
Lorenzo Giusti, Ole Anton Werner, Riccardo Taiello, Matilde Carvalho Costa, Emre Tosun, Andrea Protani, Marc Molina, Rodrigo Lopes de Almeida, Paolo Cacace, Diogo Reis Santos, Luigi Serio
https://arxiv.org/abs/2509.20175…

Federation of Agents: A Semantics-Aware Communication Fabric for Large-Scale Agentic AI
We present Federation of Agents (FoA), a distributed orchestration framework that transforms static multi-agent coordination into dynamic, capability-driven collaboration. FoA introduces Versioned Capability Vectors (VCVs): machine-readable profiles that make agent capabilities searchable through semantic embeddings, enabling agents to advertise their capabilities, cost, and limitations. Our aarchitecturecombines three key innovations: (1) semantic routing that matches tasks to agents over shar…

@arXiv_csSD_bot@mastoxiv.page
2025-09-29 09:43:47

Cross-Dialect Bird Species Recognition with Dialect-Calibrated Augmentation
Jiani Ding, Qiyang Sun, Alican Akman, Bj\"orn W. Schuller
https://arxiv.org/abs/2509.22317 https…

Cross-Dialect Bird Species Recognition with Dialect-Calibrated Augmentation
Dialect variation hampers automatic recognition of bird calls collected by passive acoustic monitoring. We address the problem on DB3V, a three-region, ten-species corpus of 8-s clips, and propose a deployable framework built on Time-Delay Neural Networks (TDNNs). Frequency-sensitive normalisation (Instance Frequency Normalisation and a gated Relaxed-IFN) is paired with gradient-reversal adversarial training to learn region-invariant embeddings. A multi-level augmentation scheme combines wavefo…

@arXiv_csLG_bot@mastoxiv.page
2025-09-25 10:44:42

Probability Signature: Bridging Data Semantics and Embedding Structure in Language Models
Junjie Yao, Zhi-Qin John Xu
https://arxiv.org/abs/2509.20124 https://

Probability Signature: Bridging Data Semantics and Embedding Structure in Language Models
The embedding space of language models is widely believed to capture the semantic relationships; for instance, embeddings of digits often exhibit an ordered structure that corresponds to their natural sequence. However, the mechanisms driving the formation of such structures remain poorly understood. In this work, we interpret the embedding structures via the data distribution. We propose a set of probability signatures that reflect the semantic relationships among tokens. Through experiments o…

@arXiv_csIR_bot@mastoxiv.page
2025-10-06 07:35:29

Revisiting Query Variants: The Advantage of Retrieval Over Generation of Query Variants for Effective QPP
Fangzheng Tian, Debasis Ganguly, Craig Macdonald
https://arxiv.org/abs/2510.02512

Revisiting Query Variants: The Advantage of Retrieval Over Generation of Query Variants for Effective QPP
Leveraging query variants (QVs), i.e., queries with potentially similar information needs to the target query, has been shown to improve the effectiveness of query performance prediction (QPP) approaches. Existing QV-based QPP methods generate QVs facilitated by either query expansion or non-contextual embeddings, which may introduce topical drifts and hallucinations. In this paper, we propose a method that retrieves QVs from a training set (e.g., MS MARCO) for a given target query of QPP. To a…

@arXiv_csCL_bot@mastoxiv.page
2025-09-30 14:06:01

How Well Do LLMs Imitate Human Writing Style?
Rebira Jemama, Rajesh Kumar
https://arxiv.org/abs/2509.24930 https://arxiv.org/pdf/2509.24930

How Well Do LLMs Imitate Human Writing Style?
Large language models (LLMs) can generate fluent text, but their ability to replicate the distinctive style of a specific human author remains unclear. We present a fast, training-free framework for authorship verification and style imitation analysis. The method integrates TF-IDF character n-grams with transformer embeddings and classifies text pairs through empirical distance distributions, eliminating the need for supervised training or threshold tuning. It achieves 97.5\% accuracy on academ…

@arXiv_csSD_bot@mastoxiv.page
2025-09-25 08:01:12

ArtiFree: Detecting and Reducing Generative Artifacts in Diffusion-based Speech Enhancement
Bhawana Chhaglani, Yang Gao, Julius Richter, Xilin Li, Syavosh Zadissa, Tarun Pruthi, Andrew Lovitt
https://arxiv.org/abs/2509.19495

ArtiFree: Detecting and Reducing Generative Artifacts in Diffusion-based Speech Enhancement
Diffusion-based speech enhancement (SE) achieves natural-sounding speech and strong generalization, yet suffers from key limitations like generative artifacts and high inference latency. In this work, we systematically study artifact prediction and reduction in diffusion-based SE. We show that variance in speech embeddings can be used to predict phonetic errors during inference. Building on these findings, we propose an ensemble inference method guided by semantic consistency across multiple di…

@arXiv_csCL_bot@mastoxiv.page
2025-09-30 14:12:22

jina-reranker-v3: Last but Not Late Interaction for Document Reranking
Feng Wang, Yuqing Li, Han Xiao
https://arxiv.org/abs/2509.25085 https://arxiv.org/pd…

jina-reranker-v3: Last but Not Late Interaction for Document Reranking
jina-reranker-v3 is a 0.6B parameter multilingual document reranker that introduces a novel last but not late interaction. Unlike late interaction models such as ColBERT that perform separate encoding followed by multi-vector matching, our approach conducts causal self-attention between query and documents within the same context window, enabling rich cross-document interactions before extracting contextual embeddings from the last token of each document. This compact architecture achieves stat…

@arXiv_csIR_bot@mastoxiv.page
2025-10-02 09:49:01

Deep Learning-Based Approach for Improving Relational Aggregated Search
Sara Saad Soliman, Ahmed Younes, Islam Elkabani, Ashraf Elsayed
https://arxiv.org/abs/2510.00966 https://…

Deep Learning-Based Approach for Improving Relational Aggregated Search
Due to an information explosion on the internet, there is a need for the development of aggregated search systems that can boost the retrieval and management of content in various formats. To further improve the clustering of Arabic text data in aggregated search environments, this research investigates the application of advanced natural language processing techniques, namely stacked autoencoders and AraBERT embeddings. By transcending the limitations of traditional search engines, which are i…

@arXiv_csIR_bot@mastoxiv.page
2025-09-30 08:20:18

Federated Consistency- and Complementarity-aware Consensus-enhanced Recommendation
Yunqi Mi, Boyang Yan, Guoshuai Zhao, Jialie Shen, Xueming Qian
https://arxiv.org/abs/2509.22659

Federated Consistency- and Complementarity-aware Consensus-enhanced Recommendation
Personalized federated recommendation system (FedRec) has gained significant attention for its ability to preserve privacy in delivering tailored recommendations. To alleviate the statistical heterogeneity challenges among clients and improve personalization, decoupling item embeddings into the server and client-specific views has become a promising way. Among them, the global item embedding table serves as a consensus representation that integrates and reflects the collective patterns across a…

Tootfinder

Opt-in global Mastodon full text search. Join the index!