VL-JEPA: Joint Embedding Predictive Architecture for Vision-language
Delong Chen, Mustafa Shukor, Theo Moutakanni, Willy Chung, Jade Yu, Tejaswi Kasarla, Allen Bolourchi, Yann LeCun, Pascale Fung
https://arxiv.org/abs/2512.10942
Microsoft AI CEO Mustafa Suleyman lays out the company's plans to develop AI self-sufficiency from OpenAI, like releasing its own voice, image, and text models (Sebastian Herrera/Wall Street Journal)
https://www.wsj.com/tech/a…
Empowering Dynamic Urban Navigation with Stereo and Mid-Level Vision
Wentao Zhou, Xuweiyi Chen, Vignesh Rajagopal, Jeffrey Chen, Rohan Chandra, Zezhou Cheng
https://arxiv.org/abs/2512.10956
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[4/5]:
- Fairness-Aware Fine-Tuning of Vision-Language Models for Medical Glaucoma Diagnosis
Zijian Gu, Yuxi Liu, Zhenhao Zhang, Song Wang
Roadmap: Emerging Platforms and Applications of Optical Frequency Combs and Dissipative Solitons
Dmitry Skryabin, Arne Kordts, Richard Zeltner, Ronald Holzwarth, Victor Torres-Company, Tobias Herr, Fuchuan Lei, Qi-Fan Yang, Camille-Sophie Br\`es, John F. Donegan, Hai-Zhong Weng, Delphine Marris-Morini, Adel Bousseksou, Markku Vainio, Thomas Bunel, Matteo Conforti, Arnaud Mussot, Erwan Lucas, Julien Fatome, Yuk Shan Cheng, Derryck T. Reid, Alessia Pasquazi, Marco Peccianti, M. Giudici, M. Marconi, A. Bartolo, N. Vigne, B. Chomet, A. Garnache, G. Beaudoin, I. Sagnes, Richard Burguete, Sarah Hammer, Jonathan Silver
https://arxiv.org/abs/2511.18231 https://arxiv.org/pdf/2511.18231 https://arxiv.org/html/2511.18231
arXiv:2511.18231v1 Announce Type: new
Abstract: The discovery of optical frequency combs (OFCs) has revolutionised science and technology by bridging electronics and photonics, driving major advances in precision measurements, atomic clocks, spectroscopy, telecommunications, and astronomy. However, current OFC systems still require further development to enable broader adoption in fields such as communication, aerospace, defence, and healthcare. There is a growing need for compact, portable OFCs that deliver high output power, robust self-referencing, and application-specific spectral coverage. On the conceptual side, progress toward such systems is hindered by an incomplete understanding of the fundamental principles governing OFC generation in emerging devices and materials, as well as evolving insights into the interplay between soliton and mode-locking effects. This roadmap presents the vision of a diverse group of academic and industry researchers and educators from Europe, along with their collaborators, on the current status and future directions of OFC science. It highlights a multidisciplinary approach that integrates novel physics, engineering innovation, and advanced researcher training. Topics include advances in soliton science as it relates to OFCs, the extension of OFC spectra into the visible and mid-infrared ranges, metrology applications and noise performance of integrated OFC sources, new fibre-based OFC modules, OFC lasers and OFC applications in astronomy.
toXiv_bot_toot
Exploiting ID-Text Complementarity via Ensembling for Sequential Recommendation
Liam Collins, Bhuvesh Kumar, Clark Mingxuan Ju, Tong Zhao, Donald Loveland, Leonardo Neves, Neil Shah
https://arxiv.org/abs/2512.17820 https://arxiv.org/pdf/2512.17820 https://arxiv.org/html/2512.17820
arXiv:2512.17820v1 Announce Type: new
Abstract: Modern Sequential Recommendation (SR) models commonly utilize modality features to represent items, motivated in large part by recent advancements in language and vision modeling. To do so, several works completely replace ID embeddings with modality embeddings, claiming that modality embeddings render ID embeddings unnecessary because they can match or even exceed ID embedding performance. On the other hand, many works jointly utilize ID and modality features, but posit that complex fusion strategies, such as multi-stage training and/or intricate alignment architectures, are necessary for this joint utilization. However, underlying both these lines of work is a lack of understanding of the complementarity of ID and modality features. In this work, we address this gap by studying the complementarity of ID- and text-based SR models. We show that these models do learn complementary signals, meaning that either should provide performance gain when used properly alongside the other. Motivated by this, we propose a new SR method that preserves ID-text complementarity through independent model training, then harnesses it through a simple ensembling strategy. Despite this method's simplicity, we show it outperforms several competitive SR baselines, implying that both ID and text features are necessary to achieve state-of-the-art SR performance but complex fusion architectures are not.
toXiv_bot_toot
Crosslisted article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[1/3]:
- Optimizing Text Search: A Novel Pattern Matching Algorithm Based on Ukkonen's Approach
Xinyu Guan, Shaohua Zhang
https://arxiv.org/abs/2512.16927 https://mastoxiv.page/@arXiv_csDS_bot/115762062326187898
- SpIDER: Spatially Informed Dense Embedding Retrieval for Software Issue Localization
Shravan Chaudhari, Rahul Thomas Jacob, Mononito Goswami, Jiajun Cao, Shihab Rashid, Christian Bock
https://arxiv.org/abs/2512.16956 https://mastoxiv.page/@arXiv_csSE_bot/115762248476963893
- MemoryGraft: Persistent Compromise of LLM Agents via Poisoned Experience Retrieval
Saksham Sahai Srivastava, Haoyu He
https://arxiv.org/abs/2512.16962 https://mastoxiv.page/@arXiv_csCR_bot/115762140339109012
- Colormap-Enhanced Vision Transformers for MRI-Based Multiclass (4-Class) Alzheimer's Disease Clas...
Faisal Ahmed
https://arxiv.org/abs/2512.16964 https://mastoxiv.page/@arXiv_eessIV_bot/115762196702065869
- Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows
Wanghan Xu, et al.
https://arxiv.org/abs/2512.16969 https://mastoxiv.page/@arXiv_csAI_bot/115762050529328276
- PAACE: A Plan-Aware Automated Agent Context Engineering Framework
Kamer Ali Yuksel
https://arxiv.org/abs/2512.16970 https://mastoxiv.page/@arXiv_csAI_bot/115762054461584205
- A Women's Health Benchmark for Large Language Models
Elisabeth Gruber, et al.
https://arxiv.org/abs/2512.17028 https://mastoxiv.page/@arXiv_csCL_bot/115762049873946945
- Perturb Your Data: Paraphrase-Guided Training Data Watermarking
Pranav Shetty, Mirazul Haque, Petr Babkin, Zhiqiang Ma, Xiaomo Liu, Manuela Veloso
https://arxiv.org/abs/2512.17075 https://mastoxiv.page/@arXiv_csCL_bot/115762077400293945
- Disentangled representations via score-based variational autoencoders
Benjamin S. H. Lyo, Eero P. Simoncelli, Cristina Savin
https://arxiv.org/abs/2512.17127 https://mastoxiv.page/@arXiv_statML_bot/115762251753966702
- Biosecurity-Aware AI: Agentic Risk Auditing of Soft Prompt Attacks on ESM-Based Variant Predictors
Huixin Zhan
https://arxiv.org/abs/2512.17146 https://mastoxiv.page/@arXiv_csCR_bot/115762318582013305
- Application of machine learning to predict food processing level using Open Food Facts
Arora, Chauhan, Rana, Aditya, Bhagat, Kumar, Kumar, Semar, Singh, Bagler
https://arxiv.org/abs/2512.17169 https://mastoxiv.page/@arXiv_qbioBM_bot/115762302873829397
- Systemic Risk Radar: A Multi-Layer Graph Framework for Early Market Crash Warning
Sandeep Neela
https://arxiv.org/abs/2512.17185 https://mastoxiv.page/@arXiv_qfinRM_bot/115762275982224870
- Do Foundational Audio Encoders Understand Music Structure?
Keisuke Toyama, Zhi Zhong, Akira Takahashi, Shusuke Takahashi, Yuki Mitsufuji
https://arxiv.org/abs/2512.17209 https://mastoxiv.page/@arXiv_csSD_bot/115762341541572505
- CheXPO-v2: Preference Optimization for Chest X-ray VLMs with Knowledge Graph Consistency
Xiao Liang, Yuxuan An, Di Wang, Jiawei Hu, Zhicheng Jiao, Bin Jing, Quan Wang
https://arxiv.org/abs/2512.17213 https://mastoxiv.page/@arXiv_csCV_bot/115762574180736975
- Machine Learning Assisted Parameter Tuning on Wavelet Transform Amorphous Radial Distribution Fun...
Deriyan Senjaya, Stephen Ekaputra Limantoro
https://arxiv.org/abs/2512.17245 https://mastoxiv.page/@arXiv_condmatmtrlsci_bot/115762447037143855
- AlignDP: Hybrid Differential Privacy with Rarity-Aware Protection for LLMs
Madhava Gaikwad
https://arxiv.org/abs/2512.17251 https://mastoxiv.page/@arXiv_csCR_bot/115762396593872943
- Practical Framework for Privacy-Preserving and Byzantine-robust Federated Learning
Baolei Zhang, Minghong Fang, Zhuqing Liu, Biao Yi, Peizhao Zhou, Yuan Wang, Tong Li, Zheli Liu
https://arxiv.org/abs/2512.17254 https://mastoxiv.page/@arXiv_csCR_bot/115762402470985707
- Verifiability-First Agents: Provable Observability and Lightweight Audit Agents for Controlling A...
Abhivansh Gupta
https://arxiv.org/abs/2512.17259 https://mastoxiv.page/@arXiv_csMA_bot/115762225538364939
- Warmer for Less: A Cost-Efficient Strategy for Cold-Start Recommendations at Pinterest
Saeed Ebrahimi, Weijie Jiang, Jaewon Yang, Olafur Gudmundsson, Yucheng Tu, Huizhong Duan
https://arxiv.org/abs/2512.17277 https://mastoxiv.page/@arXiv_csIR_bot/115762214396869930
- LibriVAD: A Scalable Open Dataset with Deep Learning Benchmarks for Voice Activity Detection
Ioannis Stylianou, Achintya kr. Sarkar, Nauman Dawalatabad, James Glass, Zheng-Hua Tan
https://arxiv.org/abs/2512.17281 https://mastoxiv.page/@arXiv_csSD_bot/115762361858560703
- Penalized Fair Regression for Multiple Groups in Chronic Kidney Disease
Carter H. Nakamoto, Lucia Lushi Chen, Agata Foryciarz, Sherri Rose
https://arxiv.org/abs/2512.17340 https://mastoxiv.page/@arXiv_statME_bot/115762446402738033
toXiv_bot_toot