Finetuning AI Foundation Models to Develop Subgrid-Scale Parameterizations: A Case Study on Atmospheric Gravity Waves
Aman Gupta, Aditi Sheshadri, Sujit Roy, Johannes Schmude, Vishal Gaur, Wei Ji Leong, Manil Maskey, Rahul Ramachandran
https://arxiv.org/abs/2509.03816
FideDiff: Efficient Diffusion Model for High-Fidelity Image Motion Deblurring
Xiaoyang Liu, Zhengyan Zhou, Zihang Xu, Jiezhang Cao, Zheng Chen, Yulun Zhang
https://arxiv.org/abs/2510.01641
LobRA: Multi-tenant Fine-tuning over Heterogeneous Data
Sheng Lin, Fangcheng Fu, Haoyang Li, Hao Ge, Xuanyu Wang, Jiawen Niu, Yaofeng Tu, Bin Cui
https://arxiv.org/abs/2509.01193
Speaker-Conditioned Phrase Break Prediction for Text-to-Speech with Phoneme-Level Pre-trained Language Model
Dong Yang, Yuki Saito, Takaaki Saeki, Tomoki Koriyama, Wataru Nakata, Detai Xin, Hiroshi Saruwatari
https://arxiv.org/abs/2509.00675
Smart Contract Intent Detection with Pre-trained Programming Language Model
Youwei Huang, Jianwen Li, Sen Fang, Yao Li, Peng Yang, Bin Hu, Tao Zhang
https://arxiv.org/abs/2508.20086
Dirichlet-Prior Shaping: Guiding Expert Specialization in Upcycled MoEs
Leyla Mirvakhabova, Babak Ehteshami Bejnordi, Gaurav Kumar, Hanxue Liang, Wanru Zhao, Paul Whatmough
https://arxiv.org/abs/2510.01185
Migration as a Probe: A Generalizable Benchmark Framework for Specialist vs. Generalist Machine-Learned Force Fields in Doped Materials
Yi Cao, Paulette Clancy
https://arxiv.org/abs/2509.00090
Improving Pre-Trained Vision-Language-Action Policies with Model-Based Search
Cyrus Neary, Omar G. Younis, Artur Kuramshin, Ozgur Aslan, Glen Berseth
https://arxiv.org/abs/2508.12211
LTA-L2S: Lexical Tone-Aware Lip-to-Speech Synthesis for Mandarin with Cross-Lingual Transfer Learning
Kang Yang, Yifan Liang, Fangkun Liu, Zhenping Xie, Chengshi Zheng
https://arxiv.org/abs/2509.25670 …
NSARM: Next-Scale Autoregressive Modeling for Robust Real-World Image Super-Resolution
Xiangtao Kong, Rongyuan Wu, Shuaizheng Liu, Lingchen Sun, Lei Zhang
https://arxiv.org/abs/2510.00820
MixedG2P-T5: G2P-free Speech Synthesis for Mixed-script texts using Speech Self-Supervised Learning and Language Model
Joonyong Park, Daisuke Saito, Nobuaki Minematsu
https://arxiv.org/abs/2509.01391
Universal Inverse Distillation for Matching Models with Real-Data Supervision (No GANs)
Nikita Kornilov, David Li, Tikhon Mavrin, Aleksei Leonov, Nikita Gushchin, Evgeny Burnaev, Iaroslav Koshelev, Alexander Korotin
https://arxiv.org/abs/2509.22459
MicroRCA-Agent: Microservice Root Cause Analysis Method Based on Large Language Model Agents
Pan Tang, Shixiang Tang, Huanqi Pu, Zhiqing Miao, Zhixing Wang
https://arxiv.org/abs/2509.15635
Pre-trained Transformer-models using chronic invasive electrophysiology for symptom decoding without patient-individual training
Timon Merk, Saeed Salehi, Richard M. Koehler, Qiming Cui, Maria Olaru, Amelia Hahn, Nicole R. Provenza, Simon Little, Reza Abbasi-Asl, Phil A. Starr, Wolf-Julian Neumann
https://arxiv.org/abs/2508.10160
Ultra-Fast Language Generation via Discrete Diffusion Divergence Instruct
Haoyang Zheng, Xinyang Liu, Cindy Xiangrui Kong, Nan Jiang, Zheyuan Hu, Weijian Luo, Wei Deng, Guang Lin
https://arxiv.org/abs/2509.25035
A Deep Transfer Learning-Based Low-overhead Beam Prediction in Vehicle Communications
Zhiqiang Xiao, Yuwen Cao, Mondher Bouazizi, Tomoaki Ohtsuki, Shahid Mumtaz
https://arxiv.org/abs/2509.20659
Recidivism and Peer Influence with LLM Text Embeddings in Low Security Correctional Facilities
Shanjukta Nath, Jiwon Hong, Jae Ho Chang, Keith Warren, Subhadeep Paul
https://arxiv.org/abs/2509.20634
A Sentinel-3 foundation model for ocean colour
Geoffrey Dawson, Remy Vandaele, Andrew Taylor, David Moffat, Helen Tamura-Wicks, Sarah Jackson, Rosie Lickorish, Paolo Fraccaro, Hywel Williams, Chunbo Luo, Anne Jones
https://arxiv.org/abs/2509.21273
U-SWIFT: A Unified Surface Wave Inversion Framework with Transformer via Normalization of Dispersion Curves
Tianjian Cheng, Hongrui Xu, Jiayu Feng, Xiongyu Hu, Chaofan Yao
https://arxiv.org/abs/2509.24872
LLMulator: Generalizable Cost Modeling for Dataflow Accelerators with Input-Adaptive Control Flow
Kaiyan Chang, Wenlong Zhu, Shengwen Liang, Huawei Li, Ying Wang
https://arxiv.org/abs/2508.17826
Radio Galaxy Zoo: Morphological classification by Fanaroff-Riley designation using self-supervised pre-training
Nutthawara Buatthaisong, Inigo Val Slijepcevic, Anna M. M. Scaife, Micah Bowles, Andrew Hopkins, Devina Mohan, Stanislav S Shabala, O. Ivy Wong
https://arxiv.org/abs/2509.11988
FusionMAE: large-scale pretrained model to optimize and simplify diagnostic and control of fusion plasma
Zongyu Yang, Zhenghao Yang, Wenjing Tian, Jiyuan Li, Xiang Sun, Guohui Zheng, Songfen Liu, Niannian Wu, Rongpeng Li, Zhaohe Xu, Bo Li, Zhongbing Shi, Zhe Gao, Wei Chen, Xiaoquan Ji, Min Xu, Wulyu Zhong
https://arxiv.org/abs/2509.12945…
Knowledge-Driven Hallucination in Large Language Models: An Empirical Study on Process Modeling
Humam Kourani, Anton Antonov, Alessandro Berti, Wil M. P. van der Aalst
https://arxiv.org/abs/2509.15336 …
Exploring Self-Supervised Audio Models for Generalized Anomalous Sound Detection
Bing Han, Anbai Jiang, Xinhu Zheng, Wei-Qiang Zhang, Jia Liu, Pingyi Fan, Yanmin Qian
https://arxiv.org/abs/2508.12230
Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search
Yuxian Gu, Qinghao Hu, Shang Yang, Haocheng Xi, Junyu Chen, Song Han, Han Cai
https://arxiv.org/abs/2508.15884
FlowVLA: Thinking in Motion with a Visual Chain of Thought
Zhide Zhong, Haodong Yan, Junfeng Li, Xiangchen Liu, Xin Gong, Wenxuan Song, Jiayi Chen, Haoang Li
https://arxiv.org/abs/2508.18269
Classical Neural Networks on Quantum Devices via Tensor Network Disentanglers: A Case Study in Image Classification
Borja Aizpurua, Sukhbinder Singh, Rom\'an Or\'us
https://arxiv.org/abs/2509.06653
Time-Aware One Step Diffusion Network for Real-World Image Super-Resolution
Tainyi Zhang, Zheng-Peng Duan, Peng-Tao Jiang, Bo Li, Ming-Ming Cheng, Chun-Le Guo, Chongyi Li
https://arxiv.org/abs/2508.16557
DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder
Junyu Chen, Wenkun He, Yuchao Gu, Yuyang Zhao, Jincheng Yu, Junsong Chen, Dongyun Zou, Yujun Lin, Zhekai Zhang, Muyang Li, Haocheng Xi, Ligeng Zhu, Enze Xie, Song Han, Han Cai
https://arxiv.org/abs/2509.25182
SONAR: Self-Distilled Continual Pre-training for Domain Adaptive Audio Representation
Yizhou Zhang, Yuan Gao, Wangjin Zhou, Zicheng Yuan, Keisuke Imoto, Tatsuya Kawahara
https://arxiv.org/abs/2509.15703
In-Context Learning as Nonparametric Conditional Probability Estimation: Risk Bounds and Optimality
Chenrui Liu, Falong Tan, Chuanlong Xie, Yicheng Zeng, Lixing Zhu
https://arxiv.org/abs/2508.08673
PAD3R: Pose-Aware Dynamic 3D Reconstruction from Casual Videos
Ting-Hsuan Liao, Haowen Liu, Yiran Xu, Songwei Ge, Gengshan Yang, Jia-Bin Huang
https://arxiv.org/abs/2509.25183 h…
Replaced article(s) found for physics.geo-ph. https://arxiv.org/list/physics.geo-ph/new
[1/1]:
- PRIME-DP: Pre-trained Integrated Model for Earthquake Data Processing
Ziye Yu, Yuqi Cai, Weitao Wang, Yanru An, Lu Li, Yueyang Xia, Yunpeng Zhang
LMAR: Language Model Augmented Retriever for Domain-specific Knowledge Indexing
Yao Zhao, Yantian Ding, Zhiyue Zhang, Dapeng Yao, Yanxun Xu
https://arxiv.org/abs/2508.05672 http…
Composition and Alignment of Diffusion Models using Constrained Learning
Shervin Khalafi, Ignacio Hounie, Dongsheng Ding, Alejandro Ribeiro
https://arxiv.org/abs/2508.19104 http…
IP-Augmented Multi-Modal Malicious URL Detection Via Token-Contrastive Representation Enhancement and Multi-Granularity Fusion
Ye Tian, Yanqiu Yu, Liangliang Song, Zhiquan Liu, Yanbin Wang, Jianguo Sun
https://arxiv.org/abs/2510.12395
Seeing is Believing: Emotion-Aware Audio-Visual Language Modeling for Expressive Speech Generation
Weiting Tan, Jiachen Lian, Hirofumi Inaguma, Paden Tomasello, Philipp Koehn, Xutai Ma
https://arxiv.org/abs/2508.16188
FS-SAM2: Adapting Segment Anything Model 2 for Few-Shot Semantic Segmentation via Low-Rank Adaptation
Bernardo Forni, Gabriele Lombardi, Federico Pozzi, Mirco Planamente
https://arxiv.org/abs/2509.12105
Replaced article(s) found for cs.SE. https://arxiv.org/list/cs.SE/new
[1/1]:
- "I see models being a whole other thing": An Empirical Study of Pre-Trained Model Naming Conventi...
Wenxin Jiang, Mingyu Kim, Chingwo Cheung, Heesoo Kim, George K. Thiruvathukal, James C. Davis
…
UNICON: UNIfied CONtinual Learning for Medical Foundational Models
Mohammad Areeb Qazi, Munachiso S Nwadike, Ibrahim Almakky, Mohammad Yaqub, Numan Saeed
https://arxiv.org/abs/2508.14024
Scalable Evaluation for Audio Identification via Synthetic Latent Fingerprint Generation
Aditya Bhattacharjee, Marco Pasini, Emmanouil Benetos
https://arxiv.org/abs/2509.18620 h…
Personalized Product Search Ranking: A Multi-Task Learning Approach with Tabular and Non-Tabular Data
Lalitesh Morishetti, Abhay Kumar, Jonathan Scott, Kaushiki Nag, Gunjan Sharma, Shanu Vashishtha, Rahul Sridhar, Rohit Chatter, Kannan Achan
https://arxiv.org/abs/2508.09636
Amortized In-Context Mixed Effect Transformer Models: A Zero-Shot Approach for Pharmacokinetics
C\'esar Ali Ojeda Marin, Wilhelm Huisinga, Purity Kavwele, Niklas Hartung
https://arxiv.org/abs/2508.15659
MAGIC-Enhanced Keyword Prompting for Zero-Shot Audio Captioning with CLIP Models
Vijay Govindarajan, Pratik Patel, Sahil Tripathi, Md Azizul Hoque, Gautam Siddharth Kashyap
https://arxiv.org/abs/2509.12591
Benchmarking CHGNet Universal Machine Learning Interatomic Potential Against DFT and EXAFS: Case of Layered WS2 and MoS2
Pjotrs \v{Z}guns, Inga Pudza, Alexei Kuzmin
https://arxiv.org/abs/2509.08498
Can maiBERT Speak for Maithili?
Sumit Yadav, Raju Kumar Yadav, Utsav Maskey, Gautam Siddharth Kashyap Md Azizul Hoque, Ganesh Gautam
https://arxiv.org/abs/2509.15048 https://
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[2/6]:
- SCoT: Straight Consistent Trajectory for Pre-Trained Diffusion Model Distillations
Zhangkai Wu, Xuhui Fan, Hongyu Wu, Longbing Cao
Mellum: Production-Grade in-IDE Contextual Code Completion with Multi-File Project Understanding
Nikita Pavlichenko, Iurii Nazarov, Ivan Dolgov, Ekaterina Garanina, Dmitry Ustalov, Ivan Bondyrev, Kseniia Lysaniuk, Evgeniia Vu, Kirill Chekmenev, Joseph Shtok, Yaroslav Golubev, Anton Semenkin, Uladzislau Sazanovich
https://arxiv.org/abs/2510…
Mitigating data replication in text-to-audio generative diffusion models through anti-memorization guidance
Francisco Messina, Francesca Ronchini, Luca Comanducci, Paolo Bestagini, Fabio Antonacci
https://arxiv.org/abs/2509.14934
Canary-1B-v2 & Parakeet-TDT-0.6B-v3: Efficient and High-Performance Models for Multilingual ASR and AST
Monica Sekoyan, Nithin Rao Koluguri, Nune Tadevosyan, Piotr Zelasko, Travis Bartley, Nick Karpov, Jagadeesh Balam, Boris Ginsburg
https://arxiv.org/abs/2509.14128
SegDINO3D: 3D Instance Segmentation Empowered by Both Image-Level and Object-Level 2D Features
Jinyuan Qu, Hongyang Li, Xingyu Chen, Shilong Liu, Yukai Shi, Tianhe Ren, Ruitao Jing, Lei Zhang
https://arxiv.org/abs/2509.16098
CDE: Concept-Driven Exploration for Reinforcement Learning
Le Mao, Andrew H. Liu, Renos Zabounidis, Zachary Kingston, Joseph Campbell
https://arxiv.org/abs/2510.08851 https://…
BIR-Adapter: A Low-Complexity Diffusion Model Adapter for Blind Image Restoration
Cem Eteke, Alexander Griessel, Wolfgang Kellerer, Eckehard Steinbach
https://arxiv.org/abs/2509.06904
SpeechOp: Inference-Time Task Composition for Generative Speech Processing
Justin Lovelace, Rithesh Kumar, Jiaqi Su, Ke Chen, Kilian Q Weinberger, Zeyu Jin
https://arxiv.org/abs/2509.14298
Representation-Based Exploration for Language Models: From Test-Time to Post-Training
Jens Tuyls, Dylan J. Foster, Akshay Krishnamurthy, Jordan T. Ash
https://arxiv.org/abs/2510.11686
Towards Unveiling Predictive Uncertainty Vulnerabilities in the Context of the Right to Be Forgotten
Wei Qian, Chenxu Zhao, Yangyi Li, Wenqian Ye, Mengdi Huai
https://arxiv.org/abs/2508.07458
CoRA: Covariate-Aware Adaptation of Time Series Foundation Models
Guo Qin, Zhi Chen, Yong Liu, Zhiyuan Shi, Haixuan Liu, Xiangdong Huang, Jianmin Wang, Mingsheng Long
https://arxiv.org/abs/2510.12681
Two-Stage Swarm Intelligence Ensemble Deep Transfer Learning (SI-EDTL) for Vehicle Detection Using Unmanned Aerial Vehicles
Zeinab Ghasemi Darehnaei, Mohammad Shokouhifar, Hossein Yazdanjouei, S. M. J. Rastegar Fatemi
https://arxiv.org/abs/2509.08026

Two-Stage Swarm Intelligence Ensemble Deep Transfer Learning (SI-EDTL) for Vehicle Detection Using Unmanned Aerial Vehicles
This paper introduces SI-EDTL, a two-stage swarm intelligence ensemble deep transfer learning model for detecting multiple vehicles in UAV images. It combines three pre-trained Faster R-CNN feature extractor models (InceptionV3, ResNet50, GoogLeNet) with five transfer classifiers (KNN, SVM, MLP, C4.5, Naïve Bayes), resulting in 15 different base learners. These are aggregated via weighted averaging to classify regions as Car, Van, Truck, Bus, or background. Hyperparameters are optimized with t…
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[3/5]:
- VisionTS : Cross-Modal Time Series Foundation Model with Continual Pre-trained Vision Backbones
Lefei Shen, Mouxiang Chen, Xu Liu, Han Fu, Xiaoxue Ren, Jianling Sun, Zhuo Li, Chenghao Liu