Tootfinder

Opt-in global Mastodon full text search. Join the index!

@arXiv_csLG_bot@mastoxiv.page
2025-09-10 10:29:21

General Demographic Foundation Models for Enhancing Predictive Performance Across Diseases
Li-Chin Chen, Ji-Tian Sheu, Yuh-Jue Chuang
arxiv.org/abs/2509.07330

@arXiv_csCV_bot@mastoxiv.page
2025-09-09 12:30:52

BIR-Adapter: A Low-Complexity Diffusion Model Adapter for Blind Image Restoration
Cem Eteke, Alexander Griessel, Wolfgang Kellerer, Eckehard Steinbach
arxiv.org/abs/2509.06904

@privacity@social.linux.pizza
2025-07-06 23:39:20

Nature of Data in Pre-Trained Large Language Models
fpf.org/blog/nature-of-data-in
@…

@arXiv_quantph_bot@mastoxiv.page
2025-09-09 11:51:22

Classical Neural Networks on Quantum Devices via Tensor Network Disentanglers: A Case Study in Image Classification
Borja Aizpurua, Sukhbinder Singh, Rom\'an Or\'us
arxiv.org/abs/2509.06653

@arXiv_csSD_bot@mastoxiv.page
2025-07-08 11:09:50

Improving BERT for Symbolic Music Understanding Using Token Denoising and Pianoroll Prediction
Jun-You Wang, Li Su
arxiv.org/abs/2507.04776

@arXiv_eessAS_bot@mastoxiv.page
2025-07-10 08:25:51

Pronunciation-Lexicon Free Training for Phoneme-based Crosslingual ASR via Joint Stochastic Approximation
Saierdaer Yusuyin, Te Ma, Hao Huang, Zhijian Ou
arxiv.org/abs/2507.06249

@arXiv_csSE_bot@mastoxiv.page
2025-08-04 09:21:31

SPENCER: Self-Adaptive Model Distillation for Efficient Code Retrieval
Wenchao Gu, Zongyi Lyu, Yanlin Wang, Hongyu Zhang, Cuiyun Gao, Michael R. Lyu
arxiv.org/abs/2508.00546

@arXiv_csNI_bot@mastoxiv.page
2025-08-05 08:47:30

Convolutions are Competitive with Transformers for Encrypted Traffic Classification with Pre-training
Chungang Lin, Weiyao Zhang, Tianyu Zuo, Chao Zha, Yilong Jiang, Ruiqi Meng, Haitong Luo, Xuying Meng, Yujun Zhang
arxiv.org/abs/2508.02001

@arXiv_csCL_bot@mastoxiv.page
2025-08-06 10:18:30

fact check AI at SemEval-2025 Task 7: Multilingual and Crosslingual Fact-checked Claim Retrieval
Pranshu Rastogi
arxiv.org/abs/2508.03475 a…

@arXiv_csCV_bot@mastoxiv.page
2025-09-08 09:47:20

Efficient Video-to-Audio Generation via Multiple Foundation Models Mapper
Gehui Chen, Guan'an Wang, Xiaowen Huang, Jitao Sang
arxiv.org/abs/2509.04957

@pbloem@sigmoid.social
2025-06-26 10:41:24

New pre-print! #ai
**Universal pre-training by iterated random computation.**
⌨️🐒 A monkey behind a typewriter will produce the collected works of Shakespeare eventually.
💻🐒 But what if we put a monkey behind a computer?
⌨️🐒 needs to be lucky enough to type all characters of all of Shakespeare correctly. 💻🐒 only needs to be lucky enough to type a program for Shakespeare.

A table showing one string of random characters next to an emoji of a monkey next to a keyboard (representing a typewriter). Below it, three strings, also of random characters, but with more structure. Some characters and n-grams repeat. Next to these three strings is an emoji of a monkey next to a laptop computer. The caption reads: (⌨️🐒) A string of randomly sampled characters. (💻🐒) The result of passing this string through three randomly initialized neural network models. The latter data is …
@arXiv_qfinST_bot@mastoxiv.page
2025-08-06 09:20:00

Kronos: A Foundation Model for the Language of Financial Markets
Yu Shi, Zongliang Fu, Shuo Chen, Bohan Zhao, Wei Xu, Changshui Zhang, Jian Li
arxiv.org/abs/2508.02739

@arXiv_csDC_bot@mastoxiv.page
2025-09-03 08:53:33

LobRA: Multi-tenant Fine-tuning over Heterogeneous Data
Sheng Lin, Fangcheng Fu, Haoyang Li, Hao Ge, Xuanyu Wang, Jiawen Niu, Yaofeng Tu, Bin Cui
arxiv.org/abs/2509.01193

@arXiv_physicsfludyn_bot@mastoxiv.page
2025-08-05 08:57:10

Fine-tuning physics-informed neural networks for cavity flows using coordinate transformation
Ryuta Takao, Satoshi Ii
arxiv.org/abs/2508.01122

@arXiv_physicsaoph_bot@mastoxiv.page
2025-09-05 08:18:41

Finetuning AI Foundation Models to Develop Subgrid-Scale Parameterizations: A Case Study on Atmospheric Gravity Waves
Aman Gupta, Aditi Sheshadri, Sujit Roy, Johannes Schmude, Vishal Gaur, Wei Ji Leong, Manil Maskey, Rahul Ramachandran
arxiv.org/abs/2509.03816

@arXiv_statML_bot@mastoxiv.page
2025-08-05 09:17:00

Uncertainty Quantification for Large-Scale Deep Networks via Post-StoNet Modeling
Yan Sun, Faming Liang
arxiv.org/abs/2508.01217 arxiv.org/…

@arXiv_csRO_bot@mastoxiv.page
2025-08-19 10:59:00

Improving Pre-Trained Vision-Language-Action Policies with Model-Based Search
Cyrus Neary, Omar G. Younis, Artur Kuramshin, Ozgur Aslan, Glen Berseth
arxiv.org/abs/2508.12211

@arXiv_eessSP_bot@mastoxiv.page
2025-09-03 11:50:13

Fluid Antenna Port Prediction based on Large Language Models
Yali Zhang, Haifan Yin, Weidong Li, Emil Bjornson, Merouane Debbah
arxiv.org/abs/2509.01121

@arXiv_csAI_bot@mastoxiv.page
2025-06-24 11:48:20

How Robust is Model Editing after Fine-Tuning? An Empirical Study on Text-to-Image Diffusion Models
Feng He, Zhenyang Liu, Marco Valentino, Zhixue Zhao
arxiv.org/abs/2506.18428

@arXiv_csLG_bot@mastoxiv.page
2025-07-31 09:25:31

Using Scaling Laws for Data Source Utility Estimation in Domain-Specific Pre-Training
Oleksiy Ostapenko, Charles Guille-Escuret, Luke Kumar, Max Tian, Denis Kocetkov, Gopeshh Subbaraj, Raymond Li, Joel Lamy-Poirier, Sebastien Paquet, Torsten Scholak
arxiv.org/abs/2507.22250

@arXiv_csCL_bot@mastoxiv.page
2025-07-29 08:31:31

HITSZ's End-To-End Speech Translation Systems Combining Sequence-to-Sequence Auto Speech Recognition Model and Indic Large Language Model for IWSLT 2025 in Indic Track
Xuchen Wei, Yangxin Wu, Yaoyin Zhang, Henglyu Liu, Kehai Chen, Xuefeng Bai, Min Zhang
arxiv.org/abs/2507.19616

@arXiv_csSE_bot@mastoxiv.page
2025-08-28 09:37:11

Smart Contract Intent Detection with Pre-trained Programming Language Model
Youwei Huang, Jianwen Li, Sen Fang, Yao Li, Peng Yang, Bin Hu, Tao Zhang
arxiv.org/abs/2508.20086

@arXiv_csHC_bot@mastoxiv.page
2025-08-15 07:47:02

Pre-trained Transformer-models using chronic invasive electrophysiology for symptom decoding without patient-individual training
Timon Merk, Saeed Salehi, Richard M. Koehler, Qiming Cui, Maria Olaru, Amelia Hahn, Nicole R. Provenza, Simon Little, Reza Abbasi-Asl, Phil A. Starr, Wolf-Julian Neumann
arxiv.org/abs/2508.10160

@arXiv_csCV_bot@mastoxiv.page
2025-08-06 10:37:50

CoPS: Conditional Prompt Synthesis for Zero-Shot Anomaly Detection
Qiyu Chen, Zhen Qu, Wei Luo, Haiming Yao, Yunkang Cao, Yuxin Jiang, Yinan Duan, Huiyuan Luo, Chengkan Lv, Zhengtao Zhang
arxiv.org/abs/2508.03447

@arXiv_physicschemph_bot@mastoxiv.page
2025-09-03 10:11:33

Migration as a Probe: A Generalizable Benchmark Framework for Specialist vs. Generalist Machine-Learned Force Fields in Doped Materials
Yi Cao, Paulette Clancy
arxiv.org/abs/2509.00090

@arXiv_csDC_bot@mastoxiv.page
2025-07-01 09:56:13

QPART: Adaptive Model Quantization and Dynamic Workload Balancing for Accuracy-aware Edge Inference
Xiangchen Li, Saeid Ghafouri, Bo Ji, Hans Vandierendonck, Deepu John, Dimitrios S. Nikolopoulos
arxiv.org/abs/2506.23934

@arXiv_eessAS_bot@mastoxiv.page
2025-09-03 10:17:13

Speaker-Conditioned Phrase Break Prediction for Text-to-Speech with Phoneme-Level Pre-trained Language Model
Dong Yang, Yuki Saito, Takaaki Saeki, Tomoki Koriyama, Wataru Nakata, Detai Xin, Hiroshi Saruwatari
arxiv.org/abs/2509.00675

@arXiv_csIR_bot@mastoxiv.page
2025-08-29 07:54:51

ELIXIR: Efficient and LIghtweight model for eXplaIning Recommendations
Ben Kabongo, Vincent Guigue, Pirmin Lemberger
arxiv.org/abs/2508.20312

@pbloem@sigmoid.social
2025-06-26 10:56:22

After training, we finetune on real-world data. We observe that the models that have been pre-trained with noise converge very quickly compared to a baseline which is trained from scratch.
Moreover, on the other datasets, the UP models retain their zero-shot performance during finetuning. This suggests that there may be a generalization benefit to using a UP model.
All this is at the expense of much longer training, but that cost can be amortized over many tasks.

The results for the finetuning experiment. Six datasets (linux, code, dyck, wp, german and ndfa) and the performance of four models: the baseline and UP trained models and two finetuning datasets. 

The results show that the UP models converge quicker, and that they retain most of their zero-shot performance on the other datasets.
@arXiv_csCR_bot@mastoxiv.page
2025-07-25 09:19:22

LoRA-Leak: Membership Inference Attacks Against LoRA Fine-tuned Language Models
Delong Ran, Xinlei He, Tianshuo Cong, Anyu Wang, Qi Li, Xiaoyun Wang
arxiv.org/abs/2507.18302

@arXiv_qbioGN_bot@mastoxiv.page
2025-06-25 08:15:39

eccDNAMamba: A Pre-Trained Model for Ultra-Long eccDNA Sequence Analysis
Zhenke Liu, Jien Li, Ziqi Zhang
arxiv.org/abs/2506.18940

@arXiv_csCL_bot@mastoxiv.page
2025-09-03 14:23:13

chDzDT: Word-level morphology-aware language model for Algerian social media text
Abdelkrime Aries
arxiv.org/abs/2509.01772 arxiv.org/pdf/2…

@arXiv_qbiobm_bot@mastoxiv.page
2025-06-23 08:34:20

Aptamer-protein interaction prediction model based on transformer
Zhichao Yan, Yue Kang, Buyong Ma
arxiv.org/abs/2506.16084

@arXiv_eessIV_bot@mastoxiv.page
2025-06-23 10:01:40

Fast Training-free Perceptual Image Compression
Ziran Zhu, Tongda Xu, Minye Huang, Dailan He, Xingtong Ge, Xinjie Zhang, Ling Li, Yan Wang
arxiv.org/abs/2506.16102

@arXiv_csCV_bot@mastoxiv.page
2025-08-06 10:36:50

MedCAL-Bench: A Comprehensive Benchmark on Cold-Start Active Learning with Foundation Models for Medical Image Analysis
Ning Zhu, Xiaochuan Ma, Shaoting Zhang, Guotai Wang
arxiv.org/abs/2508.03441

@arXiv_condmatmtrlsci_bot@mastoxiv.page
2025-08-13 08:00:22

DiffractGPT: Atomic Structure Determination from X-ray Diffraction Patterns using Generative Pre-trained Transformer
Kamal Choudhary
arxiv.org/abs/2508.08349

@arXiv_csSD_bot@mastoxiv.page
2025-06-23 10:26:40

Hybrid-Sep: Language-queried audio source separation via pre-trained Model Fusion and Adversarial Diffusion Training
Jianyuan Feng, Guangzheng Li, Yangfei Xu
arxiv.org/abs/2506.16833

@arXiv_csCL_bot@mastoxiv.page
2025-06-30 10:22:00

HyperCLOVA X THINK Technical Report
NAVER Cloud HyperCLOVA X Team
arxiv.org/abs/2506.22403 arxiv.org/pdf/2506.22403…

@arXiv_csGR_bot@mastoxiv.page
2025-08-18 07:37:50

StyleMM: Stylized 3D Morphable Face Model via Text-Driven Aligned Image Translation
Seungmi Lee, Kwan Yun, Junyong Noh
arxiv.org/abs/2508.11203

@arXiv_csRO_bot@mastoxiv.page
2025-08-28 08:06:51

LaVA-Man: Learning Visual Action Representations for Robot Manipulation
Chaoran Zhu, Hengyi Wang, Yik Lung Pang, Changjae Oh
arxiv.org/abs/2508.19391

@arXiv_astrophIM_bot@mastoxiv.page
2025-07-29 09:23:41

Finetuning Stellar Spectra Foundation Models with LoRA
Xiaosheng Zhao, Yuan-Sen Ting, Alexander S. Szalay, Yang Huang
arxiv.org/abs/2507.20972

@arXiv_csAR_bot@mastoxiv.page
2025-08-26 08:09:56

LLMulator: Generalizable Cost Modeling for Dataflow Accelerators with Input-Adaptive Control Flow
Kaiyan Chang, Wenlong Zhu, Shengwen Liang, Huawei Li, Ying Wang
arxiv.org/abs/2508.17826

@arXiv_csCV_bot@mastoxiv.page
2025-07-03 10:31:50

FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
Yukang Cao, Chenyang Si, Jinghao Wang, Ziwei Liu
arxiv.org/abs/2507.01953

@arXiv_qbioQM_bot@mastoxiv.page
2025-06-24 08:45:40

BrainSymphony: A Transformer-Driven Fusion of fMRI Time Series and Structural Connectivity
Moein Khajehnejad, Forough Habibollahi, Adeel Razi
arxiv.org/abs/2506.18314

@arXiv_csCL_bot@mastoxiv.page
2025-08-01 10:20:21

Arabic Hate Speech Identification and Masking in Social Media using Deep Learning Models and Pre-trained Models Fine-tuning
Salam Thabet Doghmash, Motaz Saad
arxiv.org/abs/2507.23661

@arXiv_eessAS_bot@mastoxiv.page
2025-09-03 10:53:13

MixedG2P-T5: G2P-free Speech Synthesis for Mixed-script texts using Speech Self-Supervised Learning and Language Model
Joonyong Park, Daisuke Saito, Nobuaki Minematsu
arxiv.org/abs/2509.01391

@arXiv_csSD_bot@mastoxiv.page
2025-07-25 07:39:21

Speaker Disentanglement of Speech Pre-trained Model Based on Interpretability
Xiaoxu Zhu, Junhua Li
arxiv.org/abs/2507.17851 arxiv.org/pdf/…

@arXiv_qfinPM_bot@mastoxiv.page
2025-07-29 09:09:51

Your AI, Not Your View: The Bias of LLMs in Investment Analysis
Hoyoung Lee, Junhyuk Seo, Suhwan Park, Junhyeong Lee, Wonbin Ahn, Chanyeol Choi, Alejandro Lopez-Lira, Yongjae Lee
arxiv.org/abs/2507.20957

@arXiv_physicsfludyn_bot@mastoxiv.page
2025-07-02 09:44:00

Guided Unconditional and Conditional Generative Models for Super-Resolution and Inference of Quasi-Geostrophic Turbulence
Anantha Narayanan Suresh Babu, Akhil Sadam, Pierre F. J. Lermusiaux
arxiv.org/abs/2507.00719

@arXiv_csLG_bot@mastoxiv.page
2025-08-21 10:15:00

Cross-Modality Controlled Molecule Generation with Diffusion Language Model
Yunzhe Zhang, Yifei Wang, Khanh Vinh Nguyen, Pengyu Hong
arxiv.org/abs/2508.14748

@arXiv_physicsgeoph_bot@mastoxiv.page
2025-08-28 11:51:44

Replaced article(s) found for physics.geo-ph. arxiv.org/list/physics.geo-ph/
[1/1]:
- PRIME-DP: Pre-trained Integrated Model for Earthquake Data Processing
Ziye Yu, Yuqi Cai, Weitao Wang, Yanru An, Lu Li, Yueyang Xia, Yunpeng Zhang

@arXiv_eessIV_bot@mastoxiv.page
2025-06-27 09:17:49

GANet-Seg: Adversarial Learning for Brain Tumor Segmentation with Hybrid Generative Models
Qifei Cui, Xinyu Lu
arxiv.org/abs/2506.21245

@arXiv_csSE_bot@mastoxiv.page
2025-07-22 10:01:10

On the Effect of Token Merging on Pre-trained Models for Code
Mootez Saad, Hao Li, Tushar Sharma, Ahmed E. Hassan
arxiv.org/abs/2507.14423

@arXiv_csSI_bot@mastoxiv.page
2025-08-12 08:42:03

Anatomy of a Machine Learning Ecosystem: 2 Million Models on Hugging Face
Benjamin Laufer, Hamidah Oderinwale, Jon Kleinberg
arxiv.org/abs/2508.06811

@arXiv_csRO_bot@mastoxiv.page
2025-08-26 11:26:46

FlowVLA: Thinking in Motion with a Visual Chain of Thought
Zhide Zhong, Haodong Yan, Junfeng Li, Xiangchen Liu, Xin Gong, Wenxuan Song, Jiayi Chen, Haoang Li
arxiv.org/abs/2508.18269

@arXiv_csIR_bot@mastoxiv.page
2025-08-11 09:31:39

LMAR: Language Model Augmented Retriever for Domain-specific Knowledge Indexing
Yao Zhao, Yantian Ding, Zhiyue Zhang, Dapeng Yao, Yanxun Xu
arxiv.org/abs/2508.05672

@arXiv_csCL_bot@mastoxiv.page
2025-07-03 09:56:10

Adapting Language Models to Indonesian Local Languages: An Empirical Study of Language Transferability on Zero-Shot Settings
Rifki Afina Putri
arxiv.org/abs/2507.01645

@arXiv_csGR_bot@mastoxiv.page
2025-06-24 09:23:49

Morse: Dual-Sampling for Lossless Acceleration of Diffusion Models
Chao Li, Jiawei Fan, Anbang Yao
arxiv.org/abs/2506.18251

@arXiv_statML_bot@mastoxiv.page
2025-08-13 09:32:02

In-Context Learning as Nonparametric Conditional Probability Estimation: Risk Bounds and Optimality
Chenrui Liu, Falong Tan, Chuanlong Xie, Yicheng Zeng, Lixing Zhu
arxiv.org/abs/2508.08673

@arXiv_csCV_bot@mastoxiv.page
2025-07-30 10:40:51

Staining and locking computer vision models without retraining
Oliver J. Sutton, Qinghua Zhou, George Leete, Alexander N. Gorban, Ivan Y. Tyukin
arxiv.org/abs/2507.22000

@arXiv_csLG_bot@mastoxiv.page
2025-07-24 09:54:39

Computer Vision for Real-Time Monkeypox Diagnosis on Embedded Systems
Jacob M. Delgado-L\'opez, Ricardo A. Morell-Rodriguez, Sebasti\'an O. Espinosa-Del Rosario, Wilfredo E. Lugo-Beauchamp
arxiv.org/abs/2507.17123

@arXiv_csCR_bot@mastoxiv.page
2025-07-17 08:12:10

Effective Fine-Tuning of Vision Transformers with Low-Rank Adaptation for Privacy-Preserving Image Classification
Haiwei Lin, Shoko Imaizumi, Hitoshi Kiya
arxiv.org/abs/2507.11943

@arXiv_csCL_bot@mastoxiv.page
2025-07-30 10:18:51

Adversarial Defence without Adversarial Defence: Enhancing Language Model Robustness via Instance-level Principal Component Removal
Yang Wang, Chenghao Xiao, Yizhi Li, Stuart E. Middleton, Noura Al Moubayed, Chenghua Lin
arxiv.org/abs/2507.21750

@arXiv_csSD_bot@mastoxiv.page
2025-07-29 07:51:51

Efficient Vocal-Conditioned Music Generation via Soft Alignment Attention and Latent Diffusion
Hei Shing Cheung, Boya Zhang
arxiv.org/abs/2507.19991

@arXiv_eessSP_bot@mastoxiv.page
2025-06-12 08:36:51

Foundation Model-Aided Deep Reinforcement Learning for RIS-Assisted Wireless Communication
Mohammad Ghassemi, Sara Farrag Mobarak, Han Zhang, Ali Afana, Akram Bin Sediq, Melike Erol-Kantarci
arxiv.org/abs/2506.09855

@arXiv_eessIV_bot@mastoxiv.page
2025-08-25 09:04:20

Time-Aware One Step Diffusion Network for Real-World Image Super-Resolution
Tainyi Zhang, Zheng-Peng Duan, Peng-Tao Jiang, Bo Li, Ming-Ming Cheng, Chun-Le Guo, Chongyi Li
arxiv.org/abs/2508.16557

@arXiv_csCV_bot@mastoxiv.page
2025-07-28 10:15:31

Back to the Features: DINO as a Foundation for Video World Models
Federico Baldassarre, Marc Szafraniec, Basile Terver, Vasil Khalidov, Francisco Massa, Yann LeCun, Patrick Labatut, Maximilian Seitzer, Piotr Bojanowski
arxiv.org/abs/2507.19468

@arXiv_csCL_bot@mastoxiv.page
2025-08-25 10:00:40

Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search
Yuxian Gu, Qinghao Hu, Shang Yang, Haocheng Xi, Junyu Chen, Song Han, Han Cai
arxiv.org/abs/2508.15884

@arXiv_csIR_bot@mastoxiv.page
2025-08-14 08:15:52

Personalized Product Search Ranking: A Multi-Task Learning Approach with Tabular and Non-Tabular Data
Lalitesh Morishetti, Abhay Kumar, Jonathan Scott, Kaushiki Nag, Gunjan Sharma, Shanu Vashishtha, Rahul Sridhar, Rohit Chatter, Kannan Achan
arxiv.org/abs/2508.09636

@arXiv_csDC_bot@mastoxiv.page
2025-07-22 08:53:20

ACME: Adaptive Customization of Large Models via Distributed Systems
Ziming Dai, Chao Qiu, Fei Gao, Yunfeng Zhao, Xiaofei Wang
arxiv.org/abs/2507.14802

@arXiv_csLG_bot@mastoxiv.page
2025-08-27 10:35:33

Composition and Alignment of Diffusion Models using Constrained Learning
Shervin Khalafi, Ignacio Hounie, Dongsheng Ding, Alejandro Ribeiro
arxiv.org/abs/2508.19104

@arXiv_csSD_bot@mastoxiv.page
2025-08-19 09:25:39

Exploring Self-Supervised Audio Models for Generalized Anomalous Sound Detection
Bing Han, Anbai Jiang, Xinhu Zheng, Wei-Qiang Zhang, Jia Liu, Pingyi Fan, Yanmin Qian
arxiv.org/abs/2508.12230

@arXiv_eessAS_bot@mastoxiv.page
2025-07-03 09:34:00

Generalizable Detection of Audio Deepfakes
Jose A. Lopez, Georg Stemmer, H\'ector Cordourier Maruri
arxiv.org/abs/2507.01750

@arXiv_csCL_bot@mastoxiv.page
2025-07-14 09:58:42

DocPolarBERT: A Pre-trained Model for Document Understanding with Relative Polar Coordinate Encoding of Layout Structures
Benno Uthayasooriyar, Antoine Ly, Franck Vermet, Caio Corro
arxiv.org/abs/2507.08606

@arXiv_csRO_bot@mastoxiv.page
2025-06-16 07:49:19

Poutine: Vision-Language-Trajectory Pre-Training and Reinforcement Learning Post-Training Enable Robust End-to-End Autonomous Driving
Luke Rowe, Rodrigue de Schaetzen, Roger Girgis, Christopher Pal, Liam Paull
arxiv.org/abs/2506.11234

@arXiv_csSE_bot@mastoxiv.page
2025-08-22 08:43:40

An Empirical Study of Knowledge Distillation for Code Understanding Tasks
Ruiqi Wang, Zezhou Yang, Cuiyun Gao, Xin Xia, Qing Liao
arxiv.org/abs/2508.15423

@arXiv_csGR_bot@mastoxiv.page
2025-06-26 08:19:00

EditP23: 3D Editing via Propagation of Image Prompts to Multi-View
Roi Bar-On, Dana Cohen-Bar, Daniel Cohen-Or
arxiv.org/abs/2506.20652

@arXiv_csLG_bot@mastoxiv.page
2025-08-22 10:16:41

Amortized In-Context Mixed Effect Transformer Models: A Zero-Shot Approach for Pharmacokinetics
C\'esar Ali Ojeda Marin, Wilhelm Huisinga, Purity Kavwele, Niklas Hartung
arxiv.org/abs/2508.15659

@arXiv_csCL_bot@mastoxiv.page
2025-08-22 10:01:01

Exploiting Vocabulary Frequency Imbalance in Language Model Pre-training
Woojin Chung, Jeonghoon Kim
arxiv.org/abs/2508.15390 arxiv.org/pdf…

@arXiv_csCL_bot@mastoxiv.page
2025-09-01 09:40:42

Efficient Code Embeddings from Code Generation Models
Daria Kryvosheieva, Saba Sturua, Michael G\"unther, Scott Martens, Han Xiao
arxiv.org/abs/2508.21290

@arXiv_csCV_bot@mastoxiv.page
2025-06-17 09:58:49

CLIP-HandID: Vision-Language Model for Hand-Based Person Identification
Nathanael L. Baisa, Babu Pallam, Amudhavel Jayavel
arxiv.org/abs/2506.12447

@arXiv_eessIV_bot@mastoxiv.page
2025-08-20 09:44:50

UNICON: UNIfied CONtinual Learning for Medical Foundational Models
Mohammad Areeb Qazi, Munachiso S Nwadike, Ibrahim Almakky, Mohammad Yaqub, Numan Saeed
arxiv.org/abs/2508.14024

@arXiv_csSD_bot@mastoxiv.page
2025-07-29 09:19:01

Do Not Mimic My Voice: Speaker Identity Unlearning for Zero-Shot Text-to-Speech
Taesoo Kim, Jinju Kim, Dongchan Kim, Jong Hwan Ko, Gyeong-Moon Park
arxiv.org/abs/2507.20140

@arXiv_csCL_bot@mastoxiv.page
2025-08-29 10:18:21

Multi-Lingual Implicit Discourse Relation Recognition with Multi-Label Hierarchical Learning
Nelson Filipe Costa, Leila Kosseim
arxiv.org/abs/2508.20712

@arXiv_csSE_bot@mastoxiv.page
2025-08-20 11:48:46

Replaced article(s) found for cs.SE. arxiv.org/list/cs.SE/new
[1/1]:
- "I see models being a whole other thing": An Empirical Study of Pre-Trained Model Naming Conventi...
Wenxin Jiang, Mingyu Kim, Chingwo Cheung, Heesoo Kim, George K. Thiruvathukal, James C. Davis

@arXiv_csCV_bot@mastoxiv.page
2025-07-16 10:34:41

Implementing Adaptations for Vision AutoRegressive Model
Kaif Shaikh, Antoni Kowalczuk, Franziska Boenisch, Adam Dziedzic
arxiv.org/abs/2507.11441

@arXiv_csGR_bot@mastoxiv.page
2025-07-25 09:26:12

Tiny is not small enough: High-quality, low-resource facial animation models through hybrid knowledge distillation
Zhen Han, Mattias Teye, Derek Yadgaroff, Judith B\"utepage
arxiv.org/abs/2507.18352

@arXiv_csCL_bot@mastoxiv.page
2025-08-25 10:03:40

Seeing is Believing: Emotion-Aware Audio-Visual Language Modeling for Expressive Speech Generation
Weiting Tan, Jiachen Lian, Hirofumi Inaguma, Paden Tomasello, Philipp Koehn, Xutai Ma
arxiv.org/abs/2508.16188

@arXiv_csRO_bot@mastoxiv.page
2025-08-11 09:37:49

Bounding Distributional Shifts in World Modeling through Novelty Detection
Eric Jing, Abdeslam Boularias
arxiv.org/abs/2508.06096 arxiv.org…

@arXiv_csLG_bot@mastoxiv.page
2025-08-21 10:10:00

Adaptively Robust LLM Inference Optimization under Prediction Uncertainty
Zixi Chen, Yinyu Ye, Zijie Zhou
arxiv.org/abs/2508.14544 arxiv.or…

@arXiv_csCV_bot@mastoxiv.page
2025-08-22 10:05:51

Transfer learning optimization based on evolutionary selective fine tuning
Jacinto Colan, Ana Davila, Yasuhisa Hasegawa
arxiv.org/abs/2508.15367

@arXiv_eessIV_bot@mastoxiv.page
2025-07-11 09:03:21

Label-Efficient Chest X-ray Diagnosis via Partial CLIP Adaptation
Heet Nitinkumar Dalsania
arxiv.org/abs/2507.07254 a…

@arXiv_csGR_bot@mastoxiv.page
2025-06-24 08:11:39

BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing
Jiacheng Chen, Ramin Mehran, Xuhui Jia, Saining Xie, Sanghyun Woo
arxiv.org/abs/2506.17450

@arXiv_csLG_bot@mastoxiv.page
2025-08-20 10:05:00

In-Context Decision Making for Optimizing Complex AutoML Pipelines
Amir Rezaei Balef, Katharina Eggensperger
arxiv.org/abs/2508.13657 arxiv…

@arXiv_csSD_bot@mastoxiv.page
2025-08-21 09:12:59

ECHO: Frequency-aware Hierarchical Encoding for Variable-length Signal
Yucong Zhang, Juan Liu, Ming Li
arxiv.org/abs/2508.14689 arxiv.org/p…

@arXiv_csLG_bot@mastoxiv.page
2025-08-12 12:07:33

Towards Unveiling Predictive Uncertainty Vulnerabilities in the Context of the Right to Be Forgotten
Wei Qian, Chenxu Zhao, Yangyi Li, Wenqian Ye, Mengdi Huai
arxiv.org/abs/2508.07458

@arXiv_csSD_bot@mastoxiv.page
2025-07-14 08:35:02

Audio Inpanting using Discrete Diffusion Model
Tali Dror, Iftach Shoham, Moshe Buchris, Oren Gal, Haim Permuter, Gilad Katz, Eliya Nachmani
arxiv.org/abs/2507.08333

@arXiv_csLG_bot@mastoxiv.page
2025-08-15 10:07:52

Projected Coupled Diffusion for Test-Time Constrained Joint Generation
Hao Luan, Yi Xian Goh, See-Kiong Ng, Chun Kai Ling
arxiv.org/abs/2508.10531

@arXiv_csLG_bot@mastoxiv.page
2025-07-11 10:23:51

Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs
Ziyue Li, Yang Li, Tianyi Zhou
arxiv.org/abs/2507.07996 arxiv.org/pdf/2507.07996 arxiv.org/html/2507.07996
arXiv:2507.07996v1 Announce Type: new
Abstract: Can a pretrained neural network adapt its architecture to different inputs without any finetuning? Do we need all layers for simple tasks, and are they adequate for challenging tasks? We found that the layers of a pretrained large language model (LLM) can be manipulated as separate modules to build a better and even shallower model customized for each test sample. In particular, each layer from the pretrained model can be skipped/pruned or repeated multiple times as recurrent neural networks (RNN), and stacked with others in arbitrary orders, yielding a chain-of-layers (CoLa) per sample. This compositional space greatly expands the scope of existing works on looped/recurrent pretrained modules, layer pruning, or early-exit networks. We develop a Monte Carlo Tree Search (MCTS) protocol to explore and identify the optimal CoLa for each sample from math and commonsense reasoning benchmarks. Compared to a static model of a fixed depth, CoLa allows shortcut paths (fast thinking), recurrence of the same layer(s) (slow thinking), and combining both, offering more flexible, dynamic architectures for different inputs. We conduct an extensive analysis of the MCTS-optimized CoLa, which leads to two key findings: (1) For >75% of samples with correct predictions by the original LLM, we can find shorter CoLa, suggesting a large space for improving inference efficiency; (2) For >60% of samples with originally incorrect predictions, we can identify CoLa achieving correct predictions, suggesting a large space of performance enhancement. Our results highlight the shortcomings of using a fixed architecture of pre-trained LLMs for inference on different samples and pave the way to unlock the generalization power of test-time depth adaptation.
toXiv_bot_toot

@arXiv_csCL_bot@mastoxiv.page
2025-07-17 08:05:40

Cross-lingual Few-shot Learning for Persian Sentiment Analysis with Incremental Adaptation
Farideh Majidi, Ziaeddin Beheshtifard
arxiv.org/abs/2507.11634