Tootfinder

Opt-in global Mastodon full text search. Join the index!

@arXiv_csLG_bot@mastoxiv.page
2025-07-31 09:25:31

Using Scaling Laws for Data Source Utility Estimation in Domain-Specific Pre-Training
Oleksiy Ostapenko, Charles Guille-Escuret, Luke Kumar, Max Tian, Denis Kocetkov, Gopeshh Subbaraj, Raymond Li, Joel Lamy-Poirier, Sebastien Paquet, Torsten Scholak
arxiv.org/abs/2507.22250

@arXiv_csCL_bot@mastoxiv.page
2025-06-30 10:22:00

HyperCLOVA X THINK Technical Report
NAVER Cloud HyperCLOVA X Team
arxiv.org/abs/2506.22403 arxiv.org/pdf/2506.22403…

@arXiv_csCV_bot@mastoxiv.page
2025-07-30 10:40:51

Staining and locking computer vision models without retraining
Oliver J. Sutton, Qinghua Zhou, George Leete, Alexander N. Gorban, Ivan Y. Tyukin
arxiv.org/abs/2507.22000

@arXiv_csDC_bot@mastoxiv.page
2025-05-30 07:17:04

Speeding up Model Loading with fastsafetensors
Takeshi Yoshimura, Tatsuhiro Chiba, Manish Sethi, Daniel Waddington, Swaminathan Sundararaman
arxiv.org/abs/2505.23072

@pbloem@sigmoid.social
2025-06-26 10:41:24

New pre-print! #ai
**Universal pre-training by iterated random computation.**
⌨️🐒 A monkey behind a typewriter will produce the collected works of Shakespeare eventually.
💻🐒 But what if we put a monkey behind a computer?
⌨️🐒 needs to be lucky enough to type all characters of all of Shakespeare correctly. 💻🐒 only needs to be lucky enough to type a program for Shakespeare.

A table showing one string of random characters next to an emoji of a monkey next to a keyboard (representing a typewriter). Below it, three strings, also of random characters, but with more structure. Some characters and n-grams repeat. Next to these three strings is an emoji of a monkey next to a laptop computer. The caption reads: (⌨️🐒) A string of randomly sampled characters. (💻🐒) The result of passing this string through three randomly initialized neural network models. The latter data is …
@arXiv_csMM_bot@mastoxiv.page
2025-05-30 09:54:06

This arxiv.org/abs/2411.17690 has been replaced.
initial toot: mastoxiv.page/@arXiv_csMM_…

@arXiv_csCL_bot@mastoxiv.page
2025-07-30 10:18:51

Adversarial Defence without Adversarial Defence: Enhancing Language Model Robustness via Instance-level Principal Component Removal
Yang Wang, Chenghao Xiao, Yizhi Li, Stuart E. Middleton, Noura Al Moubayed, Chenghua Lin
arxiv.org/abs/2507.21750

@arXiv_csSD_bot@mastoxiv.page
2025-05-30 09:56:27

This arxiv.org/abs/2505.20745 has been replaced.
initial toot: mastoxiv.page/@arXiv_csSD_…

@arXiv_csAI_bot@mastoxiv.page
2025-06-24 11:48:20

How Robust is Model Editing after Fine-Tuning? An Empirical Study on Text-to-Image Diffusion Models
Feng He, Zhenyang Liu, Marco Valentino, Zhixue Zhao
arxiv.org/abs/2506.18428

@arXiv_csDB_bot@mastoxiv.page
2025-05-30 07:16:53

TailorSQL: An NL2SQL System Tailored to Your Query Workload
Kapil Vaidya, Jialin Ding, Sebastian Kosak, David Kernert, Chuan Lei, Xiao Qin, Abhinav Tripathy, Ramesh Balan, Balakrishnan Narayanaswamy, Tim Kraska
arxiv.org/abs/2505.23039

@arXiv_csCL_bot@mastoxiv.page
2025-07-29 08:31:31

HITSZ's End-To-End Speech Translation Systems Combining Sequence-to-Sequence Auto Speech Recognition Model and Indic Large Language Model for IWSLT 2025 in Indic Track
Xuchen Wei, Yangxin Wu, Yaoyin Zhang, Henglyu Liu, Kehai Chen, Xuefeng Bai, Min Zhang
arxiv.org/abs/2507.19616

@pbloem@sigmoid.social
2025-06-26 10:56:22

After training, we finetune on real-world data. We observe that the models that have been pre-trained with noise converge very quickly compared to a baseline which is trained from scratch.
Moreover, on the other datasets, the UP models retain their zero-shot performance during finetuning. This suggests that there may be a generalization benefit to using a UP model.
All this is at the expense of much longer training, but that cost can be amortized over many tasks.

The results for the finetuning experiment. Six datasets (linux, code, dyck, wp, german and ndfa) and the performance of four models: the baseline and UP trained models and two finetuning datasets. 

The results show that the UP models converge quicker, and that they retain most of their zero-shot performance on the other datasets.
@arXiv_csCV_bot@mastoxiv.page
2025-07-28 10:15:31

Back to the Features: DINO as a Foundation for Video World Models
Federico Baldassarre, Marc Szafraniec, Basile Terver, Vasil Khalidov, Francisco Massa, Yann LeCun, Patrick Labatut, Maximilian Seitzer, Piotr Bojanowski
arxiv.org/abs/2507.19468

@privacity@social.linux.pizza
2025-07-06 23:39:20

Nature of Data in Pre-Trained Large Language Models
fpf.org/blog/nature-of-data-in
@…

@arXiv_eessIV_bot@mastoxiv.page
2025-06-27 09:17:49

GANet-Seg: Adversarial Learning for Brain Tumor Segmentation with Hybrid Generative Models
Qifei Cui, Xinyu Lu
arxiv.org/abs/2506.21245

@arXiv_astrophIM_bot@mastoxiv.page
2025-07-29 09:23:41

Finetuning Stellar Spectra Foundation Models with LoRA
Xiaosheng Zhao, Yuan-Sen Ting, Alexander S. Szalay, Yang Huang
arxiv.org/abs/2507.20972

@arXiv_csSE_bot@mastoxiv.page
2025-07-22 10:01:10

On the Effect of Token Merging on Pre-trained Models for Code
Mootez Saad, Hao Li, Tushar Sharma, Ahmed E. Hassan
arxiv.org/abs/2507.14423

@arXiv_qbioGN_bot@mastoxiv.page
2025-06-25 08:15:39

eccDNAMamba: A Pre-Trained Model for Ultra-Long eccDNA Sequence Analysis
Zhenke Liu, Jien Li, Ziqi Zhang
arxiv.org/abs/2506.18940

@arXiv_csSC_bot@mastoxiv.page
2025-05-29 07:21:09

Symbolic Foundation Regressor on Complex Networks
Weiting Liu, Jiaxu Cui, Jiao Hu, En Wang, Bo Yang
arxiv.org/abs/2505.21879

@arXiv_csCR_bot@mastoxiv.page
2025-07-25 09:19:22

LoRA-Leak: Membership Inference Attacks Against LoRA Fine-tuned Language Models
Delong Ran, Xinlei He, Tianshuo Cong, Anyu Wang, Qi Li, Xiaoyun Wang
arxiv.org/abs/2507.18302

@arXiv_csSD_bot@mastoxiv.page
2025-07-29 07:51:51

Efficient Vocal-Conditioned Music Generation via Soft Alignment Attention and Latent Diffusion
Hei Shing Cheung, Boya Zhang
arxiv.org/abs/2507.19991

@arXiv_qfinPM_bot@mastoxiv.page
2025-07-29 09:09:51

Your AI, Not Your View: The Bias of LLMs in Investment Analysis
Hoyoung Lee, Junhyuk Seo, Suhwan Park, Junhyeong Lee, Wonbin Ahn, Chanyeol Choi, Alejandro Lopez-Lira, Yongjae Lee
arxiv.org/abs/2507.20957

@arXiv_qbiobm_bot@mastoxiv.page
2025-06-23 08:34:20

Aptamer-protein interaction prediction model based on transformer
Zhichao Yan, Yue Kang, Buyong Ma
arxiv.org/abs/2506.16084

@arXiv_csGR_bot@mastoxiv.page
2025-06-10 09:05:12

Squeeze3D: Your 3D Generation Model is Secretly an Extreme Neural Compressor
Rishit Dagli, Yushi Guan, Sankeerth Durvasula, Mohammadreza Mofayezi, Nandita Vijaykumar
arxiv.org/abs/2506.07932

@arXiv_csLG_bot@mastoxiv.page
2025-07-24 09:54:39

Computer Vision for Real-Time Monkeypox Diagnosis on Embedded Systems
Jacob M. Delgado-L\'opez, Ricardo A. Morell-Rodriguez, Sebasti\'an O. Espinosa-Del Rosario, Wilfredo E. Lugo-Beauchamp
arxiv.org/abs/2507.17123

@arXiv_csSD_bot@mastoxiv.page
2025-06-23 10:26:40

Hybrid-Sep: Language-queried audio source separation via pre-trained Model Fusion and Adversarial Diffusion Training
Jianyuan Feng, Guangzheng Li, Yangfei Xu
arxiv.org/abs/2506.16833

@arXiv_qbioQM_bot@mastoxiv.page
2025-06-24 08:45:40

BrainSymphony: A Transformer-Driven Fusion of fMRI Time Series and Structural Connectivity
Moein Khajehnejad, Forough Habibollahi, Adeel Razi
arxiv.org/abs/2506.18314

@arXiv_csRO_bot@mastoxiv.page
2025-06-16 07:49:19

Poutine: Vision-Language-Trajectory Pre-Training and Reinforcement Learning Post-Training Enable Robust End-to-End Autonomous Driving
Luke Rowe, Rodrigue de Schaetzen, Roger Girgis, Christopher Pal, Liam Paull
arxiv.org/abs/2506.11234

@arXiv_qfinST_bot@mastoxiv.page
2025-06-10 09:46:13

DELPHYNE: A Pre-Trained Model for General and Financial Time Series
Xueying Ding, Aakriti Mittal, Achintya Gopal
arxiv.org/abs/2506.06288

@arXiv_eessIV_bot@mastoxiv.page
2025-06-23 10:01:40

Fast Training-free Perceptual Image Compression
Ziran Zhu, Tongda Xu, Minye Huang, Dailan He, Xingtong Ge, Xinjie Zhang, Ling Li, Yan Wang
arxiv.org/abs/2506.16102

@arXiv_qbioNC_bot@mastoxiv.page
2025-06-04 07:49:14

A Brain Graph Foundation Model: Pre-Training and Prompt-Tuning for Any Atlas and Disorder
Xinxu Wei, Kanhao Zhao, Yong Jiao, Lifang He, Yu Zhang
arxiv.org/abs/2506.02044

@arXiv_csSD_bot@mastoxiv.page
2025-07-25 07:39:21

Speaker Disentanglement of Speech Pre-trained Model Based on Interpretability
Xiaoxu Zhu, Junhua Li
arxiv.org/abs/2507.17851 arxiv.org/pdf/…

@arXiv_eessSP_bot@mastoxiv.page
2025-06-12 08:36:51

Foundation Model-Aided Deep Reinforcement Learning for RIS-Assisted Wireless Communication
Mohammad Ghassemi, Sara Farrag Mobarak, Han Zhang, Ali Afana, Akram Bin Sediq, Melike Erol-Kantarci
arxiv.org/abs/2506.09855

@arXiv_csGR_bot@mastoxiv.page
2025-06-26 08:19:00

EditP23: 3D Editing via Propagation of Image Prompts to Multi-View
Roi Bar-On, Dana Cohen-Bar, Daniel Cohen-Or
arxiv.org/abs/2506.20652

@arXiv_csCL_bot@mastoxiv.page
2025-07-14 09:58:42

DocPolarBERT: A Pre-trained Model for Document Understanding with Relative Polar Coordinate Encoding of Layout Structures
Benno Uthayasooriyar, Antoine Ly, Franck Vermet, Caio Corro
arxiv.org/abs/2507.08606

@arXiv_csCV_bot@mastoxiv.page
2025-06-17 09:58:49

CLIP-HandID: Vision-Language Model for Hand-Based Person Identification
Nathanael L. Baisa, Babu Pallam, Amudhavel Jayavel
arxiv.org/abs/2506.12447

@arXiv_csSD_bot@mastoxiv.page
2025-07-29 09:19:01

Do Not Mimic My Voice: Speaker Identity Unlearning for Zero-Shot Text-to-Speech
Taesoo Kim, Jinju Kim, Dongchan Kim, Jong Hwan Ko, Gyeong-Moon Park
arxiv.org/abs/2507.20140

@arXiv_csCR_bot@mastoxiv.page
2025-06-09 08:05:12

Stealix: Model Stealing via Prompt Evolution
Zhixiong Zhuang, Hui-Po Wang, Maria-Irina Nicolae, Mario Fritz
arxiv.org/abs/2506.05867

@arXiv_eessAS_bot@mastoxiv.page
2025-06-03 07:44:25

Source Tracing of Synthetic Speech Systems Through Paralinguistic Pre-Trained Representations
Girish, Mohd Mujtaba Akhtar, Orchid Chetia Phukan, Drishti Singh, Swarup Ranjan Behera, Pailla Balakrishna Reddy, Arun Balaji Buduru, Rajesh Sharma
arxiv.org/abs/2506.01157

@tiotasram@kolektiva.social
2025-07-19 07:51:05

AI, AGI, and learning efficiency
My 4-month-old kid is not DDoSing Wikipedia right now, nor will they ever do so before learning to speak, read, or write. Their entire "training corpus" will not top even 100 million "tokens" before they can speak & understand language, and do so with real intentionally.
Just to emphasize that point: 100 words-per-minute times 60 minutes-per-hour times 12 hours-per-day times 365 days-per-year times 4 years is a mere 105,120,000 words. That's a ludicrously *high* estimate of words-per-minute and hours-per-day, and 4 years old (the age of my other kid) is well after basic speech capabilities are developed in many children, etc. More likely the available "training data" is at least 1 or 2 orders of magnitude less than this.
The point here is that large language models, trained as they are on multiple *billions* of tokens, are not developing their behavioral capabilities in a way that's remotely similar to humans, even if you believe those capabilities are similar (they are by certain very biased ways of measurement; they very much aren't by others). This idea that humans must be naturally good at acquiring language is an old one (see e.g. #AI #LLM #AGI

@arXiv_csDC_bot@mastoxiv.page
2025-07-22 08:53:20

ACME: Adaptive Customization of Large Models via Distributed Systems
Ziming Dai, Chao Qiu, Fei Gao, Yunfeng Zhao, Xiaofei Wang
arxiv.org/abs/2507.14802

@arXiv_csSE_bot@mastoxiv.page
2025-06-04 07:44:26

How do Pre-Trained Models Support Software Engineering? An Empirical Study in Hugging Face
Alexandra Gonz\'alez, Xavier Franch, David Lo, Silverio Mart\'inez-Fern\'andez
arxiv.org/abs/2506.03013

@arXiv_csIR_bot@mastoxiv.page
2025-06-10 16:44:49

This arxiv.org/abs/2506.02916 has been replaced.
initial toot: mastoxiv.page/@arXiv_csIR_…

@arXiv_csGR_bot@mastoxiv.page
2025-06-24 09:23:49

Morse: Dual-Sampling for Lossless Acceleration of Diffusion Models
Chao Li, Jiawei Fan, Anbang Yao
arxiv.org/abs/2506.18251

@arXiv_csCV_bot@mastoxiv.page
2025-07-16 10:34:41

Implementing Adaptations for Vision AutoRegressive Model
Kaif Shaikh, Antoni Kowalczuk, Franziska Boenisch, Adam Dziedzic
arxiv.org/abs/2507.11441

@arXiv_csLG_bot@mastoxiv.page
2025-06-10 19:21:11

This arxiv.org/abs/2505.23868 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…

@arXiv_csRO_bot@mastoxiv.page
2025-06-02 10:28:24

This arxiv.org/abs/2505.21906 has been replaced.
initial toot: mastoxiv.page/@arXiv_csRO_…

@arXiv_csCR_bot@mastoxiv.page
2025-06-02 10:00:25

This arxiv.org/abs/2411.16746 has been replaced.
initial toot: mastoxiv.page/@arXiv_csCR_…

@arXiv_csDB_bot@mastoxiv.page
2025-06-09 07:27:22

Training-Free Query Optimization via LLM-Based Plan Similarity
Nikita Vasilenko, Alexander Demin, Vladimir Boorlakov
arxiv.org/abs/2506.05853

@arXiv_csGR_bot@mastoxiv.page
2025-07-25 09:26:12

Tiny is not small enough: High-quality, low-resource facial animation models through hybrid knowledge distillation
Zhen Han, Mattias Teye, Derek Yadgaroff, Judith B\"utepage
arxiv.org/abs/2507.18352

@arXiv_qbioGN_bot@mastoxiv.page
2025-06-16 09:30:09

Multimodal Modeling of CRISPR-Cas12 Activity Using Foundation Models and Chromatin Accessibility Data
Azim Dehghani Amirabad, Yanfei Zhang, Artem Moskalev, Sowmya Rajesh, Tommaso Mansi, Shuwei Li, Mangal Prakash, Rui Liao
arxiv.org/abs/2506.11182

@arXiv_eessAS_bot@mastoxiv.page
2025-06-17 11:23:45

Stereo sound event localization and detection based on PSELDnet pretraining and BiMamba sequence modeling
Wenmiao Gao, Yang Xiao
arxiv.org/abs/2506.13455

@arXiv_physicsfludyn_bot@mastoxiv.page
2025-07-02 09:44:00

Guided Unconditional and Conditional Generative Models for Super-Resolution and Inference of Quasi-Geostrophic Turbulence
Anantha Narayanan Suresh Babu, Akhil Sadam, Pierre F. J. Lermusiaux
arxiv.org/abs/2507.00719

@arXiv_csDC_bot@mastoxiv.page
2025-07-01 09:56:13

QPART: Adaptive Model Quantization and Dynamic Workload Balancing for Accuracy-aware Edge Inference
Xiangchen Li, Saeid Ghafouri, Bo Ji, Hans Vandierendonck, Deepu John, Dimitrios S. Nikolopoulos
arxiv.org/abs/2506.23934

@arXiv_csLG_bot@mastoxiv.page
2025-06-05 11:00:19

This arxiv.org/abs/2506.00486 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…

@arXiv_eessSP_bot@mastoxiv.page
2025-06-12 08:23:31

AI-Driven SEEG Channel Ranking for Epileptogenic Zone Localization
Saeed Hashemi, Genchang Peng, Mehrdad Nourani, Omar Nofal, Jay Harvey
arxiv.org/abs/2506.09255

@arXiv_csCL_bot@mastoxiv.page
2025-06-03 08:19:55

Common Corpus: The Largest Collection of Ethical Data for LLM Pre-Training
Pierre-Carl Langlais, Carlos Rosas Hinostroza, Mattia Nee, Catherine Arnett, Pavel Chizhov, Eliot Krzystof Jones, Ir\`ene Girard, David Mach, Anastasia Stasenko, Ivan P. Yamshchikov
arxiv.org/abs/2506.01732

@arXiv_csCV_bot@mastoxiv.page
2025-06-04 14:59:51

This arxiv.org/abs/2505.21920 has been replaced.
initial toot: mastoxiv.page/@arXiv_csCV_…

@arXiv_eessIV_bot@mastoxiv.page
2025-07-11 09:03:21

Label-Efficient Chest X-ray Diagnosis via Partial CLIP Adaptation
Heet Nitinkumar Dalsania
arxiv.org/abs/2507.07254 a…

@arXiv_csCR_bot@mastoxiv.page
2025-07-17 08:12:10

Effective Fine-Tuning of Vision Transformers with Low-Rank Adaptation for Privacy-Preserving Image Classification
Haiwei Lin, Shoko Imaizumi, Hitoshi Kiya
arxiv.org/abs/2507.11943

@arXiv_csGR_bot@mastoxiv.page
2025-06-24 08:11:39

BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing
Jiacheng Chen, Ramin Mehran, Xuhui Jia, Saining Xie, Sanghyun Woo
arxiv.org/abs/2506.17450

@arXiv_csIR_bot@mastoxiv.page
2025-06-05 09:42:02

This arxiv.org/abs/2506.02916 has been replaced.
initial toot: mastoxiv.page/@arXiv_csIR_…

@arXiv_physicsmedph_bot@mastoxiv.page
2025-06-05 07:35:05

Personalized MR-Informed Diffusion Models for 3D PET Image Reconstruction
George Webber, Alexander Hammers, Andrew P. King, Andrew J. Reader
arxiv.org/abs/2506.03804

@arXiv_eessAS_bot@mastoxiv.page
2025-06-19 08:41:22

Factorized RVQ-GAN For Disentangled Speech Tokenization
Sameer Khurana, Dominik Klement, Antoine Laurent, Dominik Bobos, Juraj Novosad, Peter Gazdik, Ellen Zhang, Zili Huang, Amir Hussein, Ricard Marxer, Yoshiki Masuyama, Ryo Aihara, Chiori Hori, Francois G. Germain, Gordon Wichern, Jonathan Le Roux
arxiv.org/abs/2506.15…

@arXiv_csSD_bot@mastoxiv.page
2025-06-06 07:21:12

Grapheme-Coherent Phonemic and Prosodic Annotation of Speech by Implicit and Explicit Grapheme Conditioning
Hien Ohnaka, Yuma Shirahata, Byeongseon Park, Ryuichi Yamamoto
arxiv.org/abs/2506.04527

@arXiv_csCL_bot@mastoxiv.page
2025-07-17 08:05:40

Cross-lingual Few-shot Learning for Persian Sentiment Analysis with Incremental Adaptation
Farideh Majidi, Ziaeddin Beheshtifard
arxiv.org/abs/2507.11634

@arXiv_astrophIM_bot@mastoxiv.page
2025-06-03 07:41:29

Applying Vision Transformers on Spectral Analysis of Astronomical Objects
Luis Felipe Strano Moraes, Ignacio Becker, Pavlos Protopapas, Guillermo Cabrera-Vives
arxiv.org/abs/2506.00294

@arXiv_csLG_bot@mastoxiv.page
2025-06-10 19:21:44

This arxiv.org/abs/2506.01790 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…

@arXiv_csIR_bot@mastoxiv.page
2025-06-04 07:26:13

MMM4Rec: An Transfer-Efficient Framework for Multi-modal Sequential Recommendation
Hao Fan, Yanrong Hu, Kai Fang, Qingyang Liu, Hongjiu Liu
arxiv.org/abs/2506.02916

@arXiv_csCV_bot@mastoxiv.page
2025-07-03 10:31:50

FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
Yukang Cao, Chenyang Si, Jinghao Wang, Ziwei Liu
arxiv.org/abs/2507.01953

@arXiv_csSD_bot@mastoxiv.page
2025-07-08 11:09:50

Improving BERT for Symbolic Music Understanding Using Token Denoising and Pianoroll Prediction
Jun-You Wang, Li Su
arxiv.org/abs/2507.04776

@arXiv_csLG_bot@mastoxiv.page
2025-06-03 21:30:25

This arxiv.org/abs/2504.19583 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…

@arXiv_eessAS_bot@mastoxiv.page
2025-07-15 09:17:51

Enhancing Stereo Sound Event Detection with BiMamba and Pretrained PSELDnet
Wenmiao Gao, Han Yin
arxiv.org/abs/2507.09570

@arXiv_csLG_bot@mastoxiv.page
2025-07-11 10:23:51

Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs
Ziyue Li, Yang Li, Tianyi Zhou
arxiv.org/abs/2507.07996 arxiv.org/pdf/2507.07996 arxiv.org/html/2507.07996
arXiv:2507.07996v1 Announce Type: new
Abstract: Can a pretrained neural network adapt its architecture to different inputs without any finetuning? Do we need all layers for simple tasks, and are they adequate for challenging tasks? We found that the layers of a pretrained large language model (LLM) can be manipulated as separate modules to build a better and even shallower model customized for each test sample. In particular, each layer from the pretrained model can be skipped/pruned or repeated multiple times as recurrent neural networks (RNN), and stacked with others in arbitrary orders, yielding a chain-of-layers (CoLa) per sample. This compositional space greatly expands the scope of existing works on looped/recurrent pretrained modules, layer pruning, or early-exit networks. We develop a Monte Carlo Tree Search (MCTS) protocol to explore and identify the optimal CoLa for each sample from math and commonsense reasoning benchmarks. Compared to a static model of a fixed depth, CoLa allows shortcut paths (fast thinking), recurrence of the same layer(s) (slow thinking), and combining both, offering more flexible, dynamic architectures for different inputs. We conduct an extensive analysis of the MCTS-optimized CoLa, which leads to two key findings: (1) For >75% of samples with correct predictions by the original LLM, we can find shorter CoLa, suggesting a large space for improving inference efficiency; (2) For >60% of samples with originally incorrect predictions, we can identify CoLa achieving correct predictions, suggesting a large space of performance enhancement. Our results highlight the shortcomings of using a fixed architecture of pre-trained LLMs for inference on different samples and pave the way to unlock the generalization power of test-time depth adaptation.
toXiv_bot_toot

@arXiv_csDC_bot@mastoxiv.page
2025-06-04 07:45:04

Memory-Efficient Split Federated Learning for LLM Fine-Tuning on Heterogeneous Mobile Devices
Xiaopei Chen, Liang Li, Fei Ji, Wen Wu
arxiv.org/abs/2506.02940

@arXiv_csSD_bot@mastoxiv.page
2025-07-14 08:35:02

Audio Inpanting using Discrete Diffusion Model
Tali Dror, Iftach Shoham, Moshe Buchris, Oren Gal, Haim Permuter, Gilad Katz, Eliya Nachmani
arxiv.org/abs/2507.08333

@arXiv_csIR_bot@mastoxiv.page
2025-06-02 09:58:21

This arxiv.org/abs/2410.13230 has been replaced.
initial toot: mastoxiv.page/@arXiv_csIR_…

@arXiv_csSD_bot@mastoxiv.page
2025-06-16 07:56:49

LiLAC: A Lightweight Latent ControlNet for Musical Audio Generation
Tom Baker, Javier Nistal
arxiv.org/abs/2506.11476

@arXiv_csCL_bot@mastoxiv.page
2025-07-03 09:56:10

Adapting Language Models to Indonesian Local Languages: An Empirical Study of Language Transferability on Zero-Shot Settings
Rifki Afina Putri
arxiv.org/abs/2507.01645

@arXiv_eessAS_bot@mastoxiv.page
2025-07-16 08:30:01

Physics-Informed Transfer Learning for Data-Driven Sound Source Reconstruction in Near-Field Acoustic Holography
Xinmeng Luan, Mirco Pezzoli, Fabio Antonacci, Augusto Sarti
arxiv.org/abs/2507.11070

@arXiv_csLG_bot@mastoxiv.page
2025-06-03 08:21:21

$IF-GUIDE$: Influence Function-Guided Detoxification of LLMs
Zachary Coalson, Juhan Bae, Nicholas Carlini, Sanghyun Hong
arxiv.org/abs/2506.01790

@arXiv_csSD_bot@mastoxiv.page
2025-06-16 08:05:59

Confidence-Based Self-Training for EMG-to-Speech: Leveraging Synthetic EMG for Robust Modeling
Xiaodan Chen, Xiaoxue Gao, Mathias Quoy, Alexandre Pitti, Nancy F. Chen
arxiv.org/abs/2506.11862

@arXiv_eessAS_bot@mastoxiv.page
2025-06-09 08:06:52

TADA: Training-free Attribution and Out-of-Domain Detection of Audio Deepfakes
Adriana Stan, David Combei, Dan Oneata, Hora Cucu
arxiv.org/abs/2506.05802

@arXiv_csSD_bot@mastoxiv.page
2025-07-14 08:06:12

Distilling Spectrograms into Tokens: Fast and Lightweight Bioacoustic Classification for BirdCLEF 2025
Anthony Miyaguchi, Murilo Gustineli, Adrian Cheung
arxiv.org/abs/2507.08236

@arXiv_eessAS_bot@mastoxiv.page
2025-06-13 08:02:30

Joint ASR and Speaker Role Tagging with Serialized Output Training
Anfeng Xu, Tiantian Feng, Shrikanth Narayanan
arxiv.org/abs/2506.10349

@arXiv_eessAS_bot@mastoxiv.page
2025-06-12 08:44:21

Fine-Tuning Large Audio-Language Models with LoRA for Precise Temporal Localization of Prolonged Exposure Therapy Elements
Suhas BN, Andrew M. Sherrill, Jyoti Alaparthi, Dominik Mattioli, Rosa I. Arriaga, Chris W. Wiese, Saeed Abdullah
arxiv.org/abs/2506.09707

@arXiv_eessAS_bot@mastoxiv.page
2025-07-10 08:25:51

Pronunciation-Lexicon Free Training for Phoneme-based Crosslingual ASR via Joint Stochastic Approximation
Saierdaer Yusuyin, Te Ma, Hao Huang, Zhijian Ou
arxiv.org/abs/2507.06249

@arXiv_csSD_bot@mastoxiv.page
2025-06-03 07:27:19

Improving Code Switching with Supervised Fine Tuning and GELU Adapters
Linh Pham
arxiv.org/abs/2506.00291 arxiv.org/p…

@arXiv_csSD_bot@mastoxiv.page
2025-06-03 07:53:53

Attention Is Not Always the Answer: Optimizing Voice Activity Detection with Simple Feature Fusion
Kumud Tripathi, Chowdam Venkata Kumar, Pankaj Wasnik
arxiv.org/abs/2506.01365

@arXiv_eessAS_bot@mastoxiv.page
2025-07-03 09:34:00

Generalizable Detection of Audio Deepfakes
Jose A. Lopez, Georg Stemmer, H\'ector Cordourier Maruri
arxiv.org/abs/2507.01750

@arXiv_eessAS_bot@mastoxiv.page
2025-06-02 10:03:28

This arxiv.org/abs/2505.14449 has been replaced.
initial toot: mastoxiv.page/@arXiv_ees…