
2025-09-18 10:18:21
Language models' activations linearly encode training-order recency
Dmitrii Krasheninnikov, Richard E. Turner, David Krueger
https://arxiv.org/abs/2509.14223 https://…
Language models' activations linearly encode training-order recency
Dmitrii Krasheninnikov, Richard E. Turner, David Krueger
https://arxiv.org/abs/2509.14223 https://…
In a peer-reviewed Nature article, DeepSeek says it has spent $294,000 on training its R1 model and used 512 Nvidia H800 chips (Eduardo Baptista/Reuters)
https://www.reuters.com/world/china/chinas-deepseek-says-its-hit-ai-model-cos…
Rationalizing Transformer Predictions via End-To-End Differentiable Self-Training
Marc Brinner, Sina Zarrie{\ss}
https://arxiv.org/abs/2508.11393 https://a…
Training-Free Anomaly Generation via Dual-Attention Enhancement in Diffusion Model
Zuo Zuo, Jiahao Dong, Yanyun Qu, Zongze Wu
https://arxiv.org/abs/2508.11550 https://
Substituting Proof of Work in Blockchain with Training-Verified Collaborative Model Computation
Mohammad Ishzaz Asif Rafid, Morsalin Sakib
https://arxiv.org/abs/2508.12138 https…
FlightDiffusion: Revolutionising Autonomous Drone Training with Diffusion Models Generating FPV Video
Valerii Serpiva, Artem Lykov, Faryal Batool, Vladislav Kozlovskiy, Miguel Altamirano Cabrera, Dzmitry Tsetserukou
https://arxiv.org/abs/2509.14082
"In theory, AI model makers could eliminate hallucinations by using a dataset that contains no errors."
I think someone has fundamentally misunderstood the technology. Developing a model using a 100% correct training dataset does not mean that the resulting AI will be able to correctly answer questions that were not in the training data.
Over-fitting is a thing.
SimInterview: Transforming Business Education through Large Language Model-Based Simulated Multilingual Interview Training System
Truong Thanh Hung Nguyen, Tran Diem Quynh Nguyen, Hoang Loc Cao, Thi Cam Thanh Tran, Thi Cam Mai Truong, Hung Cao
https://arxiv.org/abs/2508.11873
Mantis: A Simulation-Grounded Foundation Model for Disease Forecasting
Carson Dudley, Reiden Magdaleno, Christopher Harding, Ananya Sharma, Emily Martin, Marisa Eisenberg
https://arxiv.org/abs/2508.12260
ADMIRE-BayesOpt: Accelerated Data MIxture RE-weighting for Language Models with Bayesian Optimization
Shengzhuang Chen, Xu Ouyang, Michael Arthur Leopold Pearce, Thomas Hartvigsen, Jonathan Richard Schwarz
https://arxiv.org/abs/2508.11551
NeMo: A Neuron-Level Modularizing-While-Training Approach for Decomposing DNN Models
Xiaohan Bi, Binhang Qi, Hailong Sun, Xiang Gao, Yue Yu, Xiaojun Liang
https://arxiv.org/abs/2508.11348
Breaking the Aggregation Bottleneck in Federated Recommendation: A Personalized Model Merging Approach
Jundong Chen, Honglei Zhang, Chunxu Zhang, Fangyuan Luo, Yidong Li
https://arxiv.org/abs/2508.12386
NYC-based Protege, which prepares and sells real-world datasets like lab results and sports footage for AI training, raised a $25M Series A led by Footwork (Natasha Mascarenhas/The Information)
https://www.theinformation.com/articles/one-year-old…
WorldForge: Unlocking Emergent 3D/4D Generation in Video Diffusion Model via Training-Free Guidance
Chenxi Song, Yanming Yang, Tong Zhao, Ruibo Li, Chi Zhang
https://arxiv.org/abs/2509.15130
A Large-Scale Web Search Dataset for Federated Online Learning to Rank
Marcel Gregoriadis, Jingwei Kang, Johan Pouwelse
https://arxiv.org/abs/2508.12353 https://
From Hype to Insight: Rethinking Large Language Model Integration in Visual Speech Recognition
Rishabh Jain, Naomi Harte
https://arxiv.org/abs/2509.14880 https://
Efficient Quantization-Aware Neural Receivers: Beyond Post-Training Quantization
SaiKrishna Saketh Yellapragada, Esa Ollila, Mario Costa
https://arxiv.org/abs/2509.13786 https:/…
CSMoE: An Efficient Remote Sensing Foundation Model with Soft Mixture-of-Experts
Leonard Hackel, Tom Burgert, Beg\"um Demir
https://arxiv.org/abs/2509.14104 https://…
Adversarial Distilled Retrieval-Augmented Guarding Model for Online Malicious Intent Detection
Yihao Guo, Haocheng Bian, Liutong Zhou, Ze Wang, Zhaoyi Zhang, Francois Kawala, Milan Dean, Ian Fischer, Yuantao Peng, Noyan Tokgozoglu, Ivan Barrientos, Riyaaz Shaik, Rachel Li, Chandru Venkataraman, Reza Shifteh Far, Moses Pawar, Venkat Sundaranatha, Michael Xu, Frank Chu
GeoAware-VLA: Implicit Geometry Aware Vision-Language-Action Model
Ali Abouzeid, Malak Mansour, Zezhou Sun, Dezhen Song
https://arxiv.org/abs/2509.14117 https://
From Who Said What to Who They Are: Modular Training-free Identity-Aware LLM Refinement of Speaker Diarization
Yu-Wen Chen, William Ho, Maxim Topaz, Julia Hirschberg, Zoran Kostic
https://arxiv.org/abs/2509.15082
Survey-to-Behavior: Downstream Alignment of Human Values in LLMs via Survey Questions
Shangrui Nie, Florian Mai, David Kacz\'er, Charles Welch, Zhixue Zhao, Lucie Flek
https://arxiv.org/abs/2508.11414
A Universal Banach--Bregman Framework for Stochastic Iterations: Unifying Stochastic Mirror Descent, Learning and LLM Training
Johnny R. Zhang (Independent Researcher), Xiaomei Mi (University of Manchester), Gaoyuan Du (Amazon), Qianyi Sun (Microsoft), Shiqi Wang (Meta), Jiaxuan Li (Amazon), Wenhua Zhou (Independent Researcher)
https://arx…
Semiparametric Learning from Open-Set Label Shift Data
Siyan Liu, Yukun Liu, Qinglong Tian, Pengfei Li, Jing Qin
https://arxiv.org/abs/2509.14522 https://a…
AutoPower: Automated Few-Shot Architecture-Level Power Modeling by Power Group Decoupling
Qijun Zhang, Yao Lu, Mengming Li, Zhiyao Xie
https://arxiv.org/abs/2508.12294 https://
A proposal for automated turbulence modelling
Marco Castelletti, Maurizio Quadrio
https://arxiv.org/abs/2509.14140 https://arxiv.org/pdf/2509.14140
MEGAN: Mixture of Experts for Robust Uncertainty Estimation in Endoscopy Videos
Damola Agbelese, Krishna Chaitanya, Pushpak Pati, Chaitanya Parmar, Pooya Mobadersany, Shreyas Fadnavis, Lindsey Surace, Shadi Yarandi, Louis R. Ghanem, Molly Lucas, Tommaso Mansi, Oana Gabriela Cula, Pablo F. Damasceno, Kristopher Standish
https://arxiv.org/ab…
Noise Supervised Contrastive Learning and Feature-Perturbed for Anomalous Sound Detection
Shun Huang, Zhihua Fang, Liang He
https://arxiv.org/abs/2509.13853 https://
Synthetic-to-Real Object Detection using YOLOv11 and Domain Randomization Strategies
Luisa Torquato Ni\~no, Hamza A. A. Gardi
https://arxiv.org/abs/2509.15045 https://
FRIT: Using Causal Importance to Improve Chain-of-Thought Faithfulness
Anand Swaroop, Akshat Nallani, Saksham Uboweja, Adiliia Uzdenova, Michael Nguyen, Kevin Zhu, Sunishchal Dev, Ashwinee Panda, Vasu Sharma, Maheep Chaudhary
https://arxiv.org/abs/2509.13334
Physics-Informed Diffusion Models for Unsupervised Anomaly Detection in Multivariate Time Series
Juhi Soni, Markus Lange-Hegermann, Stefan Windmann
https://arxiv.org/abs/2508.11528
Where to Start Alignment? Diffusion Large Language Model May Demand a Distinct Position
Zhixin Xie, Xurui Song, Jun Luo
https://arxiv.org/abs/2508.12398 https://
Canary-1B-v2 & Parakeet-TDT-0.6B-v3: Efficient and High-Performance Models for Multilingual ASR and AST
Monica Sekoyan, Nithin Rao Koluguri, Nune Tadevosyan, Piotr Zelasko, Travis Bartley, Nick Karpov, Jagadeesh Balam, Boris Ginsburg
https://arxiv.org/abs/2509.14128
CollabVLA: Self-Reflective Vision-Language-Action Model Dreaming Together with Human
Nan Sun, Yongchang Li, Chenxu Wang, Huiying Li, Huaping Liu
https://arxiv.org/abs/2509.14889
Replaced article(s) found for physics.chem-ph. https://arxiv.org/list/physics.chem-ph/new
[1/1]:
- Machine Learning Interatomic Potentials: library for efficient training, model development and si...
Christoph Brunken, et al.
Pre-trained Transformer-models using chronic invasive electrophysiology for symptom decoding without patient-individual training
Timon Merk, Saeed Salehi, Richard M. Koehler, Qiming Cui, Maria Olaru, Amelia Hahn, Nicole R. Provenza, Simon Little, Reza Abbasi-Asl, Phil A. Starr, Wolf-Julian Neumann
https://arxiv.org/abs/2508.10160
A Neural-Network Framework for Tracking and Identification of Cosmic-Ray Nuclei in the RadMap Telescope
Luise Meyer-Hetling, Martin J. Losekamm, Stephan Paul, Thomas P\"oschl
https://arxiv.org/abs/2508.12708
Ensembling Large Language Models for Code Vulnerability Detection: An Empirical Evaluation
Zhihong Sun, Jia Li, Yao Wan, Chuanyi Li, Hongyu Zhang, Zhi jin, Ge Li, Hong Liu, Chen Lyu, Songlin Hu
https://arxiv.org/abs/2509.12629
Accelerating Edge Inference for Distributed MoE Models with Latency-Optimized Expert Placement
Tian Wu, Liming Wang, Zijian Wen, Xiaoxi Zhang, Jingpu Duan, Xianwei Zhang, Jinhang Zuo
https://arxiv.org/abs/2508.12851
How does an AI Weather Model Learn to Forecast Extreme Weather?
Rebecca Baiman, Elizabeth A. Barnes, Ankur Mahesh
https://arxiv.org/abs/2509.10639 https://…
Beyond the Rosetta Stone: Unification Forces in Generalization Dynamics
Carter Blum, Katja Filipova, Ann Yuan, Asma Ghandeharioun, Julian Zimmert, Fred Zhang, Jessica Hoffmann, Tal Linzen, Martin Wattenberg, Lucas Dixon, Mor Geva
https://arxiv.org/abs/2508.11017
FS-SAM2: Adapting Segment Anything Model 2 for Few-Shot Semantic Segmentation via Low-Rank Adaptation
Bernardo Forni, Gabriele Lombardi, Federico Pozzi, Mirco Planamente
https://arxiv.org/abs/2509.12105
Unlearning Comparator: A Visual Analytics System for Comparative Evaluation of Machine Unlearning Methods
Jaeung Lee, Suhyeon Yu, Yurim Jang, Simon S. Woo, Jaemin Jo
https://arxiv.org/abs/2508.12730
Back to Ear: Perceptually Driven High Fidelity Music Reconstruction
Kangdi Wang, Zhiyue Wu, Dinghao Zhou, Rui Lin, Junyu Dai, Tao Jiang
https://arxiv.org/abs/2509.14912 https://…
ATLAS: AI-Native Receiver Test-and-Measurement by Leveraging AI-Guided Search
Mauro Belgiovine, Suyash Pradhan, Johannes Lange, Michael L\"ohning, Kaushik Chowdhury
https://arxiv.org/abs/2508.12204
Wisdom of the Crowd: Reinforcement Learning from Coevolutionary Collective Feedback
Wenzhen Yuan, Shengji Tang, Weihao Lin, Jiacheng Ruan, Ganqu Cui, Bo Zhang, Tao Chen, Ting Liu, Yuzhuo Fu, Peng Ye, Lei Bai
https://arxiv.org/abs/2508.12338
Multi-Model Synthetic Training for Mission-Critical Small Language Models
Nolan Platt, Pragyansmita Nayak
https://arxiv.org/abs/2509.13047 https://arxiv.or…
Differential Privacy in Federated Learning: Mitigating Inference Attacks with Randomized Response
Ozer Ozturk, Busra Buyuktanir, Gozde Karatas Baydogmus, Kazim Yildiz
https://arxiv.org/abs/2509.13987
Low-rank Orthogonalization for Large-scale Matrix Optimization with Applications to Foundation Model Training
Chuan He, Zhanwang Deng, Zhaosong Lu
https://arxiv.org/abs/2509.11983
FLAMMABLE: A Multi-Model Federated Learning Framework with Multi-Model Engagement and Adaptive Batch Sizes
Shouxu Lin, Zimeng Pan, Yuhang Yao, Haeyoung Noh, Pei Zhang, Carlee Joe-Wong
https://arxiv.org/abs/2510.10380
Toward Embodiment Equivariant Vision-Language-Action Policy
Anzhe Chen, Yifei Yang, Zhenjie Zhu, Kechun Xu, Zhongxiang Zhou, Rong Xiong, Yue Wang
https://arxiv.org/abs/2509.14630
Replaced article(s) found for cs.CV. https://arxiv.org/list/cs.CV/new
[4/4]:
- IRL-VLA: Training an Vision-Language-Action Policy via Reward World Model
Jiang, Gao, Wang, Sun, Wang, Heng, Sun, Tang, Zhu, Chai, Wang, Gu, Jiang, Sun
Mitigating data replication in text-to-audio generative diffusion models through anti-memorization guidance
Francisco Messina, Francesca Ronchini, Luca Comanducci, Paolo Bestagini, Fabio Antonacci
https://arxiv.org/abs/2509.14934
Proxy Model-Guided Reinforcement Learning for Client Selection in Federated Recommendation
Liang Qu, Jianxin Li, Wei Yuan, Penghui Ruan, Yuhui Shi, Hongzhi Yin
https://arxiv.org/abs/2508.10401
Compute as Teacher: Turning Inference Compute Into Reference-Free Supervision
Dulhan Jayalath, Shashwat Goel, Thomas Foster, Parag Jain, Suchin Gururangan, Cheng Zhang, Anirudh Goyal, Alan Schelten
https://arxiv.org/abs/2509.14234
LLM-OREF: An Open Relation Extraction Framework Based on Large Language Models
Hongyao Tu, Liang Zhang, Yujie Lin, Xin Lin, Haibo Zhang, Long Zhang, Jinsong Su
https://arxiv.org/abs/2509.15089
RealMirror: A Comprehensive, Open-Source Vision-Language-Action Platform for Embodied AI
Cong Tai, Zhaoyu Zheng, Haixu Long, Hansheng Wu, Haodong Xiang, Zhengbin Long, Jun Xiong, Rong Shi, Shizhuang Zhang, Gang Qiu, He Wang, Ruifeng Li, Jun Huang, Bin Chang, Shuai Feng, Tao Shen
https://arxiv.org/abs/2509.14687
MeanFlowSE: one-step generative speech enhancement via conditional mean flow
Duojia Li, Shenghui Lu, Hongchen Pan, Zongyi Zhan, Qingyang Hong, Lin Li
https://arxiv.org/abs/2509.14858
Bayesian Signal Separation via Plug-and-Play Diffusion-Within-Gibbs Sampling
Yi Zhang, Rui Guo, Yonina C. Eldar
https://arxiv.org/abs/2509.12857 https://ar…
Morphology-optimized Multi-Scale Fusion: Combining Local Artifacts and Mesoscopic Semantics for Deepfake Detection and Localization
Chao Shuai, Gaojian Wang, Kun Pan, Tong Wu, Fanli Jin, Haohan Tan, Mengxiang Li, Zhenguang Liu, Feng Lin, Kui Ren
https://arxiv.org/abs/2509.13776
On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic Weighting
Wenhao Zhang, Yuexiang Xie, Yuchang Sun, Yanxi Chen, Guoyin Wang, Yaliang Li, Bolin Ding, Jingren Zhou
https://arxiv.org/abs/2508.11408
Beyond Data Privacy: New Privacy Risks for Large Language Models
Yuntao Du, Zitao Li, Ninghui Li, Bolin Ding
https://arxiv.org/abs/2509.14278 https://arxiv…
Dataset Creation for Visual Entailment using Generative AI
Rob Reijtenbach, Suzan Verberne, Gijs Wijnholds
https://arxiv.org/abs/2508.11605 https://arxiv.o…
LLM Enhancement with Domain Expert Mental Model to Reduce LLM Hallucination with Causal Prompt Engineering
Boris Kovalerchuk, Brent D. Fegley
https://arxiv.org/abs/2509.10818 ht…
Benchmarking Prosody Encoding in Discrete Speech Tokens
Kentaro Onda, Satoru Fukayama, Daisuke Saito, Nobuaki Minematsu
https://arxiv.org/abs/2508.11224 https://
FuXi-\beta: Towards a Lightweight and Fast Large-Scale Generative Recommendation Model
Yufei Ye, Wei Guo, Hao Wang, Hong Zhu, Yuyang Ye, Yong Liu, Huifeng Guo, Ruiming Tang, Defu Lian, Enhong Chen
https://arxiv.org/abs/2508.10615
Differentially private federated learning for localized control of infectious disease dynamics
Raouf Kerkouche, Henrik Zunker, Mario Fritz, Martin J. K\"uhn
https://arxiv.org/abs/2509.14024
MoE-TTS: Enhancing Out-of-Domain Text Understanding for Description-based TTS via Mixture-of-Experts
Heyang Xue, Xuchen Song, Yu Tang, Jianyu Chen, Yanru Chen, Yang Li, Yahui Zhou
https://arxiv.org/abs/2508.11326
Verify Distributed Deep Learning Model Implementation Refinement with Iterative Relation Inference
Zhanghan Wang, Ding Ding, Hang Zhu, Haibin Lin, Aurojit Panda
https://arxiv.org/abs/2508.09505
Optimizing Token Choice for Code Watermarking: A RL Approach
Zhimeng Guo, Huaisheng Zhu, Siyuan Xu, Hangfan Zhang, Teng Xiao, Minhao Cheng
https://arxiv.org/abs/2508.11925 https…
LNE-Blocking: An Efficient Framework for Contamination Mitigation Evaluation on Large Language Models
Ruijie Hou, Yueyang Jiao, Hanxu Hu, Yingming Li, Wai Lam, Huajian Zhang, Hongyuan Lu
https://arxiv.org/abs/2509.15218
From Distributional to Quantile Neural Basis Models: the case of Electricity Price Forecasting
Alessandro Brusaferri, Danial Ramin, Andrea Ballarino
https://arxiv.org/abs/2509.14113
ForTIFAI: Fending Off Recursive Training Induced Failure for AI Models
Soheil Zibakhsh Shabgahi, Pedram Aghazadeh, Azalia Mirhosseini, Farinaz Koushanfar
https://arxiv.org/abs/2509.08972
DSPC: Dual-Stage Progressive Compression Framework for Efficient Long-Context Reasoning
Yaxin Gao, Yao Lu, Zongfei Zhang, Jiaqi Nie, Shanqing Yu, Qi Xuan
https://arxiv.org/abs/2509.13723
Residual MPC: Blending Reinforcement Learning with GPU-Parallelized Model Predictive Control
Se Hwan Jeon, Ho Jae Lee, Seungwoo Hong, Sangbae Kim
https://arxiv.org/abs/2510.12717
An Efficient Model-Driven Groupwise Approach for Atlas Construction
Ziwei Zou, Bei Zou, Xiaoyan Kui, Wenqi Lu, Haoran Dou, Arezoo Zakeri, Timothy Cootes, Alejandro F Frangi, Jinming Duan
https://arxiv.org/abs/2508.10743
UTI-LLM: A Personalized Articulatory-Speech Therapy Assistance System Based on Multimodal Large Language Model
Yudong Yang, Xiaokang Liu, Shaofeng zhao, Rongfeng Su, Nan Yan, Lan Wang
https://arxiv.org/abs/2509.13145
Human-in-Context: Unified Cross-Domain 3D Human Motion Modeling via In-Context Learning
Mengyuan Liu, Xinshun Wang, Zhongbin Fang, Deheng Ye, Xia Li, Tao Tang, Songtao Wu, Xiangtai Li, Ming-Hsuan Yang
https://arxiv.org/abs/2508.10897
CorrectNav: Self-Correction Flywheel Empowers Vision-Language-Action Navigation Model
Zhuoyuan Yu, Yuxing Long, Zihan Yang, Chengyan Zeng, Hongwei Fan, Jiyao Zhang, Hao Dong
https://arxiv.org/abs/2508.10416
All Roads Lead to Rome: Graph-Based Confidence Estimation for Large Language Model Reasoning
Caiqi Zhang, Chang Shu, Ehsan Shareghi, Nigel Collier
https://arxiv.org/abs/2509.12908
MAUI: Reconstructing Private Client Data in Federated Transfer Learning
Ahaan Dabholkar, Atul Sharma, Z. Berkay Celik, Saurabh Bagchi
https://arxiv.org/abs/2509.11451 https://…
Diversity First, Quality Later: A Two-Stage Assumption for Language Model Alignment
Zetian Sun, Dongfang Li, Baotian Hu
https://arxiv.org/abs/2508.10530 https://
DiTReducio: A Training-Free Acceleration for DiT-Based TTS via Progressive Calibration
Yanru Huo, Ziyue Jiang, Zuoli Tang, Qingyang Hong, Zhou Zhao
https://arxiv.org/abs/2509.09748
FedBiF: Communication-Efficient Federated Learning via Bits Freezing
Shiwei Li, Qunwei Li, Haozhao Wang, Ruixuan Li, Jianbin Lin, Wenliang Zhong
https://arxiv.org/abs/2509.10161
MAGIC-Enhanced Keyword Prompting for Zero-Shot Audio Captioning with CLIP Models
Vijay Govindarajan, Pratik Patel, Sahil Tripathi, Md Azizul Hoque, Gautam Siddharth Kashyap
https://arxiv.org/abs/2509.12591
ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution
Long Cui, Weiyun Wang, Jie Shao, Zichen Wen, Gen Luo, Linfeng Zhang, Yanting Zhang, Yu Qiao, Wenhai Wang
https://arxiv.org/abs/2510.12793
Task-Specific Dual-Model Framework for Comprehensive Traffic Safety Video Description and Analysis
Blessing Agyei Kyem, Neema Jakisa Owor, Andrews Danyo, Joshua Kofi Asamoah, Eugene Denteh, Tanner Muturi, Anthony Dontoh, Yaw Adu-Gyamfi, Armstrong Aboah
https://arxiv.org/abs/2510.11907
Laminar: A Scalable Asynchronous RL Post-Training Framework
Guangming Sheng, Yuxuan Tong, Borui Wan, Wang Zhang, Chaobo Jia, Xibin Wu, Yuqi Wu, Xiang Li, Chi Zhang, Yanghua Peng, Haibin Lin, Xin Liu, Chuan Wu
https://arxiv.org/abs/2510.12633
FIDELIS: Blockchain-Enabled Protection Against Poisoning Attacks in Federated Learning
Jane Carney, Kushal Upreti, Gaby G. Dagher, Tim Andersen
https://arxiv.org/abs/2508.10042 …
Active Membership Inference Test (aMINT): Enhancing Model Auditability with Multi-Task Learning
Daniel DeAlcala, Aythami Morales, Julian Fierrez, Gonzalo Mancera, Ruben Tolosana, Javier Ortega-Garcia
https://arxiv.org/abs/2509.07879
Revisiting Data Attribution for Influence Functions
Hongbo Zhu, Angelo Cangelosi
https://arxiv.org/abs/2508.07297 https://arxiv.org/pdf/2508.07297
Logit Arithmetic Elicits Long Reasoning Capabilities Without Training
Yunxiang Zhang, Muhammad Khalifa, Lechen Zhang, Xin Liu, Ayoung Lee, Xinliang Frederick Zhang, Farima Fatahi Bayat, Lu Wang
https://arxiv.org/abs/2510.09354
Training Dynamics Impact Post-Training Quantization Robustness
Albert Catalan-Tatjer, Niccol\`o Ajroldi, Jonas Geiping
https://arxiv.org/abs/2510.06213 https://
DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training
Haoran Feng, Dizhe Zhang, Xiangtai Li, Bo Du, Lu Qi
https://arxiv.org/abs/2510.11712 https://
Event2Vec: A Geometric Approach to Learning Composable Representations of Event Sequences
Antonin Sulc
https://arxiv.org/abs/2509.12188 https://arxiv.org/p…
LayerSync: Self-aligning Intermediate Layers
Yasaman Haghighi, Bastien van Delft, Mariam Hassan, Alexandre Alahi
https://arxiv.org/abs/2510.12581 https://a…
Representation-Based Exploration for Language Models: From Test-Time to Post-Training
Jens Tuyls, Dylan J. Foster, Akshay Krishnamurthy, Jordan T. Ash
https://arxiv.org/abs/2510.11686
Exploring Pre-training Across Domains for Few-Shot Surgical Skill Assessment
Dimitrios Anastasiou, Razvan Caramalau, Nazir Sirajudeen, Matthew Boal, Philip Edwards, Justin Collins, John Kelly, Ashwin Sridhar, Maxine Tran, Faiz Mumtaz, Nevil Pavithran, Nader Francis, Danail Stoyanov, Evangelos B. Mazomenos
https://arxiv.org/abs/2509.09327…
Exploring Expert Specialization through Unsupervised Training in Sparse Mixture of Experts
Strahinja Nikolic, Ilker Oguz, Demetri Psaltis
https://arxiv.org/abs/2509.10025 https:…
Open-sci-ref-0.01: open and reproducible reference baselines for language model and dataset comparison
Marianna Nezhurina, Taishi Nakamura, Timur Carstensen, Niccol\`o Ajroldi, Ville Komulainen, David Salinas, Jenia Jitsev
https://arxiv.org/abs/2509.09009