Tootfinder

Opt-in global Mastodon full text search. Join the index!

@Techmeme@techhub.social
2025-10-10 20:26:02

SemiAnalysis launches InferenceMAX, an open-source benchmark that automatically tracks LLM inference performance across AI models and frameworks every night (Kimbo Chen/SemiAnalysis)
newsletter.semianalysis.com/p/

@arXiv_csCV_bot@mastoxiv.page
2025-09-11 09:32:43

Boosted Training of Lightweight Early Exits for Optimizing CNN Image Classification Inference
Yehudit Aperstein, Alexander Apartsin
arxiv.org/abs/2509.08318

@arXiv_csAI_bot@mastoxiv.page
2025-07-11 09:38:21

Towards conservative inference in credal networks using belief functions: the case of credal chains
Marco Sangalli, Thomas Krak, Cassio de Campos
arxiv.org/abs/2507.07619

@arXiv_csCL_bot@mastoxiv.page
2025-08-11 10:03:19

SlimInfer: Accelerating Long-Context LLM Inference via Dynamic Token Pruning
Lingkun Long, Rubing Yang, Yushi Huang, Desheng Hui, Ao Zhou, Jianlei Yang
arxiv.org/abs/2508.06447

@arXiv_csDC_bot@mastoxiv.page
2025-08-11 07:40:09

KV Cache Compression for Inference Efficiency in LLMs: A Review
Yanyu Liu (Shandong University of Science and Technology), Jingying Fu (Shandong University of Science and Technology), Sixiang Liu (Shandong University of Science and Technology), Yitian Zou (Shandong University of Science and Technology), You Fu (Shandong University of Science and Technology), Jiehan Zhou (Shandong University of Science and Technology), Shouhua Zhang (University of Oulu)

@arXiv_mathNA_bot@mastoxiv.page
2025-09-11 08:12:43

Tensor-Train Operator Inference
Engin Danis, Duc Truong, Kim {\O}. Rasmussen{\S}, Boian S. Alexandrov
arxiv.org/abs/2509.08071 arxiv.org/pd…

@arXiv_csAR_bot@mastoxiv.page
2025-10-10 07:36:49

SPAD: Specialized Prefill and Decode Hardware for Disaggregated LLM Inference
Hengrui Zhang, Pratyush Patel, August Ning, David Wentzlaff
arxiv.org/abs/2510.08544

@arXiv_csCR_bot@mastoxiv.page
2025-08-11 09:36:19

DMFI: Dual-Modality Fine-Tuning and Inference Framework for LLM-Based Insider Threat Detection
Kaichuan Kong, Dongjie Liu, Xiaobo Jin, Guanggang Geng, Zhiying Li, Jian Weng
arxiv.org/abs/2508.05694

@arXiv_statML_bot@mastoxiv.page
2025-10-10 09:22:29

Stick-Breaking Mixture Normalizing Flows with Component-Wise Tail Adaptation for Variational Inference
Seungsu Han, Juyoung Hwang, Won Chang
arxiv.org/abs/2510.07965

@arXiv_mathST_bot@mastoxiv.page
2025-08-11 09:11:19

Consistency of variational inference for Besov priors in non-linear inverse problems
Shaokang Zu, Junxiong Jia, Zhiguo Wang
arxiv.org/abs/2508.06179

@arXiv_csLG_bot@mastoxiv.page
2025-10-10 11:05:29

Mix- and MoE-DPO: A Variational Inference Approach to Direct Preference Optimization
Jason Bohne, Pawel Polak, David Rosenberg, Brian Bloniarz, Gary Kazantsev
arxiv.org/abs/2510.08256

@arXiv_astrophGA_bot@mastoxiv.page
2025-09-10 08:36:51

LIMFAST. IV. Learning High-Redshift Galaxy Formation from Multiline Intensity Mapping with Implicit Likelihood Inference
Guochao Sun, Tri Nguyen, Claude-Andr\'e Faucher-Gigu\`ere, Adam Lidz, Tjitske Starkenburg, Bryan R. Scott, Tzu-Ching Chang, Steven R. Furlanetto
arxiv.org/abs/2509.07060

@arXiv_csAI_bot@mastoxiv.page
2025-07-11 07:33:01

State-Inference-Based Prompting for Natural Language Trading with Game NPCs
Minkyung Kim, Junsik Kim, Hwidong Bae, Woongcheol Yang, Sangdon Park, Sohee Bae
arxiv.org/abs/2507.07203

@arXiv_astrophHE_bot@mastoxiv.page
2025-07-11 10:00:01

A Bayesian Framework for UHECR Source Association and Parameter Inference
Keito Watanabe, Anatoli Fedynitch, Francesca Capel, Hiroyuki Sagawa
arxiv.org/abs/2507.07856

@arXiv_csRO_bot@mastoxiv.page
2025-08-11 09:26:09

ReNiL: Relative Neural Inertial Locator with Any-Scale Bayesian Inference
Kaixuan Wu (School of Computer Science, Wuhan University, Wuhan, China, School of Cyber Science and Engineering, Wuhan University, Wuhan, China), Yuanzhuo Xu (School of Computer Science, Wuhan University, Wuhan, China), Zejun Zhang (University of Southern California, Los Angeles, United States), Weiping Zhu (School of Computer Science, Wuhan University, Wuhan, China), Steve Drew (Department of Electrical and Soft…

@arXiv_csLO_bot@mastoxiv.page
2025-10-10 08:12:48

Dynamic Automated Deduction by Contradiction Separation: The Standard Extension Algorithm
Yang Xu, Xingxing He, Shuwei Chen, Jun Liu, Xiaomei Zhong
arxiv.org/abs/2510.08468

@arXiv_csCV_bot@mastoxiv.page
2025-09-10 10:43:41

Active Membership Inference Test (aMINT): Enhancing Model Auditability with Multi-Task Learning
Daniel DeAlcala, Aythami Morales, Julian Fierrez, Gonzalo Mancera, Ruben Tolosana, Javier Ortega-Garcia
arxiv.org/abs/2509.07879

@arXiv_csIT_bot@mastoxiv.page
2025-10-10 07:51:59

Near-optimal Rank Adaptive Inference of High Dimensional Matrices
Fr\'ed\'eric Zheng, Yassir Jedra, Alexandre Proutiere
arxiv.org/abs/2510.08117

@arXiv_csDC_bot@mastoxiv.page
2025-07-11 09:21:41

KIS-S: A GPU-Aware Kubernetes Inference Simulator with RL-Based Auto-Scaling
Guilin Zhang, Wulan Guo, Ziqi Tan, Qiang Guan, Hailong Jiang
arxiv.org/abs/2507.07932

@arXiv_csSE_bot@mastoxiv.page
2025-09-11 09:07:33

Handling Open-Vocabulary Constructs in Formalizing Specifications: Retrieval-Augmented Parsing with Expert Knowledge
Mohammad Saqib Hasan, Sayontan Ghosh, Dhruv Verma, Geoff Kuenning, Erez Zadok, Scott A. Smolka, Niranjan Balasubramanian
arxiv.org/abs/2509.08808

@arXiv_csCC_bot@mastoxiv.page
2025-07-11 07:32:41

Nonogram: Complexity of Inference and Phase Transition Behavior
Aaron Foote, Danny Krizanc
arxiv.org/abs/2507.07283 a…

@arXiv_statME_bot@mastoxiv.page
2025-08-11 08:22:10

Identifiability and Inference for Generalized Latent Factor Models
Chengyu Cui, Gongjun Xu
arxiv.org/abs/2508.05866 arxiv.org/pdf/2508.0586…

@arXiv_csDB_bot@mastoxiv.page
2025-09-10 07:36:11

JOINT: Join Optimization and Inference via Network Traversal
Szu-Yun Ko, Ethan Chen, Bo-Cian Chang, Alan Shu-Luen Chang
arxiv.org/abs/2509.07230

@arXiv_mathOC_bot@mastoxiv.page
2025-09-10 09:26:01

Differential Dynamic Programming for the Optimal Control Problem with an Ellipsoidal Target Set and Its Statistical Inference
Sungjun Eom, Gyunghoon Park
arxiv.org/abs/2509.07546

@arXiv_csSD_bot@mastoxiv.page
2025-10-10 08:47:29

Attribution-by-design: Ensuring Inference-Time Provenance in Generative Music Systems
Fabio Morreale, Wiebke Hutiri, Joan Serr\`a, Alice Xiang, Yuki Mitsufuji
arxiv.org/abs/2510.08062

@arXiv_csGT_bot@mastoxiv.page
2025-09-10 07:40:51

Inference of Intrinsic Rewards and Fairness in Multi-Agent Systems
Victor Villin, Christos Dimitrakakis
arxiv.org/abs/2509.07650 arxiv.org/…

@arXiv_hepph_bot@mastoxiv.page
2025-10-10 09:24:39

Simulation-based inference for neutrino interaction model parameter tuning
Karla Tame-Narvaez, Aleksandra \'Ciprijanovi\'c, Steven Gardiner, Giuseppe Cerati
arxiv.org/abs/2510.07454

@arXiv_qfinRM_bot@mastoxiv.page
2025-09-11 08:14:03

Chaotic Bayesian Inference: Strange Attractors as Risk Models for Black Swan Events
Crystal Rust
arxiv.org/abs/2509.08183 arxiv.org/pdf/250…

@simon_brooke@mastodon.scot
2025-07-11 11:04:31

"[Chain of reasoning] reports are untrustworthy on principle: they are plausible explanations for plausible responses, and since the inferences involved are more complex, they burn more compute and carbon per query as well as introducing more mistakes"
This is a particularly offensive point about #LLMs: we actually do have a class of systems, inference engines, which do reason and can…

@arXiv_csCR_bot@mastoxiv.page
2025-10-10 08:34:39

Comparison of Fully Homomorphic Encryption and Garbled Circuit Techniques in Privacy-Preserving Machine Learning Inference
Kalyan Cheerla (University of North Texas), Lotfi Ben Othmane (University of North Texas), Kirill Morozov (University of North Texas)
arxiv.org/abs/2510.07457

@arXiv_astrophCO_bot@mastoxiv.page
2025-07-11 09:09:11

Fisher Score Matching for Simulation-Based Forecasting and Inference
Ce Sui, Shivam Pandey, Benjamin D. Wandelt
arxiv.org/abs/2507.07833

@arXiv_econEM_bot@mastoxiv.page
2025-09-11 09:00:03

Posterior inference of attitude-behaviour relationships using latent class choice models
Akshay Vij, Stephane Hess
arxiv.org/abs/2509.08373

@arXiv_eessAS_bot@mastoxiv.page
2025-08-11 07:48:09

NanoCodec: Towards High-Quality Ultra Fast Speech LLM Inference
Edresson Casanova, Paarth Neekhara, Ryan Langman, Shehzeen Hussain, Subhankar Ghosh, Xuesong Yang, Ante Juki\'c, Jason Li, Boris Ginsburg
arxiv.org/abs/2508.05835

@seeingwithsound@mas.to
2025-10-09 20:21:08

Space impacts temporal processing via a visual-dependent spatially organized neural architecture doi.org/10.1523/jneurosci.1444 "spatial features affected the temporal processing of sighted but not blind people, regardless of age."

@arXiv_csMA_bot@mastoxiv.page
2025-09-10 09:12:11

Towards Generalized Routing: Model and Agent Orchestration for Adaptive and Efficient Inference
Xiyu Guo, Shan Wang, Chunfang Ji, Xuefeng Zhao, Wenhao Xi, Yaoyao Liu, Qinglan Li, Chao Deng, Junlan Feng
arxiv.org/abs/2509.07571

@arXiv_csLG_bot@mastoxiv.page
2025-09-10 10:39:51

MoE-Compression: How the Compression Error of Experts Affects the Inference Accuracy of MoE Model?
Songkai Ma, Zhaorui Zhang, Sheng Di, Benben Liu, Xiaodong Yu, Xiaoyi Lu, Dan Wang
arxiv.org/abs/2509.07727

@arXiv_csAI_bot@mastoxiv.page
2025-08-11 07:32:49

A Framework for Inherently Safer AGI through Language-Mediated Active Inference
Bo Wen
arxiv.org/abs/2508.05766 arxiv.org/pdf/2508.05766

@arXiv_physicschemph_bot@mastoxiv.page
2025-07-11 09:16:21

Physics-Informed Gaussian Process Inference of Liquid Structure from Scattering Data
Harry W. Sullivan, Brennon L. Shanks, Matej Cervenka, Michael P. Hoepfner
arxiv.org/abs/2507.07948

@arXiv_csDC_bot@mastoxiv.page
2025-08-11 07:37:49

EC2MoE: Adaptive End-Cloud Pipeline Collaboration Enabling Scalable Mixture-of-Experts Inference
Zheming Yang, Yunqing Hu, Sheng Sun, Wen Ji
arxiv.org/abs/2508.06024

@arXiv_quantph_bot@mastoxiv.page
2025-10-09 10:48:21

Accelerating Inference for Multilayer Neural Networks with Quantum Computers
Arthur G. Rattew, Po-Wei Huang, Naixu Guo, Lirand\"e Pira, Patrick Rebentrost
arxiv.org/abs/2510.07195

@arXiv_astrophHE_bot@mastoxiv.page
2025-09-10 09:00:31

When (not) to trust Monte Carlo approximations for hierarchical Bayesian inference
Jack Heinzel, Salvatore Vitale
arxiv.org/abs/2509.07221

@arXiv_eessSP_bot@mastoxiv.page
2025-09-08 08:12:00

Communication-Efficient Collaborative LLM Inference via Distributed Speculative Decoding
Ce Zheng, Tingting Yang
arxiv.org/abs/2509.04576 a…

@arXiv_csAR_bot@mastoxiv.page
2025-09-09 07:31:41

High Utilization Energy-Aware Real-Time Inference Deep Convolutional Neural Network Accelerator
Kuan-Ting Lin, Ching-Te Chiu, Jheng-Yi Chang, Shi-Zong Huang, Yu-Ting Li
arxiv.org/abs/2509.05688

@arXiv_csRO_bot@mastoxiv.page
2025-09-11 08:50:03

SVN-ICP: Uncertainty Estimation of ICP-based LiDAR Odometry using Stein Variational Newton
Shiping Ma, Haoming Zhang, Marc Toussaint
arxiv.org/abs/2509.08069

@arXiv_csDC_bot@mastoxiv.page
2025-09-10 08:34:31

DuoServe-MoE: Dual-Phase Expert Prefetch and Cache Scheduling for Efficient MoE LLM Inference
Yuning Zhang, Grant Pinkert, Nan Yang, Yanli Li, Dong Yuan
arxiv.org/abs/2509.07379

@arXiv_csLO_bot@mastoxiv.page
2025-09-09 08:27:02

Compositional Inductive Invariant Inference via Assume-Guarantee Reasoning
Ian Dardik, Eunsuk Kang
arxiv.org/abs/2509.06250 arxiv.org/pdf/2…

@arXiv_csLG_bot@mastoxiv.page
2025-10-09 10:50:21

A Multi-Agent Framework for Stateful Inference-Time Search
Arshika Lalan, Rajat Ghosh, Aditya Kolsur, Debojyoti Dutta
arxiv.org/abs/2510.07147

@arXiv_statME_bot@mastoxiv.page
2025-07-11 09:48:51

Late Fusion Multi-task Learning for Semiparametric Inference with Nuisance Parameters
Sohom Bhattacharya, Yongzhuo Chen, Muxuan Liang
arxiv.org/abs/2507.07941

@arXiv_statML_bot@mastoxiv.page
2025-09-09 09:06:22

MOSAIC: Minimax-Optimal Sparsity-Adaptive Inference for Change Points in Dynamic Networks
Yingying Fan, Jingyuan Liu, Jinchi Lv, Ao Sun
arxiv.org/abs/2509.06303

@arXiv_mathNA_bot@mastoxiv.page
2025-10-10 09:28:09

Likelihood-informed Model Reduction for Bayesian Inference of Static Structural Loads
Jakob Scheffels, Elizabeth Qian, Iason Papaioannou, Elisabeth Ullmann
arxiv.org/abs/2510.07950

@arXiv_astrophGA_bot@mastoxiv.page
2025-08-11 08:40:49

Data-driven dust inference at mid-to-high Galactic latitudes using probabilistic machine learning
Matthew O'Callaghan, Kaisey S. Mandel, Gerry Gilmore
arxiv.org/abs/2508.05781

@arXiv_csCV_bot@mastoxiv.page
2025-09-11 07:58:33

Video Parallel Scaling: Aggregating Diverse Frame Subsets for VideoLLMs
Hyungjin Chung, Hyelin Nam, Jiyeon Kim, Hyojun Go, Byeongjun Park, Junho Kim, Joonseok Lee, Seongsu Ha, Byung-Hoon Kim
arxiv.org/abs/2509.08016

@arXiv_csCR_bot@mastoxiv.page
2025-09-09 12:06:22

Imitative Membership Inference Attack
Yuntao Du, Yuetian Chen, Hanshen Xiao, Bruno Ribeiro, Ninghui Li
arxiv.org/abs/2509.06796 arxiv.org/p…

@arXiv_mathST_bot@mastoxiv.page
2025-09-10 09:52:01

Bayesian inference with Besov-Laplace priors for spatially inhomogeneous binary classification surfaces
Matteo Giordano
arxiv.org/abs/2509.07439

@arXiv_astrophCO_bot@mastoxiv.page
2025-09-11 07:57:52

Taking the Weight Off: Mitigating Parameter Bias from Catastrophic Outliers in 3$\times$2pt Analysis
Carolyn McDonald Mill, C. Danielle Leonard, Markus Michael Rau, Cora Uhlemann, Shahab Joudaki
arxiv.org/abs/2509.08052

@arXiv_csLG_bot@mastoxiv.page
2025-10-10 11:06:29

Dynamic Features Adaptation in Networking: Toward Flexible training and Explainable inference
Yannis Belkhiter, Seshu Tirupathi, Giulio Zizzo, Merim Dzaferagic, John D. Kelleher
arxiv.org/abs/2510.08303

@arXiv_csAI_bot@mastoxiv.page
2025-09-11 07:41:02

Automatic Failure Attribution and Critical Step Prediction Method for Multi-Agent Systems Based on Causal Inference
Guoqing Ma, Jia Zhu, Hanghui Guo, Weijie Shi, Jiawei Shen, Jingjiang Liu, Yidan Liang
arxiv.org/abs/2509.08682

@arXiv_hepph_bot@mastoxiv.page
2025-09-09 08:11:02

Unbinning global LHC analyses
Henning Bahl, Tilman Plehn, Nikita Schmal
arxiv.org/abs/2509.05409 arxiv.org/pdf/2509.05409

@arXiv_csCL_bot@mastoxiv.page
2025-07-11 10:06:41

Why is Your Language Model a Poor Implicit Reward Model?
Noam Razin, Yong Lin, Jiarui Yao, Sanjeev Arora
arxiv.org/abs/2507.07981

@arXiv_csAR_bot@mastoxiv.page
2025-09-11 08:37:53

BitROM: Weight Reload-Free CiROM Architecture Towards Billion-Parameter 1.58-bit LLM Inference
Wenlun Zhang, Xinyu Li, Shimpei Ando, Kentaro Yoshioka
arxiv.org/abs/2509.08542

@Techmeme@techhub.social
2025-09-05 13:40:43

Baseten, which helps companies launch open-source or custom AI models, raised a $150M Series D led by Bond at a $2.15B valuation, up from $825M in February (Allie Garfinkle/Fortune)
fortune.com/2025/09/05/exclusi

@arXiv_statME_bot@mastoxiv.page
2025-09-11 08:45:33

Doubly robust average treatment effect estimation for survival data
Byeonghee Lee, Joonsung Kang
arxiv.org/abs/2509.08788 arxiv.org/pdf/250…

@arXiv_statML_bot@mastoxiv.page
2025-09-09 08:54:51

Fisher Random Walk: Automatic Debiasing Contextual Preference Inference for Large Language Model Evaluation
Yichi Zhang, Alexander Belloni, Ethan X. Fang, Junwei Lu, Xiaoan Xu
arxiv.org/abs/2509.05852

@arXiv_mathST_bot@mastoxiv.page
2025-09-09 10:31:32

Statistical Inference for Misspecified Contextual Bandits
Yongyi Guo, Ziping Xu
arxiv.org/abs/2509.06287 arxiv.org/pdf/2509.06287

@arXiv_csCR_bot@mastoxiv.page
2025-09-09 11:48:22

DCMI: A Differential Calibration Membership Inference Attack Against Retrieval-Augmented Generation
Xinyu Gao, Xiangtao Meng, Yingkai Dong, Zheng Li, Shanqing Guo
arxiv.org/abs/2509.06026

@arXiv_statME_bot@mastoxiv.page
2025-09-10 09:49:11

Monitoring Adverse Events Through Bayesian Nonparametric Clustering Across Studies
Shijie Yuan, Kevin Roberts, Noirrit Kiran Chandra, Yuan Ji, Peter M\"uller
arxiv.org/abs/2509.07267

@arXiv_astrophCO_bot@mastoxiv.page
2025-09-09 09:47:22

Unlocking 21cm Cosmology with SBI: A Beginner friendly NRE for Inference of Astrophysical Parameters
Bisweswar Sen, Abhirup Datta
arxiv.org/abs/2509.06834

@arXiv_csCL_bot@mastoxiv.page
2025-08-11 09:56:09

InfoCausalQA:Can Models Perform Non-explicit Causal Reasoning Based on Infographic?
Keummin Ka, Junhyeong Park, Jahyun Jeon, Youngjae Yu
arxiv.org/abs/2508.06220

@arXiv_csRO_bot@mastoxiv.page
2025-10-07 11:45:42

HyperVLA: Efficient Inference in Vision-Language-Action Models via Hypernetworks
Zheng Xiong, Kang Li, Zilin Wang, Matthew Jackson, Jakob Foerster, Shimon Whiteson
arxiv.org/abs/2510.04898

@arXiv_csCV_bot@mastoxiv.page
2025-09-09 12:30:42

Intraoperative 2D/3D Registration via Spherical Similarity Learning and Inference-Time Differentiable Levenberg-Marquardt Optimization
Minheng Chen, Youyong Kong
arxiv.org/abs/2509.06890

@arXiv_csCR_bot@mastoxiv.page
2025-10-08 09:39:59

Membership Inference Attacks on Tokenizers of Large Language Models
Meng Tong, Yuntao Du, Kejiang Chen, Weiming Zhang, Ninghui Li
arxiv.org/abs/2510.05699

@arXiv_csCL_bot@mastoxiv.page
2025-10-10 10:57:09

Memory Retrieval and Consolidation in Large Language Models through Function Tokens
Shaohua Zhang, Yuan Lin, Hang Li
arxiv.org/abs/2510.08203

@arXiv_statME_bot@mastoxiv.page
2025-09-09 10:21:32

Bayesian Inference for Confounding Variables and Limited Information
Ellis Scharfenaker, Duncan K. Foley
arxiv.org/abs/2509.05520 arxiv.org…

@arXiv_statML_bot@mastoxiv.page
2025-07-11 08:40:11

Hess-MC2: Sequential Monte Carlo Squared using Hessian Information and Second Order Proposals
Joshua Murphy, Conor Rosato, Andrew Millard, Lee Devlin, Paul Horridge, Simon Maskell
arxiv.org/abs/2507.07461

@arXiv_csLG_bot@mastoxiv.page
2025-10-08 10:49:49

Empirical Comparison of Membership Inference Attacks in Deep Transfer Learning
Yuxuan Bai, Gauri Pradhan, Marlon Tobaben, Antti Honkela
arxiv.org/abs/2510.05753

@arXiv_csAI_bot@mastoxiv.page
2025-10-07 12:16:52

Staircase Streaming for Low-Latency Multi-Agent Inference
Junlin Wang (Zach), Jue Wang (Zach), Zhen (Zach), Xu, Ben Athiwaratkun, Bhuwan Dhingra, Ce Zhang, James Zou
arxiv.org/abs/2510.05059

@arXiv_csCV_bot@mastoxiv.page
2025-07-11 10:18:51

TinierHAR: Towards Ultra-Lightweight Deep Learning Models for Efficient Human Activity Recognition on Edge Devices
Sizhen Bian, Mengxi Liu, Vitor Fortes Rey, Daniel Geissler, Paul Lukowicz
arxiv.org/abs/2507.07949

@arXiv_csCL_bot@mastoxiv.page
2025-10-10 10:52:19

DACIP-RC: Domain Adaptive Continual Instruction Pre-Training via Reading Comprehension on Business Conversations
Elena Khasanova, Harsh Saini, Md Tahmid Rahman Laskar, Xue-Yong Fu, Cheng Chen, Shashi Bhushan TN
arxiv.org/abs/2510.08152

@arXiv_csLG_bot@mastoxiv.page
2025-10-08 10:43:39

(Token-Level) \textbf{InfoRMIA}: Stronger Membership Inference and Memorization Assessment for LLMs
Jiashu Tao, Reza Shokri
arxiv.org/abs/2510.05582

@arXiv_csCV_bot@mastoxiv.page
2025-10-10 11:22:49

ReSplat: Learning Recurrent Gaussian Splats
Haofei Xu, Daniel Barath, Andreas Geiger, Marc Pollefeys
arxiv.org/abs/2510.08575 arxiv.org/pdf…

@arXiv_csAI_bot@mastoxiv.page
2025-09-10 10:01:11

Unleashing the True Potential of LLMs: A Feedback-Triggered Self-Correction with Long-Term Multipath Decoding
Jipeng Li, Zeyu Gao, Yubin Qi, Hande Dong, Weijian Chen, Qiang Lin
arxiv.org/abs/2509.07676

@arXiv_csCL_bot@mastoxiv.page
2025-09-09 12:08:12

COMPACT: Common-token Optimized Model Pruning Across Channels and Tokens
Eugene Kwek, Wenpeng Yin
arxiv.org/abs/2509.06836 arxiv.org/pdf/25…

@arXiv_csLG_bot@mastoxiv.page
2025-10-06 10:27:19

Best-of-Majority: Minimax-Optimal Strategy for Pass@$k$ Inference Scaling
Qiwei Di, Kaixuan Ji, Xuheng Li, Heyang Zhao, Quanquan Gu
arxiv.org/abs/2510.03199

@arXiv_csCR_bot@mastoxiv.page
2025-10-10 08:26:39

PATCH: Mitigating PII Leakage in Language Models with Privacy-Aware Targeted Circuit PatcHing
Anthony Hughes, Vasisht Duddu, N. Asokan, Nikolaos Aletras, Ning Ma
arxiv.org/abs/2510.07452

@arXiv_csAI_bot@mastoxiv.page
2025-10-10 10:22:09

PEAR: Phase Entropy Aware Reward for Efficient Reasoning
Chen Huang, Wei Lu, Wenxuan Zhang
arxiv.org/abs/2510.08026 arxiv.org/pdf/2510.0802…

@arXiv_csCV_bot@mastoxiv.page
2025-10-10 11:21:29

ResAD: Normalized Residual Trajectory Modeling for End-to-End Autonomous Driving
Zhiyu Zheng, Shaoyu Chen, Haoran Yin, Xinbang Zhang, Jialv Zou, Xinggang Wang, Qian Zhang, Lefei Zhang
arxiv.org/abs/2510.08562

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 10:01:01

Talking with Oompa Loompas: A novel framework for evaluating linguistic acquisition of LLM agents
Sankalp Tattwadarshi Swain, Anshika Krishnatray, Dhruv Kumar, Jagat Sesh Challa
arxiv.org/abs/2509.07389

@arXiv_csLG_bot@mastoxiv.page
2025-10-10 11:02:39

Enhancing Reasoning for Diffusion LLMs via Distribution Matching Policy Optimization
Yuchen Zhu, Wei Guo, Jaemoo Choi, Petr Molodyk, Bo Yuan, Molei Tao, Yongxin Chen
arxiv.org/abs/2510.08233

@arXiv_csCV_bot@mastoxiv.page
2025-10-10 11:19:29

ARTDECO: Towards Efficient and High-Fidelity On-the-Fly 3D Reconstruction with Structured Scene Representation
Guanghao Li, Kerui Ren, Linning Xu, Zhewen Zheng, Changjian Jiang, Xin Gao, Bo Dai, Jian Pu, Mulin Yu, Jiangmiao Pang
arxiv.org/abs/2510.08551

@arXiv_csCL_bot@mastoxiv.page
2025-10-09 10:24:11

Adaptive LLM-Symbolic Reasoning via Dynamic Logical Solver Composition
Lei Xu, Pierre Beckmann, Marco Valentino, Andr\'e Freitas
arxiv.org/abs/2510.06774

@arXiv_csAI_bot@mastoxiv.page
2025-07-11 09:16:01

Position: We Need An Algorithmic Understanding of Generative AI
Oliver Eberle, Thomas McGee, Hamza Giaffar, Taylor Webb, Ida Momennejad
arxiv.org/abs/2507.07544

@arXiv_csLG_bot@mastoxiv.page
2025-09-10 10:26:01

Learning Generalized Hamiltonian Dynamics with Stability from Noisy Trajectory Data
Luke McLennan, Yi Wang, Ryan Farell, Minh Nguyen, Chandrajit Bajaj
arxiv.org/abs/2509.07280

@arXiv_csCL_bot@mastoxiv.page
2025-08-11 09:55:49

EICAP: Deep Dive in Assessment and Enhancement of Large Language Models in Emotional Intelligence through Multi-Turn Conversations
Nizi Nazar, Ehsaneddin Asgari
arxiv.org/abs/2508.06196

@arXiv_csLG_bot@mastoxiv.page
2025-10-10 11:05:59

Counterfactual Identifiability via Dynamic Optimal Transport
Fabio De Sousa Ribeiro, Ainkaran Santhirasekaram, Ben Glocker
arxiv.org/abs/2510.08294

@arXiv_csCL_bot@mastoxiv.page
2025-08-11 09:55:59

Classification is a RAG problem: A case study on hate speech detection
Richard Willats, Josh Pennington, Aravind Mohan, Bertie Vidgen
arxiv.org/abs/2508.06204

@arXiv_csLG_bot@mastoxiv.page
2025-09-11 10:14:13

Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing
Jeffrey Amico, Gabriel Passamani Andrade, John Donaghy, Ben Fielding, Tristin Forbus, Harry Grieve, Semih Kara, Jari Kolehmainen, Yihua Lou, Christopher Nies, Edward Phillip Flores Nu\~no, Diogo Ortega, Shikhar Rastogi, Austin Virts, Matthew J. Wright

@arXiv_csCL_bot@mastoxiv.page
2025-09-10 10:17:01

HALT-RAG: A Task-Adaptable Framework for Hallucination Detection with Calibrated NLI Ensembles and Abstention
Saumya Goswami, Siddharth Kurra
arxiv.org/abs/2509.07475

@arXiv_csLG_bot@mastoxiv.page
2025-09-10 10:33:31

EMORF-II: Adaptive EM-based Outlier-Robust Filtering with Correlated Measurement Noise
Arslan Majal, Aamir Hussain Chughtai, Muhammad Tahir
arxiv.org/abs/2509.07415

@arXiv_csLG_bot@mastoxiv.page
2025-09-10 10:23:31

IP-Basis PINNs: Efficient Multi-Query Inverse Parameter Estimation
Shalev Manor, Mohammad Kohandel
arxiv.org/abs/2509.07245 arxiv.org/pdf/2…

@arXiv_csLG_bot@mastoxiv.page
2025-07-11 10:23:51

Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs
Ziyue Li, Yang Li, Tianyi Zhou
arxiv.org/abs/2507.07996 arxiv.org/pdf/2507.07996 arxiv.org/html/2507.07996
arXiv:2507.07996v1 Announce Type: new
Abstract: Can a pretrained neural network adapt its architecture to different inputs without any finetuning? Do we need all layers for simple tasks, and are they adequate for challenging tasks? We found that the layers of a pretrained large language model (LLM) can be manipulated as separate modules to build a better and even shallower model customized for each test sample. In particular, each layer from the pretrained model can be skipped/pruned or repeated multiple times as recurrent neural networks (RNN), and stacked with others in arbitrary orders, yielding a chain-of-layers (CoLa) per sample. This compositional space greatly expands the scope of existing works on looped/recurrent pretrained modules, layer pruning, or early-exit networks. We develop a Monte Carlo Tree Search (MCTS) protocol to explore and identify the optimal CoLa for each sample from math and commonsense reasoning benchmarks. Compared to a static model of a fixed depth, CoLa allows shortcut paths (fast thinking), recurrence of the same layer(s) (slow thinking), and combining both, offering more flexible, dynamic architectures for different inputs. We conduct an extensive analysis of the MCTS-optimized CoLa, which leads to two key findings: (1) For >75% of samples with correct predictions by the original LLM, we can find shorter CoLa, suggesting a large space for improving inference efficiency; (2) For >60% of samples with originally incorrect predictions, we can identify CoLa achieving correct predictions, suggesting a large space of performance enhancement. Our results highlight the shortcomings of using a fixed architecture of pre-trained LLMs for inference on different samples and pave the way to unlock the generalization power of test-time depth adaptation.
toXiv_bot_toot