Tootfinder

Opt-in global Mastodon full text search. Join the index!

@arXiv_statME_bot@mastoxiv.page
2025-06-06 09:56:30

This arxiv.org/abs/2212.02658 has been replaced.
link: scholar.google.com/scholar?q=a

@arXiv_csLG_bot@mastoxiv.page
2025-06-05 10:57:22

This arxiv.org/abs/2505.18570 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…

@Techmeme@techhub.social
2025-06-05 22:45:44

AMD says it has acquired the team behind AI inference chip developer Untether AI, a day after announcing it acquired AI software optimization startup Brium (Dylan Martin/CRN)
crn.com/news/components-periph

@arXiv_csCR_bot@mastoxiv.page
2025-06-06 07:16:46

Membership Inference Attacks on Sequence Models
Lorenzo Rossi, Michael Aerni, Jie Zhang, Florian Tram\`er
arxiv.org/abs/2506.05126

@arXiv_csDC_bot@mastoxiv.page
2025-07-04 09:07:51

FlowSpec: Continuous Pipelined Speculative Decoding for Efficient Distributed LLM Inference
Xing Liu, Lizhuo Luo, Ming Tang, Chao Huang
arxiv.org/abs/2507.02620

@cosmos4u@scicomm.xyz
2025-06-06 21:37:51

Deep learning inference with the #EventHorizonTelescope I. Calibration improvements and a comprehensive synthetic data library / II. The ZINGULARITY framework for Bayesian artificial neural networks / III. ZINGULARITY results from the 2017 observations and predictions for future array expansions: aanda.org/articles/aa/full_htm / aanda.org/articles/aa/full_htm / aanda.org/articles/aa/full_htm -> Self-learning neural network cracks iconic black holes: astronomie.nl/nieuws/en/self-l

@ErikJonker@mastodon.social
2025-07-06 19:29:10

Interesting, letting AI models cooperate.
#ai #sakana

@arXiv_csSE_bot@mastoxiv.page
2025-07-04 09:24:41

VeFIA: An Efficient Inference Auditing Framework for Vertical Federated Collaborative Software
Chung-ju Huang, Ziqi Zhang, Yinggui Wang, Binghui Wang, Tao Wei, Leye Wang
arxiv.org/abs/2507.02376

@arXiv_mathAP_bot@mastoxiv.page
2025-06-06 07:24:27

Lipschitz stability for Bayesian inference in porous medium tissue growth models
Tomasz D\k{e}biec, Piotr Gwiazda, B{\l}a\.zej Miasojedow, Katarzyna Ryszewska, Zuzanna Szyma\'nska, Aneta Wr\'oblewska-Kami\'nska
arxiv.org/abs/2506.04769

@arXiv_mathST_bot@mastoxiv.page
2025-06-06 07:27:36

Classification of Extremal Dependence in Financial Markets via Bootstrap Inference
Qian Hui, Sidney I. Resnick, Tiandong Wang
arxiv.org/abs/2506.04656

@arXiv_astrophCO_bot@mastoxiv.page
2025-06-05 07:29:34

Differentiable Fuzzy Cosmic-Web for Field Level Inference
P. Rossell\'o, F. -S. Kitaura, D. Forero-S\'anchez, F. Sinigaglia, G. Favole
arxiv.org/abs/2506.03969

@Techmeme@techhub.social
2025-07-07 00:50:30

Tokyo-based Sakana AI details a new Monte Carlo tree search-based technique that lets multiple LLMs cooperate on a single task, outperforming individual models (Ben Dickson/VentureBeat)
venturebeat.com/ai/sakana-ais-

@arXiv_csRO_bot@mastoxiv.page
2025-06-04 07:32:59

Active inference as a unified model of collision avoidance behavior in human drivers
Julian F. Schumann, Johan Engstroem, Leif Johnson, Matthew O'Kelly, Joao Messias, Jens Kober, Arkady Zgonnikov
arxiv.org/abs/2506.02215

@arXiv_quantph_bot@mastoxiv.page
2025-06-06 10:14:30

This arxiv.org/abs/2505.24502 has been replaced.
initial toot: mastoxiv.page/@arXiv_qu…

@arXiv_statCO_bot@mastoxiv.page
2025-06-06 07:39:24

Amortized variational transdimensional inference
Laurence Davies, Dan Mackinlay, Rafael Oliveira, Scott A. Sisson
arxiv.org/abs/2506.04749

@arXiv_physicssocph_bot@mastoxiv.page
2025-06-05 07:35:46

Reconstructing North Korea's Plutonium Production History with Bayesian Inference-Based Reprocessing Waste Analysis
Benjamin Jung, Johannes Bosse, Malte G\"ottsche
arxiv.org/abs/2506.03865

@arXiv_physicsoptics_bot@mastoxiv.page
2025-06-04 07:48:29

Inverse design for robust inference in integrated computational spectrometry
Wenchao Ma, Rapha\"el Pestourie, Zin Lin, Steven G. Johnson
arxiv.org/abs/2506.02194

@arXiv_csDC_bot@mastoxiv.page
2025-06-05 09:39:19

This arxiv.org/abs/2506.02814 has been replaced.
initial toot: mastoxiv.page/@arXiv_csDC_…

@arXiv_statME_bot@mastoxiv.page
2025-06-06 07:39:41

Bayesian Doubly Robust Causal Inference via Posterior Coupling
Shunichiro Orihara, Tomotaka Momozaki, Shonosuke Sugasawa
arxiv.org/abs/2506.04868

@arXiv_hepph_bot@mastoxiv.page
2025-06-03 07:54:40

Generator Based Inference (GBI)
Chi Lung Cheng, Ranit Das, Runze Li, Radha Mastandrea, Vinicius Mikuni, Benjamin Nachman, David Shih, Gup Singh
arxiv.org/abs/2506.00119

@arXiv_csSD_bot@mastoxiv.page
2025-06-04 13:36:50

This arxiv.org/abs/2505.15380 has been replaced.
initial toot: mastoxiv.page/@arXiv_csSD_…

@arXiv_eessSY_bot@mastoxiv.page
2025-06-06 09:45:38

This arxiv.org/abs/2504.01759 has been replaced.
initial toot: mastoxiv.page/@arXiv_ees…

@arXiv_csCR_bot@mastoxiv.page
2025-06-04 13:39:44

This arxiv.org/abs/2505.23655 has been replaced.
initial toot: mastoxiv.page/@arXiv_csCR_…

@arXiv_mathNA_bot@mastoxiv.page
2025-06-03 07:53:04

Exact operator inference with minimal data
Henrik Rosenberger, Benjamin Sanderse, Giovanni Stabile
arxiv.org/abs/2506.01244

@arXiv_astrophHE_bot@mastoxiv.page
2025-06-06 09:51:31

This arxiv.org/abs/2502.00805 has been replaced.
initial toot: mastoxiv.page/@arXiv_…

@arXiv_astrophIM_bot@mastoxiv.page
2025-06-06 09:46:50

This arxiv.org/abs/2503.18617 has been replaced.
initial toot: mastoxiv.page/@arXiv_…

@arXiv_econEM_bot@mastoxiv.page
2025-06-06 09:38:35

This arxiv.org/abs/2409.14202 has been replaced.
initial toot: mastoxiv.page/@arXiv_eco…

@arXiv_csOH_bot@mastoxiv.page
2025-05-06 09:47:09

This arxiv.org/abs/2504.10667 has been replaced.
link: scholar.google.com/scholar?q=a

@arXiv_csLG_bot@mastoxiv.page
2025-06-05 10:59:04

This arxiv.org/abs/2505.24293 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…

@arXiv_eessAS_bot@mastoxiv.page
2025-06-05 09:45:36

This arxiv.org/abs/2505.19931 has been replaced.
initial toot: mastoxiv.page/@arXiv_ees…

@arXiv_astrophSR_bot@mastoxiv.page
2025-06-04 07:45:41

Reappraising the Elatina series: Solar dynamo clocking and inference of orbital periods
F. Stefani, T. Weier, G. M. Horstmann, G. Mamatsashvili
arxiv.org/abs/2506.02628

@arXiv_csAR_bot@mastoxiv.page
2025-07-04 08:43:41

Breaking the HBM Bit Cost Barrier: Domain-Specific ECC for AI Inference Infrastructure
Rui Xie, Asad Ul Haq, Yunhua Fang, Linsen Ma, Sanchari Sen, Swagath Venkataramani, Liu Liu, Tong Zhang
arxiv.org/abs/2507.02654

@arXiv_statME_bot@mastoxiv.page
2025-06-06 07:39:44

The Spurious Factor Dilemma: Robust Inference in Heavy-Tailed Elliptical Factor Models
Jiang Hu, Jiahui Xie, Yangchun Zhang, Wang Zhou
arxiv.org/abs/2506.05116

@arXiv_qbiobm_bot@mastoxiv.page
2025-06-06 09:53:20

This arxiv.org/abs/2411.13280 has been replaced.
initial toot: mastoxiv.page/@arXiv_qbi…

@frankel@mastodon.top
2025-06-03 16:06:02

Pyrefly vs. ty: Comparing Python’s Two New Rust-Based Type Checkers
#types

@arXiv_qfinRM_bot@mastoxiv.page
2025-06-06 09:55:05

This arxiv.org/abs/2504.15268 has been replaced.
initial toot: mastoxiv.page/@arXiv_qfi…

@arXiv_csIR_bot@mastoxiv.page
2025-07-02 09:35:09

EARN: Efficient Inference Acceleration for LLM-based Generative Recommendation by Register Tokens
Chaoqun Yang, Xinyu Lin, Wenjie Wang, Yongqi Li, Teng Sun, Xianjing Han, Tat-Seng Chua
arxiv.org/abs/2507.00715

@arXiv_qbioNC_bot@mastoxiv.page
2025-06-03 07:51:15

Evaluation of "As-Intended" Vehicle Dynamics using the Active Inference Framework
Kazuharu Kidera, Takuma Miyaguchi, Hideyoshi Yanagisawa
arxiv.org/abs/2506.00035

@arXiv_csOS_bot@mastoxiv.page
2025-07-04 07:37:01

Dissecting the Impact of Mobile DVFS Governors on LLM Inference Performance and Energy Efficiency
Zongpu Zhang, Pranab Dash, Y. Charlie Hu, Qiang Xu, Jian Li, Haibing Guan
arxiv.org/abs/2507.02135

@groupnebula563@mastodon.social
2025-07-05 01:32:59

#AI #honeypots huh

@GroupNebula563@mastodon.social
2025-07-05 01:32:59

#AI #honeypots huh

@arXiv_csCL_bot@mastoxiv.page
2025-06-30 10:21:40

QuickSilver -- Speeding up LLM Inference through Dynamic Token Halting, KV Skipping, Contextual Token Fusion, and Adaptive Matryoshka Quantization
Danush Khanna, Aditya Kumar Guru, Srivarshinee Sridhar, Zidan Ahmed, Rubhav Bahirwani, Meetu Malhotra, Vinija Jain, Aman Chadha, Amitava Das, Kripabandhu Ghosh
arxiv.org/abs/2…

@arXiv_csLO_bot@mastoxiv.page
2025-06-05 09:40:40

This arxiv.org/abs/2502.03956 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLO_…

@arXiv_csDC_bot@mastoxiv.page
2025-06-04 07:43:47

Adaptive Configuration Selection for Multi-Model Inference Pipelines in Edge Computing
Jinhao Sheng, Zhiqing Tang, Jianxiong Guo, Tian Wang
arxiv.org/abs/2506.02814

@arXiv_quantph_bot@mastoxiv.page
2025-06-06 10:15:03

This arxiv.org/abs/2505.24765 has been replaced.
initial toot: mastoxiv.page/@arXiv_qu…

@arXiv_csDB_bot@mastoxiv.page
2025-06-30 07:57:20

A Survey of LLM Inference Systems
James Pan, Guoliang Li
arxiv.org/abs/2506.21901 arxiv.org/pdf/2506.21901

@arXiv_csGT_bot@mastoxiv.page
2025-06-03 16:18:10

This arxiv.org/abs/2405.00295 has been replaced.
initial toot: mastoxiv.page/@arXiv_csGT_…

@pbloem@sigmoid.social
2025-06-03 12:42:10

Everybody complaining about getting hammered with #AI traffic seems to think that these are crawlers scraping for training data.
How likely is it that this is a complete misconception and this is all inference time?
Most public companies give their cralwers and RAG agents different user agent strings. But what about security services trawling through their data?

@arXiv_csCR_bot@mastoxiv.page
2025-06-04 07:32:03

Privacy Leaks by Adversaries: Adversarial Iterations for Membership Inference Attack
Jing Xue, Zhishen Sun, Haishan Ye, Luo Luo, Xiangyu Chang, Ivor Tsang, Guang Dai
arxiv.org/abs/2506.02711

@arXiv_csLG_bot@mastoxiv.page
2025-07-04 10:22:51

LLM-Driven Treatment Effect Estimation Under Inference Time Text Confounding
Yuchen Ma, Dennis Frauen, Jonas Schweisthal, Stefan Feuerriegel
arxiv.org/abs/2507.02843

@arXiv_hepph_bot@mastoxiv.page
2025-07-04 08:06:11

Neural simulation-based inference of the Higgs trilinear self-coupling via off-shell Higgs production
Aishik Ghosh, Maximilian Griese, Ulrich Haisch, Tae Hyoun Park
arxiv.org/abs/2507.02032

@arXiv_mathST_bot@mastoxiv.page
2025-06-03 07:43:49

Variational Inference for Latent Variable Models in High Dimensions
Chenyang Zhong, Sumit Mukherjee, Bodhisattva Sen
arxiv.org/abs/2506.01893

@arXiv_astrophCO_bot@mastoxiv.page
2025-07-02 09:55:30

Simulation-Efficient Cosmological Inference with Multi-Fidelity SBI
Leander Thiele, Adrian E. Bayer, Naoya Takeishi
arxiv.org/abs/2507.00514

@arXiv_csDC_bot@mastoxiv.page
2025-06-05 07:17:03

Parallel CPU-GPU Execution for LLM Inference on Constrained GPUs
Jiakun Fan, Yanglin Zhang, Xiangchen Li, Dimitrios S. Nikolopoulos
arxiv.org/abs/2506.03296

@arXiv_csSD_bot@mastoxiv.page
2025-06-06 07:21:12

Grapheme-Coherent Phonemic and Prosodic Annotation of Speech by Implicit and Explicit Grapheme Conditioning
Hien Ohnaka, Yuma Shirahata, Byeongseon Park, Ryuichi Yamamoto
arxiv.org/abs/2506.04527

@arXiv_csLG_bot@mastoxiv.page
2025-06-05 10:56:37

This arxiv.org/abs/2505.14884 has been replaced.
initial toot: mastoxiv.page/@arXiv_csLG_…

@arXiv_physicsoptics_bot@mastoxiv.page
2025-06-06 07:35:31

Information-Optimal Sensing and Control in High-Intensity Laser Experiments
A. D\"opp, C. Eberle, J. Esslinger, S. Howard, F. Irshad, J. Schroeder, N. Weisse, S. Karsch
arxiv.org/abs/2506.04946

@arXiv_astrophIM_bot@mastoxiv.page
2025-06-06 09:46:03

This arxiv.org/abs/2405.05969 has been replaced.
initial toot: mastoxiv.page/@arXiv_…

@arXiv_csAR_bot@mastoxiv.page
2025-06-04 07:17:33

CLONE: Customizing LLMs for Efficient Latency-Aware Inference at the Edge
Chunlin Tian, Xinpeng Qin, Kahou Tam, Li Li, Zijian Wang, Yuanzhe Zhao, Minglei Zhang, Chengzhong Xu
arxiv.org/abs/2506.02847

@arXiv_csSE_bot@mastoxiv.page
2025-07-03 08:43:20

Combining Type Inference and Automated Unit Test Generation for Python
Lukas Krodinger, Stephan Lukasczyk, Gordon Fraser
arxiv.org/abs/2507.01477

@arXiv_csDC_bot@mastoxiv.page
2025-06-05 09:38:43

This arxiv.org/abs/2506.01969 has been replaced.
initial toot: mastoxiv.page/@arXiv_csDC_…

@arXiv_econEM_bot@mastoxiv.page
2025-07-02 08:55:29

Randomization Inference with Sample Attrition
Xinran Li, Peizan Sheng, Zeyang Yu
arxiv.org/abs/2507.00795 arxiv.org/p…

@arXiv_csCR_bot@mastoxiv.page
2025-07-03 09:26:10

Towards Better Attribute Inference Vulnerability Measures
Paul Francis, David Wagner
arxiv.org/abs/2507.01710 arxiv.o…

@arXiv_csRO_bot@mastoxiv.page
2025-06-04 14:06:25

This arxiv.org/abs/2505.07802 has been replaced.
initial toot: mastoxiv.page/@arXiv_csRO_…

@arXiv_quantph_bot@mastoxiv.page
2025-07-02 10:03:50

Quantum Bayesian inference with Suport vector states for intrusion detection
Nayema Mridha, Garrv Sipani, Eva R Gaarder, Shah Haque, Radhika Kuttala, Binay P Akhouri, Mohamad M Al Zein, Eric Howard
arxiv.org/abs/2507.00403

@arXiv_statME_bot@mastoxiv.page
2025-06-06 10:00:13

This arxiv.org/abs/2406.10554 has been replaced.
initial toot: mastoxiv.page/@arXiv_sta…

@arXiv_mathST_bot@mastoxiv.page
2025-06-06 07:27:44

At the edge of Donsker's Theorem: Asymptotics of multiscale scan statistics
Johann K\"ohne, Fabian Mies
arxiv.org/abs/2506.05112

@arXiv_hepph_bot@mastoxiv.page
2025-06-03 07:57:39

Bayesian inference of the magnetic field and chemical potential on holographic jet quenching in heavy-ion collisions
Liqiang Zhu, Zhan Gao, Weiyao Ke, Hanzhong Zhang
arxiv.org/abs/2506.00340

@arXiv_csDC_bot@mastoxiv.page
2025-06-04 07:20:28

FlashMLA-ETAP: Efficient Transpose Attention Pipeline for Accelerating MLA Inference on NVIDIA H20 GPUs
Pencuo Zeren, Qiuming Luo, Rui Mao, Chang Kong
arxiv.org/abs/2506.01969

@arXiv_csCR_bot@mastoxiv.page
2025-06-03 17:27:02

This arxiv.org/abs/2501.16007 has been replaced.
initial toot: mastoxiv.page/@arXiv_csCR_…

@arXiv_csAR_bot@mastoxiv.page
2025-07-04 07:50:51

System-performance and cost modeling of Large Language Model training and inference
Wenzhe Guo, Joyjit Kundu, Uras Tos, Weijiang Kong, Giuliano Sisto, Timon Evenblij, Manu Perumkunnil
arxiv.org/abs/2507.02456

@arXiv_statME_bot@mastoxiv.page
2025-06-04 07:51:57

Simulation-Based Inference for Adaptive Experiments
Brian M Cho, Aur\'elien Bibaut, Nathan Kallus
arxiv.org/abs/2506.02881

@arXiv_csCR_bot@mastoxiv.page
2025-06-06 09:33:29

This arxiv.org/abs/2409.18858 has been replaced.
initial toot: mastoxiv.page/@arXiv_csCR_…

@arXiv_econEM_bot@mastoxiv.page
2025-07-01 07:44:43

Causal Inference for Aggregated Treatment
Carolina Caetano, Gregorio Caetano, Brantly Callaway, Derek Dyal
arxiv.org/abs/2506.22885

@arXiv_astrophIM_bot@mastoxiv.page
2025-06-03 16:28:24

This arxiv.org/abs/2501.08524 has been replaced.
initial toot: mastoxiv.page/@arXiv_…

@arXiv_csDC_bot@mastoxiv.page
2025-06-06 09:35:28

This arxiv.org/abs/2505.09999 has been replaced.
initial toot: mastoxiv.page/@arXiv_csDC_…

@arXiv_statME_bot@mastoxiv.page
2025-06-03 17:13:55

This arxiv.org/abs/2406.04655 has been replaced.
initial toot: mastoxiv.page/@arXiv_sta…

@arXiv_mathST_bot@mastoxiv.page
2025-07-04 08:30:51

Two-Sample Covariance Inference in High-Dimensional Elliptical Models
Nina D\"ornemann
arxiv.org/abs/2507.02640

@arXiv_csLG_bot@mastoxiv.page
2025-06-03 08:21:27

Memory Access Characterization of Large Language Models in CPU Environment and its Potential Impacts
Spencer Banasik
arxiv.org/abs/2506.01827

@arXiv_hepph_bot@mastoxiv.page
2025-07-03 09:14:00

A Frequentist Simulation-Based Inference Treatment of Sterile Neutrino Global Fits
Joshua Villarreal, Julia Woodward, John Hardin, Janet Conrad
arxiv.org/abs/2507.01153

@arXiv_statME_bot@mastoxiv.page
2025-06-06 07:39:35

Robust Estimation in Step-Stress Experiments under Exponential Lifetime Distributions
Mar\'ia Jaenada, Juan Manuel Mill\'an, Leandro Pardo
arxiv.org/abs/2506.04445

@arXiv_csDC_bot@mastoxiv.page
2025-07-03 08:40:00

Deep Recommender Models Inference: Automatic Asymmetric Data Flow Optimization
Giuseppe Ruggeri, Renzo Andri, Daniele Jahier Pagliari, Lukas Cavigelli
arxiv.org/abs/2507.01676

@arXiv_csCR_bot@mastoxiv.page
2025-06-02 10:13:02

This arxiv.org/abs/2505.23655 has been replaced.
initial toot: mastoxiv.page/@arXiv_csCR_…

@arXiv_statME_bot@mastoxiv.page
2025-06-03 08:04:38

Flexible Selective Inference with Flow-based Transport Maps
Sifan Liu, Snigdha Panigrahi
arxiv.org/abs/2506.01150 arx…

@arXiv_csDC_bot@mastoxiv.page
2025-06-04 07:26:40

DistMLIP: A Distributed Inference Platform for Machine Learning Interatomic Potentials
Kevin Han, Bowen Deng, Amir Barati Farimani, Gerbrand Ceder
arxiv.org/abs/2506.02023

@arXiv_statME_bot@mastoxiv.page
2025-06-03 08:04:19

ProjMC$^2$: Scalable and Stable Posterior Inference for Bayesian Spatial Factor Models with Application to Spatial Transcriptomics
Lu Zhang
arxiv.org/abs/2506.01098

@arXiv_csCR_bot@mastoxiv.page
2025-05-30 07:16:47

Keyed Chaotic Tensor Transformations for Secure And Attributable Neural Inference
Peter David Fagan
arxiv.org/abs/2505.23655

@arXiv_csDC_bot@mastoxiv.page
2025-07-01 09:56:13

QPART: Adaptive Model Quantization and Dynamic Workload Balancing for Accuracy-aware Edge Inference
Xiangchen Li, Saeid Ghafouri, Bo Ji, Hans Vandierendonck, Deepu John, Dimitrios S. Nikolopoulos
arxiv.org/abs/2506.23934

@arXiv_statME_bot@mastoxiv.page
2025-06-03 17:17:32

This arxiv.org/abs/2408.06211 has been replaced.
initial toot: mastoxiv.page/@arXiv_sta…

@arXiv_csDC_bot@mastoxiv.page
2025-05-30 07:17:06

Ghidorah: Fast LLM Inference on Edge with Speculative Decoding and Hetero-Core Parallelism
Jinhui Wei, Ye Huang, Yuhui Zhou, Jiazhi Jiang, Jiangsu Du
arxiv.org/abs/2505.23219

@arXiv_csCR_bot@mastoxiv.page
2025-07-02 08:11:40

Find a Scapegoat: Poisoning Membership Inference Attack and Defense to Federated Learning
Wenjin Mo, Zhiyuan Li, Minghong Fang, Mingwei Fang
arxiv.org/abs/2507.00423

@arXiv_statME_bot@mastoxiv.page
2025-07-01 11:09:43

Causal Inference in Panel Data with a Continuous Treatment
Zhiguo Xiao, Peikai Wu
arxiv.org/abs/2506.23226 arxiv.org/…

@arXiv_csDC_bot@mastoxiv.page
2025-07-02 08:09:40

LLM-Mesh: Enabling Elastic Sharing for Serverless LLM Inference
Chuhao Xu, Zijun Li, Quan Chen, Han Zhao, Minyi Guo
arxiv.org/abs/2507.00507

@arXiv_csCR_bot@mastoxiv.page
2025-06-02 07:17:32

Synopsis: Secure and private trend inference from encrypted semantic embeddings
Madelyne Xiao, Palak Jain, Micha Gorelick, Sarah Scheffler
arxiv.org/abs/2505.23880

@arXiv_csDC_bot@mastoxiv.page
2025-06-02 07:17:16

SkyLB: A Locality-Aware Cross-Region Load Balancer for LLM Inference
Tian Xia, Ziming Mao, Jamison Kerney, Ethan J. Jackson, Zhifei Li, Jiarong Xing, Scott Shenker, Ion Stoica
arxiv.org/abs/2505.24095

@arXiv_statME_bot@mastoxiv.page
2025-06-03 17:00:18

This arxiv.org/abs/2304.12414 has been replaced.
initial toot: mastoxiv.page/@arXiv_sta…

@arXiv_csDC_bot@mastoxiv.page
2025-06-30 07:40:59

SiPipe: Bridging the CPU-GPU Utilization Gap for Efficient Pipeline-Parallel LLM Inference
Yongchao He, Bohan Zhao, Zheng Cao
arxiv.org/abs/2506.22033

@arXiv_statME_bot@mastoxiv.page
2025-07-01 11:27:23

Large Language Models for Statistical Inference: Context Augmentation with Applications to the Two-Sample Problem and Regression
Marc Ratkovic
arxiv.org/abs/2506.23862

@arXiv_statME_bot@mastoxiv.page
2025-06-03 08:05:26

Reluctant Interaction Inference after Additive Modeling
Yiling Huang, Snigdha Panigrahi, Guo Yu, Jacob Bien
arxiv.org/abs/2506.01219

@arXiv_csDC_bot@mastoxiv.page
2025-06-05 07:17:54

Cascadia: A Cascade Serving System for Large Language Models
Youhe Jiang, Fangcheng Fu, Wanru Zhao, Stephan Rabanser, Nicholas D. Lane, Binhang Yuan
arxiv.org/abs/2506.04203