Tootfinder

Opt-in global Mastodon full text search. Join the index!

@Techmeme@techhub.social
2026-02-28 03:21:12

Sam Altman says OpenAI reached an agreement with the DOD to deploy its models in DOD's classified network and asks DOD to extend those terms to all AI companies (Sam Altman/@sama)
x.com/sama/status/202757858015

@dichotomiker@dresden.network
2026-02-16 20:13:41

Was haben Influenza, Influenzer und Künstliche Inteligenzia gemeinsam? (2004)
#sociolgy

@arXiv_csCL_bot@mastoxiv.page
2026-03-31 11:12:53

Replaced article(s) found for cs.CL. arxiv.org/list/cs.CL/new
[3/5]:
- Can Small Language Models Handle Context-Summarized Multi-Turn Customer-Service QA? A Synthetic D...
Lakshan Cooray, Deshan Sumanathilaka, Pattigadapa Venkatesh Raju
arxiv.org/abs/2602.00665 mastoxiv.page/@arXiv_csCL_bot/
- SEAD: Self-Evolving Agent for Multi-Turn Service Dialogue
Dai, Gao, Zhang, Wang, Luo, Wang, Wang, Wu, Wang
arxiv.org/abs/2602.03548
- OmniRAG-Agent: Agentic Omnimodal Reasoning for Low-Resource Long Audio-Video Question Answering
Yifan Zhu, Xinyu Mu, Tao Feng, Zhonghong Ou, Yuning Gong, Haoran Luo
arxiv.org/abs/2602.03707
- GreekMMLU: A Native-Sourced Multitask Benchmark for Evaluating Language Models in Greek
Zhang, Konomi, Xypolopoulos, Divriotis, Skianis, Nikolentzos, Stamou, Shang, Vazirgiannis
arxiv.org/abs/2602.05150
- Using LLMs for Knowledge Component-level Correctness Labeling in Open-ended Coding Problems
Zhangqi Duan, Arnav Kankaria, Dhruv Kartik, Andrew Lan
arxiv.org/abs/2602.17542 mastoxiv.page/@arXiv_csCL_bot/
- MetaState: Persistent Working Memory Enhances Reasoning in Discrete Diffusion Language Models
Kejing Xia, Mingzhe Li, Lixuan Wei, Zhenbang Du, Xiangchi Yuan, Dachuan Shi, Qirui Jin, Wenke Lee
arxiv.org/abs/2603.01331 mastoxiv.page/@arXiv_csCL_bot/
- A Browser-based Open Source Assistant for Multimodal Content Verification
Milner, Foster, Karmakharm, Razuvayevskaya, Roberts, Porcellini, Teyssou, Bontcheva
arxiv.org/abs/2603.02842 mastoxiv.page/@arXiv_csCL_bot/
- Nw\=ach\=a Mun\=a: A Devanagari Speech Corpus and Proximal Transfer Benchmark for Nepal Bhasha ASR
Sharma, Shrestha, Poudel, Tiwari, Shrestha, Ghimire, Bal
arxiv.org/abs/2603.07554 mastoxiv.page/@arXiv_csCL_bot/
- Model Merging in the Era of Large Language Models: Methods, Applications, and Future Directions
Mingyang Song, Mao Zheng
arxiv.org/abs/2603.09938 mastoxiv.page/@arXiv_csCL_bot/
- AgentDrift: Unsafe Recommendation Drift Under Tool Corruption Hidden by Ranking Metrics in LLM Ag...
Zekun Wu, Adriano Koshiyama, Sahan Bulathwela, Maria Perez-Ortiz
arxiv.org/abs/2603.12564 mastoxiv.page/@arXiv_csCL_bot/
- GhanaNLP Parallel Corpora: Comprehensive Multilingual Resources for Low-Resource Ghanaian Languages
Gyamfi, Azunre, Moore, Budu, Asare, Owusu, Asiamah
arxiv.org/abs/2603.13793 mastoxiv.page/@arXiv_csCL_bot/
- sebis at ArchEHR-QA 2026: How Much Can You Do Locally? Evaluating Grounded EHR QA on a Single Not...
Ibrahim Ebrar Yurt, Fabian Karl, Tejaswi Choppa, Florian Matthes
arxiv.org/abs/2603.13962 mastoxiv.page/@arXiv_csCL_bot/
- ExPosST: Explicit Positioning with Adaptive Masking for LLM-Based Simultaneous Machine Translation
Yuzhe Shang, Pengzhi Gao, Yazheng Yang, Jiayao Ma, Wei Liu, Jian Luan, Jinsong Su
arxiv.org/abs/2603.14903 mastoxiv.page/@arXiv_csCL_bot/
- BanglaSocialBench: A Benchmark for Evaluating Sociopragmatic and Cultural Alignment of LLMs in Ba...
Tanvir Ahmed Sijan, S. M Golam Rifat, Pankaj Chowdhury Partha, Md. Tanjeed Islam, Md. Musfique Anwar
arxiv.org/abs/2603.15949 mastoxiv.page/@arXiv_csCL_bot/
- EngGPT2: Sovereign, Efficient and Open Intelligence
G. Ciarfaglia, et al.
arxiv.org/abs/2603.16430 mastoxiv.page/@arXiv_csCL_bot/
- HypeLoRA: Hyper-Network-Generated LoRA Adapters for Calibrated Language Model Fine-Tuning
Bartosz Trojan, Filip G\k{e}bala
arxiv.org/abs/2603.19278 mastoxiv.page/@arXiv_csCL_bot/
- Automatic Analysis of Collaboration Through Human Conversational Data Resources: A Review
Yi Yu, Maria Boritchev, Chlo\'e Clavel
arxiv.org/abs/2603.19292 mastoxiv.page/@arXiv_csCL_bot/
- Alignment Whack-a-Mole : Finetuning Activates Verbatim Recall of Copyrighted Books in Large Langu...
Xinyue Liu, Niloofar Mireshghallah, Jane C. Ginsburg, Tuhin Chakrabarty
arxiv.org/abs/2603.20957 mastoxiv.page/@arXiv_csCL_bot/
- KG-Hopper: Empowering Compact Open LLMs with Knowledge Graph Reasoning via Reinforcement Learning
Shuai Wang, Yinan Yu
arxiv.org/abs/2603.21440 mastoxiv.page/@arXiv_csCL_bot/
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 16:08:18

Replaced article(s) found for cs.LG. arxiv.org/list/cs.LG/new
[5/6]:
- Watermarking Degrades Alignment in Language Models: Analysis and Mitigation
Apurv Verma, NhatHai Phan, Shubhendu Trivedi
arxiv.org/abs/2506.04462 mastoxiv.page/@arXiv_csCL_bot/
- Sensory-Motor Control with Large Language Models via Iterative Policy Refinement
J\^onata Tyska Carvalho, Stefano Nolfi
arxiv.org/abs/2506.04867 mastoxiv.page/@arXiv_csAI_bot/
- ICE-ID: A Novel Historical Census Dataset for Longitudinal Identity Resolution
de Carvalho, Popov, Kaatee, Correia, Th\'orisson, Li, Bj\"ornsson, Sigur{\dh}arson, Dibangoye
arxiv.org/abs/2506.13792 mastoxiv.page/@arXiv_csAI_bot/
- Feedback-driven recurrent quantum neural network universality
Lukas Gonon, Rodrigo Mart\'inez-Pe\~na, Juan-Pablo Ortega
arxiv.org/abs/2506.16332 mastoxiv.page/@arXiv_quantph_b
- Programming by Backprop: An Instruction is Worth 100 Examples When Finetuning LLMs
Cook, Sapora, Ahmadian, Khan, Rocktaschel, Foerster, Ruis
arxiv.org/abs/2506.18777 mastoxiv.page/@arXiv_csAI_bot/
- Stochastic Quantum Spiking Neural Networks with Quantum Memory and Local Learning
Jiechen Chen, Bipin Rajendran, Osvaldo Simeone
arxiv.org/abs/2506.21324 mastoxiv.page/@arXiv_csNE_bot/
- Enjoying Non-linearity in Multinomial Logistic Bandits: A Minimax-Optimal Algorithm
Pierre Boudart (SIERRA), Pierre Gaillard (Thoth), Alessandro Rudi (PSL, DI-ENS, Inria)
arxiv.org/abs/2507.05306 mastoxiv.page/@arXiv_statML_bo
- Characterizing State Space Model and Hybrid Language Model Performance with Long Context
Saptarshi Mitra, Rachid Karami, Haocheng Xu, Sitao Huang, Hyoukjun Kwon
arxiv.org/abs/2507.12442 mastoxiv.page/@arXiv_csAR_bot/
- Is Exchangeability better than I.I.D to handle Data Distribution Shifts while Pooling Data for Da...
Ayush Roy, Samin Enam, Jun Xia, Won Hwa Kim, Vishnu Suresh Lokhande
arxiv.org/abs/2507.19575 mastoxiv.page/@arXiv_csCV_bot/
- TASER: Table Agents for Schema-guided Extraction and Recommendation
Nicole Cho, Kirsty Fielding, William Watson, Sumitra Ganesh, Manuela Veloso
arxiv.org/abs/2508.13404 mastoxiv.page/@arXiv_csAI_bot/
- Morphology-Aware Peptide Discovery via Masked Conditional Generative Modeling
Nuno Costa, Julija Zavadlav
arxiv.org/abs/2509.02060 mastoxiv.page/@arXiv_qbioBM_bo
- PCPO: Proportionate Credit Policy Optimization for Aligning Image Generation Models
Jeongjae Lee, Jong Chul Ye
arxiv.org/abs/2509.25774 mastoxiv.page/@arXiv_csCV_bot/
- Multi-hop Deep Joint Source-Channel Coding with Deep Hash Distillation for Semantically Aligned I...
Didrik Bergstr\"om, Deniz G\"und\"uz, Onur G\"unl\"u
arxiv.org/abs/2510.06868 mastoxiv.page/@arXiv_csIT_bot/
- MoMaGen: Generating Demonstrations under Soft and Hard Constraints for Multi-Step Bimanual Mobile...
Chengshu Li, et al.
arxiv.org/abs/2510.18316 mastoxiv.page/@arXiv_csRO_bot/
- A Spectral Framework for Graph Neural Operators: Convergence Guarantees and Tradeoffs
Roxanne Holden, Luana Ruiz
arxiv.org/abs/2510.20954 mastoxiv.page/@arXiv_statML_bo
- Breaking Agent Backbones: Evaluating the Security of Backbone LLMs in AI Agents
Bazinska, Mathys, Casucci, Rojas-Carulla, Davies, Souly, Pfister
arxiv.org/abs/2510.22620 mastoxiv.page/@arXiv_csCR_bot/
- Uncertainty Calibration of Multi-Label Bird Sound Classifiers
Raphael Schwinger, Ben McEwen, Vincent S. Kather, Ren\'e Heinrich, Lukas Rauch, Sven Tomforde
arxiv.org/abs/2511.08261 mastoxiv.page/@arXiv_csSD_bot/
- Two-dimensional RMSD projections for reaction path visualization and validation
Rohit Goswami (Institute IMX and Lab-COSMO, \'Ecole polytechnique f\'ed\'erale de Lausanne)
arxiv.org/abs/2512.07329 mastoxiv.page/@arXiv_physicsch
- Distribution-informed Online Conformal Prediction
Dongjian Hu, Junxi Wu, Shu-Tao Xia, Changliang Zou
arxiv.org/abs/2512.07770 mastoxiv.page/@arXiv_statML_bo
- Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
Ang Lv, Jin Ma, Yiyuan Ma, Siyuan Qiao
arxiv.org/abs/2512.23447 mastoxiv.page/@arXiv_csCL_bot/
toXiv_bot_toot

@Techmeme@techhub.social
2026-02-21 15:21:09

A look at the fundamental questions facing OpenAI: its models have a very large user base but very narrow engagement, incumbents are matching its tech, and more (Benedict Evans)
ben-evans.com/benedictevans/20

@arXiv_physicschemph_bot@mastoxiv.page
2026-03-27 08:19:37

Autotuning T-PaiNN: Enabling Data-Efficient GNN Interatomic Potential Development via Classical-to-Quantum Transfer Learning
Vivienne Pelletier, Vedant Bhat, Daniel J. Rivera, Steven A. Wilson, Christopher L. Muhich
arxiv.org/abs/2603.24752 arxiv.org/pdf/2603.24752 arxiv.org/html/2603.24752
arXiv:2603.24752v1 Announce Type: new
Abstract: Machine-learned interatomic potentials (MLIPs), particularly graph neural network (GNN)-based models, offer a promising route to achieving near-density functional theory (DFT) accuracy at significantly reduced computational cost. However, their practical deployment is often limited by the large volumes of expensive quantum mechanical training data required. In this work, we introduce a transfer learning framework, Transfer-PaiNN (T-PaiNN), that substantially improves the data efficiency of GNN-MLIPs by leveraging inexpensive classical force field data. The approach consists of pretraining a PaiNN MLIP architecture on large-scale datasets generated from classical molecular simulations, followed by fine-tuning (dubbed autotuning) using a comparatively small DFT dataset. We demonstrate the effectiveness of autotuning T-PaiNN on both gas-phase molecular systems (QM9 dataset) and condensed-phase liquid water. Across all cases, T-PaiNN significantly outperforms models trained solely on DFT data, achieving order-of-magnitude reductions in mean absolute error while accelerating training convergence. For example, using the QM9 data set, error reductions of up to 25 times are observed in low-data regimes, while liquid water simulations show improved predictions of energies, forces, and experimentally relevant properties such as density and diffusion. These gains arise from the model's ability to learn general features of the potential energy surface from extensive classical sampling, which are subsequently refined to quantum accuracy. Overall, this work establishes transfer learning from classical force fields as a practical and computationally efficient strategy for developing high-accuracy, data-efficient GNN interatomic potentials, enabling broader application of MLIPs to complex chemical systems.
toXiv_bot_toot

@arXiv_qbioPE_bot@mastoxiv.page
2026-03-30 08:26:42

Evaluating Phylogenetic Comparative Methods under Reticulate Evolutionary Scenarios
Lydia Morley, Emma Lehmberg, Sungsik Kong
arxiv.org/abs/2603.25986 arxiv.org/pdf/2603.25986 arxiv.org/html/2603.25986
arXiv:2603.25986v1 Announce Type: new
Abstract: Phylogenetic comparative methods (PCMs) are widely used to study trait evolution. However, many evolutionary histories involve reticulate evolutionary scenarios, such as hybridization, that violate core assumptions of these methods. In this study, we evaluate how such violations affect the performance of PCMs. In particular, we focus on the ancestral character estimation, evolutionary rate estimation, and model selection. We simulate continuous trait evolution on various phylogenetic network topologies and assess the performance of PCMs that assume a bifurcating tree (i.e., major tree of the network) as the underlying model of evolution. We found that the performance of the tested PCMs was suboptimal. Using random forest, generalized linear models, and model-based clustering, we identified key factors contributing to these inaccuracies. Our results show that frequent and/or recent hybridization accompanied by one ore more transgressive events and rapidly evolving traits (i.e., high evolutionary rate) lead to significant estimation error, especially with respect to rate estimation and model choice. These factors substantially shift trait values away from tree-based model expectations, leading to overall increased error in parameter estimates. Our study demonstrates cases in which PCMs that rely on trees are likely to misinterpret biological histories and offers recommendations for researchers studying systems with complex evolutionary histories.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 12:33:22

Crosslisted article(s) found for cs.LG. arxiv.org/list/cs.LG/new
[1/3]:
- SMaRT: Online Reusable Resource Assignment and an Application to Mediation in the Kenyan Judiciary
Farabi, Pinto, Lu, Ramos-Maqueda, Das, Deeb, Sautmann
arxiv.org/abs/2602.18431 mastoxiv.page/@arXiv_csCY_bot/
- Benchmarking Distilled Language Models: Performance and Efficiency in Resource-Constrained Settings
Sachin Gopal Wani, Eric Page, Ajay Dholakia, David Ellison
arxiv.org/abs/2602.20164 mastoxiv.page/@arXiv_csCL_bot/
- VISION-ICE: Video-based Interpretation and Spatial Identification of Arrhythmia Origins via Neura...
Dorsa EPMoghaddam, Feng Gao, Drew Bernard, Kavya Sinha, Mehdi Razavi, Behnaam Aazhang
arxiv.org/abs/2602.20165 mastoxiv.page/@arXiv_csCV_bot/
- Benchmarking Early Deterioration Prediction Across Hospital-Rich and MCI-Like Emergency Triage Un...
KMA Solaiman, Joshua Sebastian, Karma Tobden
arxiv.org/abs/2602.20168 mastoxiv.page/@arXiv_csCY_bot/
- Cross-Chirality Generalization by Axial Vectors for Hetero-Chiral Protein-Peptide Interaction Design
Yang, Tian, Jia, Zhang, Zheng, Wang, Su, He, Liu, Lan
arxiv.org/abs/2602.20176 mastoxiv.page/@arXiv_qbioBM_bo
- Enhancing Heat Sink Efficiency in MOSFETs using Physics Informed Neural Networks: A Systematic St...
Aniruddha Bora, Isabel K. Alvarez, Julie Chalfant, Chryssostomos Chryssostomidis
arxiv.org/abs/2602.20177 mastoxiv.page/@arXiv_csNE_bot/
- Data-Driven Deep MIMO Detection:Network Architectures and Generalization Analysis
Yongwei Yi, Xinping Yi, Wenjin Wang, Xiao Li, Shi Jin
arxiv.org/abs/2602.20178 mastoxiv.page/@arXiv_eessSP_bo
- OrgFlow: Generative Modeling of Organic Crystal Structures from Molecular Graphs
Mohammadmahdi Vahediahmar, Matthew A. McDonald, Feng Liu
arxiv.org/abs/2602.20195 mastoxiv.page/@arXiv_condmatmt
- KEMP-PIP: A Feature-Fusion Based Approach for Pro-inflammatory Peptide Prediction
Soumik Deb Niloy, Md. Fahmid-Ul-Alam Juboraj, Swakkhar Shatabda
arxiv.org/abs/2602.20198 mastoxiv.page/@arXiv_qbioQM_bo
- Regressor-guided Diffusion Model for De Novo Peptide Sequencing with Explicit Mass Control
Shaorong Chen, Jingbo Zhou, Jun Xia
arxiv.org/abs/2602.20209 mastoxiv.page/@arXiv_qbioQM_bo
- The Sim-to-Real Gap in MRS Quantification: A Systematic Deep Learning Validation for GABA
Zien Ma, S. M. Shermer, Oktay Karaku\c{s}, Frank C. Langbein
arxiv.org/abs/2602.20289 mastoxiv.page/@arXiv_eessSP_bo
- Gap-Dependent Bounds for Nearly Minimax Optimal Reinforcement Learning with Linear Function Appro...
Haochen Zhang, Zhong Zheng, Lingzhou Xue
arxiv.org/abs/2602.20297 mastoxiv.page/@arXiv_statML_bo
- Multilevel Determinants of Overweight and Obesity Among U.S. Children Aged 10-17: Comparative Eva...
Joyanta Jyoti Mondal
arxiv.org/abs/2602.20303 mastoxiv.page/@arXiv_csAI_bot/
- An artificial intelligence framework for end-to-end rare disease phenotyping from clinical notes ...
Shyr, Hu, Tinker, Cassini, Byram, Hamid, Fabbri, Wright, Peterson, Bastarache, Xu
arxiv.org/abs/2602.20324 mastoxiv.page/@arXiv_csAI_bot/
- Circuit Tracing in Vision-Language Models: Understanding the Internal Mechanisms of Multimodal Th...
Jingcheng Yang, Tianhu Xiong, Shengyi Qian, Klara Nahrstedt, Mingyuan Wu
arxiv.org/abs/2602.20330 mastoxiv.page/@arXiv_csCV_bot/
- No One Size Fits All: QueryBandits for Hallucination Mitigation
Nicole Cho, William Watson, Alec Koppel, Sumitra Ganesh, Manuela Veloso
arxiv.org/abs/2602.20332 mastoxiv.page/@arXiv_csCL_bot/
- Learning During Detection: Continual Learning for Neural OFDM Receivers via DMRS
Mohanad Obeed, Ming Jian
arxiv.org/abs/2602.20361 mastoxiv.page/@arXiv_csIT_bot/
- Detecting and Mitigating Group Bias in Heterogeneous Treatment Effects
Joel Persson, Jurri\"en Bakker, Dennis Bohle, Stefan Feuerriegel, Florian von Wangenheim
arxiv.org/abs/2602.20383 mastoxiv.page/@arXiv_statME_bo
- Selecting Optimal Variable Order in Autoregressive Ising Models
Shiba Biswal, Marc Vuffray, Andrey Y. Lokhov
arxiv.org/abs/2602.20394 mastoxiv.page/@arXiv_statML_bo
toXiv_bot_toot

@arXiv_physicschemph_bot@mastoxiv.page
2026-03-27 08:36:07

Deep learning of committor and explainable artificial intelligence analysis for identifying reaction coordinates
Toshifumi Mori, Kei-ichi Okazaki, Kang Kim, Nobuyuki Matubayasi
arxiv.org/abs/2603.25237 arxiv.org/pdf/2603.25237 arxiv.org/html/2603.25237
arXiv:2603.25237v1 Announce Type: new
Abstract: In complex molecular systems, the reaction coordinate (RC) that characterizes transition pathways is essential to understand underlying molecular mechanisms. This review surveys a framework for identifying the RC by applying deep learning to the committor, which provides the most reliable measure of the progress along a transition path. The inputs to the neural network are collective variables (CVs) expressed as functions of atomic coordinates of the system, and the corresponding RC is predicted as the output by training the network on the committor as the learning target. Because deep learning models typically operate in a black-box manner, it is difficult to determine which input variables govern the predictions. The incorporation of eXplainable Artificial Intelligence (XAI) techniques enables quantitative assessment of the contributions of individual input variables to the predictions. This approach allows the identification of CVs that play dominant roles and demonstrates that the committor distribution on the surface using important CVs is separated by well-defined boundaries. The framework provides an explainable deep learning strategy for assigning a molecular mechanism from the RC and is applicable to a wide range of complex molecular systems.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 16:07:58

Replaced article(s) found for cs.LG. arxiv.org/list/cs.LG/new
[3/6]:
- Towards Scalable Oversight via Partitioned Human Supervision
Ren Yin, Takashi Ishida, Masashi Sugiyama
arxiv.org/abs/2510.22500 mastoxiv.page/@arXiv_csLG_bot/
- ContextPilot: Fast Long-Context Inference via Context Reuse
Yinsicheng Jiang, Yeqi Huang, Liang Cheng, Cheng Deng, Xuan Sun, Luo Mai
arxiv.org/abs/2511.03475 mastoxiv.page/@arXiv_csLG_bot/
- Metabolomic Biomarker Discovery for ADHD Diagnosis Using Interpretable Machine Learning
Nabil Belacel, Mohamed Rachid Boulassel
arxiv.org/abs/2601.11283 mastoxiv.page/@arXiv_csLG_bot/
- PhysE-Inv: A Physics-Encoded Inverse Modeling approach for Arctic Snow Depth Prediction
Akila Sampath, Vandana Janeja, Jianwu Wang
arxiv.org/abs/2601.17074
- SAGE-5GC: Security-Aware Guidelines for Evaluating Anomaly Detection in the 5G Core Network
Cristian Manca, Christian Scano, Giorgio Piras, Fabio Brau, Maura Pintor, Battista Biggio
arxiv.org/abs/2602.03596
- LORE: Jointly Learning the Intrinsic Dimensionality and Relative Similarity Structure From Ordina...
Anand, Helbling, Davenport, Berman, Alagapan, Rozell
arxiv.org/abs/2602.04192
- Towards Robust Scaling Laws for Optimizers
Alexandra Volkova, Mher Safaryan, Christoph H. Lampert, Dan Alistarh
arxiv.org/abs/2602.07712 mastoxiv.page/@arXiv_csLG_bot/
- Do We Need Adam? Surprisingly Strong and Sparse Reinforcement Learning with SGD in LLMs
Sagnik Mukherjee, Lifan Yuan, Pavan Jayasinha, Dilek Hakkani-T\"ur, Hao Peng
arxiv.org/abs/2602.07729 mastoxiv.page/@arXiv_csLG_bot/
- AceGRPO: Adaptive Curriculum Enhanced Group Relative Policy Optimization for Autonomous Machine L...
Yuzhu Cai, Zexi Liu, Xinyu Zhu, Cheng Wang, Siheng Chen
arxiv.org/abs/2602.07906 mastoxiv.page/@arXiv_csLG_bot/
- VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training
Guobin Shen, Chenxiao Zhao, Xiang Cheng, Lei Huang, Xing Yu
arxiv.org/abs/2602.10693 mastoxiv.page/@arXiv_csLG_bot/
- KBVQ-MoE: KLT-guided SVD with Bias-Corrected Vector Quantization for MoE Large Language Models
Zukang Xu, Zhixiong Zhao, Xing Hu, Zhixuan Chen, Dawei Yang
arxiv.org/abs/2602.11184 mastoxiv.page/@arXiv_csLG_bot/
- MUSE: Multi-Tenant Model Serving With Seamless Model Updates
Correia, Ferreira, Martins, Bento, Guerreiro, Pereira, Gomes, Bono, Ferreira, Bizarro
arxiv.org/abs/2602.11776 mastoxiv.page/@arXiv_csLG_bot/
- Pawsterior: Variational Flow Matching for Structured Simulation-Based Inference
Jorge Carrasco-Pollo, Floor Eijkelboom, Jan-Willem van de Meent
arxiv.org/abs/2602.13813 mastoxiv.page/@arXiv_csLG_bot/
- Silent Inconsistency in Data-Parallel Full Fine-Tuning: Diagnosing Worker-Level Optimization Misa...
Hong Li, Zhen Zhou, Honggang Zhang, Yuping Luo, Xinyue Wang, Han Gong, Zhiyuan Liu
arxiv.org/abs/2602.14462 mastoxiv.page/@arXiv_csLG_bot/
- Divine Benevolence is an $x^2$: GLUs scale asymptotically faster than MLPs
Alejandro Francisco Queiruga
arxiv.org/abs/2602.14495 mastoxiv.page/@arXiv_csLG_bot/
- \"UberWeb: Insights from Multilingual Curation for a 20-Trillion-Token Dataset
DatologyAI, et al.
arxiv.org/abs/2602.15210 mastoxiv.page/@arXiv_csLG_bot/
- GLM-5: from Vibe Coding to Agentic Engineering
GLM-5-Team, et al.
arxiv.org/abs/2602.15763 mastoxiv.page/@arXiv_csLG_bot/
- Anatomy of Capability Emergence: Scale-Invariant Representation Collapse and Top-Down Reorganizat...
Jayadev Billa
arxiv.org/abs/2602.15997 mastoxiv.page/@arXiv_csLG_bot/
- AI-CARE: Carbon-Aware Reporting Evaluation Metric for AI Models
KC Santosh, Srikanth Baride, Rodrigue Rizk
arxiv.org/abs/2602.16042 mastoxiv.page/@arXiv_csLG_bot/
- Beyond Message Passing: A Symbolic Alternative for Expressive and Interpretable Graph Learning
Chuqin Geng, Li Zhang, Haolin Ye, Ziyu Zhao, Yuhe Jiang, Tara Saba, Xinyu Wang, Xujie Si
arxiv.org/abs/2602.16947 mastoxiv.page/@arXiv_csLG_bot/
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2026-02-25 10:33:41

Sparse Bayesian Deep Functional Learning with Structured Region Selection
Xiaoxian Zhu, Yingmeng Li, Shuangge Ma, Mengyun Wu
arxiv.org/abs/2602.20651 arxiv.org/pdf/2602.20651 arxiv.org/html/2602.20651
arXiv:2602.20651v1 Announce Type: new
Abstract: In modern applications such as ECG monitoring, neuroimaging, wearable sensing, and industrial equipment diagnostics, complex and continuously structured data are ubiquitous, presenting both challenges and opportunities for functional data analysis. However, existing methods face a critical trade-off: conventional functional models are limited by linearity, whereas deep learning approaches lack interpretable region selection for sparse effects. To bridge these gaps, we propose a sparse Bayesian functional deep neural network (sBayFDNN). It learns adaptive functional embeddings through a deep Bayesian architecture to capture complex nonlinear relationships, while a structured prior enables interpretable, region-wise selection of influential domains with quantified uncertainty. Theoretically, we establish rigorous approximation error bounds, posterior consistency, and region selection consistency. These results provide the first theoretical guarantees for a Bayesian deep functional model, ensuring its reliability and statistical rigor. Empirically, comprehensive simulations and real-world studies confirm the effectiveness and superiority of sBayFDNN. Crucially, sBayFDNN excels in recognizing intricate dependencies for accurate predictions and more precisely identifies functionally meaningful regions, capabilities fundamentally beyond existing approaches.
toXiv_bot_toot