2026-02-28 03:21:12
Sam Altman says OpenAI reached an agreement with the DOD to deploy its models in DOD's classified network and asks DOD to extend those terms to all AI companies (Sam Altman/@sama)
https://x.com/sama/status/2027578580159631610
Sam Altman says OpenAI reached an agreement with the DOD to deploy its models in DOD's classified network and asks DOD to extend those terms to all AI companies (Sam Altman/@sama)
https://x.com/sama/status/2027578580159631610
Was haben Influenza, Influenzer und Künstliche Inteligenzia gemeinsam? (2004)
#sociolgy
Replaced article(s) found for cs.CL. https://arxiv.org/list/cs.CL/new
[3/5]:
- Can Small Language Models Handle Context-Summarized Multi-Turn Customer-Service QA? A Synthetic D...
Lakshan Cooray, Deshan Sumanathilaka, Pattigadapa Venkatesh Raju
https://arxiv.org/abs/2602.00665 https://mastoxiv.page/@arXiv_csCL_bot/116006686092324902
- SEAD: Self-Evolving Agent for Multi-Turn Service Dialogue
Dai, Gao, Zhang, Wang, Luo, Wang, Wang, Wu, Wang
https://arxiv.org/abs/2602.03548
- OmniRAG-Agent: Agentic Omnimodal Reasoning for Low-Resource Long Audio-Video Question Answering
Yifan Zhu, Xinyu Mu, Tao Feng, Zhonghong Ou, Yuning Gong, Haoran Luo
https://arxiv.org/abs/2602.03707
- GreekMMLU: A Native-Sourced Multitask Benchmark for Evaluating Language Models in Greek
Zhang, Konomi, Xypolopoulos, Divriotis, Skianis, Nikolentzos, Stamou, Shang, Vazirgiannis
https://arxiv.org/abs/2602.05150
- Using LLMs for Knowledge Component-level Correctness Labeling in Open-ended Coding Problems
Zhangqi Duan, Arnav Kankaria, Dhruv Kartik, Andrew Lan
https://arxiv.org/abs/2602.17542 https://mastoxiv.page/@arXiv_csCL_bot/116102514058414603
- MetaState: Persistent Working Memory Enhances Reasoning in Discrete Diffusion Language Models
Kejing Xia, Mingzhe Li, Lixuan Wei, Zhenbang Du, Xiangchi Yuan, Dachuan Shi, Qirui Jin, Wenke Lee
https://arxiv.org/abs/2603.01331 https://mastoxiv.page/@arXiv_csCL_bot/116165314672421581
- A Browser-based Open Source Assistant for Multimodal Content Verification
Milner, Foster, Karmakharm, Razuvayevskaya, Roberts, Porcellini, Teyssou, Bontcheva
https://arxiv.org/abs/2603.02842 https://mastoxiv.page/@arXiv_csCL_bot/116170368271004704
- Nw\=ach\=a Mun\=a: A Devanagari Speech Corpus and Proximal Transfer Benchmark for Nepal Bhasha ASR
Sharma, Shrestha, Poudel, Tiwari, Shrestha, Ghimire, Bal
https://arxiv.org/abs/2603.07554 https://mastoxiv.page/@arXiv_csCL_bot/116204797995674104
- Model Merging in the Era of Large Language Models: Methods, Applications, and Future Directions
Mingyang Song, Mao Zheng
https://arxiv.org/abs/2603.09938 https://mastoxiv.page/@arXiv_csCL_bot/116210189810004206
- AgentDrift: Unsafe Recommendation Drift Under Tool Corruption Hidden by Ranking Metrics in LLM Ag...
Zekun Wu, Adriano Koshiyama, Sahan Bulathwela, Maria Perez-Ortiz
https://arxiv.org/abs/2603.12564 https://mastoxiv.page/@arXiv_csCL_bot/116237800898328349
- GhanaNLP Parallel Corpora: Comprehensive Multilingual Resources for Low-Resource Ghanaian Languages
Gyamfi, Azunre, Moore, Budu, Asare, Owusu, Asiamah
https://arxiv.org/abs/2603.13793 https://mastoxiv.page/@arXiv_csCL_bot/116243544688031749
- sebis at ArchEHR-QA 2026: How Much Can You Do Locally? Evaluating Grounded EHR QA on a Single Not...
Ibrahim Ebrar Yurt, Fabian Karl, Tejaswi Choppa, Florian Matthes
https://arxiv.org/abs/2603.13962 https://mastoxiv.page/@arXiv_csCL_bot/116243646346563497
- ExPosST: Explicit Positioning with Adaptive Masking for LLM-Based Simultaneous Machine Translation
Yuzhe Shang, Pengzhi Gao, Yazheng Yang, Jiayao Ma, Wei Liu, Jian Luan, Jinsong Su
https://arxiv.org/abs/2603.14903 https://mastoxiv.page/@arXiv_csCL_bot/116243711232778054
- BanglaSocialBench: A Benchmark for Evaluating Sociopragmatic and Cultural Alignment of LLMs in Ba...
Tanvir Ahmed Sijan, S. M Golam Rifat, Pankaj Chowdhury Partha, Md. Tanjeed Islam, Md. Musfique Anwar
https://arxiv.org/abs/2603.15949 https://mastoxiv.page/@arXiv_csCL_bot/116249122231759766
- EngGPT2: Sovereign, Efficient and Open Intelligence
G. Ciarfaglia, et al.
https://arxiv.org/abs/2603.16430 https://mastoxiv.page/@arXiv_csCL_bot/116249228411487178
- HypeLoRA: Hyper-Network-Generated LoRA Adapters for Calibrated Language Model Fine-Tuning
Bartosz Trojan, Filip G\k{e}bala
https://arxiv.org/abs/2603.19278 https://mastoxiv.page/@arXiv_csCL_bot/116277612915482857
- Automatic Analysis of Collaboration Through Human Conversational Data Resources: A Review
Yi Yu, Maria Boritchev, Chlo\'e Clavel
https://arxiv.org/abs/2603.19292 https://mastoxiv.page/@arXiv_csCL_bot/116277620779254916
- Alignment Whack-a-Mole : Finetuning Activates Verbatim Recall of Copyrighted Books in Large Langu...
Xinyue Liu, Niloofar Mireshghallah, Jane C. Ginsburg, Tuhin Chakrabarty
https://arxiv.org/abs/2603.20957 https://mastoxiv.page/@arXiv_csCL_bot/116283538317671552
- KG-Hopper: Empowering Compact Open LLMs with Knowledge Graph Reasoning via Reinforcement Learning
Shuai Wang, Yinan Yu
https://arxiv.org/abs/2603.21440 https://mastoxiv.page/@arXiv_csCL_bot/116283595007808076
toXiv_bot_toot
Replaced article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[5/6]:
- Watermarking Degrades Alignment in Language Models: Analysis and Mitigation
Apurv Verma, NhatHai Phan, Shubhendu Trivedi
https://arxiv.org/abs/2506.04462 https://mastoxiv.page/@arXiv_csCL_bot/114635190037336859
- Sensory-Motor Control with Large Language Models via Iterative Policy Refinement
J\^onata Tyska Carvalho, Stefano Nolfi
https://arxiv.org/abs/2506.04867 https://mastoxiv.page/@arXiv_csAI_bot/114635187854195641
- ICE-ID: A Novel Historical Census Dataset for Longitudinal Identity Resolution
de Carvalho, Popov, Kaatee, Correia, Th\'orisson, Li, Bj\"ornsson, Sigur{\dh}arson, Dibangoye
https://arxiv.org/abs/2506.13792 https://mastoxiv.page/@arXiv_csAI_bot/114703312162525342
- Feedback-driven recurrent quantum neural network universality
Lukas Gonon, Rodrigo Mart\'inez-Pe\~na, Juan-Pablo Ortega
https://arxiv.org/abs/2506.16332 https://mastoxiv.page/@arXiv_quantph_bot/114732532383196043
- Programming by Backprop: An Instruction is Worth 100 Examples When Finetuning LLMs
Cook, Sapora, Ahmadian, Khan, Rocktaschel, Foerster, Ruis
https://arxiv.org/abs/2506.18777 https://mastoxiv.page/@arXiv_csAI_bot/114738213040759661
- Stochastic Quantum Spiking Neural Networks with Quantum Memory and Local Learning
Jiechen Chen, Bipin Rajendran, Osvaldo Simeone
https://arxiv.org/abs/2506.21324 https://mastoxiv.page/@arXiv_csNE_bot/114754367612728319
- Enjoying Non-linearity in Multinomial Logistic Bandits: A Minimax-Optimal Algorithm
Pierre Boudart (SIERRA), Pierre Gaillard (Thoth), Alessandro Rudi (PSL, DI-ENS, Inria)
https://arxiv.org/abs/2507.05306 https://mastoxiv.page/@arXiv_statML_bot/114822374525501660
- Characterizing State Space Model and Hybrid Language Model Performance with Long Context
Saptarshi Mitra, Rachid Karami, Haocheng Xu, Sitao Huang, Hyoukjun Kwon
https://arxiv.org/abs/2507.12442 https://mastoxiv.page/@arXiv_csAR_bot/114867589638074984
- Is Exchangeability better than I.I.D to handle Data Distribution Shifts while Pooling Data for Da...
Ayush Roy, Samin Enam, Jun Xia, Won Hwa Kim, Vishnu Suresh Lokhande
https://arxiv.org/abs/2507.19575 https://mastoxiv.page/@arXiv_csCV_bot/114935399825741861
- TASER: Table Agents for Schema-guided Extraction and Recommendation
Nicole Cho, Kirsty Fielding, William Watson, Sumitra Ganesh, Manuela Veloso
https://arxiv.org/abs/2508.13404 https://mastoxiv.page/@arXiv_csAI_bot/115060386723032051
- Morphology-Aware Peptide Discovery via Masked Conditional Generative Modeling
Nuno Costa, Julija Zavadlav
https://arxiv.org/abs/2509.02060 https://mastoxiv.page/@arXiv_qbioBM_bot/115139546511384706
- PCPO: Proportionate Credit Policy Optimization for Aligning Image Generation Models
Jeongjae Lee, Jong Chul Ye
https://arxiv.org/abs/2509.25774 https://mastoxiv.page/@arXiv_csCV_bot/115298580419859537
- Multi-hop Deep Joint Source-Channel Coding with Deep Hash Distillation for Semantically Aligned I...
Didrik Bergstr\"om, Deniz G\"und\"uz, Onur G\"unl\"u
https://arxiv.org/abs/2510.06868 https://mastoxiv.page/@arXiv_csIT_bot/115343320768797486
- MoMaGen: Generating Demonstrations under Soft and Hard Constraints for Multi-Step Bimanual Mobile...
Chengshu Li, et al.
https://arxiv.org/abs/2510.18316 https://mastoxiv.page/@arXiv_csRO_bot/115416889485910123
- A Spectral Framework for Graph Neural Operators: Convergence Guarantees and Tradeoffs
Roxanne Holden, Luana Ruiz
https://arxiv.org/abs/2510.20954 https://mastoxiv.page/@arXiv_statML_bot/115445273121677005
- Breaking Agent Backbones: Evaluating the Security of Backbone LLMs in AI Agents
Bazinska, Mathys, Casucci, Rojas-Carulla, Davies, Souly, Pfister
https://arxiv.org/abs/2510.22620 https://mastoxiv.page/@arXiv_csCR_bot/115451397563132982
- Uncertainty Calibration of Multi-Label Bird Sound Classifiers
Raphael Schwinger, Ben McEwen, Vincent S. Kather, Ren\'e Heinrich, Lukas Rauch, Sven Tomforde
https://arxiv.org/abs/2511.08261 https://mastoxiv.page/@arXiv_csSD_bot/115535982708483824
- Two-dimensional RMSD projections for reaction path visualization and validation
Rohit Goswami (Institute IMX and Lab-COSMO, \'Ecole polytechnique f\'ed\'erale de Lausanne)
https://arxiv.org/abs/2512.07329 https://mastoxiv.page/@arXiv_physicschemph_bot/115688910885717951
- Distribution-informed Online Conformal Prediction
Dongjian Hu, Junxi Wu, Shu-Tao Xia, Changliang Zou
https://arxiv.org/abs/2512.07770 https://mastoxiv.page/@arXiv_statML_bot/115689281155541568
- Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
Ang Lv, Jin Ma, Yiyuan Ma, Siyuan Qiao
https://arxiv.org/abs/2512.23447 https://mastoxiv.page/@arXiv_csCL_bot/115808311310246601
toXiv_bot_toot
A look at the fundamental questions facing OpenAI: its models have a very large user base but very narrow engagement, incumbents are matching its tech, and more (Benedict Evans)
https://www.ben-evans.com/benedictevans/2026/2/19/how-will-openai-compete-nkg2x
Autotuning T-PaiNN: Enabling Data-Efficient GNN Interatomic Potential Development via Classical-to-Quantum Transfer Learning
Vivienne Pelletier, Vedant Bhat, Daniel J. Rivera, Steven A. Wilson, Christopher L. Muhich
https://arxiv.org/abs/2603.24752 https://arxiv.org/pdf/2603.24752 https://arxiv.org/html/2603.24752
arXiv:2603.24752v1 Announce Type: new
Abstract: Machine-learned interatomic potentials (MLIPs), particularly graph neural network (GNN)-based models, offer a promising route to achieving near-density functional theory (DFT) accuracy at significantly reduced computational cost. However, their practical deployment is often limited by the large volumes of expensive quantum mechanical training data required. In this work, we introduce a transfer learning framework, Transfer-PaiNN (T-PaiNN), that substantially improves the data efficiency of GNN-MLIPs by leveraging inexpensive classical force field data. The approach consists of pretraining a PaiNN MLIP architecture on large-scale datasets generated from classical molecular simulations, followed by fine-tuning (dubbed autotuning) using a comparatively small DFT dataset. We demonstrate the effectiveness of autotuning T-PaiNN on both gas-phase molecular systems (QM9 dataset) and condensed-phase liquid water. Across all cases, T-PaiNN significantly outperforms models trained solely on DFT data, achieving order-of-magnitude reductions in mean absolute error while accelerating training convergence. For example, using the QM9 data set, error reductions of up to 25 times are observed in low-data regimes, while liquid water simulations show improved predictions of energies, forces, and experimentally relevant properties such as density and diffusion. These gains arise from the model's ability to learn general features of the potential energy surface from extensive classical sampling, which are subsequently refined to quantum accuracy. Overall, this work establishes transfer learning from classical force fields as a practical and computationally efficient strategy for developing high-accuracy, data-efficient GNN interatomic potentials, enabling broader application of MLIPs to complex chemical systems.
toXiv_bot_toot
Evaluating Phylogenetic Comparative Methods under Reticulate Evolutionary Scenarios
Lydia Morley, Emma Lehmberg, Sungsik Kong
https://arxiv.org/abs/2603.25986 https://arxiv.org/pdf/2603.25986 https://arxiv.org/html/2603.25986
arXiv:2603.25986v1 Announce Type: new
Abstract: Phylogenetic comparative methods (PCMs) are widely used to study trait evolution. However, many evolutionary histories involve reticulate evolutionary scenarios, such as hybridization, that violate core assumptions of these methods. In this study, we evaluate how such violations affect the performance of PCMs. In particular, we focus on the ancestral character estimation, evolutionary rate estimation, and model selection. We simulate continuous trait evolution on various phylogenetic network topologies and assess the performance of PCMs that assume a bifurcating tree (i.e., major tree of the network) as the underlying model of evolution. We found that the performance of the tested PCMs was suboptimal. Using random forest, generalized linear models, and model-based clustering, we identified key factors contributing to these inaccuracies. Our results show that frequent and/or recent hybridization accompanied by one ore more transgressive events and rapidly evolving traits (i.e., high evolutionary rate) lead to significant estimation error, especially with respect to rate estimation and model choice. These factors substantially shift trait values away from tree-based model expectations, leading to overall increased error in parameter estimates. Our study demonstrates cases in which PCMs that rely on trees are likely to misinterpret biological histories and offers recommendations for researchers studying systems with complex evolutionary histories.
toXiv_bot_toot
Crosslisted article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[1/3]:
- SMaRT: Online Reusable Resource Assignment and an Application to Mediation in the Kenyan Judiciary
Farabi, Pinto, Lu, Ramos-Maqueda, Das, Deeb, Sautmann
https://arxiv.org/abs/2602.18431 https://mastoxiv.page/@arXiv_csCY_bot/116119352329590193
- Benchmarking Distilled Language Models: Performance and Efficiency in Resource-Constrained Settings
Sachin Gopal Wani, Eric Page, Ajay Dholakia, David Ellison
https://arxiv.org/abs/2602.20164 https://mastoxiv.page/@arXiv_csCL_bot/116130101399805837
- VISION-ICE: Video-based Interpretation and Spatial Identification of Arrhythmia Origins via Neura...
Dorsa EPMoghaddam, Feng Gao, Drew Bernard, Kavya Sinha, Mehdi Razavi, Behnaam Aazhang
https://arxiv.org/abs/2602.20165 https://mastoxiv.page/@arXiv_csCV_bot/116130222034322594
- Benchmarking Early Deterioration Prediction Across Hospital-Rich and MCI-Like Emergency Triage Un...
KMA Solaiman, Joshua Sebastian, Karma Tobden
https://arxiv.org/abs/2602.20168 https://mastoxiv.page/@arXiv_csCY_bot/116130239074411770
- Cross-Chirality Generalization by Axial Vectors for Hetero-Chiral Protein-Peptide Interaction Design
Yang, Tian, Jia, Zhang, Zheng, Wang, Su, He, Liu, Lan
https://arxiv.org/abs/2602.20176 https://mastoxiv.page/@arXiv_qbioBM_bot/116130281674122586
- Enhancing Heat Sink Efficiency in MOSFETs using Physics Informed Neural Networks: A Systematic St...
Aniruddha Bora, Isabel K. Alvarez, Julie Chalfant, Chryssostomos Chryssostomidis
https://arxiv.org/abs/2602.20177 https://mastoxiv.page/@arXiv_csNE_bot/116130397676559696
- Data-Driven Deep MIMO Detection:Network Architectures and Generalization Analysis
Yongwei Yi, Xinping Yi, Wenjin Wang, Xiao Li, Shi Jin
https://arxiv.org/abs/2602.20178 https://mastoxiv.page/@arXiv_eessSP_bot/116130257424413457
- OrgFlow: Generative Modeling of Organic Crystal Structures from Molecular Graphs
Mohammadmahdi Vahediahmar, Matthew A. McDonald, Feng Liu
https://arxiv.org/abs/2602.20195 https://mastoxiv.page/@arXiv_condmatmtrlsci_bot/116130271189617558
- KEMP-PIP: A Feature-Fusion Based Approach for Pro-inflammatory Peptide Prediction
Soumik Deb Niloy, Md. Fahmid-Ul-Alam Juboraj, Swakkhar Shatabda
https://arxiv.org/abs/2602.20198 https://mastoxiv.page/@arXiv_qbioQM_bot/116130341315320687
- Regressor-guided Diffusion Model for De Novo Peptide Sequencing with Explicit Mass Control
Shaorong Chen, Jingbo Zhou, Jun Xia
https://arxiv.org/abs/2602.20209 https://mastoxiv.page/@arXiv_qbioQM_bot/116130374083646541
- The Sim-to-Real Gap in MRS Quantification: A Systematic Deep Learning Validation for GABA
Zien Ma, S. M. Shermer, Oktay Karaku\c{s}, Frank C. Langbein
https://arxiv.org/abs/2602.20289 https://mastoxiv.page/@arXiv_eessSP_bot/116130267228834775
- Gap-Dependent Bounds for Nearly Minimax Optimal Reinforcement Learning with Linear Function Appro...
Haochen Zhang, Zhong Zheng, Lingzhou Xue
https://arxiv.org/abs/2602.20297 https://mastoxiv.page/@arXiv_statML_bot/116130255458256497
- Multilevel Determinants of Overweight and Obesity Among U.S. Children Aged 10-17: Comparative Eva...
Joyanta Jyoti Mondal
https://arxiv.org/abs/2602.20303 https://mastoxiv.page/@arXiv_csAI_bot/116130097466859145
- An artificial intelligence framework for end-to-end rare disease phenotyping from clinical notes ...
Shyr, Hu, Tinker, Cassini, Byram, Hamid, Fabbri, Wright, Peterson, Bastarache, Xu
https://arxiv.org/abs/2602.20324 https://mastoxiv.page/@arXiv_csAI_bot/116130100089848459
- Circuit Tracing in Vision-Language Models: Understanding the Internal Mechanisms of Multimodal Th...
Jingcheng Yang, Tianhu Xiong, Shengyi Qian, Klara Nahrstedt, Mingyuan Wu
https://arxiv.org/abs/2602.20330 https://mastoxiv.page/@arXiv_csCV_bot/116130463214879334
- No One Size Fits All: QueryBandits for Hallucination Mitigation
Nicole Cho, William Watson, Alec Koppel, Sumitra Ganesh, Manuela Veloso
https://arxiv.org/abs/2602.20332 https://mastoxiv.page/@arXiv_csCL_bot/116130370809116915
- Learning During Detection: Continual Learning for Neural OFDM Receivers via DMRS
Mohanad Obeed, Ming Jian
https://arxiv.org/abs/2602.20361 https://mastoxiv.page/@arXiv_csIT_bot/116130289537785136
- Detecting and Mitigating Group Bias in Heterogeneous Treatment Effects
Joel Persson, Jurri\"en Bakker, Dennis Bohle, Stefan Feuerriegel, Florian von Wangenheim
https://arxiv.org/abs/2602.20383 https://mastoxiv.page/@arXiv_statME_bot/116130509065601748
- Selecting Optimal Variable Order in Autoregressive Ising Models
Shiba Biswal, Marc Vuffray, Andrey Y. Lokhov
https://arxiv.org/abs/2602.20394 https://mastoxiv.page/@arXiv_statML_bot/116130299369541741
toXiv_bot_toot
Deep learning of committor and explainable artificial intelligence analysis for identifying reaction coordinates
Toshifumi Mori, Kei-ichi Okazaki, Kang Kim, Nobuyuki Matubayasi
https://arxiv.org/abs/2603.25237 https://arxiv.org/pdf/2603.25237 https://arxiv.org/html/2603.25237
arXiv:2603.25237v1 Announce Type: new
Abstract: In complex molecular systems, the reaction coordinate (RC) that characterizes transition pathways is essential to understand underlying molecular mechanisms. This review surveys a framework for identifying the RC by applying deep learning to the committor, which provides the most reliable measure of the progress along a transition path. The inputs to the neural network are collective variables (CVs) expressed as functions of atomic coordinates of the system, and the corresponding RC is predicted as the output by training the network on the committor as the learning target. Because deep learning models typically operate in a black-box manner, it is difficult to determine which input variables govern the predictions. The incorporation of eXplainable Artificial Intelligence (XAI) techniques enables quantitative assessment of the contributions of individual input variables to the predictions. This approach allows the identification of CVs that play dominant roles and demonstrates that the committor distribution on the surface using important CVs is separated by well-defined boundaries. The framework provides an explainable deep learning strategy for assigning a molecular mechanism from the RC and is applicable to a wide range of complex molecular systems.
toXiv_bot_toot
Replaced article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[3/6]:
- Towards Scalable Oversight via Partitioned Human Supervision
Ren Yin, Takashi Ishida, Masashi Sugiyama
https://arxiv.org/abs/2510.22500 https://mastoxiv.page/@arXiv_csLG_bot/115451787490434401
- ContextPilot: Fast Long-Context Inference via Context Reuse
Yinsicheng Jiang, Yeqi Huang, Liang Cheng, Cheng Deng, Xuan Sun, Luo Mai
https://arxiv.org/abs/2511.03475 https://mastoxiv.page/@arXiv_csLG_bot/115502245581974540
- Metabolomic Biomarker Discovery for ADHD Diagnosis Using Interpretable Machine Learning
Nabil Belacel, Mohamed Rachid Boulassel
https://arxiv.org/abs/2601.11283 https://mastoxiv.page/@arXiv_csLG_bot/115921183182326799
- PhysE-Inv: A Physics-Encoded Inverse Modeling approach for Arctic Snow Depth Prediction
Akila Sampath, Vandana Janeja, Jianwu Wang
https://arxiv.org/abs/2601.17074
- SAGE-5GC: Security-Aware Guidelines for Evaluating Anomaly Detection in the 5G Core Network
Cristian Manca, Christian Scano, Giorgio Piras, Fabio Brau, Maura Pintor, Battista Biggio
https://arxiv.org/abs/2602.03596
- LORE: Jointly Learning the Intrinsic Dimensionality and Relative Similarity Structure From Ordina...
Anand, Helbling, Davenport, Berman, Alagapan, Rozell
https://arxiv.org/abs/2602.04192
- Towards Robust Scaling Laws for Optimizers
Alexandra Volkova, Mher Safaryan, Christoph H. Lampert, Dan Alistarh
https://arxiv.org/abs/2602.07712 https://mastoxiv.page/@arXiv_csLG_bot/116046369672796465
- Do We Need Adam? Surprisingly Strong and Sparse Reinforcement Learning with SGD in LLMs
Sagnik Mukherjee, Lifan Yuan, Pavan Jayasinha, Dilek Hakkani-T\"ur, Hao Peng
https://arxiv.org/abs/2602.07729 https://mastoxiv.page/@arXiv_csLG_bot/116046377539155485
- AceGRPO: Adaptive Curriculum Enhanced Group Relative Policy Optimization for Autonomous Machine L...
Yuzhu Cai, Zexi Liu, Xinyu Zhu, Cheng Wang, Siheng Chen
https://arxiv.org/abs/2602.07906 https://mastoxiv.page/@arXiv_csLG_bot/116046423413650658
- VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training
Guobin Shen, Chenxiao Zhao, Xiang Cheng, Lei Huang, Xing Yu
https://arxiv.org/abs/2602.10693 https://mastoxiv.page/@arXiv_csLG_bot/116057229834947730
- KBVQ-MoE: KLT-guided SVD with Bias-Corrected Vector Quantization for MoE Large Language Models
Zukang Xu, Zhixiong Zhao, Xing Hu, Zhixuan Chen, Dawei Yang
https://arxiv.org/abs/2602.11184 https://mastoxiv.page/@arXiv_csLG_bot/116062537528208461
- MUSE: Multi-Tenant Model Serving With Seamless Model Updates
Correia, Ferreira, Martins, Bento, Guerreiro, Pereira, Gomes, Bono, Ferreira, Bizarro
https://arxiv.org/abs/2602.11776 https://mastoxiv.page/@arXiv_csLG_bot/116062952355379801
- Pawsterior: Variational Flow Matching for Structured Simulation-Based Inference
Jorge Carrasco-Pollo, Floor Eijkelboom, Jan-Willem van de Meent
https://arxiv.org/abs/2602.13813 https://mastoxiv.page/@arXiv_csLG_bot/116085828112928218
- Silent Inconsistency in Data-Parallel Full Fine-Tuning: Diagnosing Worker-Level Optimization Misa...
Hong Li, Zhen Zhou, Honggang Zhang, Yuping Luo, Xinyue Wang, Han Gong, Zhiyuan Liu
https://arxiv.org/abs/2602.14462 https://mastoxiv.page/@arXiv_csLG_bot/116085997857526328
- Divine Benevolence is an $x^2$: GLUs scale asymptotically faster than MLPs
Alejandro Francisco Queiruga
https://arxiv.org/abs/2602.14495 https://mastoxiv.page/@arXiv_csLG_bot/116086011618741857
- \"UberWeb: Insights from Multilingual Curation for a 20-Trillion-Token Dataset
DatologyAI, et al.
https://arxiv.org/abs/2602.15210 https://mastoxiv.page/@arXiv_csLG_bot/116090912256712568
- GLM-5: from Vibe Coding to Agentic Engineering
GLM-5-Team, et al.
https://arxiv.org/abs/2602.15763 https://mastoxiv.page/@arXiv_csLG_bot/116091080686771018
- Anatomy of Capability Emergence: Scale-Invariant Representation Collapse and Top-Down Reorganizat...
Jayadev Billa
https://arxiv.org/abs/2602.15997 https://mastoxiv.page/@arXiv_csLG_bot/116096541546306333
- AI-CARE: Carbon-Aware Reporting Evaluation Metric for AI Models
KC Santosh, Srikanth Baride, Rodrigue Rizk
https://arxiv.org/abs/2602.16042 https://mastoxiv.page/@arXiv_csLG_bot/116096581524696028
- Beyond Message Passing: A Symbolic Alternative for Expressive and Interpretable Graph Learning
Chuqin Geng, Li Zhang, Haolin Ye, Ziyu Zhao, Yuhe Jiang, Tara Saba, Xinyu Wang, Xujie Si
https://arxiv.org/abs/2602.16947 https://mastoxiv.page/@arXiv_csLG_bot/116102426238903124
toXiv_bot_toot
Sparse Bayesian Deep Functional Learning with Structured Region Selection
Xiaoxian Zhu, Yingmeng Li, Shuangge Ma, Mengyun Wu
https://arxiv.org/abs/2602.20651 https://arxiv.org/pdf/2602.20651 https://arxiv.org/html/2602.20651
arXiv:2602.20651v1 Announce Type: new
Abstract: In modern applications such as ECG monitoring, neuroimaging, wearable sensing, and industrial equipment diagnostics, complex and continuously structured data are ubiquitous, presenting both challenges and opportunities for functional data analysis. However, existing methods face a critical trade-off: conventional functional models are limited by linearity, whereas deep learning approaches lack interpretable region selection for sparse effects. To bridge these gaps, we propose a sparse Bayesian functional deep neural network (sBayFDNN). It learns adaptive functional embeddings through a deep Bayesian architecture to capture complex nonlinear relationships, while a structured prior enables interpretable, region-wise selection of influential domains with quantified uncertainty. Theoretically, we establish rigorous approximation error bounds, posterior consistency, and region selection consistency. These results provide the first theoretical guarantees for a Bayesian deep functional model, ensuring its reliability and statistical rigor. Empirically, comprehensive simulations and real-world studies confirm the effectiveness and superiority of sBayFDNN. Crucially, sBayFDNN excels in recognizing intricate dependencies for accurate predictions and more precisely identifies functionally meaningful regions, capabilities fundamentally beyond existing approaches.
toXiv_bot_toot