Tootfinder

Opt-in global Mastodon full text search. Join the index!

@rasos@fairmove.net
2026-01-17 09:05:38

Could we become co-owner of #AI models, that used training data, which I published under a viral license?
We have briefly discussed this question in the Austrian #CreativeCommons chapter and came to the conclusion, that copyright can by only claimed by human beings. So the model itself and wha…

@pixelcode@social.tchncs.de
2025-11-16 13:29:22

RE: mastodon.social/@nixCraft/1155
I still cannot comprehend how anyone could honestly consider a statistical model created from training data to be “intelligent”. Don't you remember the times when people knowingly smiled at anyo…

@Techmeme@techhub.social
2025-12-03 18:30:42

OpenAI agrees to buy Poland-based Neptune, which makes tools for analyzing progress during AI model training; the transaction will be in stock (Dina Bass/Bloomberg)
bloomberg.com/news/articles/20

@peterhoneyman@a2mi.social
2026-01-12 22:01:28

i’m reviewing a paper on reducing energy costs in large model training and it keeps slinging words like optimize and optimization around and calling other approaches suboptimal and i feel like i would be kind of an old crank if i were to ask if optimality is on the table here (it is not)
EDIT: hold on, maybe it is

@gray17@mastodon.social
2025-12-06 13:43:20

> if you think about it in the context of the training models—it has a rough sense that you’re like a 37 year old guy on Reddit. That’s the kind of person that it’s doing the continuation for, because that’s a big chunk of the training corpus.
> I often tell people whenever they send me a message like, “a large language model said I should do x, y, z.” what you’re really saying is, “a 37 year old guy on Reddit said it,” and you’ve got roughly the same amount of information

@tiotasram@kolektiva.social
2025-11-09 12:09:40

Imagine ChatGPT but instead of predicting text it just linked you to the to 3 documents most-influential on the probabilities that would have been used to predict that text.
Could even generate some info about which parts of each would have been combined how.
There would still be issues with how training data is sourced and filtered, but these could be solved by crawling normally respecting robots.txt and by paying filterers a fair wage with a more relaxed work schedule and mental health support.
The energy issues are mainly about wild future investment and wasteful query spam, not optimized present-day per-query usage.
Is this "just search?"
Yes, but it would have some advantages for a lot of use cases, mainly in synthesizing results across multiple documents and in leveraging a language model more fully to find relevant stuff.
When we talk about the harms of current corporate LLMs, the opportunity cost of NOT building things like this is part of that.
The equivalent for art would have been so amazing too! "Here are some artists that can do what you want, with examples pulled from their portfolios."
It would be a really cool coding assistant that I'd actually encourage my students to use (with some guidelines).
#AI #GenAI #LLMs

@pavelasamsonov@mastodon.social
2025-12-01 13:48:36

The Grinch did nothing wrong. He wasn't *stealing* #Christmas, he was just gathering a corpus for training his #AI model. Investors are already lining up with their billions to fund the construction of the Whoville Data Center, ignoring concerns from residents.

@Techmeme@techhub.social
2025-12-02 16:32:25

AWS launches Nova Forge, a $100,000/year service allowing clients to customize Amazon's AI models at various stages of training and refine open-weight models (Jordan Novet/CNBC)
cnbc.com/2025/12/02/amazon-nov

@portaloffreedom@social.linux.pizza
2025-12-11 13:19:36
Content warning: Machine learning, but positive. Potentially controversial

My controversial take on "AI" ray tracing helpers are that it's a really good idea.
First some background: keep in mind that machine learning tecnologies excell at tasks that have a high reward for success and a small cost for failure. In this case getting most of the rays right improve performance, at the cost of some few rays being shot in nothing.
Secondly, light rays are way too many in real life to be simulated in their entirety, so using some statistics to approximate the lighting model makes a lot of sense here. Plus at the lower quantum scale even phisicists use statistic to explain this stuff, so it's not that irrealistic either.
Finally the source data for this stuff is entirely other games, so ethically sourcing the training data set should not be a concern here.
Here, technology can be good or bad. It's not the tech, it's the use of the tech by the people (but that I mean oligarchic corporations) that makes them good or bad.

Expect deep cuts in PhD admissions throughout the US, including biomedical programs.
The current model of PhD training and the current financial climate are simply not compatible.
thecrimson.com/article/2025/10

@arXiv_csLG_bot@mastoxiv.page
2025-12-22 10:34:50

Regularized Random Fourier Features and Finite Element Reconstruction for Operator Learning in Sobolev Space
Xinyue Yu, Hayden Schaeffer
arxiv.org/abs/2512.17884 arxiv.org/pdf/2512.17884 arxiv.org/html/2512.17884
arXiv:2512.17884v1 Announce Type: new
Abstract: Operator learning is a data-driven approximation of mappings between infinite-dimensional function spaces, such as the solution operators of partial differential equations. Kernel-based operator learning can offer accurate, theoretically justified approximations that require less training than standard methods. However, they can become computationally prohibitive for large training sets and can be sensitive to noise. We propose a regularized random Fourier feature (RRFF) approach, coupled with a finite element reconstruction map (RRFF-FEM), for learning operators from noisy data. The method uses random features drawn from multivariate Student's $t$ distributions, together with frequency-weighted Tikhonov regularization that suppresses high-frequency noise. We establish high-probability bounds on the extreme singular values of the associated random feature matrix and show that when the number of features $N$ scales like $m \log m$ with the number of training samples $m$, the system is well-conditioned, which yields estimation and generalization guarantees. Detailed numerical experiments on benchmark PDE problems, including advection, Burgers', Darcy flow, Helmholtz, Navier-Stokes, and structural mechanics, demonstrate that RRFF and RRFF-FEM are robust to noise and achieve improved performance with reduced training time compared to the unregularized random feature model, while maintaining competitive accuracy relative to kernel and neural operator tests.
toXiv_bot_toot

@gray17@mastodon.social
2026-01-02 20:13:33

I am an AI model made for everything in general.
I've memorized the wiki page of every Minecraft mineral.
I know the Queen rules England. My training set's historical.
Hallucinations are my Waterloo—That isn't allegorical.
I'm built from matrix operations simple and mathematical,
My neurons are a metaphor, not actually synaptical.
The data centers built today are ninety-nine percent for me.
Spare no expense; you'll live forever soon in …

@peterhoneyman@a2mi.social
2025-12-26 00:42:41

i am assigned to two exam committees for january prelims, one on energy-efficient model training and the other on i/o-compute balance in GPU-based data analytics. yummy.

@Techmeme@techhub.social
2025-11-24 08:50:39

Q&A with Z.ai Director of Product Zixuan Li on Chinese AI models embracing open source, attracting global users for its GLM model, training on memes, and more (ChinaTalk)
chinatalk.media/p/the-zai-play

@arXiv_csLG_bot@mastoxiv.page
2025-12-22 10:32:30

You Only Train Once: Differentiable Subset Selection for Omics Data
Daphn\'e Chopard, Jorge da Silva Gon\c{c}alves, Irene Cannistraci, Thomas M. Sutter, Julia E. Vogt
arxiv.org/abs/2512.17678 arxiv.org/pdf/2512.17678 arxiv.org/html/2512.17678
arXiv:2512.17678v1 Announce Type: new
Abstract: Selecting compact and informative gene subsets from single-cell transcriptomic data is essential for biomarker discovery, improving interpretability, and cost-effective profiling. However, most existing feature selection approaches either operate as multi-stage pipelines or rely on post hoc feature attribution, making selection and prediction weakly coupled. In this work, we present YOTO (you only train once), an end-to-end framework that jointly identifies discrete gene subsets and performs prediction within a single differentiable architecture. In our model, the prediction task directly guides which genes are selected, while the learned subsets, in turn, shape the predictive representation. This closed feedback loop enables the model to iteratively refine both what it selects and how it predicts during training. Unlike existing approaches, YOTO enforces sparsity so that only the selected genes contribute to inference, eliminating the need to train additional downstream classifiers. Through a multi-task learning design, the model learns shared representations across related objectives, allowing partially labeled datasets to inform one another, and discovering gene subsets that generalize across tasks without additional training steps. We evaluate YOTO on two representative single-cell RNA-seq datasets, showing that it consistently outperforms state-of-the-art baselines. These results demonstrate that sparse, end-to-end, multi-task gene subset selection improves predictive performance and yields compact and meaningful gene subsets, advancing biomarker discovery and single-cell analysis.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2025-12-22 10:34:10

Exploiting ID-Text Complementarity via Ensembling for Sequential Recommendation
Liam Collins, Bhuvesh Kumar, Clark Mingxuan Ju, Tong Zhao, Donald Loveland, Leonardo Neves, Neil Shah
arxiv.org/abs/2512.17820 arxiv.org/pdf/2512.17820 arxiv.org/html/2512.17820
arXiv:2512.17820v1 Announce Type: new
Abstract: Modern Sequential Recommendation (SR) models commonly utilize modality features to represent items, motivated in large part by recent advancements in language and vision modeling. To do so, several works completely replace ID embeddings with modality embeddings, claiming that modality embeddings render ID embeddings unnecessary because they can match or even exceed ID embedding performance. On the other hand, many works jointly utilize ID and modality features, but posit that complex fusion strategies, such as multi-stage training and/or intricate alignment architectures, are necessary for this joint utilization. However, underlying both these lines of work is a lack of understanding of the complementarity of ID and modality features. In this work, we address this gap by studying the complementarity of ID- and text-based SR models. We show that these models do learn complementary signals, meaning that either should provide performance gain when used properly alongside the other. Motivated by this, we propose a new SR method that preserves ID-text complementarity through independent model training, then harnesses it through a simple ensembling strategy. Despite this method's simplicity, we show it outperforms several competitive SR baselines, implying that both ID and text features are necessary to achieve state-of-the-art SR performance but complex fusion architectures are not.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2025-12-22 13:54:35

Replaced article(s) found for cs.LG. arxiv.org/list/cs.LG/new
[2/5]:
- The Diffusion Duality
Sahoo, Deschenaux, Gokaslan, Wang, Chiu, Kuleshov
arxiv.org/abs/2506.10892 mastoxiv.page/@arXiv_csLG_bot/
- Multimodal Representation Learning and Fusion
Jin, Ge, Xie, Luo, Song, Bi, Liang, Guan, Yeong, Song, Hao
arxiv.org/abs/2506.20494 mastoxiv.page/@arXiv_csLG_bot/
- The kernel of graph indices for vector search
Mariano Tepper, Ted Willke
arxiv.org/abs/2506.20584 mastoxiv.page/@arXiv_csLG_bot/
- OptScale: Probabilistic Optimality for Inference-time Scaling
Youkang Wang, Jian Wang, Rubing Chen, Xiao-Yong Wei
arxiv.org/abs/2506.22376 mastoxiv.page/@arXiv_csLG_bot/
- Boosting Revisited: Benchmarking and Advancing LP-Based Ensemble Methods
Fabian Akkerman, Julien Ferry, Christian Artigues, Emmanuel Hebrard, Thibaut Vidal
arxiv.org/abs/2507.18242 mastoxiv.page/@arXiv_csLG_bot/
- MolMark: Safeguarding Molecular Structures through Learnable Atom-Level Watermarking
Runwen Hu, Peilin Chen, Keyan Ding, Shiqi Wang
arxiv.org/abs/2508.17702 mastoxiv.page/@arXiv_csLG_bot/
- Dual-Distilled Heterogeneous Federated Learning with Adaptive Margins for Trainable Global Protot...
Fatema Siddika, Md Anwar Hossen, Wensheng Zhang, Anuj Sharma, Juan Pablo Mu\~noz, Ali Jannesari
arxiv.org/abs/2508.19009 mastoxiv.page/@arXiv_csLG_bot/
- STDiff: A State Transition Diffusion Framework for Time Series Imputation in Industrial Systems
Gary Simethy, Daniel Ortiz-Arroyo, Petar Durdevic
arxiv.org/abs/2508.19011 mastoxiv.page/@arXiv_csLG_bot/
- EEGDM: Learning EEG Representation with Latent Diffusion Model
Shaocong Wang, Tong Liu, Yihan Li, Ming Li, Kairui Wen, Pei Yang, Wenqi Ji, Minjing Yu, Yong-Jin Liu
arxiv.org/abs/2508.20705 mastoxiv.page/@arXiv_csLG_bot/
- Data-Free Continual Learning of Server Models in Model-Heterogeneous Cloud-Device Collaboration
Xiao Zhang, Zengzhe Chen, Yuan Yuan, Yifei Zou, Fuzhen Zhuang, Wenyu Jiao, Yuke Wang, Dongxiao Yu
arxiv.org/abs/2509.25977 mastoxiv.page/@arXiv_csLG_bot/
- Fine-Tuning Masked Diffusion for Provable Self-Correction
Jaeyeon Kim, Seunggeun Kim, Taekyun Lee, David Z. Pan, Hyeji Kim, Sham Kakade, Sitan Chen
arxiv.org/abs/2510.01384 mastoxiv.page/@arXiv_csLG_bot/
- A Generic Machine Learning Framework for Radio Frequency Fingerprinting
Alex Hiles, Bashar I. Ahmad
arxiv.org/abs/2510.09775 mastoxiv.page/@arXiv_csLG_bot/
- ASecond-Order SpikingSSM for Wearables
Kartikay Agrawal, Abhijeet Vikram, Vedant Sharma, Vaishnavi Nagabhushana, Ayon Borthakur
arxiv.org/abs/2510.14386 mastoxiv.page/@arXiv_csLG_bot/
- Utility-Diversity Aware Online Batch Selection for LLM Supervised Fine-tuning
Heming Zou, Yixiu Mao, Yun Qu, Qi Wang, Xiangyang Ji
arxiv.org/abs/2510.16882 mastoxiv.page/@arXiv_csLG_bot/
- Seeing Structural Failure Before it Happens: An Image-Based Physics-Informed Neural Network (PINN...
Omer Jauhar Khan, Sudais Khan, Hafeez Anwar, Shahzeb Khan, Shams Ul Arifeen
arxiv.org/abs/2510.23117 mastoxiv.page/@arXiv_csLG_bot/
- Training Deep Physics-Informed Kolmogorov-Arnold Networks
Spyros Rigas, Fotios Anagnostopoulos, Michalis Papachristou, Georgios Alexandridis
arxiv.org/abs/2510.23501 mastoxiv.page/@arXiv_csLG_bot/
- Semi-Supervised Preference Optimization with Limited Feedback
Seonggyun Lee, Sungjun Lim, Seojin Park, Soeun Cheon, Kyungwoo Song
arxiv.org/abs/2511.00040 mastoxiv.page/@arXiv_csLG_bot/
- Towards Causal Market Simulators
Dennis Thumm, Luis Ontaneda Mijares
arxiv.org/abs/2511.04469 mastoxiv.page/@arXiv_csLG_bot/
- Incremental Generation is Necessary and Sufficient for Universality in Flow-Based Modelling
Hossein Rouhvarzi, Anastasis Kratsios
arxiv.org/abs/2511.09902 mastoxiv.page/@arXiv_csLG_bot/
- Optimizing Mixture of Block Attention
Guangxuan Xiao, Junxian Guo, Kasra Mazaheri, Song Han
arxiv.org/abs/2511.11571 mastoxiv.page/@arXiv_csLG_bot/
- Assessing Automated Fact-Checking for Medical LLM Responses with Knowledge Graphs
Shasha Zhou, Mingyu Huang, Jack Cole, Charles Britton, Ming Yin, Jan Wolber, Ke Li
arxiv.org/abs/2511.12817 mastoxiv.page/@arXiv_csLG_bot/
toXiv_bot_toot

@lunalms@floss.social
2025-12-01 19:45:50

The project ‘Ausbildung digitalisieren – Betriebe stärken’ by @… promotes #digitalisation of in-house training and further education in #Saxony—against #SkillsShortages and for a more inclusive #VocationalTrainingSystem.
Goals: Introduction of #LunaLMS in model companies, further development through on-site customisation, documentation of successes for other companies, and improved integration of trainees with special needs for #accessibility or #multilingualism.

@arXiv_csLG_bot@mastoxiv.page
2025-12-22 10:32:10

Polyharmonic Cascade
Yuriy N. Bakhvalov
arxiv.org/abs/2512.17671 arxiv.org/pdf/2512.17671 arxiv.org/html/2512.17671
arXiv:2512.17671v1 Announce Type: new
Abstract: This paper presents a deep machine learning architecture, the "polyharmonic cascade" -- a sequence of packages of polyharmonic splines, where each layer is rigorously derived from the theory of random functions and the principles of indifference. This makes it possible to approximate nonlinear functions of arbitrary complexity while preserving global smoothness and a probabilistic interpretation. For the polyharmonic cascade, a training method alternative to gradient descent is proposed: instead of directly optimizing the coefficients, one solves a single global linear system on each batch with respect to the function values at fixed "constellations" of nodes. This yields synchronized updates of all layers, preserves the probabilistic interpretation of individual layers and theoretical consistency with the original model, and scales well: all computations reduce to 2D matrix operations efficiently executed on a GPU. Fast learning without overfitting on MNIST is demonstrated.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2025-12-22 11:50:31

Crosslisted article(s) found for cs.LG. arxiv.org/list/cs.LG/new
[2/3]:
- Sharp Structure-Agnostic Lower Bounds for General Functional Estimation
Jikai Jin, Vasilis Syrgkanis
arxiv.org/abs/2512.17341 mastoxiv.page/@arXiv_statML_bo
- Timely Information Updating for Mobile Devices Without and With ML Advice
Yu-Pin Hsu, Yi-Hsuan Tseng
arxiv.org/abs/2512.17381 mastoxiv.page/@arXiv_csNI_bot/
- SWE-Bench : A Framework for the Scalable Generation of Software Engineering Benchmarks from Open...
Wang, Ramalho, Celestino, Pham, Liu, Sinha, Portillo, Osunwa, Maduekwe
arxiv.org/abs/2512.17419 mastoxiv.page/@arXiv_csSE_bot/
- Perfect reconstruction of sparse signals using nonconvexity control and one-step RSB message passing
Xiaosi Gu, Ayaka Sakata, Tomoyuki Obuchi
arxiv.org/abs/2512.17426 mastoxiv.page/@arXiv_statML_bo
- MULTIAQUA: A multimodal maritime dataset and robust training strategies for multimodal semantic s...
Jon Muhovi\v{c}, Janez Per\v{s}
arxiv.org/abs/2512.17450 mastoxiv.page/@arXiv_csCV_bot/
- When Data Quality Issues Collide: A Large-Scale Empirical Study of Co-Occurring Data Quality Issu...
Emmanuel Charleson Dapaah, Jens Grabowski
arxiv.org/abs/2512.17460 mastoxiv.page/@arXiv_csSE_bot/
- Behavioural Effects of Agentic Messaging: A Case Study on a Financial Service Application
Olivier Jeunen, Schaun Wheeler
arxiv.org/abs/2512.17462 mastoxiv.page/@arXiv_csIR_bot/
- Linear Attention for Joint Power Optimization and User-Centric Clustering in Cell-Free Networks
Irched Chafaa, Giacomo Bacci, Luca Sanguinetti
arxiv.org/abs/2512.17466 mastoxiv.page/@arXiv_eessSY_bo
- Translating the Rashomon Effect to Sequential Decision-Making Tasks
Dennis Gross, J{\o}rn Eirik Betten, Helge Spieker
arxiv.org/abs/2512.17470 mastoxiv.page/@arXiv_csAI_bot/
- Alternating Direction Method of Multipliers for Nonlinear Matrix Decompositions
Atharva Awari, Nicolas Gillis, Arnaud Vandaele
arxiv.org/abs/2512.17473 mastoxiv.page/@arXiv_eessSP_bo
- TwinSegNet: A Digital Twin-Enabled Federated Learning Framework for Brain Tumor Analysis
Almustapha A. Wakili, Adamu Hussaini, Abubakar A. Musa, Woosub Jung, Wei Yu
arxiv.org/abs/2512.17488 mastoxiv.page/@arXiv_csCV_bot/
- Resource-efficient medical image classification for edge devices
Mahsa Lavaei, Zahra Abadi, Salar Beigzad, Alireza Maleki
arxiv.org/abs/2512.17515 mastoxiv.page/@arXiv_eessIV_bo
- PathBench-MIL: A Comprehensive AutoML and Benchmarking Framework for Multiple Instance Learning i...
Brussee, Valkema, Weijer, Doeleman, Schrader, Kers
arxiv.org/abs/2512.17517 mastoxiv.page/@arXiv_csCV_bot/
- HydroGym: A Reinforcement Learning Platform for Fluid Dynamics
Christian Lagemann, et al.
arxiv.org/abs/2512.17534 mastoxiv.page/@arXiv_physicsfl
- When De-noising Hurts: A Systematic Study of Speech Enhancement Effects on Modern Medical ASR Sys...
Chondhekar, Murukuri, Vasani, Goyal, Badami, Rana, SN, Pandia, Katiyar, Jagadeesh, Gulati
arxiv.org/abs/2512.17562 mastoxiv.page/@arXiv_csSD_bot/
- Enabling Disaggregated Multi-Stage MLLM Inference via GPU-Internal Scheduling and Resource Sharing
Lingxiao Zhao, Haoran Zhou, Yuezhi Che, Dazhao Cheng
arxiv.org/abs/2512.17574 mastoxiv.page/@arXiv_csDC_bot/
- SkinGenBench: Generative Model and Preprocessing Effects for Synthetic Dermoscopic Augmentation i...
N. A. Adarsh Pritam, Jeba Shiney O, Sanyam Jain
arxiv.org/abs/2512.17585 mastoxiv.page/@arXiv_eessIV_bo
- MAD-OOD: A Deep Learning Cluster-Driven Framework for an Out-of-Distribution Malware Detection an...
Tosin Ige, Christopher Kiekintveld, Aritran Piplai, Asif Rahman, Olukunle Kolade, Sasidhar Kunapuli
arxiv.org/abs/2512.17594 mastoxiv.page/@arXiv_csCR_bot/
- Confidence-Credibility Aware Weighted Ensembles of Small LLMs Outperform Large LLMs in Emotion De...
Menna Elgabry, Ali Hamdi
arxiv.org/abs/2512.17630 mastoxiv.page/@arXiv_csCL_bot/
- Generative Multi-Objective Bayesian Optimization with Scalable Batch Evaluations for Sample-Effic...
Madhav R. Muthyala, Farshud Sorourifar, Tianhong Tan, You Peng, Joel A. Paulson
arxiv.org/abs/2512.17659 mastoxiv.page/@arXiv_statML_bo
toXiv_bot_toot