Tootfinder

Opt-in global Mastodon full text search. Join the index!

No exact results. Similar results found.
@inthehands@hachyderm.io
2026-01-15 16:13:27

I get a steady stream in my replies of meant-to-be-helpful advice of the form “You should use [complex F/OSS tech] to [do complex thing] that [solves a problem you don’t have but renders the process unusable for the people doing it]”
…when the real tech problems we’re facing on the ground here, both with Signal and other tools, are 99% just ground-level UX stuff:
- basic, basic usability glitches
- workflow friction
- problematic default settings
- reliability

@PaulWermer@sfba.social
2026-02-04 22:47:13

Balcony Solar is very popular in Germany , and available in most of the EU. These are solar PV systems that plug into a regular outlet, providing electrical power behing the meter. Utilities hate them!
California might (if we nag our state Senators and Assemblymembers) actually allow us to install "Balcony Solar" - aka plug-in solar. That would be great for renters. SB868 (Wiener) expressly allows this, and blocks utilities from prohibiting their use. Please write your sta…

@paulwermer@sfba.social
2026-02-04 22:47:13

Balcony Solar is very popular in Germany , and available in most of the EU. These are solar PV systems that plug into a regular outlet, providing electrical power behing the meter. Utilities hate them!
California might (if we nag our state Senators and Assemblymembers) actually allow us to install "Balcony Solar" - aka plug-in solar. That would be great for renters. SB868 (Wiener) expressly allows this, and blocks utilities from prohibiting their use. Please write your sta…

@arXiv_csGR_bot@mastoxiv.page
2026-02-03 08:20:05

OFERA: Blendshape-driven 3D Gaussian Control for Occluded Facial Expression to Realistic Avatars in VR
Seokhwan Yang, Boram Yoon, Seoyoung Kang, Hail Song, Woontack Woo
arxiv.org/abs/2602.01748 arxiv.org/pdf/2602.01748 arxiv.org/html/2602.01748
arXiv:2602.01748v1 Announce Type: new
Abstract: We propose OFERA, a novel framework for real-time expression control of photorealistic Gaussian head avatars for VR headset users. Existing approaches attempt to recover occluded facial expressions using additional sensors or internal cameras, but sensor-based methods increase device weight and discomfort, while camera-based methods raise privacy concerns and suffer from limited access to raw data. To overcome these limitations, we leverage the blendshape signals provided by commercial VR headsets as expression inputs. Our framework consists of three key components: (1) Blendshape Distribution Alignment (BDA), which applies linear regression to align the headset-provided blendshape distribution to a canonical input space; (2) an Expression Parameter Mapper (EPM) that maps the aligned blendshape signals into an expression parameter space for controlling Gaussian head avatars; and (3) a Mapper-integrated Avatar (MiA) that incorporates EPM into the avatar learning process to ensure distributional consistency. Furthermore, OFERA establishes an end-to-end pipeline that senses and maps expressions, updates Gaussian avatars, and renders them in real-time within VR environments. We show that EPM outperforms existing mapping methods on quantitative metrics, and we demonstrate through a user study that the full OFERA framework enhances expression fidelity while preserving avatar realism. By enabling real-time and photorealistic avatar expression control, OFERA significantly improves telepresence in VR communication. A project page is available at ysshwan147.github.io/projects/.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2025-12-22 10:32:50

Spatially-informed transformers: Injecting geostatistical covariance biases into self-attention for spatio-temporal forecasting
Yuri Calleo
arxiv.org/abs/2512.17696 arxiv.org/pdf/2512.17696 arxiv.org/html/2512.17696
arXiv:2512.17696v1 Announce Type: new
Abstract: The modeling of high-dimensional spatio-temporal processes presents a fundamental dichotomy between the probabilistic rigor of classical geostatistics and the flexible, high-capacity representations of deep learning. While Gaussian processes offer theoretical consistency and exact uncertainty quantification, their prohibitive computational scaling renders them impractical for massive sensor networks. Conversely, modern transformer architectures excel at sequence modeling but inherently lack a geometric inductive bias, treating spatial sensors as permutation-invariant tokens without a native understanding of distance. In this work, we propose a spatially-informed transformer, a hybrid architecture that injects a geostatistical inductive bias directly into the self-attention mechanism via a learnable covariance kernel. By formally decomposing the attention structure into a stationary physical prior and a non-stationary data-driven residual, we impose a soft topological constraint that favors spatially proximal interactions while retaining the capacity to model complex dynamics. We demonstrate the phenomenon of ``Deep Variography'', where the network successfully recovers the true spatial decay parameters of the underlying process end-to-end via backpropagation. Extensive experiments on synthetic Gaussian random fields and real-world traffic benchmarks confirm that our method outperforms state-of-the-art graph neural networks. Furthermore, rigorous statistical validation confirms that the proposed method delivers not only superior predictive accuracy but also well-calibrated probabilistic forecasts, effectively bridging the gap between physics-aware modeling and data-driven learning.
toXiv_bot_toot

@arXiv_csLG_bot@mastoxiv.page
2025-12-22 10:34:10

Exploiting ID-Text Complementarity via Ensembling for Sequential Recommendation
Liam Collins, Bhuvesh Kumar, Clark Mingxuan Ju, Tong Zhao, Donald Loveland, Leonardo Neves, Neil Shah
arxiv.org/abs/2512.17820 arxiv.org/pdf/2512.17820 arxiv.org/html/2512.17820
arXiv:2512.17820v1 Announce Type: new
Abstract: Modern Sequential Recommendation (SR) models commonly utilize modality features to represent items, motivated in large part by recent advancements in language and vision modeling. To do so, several works completely replace ID embeddings with modality embeddings, claiming that modality embeddings render ID embeddings unnecessary because they can match or even exceed ID embedding performance. On the other hand, many works jointly utilize ID and modality features, but posit that complex fusion strategies, such as multi-stage training and/or intricate alignment architectures, are necessary for this joint utilization. However, underlying both these lines of work is a lack of understanding of the complementarity of ID and modality features. In this work, we address this gap by studying the complementarity of ID- and text-based SR models. We show that these models do learn complementary signals, meaning that either should provide performance gain when used properly alongside the other. Motivated by this, we propose a new SR method that preserves ID-text complementarity through independent model training, then harnesses it through a simple ensembling strategy. Despite this method's simplicity, we show it outperforms several competitive SR baselines, implying that both ID and text features are necessary to achieve state-of-the-art SR performance but complex fusion architectures are not.
toXiv_bot_toot

@arXiv_csGR_bot@mastoxiv.page
2026-01-21 08:13:41

Copy-Trasform-Paste: Zero-Shot Object-Object Alignment Guided by Vision-Language and Geometric Constraints
Rotem Gatenyo, Ohad Fried
arxiv.org/abs/2601.14207 arxiv.org/pdf/2601.14207 arxiv.org/html/2601.14207
arXiv:2601.14207v1 Announce Type: new
Abstract: We study zero-shot 3D alignment of two given meshes, using a text prompt describing their spatial relation -- an essential capability for content creation and scene assembly. Earlier approaches primarily rely on geometric alignment procedures, while recent work leverages pretrained 2D diffusion models to model language-conditioned object-object spatial relationships. In contrast, we directly optimize the relative pose at test time, updating translation, rotation, and isotropic scale with CLIP-driven gradients via a differentiable renderer, without training a new model. Our framework augments language supervision with geometry-aware objectives: a variant of soft-Iterative Closest Point (ICP) term to encourage surface attachment and a penetration loss to discourage interpenetration. A phased schedule strengthens contact constraints over time, and camera control concentrates the optimization on the interaction region. To enable evaluation, we curate a benchmark containing diverse categories and relations, and compare against baselines. Our method outperforms all alternatives, yielding semantically faithful and physically plausible alignments.
toXiv_bot_toot

@arXiv_csGR_bot@mastoxiv.page
2026-01-27 07:37:15

LoD-Structured 3D Gaussian Splatting for Streaming Video Reconstruction
Xinhui Liu, Can Wang, Lei Liu, Zhenghao Chen, Wei Jiang, Wei Wang, Dong Xu
arxiv.org/abs/2601.18475 arxiv.org/pdf/2601.18475 arxiv.org/html/2601.18475
arXiv:2601.18475v1 Announce Type: new
Abstract: Free-Viewpoint Video (FVV) reconstruction enables photorealistic and interactive 3D scene visualization; however, real-time streaming is often bottlenecked by sparse-view inputs, prohibitive training costs, and bandwidth constraints. While recent 3D Gaussian Splatting (3DGS) has advanced FVV due to its superior rendering speed, Streaming Free-Viewpoint Video (SFVV) introduces additional demands for rapid optimization, high-fidelity reconstruction under sparse constraints, and minimal storage footprints. To bridge this gap, we propose StreamLoD-GS, an LoD-based Gaussian Splatting framework designed specifically for SFVV. Our approach integrates three core innovations: 1) an Anchor- and Octree-based LoD-structured 3DGS with a hierarchical Gaussian dropout technique to ensure efficient and stable optimization while maintaining high-quality rendering; 2) a GMM-based motion partitioning mechanism that separates dynamic and static content, refining dynamic regions while preserving background stability; and 3) a quantized residual refinement framework that significantly reduces storage requirements without compromising visual fidelity. Extensive experiments demonstrate that StreamLoD-GS achieves competitive or state-of-the-art performance in terms of quality, efficiency, and storage.
toXiv_bot_toot

@arXiv_csGR_bot@mastoxiv.page
2026-01-22 07:44:12

PAColorHolo: A Perceptually-Aware Color Management Framework for Holographic Displays
Chun Chen, Minseok Chae, Seung-Woo Nam, Myeong-Ho Choi, Minseong Kim, Eunbi Lee, Yoonchan Jeong, Jae-Hyeung Park
arxiv.org/abs/2601.14766 arxiv.org/pdf/2601.14766 arxiv.org/html/2601.14766
arXiv:2601.14766v1 Announce Type: new
Abstract: Holographic displays offer significant potential for augmented and virtual reality applications by reconstructing wavefronts that enable continuous depth cues and natural parallax without vergence-accommodation conflict. However, despite advances in pixel-level image quality, current systems struggle to achieve perceptually accurate color reproduction--an essential component of visual realism. These challenges arise from complex system-level distortions caused by coherent laser illumination, spatial light modulator imperfections, chromatic aberrations, and camera-induced color biases. In this work, we propose a perceptually-aware color management framework for holographic displays that jointly addresses input-output color inconsistencies through color space transformation, adaptive illumination control, and neural network-based perceptual modeling of the camera's color response. We validate the effectiveness of our approach through numerical simulations, optical experiments, and a controlled user study. The results demonstrate substantial improvements in perceptual color fidelity, laying the groundwork for perceptually driven holographic rendering in future systems.
toXiv_bot_toot