Tootfinder

No exact results. Similar results found.

@arXiv_csLG_bot@mastoxiv.page
2026-06-19 08:44:40

Crosslisted article(s) found for cs.LG. https://arxiv.org/list/cs.LG/new
[8/8]:
- Scalable Training of Spatially Grounded 2D Vision-Language Models for Radiology
Salcan, Ging, Schirrmeister, Arnold, Kotter, Bozorgtabar, Brox

@arXiv_eessIV_bot@mastoxiv.page
2026-06-19 07:41:41

FrequencyFormer: A Co-Designed Sensor-to-Processor Pipeline for Frequency-Domain Vision Transformer Inference
Chengwei Zhou, Ovishake Sen, Xuming Chen, Rishith Paramasivam, Shaahin Angizi, Swarup Bhunia, Baibhab Chatterjee, Gourav Datta
https://arxiv.org/abs/2606.19574

@memeorandum@universeodon.com
2026-06-12 02:50:52

Erika Kirk Lays Out a Vision for the Conservative Christian Woman (Kara Voght/Wall Street Journal)
https://www.wsj.com/style/turning-point-usa-womens-summit-erika-kirk-97abb601?st=dNEf2s
http://www.memeorandum.com/260611/p145#a260611p145

@arXiv_csGR_bot@mastoxiv.page
2026-07-23 07:32:35

LowPowAR: Power-Constrained Tone Mapping for Augmented Reality
Weikai Lin, Sheng Zhao, Ian Ross, Carl Marshall, Sushant Kondguli, Yuhao Zhu
https://arxiv.org/abs/2607.19509 https://arxiv.org/pdf/2607.19509 https://arxiv.org/html/2607.19509
arXiv:2607.19509v1 Announce Type: new
Abstract: Everyday-wearable Augmented Reality (AR) glasses must meet strict power limits, making displays a key target for optimization. We cast display power optimization as a power-constrained tone-mapping problem and propose a human-vision-grounded, learning-based framework that maximizes perceptual quality under a given power budget. We introduce an optimization-friendly tone-mapping operator (TMO) parameterization along with a progressive optimization strategy to effectively navigate the quality-vs-power landscape. We distill the iterative optimization into a lightweight feed-forward neural network for real-time deployment. Subjective experiments show that our method yields better perceptual quality than prior work at the same power budget. Project page: https://horizon-lab.org/lowpowar/.
toXiv_bot_toot

@arXiv_qbioNC_bot@mastoxiv.page
2026-07-22 07:57:40

Eccentricity-Constrained CNN Training Reveals Adaptive Information Coding Around the Visual Field
Dylan M. Diaz, Margaret M. Henderson
https://arxiv.org/abs/2607.19316 https://arxiv.org/pdf/2607.19316 https://arxiv.org/html/2607.19316
arXiv:2607.19316v1 Announce Type: new
Abstract: In the primate visual system, center-preferring cortical populations have higher spatial resolution and overlap face- and word-selective regions while periphery-preferring populations have lower spatial resolution and overlap scene-selective regions. This "eccentricity bias" may reflect differential task-relevance: central vision may better support fine-grained tasks like face recognition and reading, while peripheral vision may better support scene understanding. To test whether eccentricity-dependent coding can emerge from natural experience, we used egocentric video and eye-tracking data from the Visual Experience Dataset (VEDB). We trained ResNet-18 models using contrastive learning (SimCLR) on frames modified to isolate different eccentricities (gaze-contingent fovea-only crops, periphery-only crops, and periphery-only crops with a NeuroFovea transform applied). We evaluated downstream task performance and model alignment with human fMRI data (Natural Scenes Dataset; encoding models). In-domain VEDB frame classification showed systematic differences between fovea- and periphery-only models across categories, indicating differential informativeness across tasks. On downstream classification, VEDB-pretrained models generalized better to scene categorization (Places365) than face recognition (VGGFace2), with fovea-only models stronger on both. Across visual cortex, VEDB-pretrained models matched neural predictivity of models trained on mid-sized non-egocentric datasets (ImageNet-100), suggesting egocentric data supports emergence of cortically-aligned representations. In scene-selective cortex (PPA, RSC), periphery-only models held a small but consistent advantage in explained variance over fovea-only models, suggesting these regions are aligned with peripheral statistics. Together, these results suggest egocentric experience may adaptively constrain cortical information processing.
toXiv_bot_toot

@arXiv_csIT_bot@mastoxiv.page
2026-06-11 07:43:08

Vision-Language-Action Models Meet World Models: Embodied Agentic AI for Low-Altitude Wireless Networks
Feibo Jiang, Li Dong, Lei Mao, Kezhi Wang, Cunhua Pan, Dong In Kim, Naofal Al-Dhahir
https://arxiv.org/abs/2606.11618 https://arxiv.org/pdf/2606.11618 https://arxiv.org/html/2606.11618
arXiv:2606.11618v1 Announce Type: new
Abstract: Low-Altitude Wireless Networks (LAWNs), composed of Unmanned Aerial Vehicles (UAVs) and other aerial platforms, provide integrated perception, communication, and computation services in low-altitude airspace. However, deploying large generative models in this domain faces three major challenges: 1) Limited embodied action mapping; 2) Inadequate physical environment modeling; 3) Insufficient closed-loop optimization. To address these challenges, this study proposes an Embodied Agentic UAV framework. Centered on a Vision-Language-Action (VLA) model as the execution core, the framework establishes an end-to-end embodied decision-making pipeline from multimodal environmental perception to continuous control generation. In addition, a World Model (WM) is introduced to capture the coupling between UAV actions and environmental state evolution, thereby supporting environment prediction, policy verification, and dynamic optimization. Furthermore, memory and reflection mechanisms are incorporated to form an adaptive closed-loop optimization paradigm of decision, execution, evaluation, and update, thereby enhancing the system's autonomous decision-making capability and continual evolution ability in complex dynamic environments. Experimental results validate its effectiveness in enabling robust, predictive, and sustainable autonomous control in LAWNs.
toXiv_bot_toot

@arXiv_csIT_bot@mastoxiv.page
2026-06-11 07:43:08

@arXiv_qbioNC_bot@mastoxiv.page
2026-07-21 09:41:50

Replaced article(s) found for q-bio.NC. https://arxiv.org/list/q-bio.NC/new
[1/1]:
- The Illusion-Illusion: Vision Language Models See Illusions Where There Are None
Tomer Ullman
https://arxiv.org/abs/2412.18613 https://mastoxiv.page/@arXiv_qbioNC_bot/113740853862270406
- An Intelligent Infrastructure as a Foundation for Modern Science
Satrajit S. Ghosh
https://arxiv.org/abs/2508.10051 https://mastoxiv.page/@arXiv_qbioNC_bot/115031786313942225
- The embodied brain: Bridging the brain, body, and behavior with biorealistic neuromechanical models
Sibo Wang-Chen, Pavan Ramdya
https://arxiv.org/abs/2601.08056 https://mastoxiv.page/@arXiv_qbioNC_bot/115892667909757712
- Microsecond-precision sound localization emerges from slow equilibrium dynamics
Toshio Irino
https://arxiv.org/abs/2607.03890 https://mastoxiv.page/@arXiv_qbioNC_bot/116877637358323806
- A portable solution for simultaneous human movement and mobile EEG acquisition: readiness potenti...
Contreras-Altamirano, Klapprott, Jacobsen, Maanen, Welzel, Debener
https://arxiv.org/abs/2501.05378 https://mastoxiv.page/@arXiv_csNE_bot/113802730861926461
- Geometric origin of adversarial vulnerability in deep learning
Yixiong Ren, Wenkang Du, Jianhui Zhou, Haiping Huang
https://arxiv.org/abs/2509.01235 https://mastoxiv.page/@arXiv_csLG_bot/115140809439719323
- Universal Approximation Theorems for Dynamical Systems with Infinite-Time Horizon Guarantees
Abel Sagodi, Il Memming Park
https://arxiv.org/abs/2602.08640 https://mastoxiv.page/@arXiv_mathDS_bot/116045857797065476
toXiv_bot_toot

@arXiv_csGR_bot@mastoxiv.page
2026-07-21 07:34:37

Feature-Guided Diffusion for Non-Differentiable Inverse Rendering
Andrei-Timotei Ardelean, Michael Fischer, Tim Weyrich, Tom\'a\v{s} Iser
https://arxiv.org/abs/2607.17411 https://arxiv.org/pdf/2607.17411 https://arxiv.org/html/2607.17411
arXiv:2607.17411v1 Announce Type: new
Abstract: Inverse rendering is traditionally solved via differentiable renderers and gradient descent, which requires substantial problem-specific engineering and is prone to getting stuck in local minima due to ambiguities. Derivative-free approaches alleviate engineering requirements, but often heavily depend on a good problem initialization. In this work, we propose Feature-Informed Diffusion Evolution (FIDE), a fully black-box framework that requires no gradients or specific initialization: the renderer is treated as an opaque function whose only requirement is to produce images. Our key insight is feature guiding: rather than reducing each candidate rendering to a scalar loss value, we use a Vision Transformer (ViT) to extract dense visual features from it. We subsequently use these features to train a diffusion-based candidate proposal model, allowing the network to use visual cues to predict parameters that would match the target image. The candidate solutions proposed by this diffusion model are then refined in a closed loop with a CMA evolution strategy, continuously narrowing the proposal region as optimization progresses. We validate across diverse inverse problems from path tracing, vector splines, Voronoi shaders, and robotics, and demonstrate that feature-guiding substantially improves convergence over scalar-loss baselines and reliably escapes local minima where gradient-based methods stall.
toXiv_bot_toot

Tootfinder

Opt-in global Mastodon full text search. Join the index!