
2025-07-22 06:03:00
heise | Vom Text zum Bild: So funktioniert Stable Diffusion
Das frei verfügbare Stable Diffusion generiert aus Texten realitätsnahe und detailreiche Bilder. Wir erklären, wie die KI arbeitet.
https://www.
heise | Vom Text zum Bild: So funktioniert Stable Diffusion
Das frei verfügbare Stable Diffusion generiert aus Texten realitätsnahe und detailreiche Bilder. Wir erklären, wie die KI arbeitet.
https://www.
rd-spiral: An open-source Python library for learning 2D reaction-diffusion dynamics through pseudo-spectral method
Sandy H. S. Herho, Iwan P. Anwar, Rusmawan Suwarman
https://arxiv.org/abs/2506.20633
heise | macOS: Stable Diffusion in die eigene App integrieren
Ohne Training, aber mit dem vollen Potenzial von Apples Hardware: So binden Sie den Text-zu-Bild-Generator in Ihre eigene App ein.
https://www.
Q&A with Stability AI CEO Prem Akkaraju on taking over from Emad Mostaque, his movie background, data and copyright, working with creative industries, and more (Melissa Heikkilä/Financial Times)
https://www.ft.com/content/fc4cb659-bf01-4f68-a828-40ff98c24e51
Efficient and Robust Semantic Image Communication via Stable Cascade
Bilal Khalid, Pedro Freire, Sergei K. Turitsyn, Jaroslaw E. Prilepsky
https://arxiv.org/abs/2507.17416
SustainDiffusion: Optimising the Social and Environmental Sustainability of Stable Diffusion Models
Giordano d'Aloisio, Tosin Fadahunsi, Jay Choy, Rebecca Moussa, Federica Sarro
https://arxiv.org/abs/2507.15663
Explicit Monotone Stable Super-Time-Stepping Methods for Finite Time Singularities
Zheng Tan, Tariq D. Aslam, Andrea L. Bertozzi
https://arxiv.org/abs/2507.17062 https://…
Probabilistic approximation of fully nonlinear second-order PIDEs with convergence rates for the universal robust limit theorem
Lianzi Jiang, Mingshang Hu, Gechun Liang
https://arxiv.org/abs/2506.18374
Concept Unlearning by Modeling Key Steps of Diffusion Process
Chaoshuo Zhang, Chenhao Lin, Zhengyu Zhao, Le Yang, Qian Wang, Chao Shen
https://arxiv.org/abs/2507.06526
DLSF: Dual-Layer Synergistic Fusion for High-Fidelity Image Syn-thesis
Zhen-Qi Chen, Yuan-Fu Yang
https://arxiv.org/abs/2507.13388 https://
Taming Stable Diffusion for Computed Tomography Blind Super-Resolution
Chunlei Li, Yilei Shi, Haoxi Hu, Jingliang Hu, Xiao Xiang Zhu, Lichao Mou
https://arxiv.org/abs/2506.11496
SD-Acc: Accelerating Stable Diffusion through Phase-aware Sampling and Hardware Co-Optimizations
Zhican Wang, Guanghui He, Hongxiang Fan
https://arxiv.org/abs/2507.01309
When There Is No Decoder: Removing Watermarks from Stable Diffusion Models in a No-box Setting
Xiaodong Wu, Tianyi Tang, Xiangman Li, Jianbing Ni, Yong Yu
https://arxiv.org/abs/2507.03646
EXPO: Stable Reinforcement Learning with Expressive Policies
Perry Dong, Qiyang Li, Dorsa Sadigh, Chelsea Finn
https://arxiv.org/abs/2507.07986 https://arxiv.org/pdf/2507.07986 https://arxiv.org/html/2507.07986
arXiv:2507.07986v1 Announce Type: new
Abstract: We study the problem of training and fine-tuning expressive policies with online reinforcement learning (RL) given an offline dataset. Training expressive policy classes with online RL present a unique challenge of stable value maximization. Unlike simpler Gaussian policies commonly used in online RL, expressive policies like diffusion and flow-matching policies are parameterized by a long denoising chain, which hinders stable gradient propagation from actions to policy parameters when optimizing against some value function. Our key insight is that we can address stable value maximization by avoiding direct optimization over value with the expressive policy and instead construct an on-the-fly RL policy to maximize Q-value. We propose Expressive Policy Optimization (EXPO), a sample-efficient online RL algorithm that utilizes an on-the-fly policy to maximize value with two parameterized policies -- a larger expressive base policy trained with a stable imitation learning objective and a light-weight Gaussian edit policy that edits the actions sampled from the base policy toward a higher value distribution. The on-the-fly policy optimizes the actions from the base policy with the learned edit policy and chooses the value maximizing action from the base and edited actions for both sampling and temporal-difference (TD) backup. Our approach yields up to 2-3x improvement in sample efficiency on average over prior methods both in the setting of fine-tuning a pretrained policy given offline data and in leveraging offline data to train online.
toXiv_bot_toot
Process-aware and high-fidelity microstructure generation using stable diffusion
Hoang Cuong Phan, Minh Tien Tran, Chihun Lee, Hoheok Kim, Sehyok Oh, Dong-Kyu Kim, Ho Won Lee
https://arxiv.org/abs/2507.00459
Towards Multimodal Understanding via Stable Diffusion as a Task-Aware Feature Extractor
Vatsal Agarwal, Matthew Gwilliam, Gefen Kohavi, Eshan Verma, Daniel Ulbricht, Abhinav Shrivastava
https://arxiv.org/abs/2507.07106
A gradient-enhanced approach for stable finite element approximations of reaction-convection-diffusion problems
Soheil Firooz, B. Daya Reddy, Paul Steinmann
https://arxiv.org/abs/2506.01873
Diffusion and heat dissipation in marginally stable linear time-delayed Langevin systems
Xin Wang
https://arxiv.org/abs/2506.23939 https://
Music Boomerang: Reusing Diffusion Models for Data Augmentation and Audio Manipulation
Alexander Fichtinger, Jan Schl\"uter, Gerhard Widmer
https://arxiv.org/abs/2507.04864
I heard they were working on a middle ground between flat earthers and scientists. What's that supposed to be? Cubic earth??
#humor #flatearth #science
Enhancing Food-Domain Question Answering with a Multimodal Knowledge Graph: Hybrid QA Generation and Diversity Analysis
Srihari K B, Pushpak Bhattacharyya
https://arxiv.org/abs/2507.06571
ME: Trigger Element Combination Backdoor Attack on Copyright Infringement
Feiyu Yang, Siyuan Liang, Aishan Liu, Dacheng Tao
https://arxiv.org/abs/2506.10776
FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale
Boris Bonev, Thorsten Kurth, Ankur Mahesh, Mauro Bisson, Jean Kossaifi, Karthik Kashinath, Anima Anandkumar, William D. Collins, Michael S. Pritchard, Alexander Keller
https://arxiv.org/abs/2507.12144…
This https://arxiv.org/abs/2405.20032 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csNI_…
A fourth-order exponential time differencing scheme with real and distinct poles rational approximation for solving non-linear reaction-diffusion systems
Wisdom Kwame Attipoe, Andreas Kleefeld, Emmanuel Asante-Asamani
https://arxiv.org/abs/2507.01245
Noise Consistency Regularization for Improved Subject-Driven Image Synthesis
Yao Ni, Song Wen, Piotr Koniusz, Anoop Cherian
https://arxiv.org/abs/2506.06483
GenBreak: Red Teaming Text-to-Image Generators Using Large Language Models
Zilong Wang, Xiang Zheng, Xiaosen Wang, Bo Wang, Xingjun Ma, Yu-Gang Jiang
https://arxiv.org/abs/2506.10047
This https://arxiv.org/abs/2402.10346 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_mat…
This https://arxiv.org/abs/2505.04411 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_nli…