DemoFusion: Democratising High-Resolution Image Generation With No $$$

Authors: Ruoyi Du, Dongliang Chang, Timothy Hospedales, Yi-Zhe Song, Zhanyu Ma

What

This paper introduces DemoFusion, a method for generating high-resolution images from pre-trained Latent Diffusion Models (LDMs) like SDXL without requiring additional training.

Why

This paper is important because it addresses the increasing centralization and paywalling of high-resolution image generation by enabling access to this technology using consumer-grade hardware and open-source models.

How

The authors propose DemoFusion, which extends MultiDiffusion with three key mechanisms: Progressive Upscaling for iteratively enhancing image resolution, Skip Residual for maintaining global consistency, and Dilated Sampling for increasing global semantic coherence during image generation.

Result

DemoFusion generates high-resolution images with better quality and coherence compared to baselines like MultiDiffusion and SDXL+BSRGAN, as evidenced by qualitative and quantitative comparisons using metrics such as FID, IS, and CLIP score.

LF

Limitations include longer inference time due to progressive upscaling and dependence on the underlying LDM’s performance. Future work could involve training LDMs specifically for DemoFusion or exploring more efficient inference strategies.

Abstract

High-resolution image generation with Generative Artificial Intelligence (GenAI) has immense potential but, due to the enormous capital investment required for training, it is increasingly centralised to a few large corporations, and hidden behind paywalls. This paper aims to democratise high-resolution GenAI by advancing the frontier of high-resolution generation while remaining accessible to a broad audience. We demonstrate that existing Latent Diffusion Models (LDMs) possess untapped potential for higher-resolution image generation. Our novel DemoFusion framework seamlessly extends open-source GenAI models, employing Progressive Upscaling, Skip Residual, and Dilated Sampling mechanisms to achieve higher-resolution image generation. The progressive nature of DemoFusion requires more passes, but the intermediate results can serve as “previews”, facilitating rapid prompt iteration.

🪴 Quartz 4.0

Explorer

DemoFusion Democratising High-Resolution Image Generation With No $$$

DemoFusion: Democratising High-Resolution Image Generation With No $$$

What

Why

How

Result

LF

Abstract

Graph View

Table of Contents

Backlinks