Inversion-by-Inversion: Exemplar-based Sketch-to-Photo Synthesis via Stochastic Differential Equations without Training

Authors: Ximing Xing, Chuang Wang, Haitao Zhou, Zhihao Hu, Chongxuan Li, Dong Xu, Qian Yu

What

This paper introduces Inversion-by-Inversion, a novel two-stage method for exemplar-based sketch-to-photo synthesis using stochastic differential equations (SDE) without training, allowing users to generate photo-realistic images guided by both a sketch and an exemplar image.

Why

This paper is important as it addresses the challenge of generating photo-realistic images from sketches, which are inherently sparse, using pre-trained diffusion models. The proposed method effectively combines shape control from sketches with appearance control from exemplar images, advancing the field of sketch-to-photo synthesis.

How

The authors propose a two-stage approach: 1) Shape-enhancing inversion: An uncolored photo is generated from the input sketch using a shape-energy function to guide the SDE inversion process, emphasizing shape preservation. 2) Full-control inversion: Using the uncolored photo and an exemplar image, the final photo is generated using both shape-energy and appearance-energy functions to guide the SDE inversion process, adding color and texture from the exemplar while retaining the sketch’s shape.

Result

The paper shows that Inversion-by-Inversion outperforms existing SDE-based image translation methods in terms of FID score and shape fidelity, demonstrating its ability to generate more realistic and shape-consistent images. The method effectively uses various exemplars, including photos, stroke images, segmentation maps, and style images, showcasing its versatility. The ablation study confirms the importance of both the shape-enhancing step and the energy functions for achieving high-quality results.

LF

The authors acknowledge that future work could explore alternative shape-energy functions and appearance-energy functions to further enhance the performance. Additionally, investigating the generalization ability of the method to handle more complex scenes and diverse sketch styles is a promising direction.

Abstract

Exemplar-based sketch-to-photo synthesis allows users to generate photo-realistic images based on sketches. Recently, diffusion-based methods have achieved impressive performance on image generation tasks, enabling highly-flexible control through text-driven generation or energy functions. However, generating photo-realistic images with color and texture from sketch images remains challenging for diffusion models. Sketches typically consist of only a few strokes, with most regions left blank, making it difficult for diffusion-based methods to produce photo-realistic images. In this work, we propose a two-stage method named “Inversion-by-Inversion” for exemplar-based sketch-to-photo synthesis. This approach includes shape-enhancing inversion and full-control inversion. During the shape-enhancing inversion process, an uncolored photo is generated with the guidance of a shape-energy function. This step is essential to ensure control over the shape of the generated photo. In the full-control inversion process, we propose an appearance-energy function to control the color and texture of the final generated photo.Importantly, our Inversion-by-Inversion pipeline is training-free and can accept different types of exemplars for color and texture control. We conducted extensive experiments to evaluate our proposed method, and the results demonstrate its effectiveness. The code and project can be found at https://ximinng.github.io/inversion-by-inversion-project/.