A Training-Free Plug-and-Play Watermark Framework for Stable Diffusion

Authors: Guokai Zhang, Lanjun Wang, Yuting Su, An-An Liu

What

This paper introduces a training-free, plug-and-play watermarking framework for Stable Diffusion models, enabling the embedding of diverse watermarks in the latent space without requiring any retraining of the SD model itself.

Why

The paper addresses the growing concern of misuse of AI-generated content, particularly with the rapid evolution of SD models. The proposed framework provides a cost-efficient and adaptable solution for watermarking, ensuring traceability and responsibility attribution for generated images.

How

The authors develop a watermark encoder-decoder architecture trained solely on the frozen VAE encoder-decoder component of SD. During inference, the compressed watermark is embedded into the latent code after denoising, minimizing impact on image quality. The framework’s generalization ability is analyzed, and extensive experiments are conducted to evaluate its performance on various SD versions and under different attacks.

Result

The proposed framework demonstrates excellent watermark invisibility, achieving high PSNR and SSIM scores while minimally affecting image quality (even showing slight FID improvement). The watermark extraction quality is high, with NC exceeding 96%. The framework exhibits strong generalization across different SD versions (v1-1, v1-4, v1-5) without retraining and shows robustness against common image manipulations like blurring, cropping, and noise addition.

LF

The authors acknowledge limitations in handling high-angle rotations due to the watermark’s spatial dependence. Future work could explore rotation-invariant watermarking techniques. Additionally, while the framework minimizes noticeable artifacts, some localized pixel variations might occur in specific samples, requiring further investigation.

Abstract

Nowadays, the family of Stable Diffusion (SD) models has gained prominence for its high quality outputs and scalability. This has also raised security concerns on social media, as malicious users can create and disseminate harmful content. Existing approaches involve training components or entire SDs to embed a watermark in generated images for traceability and responsibility attribution. However, in the era of AI-generated content (AIGC), the rapid iteration of SDs renders retraining with watermark models costly. To address this, we propose a training-free plug-and-play watermark framework for SDs. Without modifying any components of SDs, we embed diverse watermarks in the latent space, adapting to the denoising process. Our experimental findings reveal that our method effectively harmonizes image quality and watermark invisibility. Furthermore, it performs robustly under various attacks. We also have validated that our method is generalized to multiple versions of SDs, even without retraining the watermark model.