Idempotent Generative Network
Authors: Assaf Shocher, Amil Dravid, Yossi Gandelsman, Inbar Mosseri, Michael Rubinstein, Alexei A. Efros
What
This paper introduces Idempotent Generative Networks (IGN), a novel approach to generative modeling that trains a neural network to be idempotent, meaning applying it repeatedly yields the same result as the initial application.
Why
This paper is significant because it presents a new perspective on generative modeling with unique advantages: one-step generation, optional sequential refinement, consistent latent space, and the potential for acting as a “global projector” to map various input distributions onto a target manifold.
How
The authors propose a training methodology with three key objectives: 1) Reconstruction: Data samples should be mapped to themselves. 2) Idempotence: Applying the network twice should yield the same result as applying it once. 3) Tightness: The set of instances mapped to themselves should be minimized. They achieve this through a novel self-adversarial training scheme using a single network.
Result
The paper provides theoretical guarantees of IGN’s convergence to the target distribution under ideal conditions. Experiments on MNIST and CelebA datasets demonstrate IGN’s ability to generate realistic images from noise, perform latent space manipulations, and project out-of-distribution images (noisy, grayscale, sketches) onto the learned image manifold.
LF
The authors acknowledge limitations such as mode collapse and blurriness in generated images, suggesting potential solutions like GAN mode collapse prevention techniques and perceptual or two-step loss functions. Future work aims to scale up IGN by training on larger datasets to explore its full potential.
Abstract
We propose a new approach for generative modeling based on training a neural network to be idempotent. An idempotent operator is one that can be applied sequentially without changing the result beyond the initial application, namely . The proposed model is trained to map a source distribution (e.g, Gaussian noise) to a target distribution (e.g. realistic images) using the following objectives: (1) Instances from the target distribution should map to themselves, namely . We define the target manifold as the set of all instances that maps to themselves. (2) Instances that form the source distribution should map onto the defined target manifold. This is achieved by optimizing the idempotence term, which encourages the range of to be on the target manifold. Under ideal assumptions such a process provably converges to the target distribution. This strategy results in a model capable of generating an output in one step, maintaining a consistent latent space, while also allowing sequential applications for refinement. Additionally, we find that by processing inputs from both target and source distributions, the model adeptly projects corrupted or modified data back to the target manifold. This work is a first step towards a “global projector” that enables projecting any input into a target data distribution.