GigaGAN

Large-scale GAN for text-to-image synthesis

About GigaGAN

GigaGAN is a novel architecture that far exceeds the previous limits of GAN producing ultra HD images.

With 1 billion parameters, GigaGAN is achieving lower FID than Stable Diffusion v1.5, DALL·E 2, and Parti-750M. It generates 512px outputs at 0.13s, orders of magnitude faster than diffusion and autoregressive models, and inherits the disentangled, continuous, and controllable latent space of GANs. We also train a fast upsampler that can generate 4K images from the low-res outputs of text-to-image models.

Highlights: