Generative Models

Generative Adversarial Networks

Generative Adversarial Networks (GANs) are generative models trained through a two-player game between a generator (GG) and a discriminator (DD).

  • The Generator (GG) maps random latent vectors from pzp_z into synthetic samples.

  • The Discriminator (DD) estimates whether a sample came from the real data distribution pdatap_{data} or from the generator.

Best use

Generates sharp, highly realistic, high-fidelity synthetic images and audio compared to other methods.

Watch out for

Extremely unstable to train; prone to Mode Collapse where the generator outputs identical patterns repeatedly.

i

Intuition

How to think about this algorithm

Think of the interaction as a game between a counterfeiter and an inspector:

  1. Initial Phase: The counterfeiter makes highly unconvincing copies (random noise). The detective easily spots them.

  2. Adversarial Feedback: The counterfeiter receives feedback (their fakes were rejected) and improves their paper and printing methods. The detective also learns to spot newer, more subtle flaws.

  3. Equilibrium: Eventually, the counterfeiter makes flawless bills that are indistinguishable from real currency. The detective has to guess randomly (50% accuracy).

In the ideal equilibrium, generated samples are indistinguishable from real samples, so the discriminator can do no better than guessing. In practice, GANs rarely reach that ideal exactly.

Interactive Diagram

Generative Adversarial Training (GAN)

Click Auto Train to start the minimax game loop. Watch fake points (pink) warp to mimic the real ring coordinates (cyan).

Real Distribution
Generator Output (Fake)
D = 0.5 Boundary
Training Epoch: 0
Status: Paused
Discriminator Learning Rate0.050
Generator Gradient Step0.080
Key InsightThe discriminator contour shading maps probabilities: green represents high discriminator certainty of real data, and pink represents fake predictions. The generator follows boundary gradients to match targets.

The Logic

Mathematical core for generative adversarial networks

1. The Minimax Objective Function

The adversarial training process is formulated as a minimax game over the value function V(D,G)V(D, G):

minGmaxDV(D,G)=Expdata[logD(x)]+Ezpz[log(1D(G(z)))]\min_{G} \max_{D} V(D, G) = E_{x \sim p_{data}}[\log D(x)] + E_{z \sim p_z}[\log(1 - D(G(z)))]

Where:

  • D(x)D(x) is the probability that xx came from the real data rather than pgp_g.

  • G(z)G(z) is the generated sample from latent noise zz.

  • ExpdataE_{x \sim p_{data}} is the expected log-probability of classifying real samples correctly.

  • EzpzE_{z \sim p_z} is the expected log-probability of detecting fake samples.

2. Training Alternations

In practice, training alternates between:

  1. Maximizing DD: Update parameters of DD to maximize logD(x)+log(1D(G(z)))\log D(x) + \log(1 - D(G(z))).

  2. Minimizing GG: Update parameters of GG to minimize log(1D(G(z)))\log(1 - D(G(z))) (or in practice, maximize logD(G(z))\log D(G(z)) to prevent vanishing gradients early in training).

Code Example

generative_adversarial_networks.py · pytorch example

Python
model_fitting.py
1# Pseudo-code training loop step for GANs
2import torch
3import torch.nn as nn
4
5# D and G are PyTorch nn.Modules, optimizer_D and optimizer_G are optimizers
6criterion = nn.BCELoss()
7
8def train_step(real_data, noise):
9    # 1. Train Discriminator
10    d_real_loss = criterion(D(real_data), torch.ones(real_data.size(0), 1))
11    fake_data = G(noise)
12    d_fake_loss = criterion(D(fake_data.detach()), torch.zeros(noise.size(0), 1))
13    loss_D = d_real_loss + d_fake_loss
14    # Optimize D...
15
16    # 2. Train Generator
17    loss_G = criterion(D(fake_data), torch.ones(fake_data.size(0), 1))
18    # Optimize G...
19

Strengths

  • Generates sharp, highly realistic, high-fidelity synthetic images and audio compared to other methods.

  • Does not require explicit density modeling or intractable integrals (like Variational Autoencoders).

  • Latent space interpolation allows smooth blending and semantic manipulation of generated attributes.

!

Limitations

  • Extremely unstable to train; prone to Mode Collapse where the generator outputs identical patterns repeatedly.

  • Evaluation is indirect; sample quality is often assessed with proxy metrics such as FID or by downstream performance.

  • Non-convergence: gradient descent can oscillate in parameter loops instead of reaching Nash Equilibrium.

A

Key Assumptions

Scope conditions and interpretation notes

  • 1

    The training dataset contains sufficient samples to represent the underlying manifold geometry.

  • 2

    Training dynamics can be stabilized enough that the generator improves instead of collapsing to a few modes.

R

References

Books and papers for deeper study

  • Goodfellow, I. et al. (2014) 'Generative adversarial nets', in Advances in Neural Information Processing Systems (NeurIPS), pp. 2672-2680.