Generative Adversarial Networks
Generative Adversarial Networks (GANs) are generative models trained through a two-player game between a generator () and a discriminator ().
-
The Generator () maps random latent vectors from into synthetic samples.
-
The Discriminator () estimates whether a sample came from the real data distribution or from the generator.
Generates sharp, highly realistic, high-fidelity synthetic images and audio compared to other methods.
Extremely unstable to train; prone to Mode Collapse where the generator outputs identical patterns repeatedly.
Intuition
How to think about this algorithm
Think of the interaction as a game between a counterfeiter and an inspector:
-
Initial Phase: The counterfeiter makes highly unconvincing copies (random noise). The detective easily spots them.
-
Adversarial Feedback: The counterfeiter receives feedback (their fakes were rejected) and improves their paper and printing methods. The detective also learns to spot newer, more subtle flaws.
-
Equilibrium: Eventually, the counterfeiter makes flawless bills that are indistinguishable from real currency. The detective has to guess randomly (50% accuracy).
In the ideal equilibrium, generated samples are indistinguishable from real samples, so the discriminator can do no better than guessing. In practice, GANs rarely reach that ideal exactly.
Generative Adversarial Training (GAN)
Click Auto Train to start the minimax game loop. Watch fake points (pink) warp to mimic the real ring coordinates (cyan).
The Logic
Mathematical core for generative adversarial networks
1. The Minimax Objective Function
The adversarial training process is formulated as a minimax game over the value function :
Where:
-
is the probability that came from the real data rather than .
-
is the generated sample from latent noise .
-
is the expected log-probability of classifying real samples correctly.
-
is the expected log-probability of detecting fake samples.
2. Training Alternations
In practice, training alternates between:
-
Maximizing : Update parameters of to maximize .
-
Minimizing : Update parameters of to minimize (or in practice, maximize to prevent vanishing gradients early in training).
Code Example
generative_adversarial_networks.py · pytorch example
1# Pseudo-code training loop step for GANs
2import torch
3import torch.nn as nn
4
5# D and G are PyTorch nn.Modules, optimizer_D and optimizer_G are optimizers
6criterion = nn.BCELoss()
7
8def train_step(real_data, noise):
9 # 1. Train Discriminator
10 d_real_loss = criterion(D(real_data), torch.ones(real_data.size(0), 1))
11 fake_data = G(noise)
12 d_fake_loss = criterion(D(fake_data.detach()), torch.zeros(noise.size(0), 1))
13 loss_D = d_real_loss + d_fake_loss
14 # Optimize D...
15
16 # 2. Train Generator
17 loss_G = criterion(D(fake_data), torch.ones(fake_data.size(0), 1))
18 # Optimize G...
19Strengths
Generates sharp, highly realistic, high-fidelity synthetic images and audio compared to other methods.
Does not require explicit density modeling or intractable integrals (like Variational Autoencoders).
Latent space interpolation allows smooth blending and semantic manipulation of generated attributes.
Limitations
Extremely unstable to train; prone to Mode Collapse where the generator outputs identical patterns repeatedly.
Evaluation is indirect; sample quality is often assessed with proxy metrics such as FID or by downstream performance.
Non-convergence: gradient descent can oscillate in parameter loops instead of reaching Nash Equilibrium.
Key Assumptions
Scope conditions and interpretation notes
- 1
The training dataset contains sufficient samples to represent the underlying manifold geometry.
- 2
Training dynamics can be stabilized enough that the generator improves instead of collapsing to a few modes.
References
Books and papers for deeper study
Goodfellow, I. et al. (2014) 'Generative adversarial nets', in Advances in Neural Information Processing Systems (NeurIPS), pp. 2672-2680.