Abstract

This lecture overviews Generative Adversarial Networks that have many applications in Media Production.  It covers the following topics in detail: Theoretical ML background (cross-entropy loss for binary classification), Deep fake, Generator function, Discriminator function, GANs training using Minimax optimization or Heuristic optimization. The most notable GAN architectures are presented: cGAN, IcGAN, Convolutional GANs, LSTM-GAN, TP-GAN, Pix2Pix, CycleGAN, StarGAN, GauGAN, DeblurGAN, ID-CGAN, PerceptualGAN, 3D-GAN, MidiNet, StyleGAN, DiscoGAN, PG2.

Their applications are in a) image/video captioning, b) text-to-image synthesis, Facial Image Synthesis, Image restyling, Image-to-Image Translation Image de-raining/de-fogging, Anime Character Creation, Pose-Guided Image Generation, Terrain Generation, Face Aging, Image Inpainting, Image super-resolution, Image Blending, Object detection, c) Video Frame Prediction, Conditional Video Generation, Video Transformation, d) 3D Object Creation and e) Music Score Generation.

GAN architecture.

GAN facial image synthesis.

Video restyling.

Generative-Adversarial-Networks-for-Multimedia-v3.1.1-Summary

Understanding Questionnaire

https://docs.google.com/forms/generative-adversarial-network