Abstract

This lecture overviews CUDA that has many applications in GPU computing and Deep Neural Networks. It covers the following topics in detail: CPU vs GPU, GPU microarchitecture, CUDA installation/limitations, Heterogeneous Computing, Program Structure of CUDA, CUDA execution flow, threads and blocks, CUDA Built-In Variables, CUDA Memories, GPU memory allocation, GPU optimization guide, Convolution/CNN implementation in CUDA.

CUDA processing flow.

CUDA Memories.

CUDA v2.4.1 - Summary