Ioannis Pitas (pitas@csd.auth.gr), Aristotle University of Thessaloniki, Greece

Tutorial description

2D convolutions play an extremely important role in computer vision and machine learning:

  • they form the first layers of Convolutional Neural Networks (CNNs)
  • they are essential for a number of computer vision tasks, notably template matching and correlation object trackers.
  • they are basic tools for any linear signal/image/video processing and analysis task, e.g., filtering/denoising/restoration.  

3D convolutions are very important for machine learning (video analysis through CNNs) and for video filtering/denoising/restoration. 1D convolutions are extensively used in digital signal processing (filtering/denoising)  and analysis (also through CNNs)

Therefore, 1D/2D/3D convolution algorithms are very important both for computer vision and machine learning and for signal/image/video processing and analysis. As their computational complexity is of the order O(N^2), O(N^4) and O(N^6) respectively, their fast execution is a must. One solution is their distributed computation, e.g., in GPUs or multicore CPUs. This is particularly important in embedded systems, e.g., for their on-board execution in autonomous drones and robots.

This area was a popular research topic in the DSP community in the 80ies.  Prof. Pitas pioneered in this area through his PhD work, by devising optimal m-d convolution algorithms with theoretically minimal computation complexity (Pitas m-d convolutions). Recently, fast convolution algorithms (2016+) had a very strong comeback, primarily for embedded GPU convolution computing for real-time machine learning and inference. However, this comeback was rather shallow, as many people lack the proper mathematical background (algebra, number theory). Furthermore, as these tools are used by a large number of scientists (many of them lacking appropriate background) through DNN platforms and CV tools, huge inaccuracies and inconsistencies (amounting essentially to gross errors) exist in the literature very frequently that remain largely unnoticed.

This tutorial will overview linear and cyclic convolution, trying to remedy the above mathematical consistency problem. Then it will present their fast execution through:

a) FFTs, resulting in algorithms having computational complexity of the order O(Nlog2N), O(N^2log2N) for 1D and 2D convolutions respectively and

b) through block-based methods. Optimal Winograd 1D and 2D convolution algorithms will be presented having theoretically minimal number of computations.

The tutorial will consist of 2 talks, as detailed below:

1D convolution algorithms

Signals and systems, 1D convolutions Linear & Cyclic 1D convolutions, Discrete Fourier Transform, Fast Fourier Transform, Winograd algorithm, Nested convolutions,  Block convolutions.

Fast 2D/3D convolution algorithms

2D linear systems, 2D convolutions,  Linear & Cyclic 2D convolutions, 2D Discrete Fourier Transform, 2D Fast Fourier Transform, Winograd algorithms, optimal Pitas m-D convolution algorithms, Block 2D convolution methods, Convolutions in Convolutional Neural Networks (CNNs): convolutional layers, tensors and mathematical formulations. Template matching. Correlation object trackers.

Short bio for lecturer

Prof. Ioannis Pitas (IEEE Fellow, IEEE Distinguished Lecturer, EURASIP Fellow) received the Diploma and PhD degree in Electrical Engineering, both from the Aristotle University of Thessaloniki, Greece. Since 1994, he has been a Professor at the Department of Informatics of the same University. He served as a Visiting Professor at several Universities.

His current interests are in the areas of image/video processing, machine learning, computer vision, intelligent digital media, human-centered interfaces, affective computing, 3D imaging and biomedical imaging. He has published over 906 papers, contributed in 47 books in his areas of interest and edited or (co-)authored another 11 books. He has also been member of the program committee of many scientific conferences and workshops. In the past he served as Associate Editor or co-Editor of 9 international journals and General or Technical Chair of 4 international conferences. He participated in 70 R&D projects, primarily funded by the European Union and is/was principal investigator/researcher in 42 such projects. He has 30500+ citations to his work and h-index 82+ (Google Scholar).

Prof. Pitas leads the big European H2020 R&D project MULTIDRONE: https://multidrone.eu/. He is chair of the Autonomous Systems initiative https://ieeeasi.signalprocessingsociety.org/

Prof. Pitas delivered 90 invited speeches in prestigious conferences and Universities worldwide (22 in the past 3 years), 14 short courses and tutorials (11 in the past 3 years, e.g., in ICCV2017, ACCV2018, WACV2019, University of Maryland, Tampere University of Technology, National Taipei University of Technology)

Relevant links

Aerial core project: https://aerial-core.eu/

Multidrone project: https://multidrone.eu/

AIIA Lab, Aristotle University of Thessaloniki: http://www.aiia.csd.auth.gr/EN/

Google scholar: https://scholar.google.gr/citations?user=lWmGADwAAAAJ&hl=el

Target audience

The target audience will be researchers, engineers and computer scientists working in the areas of computer vision, machine learning, image and video processing and analysis that would like to enter in the new and exciting field of fast convolution algorithms with applications in computer vision and machine learning. Necessary background: introductory background in signal and systems, computer vision, machine learning is welcomed.

Material to be distributed to attendees

The presentations slides (in PPT or PDF) will be handed to the participants, in electronic form. A list of related publications will be also distributed.