CVML Programming short course and workshop on Deep Learning and Computer Vision 2021

Short description

This three day short course and workshop provides an in-depth presentation of programming tools and techniques for various computer vision, deep learning problems. The target application domains are autonomous systems (e.g., drone cinematography) and digital/social media. The short course consists of three parts (A,B,C), each having lectures and a programming workshop with hands-on lab exercises.

It is sponsored by Horizon2020 ICT48 flagship R&D project AI4Media https://www.ai4media.eu/ and International AI Doctoral Academy (AIDA) http://www.i-aida.org/

Part A will focus on Deep Learning and GPU programming. The lectures of this part provide a solid background on Deep Neural Networks (DNN) topics, notably convolutional NNs (CNNs) and deep learning for image classification. Also, parallel GPU and multi-core CPU architectures commonly used to train DNNs will be presented. Two programming workshops will take place. The first one will be on image classification using CNNs, while the second one will be on CUDA programming, focusing on 2D convolution algorithms.

Part B lectures will focus on deep learning algorithms for computer vision, namely on 2D object/face detection and 2D object tracking (giving the attendants the opportunity to master state of the art object detectors and video trackers). The hands-on programming workshop will be on target detection with Pytorch and on how to use OpenCV (the most used library for computer vision) for target tracking.

Part C lectures will focus on autonomous UAV cinematography. Before mission execution, it is best simulated, using drone mission simulation tools. Such simulations will be presented using AirSim. Additionally, participants will have the opportunity to understand video summarization techniques, which can be used to autonomously distill the important parts of the recorded video.

The lectures and programming tools will provide programming skills for the various computer vision and deep learning problems encountered in autonomous systems and autonomous drone applications, e.g., drone cinematography, drone inspection, land/marine surveillance, search&rescue, and 3D modeling.

Lectures and programming workshops will be in English. PDF files will be available at the end of the course.

Part A (8 hours),
Deep Learning and GPU programming

Deep neural networks. Convolutional NNs.
Deep learning for target detection.
Image classification with CNNs.
Target detection with PyTorch.

Part B (8 hours), Deep Learning for Computer Vision

Deep learning for object/face detection.
2D object tracking.
PyTorch: Understand the core functionalities of an object detector. Training and deployment.
OpenCV programming for object tracking.

Part C (8 hours), Autonomous UAV cinematography

Video summarization.
UAV cinematography.
Video summarization with Pytorch.
Drone cinematography with Airsim.

WHEN?

The course will take place on 25-27 August 2021.

WHERE?

All lectures and workshops will be delivered remotely.

You can find additional information about the city of Thessaloniki and details on how to get to the city here.

HOW?

Each registrant will use her/his own computer for a) participating in the course and b) for running the programming exercises. A standard PC with a stable internet connection is required. The participants are also required to own a Google account for the workshops exercises. Finally, instructions on how to install AirSim can be found here: https://microsoft.github.io/AirSim/. All lectures and workshops will be delivered remotely using Zoom. The course link is the following: https://authgr.zoom.us/j/91483406779

PROGRAM

Date/time*	25/08/2021	26/08/2021	27/08/2021
Topic	Deep Learning and GPU programming	Deep Learning for Computer Vision	Autonomous UAV cinematography
8:00-8:30	Registration
	LECTURES	LECTURES	LECTURES
8:30-9:00	Introduction to autonomous systems
9:00-10:00	Deep neural networks – Convolutional NNs	Deep learning for object/face detection	Video summarization
10:00-11:00	Parallel GPU and multi-core CPU architectures – GPU programming	2D object tracking	UAV cinematography
11:00-11:30	Coffee break	Coffee break	Coffee break
	WORKSHOPS	WORKSHOPS	WORKSHOPS
11:30-13:30	Image classification with CNNs.	PyTorch: Understand the core functionalities of an object detector. Training and deployment.	Video summarization with Pytorch
13:30-14:30	Lunch break	Lunch break	Lunch break
14:30-16:30	CUDA programming	OpenCV programming for object tracking	Drone cinematography with Airsim

* Eastern European Summer Time (EEST, UTC+3 hours)

** This programme is indicative and may be modified without prior notice by announcing (hopefully small) changes in lectures/lecturers.

REGISTRATION

———————————————————————————————————————————————————

Early registration (till 2/08/2021):

• Standard: 200 Euros

• Reduced registration for young professionals (up to 2 years after graduation): 100 Euros

• Unemployed or Undergraduate/MSc/PhD student*: 50 Euros

Later or on-site registration (after 2/08/2021):

• Standard: 210 Euros

• Reduced registration for young professionals (up to 2 years after graduation): 110 Euros

• Unemployed or Undergraduate/MSc/PhD student*: 60 Euros

After the completion of your payment, please fill in the form below:

Complete your registration

Up to 10 PhD students, registered in AUTH or in any VISION CSA https://www.vision4ai.eu or AI4Media https://ai4media.eu/ or HumanAI-E Net https://www.humane-ai.eu/ University partners, are entitled for 1 free CVML Web Course registration per fall/spring semester on a FCFS basis, with priority to ones working on AI-related topics. This offer is related to the upcoming educational activities of International AI Doctoral Academy (AIDA) http://www.i-aida.org/ that is co-initiated by these two projects.

Other special CVML Web Courses containing any CVML Web Lecture set (16 Lectures) can be available upon request, to meet your personal learning needs, by sending a message as described in the section IF I HAVE A QUESTION?

A certificate of attendance will be provided.

Upon successful completion of more than 14 CVML Web lecture understanding questionnaires (mark equal or above 5 in the range 0-10, 10 being excellent) with 2 months after receiving CVML Web Course material, you can receive a certificate for its successful completion with or without an entry of 3,5 ECTS (depending on your choice).

All lectures and workshops will be in English.

*** Due to the special COVID-19 circumstances, the 2021 edition of the «Programming short course and workshop on Deep Learning and Computer Vision for Autonomous Systems» will take place as web course on 25-27 August 2021. Remote participation will be available via teleconferencing. ***

Cancellation policy:

50% refund for cancellation up to 15/07/2021
0% refund afterwards

Presentations

Presentations and lab notes will be available to the attendees.

TOPICS

Part A (first day, 2 lectures, 2 programming exercises) 25/08/2021
Deep Learning and GPU programming

The lectures of Part A provide a solid background on the topics of Deep neural networks. Parallel GPU and multi-core CPU architectures – GPU programming for NNs. Image classification with CNNs, e.g., PyTorch, Keras, Tensorflow.

The hands-on programming workshop will be on PyTorch basics and target detection with PyTorch.

1. Deep neural networks. Convolutional NNs:
Abstract: From multi-layer Perceptrons to deep architectures. Fully connected layers. Convolutional layers. Tensors and mathematical formulations. Pooling. Training convolutional NNs. Initialization. Data augmentation. Batch Normalization. Dropout. Deployment on embedded systems. Lightweight deep learning. DNN programming tools (e.g., PyTorch, Keras, Tensorflow).

2. Parallel GPU and multi-core CPU architectures . GPU programming:
Abstract: GPU’s unique architectural features are emphasized through CPU-GPU comparison. GPU’s architecture in terms of ALUs and memory types is given in detail in order to introduce the GPU’s programming special characteristics. The audience becomes familiar with terms such as grid, block, thread, kernel, etc. and the general layout of a CUDA program is presented. Cuda keywords are explained by presenting simple CUDA programs. Finally, areas where GPU programming achieves outstanding performance are mentioned and 2D convolution algorithm implementations are demonstrated.

3. Image classification with CNNs:
Abstract:

4. CUDA programming:
Abstract: 2D and 3D convolutions are very important tools both for computer vision (e.g., for target tracking and for deep learning (convolutional NNs). Learn how to implement a 2D convolution between an image and a mask with CUDA.

Part B (second day, 2 lectures, 2 programming exercises) 26/08/2021:
Deep Learning for Computer Vision

Part B lectures will focus on computer vision algorithms, namely on 2D target tracking, Deep learning for object/face detection. Two programming workshops will take place. The first one will be on Understand the core functionalities of an object detector. Training and deployment. The second one will be on OpenCV programming for object tracking.

1. Deep learning for object/face detection:
Abstract: Recently, Convolutional Neural Networks (CNNs) have been used for object/target (e.g., car, pedestrian, road sign) detection with great results. However, using such CNN models on embedded processors for real-time processing is prohibited by HW constraints. In that sense various architectures and settings will be examined in order to facilitate and accelerate the use of embedded CNN-based object detectors with limited computational capabilities. The following target detection topics will be presented: Object detection as search and classification task. Detection as classification and regression task. Modern architectures for target detection (e.g., RCNN, Faster-RCNN, YOLO, SSD). Lightweight architectures. Data augmentation. Deployment. Evaluation and benchmarking.

2. 2D target tracking:
Abstract: Target tracking is a crucial component of many computer vision systems. Many approaches regarding face/object detection and tracking in videos have been proposed. In this lecture, video tracking methods using correlation filters or convolutional neural networks are presented, focusing on video trackers that are capable of achieving real time performance for long-term tracking on a UAV platform.

3. Understand the core functionalities of an object detector. Training and deployment:
Abstract:

4. OpenCV programming for object tracking:
Abstract: The first part of this tutorial will have an introduction to the OpenCV library using Python. Students can learn how to perform basic image processing operations, such as reading and displaying an image, extracting ROIs, applying filters etc. In the second part of the tutorial, the students will learn how to perform visual object tracking in video sequences, with correlation filter based tracking algorithms and OpenCV.

Part C (third day, 2 lectures, 1 programming exercise) 27/08/2021:
Autonomous UAV cinematography

As drones execute missions (e.g., AV shooting, inspection), Part C lectures will focus on Video summarization and UAV cinematography. Such simulations will be presented using AirSim. Additionally a programming workshop on Pytorch will take place.

1. Video summarization:
Abstract:

2. UAV cinematography:
Abstract:

3. Video summarization with Pytorch:
Abstract:

4. Drone cinematography with Airsim:
Abstract:

IF I HAVE A QUESTION?

Contact

LECTURERS & TUTORS

Prof. Ioannis Pitas (IEEE fellow, IEEE Distinguished Lecturer, EURASIP fellow) received the Diploma and Ph.D.Degree in Electrical Engineering, both from the Aristotle University of Thessaloniki, Greece. Since 1994, he has been a Professor at the Department of Informatics of the same University. He served as a Visiting Professor at several Universities. His current interests are in the areas of image/video processing, machine learning, computer vision, intelligent digital media, human centered interfaces, affective computing, 3D imaging and biomedical imaging. He is also chair of the Autonomous Systems initiative. (Lecture: Introduction to drone imaging.)

Paraskevi Nousi obtained her BsC in Informatics in 2014 from Aristotle University of Thessaloniki and is currently pursuing her PhD in Computational Intelligence at the Informatics Department of Aristotle University of Thessaloniki. Her research is focused on developing effective and efficient Deep Learning methods for visual analysis tasks, such as Visual Object Tracking, Object Detection and Recognition and has been influenced by the needs of the H2020 project MULTIDRONE. (Lecture: Deep learning for target detection. Programming workshop: PyTorch: Understand the core functionalities of an object detector. Training and deployment.)

Iason Karakostas received the Diploma of Electrical Engineering in 2017 and is currently a PhD Student at the Artificial Intelligence and Information Analysis Laboratory (AIIA) in the Department of Informatics of AUTH. He has co-authored 8 papers in scientific journals and international conferences and has participated in two European Union-funded R&D projects. His current research interests include machine learning, computer vision, autonomous robotics and intelligent cinematography. (Lecture: 2D target tracking. Programming workshop: OpenCV programming for object tracking.)

Michail Kaseris (PhD Candidate)

Sotirios Papadopoulos (PhD Candidate)

Christos Papaioannidis (PhD Candidate)

Emmanouil Patsiouras (PhD Candidate)

Educational record of Prof. I. Pitas

Prof I. Pitas was Visiting/Adjunct/Honorary Professor/Researcher and lectured at several Universities: University of Toronto (Canada), University of British Columbia (Canada), EPFL (Switzerland), Chinese Academy of Sciences (China), University of Bristol (UK), Tampere University of Technology (Finland), Yonsei University (Korea), Erlangen-Nurnberg University (Germany), National University of Malaysia, Henan University (China). He delivered 90 invited/keynote lectures in prestigious international Conferences and top Universities worldwide. He ran 17 short courses and tutorials on Autonomous Systems, Computer Vision and Machine Learning, most of them in the past 3 years in many countries, e.g., USA, UK, Italy, Finland, Greece, Australia, N. Zealand, Korea, Taiwan, Sri Lanka, Bhutan.

PAST COURSE EDITIONS

2019

Participants: 53, Countries: UK, Germany, Sweden, Norway, Italy, Greece, Croatia, Slovakia.

Registrant comments:

(Anonymous) “… The lectures during the workshops were really good. …”,

(Anonymous) “… Course material was very appealing and perfectly adequate. …”

2) Semi-Supervised Subclass Support Vector Data Description for image and video classification, V. Mygdalis, A. Iosifidis, A. Tefas, I. Pitas, Neurocomputing, vol. 291, pp. 237-241, 2018

3) Face detection Hindering, P. Chriskos, J. Munro, V. Mygdalis, I. Pitas, Proceedings of the IEEE Global Conference on Signal and Information Processing (GLOBALSIP), Quebec, Montreal, 2017

4) 2D visual tracking for sports UAV cinematography applications, O. Zachariadis, V. Mygdalis, I. Mademlis, I. Pitas, Proceedings of the IEEE Global Conference on Signal and Information Processing (GLOBALSIP), Montreal, Canada, 2017

5) Neurons With Paraboloid Decision Boundaries for Improved Neural Network Classification Performance, N. Tsapanos, A. Tefas, N. Nikolaidis and I. Pitas, IEEE Transactions on Neural Networks and Learning Systems (TNNLS), vol. 30, issue 1, pp. 284-294, 2019

6) Convolutional Neural Networks for Visual Information Analysis with Limited Computing Resources, P. Nousi, E. Patsiouras, A. Tefas, I. Pitas, Proceedings of the IEEE International Conference on Image Processing (ICIP), Athens, Greece, 2018

7) Overview of drone cinematography for sports filming, I. Mademlis, V. Mygdalis, C. Raptopoulou, N.Nikolaidis, N. Heise, T. Koch, T. Wagner, A. Messina, F. Negro, S. Metta, I.Pitas, European Conference on Visual Media Production (CVMP), London, UK, 2017

8) Challenges in Autonomous UAV cinematography: An overview, I. Mademlis, V. Mygdalis, N. Nikolaidis, I. Pitas, Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), San Diego, USA, 2018

9) Learning Multi-graph regularization for SVM classification, V.Mygdalis, A.Tefas, I.Pitas, Proceedings of the IEEE International Conference on Image Processing (ICIP), Athens, Greece, 2018

10) UAV Cinematography Constraints Imposed by Visual Target Trackers, I. Karakostas, I. Mademlis, N. Nikolaidis, I. Pitas, Proceedings of the IEEE International Conference on Image Processing (ICIP), Athens, Greece, 2018

11) Efficient camera control using 2D visual information for unmanned aerial vehicle-based cinematography, N. Passalis, A. Tefas, I. Pitas, Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), Florence, Italy, 2018

12) The future of media production through multi-drones’ eyes, A. Messina, S. Metta, M. Montagnuolo, F. Negro, V. Mygdalis, I. Pitas, J. Capitán, A. Torres, S. Boyle, D. Bull, F. Zhang, International Broadcasting Convention (IBC), Amsterdam, Netherlands, 2018

13) Quality Preserving Face De-Identification Against Deep CNNs, P. Chriskos, R. Zhelev, V. Mygdalis, I. Pitas, Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP), Aalborg, Denmark, 2018

14) Improving Face Pose Estimation using Long-Term Temporal Averaging for Stochastic Optimization, N. Passalis, A. Tefas, Proceedings of the International Conference on Engineering Applications of Neural Networks, EANN 2017, Athens, Greece, 2017

15) Discriminatively Trained Autoencoders for Fast and Accurate Face Recognition, P. Nousi, A. Tefas, Proceedings of the International Conference on Engineering Applications of Neural Networks, EANN, Athens, Greece, 2017

16) Concept Detection and Face Pose Estimation Using Lightweight Convolutional Neural Networks for Steering Drone Video Shooting, N. Passalis, A. Tefas, Proceedings of the European Signal Processing Conference (EUSIPCO), Kos, Greece, 2017

17) Human Crowd Detection for Drone Flight Safety Using Convolutional Neural Networks, M.Tzelepi, A.Tefas, Proceedings of the European Signal Processing Conference (EUSIPCO), Kos, Greece, 2017

18) Lightweight Two-Stream Convolutional Face Detection, D. Triantafyllidou, P. Nousi, A. Tefas, Proceedings of the European Signal Processing Conference (EUSIPCO), Kos, Greece, August, 2017

19) Fast Deep Convolutional Face Detection in the Wild Exploiting Hard Sample Mining, D. Triantafyllidou, P. Nousi, A. Tefas, Big Data Research, Elsevier, vol. 11, pp. 65-76, 2018

20) Learning Bag-of-Features Pooling for Deep Convolutional Neural Networks, N. Passalis, A. Tefas, Proceedings of the International Conference on Computer Vision (ICCV), Venice, Italy, 2017

21) Self-Supervised Auto-encoders for Clustering and Classification, P. Nousi, A. Tefas, Evolving Systems Journal, Springer, pp 1–14, 2018

22) Unsupervised Knowledge Transfer using Similarity Embeddings, N. Passalis, A. Tefas, IEEE Transactions on Neural Networks and Learning Systems (TNNLS), vol. 30, issue 3, pp. 946-950, 2018

23) Recurrent Attention for Deep Neural Object Detection, G. Symeonidis, A. Tefas, Hellenic Conference on Artificial Intelligence (SETN), Rio Patras, Greece, 2018

24) Neural Network Knowledge Transfer using Unsupervised Similarity Matching, N. Passalis, A. Tefas, Proceedings of the International Conference on Pattern Recognition (ICPR), Beijing, China, 2018

25) Deep reinforcement learning for frontal view person shooting using drones, N. Passalis, A. Tefas, Proceedings of the IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS), Rhodes, Greece, 2018

26) A Multidrone Approach for Autonomous Cinematography Planning, A. Torres-Gonzalez, J. Capitan, R. Cunha, A. Ollero and I. Mademlis, Proceedings of the Iberian Robotics Conference (ROBOT), 2017

27) Decentralized safe conflict resolution for multiple robots in dense scenarios, E. Ferrera, J. Capitán, A.R. Castaño and P.J. Marrón, Robotics and Autonomous Systems, vol. 91, pp. 179-193, 2017

28) Cooperative perimeter surveil using Bluetooth framework under communication constraints, J.M. Aguilar, P. R. Soria, B.C. Arrue and A. Ollero, Proceedings of the Iberian Robotics Conference (ROBOT), 2017

29) Applying Frontier Cells Based Exploration and Lazy Theta* Path Planning over Single Grid-Based World Representation for Autonomous Inspection of Large 3D Structures with an UAS, M. Faria, I. Maza and A. Viguria, Journal of Intelligent & Robotic Systems, accapted Springer

30) Discriminative Optimization: Theory and Applications to Computer Vision Problems, J. Vongkulbhisal, F. De la Torre, and J. P. Costeira, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. 41, issue 4, pp. 829 – 843, 2018

31) Integrated Visual Servoing Solution to Quadrotor Stabilization and Attitude Estimation Using a Pan and Tilt Camera, D. Cabecinhas, S. Brás, R. Cunha, C. Silvestre, P. Oliveira, IEEE Transactions on Control Systems Technology, vol. 27, issue 1, pp. 14-29, 2017

32) UAL: An Abstraction Layer for Unmanned Aerial Vehicles, F. Real, A. Torres-González, P. Ramón-Soria, J. Capitán and A. Ollero, Proceedings of the International Symposium on Aerial Robotics (ISAR), Philadelphia, PA, USA, 2018

33) Inverse Composition Discriminative Optimization for Point Cloud Registration, J. Vongkulbhisal, B. I. Ugalde, F. De la Torre, J. P. Costeira, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, 2018

34) P. Chriskos, O.Zoidi, A.Tefas and I.Pitas, De-identifying facial images using singular value decomposition and projections, Multimedia Tools and Applications, Springer, vol. 76, issue 3, pp. 3435-3468, 2017

35) Cooperative Unmanned Aerial Systems for Fire Detection, Monitoring and Extinguishing, L. Merino, J.R. Martinez-de Dios, A. Ollero, In “Handbook of Unmanned Aerial Vehicles”, ISBN 978-90-481-9706-4, Springer, 2015

36) Shot Type Feasibility in Autonomous UAV Cinematography, I. Karakostas, I. Mademlis, N. Nikolaidis, I. Pitas, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019

37) High-Level Multiple-UAV Cinematography Tools for Covering Outdoor Events, I. Mademlis, V. Mygdalis, N. Nikolaidis, M. Montagnuolo, F. Negro, A. Messina, I. Pitas, IEEE Transactions on Broadcasting, accepted for publication, 2019

38) Autonomous Unmanned Aerial Vehicles Filming in Dynamic Unstructured Outdoor Environments, I. Mademlis, N. Nikolaidis, A. Tefas, I. Pitas, T. Wagner, A. Messina,

IEEE Signal Processing Magazine, vol. 36, issue 1, pp. 147-153, 2019

39) Deep Convolutional Feature Histograms for Visual Object Tracking, P. Nousi, A. Tefas, I. Pitas, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019

40) Semantic Map Annotation Through UAV Video Analysis Using Deep Learning Models in ROS, E. Kakaletsis, M. Tzelepi, P.I. Kaplanoglou, C. Symeonidis, N. Nikolaidis, A. Tefas, I. Pitas, Proceedings of the International Conference on Multimedia Modeling (MMM), Thessaloniki, Greece, 2019

41) Exploiting multiplex data relationships in Support Vector Machines, V. Mygdalis, A. Tefas, I. Pitas, Pattern Recognition, Elsevier, vol. 85, pp. 70-77, 2019

42) Deep reinforcement learning for controlling frontal person close-up shooting, N. Passalis, A. Tefas, Neurocomputing, Elsevier, vol. 335, pp. 37-47, 2019

43) Graph Embedded Convolutional Neural Networks in Human Crowd Detection for Drone Flight Safety, M. Tzelepi, A. Tefas, IEEE Transactions on Emerging Topics in Computational Intelligence, accepted for publication, 2019

44) Training Lightweight Deep Convolutional Neural Networks Using Bag-of-Features Pooling, N. Passalis, A. Tefas, IEEE Transactions on Neural Networks and Learning Systems, accepted for publication, 2018