This three day short course and workshop provides an in-depth presentation of programming tools and techniques for various computer vision, deep learning problems. The target application domains are autonomous systems (e.g., drone cinematography) and digital/social media. The short course consists of three parts (A,B,C), each having lectures and a programming workshop with hands-on lab exercises.
Part A will focus on Deep Learning and GPU programming. The lectures of this part provide a solid background on Deep Neural Networks (DNN) topics, notably convolutional NNs (CNNs) and deep learning for image classification. Also, parallel GPU and multi-core CPU architectures commonly used to train DNNs will be presented. Two programming workshops will take place. The first one will be on image classification using CNNs, while the second one will be on CUDA programming, focusing on 2D convolution algorithms.
Part B lectures will focus on deep learning algorithms for computer vision, namely on 2D object/face detection and 2D object tracking (giving the attendants the opportunity to master state of the art object detectors and video trackers). The hands-on programming workshop will be on target detection with Pytorch and on how to use OpenCV (the most used library for computer vision) for target tracking.
Part C lectures will focus on autonomous UAV cinematography. Before mission execution, it is best simulated, using drone mission simulation tools. Such simulations will be presented using AirSim. Additionally, participants will have the opportunity to understand video summarization techniques, which can be used to autonomously distill the important parts of the recorded video.
The lectures and programming tools will provide programming skills for the various computer vision and deep learning problems encountered in autonomous systems and autonomous drone applications, e.g., drone cinematography, drone inspection, land/marine surveillance, search&rescue, and 3D modeling.
Lectures and programming workshops will be in English. PDF files will be available at the end of the course.
Part A (8 hours),
Deep Learning and GPU programming
- Deep neural networks. Convolutional NNs.
- Deep learning for target detection.
- Image classification with CNNs.
- Target detection with PyTorch.
Part B (8 hours), Deep Learning for Computer Vision
- Deep learning for object/face detection.
- 2D object tracking.
- PyTorch: Understand the core functionalities of an object detector. Training and deployment.
- OpenCV programming for object tracking.
Part C (8 hours), Autonomous UAV cinematography
- Video summarization.
- UAV cinematography.
- Video summarization with Pytorch.
- Drone cinematography with Airsim.
The course will take place on 25-27 August 2021.
All lectures and workshops will be delivered remotely.
You can find additional information about the city of Thessaloniki and details on how to get to the city here.
Each registrant will use her/his own computer for a) participating in the course and b) for running the programming exercises. A standard PC with a stable internet connection is required. The participants are also required to own a Google account for the workshops exercises. Finally, instructions on how to install AirSim can be found here: https://microsoft.github.io/AirSim/. All lectures and workshops will be delivered remotely using Zoom. The course link is the following: https://authgr.zoom.us/j/91483406779
|Topic||Deep Learning and GPU programming||Deep Learning for Computer Vision||Autonomous UAV cinematography|
|8:30-9:00||Introduction to autonomous systems
|9:00-10:00||Deep neural networks – Convolutional NNs||Deep learning for object/face detection
|10:00-11:00||Parallel GPU and multi-core CPU architectures – GPU programming||2D object tracking|| UAV cinematography
|11:00-11:30||Coffee break||Coffee break||Coffee break|
|11:30-13:30||Image classification with CNNs.||PyTorch: Understand the core functionalities of an object detector. Training and deployment.||Video summarization with Pytorch
|13:30-14:30||Lunch break||Lunch break||Lunch break|
|14:30-16:30||CUDA programming||OpenCV programming for object tracking
||Drone cinematography with Airsim
* Eastern European Summer Time (EEST, UTC+3 hours)
** This programme is indicative and may be modified without prior notice by announcing (hopefully small) changes in lectures/lecturers.
Early registration (till 2/08/2021):
• Standard: 200 Euros
• Reduced registration for young professionals (up to 2 years after graduation): 100 Euros
• Unemployed or Undergraduate/MSc/PhD student*: 50 Euros
Later or on-site registration (after 2/08/2021):
• Standard: 210 Euros
• Reduced registration for young professionals (up to 2 years after graduation): 110 Euros
• Unemployed or Undergraduate/MSc/PhD student*: 60 Euros
After the completion of your payment, please fill in the form below:
Up to 10 PhD students, registered in AUTH or in any VISION CSA https://www.vision4ai.eu or AI4Media https://ai4media.eu/ or HumanAI-E Net https://www.humane-ai.eu/ University partners, are entitled for 1 free CVML Web Course registration per fall/spring semester on a FCFS basis, with priority to ones working on AI-related topics. This offer is related to the upcoming educational activities of International AI Doctoral Academy (AIDA) http://www.i-aida.org/ that is co-initiated by these two projects.
Other special CVML Web Courses containing any CVML Web Lecture set (16 Lectures) can be available upon request, to meet your personal learning needs, by sending a message as described in the section IF I HAVE A QUESTION?
A certificate of attendance will be provided.
Upon successful completion of more than 14 CVML Web lecture understanding questionnaires (mark equal or above 5 in the range 0-10, 10 being excellent) with 2 months after receiving CVML Web Course material, you can receive a certificate for its successful completion with or without an entry of 3,5 ECTS (depending on your choice).
All lectures and workshops will be in English.
*** Due to the special COVID-19 circumstances, the 2021 edition of the «Programming short course and workshop on Deep Learning and Computer Vision for Autonomous Systems» will take place as web course on 25-27 August 2021. Remote participation will be available via teleconferencing. ***
- 50% refund for cancellation up to 15/07/2021
- 0% refund afterwards
Presentations and lab notes will be available to the attendees.
Part A (first day, 2 lectures, 2 programming exercises) 25/08/2021
Deep Learning and GPU programming
The lectures of Part A provide a solid background on the topics of Deep neural networks. Parallel GPU and multi-core CPU architectures – GPU programming for NNs. Image classification with CNNs, e.g., PyTorch, Keras, Tensorflow.
The hands-on programming workshop will be on PyTorch basics and target detection with PyTorch.
1. Deep neural networks. Convolutional NNs:
Abstract: From multi-layer Perceptrons to deep architectures. Fully connected layers. Convolutional layers. Tensors and mathematical formulations. Pooling. Training convolutional NNs. Initialization. Data augmentation. Batch Normalization. Dropout. Deployment on embedded systems. Lightweight deep learning. DNN programming tools (e.g., PyTorch, Keras, Tensorflow).
2. Parallel GPU and multi-core CPU architectures . GPU programming:
Abstract: GPU’s unique architectural features are emphasized through CPU-GPU comparison. GPU’s architecture in terms of ALUs and memory types is given in detail in order to introduce the GPU’s programming special characteristics. The audience becomes familiar with terms such as grid, block, thread, kernel, etc. and the general layout of a CUDA program is presented. Cuda keywords are explained by presenting simple CUDA programs. Finally, areas where GPU programming achieves outstanding performance are mentioned and 2D convolution algorithm implementations are demonstrated.
3. Image classification with CNNs:
4. CUDA programming:
Abstract: 2D and 3D convolutions are very important tools both for computer vision (e.g., for target tracking and for deep learning (convolutional NNs). Learn how to implement a 2D convolution between an image and a mask with CUDA.
Part B (second day, 2 lectures, 2 programming exercises) 26/08/2021:
Deep Learning for Computer Vision
Part B lectures will focus on computer vision algorithms, namely on 2D target tracking, Deep learning for object/face detection. Two programming workshops will take place. The first one will be on Understand the core functionalities of an object detector. Training and deployment. The second one will be on OpenCV programming for object tracking.
1. Deep learning for object/face detection:
Abstract: Recently, Convolutional Neural Networks (CNNs) have been used for object/target (e.g., car, pedestrian, road sign) detection with great results. However, using such CNN models on embedded processors for real-time processing is prohibited by HW constraints. In that sense various architectures and settings will be examined in order to facilitate and accelerate the use of embedded CNN-based object detectors with limited computational capabilities. The following target detection topics will be presented: Object detection as search and classification task. Detection as classification and regression task. Modern architectures for target detection (e.g., RCNN, Faster-RCNN, YOLO, SSD). Lightweight architectures. Data augmentation. Deployment. Evaluation and benchmarking.
2. 2D target tracking:
Abstract: Target tracking is a crucial component of many computer vision systems. Many approaches regarding face/object detection and tracking in videos have been proposed. In this lecture, video tracking methods using correlation filters or convolutional neural networks are presented, focusing on video trackers that are capable of achieving real time performance for long-term tracking on a UAV platform.
3. Understand the core functionalities of an object detector. Training and deployment:
4. OpenCV programming for object tracking:
Abstract: The first part of this tutorial will have an introduction to the OpenCV library using Python. Students can learn how to perform basic image processing operations, such as reading and displaying an image, extracting ROIs, applying filters etc. In the second part of the tutorial, the students will learn how to perform visual object tracking in video sequences, with correlation filter based tracking algorithms and OpenCV.
Part C (third day, 2 lectures, 1 programming exercise) 27/08/2021:
Autonomous UAV cinematography
As drones execute missions (e.g., AV shooting, inspection), Part C lectures will focus on Video summarization and UAV cinematography. Such simulations will be presented using AirSim. Additionally a programming workshop on Pytorch will take place.
1. Video summarization:
2. UAV cinematography:
3. Video summarization with Pytorch:
4. Drone cinematography with Airsim:
IF I HAVE A QUESTION?
LECTURERS & TUTORS
Prof. Ioannis Pitas (IEEE fellow, IEEE Distinguished Lecturer, EURASIP fellow) received the Diploma and Ph.D.Degree in Electrical Engineering, both from the Aristotle University of Thessaloniki, Greece. Since 1994, he has been a Professor at the Department of Informatics of the same University. He served as a Visiting Professor at several Universities. His current interests are in the areas of image/video processing, machine learning, computer vision, intelligent digital media, human centered interfaces, affective computing, 3D imaging and biomedical imaging. He is also chair of the Autonomous Systems initiative. (Lecture: Introduction to drone imaging.)
Paraskevi Nousi obtained her BsC in Informatics in 2014 from Aristotle University of Thessaloniki and is currently pursuing her PhD in Computational Intelligence at the Informatics Department of Aristotle University of Thessaloniki. Her research is focused on developing effective and efficient Deep Learning methods for visual analysis tasks, such as Visual Object Tracking, Object Detection and Recognition and has been influenced by the needs of the H2020 project MULTIDRONE. (Lecture: Deep learning for target detection. Programming workshop: PyTorch: Understand the core functionalities of an object detector. Training and deployment.)
Iason Karakostas received the Diploma of Electrical Engineering in 2017 and is currently a PhD Student at the Artificial Intelligence and Information Analysis Laboratory (AIIA) in the Department of Informatics of AUTH. He has co-authored 8 papers in scientific journals and international conferences and has participated in two European Union-funded R&D projects. His current research interests include machine learning, computer vision, autonomous robotics and intelligent cinematography. (Lecture: 2D target tracking. Programming workshop: OpenCV programming for object tracking.)
Michail Kaseris (PhD Candidate)
Sotirios Papadopoulos (PhD Candidate)
Christos Papaioannidis (PhD Candidate)
Emmanouil Patsiouras (PhD Candidate)
Educational record of Prof. I. Pitas
Prof I. Pitas was Visiting/Adjunct/Honorary Professor/Researcher and lectured at several Universities: University of Toronto (Canada), University of British Columbia (Canada), EPFL (Switzerland), Chinese Academy of Sciences (China), University of Bristol (UK), Tampere University of Technology (Finland), Yonsei University (Korea), Erlangen-Nurnberg University (Germany), National University of Malaysia, Henan University (China). He delivered 90 invited/keynote lectures in prestigious international Conferences and top Universities worldwide. He ran 17 short courses and tutorials on Autonomous Systems, Computer Vision and Machine Learning, most of them in the past 3 years in many countries, e.g., USA, UK, Italy, Finland, Greece, Australia, N. Zealand, Korea, Taiwan, Sri Lanka, Bhutan.
PAST COURSE EDITIONS
Participants: 53, Countries: UK, Germany, Sweden, Norway, Italy, Greece, Croatia, Slovakia.
(Anonymous) «… The lectures during the workshops were really good. …»,
(Anonymous) «… Course material was very appealing and perfectly adequate. …»
If you want to be our sponsor send us an email here: firstname.lastname@example.org
SAMPLE COURSE MATERIAL. RELATED LITERATURE
1) Multidrone Project (MULTIple DRONE platform for media production), funded by the EU (2017-19), within the scope of the H2020 framework, https://multidrone.eu/
2) Semi-Supervised Subclass Support Vector Data Description for image and video classification, V. Mygdalis, A. Iosifidis, A. Tefas, I. Pitas, Neurocomputing, vol. 291, pp. 237-241, 2018
3) Face detection Hindering, P. Chriskos, J. Munro, V. Mygdalis, I. Pitas, Proceedings of the IEEE Global Conference on Signal and Information Processing (GLOBALSIP), Quebec, Montreal, 2017
4) 2D visual tracking for sports UAV cinematography applications, O. Zachariadis, V. Mygdalis, I. Mademlis, I. Pitas, Proceedings of the IEEE Global Conference on Signal and Information Processing (GLOBALSIP), Montreal, Canada, 2017
5) Neurons With Paraboloid Decision Boundaries for Improved Neural Network Classification Performance, N. Tsapanos, A. Tefas, N. Nikolaidis and I. Pitas, IEEE Transactions on Neural Networks and Learning Systems (TNNLS), vol. 30, issue 1, pp. 284-294, 2019
6) Convolutional Neural Networks for Visual Information Analysis with Limited Computing Resources, P. Nousi, E. Patsiouras, A. Tefas, I. Pitas, Proceedings of the IEEE International Conference on Image Processing (ICIP), Athens, Greece, 2018
7) Overview of drone cinematography for sports filming, I. Mademlis, V. Mygdalis, C. Raptopoulou, N.Nikolaidis, N. Heise, T. Koch, T. Wagner, A. Messina, F. Negro, S. Metta, I.Pitas, European Conference on Visual Media Production (CVMP), London, UK, 2017
8) Challenges in Autonomous UAV cinematography: An overview, I. Mademlis, V. Mygdalis, N. Nikolaidis, I. Pitas, Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), San Diego, USA, 2018
9) Learning Multi-graph regularization for SVM classification, V.Mygdalis, A.Tefas, I.Pitas, Proceedings of the IEEE International Conference on Image Processing (ICIP), Athens, Greece, 2018
10) UAV Cinematography Constraints Imposed by Visual Target Trackers, I. Karakostas, I. Mademlis, N. Nikolaidis, I. Pitas, Proceedings of the IEEE International Conference on Image Processing (ICIP), Athens, Greece, 2018
11) Efficient camera control using 2D visual information for unmanned aerial vehicle-based cinematography, N. Passalis, A. Tefas, I. Pitas, Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), Florence, Italy, 2018
12) The future of media production through multi-drones’ eyes, A. Messina, S. Metta, M. Montagnuolo, F. Negro, V. Mygdalis, I. Pitas, J. Capitán, A. Torres, S. Boyle, D. Bull, F. Zhang, International Broadcasting Convention (IBC), Amsterdam, Netherlands, 2018
13) Quality Preserving Face De-Identification Against Deep CNNs, P. Chriskos, R. Zhelev, V. Mygdalis, I. Pitas, Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP), Aalborg, Denmark, 2018
14) Improving Face Pose Estimation using Long-Term Temporal Averaging for Stochastic Optimization, N. Passalis, A. Tefas, Proceedings of the International Conference on Engineering Applications of Neural Networks, EANN 2017, Athens, Greece, 2017
15) Discriminatively Trained Autoencoders for Fast and Accurate Face Recognition, P. Nousi, A. Tefas, Proceedings of the International Conference on Engineering Applications of Neural Networks, EANN, Athens, Greece, 2017
16) Concept Detection and Face Pose Estimation Using Lightweight Convolutional Neural Networks for Steering Drone Video Shooting, N. Passalis, A. Tefas, Proceedings of the European Signal Processing Conference (EUSIPCO), Kos, Greece, 2017
17) Human Crowd Detection for Drone Flight Safety Using Convolutional Neural Networks, M.Tzelepi, A.Tefas, Proceedings of the European Signal Processing Conference (EUSIPCO), Kos, Greece, 2017
18) Lightweight Two-Stream Convolutional Face Detection, D. Triantafyllidou, P. Nousi, A. Tefas, Proceedings of the European Signal Processing Conference (EUSIPCO), Kos, Greece, August, 2017
19) Fast Deep Convolutional Face Detection in the Wild Exploiting Hard Sample Mining, D. Triantafyllidou, P. Nousi, A. Tefas, Big Data Research, Elsevier, vol. 11, pp. 65-76, 2018
20) Learning Bag-of-Features Pooling for Deep Convolutional Neural Networks, N. Passalis, A. Tefas, Proceedings of the International Conference on Computer Vision (ICCV), Venice, Italy, 2017
21) Self-Supervised Auto-encoders for Clustering and Classification, P. Nousi, A. Tefas, Evolving Systems Journal, Springer, pp 1–14, 2018
22) Unsupervised Knowledge Transfer using Similarity Embeddings, N. Passalis, A. Tefas, IEEE Transactions on Neural Networks and Learning Systems (TNNLS), vol. 30, issue 3, pp. 946-950, 2018
23) Recurrent Attention for Deep Neural Object Detection, G. Symeonidis, A. Tefas, Hellenic Conference on Artificial Intelligence (SETN), Rio Patras, Greece, 2018
24) Neural Network Knowledge Transfer using Unsupervised Similarity Matching, N. Passalis, A. Tefas, Proceedings of the International Conference on Pattern Recognition (ICPR), Beijing, China, 2018
25) Deep reinforcement learning for frontal view person shooting using drones, N. Passalis, A. Tefas, Proceedings of the IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS), Rhodes, Greece, 2018
26) A Multidrone Approach for Autonomous Cinematography Planning, A. Torres-Gonzalez, J. Capitan, R. Cunha, A. Ollero and I. Mademlis, Proceedings of the Iberian Robotics Conference (ROBOT), 2017
27) Decentralized safe conflict resolution for multiple robots in dense scenarios, E. Ferrera, J. Capitán, A.R. Castaño and P.J. Marrón, Robotics and Autonomous Systems, vol. 91, pp. 179-193, 2017
28) Cooperative perimeter surveil using Bluetooth framework under communication constraints, J.M. Aguilar, P. R. Soria, B.C. Arrue and A. Ollero, Proceedings of the Iberian Robotics Conference (ROBOT), 2017
29) Applying Frontier Cells Based Exploration and Lazy Theta* Path Planning over Single Grid-Based World Representation for Autonomous Inspection of Large 3D Structures with an UAS, M. Faria, I. Maza and A. Viguria, Journal of Intelligent & Robotic Systems, accapted Springer
30) Discriminative Optimization: Theory and Applications to Computer Vision Problems, J. Vongkulbhisal, F. De la Torre, and J. P. Costeira, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. 41, issue 4, pp. 829 – 843, 2018
31) Integrated Visual Servoing Solution to Quadrotor Stabilization and Attitude Estimation Using a Pan and Tilt Camera, D. Cabecinhas, S. Brás, R. Cunha, C. Silvestre, P. Oliveira, IEEE Transactions on Control Systems Technology, vol. 27, issue 1, pp. 14-29, 2017
32) UAL: An Abstraction Layer for Unmanned Aerial Vehicles, F. Real, A. Torres-González, P. Ramón-Soria, J. Capitán and A. Ollero, Proceedings of the International Symposium on Aerial Robotics (ISAR), Philadelphia, PA, USA, 2018
33) Inverse Composition Discriminative Optimization for Point Cloud Registration, J. Vongkulbhisal, B. I. Ugalde, F. De la Torre, J. P. Costeira, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, 2018
34) P. Chriskos, O.Zoidi, A.Tefas and I.Pitas, De-identifying facial images using singular value decomposition and projections, Multimedia Tools and Applications, Springer, vol. 76, issue 3, pp. 3435-3468, 2017
35) Cooperative Unmanned Aerial Systems for Fire Detection, Monitoring and Extinguishing, L. Merino, J.R. Martinez-de Dios, A. Ollero, In «Handbook of Unmanned Aerial Vehicles», ISBN 978-90-481-9706-4, Springer, 2015
36) Shot Type Feasibility in Autonomous UAV Cinematography, I. Karakostas, I. Mademlis, N. Nikolaidis, I. Pitas, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019
37) High-Level Multiple-UAV Cinematography Tools for Covering Outdoor Events, I. Mademlis, V. Mygdalis, N. Nikolaidis, M. Montagnuolo, F. Negro, A. Messina, I. Pitas, IEEE Transactions on Broadcasting, accepted for publication, 2019
38) Autonomous Unmanned Aerial Vehicles Filming in Dynamic Unstructured Outdoor Environments, I. Mademlis, N. Nikolaidis, A. Tefas, I. Pitas, T. Wagner, A. Messina,
IEEE Signal Processing Magazine, vol. 36, issue 1, pp. 147-153, 2019
39) Deep Convolutional Feature Histograms for Visual Object Tracking, P. Nousi, A. Tefas, I. Pitas, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019
40) Semantic Map Annotation Through UAV Video Analysis Using Deep Learning Models in ROS, E. Kakaletsis, M. Tzelepi, P.I. Kaplanoglou, C. Symeonidis, N. Nikolaidis, A. Tefas, I. Pitas, Proceedings of the International Conference on Multimedia Modeling (MMM), Thessaloniki, Greece, 2019
41) Exploiting multiplex data relationships in Support Vector Machines, V. Mygdalis, A. Tefas, I. Pitas, Pattern Recognition, Elsevier, vol. 85, pp. 70-77, 2019
42) Deep reinforcement learning for controlling frontal person close-up shooting, N. Passalis, A. Tefas, Neurocomputing, Elsevier, vol. 335, pp. 37-47, 2019
43) Graph Embedded Convolutional Neural Networks in Human Crowd Detection for Drone Flight Safety, M. Tzelepi, A. Tefas, IEEE Transactions on Emerging Topics in Computational Intelligence, accepted for publication, 2019
44) Training Lightweight Deep Convolutional Neural Networks Using Bag-of-Features Pooling, N. Passalis, A. Tefas, IEEE Transactions on Neural Networks and Learning Systems, accepted for publication, 2018
Prof. Ioannis Pitas: https://scholar.google.gr/citations?user=lWmGADwAAAAJ&hl=el
Multidrone project: https://multidrone.eu/
Icarus Research Team: http://icarus.csd.auth.gr/
Laboratory of Artificial Intelligence and Information Analysis: http://www.aiia.csd.auth.gr/
Department of Informatics, Aristotle University of Thessaloniki (AUTH): http://www.csd.auth.gr/en/