Computer Vision, Machine Learning and Autonomous Systems Lectures
After the very successful completion of a) the 2020 Spring edition of the CVML Web Lecture Series and b) the ‘Summer School on Autonomous Systems 2020’ 17-21/8/2020, attracting more than 100 registrants, AIIA Lab (AIIA.CVML research group) offers an asynchronous mode to study Computer Vision, Machine Learning and Autonomous Systems topics. CVML Web Lecture list is found below. Sample material of this course is available at the lecture topics is provided as well.
This asynchronous e-course provides an overview and in-depth presentation of the various computer vision and deep learning problems encountered in autonomous systems perception, e.g. in drone imaging or autonomous car vision. It consists of 20 lectures 1-hour lectures (and related material) covering the following domains:
a. Computer vision. After reviewing image acquisition, camera geometry (mapping the 3D world on a 2D image plane) and camera calibration, stereo and multi-view imaging systems are presented for recovering 3D world geometry from 2D images. This is complemented by Structure from Motion (SfM) towards Simultaneous Localization and Mapping (SLAM) for vehicle and/or target localization and visual object tracking and 3D localization. Motion estimation algorithms are also overviewed.
b. Neural Networks and Deep Learning. As there is much hype and often little accuracy, when treating these topics, first the principles of Machine Learning are presented, focusing on classification and egression. Then, an introduction to neural networks, provides rigorous formulation of the optimization problems for their training, starting with Perceptron. It continues with Multilayer perceptron training through Backpropagation, presenting many related problems, such as over-/under-fitting and generalization. Deep neural networks, notably Convolutional NNs are the core of this domain nowadays and they are overviewd in great detail. Their application on deep learning for object detection is a very important issue as well, complemented with a presentation of deep semantic image segmentation.
c. Autonomous Systems. First of all, an introduction to Autonomous Systems (AS) provides an overview of various issues related to AS perception and control. Then topics related to autonomous drones are detailed, notably drone mission planning and control and multiple drone imaging. Then Autonomous cars and autonomous marine vehicles are overviewed.
d. CVML algorithms and programming. Various such tools, libraries and frameworks are overviewed: Robotic Operating System (ROS), linear algebra libraries (BLAS), DNN libraries (e.g., cuBLAS, cuDNN) and frameworks (e.g., Pytorch, Tensorflow, Keras etc). Distributed computing frameworks (Apache Spark) and collaborative SW development tools are overviewd as well (e.g., GitHub).
e. Signals and Systems. Much confusion exists nowadays in ML literature, as even mature ML scientists have no background on Signals and Systems (SS) and confuse even basic notions, e.g., convolutions and correlations. SS principles are overviewed, while focusing on fast convolution algorithms, particularly on 2D convolution algorithms that are an absolute must for CNN libraries/frameworks and many computer vision tasks.
You can also self-assess your CVML knowledge by filling appropriate questionnaires (one per lecture) and you will be provided programming exercises to improve your CV programming skills.
You can click on the lecture title to view its description.
- Introduction to computer vision
- Image acquisition, camera geometry
- Stereo and Multiview imaging
- Structure from motion
- Localization and mapping
- Object tracking and 3D localization
- Motion estimation
- Introduction to Machine Learning
- Introduction to neural networks, Perceptron
- Multilayer perceptron. Backpropagation
- Deep neural networks. Convolutional NNs
- Deep learning for object detection
- Deep Semantic Image Segmentation
- Introduction to autonomous systems
- Introduction to multiple drone imaging
- Drone mission planning and control
- Introduction to car vision
- Introduction to autonomous marine vehicles
CVML algorithms and programming
Signals and Systems
CVML PROGRAMMING EXERCISES
Υou can improve your programming knowledge on Computer Vision, Machine Learning and Image/Video Processing topics through programming exercises using OpenCV, PyTorch and CUDA on the following topics:
- Introduction to OpenCV Programming
- CNN image classification
- PyTorch for deep object detection
- OpenCV programming for object tracking
- CUDA programming of 2D convolution algorithms
You will be provided the programming exercise solutions to check your progress.
More information can be found in: https://aiia.csd.auth.gr/cvml-programming-exercises/.
You can now assess your CVML knowledge and background by performing the exercise described in:
It takes you only 15 minutes per questionnaire. You can perform the self-assessment exercise before/after trying the course material to assess your progress You can do this double self-assessment (before/after study) for free, using the sample lecture study material provided below.
Prof. Ioannis Pitas (IEEE Fellow, IEEE Distinguished Lecturer, EURASIP fellow) received the Diploma and Ph.D. degree in Electrical Engineering, both from the Aristotle University of Thessaloniki, Greece. Since 1994, he has been a Professor at the Department of Informatics of the same University. His current interests are in the areas of image/video processing, machine learning, computer vision, intelligent digital media, human-centered interfaces, affective computing, 3D imaging, and biomedical imaging. He has published over 860 papers, contributed in 44 books in his areas of interest and edited or (co-)authored another 11 books. He has also been member of the program committee of many scientific conferences and workshops. In the past he served as Associate Editor or co-Editor of 9 international journals and General or Technical Chair of 4 international conferences. He participated in 69 R&D projects, primarily funded by the European Union and is/was principal investigator/researcher in 41 such projects.
He has 31600 citations to his work and h-index 85+ (Google Scholar)
Prof. Pitas lead the big European H2020 R&D project MULTIDRONE and is principal investigator (AUTH) in H2020 projects Aerial Core and AI4Media. He is chair of the Autonomous Systems initiative https://ieeeasi.signalprocessingsociety.org/.
Professor Pitas will deliver 16 lectures on deep learning and computer vision.
Educational record of Prof. I. Pitas: He was Visiting/Adjunct/Honorary Professor/Researcher and lectured at several Universities: University of Toronto (Canada), University of British Columbia (Canada), EPFL (Switzerland), Chinese Academy of Sciences (China), University of Bristol (UK), Tampere University of Technology (Finland), Yonsei University (Korea), Erlangen-Nurnberg University (Germany), National University of Malaysia, Henan University (China). He delivered 90 invited/keynote lectures in prestigious international Conferences and top Universities worldwide. He run 17 short courses and tutorials on Autonomous Systems, Computer Vision and Machine Learning, most of them in the past 3 years in many countries, e.g., USA, UK, Italy, Finland, Greece, Australia, N. Zealand, Korea, Taiwan, Sri Lanka, Bhutan.
Registration fee: 141 Euros
Depending on your background and pace, it will take you 1 month (or more) to finish this course and run all questionnaires, at a rate of 4 lectures/week.
All material and correspondence are in English. You will be provided email and/or Skype support for understanding questions for a period of 1 month after your registration.
Upon successful completion of more than 14 questionnaires (mark above 5/10), you will receive a certificate for the successful completion of this course.
1. Introduction to computer vision
Abstract: A detailed introduction to computer vision will be made: image/video sampling, Image and video acquisition, Camera geometry, Stereo and Multiview imaging, Structure from motion, Structure from X, 3D Robot Localization and Mapping, Semantic 3D world mapping, 3D object localization, Multiview object detection and tracking, Object pose estimation.
Sample Lecture material: Download
2. Image acquisition, camera geometry
Abstract: After a brief introduction to image acquisition and light reflection, the building blocks of modern cameras will be surveyed, along with geometric camera modeling. Several camera models, like pinhole and weak-perspective camera model, will subsequently be presented, with the most commonly used camera calibration techniques closing the lecture.
Sample Lecture material: Download
3. Stereo and Multiview imaging
Abstract: The workings of stereoscopic and multiview imaging will be explored in depth, focusing mainly on stereoscopic vision, geometry and camera technologies. Subsequently, the main methods of 3D scene reconstruction from stereoscopic video will be described, along with the basics of multiview imaging.
Sample Lecture material: Download
4. Structure from motion
Abstract: Image-based 3D Shape Reconstruction, Stereo and multiview imaging principles. Feature extraction and matching. Triangulation and Bundle Adjustment. Mathematics of structure from motion. UAV image capturing. Optimal UAV flight trajectory/flight height/viewing angle/image overlap ratio. Pre/post-processing for 3D reconstruction: flat surface smoothing/mesh modification/isolated point removal. Structure from motion applications: 3D face reconstruction from uncalibrated video. 3D landscape reconstruction. 3D building/monument reconstruction and modeling.
5. Localization and mapping
Abstract: The lecture includes the essential knowledge about how we obtain/get 2D and/or 3D maps that robots/drones need, taking measurements that allow them to perceive their environment with appropriate sensors. Semantic mapping includes how to add semantic annotations to the maps such as POIs, roads and landing sites. Section Localization is exploited to find the 3D drone or target location based on sensors using specifically Simultaneous Localization And Mapping (SLAM). Finally, drone localization fusion describes improves accuracy on localization and mapping by exploiting the synergies between different sensors.
Lecture material: Download
6. Object tracking and 3D localization
Abstract: Target tracking is a crucial component of many vision systems. Many approaches regarding person/object detection and tracking in videos have been proposed. In this lecture, video tracking methods using correlation filters or convolutional neural networks are presented, focusing on video trackers that are capable of achieving real-time performance for long-term tracking on embedded computing platforms.
Sample Lecture material: Download
7. Motion estimation
Abstract: Motion estimation principals will be analyzed. Initiating form 2D and 3D motion models, displacement estimation as well as quality metrics for motion estimation will subsequently be detailed. One of the basic motion estimation techniques, namely block matching, will also be presented, along with three alternative, faster methods. A good overview of deep neural notion estimation will be presented. Phase correlation will be described, next followed by optical flow equation methods. Finally, a brief introduction to object detection and tracking will conclude the lecture.
1. Introduction to Machine Learning
Abstract: This lecture will cover the basic concepts of Machine Learning. Supervised, self-supervised, unsupervised, semi-supervised learning. Multi-task Machine Learning. Classification, regression. Object detection, Object tracking. Clustering. Dimensionality reduction, data retrieval. Artificial Neural Networks. Adversarial Machine Learning. Generative Machine Learning. Temporal Machine learning (Recurrent Neural Networks). Continual Learning (few-shot learning, online learning). Reinforcement Learning. Adaptive learning (Knowledge Distillation, Domain adaptation, Transfer learning, Activation Pattern Analysis, Federated learning/Collaborative learning, Ensemble learning). Precise mathematical definitions of ML tasks will be presented.
2. Introduction to neural networks, Perceptron
Abstract: This lecture will cover the basic concepts of Artificial Neural Networks (ANNs): Biological neural models, Perceptron, Activation functions, Loss types, Steepest Gradient Descent, On-line Perceptron training, Batch Perceptron training.
Sample Lecture material: Download
3. Multilayer perceptron. Backpropagation
Abstract: This lecture will cover the basic concepts of Multi-Layer Perceptron (MLP), Training MLP neural networks, Activation functions, Loss types, Gradient descent, Error Backpropagation, Stochastic Gradient Descent, Adaptive Learning Rate Algorithms, Regularization, Evaluation, Generalization.
Sample Lecture material: Download
4. Deep neural networks. Convolutional NNs
Abstract: From multilayer perceptrons to deep architectures. Fully connected layers. Convolutional layers. Tensors and mathematical formulations. Pooling. Training convolutional NNs. Initialization. Data augmentation. Batch Normalization. Dropout. Deployment on embedded systems. Lightweight deep learning.
Sample Lecture material: Download
5. Deep learning for object detection
Abstract: Recently, Convolutional Neural Networks (CNNs) have been used for object/target (e.g., car, pedestrian, road sign) detection with great results. However, using such CNN models on embedded processors for real-time processing is prohibited by HW constraints. In that sense, various architectures and settings will be examined in order to facilitate and accelerate the use of embedded CNN-based object detectors with limited computational capabilities. The following target detection topics will be presented: Object detection as search and classification task. Detection as classification and regression task. Modern architectures for target detection (e.g., RCNN, Faster-RCNN, YOLO, SSD). Lightweight architectures. Data augmentation. Deployment. Evaluation and benchmarking.
6. Deep Semantic Image Segmentation
Abstract: Semantic image segmentation is a very important computer vision task with several applications in autonomous systems perception, robotic vision and medical imaging. Recent semantic image segmentation methods rely on deep neural networks and aim to assign a specific class label to each pixel of the input image. This lecture overviews the topic and addresses some of the semantic image segmentation challenges, notably: Deep semantic Image Segmentation architectures. Skip connections. U-nets. BiSeNet. Semantic image segmentation performance, computational complexity and generalization.
1. Introduction to autonomous systems
Abstract: A fully autonomous system can: a) gain information about the environment, b) work for an extended period without human intervention, c) move either all or part of itself throughout its operating environment without human assistance and d) avoid situations that are harmful to people, property, or itself unless those are part of its design specifications.
Key technologies of autonomous systems are overviewed, notably: mission planning and control, perception and intelligence, embedded computing, swarm systems, communications and societal technologies. Several autonomous system applications are presented, notably a) autonomous cars, b) drones and drone swarms, c) autonomous underwater vehicles d) autonomous marine vessels and e) autonomous robots.
2. Introduction to multiple drone systems
Abstract: This lecture will provide the general context for this new and emerging topic, presenting the aims of multiple drone systems, focusing on their sensing and perception. Drone mission formalization, planning and control will be overviewed. Then multiple drone communication issues will be presented, notably drone2ground communication and multisource video streaming. In drone vision, the challenges (especially from an image/video analysis and computer vision point of view), the important issues to be tackled, the limitations imposed by drone hardware, regulations and safety considerations will be presented. A multiple drone platform will be the detailed during the second part of the lecture, beginning with platform hardware overview, issues and requirements and proceeding by discussing safety and privacy protection issues. Finally, platform integration will be the closing topic of the lecture, elaborating on drone mission planning, object detection and tracking, UAV-based cinematography, target pose estimation, privacy protection, ethical and regulatory issues, potential landing site detection, crowd detection, semantic map annotation and simulations. Two drone use cases will be overviewed: a) multiple drones in media production and b) drone-based infrastructure surveillance, notably of electrical installations.
3. Drone mission planning and control
Abstract: In this lecture, first the audiovisual shooting mission is formally defined. The introduced audiovisual shooting definitions are encoded in mission planning commands, i.e., navigation and shooting action vocabulary, and their corresponding parameters. The drone mission commands, as well as the hardware/software architecture required for manual/autonomous mission execution are described. The software infrastructure includes the planning modules, that assign, monitor and schedule different behaviours/tasks to the drone swarm team according to director and environmental requirements, and the control modules, which execute the planning mission by translating high-level commands to into desired drone+camera configurations, producing commands for autopilot, camera and gimbal of the drone swarm.
4. Introduction to car vision
Abstract: In this lecture, an overview of the autonomous car technologies will be presented (structure, HW/SW, perception), focusing on car vision. Examples of autonomous vehicle will be presented as a special case, including its sensors and algorithms. Then, an overview of computer vision applications in this autonomous vehicle will be presented, such as visual odometry, lane detection, road segmentation, etc. Also, the current progress of autonomous driving will be introduced.
5. Introduction to autonomous marine vehicles
Abstract: Autonomous marine vehicles can be described as surface (boats, ships) and underwater ones (submarines). They have many applications in marine transportation, marine/submarine surveillance and many challenges in environment perception/mapping and vehicle control, to be reviewed in this lecture.
1. CVML programming tools
Abstract: This lecture overviews the various SW tools, libraries and environments used in computer vision and machine learning: Robotic Operating System (ROS). Libraries (OpenCV, BLAS, cuBLAS, MKL DNN, cuDNN), DNN Frameworks (Neon, Tensorflow, PyTorch, Keras, MXNet), Distributed/cloud computing (MapReduce programming model, Apache Spark), Collaborative SW Development tools (GitHub, Bitbucket).
1. Fast 1D convolution algorithms
Abstract: 1D convolutions are extensively used in digital signal processing (filtering/denoising) and analysis (also through CNNs). As their computational complexity is of the order O(N^2), their fast execution is a must.
This lecture will overview linear systems, linear and cyclic convolution and correlation. Then it will present their fast execution through FFTs, resulting in algorithms having computational complexity of the order O(Nlog2N). Optimal Winograd 1D convolution algorithms will be presented having theoretically minimal number of computations. Parallel block-based 1D convolution/calculation methods will be overviewed.
2. Fast 2D convolution algorithms
Abstract: 2D convolutions play an extremely important role in machine learning, as they form the first layers of Convolutional Neural Networks (CNNs). They are also very important for computer vision (template matching through correlation, correlation trackers) and in image processing (image filtering/denoising/
Therefore, 2D/3D convolution algorithms are very important both for machine learning and for signal/image/video processing and analysis. As their computational complexity is of the order O(N^4) and O(N^6) respectively their fast execution is a must.
This lecture will overview 2D linear and cyclic convolution. Then it will present their fast execution through FFTs, resulting in algorithms having computational complexity of the order O(N^2log2N). Optimal Winograd 2D convolution algorithms will be presented having theoretically minimal number of computations. Parallel block-based 2D convolution/calculation methods will be overviewed. The use of 2D convolutions in Convolutional Neural Networks will be presented.
Any engineer or scientist practicing or student having some knowledge of computer vision and/or machine learning, notable CS, CSE, ECE, EE students, graduates or industry professionals with relevant background.
IF YOU HAVE A QUESTION
•Prof. Ioannis Pitas: https://scholar.google.gr/citations?user=lWmGADwAAAAJ&hl=el
•Department of Computer Science, Aristotle University of Thessaloniki (AUTH): https://www.csd.auth.gr/en/
•Laboratory of Artificial Intelligence and Information Analysis: http://www.aiia.csd.auth.gr/