DESCRIPTION

Acoustic digital signals are ubiquitous and have many applications in several disciplines, notably in:

  • Digital Media, Social Media (music, voice signals),
  • Biomedical Signal Analysis and Diagnosis,
  • Scientific signal acquisition of any sort, e.g., Environment Sensing, Geophysical Prospecting.

Text can be considered as 1D signal, as it evolves over time (actually as an order time series of letters or words). Its analysis is extremely important in social media applications (e.g., Tweet analysis), in the analysis of any written and/or broadcasted text (e.g., news articles in newspapers) and in literary text analysis.

This CVML Web Module focuses on Acoustics, Speech and Text Processing and Analysis and their applications. Music genre recognition can be used to automatically characterize music style.

 Neural Speech Recognition can be used for speech to text mapping. Natural Language Processing (NLP) and can be used, among others, for vectorial text embedding that has numerous applications in text comparison, search and text sentiment analysis.

Automatic Speech Recognition.

Classical music and Jazz Mel spectrogram.

LECTURE LIST

  1. Music genre recognition
  2. Neural Speech Recognition
  3. Natural Language Processing
  4. NLP and Text Sentiment Analysis
  5. Large Language Models
CVML WEB LECTURE MODULE SCHEDULE

This module has been designed to be mastered within 1 week (or less), if you have proper background (at least early undergraduate student in an EE, ECE, CS, CSE or any Engineering or Exact Sciences Department).
We propose that you follow the above mentioned  Lecture order. You may want to skip few Lectures that might not be of immediate interest to you for later study.

On average you can study 4 lectures per week. The related effort is as follows:
Lecture pdf study and filling the related understanding questionnaire: 1-2 hours per lecture (on average, depending on your background).