DESCRIPTION

This course explores the evolution of information retrieval (IR) from traditional techniques to advanced neural and generative models. It begins with an introduction to sparse and dense retrieval methods, covering recent neural models that use learned representations to enhance search effectiveness (e.g. Dense Passage Retrieval) and the research challenges we still face (e.g. hard negative mining, distillation). Next, the course delves into neural retrieval techniques that balance efficiency and precision by optimizing both token interactions (e.g. late interaction models such as Colbert) and learned representations (e.g. SPLADE). We’ll then explore retrieval-augmented generation (RAG), where retrieval is used to condition generative models, and the role of large language models in combining retrieval with text generation for complex tasks. In the final section, we examine cutting-edge trends in generative IR, including end-to-end differentiable models and other approaches that unify retrieval and generation processes. By the end of the course, students will have a solid understanding of the advantages and disadvantages of these advanced methods as well open research in the field of IR.

DETAILS

Course type: Short Course

Institution of lecturer: University of Amsterdam

LECTURER

Prof. Evangelos Kanoulas

Short CV: Evangelos Kanoulas (https://staff.fnwi.uva.nl/e.kanoulas/) is a professor of computer science at the University of Amsterdam, leading the Information Retrieval Lab (https://irlab.science.uva.nl/) at the Informatics Institute. His research lies in developing evaluation methods and algorithms for search, and recommendation, with a focus on learning robust models of language that can be used to understand noisy human language, retrieve textual data from large corpora, generate faithful and factual text, and converse with the user. Prior to joining the University of Amsterdam, he was a research scientist at two of the leading companies in search technology, Google and Microsoft, respectively, and before that, a Marie Curie fellow and postdoc at the University of Sheffield. His research has been published at SIGIR, CIKM, KDD, WWW, WSDM, EMNLP, and other venues in the fields of IR and NLP. He has proposed and organized numerous search benchmarking competitions as part of the Text Retrieval Conference (TREC) and the Conference and Labs of the Evaluation Forum (CLEF) – Dynamic Search and TAR. Furthermore, he is a member of the Ellis society (https://ellis.eu/), and a co-founder of Ellogon AI (https://ellogon.ai/), a company that focuses on personalizing immunotherapy.