Module Number

ML-4309
Module Title

Data Compression with and without Deep Probabilistic Models
Lecture Type(s)

Lecture, Tutorial
ECTS 6
Work load
- Contact time
- Self study
Workload:
180 h
Class time:
60 h / 4 SWS
Self study:
120 h
Duration 1 Semester
Frequency In the summer semester
Language of instruction English
Type of Exam

written exam

Content

This course covers lossless and lossy data compression from information theory to applications, and from established compression algorithms to novel machine-learning based methods. Research on data compression has made rapid progress in the last few years. Novel, machine-learning based methods are now beginning to significantly outperform even the best conventional compression methods, in particular for image and video compression.

We will discuss and prove information-theoretical foundations of compression (e.g., theoretical bounds on the bitrate, rate/distortion theory, and the source/channel separation theorem). Building on these concepts, we will then first discuss and analyze various established practical algorithms for data compression (e.g., Huffman Coding, Arithmetic Coding, Asymmetric Numeral Systems, and Bits-Back Coding). Finally we will cover the emerging field of machine-learning based data compression, discussing important methods like variational inference and deep probabilistic models such as variational autoencoders.

Detailed course schedule: https://robamler.github.io/teaching/compress23/
Enrollment (link fixed on 7 April 2023): https://ovidius.uni-tuebingen.de/ilias3/goto.php?target=crs_4116461

Objectives

On the theory side, you will learn information theoretical bounds for lossless and lossy compression, several algorithms for so-called entropy coding with their respective advantages and disadvantages, and the foundations of probabilistic machine learning, in particular scalable approximate Bayesian inference.

On the applied side, the tutorials will teach you how to implement entropy coding algorithms in real code and how to train various types of deep probabilistic machine learning models, integrate them into data compression algorithms, and evaluate their performance.

Allocation of credits / grading
Type of Class
Status
SWS
Credits
Type of Exam
Exam duration
Evaluation
Calculation
of Module (%)
Lecture
V
o
3
4.5
g
Tutorial
Ü
o
1
1.5
Prerequisite for participation There are no specific prerequisites.
Lecturer / Other Bamler
Literature

I will recommend some relevant literature in the first lecture. However, since this course covers an emerging field of research, there isn't any canonical reference yet that covers all discussed topics. Special-made and recently revised videos for all covered topics as well as lecture notes and solutions to the problem sets will be provided on the course website.

Students should already have a sound understanding of multivariate calculus and should be able to write simple programs in Python. Parallel attendance of the course "Probabilistic Machine Learning" will likely be helpful, but is not formally required.

Last offered Sommersemester 2022
Planned for Sommersemester 2023
Assigned Study Areas INFO-INFO, MEDI-APPL, MEDI-INFO, ML-CS, ML-DIV