Module Number ML-4309	Module Title Data Compression with and without Deep Probabilistic Models	Lecture Type(s) Lecture, Tutorial
ECTS	6
Work load - Contact time - Self study	Workload: 180 h Class time: 60 h / 4 SWS Self study: 120 h
Duration	1 Semester
Frequency	In the summer semester
Language of instruction	English
Type of Exam	written exam
Content	This course covers lossless and lossy data compression from information theory to applications, and from established compression algorithms to novel machine-learning based methods. Research on data compression has made rapid progress in the last few years. Novel, machine-learning based methods are now beginning to significantly outperform even the best conventional compression methods, in particular for image and video compression. We will discuss and prove information-theoretical foundations of compression (e.g., theoretical bounds on the bitrate, rate/distortion theory, and the source/channel separation theorem). Building on these concepts, we will then first discuss and analyze various established practical algorithms for data compression (e.g., Huffman Coding, Arithmetic Coding, Asymmetric Numeral Systems, and Bits-Back Coding). Finally we will cover the emerging field of machine-learning based data compression, discussing important methods like variational inference and deep probabilistic models such as variational autoencoders. Detailed course schedule: https://robamler.github.io/teaching/compress23/ Enrollment (link fixed on 7 April 2023): https://ovidius.uni-tuebingen.de/ilias3/goto.php?target=crs_4116461
Objectives	On the theory side, you will learn information theoretical bounds for lossless and lossy compression, several algorithms for so-called entropy coding with their respective advantages and disadvantages, and the foundations of probabilistic machine learning, in particular scalable approximate Bayesian inference. On the applied side, the tutorials will teach you how to implement entropy coding algorithms in real code and how to train various types of deep probabilistic machine learning models, integrate them into data compression algorithms, and evaluate their performance.
Allocation of credits / grading	Type of Class Status SWS Credits Type of Exam Exam duration Evaluation Calculation of Module (%) Lecture V o 3 4.5 g Tutorial Ü o 1 1.5
Prerequisite for participation	There are no specific prerequisites.
Lecturer / Other	Bamler
Literature	I will recommend some relevant literature in the first lecture. However, since this course covers an emerging field of research, there isn't any canonical reference yet that covers all discussed topics. Special-made and recently revised videos for all covered topics as well as lecture notes and solutions to the problem sets will be provided on the course website. Students should already have a sound understanding of multivariate calculus and should be able to write simple programs in Python. Parallel attendance of the course "Probabilistic Machine Learning" will likely be helpful, but is not formally required.
Last offered	Sommersemester 2022
Planned for	Sommersemester 2023
Assigned Study Areas	INFO-INFO, MEDI-APPL, MEDI-INFO, ML-CS, ML-DIV

Module Number

ML-4309

Module Title

Data Compression with and without Deep Probabilistic Models

Lecture Type(s)

Lecture, Tutorial

ECTS

Work load
- Contact time
- Self study

Workload:
180 h

Class time:
60 h / 4 SWS

Self study:
120 h

Duration

1 Semester

Frequency

In the summer semester

Language of instruction

English

Type of Exam

written exam

Content

This course covers lossless and lossy data compression from information theory to applications, and from established compression algorithms to novel machine-learning based methods. Research on data compression has made rapid progress in the last few years. Novel, machine-learning based methods are now beginning to significantly outperform even the best conventional compression methods, in particular for image and video compression.

We will discuss and prove information-theoretical foundations of compression (e.g., theoretical bounds on the bitrate, rate/distortion theory, and the source/channel separation theorem). Building on these concepts, we will then first discuss and analyze various established practical algorithms for data compression (e.g., Huffman Coding, Arithmetic Coding, Asymmetric Numeral Systems, and Bits-Back Coding). Finally we will cover the emerging field of machine-learning based data compression, discussing important methods like variational inference and deep probabilistic models such as variational autoencoders.

Detailed course schedule: https://robamler.github.io/teaching/compress23/
Enrollment (link fixed on 7 April 2023): https://ovidius.uni-tuebingen.de/ilias3/goto.php?target=crs_4116461

Objectives

On the theory side, you will learn information theoretical bounds for lossless and lossy compression, several algorithms for so-called entropy coding with their respective advantages and disadvantages, and the foundations of probabilistic machine learning, in particular scalable approximate Bayesian inference.

On the applied side, the tutorials will teach you how to implement entropy coding algorithms in real code and how to train various types of deep probabilistic machine learning models, integrate them into data compression algorithms, and evaluate their performance.

Allocation of credits / grading

Lecture

4.5

Tutorial

1.5

Prerequisite for participation

There are no specific prerequisites.

Lecturer / Other

Bamler

Literature

I will recommend some relevant literature in the first lecture. However, since this course covers an emerging field of research, there isn't any canonical reference yet that covers all discussed topics. Special-made and recently revised videos for all covered topics as well as lecture notes and solutions to the problem sets will be provided on the course website.

Students should already have a sound understanding of multivariate calculus and should be able to write simple programs in Python. Parallel attendance of the course "Probabilistic Machine Learning" will likely be helpful, but is not formally required.

Last offered

Sommersemester 2022

Planned for

Sommersemester 2023

Assigned Study Areas

INFO-INFO, MEDI-APPL, MEDI-INFO, ML-CS, ML-DIV