|
Module Number INFO-4xxx |
Module Title AI Safety |
Lecture Type(s) Lecture, Tutorial |
|---|---|---|
| ECTS | 6 | |
|
Work load - Contact time - Self study |
Workload:
180 h Class time:
60 h / 4 SWS Self study:
120 h |
|
| Duration | 1 Semester | |
| Frequency | Irregular | |
| Language of instruction | English | |
| Type of Exam | Project and closed-book exam |
|
| Content | This course provides a comprehensive introduction to safety and reliability in modern AI systems, with a focus on large language models and AI agents. Students will explore technical vulnerabilities including adversarial robustness, jailbreaks, prompt injections, and hallucinations, while examining approaches to detect and prevent these failures. The curriculum covers alignment challenges such as emergent misalignment, scalable oversight, and AI control methods for managing increasingly capable systems. |
|
| Objectives | Students will gain hands-on experience with interpretability techniques, evaluation methods, and practical tools for watermarking, detecting AI-generated content, and understanding copyright implications in LLMs. By the end of the course, students will understand both the theoretical foundations and practical aspects of building safer AI systems, including methods for predicting AI capabilities. |
|
| Allocation of credits / grading |
Type of Class
Status
SWS
Credits
Type of Exam
Exam duration
Evaluation
Calculation
of Module (%) |
|
| Prerequisite for participation | There are no specific prerequisites. | |
| Lecturer / Other | wechselnde Dozenten | |
| Literature | Prerequisites: Prior coursework in deep learning, statistical machine learning, or LLMs. |
|
| Last offered | unknown | |
| Planned for | currently not planned | |
| Assigned Study Areas | ||