Master-Seminar: Multi-modal AI for Medicine (IN2107)

This year’s seminar will look at aspects of multi-modal machine learning in medicine and healthcare, focusing on:

  • Vision language models (VLMs) for medical and healthcare applications
  • Generic multi-modal AI models utilising imaging data, clinical reports, lab test results, electronic health records, and genomics
  • Foundation models for multi-modal medicine


At the end of the module students should have:

  • a thorough understanding of current research in multi-modal AI in medicine, in particular about foundation models and large vision-language models and their impact in medicine
  • After course completion students should be able to apply learned concepts, critically evaluate research works in the area, and be able to conceptualise strategies to tackle the issues discussed


  • Each student will choose one paper from a provided list of papers, read it, and give a 15-minute presentation about the paper during the seminar sessions
  • All students are expected and highly encouraged to participate in discussions during the seminar sessions
  • Each student will then write a 2-page report after presenting and discussing the paper


Students are expected to be familiar with:

  • Mathematics basics (graduate level):
    • probability theory
    • linear algebra
    • calculus
  • Machine / deep learning basics, e.g. having completed:
    • Machine Learning (IN2064)
    • Introduction to Deep Learning

Preference might be given to students with:

  • Knowledge in deep learning models in medicine, especially vision and/or language models
  • Completion of related courses from our chair, e.g.:
    • AI in Medicine I
    • AI in Medicine II
  • Work experience in AI / Data Science for Medicine & Healthcare

Information session and sign-up

Information Slides - 2024/25 Winter Semester