Introduction to data linkage

Open to
Government analysts
Training category
Analytical, Data linkage
Type of training
14 hours
Data Science Campus Faculty
Data Science Campus Faculty

Data linkage provides insight, informs policy change and helps answer society’s most important questions by increasing the utility of data.

This self-study course will introduce you to the principles, theory and practice of data linkage. You must have completed the ‘Awareness in data linkage’ course to enrol on this course.

Learning outcomes

By the end of the course, participants will:

  • understand the difficulties involved in data linkage
  • know how to prepare datasets before matching and select matching variables
  • know how to account for partial agreement of matching variables
  • have reviewed different types of linkage methods
  • know how to link very large datasets
  • know how to evaluate the quality of their matches
  • have gained practical experience in linking data
  • have designed their own linkage methods

Participants will also be guided through exercises in coding their own linkage methods in Python. You do not need previous coding experience or non-standard software for these exercises.

How to book

Please use your Learning Hub account to access the course online. If you do not have a Learning Hub account, please contact


If you would like more information about this course, please email 

Related courses

Data linkage in R

Data linkage in Python