Module and Programme Catalogue

Search site

Find information on

2018/19 Taught Postgraduate Module Catalogue

EPIB5040M Introduction to Health Data Science

15 creditsClass Size: 45

Module manager: Dr Peter Tennant

Taught: Semester 1 (Sep to Jan) View Timetable

Year running 2018/19

Pre-requisite qualifications

Academic entry requirements

A 1st degree in a quantitative or scientific subject area with substantial mathematical, statistical or numeracy components (at least 2:1). We also consider working experience (two years or more) of research in a quantitative subject area. Non-graduates who: have successfully completed three years of a UK medical degree; are normally ranked in the top 50% of the year 3 cohort; and wish to take the Health Data Analytics MSc as an intercalated programme, will also be accepted.

English language requirements

An overall score of 7.0 on IELTS (International English Language Testing System) with at least 6.0 in writing and no other skill below 6.5; from a TOEFL paper-based test the requirement is a minimum score of 600, with 4.5 in the Test of Written English (TWE); from a TOEFL computer-based test the requirement is a minimum score of 250, with 4.5 TWE; from a TOEFL Internet-based test the requirement is a minimum score of 100, with 25 in the "Writing Skills" score.

This module is mutually exclusive with

EPIB5022MCore Epidemiology

Module replaces

EPIB5022M Core Epidemiology

This module is approved as an Elective

Module summary

The module is designed to provide students with a thorough grounding in the principals of planning, conducting, and critically reviewing data scientific research in the contexts of health of medicine. By the end of the module, students will be confident with: the language and conventions of health data science, calculating and interpreting measures of occurrence and association, designing and evaluating scientific studies in populations, identifying and appraising sources of bias, and using causal diagrams to support causal reasoning.


The objectives of this module are to:

- Introduce the language, conventions, and core principles of the scientific study health and of disease in populations
- Introduce the key measures for counting and describing the occurrence of health states, events, and disease
- Introduce the key measures for describing the association between exposures and outcomes
- Describe a range of study design approaches for examining health and disease in populations
- Introduce common sources of error and bias when studying health and disease in populations
- Introduce non-deterministic causation and explore different approaches to causal reasoning and distinguishing cause from association
- Emphasise the attitudes and behaviours that lie at the heart of the scientific endeavour.

Learning outcomes
Health data science
By the end of this module the student should be able to:

- Define the scientific study of health and of disease in populations
- Define and provide examples of exposures, outcomes, and populations in the context of health data science
- Explain the inductive scientific method and recognise the focus on refuting, not proving, hypotheses
- Explain why health data science is both a multidisciplinary science and a social science
- Recognise the attitudes and behaviours that exemplify a good data scientist

Descriptive study
By the end of this module the student should be able to:

- Explain the reasons for, and benefits of, descriptive research in health data science
- Define, discriminate between, and calculate the most common measures of occurrence
- Understand the relationship between incidence, prevalence, mortality, and cure rate in determining the burden of disease
- Discuss the role of descriptive research in hypothesis generation
- Identify the appropriate study design to estimate an intended measure of occurrence
- Critically evaluate the relative strengths, weaknesses, and contributions of experimental and observational research

Analytic studies
By the end of this module the student should be able to:

- Identify and describe common experimental and observational study designs used in health data science and critically evaluate their appropriateness in different contexts and their relative strengths and weaknesses
- Define, discriminate between, and calculate the most common measures of association and attribution
- Critically discuss the validity of the 'hierarchy of evidence'
- Explain the 'Fundamental Problem of Causal Inference' and what approaches are used to improve estimates of the counterfactual

Error, bias, and confounding
By the end of this module the student should be able to:

- Recognise error and bias as universal problems in data scientific study
- Define error and bias, and describe their relationships with precision, accuracy, and sample size
- Explain the limitations of sampling-based confidence intervals and p-values in observational research
- Identify and describe common sources of error and bias and suggest ways to reduce their effect
- Define and identify examples of confounding in observational research
- Describe conceptual approaches to reduce confounding bias, explaining 'conditional exchangeability' and the difference between unobserved and residual confounding
- Critically discuss the importance of representativeness in descriptive and analytic research

Causal reasoning
By the end of this module the student should be able to:

- Distinguish between a causal effect and an association
- Identify the features of a non-deterministic cause, and describe how a cause can be probabilistic
- Define component and sufficient causes
- Construct and interpret different causal models, including 'causal pies' and 'causal diagrams', to help with understanding and estimating causal effects
- Critically evaluate practical and criteria-based approaches to aiding causal inference

Transferable skills
By the end of this module the student should be able to:

- Critically appraise existing research in health data science
- Recognise and practise the professional attitudes and behaviours needed for robust scientific research

Skills outcomes
Recognise and practise the professional attitudes and behaviours needed for robust scientific research.


The module will be delivered by Dr Peter Tennant over 6 weeks, as a blend of face-to-face small group work and lectures, vodcasts (audio-visual presentations), online written material, online formative questions, answers and feedback.

The course will cover the following subjects:

Introduction to health data science, the inductive scientific method, experimental and observational research in populations, study designs. case-reports, case-series, register-based studies, cross-sectional studies, cohort studies, ecological studies, case-control studies, natural experiments, pre-post studies, quasi-experimental studies, randomised controlled trials, randomised crossover trials, the 'hierarchy of evidence', measures of occurrence, prevalence proportions, cumulative incidence proportions, incidence rates, cumulative mortality ratios, mortality rates, case-fatality ratios, stratification, standardisation, measures of association, risks ratios, rate ratios, odds ratios, risk differences, attributable risks and fractions, error, bias, accuracy, precision, random sampling variation, selection bias, information bias, confounding bias, the ecological fallacy, generalisability, causal inference, directed acyclic graphs, causal pies, causal criteria, counterfactual reasoning, probabilistic causation.

Teaching methods

Delivery typeNumberLength hoursStudent hours
Independent online learning hours60.00
Private study hours72.00
Total Contact hours18.00
Total hours (100hr per 10 credits)150.00

Private study

The module will exploit staged web-based teaching as follows:

- Comprehensive web-based material covering the basics of the study design and the statistical analysis of these designs.
- A series of reusable learning objects, as online audio-visual presentations, designed to cover in an entertaining way the basics of epidemiology as a science and the details of study design.
- Online formative assessments, delivered using Minerva, will assess learning in the students throughout the web-based material and the audio-visual casts.
- Supplementary reading recommended from text books and significant papers will be set. Students will be set a weekly reading task and the discussion rooms within Minerva will be used to discuss a question about each article.
- An online discussion forum will allow students and lecturers to discuss questions and answers about the material being covered.

Opportunities for Formative Feedback

This will be done in a number of ways:

- Access logs for the online material on Minerva. Despite this being self-paced, reasonable aims will be defined for progress with the material.
- Recording and monitoring of progress with online summative questions/answers within Minerva.

Methods of assessment

Assessment typeNotes% of formal assessment
In-course MCQSummative, timed, open book assessment delivered securely using Minerva15.00
Essay or DissertationShort answer questions - 1000 words total70.00
In-course MCQSummative, timed, open book assessment delivered securely using Minerva15.00
Total percentage (Assessment Coursework)100.00

Normally resits will be assessed by the same methodology as the first attempt, unless otherwise stated

Reading list

The reading list is available from the Library website

Last updated: 10/04/2019


Browse Other Catalogues

Errors, omissions, failed links etc should be notified to the Catalogue Team.PROD

© Copyright Leeds 2019