# 2022/23 Taught Postgraduate Module Catalogue

## YCHI5081M Statistics and Modelling for Health Sciences

### 15 creditsClass Size: 40

Module manager: Dr Samuel D Relton
Email: s.d.relton@leeds.ac.uk

Taught: Semester 1 (Sep to Jan) View Timetable

Year running 2022/23

### Pre-requisite qualifications

First degree in a relevant subject e.g. Social Sciences, STEMM, Nursing (or equivalent) 2:1 OR previous work experience (minimum 2 years) of handling and/or analysing data

IELTS 7 – minimum of 6.5 in each component

### This module is mutually exclusive with

 NUFF5040M Statistics for Health Sciences YCHI5045M Statistics for Health Sciences

This module is not approved as an Elective

### Module summary

This module introduces students to statistical testing, generalized linear models (GLMs) and survival models, which are the foundation for analysing observational healthcare data. By the end of the course students will be able to model various healthcare outcomes of interest on real-life datasets including 30-day mortality, treatment costs, length of stay in hospital, from NHS digital etc. The module will also convey best practice in model evaluation and validation, based on the TRIPOD and STAR-D guidelines for reporting of statistical models in medical journals.

### Objectives

Introduction to statistical tests used throughout healthcare research, and the various statistical models commonly employed for binary, continuous, and survival outcomes. Students will learn to utilise these models on real datasets using the R/Python programming languages.

Learning outcomes
1. Apply basic statistical tests to compare measurements from different groups (e.g. t-test, Mann-Whitney U-test, Chi-squared test).
2. Describe the use of generalized linear models and survival models for predictive analytics and critically evaluate published research using such models based on best practice guidelines.
3. Demonstrate a critical understanding of the biases introduced by confounding variables and how to mitigate them in real-life and publically available datasets.
4. Independently design and critically evaluate a basic statistical model, including nonlinear covariates (e.g. BMI), and interactions (e.g. age and sex).
5. Demonstrate a critical understanding by performing model selection using the Akaike Information Criterion and evaluate model performance using AUC, cross-validation, Q-Q plots, and calibration curves, for example.
6. Critically evaluate the statistical methodology of published research and discuss the pros and cons of alternative approaches.

### Syllabus

1. Group comparison using statistical tests
2. Generalized linear models and survival models
3. Controlling for confounders, including nonlinear covariates and interactions.
4. Model selection using the Akaike Information Criterion
5. Model evaluation: AUC, k-fold cross-validation, Q-Q plots, calibration curves
6. Signposting to advanced topics: mixed models, regularization, joint modelling

### Teaching methods

 Delivery type Number Length hours Student hours Group learning 5 1.00 5.00 Lecture 5 1.00 5.00 Practical 5 3.00 15.00 Seminar 5 1.00 5.00 Private study hours 120.00 Total Contact hours 30.00 Total hours (100hr per 10 credits) 150.00

### Private study

120 hours private study. Students are expected to read further detail on the topics discussed from the reading list and relevant literature on the subject.

### Opportunities for Formative Feedback

Students will submit a statistical analysis plan before beginning their coursework, to allow for feedback on their planned analysis.
Students will also have ample opportunities to discuss specific points with staff during the practical sessions. Staff will work through problems with students and provide guidance.

### Methods of assessment

Coursework
 Assessment type Notes % of formal assessment Essay Formative. 500 words. Statistical analysis plan detailing the students approach to cleaning the data, choosing appropriate features and models, then testing their model. 0.00 Report Summative. Project Report. 3000 words. Analysis and validation of a real-life dataset, using the approaches discussed in lectures. Choice of a survival or binary outcome. 100.00 Total percentage (Assessment Coursework) 100.00

Normally resits will be assessed by the same methodology as the first attempt, unless otherwise stated