2021/22 Taught Postgraduate Module Catalogue
COMP5510M Data Science & Analytics for Causal Inference and Prediction
15 creditsClass Size: 80
Module manager: Prof Mark Gilthorpe
Taught: 1 Sep to 31 Jan (adv yr), Semester 1 (Sep to Jan) View Timetable
Year running 2021/22
Pre-requisite qualificationsAcademic entry requirements
A 1st degree in a quantitative or scientific subject area with substantial mathematical, statistical or numeracy components (at least 2:1). We also consider working experience (two years or more) of research in a quantitative subject area. Non-graduates who: have successfully completed three years of a UK medical degree; are normally ranked in the top 50% of the year 3 cohort; and wish to take the MRes in Data Science & Analytics for Health as an intercalated programme, will also be accepted.
English language requirements
An overall score of 7.0 on IELTS (International English Language Testing System) with at least 6.0 in writing and no other skill below 6.5; from a TOEFL paper-based test the requirement is a minimum score of 600, with 4.5 in the Test of Written English (TWE); from a TOEFL computer-based test the requirement is a minimum score of 250, with 4.5 TWE; from a TOEFL Internet-based test the requirement is a minimum score of 100, with 25 in the "Writing Skills" score.
This module is not approved as an Elective
Module summaryThe module is designed to give students a comprehensive introduction to linear modelling and equip them with the skills and knowledge necessary to analyse various outcome data types. By the end of the module students will be able to identify suitable linear models for analysing a variety for different outcome types; fit a linear model using statistical software including selection of model parameters; compare between models and assess the appropriateness or otherwise of the fitted model.
ObjectivesThe module is designed to give students a comprehensive introduction to linear modelling and equip them with the skills and understanding necessary to analyse various outcome data types.
Upon successful completion of the module the students will have demonstrated:
1. in-depth specialist knowledge related to: the methods of least squares and maximum likelihood; the principle of parsimony; and the types and principles (theory, limitations and assumptions) of (generalised) linear modelling – by successful engagement with the 10 Practical sessions and successful completion of assignment (ii), see below;
2. a sophisticated understanding of concepts at the cutting edge of data science, including: the utility of linear models for: different outcome data types; and prediction and causal inference – by successful engagement with the 11 Lectures and successful completion of assignment (i), see below;
3. proficient intellectual and practical analytical skills relevant for: the critical appraisal of linear models; the application of linear modelling in different contexts/datasets using R statistical software, with covariate subset selection appropriate to either prediction or causal inference, the latter informed by causal diagrams – by successful completion of assignment (i), see below;
4. independent ideas and hypotheses, particularly concerning: decisions and assumptions involved in linear modelling – by successful engagement with the Tutorial and successful completion of assignment (ii), see below;
5. self-reflection with regard to the development of collegial, participative and professional relationships with peers, colleagues, trainers and hosts, including: the responsibilities, constraints, rewards and requirements for publishing robust linear models – by successful engagement with the 10 Practical sessions and Tutorial, see below;
6. an ability to keep abreast of current issues, new developments and professional advances as these impact upon: the implementation of causal theories within theoretical, pure and applied data science – by the successful application of Private Study as evidenced in the successful completion of assignments (i) and (ii), see below.
Statistical analysis skills. Practical modelling skills for observational data and critical evaluation of the use of linear models.
The module will be delivered through a blend of: face-to-face lectures and hands-on practical classes; with selected written material provided online to support private study.
The course will cover the following topics:
(a) introduction to linear models and R statistical software for fitting linear models;
(b) correlation and simple linear regression; multiple linear regression (including maximum likelihood estimation);
(c) model fitting, parameter estimation, interpretation and model diagnostics;
(d) generalised linear modelling (including logistic regression analysis and Poisson regression); and
(e) the use of directed acyclic graphs (DAGs) to support statistical modelling for causal inference.
Attainment of intended learning outcomes will be assessed using three separate assignments:
(i) critical appraisal of two project reports, each using predictive or causal inference modelling – 33.3% each; and
(ii) a two-hour, unseen written examination – 33.3%.
The rationale is to build student confidence using coursework assessments designed to test the application of knowledge learnt in lectures, practicals and the tutorial, to the sorts of published reports students are likely to encounter in applied data contexts. The examination is then designed to test student knowledge and understanding of modelling theory and robust practice, and to strengthen student confidence and performance in time-limited academic assessment.
|Delivery type||Number||Length hours||Student hours|
|Private study hours||117.00|
|Total Contact hours||33.00|
|Total hours (100hr per 10 credits)||150.00|
Private studyAt least 4 hours per week of private study of additional course materials to support lectures and tutorial work. In addition, students are expected to spend 34 hours on each of the two assignments.
Opportunities for Formative FeedbackIn class formative interaction with the lecturing staff; during one-to-one support and iterative feedback during practicals; and as a result of feedback following the two summative report assignments prior to the exam.
Methods of assessment
|Assessment type||Notes||% of formal assessment|
|Total percentage (Assessment Coursework)||66.60|
There is no compensation across the two items of coursework. Students need to pass both components.
|Exam type||Exam duration||% of formal assessment|
|Online Time-Limited assessment||48 hr||33.40|
|Total percentage (Assessment Exams)||33.40|
Normally resits will be assessed by the same methodology as the first attempt, unless otherwise stated
Reading listThere is no reading list for this module
Last updated: 15/03/2022 16:12:19
Browse Other Catalogues
- Undergraduate module catalogue
- Taught Postgraduate module catalogue
- Undergraduate programme catalogue
- Taught Postgraduate programme catalogue
Errors, omissions, failed links etc should be notified to the Catalogue Team.PROD