2024/25 Taught Postgraduate Module Catalogue
MATH5772M Multivariate and Cluster Analysis
15 creditsClass Size: 81
Module manager: Dr Arief Gusnanto
Email: A.Gusnanto@leeds.ac.uk
Taught: Semester 1 (Sep to Jan) View Timetable
Year running 2024/25
Pre-requisite qualifications
MATH2715This module is mutually exclusive with
MATH3772 | Multivariate Analysis |
This module is approved as an Elective
Module summary
Multivariate datasets are common to all research areas: it is typical that experimental units are measured for (or questioned about) more than one variable at a time. This module covers the extension of univariate statistical techniques for continuous data to a multivariate setting and introduces methods designed specifically for multivariate data analysis (cluster analysis, principal component analysis, multidimensional scaling and factor analysis).Objectives
By the end of this module, students should be able to:- relate joint, marginal and conditional distributions and their properties with particular reference to the normal distribution;
- obtain and use Hotelling's T-squared statistic for the one sample and two sample problems;
- derive, discuss the properties of, and interpret principal components;
- use the factor analysis model, and interpret the results of fitting such a model;
- derive, discuss the properties of, and interpret decision rules in discriminant analysis;
- use hierarchical methods on similarity or distance matrices to partition data into clusters;
- use multidimensional scaling to construct low-dimensional representations of data;
- use a statistical package with real data to facilitate an appropriate analysis and write a report giving and interpreting the results.
Syllabus
1. Introduction to multivariate analysis and review of matrix algebra.
2. Multivariate distributions; moments; conditional and marginal distributions; linear combinations.
3. Multivariate normal and Wishart distributions; maximum likelihood estimation.
4. Hotelling's T2 test; likelihood vs. union-intersection approach; simultaneous confidence intervals.
5. Compositional data modelling; distributions on a simplex; maximum likelihood estimation.
6. Dimension reduction; principal component and factor analysis; covariance vs. correlation matrix; loading interpretation.
7. Discriminant analysis; maximum likelihood and Bayesian discriminant rules; misclassification probabilities and estimation; Fisher's discriminant rule.
8. Cluster analysis, similarity matrix, distance matrix, hierarchical methods.
9. Multidimensional scaling, metric scaling, non-metric scaling, horseshoe effect.
Teaching methods
Delivery type | Number | Length hours | Student hours |
Lecture | 33 | 1.00 | 33.00 |
Practical | 1 | 2.00 | 2.00 |
Private study hours | 115.00 | ||
Total Contact hours | 35.00 | ||
Total hours (100hr per 10 credits) | 150.00 |
Private study
Studying and revising of course material.Completing of assignments and assessments.
Opportunities for Formative Feedback
Regular problem solving assignmentsMethods of assessment
Coursework
Assessment type | Notes | % of formal assessment |
In-course Assessment | Coursework | 20.00 |
Total percentage (Assessment Coursework) | 20.00 |
There is no resit available for the coursework component of this module. If the module is failed, the coursework mark will be carried forward and added to the resit exam mark with the same weighting as listed above.
Exams
Exam type | Exam duration | % of formal assessment |
Standard exam (closed essays, MCQs etc) | 2 hr 30 mins | 80.00 |
Total percentage (Assessment Exams) | 80.00 |
Normally resits will be assessed by the same methodology as the first attempt, unless otherwise stated
Reading list
The reading list is available from the Library websiteLast updated: 19/11/2024
Browse Other Catalogues
- Undergraduate module catalogue
- Taught Postgraduate module catalogue
- Undergraduate programme catalogue
- Taught Postgraduate programme catalogue
Errors, omissions, failed links etc should be notified to the Catalogue Team.PROD