Module and Programme Catalogue

Search site

Find information on

2023/24 Undergraduate Module Catalogue

NATS3200 Machine Learning Approaches to Scientific Data Analysis

10 creditsClass Size: 60

Module manager: Dr Stefan Auer
Email: s.auer@leeds.ac.uk

Taught: Semester 2 (Jan to Jun) View Timetable

Year running 2023/24

Pre-requisite qualifications

NATS2100 or equivalent scientific programming in Python module.
Year 1 Mathematics modules in Natural Sciences, or NATS2380 or equivalent Mathematics.

Module replaces

None

This module is not approved as a discovery module

Module summary

Statistical machine learning is at the core of the modern world. Online advertising, automated vehicles, stock market trading, transport planning: each uses statistical models to learn from past data and make decisions about the future. Statistical machine learning is a way to rigorously identify patterns in data and to make quantitative predictions. It is how we translate data into knowledge. In this module the fundamental concepts of statistical machine learning are introduced and the student will learn to use several key statistical models widely employed in science and industry.

Objectives

To introduce basic techniques from statistical machine learning for classification and regression using Python.

Learning outcomes
1. Be able to explain the classification and regression problem;
2. Be able to assess the error of a fitted model and explain the fitting algorithm;
3. Understand the statistical foundations of different classification and regression methods;
4. Understand the importance of uncertainty and evaluate the uncertainty in simple model predictions;
5. Be able to perform classification and regression tasks using existing software packages;
6. Be able to carry out and justify a simple statistical model analysis of real world data.


Syllabus

- Introduction to classification and regression;
- Statistical decision theory, loss functions;
- Optimisation, gradient descent, local & global optima;
- Linear regression;
- Logistic regression;
- Tree models;
- Ensemble methods: e.g. Boosting, Random forests.

Teaching methods

Delivery typeNumberLength hoursStudent hours
Workshop112.0022.00
Lectures111.0011.00
Private study hours67.00
Total Contact hours33.00
Total hours (100hr per 10 credits)100.00

Private study

Learn course material, perform tasks, create and solve computational problems.

Students required to resit the module would be given a further attempt to complete the tasks over the summer. The problems in those tasksheets have no "standard" solution, so it is not a problem if they have to work on the same problems again.

Opportunities for Formative Feedback

The teaching sessions introduces the course material, and the students can ask questions throughout and after the lecture.

The workshop sessions will be based in computer clusters and involve guide to solutions to the project with a member of staff enabling feedback on the approach being taken and any technical issues.

Methods of assessment


Coursework
Assessment typeNotes% of formal assessment
In-course AssessmentAssessed coursework100.00
Total percentage (Assessment Coursework)100.00

Typically there are 6 tasksheets. The first three will not contribute to the module mark, and model solutions will be provided. Tasksheet 1 will be handed out in week 1, Tasksheet 2 will be handed out in week 2, and Tasksheet 3 will be handed out in week 3. The remaining 3 tasksheets will count to the module mark, 1/3 each. Tasksheet 3 will be handed out in week 4, Tasksheet 5 will be handed out in week 6, and Tasksheet 6 will be handed out in week 8. The assessments are weighted equally and a simple average is used to work out the module mark. In general, the tasksheets cover the material taught in the corresponding weeks.

Reading list

There is no reading list for this module

Last updated: 09/06/2023

Disclaimer

Browse Other Catalogues

Errors, omissions, failed links etc should be notified to the Catalogue Team.PROD

© Copyright Leeds 2019