Module and Programme Catalogue

Search site

Find information on

2019/20 Taught Postgraduate Module Catalogue

TRAN5340M Transport Data Science

15 creditsClass Size: 30

Module manager: Dr Robin Lovelace
Email: r.lovelace@leeds.ac.uk

Taught: Semester 2 (Jan to Jun) View Timetable

Year running 2019/20

Pre-requisite qualifications

Acceptance on to any of the Masters programmes at the Institute for Transport Studies or equivalent experience (if taken as an individual module).

There are no pre-requisite modules but the Introduction to R one-off 3 hour workshop (semester 1 Computer Skills workshop) is recommended (or equivalent experience using R, e.g. by completing the free online 4 hour tutorial at https://www.datacamp.com/courses/free-introduction-to-r )

This module is not approved as an Elective

Module summary

The quantity, diversity and availability of transport data is increasing rapidly, requiring skills in the management and interrogation of data and databases. Recent years have seen a new wave of 'big data' and 'data science' changing the world, with the Harvard Business Review describing Data Science as the 'sexiest job of the 21st century' (see hbr.org). Transport researchers increasingly need to take data from a wide range of sources and perform non-standard analyses methods on them to inform the decision-making process.Despite these developments the transport sector has been slow to adapt to new methods and workflows. The Transport Systems Catapult, for example, identified a skills gap in "skilled technical talent capable of handling and analysing very large datasets compiled from multiple sources" (see ts.catapult.org.uk).This module takes a highly practical approach to learning about 'data science' tools and their application to investigating transport issues. The focus is on practical data science, enabling attendees to make use of a wide range of datasets to answer real-world transport planning questions

Objectives

- Understand the structure of transport datasets: spatial, temporal and demographic.
- Understand how to obtain, clean and store transport related data.
- Gain proficiency in command-line tools for handling large transport datasets.
- Learn machine learning and data modelling techniques.
- Produce data visualizations, static and interactive.
- Learn where to find large transport datasets and assess data quality.
- Learn how to join together the components of transport data science into a cohesive project portfolio.

Learning outcomes
Students will become confident at working in data science teams working on transport problems; in selecting appropriate tools to answer societal and business questions with a range of input data types; and understanding the wider implications of the increasing use of data science for transport planning.
Specifically, learning outcomes will include the ability to:
- Identify available datasets and access and clean them
- Combine datasets from multiple sources
- Understand what machine learning is, which problems it is appropriate for compared with traditional statistical approaches, and how to implement machine learning techniques
- Visualise and communicate the results of transport data science, and know about setting-up interactive web applications
- Articulate the relevance and limitations of data-centric analysis applied to transport problems, compared with other methods

Skills outcomes
Students will gain skills in:
- Importing a range of transport data file formats
- Setting-up data science projects to ensure reproducibility
- Data cleaning and manipulation
- Visualisation of large datasets


Syllabus

- Software for practical data science
- The structure of transport data e.g. flows, incidents, origin/destination, GIS
- Data cleaning and subsetting
- Accessing data from web sources
- Processing data using remote services and locally installed software
- Data visualisation
- Machine learning
- Professional and ethical issues of big data in transport data analysis

Teaching methods

Delivery typeNumberLength hoursStudent hours
Lecture51.005.00
Practical53.0015.00
Seminar51.005.00
Private study hours125.00
Total Contact hours25.00
Total hours (100hr per 10 credits)150.00

Private study

Students are expected to spend their study time on software set-up and worked examples, plus background reading for lectures, preparatory work for workshops and assessed coursework.
Unsupervised teamwork practical sessions will be arranged to ensure a complete portfolio is submitted. Students are encouraged to submit the code that generated the portfolio.

Opportunities for Formative Feedback

Progress will be monitored informally during supervised practical sessions.

Methods of assessment


Coursework
Assessment typeNotes% of formal assessment
PortfolioProject Portfolio 3000 word limit plus appendices100.00
Total percentage (Assessment Coursework)100.00

The project portfolio will be a 3000 word limit, (excluding appendices) report explaining the methods learned and their application to real problems. Students are encouraged to submit reproducible code alongside the report in the appendices. This will be used as the basis for formative assessment mid-way through the semester. This will highlight if any students are struggling. If a portfolio fails the assessment criteria the student will have an opportunity to resubmit a report outlining what they have learned in the areas in which they are failing.

Reading list

The reading list is available from the Library website

Last updated: 17/12/2019

Disclaimer

Browse Other Catalogues

Errors, omissions, failed links etc should be notified to the Catalogue Team.PROD

© Copyright Leeds 2019