Module and Programme Catalogue

Search site

Find information on

2023/24 Taught Postgraduate Module Catalogue

TRAN5340M Transport Data Science

15 creditsClass Size: 40

Module manager: Dr Robin Lovelace
Email: r.lovelace@leeds.ac.uk

Taught: Semester 2 (Jan to Jun) View Timetable

Year running 2023/24

Pre-requisite qualifications

Students must have access to and experience using recent versions of R (4.0.0 minimum) and RStudio e.g installed on their own computers with at least 8 GB RAM (recommended) or via university IT. Attending the Introduction to R one-off 3 hour workshop (semester 1 Computer Skills workshop) and experience of using R (e.g. having used it for work, in previous degrees or having completed an online course) is essential. Students can demonstrate this by showing evidence that they have completed an online course such as the first 4 sessions in the RStudio Primers series https://rstudio.cloud/learn/primers or DataCamp’s Free Introduction to R course: https://www.datacamp.com/courses/free-introduction-to-r. This is an advanced and research-led module. Evidence of substantial programming and data science experience in previous professional or academic work, in languages such as R or Python, also constitutes sufficient pre-requisite knowledge for the course.

This module is not approved as an Elective

Module summary

The quantity, diversity and availability of transport data is increasing rapidly, requiring skills in the management and interrogation of data and databases. Recent years have seen a new wave of 'big data' and 'data science' changing the world, with the Harvard Business Review describing Data Science as the 'sexiest job of the 21st century' (see hbr.org). Transport researchers increasingly need to take data from a wide range of sources and perform a wide range of methods on them to inform the decision-making process.Despite these developments the transport sector has been slow to adapt to new methods and workflows. The Transport Systems Catapult, for example, identified a skills gap in "skilled technical talent capable of handling and analysing very large datasets compiled from multiple sources" (see ts.catapult.org.uk).This module takes a practical approach to learning about data science tools and their application to investigating transport issues. The focus is on practical data science with a focus on data at zone, origin-destination route and route network levels. By the end of the course students will be able to make use of a wide range of datasets to answer real-world transport planning questions.

Objectives

 Understand the structure of transport datasets, from origin-destination to street segment levels
 Understand how to obtain, clean and store transport related data.
 Gain proficiency in command-line tools for handling large transport datasets.
 Produce data visualizations, static and via interactive web maps
 Learn where to find large transport datasets and assess data quality
 Learn how to join together the components of transport data science into a cohesive project portfolio


Learning outcomes
Students will become confident at working in data science teams working on transport problems; in selecting appropriate tools to answer societal and business questions with a range of input data types; and understanding the wider implications of the increasing use of data science for transport planning and engineering.
Specifically, learning outcomes will include the ability to:
 Identify available datasets and access and clean them
 Combine datasets from multiple sources
 Visualise and communicate the results of transport data science, and know about setting-up interactive web applications
 Articulate the importance of data science in wider context

Skills outcomes
Students will gain skills in:
- Importing a range of transport data file formats
- Setting-up data science projects to ensure reproducibility
- Data cleaning and manipulation
- Visualisation of large datasets


Syllabus

 Software for practical data science
 The structure of transport data at the level of zones, origin-destination pairs, routes and infrastructure
 Data cleaning and subsetting
 Accessing data from web sources
 Data visualization

Teaching methods

Delivery typeNumberLength hoursStudent hours
Lecture51.005.00
Practical62.5015.00
Seminar22.505.00
Private study hours125.00
Total Contact hours25.00
Total hours (100hr per 10 credits)150.00

Private study

Students are expected to spend their study time on software set-up and worked examples, plus background reading for lectures, preparatory work for workshops and assessed coursework.
Unsupervised teamwork practical sessions will be arranged to ensure a complete portfolio is submitted. Students are encouraged to submit the code that generated the portfolio.

Opportunities for Formative Feedback

Progress will be monitored informally during supervised practical sessions.

Methods of assessment


Coursework
Assessment typeNotes% of formal assessment
PortfolioProject Portfolio 3000 word limit plus appendices100.00
Total percentage (Assessment Coursework)100.00

The project portfolio will be a short (10 pages max, excluding appendices) but high quality report explaining the methods learned and their application to real problems. Students are encouraged to submit reproducible code alongside the report. This will be used as the basis for formative assessment mid-way through the semester. This will highlight if any students are struggling. If a portfolio fails the assessment criteria the student will have an opportunity to resubmit a report outlining what they have learned in the areas in which they are failing.

Reading list

The reading list is available from the Library website

Last updated: 25/10/2023 15:48:48

Disclaimer

Browse Other Catalogues

Errors, omissions, failed links etc should be notified to the Catalogue Team.PROD

© Copyright Leeds 2019