2023/24 Taught Postgraduate Module Catalogue
TRAN5340M Transport Data Science
15 creditsClass Size: 40
Module manager: Dr Robin Lovelace
Email: r.lovelace@leeds.ac.uk
Taught: Semester 2 (Jan to Jun) View Timetable
Year running 2023/24
Pre-requisite qualifications
Students must have access to and experience using recent versions of R (4.0.0 minimum) and RStudio e.g installed on their own computers with at least 8 GB RAM (recommended) or via university IT. Attending the Introduction to R one-off 3 hour workshop (semester 1 Computer Skills workshop) and experience of using R (e.g. having used it for work, in previous degrees or having completed an online course) is essential. Students can demonstrate this by showing evidence that they have completed an online course such as the first 4 sessions in the RStudio Primers series https://rstudio.cloud/learn/primers or DataCamp’s Free Introduction to R course: https://www.datacamp.com/courses/free-introduction-to-r. This is an advanced and research-led module. Evidence of substantial programming and data science experience in previous professional or academic work, in languages such as R or Python, also constitutes sufficient pre-requisite knowledge for the course.This module is not approved as an Elective
Module summary
The quantity, diversity and availability of transport data is increasing rapidly, requiring skills in the management and interrogation of data and databases. Recent years have seen a new wave of 'big data' and 'data science' changing the world, with the Harvard Business Review describing Data Science as the 'sexiest job of the 21st century' (see hbr.org). Transport researchers increasingly need to take data from a wide range of sources and perform a wide range of methods on them to inform the decision-making process.Despite these developments the transport sector has been slow to adapt to new methods and workflows. The Transport Systems Catapult, for example, identified a skills gap in "skilled technical talent capable of handling and analysing very large datasets compiled from multiple sources" (see ts.catapult.org.uk).This module takes a practical approach to learning about data science tools and their application to investigating transport issues. The focus is on practical data science with a focus on data at zone, origin-destination route and route network levels. By the end of the course students will be able to make use of a wide range of datasets to answer real-world transport planning questions.Objectives
Understand the structure of transport datasets, from origin-destination to street segment levels Understand how to obtain, clean and store transport related data.
Gain proficiency in command-line tools for handling large transport datasets.
Produce data visualizations, static and via interactive web maps
Learn where to find large transport datasets and assess data quality
Learn how to join together the components of transport data science into a cohesive project portfolio
Learning outcomes
Students will become confident at working in data science teams working on transport problems; in selecting appropriate tools to answer societal and business questions with a range of input data types; and understanding the wider implications of the increasing use of data science for transport planning and engineering.
Specifically, learning outcomes will include the ability to:
Identify available datasets and access and clean them
Combine datasets from multiple sources
Visualise and communicate the results of transport data science, and know about setting-up interactive web applications
Articulate the importance of data science in wider context
Skills outcomes
Students will gain skills in:
- Importing a range of transport data file formats
- Setting-up data science projects to ensure reproducibility
- Data cleaning and manipulation
- Visualisation of large datasets
Syllabus
Software for practical data science
The structure of transport data at the level of zones, origin-destination pairs, routes and infrastructure
Data cleaning and subsetting
Accessing data from web sources
Data visualization
Teaching methods
Delivery type | Number | Length hours | Student hours |
Lecture | 5 | 1.00 | 5.00 |
Practical | 6 | 2.50 | 15.00 |
Seminar | 2 | 2.50 | 5.00 |
Private study hours | 125.00 | ||
Total Contact hours | 25.00 | ||
Total hours (100hr per 10 credits) | 150.00 |
Private study
Students are expected to spend their study time on software set-up and worked examples, plus background reading for lectures, preparatory work for workshops and assessed coursework.Unsupervised teamwork practical sessions will be arranged to ensure a complete portfolio is submitted. Students are encouraged to submit the code that generated the portfolio.
Opportunities for Formative Feedback
Progress will be monitored informally during supervised practical sessions.Methods of assessment
Coursework
Assessment type | Notes | % of formal assessment |
Portfolio | Project Portfolio 3000 word limit plus appendices | 100.00 |
Total percentage (Assessment Coursework) | 100.00 |
The project portfolio will be a short (10 pages max, excluding appendices) but high quality report explaining the methods learned and their application to real problems. Students are encouraged to submit reproducible code alongside the report. This will be used as the basis for formative assessment mid-way through the semester. This will highlight if any students are struggling. If a portfolio fails the assessment criteria the student will have an opportunity to resubmit a report outlining what they have learned in the areas in which they are failing.
Reading list
The reading list is available from the Library websiteLast updated: 25/10/2023 15:48:48
Browse Other Catalogues
- Undergraduate module catalogue
- Taught Postgraduate module catalogue
- Undergraduate programme catalogue
- Taught Postgraduate programme catalogue
Errors, omissions, failed links etc should be notified to the Catalogue Team.PROD