Module and Programme Catalogue

Search site

Find information on

2021/22 Undergraduate Module Catalogue

CHEM3212 Big Data, Big Science

10 creditsClass Size: 150

Module manager: Dr Stuart Warriner

Taught: Semester 1 (Sep to Jan) View Timetable

Year running 2021/22

This module is not approved as a discovery module

Module summary

The explosion of information means that many jobs often require people to handle large datasets efficiently and quickly, yet graduates often don’t have these core skills. In science new insights often involve taking lots of data and bringing it together in a way that illuminates the problem. In this course you will develop the core skills to efficiently handle large datasets. Using examples from across Chemistry you will see how to efficiently extract data using common tools such as advanced manipulations in excel to simple programming in python and reach meaningful conclusions. Online tools will help you acquire these key skills with weekly seminars will let you explore real examples letting you use these skills to answer scientific questions.


To enable students to explore how to handle large datasets to extract key scientific information.

Learning outcomes
Understand how large datasets can be useful within and outside science
The ability to use advanced excel capabilities (data pivots, array functions, and handling multisheet data, solver and stats packages)
The ability to use simple python programming to extract data from large datasets
Presentation of data efficiently and concisely.


Core concepts in excel – basic formula construction, conventions and referencing
Aggregating data in excel – pivot tables and array formulae
Aggregating multisheet data using indirect functions
Using statistics packages with excel.
Fundamental python programming concepts
Pattern matching and data mining using python

Teaching methods

Due to COVID-19, teaching and assessment activities are being kept under review - see module enrolment pages for information

Delivery typeNumberLength hoursStudent hours
Computer Class72.0014.00
Independent online learning hours30.00
Private study hours55.00
Total Contact hours15.00
Total hours (100hr per 10 credits)100.00

Private study

Lectures are to introduce the course and the assessment only
Online courses and examples to enable development of the technical skills for data analysis – eg advanced excel functions, basics of python programming.
Self study with self-taught examples and tests. These tools will then support the exercises in the workshops

Opportunities for Formative Feedback

The workshop sessions will involve guided solutions to the project with a member of staff enabling feedback on the approach being taken and any technical issues.

The online learning will have self help exercises to enable the students to monitor their own progress.

Methods of assessment

Due to COVID-19, teaching and assessment activities are being kept under review - see module enrolment pages for information

Assessment typeNotes% of formal assessment
ProjectData Analysis Project Report100.00
Total percentage (Assessment Coursework)100.00

The project will include a real data analysis exercise framed around a scientific question. The students will have to understand the data provided, determine what data is relevant and extract it using the skills they have obtained and then present the data as a short report. Students requiring to resit the module would be given a further attempt to complete the project over the summer.

Reading list

There is no reading list for this module

Last updated: 02/07/2021 10:55:13


Browse Other Catalogues

Errors, omissions, failed links etc should be notified to the Catalogue Team.PROD

© Copyright Leeds 2019