Module and Programme Catalogue

Search site

Find information on

2021/22 Undergraduate Module Catalogue

LISS1031 Data Mining and Text Analytics

10 creditsClass Size: 30

Module manager: Professor Eric Atwell
Email: e.s.atwell@leeds.ac.uk

Taught: 1 Jul to 31 Aug View Timetable

Year running 2021/22

Pre-requisite qualifications

GPA of 2.8 (US) or equivalent and enrolled at a university
You must have experience of using and creating data files, including Word documents, Excel spreadsheets, PowerPoint presentations, video streaming sites, Wikipedia web-pages, Twitter/Facebook or other social media data.

Module replaces

None

This module is not approved as a discovery module

Module summary

You will be introduced to data mining inputs and outputs: instances, attributes, classes, concepts; machine learning and data mining with the WEKA toolkit; real-world data-sets and competitions with Kaggle.com; CRISP-DM Cross Industry Standard Process for Data Mining; evaluation of data mining and text analytics results; text classification; and text search and Information Retrieval. These data mining principles will be introduced in seminars, and you will experience data-mining in practical team-work, taking part in a data mining and text analytics challenge, and presenting your team results in a research report and video presentation. You are NOT expected to have previous expertise in data-mining, but you should bring your own laptop, and be familiar with using and creating data files, including Word documents, Excel spreadsheets, PowerPoint presentations, video streaming sites, Wikipedia web-pages, Twitter/Facebook or other social media data. Before the course, you must have set up user accounts at a suitable video streaming site and kaggle.com, and downloaded to your laptop the WEKA free toolkit from the WEKA open-access website.

Objectives

You will learn the principles of data mining and text analytics; apply these principles in practical exercises with a data mining toolkit and real data; compare a range of different techniques and algorithms and evaluate their performance.

Learning outcomes
On completion of the module, students should be able to:
1. demonstrate a broad understanding of the concepts, information, practical competencies and techniques in the field of data-mining and text analytics;
2. apply generic and subject specific intellectual qualities to standard situations outside the context in which they were originally studied;
3. appreciate and employ the main methods of enquiry in data-mining and text analytics; and critically evaluate the appropriateness of different methods of enquiry;
4. use a range of techniques to initiate and undertake the analysis of data and text;
5. effectively communicate information, results and analysis in written scientific reports and presentations.

Skills outcomes
Theory and practice of data mining and text analytics


Syllabus

1. Data mining inputs and outputs: instances, attributes, classes, concepts
2. Machine learning and data mining with the WEKA toolkit
3. Real-world data-sets and competitions with Kaggle.com
4. CRISP-DM Cross Industry Standard Process for Data Mining
5. Evaluation of data mining results
6. Text classification
7. Text search and Information Retrieval
8. Evaluation of text analytics results
9. Practical data mining and text analytics challenge
10. Ethical issues in data mining and text analytics

Teaching methods

Delivery typeNumberLength hoursStudent hours
Visit110.0010.00
Class tests, exams and assessment13.003.00
Fieldwork18.008.00
Practical72.0014.00
Seminar81.008.00
Independent online learning hours16.00
Private study hours41.00
Total Contact hours43.00
Total hours (100hr per 10 credits)100.00

Private study

16 hours pre-course preparatory work (materials available on Minerva). This includes: Set up user accounts at a video streaming site and kaggle.com, and downloaded to the student’s laptop the WEKA free toolkit from the WEKA open-access website. Produce a one-minute video to introduce yourself – to ensure the skills needed for CW2 have been acquired.
41 hours private study during the module. This includes the time spent on coursework.

Opportunities for Formative Feedback

Opportunities for formative feedback during the practical sessions.

Methods of assessment


Coursework
Assessment typeNotes% of formal assessment
ReportIndividual report (6 pages)60.00
Group ProjectGroup presentation via video (5 minutes)40.00
Total percentage (Assessment Coursework)100.00

Normally resits will be assessed by the same methodology as the first attempt, unless otherwise stated

Reading list

There is no reading list for this module

Last updated: 30/06/2021 16:24:23

Disclaimer

Browse Other Catalogues

Errors, omissions, failed links etc should be notified to the Catalogue Team.PROD

© Copyright Leeds 2019