Module and Programme Catalogue

Search site

Find information on

2021/22 Taught Postgraduate Module Catalogue

COMP5840M Data Mining and Text Analytics

15 creditsClass Size: 250

Module manager: Prof Eric Atwell
Email: e.s.atwell@leeds.ac.uk

Taught: Semester 2 (Jan to Jun) View Timetable

Year running 2021/22

Pre-requisite qualifications

COMP5450M Knowledge Representation and Reasoning

Pre-requisites

COMP5450MKnowledge Repres. & Reasoning

This module is not approved as an Elective

Module summary

Introduction to linguistic theory and terminology.Understand and use algorithms and resources for implementing .and evaluating text mining and analytics systems.Develop solutions using open-source and commercial toolkits.Consider the applications of data mining and text analytics through case studies in information retrieval and extraction.

Objectives

On completion of this module, students should be able to ...

- understand theory and terminology of empirical modelling of natural language;
- understand and use algorithms, resources and techniques for implementing and evaluating text mining and analytics systems;
- demonstrate familiarity with some of the main text mining and analytics application areas;
- appreciate why unrestricted natural language processing is still a major research task.

Learning outcomes
On completion of this module, students should be able to:
understand data mining terminology and components of the data mining process; Data warehouses; Tools and techniques for data cleansing and aggregation; Use of machine learning classifiers for data classification; Meta data; Use of clustering and association tools for data mining; Open-source and commercial text mining and text analytics toolkits; Web-based text analytics; Case studies of current commercial applications.


Syllabus

Introduction to linguistic theory and terminology.

Algorithms and techniques for computer-assisted text processing, focusing on applied and corpus-based problems such as spell checking, collocation and co-occurence discovery and text analytics.

Open-source and commercial text mining and text analytics toolkits. Web-based natural language processing.

Case studies of current commercial applications in text mining, beyond English, Arabic data, machine translation, information retrieval, information extraction, chatbots and text classification.

Current research in text analytics.

Teaching methods

Due to COVID-19, teaching and assessment activities are being kept under review - see module enrolment pages for information

Delivery typeNumberLength hoursStudent hours
Laboratory121.0012.00
Class tests, exams and assessment22.004.00
Lecture81.008.00
Private study hours76.00
Total Contact hours24.00
Total hours (100hr per 10 credits)100.00

Opportunities for Formative Feedback

Attendance monitoring and in-lecture interaction.

Methods of assessment

Due to COVID-19, teaching and assessment activities are being kept under review - see module enrolment pages for information


Coursework
Assessment typeNotes% of formal assessment
Online AssessmentOnline test40.00
Online AssessmentOnline test40.00
PracticalLab-based assessment60.00
Total percentage (Assessment Coursework)140.00

This module will be reassessed by an online time-constrained assessment

Reading list

There is no reading list for this module

Last updated: 30/06/2021 16:20:48

Disclaimer

Browse Other Catalogues

Errors, omissions, failed links etc should be notified to the Catalogue Team.PROD

© Copyright Leeds 2019