2019/20 Taught Postgraduate Module Catalogue
COMP5840M Data Mining and Text Analytics
15 creditsClass Size: 120
Module manager: Dr Eric Atwell
Email: e.s.atwell@leeds.ac.uk
Taught: Semester 2 (Jan to Jun) View Timetable
Year running 2019/20
Pre-requisite qualifications
COMP5450M Knowledge Representation and ReasoningPre-requisites
COMP5450M | Knowledge Repres. & Reasoning |
This module is not approved as an Elective
Module summary
Introduction to linguistic theory and terminology.Understand and use algorithms and resources for implementing .and evaluating text mining and analytics systems.Develop solutions using open-source and commercial toolkits.Consider the applications of data mining and text analytics through case studies in information retrieval and extraction.Objectives
On completion of this module, students should be able to ...- understand theory and terminology of empirical modelling of natural language;
- understand and use algorithms, resources and techniques for implementing and evaluating text mining and analytics systems;
- demonstrate familiarity with some of the main text mining and analytics application areas;
- appreciate why unrestricted natural language processing is still a major research task.
Learning outcomes
On completion of the year/programme students should have provided evidence of being able to:
-to demonstrate in-depth, specialist knowledge and mastery of techniques relevant to the discipline and/or to demonstrate a sophisticated understanding of concepts, information and techniques at the forefront of the discipline;
-to exhibit mastery in the exercise of generic and subject-specific intellectual abilities;
-to demonstrate a comprehensive understanding of techniques applicable to their own research or advanced scholarship;
-proactively to formulate ideas and hypotheses and to develop, implement and execute plans by which to evaluate these;
-critically and creatively to evaluate current issues, research and advanced scholarship in the discipline.
Syllabus
Introduction to linguistic theory and terminology.
Algorithms and techniques for computer-assisted text processing, focusing on applied and corpus-based problems such as spell checking, collocation and co-occurence discovery and text analytics.
Open-source and commercial text mining and text analytics toolkits. Web-based natural language processing.
Case studies of current commercial applications in text mining, beyond English, Arabic data, machine translation, information retrieval, information extraction, chatbots and text classification.
Current research in text analytics.
Teaching methods
Delivery type | Number | Length hours | Student hours |
Laboratory | 2 | 1.00 | 2.00 |
Lecture | 22 | 1.00 | 22.00 |
Private study hours | 126.00 | ||
Total Contact hours | 24.00 | ||
Total hours (100hr per 10 credits) | 150.00 |
Opportunities for Formative Feedback
Attendance monitoring and in-lecture interaction.Methods of assessment
Coursework
Assessment type | Notes | % of formal assessment |
Assignment | Coursework | 40.00 |
Total percentage (Assessment Coursework) | 40.00 |
This module is re-assessed by exam only.
Exams
Exam type | Exam duration | % of formal assessment |
Open Book exam | 2 hr | 60.00 |
Total percentage (Assessment Exams) | 60.00 |
This module is re-assessed by exam only.
Reading list
There is no reading list for this moduleLast updated: 30/04/2019
Browse Other Catalogues
- Undergraduate module catalogue
- Taught Postgraduate module catalogue
- Undergraduate programme catalogue
- Taught Postgraduate programme catalogue
Errors, omissions, failed links etc should be notified to the Catalogue Team.PROD