Professor Max Bramer

Principles of Data Mining (second edition)

Published by Springer-Verlag. 2013. 440 pages. ISBN: 978-1-4471-4883-8 (Print) ISBN: 978-1-4471-4884-5 (Online)

Data Mining, the automatic extraction of implicit and potentially useful information from data, is increasingly used in commercial, scientific and other application areas.

Principles of Data Mining explains and explores the principal techniques of Data Mining: for classification, association rule mining and clustering. Each topic is clearly explained and illustrated by detailed worked examples, with a focus on algorithms rather than mathematical formalism. It is written for readers without a strong background in mathematics or statistics, and any formulae used are explained in detail.

This second edition has been expanded to include additional chapters on using frequent pattern trees for Association Rule Mining, comparing classifiers, ensemble classification and dealing with very large volumes of data.

Principles of Data Mining aims to help general readers develop the necessary understanding of what is inside the 'black box' so they can use commercial data mining packages discriminatingly, as well as enabling advanced readers or academic researchers to understand or contribute to future technical advances in the field.

Suitable as a textbook to support courses at undergraduate or postgraduate levels in a wide range of subjects including Computer Science, Business Studies, Marketing, Artificial Intelligence, Bioinformatics and Forensic Science.

Presents the principal techniques of data mining with particular emphasis on explaining and motivating the techniques used
Focuses on understanding of the basic algorithms and awareness of their strengths and weaknesses
Useful as a textbook and also for self-study
Substantially expanded second edition
Each chapter contains practical exercises to enable readers to check their progress, and there is a full glossary of technical terms

Buy from Amazon (UK)

Errata

Chap	Page	Line	Change
13	208	4 to 5	Change http://www.ics.uci.edu/mlearn/MLRepository.html to http://www.ics.uci.edu/~mlearn/MLRepository.html
15	236	-5 to -4	Change http://www.ics.uci.edu/mlearn/MLRepository.html to http://www.ics.uci.edu/~mlearn/MLRepository.html
15	227	-10	'the 1% value' should read 'the 10% value'
15	227	-10 to -9	'We can safely reject the null hypothesis' should read 'We can safely accept the null hypothesis'
18	290	-9	'download' should be 'downward'
18	300	-5	Final paragraph. This should start: 'We have found two frequent itemsets ending with item p: {p} and {c, p}'
Index	437	-11 (left-hand column)	Change 'Global Infomation Partition' to 'Global Information Partition'

Last updated October 30th 2016

Software

These web-based programs are provided to support some of the material in Principles of Data Mining (second edition)

Calculation of performance measures (Chapter 12)

Calculation of interestingness measures (Section 17.9)

FP-growth Frequent Pattern Trees algorithm (Chapter 18)

Comparing Classifiers: Calculation of paired t statistic (Chapter 15)

Datasets

Downloadable copies of datasets referred to in the book (all in Inducer format)