Data mining - course description

General information

Course name	Data mining
Course ID	04.2-WE-BizElP-DataMining-Er
Faculty	Faculty of Computer Science, Electrical Engineering and Automatics
Field of study	E-business
Education profile	practical
Level of studies	First-cycle Erasmus programme
Beginning semester	winter term 2019/2020

Course information

Semester	2
ECTS credits to win	3
Course type	obligatory
Teaching language	english
Author of syllabus	dr hab. inż. Marek Kowal, prof. UZ

Classes forms

The class form	Hours per semester (full-time)	Hours per week (full-time)	Hours per semester (part-time)	Hours per week (part-time)	Form of assignment
Lecture	15	1	-	-	Credit with grade
Laboratory	30	2	-	-	Credit with grade

Aim of the course

Presentation of the software used for data mining. Familiarize students with the methods of data cleaning. Presentation of data classification methods. Presentation of methods of association and sequences rules discovery. Presentation of data clustering methods. Developing practical skills in operating selected data mining systems. Developing skills in the application of data mining methods in e-business (customer segmentation, credit risk scoring, cross-selling strategies, fraud detection).

Prerequisites

Scope

Review and characteristics of the software used for data mining. Introduction to data mining software (SAS). Data structures used in data mining. Types and roles of variables in data mining tasks.

Preparation of data for exploration. Data profiling. Data cleansing. Data sampling. Transformation of variables. Variable Selection.

Data classification. Classification trees, k-nearest neighbors, naive Bayes classifier, neural networks, logistic regression. Measures of classification accuracy. Practical exercises from data classification.

Discovering association and sequence rules. Measures describing the statistical importance and strength of association and sequence rules. Market basket analysis. Computational complexity of association rules discovery. Discussion of the Apriori and Generalized Sequential Pattern algorithm. Practical exercises from association and sequence rules discovery.

Data clustering. Methods of hierarchical clustering. Clustering methods based on iterative optimization. Distance measures used in clustering algorithms. Clusters summary. Methods for estimating the number of clusters. Practical exercises from data clustering.

Teaching methods

Lecture - conventional lecture using a video projector.
Laboratory - practical exercises in the computer laboratory.

Learning outcomes and methods of theirs verification

Outcome description	Outcome symbols	Methods of verification	The class form

Assignment conditions

Lecture - the passing criteria is to obtain positive grades from tests carried out at least once in a semester.

Laboratory - the passing criterion is to obtain positive marks for laboratory exercises and tests.

Final mark components = lecture: 50% + teaching laboratory: 50%

Notes

Modified by dr hab. inż. Marek Kowal, prof. UZ (last modification: 09-12-2019 11:52)

Generate PDF for this page