SylabUZ
Nazwa przedmiotu | Data mining |
Kod przedmiotu | 04.2-WE-BizElP-DataMining-Er |
Wydział | Wydział Informatyki, Elektrotechniki i Automatyki |
Kierunek | Biznes elektroniczny |
Profil | praktyczny |
Rodzaj studiów | Program Erasmus pierwszego stopnia |
Semestr rozpoczęcia | semestr zimowy 2021/2022 |
Semestr | 2 |
Liczba punktów ECTS do zdobycia | 3 |
Typ przedmiotu | obowiązkowy |
Język nauczania | angielski |
Sylabus opracował |
|
Forma zajęć | Liczba godzin w semestrze (stacjonarne) | Liczba godzin w tygodniu (stacjonarne) | Liczba godzin w semestrze (niestacjonarne) | Liczba godzin w tygodniu (niestacjonarne) | Forma zaliczenia |
Wykład | 15 | 1 | - | - | Zaliczenie na ocenę |
Laboratorium | 30 | 2 | - | - | Zaliczenie na ocenę |
Presentation of the software used for data mining. Familiarize students with the methods of data cleaning. Presentation of data classification methods. Presentation of methods of association and sequences rules discovery. Presentation of data clustering methods. Developing practical skills in operating selected data mining systems. Developing skills in the application of data mining methods in e-business (customer segmentation, credit risk scoring, cross-selling strategies, fraud detection).
Review and characteristics of the software used for data mining. Introduction to data mining software (SAS). Data structures used in data mining. Types and roles of variables in data mining tasks.
Preparation of data for exploration. Data profiling. Data cleansing. Data sampling. Transformation of variables. Variable Selection.
Data classification. Classification trees, k-nearest neighbors, naive Bayes classifier, neural networks, logistic regression. Measures of classification accuracy. Practical exercises from data classification.
Discovering association and sequence rules. Measures describing the statistical importance and strength of association and sequence rules. Market basket analysis. Computational complexity of association rules discovery. Discussion of the Apriori and Generalized Sequential Pattern algorithm. Practical exercises from association and sequence rules discovery.
Data clustering. Methods of hierarchical clustering. Clustering methods based on iterative optimization. Distance measures used in clustering algorithms. Clusters summary. Methods for estimating the number of clusters. Practical exercises from data clustering.
Lecture - conventional lecture using a video projector.
Laboratory - practical exercises in the computer laboratory.
Opis efektu | Symbole efektów | Metody weryfikacji | Forma zajęć |
Lecture - the passing criteria is to obtain positive grades from tests carried out at least once in a semester.
Laboratory - the passing criterion is to obtain positive marks for laboratory exercises and tests.
Final mark components = lecture: 50% + teaching laboratory: 50%
1. Hastie T., Tibshirani R., Friedman J.H.: The Elements of Statistical Learning, Springer 2001
Zmodyfikowane przez dr hab. inż. Marek Kowal, prof. UZ (ostatnia modyfikacja: 12-07-2021 11:41)