Practical Applications of Data Mining Systems - course description

General information

Course name	Practical Applications of Data Mining Systems
Course ID	11.3-WK-DEED-PADMS-S22
Faculty	Faculty of Mathematics, Computer Science and Econometrics
Field of study	Data Engineering
Education profile	academic
Level of studies	Second-cycle studies leading to MS degree
Beginning semester	summer term 2023/2024

Course information

Classes forms

Aim of the course

Acquire the modeling skills required to understand and store big data in big data sets.
Using skills to make decisions such as cancer detection, fraud detection, customer segmentation and machine downtime prediction.
Learning about the data mining process and modeling techniques using one IBM SPSS Modeler program.
Creating models based on selected data, testing models with historical data, using current data.

Fundamental of statistics.

Introduction to data mining
- CRISP-DM methodology
- Introduction to SPSS Modeler - a predictive data mining workshop
- SPSS Modeler interface
Data retrieval process
1. Understanding the business
2. Understanding data
3. Data preparation
Modeling techniques
1. Introduction to modeling techniques
2. Cluster analysis (unsupervised learning)
3. Classification and prediction (supervised learning)
4. Classification, training and testing
5. Sampling in classification
6. Predictive Modeling Algorithms in SPSS Modeler
7. Automatic selection of algorithms
Model evaluation
1. Performance evaluation data
2. Accuracy as a performance evaluation tool
3. Overcoming accuracy limits
4. ROC Curves
Implementation on IBM Bluemix
1. Evaluating new data
2. Implementation of a predictive model
3. What is IBM Bluemix?
4. Predictive Modeling: Cloud Deployment
5. SPSS Collaboration and Implementation Services

Conventional lecture, problem-based lecture. Laboratory exercises. Discussion.

Outcome description	Outcome symbols	Methods of verification	The class form

The grade for the laboratory will be based on the results from the colloquium and/or projects (80%) and activity in classes (20%).

Modified by dr Maciej Niedziela (last modification: 11-04-2024 16:06)