Big data and business intelligence - course description

General information

Course name	Big data and business intelligence
Course ID	11.3-WE-INFD-BDiaI-Er
Faculty	Faculty of Engineering and Technical Sciences
Field of study	Computer Science
Education profile	academic
Level of studies	Second-cycle Erasmus programme
Beginning semester	winter term 2021/2022

Course information

Semester	2
ECTS credits to win	5
Course type	obligatory
Teaching language	english
Author of syllabus

Classes forms

The class form	Hours per semester (full-time)	Hours per week (full-time)	Hours per semester (part-time)	Hours per week (part-time)	Form of assignment
Lecture	30	2	-	-	Credit with grade
Laboratory	30	2	-	-	Credit with grade

Aim of the course

Teaching students how to choose the right data analysis techniques depending on the scale of the problem being considered and the type of analysis being carried out.
Teaching students to work using modern platforms for data storage and processing.
Teaching students selected techniques to analyze large data sets, mainly textual.

Prerequisites

Introduction to databases, Basics of statistics

Scope

Big Data: An introduction to processing large amounts of data.

Non-relational databases: Reminder of the basic issues related to relational databases. Advantages and disadvantages of these databases. Basic problems related to the use of relational databases to store and process larger and larger amounts of increasingly distributed data. Horizontal and vertical scaling of databases. A new concept of databases not based on the traditional relational model. CAP and BASE theory. Aggregate data models. Key-value, column, document and graph databases. Database replication. Sharing resources in databases. Map-Reduce methodology. Presentation of a few selected non-relational database systems (e.g. MongoDB, Cassandra, Redis, Neo4J, Oracle NoSQL Database).

Selected IT systems: Large-scale business analytics: modern solutions used for transmission, storage and processing of large data sets. Basics of data processing using convolutional neural networks (CNN). Tensorflow and Keras libraries. Working in the Google Colaboratory cloud environment.

Elements of Text Mining: Introduction to Text Mining. Pre-processing of text documents. Stemming algorithms. Keyword searching. Organization of documents in the form of a term-document matrix (TDM). Selected elements of linear algebra and their application to Text Mining. Grouping and classifying of text documents. Create document summaries. Wordclouds. Sentiment analysis. Selected IT systems and libraries for Text Mining.

Teaching methods

Lecture, laboratory exercises.

Learning outcomes and methods of theirs verification

Outcome description	Outcome symbols	Methods of verification	The class form

Assignment conditions

Lecture – the passing condition is to obtain a positive mark from the final test.

Laboratory – the passing condition is to obtain positive marks from all laboratory exercises to be planned during the semester.

Calculation of the final grade: lecture 50% + laboratory 50%

Notes

Modified by dr hab. inż. Artur Gramacki, prof. UZ (last modification: 08-09-2021 19:00)

Generate PDF for this page