Big data technologies - opis przedmiotu

Nazwa przedmiotu Big data technologies
Teaching students how to choose the right data analysis techniques depending on the scale of the problem being considered and the type of analysis being carried out.
Teaching students to work using modern platforms for data storage and processing.
Teaching students selected techniques to analyze large data sets, mainly textual.

Introduction to databases, Basics of statistics


Big Data: An introduction to processing large amounts of data.

Non-relational databases: Reminder of the basic issues related to relational databases. Advantages and disadvantages of these databases. Basic problems related to the use of relational databases to store and process larger and larger amounts of increasingly distributed data. Horizontal and vertical scaling of databases. A new concept of databases not based on the traditional relational model. CAP and BASE theory. Aggregate data models. Key-value, column, document and graph databases. Database replication. Sharing resources in databases. Map-Reduce methodology. Presentation of a few selected non-relational database systems (e.g. MongoDB, Cassandra, Redis, Neo4J, Oracle NoSQL Database).

Selected IT systems: Large-scale business analytics: modern solutions used for transmission, storage and processing of large data sets. Basics of data processing using convolutional neural networks (CNN). Tensorflow and Keras libraries. Working in the Google Colaboratory cloud environment.

lecture: conventional lecture

project: work in groups, practical classes

  1. Pramod J. Sadalage and Martin Fowler: NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence,2012
  2. Dan Sullivan: NoSQL for Mere Mortals,2015
  3. Francois Chollet: Deep Learning with Python, Helion, 2017
  4. Tensorflow and Keras docs:

