Search for contacts, projects,
courses and publications

Data Analytics

Description

COURSE OBJECTIVES

The course deals with mining very large datasets, analysing them to make some descriptive summary of their content, test some hypothesis and extract valuable knowledge from them.

 

COURSE DESCRIPTION
Differently from other data mining courses, in this one we deal with datasets that for their large size, fast speed of updating, and variety of content (all characteristics of Big Data) cannot be mined with standard techniques. Hence, the course will deal with topics such as: similarity measures for very large datasets and data streams, data streams, link analysis, clustering, recommender systems, etc.

 

LEARNING METHODS
The course will consist of theoretical lectures and of a practical part where, using statistical packages for Python (language that we will assume students already know), we will also learn how to perform practical analysis of large datasets to interpret, visualise, and diagnose results and potential problems of your data analysis.

 

EXAMINATION INFORMATION
Students will be examined during the course by means of 2 theoretical and 2 practical tests (no final exam). The theoretical tests will deal with the material taught during the lectures, while the practical tests will consists of the analysis of large datasets provided during the course.

 

REFERENCES

  • Required:
    Jure Leskovec, Anand Rajaraman and Jeffrey David Ullman. Mining of Massive Datasets (2nd edition). Cambridge University Press, 2014.
  • Other books will be suggested during the course, but are not required and could be found in the university library.

People

 

Crestani F.

Course director

Additional information

Semester
Spring
Academic year
2020-2021
ECTS
6
Language
English
Education
Master of Science in Artificial Intelligence, Core course, Lecture, 1st year
Master of Science in Informatics, Elective course, Lecture, 1st year
Master of Science in Informatics, Elective course, Lecture, 2nd year