Search for contacts, projects,
courses and publications

Data Analytics

Description

The course deals with mining very large datasets, analysing them to make some descriptive summary of their content, test hypothesis, and extract valuable knowledge from them. Differently from other data mining courses here we deal with datasets that for their large size, speed of updating, or variety of content cannot be mined with standard techniques. Hence we will deal with topics such as: similarity measures for very large datasets and data streams, link analysis, clustering, recommender systems, MapReduce, etc. As part of the course we also learn how to use the R statistical programming language to perform, interpret and visualise results, and diagnose potential problems of your analysis.

 

 

REFERENCES

  • Required: Jure Leskovec, Anand Rajaraman and Jeffrey David Ullman. Mining of Massive Datasets (2nd edition). Cambridge University Press, 2014.
  • Suggested: Paul Teetor. R Cookbook. O' Reilly, 2011.

People

 

Crestani F.

Course director

Ríssola E. A.

Assistant

Additional information

Semester
Spring
Academic year
2017-2018
ECTS
6
Education
Master of Science in Artificial Intelligence, Core course, Lecture, 1st and 2nd year

Master of Science in Financial Technology and Computing, Core course, Lecture, 1st and 2nd year

Master of Science in Informatics, Elective course, Lecture, 1st and 2nd year

Master of Science in Management and Informatics, Elective course, Lecture, 1st and 2nd year

PhD programme of the Faculty of Informatics, Elective course, Lecture, 1st, 2nd and 3rd year