Search for contacts, projects,
courses and publications

Data Design & Modeling

Description

Data design and modeling provides the foundation for representing, storing and managing structured, semi-structured and unstructured data. Data can be persistent or volatile, processed in batches or in continuous streams. Students will learn: - how to select appropriate data management solutions to deal with scalability, availability, consistency, performance and expressiveness requirements - big data dimensions: volume, velocity, variety and veracity - CRUD primitives (create, read, update, delete) implemented at scale - ACID/BASE transactional properties of existing SQL/NOSQL data management technologies - No-SQL data models and technologies - sharding and replication strategies - data analysis pipeline: Acquisition, Integration, Exploration, Mining, Analytics, Interpretation and Visualization - data quality, provenance, wrangling and cleansing to ensure data is worthy of trust Students will experiment with big data technologies with hands-on use cases and practical use of cloud big data platforms.

 

REFERENCES

  • Martin J. Fowler, Pramodkumar J. Sadalage. Nosql Distilled: A Brief Guide to the Emerging World of Polyglot Persistence. Addison-Wesley, 2009. Traditional foundational book on NoSQL practices, mainly from software engineering perspective.
  • Ted Hills. NoSQL and SQL Data Modeling: Bringing Together Data, Semantics, and Software. Technics Pubns Llc, 2016. Good, recent overview on database diversity. Despite the title, it does not really focus on modeling and it does not provide a holistic view on the field.
  • Aaron Ploetz, Devram Kandhare, Sudarshan Kadambi, Xun (Brian) Wu. Seven NoSQL Databases in a Week. 2018. Basic, introductory reading to the most popular NoSql solutions, with an empirical and simple vision, missing completely the big picture on design and decision making.
  • Andreas Meier e Michael Kaufmann. SQL- & Nosql-databases: Models, Languages, Consistency Options and Architectures for Big Data Management. Springer, mid-2019. Yet to be published book, showing how strategic it is to get a good positioning in the field with an appropriate book on NoSQL. It possibly take a perspective similar to ours.
  • Martin Kleppmann. Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems. O’Reilly, 2017. Best seller book on modern, large scale data processing system. It has a technology and system-centric perspective, that only superficially touch data models and their impact on system performance.

People

 

Brambilla M.

Course director

Gashi S.

Assistant

Additional information

Semester
Fall
Academic year
2019-2020
ECTS
6
Language
English
Education
Master of Science in Software & Data Engineering, Core course, Lecture, 1st year

PhD programme of the Faculty of Informatics, Elective course, Lecture, 2nd year (4 ECTS)

PhD programme of the Faculty of Informatics, Elective course, Lecture, 1st year (4 ECTS)