Data Design & Modeling
- big data dimensions: volume, velocity, variety, and veracity - CRUD primitives (create, read, update, delete) implemented at scale - ACID/BASE transactional properties of existing SQL/NOSQL data management technologies - No-SQL data models and technologies - sharding and replication strategies - data analysis pipeline: Acquisition, Integration, Exploration, Mining, Analytics, Interpretation and Visualization - data quality, provenance, wrangling, and cleansing to ensure data is worthy of trust
Data design and modeling provides the foundation for representing, storing and managing structured, semi-structured and unstructured data. Data can be persistent or volatile, processed in batches or in continuous streams. Students will learn how to select appropriate data management solutions to deal with scalability, availability, consistency, performance, and expressiveness requirements.
Besides the introductory classes, students will experiment with big data technologies with hands-on use cases and practical use of cloud big data platforms.
The exam will consist in a written session where theory questions and exercises will be responded by students on paper. The written exam will account for 70% of the mark. Along with the course, project work activities will be carried out by students in groups. This will count for 30% of the mark.