Real world problems more often than not have a fuzzy nature: they are not well defined problems with well defined data where one can just apply a statistical model and/or a machine learning technique. In reality, there is a clear objective and part of the solution is to define the problem by understanding and refining the underlying data. Solving these problems is an iterative process during which the data is integrated and explored multiple times from different angles, refining the model and understanding of it. Data exploration exploits a combination of interactive visualization and querying, as a means to understand the "shape" of the data, break down its complexity, and ask questions/test hypotheses. In this course we will see a number of data exploration and visualization techniques, and how to use them to iteratively solve a given problem by understanding data. The course will also cover how to explore ultra large datasets, which tools and techniques exist, and what are the challenges and constraints.
Data Design & Modeling, Software Design & Modeling