Introduction to Data Science
Data science is an interdisciplinary field of scientific methods, processes, algorithms and systems to extract knowledge or insights from data in various forms, either structured or unstructured. In this course, we begin by studying “data”. Data comes in various form and shapes. Often we make the distinction between observational and controlled studies, which result in “observational data” and “experimental data”, respectively. Whereas the latter is the benchmark for many scientific fields, modern sensing technology often produces large quantities of the former. We describe how this relates to the issue of confounding and the way that experimental design can play a role in this. Next we focus on how causal inference is able detect causal structures even in modern observational settings. We develop statistical learning theory, which is the corner stone of modern data science approaches. We discuss maximum likelihood estimation, and introduce statistical testing and then move on to more advanced techniques.