The course is composed of 4 parts. In the first, we learn how human vision works and how to design charts and visualizations for effective communication. In the second part, we see how to use Jupyter, Pandas, and Bokeh to perform data cleaning, analysis, and visualization. We also cover how to deal with geo-spatial data and how to manipulate geocoded objects. In the third part, we learn how to index, query, and aggregate data at scale with ElasticSearch, and how to create interactive dashboards with Kibana and Canvas. In the last part, we see how to analyze ultra-large datasets with Apache Spark, discussing how the Spark optimizer works under the hood.
- Learn how to best design charts and visualizations for effective communication
- Acquire practical experience with a number of frameworks for data analysis and visualization (Jupyter, Pandas, Bokeh, Folium, ElasticSearch, Spark)
- Create a data analysis platform in a group project
- Interactive coding sessions
- Hands-on tutorials
This is an atelier course with neither midterm nor final exam. The pass/fail and the final grade are based on a number of assignments and a project.