Data Analytics for Finance I & II
People
Course director
Description
The course is organized weekly sessions of four hours. There are two parts:
Part 1: Financial Data – in the first half of the semester, we will study the answer to the following questions
- What is data
– Nature and theory of data, data generating process, measurement, variables, data types
– Scientific papers that make innovative use of data
– Which financial questions can be answered with data - How to download, store, encode and organize data
– Data encodings
– Introduction to relational databases
– APIs and scraping - Where to obtain data
- Traditional data sources: Computstat, CRSP, Optionmetrics, Markit, Factset, Bloomberg
- Alternative data sources: Quandl, microblogging, search engines, blockchain
- Climate data
- Historic datasets
- Fact checking and data checking in the age of AI
- How to analyze numeric data
– Exploratory data analysis with the help of AI
– Derived quantities, and data aggregation - How to analyze textual data
– Traditional analysis of textual data
– LLM-as-model for text analysis - How to collect your own data
– An introduction to surveys and interviews - Which econometric models can overcome data shortcomings
Robust statistics, seasonality - The data economy and your data strategy
– Data business models
– Open research data principles
Part 2: Data visualization – the second half of the semester is entirely dedicated to data visualization, one of the most powerful tools for analyzing data and communicating your results
- Visualization theory
– Human perception
– The grammar of graphics
– Aesthetics, color, - Static visualizations
Bar charts, scatter plots, pie charts, line carts, Sankey diagrams, Parallel coordinate plots - Statistical visualizations
Box and violin plots, qq-plot, histograms, tree maps, forest plot, autocorellograms, Lorenz curves, Venn diagrams, - Tools for data visualization
– Visualizations using ChatGPT and Claude
– Python and R packages
– Flourish and Tableau - Data maps
Dot distribution maps, heat maps, choropleths and alternative maps: cartograms, grid and hexagon maps, statebins - Interactive visualizations
Basic web technology, user interaction, R shiny - Visualization workflow and refinement
Objectives
In May 2017 the Economist titled: “Data is the new oil” . They could not have been more right. Data is one of the most valuable raw materials of the 21st century. Beyond countless business processes it is the basis for the artificial intelligence revolution. This is why we have – already in 2019 – created this course that is entirely devoted to all aspects of financial data.
The goal of this course is
- to provide the students with the thinking framework for data analysis
- equip them with tools and resources to obtain and manage financial data.
- to foster creativity in using and exploring existing datasets as well as in creating new datasets
- to equip students with techniques to visually analyze data and to communicate their results
Teaching mode
In presence
Learning methods
This course will take students from theory to practice in four steps. Students prepare with reading a short text or watching a video. Then new topics are introduced in a lecture, which is followed by learning-by-doing laptop session. Students finally apply their new knowledge in individual homeworks, which are collected and presented in a student portfolio.
Programming is heavily centred around AI tools, using Github Copilot. Based on these AI tools, several programming languages will be used throughout the course: the R and Python languages, a bit of SQL and bash (Linux) as well as Tableau (free student version).
Students are free to use any language or tool for the assignments. Students are required to bring a laptop to every class. Students will also need a ChatGPT Plus licese for the second part of the course. Occasionally, we will work with Linux. Students can use a server made available by Peter Gruber or continue to use their own server from Programming in Finance and Economics II.
Examination information
40% – Written midterm exam
60% – Portfolio
Students create an individual portfolio of ca. 20 pages from their work throughout the semester, including
- One discussion of data sets and/or methods
- One discussion of a paper from the literature
- Several data visualizations
Bibliography
- Tukey, J.W.. The future of Data Analysis. The Annals of mathematics and statistics, 1962. (p. 1-67)
Education
- Master of Science in Economics in Finance, Lecture, 2nd year
- Master of Science in Financial Technology and Computing, Lecture, 1st year
Prerequisite
- Advanced Statistics, Mira A., Ghilotti L., Peluso S., SA 2021-2022
- Programming in Finance and Economics I, Gruber P., Montemurro P., SA 2021-2022