Data Analytics for Finance I & II
Persone
Docente titolare del corso
Descrizione
The course is organized weekly sessions of four hours. There are two parts:
Part 1: Financial Data – in the first half of the semester, we will study the answer to the following questions
- What is data
Nature and theory of data, data generating process, measurement, variables, data types - What to do with the data
Scientific papers that make innovative use of data - Which financial questions can be answered with data
Economic and financial variables and their classification, flow vs. stock, P vs. Q - Where to get data
- Traditional data sources: Computstat, CRSP, Optionmetrics, Markit, Factset, Bloomberg
- Alternative data sources: Quandl, microblogging, search engines, blockchain
- Historic datasets
- How to collect your own data
- How to check data
The six data quality dimensions - How to store data
Data encodings and documentation - How to organize data
Principles of database design - How to connect the dots between datasets
Introduction to relational databases - How to analyze data
Exploratory data analysis and an introduction to nonparametric statistics - How to extract (even more) meaningful variables
Derived quantities, aggregation - Which econometric models can overcome data shortcomings
Robust statistics, seasonality - How to make money with data
Data business models - How not to make money with data
Sharing data (when, why, how), open research data principles
Part 2: Data visualization – the second half of the semester is entirely dedicated to data visualization, one of the most powerful tools for analyzing data and communicating your results
- Visualization theory
Perception and aesthetics, color, the grammar of graphics - Static visualizations
Bar charts, scatter plots, pie charts, line carts, Sankey diagrams, Parallel coordinate plots, - Statistical visualizations
Box and violin plots, qq-plot, histograms, tree maps, forest plot, autocorellograms, Lorenz curves, Venn diagrams, - Data maps
Dot distribution maps, heat maps, choropleths and alternative maps: cartograms, grid and hexagon maps, statebins - Interactive visualizations
Basic web technology, user interaction, R shiny - Visualizations in R
ggPlot and shiny
Obiettivi
Tukey (1962) defines Data Analysis to be “Procedures for analyzing data, techniques for interpreting the results of such procedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of (mathematical) statistics which apply to analyzing data.”
The goal of this course is
- to provide the students with the tools and thinking framework for data analysis
- to foster creativity in using existing or or creating new datasets
- to equip students with techniques to communicate their results
Modalità di insegnamento
In presenza
Impostazione pedagogico-didattica
This course will take students from theory to practice in four steps. Students start prepare with reading a short text or watching a video. Then new topics are introduced in a lecture, which is followed by learning-by-doing laptop session. Students finally apply their new knowledge in individual homeworks, which are collected and presented in a student portfolio.
The R programming language (together with a bit of SQL and Linux) will be used for most part of this course. Students are free to use other languages or tools (Python, Tableau, …) for the assignments. Students are required to bring a laptop with R and R Studio installed to every class. Occasionally, we will work with LINUX. Student can access a server made available by the lecturer or continue to use their server from Programming in Finance and Economics II.
Modalità d’esame
40% – Written midterm exam
60% – Portfolio
Students create portfolios of ca. 20 pages from their individual work throughout the semester, including
- One discussion of data sets and/or methods
- One discussion of a paper from the literature
- Data visualizations
Bibliografia
- Tukey, J.W.. The future of Data Analysis. The Annals of mathematics and statistics, 1962. (p. 1-67)
Offerta formativa
- Master of Science in Economics in Finance, Lezione, 2° anno
- Master of Science in Financial Technology and Computing, Lezione, 1° anno
Prerequisito
- Advanced Statistics, Mira A., Ghilotti L., Peluso S., SA 2021-2022
- Programming in Finance and Economics I, Gruber P., Montemurro P., SA 2021-2022