Large-scale kernel methods in financial economics
Persone
(Responsabile)
(Co-responsabile)
Abstract
This project aims at significantly expanding the boundary of feasibility of modeling social phenomena using large data sets. We address this objective with a bottom-up approach integrating simultaneously formulation, mathematical foundation, and computational implementation in an unprecedented way. The overarching theme is the sparse representation of high-dimensional and complex data, while preserving desired structure that could arise, for instance, from theories to be tested. This project requires the development of novel methodologies and can therefore only be accomplished in an interdisciplinary manner by combining expertise in both Social Sciences and Mathematics. Our research questions are easily illustrated, as many problems in financial economics have a representation in terms of expectations of functions of random variables, predicting outcomes conditional on observed variables. While such settings can oftentimes be concisely stated, in most cases the problem ingredients can be assessed through, at best, estimates from sample averages. Moreover, each of the before mentioned objects may carry structure from separate research fields, and from separate theories. For instance, the conditional distribution of asset returns is an important topic in econometrics, and decision theorists investigate what the properties of economic preferences ought to be that generate these expectations, with little overlap between the two. Our efforts concentrate on learning about these objects jointly, using high-dimensional and large data sets in suitable reproducing kernel Hilbert spaces (RKHS). This setup allows us to identify all the components that make up a model formulation at the same time, with, or without connections between them. RKHS models are very versatile, and can be used between different applications with minor, or no modifications. Furthermore, operations within RKHS modeling are very explicit and transparent, and come with rigorous error control. This is a big advantage compared to neural nets when it comes to interpreting results. Used with data, objects in RKHS can be facilitated through so-called representer theorems, that yield solutions to potentially infinite-dimensional optimization problems through simple numerical operations. A sizable part of our contribution will be the particular choice and development of these RKHS for their use within modeling, RKHS modeling with constraints imposed by theory, methods for their numerical implementation, and a great number of empirical applications exploiting the richness of new possibilities made possible by our methodological achievements.