Statistical Inference on Large-Scale Mechanistic Network Models
Many scientific and societal systems can be examined as networks, a growing field due to the advent of big data and greater computational capacity. A network allows relational data, such as Facebook data, to be represented as a set of nodes (subjects) connected by links (friends). Researchers use statistical inference to study a network’s evolution, so as to answer questions about network density, reciprocity of relationships, triadic closure (if A is B’s friend, and B is C’s friend, how often is A C’s friend?). This project bridges the divide between two prominent paradigms in modeling network structures, mechanistic and statistical, using Approximate Bayesian Computation (ABC) methodology. ABC simulates multiple samples of network configurations with given sets of parameter values, comparing the resulting networks with observed networks, and retaining those parameter values that yield networks “close” to that observed. For example, to study the structure and dynamics of a national social network, we have about 2 million people connected by 600 million communication events. Most networks are too big to be addressed using the statistical approach, and while mechanistic modeling is highly scalable, there are no inferential tools available for them. On the other hand, most statistical models readily lend themselves to inference, but they do not scale to more than a few thousand nodes. Our ABC approach makes it possible to carry out inference and model comparison for large scale mechanistic networks. It thus brings together ideas and tools from network science, statistics and physics and applies them to questions that originate in the social and biological sciences, making the project highly interdisciplinary and providing a flexible and generic integrated method.