Streamline 2 - An Architecture for Application-Level Data Networking
Recent advances in computer networking and high-speed data transmission technologies have led to considerable increase in the average bandwidth available to end users. As a result, a number of information-diffusion applications have emerged (e.g., social networking, streaming). These applications have in common the fact that they are typically deployed over a wide geographical area, contain a large number of users, and may involve non-trivial data processing.
In particular, given its size and geographical distribution, we assume that no single computing site can host and process the complete application data. Instead, we envisage the infrastructure as a collection of sites composed of one or more datacenters (i.e., clusters of servers). Sites may or may not be in the same geographical region. While multiple datacenters within the same region are deployed for availability, datacenters placed in different regions can reduce the response time experienced by distant users. Our prototypical application is social networking. Although we do not plan to design techniques and protocols specific to this context, we plan to assess our infrastructure using a social network application.
This project intends to consider a layered approach to support large-scale applications. We propose to decompose the problem according to two axes, communication and state, each one corresponding to a system layer. In this respect, the project will consider efficient, scalable and robust mechanisms for (a) communication and (b)~data management and processing over geographically distributed systems. These objectives are closely aligned with Streamline 1. In Streamline 1, we investigated modular approaches to data streaming, focusing on efficient construction of overlays and on data propagation. The resulting architecture is quite modular and some of the ideas developed in Streamline 1 can be extended to handle efficient communication among datacenters. Additionally, in Streamline 1 we have considered techniques to manage state in large-scale single-site systems. In particular we have proposed scalable database protocols that we plan to use as the basis for the underlying storage of geographically distributed applications.
Streamline 2 will allow the continuation of a successful collaboration between the University of Lausanne and the University of Lugano and provide the funding for two PhD students to start their training. As we have done in the past, we plan to submit our findings to visible conferences and journals and render the resulting software artifacts available as opensource.