Colloc 2 - Local Replication Strategies for Global Content Management in Large-Scale Distributed Systems
This project is the direct follow-up to the SNF project entitled "Colloc: A Replication Engine for Peer-to-Peer Content Management in Large-Scale Distributed Systems", a joint research effort between the University of Lausanne (Unil) and the University of Lugano (USI).
The Colloc 2 project described hereafter will complete this research, by capitalizing on the work already invested. The results obtained so far also compelled us to partially redefine the initial project, which accounts for the 2 years we are requesting for the follow-up project described in this proposal. Peer-to-Peer Content Management Systems (P2P-CMS) aim at supporting the collaborative work of scattered and fluctuating communities, and thus face many challenges. Researchers in bioinformatics for instance, which shares and collaboratively maintains information related to DNA and amino acid sequences worldwide, constitute such a highly distributed yet highly focused community. The challenges facing P2P-CMS include (1) finding the right level of consistency to achieve optimal performance, as the need for fast data access often supersedes the need for data consistency, (2) dealing with read and write requests constantly changing origins and frequencies, as the interest in specific pieces of content is continuously shifting, (3) managing the distributed storage of enormous volumes of data, as it is difficult, if not impossible, to store all the data in one single place. A promising approach consists in tapping idle computing power and storage, since most computers in large organizations happen to be underused. The objective of the Colloc 2 project, in line with that of Colloc 1, consists in laying down the basis of peer-to-peer content management systems facing the challenges sketched above. This objective implies for a P2P-CMS to constantly adapt its answers to the following questions: "When to replicate?" (replication condition), "What to replicate?" (replica granularity), "Where to replicate?" (replica location), and "How to replicate?" (replication scheme). More specifically, a P2P-CMS must be able (a) to accommodate dynamic changes in availability of resources when going large-scale, and (b) to dynamically place replicas according to read/write access patterns. Solutions to these problems can then be seen as distinct layers in the P2P-CMS protocol stack. In Colloc 1, we mainly focus on Problem (a). Yet, a significant amount of work is still required to complete this research: more precisely, we plan to further explore the impact of locality on the performance and propose solutions to mitigate this impact, and address Problem (b) by taking advantage of the layers built to solve Problem (a).