Search for contacts, projects,
courses and publications

Cosmmus 2 - An infrastructure for scalable distributed applications

People

 

Pedone F.

(Responsible)

Le L. H.

(Collaborator)

Abstract

Many current online services must meet strict availability and performance requirements. High availability implies tolerating server failures, which calls for redundancy of servers and data, that is, replication. Although replication is needed for fault tolerance, having two or more replicas capable of serving client requests introduces additional complexity. For example, how to avoid that two clients simultaneously book the same airline seat, after each one accesses a different replica? Obviously, replicas cannot act alone and must coordinate. But simply having one replica wait for the acknowledge of the other won’t do it since the other may have crashed (accurately detecting the crash of a replica is a problem in its own). 

State machine replication is a solution to these problems. The idea is to arrange for replicas to receive and execute client requests in the same order. The execution of each request must be deterministic so that replicas produce the same results after the execution of the same requests. Therefore, clients know that a request has been executed after they receive the first response from a replica. In the example above, even though each client contacts a different replica, their requests will be executed by the two replicas in the same order, and the replicas agree that only the client whose request is executed first should book the seat.

State machine replication is a well-established solution to increasing the availability of services and has been deployed in many production systems. When it comes to performance, however, it offers limited capability. Increasing the number of replicas will not translate into more requests execute per time unit since each replica must process all requests. This project investigates extensions to the state machine replication that will make it configurable with respect to both availability and performance.

Additional information

Acronym
Cosmmus 2
Start date
01.10.2016
End date
31.05.2019
Duration
32 Months
Funding sources
SNSF
Status
Ended
Category
Swiss National Science Foundation / Project Funding / Mathematics, Natural and Engineering Sciences (Division II)

Publications