Theoretical basis for the design and evaluation of content-based routing protocols
The content-based communication paradigm is a form of communication based on the idea that some information should be transmitted to receivers based on their interests rather than on an explicit destination address set by the sender. The paradigm works as follows: (1) receivers tell the service what they are interested in by declaring a predicate, which is a Boolean function that, applied to a message, determines whether that message is of interest to the receiver. Then (2) senders simply send messages, without specifying a destination, and (3) the service delivers each message to all the receivers that declared predicates matching the content of the message. Thus, a content-based service acts both as a traditional communication mechanism, transmitting information from senders to receivers, and as an information broker, selecting information of interest on behalf of the receivers. Our general goal is to develop the theoretical foundations for the study of content-based networking. Within this project, we propose to analyze and improve routing from a theoretical viewpoint, and to inform its experimental evaluation. We will specifically focus on routing schemes, which define the routing state that a protocol installs in each router independent of the way in which that state is disseminated among routers. In particular, we propose to study two aspects of content-based routing schemes. First, we want to see how well a scheme realizes the content-based service. We do that by measuring accuracy and cost. We consider a scheme correct when it delivers each message to all interested receivers. For example, a broadcast scheme would be correct as content-based routing scheme. However, it would also be wasteful of resources, as it would deliver messages to many uninterested receivers. So, for each message, we measure total cost (total packet count), false positives (deliveries to uninterested receivers), and false negatives (missed deliveries to interested receivers). Our goal is to characterize these metrics in the presence of various network conditions. Specifically, we want to establish provable bounds for the number of false positives and false negatives for chosen schemes in chosen scenarios. Second, we want to characterize the space complexity of routing schemes. This is the amount of state held by routers to implement a particular scheme. Specifically, we want to establish the asymptotic complexity of a scheme in various types of networks as well as the trade-offs between space complexity and accuracy of the scheme. This again will depend on how many nodes malfunctions and how severe their faults are.