Tight Fault Locality
Shay Kutten David Peleg
Abstract
This paper lays a theoretical foundation for scaling fault tolerant tasks to large and
diversified networks, such as the Internet. In such networks, there are always parts of the
network that failed. On the other hand, various subtasks interest only parts of the network,
and it is desirable that those parts, if nonfaulty, do not suffer from faults in other parts.
Our approach is to refine the previously suggested notion of fault local algorithms (that was
best suited for global tasks) for which the complexity of recovering was proportional to the
number of faults. We refine this notion by introducing the concept of tight fault locality
to deal with problems whose complexity (in the absence of faults) is sublinear in the size of
the network. For a problem whose time complexity on an n-node network is T (n) (where
possibly T (n) = o(n)), a tightly fault local algorithm recovers a legal global state in
O(T (x)) time when the (unknown) number of faults is x.
This concept is illustrated by presenting a general transformation for MIS algorithms
to make them tightly fault local. In particular, our transformation yields an O(log x)
randomized mending algorithm and an O(exp(O(sqrt(log x)))) deterministic mending algorithm
for MIS. The methods used in the transformation may be of interest by themselves.