+ All Categories

VTJNS06

Date post: 04-Jun-2018
Category:
Upload: raghuram-seshabhattar
View: 217 times
Download: 0 times
Share this document with a friend

of 16

Transcript
  • 8/13/2019 VTJNS06

    1/16

  • 8/13/2019 VTJNS06

    2/16

    In this paper, we explore how to make TrustedPalsapplicable in environments with less synchrony. Moreprecisely, we show how to solve the asynchronous versionof SMC using asynchronous synchronization algorithmsinspired by recent results in fault-tolerant distributedcomputing: we use an asynchronous consensus algorithmand encapsulate (some very weak) timing assumptionswithin a device known as a failure detector [7].

    The concept of a failure detector has been investigated inquite some detail in systems with merely crash faults [8]. Insuch systems, correct processes (i.e., processes which do notcrash) must eventually permanently suspect crashed pro-cesses. There is very little work on failure detection andconsensus in message omission environments. In fact, it isnot clear what a sensible definition of a failure detector (andconsensus) is in such environments because the notion of acorrect process can have several different meanings (e.g., aprocess with no failures whatsoever or a process whichdoes not crash but just omits some messages). In this work,instead of correct processes, we will consider well-connectedprocesses, i.e., those processes which are able to computeand communicate without omissions with a majority of processes.

    1.1 Related Work Our work on TrustedPals can be regarded as buildingfailure detectors for arbitrary ( Byzantine) failures, which has been investigated previously (see, for example, Malkhi andReiter [9], Kihlstrom et al. [10], Doudou et al. [11], Doudouet al. [12], Haeberlen et al. [13], [14], and more recentlyHaeberlen and Kuznetsov [15]). In contrast to previousworks on Byzantine failure detectors, we use security

    modules to avoid the tar pits of this area. This contrastsTrustedPals to the large body of work that tackles Byzantinefaults directly, like Castro and Liskovs Practical ByzantineFault-Tolerance [16] or more recently the Aardvark system by Clement et al. [17] and the Turquois protocol by Monizet al. [18]. While being conceptually simpler, Byzantine-tolerant protocols necessarily have to assume a two-thirdsmajority of correct processes in nonsynchronous settings[19] while TrustedPals needs only a simple majority (due tothe availability of the secure overlay network). Next to animproved resilience, TrustedPals by design can providesecrecy of data against attackers, a notion that can only beachieved in Byzantine-tolerant algorithms by applyingcomplex secret sharing mechanisms [20]. All these advan-tages result from using security modules to constrain theattacker in such a way that Byzantine faults are reduced togeneral omission faults.

    Delporte-Gallet et al. [21] were the first to investigatenonsynchronous settings in the TrustedPals context, alwayswith the (implicit) motivation to make TrustedPals morepractical. Following the approach of Chandra and Toueg [7](and similar to this paper) they separate the trusted systeminto an asynchronous consensus layer and a partiallysynchronous failure detection layer. The main differencehowever is that they assume that transient omissions are

    masked by a piggybacking scheme while we detect transientomissions and, therefore, we do not need to piggyback allmessage history. Besides, they solve a different version of consensus than we do: roughly speaking, message omissions

    can cause processes to only be able to communicate indirectlyand we admit processes to participate in consensus even if they cannot communicate directly. Delporte-Gallet et al. [21]only guarantee that all processes that can communicatedirectlywith each other solve consensus. In contrast,we allowalso another set of processes to propose and decide: thosewhich are able to send and receive messages even indirectly.As a minor difference, we focus on the class P of eventuallyperfect failure detectors whereas they [21] implement the

    failure detector. Furthermore, they [21] do not describehow to integrate failure detection and consensus within theTrustedPals framework: a realistic adversary who is able toselectively influence the communication messages of thealgorithms forfailure detectionand consensuscan cause theirconsensus algorithmto fail. This problem is partly addressedin a recent paper [22] where consensus and failure detectionare integrated for efficiency purposes, not for security.

    Apart from Delporte-Gallet et al. [21], other authors alsoinvestigated solving consensus in systems with omissionfaults. Work by Dolev et al. [23], [24] also follows the failuredetector approach to solve consensus, however they focuson the class Som of failure detectors. Babaoglu et al. [25]also follow the path of S to solve consensus in partition-able systems. Alternatively, Santoro and Widmayer [26]assume a synchronous system model, and Moniz et al. [27]use randomization.

    Recently, solving SMC without security modules hasreceived some attention focusing mainly on two-partyprotocols [28], [29], [30], [31], [32], [33]. In systems withsecurity modules, Avoine and Vaudenay [34] examined theapproach of jointly simulating a TTP. This approach waslater extended by Avoine et al. [35] who show that in a

    system with security modules fair exchange can be reducedto a special form of consensus. They derive a solution to fairexchange in a modular way so that the agreement abstrac-tion can be implemented in diverse manners. Benenson et al.[2] extended this idea to the general problem of SMC andshowed that the use of security modules cannot improve theresilience of SMC but enables more efficient solutions forSMC problems. All these papers assume a synchronousnetwork model.

    Ben-Or et al. [36] were the first to investigate solutions tothe asynchronous variant of SMC which is slightly moreinvolved than its synchronous counterpart because thefailure to provide input to F by some party cannot beattributed solely to that partyit can also be due tounpredictable delays in the network. This has consequencesregarding the resilience of SMC. While it is possible to solveSMC in the synchronous setting with a simple majority of benign processes [37], [38], in asynchronous settings a two-thirds majority is necessary [39].

    Correia et al. [40] present a system which employs areal-time distributed security kernel to solve SMC. Thearchitecture is very similar to that of TrustedPals as it alsouses the notion of architectural hybridization [41]. How-ever, the adversary model of Correia et al. [40] assumesthat the attacker only has remote access to the system

    while TrustedPals allows the owner of a security moduleto be the attacker. Like other previous works [2], [34], [35],Correia et al. [40] also assume a synchronous networkmodel at least in a part of the system.

    CORTI ~NAS ET AL.: SECURE FAILURE DETECTION AND CONSENSUS IN TRUSTEDPALS 611

  • 8/13/2019 VTJNS06

    3/16

    1.2 ContributionsIn this paper, we present a modular redesign of TrustedPalsusing consensus and failure detection as modules. Morespecifically, we make the following technical contributions:

    . We show how to solve asynchronous Secure Multi-party Computation by implementing TrustedPals inasynchronous systems with a (weak) failure detec-tor. We do this by reducing the problem of SMC tothe problem of uniform consensus in omissionfailure environments. As a corollary we show thatin systems with security modules and weak timingassumptions the resilience of asynchronous SMC can be improved from a two-thirds majority to a simplemajority of benign processes.

    . We propose a new definition of connectedness inomission environments. Informally, a process is in-connected if it does not crash and, despite omissions,receives either directly or indirectly all messages thata majority of processes sends to it. Similarly, a

    process is out-connected if it does not crash and allmessages it sends are received by a majority of processes. We also consider well-connected processes,which are those processes that are both in-connectedand out-connected.

    . We give a novel definition of consensus in the newomission model, by refining the termination prop-erty of consensus (Every in-connected processeventually decides some value), and an algorithmwhich uses the failure detector class P om tosolve consensus. That algorithm is an adaptation of the classic algorithm by Chandra and Toueg for the

    crash model.. We give a novel definition of P in the omissionmodel, P om, and we show how to implement it ina system with weak synchrony assumptions in thespirit of partial synchrony.

    . We integrate failure detection and consensus se-curely in TrustedPals by employing message pad-ding and dummy traffic, tools known from the areaof privacy enhancing techniques.

    1.3 Paper OutlineThis paper is structured as follows: In Section 2 we give anoverview over TrustedPals, its architecture and the motiva-tion behind the definitions and model and show howasynchronous SMC can be reduced to uniform consensus.In Section 3 we fully formalize the system model of TrustedPals. In Section 4 we show how to solve consensususing an abstract failure detector of the class P om. InSection 5 we show how to implement the failure detectorP om in the omission failure model under very weaksynchrony assumptions. In Section 6 we describe how tointegrate failure detection and consensus securely in theTrustedPals framework. We conclude in Section 7.

    2 TRUSTEDPALS IN WEAKLY SYNCHRONOUSS YSTEMS

    We now present an overview over the TrustedPals systemarchitecture and show how the reduction from SMC to

    consensus can be done in nonsynchronous systems.Additionally, this section is meant as an informal introduc-

    tion giving a high level view of the definitions used in theremainder of the paper. All necessary notions are fullyformalized later in Section 3.

    2.1 Untrusted and Trusted SystemWe formalize the system assumptions within a hybridmodel, i.e., the model is divided into two parts (see Fig. 1).The lower part consists of n processes which represent theuntrusted hosts. The upper part equally consists of n processeswhich represent the security modules. Due to the lack of mutual trust among untrusted hosts, we call the former partthe untrusted system. Since the security modules trust each

    other we call the latter part the trusted system.The processes in the untrusted system (i.e., the hosts)execute (possibly untrustworthy) user applications likee-banking or e-voting programs. Because of the untrust-worthy nature of these processes, they use the trustedsystem as a subsystem to solve the involved securityproblems. The trusted system consists of software runninginside the security modules. This software must have beencertified by some accepted authority. It is not possible for theuser to install arbitrary software on the trusted system. Thetamper-proof nature of the trusted processes allows toprotect stored and transmitted information even from the

    untrusted processes on which they reside. The authority can be an independent service provider (like a networkoperator) and is only necessary within the bootstrap phaseof the system, not during any operational phases (likerunning the SMC algorithms).

    Formally, the connection between the untrusted andtrusted system is achieved by associating each process inthe untrusted system (i.e., each host) with exactly oneprocess in the trusted system (i.e., a security module) andvice versa. Hence, every untrusted process has a trustedpal (an associated trusted process). Since host and securitymodule reside on the same physical machine, we assume

    that for each association there is a bidirectional eventuallytimely and secure communication channel, e.g., implemen-ted by shared variables or message passing communicationin the host operating system.

    612 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 9, NO. 4, JULY/AUGUST 2012

    Fig. 1. The untrusted and trusted system.

  • 8/13/2019 VTJNS06

    4/16

    For better readability, we sometimes refer to untrustedprocesses simply as hosts and to trusted processes simply assecurity modules. We refer to TrustedPals as the collection of trusted processes together with the services they provide tothe untrusted processes.

    2.2 Defining Asynchronous SMC with TrustedPals

    As mentioned in the introduction, SMC allows the set of hosts to correctly compute some common function F of their local values. While the original work on TrustedPals[2] used the synchronous definition of SMC given byGoldreich [42], we follow the definition of asynchronousSMC given by Ben-Or et al. [36]. The difference between thesynchronous and asynchronous form of SMC lies in the factthat F is not computed on all inputs but rather on a subsetof inputs of size at least n f , where f is an upper boundon the number of faulty (i.e., nonbenign) processes. Aprocess is benign if, besides not crashing, it follows itsprotocol (i.e., it is non-Byzantine).

    More formally, let x1 ; . . . ; xn be the private inputs of eachhost. In asynchronous SMC, the result r is computed on theinputs from a subset C of hosts of size at least n f , i.e.,r F y1 ; . . . ; yn where yi xi if host i is in C and somedefault value (e.g., yi 0 ) otherwise. The result r should becomputed reliably and securely, i.e., as if all hosts wereusing a TTP. This means that the individual inputs remainsecret to other hosts (apart from what is given away by r)and that malicious hosts can neither prevent the computa-tion from taking place nor influence r in favorable ways.

    In this paper, we use the following definition of (asynchronous) SMC: A protocol solves secure multipartycomputation with TrustedPals if it satisfies the followingproperties [36], [42]:

    . SMC-Validity . If a host receives a result, then thatresult was computed by applying F on the inputsfrom a subset C of hosts of size at least n f , i.e.,r F y1 ; . . . ; yn where yi xi if host i is in C andsome default value (e.g., yi 0 ) otherwise. The set C is the same for all hosts that receive a result.

    . SMC-Agreement. No two values returned by securitymodules differ.

    . SMC-Termination. Every benign host eventually

    receives a result from its associated security module.. SMC-Privacy. Faulty hosts learn nothing about theinput values of benign hosts (apart from what isgiven away by the result r and the input values of allfaulty hosts).

    Recall that these properties abstractly specify whathappens when a TTP is used to solve the problem [36], [42].

    2.3 Untrusted System: AssumptionsWithin the untrusted system each pair of hosts is connected by a pair of unidirectional communication channels, one ineach direction. We assume that there is a minimal set of

    reliable channels in the system. The rest of the channels can be lossy. Recall that every message sent through a reliablechannel is eventually delivered at the destination. Weassume no particular ordering relation on channels.

    2.3.1 Failure Model The process failure model we assume in the untrustedsystem is the Byzantine failure model [43]. A Byzantineprocess can behave arbitrarily. We will assume a majority of benign processes in the untrusted system.

    2.3.2 Timing

    We assume that a local real-time clock is available to eachprocess in the untrusted system, but clocks are notnecessarily synchronized within the network. The untrustedsystem is assumed to be partially synchronous, meaning thateventually unknown bounds on processing and commu-nication delays hold for the majority of benign processes.The model is a variant of the partial synchrony models of Dwork et al. [44]. The differences are that the bounds musthold for just a majority of processes, and that we assume aset of reliable, eventually timely channels connecting thoseprocesses.

    2.4 Trusted System: Assumptions

    The trusted system can be considered as an overlaynetworka network that is built on top of anothernetworkover the untrusted system. Nodes in the overlaynetwork can be thought of as being connected by virtual orlogical channels. In practice, for example, smart cards couldform the overlay network which runs on top of the Internetmodeled by the untrusted processes. In the trusted system,each process has also an outgoing and an incomingcommunication channel with every other process.

    2.4.1 Trust Model Within the trusted system we assume that any two commu-

    nicating parties can establish mutual message confidenti-ality, message integrity, and message authentication. Thiscan be realized, for example, by exchanging cryptographickeys during a setup phase of the system. As mentionedabove, we assume that the code running within the trustedsystem has been certified by some trusted authority, i.e.,nodes in the trusted system may assume that each othersprograms have not been tampered with. The trustedauthority acts only during the setup phase of the system,not during the operational phase.

    2.4.2 Timing Security modules do not have any clock, they have just asimple step counter, whereby a step consists of possiblyconsuming a message, executing a local computation andpossibly sending a message. Passing of time is checked bycounting the number of steps executed. Roughly speaking,the timing assumptions for the processes in the trustedsystem are the same as those of the untrusted system, i.e., weassume partial synchrony. However, as we will explain next,in case the trusted process is associated with an untrustedprocess which is faulty/malicious, the trusted process maynot rely on any timing assumptions whatsoever.

    2.4.3 Failure Model

    Like the untrusted system, the trusted system is also proneto attacks. However, the assumptions on the securitymodules and the possibility to establish secure channelsreduce the options of the malicious hosts to attacks on the

    CORTI ~NAS ET AL.: SECURE FAILURE DETECTION AND CONSENSUS IN TRUSTEDPALS 613

  • 8/13/2019 VTJNS06

    5/16

    liveness of the system, i.e., 1) destruction of the securitymodule, 2) interception of messages between the channeland the security module, or 3) changes in the frequencyof the step counter. This way, in the trusted system weassume the failure model of general omission and someasynchrony that can only affect those trusted processesassociated with faulty processes in the untrusted system(i.e., security modules residing on Byzantine hosts).

    The concept of omission faults, meaning that a processdrops a message either while sending ( send omission) orwhile receiving it ( receive omission), was introduced byHadzilacos [45] and later generalized by Perry and Toueg[46]. In the general omission model processes can fail bycrashing or experience either send omissions or receiveomissions. In our system we allow the possibility of transientomissions, i.e., a process may temporarily drop messagesand later on reliably deliver messages again. Of course, permanent omissions are possible too.

    Besides omissions, trusted processes can become arbi-trarily slow (asynchronous) although the physical system in

    which they operate (i.e., the untrusted system) is partiallysynchronous. This models the effect of two different typesof attacks by malicious hosts which we now explain:

    . Timing attacks. Recall that security modules use a stepcounter to be aware of passing of time. The speed of the step counter is controlled by the associated host.In a timing attack, a malicious host arbitrarily changesthe speed of the step counter of its security module. Inthat way, it can make the security module work faster(although the speed is physically bounded by thehosts own clock) or slower (not bounded). As aconsequence, the behavior of its security module

    could become asynchronous and thus the commu-nication through all the virtual channels which areadjacent to that security module would becomeasynchronous as well.

    . Buffering attacks. As we have previously pointed out,a malicious host can intercept messages between itssecurity module and the communication channels.Removal of an intercepted message is modeled as amessage omission. In a buffering attack, the host doesnot remove the messages but stores them in a bufferand later on injects them into the communicationchannel after an arbitrary delay. This means that themessage is not omitted but communication through

    that channel may become asynchronous in thetrusted system.Observe that both attacks affect the timing behavior of theattacked security module. However, buffering attacks aremore selective, since a particular communication channel of a security module could be attacked (become asynchronousin the trusted system) without affecting the rest of thecommunication channels of that security module.

    In addition to the previous types of attacks, messagereordering attacks are treated as a particular case of message buffering attacks, since in our algorithms nomessage is delivered if it is not the expected one (messagescarry a unique sequence number). Also, note that the case of buffer overflow in the smart card (e.g., if the attacker buffersa lot of messages and then passes all those messages at thesame time to the smart card) can be naturally treated as if the smart card omitted the reception of some message(s).

    To summarize, processes in the trusted system can fail bycrashing or omitting messages. Additional types of failureinclude the process or any of its incoming or outgoingcommunication channels becoming asynchronous and/orlossy.Wewill fully formalizethis faultybehavior in Section 3.

    2.4.4 Classes of Processes Earlier work on failure detectors and partial synchrony [7],[44] assumed a majority of correct processes in order to solveconsensus. However, as observed earlier we allow faultyprocesses to participate in consensus provided that theykeep their ability to compute (no crash) and to communicatewithout omissions with a majority of processes.

    In order to circumvent some transient omissions, weadmit thepossibility of indirect communication between twoprocesses. For example, if there are omissions in thecommunication channel from a process p to another processq , but both of them have no omissions with a third process r,process p could indirectly communicate with q through rwithout anyomission.This way, a process will be considereda well-connected process as long as it is able to communicatewith a majority of processes without any omission, even if it

    has suffered some omissions. The set of well-connectedprocesses will be formalized in Section 3.Furthermore, we should notice that the connectedness of a

    process can be asymmetric, since it can suffer send omissionsand receive omissions independently, e.g., a process can beable to send to a majority of processes, but not be able toreceive from a majority of processes (because it has had toomany receive omissions). Following this motivation, weconsider the following classes of processes, based on theirability to communicate (we give only rough and intuitivedefinitions here and fully formalize these notions later inSection 3):

    . A process is in-connected if it does not crash and itreceives all messages that some well-connectedprocess sends to it.

    . A process is out-connected if it does not crash and allmessages it sends to some well-connected processare received by that process.

    Based on these definitions, well-connected processes are both in-connected and out-connected. Observe that everyout-connected process can send information to any in-connected process with no omissions. Fig. 2 shows anexample where arcs represent channels with no omissions.

    The majority of well-connected processes corresponds tothe set fx;y;w;r;vg. Processes p and q are out-connected,while process s is in-connected. Finally, process u is neitherin-connected nor out-connected.

    614 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 9, NO. 4, JULY/AUGUST 2012

    Fig. 2. Examples for classes of processes.

  • 8/13/2019 VTJNS06

    6/16

    2.4.5 Connecting the Failure Assumptions To connect the failure assumptions from trusted anduntrusted systems we make the following assumption: a benign process in the untrusted system, i.e., a benign host,implies that its associated process in the trusted system iswell-connected. Since there is a majority of benign hosts, theprevious assumption implies that there is a majority of well-

    connected processes in the trusted system. Observe that anonbenign host does not necessarily imply a non well-connected process in the trusted system.

    2.5 Uniform ConsensusWe are now ready to give the definition of consensus used inthis paper. Intuitively, in the consensus problem, everyprocess proposes a value, and correct processes musteventually decide on some common value that has beenproposed. In thecrash model,every correct processis requiredto eventually decide some value. This is called the Termina-tion property of consensus. The difference between (regular)

    consensus and uniform consensus lies in the uniformagreement property that demands that noncorrect processesare not allowed to decide differently from correct processes.

    In order to adapt consensus to the omission model, weargue that only theTerminationproperty has to be redefined.This property now involves in-connected processes, since,although they can experience some omissions, in-connectedprocesses will be able to receive the decision. The propertiesof uniform consensus in the omission model are thus thefollowing:

    . Termination. Every in-connected process eventuallydecides some value.

    . Integrity. Every process decides at most once.

    . Uniform agreement. No two processes decide differ-ently.

    . Validity. If a process decides v, then v was proposed by some process.

    2.6 Solving Asynchronous SMC with TrustedPalsThe original work on TrustedPals [2] reduced the problemof SMC to that of Uniform Interactive Consistency (UIC) inthe trusted system. Roughly speaking, within the trustedsystem, trusted processes exchange the inputs, thencompute the function F and synchronize before returning

    the result back into the untrusted system. The implementa-tion was based on an algorithm for uniform consensus byParve dy and Raynal [47]. We now argue that theasynchronous variant of SMC can also be solved usinguniform consensus in TrustedPals. We give pseudocode inform of a procedure in Fig. 3.

    For every trusted process i, after receiving the inputvalue xi from its host, the trusted process sends xi to allother processes in the trusted system. Thereafter, it waits forthe receipt of at least dn 1 =2 e values from other trustedprocesses and collects the pairs j; x j in a local set V i , wherex j is the value received from process j. Next, the set V i is

    proposed to uniform consensus. Let V denote the decisionvalue of uniform consensus. From V , the process constructsthe vector of inputs for F (as used in the definition of SMC),computes F on that vector and returns the result to its host.

    We now argue that this algorithm implements SMC for amajority of benign hosts if uniform consensus can be solvedin the trusted system. Recall that a majority of benign hostsimplies that a majority of processes in the trusted systemwill be well-connected and thus in-connected.

    First consider SMC-Termination. Although the system is

    asynchronous, the distribution of input values to all otherprocesses will terminate since only a majority of values hasto be received and a majority of hosts are benign. Therefore,all the trusted processes of benign hosts will eventuallypropose a value to uniform consensus. From the Termina-tion property of uniform consensus, every such process willeventually decide and return the computed value to its host.

    Now consider SMC-Validity. Since all security moduleson benign hosts enter uniform consensus, they all propose aset V i containing a majority of values. From the Validityproperty of uniform consensus, the decided value V onevery such process will also contain a majority of values. Theuniform agreement property of uniform consensus in turnguarantees that any returned result will be computed on thesame vector of inputs. SMC-Agreement trivially followsfrom the protocol and the uniform agreement property of uniform consensus.

    Proving SMC-Privacy is much more intricate because itdepends critically on how the trusted system (i.e., Trus-tedPals) operates. Intuitively, the trusted system shouldleak no information other than what is communicated at itsinterface (i.e., the input of xi and the output of the result r).This will be subject of Section 6 where we integrate allalgorithms in the trusted system such that we achieve aform of unobservability [48], [49], a notion known from thearea of privacy-enhancing techniques that theoreticallycloses all side channels through which confidential in-formation may leak.

    Fig. 4 summarizes the layers and interfaces of theproposed modular architecture for TrustedPals. A messageexchange is performed in the transport layer, which is undercontrol of the untrusted host. The security mechanisms formessage encryption/decryption run in the layer termedSecurity on the security module. In the failure detectorand consensus/SMC layers run the failure detection and

    consensus/SMC algorithms, respectively. Finally, in theapplication layer, which again is under the control of theuntrusted host, application software offering user interfacesto consensus/SMC operate.

    CORTI ~NAS ET AL.: SECURE FAILURE DETECTION AND CONSENSUS IN TRUSTEDPALS 615

    Fig. 3. Solving SMC using uniform consensus.

  • 8/13/2019 VTJNS06

    7/16

    3 FORMALIZATION OF THE TRUSTED SYSTEMAs explained above, the reduction from SMC to Consensusassumes that there exists an algorithm for Uniform Con-sensus in the trusted system. Describing such an algorithmwill be the main subject of the remainder of this paper.But before we describe the algorithm, we first formalize allnecessary concepts within the trusted system.

    3.1 Processes and ChannelsWe model a distributed system as a set of n > 1 processes

    f p1 ; p2 ; . . . ; pn g which are connected through pairwise bidirectional communication channels in a fully connectedtopology. In the following, we will also use p, q , r, etc., todenote processes. We denote the channel from p to q by c pq .

    3.2 Algorithms and Events

    An algorithm A consists of a set of deterministic automata,one for each process. We give our algorithms in an event- based notation and thus assume that a local FIFO eventqueueis part of the local state of every process. Within an executionstep, a process takes an event from the queue, performs astate transition according to the event, and then may send amessage or add a new event to the queue. Message arrivalsare treated as events too, i.e., when a message arrives, anappropriate event is added to the queue. It is received bythe process when this event is processed. We assume thatevery process which does not crash executes infinitely manyevents.

    3.3 Global Clock We use a discrete global clock to simplify the presentationof our model. However, no process has access to this clock;it is merely a fictional device. For simplicity we take therange T of the clock to be the set of natural numbers.

    Steps (i.e., event executions) on processes are alwaysassociated with a certain global time. We assume a linearmodel of event execution, i.e., for every instance in timethere is at most one event in the system which is executed.

    3.4 Process FailuresProcesses can experience different kinds of failures: crash

    failures, omission failures, and timing failures.A crash failure set F c is a subset of processes.Informally, F c contains all processes that will eventuallycrash.

    A send-omission failure set F so is a relation over. Informally, p; q 2 F so means that process p experiences

    at least one send omission toward q . If p; q 62 F so then pnever experiences a send omission toward q .

    Similarly, a receive-omission failure set F ro is arelation over . Informally, p; q 2 F ro means that processq experiences at least one receive omission from p. So if

    p; q 62 F ro then q never experiences a receive omissionfrom p.

    Some processes may experience timing failures. Timingfailures refer to process asynchrony. We define an asyn-chronous process failure set F ap as a subset of processes.Intuitively, F ap contains all processes which are asynchro-nous in the system. Processes which are not in F ap areeventually synchronous [44] meaning that their processingspeed is eventually bounded. Formally, a process p issynchronous if there exists a known bound such that thetime between the execution of any two steps of p is bounded by . A process p is eventually synchronous if there exists atime after which p is synchronous (additionally, can beunknown). Note that this implies that the relative processspeeds between any pair of eventually synchronousprocesses is bounded. In our system model, both andthe time after which holds are unknown.

    A process failure set F F c; F so ; F ro ; F ap is a tupleconsisting of a crash failure set, a send-omission failureset, a receive-omission failure set, and an asynchronousprocess failure set.

    We define the set of correct processes to be the set of allprocesses that neither crash nor experience any omission

    nor are asynchronous. We denote this set with C . FormallyC f p 2 : p 62 F c ^ p 62 F ap ^

    8q 2 : p; q 62 F so ^ q; p 62 F ro g:

    As we will see, we do not need to assume the existence of a majority of correct processes. Moreover, the set of correctprocesses could even be empty.

    3.5 Send and ReceiveProcesses can send a message using the Send primitive. Theevent Send p; m; q; t means that at time t process p sends mto q . More precisely, m is inserted into the channel c pq unless p experiences a send omission of m toward q . If c pq is areliable channel, for any message m inserted into c pq , thechannel guarantees that eventually an appropriate event isadded to the local event queue of process q . However, incase of buffering attacks or channel asynchrony/loss, therecould be no time bound for the event in q to be added. If c pq is not reliable or q experiences a receive omission, then m isremoved from the channel without adding an appropriateevent to the event queue of q . When this event is processed,we say that the message is received at time t , formalized asthe occurrence of the event Receive p; m; q; t. We allow

    processes to selectively wait for messages.No particular order relations are defined in the receptionof messages. We assume that every message m from p to q istagged with a unique sequence number assigned by p.

    616 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 9, NO. 4, JULY/AUGUST 2012

    Fig. 4. The architecture of our system.

  • 8/13/2019 VTJNS06

    8/16

  • 8/13/2019 VTJNS06

    9/16

    Eventual Strong Accuracy ensures that out-connectedprocesses will eventually be trusted forever. Finally, StrongCompleteness avoids trusting forever a process that is notout-connected.

    Our definition of an eventually perfect failure detectorresults naturally, i.e., at no additional cost, from the uniform(periodical, all-to-all) communication pattern used in ourfailure detection algorithm. We use this communicationpattern in order to avoid side channel attacks (see Section 6)that otherwise an adversary could try in Byzantineenvironments. Said this, we guess that in a non-Byzantineomissive setting a failure detector of type S or could bestrong enough to solve consensus, as it is the case in thecrash failure model.

    4 P om-BASED CONSENSUS IN THE TRUSTEDS YSTEM

    We now focus our attention on the consensus layer of theTrustedPals architecture (see Fig. 4), in which we implementuniform consensus (as defined in Section 2.5) using thefailure detector class P omdefined in the previous section.Note that we fully operate within the trusted system, i.e., theconsensus algorithm itself is asynchronous, meaning that ittolerates arbitrary phases of asynchrony. We first give theconsensus algorithm and then prove its correctness.

    4.1 Consensus AlgorithmFigs. 5 and 6 present an algorithm solving consensus inthe omission model using P om. It is an adaptation of the well-known S -based Chandra-Toueg consensus algo-rithm [7] (which also works with a P failure detector).

    The use of P om by every process p is modeled bymeans of the following two variables: a Boolean variableI am InConnected p which provides the In-connectednessproperty, and a set OutConnected p which provides theStrong Completeness and Eventual Strong Accuracyproperties.

    The algorithm is based on the rotating coordinatorparadigm. It executes in rounds, and each round iscoordinated by a single process, which tries to impose avalue to the rest of participants. If it succeeds, then it takes adecision and reliably broadcasts [50] it to all processes,which adopt it. Otherwise, i.e., if the current coordinator issuspected, then processes advance to the next round. Eachround is divided in four phases: a votingphase,a propositionphase, an acknowledgment phase, and a (potential) decisionphase. An adequate use of the P om failure detectorensures that, if not earlier, eventually a process that is well-connected will succeed in imposing a value and thus willdecide.

    We now comment on the modifications required to adaptthe original Chandra-Toueg algorithm:

    . In Phase 2, the current coordinator waits for amajority of estimates while it considers itself as in-connected in order not to block.

    . In Phase 3, every process p waits for the newestimate proposed by the current coordinator while p considers itself as in-connected and the coordina-tor as out-connected in order not to block.

    . In Phase 4, if the current coordinator sent a validestimate in Phase 2, it waits for replies of out-connected processes while it considers itself as in-connected in order not to block.

    When a process p sends a message m to another processq , the following relaying approach is assumed: 1) p sends mto all processes, including q , except p itself, and 2) whenever

    p receives for the first time a message m whose actualdestination is another process q , p forwards m to allprocesses (except the process from which p has receivedm and p itself). This approach can take advantage of theunderlying all-to-all implementation of the P om failuredetector without generating any extra message apart fromthe periodical all-to-all communication pattern of the failuredetector, as we will see in the next section.

    4.2 Correctness ProofSince the algorithm is very similar to the one proposed byChandra and Toueg [7], we only sketch the correctness

    proof here. First of all, observe that uniform agreement ispreserved, because we keep the original mechanism basedon majorities to decide on a value. Also, it is easy to see thatintegrity and validity are satisfied. Finally, in order toshow that termination is satisfied, we first show that thealgorithm does not block in any of its wait instructions:

    . In Phase 2, if the current coordinator p is not in-connected, it will eventually stop waiting becausethe failure detector will eventually set I amInConnected p to FALSE. On the other hand, if p isin-connected, it will eventually receive a majority of

    estimates since by definition there is a majority of well-connected processes in the system. Hence, nocoordinator blocks forever in the wait instruction of Phase 2.

    . In Phase 3, every process p waits for the newestimate proposed by the current coordinator or aNEXT message while p considers itself as in-connected and the coordinator as out-connected.Clearly, by the properties of P om no process blocks forever in the wait instruction of Phase 3.

    . In Phase 4, the current coordinator waits for repliesof out-connected processes while it considers itself as in-connected. Again, by the properties of P omno coordinator blocks forever in the wait instructionof Phase 4.

    By the previous facts, eventually some well-connectedprocess c will coordinate a round in which:

    . In Phase 2, the coordinator c will receive a majorityof estimates, because I am InConnected c will bepermanently set to TRUE (by the properties of P om) and there is a majority of well-connectedprocesses in the system. Hence, c will send a validestimate to all processes at the end of Phase 2.

    . In Phase 3,everywell-connectedprocess pwillreceive

    cs valid estimate,because I am InConnected p will bepermanently set to TRUE and c will be permanently inOutConnected p (by the properties of P om). Hence, p will send an ACK message to c at the end of Phase 3.

    618 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 9, NO. 4, JULY/AUGUST 2012

  • 8/13/2019 VTJNS06

    10/16

    CORTI ~NAS ET AL.: SECURE FAILURE DETECTION AND CONSENSUS IN TRUSTEDPALS 619

    Fig. 5. Solving consensus in the omission model using P om: main algorithm.

  • 8/13/2019 VTJNS06

    11/16

    . In Phase 4, the coordinator c will receive a majority of ACK messages, because I am InConnected c will bepermanently set to TRUE and all well-connectedprocesses will be permanently in OutConnected c (bythe properties of P om) and there is a majority of well-connected processes in the system. Hence, c willR-broadcast the decision, and every in-connectedprocess will eventually decide.

    5 FAILURE DETECTION IN THE TRUSTED SYSTEM

    Recall from Section 2 that at the transport layer TrustedPalsmerely assumes a partially synchronous system model. Ontop of this model, we now explain how to implement afailure detector of class P om, i.e., we focus on the failuredetection layer of the TrustedPals architecture (see Fig. 4).

    The failure detection algorithm presented in this sectiondetermines the connectivity relations defined in Section 3and builds a basis for the consensus algorithm of Section 4.The algorithm is based on heartbeat messages that everyprocess sends periodically to the rest of processes. Thisschema provides a kind of delayed message forwarding,which enables indirect communication of informationattached to heartbeatmessages. At the same time, it providesthe support for piggybacking consensus level messages, aswe will see in Section 6.

    5.1 Failure Detector AlgorithmFigs. 7, 8, and 9 present an algorithm implementing P omaccording to the properties defined in Section 3. Thealgorithm provides to every process p a set of out-connected processes, OutConnected p and a Boolean vari-able, I am InConnected p. The set OutConnected p willeventually and permanently contain all the out-connectedprocesses in case p is in-connected. Regarding theI am InConnected p variable, it will be TRUE if p is in-connected.

    The algorithm is based on the periodical communicationof heartbeat messages from every process to the rest of processes. Roughly speaking, heartbeat messages carryinformation about connectivity among processes. When aprocess p receives a heartbeat message, it uses thatconnectivity information to update its perception of theconnectivity of the rest of processes. This information,together with ps own connectivity, gives p a view of thecurrent system connectivity, which will be propagated tothe rest of processes attached to ps subsequent heartbeatmessages. Next we explain in detail how our algorithmimplements this approach.

    Every process p has a matrix M p of n n elementsrepresenting connectivity information ( ! relations betweenevery pair of processes). Every heartbeat message from pcarries the matrix M p and a sequence number used to detect

    message omissions. Received messages are buffered to bedelivered in FIFO order. When a heartbeat from process q isreceived by process p, p updates M p according to the state of its input channels and with the received M q . Actually, M prepresents the transposed adjacency matrix, a (0,1)-matrix of a directed graph, where the value of the element M p p q shows if there is an arc from q to p. M p has the informationneeded to calculate the set of out-connected processes andthe value of the I am InConnected p variable. The algorithmcalculates powers of the adjacency matrix to find paths of any length between processes, which correspond to transi-tive relations q ! p, and therefore representing indirectcommunication paths from q to p. The set of out-connectedprocesses, along with the I am InConnected p variable, iscomputed in the update Connectivity procedure (Fig. 8),which is called every time a value of the matrix M p ischanged. Observe that it is important for a process p tocheck its own in-connectivity to verify the validity of theinformation contained in M p. The in-connectivity conditionof p (more than bn=2 c processes communicate properly withit) is checked in the update Connectivity procedure too,and its value is output to the I am InConnected p variable.

    In the algorithm, every process p executes three tasks:

    . In Task 1 (line 13), p periodically sends to the restof processes a heartbeat message including thematrix M p and a unique sequence number.

    . In Task 2 (line 20), if p does not receive the expectedmessage from a process q (according to thenext receive pq sequence number) in the expectedtime, the value of M p p q is set to 0.

    . In Task 3 (line 27), received messages are pro-cessed. The messages p receives from anotherprocess q are inserted in a FIFO buffer Buffer pq (line 28), and delivered following the sequencenumber next receive pq . Once delivered the nextexpected message from a process q , the condition of empty buffer means that there is no message leftfrom q , so M p p q is set to 1.

    The procedure deliver next message (Fig. 9) is used toupdate the adjacency matrix M p using the informationcarried by the message. In the procedure, process p copiesinto M p the row q of the matrix M q received from q . Thisway, p learns about q s input connectivity. With respect toevery other process u, a mechanism based on versionnumbers is used to avoid copying old information about usinput connectivity. Process p will only copy into M p the rowu of M q if its associated version number is higher.

    The periodical exchange of matrices provides a mechan-ism to indirectly communicate connectivity informationwhich allows to determine the existence of eventuallytimely paths corresponding to ! relations. This will be thecase for processes in the set C and processes that are in- orout-connected as defined in Section 3. Besides, a process pwhich is not out-connected due to timing/buffering attacksor channel asynchrony could behave as out-connected if several reliable (i.e., neither omissive nor lossy) paths from pto C perfectly alternate in such a way that, eventually, there

    is always a path from p to C along which no time-outexpires. In other words, the set of paths emulates aneventually timely path from p to C , but there is not a stableeventually timely path from p to C . An analogous reasoning

    620 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 9, NO. 4, JULY/AUGUST 2012

    Fig. 6. Solving consensus in the omission model using P om: adopting

    the decision.

  • 8/13/2019 VTJNS06

    12/16

    can be made for a process that, not being in-connected,could behave as in-connected. Although this strange behavior might exist, for simplicity we assume that it doesnot occur. Observe that the assumption of a majority of benign processes connected via ! relations makes thisform of pseudoconnectedness harmless to the system. Indeed,if needed, pseudoconnectedness could be detected bychecking if paths along which no time-out expires remainunchanged (i.e., are stable) or not, although we do notaddress this issue here.

    5.2 Correctness ProofWe now show that the algorithm of Figs. 7, 8, and 9implements P om in the omission model.Observation 1. At every process p, the matrix M p is updated

    with its own connectivity information and with thematrices M q received in the heartbeat messages. The

    updated M p and its version number V ersion p are sentwith ps next heartbeat message. The local delay inprocess p to send M p and V ersion p is bounded in thealgorithm by the period of Task 1 of p, which is finite if p

    CORTI ~NAS ET AL.: SECURE FAILURE DETECTION AND CONSENSUS IN TRUSTEDPALS 621

    Fig. 7. P om in the omission model: main algorithm.

  • 8/13/2019 VTJNS06

    13/16

    is eventually synchronous and has not crashed. Thisway, an indirect communication schema is obtained.

    Lemma 1. 8 p; q 2 , iff q ! p, then eventually and permanentlyM pn p q 1 .

    Proof. If q ! p, then, by definition of ! , eventually andpermanently there is a reliable and eventually timelypath with no omission of any length from q to p. By

    Task 1 q sends periodically a heartbeat messageincluding its updated M q and V ersionq q to the rest of processes. Every process r receiving by Task 3 a newV ersion q q will update from M q , in the proceduredeliver next message , the row q of M r as well asV ersion r q . Process r will update matrix M r too with itsown connectivity information: by Task 3 of r , M r r q isset to 1 every time Buffer r q becomes empty; by Task 2M r r q is set to 0 when the next expected message fromq is not received timely by r . By Observation 1, M r r q and V ersionr r are propagated to every process s if s iseventually synchronous and has not crashed. (Note that,

    as a particular case, r or s may be p.) When somemessage is delivered by Task 3 of p, by the proceduredeliver next message , p will update M p and calculateM pn if some value in M p has changed. By thedefinition of the relation ! , if q ! p then M pn p q will be evaluated eventually and permanently to apositive value, otherwise, if not q ! p, M pn p q willnot be set to 1 permanently because, by definition, inevery possible path from q to p there will be twoprocesses, r and s, such that not r ! s (r and s could beq and p), and therefore, M pn p q 0 . tu

    Lemma 2. 8 p 2 in

    -connected , 8q 2 out

    -connected , eventuallyand permanently q 2 OutConnected p.

    Proof. By definition of out-connected process, there issome well-connected process r such that q ! r . By

    definition of in-connected process, there is some well-connected process s such that s ! p. By transitivity, forevery well-connected process u, q ! u, and in particularq ! s. By Lemma 1, M sn s q 1 . By the proceduredeliver next message , s will update M s copying all therows from at least the rest of well-connected processes.As a consequence of that, since j C j > bn=2 c, column q of M sn will include eventually and permanently morethan bn=2 c positive values. Again by Lemma 1 and theprocedure deliver next message , column q of M pn

    will have eventually and permanently a positive valuefor more than bn=2 c processes. As a consequence,according to the procedure update Connectivity , q will be permanently included in the set OutConnected p. tu

    Lemma 3. 8 p 2 in -connected , 8q 62 out -connected , q is not permanently in OutConnected p.

    Proof. Since q is not out-connected, it does not exist a well-connected process r such that q ! r . By Lemma 1,M pn r q 1 only when q receives messages timely

    from r, however, since q is not out-connected, this will notoccur permanently. As a consequence, since j C j > bn=2 c,the number of processes s such that M pn s q 1 willnot be permanently greater than bn=2 c, and by theprocedure update Connectivity , q will not be perma-nently included in the set OutConnected p. tu

    Lemma 4. 8 p 2 , iff p 2 in -connected , then eventually and permanently I am InConnected p TRUE .

    Proof. If p 2 in -connected , 8r 2 C ; r ! p. By Lemma 1, andfollowingasimilarreasoningtotheproofofLemma2,nowapplied to row p of M pn , iff p 2 in -connected theprocedure update Connectivity will eventually andpermanently set I am InConnected p to TRUE (line 45). tu

    Theorem 1. The algorithm of Figs. 7, 8, and 9 implementsP om in the omission model.

    622 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 9, NO. 4, JULY/AUGUST 2012

    Fig. 9. P om in the omission model: procedure deliver_next_message().

    Fig. 8. P om in the omission model: procedure update_Connectivity().

  • 8/13/2019 VTJNS06

    14/16

    Proof. The strong completeness, eventual strong accuracy,and in-connectivity properties of P om are satisfied byLemmas 3, 2, and 4, respectively. tu

    6 INTEGRATING FAILURE DETECTION ANDCONSENSUS SECURELY

    Most properties of SMC have their direct counterparts inproperties provided by the consensus abstraction used inTrustedPals. One notable difference is the property of SMC-Privacy, which relies on subtle issues not relevant to fault-tolerant synchronization. As an example, it is possible tosuccessfully attack TrustedPals if the attacker can distin-guish different message types on the network. Recall thatevery smart card has to process messages from theconsensusprotocol and messages from the failure detector (see Fig. 4).Therefore, it is important to integrate the consensus andfailure detector algorithms securely in TrustedPals.

    6.1 Types of MessagesIn TrustedPals there are two types of messages sent over thechannel: messages by the consensus algorithm and failuredetector messages. We call the former protocol messages andthe latter heartbeats. Heartbeats are time critical, i.e., theyshould not be delayed by the transport layer, while protocolmessages are asynchronous, i.e., eventual delivery issufficient for them.

    The idea of TrustedPals is to use heartbeats as thetransport mechanism for protocol messages, providing animplicit relaying mechanism to the consensus layer. Everyheartbeat has a small fixed-size message field called the payload. Similar to network transport protocols (like IP)protocol messages are inserted into the payload of heart- beats when they are sent. If a protocol message is largerthan the size of the payload, it is fragmented into smallerparts which are sent one after the other. Similar to thefragmentation mechanism in IP, unique identifiers andsequence numbers allow to piece together the fragments atthe destination in the correct order. If there is no protocolmessage to be sent, the payload of a heartbeat can remainempty. In this way, we achieve that TrustedPals generates fixed size messages in fixed (periodic) time intervals.

    Observe that the use of heartbeats as the transportmechanism for protocol messages has a negative impact onthe performance of the consensus algorithm, since theconsensus messages are delayed until messages of the

    failure detection layer are sent. This is the price to pay forhiding the communication pattern of the consensus algo-rithm to the adversary.

    6.2 Avoiding Message and Traffic AnalysisTraffic analysis refers to an attack technique which tries toderive information about messages by simply analyzing thevariation of the times in which they are sent and received.Since we send protocol messages within heartbeats weincrease the difficulty for the attacker to distinguish anempty heartbeat from a full heartbeat by only lookingat the timing of traffic. Since heartbeats are sent in an all-to-all pattern, it is also hard to distinguish which process is

    sending protocol messages to which other process. Thisapproach ensures unobservability regarding protocol mes-sages, a notion known from the area of privacy-enhancingtechniques [48], [49].

    Of course, the attacker could simply look into the contentsof a heartbeat todiscern a full froman empty message. That iswhywe employcryptography on thechannel. We implementa secure channel satisfying confidentiality, integrity, andauthenticity using standard techniques from cryptography[51]. The idea is to use the assumed keying material withinthe security modules to encrypt all messages and to usemessage authentication codes or digital signatures to ensurethe integrity and authenticity of the channel.

    Important from a message-analysis point of view is thatall heartbeats are indistinguishable from each other. Asmentioned above, all heartbeats messages have the samelength and if they are encrypted they will ideally look likerandom data. Hence, from just analyzing the contents of aheartbeat it is impossible to distinguish a full from anempty heartbeat. Note that information about source anddestination of a heartbeat must be sent unencrypted toallow routing. However, this information must also bestored within the encrypted part of the message to ensureauthenticity.

    7 CONCLUSIONSWe have presented a modular redesign of TrustedPals, asmart card-based security framework capable of efficientlysolving Secure Multiparty Computation. The framework is based on a two-part architecture. One part represents theuntrusted system, consisting of untrusted, Byzantine hosts.In the other part, representing the trusted system, securitymodules reduce the security problem to a fault-tolerantconsensus among smart cards.

    The modular redesign allows TrustedPals to face the

    consensus problem in the general-omission failure model,which is more benign than the Byzantine model. Addi-tionally, the trusted system has to deal with attacks whichcannot be filtered by smart cards, specifically timingattacks and buffering attacks, resulting in a system modelthat includes asynchronous processes and/or channels.According to that, we model classes of processes and typesof communication among them, and give a novel definitionof the eventually perfect failure detector class for theomission failure model. The new failure detector propertiesare based on process connectedness rather than on processcorrectness. We have proposed a failure detector algorithmwhich assumes a majority of well-connected processes, anda consensus algorithm using it. Interestingly, the consensusalgorithm is an adaptation of the classical S -basedChandra-Toueg algorithm for the crash model.

    Another relevant aspect of the redesign is the integrationof failure detection and consensus into the TrustedPalsframework. Since the failure detector follows a heartbeat- based, all-to-all communication pattern, TrustedPals usesheartbeats as the transport mechanism for consensusmessages. This approach ensures unobservability. Concep-tually, the system is reasonably secure against almost allpractical attacks.

    Our algorithms can be improved with respect to effi-

    ciency. In particular, our implementation of P om can bemodified such that it results in by omitting the all-to-allmessage exchange pattern, saving a substantial amount of messages. However, the decision to choose P om was

    CORTI ~NAS ET AL.: SECURE FAILURE DETECTION AND CONSENSUS IN TRUSTEDPALS 623

  • 8/13/2019 VTJNS06

    15/16

    deliberatesince the integration intoTrustedPals makes all-to-all communication necessary anyway to protect against sidechannel analysis that could endanger security. Therefore,any such efficiency improvement would be futile in practicalsystems. Nevertheless, we consider that determining theweakest failure detector for solving consensus in this systemis an interesting open question which deserves further

    research. Also, the space requirements of the failure detectormessages (mainly the matrix of bits) can be compressedsubstantially by special encoding techniques in practice.

    As future work, we intend to implement this approachand perform practical experiments with the system. Withinthe failure detector implementation, the size of the payloadfield will be an interesting parameter to choose. It isnecessary to find an acceptable tradeoff between securityand performance such that a message size provides bettersecurity in expense of worse performance.

    On the theoretical side it would be interesting to studythe minimal storage and communication effort necessary

    to solve consensus in our model, since we use unbounded buffers in our implementation and the bit complexity of the messages we use is also rather high.

    ACKNOWLEDGMENTSWork by the Spanish authors was supported by theSpanish Research Council, under grant HA2005-0078.Research partially supported by the Spanish ResearchCouncil, under grant TIN2010-17170, and the BasqueGovernment, under grant IT395-10. Work by the Germanauthors was supported by DAAD PPP Programme Acciones

    Integradas Hispano Alemanas. A preliminary version of thispaper appeared in the proceedings of the 9th InternationalSymposium on Safety, Security and Stabilization (SSS 2007),Paris, France, 2007.

    REFERENCES[1] A.C.-C. Yao, Protocols for Secure Computations (Extended

    Abstract), Proc. IEEE 23rd Symp. Foundations of Computer Science(FOCS), pp. 160-164, 1982.

    [2] M. Fort, F.C. Freiling, L.D. Penso, Z. Benenson, and D. Kesdogan,Trustedpals: Secure Multiparty Computation Implemented withSmart Cards, Proc. 11th European Symp. Research in ComputerSecurity (ESORICS), pp. 34-48, 2006.

    [3] Z. Chen, Java Card Technology for Smart Cards: Architecture andProgrammers Guide. Addison-Wesley Longman Publishing Co.,Inc., 2000.

    [4] N. Leavitt, Will Proposed Standard Make Mobile Phones MoreSecure?, Computer, vol. 38, no. 12, pp. 20-22, Dec. 2005.

    [5] Certgate GmbH, Certgate Smart Card, http://www.certgate.com/web_en/products/smartcardmmc.html, 2008.

    [6] N.A. Lynch, Distributed Algorithms. Morgan Kaufmann Publishers,Inc., 1996.

    [7] T.D. Chandra and S. Toueg, Unreliable Failure Detectors forReliable Distributed Systems, J. ACM, vol. 43, no. 2, pp. 225-267,1996.

    [8] F.C. Freiling, R. Guerraoui, and P. Kuznetsov, The FailureDetector Abstraction, ACM Computing Surveys, vol. 43, no. 2,pp. 1-40, 2011.

    [9] D. Malkhi and M.K. Reiter, Unreliable Intrusion Detection in

    Distributed Computations, Proc. 10th Computer Security Founda-tions Worksop (CSFW), pp. 116-125, 1997.[10] K.P. Kihlstrom, L.E. Moser, and P.M. Melliar-Smith, Byzantine

    Fault Detectors for Solving Consensus, Computing J., vol. 46,no. 1, pp. 16-35, 2003.

    [11] A. Doudou, B. Garbinato, R. Guerraoui, and A. Schiper, Mute-ness Failure Detectors: Specification and Implementation, Proc.Third European Dependable Computing Conf. Dependable Computing(EDCC), pp. 71-87, 1999.

    [12] A. Doudou, B. Garbinato, and R. Guerraoui, EncapsulatingFailure Detection: From Crash to Byzantine Failures, Proc.Seventh Intl Conf. Reliable Software Technologies, pp. 24-50, 2002.

    [13] A. Haeberlen, P. Kouznetsov, and P. Druschel, The Case forByzantine Fault Detection, Proc. Second Workshop Hot Topics inSystem Dependability (HotDep), 2006.

    [14] A. Haeberlen, P. Kouznetsov, and P. Druschel, Peerreview:Practical Accountability for Distributed Systems, Proc. 21st ACMSIGOPS Symp. Operating Systems Principles (SOSP), T.C. Bressoudand M.F. Kaashoek, eds., pp. 175-188, 2007.

    [15] A. Haeberlen and P. Kuznetsov, The Fault Detection Problem,Proc. 13th Intl Conf. Principles of Distributed Systems (OPODIS),pp. 99-114, 2009.

    [16] M. Castro and B. Liskov, Practical Byzantine Fault Tolerance andProactive Recovery, ACM Trans. Computing Systems, vol. 20, no. 4,pp. 398-461, 2002.

    [17] A. Clement, E.L. Wong, L. Alvisi, M. Dahlin, and M. Marchetti,Making Byzantine Fault Tolerant Systems Tolerate ByzantineFaults, Proc. Sixth USENIX Symp. Networked Systems Design andImplementation (NSDI), pp. 153-168, 2009.

    [18] H. Moniz, N.F. Neves, and M. Correia, Turquois: Byzantine

    Consensus in Wireless Ad Hoc Networks, Proc. IEEE/IFIP IntlConf. Dependable Systems and Networks (DSN), pp. 537-546, 2010.[19] G. Bracha and S. Toueg, Asynchronous Consensus and Broadcast

    Protocols, J. ACM, vol. 32, no. 4, pp. 824-840, 1985.[20] M. Herlihy and J.D. Tygar, How to Make Replicated Data

    Secure, Proc. Conf. Theory and Applications of CryptographicTechniques on Advances in Cryptology (CRYPTO), pp. 379-391, 1987.

    [21] C. Delporte-Gallet, H. Fauconnier, and F.C. Freiling, RevisitingFailure Detection and Consensus in Omission Failure Environ-ments, Proc. Intl Colloquium Theoretical Aspects of Computing(ICTAC), pp. 394-408, 2005.

    [22] C. Delporte-Gallet, H. Fauconnier, A. Tielmann, F.C. Freiling, andM. Kilic, Message-Efficient Omission-tolerant Consensus withLimited Synchrony, Proc. IEEE Intl Symp. Parallel and DistributedProcessing (IPDPS), pp. 1-8, 2009.

    [23] D. Dolev, R. Friedman, I. Keidar, and D. Malkhi, FailureDetectors in Omission Failure Environments, Technical ReportTR96-1608, Cornell Univ., Computer Science Dept., 1996.

    [24] D. Dolev, R. Friedman, I. Keidar, and D. Malkhi, Failure Detectorsin Omission Failure Environments, Proc. 16th Ann. ACM Symp.Principles of Distributed Computing (PODC), p. 286, 1997.

    [25] O . Babaoglu, R. Davoli, and A. Montresor, Group Communica-tion in Partitionable Systems: Specification and Algorithms, IEEETrans. Software Eng., vol. 27, no. 4, pp. 308-336, Apr. 2001.

    [26] N. Santoro and P. Widmayer, Agreement in SynchronousNetworks with Ubiquitous Faults, Theoretical Computer Science,vol. 384, nos. 2/3, pp. 232-249, 2007.

    [27] H. Moniz, N.F. Neves, M. Correia, and P. Ver ssimo, Randomi-zation Can Be a Healer: Consensus with Dynamic OmissionFailures, Proc. Intl Symp. Distributed Computing (DISC), pp. 63-77,2009.

    [28] P.D. MacKenzie, A. Oprea, and M.K. Reiter, Automatic Genera-tion of Two-Party Computations, Proc. ACM Conf. Computer andComm. Security, pp. 210-219, 2003.

    [29] D. Malkhi, N. Nisan, B. Pinkas, and Y. Sella, Fairplay - SecureTwo-Party Computation System, Proc. USENIX Security Symp.,pp. 287-302, 2004.

    [30] Y. Lindell and B. Pinkas, An Efficient Protocol for Secure Two-Party Computation in the Presence of Malicious Adversaries,Proc. 26th Ann. Intl Conf. Advances in Cryptology (EUROCRYPT),pp. 52-78, 2007.

    [31] V. Kolesnikov, Gate Evaluation Secret Sharing and Secure One-Round Two-Party Computation, Proc. 11th Intl Conf. Theory and Application of Cryptology and Information Security (ASIACRYPT),pp. 136-155, 2005.

    [32] L. Kruger, S. Jha, E.-J. Goh, and D. Boneh, Secure FunctionEvaluation with Ordered Binary Decision Diagrams, Proc. ACM

    Conf. Computer and Comm. Security, pp. 410-420, 2006.[33] M. Burkhart, M. Strasser, D. Many, and X.A. Dimitropoulos,Sepia: Privacy-Preserving Aggregation of Multi-Domain Net-work Events and Statistics, Proc. USENIX Security Symp., pp. 223-240, 2010.

    624 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 9, NO. 4, JULY/AUGUST 2012

  • 8/13/2019 VTJNS06

    16/16