Post on 22-Jan-2016
description
transcript
On the Anonymity of Anonymity Systems
Andrei Serjantov
schnur@gmail.com
(anonymous)
Outline
• Anonymity informally– Anonymity Properties
• Anonymity of Existing Implementations– Analysis
• Probability, Entropy
• Attacks– Low Latency– Intersection
• Conclusion
What is Anonymity?
Actually, we assume humans are tied to computers and anonymize those
Anonymity does not hide the presence of the individuals/computersjust their identity
Anonymity System
This guy does not even know he is on the internet(!)
Anonymity Properties I:Receiver Untraceability
Senders are observable – i.e. the attacker knows that
A sent a message to someone
Receivers are not observable – ie the attacker does not know if
B received a message
A
B
Anonymity Properties II:Sender Untraceability
Senders unobservable….
A
B
Anonymity Properties III:Unlinkability
Senders and Receivers are observable, but not clear who is talking to whom
B
A
Anonymous from Who?(threat model)
• The observer:– Can compromise (almost) everything but two
users of the system– Observes and modifies all network traffic– Observes all network traffic
• Global Passive Adversary
– Observes some network traffic– Is the service the user is accessing
Properties
• A mix cascade guarantees that a global active attacker cannot distinguish two honest users who send one message each between time t and t’.– e.g. mixing votes
• DC-net– (both sender and receiver anonymity)
• Can be expressed formally
Anonymity of Existing Implementations
Mixes
Mix Systems
Timed Mix
Mix System
Sender
M, 0101011
R
Receiver
M, 0101011
R
R
B
B
A
R - ReceiverA - MixB - Mix
A
M, 0101011
R
R
B
B
Doing Things Anonymously
• Can provide guarantees for those who wish to send one message < 32K, and suffer the consequences of it not reaching the receiver
• Real life is not like that– Anonymous email (Mixmaster, Mixminion)
• Send and receive anonymous emails
– Web Browsing (JAP, TOR, Tarzan, Morphmix)• Wide file size distribution• Low latency
Anonymity Analysis of Existing Systems
• Define a system, and an adversary• Take inputs into the system
– e.g. web request message stream– Email interaction
• Compute observationHence figure out how vulnerable the anonymity of a certain activity is to a
particular adversary.
Inputs, Model, Observation
M2
M1
Sender 2
Sender 1
Sender3
R2
R1
R3
M2
M1
Sender 2
Sender 1
Sender3
R2
R1
R3
M1 M2
Sender 1
Sender 2
Sender 3
R2
R1
R3
System:
Inputs:
Attacker: Global Passive Adversary
Observation:
(transition semantics model of the mixes)
Mix Network
Q
RD
B
A
C Traditionally {A,B,C,D}
Timed Mix
A
B
C
D{A,B,C,D}
Mix Network
Q
RD
B
The message arriving to R is much more likely to be from D than from A
A
C Traditionally {A,B,C,D}
Pool Mix
N+M
NN
M
• M messages stay in the mix at each round• Messages to be sent are picked from both the N and the M• A message might stay in the mix for an very long time (but the probability of this happening is very small)
• The anonymity set of a message leaving at round i includes the senders who sent messages processed during previous rounds
Adding Probabilities
• Let us add the probability of that event having occurred to each event
• Call this Anonymity Probability Distribution• So {A,B,C,D} could become:
– {(A,¼), (B, ¼),(C, ¼),(D, ¼)}– Or, {(A,0.5), (B,0.1),(C,0.1),(D,0.3)}
• The probability distribution you come up with will depend on your observation, (+ knowledge, computational power…)
Entropy
• Ok, what can we do with the probability distribution afterwards?
• From information theory, is the information content of a probability distribution
• Can use this for:– Measuring anonymity
– Expressing new attacks (ones which do not modify the set, but modify the distribution)
– Comparing effectiveness of attacks
)log( pp
Pool Mix Revisited
• Could not previously compare a pool mix with a other mixes
• Now we can! • Compute the entropy of the geometric distribution• Pool mix with 100 inputs and 10 “feedbacks” is
equivalent to a standard mix with 140 inputs(!!!)• But, average delay of a message going through a
pool mix is greater • In the above example, 9% chance “of staying for
another round”
Mix Networks
• Can also compute the anonymity probability distribution in mix networks
• Model and details in [Ser04]
{(A,0.125), (B,0.125), (C,0.25), (D,0.5)}
Q
RD
B
A
C
Impact of Low Latency and Repeated Communication
-Packet Counting
-Intersection
Connection-based Anonymity Systems
• A number of nodes – Nodes do not mix, but do onion encryption
• Packets are forwarded along links• All packets of a connection are forwarded via the
same sequence of nodes
“Classical” Network P2P anonymity system
The Packet Counting Attack I
• Connection-based Anonymity Systems split the data up into many fairly small packets <1K
• All packets of an anonymous connection travel down the same path
• Thus, counting the packets may reveal which connections go where
• Merely coarse-grained packet counting required
Packet Counting II• Observe the mix for
time t and count packets on each link
• Correlate incoming and outgoing links – 1075 and 1076
3056
2748
1353
1076 1075
1804
2497
2850
• Ok if: – d (mix delay) << t– t is much smaller than interval between new connections starting
Packet Counting – Key Observation
• Packet counting works if the whole connection is lone– i.e. if it is the only connection on all the links
(from the client to the server) it passes through
This case may be attackable, we consider it not to be
Packet Counting – Results
• Hence, we need 2 or more connections on as many links as possible
• In our paper (ESORICS 2003) we define this formally
• Then simulate, showing that– E.g. 100 nodes, 100 connections via 2-4 nodes 92%
of connections are lone (p2p scenario)
– E.g. 20 nodes, 200 connections via 2-4 nodes 2.5% of connections lone (classic network)
ThresholdB+1B
Alice
Steves
M
NTo N
To M
Repeated Communication
As seen by the attacker
The Model
Simplification introduced by the model
Alice
The Results (1000 rounds, B=10)
Receivers, r
P(Estimate)
Estimate of probability of Alice sending to r
The Results
The Results
Conclusions
• Anonymity is a security property– not just privacy
• Analysis of anonymity properties important– Has been a neglected area– Uses tools from other fields (graph theory, probability)
• Plenty of applications– Identity management– Electronic voting– Anonymous email (whistle blowing)