+ All Categories
Home > Documents > Exploring Decentralization Dimensions of Social Networking...

Exploring Decentralization Dimensions of Social Networking...

Date post: 25-Jun-2018
Category:
Upload: phungdang
View: 213 times
Download: 0 times
Share this document with a friend
8
Exploring Decentralization Dimensions of Social Networking Services: Adversaries and Availability Thomas Paul Benjamin Greschbach Sonja Buchegger Thorsten Strufe TU Darmstadt & CASED KTH Royal Institute of Technology Darmstadt School of Computer Science and Communications Germany Stockholm, Sweden ABSTRACT Current online Social Networking Services (SNS) are orga- nized around a single provider and while storage and func- tionality can be distributed, the control over the service be- longs to one central entity. This structure raises privacy con- cerns over the handling of large-scale and at least logically centralized collections of user data. In an effort to protect user privacy and decrease provider dependence, decentral- ization has been proposed for SNS. This decentralization has effects on availability, opportunities for traffic analysis, re- source requirements, cooperation and incenctives, trust and accountability for different entities, and performance. In this paper, we explore the spectrum of SNS implemen- tations from centralized to fully decentralized and several hybrid constellations in between. Taking a systematic ap- proach of SNS layers, decentralization classes, and replica- tion strategies, we investigate the design space and focus on two issues as concrete examples where the contrast of ex- treme ends of the decentralization spectrum is illustrative, namely potential adversaries and churn-related profile avail- ability. In general, our research indicates that hybrid ap- proaches deserve more attention as both centralized as well as entirely decentralized systems suffer from severe draw- backs. 1. INTRODUCTION Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. HotSocial ’12, August 12, 2012, Beijing, China. Copyright 2012 ACM ISBN 978-1-4503-1549-4 ...$5.00. 49
Transcript

Exploring Decentralization Dimensions ofSocial Networking Services: Adversaries and Availability

Thomas Paul* Benjamin Greschbach† Sonja Buchegger† Thorsten Strufe*

*TU Darmstadt & CASED †KTH Royal Institute of TechnologyDarmstadt School of Computer Science and CommunicationsGermany Stockholm, Sweden

{thomas.paul, strufe}@cs.tu-darmstadt.de {bgre, buc}@csc.kth.se

ABSTRACTCurrent online Social Networking Services (SNS) are orga-nized around a single provider and while storage and func-tionality can be distributed, the control over the service be-longs to one central entity. This structure raises privacy con-cerns over the handling of large-scale and at least logicallycentralized collections of user data. In an effort to protectuser privacy and decrease provider dependence, decentral-ization has been proposed for SNS. This decentralization haseffects on availability, opportunities for traffic analysis, re-source requirements, cooperation and incenctives, trust andaccountability for different entities, and performance.

In this paper, we explore the spectrum of SNS implemen-tations from centralized to fully decentralized and severalhybrid constellations in between. Taking a systematic ap-proach of SNS layers, decentralization classes, and replica-tion strategies, we investigate the design space and focus ontwo issues as concrete examples where the contrast of ex-treme ends of the decentralization spectrum is illustrative,namely potential adversaries and churn-related profile avail-ability. In general, our research indicates that hybrid ap-proaches deserve more attention as both centralized as wellas entirely decentralized systems suffer from severe draw-backs.

1. INTRODUCTIONDecentralizing Social Networking Services (SNS) has

frequently been proposed to overcome shortcomings ofcentralized systems in recent publications. Removingthe central provider has various advantages with re-spect to individual control, general service availabil-

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.HotSocial ’12, August 12, 2012, Beijing, China.Copyright 2012 ACM ISBN 978-1-4503-1549-4 ...$5.00.

ity, privacy, attack resilience, and a decreased impactof miscon�guration and individual errors. Eliminatingthe single point of failure, or central bottleneck, henceis a valid motivation for the decentralization of serviceproviders. Decentralization may increase the scalabil-ity and availability of services, helping to prevent ser-vice breakdowns or e. g., politically motivated serviceshutdowns.The desire to protect the con�dentiality of the vast

amount of personally identi�able information that isstored in online social networks like Facebook, twitter,and Google+ has been the primary driving force be-hind the numerous attempts to decentralize Social Net-working Services. Centralized services act as gatekeep-ers that try to control the access of third parties tothe personal information of a user. While denying itto unauthorized users, the providers can grant accessto their a�liates, as well as to those users who havebeen authorized, and for this purpose they have full ac-cess to and full control over all data that is publishedwithin the system. Several cryptographic schemes [8,2, 17, 9] have been proposed to overcome the omni-science of the provider. These approaches, however,can not achieve anonymous participation, con�dential-ity of communication acts, or anonymization of com-munication partners, let alone plausible deniability ofparticipation, since they still rely on a centralized datastore.Several approaches to decentralize the Social Net-

working Service, or to integrate di�erent SNSs, havebeen proposed, and some have been implemented. Get-sharekit1 and sociallib2 are examples for general APIsthat help integrating a few, large, existing providers.Shared information hence can be spread over severalservices, thus making it available to the designated re-ceivers while keeping it partially hidden from overly cu-rious providers. Diaspora*3 and friend-of-a-friend [18]

1http://www.getsharekit.com2http://code.google.com/p/sociallib/3http://www.diasporaproject.org

49

are closely related in that they propose to break up theprovider into a few integrated, dedicated servers, us-ing DNS for addressing. The entire pro�le of a userin this case is hosted on a single, remote provider. Theproviders, however, are decentralized and hence capableof accessing only the subset of pro�les their registeredusers trusted them with. LifeSocial [6], LotusNet [1],PeerSoN [3], and SafeBook [5] are prominent, fully im-plemented or at least prototyped examples for systemsthat provide the service in an entirely decentralized,peer-to-peer (P2P) fashion.Decentralizing the SNS may entail several side e�ects

that have not satisfyingly been addressed by the pro-posed approaches so far. First, removing dedicated re-sources from the system, which then is characterized byhigh churn and very low reliability of the P2P-based ser-vice providers, makes guaranteeing availability of pro-�les very challenging. Second, applying redundancy andreplication, that are the primary solutions to such lackof reliability, imposes high costs in terms of storage andcommunication overheads. Centralized servers, �nally,are acting as de-facto anonymization mixes with respectto the data transmitted between the participants. For-warding all messages over a series of decentralized de-vices, instead, simpli�es the analysis of service requestsand responses for the purpose of endpoint correlation,identi�cation of individuals, or participation disclosureto third parties. The provider of centralized services hasaccess to this information; decentralization removes thisthreat but can strengthen other adversaries.The contributions of this paper are the following:

• We give an overview of possible classes of decen-tralization.

• Assuming the decision to decentralize the serviceprovision we analyze new vulnerabilities and attacksurfaces that evolve due to the service decentral-ization.

• We discuss the implications of the degree of decen-tralization on pro�le data availability.

The rest of the paper is divided into the six sectionsof (2) a brief introduction of de�nitions and the sys-tem layer model, (3) the description of di�erent classesof decentralization, (4) adversary models, (5) a discus-sion of the consequences to security, (6) a discussion onavailability and replica placement strategies, and (7) aconclusion and outlook to future work.

2. LAYERED SYSTEM MODELWe de�ne a Social Networking Service (SNS) as an

Internet-based service that allows users to maintain pro-�les as their digital representations and explicitly de-clare connections between these pro�les. A connection

Figure 1: System model with three conceptional

layers of a Social Networking Service.

between two pro�les can represent friendship, trust, orother kinds of relations between the subjects.To analyze the attack entry points and based on [4],

we use a three-layer model to describe the Social Net-working Service, as depicted in Fig. 1: The communica-tion and transport (CT) layer, the application service(AS) layer and the social networking (SN) layer. Theunderlying CT layer is responsible for routing messagesbetween network nodes and will usually comprise physi-cal networks such as the Internet and mobile phone net-works. The implementation of the SNS on top of a phys-ical network forms the AS layer, where entities repre-sent distributed applications, used and provided by theinvolved parties, such as clients representing membersof the network, as well as primary and fallback serversof the SNS providers, their delegates, and their a�li-ates including third party application providers. TheSN layer represents the network users and their real-world relations, such as friendships, trust relations andcommunication.

3. CLASSES OF DECENTRALIZATIONWe distinguish three properties of possible architec-

tural approaches in order to systematically analyze dif-ferent proposed systems.The (1) decentralization re�ects the degree of de-

centralization of the network nodes. It can be (a) com-pletely centralized, consist of (b) distributed servers, i. e.,comprise several distributed centers, or be a (c) peer-to-peer approach, i. e., completely decentralized. The loca-tion of (2) data integration distinguishes between (a)local data integration, where isolated datasets are inte-grated locally at the client side (e. g., usage of several

50

services) and (b) remote data integration, e. g., all userdata is integrated remotely at the server side. Finally,(3) communication paths, describe the di�erence be-tween (a) direct communication, having the possibilityof direct connections between members of the network(potentially after a prior lookup phase), and (b) relayedcommunication only, where all communication is recur-sively routed through servers or overlay nodes.Table 1 shows the resulting classes of decentraliza-

tion, when considering all meaningful combinations ofthe three properties. The location of data integrationis only relevant for decentralized servers. For central-ized systems it is necessarily located at the server side,for P2P systems usually at the client side. While welist a wide range of classes of decentralization, not allof these classes have been used in current proposals fordecentralized SNS designs.

3.1 CentralizedA SNS that is based on an � at least logically � cen-

tralized network structure is the common case of to-day's popular applications, such as Facebook, Google+or others4. All tra�c is either relayed or at least medi-ated by the central provider. This introduces a limitedmix functionality: an adversary sni�ng on the CT layeris unable to correlate communication endpoints, givenhigher layer con�dentiality. Content is stored central-ized and the provider is responsible for enforcing user-de�ned access control policies. All system internal enti-ties are located in one administrative domain, but dataaccess interfaces are usually provided for a�liates, suchas third party application servers.

3.2 Peer Assisted CentralizedPeer-assisted centralized SNSs are a combination of

a centralized network and a �at P2P system. One logi-cally centralized server is used for registration, identitymanagement and other tasks, that can pro�t from a cen-tralized design. Content distribution, synchronous in-teractions and other suitable operations, however, canleverage direct connections between the peers, havingthe central server as a fallback solution. While thisclass so far has not been proposed for SNS designs, itrepresents distributed approaches like, e.g., skype, orother common audio/video conference systems.

3.3 Decentralized ServersA system of decentralized servers aims to avoid the

comprehensive data aggregation in the domain of oneSNS provider. Hence it consists of multiple servers, eachresponsible for one or several clients. The servers hostthe pro�le data of assigned clients. Moreover, they actas message storage for incoming messages, that cannotbe delivered to temporarily unavailable clients. Repre-

4e. g., linkedin.com, xing.com, myspace.com, netlog.com

senting a proxy of their users, the servers are requiredto be online and available. Diaspora*5 is one examplein this class, where several users can be hosted on eachof the servers. In Vis-a-Vis [13], each server typicallyhosts a single pro�le.Peer-assistance can be used for these approaches to

mitigate the server load. Clients are allowed to ex-change data directly, when both communication part-ners are online simultaneously.

3.4 Common Interface Decentralized ServicesThis approach consists of several di�erent SNS, where

a user maintains one identity in each service. The con-nection between the SNS is a common user interfacefor managing the integrated services. One example inthis class is onesocialweb6. Here again, a peer-assistedversion is conceivable, leveraging direct connections be-tween users. These might, however, not cross SNS bor-ders, since maintaining multiple identities would other-wise not make sense any more.

3.5 Pure P2P and Recursive RoutingIn P2P systems, user data is usually hosted by the

members themselves, while content replication is usedto increase availability. Access control is typically re-alized by encrypting the content, allowing only the in-tended recipients to decipher it. While in some casesthe overlay network serves only as pragmatic routingsubstrate, e. g., to implement a DHT, other approachesleverage trust relationships of users to form networkedges. Trusted paths for package transport in combi-nation with source address rewriting at each step [4]provide the possibility to hide the identities in the com-munication and transport layer, comparable to dark-nets.In �at P2P systems, every node represents one sub-

ject that is member of the network. There are no serversand no explicit distinctions between nodes, althoughhigh-degree nodes can become more attractive attacktargets. SafeBook [5] is one example in this class. Ex-plicitly hierarchical P2P systems, such as SuperNova [14],distinguish between plain clients and dedicated supern-odes. The latter are used for bootstrapping new mem-bers, managing directories or other tasks that rely oncertain properties, such as high availability, performance,or security.

3.6 Pure P2P and Iterative RoutingThis class is similar to Section 3.5, with the di�er-

ence that direct links, that are not based on the over-lay, are allowed, using addresses of the underlying net-work (e.g. IP addresses). Consequently, this approachdoes not provide anonymity with respect to the IP layer,

5http://www.diasporaproject.org6http://onesocialweb.org

51

Decentralization Integr. Comm. Class

centralized remoterelayed 3.1 Centralized (e. g., Facebook, Google+)direct 3.2 Peer-assisted Centralized

remoterelayed 3.3 Decentralized Servers (e. g., Diaspora*, Vis-a-Vis)

decentralized direct (see 3.3, peer-assisted version)servers

localrelayed 3.4 Common Interface Decentralized Services (e. g., onesocialweb)direct (see 3.4, peer-assisted version)

peer-to-peer localrelayed 3.5 Pure P2P and Recursive Routing (e. g., SafeBook)direct 3.6 Pure P2P and Iterative Routing (e. g., PeerSon, Persona)

Table 1: Classi�cation of architectural approaches based on system properties.

but message paths are shorter (fewer hops) and fewernodes might have access to the messages (or the ci-pher text). Some decentralized SNS approaches, suchas PeerSoN [3], Persona [2] or Decent [10], are based onthis concept.

4. ADVERSARY MODELSTo characterize possible adversary models, we �rst

discuss general goals of attackers. Then we address theattack surfaces and attack vectors on the three systemlayers from [4].

4.1 Adversary GoalsAn adversary may want to a�ect one or more parts of

security and privacy: con�dentiality, integrity, serviceavailability. Thus he may want to deanonymize a targetuser, get access to con�dential information that is notaddressed to him, may falsify information to in�uencesubjects or he may attempt to prevent subjects fromusing the system by destroying the functionality.

4.2 Attack Surfaces and Vectors

Communication and Transport Layer.On the CT layer, the natural attack surface for an ad-

versary is the network communication. Network tra�ccan be sni�ed locally, e. g., in a wireless network envi-ronment, or more globally, e. g., by a malicious ISP. Be-sides passively observing tra�c, an adversary can inter-vene with the communication and �lter or block tra�c.Furthermore, an adversary might try to delay tra�c, inorder to disrupt time-critical synchronous communica-tion.An even more active adversary can try to manipulate

or inject packages, although on the CT layer this mightnot be practical, because creating a meaningful packetrequires perfect knowledge of several network parame-ters. Finally, an adversary can perform denial of service(DoS) attacks against the SNS network components.All these attacks are not speci�c to the domain of

SNS, but have to be taken into account when analyz-ing the security and privacy properties of these services.

Furthermore, di�erent SNS implementations expose dif-ferent observable information on this layer.

Application and Service Layer.On the AS layer, several attack surfaces are available

for an adversary. Again, network tra�c can be sni�ed,manipulated, blocked and injected, this time, informedby routing characteristics of the AS layer. Adversariescan try to exploit high-value nodes, e. g., high tra�cvolume nodes or those responsible for routing tra�c ofa certain target user.Stored data is another attack surface on the AS layer.

Malicious storage nodes do have extended access to pri-vate content data they store (plain- or ciphertext, de-pending on the implementation) and can evaluate re-quest logs to analyze access patterns for this content.Active attackers may even modify or delete stored con-tent. External adversaries can at least crawl the net-work and harvest accessible storage objects.Moreover, an adversary can exploit vulnerabilities of

the identity management employed by the SNS. Pos-sible attacks in this category comprise impersonation(using the existing ID of a target user to act on herbehalf), spoo�ng (using a new, falsi�ed ID), sybil at-tacks (creating and orchestrating a large amount of fakeIDs), and eclipse attacks (surrounding a target user byadversary-controlled nodes). Adversaries with system-internal friendships to a target user (including indirectones, e. g., friend-of-a-friend relations), can exploit theextended legitimate data access that comes with thisstatus.Finally, weaknesses of the protocols employed on the

AS layer, can be exploited by attackers. Besides ob-vious security holes that allow for illegitimate actions,such as deleting a pro�le, other types of attacks mightbe possible, such as specialized DoS attacks (e. g., �ood-ing of SNS requests). Attacks on APIs for interactionswith third-parties, delegates or replicas also fall in thiscategory.

Social Networking Layer.On the SN layer, the users' vulnerability to social

52

engineering is the main attack surface for adversaries.It allows for, e. g., social pressure, phishing and pass-word theft. Furthermore, an adversary can use back-ground knowledge about a target user, that was ac-quired service-externally (friend adversary). It allows tointerpret information that was collected inside the sys-tem or to mount inference attacks on sparse raw data,that on its own might not have been critical with re-spect to user privacy (e. g., sparse location data thattogether with background knowledge about preferredplaces of the user enable precise localization with highprobability).

5. SECURITY DISCUSSIONIn the following we compare the di�erent classes of

decentralization, described in Section 3, with respect totrust models and the di�erent adversary threats, dis-cussed in Section 4.

5.1 Trust ModelsSNS a�liates are part of a broad spectrum of trust

relationships with legal trust at one end, and interper-sonal social trust at the other end.A central SNS providing company is a single admin-

istrative domain that can be identi�ed by the users andthus can be sued in the case of misbehavior. This allowsfor legal trust in the company to respect the law of atleast the country where it has the registered o�ce.A decentralized SNS consists of more than one admin-

istrative domain, that store user data and they may notbe a registered company, subject to the legal account-ability that this status entails. In addition, the admin-istrative domains may be situated in di�erent countrieswith di�erent law systems, further complicating legaltrust. Therefore, trust in a decentralized SNS is ratherbased on technical mechanisms like cryptography andinterpersonal as well as institutional reputation. Dis-tributed server models are divided into several admin-istrative domains, each comprising the machines relatedto one server. P2P approaches can even be seen as hav-ing one administrative domain for each network mem-ber. While increasing the number of domains allowsfor distributing trust on several parties (as all of themhave to collude in order to harm the user in the sameway as a single provider), it also requires more inter-domain communicating links, which are more exposedto adversaries.

5.2 Adversary ThreatsAttacks on the social layer do not depend on the tech-

nical implementation of the SNS. Some of them can berelated to properties of the user-interface (e. g., phish-ing), but the interface is in general independent of thedegree of decentralization in the underlying system. Forthe following discussion we therefore focus on attacks

on the communication and transport layer, as well ason the application and service layer.Centralized systems (Section 3.1) expose very lim-

ited entry points for external adversaries, and by aggre-gating all user tra�c on a small number of machines,the central provider performs a kind of tra�c mixing,which makes it hard for external observers to infer sen-sitive information from tra�c analysis. However, in acentralized system the provider hast to be trusted, notonly not to misuse or sell the massively accumulatedprivate user data, but also to protect it perfectly fromunintentional leakages, attacks, and curious employees.Peer-assisted centralized systems (Section 3.2)

open up some more attack surfaces for external adver-saries compared to a purely centralized system. As partof the tra�c is routed directly between peers, the im-plicit mix property of centralized systems is lost: net-work sni�ers can infer interactions between peers bymonitoring communication partners and tra�c volume.The central party still has an almost comprehensiveview of the users' activities and remains a major threatto user privacy if it is not fully trusted.Distributed Servers (Section 3.3) have the poten-

tial to combine the properties of centralized and de-centralized approaches. Trust requirements are usuallydistributed over several servers, and users are free tochoose a certain server based on experience and repu-tation. Traitor attacks (i. e., a server �rst behaves hon-estly to gain reputation and exploits that trust later)are still possible but mitigated since trust is not mainlybased on recent behavior of the servers, but more onthe reputation of the server maintaining parties (e. g.,communities, companies or private persons).Relying on a single one of the servers is not required

since data can be stored redundantly (e. g., a completecopy of a user's pro�le at the user's device, or on fallbackservers) to minimize data loss in case a single serverturns malicious or is no longer maintained.Pure P2P approaches (Section 3.5 and Section 3.6)

do not have the requirement of trusting a central party.While this constitutes a major advantage for user pri-vacy, these systems have to cope with other challenges[7]: The complete decentralization of the network con-tent and functionality implies exposing all system- andmetadata to anybody observing the network, includingall participants on the path along which a message isforwarded.Even though content is usually encrypted in this kind

of systems, metadata about the content or data gen-erated while managing the content, can reveal sensi-tive information with the potential to invade the users'privacy. This holds especially for the communicationpartner identi�cation as well as for the frequency andvolume of data exchange. Thus it can tell an observerqualitative information about the relation of the com-

53

municating members. The size of a storage object canindicate its content type (e. g., the di�erence betweentext posts and pictures), statistical information (e. g.,the length of a post), or act as a �ngerprint to track aspeci�c content, even if it is re-encrypted under di�erentkeys when shared by di�erent users.Finally, the identity and relationship management,

e. g., the distribution of cryptographic keys, can leaksensitive information to network-sni�ng adversaries, suchas the content audience of encrypted objects or friend-ship status changes of network members. HierarchicalP2P systems have the potential of hiding some of thesystem internals from outsiders, by entrusting supern-odes with certain crucial tasks. The selection mecha-nisms for supernodes is usually based on automatic eval-uations of node properties, such as availability. There-fore the design is vulnerable to sybil attacks or otherapproaches with the aim to tamper with the evaluationresults in favor of adversary-controlled nodes.Systems that employ recursive routing (Section 3.5)

can facilitate communication anonymization, as iden-ti�ers of the communication endpoints can be hiddenfrom external observers. They are characterized by ahigher dependence of users on the forwarding nodes,and hence may be more vulnerable to insider attacks.

We conclude, that hybrid approaches, as describedin Section 3.3, are less exposed to the metadata vul-nerabilities of pure P2P approaches since servers exist,which act as implicit mixes. Furthermore, there is nocentral authority with total access to all user data, likein a central SNS. A user, joining an SNS based on a hy-brid approach still needs to solve the trade-o� to choosea server maintained by an authority whom he trusts, orengage in the challenge of providing and maintaining itsown server, thus potentially abandoning the mixing ofa server, which hosts multiple pro�les.

6. PROFILE DATA AVAILABILITYTo allow users to access pro�le data of other users,

it needs to be available, respecting reasonable delay.In this section, we give our notion of availability andprovide an overview of how the di�erent approaches aimto tackle this issue, since the pro�le accessibility andthus the availability of the data is one key issue whenbuilding an SNS.For the scope of this paper, we de�ne pro�le availabil-

ity to be the fraction of time (baseline: 24/7) in totalthat a pro�le is available to others. Approaches basedon dedicated storage resources potentially reach full-time (churn related) pro�le availability by de�nition.Another notion of availability, that is sometimes usedfor SNS, is to restrict the scope to the time when friends,who might need to access the data, are most likely tobe online. This, however, leads to more optimistic es-

timations and does not take into account friendshipsattempted to be established during o�ine times or pro-�le request from friends of friends.In case of a centralized SNS, the server is the place to

store the data and provide access at all times, whethera particular user is online or not. While this can berealized with a distributed set of servers or using contentdistribution networks, the control over the storage is inthe hand of a single entity and we thus consider this alogically centralized setup. In contrast, in the case offully decentralized approaches using non-reliable pro�lestorage (e. g., P2P), more care needs to be taken to keepdata accessible even when the owner of the data is notonline. This can be achieved by other mechanisms, suchas replication.Di�erent strategies exist to achieve data availability

in a P2P-based, decentralized SNS under condition ofchurn. First, replicas can be spread randomly across

the network. Second, the friends' nodes may hold

replicas of the pro�le data, since relevance of dataand good behavior are assumed more likely due to thetrust relationship of being friends [5]. Finally, nodescan be selected based on di�erent metrics to storereplicas, aiming to achieve the highest possible pro�leavailability while minimizing network tra�c and stor-age overhead. In the following, we discuss these replicaplacement strategies in terms of their impact on avail-ability.

6.1 Random Selection of Replica NodesRandomly storing pro�le copies at unrelated nodes

may lead to a large number of necessary copies. Assum-ing uniformly random distributed online times, whichrepresents a favorable assumption, leveling out devia-tions in user density over varying timezones, we makethe following back-of-the envelope calculation to illus-trate the worst case, when replicas are only available aslong as the SNS session is active (in the remainder weuse the term online as a shorthand for this). Given thisrestriction, the pro�le availability PA is calculated as:

PA = 1− (1−OF )R (1)

where OF is the fraction of time a node is online onaverage, and R is the number of replicas. This numberof replicas also equals the number of pro�les each nodeneeds to store and serve (for n users, in total R·n pro�lecopies have to be distributed over n nodes).According to a recent study7 Facebook users have 130

friends, and publish 90 items a month on average. Thistranslates to an average of three pro�le updates per day.It further states that the 750 million users included hadbeen online for 700 billion minutes a month in total,

7http://www.internetworld.de/Specials/Facebook/Zahlen-und-Fakten/Facebook-Nutzung-weltweit-Die-o�zielle-Statistik

54

0

0.2

0.4

0.6

0.8

1

0 20 40 60 80 100

Ava

ilabi

lity

Replicas

15 min31 min40 min

1.2 h2.4 h

Figure 2: Availability of pro�les depending on

the number of replicas, for di�erent SNS online

times of replicating devices.

or 31 minutes per day and user. Schneider et al. [12]analyzed passively monitored network tra�c of �tens ofthousands of users at di�erent ISPs� in 2008 and showed�that OSN sessions exhibit high variability, with manylasting a very short period of time and a few lasting forhours, with a mean of about 40 minutes�. The studyhence supports the numbers given in the �rst source,even if a user might not necessarily have exactly onesession per day: both give a general order of magnitudefor the time users utilize the service.To give an idea about how many replicas are neces-

sary in order to reach a certain availability, we plot thecases that nodes are part of the network for 2.4 hours aday (10% of the time), 1.2 hours (5%), 40 minutes, 31minutes, and 15 minutes per day on average (Fig. 2).Random peer selection for pro�le data replication in

decentralized SNS in this light does not seem to be feasi-ble for high availability requirements, especially if a pro-�le is considered accessible only when one of the repli-cating nodes are engaged in an SNS session. Even con-sidering simpli�ed, favorable circumstances and realisticsession times, pro�le data availability requires very highreplication factors (cmp. (Fig. 2). Therefore, more so-phisticated replica placement strategies have been pro-posed, as described in the following.

6.2 Friend StorageSeveral approaches propose to chose a users friends

for replication, since they are both assumed to be in-terested in the content and to cooperate in favor of theuser [5, 11]. In case of replicating pro�le data at friends'nodes, it is not unlikely that the friends live in the same� or at least close � time zones and thus are o�ine atsimilar times. Furthermore, storing replicas at friends'nodes causes a bootstrapping problem. When a newuser joins the network, she does not have any friendship

connections. The pro�le will therefore not have enoughreplicas in the network to be su�ciently available to befound by other users.Sharma et al. [15] conducted an empirical study

of availability in friend-to-friend storage systems. Theyobserve that �roughly 50% nodes can achieve at least90% of coverage�. Thus, if every node holds a copy ofthe data of all friends, only roughly 50% of the datasetscan be held available for at least 90% of the total time.

6.3 Metric-based Replication StrategiesTegeler et al. introduce Gemstone [16], a more so-

phisticated approach to select replicas for storing SNScontent in a decentralized manner. Aiming at reduc-ing the number of necessary copies to achieve a conve-nient availability, the authors suggest a selection strat-egy based on (1) the online time represented by theaverage online probability, (2) the social relation (bi-nary friendship indicator), and �nally, (3) an �onlineexperience�.SuperNova [14] introduces �super nodes� and �store-

keepers� to increase the availability of pro�le data. Anode joining the network �rst uses a chosen super nodeto bootstrap and replicate the data until it has enoughedges on its own. This approach di�erentiates betweentwo kinds of edges: friends and storekeepers. To bestorekeeper is a unidirectional connection between nodes,representing the willingness to replicate the pro�le dataof another node. Storekeepers are a manually chosensubset of friends supplemented by asking the super nodeto convey additional storekeeping nodes beyond the ownfriendship horizon.The discussed approaches are e�ective strategies of

selecting replica nodes with respect to decreasing num-ber of replicas while maximizing the pro�le availabilityand thus help to improve the availability of pro�le datain a P2P-based SNS. Nevertheless, the following issuesindicate that more research is needed. So far, metric-based approaches fall short of the standard expectationof 24/7 availability in centralized systems or other ap-proaches using dedicated services. Furthermore, the re-liance on online time measured and advertised by repli-cator nodes themselves has some disadvantages: savingbandwidth and storage resources is a strong incentiveto lie and the online time might be highly dynamic,thus resulting in the past not necessarily being a goodestimator for the future behavior (e. g., weekend vs busi-ness days), and the online time might be considered asa private information. The �online experience� leads topreferring nodes as storage with similar temporal on-line patterns which is suboptimal for continuous dataavailability. Preferably using socially related nodes maycause privacy issues, since a social relation may createa strong interest in learning information about the re-lated user by observing SNS usage patterns. Another

55

open issue for replication strategies is not only to maxi-mize availability but also to take bandwidth needs intoaccount, which is especially relevant in the presenceof very popular pro�les and temporal popularity peaks(e. g., caused by media attention).

7. CONCLUSIONThis paper discusses six classes of decentralization for

implementations of Social Networking Services. Iden-tifying the central provision of an SNS as a potentialthreat to the privacy of the users, numerous proposalsto decentralize the service have been made in the recentpast. Common ground of these proposals is to aim atremoving an omniscient central entity. The decentral-ization, while o�ering bene�ts with respect to reduc-ing both the data exploitation surfaces for the serviceprovider as well as the existence of a high value attacktarget for adversaries, comes with several, potentiallyundesired side e�ects, that so far have broadly beendisregarded.This paper introduces a classi�cation of decentraliza-

tion degrees for Social Networking Services, which canbe used to formalize decentralized service architectures,and that helps to identify possible attack surfaces aswell as classes of adversaries. It is subsequently appliedto help identifying drawbacks of purely decentralizedapproaches, highlighting the fact that pure decentral-ization may introduce disadvantages regarding privacyas well as availability of pro�les.We have started to investigate hierarchical, hybrid ar-

chitectures of distributed servers as they seem a promis-ing alternative combining positive characteristics of cen-tralized as well as fully decentralized approaches at thechance of avoiding their speci�c �aws.

8. REFERENCES

[1] Luca Maria Aiello and Giancarlo Ru�o. Secureand Flexible Framework for Decentralized SocialNetwork Services. In SESOC, 2010.

[2] Randy Baden, Adam Bender, Neil Spring, BobbyBhattacharjee, and Daniel Starin. Persona: anonline social network with user-de�ned privacy. InSIGCOMM, 2009.

[3] Sonja Buchegger, Doris Schiöberg, Le Hung Vu,and Anwitaman Datta. PeerSoN: P2P socialnetworking - early experiences and insights. InWorkshop on Social Network Systems, 2009.

[4] Leucio-Antonio Cutillo, Mark Manulis, andThorsten Strufe. Security and Privacy in OnlineSocial Networks. Handbook of Social NetworkTechnologies and Applications, 2010.

[5] Leucio-Antonio Cutillo, Re�k Molva, andThorsten Strufe. Safebook: Feasibility ofTransitive Cooperation for Privacy on a

Decentralized Social Network . In WoWMoM,2009.

[6] Kalman Gra�, Sergey Podrajanski, PatrickMukherjee, Aleksandra Kovacevic, and RalfSteinmetz. A Distributed Platform forMultimedia Communities. In InternationalSymposium on Multimedia, 2008.

[7] Benjamin Greschbach, Gunnar Kreitz, and SonjaBuchegger. The Devil is in the Metadata � NewPrivacy Challenges in Decentralised Online SocialNetworks. In SESOC, 2012.

[8] Saikat Guha, Kevin Tang, and Paul Francis.NOYB: Privacy in Online Social Networks. InFirst Workshop on Online Social Networks, 2008.

[9] Felix Günther, Mark Manulis, and ThorstenStrufe. Cryptographic Treatment of Private UserPro�les. Lecture Notes in Computer Science(LNCS), 7126, 2012.

[10] Sonia Jahid, Shirin Nilizadeh, Prateek Mittal,Nikita Borisov, and Apu Kapadia. DECENT: ADecentralized Architecture for Enforcing Privacyin Online Social Networks. In SESOC, 2012.

[11] Rammohan Narendula. The Case of DecentralizedOnline Social Networks. Technical report, EPFL,2012.

[12] Fabian Schneider, Anja Feldmann, BalachanderKrishnamurthy, and Walter Willinger.Understanding Online Social Network Usage froma Network Perspective. In IMC, 2009.

[13] Amre Shakimov, Harold Lim, Ramón Cáceres,Landon Cox, Kevin Li, Dongta Liu, andAlexander Varshavsky. Vis-a-Vis:Privacy-Preserving Online Social Networking viaVirtual Individual Servers. In ComsNets, 2011.

[14] Rajesh Sharma and Anwitaman Datta.SuperNova: Super-peers Based Architecture forDecentralized Online Social Networks. Technicalreport, ArXiv e-prints, 2011.

[15] Rajesh Sharma, Anwitaman Datta, MatteoDell'Amico, and Pietro Michiardi. An EmpiricalStudy of Availability in Friend-to-Friend StorageSystems. In Peer-to-Peer Computing (P2P), 2011IEEE, 2011.

[16] Florian Tegeler, David Koll, and Xiaoming Fu.Gemstone: Empowering Decentralized SocialNetworking with High Data Availability. InGlobecom, 2011.

[17] Amin Tootoonchian, Stefan Saroiu, YasharGanjali, and Alec Wolman. Lockr: Better Privacyfor Social Networks. In CoNEXT, 2009.

[18] Ching Man Au Yeung, Ilaria Liccardi, KanghaoLu, Oshani Seneviratne, and Tim Berners-Lee.Decentralization: The Future of Online SocialNetworking. In W3C Workshop on the Future ofSocial Networking, 2009.

56


Recommended