+ All Categories
Home > Documents > Information Leakage through Mobile Analytics Servicescs441/Papers/sec-010.pdf · through mobile...

Information Leakage through Mobile Analytics Servicescs441/Papers/sec-010.pdf · through mobile...

Date post: 04-Aug-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
6
Information Leakage through Mobile Analytics Services Terence Chen *† , Imdad Ullah *† , Mohamed Ali Kaafar *? , and Roksana Boreli *† * National ICT Australia University of New South Wales, Australia ? INRIA, France [email protected] ABSTRACT In this paper we investigate the risk of privacy leakage through mobile analytics services and demonstrate the ease with which an external adversary can extract individual’s profile and mobile applications usage information, through two major mobile analytics services, i.e. Google Mobile App Analytics and Flurry. We also demonstrate that it is possi- ble to exploit the vulnerability of analytics services, to influ- ence the ads served to users’ devices, by manipulating the profiles constructed by these services. Both attacks can be performed without the necessity of having an attacker con- trolled app on user’s mobile device. Finally, we discuss po- tential countermeasures (from the perspectives of different parties) that may be utilized to mitigate the risk of individ- ual’s personal information leakage. 1. INTRODUCTION The mobile advertising ecosystem, comprising ad compa- nies (with associated analytics services), app developers, the companies running ad campaigns and mobile users, has be- come a powerful economic force. Increasingly, targeted ads are being served based on user’s information collected by the apps. Potential privacy threats resulting from data collection by third-parties have been extensively studied in the research literature [7, 10, 9], and a number of countermeasures are proposed [6, 11, 8]. the immediate impact of such data col- lection has been so far overlooked. In this paper, we argue and show that, even if the genuine purposes of analytics services are legitimate (i.e. to pro- vide analytics to developer and/or to serve targeted ads), user’s privacy can be leaked by third-party tracking compa- nies due to inadequate protection of data collection phase and the aggregated information process. We further argue that inappropriate security measures of the mobile analytics services may threaten the ads eco-system. We consider mobile analytics services as entities that can leak private data to external adversaries, and indeed show Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ACM HotMobile’14, February 26–27, 2014, Santa Barbara, CA, USA. Copyright 2014 ACM 978-1-4503-2742-8 ...$15.00. how user profiles and app usage statistics can be extracted from two major mobile analytics and ads delivery networks (Google Mobile App Analytics and Flurry Analytics 1 ). We exploit, first, the use of globally unique identifiers (IDs) and the lack of mobile device authentication in the mobile ana- lytics data collection. Then, the detailed (per user) report- ing available from the mobile analytics companies. Considering the minimum effort required to obtain the device ID and other identifying information used by mobile analytics company, that can be collected either by monitor- ing any accessible networks, e.g. open wireless access points (APs) or local area network (LAN), or by any app (we note that no permission is required for the app to collect device IDs and related information, e.g. Android ID, device model, OS version, etc), we show how this type of attack presents a serious threat to user’s privacy. We validate the informa- tion leakage through experiments using a set of 44 volunteers Android mobile devices. We successfully extract the volun- teer’s profiles from Flurry and Google Analytics and show that it is also possible to spoof a device by setting the iden- tifying parameters in either a (different) mobile device or an emulator. We further show how a malicious adversary can exploit the vulnerability of analytics services to launch an attack on the advertising ecosystem, by perturbing user’s profiles and consequently disrupting the accuracy of targeted ads placement, with potentially serious financial consequences for the advertising ecosystem. Specifically, we demonstrate how user profiles can be distorted by injecting artificial app usage data (from spoofed devices), which will influence the ads served to the original devices. We validate the success of our technique by capturing and analyzing the ads received on the targeted devices. We note that both attacks can be performed without the need for the user to have an attacker controlled app on their device, while exploiting the easy access to the unique iden- tifiers of user profiles in the analytics services and lack of authentication during the data collection and accessing pro- cess. With this work, we aim to highlight the vulnerabilities of current tracking mechanism by demonstrating the potential privacy threats, rather than to design a large scale attack on privacy of mobile users. We hope that our work will inspire the analytics companies to implement more secure and privacy friendly tracking mechanisms. 1 For the convenience, we use Google and Flurry to refer to these two mobile analytics services in this paper
Transcript
Page 1: Information Leakage through Mobile Analytics Servicescs441/Papers/sec-010.pdf · through mobile analytics services and demonstrate the ease with which an external adversary can extract

Information Leakage through Mobile Analytics Services

Terence Chen∗†, Imdad Ullah∗†, Mohamed Ali Kaafar∗?, and Roksana Boreli∗†∗National ICT Australia †University of New South Wales, Australia ?INRIA, France

[email protected]

ABSTRACTIn this paper we investigate the risk of privacy leakagethrough mobile analytics services and demonstrate the easewith which an external adversary can extract individual’sprofile and mobile applications usage information, throughtwo major mobile analytics services, i.e. Google Mobile AppAnalytics and Flurry. We also demonstrate that it is possi-ble to exploit the vulnerability of analytics services, to influ-ence the ads served to users’ devices, by manipulating theprofiles constructed by these services. Both attacks can beperformed without the necessity of having an attacker con-trolled app on user’s mobile device. Finally, we discuss po-tential countermeasures (from the perspectives of differentparties) that may be utilized to mitigate the risk of individ-ual’s personal information leakage.

1. INTRODUCTIONThe mobile advertising ecosystem, comprising ad compa-

nies (with associated analytics services), app developers, thecompanies running ad campaigns and mobile users, has be-come a powerful economic force. Increasingly, targeted adsare being served based on user’s information collected by theapps.

Potential privacy threats resulting from data collection bythird-parties have been extensively studied in the researchliterature [7, 10, 9], and a number of countermeasures areproposed [6, 11, 8]. the immediate impact of such data col-lection has been so far overlooked.

In this paper, we argue and show that, even if the genuinepurposes of analytics services are legitimate (i.e. to pro-vide analytics to developer and/or to serve targeted ads),user’s privacy can be leaked by third-party tracking compa-nies due to inadequate protection of data collection phaseand the aggregated information process. We further arguethat inappropriate security measures of the mobile analyticsservices may threaten the ads eco-system.

We consider mobile analytics services as entities that canleak private data to external adversaries, and indeed show

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.ACM HotMobile’14, February 26–27, 2014, Santa Barbara, CA, USA.Copyright 2014 ACM 978-1-4503-2742-8 ...$15.00.

how user profiles and app usage statistics can be extractedfrom two major mobile analytics and ads delivery networks(Google Mobile App Analytics and Flurry Analytics1). Weexploit, first, the use of globally unique identifiers (IDs) andthe lack of mobile device authentication in the mobile ana-lytics data collection. Then, the detailed (per user) report-ing available from the mobile analytics companies.

Considering the minimum effort required to obtain thedevice ID and other identifying information used by mobileanalytics company, that can be collected either by monitor-ing any accessible networks, e.g. open wireless access points(APs) or local area network (LAN), or by any app (we notethat no permission is required for the app to collect deviceIDs and related information, e.g. Android ID, device model,OS version, etc), we show how this type of attack presentsa serious threat to user’s privacy. We validate the informa-tion leakage through experiments using a set of 44 volunteersAndroid mobile devices. We successfully extract the volun-teer’s profiles from Flurry and Google Analytics and showthat it is also possible to spoof a device by setting the iden-tifying parameters in either a (different) mobile device or anemulator.

We further show how a malicious adversary can exploitthe vulnerability of analytics services to launch an attackon the advertising ecosystem, by perturbing user’s profilesand consequently disrupting the accuracy of targeted adsplacement, with potentially serious financial consequencesfor the advertising ecosystem. Specifically, we demonstratehow user profiles can be distorted by injecting artificial appusage data (from spoofed devices), which will influence theads served to the original devices. We validate the success ofour technique by capturing and analyzing the ads receivedon the targeted devices.

We note that both attacks can be performed without theneed for the user to have an attacker controlled app on theirdevice, while exploiting the easy access to the unique iden-tifiers of user profiles in the analytics services and lack ofauthentication during the data collection and accessing pro-cess.

With this work, we aim to highlight the vulnerabilities ofcurrent tracking mechanism by demonstrating the potentialprivacy threats, rather than to design a large scale attackon privacy of mobile users. We hope that our work willinspire the analytics companies to implement more secureand privacy friendly tracking mechanisms.

1For the convenience, we use Google and Flurry to refer tothese two mobile analytics services in this paper

Page 2: Information Leakage through Mobile Analytics Servicescs441/Papers/sec-010.pdf · through mobile analytics services and demonstrate the ease with which an external adversary can extract

User%devices% Mobile%apps%

Aggrega2on%%server%

Ad%placement%server%

Adver2ser% Developer%

Analy2cs%server%

App%analy2c%services%

[app$usage]$

[app$usage,$$user$info]$

Mobile%ads/analy2cs%network%

[app$usage,$$user$info]$

Figure 1: Free app eco-system and information flowbetween different parties.

The paper is organized as follows. Section 2 describes themobile applications ecosystem including the analytics ser-vices. Section 3 presents the methodology used for the firstattack, extracting user profile information, while Section 4presents the methodology to implement the influencing ofserved ads by distorting user profiles. Section 5 discussesthe potential countermeasures that may be used to mitigatethe attacks. We conclude in Section 6.

2. BACKGROUNDIn the following, we provide a brief overview of the mobile

applications ecosystem, aim to explain the relationship be-tween users, apps, app developers, advertising network andthe corresponding analytics services.

2.1 The App Ecosystem and Mobile TrackingIn order to maximise the revenue of in-app ads, targeted

ad networks are widely adopted by app developers. Accord-ing to the study in [9], more than half (52.1%) of top 100,000apps downloaded from Android Market contain at least onead library (including analytics libraries). Figure 1 showsa typical free app eco-system and the information flow be-tween different parties. The analytic library (in some casessame as the ad library, e.g. Flurry) which is typically em-bedded into the mobile app collects usage information andpossibly user related attributes, which are sent to an aggre-gation server. Globally unique identifiers (e.g. Android ID,iOS UUID, IMEI, MAC, etc.), available on mobile devices,allow analytics companies to link data collected from differ-ent apps (identified by an appID) to the same individuals.The analytics service derive user profiles from aggregateddata and provide the information to the ad network (e.g.Google AdMob and Flurry) for the purpose of serving tar-geted ads to mobile users.

2.2 Mobile Analytics and Tracking APIAnother incentive for the developers to use analytics ser-

vices is the comprehensive measurement tools that helpthem to evaluate the performance of their apps. With theknowledge learned from large number of other apps andusers, analytics services can also provide audience informa-tion, if available, for example gender, age, location, lan-guage, interests, etc. Figure 2 illustrates the Flurry app

Figure 2: Snapshot of developer portal in FlurryAnalytics

Figure 3: Snapshot of an app usage message(onStartSession message) send by Flurry API

performance dashboard which shows the user interests ofan app compared to global benchmark. Developers are alsoable to view the demographic information of the app usersvia the tags on the left.

In order to use these services, developers are required toembed the tracking APIs to the apps during the develop-ment. Once a user launch the app and agreed on the permis-sion of accessing resources on the device, the tracking APIssend user information, device and app usage information tothe aggregation server over the Internet. Typically the mes-sage is sent using either a HTTP GET or a POST method,e.g. Figure 3 shows the content of a onStartSession mes-sage, reporting the app (identified by APPID) usage of adevice (identified by AndoidID).

3. USER PROFILE EXTRACTIONIn this section we present the methodology to extract user

profiles from mobile analytics services, solely relying on thedevice identifier of the target. In our study, we demonstratethe methods using both Google Analytics and Flurry in theAndroid environment. We refer a user profile as the set ofinformation collected or inferred by the analytics services.Different types of information may be available in differentservices, most of them include basic demographic informa-tion like age, gender, geography, language, etc. Analyticsservices also provide audience’s interests characteristics thatare inferred from other apps usage. For instance, Flurry pro-

Page 3: Information Leakage through Mobile Analytics Servicescs441/Papers/sec-010.pdf · through mobile analytics services and demonstrate the ease with which an external adversary can extract

vides an attribute called“persona”2 in the user profile, whichindicates the interests and behavior of the audience group.While some of these persona tags are quite general, someothers can be consider as sensitive personal information, e.g.“Singles”, “New Mons”, “High Net-Worth Individuals”, etc.We notes that Flurry also allow developer to access detailedusage sessions on different app categories of the app’s audi-ence.

The key technique to extract user profiles from the an-alytics service is to first impersonate the victim’s identityand perform actions on behalf of the victim, then (1) in theGoogle case, to fetch the user profile from a spoofed device,where the profile is simply shown by the Google service asan ads preference setting or (2) in the Flurry case, to injectthe target’s identity into a controlled analytics app, whichtriggers changes in the Flurry audience analysis report, fromwhich the adversary is able to extract the user profile. Inthe following, we first describe how to obtain and spoof adevice’s identity. Then, we detail the user profile extractionfor both cases of Google and Flurry.

3.1 Availability and Spoofing of Device IDEasy access to Device ID. An adversary can capture

victims Android IDs in at least two possible ways. First,the adversary can simply monitor the network, capture theusage reporting message sent by the third-party trackingAPIs and extract the device ID to be utilised for furthercommunication with the analytics service. An example ofsuch a message is shown for the case of Flurry in Figure 3.In a public hotspot scenario, it is then very easy to monitorhundreds if not thousands of IDs. In a confined area, anadversary (e.g. an employer or a colleague) targeting a par-ticular individual can even associate the collected device IDto his target (e.g. employee). It is interesting to note thatGoogle analytic library prevents leakage of device identityby hashing the Android ID, however it cannot stop other adlibraries to transmit such information in plain text (whichbe easily mapped to Google’s hashed device ID).

An alternative way, although may be more challengingin practice, is to obtain the target’s device identifier fromany application (controlled by the adversary) that logs andexports the device’s identity information.

Spoofing of Device ID. Flurry uniquely identifies An-droid users by a combination of device ID (Android ID in ourcase) and device information comprising the device name,model, brand, version and system build. Google howeversimply relies on the Android ID. These parameters, as de-scribed above, can be easily identified by observing the un-protected reporting messages. We note that they are alsoaccessible to any Android app, without the user agreementgiven through the Android permissions. By modifying thevalues of identifying parameters in a rooted Android device,we are able to spoof the identity of another Android device.The corresponding system properties files that contain theparameters in the Android file system are shown in Table 1.

3.2 Extracting User Profiles from GoogleAndroid system allows users to view and manage their in-

app ads preferences3, e.g. to opt-out or to delete interest

2http://support.flurry.com/index.php?title=Analytics/Overview/Lexicon/Personas3Access from Google Settings -> Ads -> Learn more ->Adjust your Ads Settings

Parameter File path in Android file system

Android ID/data/data/com.android.providers.settings/databases/settings.db

ro.build.id

/system/build.propro.build.version.releasero.product.brandro.product.namero.product.devicero.product.model

Table 1: The file path of identifying parameters inAndroid device file system

[device%IDa,%appIDa,%…]%

Target%user%

Real%device%

Spoofed%%device% Adversary%

[device%ID

a]%

[app%usage]%

Aggrega2on%%server%

App%analy2cs%services%

Analy2cs%server%Open%AP%

appIDa%

appIDx%

Figure 4: Privacy leakage attack scenario

categories. The feature retrieves user profile from Googleserver which is identified by the Android ID. As a conse-quence of the device identity spoofing described in Section3.1, an adversary is able to access the victim’s profile on aspoofed device.

3.3 Extracting User Information from FlurryExtracting user profile from Flurry is more challenging,

as Flurry does not allow users to view or edit their Interestsprofiles. In fact, except the initial consent on the access ofresources, many smartphone users may not be aware of theFlurry’s tracking activity.

In Figure 4 we show the basic operations of our profileextraction technique. An adversary spoofs the target de-vice, identified by deviceIDa, using another Android deviceor an emulator. He then uses a bespoke app with a (legit-imate) appIDx, installed on the spoofed device, to triggera usage report message to Flurry. The analytics service isthus manipulated into believing that deviceIDa is using anew application tracked by the system. Consequently, alluser related information is made accessible to the adversarythrough audience analysis of application appIDx.

When the audience report from Flurry targets a uniqueuser, an adversary can easily extract the correspondingstatistics and link them to that single user. Similarly, the ad-versary will be able to access all subsequent changes to thisuser profile, reported at a later time. In our presented tech-nique, since we do impersonate a particular target’s deviceID, we can easily associate the target to a “blank” Flurry-monitored application.

Alternatively, an adversary can derive an individual pro-file from an aggregated audience analysis report by mon-itoring report difference before and after a target ID hasbeen spoofed (and as such has been added to the audi-ence pool). Specifically, the adversary has to take a snap-

Page 4: Information Leakage through Mobile Analytics Servicescs441/Papers/sec-010.pdf · through mobile analytics services and demonstrate the ease with which an external adversary can extract

shot of the audience analysis report Pt at time t, imperson-ates a target’s identity within his controlled Flurry-trackedapplication, and then takes another snapshot of the audi-ence analysis report at Pt+1. The target’s profile is ob-tained by extracting the difference between Pt and Pt+1,i.e. ∆(Pt, Pt+1). However in practice, Flurry service up-dates profile attributes in a weekly basis which means it willtake up to a week to extract a full profile per user.

Finally, it is important to note that Flurry provides afeature called segment to further split the app audience byapplying filters according to e.g. gender, age group and/ordeveloper defined parameter values. This feature allows anadversary to isolate and extract user profiles in a more ef-ficient way. For instance, a possible segment filter can be“only show users who have Android ID value of x” whichresults in the audience profile containing only one user.

3.4 Validation ExperimentsTo demonstrate the private information leakage through

both Google and Flurry analytics services, we design andconduct controlled experiments on a set of 44 volunteers(Andriod users, mostly researchers and university studentsfrom Australia, China, France and US). More specifically, weaim to show that it is possible to not only spoof the identityof an Android device but also to extract each user’s profile.

To aid our experiments, we have developed an Androidapp, which has been installed by the volunteers. This appcollects the devices identifiers and sends them back to ourserver. These identifiers are later used for device spoofing.

After spoofing volunteer’s device identities, we can suc-cessfully access their Google ad preference settings on ourexperimental device. We found that the majority of users inour volunteer sample are not (yet) profiled by Google (80%),7 of them have at least 1 interest category and 2 have de-mographic information, i.e. gender and age group. Interest-ingly, we observed that 2 of our volunteers have opted-outthe targeted ads.

To validate the profile extraction on Flurry, each volunteerinstalled an customized app with a unique appIDx (so thatprofiles subsequently built by Flurry can be differentiated).This application triggers a usage report message (by callingthe onStartSession function of the Flurry SDK, using it’sspecific appIDx). We are thus able to extract the profileof each volunteer from the corresponding Flurry audiencereport, as described in Section 3.3.

To validate the effectiveness of the device spoofing, we setup a second experiment using the same set of collected de-vice IDs. For each device, we initiate a new usage reportmessage to Flurry, using a different appIDx. Our assump-tion is that Flurry will associate each device identifier to theapp corresponding to appIDx. To verify this, we extract thetwo versions of profiles from Flurry for each of the collecteddevice IDs, i.e. the profile corresponding to the volunteers’real devices and to our spoofed devices. We have observedthat the two sets of profiles are identical, confirming thatFlurry cannot distinguish between the spoofed and actualreport messages.

For the Flurry profiles, we found that 84.1% of our volun-teers were already tracked by Flurry (before users install ourapplication). In addition, 56.8% of the profiles have been as-signed at least one persona tags, and 11.4% of them have anestimated gender. By following the Flurry audience report,we are able to observe how user’s application usage evolves

Profile Code avg # unique avg # unique Jaccard index Jaccard indexCategory ads (Google) ads (Flurry) vs. BL (Google) vs. BL (Flurry)Blank BL 42.5 212 0.645 0.92Books & Refs BO 106 260.5 0.32 0.704Business BU 148 219.5 0.275 0.8752Games GA 148 219.5 0.1825 0.608Media ME 166.5 220.5 0.235 0.8705Productivity PR 110 215 0.3325 0.8435Social SO 176 181.5 0.235 0.793

Table 2: Measuring ads received by different profiles

over time. Over a period of 8 weeks monitoring, we foundthat only 25% of the profiles are static, and that 50% of thevolunteers use at least one Flurry-tracked application oncea week, which triggers changes on their profiles.

4. DISTORTING USER PROFILES ANDINFLUENCING SERVED ADS

In this section, we demonstrate the second type of attackthat again uses the analytics service vulnerabilities to reducethe accuracy of analytics results and produce irrelevant tar-geted ads. The main idea is to first impersonate the targetdevice, then “pollute” the profile in mobile analytics servicesby generating artificial app usage reports. This attack has apotential to seriously disrupt the advertising ecosystem andto reduce the profit for both advertisers and ad companies.

4.1 MethodologyThe effectiveness of the attack are validated in two steps.

We first validate the premise that user’s profile is the basisfor targeting, by showing that specific profiles will consis-tently receive highly similar ads and conversely, that a dif-ference in a user’s profile will result in a mobile receivingdissimilar ads. We then perform the ad influence attack, i.e.we perturb selected profiles and demonstrate that the mod-ified profiles indeed receive in-app ads in accordance withthe profile modifications. Both sets of experiments comprisethe following steps.

Profile training and perturbing: We first train newuser profiles on a set of random generated Android IDs(Google and Flurry see these users as new users with blankprofiles), and then run a set of apps from selected inter-est categories, so that Google and Flurry receive app usagereports from these categories and build user profiles accord-ingly. To train a specific profile, we run apps from targetingcategory, e.g. business, for a period of 24 hours. Google re-sponse to the profile changes in a proximate 6 hours intervalwhile Flurry updates audience category usage to the devel-oper once a week. To perturb a profile, we run an app froma different category and tailor the number of app sessions sothat the new category becomes dominant.

Ad collection: In-app ads for both Google and Flurryare delivered via HTTP protocol. Before launching an appthat receive ads, we run tcpdump for Android on the back-ground to monitor the ad traffic. The captured traffic ispulled from the device and ads are extracted from the TCPflows every 10 minutes. Unlike Google, the captured URLsare obfuscated by the Flurry ad network. To identify ads,we follow the redirections until reaching the landing page ofthe ads. Note that we turn on the “test mode” of Flurry adsAPI to avoid ad impression and click fraud.

4.2 Validation of the TargetingMobile targeted advertising is till in early age compared to

browser-based targeting, to validate the affect of user profile

Page 5: Information Leakage through Mobile Analytics Servicescs441/Papers/sec-010.pdf · through mobile analytics services and demonstrate the ease with which an external adversary can extract

to the in-app ads, we compare the similarity of ads receivedusing different user profiles in a control environment. Wefirst select six app categories, for each category we train twoidentical profiles by installing and running apps from theselected category in two devices, denote as profile a and b.The selected categories are: Games (GA), Business (BU),Books (BO) & References, Media (ME) & Video, Productiv-ity (PR) and Social (SO). We then collect ads from all thedevices by running the ad collection apps for a period of 24hours. We also collect ads from two devices with no/blankprofile.

BLaBLbPR

aPR

bBO

aBO

bBU

aBU

bMEa

MEb

SOa

SOb

GAa

GAb

BLaBLbPRaPRbBOaBObBUaBUbMEaMEbSOaSObGAaGAb

L

M

H

(a) Google

BLaBLbBU

aBU

bMEa

MEb

PRa

PRb

SOa

SOb

BOa

BOb

GAa

GAb

BLaBLbBUaBUbMEaMEbPRaPRbSOaSObBOaBObGAaGAb

L

M

H

(b) Flurry

Figure 5: Unique ads similarity between profiles,sorted by Jaccard index v.s. blank profile. (H -high, M - moderate and L - low)

To measure the similarity between the sets of received adsin various profiles, we compute the Jaccard index J(A,B) =|A∩B||A∪B| , where A and B are two sets of unique ads. Table 2

shows an overview of the collected ads across all devices. Weobserve that even if less ads are received from the Google adnetwork, the ads are more diverse as the Jaccard index of theads between different categories and blank profiles suggeststhey are less similar compared to Flurry. Furthermore, wecompare the similarity of ads between categories, the resultsare shown in Figure 5. We can observe strong evidence oftargeting, with 1) profiles a and b in each category havinghigher (lighter colour in Figure 5) Jaccard index values and,2) profiles from different categories having a lower (darkercolour) values.

4.3 Influencing Served AdsTo demonstrate the effectiveness of the ad influencing at-

tack, we use an example of polluting the Game profileswith Business applications4. As a starting point, we usefour devices with identical profiles from the Games cate-gory: GAa, GAb, GAc and GAd. Then from a fifth device,we pollute GAa, GAb by injecting artificial usage reportsof Business applications with the spoofed device IDs. Tomake Business a dominant factor in the modified profile,we “run” significantly more sessions and longer period withartificially Business apps usage. We denote the perturbedprofiles as GABU

a and GABUb . We verify the results of per-

turbation by extracting the profiles from Flurry and Googleusing techniques described in Section 3, finding that the per-turbed profile comprise of more than 98% of Business cate-gories usage and less than 2% of the Game category usage.For Google, we find categories related to both Business and

4We successfully tested with other categories. For simplicity,we only show results based on Games and Business profiles

Games appeared in the perturbed profiles, this suggest con-verting a profile in Google requires a longer period.

GAa BU

GAb BU

BUa

BUb

GAc

GAd

GAaBU

GAbBU

BUa

BUb

GAc

GAd

L

M

H

(a) Google

GAa BU

GAb BU

BUa

BUb

GAc

GAd

GAaBU

GAbBU

BUa

BUb

GAc

GAd

L

M

H

(b) Flurry

Figure 6: Unique ads similarity before and after pro-file perturbation. (H - high, M - moderate and L -low)

To evaluate the difference in served ads, we again computethe Jaccard index between set of unique ads received as byall the profiles (note BUa, BUb, GAa and GAb were mea-sured in the validation experiment). The level of similarityare shown in Figure 6. The distortion to the Games profilesare more obvious in Flurry ads network: we observe that theads received by GABU

a and GABUb have much higher simi-

larity to Business profiles than to Games profiles, whichsuggests that Flurry ad network has altered the targetingto present a significantly higher number of ads appropriateto the Business profile, to the Games user. In the Googlecase, the ads received by perturbed profiles and actual Busi-ness and Games profiles exhibit moderate similarity values.However the similarity between Business and Games adsare rather low . This suggests that the perturbed profiles,GABU

a and GABUb , receive both Business and Games related

ads. This result is consistent with the observation that bothBusiness and Games components are found in the Googleads preference.

5. POTENTIAL COUNTERMEASURESThere are a number of user based solutions to avoid ana-

lytics tracking, e.g. Mockdroid [4] and PDroid[1], that blockinformation access to mobile applications (including third-party libraries embedded into apps) or alter the informationreported to the third-parties. However, as observed in [4],these solutions may prevent specific applications from prop-erly functioning. In addition, users may actually want toreceive useful and meaningful in-app ads. The user sideprotection therefore may not be the appropriate solution tothe problem we have presented here.

To address the privacy concern raised by the third-partytracking, both Android [2] and iOS [3] deprecate permanentunique identifiers (Android ID for Android and UUID foriOS) in their new software release, instead, a user-specificand unique “advertising identifier” is provided for the track-ing purpose. This changes allow users to reset their ad-vertising identifier, in similar way as browser cookies maybe deleted in desktop computers. However, to the analyticscompany, every “reset”of the advertising ID creates an addi-tional identity in their database, the user’s previous historybecome useless and also results in inaccurate audience esti-mation to both advertiser and developers. Therefore there

Page 6: Information Leakage through Mobile Analytics Servicescs441/Papers/sec-010.pdf · through mobile analytics services and demonstrate the ease with which an external adversary can extract

is a strong incentive for the analytics companies to find re-placement identifier, and in fact in the mobile environment,there are a number of alternatives, for example MAC ad-dress, IMEI, or building fingerprinting of the device. Wenote that such changes will not affect the identity spoofingattacks if these identifiers are exposed to the attacker.

From an analytics service provider perspective, it may bepossible to prevent the individual profile extraction by en-suring the anonymity level of audience reports. For instance,analytics should produce reports only when the audience iscomposed of a minimum of k significantly different profilessimilar to what is done in the Facebook Audience estima-tion platform [5]. On the other hand, addressing the identityspoofing vulnerability may be more challenging in practice.As the first step, any message containing the device identi-fier should be protected. Google does take this into consid-eration by hashing the device IDs. However as mentionedearlier, Google cannot successfully protect the device ID inisolation, and all other ad and analytics libraries need to dolikewise.

protocolonStartSession getAds total/hour

latency bandwidth latency bandwidth latency bandwidthHTTP 160±1 ms 422 B 160±1 ms 340±2 B 4,400±380 ms 9,425±731 BHTTPS 800±5 ms 3288 B 800±5 ms 2000±269 B 8,200±950 ms 390,645±36,611 B

Table 3: HTTP v.s. HTTPS communication cost ofanalytics and ads traffic (Flurry)

As the next level of protection, let us assume that all com-munications between the user device and analytics serverare secured. In fact, Flurry API allows the developer touse SSL for this purpose, although this feature is turned offby default. The API documentation5 suggests that addi-tional loading time may be introduced if SSL authentica-tion/encryption is used, due to extra handshake and com-munication costs. We take Flurry API as an example toevaluate the communication cost, in terms of latency andbandwidth, for both tracking and ads traffics using FlurryHTTP and HTTPS methods. The results are shown in Ta-ble 3. We found that the response time is on average 5 timeslonger using HTTPS than HTTP, and the extra handshakesintroduces on average 7 times more bandwidth. If we run theapp for 1 hour with default ad fetching interval, as definedby Flurry API, HTTPS consumes 390.6 kB of bandwidthwhile only 9.4 kB for HTTP. These results suggest that en-forcing SSL usage would lead to significantly higher process-ing and communication costs for the user device, and thatthis is certainly the case for Flurry, given that they claimthey are handling as much as 3.5 billion app sessions perday6. Nonetheless, even though protecting the communica-tion through encryption would prevent the adversary fromcollecting the device ID, it would still not prevent the iden-tity spoof, if the ID was known through a different process,e.g. from malicious apps.

A conventional security solution to mitigate identity spoofattacks is to rely on a public key infrastructure, and use cer-tificate and digital signatures to authenticate the messagesand devices. Regardless of the extra communication andprocess cost, the users may be reluctant to authenticate toservices that they may not even be aware of, and such infras-tructure requires an industry-wide effort to be implemented.5accessible through http://goo.gl/sB9qs, June 13 20136accessible through http://goo.gl/XnTyC, Oct 17 2013

How to efficiently authenticate mobile devices while protect-ing user privacy during tracking process across applicationsremains an open question.

6. CONCLUSIONIn this paper, we present and validate the methodology

used to demonstrate the leakage of user’s personal informa-tion through mobile analytics services. We show this leak-age for two major analytics services, Google Mobile AppAnalytics and Flurry and additionally demonstrate how amalicious attacker can use this to disrupt the advertisingecosystem. Although the recent modifications in Androidand iOS systems, to remove permanent unique identifiers,may result in a change in the way the analytics companiestrack users (necessitating use of a different permanent IDslike e.g. device fingerprint) these changes would not affectthe described attacks if such identifiers are exposed.

7. REFERENCES[1] Pdroid – the better privacy protection, December

2011.http://www.xda-developers.com/android/pdroid-the-better-privacy-protection/.

[2] Android “kitkat” update – new privacy features,November 2013.http://www.futureofprivacy.org/2013/11/15/android-kitkat-update-new-privacy-features/.

[3] Using identifiers in your apps, March 2013.https://developer.apple.com/news/?id=3212013a.

[4] A. R. Beresford, A. Rice, N. Skehin, and R. Sohan.Mockdroid: trading privacy for applicationfunctionality on smartphones. In HotMobile, 2011.

[5] T. Chen, A. Chaabane, P.-U. Tournoux, M. A. Kaafar,and R. Boreli. How Much is too Much? LeveragingAds Audience Estimation to Evaluate Public ProfileUniqueness. In PETS’13, 2013.

[6] W. Enck, P. Gilbert, B.-G. Chun, L. P. Cox, J. Jung,P. McDaniel, and A. N. Sheth. Taintdroid: aninformation-flow tracking system for realtime privacymonitoring on smartphones. In Proc. of 9th USENIXSymposium on OSDI, 2010.

[7] W. Enck, D. Octeau, P. McDaniel, and S. Chaudhuri.A study of Android Application Security. InProceedings of the 20th USENIX conference onSecurity, SEC’11, 2011.

[8] A. P. Felt, H. J. Wang, A. Moshchuk, S. Hanna, andE. Chin. Permission Re-Delegation: Attacks andDefenses. In Proc. of 20th USENIX SecuritySymposium, 2011.

[9] M. C. Grace, W. Zhou, X. Jiang, and A.-R. Sadeghi.Unsafe Exposure Analysis of Mobile In-appAdvertisements. In WISEC, 2012.

[10] S. Han. A Study of Third-Party. Tracking by MobileApps in the Wild. Technical report, University ofWashington UW-CSE-12-03-01, 2012.

[11] I. Leontiadis, C. Efstratiou, M. Picone, andC. Mascolo. Don’t kill my ads!: balancing privacy inan ad-supported mobile application market. InHotMobile, 2012.


Recommended