+ All Categories
Home > Documents > Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the...

Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the...

Date post: 19-Jun-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
66
Privacy on the Web Li Xiong Department of Mathematics and Computer Science Emory University
Transcript
Page 1: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Privacy on the Web

Li Xiong

Department of Mathematics and Computer ScienceEmory University

Page 2: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Definitions of Privacy

Right to be left alone (1890s, Brandeis, future US Supreme Court Justice)

a: The quality or state of being apart from company or observation; b: freedom from unauthorized intrusion (Merrian-Webster)

The right of individual to be protected against intrusion into his personal life or affairs, or those of his family, by direct physical or by publication of information (Calcutt committee, UK)

Page 3: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Aspects of Privacy

Information privacy Bodily privacy Privacy of communications Territorial privacy

Page 4: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Information Privacy

Establishment of rules governing the collection and handling of personal data Data about individuals should not be

automatically available to other individuals and organizations

The individual must be able to exercise a substantial degree of control over that data and its use.

Page 5: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Information privacy on the web

Large amount of (personal) data collected on the web Search engine logs Personal data and blogs on

social network sites …

The data are of great value for both individuals and our society.

The data also pose a significant threat to individuals’ privacy.

Page 6: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Data privacy on the web – some case studies A comparison of privacy practices of internet

service companies Query log retention and its privacy

implications Information revelation patterns and its privacy

implications on social network sites

Page 7: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

A race to the bottom: privacy ranking of Internet service companies Privacy International, 2007 Studied and ranked the privacy practices of

key Internet based companies Amazon, AOL, Apple, BBC, eBay, Facebook,

Friendster, Google, LinkedIn, LiveJournal, Microsoft, MySpace, Skype, Wikipedia, LiveSpace, Yahoo!, YouTube

Page 8: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

A Race to the Bottom: Methodologies

Corporate administrative details Data collection and processing Data retention Openness and transparency Customer and user control Privacy enhancing innovations and privacy

invasive innovations

Page 9: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

A race to the bottom: interim results revealed

Page 10: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Why Google

Retains a large quantity of information about users, often for an unstated or indefinite length of time, without clear limitation on subsequent use or disclosure

Maintains records of all search strings with associated IP and time stamps for at least 18-24 months

Additional personal information from user profiles in Orkut

Use advanced profiling system for ads

Page 11: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Data privacy on the web – some case studies A comparison of privacy practices of internet

service companies Query log retention and its privacy

implications Information revelation patterns and its privacy

implications on social network sites

Page 12: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Query Log

AnonID Query QueryTime ItemRank ClickURL217 lottery 2006-03-01 11:58:51 1 http://www.calottery.com217 lottery 2006-03-27 14:10:38 1 http://www.calottery.com1268 gall stones 2006-05-11 02:12:511268 gallstones 2006-05-11 02:13:02 1 http://www.niddk.nih.gov1268 ozark horse blankets 2006-03-01 17:39:28 8 http://www.blanketsnmore.com

(Source: AOL Query Log)

Page 13: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

A Face is exposed for AOL searcher No. 4417749

20 million Web search queries by AOL User 4417749

“numb fingers”, “60 single men” “dog that urinates on everything” “landscapers in Lilburn, Ga” Several people names with last name Arnold “homes sold in shadow lake subdivision

gwinnett county georgia”

Thelma Arnold, a 62-year-old widow who lives in Lilburn, Ga., frequently researches her friends’ medical ailments and loves her dogs

Page 14: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Privacy Risks of Query Log Accidental or malicious disclosure

Disclosure of information that users intended to keep private, or that may harm them when released

Compelled disclosure to third parties Query logs may be subject to subpoena as part of civil

litigation between individuals or organizations Disclosure to the government

Query logs may be subject to government demands in the context of law enforcement or intelligence investigations

Misuse of user profiles The retention of query logs may allow the creation of

detailed profiles of individuals’ interests, preferences, and behaviors.

Page 15: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Query Log Retention Rationale

Improving ranking algorithms and quality of search results

Language-based applications such as spelling correction

Query refinement Personalization Combating fraud and abuse Sharing data for academic research Sharing data for marketing and other

commercial purposes

Page 16: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Query log retention

Analyze potential techniques/practices for query log retention how well the technique protects privacy how well the technique preserves the utility of

the query logs how well the technique might be implemented

as a user control (a mechanism that allows users to choose to applied the technique)

Page 17: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

1. Log Deletion

Erase users’ complete query logs may occur as early as when the search engine returns

search results to the user. Privacy: the most privacy-enhancing technique

available Utility: drops to zero after they are erased

If the query log keep longer before erasure, search engine could seek to gain some of the benefits of log analysis and storage

User control: straightforward - either have their logs retained or deleted

Page 18: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

2. Hashing queries

Two approaches Entire queries could be hashed, so that the

resulting log contains a hash value tokenize the query, and hash each token,

resulting in a set of hash values

Page 19: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

2. Hashing queries (privacy protection)

user’s interests or behaviors no longer be available

√Profile misuse

√ XGovernment

Can’t ask for all queries associated with a particular user, IP address, or cookie IDBut able to inquire about particular query terms

√ XThird party

hashed sensitive information is hard to be reversed

√malicious

Page 20: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

2. Hashing queries (utility)

Ranking, anti-fraud, Language-based application, query refinement

personalization, some academic, commercial

If token-based hashing, or if the search engine retains aggregate statistics about queries (or some subset of queries) in an unhashedform.

Viable rationales

rely on tying query logs to particular users

Impeded rationales

Page 21: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

2. Hashing queries (user control)

Straightforward the technique’s effectiveness in protecting

privacy may actually increase for those who choose to adopt it if not all individuals make use of it. (because reverse-engineering attacks relies on statistic)

Page 22: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

3. Identifier Deletion

Identifier: IP address, cookie IDs

Page 23: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

3. Identifier Deletion (privacy protection)

any user profile based on remaining partial identifier cannot likely be correlated to specific individuals, computers, or browsers

√Profile misuse

√Government

difficult to request all query logs corresponding to a specific user, IP address, or cookie ID

√Third party

if users query their own personal informationXmalicious

Page 24: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

3. Identifier Deletion (utility)

Language-based application, query refinement

Ranking, anti-fraud, personalization, academic, commercial

those analyzing the data can use other query information (e.g., timestamps and query content)

Viable rationales

rely on tying query logs to particular users

Impeded rationales

Page 25: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

4. Hashing Identifiers

Identifier: IP address, cookie IDs

Page 26: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

4. Hashing Identifiers (privacy protection)

Possible to link a single user’s queries together

XProfile misuse

XGovernment

If civil litigants or government authorities possess the input into the hash for a particular user (e.g., the user’s IP address and/or cookie ID)

XThird party

if users query theirown personal information

Xmalicious

Page 27: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

4. Hashing Identifiers (utility)

Ranking, Language-based application, query refinement, personalization, academic, commercial

anti-fraud

it preserves the correlation between individual users and their queries.

Viable rationales

difficult to recover a fraudulent user’s external identifiers just by looking at the query logs

Impeded rationales

Page 28: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

5. Scrubbing Query Content

remove identifying information phone numbers, Social Security numbers,

credit card numbers, addresses, and names distinguish identifying information from the

remainder of the query content [Xiong and Agichtein 2007]

Page 29: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

5. Scrubbing Query Content (privacy protection)

all the other information of value (the searcher’s interests, tastes, and behaviors) remains available

XProfile misuse

XGovernment

queries can still be tied to an individual via an identifier

XThird party

But still possible to identify individual with out source

√malicious

Page 30: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

5. Scrubbing Query Content (utility)

Ranking, Language-based application, query refinement, anti-fraud personalization, academic, commercial

it preserves the correlation between individual users and their queries.

Viable rationales

Only with the specific purpose of analyzing identifying information

Impeded rationales

Page 31: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

5. Scrubbing Query Content (user control)

Two ways Scrub everything the search engine deemed

to be identifying User defined

Page 32: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

6. Deleting Infrequent Query

remove queries appearing infrequently the vast majority of queries occur a small

number of times [Beitzel et al. 2004; Spink et al. 2001]

Infrequent queries right now might not necessarily be infrequent for ever (new professional athletes or celebrities, new slang, new product names, etc.)

Page 33: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

6. Deleting Infrequent Query (privacy protection)

all the other information of value (the searcher’s interests, tastes, and behaviors) remains available

XProfile misuse

XGovernment

queries can still be tied to an individual via an identifier

XThird party

Less identifying information is available. But still possible to identify individual with out source

√malicious

Page 34: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

6. Deleting Infrequent Query (utility)

Ranking, anti-fraud, commercial

Language-based application, query refinement, personalization, academic,

based in part on the value of analyzing popular or high-volume queries.

Viable rationales

depend on recognizing rare queries and learn how to offer the searcher suggestions or make adjustments behind the scenes

Impeded rationales

Page 35: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

6. Deleting Infrequent Query (user control)

Difficult Hard to define infrequent query

Page 36: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

7. Shortening session

shorten the length of time that any identifier is associated with an individual [Xiong, 2007]. a user may be assigned a new identifier every

month, day, or hour Or when users close their browsers or when

they navigate away from the search engine’s site.

Page 37: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

7. Shortening session (privacy protection)

profiles could only be based on data from a narrow window√Profile

misuse

√Government

shorter sessions can remove the link between a user and his or her entire query history

√Third party

the query content may still contain identifying information, and because queries within the same session may still be linked together

Xmalicious

Page 38: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

7. Shortening session (utility)

Ranking, Language-based application, query, refinement, academic,

personalization, anti-fraud, commercial

based on queries that occur in close proximity to each other

Viable rationales

based on sharing historical profiles of users.

Impeded rationales

Page 39: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

7. Shortening session (user control)

easy provide the option of clearing identifiers stored

on the search engine’s side managing which users have requested shorter

sessions and when their sessions expire may be expensive

Page 40: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Conclusion

It is possible to collect/retain query logs while protecting user privacy

Technical challenges to strike the balance between retaining query log utility and protecting user privacy

Policy and user education challenges

Page 41: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Data privacy on the web – some case studies A comparison of privacy practices of internet

service companies Query log retention and its privacy

implications Information revelation patterns and its privacy

implications on social network sites

Page 42: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Motivation

Mass adoption Number of online social networking sites has increased Dramatic increase of online network participants each

year

Information revelation behavior of participants More open than offline social networks

Page 43: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Online Vs. Offline Networks

Offline social networks contain diverse relations. Examples – Family, Friend, Co-Worker, Roommate,

Acquaintance, Classmate, Teammate, Enemy, etc.

Online social networks simplify relations to simplistic binary relations such as “Friend or not”. How does someone qualify as a “Friend or not”?

What is the measurement? Most users tend to list anyone (as a Friend) who they

know and do not actively dislike.

Page 44: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Online Vs. Offline Networks

An offline social network may include up to a dozen intimate or significant ties and 1000 to 1700 “acquaintances” or “interactions”.

Online social networks can list hundreds of direct “friends” and include hundreds of thousands of additional “friends” within just three degrees of separation from a subject.

Page 45: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Online Social Networks -Privacy Implications

1. The level of identifiability of the information

2. The possible recipients of the information

3. The possible uses of the information

Page 46: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Online Social Networks -Privacy Implications

1. Level of identifiability Sites that don’t expose user identity may provide

enough information to identify the profile’s owner Examples:

Face re-identification through photos used across different sites Demographic data Category-based representations of interests that reveal unique

or rare overlaps of hobbies or tastes

Information Revelation (two possibilities) Identify “anonymous” profile through previous knowledge of

profile owner’s characteristics or traits (identity disclosure) Allowing a party to infer previously unknown characteristics or

traits about an identified profile (attribute disclosure)

Page 47: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Online Social Networks -Privacy Implications2. Possible Recipients – Who has access to the

profile information?

Hosting site / Company The site’s social network (in some cases site

visitors) Hackers Government Agencies

Page 48: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Online Social Networks -Privacy Implications3. Possible uses – how can social network

profile information be used?

Dependant upon information provided (may be extensive and intimate in some cases)

Possible uses (risks) Identity theft Online/physical stalking Embarrassment Blackmail

Page 49: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Analysis - The Facebook.com

Gross and Acquisti, 2005 In June 2005, the authors searched for all

“female” and all “male” profiles for CMU Facebook members using Facebook’sadvanced search feature and extracted their profile IDs.Using the extracted IDs, they downloaded a

total of 4540 profiles – virtually the entire CMU Facebook population at the time of the study.

Page 50: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

The Facebook.com Demographics

Page 51: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

The Facebook.com Demographics

Page 52: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

The Facebook.comTypes and Amounts of Information Disclosed

In general, CMU Facebook members provided large amounts of information. 90.8% of profiles contained an image. 87.8% revealed their birth date. 39.9% listed a phone number 50.8% listed their current residence. 62.9% listed their relationship status.

Across most categories, the amount of information revealed by female and male users was similar. A notable exception was the phone number, disclosed by substantially more male than female users (47.1% vs. 28.9%).

Page 53: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

The Facebook.comTypes and Amounts of Information Disclosed

Page 54: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

The Facebook.com Data Validity

Names were manually categorized as being one of the following. Real Name – Name appears to be real (example – can be

matched to the visible CMU e-mail address provided at login). Partial Name – Only a first name is given. Fake Name – Obviously fake name.

Page 55: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

The Facebook.com Data Identifiability

The same evaluation was repeated for Friendster, where the profile name is only the first name of the member (which makes Friendster profiles not as identifiable as Facebook profiles).

Page 56: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

The Facebook.com Data Identifiability

Friends networks can also contribute to data validity and identifiabilitysince adding a friend requires explicit confirmation.

Facebook users typically maintain a very large network of friends. On average, CMU Facebook members list 78.2 friends at CMU and 54.9

friends at other schools.

Page 57: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

The Facebook.com Privacy Implications

The population of Facebook users studied is oblivious, unconcerned, or pragmatic about their personal privacy Benefit of selectively revealing data to strangers may appear larger

than the perceived costs of privacy invasions. Peer pressure or herding behavior. Incomplete information about possible privacy implications Service’s user interface may drive unchallenged acceptance of default

privacy settings.

Users may put themselves at risk for a variety of attacks on their physical or online persona.

Personal data is generously provided Limiting privacy preferences are sparingly used.

Page 58: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

The Facebook.com Privacy Implications

Stalking

Potential adversary (with an account at the same institution) can determine the likely physical location of the user for large portions of the day based on profile information about

residence locationclass schedulelocation of last login.

Page 59: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

The Facebook.com Privacy Implications

Re-identification Demographics 45.8% list birthday, gender, and current residence. Can be linked

to outside, de-identified data sources such as hospital discharge data.

Face Re-Identification Social Security Numbers Hometown and birth-date can be used to estimate the first three

and middle two digits of a social security number. Possible to obtain last four digits (often used in unprotected logins

and passwords) through social engineering. Identify Theft Majority of profiles contain current phone number and residence

which are often used for verification by financial institutions.

Page 60: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

The Facebook.com Privacy Implications

Digital Dossier

Privacy implications of revealing personal information may extend beyond their immediate impact, which can be limited.

With low and decreasing costs of storing digital information, it is possible to monitor and record the evolution of the network and its users’ profiles, thereby building a digital dossier for its participants.

Users may not be concerned about the visibility of personal information now, but may be later when the data could still be available.

Page 61: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

The Facebook.com Privacy Implications

Page 62: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Conclusions Online social networks are both vaster and looser than their

offline counterparts. Possible for a profile to be connected to thousands of

other profiles through the network’s ties. In the study of CMU users of Facebook

Quantified individuals’ willingness to provide large amounts of personal information has been.

Shown how unconcerned its’ users appear to privacy risks based on how personal data is generously provided and limiting privacy preferences are hardly used.

Based on the information they provide online, users expose themselves to various physical and cyber risks.

Page 63: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Remember, they are always watching … what can we do?

Page 64: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Some advices from privacy campaigners Use cash when you can. Do not give your phone number, social-security number or

address, unless you absolutely have to. Do not fill in questionnaires or respond to telemarketers. Demand that credit and data-marketing firms produce all

information they have on you, correct errors and remove you from marketing lists.

Check your medical records often. Block caller ID on your phone, and keep your number unlisted. Never leave your mobile phone on, your movements can be

traced. Do not user store credit or discount cards If you must use the Internet, encrypt your e-mail, reject all

“cookies” and never give your real name when registering at websites

Better still, use somebody else’s computer

Page 65: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Privacy Protection Technologies

Access control De-identification/anonymization Statistical databases and inference control Data perturbation Cryptographic protocols

Page 66: Privacy on the Web - Emory Universitylxiong/cs573_s12/share/slides/CS...Information privacy on the web Large amount of (personal) data collected on the web Search engine logs Personal

Thank you


Recommended