+ All Categories
Home > Technology > CSIAC - Social Media Analysis and Privacy

CSIAC - Social Media Analysis and Privacy

Date post: 17-Jan-2015
Category:
Upload: securemind
View: 81 times
Download: 10 times
Share this document with a friend
Description:
 
Popular Tags:
33
Unclassified // Public Release 153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com Joshua White [email protected] Senior Computer Engineer Assured Information Security http://ainfosec.com PhD Student of Engineering Science Clarkson University Date: Oct 31, 2012 Release: Unclassified // Public Social Media Analysis and Privacy Copyright 2012 Assured Information Security, Inc.
Transcript
Page 1: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Joshua White [email protected] Senior Computer Engineer Assured Information Security http://ainfosec.com PhD Student of Engineering Science Clarkson University Date: Oct 31, 2012 Release: Unclassified // Public

Social Media Analysis and Privacy

Copyright 2012 Assured Information Security, Inc.

Page 2: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

About: Company

AIS (Assured Information Security)

Research and Development of technologies and capabilities to support effective operations within the entirety of the cyber domain.

Leading pioneers in the disciplines of Information Operations including Network Operations, Electronic Warfare, and Computer Network Operations of all types.

Located In:

Rome NY (Corporate Headquarters)

Portland OR

Baltimore MD

Beavercreek OH

San Antonio TX

Colorado Springs, CO

Copyright 2012 Assured Information Security, Inc.

Page 3: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

About: Speaker

Joshua White

Education:

AAS Computer Network Technology (FLCC)

BS / MS Telecommunications (SUNYIT)

PhD Student of Engineering Science (Clarkson University)

Experience:

7+ years Government Contracting in Information Security and Telecommunications Engineering

Areas of Study:

Intrusion Detection Systems

Optical Network Security

Large Dataset Analysis

Distributed Processing

Copyright 2012 Assured Information Security, Inc.

Page 4: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Overview: The Big Questions

Introduction

The Big Research Questions:

What are social media networks?

What is the privacy problem relating to them?

Who would want this data and why?

What rights of privacy must I protect?

What regulations regarding privacy exist?

What happens if I don't protect the privacy?

Conclusions

References

Copyright 2012 Assured Information Security, Inc.

Page 5: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

What are social media networks?

Copyright 2012 Assured Information Security, Inc.

Page 6: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Definition

Social Media Networks

DHS identified multiple categories [1]

Search

Video

Maps

Photos

Blog aggregates

Micro-blogs

Traditional social networks

Copyright 2012 Assured Information Security, Inc.

Page 7: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

What's the privacy problem as it relates to these social media networks?

Copyright 2012 Assured Information Security, Inc.

Page 8: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Problem

Two-Part Problem

End Users

Unsure or unaware on ways to properly protect their privacy

Data Collectors

Don't know how to properly maintain the privacy of their datasets

Copyright 2012 Assured Information Security, Inc.

Page 9: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Problem

Social Media-Networking Sites

Provide a communications method thought by many to be at least somewhat private

Many never change the default security settings associated with their accounts

Example: Percentage of Facebook users by age that change their account security settings to anything other then the default (no security) setting [2]

18-19 years old = 71%

30-39 years old = 67%

50-64 years old = 55%

80% of all users fall within the 18-64 age range

Estimated 20+ million users have no security but must still have a basic expectation of privacy

Provides the largest “Social Network” datasets available for study

Copyright 2012 Assured Information Security, Inc.

Page 10: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Who would want this data and why?

Copyright 2012 Assured Information Security, Inc.

Page 11: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Problem Focus: Data Collectors

The problem of user expectations and knowledge of privacy settings is for another discussion

Lets focus on the “larger” problem

Data Collection

What can we collect?

What can we do with the data?

How must we protect the privacy of an individual’s PII contained within the datasets?

Copyright 2012 Assured Information Security, Inc.

Page 12: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Social Media Networks Awareness

Benefits:

Government

Track locations of persons of interest with reasonable accuracy

“Bad guys” may have protected posts

Sometimes accessible by simply looking at their friend’s posts, or even other sites that they have allowed access within their accounts

Track trends

Who said what, who repeated it?

Is it going to cause a riot or worse yet, a war?

News before “official” reports

Natural disasters, shootings, etc

Copyright 2012 Assured Information Security, Inc.

Page 13: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Social Media Networks Awareness

Benefits:

Businesses

Directed advertising

Track locations of consumers with reasonable accuracy

Track buying habits and interests

Track trends

Who said what, who repeated it, is something going to effect a brand?

News before “official” reports

Did something happen that will effect the market rapidly

Natural disasters, news reports, etc

Copyright 2012 Assured Information Security, Inc.

Page 14: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Social Media Networks Awareness

Benefits:

Academia

Research

Track locations of subjects with reasonable accuracy

Track habits, interests and moods over time

Track trends

Who said what, who repeated it (graph theory)?

Study social networks with the largest datasets ever created

Collaborate with millions

Build prediction models

Copyright 2012 Assured Information Security, Inc.

Page 15: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Social Media Networks Awareness

It Concerns Groups Differently

Persons of interest

Don't want to be incriminated in things that you may not have done

Consumers

Don't want others to know things about their buying habits that can be used against them

Subjects

Don't want information released that might cause them to be judged by their peers

Some Concerns Everyone Shares

Discrimination

A feeling of (privacy) violation

Copyright 2012 Assured Information Security, Inc.

Page 16: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Case Study: Twitter

A real-time social media network of microblogs

Various API's

Search, Live, Historical

Highly accessible

Example: NodeXL offers a MS Excel plugin for quickly grabbing a few thousand samples a day from multiple sites

Large user base

65+ million “tweets” per day

750+ “tweets” per second

International Community

At least 27 languages represented

Copyright 2012 Assured Information Security, Inc.

Page 17: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Case Study: Twitter

Twitter is used by:

People

Every Day Individuals

Politicians

Celebrities

Professionals

Bad Guys

Objects

Gadgets that tweet (Sensors, bots, computers, spammers)

Labeled Nefarious Groups

Lulzsec

Anonymous

others

Copyright 2012 Assured Information Security, Inc.

Page 18: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Case Study: Twitter

What's accessible:

Posts contain far more then what's shown in the http://www.twitter.com web interface

Data is accessible as XML or in it's native JSON form

Data includes:

Location (Geo fields)

User names / real names

Threading

Track conversations using replies

Track re-tweets

Twitter client software data

Time stamping

Tweet text

And so much more

Copyright 2012 Assured Information Security, Inc.

Page 19: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Case Study: Twitter

Copyright 2012 Assured Information Security, Inc.

Page 20: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Case Study: Twitter

What can be done with all of this:

NYC Company DataMinr

Report the death of Bin Laden:

25 minutes after he was killed

13 minutes before the presidents address

They saw the first message regarding this only 19 minutes after it happened

They were able to trace even earlier messages that with the right algorithms would have shown something going on before the initial military strike

Reports of US helicopter flying over head

Copyright 2012 Assured Information Security, Inc.

Page 21: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Case Study: Twitter

Consequences

Data on sites like twitter can be used to:

Predict Social Security numbers with reasonable accuracy [4]

Deduce the gender of an individual from nothing but the message text [5]

Track a persons physical location and create predictable pattern maps

Deny services based on views and opinions expressed

Use posts, even those that were deleted as evidence in court [6]

So much more

Copyright 2012 Assured Information Security, Inc.

Page 22: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

What rights of privacy must I protect? &

What laws regarding privacy regulation exist?

Copyright 2012 Assured Information Security, Inc.

Page 23: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

First we need a strict definition for what is and isn't PII (Personally Identifiable Information)

PII is any information that can be used to identify a specific individual

This includes data that can be combined with other sources to identify an individual

Privacy Protection / Regulation

Copyright 2012 Assured Information Security, Inc.

Page 24: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Privacy Protection / Regulation

You decided to use this data, what's next?

Protecting the PII of individuals within the dataset is key, and to some extent dependent on who you are

We're back to:

Government

Businesses

Academics

Let's concentrate on US law during the rest of this talk

Copyright 2012 Assured Information Security, Inc.

Page 25: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

The US Government

Must protect the privacy of its citizens

Federal: Cannot collect data on citizens without a warrant

States: Cannot collect data on citizens without just cause

Cannot deny citizens the right to use social media networks

Cannot enforce privacy on the individual

Can enforce regulations on the social media companies and those who use the data

Privacy Protection / Regulation

Copyright 2012 Assured Information Security, Inc.

Page 26: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Businesses in the US

Must protect the privacy of consumers

Must abide by regulations imposed by the government that the site is located within

While not required by law, it's good practice to let consumers know what is being done with their data

Privacy Protection / Regulation

Copyright 2012 Assured Information Security, Inc.

Page 27: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Academics in the US

Must protect the privacy of subjects

This applies even in instances where data is gathered without consent, such as from social media network sites

Consent is not required for the collection of information from these sites

Depending on the specific sites EULA, datasets may:

Not be shared with other researchers outside of the organization

Not be duplicated within a publication

Summation through statistics and results is OK

Privacy Protection / Regulation

Copyright 2012 Assured Information Security, Inc.

Page 28: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

What happens if I don't protect the privacy of the individuals within my datasets?

Copyright 2012 Assured Information Security, Inc.

Page 29: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Consequences

There are obvious legal ramifications for not protecting the privacy of individuals within a dataset

Legal (Federal / State)

Legal personal injury

Not so obvious

Loss of consumer trust / support

Loss of position through ethics violation clauses

Copyright 2012 Assured Information Security, Inc.

Page 30: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Consequences: Example

Ethics can be tied closely to privacy

Harvard researchers accessed complete Facebook profiles of 1700 students [7]

Data consisted of public profiles collected within the university

Researchers outside the university had to apply for access to the data

Data manual contained statistics about the dataset that did not require the application to be filled out

These statistics were used to identify individuals

Consequently researchers lost funding and the University found that opinion of the school had lowered

Researchers were put before the ethics board Copyright 2012 Assured Information Security, Inc.

Page 31: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Conclusion

Social media network datasets contain PII

PII is not just profile data, it's also unseen fields such as geo-location and data that can be derived from the messages posted

Datasets can not be shared outside an organization without prior permission if required by the EULA

If the EULA allows for sharing of the data, it still must be properly anonymized

Copyright 2012 Assured Information Security, Inc.

Page 32: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

References

[1] DHS, Office of Operations Coordination and Planning, “Publicly Available Social Media Monitoring and Situational Awareness Initiative,” June 22, 2010

[2] “Vaidhyanathan, S.; , “Welcome to the surveillance society,” Spectrum, IEEE , vol.48, no.6, pp.48-51, June 2011 doi: 10.1109/MSPEC.2011.5779791

[3] Brodkin, Jon.; , “Bin Laden death-detecting analytics services signs partnership with Twitter,” ArsTechnica, Apr 9 2012

[4] Alessandro Acquisti, Ralph Gross.; ,“Predicting Social Security Numbers from public data”, Proceedings of the National Academy of Sciences, vol. 106, no. 27, July 7, 2009.

[5] Burger, John., Et. All.; , “Discriminating Gender on Twitter,” Mitre Corp, Nov, 2011

[6] Smith, . ; , "No warrant needed, no privacy: Judge rules even deleted tweets can be used in court," Network World, Apr. 24, 2012

[7] Parry, Marc., ; , "Harvard Researchers Accused of Breaching Students' Privacy," The Chronicle of Higher Education, July 10, 2011

Copyright 2012 Assured Information Security, Inc.

Page 33: CSIAC - Social Media Analysis and Privacy

Unclassified // Public Release

153 Brooks Road, Rome, NY | 315.336.3306 | http://ainfosec.com

Questions

Copyright 2012 Assured Information Security, Inc.


Recommended