1
The Internet Hunt Revisited: Personal Information Accessible via the Web
Kay Connelly, Tom Jagatic, Ashraf Khalil, Yong Liu, Katie A Siek, and Sid Stamm
2
Overview
Motivation
This is what we did
Results
Ramifications
Conclusion
3
Motivation
Rick Gates’ Hunt (6/93)
11 years later
Google™
DPPA & information selling[1]
Internet Archiving
Blogs & more...
4
Choosing Targets
Names and gender changed for privacy
Criteria for Selection
Two Targets
One each over 40 and under 25 (net generation)
One has a blog
One on active duty in Iraq
5
“Alice”
In Academia
Website through Work
Over 40
Net presence for several years
Alice™ © Disney Motion Pictures
6
“Bob”
In the US Military
Stationed in the Middle east
Under 25
Maintains a Blog
Bob the Builder™ © Hit Entertainment PLC
7
Tools
Logging & Data Archiving system
Webwhacker™[2]
Onion Router
8
Hunting Procedure
Divide and conquer:
Groups for each target
One for pay sites
Gathering data [figure]
Chaining Information
Search
Web
Record
Data
Discuss
Collected
Data
9
Background report has older man's name living at same
residence
Personal Blog mentions business venture with father
Business web site mentions product, but no names
Product patent mentions Bob and
co-owner's name
Co-owner's name matches
background report man's name
Bo
b’s
Fat
her’s
Nam
e
10
Find subject's income range from website Background report tells us price of
current house and prices subject sold other houses for
Official website gives us information on taxes in the area
Deduce subject's savings
Est
imat
ing
Alic
e’s
Sav
ing
s
11
What We Found
Category Alice BobEducation 3 17
Career 6 4Service 0 28
Location 38 25Family 62 62
Interests & Social Life 1 58Political 0 5
Other 10 54
Total Information: 120 253
12
05/29 06/12 06/26 07/10 07/24 08/07 08/21 09/04
Accum
ula
ted I
nfo
rmati
on
Date
Facts vs Time
A
B
both
13
05/29 06/12 06/26 07/10 07/24 08/07 08/21 09/04
Accum
ula
ted I
nfo
rmati
on
Date
Facts vs Time
A
B
both
alice
14
05/29 06/12 06/26 07/10 07/24 08/07 08/21 09/04
Accum
ula
ted I
nfo
rmati
on
Date
Facts vs Time
A
B
both
alice
bob
15
05/29 06/12 06/26 07/10 07/24 08/07 08/21 09/04
Acc
um
ula
ted I
nfo
rmat
ion (
and t
ime)
Date
Facts vs Time
A
B
bothalicebob
time spent
16
Where did we find it?Alice Bob
Category # src # fact # src # fact
Personal 5 77 3 119
Official 2 2 1 9
Media 0 0 3 13
Contributions 0 0 2 34
Pay Sites 3 75 3 81
17
Not all information is intentionally made available.
Control of Information
Subject Voluntary Involuntary
Alice 78 49
Bob 153 107
18
Ramifications
Lots of information readily available
Can find out some troops’ locations overseas
Easy to find a person’s home
How much is available about you?
19
If we were bad...
We could attempt masquerading as an individual
Obtain birth certificates, credit records, marriage licenses
Bribe the person
Find troops’ locations
Coerce information from them (masquerade as another)
Google for Credit Card Info[3]
20
Conclusion
Lots of Information available
More content, less reliable
Best sources were Intellius and the subjects themselves
General searching may spawn more information
Identity theft risk
Advanced technology brings about different privacy concerns
21
References (more in the paper)
1. “US Supreme Court takes up driver’s license data privacy.” Heather Hayes, CNN. May 21, 1999.
2. Webwhacker™ by Blue Squirrel. (http://www.bluesquirrel.com/webwhacker/)
3. “Google search reveals credit-card numbers.” Baker, P and Baker B. CRM Daily. (http://story.news.yahoo.com/news?tmpl=story&u=/nf/26967)