+ All Categories
Home > Documents > Better web privacy through automation Umesh Shankar Berkeley EECS.

Better web privacy through automation Umesh Shankar Berkeley EECS.

Date post: 18-Jan-2018
Category:
Upload: candice-sherman
View: 221 times
Download: 0 times
Share this document with a friend
Description:
Correct configuration is difficult We have high-level goals… “Allow my computers onto my network and no others” …but low-level configuration languages “Did I get the settings right?” Not always obvious how to express goals in policy terms or to see if they are met
23
Better web privacy through automation Umesh Shankar Berkeley EECS
Transcript
Page 1: Better web privacy through automation Umesh Shankar Berkeley EECS.

Better web privacy through automation

Umesh ShankarBerkeley EECS

Page 2: Better web privacy through automation Umesh Shankar Berkeley EECS.

Configuration everywhere

•Home wireless access

•Operating System

•Web Browser

Page 3: Better web privacy through automation Umesh Shankar Berkeley EECS.

Correct configuration is difficult

•We have high-level goals…•“Allow my computers onto my network and no others”

•…but low-level configuration languages•“Did I get the settings right?”

•Not always obvious how to express goals in policy terms or to see if they are met

Page 4: Better web privacy through automation Umesh Shankar Berkeley EECS.

Insecurity ensues

•When policies are too difficult to configure, people disable or bypass them•Browser, firewall, OS policy are set to defaults

•We lose the benefit of configurability: policy reflecting our specific needs

•Functionality wins over security by default

Must retain security subject to users’ patience

Page 5: Better web privacy through automation Umesh Shankar Berkeley EECS.

Browser cookie recap

Cookies

Client Server

index.html

GET /index.html

CartID=1

234

cart.jsp

GET /cart.jsp

“Your cart #1234 contains…"

…CartID=1234

Page 6: Better web privacy through automation Umesh Shankar Berkeley EECS.

Cookie types

•Four classes based on context and lifetime: (First party, Third party) (Session, Persistent)

•First = cookies from same site as in URL bar•Third = cookies from other sites (e.g., ads)•Session cookie = lasts until browser is closed•Persistent cookie = lasts until expiration date

•All cookies only sent to originating site

Page 7: Better web privacy through automation Umesh Shankar Berkeley EECS.

Cookies and privacy

•Problem: The same third party cookie can appear on multiple sites => you can be tracked

•If anyone knows who you are, everyone does

BUY ME!!!

Content…~~~~~~~~~~~~~~~~~~~~~~~~~~~~

whitehouse.gov

BUY ME!!!

Content…~~~~~~~~~~~~~~~~~~~~~~~~~~~~

whitehouse.com

AdvertiserDatabaseabout you

Tracking

cookie

Tracking cookie

Page 8: Better web privacy through automation Umesh Shankar Berkeley EECS.

The sad state of the art• Two unpalatable

options:• Global policy for all sites• Constant barrage of

dialogs

• Sufficiently annoying that people accept defaults• Functionality trumps

privacy concerns• Individually tailored

policies eliminated

Page 9: Better web privacy through automation Umesh Shankar Berkeley EECS.

Existing approaches

•Browser bars•Expose site cookies to the user

•Collect user choices for collaborative filtering•Violates user privacy (could be fixed)

•P3P interface•A good start•P3P still unreliable

Page 10: Better web privacy through automation Umesh Shankar Berkeley EECS.

A new way of thinking

Observations: 1. People don’t care about cookies, they care

about privacy and functionality2. People are equipped to make high-level

decisions3. Mistakes happen: error recovery should be easy

Solutions:1. Automated optimization: browser forking2. Easy, high-level choices: side-by-side

comparison3. Recovery mechanism: backtrack + replay

Page 11: Better web privacy through automation Umesh Shankar Berkeley EECS.

The ideal policy

•Allow the set of cookies that maximizes the net benefit to the user

•Net benefit = Benefit – Cost =Functionality – Privacy Loss

•Having an objective function allows us to start to measure policy quality•With a default-deny policy, each user-initiated change represents a unit of error

Page 12: Better web privacy through automation Umesh Shankar Berkeley EECS.

Challenges

1. Benefit is not revealed until after cookie is taken

2. Cost only revealed through hidden P3P policy

Making informed decisions is difficult

Page 13: Better web privacy through automation Umesh Shankar Berkeley EECS.

Determining benefit

Observation: If a cookie has no benefit, we can reject it outright

Idea: Fork the browser1. One fork takes the cookie, one doesn’t2. Maintain fork for a few pages3. If no difference in the output, assume

reject4. Encode the decision into the policy

Page 14: Better web privacy through automation Umesh Shankar Berkeley EECS.

Forking the browser (cookies not needed)

Homepage

Itempage

View book CartAdd to cart

VISIBLE Amazon.com (cookies off)

Homepage

Itempage

View book CartAdd to cart

HIDDEN, FORKED Amazon.com (cookies on)

. . .

DENY policy assumed because no difference

Page 15: Better web privacy through automation Umesh Shankar Berkeley EECS.

Forking the browser (cookies needed)

Homepage

EnterZIP

View plans Errorpage

‘94720’

VISIBLE VerizonWireless.com (cookies off)

Homepage

EnterZIP

View plans List ofplans

‘94720’

HIDDEN, FORKED VerizonWireless.com (cookies on)

Switchto otherwindow?

Page 16: Better web privacy through automation Umesh Shankar Berkeley EECS.

User choice

• Last line of defense• Idea: Expose cost and benefit to the

user• Show side-by-side output from forking

technique, highlighting differences (=benefit)

• Show relevant P3P information (=cost)

• User can simply decide if with-cookie is better than without-cookie

Page 17: Better web privacy through automation Umesh Shankar Berkeley EECS.

Semantic comparison of web pages•Necessary for forking and side-by-side

highlighting•Difficult problem on its own•Sources of error:

•Advertisements•Natural nondeterminism (click trackers, news feeds)

•For now, coarse comparison (viz., page title)

Page 18: Better web privacy through automation Umesh Shankar Berkeley EECS.

Dealing with errors

•Errors are inevitable•Prediction algorithm won’t always guess

right•Humans make mistakes or change their

minds

Recovery should be easy and automated

Page 19: Better web privacy through automation Umesh Shankar Berkeley EECS.

“This site requires cookies”The nightmare page

Important Note: This section of the site requires the use of cookies. Cookies allow us to keep track of who you are and your placement choices… Once you have changed your browser to accept cookies, please go back to the placement area home page and begin making your choices…

1. Select "Preferences" from the Edit menu.2. Click on the "Advanced" selector.3. Click the "Cookies" option.4. Scroll down to the "Cookies" section.5. To enable:6. Select "Enable all cookies."7. Click "OK."

Page 20: Better web privacy through automation Umesh Shankar Berkeley EECS.

“Fix Me”

Homepage

Itempage

Emptycart

View Item Show cart

Homepage

Itempage

Cartpage

View Item(Automatic

)

Add item(Automatic

)

Click “Fix Me”

Replay stops because page is different

Itempage

Add item

Cookies Off (due to prior policy)

Session Cookies On(automatic)

Page 21: Better web privacy through automation Umesh Shankar Berkeley EECS.

Notes on Backtracking + replaying

• How far to go back?• Typically start of site

• Which cookies to enable?• Perform an expanding search, start with FP

session• When to stop?

• When there is a difference from the history• If no difference, expand cookie search &

retry• Don’t replay through POSTs (non-

idempotent)

Page 22: Better web privacy through automation Umesh Shankar Berkeley EECS.

Why replaying is OK

•No formal proof possible: server can take arbitrary actions in response to a request

•In practice, server state is the problem•Without state, most sites are repeatable

•Lack of state (cookies) is why we are replaying!

•Empirically, most problems show up early, so not much to replay

Page 23: Better web privacy through automation Umesh Shankar Berkeley EECS.

Engineering challenges

•Web is dynamic and nondeterministic•We handle most Javascript•Heuristics are good in practice at matching elements

•Browsing is nonlinear (e.g., tabs, back button)•Still working on multiple tabs•Back button is emulated

•Firefox is an immature platform


Recommended