+ All Categories
Home > Documents > Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh...

Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh...

Date post: 19-Jan-2016
Category:
Upload: jeffry-baker
View: 213 times
Download: 1 times
Share this document with a friend
Popular Tags:
39
Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University PARS: A Uniform and Open-source Password Analysis and Research System
Transcript
Page 1: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

Shouling Ji, Shukun Yang, and Raheem BeyahGeorgia Institute of Technology

Ting WangLehigh University

Changchang Liu and Wei-Han LeePrinceton University

PARS: A Uniform and Open-source Password Analysis and Research

System

Page 2: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

INTRODUCTION

• People choose simple passwords.• password• 123456• 111111• iloveyou

• People reuse passwords.• On average a user has 6-7 passwords and maintains 25 distinct

online accounts. (Florencio et al. [WWW’07][1])

Page 3: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

INTRODUCTION

Over the past decade, hundreds of millions of passwords have been leaked.

Page 4: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

INTRODUCTION

Why do we worry about leaked passwords?

They can be used to crack other password datasets!

How?

Page 5: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

INTRODUCTION

• Text-based passwords still dominate computer system authentication

• Usability: Personal enough for users to remember• Security: Difficult enough for outsiders to guess/access

• Passwords datasets have been leaked• English: LinkedIn, Rockyou, eHarmony, …• Chinese: Tianya, CSDN, 7k7k, Duduniu, ….• German: Gamigo

• Passwords research has made considerable progress• Passwords Cracking: Markov-based, Structure-based, Dictionary, Rainbow

Table, …• Passwords Strength Meter: NIST, Ideal, Markov-based, Structure-based, …• Passwords Management, Measurement, Alternatives, …

Page 6: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

RELATED WORK

• Password Cracking

• Algorithms aim to reduce search size of password space and to enumerate passwords in the decreasing order of likelihood.

• Use expired and reused passwords as training information to create guesses.

• Password Measurement

• Correlations between demographic and behavioral factors have been found. Regional differences result in various password patterns.

• NIST entropy and other traditional password metrics have been found ineffective. Using cracking models to build sophisticated meters has been shown to be more effective.

• Inconsistent feedback of password strength exists among different sorts of strength metrics and meters.

Page 7: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

RELATED WORK

Questions:

Many password cracking algorithms have performed reasonably well.

But which one is the most effective?

Many websites have password policies and strength meters.

But are they helpful?

Many passwords datasets have been leaked and published.

But do they affect the security of other datasets?

Page 8: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

OUR CONTRIBUTIONS

• A uniform and comprehensive Password Analysis and Research System - PARS (open-source project)

• Large-scale Password Security Measurement and Analysis

• Future Research Insights: Correlation, Hybrid Cracking, Relative Improvement Ratio, Diversity, …

Page 9: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

OUTLINE

• PARS Overview• Datasets Analysis• Password Cracking Models• Hybrid Cracking Feasibility Analysis• Password Measurement Models – Academic• Password Measurement Models – Commercial• Future Research Insights

Page 10: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PARS OVERVIEW

Available in PARS:

• Cracking Module - Attack

• Measurement Module - Defend

• Utility Module - Analyze

Crac

kin

g A

lgor

ith

ms

(12

)

Dataset Analysis (145M)

Strength Metrics (15)

Insi

ghts

Boo

ster

(R

IR, …

)

Academic Meters (8)

Commercial Meters (15)

Page 11: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PARS OVERVIEW

Cracking Module:

• 12 state-of-the-art cracking algorithms

Measurement Module:

• 15 intra-site and cross-site password strength metrics

• 8 academic password meters

• 15/24 commercial password meters from top-150 websites ranked by Alexa.com

Crac

kin

g A

lgor

ith

ms

(12

)

Dataset Analysis (145M)

Strength Metrics (15)

Insi

ghts

Boo

ster

(R

IR, …

)

Academic Meters (8)

Commercial Meters (15)

Page 12: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PARS OVERVIEW

Utility Module:

• Data Analysis• 8 Datasets of Leaked

Passwords• Data Processing Unit

• Tools• Hashing (MD5)• Preprocessing of data• Statistical Analysis

• Insights and Future Research• Hybrid Password Cracking• RIR Metrics

Crac

kin

g A

lgor

ith

ms

(12

)

Dataset Analysis (145M)

Strength Metrics (15)

Insi

ghts

Boo

ster

(R

IR, …

)

Academic Meters (8)

Commercial Meters (15)

Page 13: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PARS OVERVIEW

• Command Line Mode• Easy and Fast Scripting• No need to learn about using

individual algorithms• Carefully aligned outputs

• GUI Mode• Delicate design that is user-

friendly• Support for Commercial

Meters Evaluation• Visualization of Outputs

Crac

kin

g A

lgor

ith

ms

(12

)

Dataset Analysis (145M)

Strength Metrics (15)

Insi

ghts

Boo

ster

(R

IR, …

)

Academic Meters (8)

Commercial Meters (15)

Page 14: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

DATASETS ANALYSIS

Ethics: All datasets were once publicly available and are used for research purpose only

Contains username, email corresponding to each password

Page 15: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

DATASETS ANALYSIS

• Passwords classifications• Lengths: <=6, 7, …, 14, >=15

• Compositions: Univariate, Bivariate, Trivariate, Qualvariate

e.g., password123 -> bivariate; password123!@# -> trivariate

• Structure: LD, L, D, DL, LDL, UD, U, ULD, DLD, LDLD, other

password123 -> LDPAssword123 -> ULD

Page 16: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

DATASETS ANALYSIS

• Standard Datasets

• 2 million randomly sampled passwords from each of the 8 datasets to ensure fair evaluation

• 8 standard datasets for all evaluations and tests beyond this point• 7k7k• CSDN• Duduniu• Renren• Tianya• LinkedIn• Rockyou• Gamigo

Page 17: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PASSWORD CRACKING MODELS

• John the Ripper [2]

• A popular community cracking software• Bleeding-Jumbo is a open-sourced community version

• Contains 4 popular modes• Single (social profile information)• Wordlist (input dictionary/wordlist)• Incremental (smart brute-force)• Markov (markov chain/training data)

Page 18: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PASSWORD CRACKING MODELS

• HashCat (v0.50)

• A popular community cracking software

• Contains 4 popular modes• Brute-force Mode• Dictionary Attack• Mask Attack• Permutation Attack…etc

Page 19: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PASSWORD CRACKING MODELS

• Probabilistic Context Free Grammars

• Pcfg Manager (PCFG)

Weir et el. “Password Cracking Using Probabilistic Context-Fre Grammars”, S&P 2009.

• Semantic Guesser (VCT)

Rafael et el. “On the Semantic Patterns of Passwords and their Security Impact”, NDSS 2014.

Page 20: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PASSWORD CRACKING MODELS

• Markov Models

• Fast Dictionary Attack

• Narayana et el. “Fast dictionary attacks on passwords using time-space tradeo ”, CCS 2005ff

• N-gram

• Ur et el. “How does your password measure up? the e ect of strength meters ffon password creation”, USENIX 2012

• OMEN and OMEN+

• Durmuth et el. “Leveraging personal information for password cracking”, CoRR 2013

Page 21: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PASSWORD CRACKING MODELS

• Others

• Cross-site guessing (DBCBW)

• Das et el. “The tangled web of password reuse”, NDSS 2014

• Transform-based guessing (ZMR)

• Zhang et el. “The security of modern password expiration: An algorithmic framework and empirical analysis”, CCS 2010

Page 22: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PASSWORD CRACKING MODELS

• Summary

Page 23: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PASSWORD CRACKING MODELS

Highest percentage of cracked passwords highlighted

Training-free

Page 24: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

BETTER CRACKING PERFORMANCE?

Since no single algorithm excels all the time…

Why not devise a hybrid algorithm to combine the cracking performance?

Page 25: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

BETTER CRACKING PERFORMANCE?

• Possible?

• Take the advantages of both PCFG-based and Markov-based algorithms and combine into a single hybrid version

Hybrid Password Cracking (HPC)

Page 26: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

HYBRID PASSWORD CRACKING (HPC)

• Question 1

Is it reasonable/necessary to design a HPC algorithm?

• Question 2

If reasonable, how much improvement can be achieved?

Page 27: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

HYBRID PASSWORD CRACKING (HPC)

• Relative Improvement Ratio (RIR)Under same settings:

A1 -> PCFG, A2->OMENX: set of 7k7k password cracked by Renren-trained A1Y: set of 7k7k password cracked by Renren-trained A2

The RIR of A1 given by A2, denoted by is defined as

PCFG(P)VCT(V)3gram(3)OMEN(O)

Set of cracked passwords by A1

Set of cracked passwords by A2

Overlap

Page 28: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

HYBRID PASSWORD CRACKING (HPC)

• RIR indicates the potential improvement of an algorithm when incorporating the advantages of another algorithm

• Answers to previous 2 questions• Every algorithm has room for improvement given the

advantage of another algorithm• Different choice of algorithms will result in different

improvement spaces

Page 29: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PASSWORD MEASUREMENT - ACADEMIC

• Statistics-based• Ideal [9]: Assigns entropy

score based on probability distribution

• Rule-based• NIST [9]: entropy of 1st char

is 4 bits; next 7 chars are 2 bits/char; bonus with dictionary check

• Intra-site metrics [10]• Min-Entropy• Guesswork G• Beta-success-rate• Alpha-work-factor

• Cross-site metrics [10]

Page 30: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PASSWORD MEASUREMENT - ACADEMIC

• Attack-based• Estimate difficulty of attack

using a cracking model• PCFG [11]

• Based on the structure-based cracking algorithm

• Adaptive [9]• Based on the Markov-

based cracking algorithm• Brute-force Markov (BFM)

[12]

Traditional entropy does not evaluate attacker’s efforts.

Good metrics try to estimate the efforts for attackers to crack a specific password/dataset.

Page 31: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PASSWORD MEASUREMENT - ACADEMIC

Training -> Testing

Eg.,

Tianya->CSDN (PCFG)

= Using Tianya to train PCFG meter and evaluate CSDN dataset

CDF of Entro

py Dist.

VS. Entro

py Score

Page 32: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PASSWORD MEASUREMENT – COMMERCIAL

• From alexa.com

• Top-150 commercial password strength evaluators in 10 specific categories

LL: with minimum length constraint

UL: with maximum length constraint

C1-C6: # of composition policies

N/A: unclear policies

Page 33: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PASSWORD MEASUREMENT – COMMERCIAL

List of 150 websites

Page 34: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PASSWORD MEASUREMENT – COMMERCIAL

• Commercial Meters:

• Target

• Bloomberg

• Yahoo!

• Google

• Home Depot

• …

Page 35: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PASSWORD MEASUREMENT – COMMERCIAL

• We extracted source code from specific online meters and leveraged them directly or rewrote the meters using the algorithms gleaned from the code

• For those checkers that operate at the server-end without , we can initiate requests to check specific passwords

• In our experiments, we evaluate using Google, Yahoo!, Bloomberg and Target’s meters

Page 36: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PASSWORD MEASURING – COMMERCIAL

Page 37: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

PASSWORD MEASUREMENT – COMMERCIAL

Strength Dist. VS. Strength Levels (each meter has different “levels”)

Page 38: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

CONCLUSIONS

• We proposed and implemented PARS, the first uniform, open-source and comprehensive password research platform, which can provide great convenience for researchers and a platform of benchmark for new techniques.

• We conducted large-scale and comparable evaluation using numerous algorithms implemented in PARS which helps us better understand current password security research.

• We evaluated future research insights such as the feasibility of hybrid password cracking and provided a measure of the effectiveness of HPC design.

Page 39: Shouling Ji, Shukun Yang, and Raheem Beyah Georgia Institute of Technology Ting Wang Lehigh University Changchang Liu and Wei-Han Lee Princeton University.

Thank you!

CAP Grouphttp://www.ece.gatech.edu/cap/PARS


Recommended