+ All Categories
Home > Documents > Machine Learning and BigData in Cyber Securityin.bgu.ac.il/en/engn/ise/dmbi2015/Documents/Eyal...

Machine Learning and BigData in Cyber Securityin.bgu.ac.il/en/engn/ise/dmbi2015/Documents/Eyal...

Date post: 17-Mar-2018
Category:
Upload: vuquynh
View: 216 times
Download: 3 times
Share this document with a friend
18
1 © Copyright 2011 EMC Corporation. All rights reserved. Machine Learning and BigData in Cyber Security Eyal Kolman, Ph.D. Research Scientist RSA 14.5.2015
Transcript

1 © Copyright 2011 EMC Corporation. All rights reserved.

Machine Learning and

BigData in Cyber

Security

Eyal Kolman, Ph.D.

Research Scientist

RSA

14.5.2015

2 © Copyright 2011 EMC Corporation. All rights reserved.

Today’s Cyber Security Paradigm

3 © Copyright 2011 EMC Corporation. All rights reserved.

Today’s Cyber Security Paradigm

4 © Copyright 2011 EMC Corporation. All rights reserved.

** RSA CONFIDENTIAL **

Recent Security Attacks

Neiman Marcus, Jan 2014: 1.1 million credit and debit cards

NY city, 2014: 22.8 private records were exposed

UPS, August 2014: customer details were exposed

Home Depot, September 2014: 56 million credit and debit cards

JP Morgan Chase, 2014: 76 million consumers and 7 million businesses

Sony, November 2014: 47K social security numbers, 5 movies

Goodwill, September 2014: 900k credit and debit cards

KMart, 2014: unknown number of credit and debit cards

Dairy Queen, August 2014: 600K credit and debit cards

5 © Copyright 2011 EMC Corporation. All rights reserved.

Today’s Cyber Security Paradigm DB

DB

DB

(Variety,

Velocity,

Volume)

Data Science

Monitor. Analyze.

Detect.

6 © Copyright 2011 EMC Corporation. All rights reserved.

Data Science all the Way

Data Extraction

•Where to place the sniffers?

•How to best use my limited storage?

•Detection of failures

Parsing

•Detection of message modifications

•Automatic synchronization and normalization

Feature Extraction

•Automatic feature definition (“deep learning” style)

•Feature selection

Detection

•Anomaly detection

•Pattern recognition

•Behavioral-based analysis

•Law and slow detection

•…

Alerting

•Contextual alerting

•Prioritization

•Alerts filtering

•Grouping

Investigation

•Prediction of analyst next step

•Recommendation systems

•Crowd sourcing

•Feedback generation (explicit and implicit)

Mitigation

•What-if analysis

•Automatic mitigation

7 © Copyright 2011 EMC Corporation. All rights reserved.

Data Science all the Way

Data Extraction

•Where to place the sniffers?

•How to best use my limited storage?

•Detection of failures

Parsing

•Detection of message modifications

•Automatic synchronization and normalization

Feature Extraction

•Automatic feature definition (“deep learning” style)

•Feature selection

Detection

•Anomaly detection

•Pattern recognition

•Behavioral-based analysis

•Law and slow detection

•…

Alerting

•Contextual alerting

•Prioritization

•Alerts filtering

•Grouping

Investigation

•Prediction of analyst next step

•Recommendation systems

•Crowd sourcing

•Feedback generation (explicit and implicit)

Mitigation

•What-if analysis

•Automatic mitigation

• Where to place the sniffers?

• How to best use my limited storage?

• Detection of failures

8 © Copyright 2011 EMC Corporation. All rights reserved.

Data Science all the Way

Data Extraction

•Where to place the sniffers?

•How to best use my limited storage?

•Detection of failures

Parsing

•Activities aggregation

•Detection of message modifications

•Automatic synchronization and normalization

Feature Extraction

•Automatic feature definition (“deep learning” style)

•Feature selection

Detection

•Anomaly detection

•Pattern recognition

•Behavioral-based analysis

•Law and slow detection

•…

Alerting

•Contextual alerting

•Prioritization

•Alerts filtering

•Grouping

Investigation

•Prediction of analyst next step

•Recommendation systems

•Crowd sourcing

•Feedback generation (explicit and implicit)

Mitigation

•What-if analysis

•Automatic mitigation

• Activities aggregation

• Detection of message modifications

• Automatic synchronization and

normalization

9 © Copyright 2011 EMC Corporation. All rights reserved.

Data Science all the Way

Data Extraction

•Where to place the sniffers?

•How to best use my limited storage?

•Detection of failures

Parsing

•Activities aggregation

•Detection of message modifications

•Automatic synchronization and normalization

Feature Extraction

•Automatic feature definition (“deep learning” style)

•Feature selection

•Dimensionality reduction

Detection

•Anomaly detection

•Pattern recognition

•Behavioral-based analysis

•Law and slow detection

•…

Alerting

•Contextual alerting

•Prioritization

•Alerts filtering

•Grouping

Investigation

•Prediction of analyst next step

•Recommendation systems

•Crowd sourcing

•Feedback generation (explicit and implicit)

Mitigation

•What-if analysis

•Automatic mitigation

• Automatic feature definition (“deep learning”

style)

• Feature selection

• Dimensionality reduction

10 © Copyright 2011 EMC Corporation. All rights reserved.

Data Science all the Way

Data Extraction

•Where to place the sniffers?

•How to best use my limited storage?

•Detection of failures

Parsing

•Activities aggregation

•Detection of message modifications

•Automatic synchronization and normalization

Feature Extraction

•Automatic feature definition (“deep learning” style)

•Feature selection

•Dimensionality reduction

Detection

•Anomaly detection

•Pattern recognition

•Behavioral-based analysis

•Low and slow detection

•…

Alerting

•Contextual alerting

•Prioritization

•Alerts filtering

•Grouping

Investigation

•Prediction of analyst next step

•Recommendation systems

•Crowd sourcing

•Feedback generation (explicit and implicit)

Mitigation

•What-if analysis

•Automatic mitigation

• Anomaly detection

• Pattern recognition

• Behavioral-based analysis

• Low-and-slow detection

• …

11 © Copyright 2011 EMC Corporation. All rights reserved.

Data Science all the Way

Data Extraction

•Where to place the sniffers?

•How to best use my limited storage?

•Detection of failures

Parsing

•Activities aggregation

•Detection of message modifications

•Automatic synchronization and normalization

Feature Extraction

•Automatic feature definition (“deep learning” style)

•Feature selection

•Dimensionality reduction

Detection

•Anomaly detection

•Pattern recognition

•Behavioral-based analysis

•Low and slow detection

•…

Alerting

•Contextual alerting

•Prioritization

•Alerts filtering

•Grouping

Investigation

•Prediction of analyst next step

•Recommendation systems

•Crowd sourcing

•Feedback generation (explicit and implicit)

Mitigation

•What-if analysis

•Automatic mitigation

• Contextual alerting

• Prioritization

• Alerts filtering

• Grouping

12 © Copyright 2011 EMC Corporation. All rights reserved.

Data Science all the Way

Data Extraction

•Where to place the sniffers?

•How to best use my limited storage?

•Detection of failures

Parsing

•Activities aggregation

•Detection of message modifications

•Automatic synchronization and normalization

Feature Extraction

•Automatic feature definition (“deep learning” style)

•Feature selection

•Dimensionality reduction

Detection

•Anomaly detection

•Pattern recognition

•Behavioral-based analysis

•Low and slow detection

•…

Alerting

•Contextual alerting

•Prioritization

•Alerts filtering

•Grouping

Investigation

•Prediction of analyst next step

•Recommendation systems

•Crowd sourcing

•Feedback generation (explicit and implicit)

Mitigation

•What-if analysis

•Automatic mitigation

• Prediction of analyst next step

• Recommendation systems

• Crowd sourcing

• Feedback generation (explicit and implicit)

13 © Copyright 2011 EMC Corporation. All rights reserved.

Data Science all the Way

Data Extraction

•Where to place the sniffers?

•How to best use my limited storage?

•Detection of failures

Parsing

•Activities aggregation

•Detection of message modifications

•Automatic synchronization and normalization

Feature Extraction

•Automatic feature definition (“deep learning” style)

•Feature selection

•Dimensionality reduction

Detection

•Anomaly detection

•Pattern recognition

•Behavioral-based analysis

•Low and slow detection

•…

Alerting

•Contextual alerting

•Prioritization

•Alerts filtering

•Grouping

Investigation

•Prediction of analyst next step

•Recommendation systems

•Crowd sourcing

•Feedback generation (explicit and implicit)

Mitigation

•What-if analysis

•Automatic mitigation

• What-if analysis

• Automatic mitigation

14 © Copyright 2011 EMC Corporation. All rights reserved.

Risk Engine

Case Mgmt

Activity details

Policy

Mgr.

Behavior Device Fraud

Authenticate Continue

The RSA Risk Engine

Step-up Authentication Feedback

Feedback

Ch

alle

nge

Ou

t-o

f-b

and

Oth

ers

Kn

ow

led

ge

271 937

15 © Copyright 2011 EMC Corporation. All rights reserved.

Device1

Device2

Device3

Country

Device

The model learns the user’s behavior from his historical data

IN

UAE

User logs in from UAE for the 1st time. He’s always located in India Score: 92

User logs in from a new, unrecognized, device Score: 90

Transmitted

Data [MB] User transmits 1GB, user’s average is 68MB

Score: 93

Session

Duration

[Hours]

Sessions duration is 15 hours, user’s average is 4 hours Score: 82

Score Final score is an aggregation of the features’ scores

Aggregated

Score: 98

Impersonation Detection

16 © Copyright 2012 EMC Corporation. All rights reserved.

Suspicious Domains Detection

Each vertical line

represents one feature

How Long is the Path length in the URL?

Was the site reached through a referrer?

Was the site communicated with a cookie?

Was the site seen by only few users?

Was the user agent string suspicious?

Is transmit to receive ratio abnormal?

Risk is calculated across multiple features

Risk is scored between 0 – 1 1 = riskiest (RED) 0 = normal (GREEN)

17 © Copyright 2012 EMC Corporation. All rights reserved.

Ranking Top Suspicious Domains

Ranking Top Suspicious Domains

68% of the top 50 domain are malicious

Legend: Red – malicious

Black – benign

18 © Copyright 2011 EMC Corporation. All rights reserved.

Data Science is not a position. It’s a

Group.

Data Gurus

Domain

Experts

Machine

Learning

Researcher

s

User

Verificatio

n

Device-

based risk

assessmen

t

Suspiciou

s Domains

Detection

Suspiciou

s Users

Detection

Alerts

Prioritizatio

n

Anomalous

Communicatio

n Detection

DNS-

based

Malware

Detection

eMail me: [email protected]

Data

Scienc

e


Recommended