+ All Categories
Home > Documents > Modeling Malware Spreading Dynamics

Modeling Malware Spreading Dynamics

Date post: 23-Mar-2016
Category:
Upload: ulf
View: 54 times
Download: 0 times
Share this document with a friend
Description:
Modeling Malware Spreading Dynamics. Michele Garetto (Politecnico di Torino – Italy) Weibo Gong (University of Massachusetts – Amherst – MA) Don Towsley (University of Massachusetts – Amherst – MA). INFOCOM 2003. Outline. Motivation Modeling framework Interactive Markov Chains - PowerPoint PPT Presentation
Popular Tags:
27
Modeling Malware Spreading Dynamics Michele Garetto (Politecnico di Torino – Italy) Weibo Gong (University of Massachusetts – Amherst – MA) Don Towsley (University of Massachusetts – Amherst – MA) INFOCOM 2003
Transcript
Page 1: Modeling Malware Spreading Dynamics

Modeling Malware Spreading Dynamics

Michele Garetto (Politecnico di Torino – Italy)

Weibo Gong (University of Massachusetts – Amherst – MA)

Don Towsley (University of Massachusetts – Amherst – MA)

INFOCOM 2003

Page 2: Modeling Malware Spreading Dynamics

2

Outline Motivation Modeling framework

Interactive Markov Chains Analysis and simulation of

malware propagation dynamics Percolation problem Transient behavior

Conclusions

Page 3: Modeling Malware Spreading Dynamics

3

Motivation The Internet is an easy and powerful

mechanism for propagating malicious software programs : the “malware” (email viruses, worms, …)

It is expected that future malware acitivity will be more prevalent and virulent, resulting in significant greater damage and economic losses

Page 4: Modeling Malware Spreading Dynamics

4

Motivation Dynamics of malware propagation are still

not well understood We would like to:

Predict the temporal evolution of an infection process that starts propagating on a network

Design and evaluate effective countermeasures Assess the defensibility and vulnerability of different

network architectures Need to develop mathematical

methodologies that are able capture the spreading characteristics of malware

Page 5: Modeling Malware Spreading Dynamics

5

Our contribution A flexible modeling framework based on

Interactive Markov Chains, able to capture the probabilistic nature of malware propagation

Application of such framework to the case of email viruses: Identification of a “percolation problem” Investigation of the impact of the underlying

network topology

Analytical bounds and approximations validated through extensive simulations

Page 6: Modeling Malware Spreading Dynamics

6

The “Interactive Markov Chain” (IMC) Modeling Framework

Global network structure ...

alertalert failednormal

Local Structure(can vary from node to node)

secure insecure emergency

Each node is represented by a Markov chain, whose state transitions are influenced by the status of its neighbors

but locally a Markov chain

Global Structure(the network)

Page 7: Modeling Malware Spreading Dynamics

7

Computational complexity issue

The whole system evolves according to a global Markov chain G, whose state space dimension ( #G ) is equal to the product of the local chain dimensions ( #L ) G = L N

The solution of the global Markov chain is feasible only for small systems

example: - 20 nodes - binary status (0 = not infected, 1 = infected)

220 states !

Page 8: Modeling Malware Spreading Dynamics

8

How can we study very large systems (thousands of nodes) ?

Discrete event simulations of the model

how many runs ? how long ? could be computationally too expensive do not help to understand the system dynamics

Analytical bounds and approximations

quick prediction of the system behavior gross-level approximations can be sufficient provide insights into the inner dynamics

Page 9: Modeling Malware Spreading Dynamics

9

The virus propagates as attachment to e-mail messages

Requires human assistance random time elapses before the

recipient reads the message the “click” probability

The virus makes use of the recipient’s address book to send copies of itself

E-mail virus propagation

Page 10: Modeling Malware Spreading Dynamics

10

We consider the network graph induced by email address books Each node stands for an email address Edges represent social or business

relationships

The resulting graph is expected to have “small world” properties: small characteristic path length high clustering coefficient

IMC model : global structure

Page 11: Modeling Malware Spreading Dynamics

11

IMC model : local structure 3 statuses for each node:

S (Susceptible): the node can be infected by the virus I (Infected): the node has been infected by the virus M (Immune): the node can no more be infected by the

virus

Discrete time model probability that node “j” is susceptible at time

k probability that node “j” is infected at time k

probability that node “j” is immune at time k

Page 12: Modeling Malware Spreading Dynamics

12

IMC model

i

j

wij

I M

S

cj1-cj

Page 13: Modeling Malware Spreading Dynamics

13

Virus propagation model

The numerical solution of the system requires to know the joint probabilities of neighboring nodes:

?

Page 14: Modeling Malware Spreading Dynamics

14

Fundamental questions about malware propagation dynamics:

Virus propagation model

What is the final size of the infection outbreak ? How many nodes (on average) will be

reached by the virus at the end ? How fast is the malware propagation ?

What is the (average) number of infected node as a function of time ?

Page 15: Modeling Malware Spreading Dynamics

15

The “small-world” model of Watts and Strogatz

A ring lattice with additional random shortcuts

Parameters:N = number of nodesS = number of shortcuts ( = shortcuts density)

k = lattice connectivity (number of neighbors on each side of a node)

N = 24 S = 4 k = 3

Page 16: Modeling Malware Spreading Dynamics

16

What is the final size of the infection outbreak ?

Not all of the susceptible nodes necessarily receive a copy of the virus !

site percolation problem

(node occupation probability = click probability)

Page 17: Modeling Malware Spreading Dynamics

17

0.01

0.1

1

10

100

1000

10000

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8click probability (c)Av

erag

e nu

mbe

r of

infe

cted

sit

es = 0.1 = 0.01 = 0.001

= 0.0001

Site percolation problem on the small world graph: exact asymptotic result

[Moore, Newman 2000]

Page 18: Modeling Malware Spreading Dynamics

18

Transient analysis of an infection process : Bounds

Probability that a node has been reached by the virus

Lower boundproperty of joint probabilities

Upper boundtheory of Associated Random Variables

Page 19: Modeling Malware Spreading Dynamics

19

Analytical bounds

0

1000

2000

3000

4000

5000

6000

7000

8000

0 500 1000 1500 2000 2500

Ave

rage

num

ber o

f inf

ecte

d no

des

time

simupper bound

lower bound

Infinite unidimentional lattice

k = 1k = 10

k = 100

(click probability = 1)1

Page 20: Modeling Malware Spreading Dynamics

20

Transient analysis of an infection process : approximation

Linear mixing of lower bound and upper bound

k = local connectivity of the nodes = self-influence probability

Page 21: Modeling Malware Spreading Dynamics

21

Fitting of mixing coefficient M(k,s)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

1 10 100 1000

Mix

ing

coeffi

cien

t (M

)

Connectivity (k)

s = 0s = 1/3s = 2/3s = 0.9

Infinite unidimentional lattice

Page 22: Modeling Malware Spreading Dynamics

22

Approximate analysis on the small world graph: the impact of topology

0200400600800

100012001400160018002000

0 200 400 600 800 1000

Ave

rage

num

ber o

f inf

ecte

d no

des

time

model approxsimulation

k = 10 no shortcuts

k = 10 2 shortcuts

k ~ geom(10) no shortcuts

k = 10 20 shortcuts

Fully-connected graph

(2000 nodes - click probability = 1)

Page 23: Modeling Malware Spreading Dynamics

23

Combining transient analysis and percolation on general topologies

Probability not to be reached by the virus = initial immunization

overestimate of the spreading rate of the virus

Upper bound of the reaching probability on general topologies:

Global upper bound for the infection process

Page 24: Modeling Malware Spreading Dynamics

24

0

1000

2000

3000

4000

5000

6000

0 500 1000 1500 2000 2500 3000 3500 4000

Aver

age

num

ber

of in

fect

ed n

odes

time

simulationmodel approx + bound perc

Transient analysis and percolation on power-law random graphs

10000 nodes

GLP algorithm (Bu 2002)power-law node degree, small-world properties

(click probability = 0.5)

m = initial connectivitym = 1

m = 2

m = 4

One initially infected node with degree 10

Page 25: Modeling Malware Spreading Dynamics

25

Conclusions We have proposed an analytical framework

to study the dynamics of malware propagation on a network

We have obtained useful bounds and approximations to study an infection process on a general topology

Approach suitable to analyze a wide range of “dynamic interactions on networks” (routing protocols, p2p,…)

Page 26: Modeling Malware Spreading Dynamics

26

The End

Thanks…

Page 27: Modeling Malware Spreading Dynamics

27

Site percolation on a given small-world graph

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 100 200 300 400 500 600 700 800 900 999

Rea

chin

g pr

obab

ility

node index

simupper bound - h = 0upper bound - h = 8approximationlower bound - h = 8lower bound - h = 0


Recommended