Structured models with neural...

Post on 04-Aug-2020

0 views 0 download

transcript

Structured models with neural networks

Tomáš Pevný

February 28, 2020

Motivation — identification of infected computers

examples of messages:http://lop.guardpair.com/affs?addonname=[Enter%20Product%20Name]&affid=9050&subaffid=5774&subID=undefined&clientuid=undefined&origaffid=9050&origsubaffid=5774&href=http%3A%2F%2F7http://pf.updatewp.org/?v=3.16&pcrc=308403836&LSVRDT=&ty=CHECKhttp://rules.similardeals.net/v1.0/whitelist/1052/9050x5774/7online.subsea7.net?partnerName=Lyrics

Motivation — representation of PE file format

Executable

CodeDataPE header

Call Graph Functions

Control Flow Graph Basic Blocks

StringsImports

Instructions

f1

f2

f3

f4

f7

f8

f5

f6

b1

b3b2

b4

b5b6

b7

b8

Single-instance learning

x = (x1, . . . , xd) f(x) =

{+1

−1extractfeatures

send toclassifier

Icon made by Freepik from www.flaticon.com

Multi-instance learning

x =

(x1,1, . . . , x1,d)(x1,1, . . . , x1,d)

...(xb,1, . . . , xb,d)

f(x) =

{+1

−1extractfeatures

send toclassifier

Our solution of multi-instance learning

x1 ∈ Rd

x2 ∈ Rd

x3 ∈ Rd

xb ∈ Rd

...

h(x1, θh)

h(x2, θh)

h(x3, θh)

h(xb, θh)

x̃1 ∈ Ru

x̃2 ∈ Ru

x̃3 ∈ Ru

x̃b ∈ Ru

...

g({x̃i}bi=1, θg

)x̄ ∈ Ru f (x̄, θf )

g({x̃i}bi=1

)= 1

l

∑bi=1 x̃i

h(x, θh) : Rd → Ru

h(x, θh) = max{0, xTθh}f(x̄, θf ) : Ru → R2

f(x̄, θf ) = x̄Tθf

One vector per instance

One vector per sample

Generalized multi-instance learning

x =

(x(1)1 , . . . , x

(1)d1

)

(x(2)1,1, . . . , x

(2)1,d2

)

(x(2)1,1, . . . , x

(2)1,d2

)...

(x(2)b,1 , . . . , x

(2)b,d2

)

(x(3)1,1, . . . , x

(3)1,d3

)

(x(3)1,1, . . . , x

(3)1,d3

)...

(x(3)c,1 , . . . , x

(3)c,d3

)

f(x) =

{+1

−1extractfeatures

send toclassifier

Extension of universal approximation theorem

Let S be the class of spaces which1. contains all compact subsets of Rd, d ∈ N2. is closed under finite cartesian products3. for each X ∈ S we have P(X ) ∈ S1

Then for each X ∈ S, every continuous function on X can bearbitrarily well approximated by neural networks.

1Here we assume that P(X ) is endowed with some metric.

Identification of infected computers

Detection of infected users — model

d(1)1

. . . d(1)n

d(2)1

. . . d(2)n

p(1)1

. . . p(1)n

p(2)1

. . . p(2)n

k(1)1

. . . k(1)n

v(1)1

. . . v(1)n

k(2)1

. . . k(2)n

v(2)1

. . . v(2)n

�KVq(1)1

. . . q(1)n

�KVq(2)1

. . . q(2)n

�Dd1 . . . dn

�Pp1 . . . pn

�Qq1 . . . qn

�Uu(1)1

. . . u(1)3n

u(2)1

. . . u(2)

l

u(3)1

. . . u(3)

l

u(4)1

. . . u(4)

l

u(5)1

. . . u(5)

l

u(6)1

. . . u(6)

l

�SLDs(1)1

. . . s(1)

k

�SLDs(2)1

. . . s(2)

k

�SLDs(3)1

. . . s(3)

k

�userx1 . . . xp

fy

Detection of infected users — results

I Training data from 5-30.10 20173.6 · 109/4.45 · 106 urls / users

I Testing data from 3.11 20172 · 108/1.5 · 106 urls / users

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

recall

precision

Convolution [1] R. Forests [2]MIL - URL MIL - 5 min

[1] eXpose: A character-level convolutional neural network with embeddings for detectingmalicious URLs, file paths and registry keys, J. Saxe and K. Berlin, 2017

[2] Learning detectors of malicious web requests for intrusion detection in network traffic, L.Machlica, K. Bartos, and M. Sofka, 2017

Static analysis of malware binaries — PE file format

Executable

CodeDataPE header

Call Graph Functions

Control Flow Graph Basic Blocks

StringsImports

Instructions

f1

f2

f3

f4

f7

f8

f5

f6

b1

b3b2

b4

b5b6

b7

b8

Static analysis of malware binaries — model of a function

push regmov reg regmov reg memdec regjne num

mov reg mempush regmov mem regcall KERNEL32!Disable...

xor reg reginc regpop regretn num

1

2

3

tokenization block representationdisassembly output

2554116045

669126

1

2

3

11251163

ngrams

x1

x2

x3

MIL

NN

Static analysis of malware binaries — model of a binary

x1x2x3

x4x5

xn

Block NNBlock NNBlock NN

Block NNBlock NN

Block NN

f1

f2

fm

meanmax

meanmax

meanmax

y2

Function NN

Function NN

Function NN

y1

ym

meanmax Binary NN

map groupby reduce map groupby reduce

Static analysis of malware binaries — results

I Training data4 · 105 binaries

I Testing data5.5 · 105 binaries

I approx. 10% were malicious

Static analysis of malware binaries — feedback

I avoids specifying imports in PEheader

I loads addresses of libraryfunctions to an array

Static analysis of malware binaries — feedback

I adware related behaviorI creating both visible and

invisible windows

Future directions

I Learning representation of hierarchical dataI Learning distances for clusteringI Generative modelsI Anomaly / few shot learning

I Game theory in securityI Finding new attacks under constraints

I Decentralized learningI Modeling and reasoning over relational data

Modeling the internet

https://www.threatcrowd.org/domain.php?domain=ibuyitttttttttttttttttttttttttttttttttttibuyit.com

Materials

I https://github.com/pevnak/Mill.jlI https://github.com/pevnak/JsonGrinder.jl

I Discriminative models for multi-instance problems withtree-structure, Tomáš Pevný, Petr Somol, 2016

I Using Neural Network Formalism to Solve Multiple-InstanceProblems, Tomáš Pevný, Petr Somol, 2016

I Approximation capability of neural networks on sets ofprobability measures and tree-structured data, Tomáš Pevný,Vojtěch Kovařík, 2019

I Nested Multiple Instance Learning in Modelling of HTTPnetwork traffic, Tomas Pevny, Marek Dedic, 2020