+ All Categories
Home > Documents > AGeneralApproachforEfficiently ...• DAG"Representa3on" •...

AGeneralApproachforEfficiently ...• DAG"Representa3on" •...

Date post: 04-Oct-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
42
A General Approach for Efficiently Accelera3ng So6ware8based Dynamic Data Flow Tracking on Commodity Hardware Kangkook Jee Columbia University Joint work with Georgios Portokalidis 1 , Vasileios Kemerlis 1 , Soumyadeep Ghosh 2 , David August 2 , Angelos Keromy3s 1 1 Columbia University, 2 Princeton University 1
Transcript
Page 1: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

A"General"Approach"for"Efficiently"Accelera3ng"So6ware8based"Dynamic"Data"Flow"Tracking"on"Commodity"Hardware"

Kangkook"Jee""Columbia"University"

Joint"work"with"Georgios"Portokalidis1","Vasileios"Kemerlis1",""

Soumyadeep"Ghosh2,"David"August2,"Angelos"Keromy3s1""

1Columbia"University,"2Princeton"University"

1"

Page 2: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Talk"Outline"

•  Data"Flow"Tracking"[VEE"12]"– Popular"Subject"of"Security"Research"– Tagging"and"tracking"of"Interes3ng"data"

•  Taint"Flow"Algebra"(TFA)"[NDSS"12]"–  Intermediate"Representa3on(IR)"for"DFT"– Compiler"op3miza3on"+"DFT"specific"op3miza3on"

•  ShadowReplica"– Decoupling"of"execu3on"and"monitoring"– Running"each"from"different"cores"

2"

Page 3: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Talk"Outline"

•  Data"Flow"Tracking"[VEE"12]"– Popular"Subject"of"Security"Research"– Tagging"and"tracking"of"Interes3ng"data"

•  Taint"Flow"Algebra"(TFA)"[NDSS"12]"–  Intermediate"Representa3on(IR)"for"DFT"– Compiler"op3miza3on"+"DFT"specific"op3miza3on"

•  ShadowReplica"– Decoupling"of"execu3on"and"monitoring"– Running"each"from"different"cores"

3"

Page 4: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Data"Flow"Tracking"(DFT)"•  A"great"security"tool"with"many"applica3ons"– Tag"input"data"and"track"them"– So6ware"exploits,"Informa3on"misuse"or"leakage"malware"analysis"…"

•  Implementa3on"approaches"– Hardware"assisted:"Raksha,"RIFLE"…"– Source"code"based:"GIFT"…"– Binary"only:"TaintCheck,"Dytan,"Minemu,"Libd6"…"

"Binary"only"DFT:"Most"promising,"but"too"slow!"4"

Page 5: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

DFT:"Basic"Aspects"•  DFT"is"characterized"by"three"aspects""

(1)  Data'Sources:"program"or"memory"loca3ons"where"data"of"interest"enter"the"system"and"is"subsequently"tagged"

(2)  Data'tracking:"process"of"propaga3ng"data"tags"according"to"the"program’s"seman3cs"

(3)  Data'Sinks:"program"or"memory"loca3ons"where"checks"for"“tagged”"data"can"be"made"

Shadow Memory

Data Tracking

mov eax, [ebx]

mov [esi], eax

(2)

Source

FileNetworkKeyboard

Input

(1)

FileNetwork

Output

����

(3)

5"

Page 6: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

DFT"Opera3on"

Real Memory

•  Real"Memory"="Address"space"+"register"context"6"

Page 7: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Real Memory

Shadow Memory

DFT"Opera3on"

•  Real"Memory"="Address"space"+"register"context"•  Shadow"memory"to"track"metadata"update""" 7"

Page 8: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Real Memory

Shadow Memory

DFT"Opera3on"

dst[idx1] = src[idx0];

•  Memory"copy"statement"from"the"original"execu3on"8"

Page 9: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Real Memory

Shadow Memory

DFT"Opera3on"

dst[idx1] = src[idx0];

t(dst[idx1]) = t(src[idx0]);

•  Memory"copy"statement"from"the"original"execu3on"•  Corresponding"shadow"memory"update" 9"

Page 10: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Real Memory

Shadow Memory

DFT"Opera3on"

dst[idx1] = src[idx0];

t(dst[idx1]) = t(src[idx0]);

mov reg0 ← [src+idx0]

mov [dst+idx1] ← reg0

•  Original"opera3on"translated"into"machine"code""•  It"requires"intermediate"register"repository"(reg0)" 10"

Page 11: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Real Memory

Shadow Memory

DFT"Opera3on"

dst[idx1] = src[idx0];

t(dst[idx1]) = t(src[idx0]);

mov reg0 ← [src+idx0]

mov [dst+idx1] ← reg0

mov reg0 ← [t(src+idx0)] mov [t(reg0)] ← reg0

•  Instruc3on"level"instrumenta3on"to"implement"shadow"update"

11"

Page 12: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Real Memory

Shadow Memory

DFT"Opera3on"

dst[idx1] = src[idx0];

t(dst[idx1]) = t(src[idx0]);

mov reg0 ← [src+idx0]

mov [dst+idx1] ← reg0

mov reg0 ← [t(src+idx0)] mov [t(reg0)] ← reg0

mov reg0 ← [t(reg0)] mov [t(dst+idx1)] ← mov reg0

•  2"original"instruc3ons"+"4"tracking"instruc3ons"•  2"instrumenta3on"units" 12"

Page 13: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Why"So"Slow?"

•  Framework"cost"(virtualiza3on"cost)"– DBI,"Hypervisor"instrumenta3on"

•  DFT"cost"– Accesses"to"shadow"storage"

•  Naïve"Implementa3on"– No"understanding"of"global"context"– No"understanding"of"DFT"seman3cs"

13"

Page 14: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Talk"Outline"

•  Data"Flow"Tracking"– Popular"Subject"of"Security"Research"– Tagging"and"tracking"of"Interes3ng"data"

•  Taint"Flow"Algebra"(TFA)"–  Intermediate"Representa3on(IR)"for"DFT"– Compiler"op3miza3on"+"DFT"specific"op3miza3on"

•  ShadowReplica"– Decoupling"of"execu3on"and"monitoring"– Running"each"from"different"cores"

14"

Page 15: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Taint"Flow"Algebra"

•  Applica3on"specific"analysis"•  DFT"specific"analysis"•  Integrated"with"libd)*– High"performance"DFT"tool""[VEE"2012]""•  1.46x"~"8x"slowdown"(over"na3ve"execu3on)"

– Designed"for"use"with"Pin*DBI*framework*"– Open"source"•  hsp://www.cs.columbia.edu/~vpk/research/libd6"

15"

Page 16: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Op3mizing"DFT"

mov reg0 ← [src+idx0]

mov [dst+idx1] ← reg0

mov reg0 ← [t(reg0)] mov [t(dst+idx1)] ← mov reg0

mov reg0 ← [t(src+idx0)] mov [t(reg0)] ← reg0

•  Each"Instrumenta3on"unit"requires"head/tail"instruc3ons"•  t(*)*:"shadow"memory"access"cost""

Instrumenta3on"

Original"

16"

Page 17: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

mov reg0 ← [src+idx0]

mov [dst+idx1] ← reg0

mov reg0 ← [t(reg0)] mov [t(dst+idx1)] ← mov reg0

mov reg0 ← [t(src+idx0)] mov [t(reg0)] ← reg0

Op3mizing"DFT"

•  Re8locatable"

Original"

Instrumenta3on"

17"

Page 18: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Op3mizing"DFT"

mov reg0 ← [src+idx0]mov [dst+idx1] ← reg0

mov reg0 ← [t(src+idx0)] mov [t(reg0)] ← reg0 mov reg0 ← [t(reg0)] mov [t(dst+idx1)] ← mov reg0

•  Less"instrumenta3on"units"(2!1)"18"

Page 19: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

mov reg0 ← [src+idx0]mov [dst+idx1] ← reg0

mov reg0 ← [t(src+idx0)] mov [t(reg0)] ← reg0 mov reg0 ← [t(reg0)] mov [t(dst+idx1)] ← mov reg0

Op3mizing"DFT"

•  Less"instrumenta3on"units"(2!1)"•  Less"tracking"instruc3ons"(4!2)" 19"

Page 20: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Execu3on"Model"

•  3"Components"–  "Profiler,"Analyzer,"DFT"Run3me"

•  Sta3c/offline"analysis"+"Dynamic"run3me"–  Feedback"loop"

DFT Runtime Analyzer

Optimized data tracking Control flow

information

Basic blocks Static Profiler

Dynamic profiler

Unprocessed basic blocks

20"

Page 21: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Analyzer"•  Taint"Flow"Algebra"–  Represent"binary"analysis"result"–  IR"tailored"to"capture"DFT"seman3cs"

•  Compiler"op3miza3on"to"TFA"–  Inner"(intra)"basic"block:"

"Dead"code"elimina3on,"Algebraic"simplifica3on,"…"– Outer"(inter)"basic"block:""

"Data"flow"analysis"•  DFT"specific"considera3ons"–  Valid"loca3on"for"each"instrumenta3on"unit"– Number"of"instrumenta3on"units"

21"

Page 22: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

TFA"Op3miza3on"

•  Per"basic"block"analysis"•  Gray"instruc3ons:"non8tracking"instruc3ons"

1:#mov#ecx,#esi2:#movzxb#eax,#al3:#shl#ecx,#0x54:#add#edx,0x15:#lea#esi,#ptr#[ecx+esi]6:#lea#esi,#ptr#[eax+esi]7:#movzxb#eax,#ptr#[edx+esi]######8:#testb#al,#al9:#jnzb#0xb7890200

(a) x86 instruction

22"

Page 23: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

TFA"Op3miza3on"

•  Translated"into"TFA"•  Input"operands,"output"operands"

1:#mov#ecx,#esi2:#movzxb#eax,#al3:#shl#ecx,#0x54:#add#edx,0x15:#lea#esi,#ptr#[ecx+esi]6:#lea#esi,#ptr#[eax+esi]7:#movzxb#eax,#ptr#[edx+esi]######8:#testb#al,#al9:#jnzb#0xb7890200

(a) x86 instruction

1:#ecx1#:=#esi02:#eax1#:=#0x1#&#eax03:#4:#5:#esi1#:=#ecx1#|#esi06:#esi2#:=#eax1#|#esi17:#eax2#:=#0x1#&#[edx0+esi2]8:#9:#

(b) TFA transformation

23"

Page 24: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

TFA"Op3miza3on"

•  Output"operands"are"expressed"in"terms"of"input"operands"

•  Data"flow"analysis"to"remove"irrelevant"outputs"

1:#mov#ecx,#esi2:#movzxb#eax,#al3:#shl#ecx,#0x54:#add#edx,0x15:#lea#esi,#ptr#[ecx+esi]6:#lea#esi,#ptr#[eax+esi]7:#movzxb#eax,#ptr#[edx+esi]######8:#testb#al,#al9:#jnzb#0xb7890200

(a) x86 instruction

1:#ecx1#:=#esi02:#eax1#:=#0x1#&#eax03:#4:#5:#esi1#:=#ecx1#|#esi06:#esi2#:=#eax1#|#esi17:#eax2#:=#0x1#&#[edx0+esi2]8:#9:#

(b) TFA transformation

1:#ecx1#:=#esi02:3:#4:#5:#6:#esi2#:=#0x1#&#eax0#|esi07:#eax2#:=#0x1#&#[edx0+esi2]8:#9:#

(c) TFA optimization

24"

Page 25: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

TFA"Op3miza3on"

•  DAG"Representa3on"•  Express"root"nodes"in"terms"of"leaf"nodes"

1:#mov#ecx,#esi2:#movzxb#eax,#al3:#shl#ecx,#0x54:#add#edx,0x15:#lea#esi,#ptr#[ecx+esi]6:#lea#esi,#ptr#[eax+esi]7:#movzxb#eax,#ptr#[edx+esi]######8:#testb#al,#al9:#jnzb#0xb7890200

(a) x86 instruction

esi2

eax1

eax0 0x1 esi0

esi1

eax2

[edx0+esi2] 0x1

&|

&&

ecx1

DAG Representation

25"

Page 26: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

DFT"Run3me"

•  Generate/Inject"op3mized"tracking"code"to"the"baseline"DFT"plaworm"– Translate"op3mized"TFA""

•  Our"prototype"extends"libd6""•  Code"genera3on"of"libd6/PIN8aware"C"code"– A"func3on"per"each"instrumenta3on"unit"– e.g.,"Firefox:"50K"customized"func3ons""

26"

Page 27: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Evalua3on"•  Op3miza3on"schemes"

–  Code"reduc3on:"Simple"dead"code"elimina3ons"•  Inner,"Outer"

–  Code"genera3on:"Op3mized"tracking"codes"–  TFA"Scaser,"TFA"Aggrega3on"

Category' Op9miza9on'schemes'

CFG'Considera9on'

TFA'Op9miza9on'

Aggrega9on'

Code"reduc3on" Inner" No" No" No"

Outer" Yes" No" No"

Code"genera3on"

Scaser" Yes" Yes" No"

Aggrega3on" Yes" Yes" Yes"

27"

Page 28: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Evalua3on:"SPEC"CPU2000"

•  CPU"intensive"workloads"•  TFA’s"speedup"over"libd6:""on"average"1.90x"(the"largest"2.23x)"•  ~3x"slowdown"over"the"na3ve"execu3on"

"

0

1.5

3

4.5

6

7.5

9

10.5

12

crafty eon gap gcc mcf parserperlbmk twolf vortex vpr average

Slo

wd

ow

n (

No

rma

lize

d)

libdftInnerOuter

TFA scatterTFA aggr

28"

Page 29: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Evalua3on:"Server"applica3ons"

•  Mysql’s"own"benchmark"suite"(sql8bench)"and"PHP"micro"benchmark"suite"(PHPBench)"–  Plosed"representa3ve"subsets"

1

2

3

4

5

6

7

create alter insert ATIS

Slo

wdow

n (

norm

aliz

ed)

Test suite

(a) MySQL

libdftInnerOuter

TFA scatterTFA aggr

4

8

12

16

20

24

28

casing md5 sha1 average

Slo

wdow

n (

norm

aliz

ed)

PHPBench Benchmark

(b) PHP

libdftInnerOuter

TFA scatterTFA aggr

29"

Page 30: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Evalua3on:"Client"Applica3ons"

•  Rendering"measurement"for"Alexa’s"Top"500"sites"and"NDSS"2012"site"–  For"Firefox"web8browser"

•  Dromaeo"(hsp://www.dromaeo.com)"Javascript"benchmark"suite"–  For"Firefox"and"Google"Chrome"web8browser"

1

2

3

4

5

6

7

8

9

10

11

Gmail NDSS YoutubeFacebook

Slo

wd

ow

n (

no

rma

lize

d)

Web site

(a) Web site rendering

libdftInnerOuter

TFA scatterTFA aggr

6

9

12

15

18

firefox chrome

Slo

wd

ow

n (

no

rma

lize

d)

Browser

(b) Javascript

libdftInnerOuter

TFA scatterTFA aggr

30"

Page 31: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Talk"Outline"

•  Data"Flow"Tracking"[VEE"12]"– Popular"Subject"of"Security"Research"– Tagging"and"tracking"of"Interes3ng"data"

•  Taint"Flow"Algebra"(TFA)"[NDSS"12]"–  Intermediate"Representa3on(IR)"for"DFT"– Compiler"op3miza3on"+"DFT"specific"op3miza3on"

•  ShadowReplica"– Decoupling"of"execu3on"and"monitoring"– Running"each"from"different"cores"

31"

Page 32: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

ShadowReplica:"Parallelized"DFT"

•  Basic"idea"–  Offload"tracking"overhead"to"the"separate"core"

•  Challenge"–  Communica3on"cost"is"too"high"

–  Synchroniza3on"•  Approach"

– Minimize"the"overhead"to"the"primary"

– Maximize"the"u3liza3on"of"the"secondary"

Collect Events

Application

Enqueue

Dequeue

Analyze

Ring"buffer"

32"

Page 33: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

ShadowReplica:"Execu3on"Model"

Collect Events

Application

Enqueue

Dequeue

Analyze

The"Secondary"

Ring"buffer"

=="Block"ID:"0x1234=="ecx1":="esi0"esi1":="[ebp0"–"0x4"]"edx1":="[ebp0"–"0x14]"esi2":="0x1"&"eax0"|"esi1"eax2":="0x1"&"[edx1"+"esi2]"

The"Primary"

Block"ID:"0x1234"

•  The"block"0x1234""–  Executed"from"the"primary""– Monitored"from"the"secondary"

33"

Page 34: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

ShadowReplica:"Run3me"Informa3on"

Collect Events

Application

Enqueue

Dequeue

Analyze

The"Secondary"

Ring"buffer"

=="Block"ID:"0x1234"=="ecx1":="esi0"esi1":="[ebp0'–'0x4']'edx1":="[ebp0'–'0x14]'esi2":="0x1"&"eax0"|"esi1"eax2":="0x1"&"[edx1'+'esi2]'

The"Primary"

Block"ID:"0x1234,""[ebp0"8"0x4],[ebp0"–"0x14],""[edx1"+"esi2]"

•  The"Primary"enqueues"–  Effec3ve"Memory"Addresses"

–  Block"Iden3fiers"

34"

Page 35: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

ShadowReplica:"Run3me"Informa3on"

Collect Events

Application

Enqueue

Dequeue

Analyze

The"Secondary"

Ring"buffer"

=="Block"ID:"0x1234"=="ecx1":="esi0"esi1":="[ebp0'–'0x4']'edx1":="[ebp0'–'0x14]'esi2":="0x1"&"eax0"|"esi1"eax2":="0x1"&"[edx1'+'esi2]'

The"Primary"

Block"ID:"0x1234,""[ebp0"8"0x4],[ebp0"–"0x14],""[edx1"+"esi2]"

•  The"Primary"enqueues"–  Effec3ve"Memory"Addresses"

–  Block"Iden3fiers"

35"

Page 36: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Op3mizing"Effec3ve"Addresses"

•  Intra"Block"Op3miza3on"–  Linear"Lower"Bound"

•  3"Register"Variables:""eax0,"eax1,"ebx0"

•  4"Memory"Operands:""One"redundancy"

![eax0!+!1!×!ebx0]![eax1!+!2!×!ebx0]![ebx0!+!4!×!eax1]![eax0!+!2!×!ebx0]!

Block"ID:"0x1234"

36"

Page 37: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Op3mizing"the"Primary:""Effec3ve"Addresses"

•  Intra"Block"Op3miza3on"–  Linear"Lower"Bound"

•  3"Register"Variables:""eax0,"eax1,"ebx0"

•  4"Memory"Operands:""One"redundancy"

•  Inter"Block"Op3miza3on"–  Data"Flow"Analysis"–  Use8Def"equa3on"based"on"Inferablity*

![eax0!+!1!×!ebx0]![eax1!+!2!×!ebx0]![ebx0!+!4!×!eax1]![eax0!+!2!×!ebx0]!

Block"ID:"0x1234"

BB1 BB2

BB0[eax0 + ebx0]

[eax^ + ebx^] [eax^][ebx^]

*"‘^’"indicates"top8most"version"for"the"register"variable"in"the"block" 37"

Page 38: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Op3mizing"the"Primary":"Block"Iden3fiers"

Block 0

Block 1

Block 2

Block 3 Block 4

10000

10000

9999 1

•  Minimize"Block"ID"trace"leveraging"–  Control"Flow"Graph"(CFG)"""–  Execu3on"count"

•  Only"Block"4"require"instrumenta3on"–  1"instrumenta3on"&"1"execu3on"

–  Rest"can"be"restored"from"the"secondary"

38"

Page 39: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Op3mizing"Block"Iden3fiers"

Block 0

Block 1

Block 2

Block 3 Block 4

10000

10000

9999 1

•  Minimize"Block"ID"trace"leveraging"–  Control"Flow"Graph"(CFG)"""–  Execu3on"count"

•  Only"Block"4"require"instrumenta3on"–  1"instrumenta3on"&"1"execu3on"

–  Rest"can"be"restored"from"the"secondary"

39"

Page 40: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Preliminary"Evalua3on:""Bzip2"Compression"

libdft TFA NO_OPT INTRA INTER NULL_PIN

1

2

3

4

5

6

7

Implementations

Slow

dow

n

4.11x

7.73x

3.45x

2.5x2.2x

1.27x

•  NO_OPT:"All"EAes"and"BB"traces"•  INTRA:"Intra"Block"op3miza3on"•  INTER:"Intra"+"Inter"Block"op3miza3on"

•  The"primary"performs"Bzip2"compression"against"a"Linux"kernel"

•  The"cost"of"dumping"EAes"and"Block"trace"from"the"primary"

•  Instrumenta3on"for"the"primary"is"implemented""using"PIN"DBI"

40"

Page 41: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Candidate"Analyses"for"ShadowReplica"

Analyses" Shadow"Memory" Data"Dependency" Checking"

Data"Flow"Tracking" ✓" ✓" ✓"

MemCheck" ✓" ✓" ✓"

LockCheck" ✓" ✗" ✓"

Method"Coun3ng" ✗" ✗" ✗"

Call"Graph"Profiling" ✗' ✗' ✗'

Path"Profiling" ✗" ✗" ✗"

Cache"Simula3on" ✓" ✗" ✗"

•  Each"analysis"have"different"performance"implica3ons"–  Instrumenta3on"frequencies,"size"of"analysis"rou3nes""

•  Data"dependency"across"mul3ple"updates"inhibits"the"paralleliza3on"of"analysis"

•  Checking"opera3ons"requires"synchroniza3on"between"an"applica3on"and"an"analysis" 41"

Page 42: AGeneralApproachforEfficiently ...• DAG"Representa3on" • Express"root"nodes"in"terms"of"leaf"nodes" 1:#mov#ecx,#esi 2:#movzxb#eax,#al 3:#shl#ecx,#0x5 4:#add#edx,0x1 5:#lea#esi,#ptr#[ecx+esi]

Conclusion"

•  Current"binary8only"DFT"implementa3ons"are"sub"op3mal"– No"considera3on"for"DFT"seman3cs"– No"considera3on"for"global"context"

•  Proposed"a"novel"approaches"scheme"that"–  Combines"sta3c"and"dynamic"analysis"–  Segregates"execu3on"and"tracking"logic"

•  Huge"Speedups"for"real8world"applica3ons"–  TFA:"~2x"–  Shadow"Replica:"Constant"upper"bound"to"the"overhead"

42"


Recommended