+ All Categories
Home > Technology > CSTalks-Visualizing Software Behavior-14Sep

CSTalks-Visualizing Software Behavior-14Sep

Date post: 29-Nov-2014
Category:
Upload: cstalks
View: 436 times
Download: 0 times
Share this document with a friend
Description:
 
49
Visualizing Software Behavior Wu Yongzheng 14/Sep/2011 NUS SoC CSTalks 1
Transcript
Page 1: CSTalks-Visualizing Software Behavior-14Sep

Visualizing Software Behavior Wu Yongzheng

14/Sep/2011 NUS SoC CSTalks 1

Page 2: CSTalks-Visualizing Software Behavior-14Sep

Problems

• Software is complex – Large codebase – Interaction between components – Components from different vendor – Closed source, closed API

• Why understand software? – As developer => less bugs – As administrator => diagnosis – Curiosity?

• Execution trace contains software behavior information, but it’s huge.

14/Sep/2011 NUS SoC CSTalks 2

Page 3: CSTalks-Visualizing Software Behavior-14Sep

Software Traces

• Types of traces – Instruction trace: records machine instructions – Call trace: records function calls – System call trace: records system calls – Software logs: important events

• System trace – System call trace from all processes – Mainly resource usage, system & process

interaction

14/Sep/2011 NUS SoC CSTalks 3

Page 4: CSTalks-Visualizing Software Behavior-14Sep

WinResMon

• WinResMon: our trace recorder. • Works in Windows • Types of events:

– File: open, read, write, close, rename, … – Registry: open, get value, set value, delete, … – Network: connect, listen, send, receive, … – Process/thread: create, terminate.

14/Sep/2011 NUS SoC CSTalks 4

Page 5: CSTalks-Visualizing Software Behavior-14Sep

Information (fields) in an Event

• PID/TID Process/thread ID • Program name Path of program’s EXE • User name/group Process’ owner • Start/end time Event timing in CPU ticks • Operation type E.g. file open • Parameter Type dependent. E.g.

– file path, system call flags, registry path – IP address

• Call stack trace Call stack in user process

14/Sep/2011 NUS SoC CSTalks 5

Page 6: CSTalks-Visualizing Software Behavior-14Sep

Why visualize System Traces

• Software is complex – Interaction between modules, other software

• Software can be closed source, but interaction is open

• Human is good at detecting – Repeated pattern – Anomaly

NUS SoC CSTalks 6 14/Sep/2011

Page 7: CSTalks-Visualizing Software Behavior-14Sep

What is DotPlot?

E A C B E E E D C

A

C

B

C

D

E

B

C

E

NUS SoC CSTalks 7

Trace X

Trace Y

14/Sep/2011

Page 8: CSTalks-Visualizing Software Behavior-14Sep

What is DotPlot?

E A C B E E E D C

A

C

B

C

D

E

B

C

E

NUS SoC CSTalks 8

Trace X

Trace Y

14/Sep/2011

Page 9: CSTalks-Visualizing Software Behavior-14Sep

An Example

NUS SoC CSTalks 9

Visualization comparing: MS PowerPoint, MS Word, OO Word, and OO PowerPoint.

14/Sep/2011

Page 10: CSTalks-Visualizing Software Behavior-14Sep

Elements of VDP

NUS SoC CSTalks 10

1: Extended DotPlot 2,3: Axis Histogram 4,5: Barcode

1 3 4

2

3 14/Sep/2011

Page 11: CSTalks-Visualizing Software Behavior-14Sep

Extended DotPlot

NUS SoC CSTalks 11

• Matching Rule – Define whether two events match – By fields: e.g. “if PIDs and resource paths are

the same”, “if program names are the same”

• DP Coloring Rule – Define color for matched events – Traditional DP uses black only – Use RGB model on black background, CMY

on white background – Use regular expression to specify events – E.g. “.*file_open.*”→blue. “.*reg_.*”→cyan

14/Sep/2011

Page 12: CSTalks-Visualizing Software Behavior-14Sep

Event-ordered and Time-ordered

• Each event takes different time • The meaning/unit of each axis

NUS SoC CSTalks 12

Event-ordered Time-ordered

14/Sep/2011

Page 13: CSTalks-Visualizing Software Behavior-14Sep

Axis Histogram

NUS SoC CSTalks 13

– Ticks mark unit time (e.g. 1 second) – Histogram

• Event density (time-ordered) • Time spent (event-ordered)

14/Sep/2011

Page 14: CSTalks-Visualizing Software Behavior-14Sep

Barcode

NUS SoC CSTalks 14

• One dimensional • Highlight user chosen events

• E.g. file_open → red • One or more (e.g. three below) • Barcode coloring rules

14/Sep/2011

Page 15: CSTalks-Visualizing Software Behavior-14Sep

Example 1: File Copying

NUS SoC CSTalks 15

Self-comparison, event-ordered xcopy copying 8 files: 1MB, 10KB, 10MB, 100KB, 1MB, 10KB, 10MB and 100KB DP match : operation + parameter (pathname) DP color : magenta → source; cyan → destination; black → other

File Operation

Source/Dst File Operation

Registry Operation

14/Sep/2011

Page 16: CSTalks-Visualizing Software Behavior-14Sep

File Size

NUS SoC CSTalks 16

File size is visible Two 1MB and 10MB are shown Two 10KB and two 100KB are visible only when zoomed in

14/Sep/2011

Page 17: CSTalks-Visualizing Software Behavior-14Sep

Zooming in

NUS SoC CSTalks 17

DP color : magenta → source; cyan → destination; black → other

14/Sep/2011

Page 18: CSTalks-Visualizing Software Behavior-14Sep

A Surprise: Registry Operations

NUS SoC CSTalks 18

So many registry operations for a console application

Registry Operation

14/Sep/2011

Page 19: CSTalks-Visualizing Software Behavior-14Sep

Another Surprise: DLLs

NUS SoC CSTalks 19

File, but not source or destination. Time on DLLs is more than a 1MB file.

File Operation

Source/Dst File Operation

DLLs

14/Sep/2011

Page 20: CSTalks-Visualizing Software Behavior-14Sep

Example 2: Software Build

NUS SoC CSTalks 20

X: succeed; Y: failed due to missing .c file DP match : program + operation + value (pathname) DP color : black → any Bar1 color : black → nmake.exe Bar2 color : cyan → cl.exe; magenta → link.exe Bar3 color : cyan → reading .c files; magenta → reading .h files

Y: Failed due to missing .c file

X: succeed

14/Sep/2011

Page 21: CSTalks-Visualizing Software Behavior-14Sep

Number of Executions

NUS SoC CSTalks 21

X: 4 compiles (cl.exe), 1 link (link.exe) Y: 3 compiles, 0 link Y: Third compile doesn’t read .c or .h. Bar2 color : cyan → cl.exe; magenta → link.exe Bar3 color : cyan → reading .c files; magenta → reading .h files

X: 4 compiler, 1 linker

Y: 3 compiler, 0 linker

14/Sep/2011

Page 22: CSTalks-Visualizing Software Behavior-14Sep

Similarity & Difference

NUS SoC CSTalks 22

Two traces are similar. Y (failed) trace terminates earlier. Right before reading .c file

14/Sep/2011

Page 23: CSTalks-Visualizing Software Behavior-14Sep

Different Matching Rule

NUS SoC CSTalks 23

Operation Type Program Name

14/Sep/2011

Page 24: CSTalks-Visualizing Software Behavior-14Sep

Example 3: Two Idle Windows Machine

NUS SoC CSTalks 24

• Time-ordered • 1 hour each • Different time • About 750K events

each

14/Sep/2011

Page 25: CSTalks-Visualizing Software Behavior-14Sep

Anomaly & Repeated Pattern

NUS SoC CSTalks 25

• Periodic pattern • Most events in R1 • Most time in R2 alike • Easily spot anomaly &

regular pattern

R1

R2

14/Sep/2011

Page 26: CSTalks-Visualizing Software Behavior-14Sep

Zoom In

NUS SoC CSTalks 26

R1

R2

14/Sep/2011

Page 27: CSTalks-Visualizing Software Behavior-14Sep

R1: Windows Update

• Similar events (darker area) are by Windows Auto Updater

• More file operation, less registry operation

NUS SoC CSTalks 27

magenta → wuauclt.exe (Windows Update)

File Operation

Registry Operation

14/Sep/2011

Page 28: CSTalks-Visualizing Software Behavior-14Sep

14/Sep/2011 NUS SoC CSTalks 28

Page 29: CSTalks-Visualizing Software Behavior-14Sep

Visualizing Module Dependencies

• The problem – There’s vulnerability in X. Which software uses X? – Why my software uses X? I never call it. – Is it safe to uninstall X?

• Software module – Windows DLLs – UNIX .so – Java class, packages

14/Sep/2011 NUS SoC CSTalks 29

Page 30: CSTalks-Visualizing Software Behavior-14Sep

Examples of dependencies (1)

• Binaries used by notepad – c:\windows\apppatch\acgenral.dll – c:\windows\system32\avgrsstx.dll – c:\windows\system32\imm32.dll – c:\windows\system32\lpk.dll – c:\windows\system32\msacm32.dll – c:\windows\system32\msctf.dll – c:\windows\system32\msctfime.ime – c:\windows\system32\shimeng.dll – c:\windows\system32\usp10.dll – c:\windows\system32\uxtheme.dll – c:\windows\system32\winmm.dll – c:\windows\system32\winspool.drv – c:\windows\winsxs\x86_microsoft.windows.common-

controls_6595b64144ccf1df_6.0.2600.5512_x-ww_35d4ce83\comctl32.dll

14/Sep/2011 NUS SoC CSTalks 30

Page 31: CSTalks-Visualizing Software Behavior-14Sep

Examples of dependencies (2) • Simple boot (only Windows installed)

– DLLs: 154 – EXEs: 10 – Drivers: 1 – Ime: 1

• Typical boot (Windows + applications) – DLLs: 274 – EXEs: 15 – Telephony/Modem: 6 – Drivers: 3 – ActiveX: 2 – Ime: 1

14/Sep/2011 NUS SoC CSTalks 31

Page 32: CSTalks-Visualizing Software Behavior-14Sep

Visualization (1)

• Basic dependency graph • Graph is too dense

14/Sep/2011 NUS SoC CSTalks 32

Page 33: CSTalks-Visualizing Software Behavior-14Sep

Binary Dependency Visualization • Two types of nodes: EXE, DLL + etc • Three types of directed edges

1. EXE X launches another EXE Y 2. EXE X load a DLL Y 3. A function in binary X calls a function in binary Y

• How are binaries shared among programs? – EXE Dependency Graph – Only Type 1 and 2 edge – Group DLLs by loader

• How binaries interact? – DLL Dependency Graph – Only Type 2 and 3 edge – Group DLLs manually by functionality or software vendor

14/Sep/2011 NUS SoC CSTalks 33

Page 34: CSTalks-Visualizing Software Behavior-14Sep

Visualization (1)

• Basic dependency graph • Graph is too dense

14/Sep/2011 NUS SoC CSTalks 34

Page 35: CSTalks-Visualizing Software Behavior-14Sep

A more usable Visualization: EXE Dependency Graph

• Grouped dependency graph 1

1

1

2

2

14/Sep/2011 NUS SoC CSTalks 35

Page 36: CSTalks-Visualizing Software Behavior-14Sep

Comparing Microsoft Word and Open Office Writer

14/Sep/2011 NUS SoC CSTalks 36

Page 37: CSTalks-Visualizing Software Behavior-14Sep

DLL Dependency Graph: actual binary usage

• Some definitions: – An EXE-DLL dependency in a DLL Dependency Graph is

when there is has a control transfer from code in executable x to code in DLL y. We say that x has an EXE-DLL dependency on y.

– A DLL-DLL dependency in a DLL Dependency Graph is when there is has a control transfer from code in DLL x to code in DLL y. We say that x has a DLL-DLL dependency on y

14/Sep/2011 NUS SoC CSTalks 37

Page 38: CSTalks-Visualizing Software Behavior-14Sep

wget: DLL dependency without grouping

14/Sep/2011 NUS SoC CSTalks 38

Page 39: CSTalks-Visualizing Software Behavior-14Sep

wget: DLL dependency group by fnctionality

14/Sep/2011 NUS SoC CSTalks 39

Page 40: CSTalks-Visualizing Software Behavior-14Sep

Examples of grouping By functionality (GIMP)

14/Sep/2011 NUS SoC CSTalks 40

Page 41: CSTalks-Visualizing Software Behavior-14Sep

Examples of grouping By software vendor (GIMP)

14/Sep/2011 NUS SoC CSTalks 41

Page 42: CSTalks-Visualizing Software Behavior-14Sep

Two Operations

• Diff – Compare two graphs.

• E.g. from same program but different environment/input • E.g. from two related programs

– Diff graph G1 and G2 to get G3. • Projection

– Focus on a particular module X – Only show modules that calls X or called by X

(recursive defination) – Project graph G1 on module M to get G2 – Not a simple subgraph problem

14/Sep/2011 NUS SoC CSTalks 42

Page 43: CSTalks-Visualizing Software Behavior-14Sep

Diff of DLL dependency graph of Internet Explorer with Flash and without

14/Sep/2011 NUS SoC CSTalks 43

Page 44: CSTalks-Visualizing Software Behavior-14Sep

Projection of the DLL dependency graph of Internet Explorer on Flash

14/Sep/2011 NUS SoC CSTalks 44

Page 45: CSTalks-Visualizing Software Behavior-14Sep

Firefox using tortoisesvn

14/Sep/2011 NUS SoC CSTalks 45

Page 46: CSTalks-Visualizing Software Behavior-14Sep

Questions?

14/Sep/2011 NUS SoC CSTalks 46

Page 47: CSTalks-Visualizing Software Behavior-14Sep

Visualizing binaries executed

• Call graph is large. • Group functions to images => DLL dependency

graph. • DLL dependency graph is still large. • Group DLLs by properties:

– By functionality: graphics, audio, network… – By vendor: microsoft, adobe… – By path: C:\windows\system32\*.dll,

D:\vmware\*.dll…

14/Sep/2011 NUS SoC CSTalks 47

Page 48: CSTalks-Visualizing Software Behavior-14Sep

Visualizing binaries executed (1)

• Generate call tree, call graph, DLL dependency graph • PIN tool to collect execution trace

– Trace include call, return, thread, context, system call events

– Call and return records stack pointer, PC and target address.

• Not trivial to maintain call stack by tracking call and return – Non-return function (long jump) – Thread, fiber – Context – Kernel callback

14/Sep/2011 NUS SoC CSTalks 48

Page 49: CSTalks-Visualizing Software Behavior-14Sep

Projection void main (void) { A(); B(1); } void A (void) { B(0); } void B (int i) { if (i) D(); else C(); } void C (void) {} void D (void) {}

14/Sep/2011 NUS SoC CSTalks 49

main

A

B

C

D

main

A

B

C

Full Graph

Project on A


Recommended