+ All Categories
Home > Documents > Labeling Library Functions in Stripped Binaries

Labeling Library Functions in Stripped Binaries

Date post: 23-Feb-2016
Category:
Upload: chung
View: 46 times
Download: 1 times
Share this document with a friend
Description:
Labeling Library Functions in Stripped Binaries. Emily R. Jacobson, Nathan Rosenblum , and Barton P. Miller Computer Sciences Department University of Wisconsin - Madison. Why Binary Code?. Source code isn’t available Source code isn’t the right representation. - PowerPoint PPT Presentation
31
PASTE 2011 Szeged, Hungary September 5, 2011 Labeling Library Functions in Stripped Binaries Emily R. Jacobson, Nathan Rosenblum, and Barton P. Miller Computer Sciences Department University of Wisconsin - Madison
Transcript
Page 1: Labeling Library Functions  in Stripped Binaries

PASTE 2011Szeged, Hungary

September 5, 2011

Labeling Library Functions in Stripped Binaries

Emily R. Jacobson, Nathan Rosenblum, and Barton P. Miller

Computer Sciences DepartmentUniversity of Wisconsin - Madison

Page 2: Labeling Library Functions  in Stripped Binaries

Why Binary Code?o Source code isn’t available

o Source code isn’t the right representation

2Labeling Library Functions in Stripped Binaries

Page 3: Labeling Library Functions  in Stripped Binaries

Binary Tools Need Symbol Tableso Debugging Tools

oGDB, IDA Pro…o Instrumentation Tools

o PIN, Dyninst,…o Static Analysis Tools

oCodeSurfer/x86,…o Security Analysis Tools

o IDA Pro,…

3Labeling Library Functions in Stripped Binaries

Page 4: Labeling Library Functions  in Stripped Binaries

Function locations

Complicated by:oMissing symbol informationoVariability in function layout (e.g. code sharing, outlined basic blocks)oHigh degree of indirect control flow

program binary

Restoring Information

4Labeling Library Functions in Stripped Binaries

targ80c3bd0 targ80c3df4 targ80c3df4

Page 5: Labeling Library Functions  in Stripped Binaries

What about semantic information?o Program’s interaction with the operating

system (system calls) encapsulated by wrapper functions

Restoring Information

5Labeling Library Functions in Stripped Binaries

Library fingerprinting: identify functions based on patterns learned from exemplar libraries

program binarytarg80c3bd0 targ80c3df4 targ80c3df4

Page 6: Labeling Library Functions  in Stripped Binaries

stripped binary parsing+

library fingerprinting+

binary rewriting

unstrip

6Labeling Library Functions in Stripped Binaries

targ80c3bd0 targ80c3df4 targ80c3df4getpid accept

Page 7: Labeling Library Functions  in Stripped Binaries

<accept>:mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxint $0x80mov %edx, %ebxcmp %0xffffff83,%eaxjae syscall_errorret

Set up system call argumentsint $0x80Invoke a

system callmov %edx, %ebxcmp %0xffffff83,%eaxjae syscall_errorret

Error check and return

mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecx

Save registers

Page 8: Labeling Library Functions  in Stripped Binaries

<accept>:mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxint $0x80mov %edx, %ebxcmp %0xffffff83,%eaxjae syscall_errorret

int $0x80mov %edx, %ebxcmp %0xffffff83,%eaxjae syscall_errorret

mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecx

<accept>:cmpl $0x0,%gs:0xcjne 80f669cmov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxcall *0x814e93cmov %edx, %ebxcmp %0xffffff83,%eaxjae syscall_errorretpush %esicall enable_asyncancelmov %eax,%esimov %ebx,%edx

mov $0x66,%eaxmov $0x5,%ebxlea 0x8(%esp),%ecxcall *0x8181578mov %edx, %ebxxchg %eax,%esicall disable_acynancelmov %esi,%eaxpop %esicmp $0xffffff83,%eaxjae syscall_errorret

<accept>:cmpl $0x0,%gs:0xcjne 80f669cmov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxint $0x80mov %edx, %ebxcmp %0xffffff83,%eaxjae syscall_errorretpush %esicall enable_asyncancelmov %eax,%esimov %ebx,%edx

mov $0x66,%eaxmov $0x5,%ebxlea 0x8(%esp),%ecxint $0x80mov %edx, %ebxxchg %eax,%esicall disable_acynancelmov %esi,%eaxpop %esicmp $0xffffff83,%eaxjae syscall_errorret

glibc 2.5 on RHEL with GCC 3.4.4

The same function can be realized in a variety of ways in the binary

glibc 2.5 on RHEL with GCC 4.1.2

glibc 2.2.4 on RHEL with GCC 2.95.3

Page 9: Labeling Library Functions  in Stripped Binaries

o Function inlining

o Code reordering

o Minor code changes

o Alternative code sequences

Binary-level Code Variations

9Labeling Library Functions in Stripped Binaries

Page 10: Labeling Library Functions  in Stripped Binaries

Semantic Descriptorso Rather than recording byte patterns, we

take a semantic approacho Record information that is likely to be

invariant across multiple versions of the function

10

<accept>:mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxint $0x80mov %edx, %ebxcmp %0xffffff83,%eaxjae 8048300retmov %esi,%esi

int $0x80

mov %0x66,%eaxmov $0x5,%ebx

{<socketcall >}

, 5

Labeling Library Functions in Stripped Binaries

Page 11: Labeling Library Functions  in Stripped Binaries

Building Semantic Descriptors

11Labeling Library Functions in Stripped Binaries

We parse an input binary, locate system calls and wrapper function calls, and employ dataflow analysis.

binary

reboot:push %ebpmov %esp,%ebpsub $0x10,%esppush %edipush %ebxmov 0x8(%ebp),%edxmov $0xfee1dead,%edimov $0x28121969,%ecxpush %ebxmov %edi,%ebxmov $0x58,%eaxint $0x80 …

SYSTEM CALL

0x58 0x28121969

EAX EBX ECX

%edi

0xfee1dead

{<reboot, 0xfee1dead, 0x2812969>}

EAX

(reboot)

Page 12: Labeling Library Functions  in Stripped Binaries

Building Semantic Descriptors Recursively

12Labeling Library Functions in Stripped Binaries

sethostid:…call open…call write…mov $0x6, eaxint $0x80…

{ <close>}

open:…mov $0x5, eaxint $0x80…

{<open, “/etc/hostid”, 577, 420>}

write:…mov $0x4, eaxint $0x80…

{<write,?,?,4>}

{ <close>, <open, “/etc/hostid”, 577,420>, <write,?,?,4>}

Page 13: Labeling Library Functions  in Stripped Binaries

unstrip

Building a Descriptor Database

13Labeling Library Functions in Stripped Binaries

Descriptor Database

<accept>:mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxint $0x80…

Locate wrapper functions

Build semantic descriptors

{<socketcall, 5>}: accept

{<socketcall, 4>}: listen

{<getpid>}: getpid…

glibcreference library

Page 14: Labeling Library Functions  in Stripped Binaries

glibcreference library

glibcreference library

glibcreference library

glibcreference library

unstrip

Building a Descriptor Database

14Labeling Library Functions in Stripped Binaries

Descriptor DatabaseBuild

semantic descriptors

Locate wrapper functions

{<socketcall, 5>}: accept

{<socketcall, 4>}: listen

{<getpid>}: getpid…

{<socketcall, 5>}: accept

{<socketcall, 4>}: listen

{<getpid>}: getpid…

{<socketcall, 5>}: accept

{<socketcall, 4>}: listen

{<getpid>}: getpid…

{<socketcall, 5>}: accept

{<socketcall, 4>}: listen

{<getpid>}: getpid…

1

<accept>:mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxint $0x80…

1

<accept>:mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxint $0x80…

1

<accept>:mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxint $0x80…

1

<accept>:mov %ebx, %edxmov %0x66,%eaxmov $0x5,%ebxlea 0x4(%esp),%ecxint $0x80…

Page 15: Labeling Library Functions  in Stripped Binaries

o Two stages1) Exact matches2) Best match based on coverage criterion

o Handle minor code variations by allowing flexible matches

Pattern Matching Criteria

15Labeling Library Functions in Stripped Binaries

Page 16: Labeling Library Functions  in Stripped Binaries

Pattern Matching Criteria

16Labeling Library Functions in Stripped Binaries

coverage(A,B) =

A: {<socketcall,5>}

B: {<socketcall,5>, <socketcall,5>, <futex>}

<socketcall,5> <socketcall,5>

coverage(A,B) =

<futex>

A B = { b B | b A }

fingerprint from the database

semantic descriptor from the code

Page 17: Labeling Library Functions  in Stripped Binaries

Multiple Matcheso It’s possible that two or more functions

are indistinguishableo Policy decision: return set of potential

matcheso In practice, we’ve observed 8% of

functions have multiple matches, but the size of the match set is small (≤ 3)

17Labeling Library Functions in Stripped Binaries

Page 18: Labeling Library Functions  in Stripped Binaries

unstrip

Identifying Functions in a Stripped Binary

18Labeling Library Functions in Stripped Binaries

stripped binary

unstripped

binary

Descriptor Database

For each wrapper function {

1. Build the semantic descriptor.

2. Search the database for a match (apply two-stage matching process).

3. Add label to symbol table.}

Page 19: Labeling Library Functions  in Stripped Binaries

stripped binary parsing

+library fingerprinting

+ binary rewriting

Implementation

19Labeling Library Functions in Stripped Binaries

Page 20: Labeling Library Functions  in Stripped Binaries

Evaluationo To evaluate across three dimensions of

variation, we constructed three data sets:oGCC versiono glibc versiono distribution vendor

o In each set, compile statically-linked binaries, build a DDB, compare unstrip to IDA Pro’s FLIRT

o Evaluation measure is accuracy20Labeling Library Functions in Stripped Binaries

Page 21: Labeling Library Functions  in Stripped Binaries

Evaluation Results: GCC Version Study

3.4.4 4.0.2 4.1.2 4.2.10

0.25

0.5

0.75

1

unstripIDA Pro

GCC 3.4.4 Patterns Predicting Each Library

accu

racy

21Labeling Library Functions in Stripped Binaries

Page 22: Labeling Library Functions  in Stripped Binaries

Evaluation Results: glibc Version Study

2.2.4 2.3.2 2.3.4 2.5 2.11.10

0.25

0.5

0.75

1

unstripIDA Pro

glibc 2.2.4 Patterns Predicting Each Library

accu

racy

22Labeling Library Functions in Stripped Binaries

Page 23: Labeling Library Functions  in Stripped Binaries

Evaluation Results: Distribution Study

Fedora Mandrivia OpenSuse Ubuntu0

0.25

0.5

0.75

1

unstripIDA Pro

Fedora Patterns Predicting Each Library

accu

racy

23Labeling Library Functions in Stripped Binaries

Page 24: Labeling Library Functions  in Stripped Binaries

24Labeling Library Functions in Stripped Binaries

unstrip is available athttp://www.paradyn.org/html/tools/unstrip.html

Page 25: Labeling Library Functions  in Stripped Binaries

Backup slides follow

Page 26: Labeling Library Functions  in Stripped Binaries

Evaluation Results: GCC Version Study(Temporal: backwards)

3.4.4 4.0.2 4.1.2 4.2.10

0.25

0.5

0.75

1

unstripIDA Pro

GCC 4.2.1 Patterns Predicting Each Library

accu

racy

26Labeling Library Functions in Stripped Binaries

Page 27: Labeling Library Functions  in Stripped Binaries

Evaluation Results: glibc Version Study(Temporal: backwards)

2.2.4 2.3.2 2.3.4 2.5 2.11.10

0.25

0.5

0.75

1

unstripIDA Pro

glibc 2.11.1 Patterns Predicting Each Library

accu

racy

27Labeling Library Functions in Stripped Binaries

Page 28: Labeling Library Functions  in Stripped Binaries

Evaluation Results: Distribution Study(one predicts the rest)

Fedora Mandrivia OpenSuse Ubuntu0

0.25

0.5

0.75

1

unstripIDA Pro

Mandrivia Patterns Predicting Each Library

accu

racy

28Labeling Library Functions in Stripped Binaries

Page 29: Labeling Library Functions  in Stripped Binaries

Evaluation Results: GCC Version Study (one predicts the rest)

3.4.4 4.0.2 4.1.2 4.2.10

0.25

0.5

0.75

1

unstripIDA Pro

GNU C Compiler Version

Accu

racy

29Labeling Library Functions in Stripped Binaries

Page 30: Labeling Library Functions  in Stripped Binaries

Evaluation Results: glibc Version Study(one predicts the rest)

2.2.4 2.3.2 2.3.4 2.5 2.11.10

0.25

0.5

0.75

1

unstripIDA Pro

glibc version

Accu

racy

30Labeling Library Functions in Stripped Binaries

Page 31: Labeling Library Functions  in Stripped Binaries

Evaluation Results: Distribution Study(one predicts the rest)

Fedora Mandrivia OpenSuse Ubuntu0

0.25

0.5

0.75

1

unstripIDA Pro

Distribution Vendor

Accu

racy

31Labeling Library Functions in Stripped Binaries


Recommended