+ All Categories
Home > Documents > Pimp My PE: Parsing Malicious and Malformed Executables

Pimp My PE: Parsing Malicious and Malformed Executables

Date post: 03-Feb-2022
Category:
Upload: others
View: 9 times
Download: 0 times
Share this document with a friend
44
Transcript
Page 1: Pimp My PE: Parsing Malicious and Malformed Executables

Pimp My PE:Parsing Malicious andMalformed Executables

Virus Bulletin 2007

Page 2: Pimp My PE: Parsing Malicious and Malformed Executables

2

Authors

• Sunbelt Software, Tampa FL• Anti-Malware SDK team:

– Casey Sheehan, lead developer– Nick Hnatiw, developer / researcher– Tom Robinson, developer / researcher– Nick Suan, developer / researcher

Page 3: Pimp My PE: Parsing Malicious and Malformed Executables

3

Purpose• Chronicles the early development of our detection engine

– Specifically, the PE parser– Building enterprise infrastructure to support development

• Technical issues:– Understand malformations prevalent in wild PE’s– Methods for identifying malicious PE’s– Reliably parsing PE’s

• Pietrek’s article “An In-Depth Look into the Win32 PortableExecutable File Format” [3] a great intro, but much more is neededto successfully process modern PE’s

• Virtually all commercial analysis tools have serious issues parsingmalicious PEs

Page 4: Pimp My PE: Parsing Malicious and Malformed Executables

4

Overview

• Introduction• Technical Background• Infrastructure• Image parsing in depth

Page 5: Pimp My PE: Parsing Malicious and Malformed Executables

5

Part 1:Introduction

Page 6: Pimp My PE: Parsing Malicious and Malformed Executables

6

The Need

• Ability to parse any PE into a robust internalrepresentation

• Ability to detect and remediate threats

Page 7: Pimp My PE: Parsing Malicious and Malformed Executables

7

The Problem• Initial assumption: parsing is easy

– Simple parser should be able to cope with all samples.• Reality: malicious samples break parser constantly• Reaction: are these all corrupted PEs?• Realization: Windows loader behavior a valuable

comparison metric.– If Windows loads an image, we had better parse it– Corrupted images are, at very least, suspicious.

• In summary:– Implementations in the literature perform poorly versus threats

in the wild; generally cope poorly with “malformed” images– A large percentage of images in the wild are malformed (68%)

Page 8: Pimp My PE: Parsing Malicious and Malformed Executables

8

The Problem (con’t)• The actual problem: building a parser to

effectively process modern, malicious PEs• Key hurdles:

– Qualify behavior of Windows loader for comparisonpurposes

– Analyze and categorize “anomalous” characteristics ofsample images which identify malformed images

– Iteratively improve parser performance (i.e., avoidperformance regression)

Page 9: Pimp My PE: Parsing Malicious and Malformed Executables

9

The Solution

• Iteratively build and test parser• Constant regression testing

– Ensure new features don’t cause overall performanceto regress

• Verify performance vs. Windows loader– Gauge parser performance in absolute terms

Page 10: Pimp My PE: Parsing Malicious and Malformed Executables

10

Image Anomalies

• Anomaly:– specific structural malformation; a particular field

malformed a particular way– frequently inconsistent with PE specification, or just

unusual or suspicious

• Analysis of anomalies and other structuralcharacteristics provides key insight into commonimage malformations

Page 11: Pimp My PE: Parsing Malicious and Malformed Executables

11

Part 2:Background

Page 12: Pimp My PE: Parsing Malicious and Malformed Executables

12

Basic PE Structure

• PE Header• PE Sections• Overlay (optional)

header

sect 1

sect 2

sect n

overlay

Page 13: Pimp My PE: Parsing Malicious and Malformed Executables

13

Alignment

• Alignment applies to section mapping• PE header specifies two sectional alignment

values– File alignment specifies file mapped alignment– Virtual alignment specifies virtual mapped alignment

Page 14: Pimp My PE: Parsing Malicious and Malformed Executables

14

Image Mapping• Windows loader performs “map and load” operation:

– Map:– Size the view– Create view in process VA space– Allocate storage

– Load image section by section• Our parser mimics this behavior

– “Source representation”• Frequently file mapped (linker output)• However we may be given memory mapped image with no

corresponding file image– “Target representation”

• Typically virtual mapped

Page 15: Pimp My PE: Parsing Malicious and Malformed Executables

15

Mapping Translation

• Need to handle both file- and virtual-mappedimages

• cImageStream class– Accepts any source representation– Translates to requested target representation– Manages all stream-related details

Page 16: Pimp My PE: Parsing Malicious and Malformed Executables

16

Section Size• Fundamental concept when dealing with

sections due to variable section alignment– Applies to header and sections

• 3 unique size concepts:– Raw size: unpadded data size– File size: RoundUp(raw_size, file_align)

• “File cave”; persistent– Virtual size: RoundUp(file_size, virtual_align)

• “Virtual cave”; transient– Be precise!

• Always explicitly state the size type in source code

Page 17: Pimp My PE: Parsing Malicious and Malformed Executables

17

Section Size (con’t)

• Interesting (and annoying) that raw section sizeis unavailable– Important if you want size of REAL content!– E.g., when parsing structures in the header– … Or instructions (atoms) in a code section

• In practice, file aligned size is often treated assynonymous with raw size

• Demo:– Dump basic white file; identify raw, file, virtual sizes

Page 18: Pimp My PE: Parsing Malicious and Malformed Executables

18

PE Structure• PE header:

– Documents “explicit” image structure– Vs. “implicit” structure

• PE section– Primary image content– Code, data, etc.– Described in header’s section table

• Overlay: non-loadable data, appended to PE image– Certificates– Debug info– Malware-specific payload– Demo Ganda

Page 19: Pimp My PE: Parsing Malicious and Malformed Executables

19

PE Structural Abstractions

• Metasection:– abstraction for header, section, overlay components

• Metadata:– predefined data types– enumerated in the Data Directory (“DD”)– scattered throughout the image (and overlay)

Page 20: Pimp My PE: Parsing Malicious and Malformed Executables

20

Part 3:Enterprise Infrastructure:Data Management & Analysis

Page 21: Pimp My PE: Parsing Malicious and Malformed Executables

21

Infrastructure Overview

BlackFiles

WhiteFiles

AnalysisDB

RegressionDB

Data Warehouse

Analysis Tools

PESWEEP PeID RegressionSuite

Page 22: Pimp My PE: Parsing Malicious and Malformed Executables

22

Data Repositories• PE repository consists of

– ~9,000 known good PEs (“white collection”)– ~70,000 known malicious PEs (“black collection”)

• Images processed through two tools– PEiD packer identifier [1]

– Proprietary static analyzer PeSweep• Post-process tool output, import into DB• Mine DB for interesting correlations

– Data mining is speculative, iterative, time-consuming– Results shown here are tip of iceberg

Page 23: Pimp My PE: Parsing Malicious and Malformed Executables

23

PeSweep Analysis• Analyzes single file, directory, optional recursion• For every file processed, generates info on:

– Infer whether Windows is able to load it– Details on how much of the structure the parser is able to parse– Entropy values on a sectional basis– Header structure– Anomaly bits

• Able to create both file and virtually mapped targetmappings of the image

• Fully parses “explicit” content (header+metadata) :import, export, relocation, resource, etc values

Page 24: Pimp My PE: Parsing Malicious and Malformed Executables

24

Sample Analysis Results

Page 25: Pimp My PE: Parsing Malicious and Malformed Executables

25

Page 26: Pimp My PE: Parsing Malicious and Malformed Executables

26

Page 27: Pimp My PE: Parsing Malicious and Malformed Executables

27

Page 28: Pimp My PE: Parsing Malicious and Malformed Executables

28

Section Name Frequency

Page 29: Pimp My PE: Parsing Malicious and Malformed Executables

29

Sectional Analysis

Page 30: Pimp My PE: Parsing Malicious and Malformed Executables

30

Overlay Prevalence

Page 31: Pimp My PE: Parsing Malicious and Malformed Executables

31

Anomaly Frequency

Page 32: Pimp My PE: Parsing Malicious and Malformed Executables

32

Analysis Summary

• We’re profiling characteristics of known-bad andknown-good images

• Distilling these results into general rules forfiltering files at runtime

• These rules could help identify suspicious files– E.g., the more suspicious a file, the more analysis

resources it receives

Page 33: Pimp My PE: Parsing Malicious and Malformed Executables

33

Analysis Use Case 1

• Goal: Identify Loadable PEs– Classify PEs as valid / invalid at runtime

• Approach: synthesized “loader test”– Indicates whether Windows will run the file– Comprised of CreateProcess/LoadLibraryEx– Run across NT, 2000, XP, Vista

Page 34: Pimp My PE: Parsing Malicious and Malformed Executables

34

Loader Test Results

Page 35: Pimp My PE: Parsing Malicious and Malformed Executables

35

Analysis Data Use Case 2

• Goal: Identify Malicious PEs– Obviously a runtime heuristic generating a reliable “Is

Suspicious” flag is valuable

• Single query of anomaly bits– Identifies 67% of black list– Identifies 1.4% of white list

• This could be improved dramatically byincreasing the sophistication of our query.

Page 36: Pimp My PE: Parsing Malicious and Malformed Executables

36

Part 4:Image Parsing in Depth

Page 37: Pimp My PE: Parsing Malicious and Malformed Executables

37

PE Parser Class Organization

Page 38: Pimp My PE: Parsing Malicious and Malformed Executables

38

PE Parsing Flowchart

Page 39: Pimp My PE: Parsing Malicious and Malformed Executables

39

ImageStream Initialization• Same MapAndLoad process as before• Calculate target stream size

– Sum source stream metasection sizes, according to targetstream mapping

• Construct target stream– Copy each source metasection at computed offset in target

stream– Delicate process due to possible structural anomalies

• Parse anomalies are tracked throughout entire parsingprocess

Page 40: Pimp My PE: Parsing Malicious and Malformed Executables

40

Stream Normalization

• Problem: MapAndLoad process is fragile– Image structure can be corrupted in a myriad of

different ways– Non-validated fields can lead to crashes during

mapping and loading

• Solution: preliminary scan of header– “Normalization” pass through the header to fix

obviously illegal values– Guarantee subsequent parse pass succeeds

• Initial results were promising!

Page 41: Pimp My PE: Parsing Malicious and Malformed Executables

41

Stream Normalization (con’t)• Sample “illegal” values:

– Section table entry RVA falls within the header– Section table entry wild RVA and sizes entry– Header structures overlap– Wild DD entries

• TinyPE breaks them all! [2]– File ends before nominal end of OptHdr!

• Demo• Summary:

– Normalization must allow many degenerate cases– Less is more

• none is best ☺

Page 42: Pimp My PE: Parsing Malicious and Malformed Executables

42

In Summary

• Anomaly Mechanism– Useful source of info for analysis engine

• Parser Design– Hope there are some useful nuggets here..

• Infrastructure– Supports ongoing technology improvement and QA– Insight into malformations prevalent in the wild– Proven useful for technology refinement

Page 43: Pimp My PE: Parsing Malicious and Malformed Executables

43

Future Work• Extend

– Infrastructure– Analysis

• Refine heuristics for identifying malware and “suspicious”images

• Build additional tools– GUI version of PeSweep

• For now, SDK resources available athttp://research.sunbelt-software.com/ViperSDK/– PeSweep (cmdline binary; no source ☺)– Presentation

Page 44: Pimp My PE: Parsing Malicious and Malformed Executables

44

Thanks!

[email protected]

References:[1] PEiD homepage (http://peid.has.it/)[2] TinyPE (http://www.phreedom.org/solar/code/tinype/)

[3] Matt Pietrek, Under The Hood, An In-Depth Look into theWin32 Portable Executable File Format, MSDN Magazine, April2002, http://msdn.microsoft.com/msdnmag/issues/02/02/PE


Recommended