+ All Categories
Home > Documents > PE-Miner: Mining Structural Information to Detect...

PE-Miner: Mining Structural Information to Detect...

Date post: 04-Jul-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
20
PE-Miner: Mining Structural Information to Detect Malicious Executables in Realtime M. Zubair Shafiq 1 , S. Momina Tabish 1,2 , Fauzan Mirza 2,1 , Muddassar Farooq 1 1 Next Generation Intelligent Networks Research Center (nexGIN RC) National University of Computer & Emerging Sciences (FAST-NUCES) Islamabad, 44000, Pakistan {zubair.shafiq,momina.tabish,muddassar.farooq}@nexginrc.org 2 School of Electrical Engineering & Computer Science (SEECS) National University of Sciences & Technology (NUST) Islamabad, 44000, Pakistan [email protected] Abstract. In this paper, we present an accurate and realtime PE-Miner framework that automatically extracts distinguishing features from portable executables (PE) to detect zero-day (i.e. previously unknown) malware. The distinguishing features are extracted using the structural informa- tion standardized by the Microsoft Windows operating system for exe- cutables, DLLs and object files. We follow a threefold research method- ology: (1) identify a set of structural features for PE files which is com- putable in realtime, (2) use an efficient preprocessor for removing re- dundancy in the features’ set, and (3) select an efficient data mining algorithm for final classification between benign and malicious executa- bles. We have evaluated PE-Miner on two malware collections, VX Heavens and Malfease datasets which contain about 11 and 5 thousand malicious PE files respectively. The results of our experiments show that PE-Miner achieves more than 99% detection rate with less than 0.5% false alarm rate for distinguishing between benign and malicious executables. PE- Miner has low processing overheads and takes only 0.244 seconds on the average to scan a given PE file. Finally, we evaluate the robustness and reliability of PE-Miner under several regression tests. Our results show that the extracted features are robust to different packing techniques and PE-Miner is also resilient to majority of crafty evasion strategies. Key words: Data Mining, Malicious Executable Detection, Malware Detection, Portable Executables, Structural Information 1 Introduction A number of non-signature based malware detection techniques have been pro- posed recently. These techniques mostly use heuristic analysis, behavior analysis, or a combination of both to detect malware. Such techniques are being actively investigated because of their ability to detect zero-day malware without any a
Transcript
Page 1: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

PE-Miner: Mining Structural Information toDetect Malicious Executables in Realtime

M. Zubair Shafiq1, S. Momina Tabish1,2, Fauzan Mirza2,1, Muddassar Farooq1

1Next Generation Intelligent Networks Research Center (nexGIN RC)National University of Computer & Emerging Sciences (FAST-NUCES)

Islamabad, 44000, Pakistan{zubair.shafiq,momina.tabish,muddassar.farooq}@nexginrc.org

2School of Electrical Engineering & Computer Science (SEECS)National University of Sciences & Technology (NUST)

Islamabad, 44000, [email protected]

Abstract. In this paper, we present an accurate and realtime PE-Minerframework that automatically extracts distinguishing features from portableexecutables (PE) to detect zero-day (i.e. previously unknown) malware.The distinguishing features are extracted using the structural informa-tion standardized by the Microsoft Windows operating system for exe-cutables, DLLs and object files. We follow a threefold research method-ology: (1) identify a set of structural features for PE files which is com-putable in realtime, (2) use an efficient preprocessor for removing re-dundancy in the features’ set, and (3) select an efficient data miningalgorithm for final classification between benign and malicious executa-bles.We have evaluated PE-Miner on two malware collections, VX Heavensand Malfease datasets which contain about 11 and 5 thousand maliciousPE files respectively. The results of our experiments show that PE-Minerachieves more than 99% detection rate with less than 0.5% false alarmrate for distinguishing between benign and malicious executables. PE-Miner has low processing overheads and takes only 0.244 seconds on theaverage to scan a given PE file. Finally, we evaluate the robustness andreliability of PE-Miner under several regression tests. Our results showthat the extracted features are robust to different packing techniques andPE-Miner is also resilient to majority of crafty evasion strategies.

Key words: Data Mining, Malicious Executable Detection, MalwareDetection, Portable Executables, Structural Information

1 Introduction

A number of non-signature based malware detection techniques have been pro-posed recently. These techniques mostly use heuristic analysis, behavior analysis,or a combination of both to detect malware. Such techniques are being activelyinvestigated because of their ability to detect zero-day malware without any a

Page 2: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

2 M. Zubair Shafiq, S. Momina Tabish, Fauzan Mirza, Muddassar Farooq

priori knowledge about them. Some of them have been integrated into the exist-ing Commercial Off the Shelf Anti Virus (COTS AV) products, but have achievedonly limited success [26], [13]. The most important shortcoming of these tech-niques is that they are not realtime deployable1. We, therefore, believe that thedomain of realtime deployable non-signature based malware detection techniquesis still open to novel research.

Non-signature based malware detection techniques are primarily criticizedbecause of two inherent problems: (1) high fp rate, and (2) large processing over-heads. Consequently, COTS AV products mostly utilize signature based detec-tion schemes that provide low fp rate and have acceptable processing overheads.But it is a well-known fact that signature based malware detection schemes areunable to detect zero-day malware. We cite two reports to highlight the alarmingrate at which new malware is proliferating. The first report is by Symantec thatshows an increase of 468% in the number of malware from 2006 to 2007 [25].The second report shows that the number of malware produced in 2007 alonewas more than the total number of malware produced in the last 20 years [6].These surveys suggest that signature based techniques cannot keep abreast withthe security challenges of the new millennium because not only the size of thesignatures’ database will exponentially increase but also the time of matchingsignatures. These bottlenecks are even more relevant on resource constrainedsmart phones and mobile devices [3]. We, therefore, envision that in near futuresignature based malware detection schemes will not be able to meet the criterionof realtime deployable as well.

We argue that a malware detection scheme which is realtime deployableshould use an intelligent yet simple static analysis technique. In this paper wepropose a framework, called PE-Miner, which uses novel structural features toefficiently detect malicious PE files. PE is a file format which is standardized bythe Microsoft Windows operating systems for executables, dynamically linkedlibraries (DLL), and object files. We follow a threefold research methodology inour static analysis: (1) identify a set of structural features for PE files whichis computable in realtime, (2) use an efficient preprocessor for removing redun-dancy in the features’ set, and (3) select an efficient data mining algorithm forfinal classification. Consequently, our proposed framework consists of three mod-ules: the feature extraction module, the feature selection/preprocessing module,and the detection module.

We have evaluated our proposed detection framework on two independentlycollected malware datasets with different statistics. The first malware datasetis the VX Heavens Virus collection consisting of more than ten thousand mali-cious PE files [27]. The second malware dataset is the Malfease dataset, whichcontains more than five thousand malicious PE files [21]. We also collected morethan one thousand benign PE files from our virology lab, which we use in con-junction with both malware datasets in our study. The results of our experiments

1 We define a technique as realtime deployable if it has three properties: (1) a tp rate(or true positive rate) of approximately 1, (2) an fp rate (or false positive rate) ofapproximately 0, and (3) the file scanning time is comparable to existing COTS AV.

Page 3: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

PE-Miner: Mining Structural Information to Detect Malicious Executables 3

show that our PE-miner framework achieves more than 99% detection rate withless than 0.5% false alarm rate for distinguishing between benign and maliciousexecutables. Further, our framework takes on the average only 0.244 secondsto scan a given PE file. Therefore, we can conclude that PE-Miner is realtimedeployable, and consequently it can be easily integrated into existing COTS AVproducts. PE-Miner framework can also categorize the malicious executables asa function of their payload. This analysis is of great value for system adminis-trators and malware forensic experts. An interested reader can find details inthe accompanying technical report [23].

We have also compared PE-Miner with other promising malware detectionschemes proposed by Perdisci et al. [18], Schultz et al. [22], and Kolter et al.[11]. These techniques use some variation of n-gram analysis for malware de-tection. PE-Miner provides better detection accuracy2 with significantly smallerprocessing overheads compared with these approaches. We believe that the su-perior performance of PE-Miner is attributable to a rich set of novel PE formatspecific structural features, which provides relevant information for better de-tection accuracy [10]. In comparison, n-gram based techniques are more suitablefor classification of loosely structured data; therefore, they fail to exploit formatspecific structural information of a PE file. As a result, they provide lower de-tection rates and have higher processing overheads as compared to PE-Miner.Our experiments also demonstrate that the detection mechanism of PE-Minerdoes not show any significant bias towards packed/non-packed PE files. Finally,we investigate the robustness of PE-Miner against “crafty” attacks which arespecifically designed to evade detection mechanism of PE-Miner. Our resultsshow that PE-Miner is resilient to majority of such evasion attacks.

2 PE-Miner Framework

In this section, we discuss our proposed PE-Miner framework. We set the follow-ing strict requirements on our PE-Miner framework to ensure that our researchis enacted with a product development cycle that has a short time-to-market:

– It must be a pure non-signature based framework with an ability to detectzero-day malicious PE files.

– It must be realtime deployable. To this end, we say that it should have morethan 99% tp rate and less than 1% fp rate. We argue that it is still a challengefor non-signature based techniques to achieve these true and false positiverates. Moreover, its time to scan a PE file must be comparable to those ofexisting COTS AV products.

2 Throughout this text, the terms detection accuracy and Area Under ROC Curve(AUC) are used interchangeably. ROC curves are extensively used in machine learn-ing and data mining to depict the tradeoff between the true positive rate and falsepositive rate of a classifier. The AUC (0 ≤ AUC ≤ 1) is used as a yardstick to de-termine the detection accuracy from ROC curve. Higher values of AUC mean hightp rate and low fp rate [28]. At AUC = 1, tp rate = 1 and fp rate = 0.

Page 4: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

4 M. Zubair Shafiq, S. Momina Tabish, Fauzan Mirza, Muddassar Farooq

Fig. 1. The architecture of our PE-Minerframework Fig. 2. The PE file format

– Its design must be modular that allows for the plug-n-play design philosophy.This feature will be useful in customizing the detection framework to specificrequirements, such as porting it to the file formats used by other operatingsystems.

We have evolved the final modular architecture of our PE-Miner frameworkin a question oriented engineering fashion. In our research, we systematicallyraised following relevant questions, analyzed their potential solutions, and finallyselected the best one through extensive empirical studies.

1. Which PE format specific features can be statically extracted from PE filesto distinguish between benign and malicious files? Moreover, are the formatspecific features better than the existing n-grams or string-based features interms of detection accuracy and efficiency?

2. Do we need to deploy preprocessors on the features’ set? If yes then whichpreprocessors are best suited for the raw features’ set?

3. Which are the best back-end classification algorithms in terms of detectionaccuracy and processing overheads.

Our PE-Miner framework consists of three main modules inline with theabove-mentioned vision: (1) feature extraction, (2) feature preprocessing, and(3) classification (see Figure 1). We now discuss each module separately.

2.1 Feature Extraction

Let us revisit the PE file format [12] before we start discussing the structuralfeatures used in our features’ set. A PE file consists of a PE file header, a sectiontable (section headers) followed by the sections’ data. The PE file header consistsof a MS DOS stub, a PE file signature, a COFF (Common Object File Format)header, and an optional header. It contains important information about a filesuch as the number of sections, the size of the stack and the heap, etc. Thesection table contains important information about the sections that follow it,

Page 5: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

PE-Miner: Mining Structural Information to Detect Malicious Executables 5

Table 1. List of the features extracted from PE files

Feature Description Type Quantity

DLLs referred binary 73COFF file header integer 7

Optional header – standard fields integer 9Optional header – Windows specific fields integer 22

Optional header – data directories integer 30.text section – header fields integer 9.data section – header fields integer 9.rsrc section – header fields integer 9

Resource directory table & resources integer 21

Total 189

Table 2. Mean values of the extracted features. The bold values in every row highlight interesting outliers.

Dataset VX Heavens Malfease

Name of Benign Backdoor Constructor DoS + Flooder Exploit + Worm Trojan Virus -Feature + Sniffer + Virtool Nuker Hacktool

WSOCK32.DLL 0.037 0.503 0.038 0.188 0.353 0.261 0.562 0.242 0.053 0.065WININET.DLL 0.073 0.132 0.009 0.013 0.04 0.141 0.004 0.103 0.019 0.086# Symbols 430.2 2.0E6 14.7 59.4 25.8 3.5E6 38.8 4.1E6 1.0E6 2.7E7

Maj Linker Ver 4.7 14.4 11.2 14.1 12.1 12.3 18.7 12.2 19.3 6.5Init Data Size (E5) 4.4 1.1 0.5 0.4 0.8 0.7 0.4 0.4 0.1 0.6

Maj Img Ver 163.1 1.6 6.3 0.4 0.6 11.2 0.3 6.0 53.6 0.2DLL Char 5.8x103 0.0 0.0 0.0 0.0 24.9 0.0 3.1 230.8 18.7

Exp Tbl Size (E2) 13.7 2.4 1.7 14.1 5.0 0.3 1.2 2.1 0.9 0.05Imp Tbl Size (E2) 5.8 19.2 6.1 7.9 20.8 7.1 23.4 10.3 6.2 4.7Rsrc Tbl Size (E4) 32.6 5.5 1.5 1.4 6.2 1.0 2.6 2.2 0.5 5.9Except Tbl Size 12.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3.5

.data Raw Size (E3) 25.2 8.4 5.6 6.3 6.0 7.9 6.1 5.5 6.7 22.1# Cursors 14.5 6.4 6.7 7.4 6.1 5.9 5.8 6.0 3.0 6.8# Bitmaps 12.6 1.2 0.0 1.0 0.6 0.7 1.2 1.4 2.4 0.5# Icons 17.6 2.5 1.9 2.7 2.0 2.1 1.8 1.9 4.5 2.2

# Dialogs 10.9 3.2 1.5 3.2 1.5 2.0 1.9 1.7 2.2 2.3# Group Cursors 11.6 6.0 6.6 7.2 5.8 5.8 5.4 5.7 2.7 6.7# Group Icons 4.1 1.0 0.7 1.0 0.8 0.7 0.5 0.7 1.5 0.6

such as their name, offset and size. These sections contain the actual data suchas code, initialized data, exports, imports and resources [12], [15].

Figure 2 shows an overview of the PE file format [12], [15]. It is important tonote that the section table contains Relative Virtual Addresses (RVAs) and thepointers to the start of every section. On the other hand, the data directories inan optional header contain references to various tables (such as import, export,resource, etc.) present in different sections. These references, if appropriatelyanalyzed, can provide useful information.

We believe that this structural information about a PE file should be lever-aged to extract features that have the potential to achieve high detection accu-racy. Using this principle, we statically extract a set of large number of featuresfrom a given PE file3. These features are summarized in Table 1. In the discus-sion below, we first intuitively argue about the features that have the potentialto distinguish between benign and malicious files. We then show interesting ob-servations derived from the executable datasets used in our empirical studies.

DLLs referred. The list of DLLs referred in an executable effectively pro-vides an overview of its functionality. For example, if an executable calls WINSOCK.DLL

3 A well-known Microsoft Visual C++ utility, called dumpbin, dumps the relevantinformation which is present inside a given PE file [4]. Another freely available utility,called pedump, also does the required task [20].

Page 6: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

6 M. Zubair Shafiq, S. Momina Tabish, Fauzan Mirza, Muddassar Farooq

or WSOCK.DLL then it is expected to perform network related activities. However,there can be exceptions to this assumption as well. In [22], Schultz et al. have usedthe conjunction of DLL names, with a similar functionality, as binary features.The results of their experiments show that this feature helps to attain reasonabledetection accuracy. However, our pilot experimental studies have revealed thatusing them as individual binary features can reveal more information, and hencecan be more helpful in detecting malicious PE files. In this study, we have used73 core functionality DLLs as features. Their list and functionality is detailedin [23]. Table 2 shows the mean feature values for the two DLLs4. Interestingly,WSOCK32.DLL and WININET.DLL are used by the majority of backdoors, nukers,flooders, hacktools, worms, and trojans to access the resources on the networkand the Internet. Therefore, the applications misusing these DLLs might providea strong indication of a possible covert network activity.

COFF file header. The COFF file header contains important informationsuch as the type of the machine for which the file is intended, the nature ofthe file (DLL, EXE, or OBJ etc.), the number of sections, and the number ofsymbols. It is interesting to note in Table 2 that a reasonable number of symbolsare present in benign executables. The malicious executables, however, eithercontain too many or too few symbols.

Optional header: standard fields. The interesting information in thestandard fields of the optional header include the linker version used to cre-ate an executable, the size of the code, the size of the initialized data, the sizeof the uninitialized data, and the address of the entry point. Table 2 shows thatthe values of major linker version and the size of the initialized data have asignificant difference in the benign and malicious executables. The size of theinitialized data in benign executables is usually significantly higher compared tothose of the malicious executables.

Optional header: Windows specific fields. The Windows specific fieldsof the optional header include information about the operating system version,the image version, the checksum, the size of the stack and the heap. It can beseen in Table 2 that the values of fields such as the major image version andthe DLL characteristics are usually set to zero in the malicious executables. Incomparison, their values are significantly higher in the benign executables.

Optional header: data directories. The data directories of the optionalheader provide pointers to the actual data present in the sections following it. Itincludes the information about export, import, resource, exception, debug, cer-tificate, and base relocation tables. Therefore, it effectively provides a summaryof the contents of an executable. Table 2 highlights that the size of the exporttable is higher for the benign executables and nukers as compared to those ofother malicious executables. Another interesting observation in Table 2 is thatthe backdoors, flooders, worms and trojans mostly have a bigger import tablesize. It can be intuitively argued that they usually import network functionali-ties which increase the size of their import table. The size of the resource table,on the other hand, is higher for the benign executables as compared to those of

4 The details of the datasets and their categorization are available in Section 3.

Page 7: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

PE-Miner: Mining Structural Information to Detect Malicious Executables 7

the malicious executables. The exception table is mostly absent in the maliciousexecutables.

Section headers. The section headers provide important characteristics ofa section such as its address, size, number of relocations, and line numbers. Inthis study, we have only considered text, data and resource sections because theyare commonly present in the executables. Note that the size of the data section(if present) is relatively larger for the benign executables.

Resource directory table & resources. The resource directory table pro-vides an overview of the resources that are present in the resource section of anexecutable file. We consider the actual count of various types of resources thatare present in the resource section of an executable file. The typical examples ofresources include cursors, bitmaps, icons, menus, dialogs, fonts, group cursors,and user defined resources. Intuitively and as shown in Table 2, the number ofthese resources is relatively higher for the benign executables.

2.2 Feature Selection/Preprocessing

We have now identified our features’ set that consists of a number of staticallycomputable features – 189 to be precise – based on the structural informationof the PE files. It is possible that some of the features might not convey usefulinformation in a particular scenario. Therefore, it makes sense to remove orcombine them with other similar features to reduce the dimensionality of ourinput feature space. Moreover, this preprocessing on the raw extracted features’set also reduces the processing overheads in training and testing of classifiers, andcan possibly also improve the detection accuracy of classifiers. In this study, wehave used three well-known features’ selection/preprocessing filters. We providetheir short descriptions in the following text. More details can be found in [29].

Redundant Feature Removal (RFR). We apply this filter to removethose features that do not vary at all or show significantly large variation i.e. theyhave approximately uniform-random behavior. Consequently, this filter removesall features that have either constant values or show a variance above a thresholdor both.

Principal Component Analysis (PCA). The Principal Component Analy-sis (PCA) is a well-known filter for dimensionality reduction. It is especiallyuseful when the input data has high dimensionality – sometimes referred to ascurse of dimensionality. This dimensionality reduction can possibly improve thequality of an analysis on a given data if the dataset consists of highly corre-lated or redundant features. However, this dimensionality reduction may resultin information loss (i.e. reduction in data variance) as well. One has to care-fully choose the appropriate balance for this tradeoff. We apply PCA filter toremove/combine correlated features for dimensionality reduction.

Haar Wavelet Transform (HWT). The principle of this technique isthat the most relevant information is stored with the highest coefficients at eachorder of a transform. The lower order coefficients can be ignored to get onlythe most relevant information. The wavelet transform has also been used fordimensionality reduction. The wavelet transform technique has been extensively

Page 8: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

8 M. Zubair Shafiq, S. Momina Tabish, Fauzan Mirza, Muddassar Farooq

Table 3. Statistics of the data used in this study.

Dataset VX Heavens Malfease

- Benign Backdoor Constructor DoS + Flooder Exploit + Worm Trojan Virus -+ Sniffer + Virtool Nuker Hacktool

Quantity 1, 447 3, 455 367 267 358 243 1, 483 3, 114 1, 052 5, 586Avg. Size 1, 263 270 234 176 298 156 72 136 50 285

(KB)

Min. Size 4 1 4 3 6 4 2 1 2 1(KB)

Max. Size 104, 588 9, 277 5, 832 1, 301 14, 692 1, 924 2, 733 4, 014 1, 332 5, 746(KB)

UPX 17 786 79 15 32 43 353 622 48 470ASPack 2 432 21 16 25 15 66 371 10 187

Misc. Packed 372 325 47 31 58 38 471 170 71 1, 909Borland 15 56 8 15 10 6 13 63 18 11C/C++

Borland Delphi 13 589 13 65 64 8 76 379 71 342Visual Basic 4 719 106 39 126 38 210 674 119 809Visual C++ 526 333 19 51 29 59 89 619 96 351Visual C# 56 0 0 0 1 0 5 1 6 1Misc. Other 9 49 9 2 3 2 4 15 7 5

Non-packed (%) 43.1 50.5 42.2 64.4 65.1 46.5 26.8 56.2 30.1 27.2Packed (%) 27.0 44.7 40.1 23.2 32.1 39.5 60.0 37.4 12.3 46.6

Not Found (%) 29.9 4.8 17.7 12.4 2.8 14.0 13.2 6.4 57.6 26.2

used in the image compression but is never evaluated in the malware detectiondomain. The Haar wavelet is one of the simplest wavelets and is known to providereasonable accuracy. The application of Haar wavelet transform requires inputdata to be normalized. Therefore, we have passed the data through a normalizefilter before applying HWT.

2.3 Classification

Once the dimensionality of the input features’ set is reduced by applying oneof the above-mentioned preprocessing filters, it is given as an input to the well-known data mining algorithms for classification. In this study we have used fiveclassifiers: (1) instance based learner (IBk), (2) decision tree (J48), (3) NaıveBayes (NB), (4) inductive rule learner (RIPPER), and (5) support vector ma-chines using sequential minimal optimization (SMO). An interested reader canfind their details in the accompanying technical report [23].

3 Datasets

In this section, we present an overview of the datasets used in our study. Wehave collected 1, 447 benign PE files from the local network of our virology lab.The collection contains executables such as Packet CAPture (PCAP) file parserscompiled by MS Visual Studio 6.0, compressed installation executables, and MSWindows XP/Vista applications’ executables. The diversity of the benign filesis also evident from their sizes, which range from a minimum of 4 KB to amaximum of 104, 588 KB (see Table 3).

Moreover, we have used two malware collections in our study. First is theVX Heavens Virus Collection, which is labeled and is publicly available for freedownload [27]. We only consider PE files to maintain focus. Our filtered dataset

Page 9: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

PE-Miner: Mining Structural Information to Detect Malicious Executables 9

contains 10, 339 malicious PE files. The second dataset is the Malfease malwaredataset [21], which consists of 5, 586 unlabeled malicious PE files.

In order to conduct a comprehensive study, we further categorize the ma-licious PE files as a function of their payload5. The malicious executables aresubdivided into eight major categories such as virus, trojan, worm, etc [7]. More-over, we have combined some categories that have similar functionality. For ex-ample, we have combined constructor and virtool to create a single constructor+ virtool category. This unification increases the number of malware samplesper category. Brief introductions of every malware category are provided in theaccompanying technical report [23].

Table 3 provides the detailed statistics of the malware used in our study. Itcan be noted that the average size of the malicious executables is smaller thanthat of the benign executables. Further, some executables used in our study areencrypted and/or compressed (packed). The detailed statistics about packingare also tabulated in Table 3. We use PEiD [16] and Protection ID for detectingpacked executables [19]6.

Our analysis shows that VX Heavens Virus collection contains 40.1% packedand 47.2% non-packed PE files. However, approximately 12.7% malicious PEfiles cannot be classified as either packed or non-packed by PEiD and Protec-tion ID. The Malfease collection contains 46.6% packed and 27.2% non-packedmalicious PE files. Similarly, 26.2% malicious PE files cannot be classified aspacked or non-packed. Therefore, we can say that packed/non-packed malwaredistribution in the VX Heavens virus collection is relatively more balanced thanthe Malfease dataset. In our collection of benign files, 43.1% are packed and27.0% are non-packed PE files respectively. Similarly, 29.9% benign files are notdetected by PEiD and Protection ID. An interesting observation is that thebenign PE files are mostly packed using nonstandard and custom developedpackers. We speculate that a significant portion of the packed executables arenot classified as packed because the signatures of their respective packers are notpresent in the database of PEiD or Protection ID. Note that we do not manuallyunpack any PE file prior to the processing of our PE-Miner.

4 Related Work

We now briefly describe the most relevant non-signature based malware detectiontechniques. These techniques are proposed by Perdisci et al. [18], Schultz et al.[22] and Kolter et al. [11]. We briefly summarize their working principles in thefollowing paragraphs but an interested reader can find their detailed descriptionin [23].

In [18], the authors proposed McBoost that uses two classifiers – C1 andC2 – for classification of non-packed and packed PE files respectively. A custom5 Since the Malfease malware collection is unlabeled; therefore, it is not possible to

divide it into different malware categories.6 We acknowledge the fact that PEiD and Protection ID are signature based packer

detectors and can have significant false negatives.

Page 10: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

10 M. Zubair Shafiq, S. Momina Tabish, Fauzan Mirza, Muddassar Farooq

developed unpacker is used to extract the hidden code from the packed PEfiles and the output of the unpacker is given as an input to the C2 classifier.Unfortunately, we could not obtain its source code or binary due to licensingrelated problems. Furthermore, its implementation is not within the scope ofour current work. Consequently, we only evaluate the C1 module of McBoostwhich works only for non-packed PE files. Therefore, we acknowledge that ourMcBoost results should be considered only preliminary.

In [22], Schultz et al. have proposed three independent techniques for detect-ing malicious PE files. The first technique, uses the information about DLLs,function calls and their invocation counts. However, the authors did not provideenough information about the used DLLs and function names; therefore, it is notpossible for us to implement it. But we have implemented the second approach(titled strings) which uses strings as binary features i.e. present or absent. Thethird technique uses two byte words as binary features. This technique is laterimproved in a seminal work by Kolter et al. [11] which uses 4-grams as binaryfeatures. Therefore, we include the technique of Kolter et al. (titled KM ) in ourcomparative evaluation.

5 Experimental Results

We have compared our PE-Miner framework with recently proposed promisingtechniques by Perdisci et al. [18], Schultz et al. [22], and Kolter et al. [11]. Wehave used the standard 10 fold cross-validation process in our experiments, i.e.,the dataset is randomly divided into 10 smaller subsets, where 9 subsets areused for training and 1 subset is used for testing. The process is repeated 10times for every combination. This methodology helps in systematically evaluat-ing the effectiveness of our approach to detect previously unknown (i.e. zero-day)malicious PE files. The ROC curves are generated by varying the threshold onoutput class probability [5], [28]. The AUC is used as a yardstick to determinethe detection accuracy of each approach. We have done the experiments on anIntel Pentium Core 2 Duo 2.19 GHz processor with 2 GB RAM. The MicrosoftWindows XP SP2 is installed on this machine.

5.1 Malicious PE File Detection

In our first experimental study, we attempt to distinguish between benign andmalicious PE files. To get better insights, we have done independent experimentswith benign and each of the eight types of the malicious executables. The fivedata mining algorithms, namely IBk, J48, NB, RIPPER, and SMO, are deployedon top of each approach (namely PE-Miner with RFR, PE-Miner with PCA, PE-Miner with HWT, McBoost (C1 only) by Perdisci et al. [18], strings approachby Schultz et al. [22], and KM by Kolter et al. [11]). This results in a total of 270experimental runs each with 10-fold cross validation. We tabulate our resultsfor this study in Table 4 and now answer different questions that we raised inSection 2 in a chronological fashion.

Page 11: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

PE-Miner: Mining Structural Information to Detect Malicious Executables 11

Table 4. AUCs for detecting the malicious executables. The bold entries in each column represent the best results.

Dataset VX Heavens Malfease

Malware Backdoor + Constructor DoS + Flooder Exploit + Worm Trojan Virus Average -Sniffer + Virtool Nuker Hacktool

PE-Miner — RFRIBK 0.992 0.996 0.995 0.994 0.998 0.979 0.984 0.994 0.992 0.986J48 0.993 0.998 0.987 0.993 0.999 0.979 0.992 0.993 0.992 0.979NB 0.971 0.978 0.966 0.973 0.987 0.972 0.974 0.986 0.976 0.976

RIPPER 0.996 0.996 0.977 0.981 0.999 0.988 0.988 0.996 0.990 0.985SMO 0.991 0.990 0.991 0.993 0.997 0.975 0.978 0.992 0.988 0.963

PE-Miner — PCAIBK 0.989 0.996 0.994 0.995 0.998 0.976 0.984 0.993 0.991 0.984J48 0.980 0.966 0.929 0.960 0.987 0.936 0.951 0.985 0.962 0.945NB 0.961 0.990 0.993 0.996 0.996 0.964 0.956 0.990 0.981 0.898

RIPPER 0.982 0.978 0.996 0.974 0.977 0.949 0.968 0.987 0.976 0.952SMO 0.990 0.992 0.989 0.995 0.995 0.958 0.965 0.992 0.985 0.954

PE-Miner — HWTIBK 0.991 0.996 0.996 0.998 1.000 0.978 0.985 0.995 0.992 0.986J48 0.995 0.997 0.993 0.988 0.997 0.978 0.991 0.999 0.992 0.977NB 0.989 0.982 0.983 0.987 0.990 0.978 0.972 0.990 0.984 0.960

RIPPER 0.994 0.997 0.982 0.990 0.997 0.983 0.990 1.000 0.992 0.987

SMO 0.990 0.995 0.991 0.996 1.000 0.972 0.973 0.994 0.989 0.964

McBoost — C1 onlyIBK 0.941 0.935 0.875 0.960 0.832 0.938 0.930 0.914 0.916 0.949J48 0.866 0.895 0.809 0.893 0.731 0.906 0.902 0.882 0.860 0.860NB 0.831 0.924 0.723 0.889 0.795 0.873 0.886 0.844 0.846 0.817

RIPPER 0.833 0.888 0.744 0.918 0.660 0.866 0.838 0.844 0.824 0.860SMO 0.802 0.887 0.759 0.910 0.678 0.854 0.805 0.827 0.815 0.835

StringsIBK 0.949 0.860 0.902 0.980 0.925 0.928 0.863 0.952 0.920 0.944J48 0.913 0.834 0.862 0.695 0.871 0.908 0.836 0.938 0.857 0.929NB 0.920 0.830 0.882 0.726 0.886 0.901 0.828 0.905 0.860 0.930

RIPPER 0.843 0.797 0.714 0.578 0.712 0.892 0.743 0.929 0.776 0.927SMO 0.855 0.817 0.705 0.775 0.583 0.871 0.756 0.883 0.781 0.933

KMIBK 0.984 0.934 0.983 0.971 0.983 0.987 0.979 0.986 0.976 0.980J48 0.953 0.940 0.916 0.907 0.916 0.957 0.951 0.953 0.937 0.952NB 0.943 0.959 0.961 0.952 0.961 0.968 0.954 0.954 0.957 0.961

RIPPER 0.951 0.944 0.924 0.921 0.924 0.964 0.948 0.948 0.941 0.971SMO 0.949 0.946 0.952 0.927 0.952 0.961 0.940 0.938 0.946 0.960

Which features’ set is the best? Table 4 tabulates the AUCs for PE-Miner using three different preprocessing filters (RFR, PCA and HWT), Mc-Boost, strings and KM [11]. A macro level scan through the table clearly showsthe supremacy of PE-Miner based approaches with AUCs more than 0.99 formost of the malware types and even approaching 1.00 for some malware types.For PE-Miner, RFR and HWT preprocessing lead to the best average resultswith more than 0.99 AUC.

The strings approach gives the worst detection accuracy. The KM approach isbetter than the strings approach but inferior to our PE-Miner. This is expectedbecause the string features are not stable as compiling a given piece of code byusing different compilers leads to different sets of strings. Our analysis showsthat KM approach is more resilient to variation in the string sets because ituses a combination of string and non-string features. The results obtained forKM approach (AUC= 0.95) are also consistent with the results reported in [11].The C1 module of McBoost also provides relatively inferior detection accuracieswhich are as low as 0.66 for exploit+hacktool category. It is important to notethat the C1 module of McBoost is functionally similar to the techniques proposedby Schultz et al. and Kolter et al. The only significant difference is that C1operates only on the code sections of the non-packed PE files whereas the othertechniques operate on complete files.

Page 12: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

12 M. Zubair Shafiq, S. Momina Tabish, Fauzan Mirza, Muddassar Farooq

Table 5. The processing overheads (in seconds/file) of different feature selection, extraction and preprocessingschemes.

PE-Miner McBoost Strings KM(RFR) (PCA) (HWT)

Selection - - - 2.839 5.289 31.499Extraction 0.228 0.228 0.228 0.198 0.130 0.220

Preprocessing 0.007 0.009 0.012 - - -

Total 0.235 0.237 0.240 3.037 5.419 31.719

It is important to emphasize that both strings and KM approaches incurlarge overheads in the feature selection process (see Table 57). Kolter et al. haveconfirmed that their implementation of information gain calculation for featureselection took almost a day for every run. To make our implementation of n-grams more efficient, we use hash map STL containers in the Visual C++ [8].Our experiments show that the feature selection process in KM still takes morethan 31 seconds per file even with our optimized implementation. The optimizedstrings approach takes, on the average, more than 5 seconds per file for featureselection. The optimized McBoost (C1 only) approach takes an average of morethan 2 seconds per file for feature selection8. These approaches have processingoverheads because the time to calculate information gain increases exponentiallywith the number of unique n-grams (or strings). On the other hand, PE-Minerdoes not suffer from such serious bottlenecks. The application of RFR, PCA, orHWT filters takes only about a hundredth of a second.

Which classification algorithm is the best? We can conclude from Table4 that J48 outperforms the rest of the data mining classifiers in terms of thedetection accuracy in most of the cases. Moreover, Table 6 shows that J48 has oneof the smallest processing overheads both in training and testing. RIPPER andIBk closely follow the detection accuracy of J48. However, they are infeasible forrealtime deployment because of the high processing overheads in the training andthe testing phases respectively. The processing overheads of training RIPPER arethe highest among all classifiers. In comparison, IBk does not require a trainingphase but its processing overheads in the testing phase are the highest. Further,Naıve Bayes gives the worst detection accuracy because it assumes independenceamong input features. Intuitively speaking, this assumption does not hold for thefeatures’ sets used in our study. Note that Naıve Bayes has very small learningand testing overheads (see Table 69).

Which malware category is the most challenging to detect? An overviewof Table 4 suggests that the most challenging malware categories are worms andtrojans. The average AUC values of the compared techniques for worms andtrojans are approximately 0.95. The poor detection accuracy is attributed tothe fact that the trojans are inherently designed to appear similar to the benign

7 The results in Table 5 are averaged over 100 runs.8 Note that the complete McBoost system also uses unpacker for extraction of hidden

code. This process is time consuming as reported by the authors in [18].9 The results in Table 6 are averaged over 100 runs.

Page 13: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

PE-Miner: Mining Structural Information to Detect Malicious Executables 13

0 0.02 0.04 0.06 0.08 0.10.9

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1

fp ratetp

rat

e

BackdoorConstructorDoSFlooderExploitWormTrojanVirus

Fig. 3. The magnified ROC plots for detecting the malicious executables using PE-Miner utilizing J48 preprocessed with RFR filter.

Table 6. The processing overheads (in seconds/file) of different features and classification algorithms.

IBK J48 NB RIPPER SMO IBK J48 NB RIPPER SMO

Training Testing

PE-Miner (RFR) - 0.008 0.001 0.269 0.199 0.032 0.001 0.002 0.002 0.002PE-Miner (PCA) - 0.007 0.001 0.264 0.179 0.035 0.001 0.001 0.001 0.002PE-Miner (HWT) - 0.007 0.001 0.252 0.147 0.032 0.001 0.002 0.001 0.002

McBoost - 0.021 0.004 1.305 1.122 0.218 0.010 0.007 0.005 0.022Strings - 0.009 0.002 0.799 0.838 0.163 0.003 0.003 0.002 0.003

KM - 0.024 0.004 1.510 1.018 0.254 0.018 0.007 0.005 0.020

Table 7. Realtime deployable analysis of the best techniques

Technique Classifier AUC Scan Time Is Realtime(sec/file) Deployable?

PE-Miner (RFR) J48 0.991 0.244 “Yes”

McBoost IBk 0.926 3.255 NoStrings IBk 0.927 5.582 No

KM IBk 0.977 31.973 NoAVG Free 8.0 [1] - - 0.159 -Panda 7.01 [14] - - 0.131 -

executables. Therefore, it is a difficult challenge to distinguish between trojansand benign PE files. Our PE-Miner still achieves on the average 0.98 AUC forworms and trojans which is quite reasonable. Figure 3 shows that for othermalware categories, PE-Miner (with RFR preprocessor) has AUCs more than0.99.

5.2 Miscellaneous Discussions

We conclude our comparative study with an answer to an important issue: whichof the compared techniques meet the criterion of being realtime deployable? (seeSection 2). We tabulate the AUC and the scan time of the best techniques inTable 7. Moreover, we also show the scan time of two well-known COTS AVproducts for doing the realtime deployable analysis of different non-signaturebased techniques. It is clear that PE-Miner (RFR) with J48 classifier is theonly non-signature based technique that satisfies the criterion of being realtimedeployable. One might argue that PE-Miner framework provides only a small

Page 14: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

14 M. Zubair Shafiq, S. Momina Tabish, Fauzan Mirza, Muddassar Farooq

Table 8. Portion of the developed decision trees for distinguishing between benign and back-door+sniffer

NumMessageTable <= 0| SizeLoadConfigTable <= 0| | TimeDateStamp <= 1000000000| | | NumCursor <= 1| | | | NumAccelerators <= 0| | | | | NumBitmap <= 0: malicious| | | | | NumBitmap > 0: benign| | | | NumAccelerators > 0:malicious| | | NumCursor > 1:malicious

improvement in detection accuracy over the KM approach. But then KM has theworst scan time of 31.97 seconds per file (see Table 7). It is very important tointerpret the results in Table 7 from a security expert’s perspective. For example,if a malware detector scans ten thousand files with an AUC of 0.97, it will notdetect approximately 300 malicious files. In comparison, a detector with an AUCof 0.99 will miss only 100 files, which is a 66.6% improvement in the number ofmissed files [2]. Therefore, we argue that from a security expert’s perspective,even a small improvement in the detection accuracy is significant in the limitingcase when the detection accuracy approaches to 1.00.

An additional benefit of PE-Miner is that it provides insights about thelearning models of different classifiers that can be of great value to malwareforensic experts. We show a partial subtree of J48 for categorizing benign andmalicious PE files in Table 8. The message tables mostly do not exist in thebackdoor+sniffer categories. The TimeDateStamp is usually obfuscated in themalicious executables. The number of resources are generally smaller in maliciousPE files, whereas the benign files tend to have larger number of resources suchas menus, icons, and user defined resources. Similar insights are also providedby the rules developed in the training phase of RIPPER.

In [9], the authors have pointed out several difficulties in parsing PE files.In our experiments, we have also observed various anomalies in parsing thestructure of malicious PE files. Table 9 contains the list of anomalies which wehave observed in parsing malicious PE files. A significant proportion of maliciousPE files have anomalous structure which can crash a naıve PE file parser. Figure4 provides the statistics of anomalies which we have observed in parsing maliciousPE files of VX Heavens and Malfease collections. To this end, we have developeda set of heuristics which successfully handle the above-mentioned anomalies.

6 Robustness and Reliability Analysis of PE-Miner

We have now established the fact that PE-Miner is a realtime deployable schemefor zero-day malware detection. A careful reader might ask whether the statementstill holds if the “ground truth” is now changed as: (1) we cannot trust theclassification of signature based packer detectors PEiD and Protection ID, and(2) a “crafty” attacker can forge the features of malicious files with those of

Page 15: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

PE-Miner: Mining Structural Information to Detect Malicious Executables 15

Table 9. List of the anomalies observed in parsing malicious PE files

ID Description

1 Large number of sections2 SizeOfHeader field is unaligned3 Overlapping DoS and PE headers4 Large virtual size in a section5 Large raw data size in a section6 Zero/Non-zero pair in data directory table7 Large pointer in data directory entry8 Size of section is too large9 Section name garbled (non printable characters)10 There is an unknown overlay region11 Out of file pointer

1 2 3 4 5 6 7 8 9 10 110

10

20

30

40

50

Anomaly ID

File

s (%

)

VX Heavens Malfease

Fig. 4. Statistics of anomalies observed in parsing malicious PE files

benign files to evade detection. In this section, we do a stress and regressiontesting of PE-Miner to analyze robustness of its features and its resilience topotential evasive techniques.

6.1 Robustness Analysis of Extracted Features

It is a well-known fact that signature based packer detector PEiD, which weare using to distinguish between packed and non-packed executables, has ap-proximately 30% false negative rate [17]. In order to convince ourselves that ourextracted features are actually “robust”, we evaluate PE-Miner in four scenar-ios: (1) training PE-Miner on 70% non-packed PE files and 30% packed PE filesand testing on the remaining 70% packed PE files, (2) training PE-Miner onnon-packed PE files only and testing on packed PE files, (3) training PE-Mineron packed PE files only and then testing on non-packed PE files, and (4) testingPE-Miner on a “difficult” dataset that consists of packed benign and non-packedmalicious PE files. We assert that the scenarios (2) and (3) – although unrealis-tic – still provide valuable insight into the extent of bias, that PE-Miner mighthave, towards detection of packed/non-packed executables.

We want to emphasize an important point that there is no confusion about“ground truth” for packed executables in above-mentioned four scenarios becausea packer only detects a file as “packed” if it has its signature in its database. Theconfusion about “ground truth”, however, stems in the fact that a reasonableproportion of packed PE files could be misclassified as non-packed because of

Page 16: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

16 M. Zubair Shafiq, S. Momina Tabish, Fauzan Mirza, Muddassar Farooq

Table 10. An analysis of robustness of extracted features of PE-Miner (RFR) in different scenarios

Dataset VX Heavens Malfease

Malware Backdoor + Constructor DoS + Flooder Exploit + Worm Trojan Virus Average -Sniffer + Virtool Nuker Hacktool

Scenario 1: Detection of packed benign and malicious PE files

IBK 0.999 1.000 1.000 0.999 0.999 0.998 0.999 0.999 0.999 0.812J48 0.996 1.000 1.000 0.999 0.999 0.998 0.993 0.999 0.998 0.991

NB 0.971 0.988 0.963 0.955 0.996 0.980 0.978 0.987 0.977 0.934RIPPER 0.997 0.996 0.999 0.990 0.993 0.985 0.858 0.998 0.977 0.988

SMO 0.985 0.998 1.000 0.996 0.994 0.994 0.985 0.998 0.994 0.706

Scenario 2: Training using non-packed executables only and testing using packed executables

IBK 0.986 0.965 0.912 0.963 0.998 0.993 0.850 0.989 0.957 0.917J48 0.982 0.999 0.998 0.937 0.999 0.963 0.857 0.954 0.961 0.968

NB 0.927 0.899 0.842 0.809 0.966 0.911 0.857 0.965 0.897 0.780RIPPER 0.989 0.995 0.998 0.995 0.986 0.962 0.858 0.853 0.954 0.937

SMO 0.983 0.772 0.905 0.691 0.996 0.737 0.651 0.852 0.823 0.859

Scenario 3: Training using packed executables only and testing using non-packed executables

IBK 0.975 0.965 0.964 0.878 0.793 0.982 0.911 0.904 0.921 0.855J48 0.951 0.908 0.919 0.940 0.726 0.958 0.903 0.881 0.898 0.903

NB 0.685 0.965 0.668 0.633 0.689 0.979 0.688 0.688 0.749 0.789RIPPER 0.979 0.938 0.967 0.972 0.747 0.768 0.840 0.867 0.885 0.904

SMO 0.977 0.941 0.877 0.882 0.536 0.983 0.835 0.904 0.867 0.849

Scenario 4: Detection of packed benign and non-packed malicious PE files (“difficult” dataset)

IBK 0.999 1.000 1.000 0.999 0.998 0.998 0.998 0.994 0.998 0.992J48 0.997 0.986 0.999 0.999 0.999 0.999 0.989 0.993 0.995 0.996

NB 0.954 0.963 0.995 0.988 0.966 0.990 0.975 0.986 0.977 0.948RIPPER 0.998 0.984 0.998 0.993 0.986 0.999 0.992 0.996 0.993 0.948

SMO 0.989 0.996 1.000 0.997 0.996 0.997 0.984 0.992 0.994 0.945

false negative rate of PEiD. Note that the false negatives of PEiD, reported in[17], consist of two types: (1) packed PE files that are misclassified as non-packed,and (2) PE files that are unclassified. We have not included unclassified files inour dataset to remove the false negatives of the second type.

Scenario 1: Detection of packed benign and malicious PE files. Themotivation behind the first scenario is to test if PE-Miner can distinguish be-tween packed benign and packed malware, regardless of the type of packer. In or-der to ensure that our features are not influenced by the type of packing tool usedto encrypt PE files, our “packed-only” dataset contains PE files (both benignand malware) packed using a variety of packers like UPX, ASPack, Armadillo,PECompact, WWPack32, Virogen Crypt 0.75, UPS-Scrambler, PEBundle andPEPack etc. Moreover, the “packed-only” dataset contains on the average 44%and 56% packed malicious and benign PE files respectively. We train PE-Mineron 70% non-packed executables and 30% packed executables and then test it onthe remaining 70% packed executables. The results of our experiments for thisscenario are tabulated in Table 10. We can easily conclude that PE-Miner hasshown good resilience in terms of detecting accuracy once it is tested on packedbenign and malicious PE files from both datasets.

Scenarios 2 and 3: Detection of packed/non-packed malicious PEfiles. In the second experiment, we train PE-Miner on non-packed benign andmalicious PE files and test it on packed benign and malicious PE files. Note thatthis scenario is more challenging because the training dataset contains signifi-cantly less number of packed files compared with the first scenario. In the thirdexperiment, we train PE-Miner on packed benign and malicious PE files and teston non-packed benign and malicious PE files. The results of these experimentsare tabulated in Table 10. It is clear from Table 10 that the detection accuracy

Page 17: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

PE-Miner: Mining Structural Information to Detect Malicious Executables 17

of PE-Miner (RFR-J48) drops to 0.96, when it is trained on non-packed exe-cutables and tested on the packed executables. Likewise, the average detectionaccuracy of PE-Miner (RFR-J48) drops to 0.90 for the third scenario. Remem-ber once we train PE-Miner on “packed only” dataset, then it gets 0% exposureto non-packed files and this explains deterioration in the detection accuracy ofPE-Miner. We conclude that the detection accuracy of PE-Miner, even in theseunrealistic stress testing scenarios, gracefully degrades.

Scenario 4: Detection of packed benign and non-packed maliciousPE files. In [18], the authors report an interesting study about the ability ofdifferent schemes to detect packed/non-packed executables. They show that thedetection accuracy of KM approach degrades on a “difficult” dataset consistingof packed benign and non-packed malicious PE files. According to the authorsin [18], KM shows a bias towards detecting packed PE files as malware and non-packed PE files as benign. We also – in line with this strategy – tested PE-Mineron a “difficult” dataset created from both malware collections used in our study.The results are tabulated in Table 10. It is important to highlight that for theseexperiments PE-Miner is trained on the original datasets but is tested on the“difficult” versions of both datasets. One can conclude from the results in Table10 that PE-Miner does not show any bias towards detecting packed executablesas malicious and non-packed executables as benign.

Our experiments conclude that the extracted features are actually “robust”,and as a result, PE-Miner doest not show any significant bias towards detectionof packed/non-packed executables.

6.2 Reliability of PE-Miner

Now we test PE-Miner on a “crafty” malware dataset, especially designed to cir-cumvent detection by PE-Miner. We particularly focus our attention on the falsenegative rate (or miss detection rate)10 of PE-Miner when we replace featuresin malicious files with those of benign files. It can be argued that if adversariesexactly know our detection methodology, they might be able to design strategiesthat evade detection by PE-Miner. The examples of such strategies could beespecially crafted packing techniques, insertion of dummy resources, obfuscationof address pointers, and other information present in headers etc.

We have conducted an empirical study to analyze the robustness of PE-Miner to such evasive techniques. To this end, we have “crafted” malware filesin the datasets to contain benign-like features. Specifically, we have created seven“crafty” datasets in which for every malware file 5, 10, 30, 50, 100, 150 and 189random features – out of 189 features – are forged with the respective featuresfrom a randomly chosen benign file. We now analyze the false negative rate ofPE-Miner (RFR-J48) across these “crafty” datasets. The results tabulated inTable 11 highlight the robustness of PE-Miner to such crafty attacks. The falsenegative rate of PE-Miner stays below 1% when fifty features are simultaneously

10 The false negative rate is defined by the fraction of malicious files wrongly classifiedas benign.

Page 18: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

18 M. Zubair Shafiq, S. Momina Tabish, Fauzan Mirza, Muddassar Farooq

Table 11. False negative rate for detecting malicious executables with PE-Miner on the “crafty” datasets

Dataset VX Heavens Malfease

Malware Backdoor + Constructor DoS + Flooder Exploit + Worm Trojan Virus Average -Sniffer + Virtool Nuker Hacktool

# Forged Features False negative rate

0/189 0.001 0.000 0.000 0.000 0.004 0.000 0.004 0.007 0.002 0.0015/189 0.002 0.000 0.000 0.000 0.004 0.000 0.004 0.007 0.002 0.00110/189 0.002 0.000 0.000 0.000 0.004 0.011 0.004 0.014 0.004 0.00430/189 0.002 0.003 0.000 0.012 0.023 0.011 0.011 0.014 0.009 0.00450/189 0.002 0.003 0.000 0.012 0.023 0.016 0.011 0.014 0.010 0.004100/189 0.096 0.003 0.000 0.012 0.023 0.050 0.445 0.176 0.101 0.004150/189 0.658 0.003 0.000 0.583 0.795 0.611 0.558 0.221 0.429 0.426189/189 0.996 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.999 0.998

5 10 15 20 25 30 35 40 45 500

20

40

60

80

100

Number of forged features (/189)

Mal

war

e su

cces

sful

ly e

xecu

ted

(%)

Fig. 5. Execution analysis of crafted malware files

forged. For both datasets, the average false negative rate is approximately 5%even when 100 out of 189 features are forged. This shows that a large set offeatures, which cover structural information of almost all portions of a PE file,used by PE-Miner make it very difficult for an attacker to evade detection – evenwhen it manipulates majority of them at the same time.

It should be emphasized that simultaneous manipulation of all features of aPE malware file requires significant level of skill, in-depth knowledge about thestructure of a PE file, and detailed understanding of our detection framework. Ifan attacker tries to randomly forge, using brute-force, the structural features of aPE malware file with those of a benign PE file then he/she will inevitably end upcorrupting the executable image. Consequently, the file will not load successfullyinto memory. We have manually executed the “crafted” malicious executables.The objective is to understand that how many features a “crafty” attacker cansuccessfully forge without ending up corrupting the executable image. The resultsof our experiments are shown in Figure 5. This figure proves our hypothesisthat the probability of having valid PE files decreases exponentially with anincrease in the number of simultaneously forged features. In fact, the successfulexecution probability approaches to zero as the number of simultaneously forgedfeatures approaches to 50. Referring back to Table 11, the average false negativerate of PE-Miner is less than 1% when 50 features are simultaneously forged.Therefore, we argue that it is not a cinch for an attacker to alter malicious PEfiles to circumvent detection by PE-Miner. However, we accept that an attackercan evade the detection capability of PE-Miner if: (1) he/she knows the exact

Page 19: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

PE-Miner: Mining Structural Information to Detect Malicious Executables 19

details of our detection framework – including the detection rules, and (2) alsohas the “craft” to simultaneously manipulate more than 100 structural featureswithout corrupting the executable image.

7 Conclusion

In this paper we present, PE-Miner, a framework for detection of malicious PEfiles. PE-Miner leverages the structural information of PE files and the data min-ing algorithms to provide high detection accuracy with low processing overheads.Our implementation of PE-Miner completes a single-pass scan of all executablesin the dataset (more than 17 thousand) in less than one hour. Therefore it meetsall of our requirements mentioned in Section 2.

We believe that our PE-Miner framework can be ported to Unix and othernon-Windows operating systems. To this end, we have identified similar struc-tural features for the ELF file format in Unix and Unix-like operating systems.Our initial results are promising and show that PE-Miner framework is scalableacross different operating systems. This dimension of our work will be the sub-ject of forthcoming publications. Moreover, PE-Miner framework is also ideallysuited for detecting malicious PE files on resource constrained mobile phones(running mobile variants of Windows) because of its small processing overheads.Finally, we are also doing research to develop techniques to fully remove the biasof PE-Miner in detecting packed/non-packed executables [24].

Acknowledgments

This work is supported in part by the National ICT R&D Fund, Ministry ofInformation Technology, Government of Pakistan. The information, data, com-ments, and views detailed herein may not necessarily reflect the endorsementsof views of the National ICT R&D Fund.

We are thankful to Muhammad Umer for designing experiments to collectstatistics of anomalies observed in parsing malicious PE files. We also acknowl-edge Marcus A. Maloof and Jeremy Z. Kolter for continuous feedbacks regard-ing the implementation of byte sequence approach and the experimental setup.We thank Roberto Perdisci for providing implementation details of McBoost,sharing Malfease malware dataset, and the results of their custom developedunpacker. We also thank VX Heavens moderators for making a huge malwarecollection publicly available and sharing packing statistics of malware. We alsothank Guofei Gu and Syed Ali Khayam for providing useful feedback on aninitial draft of this paper.

Page 20: PE-Miner: Mining Structural Information to Detect ...homepage.divms.uiowa.edu/~mshafiq/files/raid09-zubair.pdf · PE-Miner: Mining Structural Information to Detect Malicious Executables

20 M. Zubair Shafiq, S. Momina Tabish, Fauzan Mirza, Muddassar Farooq

References

1. AVG Free Antivirus, available at http://free.avg.com/.2. S. Axelsson, “The base-rate fallacy and its implications for the difficulty of intrusion detection”,

ACM Conference on Computer and Communications Security (CCS), pp. 1-7, Singapore, 1999.3. J. Cheng, S.H.Y. Wong, H. Yang, S. Lu, “SmartSiren: virus detection and alert for smart-

phones”, International Conference on Mobile Systems, Applications and Services (MobiSys),pp. 258-271, USA, 2007.

4. DUMPBIN utility, Article ID 177429, Revision 4.0, Micorsoft Help and Support, 2005.5. T. Fawcett, “ROC Graphs: Notes and Practical Considerations for Researchers”, TR HPL-2003-

4, HP Labs, USA, 2004.6. F-Secure Corporation, “F-Secure Reports Amount of Malware Grew by 100% during 2007”,

Press release, 2007.7. F-Secure Virus Description Database, available at http://www.f-secure.com/v-descs/.8. hash map, Visual C++ Standard Library, available at http://msdn.microsoft.com/en-us/library/

6x7w9f6z.aspx.9. N. Hnatiw, T. Robinson, C. Sheehan, N. Suan, “PIMP MY PE: Parsing Malicious and Mal-

formed Executables”, Virus Bulletin Conference (VB), Austria, 2007.10. K. Kendall, C. McMillan, “Practical Malware Analysis”, Black Hat Conference, USA, 2007.11. J.Z. Kolter, M.A. Maloof, “Learning to detect malicious executables in the wild”, ACM In-

ternational Conference on Knowledge Discovery and Data Mining (KDD), pp. 470-478, USA,2004.

12. Microsoft Portable Executable and Common Object File Format Specification, Windows Hard-ware Developer Central, Updated March 2008, available at http://www.microsoft.com/whdc/system/platform/firmware/PECOFF.mspx.

13. J. Munro, “Antivirus Research and Detection Techniques”, Antivirus Research and Detec-tion Techniques, ExtremeTech, 2002, available at http://www.extremetech.com/article2/0,2845,367051,00.asp.

14. Panda Antivirus, available at http://www.pandasecurity.com/.15. PE file format, Webster Technical Documentation, available at http://webster.cs.ucr.edu/Page

TechDocs/pe.txt.16. PEiD, available at http://www.peid.info/.17. R. Perdisci, A. Lanzi, W. Lee, “Classification of Packed Executables for Accurate Computer

Virus Detection”, Elsevier Pattern Recognition Letters, 29(14), pp. 1941-1946, 2008.18. R. Perdisci, A. Lanzi, W. Lee, “McBoost: Boosting Scalability in Malware Collection and Analy-

sis Using Statistical Classification of Executables”, Annual Computer Security ApplicationsConference (ACSAC), pp. 301-310, IEEE Press, USA, 2008.

19. Protection ID - the ultimate Protection Scanner, available at http://pid.gamecopyworld.com/.20. M. Pietrek, “An In-Depth Look into the Win32 Portable Executable File Format, Part 2”,

MSDN Magazine, March, 2002.21. Project Malfease, available at http://malfease.oarci.net/.22. M.G. Schultz, E. Eskin, E. Zadok, S.J. Stolfo, “Data mining methods for detection of new

malicious executables”, IEEE Symposium on Security and Privacy (S&P), pp. 38-49, USA,2001.

23. M.Z. Shafiq, S.M. Tabish, F. Mirza, M. Farooq, “A Framework for Efficient Mining of Struc-tural Information to Detect Zero-Day Malicious Portable Executables”, Technical Report, TR-nexGINRC-2009-21, January, 2009, available at http://www.nexginrc.org/papers/tr21-zubair.pdf

24. M.Z. Shafiq, S.M. Tabish, M. Farooq, “PE-Probe: Leveraging Packer Detection and Struc-tural Information to Detect Malicious Portable Executables”, Virus Bulletin Conference (VB),Switzerland, 2009.

25. Symantec Internet Security Threat Reports I-XI (Jan 2002-Jan 2008).26. F. Veldman, “Heuristic Anti-Virus Technology”, International Virus Bulletin Conference, pp.

67-76, USA, 1993.27. VX Heavens Virus Collection, VX Heavens website, available at http://vx.netlux.org.28. S.D Walter, “The partial area under the summary ROC curve”, Statistics in Medicine, 24(13),

pp. 2025-2040, 2005.29. I.H. Witten, E. Frank, “Data mining: Practical machine learning tools and techniques”, Morgan

Kaufmann, 2nd edition, USA, 2005.


Recommended