Home > Documents > Browser Guard a Behavior-Based Solution To

Browser Guard a Behavior-Based Solution To

Date post: 16-Jul-2015
Category:
Author: shanoop-pattanath
View: 89 times
Download: 0 times
Share this document with a friend
Embed Size (px)
of 8 /8
 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 29, NO. 7, AUGUST 2011 1461 BrowserGuard: A Behavior-Based Solution to Drive-by-Download Attacks Fu-Hau Hsu, Chang-Kuo Tso, Yi-Chun Yeh, Wei-Jen Wang, and Li-Han Chen  Abstra ct —Along with an increasing user population of various web applications, browser-based drive-by-download attacks soon become one of the most common security threats to the cyber community. A user using a vulnerable browser or browser plug- ins may bec ome a vic tim of a dri ve -by-do wnl oad attack right after visiting a vicious web site. The end result of such attacks is that an attacker can download and execute any code on the victi m’ s host. This paper propos es a runtime, behav ior -base d solution, BrowserGuard, to protect a browser against drive-by- download attacks. BrowserGuard records the download scenario of every  le that is loaded into a hos t through a browse r . The n bas ed on the download sce nari o, Bro wse rGuard blocks the execution of any  le that is loaded into a host without the consen t of a browser user. Due to its behavior-base d detect ion nature, BrowserGuard does not need to analyze the source le of any web page or the run-time states of any script code, such as Ja vasc ript. Browse rGuar d also does not need to maint ain any exploit code samples and does not need to query the reputation value of any web site. We utilize the standard BHO mechanism of Windows to implement BrowserGuard on IE 7.0. Experimental results show that BrowserGuard has low performance overhead (le ss than 2.5%) and no false pos itives and false negati ves for the web pages used in our experiments.  Index Terms—drive-by-download attack, heap spray, malware, Web browser, intrusion detection, system security. I. I NTRODUCTION I N THIS P APE R we propose a beha vior-b ase d sol uti on, called BrowserGuard, agains t driv e-by-download attacks which are one of the most dangerous security threats nowa- days. A drive-by-download attack utilizes the vulnerabilities in a bro wser or browse r plug-i ns to download and execute attack code in the address space of the browser without the consent of the bro wse r use rs. A dri ve- by-download att ack is launched through malicious web sites. When a user of a vulnera ble browser visits a malic ious web site, the user’s host wil l be comp romi sed immedi ate ly . Acc ording to [1], more than 1.3% query results provided by Google point to a web page that performs driv e-by-do wnload attac ks. Besides, Frei et al. [2] observed that only 60% of Google users use the latest ver sion of the ir bro wse rs. The above res earc h res ult s show that there are many drive-by-download traps in the Internet to pre y on host s tha t use vulnerable browsers or bro wser plug-in s. Due to the potent destructi ve power of the drive- by-d ownloa d attacks, many promisi ng solu tio ns ha ve bee n Manusc ript receiv ed 1 August 2010; revised 4 January and 21 February 2011. C.-K. Tso is with the Department of Computer Science and Information Engine erin g, Natio nal Cent ral Uni versi ty , Jhongli City , Ta oyuan Count y, 32001 ROC (e-mail: [email protected]). F.-H. Hsu, Y.-C. Yeh, W.-J. Wang and L.-H. Chen are with National Central University. Digita l Objec t Ident ier 10.1109/JSAC.2011.110811. propose d. Howeve r, many of them are bothere d by non-tri vial false positives, false negatives, or performance overhead. Thi s pape r proposes a runtime, beha vior-based sol uti on, BrowserGuard, to protect a browser against drive-by-download attacks. BrowserGuard records the download scenario of every le that is loaded into a host through a browser. Then based on the download scenario, BrowserGuard blocks the execution of any le that is loaded into a host without the consent of a browser user . Due to its behavior -based detect ion nature, BrowserGuard does not need to analyze the source  file of any web page or the run-time states of any script code, such as Javascript. BrowserGuard also does not need to maintain any exploit code samples and does not need to query the reputation value of any web site. We utilize the standard BHO mechanism (subsection II-B4) of Windows to implement BrowserGuard on IE 7.0, which is the most popular browser nowadays [3] and is the major target of many drive-by-download attacks [4]. Experimental results show that BrowserGuard has low performance overhead (less than 2.5%) and negligible false positives and false negatives. The remainder of this paper is organized as follows. Section II discusses the attack ing model of typica l driv e-by-do wnload attacks and the bac kground knowledge of Bro wse rGuard. Sec tion III ill ustr ate s the mec hani sm and implementati on details of BrowserGuard. Section IV includes our effectiveness and performance evaluation. Section V discusses other related research of this security problem. Section VI concludes this paper. II. BACKGROUND In this section we discuss the details of drive-by-download attacks, the APIs used by IE 7.0 to download a  file, BHO, and API hooking.  A. Drive-by-Download Attacks A drive-by-d ownl oad attac k is launch ed through a web page with crafted malicious content. The web server that hosts the vicious web page may be owned by an attacker or may be compromised by an attacker or may be a normal benign host which allows other persons to put their content, such as an advertisement, in the web pages of the host. To accomplish a driv e-by-do wnload attac k, a Malw are Bootst rap Funct ion (MBF) must be injected into the address space of the attacked browser  rst . The n the execution  ow must be tra nsfe rre d to the MBF through some vulnerability in the browser or a plug-in in the browser. In turn, the MBF will download more malware into the compromised host and execute the malware. An MBF can be injected into the attacked browser either by a 0733-8716/11/$25.00  c 2011 IEEE
Transcript

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 29, NO. 7, AUGUST 2011

1461

BrowserGuard: A Behavior-Based Solution to Drive-by-Download AttacksFu-Hau Hsu, Chang-Kuo Tso, Yi-Chun Yeh, Wei-Jen Wang, and Li-Han ChenAbstractAlong with an increasing user population of various web applications, browser-based drive-by-download attacks soon become one of the most common security threats to the cyber community. A user using a vulnerable browser or browser plugins may become a victim of a drive-by-download attack right after visiting a vicious web site. The end result of such attacks is that an attacker can download and execute any code on the victims host. This paper proposes a runtime, behavior-based solution, BrowserGuard, to protect a browser against drive-bydownload attacks. BrowserGuard records the download scenario of every le that is loaded into a host through a browser. Then based on the download scenario, BrowserGuard blocks the execution of any le that is loaded into a host without the consent of a browser user. Due to its behavior-based detection nature, BrowserGuard does not need to analyze the source le of any web page or the run-time states of any script code, such as Javascript. BrowserGuard also does not need to maintain any exploit code samples and does not need to query the reputation value of any web site. We utilize the standard BHO mechanism of Windows to implement BrowserGuard on IE 7.0. Experimental results show that BrowserGuard has low performance overhead (less than 2.5%) and no false positives and false negatives for the web pages used in our experiments. Index Termsdrive-by-download attack, heap spray, malware, Web browser, intrusion detection, system security.

I. I NTRODUCTION N THIS PAPER we propose a behavior-based solution, called BrowserGuard, against drive-by-download attacks which are one of the most dangerous security threats nowadays. A drive-by-download attack utilizes the vulnerabilities in a browser or browser plug-ins to download and execute attack code in the address space of the browser without the consent of the browser users. A drive-by-download attack is launched through malicious web sites. When a user of a vulnerable browser visits a malicious web site, the users host will be compromised immediately. According to [1], more than 1.3% query results provided by Google point to a web page that performs drive-by-download attacks. Besides, Frei et al. [2] observed that only 60% of Google users use the latest version of their browsers. The above research results show that there are many drive-by-download traps in the Internet to prey on hosts that use vulnerable browsers or browser plug-ins. Due to the potent destructive power of the driveby-download attacks, many promising solutions have beenManuscript received 1 August 2010; revised 4 January and 21 February 2011. C.-K. Tso is with the Department of Computer Science and Information Engineering, National Central University, Jhongli City, Taoyuan County, 32001 ROC (e-mail: [email protected]). F.-H. Hsu, Y.-C. Yeh, W.-J. Wang and L.-H. Chen are with National Central University. Digital Object Identier 10.1109/JSAC.2011.110811.

I

proposed. However, many of them are bothered by non-trivial false positives, false negatives, or performance overhead. This paper proposes a runtime, behavior-based solution, BrowserGuard, to protect a browser against drive-by-download attacks. BrowserGuard records the download scenario of every le that is loaded into a host through a browser. Then based on the download scenario, BrowserGuard blocks the execution of any le that is loaded into a host without the consent of a browser user. Due to its behavior-based detection nature, BrowserGuard does not need to analyze the source le of any web page or the run-time states of any script code, such as Javascript. BrowserGuard also does not need to maintain any exploit code samples and does not need to query the reputation value of any web site. We utilize the standard BHO mechanism (subsection II-B4) of Windows to implement BrowserGuard on IE 7.0, which is the most popular browser nowadays [3] and is the major target of many drive-by-download attacks [4]. Experimental results show that BrowserGuard has low performance overhead (less than 2.5%) and negligible false positives and false negatives. The remainder of this paper is organized as follows. Section II discusses the attacking model of typical drive-by-download attacks and the background knowledge of BrowserGuard. Section III illustrates the mechanism and implementation details of BrowserGuard. Section IV includes our effectiveness and performance evaluation. Section V discusses other related research of this security problem. Section VI concludes this paper. II. BACKGROUND In this section we discuss the details of drive-by-download attacks, the APIs used by IE 7.0 to download a le, BHO, and API hooking. A. Drive-by-Download Attacks A drive-by-download attack is launched through a web page with crafted malicious content. The web server that hosts the vicious web page may be owned by an attacker or may be compromised by an attacker or may be a normal benign host which allows other persons to put their content, such as an advertisement, in the web pages of the host. To accomplish a drive-by-download attack, a Malware Bootstrap Function (MBF) must be injected into the address space of the attacked browser rst. Then the execution ow must be transferred to the MBF through some vulnerability in the browser or a plug-in in the browser. In turn, the MBF will download more malware into the compromised host and execute the malware. An MBF can be injected into the attacked browser either by a

0733-8716/11/$25.00 c 2011 IEEE

1462

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 29, NO. 7, AUGUST 2011

Javascript program or by a long string in some HTML tags in a web page. The vulnerabilities utilized to transfer the execution ow of a browser to an MBF can be divided into the following three types. C1 misuse APIs [5][8] C2 memory corruption errors [9][13] C3 initialization errors [14] Type C1 vulnerability is usually created by a browser plugin, such as ActiveX Control, which erroneously exports a ow-control function to its users. The ow-control function allows its user to transfer the execution of a program into any location specied by the user. Type C2 vulnerability is usually generated by bugs in browser code or browser plug-ins. The most famous one is buffer overow bugs [9][12]. Even though many secure solutions, such as ASLR, DEP, GS, and so on, have been developed to solve the memory corruption error problem, attackers continue designing new approaches, such as heap sprays [15], to invalidate the protection. Type C3 vulnerability is caused by some exception conditions that a Javascript engine cannot handle correctly. B. IE Components Regarding to File Download and File Execution BrowserGuard blocks drive-by-download attacks based on the download and execution scenario of a le. There are three components involved in the le download operation and the le execution operation in IE. They are le download component, le execution component, and event component. The rst two components consist of some Application Programming Interfaces (API), which are responsible to le download and le execution. The third component consists of various events of a browser. The following subsections give detailed discussion about these components. 1) File Download Component: The following four reasons cause an IE browser legally download a le to a local host. First, when browsing a web page, in order to display the web page on the Internet Explorer (IE) browser, IE needs to download all the objects described in the web page, including the source code of the web page, script les, Cascading Style Sheets (CSS), and multimedia les, to local storage. Second, when a browser user clicks a hyperlink to navigate another web page, IE needs to download the html le. Third, when a user clicks the download button in a download dialogue box to download a le, IE downloads the le. (The dialogue box is popped up due to a users clicking a URL which points to a le that the browser cannot display on its window.) Fourth, when a user clicks the hyperlink of an ASP/PHP/JSP le which creates a new le or when a user puts the cursor over a hyperlink and clicks the right button of a mouse to open a context menu and download a le, IE downloads the related le. File download caused by the above four reasons is accomplished by the le download component. Files downloaded to a host due to one of the above reasons are called benign les. Files downloaded to a host through drive-by-download attacks (Subsection II-A) are malware. Inside the le download component, IE follows the following steps to download a le. First, Internet Explorer calls API InternetConnect to open a connection and

receive a handle. Second, using the above internet handle, Internet Explorer calls API HttpOpenRequest to assign a name to an object (le). Third, Internet Explorer calls API InternetReadFile to download the le. Finally, Internet Explorer calls API WriteFile to save the le into the Temporary Internet Files directory. Except the above execution path, there exists another execution path used by IE to download a le. This execution path is used when an IE user manually downloads a le by clicking the right button of a mouse or by clicking a hyperlink of a server-side page which creates a new le. When this happens, IE calls API DoFileDownload rst to open a download window. Next, it calls APIs InternetConnect, HttpOpenRequest, InternetReadFile, and WriteFile one by one. Finally based on the directory specied by the user, IE uses API WriteFile to save the downloaded le in the specied directory. 2) File Execution Component: IE calls API CreateProcessW to execute an executable le. API CreateProcessW in turn calls API CreateProcessInternalW to load the image of the le. After the above operations, IE creates a new child process. 3) Event Component: IE provides various events to indicate the occurrence of different activities related to itself. Event BeforeNavigate2 is one of them. When an object, such as a window element or a frameset element in the DOM architecture, is going to be browsed, BeforeNavigate2 will be issued to indicate this activity. The event component of BrowserGuard allows BrowserGuard to more precisely decide under what situation a le is downloaded. 4) Browser Helper Object (BHO) and API Hooking: A Microsoft BHO [16] is a DLL module that will be loaded into the address space of an IE browser, called the host browser of the BHO, when it starts up. The BHO keeps staying in the address space of the host browser until the browser nishes. Because a BHO is executed in the same address space as its host browser, the BHO can perform any operation that the host browser is allowed to do. The major component of BrowserGuard is implemented in a BHO, called BrowserGuard-BHO. API hooking [17] is a technique that allows a programmer to intercept function calls or messages or events passed between software functions. BrowserGuard uses Detours [18] to implement API hooking. Detours replaces the rst few instructions of a target function with an unconditional jump to a detour function provided by a user. The detour function then transfers the execution ow of a process back to the original target function. III. I MPLEMENTATION This section describes the design principle, design goals, and implementation details of BrowserGuard. According to the le download steps of a browser, BrowserGuard sets several check points on a browser and the Windows kernel to detect secret download and blocks the execution of downloaded malware at runtime.

HSU et al.: BROWSERGUARD: A BEHAVIOR-BASED SOLUTION TO DRIVE-BY-DOWNLOAD ATTACKS

1463

IEBrowserGuard-BHO

List Server White-list BlacklistUser Space Kernel Space

BrowserGuard-Kernel

Named Pipe

Fig. 1. Structure, major components, and major data structures of BrowserGuard.

A. Structure of BrowserGuard As g. 1 shows, BrowserGuard consists of a BrowserGuardBHO in every IE process, a BrowserGuard-Kernel in the kernel space, and a list server process. Each host has only one list server process. But the host may have several browsers executing simultaneously; hence, there may exist multiple BrowserGuard-BHOs in a host at the same time. A BrowserGuard-BHO communicates with the list server process through a named pipe. Multiple BrowserGuard-BHOs can communicate with the list server process simultaneously. The list server process contains two lists, a white-list and a blacklist. The white-list records the URLs of benign les (Subsection II-B1) and the hash vales of benign executable les. The blacklist records the hash values of detected malicious les. Downloading benign les to a host due to the rst three reasons discussed in Subsection II-B1 will trigger the system to issue event BeforeNavigate2, which in turn will trigger the execution of the BrowserGuard-BHO function, before_navigate, to record the URLs from which benign les are downloaded (Subsection II-B1). Besides, BrowserGuard also utilizes BrowserGuard-BHO to hook detour functions (Subsection II-B4) to functions in the le download component and functions in the le execution component. The hooked target functions in the le download component contain DoFileDownload, InternetReadFile, and WriteFile. The hooked function in the le execution component is CreateProcessInternalW. BrowserGuard-Kernel is a kernel component of BrowserGuard. BrowserGuard-Kernel enforces the following two tasks to prevent the execution of malware and illegal modications of a white-list and blacklist. First, BrowserGuardKernel ensures that the execution of a program is issued by CreateProcessInternalW which has been hooked by BrowserGuard. Second, BrowserGuard-Kernel denies a request to modify a white-list, if the request is not issued through the code in function before_navigate or DoFileDownload of BrowserGuard. B. Workow of BrowserGuard BrowserGuard blocks drive-by-download attacks by denying the execution of malware (subsection II-B1). BrowserGuard provides its protection to a host through a two-phase mechanism and a kernel component. In the rst phase, namely the ltration phase, BrowserGuard distinguishes malicious

les from benign ones based on the situations under which the les are downloaded to a local host. In the second phase, namely the prohibition phase, BrowserGuard denies the request to execute malicious les. The kernel component blocks attempts to bypass BrowserGuard. This and next subsections describe the techniques. 1) Filtration Phase: To be able to distinguish malicious les from benign ones, BrowserGuard needs to know the situation under which a le is downloaded to a local host. With the information, BrowserGuard can deduce whether a downloaded le is a benign one or malicious one. In order to collect the required information, BrowserGuard installs several check points to monitor the behavior of a browser. The check point before_navigate is a BrowserGuardBHO function that is bound to event BeforeNavigate2, so that the function is invoked whenever a BeforeNavigate2 event is issued. When before_navigate is called, it records the URL of the related le in the white-list of the list server process. As discussed in the previous subsection, a benign le download that is not triggered by clicking the hyperlink of an ASP/PHP/JSP le which creates a new le or by clicking the right mouse button will always result in the BeforeNavigate2 event. Even though clicking right mouse button does not trigger event BeforeNavigate2, this le download request makes IE to invoke DoFileDownload to perform the download. DoFileDownload also contains the URL of the le that is going to be downloaded to a local host. Figure 2 shows the major functions, data structures, and operations involved in the ltration phase. While a user is surng the WWW, a browser needs to download various les. All these les are placed in a directory called Temporary Internet Files and they cannot be directly executed without the admission of the browser user. On a BrowserGuard-protected browser, normal le download triggers the execution of DoFileDownload or before_navigate. Both functions connect to the list server process of a host to record URLs in the white-list of the process. The URLs are the URLs of the les that are going to be downloaded to the host. The real download is performed by API InternetReadFile, which in turn calls API WriteFile to store the downloaded le. The detour functions of InternetReadFile checks whether the URL of the le that this function is asked to download is within the white-list. If the URL is within the white-list, the le is downloaded as usual; but if the URL is not within the white-list and the rst two bytes of the le is MZ, after the le is downloaded to the local host, BrowserGuard calculates the hash value of the le and adds the hash value to the black list. MZ is the rst two bytes of a PE format le which is the most common format of Windows executable les. Instead of using the lename extension to nd executable les, BrowserGuard uses MZ to nd executable les. This can prevent an attacker from naming an executable le with a non-executable lename extension rst and then changing its lename extension back to an executable lename extension before executing the le. The hash value is calculated based on the rst 512 bytes of a le. BrowserGuard uses the hash value of a le to represent

1464

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 29, NO. 7, AUGUST 2011

Internet Explorer

File Download

Normal download

Download via context menu

Event

InternetConnect HttpOpenRequest

DoFileDownload

BeforeNavigate2

URL

InternetReadFile WriteFilehash URL

URL

Named PipeAdd to white-list (URL or hash value)

List Servershash

Check white-list (URL of hash value)

BlackList

White-list

Add to blacklist Possible route to save a file

Fig. 2. phase.

Functions, data structures, and operations involved in the ltration

the le in the blacklist. Hence, BrowserGuard does not need to compare every byte of two les to determine whether these two les have the same content. This step can accelerate the processing time of the prohibition phase and save storage space. Except adding the hash value of a malicious le into the blacklist, the detour function of InternetReadFile also adds the hash value of a benign executable le into the whitelist. The hash values in the white-list are then used by the detour function of WriteFile. WriteFile is used to write a le into storage, such as a disk. Thus, when an executable le is written to a disk by WriteFile, its detour function queries the white-list to check whether the hash value of this le is logged by InternetReadFile. If the hash value of the le is not in the white-list, the le must be transformed from another non-executable le after the non-executable le is downloaded to the local host. For example, an attacker may disguise a malware le as an image le rst. After the disguised le is downloaded to an attacked host by an MBF, the MBF transforms the le into an executable le and executes it. Because benign les are not supposed to be handled in this way, executable les created on a disk using the above methods are deemed as non-benign les (i.e. malicious les). WriteFile saves the hash values of these les into the blacklist. 2) Prohibition Phase: Inside an IE browser, CreateProcessInternalW is used to execute an executable le stored on a disk. BrowserGuard hooks this API to ensure that the API will not execute malware. BrowserGuard calculates the hash value of the executable le rst. Then BrowserGuard checks whether the white-list and blacklist contain the same hash value. If the blacklist does not contain the hash value but the white-list contains the hash value, API CreateProcessInternalW runs the executable le. Otherwise, it blocks the execution of the le. C. Prevention of Checkpoint-Bypassing Various checkpoints installed by BrowserGuard are the critical instructions used to detect downloaded malware and

prevent the execution of the downloaded malware. If an attacker can bypass these checkpoints, she/he can successfully accomplish a drive-by-download attack on a BrowserGuardprotected browser. BrowserGuard utilizes various approaches to ensure that, if the download and execution of a program do not follow the normal path and does not pass the predened checkpoints, BrowserGuard can detect it and block the execution. 1) Protecting the Checkpoints in DoFileDownload and before_navigate: This subsection describes the approaches that BrowserGuard uses to prevent attackers from adding URLs to the white-list by directly calling API DoFileDownload and before_navigate from an MBF or executing copied versions of these APIs in an MBF. DoFileDownload and before_navigate connect to the list server process in a host to record URLs in the whitelist of the process. Inside the kernel, BrowserGuard-Kernel ensures only instructions inside functions DoFileDownload and before_navigate can add a URL to the white-list. In a BrowserGuard-protected browser, the URLs are transmitted from an IE process to a list server process. Thus, kernel functions need to be used to accomplish this work. Hence, by recording the return addresses used to return to API DoFileDownload or before_navigate from the kernel after the kernel transmits a URL from an IE process to the white-list of a list server process, BrowserGuard-Kernel can infer the addresses of the legal user space instructions that can initialize the transmission. Thus, BrowserGuard can deny any request to transmit a URL to the white-list of a list server process if the request is not issued through instructions inside DoFileDownload and before_navigate. Besides, even if an MBF directly calls DoFileDownload or before_navigate to add URLs of malware to the white-list, BrowserGuard still can detect the behavior due to the following reason: When the above behavior occurs, the execution ow still needs to return to the MBF because the MBF has to download and execute the malware or the execution ow nally needs to transfer to CreateProcessInternalW which is the only legal API inside a BrowserGuard-protected browser to create a new process (subsection III-C2). On a Windows system, due to DEP, the stack segment and data segments of a process are not executable. Hence, an MBF can only be stored in the heap segment of a process. As a result, by checking whether the stack return addresses contain a heap address or the address of CreateProcessInternalW when DoFileDownload or before_navigate is executed, BrowserGuard can prevent these two APIs from being directly called by an MBF. 2) Protecting the Checkpoint in CreateProcessInternalW: To avoid that an MBF bypasses the check point inside the detour function hooked to CreateProcessInternalW by directly jumping to the sixth byte of this function, BrowserGuard utilizes a new software interrupt, BGSetFlag, to solve the problem. BGSetFlag is an element of BrowserGuard-Kernel. Because BGSetFlag is a software interrupt, after a thread invokes this software interrupt in the user address space, the thread switches from user mode into kernel mode and the system will store the address (a return address) of the

HSU et al.: BROWSERGUARD: A BEHAVIOR-BASED SOLUTION TO DRIVE-BY-DOWNLOAD ATTACKS

1465

instruction right after the calling instruction in the kernel mode stack of the thread. For a process, the rst execution of BGSetFlag results in the recording of the above return address, called anchor address of the process, and the setting of the BG_CHECKED ag. Every subsequent execution of BGSetFlag compares the return address stored in the kernel mode stack of the process with the anchor address rst. If these two addresses are the same, the BG_CHECKED ag will be set; otherwise, the ag is cleaned. Inside the kernel address space, the native API in charge of the creation of a new process checks the BG_CHECKED ag rst. Only when the ag is set, the API creates a new process and then cleans the ag. Otherwise, it aborts the process creation request. BrowserGuard adds an invocation statement of BGSetFlag in the detour function of CreateProcessInternalW to make sure that the detour function is executed when a process creates a new process. Except BrowserGuard-Kernel, the detour function of CreateProcessInternalW is another place where BrowserGuard blocks the execution of malware downloaded by a drive-by-download attack. The above approach is also used to prevent attackers from bypassing the detour function of InternetReadFile. IV. E VALUATION In this section, we discuss the results of various experiments that were made to evaluate the effectiveness and efciency of BrowserGuard. What follows are the specications of the hosts, operating systems, and browsers used in our experiments. All browsers used in our experiments are IE 7.0 and are executed in a guest machine. The guest machine is executed on a host machine through VMware. The web server used in our tests is installed in a remote Linux machine. local client machine: guest machine: (OS: Windows XP SP2 (32bit), Browser: IE 7.0) VMware 7.0.1 (Memory: 1024 MB) host machine (OS: Windows 7 (32bit), CPU: Intel Core2Duo CPU P8600 2.4 GHz, Memory: 3 GB) remote server machine: (OS: Ubuntu 10.04, Web Server: Apache 2) A. Effectiveness To evaluate the effectiveness of BrowserGuard, we made various tests to evaluate the false positives and false negatives created by BrowserGuard. To test the false positives of BrowserGuard, we chose the top 500 ranking websites from Alexa [19] and visited them using an IE browser with BrowserGuard. Because surng these websites did not make BrowserGuard to issue any drive-bydownload attack alert and these websites were not reported by Google as malicious websites, the number of false positives of BrowserGuard for these websites is zero. In order to evaluate the false negatives of BrowserGuard, we used Metasploit framework [20] to generate 10 malicious web pages based on the 9 exploits for IE 7.0 listed in Table I. We installed these 10 malicious web pages in a remote server machine. All these web pages contain both shellcode

and exploit code used to launch drive-by-download attacks; hence, these web pages compromised our test machines immediately after we use an ordinary IE 7.0 to view these web pages. However, when using a browser with BrowserGuard to visit these pages, even though the related malware were still downloaded to the local host, all of them were blocked when the shellcode tried to execute them. Hence, the number of false negatives of BrowserGuard for these malicious web pages is zero. Among the 10 malicious web pages, number 5 is a special one because it stores malware in a le with JPEG image le header rst. After the disguised le is downloaded to the local host, the le is transformed back to an executable le. But BrowserGuard still thwarted the attack of number 5 web page. Hence, no matter how attackers encrypt the malware. BrowserGuard still can detect the malware before it is executed. Table II shows comparisons of the detection accuracy between BrowserGuard and other similar work [21][24]. Some work does not provide complete data; hence, in Table II we use N/A to represent the unavailable data. IMC [24] is a signature-based solution; hence, if the database does not contain the signatures for a vulnerability, the false negative rate will increase to 48%. Besides, [25] proposed an approach to bypass Nozzles detection recently. Hence, the gure shows that BrowserGuard is an accurate solution. B. Performance Overhead Static code analysis shows that the performance overhead imposed by BrowserGuard is mainly caused by the following operations. First, when an object is going to be navigated, BrowserGuard makes some memory access to enlist the URL of the object to the white-list. Second, BrowserGuard needs to read the header of a downloaded le to check whether the rst two bytes of the le are MZ. Third, BrowserGuard needs to execute extra code to handle event reception and API hooking. To evaluate the performance overhead introduced by BrowserGuard, we measured the time period between the time when a browser issues a request to view a web page and the time when the browser nish the download of the web page with or without BrowserGuard. We chose 5 web pages from Alexa Top Sites to make our measurements. For each web site, we tested the time to nish the above operations 2000 times to collect the statistics. The extra time introduced by BrowserGuard is almost xed; hence, when a user visits a web page, the performance overhead imposed by BrowserGuard is determined by the original time needed to download the web page. The original download time is affected by many factors, such as the workload of a web site, the size of a web page, the computation power of a web site, the transmission time of a web page, and so on. To reduce the inuence of the above factors, we mirror all tested web pages in a separate local server. Hence, the whole testing environment is built in a local network so that we can make sure that the measured data is under minimized affection of network transmitting velocity. Overall, the worst case performance overhead in our tests is 2.5%. Table III lists the results. Figure 3 shows comparisons of performance overhead between BrowserGuard and other similar work. Some work does

1466

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 29, NO. 7, AUGUST 2011

TABLE I R ESULTS OF FALSE NEGATIVE TESTS . I N THIS TABLE MSB MEANS M ICROSOFT S ECURITY B ULLETIN Number 1 2 3 4 5 6 7 8 9 10 MSB MS06-014 MS06-055 MS06-067 MS07-017 MS07-017 MS08-078 MS09-002 MS09-072 MS10-002 MS10-018 CVE-ID 2006-0003 2006-4868 2006-4777 2007-0038 2007-0038 2008-4844 2009-0075 2009-3672 2010-0249 2010-0806 Description RDS.Dataspace ActiveX Control Vulnerability VML Fill Method Buffer Overow Daxctle.ocx Keyframe Function Heap Overow ANI LoadAniIcon Function Buffer Overow The Malicious Executable is encoded in a jpg le. Data Binding Memory Corruption CFunctionPointer Uninitialized Memory Corruption getElementsByTagName Memory Corruption HTML Object Memory Corruption DHTML Behaviors Use-after-free Result Blocked Blocked Blocked Blocked Blocked Blocked Blocked Blocked Blocked Blocked

Overhead Factors of Page Load Time (%)

TABLE II C OMPARISONS OF THE DETECTION ACCURACY BETWEEN B ROWSER G UARD AND OTHER SIMILAR WORK False Positive Rate 0% 0% N/A 10.9% 0% TABLE III P ERFORMANCE OVERHEAD INTRODUCED BY B ROWSER G UARD Mirrored web site news.yahoo.com w3.org youtube.com imdb.com facebook.com BrowserGuard avg.(sec) 4.933 0.867 1.227 1.520 1.538 W/O BrowserGuard avg.(sec) 4.891 0.857 1.211 1.483 1.527 Overhead 0.9% 1.2% 1.3% 2.5% 0.7% False Negative Rate 0% 0% 0% 0.2% 48%(0%)

50 45 40 35 30 25 20 15 10 5 0

45

BrowserGuard Nozzle 50% Threshold BuBBle JSAND IMC

6.4 1.32Br ow se rG ua rd No zz le 5% Sa mp le Ra te No zz le 25 % Sa

4.82 N/ABu BB le mp le Ra te JS AN D

1.5IM C

Fig. 3. Comparisons of performance overhead between BrowserGuard and other similar work

not provide complete data; hence, in Fig. 3 we use N/A to mean the unavailable data. Figure 3 shows that BrowserGuard has low performance overhead. V. R ELATED W ORK In this section, we discuss related work in the literature. Many drive-by-download attacks are triggered by vulnerable ActiveX controls. Microsoft uses Kill-Bit [26] to mitigate this problem. However, Kill-Bit does not patch any executable. Instead, it just blocks the use of certain known vulnerable ActiveX Controls. If a particular ActiveX Control is marked as unsafe through Kill-Bit, the ActiveX Control will never be invoked by any application. However, attackers can utilize nonActiveX-Control vulnerabilities to launch a drive-by-download attack and not all vulnerabilities of all ActiveX controls are unveiled. Many drive-by-download attacks use heap sprays [15] to accomplish the attacks. Nozzle [21] detects heap spray attacks based on the observation that shellcode used in a heap spray attack is usually prepended with a long NOP sled. Experimental results showed that Nozzle has small number of false positives and false negatives. However, if an attacker writes NOP sleds and shellcode into a heap string after Nozzle nishes its examination, Nozzle is not able to detect the attack. Besides, elaborately created attack strings still can bypass Nozzles detection. Manuel Egele et al. [27] utilize library libemb to emulate x86 instructions to detect shellcode stored

in a Javascript variable. However, this solution introduces nontrivial performance overhead and elaborately created attack strings still can bypass their detection. HSP [28] controls the number and location of int 80 instructions in a process and hides the whereabouts of the only legal int 80 instruction; hence, HSP makes it difcult for attackers to issue a system call, let alone a heap spray attack. HSP is a compiler-based solution; hence, current version of HSP cannot provide protection for static linked libraries. However, very few if not none browsers use static linked libraries. L. Lu et al. [29] adopt similar philosophy of blocking the execution of suspicious executable les as BrowserGuard does. By sandboxing all downloaded objects in a secure zone, their work, Blade, prohibits supervised process from operating unauthorized les in the secure zone. Blade captures user behaviors, such as clicking a mouse button on a downloadrelated popup window, to mark user-consent downloaded les as authorized. Blade has zero error rate. However, it may pose performance issues to non-browser processes due to the special secure zone design that once a user tries to manipulate a le, OS should check whether this le is in the secure zone. Gadaleta et al. [23] defeat heap sprays by randomly inserting interrupt instructions inside every Javascript string variable before storing it in the heap and reverting the modied string to the original string before using it. If the execution ow is redirected to the shellcode stored in a Javascript string variable, it cannot be successfully executed.

HSU et al.: BROWSERGUARD: A BEHAVIOR-BASED SOLUTION TO DRIVE-BY-DOWNLOAD ATTACKS

1467

Another popular defense mechanism against drive-bydownload attacks is the browser reputation system. In this system, before displaying a web page on a browser, the browser automatically connects to a remote database to check the reputation of the web page rst. Only web pages with good reputation can display on the browser. Various antivirus vendors, such as Norton SafeWeb [30], McAFee SiteAdvisor [31], and Trend Micro TrendProtect [32], adopted this approach to deal with drive-by-download attacks. However, the browser reputation system has no guarantee that all websites are under their monitor. Besides, they have non-trivial falsepositives and it takes a while to update out-of-date or wrong data in the database or to add new data to the database. Some solutions, such as Provos et al. [1], Moshchuk et al. [33], Capture-HPC [34], and HoneyMonkey [35], use highinteraction honey browsers to visit web sites and monitor the behavior of these web sites in the underlying operating system to detect malicious web pages. The behavior includes creation of les or new processes and creation or modication of Registry. Cova et al. [22] utilize machine learning and anomaly detection in an emulation environment to automatically detect and analyze malicious Javascript code in malicious web pages. Their solution, JSAND, can simulate the presence of any ActiveX controls or plug-ins required by a web page. Dewald et al. [36] log critical actions triggered by the execution of Javascript code in a web page. Then utilizing heuristics on the logs, their solution, ADSandbox, decides whether the web page is malicious. Basically these solutions are not integrated into a browser; hence, they are not able to provide real time protection to browsers. Moreover, these solutions cannot detect malicious web content when the honey browser does not have the vulnerability that is used by the exploits in the malicious page. Furthermore, it is a challenging work for them to examine all web pages. C. Song et al. [24] detect drive-by-download attacks by matching the inter-module communication events with predened vulnerability signatures. However, its signature-based property makes it difcult to detect zero-day attacks. VI. C ONCLUSION Drive-by-download attacks are one of the most severe security threats to computer and network systems nowadays. In this paper, we present BrowserGuard, a runtime, behaviorbased solution to drive-by-download attacks. BrowserGuard analyzes the download scenario of every downloaded object. Based on the download scenario, BrowserGuard blocks the execution of any executable le that is downloaded to the host machine without the consent of a user. This light-weighted technique introduces less than 2.5% performance overhead because no simulation or static web page analysis is required. BrowserGuard also does not need to maintain any attack string signatures or web site reputation. Experimental results show that BrowserGuard has no false negative to past exploit samples and no false positive to top 500 rated websites. Currently, BrowserGuard is implemented on Windows Internet Explorer 7.0 because most exploits in the wild targeting this version of IE. Although BrowserGuard only supports IE 7.0 on a Windows system, we believe the defense model of

BrowserGuard can serve as a guide to develop similar tools for other browsers. ACKNOWLEDGMENT Our work is funded by National Science Committee of Taiwan (ROC), and the numbers of the Projects are NSC 992220-E-008-001 and NSC 99-2219-E-008-001. R EFERENCES[1] N. Provos, P. Mavrommatis, M. A. Rajab, and F. Monrose, All your iFRAMEs point to us, in Proc. 17th conference on USENIX security symposium. USENIX Association, 2008, pp. 115. [2] S. Frei, T. D bendorfer, G. Ollmann, and M. May, Understanding the u web browser threat: examination of vulnerable online web browser populations and the insecurity iceberg, ETH, Eidgen ssische Technische o Hochschule Z rich, Communication Systems Group, Tech. Rep., 2008. u [3] NetApplications Company News (December 1, 2008). [Online]. Available: http://www.netapplications.com/newsarticle.aspx?nid=45 [4] National Vulnerability Database. [Online]. Available: http://nvd.nist. gov/ [5] M. Egele, E. Kirda, and C. Kruegel, Mitigating drive-by download attacks: challenges and open problems, open research problems, in INetSec 2009. Open Research Problems in Network Security, 2009. [6] Microsoft Ofce Snapshot Viewer ActiveX vulnerability. [Online]. Available: http://cve.mitre.org/cgi-bin/cvename.cgi?name= CVE-2008-2463 [7] Microsoft Security Bulletin MS06-014 - Vulnerability in the Microsoft Data Access Components (MDAC) Function Could Allow Code Execution. [Online]. Available: http://www.microsoft.com/technet/ security/Bulletin/MS06-014.mspx [8] Sina dloader class activex control downloadandinstall method arbitrary le download vulnerability. [Online]. Available: http: //www.securityfocus.com/bid/30223/info [9] C. Cowan, C. Pu, D. Maier, H. Hintony, J. Walpole, P. Bakke, S. Beattie, A. Grier, P. Wagle, and Q. Zhang, Stackguard: automatic adaptive detection and prevention of buffer-overow attacks, in Proc. 7th conference on USENIX Security Symposium - Volume 7. USENIX Association, 1998, pp. 55. [10] Aleph One, Smashing the Stack For Fun and Prot. Phrack Magazine, 1996. [11] L.-H. Chen, F.-H. Hsu, C.-H. Huang, C.-W. Ou, C.-J. Lin, and S.-C. Liu, A robust kernel-based solution to control-hijacking buffer overow attacks, Journal of Information Science and Engineering, vol. 27, no. 3, 2011. [12] T.-C. Chiueh and F.-H. Hsu, RAD: a compile-time solution to buffer overow attacks, in Proc. 21st International Conference on Distributed Computing Systems, 2001, pp. 409417. [13] J. Xu, P. Ning, C. Kil, Y. Zhai, and C. Bookholt, Automatic diagnosis and response to memory corruption vulnerabilities, in Proc. 12th ACM conference on Computer and communications security, ser. CCS 05. ACM, 2005, pp. 223234. [14] Microsoft Internet Explorer window() Arbitrary Code Execution Vulnerability. [Online]. Available: http://secunia.com/advisories/15546/ [15] A. Sotirov, Heap feng shui in JavaScript, BlackHat Europe, 2007. [16] D. Esposito, Browser Helper Objects: The Browser the Way You Want It. [Online]. Available: http://msdn.microsoft.com/en-us/library/ bb250436(VS.85).aspx [17] I. Ivanov, API hooking revealed. [Online]. Available: http://www. codeproject.com/KB/system/hooksys.aspx [18] Detours. [Online]. Available: http://research.microsoft.com/en-us/ projects/detours/ [19] Alexa Internet. [Online]. Available: http://www.alexa.com [20] Metasploit. [Online]. Available: http://www.metasploit.com [21] P. Ratanaworabhan, B. Livshits, and B. Zorn, NOZZLE: a defense against heap-spraying code injection attacks, in Proc. 18th conference on USENIX security symposium. USENIX Association, 2009, pp. 169 186. [22] M. Cova, C. Kruegel, and G. Vigna, Detection and analysis of driveby-download attacks and malicious javascript code, in Proc. 19th international conference on World wide web, ser. WWW 10. ACM, 2010, pp. 281290. [23] F. Gadaleta, Y. Younan, and W. Joosen, Bubble: A javascript engine level countermeasure against heap-spraying attacks, in Engineering Secure Software and Systems, ser. Lecture Notes in Computer Science, vol. 5965. Springer, 2010, pp. 117.

1468

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 29, NO. 7, AUGUST 2011

[24] C. Song, J. Zhuge, X. Han, and Z. Ye, Preventing drive-by download via inter-module communication monitoring, in Proc. 5th ACM Symposium on Information, Computer and Communications Security, ser. ASIACCS 10. ACM, 2010, pp. 124134. [25] Y. Ding, T. Wei, T. Wang, Z. Liang, and W. Zou, Heap taichi: exploiting memory allocation granularity in heap-spraying attacks, in Proc. 26th Annual Computer Security Applications Conference, ser. ACSAC 10. ACM, 2010, pp. 327336. [26] Microsoft Security Research & Defense. [Online]. Available: http://blogs.technet.com/srd/archive/2008/02/06/The-Kill 2D00 Bit-FAQ 3A00 -Part-1-of-3.aspx [27] M. Egele, P. Wurzinger, C. Kruegel, and E. Kirda, Defending browsers against drive-by downloads: Mitigating heap-spraying code injection attacks, in Proc. 6th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, ser. DIMVA 09. SpringerVerlag, 2009, pp. 88106. [28] F.-H. Hsu, C.-H. Huang, C.-H. Hsu, C.-W. Ou, L.-H. Chen, and P.-C. Chiu, HSP: A solution against heap sprays, Journal of Systems and Softwware, vol. 83, pp. 22272236, 2010. [29] L. Lu, V. Yegneswaran, P. Porras, and W. Lee, BLADE: an attackagnostic approach for preventing drive-by malware infections, in Proc. 17th ACM conference on Computer and communications security, ser. CCS 10. ACM, 2010, pp. 440450. [30] Norton safe web. [Online]. Available: http://safeweb.norton.com/ [31] McAFee SiteAdvisor. [Online]. Available: http://safeweb.norton.com/ [32] Trend Micros TrendProtect. [Online]. Available: http://www. trendsecure.com/portal/en-US/tools/security tools/trendprotect [33] A. Moshchuk, T. Bragin, D. Deville, S. D. Gribble, and H. M. Levy, Spyproxy: execution-based detection of malicious web content, in Proc. 16th conference on USENIX Security Symposium. USENIX Association, 2007, pp. 3:13:16. [34] The Honeynet Project. Capture-HPC. [Online]. Available: https: //projects.honeynet.org/capture-hpc [35] Y.-M. Wang, D. Becker, and X. Jiang, Automated web patrol with strider honeymonkeys: Finding web sites that exploit browser vulnerabilities, in Proc. Symposium on Network and Distributed System Security, 2006. [36] A. Dewald, T. Holz, and F. C. Freiling, Adsandbox: sandboxing javascript to ght malicious websites, in Proc. 2010 ACM Symposium on Applied Computing, ser. SAC 10. ACM, 2010, pp. 18591864.

Chang-Kuo Tso is a Ph.D. student in the Department of Computer Science and Information Engineering of National Central University. He received his M.S. degree in computer science and information engineering from National Central University, Taoyuan, Taiwan, in 2009. His researches are security issues about OS design, mobile devices, especially Windows Mobile and Android, and network security.

Yi-Chun Yeh received the B.S degree in computer science and engineering from Tatung University, in 2007, and the M.S degree in computer science and information engineering from National Central University, in 2009. He is currently working toward the Ph.D. degree in Department of Computer Science and Information Engineering, National Central University with Prof. Fu-Hau Hsu. His research interests include malware technology, rmware development, operating system and mobile security.

Wei-Jen Wang is an Assistant Professor of Computer Science and Information Engineering at National Central University, Taiwan. He received his B.S. degree and M.S. degree in computer information science from National Chiao Tung University, Taiwan, in 1997 and 1999, respectively. He received his Ph.D. in computer science from Rensselaer Polytechnic Institute in 2006. His research interests include concurrent programming models and languages, cloud/grid/Internet computing, distribute garbage collection, and data hiding. Li-Han Chen is a Ph.D. student in the Department of Computer Science and Information Engineering of National Central University. He received his M.S. degree in computer science and information engineering from National Central University, Taoyuan, Taiwan, in 2008, and his B.S. degree in chemical engineering from National Tsing Hua University. His research areas include mobile security, operating system, and network security.

Fu-Hau Hsu received his Ph.D. degree in the department of computer science from Stony Brook University, New York, USA in 2004. He is an assistant professor at National Central University and has had an appointment in the Department of Computer Science and Information Engineering since August 2005. He is afliated with the Advanced Defense Lab and the Wireless and Multimedia Lab.


Recommended