Machine Learning-based Defense Against Process- Aware Attacks...

Machine Learning-based Defense Against Process-Aware Attacks on Industrial Control Systems

Anastasis Keliris∗, Hossein Salehghaffari∗, Brian Cairl∗,Prashanth Krishnamurthy∗, Michail Maniatakos† and Farshad Khorrami∗∗ Tandon School of Engineering, New York University, New York, USA

† New York University Abu Dhabi, Abu Dhabi, UAEEmail: {anastasis.keliris, h.saleh, bc1152, prashanth.krishnamurthy, michail.maniatakos, khorrami}@nyu.edu

Abstract—The modernization of Industrial Control Systems(ICS), primarily targeting increased efficiency and controllabilitythrough integration of Information Technologies (IT), introducedthe unwanted side effect of extending the ICS cyber-securitythreat landscape. ICS are facing new security challenges andare exposed to the same vulnerabilities that plague IT, asdemonstrated by the increasing number of incidents targetingICS. Due to the criticality and unique nature of these systems, itis important to devise novel defense mechanisms that incorporateknowledge of the underlying physical model, and can detectattacks in early phases. To this end, we study a benchmarkchemical process, and enumerate the various categories of attackvectors and their practical applicability on hardware controllersin a Hardware-In-The-Loop testbed. Leveraging the observed im-plications of the categorized attacks on the process, as well as theprofile of typical disturbances, we follow a data-driven approachto detect anomalies that are early indicators of malicious activity.

I. INTRODUCTION

Automatic control systems ensure the stable operation ofindustrial environments and provide monitoring and manage-ment capabilities for the underlying physical processes. Exam-ples of industrial environments include water treatment andwater desalination plants, assembly lines and manufacturingprocesses, chemical processes, and electric power systems.The nature and significance of these environments render themparts of critical infrastructure.

These industrial processes and their associated control sys-tems are typically referred to as Industrial Control Systems(ICS). The two major types of ICS with regards to thenature and topology of the controlled industrial process are:i) Distributed Control System (DCS), where the system isdivided into distributed and decentralized subsystems eachresponsible for its own local process, and ii) SupervisoryControl and Data Acquisition (SCADA), where the controlof the entire system is centralized and the system typicallyspans over a large geographical area [1].

Over the past years the hardware and software componentsof ICS are being upgraded, towards a more modern and“smart” critical infrastructure that has increased efficiency,controllability, and reliability. The addition of computing ca-pabilities and inter/intra-connectivity to ICS promise lowerproduction and maintenance costs, faster emergency responsetimes, fewer incidents, and shorter downtimes. This mod-ernization trend is enabled by the proliferation of cheap

general purpose Commercial-Off-The-Shelf (COTS) hardwareand software [2]. A contemporary ICS typically incorporatesmicrocontrollers and common-architecture embedded micro-processors (e.g., ARM-based) running commodity operatingsystems, such as Windriver’s VXworks, MentorGraphics’ Nu-cleus and Unix-based Real Time Operating Systems (RTOS).Other advanced features include web servers with graphicaluser interfaces for configuration and monitoring, File TransferProtocol (FTP) servers, common networking standards, andremote maintenance capabilities [3].

The use of COTS components in critical infrastructuresettings is attractive since it provides the immediate benefitof robust hardware and stable, readily available softwaremodules. At the same time, however, vulnerabilities discov-ered in COTS products can be promptly ported to industrialenvironments, extending the cyber-security threat landscapeof ICS [4]. In addition, common IT protocols used for ICScommunication have known vulnerabilities and exploitationtechniques, enabling elaborate attacks. Even the assurances ofair-gap networks are not adequate against motivated attackers,as demonstrated by Stuxnet [5]. Cyber-attacks against ICS arehappening at an alarming pace. In 2014, the ICS Cyber Emer-gency Response Team (ICS-CERT) received and respondedto 245 incidents in the US, whereas in 2015 the number ofincidents reported grew to 295 [6], [7]. At the same time, theICS security market is expected to grow to $11.29 billion by2019 [8]. Table I aggregates information on high-impact ICSattacks from 2000 to date.

ICS security has been traditionally handled using networksecurity and conventional IT security practices. ICS securitygoals, however, differ greatly from traditional IT securitygoals. Straightforward adoption of IT security solutions failsto address the coupling between the cyber and physicalcomponents in an ICS [14], as well as the demand for highavailability of the monitoring and control functions [15]. Forexample, while an email system can afford short delays indelivering messages, a short disruption of the control processin an ICS can have devastating effects ranging from environ-mental disasters to significant financial losses, or even loss oflife.

In this paper, in order to address the cyber and physicalcoupling in ICS, we develop a process-aware supervisedlearning defense strategy that takes into consideration the

Paper 12.2978-1-4673-8773-6/16/$31.00 c©2016 IEEE

INTERNATIONAL TEST CONFERENCE 1

TABLE ITIMELINE OF HIGH-IMPACT ATTACKS TARGETING ICS

Description ImpactIn 2000, the SCADA system that controlled a Queensland sewage treatment plant was About 800,000 liters of raw waste were pumped into aaccessed and controlled by a former employee of the software development team, after his nearby river and the grounds of a resort hotel, killingjob application was rejected [9]. wildlife and plants.In 2008, the control and monitoring system of Baku-Tbilisi-Ceyhan oil pipeline in Turkey An explosion on the pipeline caused more than 30,000was attacked by a terrorist organization. The attackers gained the entry onto the system barrels of oil to spill in an area above a water aquifer,by exploiting the vulnerabilities of the camera communication software, and then disabled and cost British Petroleum and its partners $5 millionthe alarm system and manipulated the pressure in the pipeline [10]. a day in transit tariffs during the closure.In 2009, the system responsible for detecting pipeline leaks for oil derricks off the The monitoring system was disabled temporarily. TheSouthern California coast was hacked by a disgruntled employee [11]. coastline was exposed to environmental disasters.In 2010, the Stuxnet computer worm infected the software of at least 14 industrial sites The worm spied on the operations of the target system. Itin Iran, including a uranium enrichment plant. It was introduced to the target system then used the information it had gathered to control thevia USB drives and then repeatedly replicated itself [12]. centrifuges, forcing them to tear themselves apart.In 2014, sophisticated attackers used spear-phishing and social engineering to gain access Outages on control components and production machinesto the office network of a steel plant in Germany, from which the attackers then broke prevented the plant from properly shutting down a blastinto the organization’s production network [13]. furnace, resulting in significant physical damages.

operational behavior of an ICS to detect attacks in real-time.To better understand the requirements of such a detectionmodule, we analyze the different categories of attacks, andstudy their implications on the ICS process. The accuracy ofour analysis is enhanced by the inclusion of actual hardwarein a Hardware-In-The-Loop (HITL) setup, which introducesrealistic disturbances to the simulation model and also enablesus to demonstrate a complete payload delivery mechanism.The contributions of this paper are twofold:

• Exploration of the attack surface for process-aware pay-loads and demonstration of complete ICS attack vectorsin a controlled lab setup.

• Development of a real-time robust machine learning clas-sifier that can detect several possible abstract categories ofattacks and distinguish them from process disturbances,and redundancy-based mitigation through automated con-trol switching.

The rest of the paper is organized as follows: Followingthe problem formulation in Section II, Section III presents themethodology used in this paper. We provide an overview ofthe utilized benchmark process in Section IV, followed by anexploration of the different attack categories in Section V. Wepresent our experimental setup in Section VI and justify ourdecision for incorporating hardware devices in a HITL testbed.Section VII describes two process-aware attacks including theend-to-end payload delivery mechanism, and the attacks’ im-pact on the overall process. Details and experimental results ofour proposed online machine learning detection and mitigationstrategy are presented in Section VIII. We compare this workwith related work in Section IX, and conclude the paper inSection X.

II. BACKGROUND AND PROBLEM FORMULATIONThis paper focuses on the development of attack detection

and mitigation strategies for ICS, with emphasis on controlsystem implementations on Programmable Logic Controllers(PLCs). ICS are comprised of one or more dynamic processes,sensors, actuators, computational components such as PLCsthat implement and execute control algorithms, and commu-nication components. Fig. 1 depicts a simplified layout of ICScomponents and their interconnections.

Fig. 1. Simplified layout of ICS architecture

The characteristics of the underlying regulated dynamicprocesses depend on the application domain, and can includecontinuous-time or discrete-time dynamical systems (that are,in general, time-varying and non-linear), hybrid combina-tions of continuous-time/discrete-time, discrete-event or event-driven dynamics, and combinations of multiple dynamic com-ponents in a centralized or distributed structure. Furthermore,the control system implementations can also be of a widevariety of structures. However, the fundamental concept inany control system component is a feedback interconnectionutilizing sensors and actuators, wherein real-time informationfrom one or more sensors is utilized to compute commands toone or more actuators. For example, in the case of a chemicalprocess, the real-time reading of the pressure in a reactionvessel can be utilized in a feedback loop to compute real-time commands to a valve. The valve regulates the reactionvessel pressure to a desired value, also known as a setpointin control theory terms. While there is a considerable varietyof feedback control methodologies, the Proportional-Integral-Derivative (PID) controller is the most commonly used controlstructure in industrial applications.

Depending on the application domain, feedback controlimplementations can span a range of time scales, from a fewmilliseconds for fast electromechanical systems, to several

Paper 12.2 INTERNATIONAL TEST CONFERENCE 2

seconds for slow chemical processes. The controller param-eters (e.g., the PID gains) are tuned based on closed-loopperformance objectives. The numerical values of the controllerparameters also depend on the time step utilized in thecontroller implementation. ICS can have multiple control loopspossibly implemented on multiple controller computationalplatforms, with each of these loops regulating different parts ofthe overall process. Furthermore, feedback control componentscan be interconnected in cascade or parallel combinations.

Since the proper operation of an ICS depends on thereal-time feedback control loops, an ICS can be susceptibleto various types of process-aware attacks that attempt tohamper system performance and stability, by modifying anyone of the constituent components, or a combination thereof.The potential points of interest of an ICS from a securityperspective are:

• Sensor components• Computational components (i.e., controllers)• Actuator components• Communication from controllers to actuators• Communication from sensor to controllers• Remote communication mechanisms

A. Process-aware attacks

In comparison with generic attacks that target the com-putational or communication elements, process-aware attacksattempt to utilize knowledge of the dynamic process beingcontrolled, typical sensor and actuator time signals, controlalgorithms and implementation mechanisms, to negativelyimpact the closed-loop system. While the attack payload isprocess-aware, the entry points to the system are genericvulnerabilities of the computational or communication COTScomponents.

By leveraging vulnerabilities in the computational units thatimplement the real-time controllers, an attacker can modifycontrol parameters, time step settings, or trigger conditions inan event-driven controller. Modification of the control logicand control parameters can, in general, be targeted to generatevarious types of effects on the closed-loop system, such asloss of performance or system stability, slow effects over alonger time interval, effects under a specific trigger condition,modified system behavior, etc.

B. Process-aware defenses

To guard against process-aware attacks and counteract theireffects, process-aware resiliency mechanisms are crucial inaddition to and in conjunction with best-practice securitymethods for the computation and communication componentsin the system [3]. The two main categories of defense strate-gies for ICS are i) approaches based on dynamic modelsof the system, and ii) machine learning-based approaches.Dynamic model based approaches utilize the existing modelof a system, and aim to detect anomalies that do not conformto the dynamic equations and control laws that govern thesystem behavior. Machine learning algorithms train models todetect deviations from the normal operation of the system.

A common assumption for both approaches is that an attackwill have observable impact. However, the first approachadditionally assumes that a representative mathematical modelof the system is available, whereas such a model is not requiredby the second approach. Furthermore, dynamic models oftenhave high complexity and are specific to the instantiation ofthe system considered in the analysis, in contrast to machinelearning algorithms that tend to be less complex and can moreefficiently generalize and scale to defend an updated versionof the system or another ICS altogether. For these reasons,we adopt a machine learning defense approach in this work,and train a module to detect whether the process has beenmaliciously modified or tampered with.

C. Threat model

The threat model considered in this paper is similar to thatof the well-known Stuxnet worm [5]. In our threat model, weassume 2 malicious entities:The attack designer: A technically capable adversary, whohas partial prior knowledge and implementation specifics ofthe target process, equipment, and sufficient budget. Theattack designer can for example be a medium to large sizedcorporation, or a nation state.The attack launcher: A technically incapable adversary, whohas physical access to the facilities and ICS devices. The attacklauncher can for example be low-level staff at the ICS facilitywho has either financial motives, or was blackmailed to carryout the attack. The attack launcher possesses a “box”, whichautomatically carries out the attack when connected to thesystem without the need for user input, effectively deliveringa Red-Team-In-a-Box (RTIB) attack.

III. METHODOLOGY

To develop a process-aware defense strategy tailored forICS, we follow a full vulnerability assessment cycle. Ingeneral, the goal of vulnerability assessment is to identifypotential threats to a system, analyze risks in order to providemitigation techniques, and prevent any deviation from thesystem’s expected operation. In the case of ICS, the steps ofsuch an approach can be:Reconnaissance: Gather information regarding the ICS com-ponents and configuration details.Vulnerability discovery: Find possible attack entry-points,implementation weaknesses, and configuration flaws.Attack vectors and impact analysis: Formulate attack vec-tors, develop exploits, and analyze their impact.Mitigation: Develop detection and mitigation techniques forthe attack vectors.

We adopt this vulnerability assessment approach because itenables us to explore the categories of attacks against anyarbitrary ICS system in a structured manner, and providesbetter understanding of the nature and impact of attacks. Thisattack-oriented analysis provides insight into the implicationsof attacks on the overall process, enabling the formulation ofrequirements for a detection module.


Fig. 2. Tennessee Eastman process schematic

As a benchmark model, we use a dynamic model of acomplex, non-linear process, namely the Tennessee Eastman(TE) chemical process. We integrate hardware PLCs in a HITLexperimental testbed, which adds realistic disturbances to thesimulation model. Furthermore, the existence of hardware isessential for the development and demonstration of a completeattack vector with a payload delivery mechanism, followingsteps similar to ones an attacker would when targeting an ICS,including reconnaissance and vulnerability discovery.

For the detection module we follow a supervised learningapproach, and in particular non-linear Support Vector Ma-chines (SVMs). We develop a robust SVM model that candifferentiate between disturbances during normal operationand malicious activity, and detect attacks shortly after theirdeployment on the system. Mitigation of detected attacks isachieved through automated control switching between redun-dant controllers. Although this redundancy-based strategy maynot be cost-efficient, we argue that incident response timeand overall system performance are superseding factors tocontroller costs. In addition, judicious selection of a subsetof controllers that are critical to the process can reduce theseduplication costs.

IV. BENCHMARK ICS: TENNESSEE EASTMAN PROCESS

The TE process shown in Fig. 2 is a complex, open-loopunstable industrial process. We selected this process as abenchmark model since it realistically encapsulates the dy-namic behavior of real chemical processes [16]. It is composedof five operation units: a reactor, a product condenser, a vapor-liquid separator, a compressor, and a stripper. In this process,gaseous reactants, A through E, are fed to the reactor anda set of chemical reactions generates two liquid products, Gand H, and one liquid byproduct F. The reactor product streampasses through the separator in order to condense the product.The non-condensed products are recycled back to the reactorthrough the condenser unit. The condensed products are thenfed to the stripper unit where liquids G and H are removed.The byproduct is also purged from the stripper.

TABLE IITE OPERATION MODES

Mode G to H Mass Ratio Production Rate (kg/h)1 50/50 140762 10/90 140763 90/10 11111

Based on market demands, the TE process can work in threemodes of operation, which are defined by the mass ratio ofG to H in the product, and the product rate. These modes ofoperation are listed in Table II. Without loss of generality, inthis work, we focus on the first operation mode. The primarycontrol objective of the plant is to maintain the mass ratio ofG to H in the product, while satisfying equipment constraints.

We build upon the MATLAB Simulink model of the TEprocess provided in [17]. We extended the simulation modelin order to investigate process-aware attacks and defenses.Furthermore, we incorporated a serial hardware interface thatenables communication between the simulation model and aPLC, effectively realizing a HITL testbed. The Simulink modeldoes not consider fast dynamics of some components withinthe MATLAB simulation (e.g., transmitter lags), but retainsrealistic behavior of gas phase dynamics and valve lags. Theinstantiation of the simulator considered in this paper includes50 states, 41 measured variables with Gaussian noises, 12 ma-nipulated variables, and 13 disturbance signals. A distributedcontrol approach is implemented via 18 Proportional-Integral(PI) controllers, and the simulation time step is set to 1.8seconds. Note that the large number of measured variables isdue to the fact that some variables were included in the originalsimulation model for monitoring and research purposes, but donot have any impact on the control loops.

V. ATTACK CATEGORIZATION

In this section, we present the different categories of pay-loads that can be launched in an ICS environment. The sectionserves two main purposes: i) exploration of the differentoptions an attack designer may consider when designing apayload, and ii) generation of comprehensive data for the dif-ferent abstract categories of attacks that will be subsequentlyused for training a supervised learning detection module.

Towards investigating process-aware payloads, we carriedout studies on the TE simulation model to understand theeffects of attacks on the various control loops in the system.We have studied all control loops of the model. However,due to the large number (18) of distributed control loopsin the process, showing all possible combinations of attacksand their implications on the complete set of loops is neitherfeasible, nor desirable. Instead, we focus on the implicationsof attacks on the overall process and present payloads thattarget one specific control objective in the model, without lossof generality.

Based on attack objectives and vulnerabilities of the specificimplementation of an ICS, different components of a controlloop may be more attractive from a security perspective.Attacks can be divided in three distinct categories, namelysensor attacks, actuator attacks, and controller attacks. In the


Fig. 3. Stripper level response under sensor attack

following subsections, we present attack examples, as wellas their impact for all three categories. We utilize reactorpressure as a case study, as it is one of the most importantvariables of the TE process. Variability in reactor pressureand temperature can result in instability of the process. Smallincreases in pressure can halt the entire process, since theoptimal operational value for minimizing production cost isset to 2800 kPa, very close to the shut down limit of 3000 kPa.Decreasing the reactor pressure leads to increased productioncosts. These factors render the reactor pressure control loopattractive for attackers interested in negatively affecting theefficiency and stability of the system. For all ensuing scenariosin this section, G production setpoint and production ratesetpoint are set to 53.8% and 23m3/h respectively, and allattacks are launched at t = 10h.

A. Sensor Attacks

In sensor attacks, the attacker modifies/spoofs a sensorreading to affect closed-loop system operation. One exam-ple of such an attack is modification of the sensor valuein a continuous manner, starting with a slow increase, andincreasing the rate of variation while the attack progresses.The mathematical model of such an attack is:

ỹ(t) = ysp + αeβ(t−τ) (1)

where ysp is setpoint value of the output, τ is the launchedattack time, and α and β are tuning constants.

Fig. 3 and Fig. 4 show the effects of a payload fallingunder this category, launched against the stripper level sensor.The attack influences the process slowly at first, but itseffect increases exponentially over time. Under this attack,the stripper level reaches the high shutdown limit after 5.65h.Moreover, the production rate and operation cost deviate fromtheir setpoints during the attack. Note that maintaining Gproduction percentage and production flow rate constant, whilesatisfying safety constraints, are important objectives of the TEprocess.

B. Controller Attacks

The second category of payloads targets controllers, andmodifies the control parameters of the process, or ultimately

Fig. 4. Performance indices of the system under sensor attack

the control law itself. For example, modifying the controllerPID gains directly influences controller performance, and willpossibly result in deterioration of the overall performance ofthe process. In our case, the attack designer may change eitherthe proportional or integral gain of one of the PI controllers ofthe TE process. One example would be a change in the gainsby a multiplication factor:

k̃i = λki. (2)

where λ is a constant, ki is the original designed gain, and k̃iis the modified gain value.

The effects of one such controller attack, where the propor-tional gain of the reactor pressure PI controller is multipliedwith a constant numerical value, are shown in Fig. 5 andFig. 6. The attack results in decrease of the reactor pressure,which has a direct negative effect on the operating cost ofthe process. Although this attack does not have a large impacton the product quality, it significantly increases the operatingcost.

C. Actuator Attacks

The final category of attacks targets actuator values, inwhich the payload modifies the actuator values to disruptthe system’s operation in a manner difficult to detect, sincethe actuator values are typically the ones sent to the controlcenter for monitoring purposes. One example of an actuatortargeting payload is the addition of a small time-varyingbias to the actual actuator value to disguise the attack, andslowly deteriorate the system performance without causingany instability outside the process’s operational boundaries.The mathematical model of such a payload is:

ũ(t) = u(t) + a sin(ωt) (3)

where a and ω are constant values, u(t) is actual actuatorvalue, and ũ(t) is the modified actuator value.

The effects of this attack on the separator level controlloop are shown in Fig. 7 and Fig. 8. The percentage of Gproduction, which is the product quality metric, has oscillatoryresponse under this attack model. Product quality oscillation isa very undesirable process behavior. Additionally, minimizingvalve movements is one of the control objectives of the TE


Fig. 5. Reactor pressure response under controller attack

Fig. 6. Performance indices of the system under controller attack

process. This attack model causes an oscillatory response forthe separator flow valve position (shown in Fig. 7), whichwill result in faster wear-out of the valve and subsequentdecommission. The characteristics of this attack are similarto the ones of Stuxnet’s payload that destroyed centrifuges byforcing an oscillatory response, reducing their life-span.

VI. EXPERIMENTAL SETUP: HITL TESTBED

An important consideration when performing vulnerabilityassessments is proper selection of the assessment environ-ment. Assessment environments may include software-onlysimulation models, production testing, or setup replication,each with its advantages and disadvantages. For example,while production testing and setup replication provide themost accurate results, they are not viable options for ICS.The former is inherently hazardous, given the interactions ofICS with the physical world, and the latter is cost prohibitiveas it requires duplication of every component in the system.Software simulations have very low design costs, but fail tocapture the complexity of ICS and cannot recreate the real-world conditions and interactions of cyber-physical systems.Hybrid methods try to address this trade-off by includingone or more hardware components connected to a softwaresimulation model in a HITL setup. This approach inheritsthe low design cost benefit of software simulations and therealistic disturbances that hardware inclusion contributes to asystem. Moreover, from a security perspective, existence of

Fig. 7. Separator level response under actuator attack

Fig. 8. Performance indices of the system under actuator attack

hardware in the experimental setup enables a more thoroughinvestigation of the system’s security, as well as formulationof complete attack vectors, including payload delivery mecha-nisms [18]. For the aforementioned reasons, we adopt a HITLexperimental setup in this work.

The experimental HITL testbed we developed for studyingthe TE process is depicted in Fig. 9. The Simulink modeldescribed in Section IV was modified by removing one of itscontrol loops, and implementing the equivalent model on aPLC unit. The control loop offloaded to the hardware PLCis a cascade of two PI controllers driven by two sensors, re-sponsible for controlling the reactor’s pressure and purge rate.The cascaded PI-to-PI controller implemented on the PLC wastuned to closely match the behavior of its computer-simulatedanalog. The numerical results from the HITL simulator forany process initial condition and disturbance conditions arevery similar to the pure simulation, but also include noiseand errors due to multiple practically relevant hardware-relatedeffects (e.g., random noise, baseline drift on analog signallines, quantization effects on analog I/Os).

In terms of hardware, the primary PLC unit used in theHITL testbed is the Wago 750-881, because it is a good ex-ample of the transition from legacy-based structures to moderntechnologies. The Wago 750-881 features a 32-bit ARM CPUrunning a Nucleus RTOS, and a 32KB non-volatile memorywhich holds the ladder logic files. The RTOS includes a web-server and FTP service. In terms of networking, the Wago


Fig. 9. Experimental HITL testbed configuration

supports HTTP, SNTP and SNMP protocols for diagnosticsand management, and EtherNet/IP and Modbus for Fieldbuscommunication. Two Ethernet ports and a serial interface allowprogramming and management of the PLC.

For communication between the PC simulation model andthe PLC we utilized a Serial-Interface Board (SIB) withanalog-to-digital (A/D) and digital-to-analog (D/A) conver-sion capabilities. Packets containing values corresponding tothe measured signals, which normally feed into the reactorpressure and purge rate PI-loops, are transmitted from thesimulation host PC to the SIB over a USB-to-serial connection,using Simulink’s serial interface functionality. All data aretransmitted as 32-bit, single-precision floating point values.Before packetization, measurement values are scaled to ac-commodate the SIB’s D/A output voltage range of 0 to 3.3volts. Two D/A channels are routed from the SIB to analogsignal amplifiers, which rescale these signals to the PLC’sanalog input range of 0 to 10 volts. In a similar manner, theoutput of the PLC is downscaled and transmitted via a D/Aperipheral. This signal is then routed to an A/D channel on theSIB, which samples the signal and sends the resulting valueas serial data back to the Simulink model. This control valueis converted to its original units, and applied as an actuationsignal, closing the simulated process loop.

To achieve time-synchronization between the PLC and thehost PC we used a digital trigger. Each time the SIB receivesa data packet from the Simulink program, it triggers a PLCcontrol loop update cycle. The PLC is programmed to readthis signal via a digital input peripheral, and subsequentlyexecute a single control loop update cycle. Once the SIB boardhas sampled the updated PLC controller output, the acquiredcontrol-loop output data is sent back to the Simulink host PC,as previously described.

Programming the ladder logic on the PLC and monitoring

the progress is performed over the PLC’s Ethernet port, usingWago’s CODESYS development environment. In addition toincorporating the Wago PLC in our HITL setup, we alsoimplemented an equivalent interface with a Siemens S7-300PLC unit, which serves as a back-up controller for the needsof our mitigation strategy.

VII. DEVELOPMENT AND IMPACT OF ATTACK VECTORS

In this section, we demonstrate the development of acomplete attack vector, and the impact of two process-awarepayloads on the overall system. We argue that the stepsdetailed here are similar to steps an attack designer wouldfollow when designing a payload and delivery mechanism.

A. Payload delivery mechanism

In order to compromise the PLC, an attack designer has tofind an entry point that does not disrupt the normal operationof the system. The Wago 750-881, as described in the previoussection, has two Ethernet ports. Given that our threat modelallows for physical access, and under the assumption that oneport is utilized for connection between the controller and thecontrol center and the other is not occupied, we investigatedthe feasibility of concurrent connections to the PLC. Our initialfinding was that the CODESYS environment does not allowconcurrent connections to the same PLC from two differentmachines. However, by reverse engineering the proprietarycommunication protocol between CODESYS and the PLCwith the help of Wireshark, and establishing a connection tothe PLC outside CODESYS, we discovered that concurrentconnections are possible. To avoid crashing the PLC, we forcea 1ms delay between transmission of successive packets.

Disassembling the firmware of the Wago 750-881 verifiedthat our PLC suffers from the same vulnerability reportedfor other Wago products (CVE-2012-3013). Particularly, thefirmware includes 3 hardcoded credentials which we can lever-age to perform privileged operations, such as file operationsand controller reset [19]. Furthermore, the FTP service onthe PLC does not require any authentication. To automatecommunications, we developed scripts that mimic legitimatecommunication, and allow us to send commands and files tothe controller, similar to the approach described in [20].

A checksum file is sent to the PLC along with the ladderlogic files, for verifying the integrity of the transmitted data onthe PLC-side. Since a payload with modified ladder logic filesmust pass this check, we analyzed the checksum algorithm.By comparing the checksum of different files and reverse en-gineering the CODESYS executable we derived the checksumalgorithm to be a variation of the legacy SYSV algorithm.

The payloads we chose to deliver using the above attackpath fall under the category of controller attacks discussedin Section V. In particular, we developed two payloads thatmodify the proportional and integral gains of the PI controllersimplemented on the PLC respectively. Conforming to theconsidered threat model, we automated the attack using anUbuntu BQ Aquaris E4.5 mobile phone that plays the roleof the RTIB. We additionally developed an application on the


Fig. 10. Reactor pressure before and after proportional gain attack

Fig. 11. Reactor pressure before and after integral gain attack

phone, which detects when the phone is connected to a WagoPLC and automatically launches an attack. The technicallyincapable attack launcher is only required to plug the phone onthe PLC, remaining completely oblivious of the inner workingsand underlying mechanisms of the attack.

The complete attack vector is outlined as follows:• Connect RTIB to PLC via unused Ethernet port• Establish communication link• Login using hardcoded credentials• Download ladder logic over FTP• Modify gain variables in ladder logic• Calculate checksum of new binary• Send modified files to PLC via FTP• Force-reload of boot project

B. Attack results

Fig. 10 and Fig. 11 show the results of the two attacks, bothlaunched at t = 20h. The first attack replaces the originalproportional control gain of the Purge Rate PI control loopwith a slightly larger value. This attack causes the controller’sperformance to degrade, eventually resulting in lower reactorpressure as seen in Fig. 10, and thus higher operational costs.The second attack, shown in Fig. 11, replaces the integral gainof the same control loop with its inverse value. This affectsthe reactor pressure, causing it to slowly rise, and eventuallyexceed the shutdown limit of 3000 kPa, halting the entireprocess. In both cases, we can observe significant alterationsto the process outputs as a result of the attacks.

VIII. PROCESS-AWARE DETECTION AND MITIGATION

The attacks described in Section V will be used to train amachine learning-based attack detection module, which is theultimate goal of this work. To detect and counteract maliciousactivity on an ICS, model-based and empirical knowledge of

the typical time signal patterns of the dynamic processes andtheir dynamic characteristics can be utilized to ascertain ifan attack is on-going. One such process-aware approach ismachine learning-based clustering. In particular, a general andflexible clustering approach is the Support Vector Machine(SVM) that utilizes machine learning-based techniques togenerate a classifier that can cluster input data into one ofseveral categories.

For the needs of our detection mechanism, we trained aprimary SVM to detect the presence of an on-going attack,and a bank of secondary, separate SVMs to detect specificcategories and types of attacks. SVMs provide significantrobustness properties beyond simple range-based classifiers.By utilizing learned knowledge of the run-time interdepen-dencies between multiple information streams, an SVM canprovide high detection accuracy. In comparison, simple range-based attack detectors suffer from either high number of falsepositives when the ranges are set narrowly enough to detectattacks reliably, or high number of false negatives when theranges are widened to reduce false positives.

As with any classifier methodology, the accuracy of an SVMcrucially depends on how well it is trained. In the context ofthe application considered here, the training set for the SVMincluded large data sets from normal operation of the system,as well as comprehensive data sets under various categoriesof attack conditions. Furthermore, to accurately discriminatebetween typical disturbance conditions under normal operationand attack conditions, the training set included data setsunder the various disturbance conditions. Our attack detectionmodule utilizes data from the complete set of 12 sensors of theTE process. Based on the accessible signal set y, the classifieroperates on a sliding time window of the measurements fromthat signal set, i.e., the classifier computation at time k utilizesy[k], y[k − 1], . . . , y[k − N + 1], with N being a positiveinteger wherein N = 1 corresponds to the simplest case,in which the classifier operates on measurements from eachtime step separately. To provide robustness to naturally noisydata and reduce false positives, the classifier output yc is timewindowed to generate the overall detection signal s based onthe time sequence yc[k], . . . , yc[k −Ns + 1] with Ns being apositive integer. In the SVM-based classifier implementationhere, a sliding window with Ns = 50 is utilized to providerobustness against false alarms under disturbances and sensornoise during normal operation, by only declaring an attackdetection when more than 90% of the classifier outputs duringthe sliding time window correspond to attack detections.

Utilizing the RBF kernel and data sets from normal oper-ation and attack scenarios above under various disturbanceconditions, a set of SVMs was trained with N = 1 andNs = 50 to firstly detect an attack and secondly identifywhich specific type or category the attacks falls into. Wepresent one of the SVM test vectors from the sensor attackcategory in Fig. 12. To validate the robustness of our trainedmodels, we applied an unknown payload in addition to apreviously unseen disturbance. The disturbance was appliedon the A/C-ratio with constant B composition during time


Fig. 12. SVM based attack detection under sensor attack; Process disturbanceinjected between 7h and 8h, attack injected at 15h.

t = 7h to 8h, with its effect remaining on the system untilabout time t = 10h, while a malicious payload was launchedat time t = 15h. The trained SVM detected the abnormality inthe system 0.2h after the attack started. The SVM output isshown in Fig. 12 as a 0 or 1 with 1 denoting attack detection.This demonstrates that our detection SVM can accuratelydiscriminate between a disturbance condition appearing duringnormal operation and an attack condition. Note that undersuch disturbance conditions, a simplistic range-based classifierbased on maximum and minimum thresholds would eithertrigger false positive alarms, or fail to detect the attack,depending on the looseness of the thresholds.

A. Attack detection and mitigation experimental results

We integrated the SVM-based process-aware detection al-gorithm described above into the developed HITL testbed todetect abnormalities at run-time, and tested the model fortwo different scenarios of controller attacks. The attacks usedthe payload delivery method described in Section VII, whilethe two payloads were controller attacks that modified theproportional and integral gains of the reactor pressure controlloop respectively. Note that neither payloads were used in thetraining phase of the SVM model.

During each test, the process was allowed to run normallyfor 2 hours with the SVM-based detection module active.At the 2 hour mark, we deployed an attack through theRTIB (Ubuntu phone). Once the SVM detected that an attackoccurred, the control was switched from the primary PLC(Wago 750-881) to a secondary PLC (Siemens S7-300) thatexecuted the same controller in parallel. The experimentalresults of our two tests, presented in Fig. 13 and Fig. 14,demonstrate the effectiveness and robustness of our defensestrategy. Both attacks are detected, and control is switchedto the auxiliary PLC with a small time delay, such that theoverall process performance is not greatly affected. Note thatthe transients after the SVM detects an attack are due to theswitching between different controller instantiations, and thetime the process requires to correct the short-term effects ofan attack.

The backup PLC should preferably be of a different makeand model than the primary unit, making it less likely to suffer

Fig. 13. Reactor pressure under proportional gain attack with SVM mitigation

Fig. 14. Reactor pressure under integral gain attack with SVM mitigation

from the same vulnerabilities. This architecture introducesredundancy and decreases the overall susceptibility of thecontrol system to attacks, while at the same time increasingthe complexity and effort required by an attacker. Moreover,in the event of false positive alarms, the control is switched toan identical controller (from a control theoretic perspective),reducing the severity of misclassifications.

IX. RELATED WORK

Due to the increasing complexity and ubiquity of current-day ICS applications, significant research effort has beenundertaken on fault-tolerant and resilient control methods [21],[22]. More specifically, machine learning based approacheshave been suggested for the TE process in previous worksfor fault detection [23], [24]. In comparison, the work pre-sented in this paper focuses on process-aware security ofthe process. We consider malicious modifications that havebeen meticulously crafted and incorporate knowledge of theprocess mechanics, in contrast to random faults that have asignificantly different impact on the process.

Prior works have studied process-aware attacks and mitiga-tion methods [25]–[28], and simulation-based analysis of theeffects of attacks and component failures in ICS applications[29], [30]. These works follow a control-theoretic approach todesign and/or detect attacks, but do not take into considerationnoise and artifacts that hardware components can introduce tothe process. In addition, to the best of our knowledge, thisis the first work that demonstrates a complete attack and itsimpact, including the design of process-aware payloads andpayload delivery mechanisms in a HITL testbed.

This work is closer to [31] and [32]. In [31], the authorsuse a dynamic linearized model of the TE and follow a control


theoretic approach to formalize different types of attacks anddevelop state estimators that can detect malicious behavior.However, that paper only considers sensor attacks, and inaddition uses a simplified, linear approximation of the originalTE process. Similarly, the authors of [32] investigate a “securecontrol” methodology, applied on the TE model towardsresilient control. The focus is on attacks, and specifically onsensor attacks, and while the authors briefly discuss possibledefense objectives, they do not provide a complete defensestrategy. In comparison to these works, we have utilized thenon-linear TE model, and have explored attacks that span theentire set of components included in the process (sensors, con-trollers, actuators), providing a comprehensive process-awaredetection and mitigation strategy. Furthermore, none of thediscussed methodologies incorporate the realistic disturbancesthat hardware inclusion contributes to the process.

X. CONCLUSION

In this paper, we demonstrated a process-aware defenseand mitigation strategy using supervised learning. We utilizeda HITL testbed of the TE chemical process to investigateand understand process-aware attacks and their impact onthe overall process. We presented end-to-end attack vectors,including payload delivery mechanisms. This analysis wasmade possible by the inclusion of hardware PLCs in the as-sessment environment. Using the knowledge obtained from theinvestigation of attack categories, we trained an SVM modelthat has the ability to detect abnormalities in real-time, andcan distinguish between disturbances and malicious behavior.Our model was able to detect all previously unseen testedpayloads with small delays, while not triggering false alarmsat the presence of disturbances during normal operation.

ACKNOWLEDGMENT

This project was supported by the U.S. Office of NavalResearch under Award N00014-15-1-2182; and by the NYUCenter for Cyber Security (New York and Abu Dhabi).

REFERENCES[1] H. Ernie, M. Assante, and T. Conway, “An abbreviated history of

automation & industrial controls systems and cybersecurity,” SANS,2014.

[2] R. Leszczyna, E. Egozcue, L. Tarrafeta, V. F. Villar, R. Estremera, andJ. Alonso, “Protecting industrial control systems recommendations forEurope and Member States,” 2011.

[3] K. Stouffer, J. Falco, and K. Scarfone, “Guide to industrial controlsystems (ICS) security,” NIST special publication, vol. 800, no. 82, pp.16–16, 2011.

[4] E. Byres and J. Lowe, “The myths and facts behind cyber security risksfor industrial control systems,” in Proceedings of the VDE Kongress,vol. 116, 2004, pp. 213–218.

[5] N. Falliere, L. O. Murchu, and E. Chien, “W32. Stuxnet dossier,” Whitepaper, Symantec Corp., Security Response, vol. 5, 2011.

[6] ICS-CERT, “ICS-CERT year in review,” [Online]: https://ics-cert.us-cert.gov/sites/default/files/documents/Year in Review FY2014 Final.pdf, 2014.

[7] ICS-CERT, “ICS-CERT monitor,” [Online]: https://ics-cert.us-cert.gov/sites/default/files/Monitors/ICS-CERT%20Monitor Nov-Dec2015 S508C.pdf, 2015.

[8] marketsandmarkets, “Industrial Control System (ICS) security market,”[Online]: http://www.marketsandmarkets.com/PressReleases/industrial-control-systems-security-ics, 2015.

[9] C. Blask, “ICS cybersecurity: Water, water everywhere,” [Online]:http://www.infosecisland.com/blogview/18281-ICS-Cybersecurity-Water-Water-Everywhere.html, Nov. 2011.

[10] J. Robertson and M. Riley, “Mysterious ’08 Turkey pipeline blastopened new cyberwar,” [Online]: http://www.bloomberg.com/news/articles/2014-12-10/mysterious-08-turkey-pipeline-blast-opened-new-cyberwar, Dec. 2014.

[11] D. Kravets, “Feds: Hacker disabled offshore oil platforms’ leak-detectionsystem,” [Online]: http://www.wired.com/2009/03/feds-hacker-dis/,Mar. 2009.

[12] D. Kushner, “The real story of Stuxnet,” [Online]: http://spectrum.ieee.org/telecom/security/the-real-story-of-stuxnet, Feb. 2013.

[13] E. Kovacs, “Cyberattack on german steel plant caused significant dam-age,” [Online]: http://www.securityweek.com/cyberattack-german-steel-plant-causes-significant-damage-report, Dec. 2014.

[14] F. Khorrami, P. Krishnamurthy, and R. Karri, “Cybersecurity for controlsystems: a process-aware perspective,” IEEE Design & Test, vol. 33,no. 5, Oct. 2016.

[15] J. Weiss, “Assuring industrial control system (ICS) cyber security,”[Online]: http://csis.org/files/media/csis/pubs/080825 cyber.pdf.

[16] J. Downs and E. F. Vogel, “A plant-wide industrial process controlproblem,” Computers & Chemical Engineering, vol. 17, no. 3, pp. 245–255, 1993.

[17] N. L. Ricker, “Tennessee Eastman challenge archive,” http://depts.washington.edu/control/LARRY/TE/download.html.

[18] A. Keliris, C. Konstantinou, N. G. Tsoutsos, R. Baiad, and M. Ma-niatakos, “Enabling multi-layer cyber-security assessment of industrialcontrol systems through hardware-in-the-loop testbeds,” in 2016 21stAsia and South Pacific Design Automation Conference (ASP-DAC).IEEE, 2016, pp. 511–518.

[19] M. Gjendemsjø, “Creating a weapon of mass disruption: Attackingprogrammable logic controllers,” 2013.

[20] D. Bond, “3S CODESYS,” http://www.digitalbond.com/tools/basecamp/3s-codesys/, 2012.

[21] P. Mhaskar, J. Liu, and P. D. Christofides, Fault-Tolerant ProcessControl: Methods and Applications. Springer, 2012.

[22] M. S. Mahmoud and Y. Xia, Analysis and Synthesis of Fault-TolerantControl Systems. Wiley, 2014.

[23] A. Kulkarni, V. K. Jayaraman, and B. D. Kulkarni, “Knowledge incor-porated support vector machines to detect faults in tennessee eastmanprocess,” Computers & chemical engineering, vol. 29, no. 10, pp. 2128–2133, 2005.

[24] Y. Zhang, “Enhanced statistical analysis of nonlinear processes usingkpca, kica and svm,” Chemical Engineering Science, vol. 64, no. 5, pp.801–811, 2009.

[25] M. Burmester, E. Magkos, and V. Chrissikopoulos, “Modeling securityin cyberphysical systems,” International Journal of Critical Infrastruc-ture Protection, vol. 5, no. 3-4, p. 118126, Dec 2012.

[26] C. G. Rieger, K. L. Moore, and T. L. Baldwin, “Resilient control sys-tems: A multi-agent dynamic systems perspective,” IEEE InternationalConference on Electro-Information Technology , EIT 2013, May 2013.

[27] M. B. Line, A. Zand, G. Stringhini, and R. Kemmerer, “Targeted attacksagainst industrial control systems,” Proceedings of the 2nd Workshop onSmart Energy Grid Security - SEGS 14, 2014.

[28] F. Pasqualetti, F. Dorfler, and F. Bullo, “Attack detection and identifica-tion in cyber-physical systems,” IEEE Trans. Automat. Contr., vol. 58,no. 11, p. 27152729, Nov 2013.

[29] T. Morris, A. Srivastava, B. Reaves, W. Gao, K. Pavurapu, and R. Reddi,“A control system testbed to validate critical infrastructure protectionconcepts,” International Journal of Critical Infrastructure Protection,vol. 4, no. 2, p. 88103, Aug 2011.

[30] M. J. Mcdonald and B. T. Richardson, “Position paper: Model-ing and simulation for process control system cyber security re-search, development and applications,” [Online]: http://cimic.rutgers.edu/positionPapers/MichaelMcdonald-paper.pdf.

[31] A. A. Cardenas, S. Amin, Z.-S. Lin, Y.-L. Huang, C.-Y. Huang, andS. Sastry, “Attacks against process control systems: risk assessment,detection, and response,” Proceedings of the 6th ACM Symposium onInformation, Computer and Communications Security - ASIACCS 11,2011.

[32] M. Krotofil and A. A. Cárdenas, “Resilience of process control systemsto cyber-physical attacks,” in Secure IT Systems. Springer, 2013, pp.166–182.


Date post:	05-Feb-2021
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

Machine Learning-based Defense Against Process- Aware Attacks...

Documents