Date post: | 31-Oct-2014 |
Category: |
Technology |
Upload: | energysec |
View: | 706 times |
Download: | 1 times |
{
Integrating Cyber Security Alerts into the Operator Display
Digital Bond, Inc. Michael Toecker, PE ddddddddd
EnergySec 9th Annual Security Summit
Ñ Michael Toecker
Who Am I?
Monitoring and Response of Cyber Security Events Originating from the Control System Parallels the Monitoring and Response of Process Events
The Premise
{
Ñ ICS Operations was similar to Security Operations Ó ICS had alarms, SecOps had alarms Ó ICS had events, SecOps had events Ó ICS had historical points, SecOps had voluminous logs
Ó ICS had 24/7 Operators, SecOps had Analysts (some 24/7, others not)
Ó ICS had a responsibility for monitoring safe and effective productions, SecOps had responsibility for ensure secure and trusted operations
I Spent a Year working as a Security Guy
in an Operations Environment
ICS Ops vs. SecOps
{
Task SecOps ICSOps
Visualizing Data using Graphs, Charts, etc
X X
Providing Status Indicators when parameters went out of normal
X X
Directed Field Personnel to Take Specific Actions based on Events or Alarms
X X
Reviewing of Logs, Records, and Other Data to Improve Efficiency and Locate Problem Areas
X X
Investigate for Compliance and Effect on Process, and find ways to Prevent, Detect and Respond
X X
What I Often Saw in ICS
Operations was Paralleled in What I was Doing
Parallels
{ Ñ …was the data.
Ó My data was security logs, their data from process points.
Ó But we were both identifying conditions that could impact our production or compliance, and taking some action to correct
I was an Engineer with Specialized Knowledge of Specific
Equipment
What was Different…
{ Ñ There is an emphasis on procedure, and process when faced with issues
Ñ Troubleshooting where advanced knowledge is required is conducted by those with the knowledge
Ñ Operators follow known actions that will return a system to a stable state, usually developed by process engineers.
Operators Monitor & Respond, but Do Not Always
Possess Specific
Knowledge
The Role of Operators
Why not add some Security?
Ñ Lack of Understanding and Confusion about Computer Security
Ñ Owner A]itude is that Security has nothing to Do with Operations
Ñ Leads to Reduction in Situational Awareness Ñ Operators Don’t Know What Actions To Take
The Problems
Ñ Proper Notification Reduces Response Time to Security Incidents
Ñ Regulatory Requirements can be Met With Existing Personnel
Ñ Alerts and Events directly to 24/7 Personnel look Awesome as Compensating Controls
The Benefits
{
Cyber Security Events & Incidents
Detectable w/ Security
Monitoring
Security Events Operators
Could Respond To
Not a Substitute for a
Focused Security Monitoring Program
The Limitations
{
Monitor, and
Analyze
Identify Security Conditions
Identify Operational Events
Develop Procedures for Action
Implement Condition and
Procedure
Security Monitoring Program
Should Feed into
Conditions for Operator Alerts
The Role of Security
{
Monitor Data Points
Identify Process
Conditions
Identify Operational Events
Develop Procedures for Action
Implement Condition and
Procedure
This looks a lot like Process Intelligence Process, the
only difference is the Analysis
and Knowledge
….wait a minute.
Identify Specific Clear Cyber Security Events
Determine Events Appropriate for Operator A]ention
Create Operations Procedures for Actions
Develop a Detection and Presentation Strategy
The Process
{ Part 1
Identify Operational Events
{
…Clear • No Ambiguity • Straightforward Yes/No Decision Point
…Derivable • Sourced Directly from Control Systems Security Data, not from Intuition or Analysis
…Actionable • Specific Actions can be taken on receipt of the Event
• Not Dependent on Other Events, or on Further Analysis
An Operational Cyber Security Event Should
Be..
Identify and Define
{
Ñ Questions to Ask Ó What do my regulations tell me to be concerned with?
Ó What do various standards bodies tell me to be concerned with?
Ó Do I have specific policy statements that suggest alerting, or 24/7 response?
Ó What Lessons Learned Do I have related to Cyber Security?
Identify Cyber Security
Conditions to Alert On
Identifying Events
This is my polite way of saying “If You Got Hacked, How Did
it Happen?”
List of Security
Conditions
Regulations Require
Monitoring and Action
Standards suggest an Approach
Security Policy may Specify
Conditions
Lessons Learned from
Security Incidents
Determine Conditions
{
Ñ CIP-‐‑007 R4 – Malicious Software Prevention Ó Paraphrase: ~..shall use anti-‐‑virus software to detect malware on all Cyber Assets within the ESP~
Ó Conclusion: I should alert on anti-‐‑virus detections
Ñ CIP-‐‑007 R5 – Monitoring Electronic Access Ó Paraphrase: ~monitoring processes shall detect and alert for a]empts at or actual unauthorized access~
Ó Conclusion: A]empts at unauthorized access include incorrect passwords, alert on that.
Regulations, such as NERC CIP, may
provide clues as to what
events should be monitored
Regulations Well, I did say clues…
Source: NERC CIP Standards, V3
{ Ñ Section 3.2.2 – Signs of an Incident Ó ~Too many indicators exist to exhaustively list them~
Ó ~Common ones include multiple failed login a]empts, deviations from normal network traffic, filenames with unusual characters..~
Standards can help as well, but still are clues not firm guidance
Standards Source: NIST SP-‐‑800-‐‑61
{ Ñ What I’ve seen in the past:
Ó ~Addition and Modifications of Users shall be conducted through the change control process~
Ó ~New Software on Control Systems requires approval by the Senior Manager~
Conditions may exist in
your corporations IT
Security Policies
Policy Remarks
{
Ñ Good Security Comes with Experience, Ó Most Experience Comes from Failures in Security
Ñ ….but it doesn’t have to be YOUR Failures in Security Ó Talk, Listen, Learn
Why Information Sharing is Important.
Lessons Learned
{
There are tons of events
available, but not all are relevant or
appropriate for Operations
Complex, Irrelevant
{
• Start with from general security conditions
• Trim to Specific Events within those categories
Top Down
• Start with Every Potential Event that Could Be Generated
• Trim to Specific Events from the Potentials
Bo]om Up
There are Two Main Methods for Identifying
Events
Methods to Identify
{
Ñ Specific Classes of Computer Security Events Ó Virus Detection, Failed Logins, Disallowed Ports, etc
Ó Good Source of Some Classes – NIST SP-‐‑800-‐‑53
Ñ Useful for PC based systems, which often have a huge amount of capacity for security
Top Down is Good For
Systems with Many Potential
Events
Top Down Approach
{ Cyber Security Event Class:
Virus
Detection
Top Down Example
{
Ñ Enumerate the Security Capabilities of the Device. Examples: Ó Provides Specific Syslog Evidence Ó Sets a Point when a Login Threshold has been reached
Ñ Useful for Devices, where Capability is often limited
BoYom Up is Good for
Systems with Limited
Capability for Security
Bo]om Up Approach
{ Review of Manuals and Datasheets can identify detectable Cyber Security Events
Bo]om Up Example
Source: S&C IntelliRuptor Instruction Sheet 766-‐‑560
{
Ñ Top Down Ó Allows you to set criteria, and then delve into system to find triggers to meet it
Ó Avoids the complexity of ge]ing into the weeds of system events
Ó May miss important conditions due to avoiding those same weeds
Ñ Bo]om Up Ó Complex, but most Detailed Ó Requires analysis of many events that will likely never make it in front of an operator
There are advantages and disadvantages
of each Approach
Compare and Contrast
{ Ñ Windows Based Computers are the obvious systems to use Top Down Ó Event Heavy, Highly Complex Ó Events were designed from an incident response perspective, not from an alert perspective
Use Top Down when a system is highly capable of reporting
security events to narrow your
range
When to Use an Approach
{ Ñ Systems like PLCs, Controllers, some Network Devices have limited capability to report security status Ó Won’t be able to simply define events, you’ll have to work with what’s there
Use BoYom Up when working with devices that report on few security conditions
When to Use an Approach
{
Condition Source Anti-‐‑Virus Detection NERC CIP-‐‑007 R4 User Modified or Added NERC CIP-‐‑007 R5 Security Logs Deleted NERC CIP-‐‑007 R6 Security Logs Full NERC CIP-‐‑007 R6 Excessive Incorrect Login NERC CIP-‐‑007 R6 Use of Removable Media Good Practice New Software Installed IT Policy Logging Options Changed
IT Policy
The End Result of this Analysis is a List of
Conditions to Alert On
List of Conditions Note: This list is far from comprehensive
{ Part 2
Appropriate for Operators
{
Ñ Is the Condition a Clear Cyber Security Event?
Ñ Is the Condition Derivable directly from Logs, Alerts, and other evidence?
Ñ Is the Condition Actionable by Operators?
Not Every Condition is Appropriate for Operator Notification
Appropriate for Operators
{
Condition Source Anti-‐‑Virus Detection NERC CIP-‐‑007 R4 User Modified or Added NERC CIP-‐‑007 R5 Security Logs Deleted NERC CIP-‐‑007 R6 Security Logs Full NERC CIP-‐‑007 R6 Excessive Incorrect Login NERC CIP-‐‑007 R6 Use of Removable Media Good Practice New Software Installed IT Policy Logging Options Changed
IT Policy
Unclear Conditions are Removed from
the List
Is it Clear? Note: This list is far from comprehensive
{ Condition Source Anti-‐‑Virus Detection NERC CIP-‐‑007 R4 Security Logs Deleted NERC CIP-‐‑007 R6 Security Logs Full NERC CIP-‐‑007 R6 Excessive Incorrect Login NERC CIP-‐‑007 R6 Use of Removable Media Lesson Learned
Remove Conditions Incapable of being Derived from Evidence, or Require Analysis
Is it Derivable? Note: This list is far from comprehensive
{
Condition Detection Method
Reliability
Anti-‐‑Virus Detection
Windows Event Log
Very Reliable, Test Indicates an event generated on each detection in SYSTEM log
Security Log Deleted
Windows Event Log
Very Reliable, an explicit event is created on clearing
Excessive Incorrect Login
Windows Event Log
Reliable, so long as the account lockout se]ings in SECPOL.msc are set correctly
Use of Removable Media
May require 3rd party program.
Not Always Possible without 3rd Party Program
How Reliable are the Detection
Methods? Do they have
potential false positives?
Reliable and Unreliable Conditions
Note: This list is far from comprehensive
{ Remove Conditions that an Operator cannot
Realistically take Action On
Is it Actionable?
Condition Source
Anti-‐‑Virus Detection
NERC CIP-‐‑007 R4
Security Logs Deleted
NERC CIP-‐‑007 R6
Security Logs Full
NERC CIP-‐‑007 R6
Excessive Incorrect Login
NERC CIP-‐‑007 R6
{ Why were some of the conditions removed?
An Aside…
Ñ User Modified or Added
Condition Reason for Removal
User Modified or Added
Not Clear, as there are legitimate reasons for adding, or modifying a User and these reasons aren’t apparent without analysis.
Security Log Full
Not Actionable, as operators should be doing maintenance and admin functions.
Removable Media
Not Derivable, on most systems as is. May require a 3rd Party program to do a decent job of this.
Condition Source Anti-‐‑Virus Detection NERC CIP-‐‑007 R4 User Modified or Added NERC CIP-‐‑007 R5 Security Logs Deleted NERC CIP-‐‑007 R6 Security Logs Full NERC CIP-‐‑007 R6 Excessive Incorrect Login NERC CIP-‐‑007 R6 Use of Removable Media Good Practice New Software Installed IT Policy Logging Options Changed
IT Policy
{ Ñ Example: Removable Media Detection
Ó Wasn’t able to do this in Native Windows in a Clear and Derivable manner
Ó Use of Third Party tools can change this, making it possible to monitor and alert
A Previously Rejected
Condition can become valid with New
Information or Technology
When Conditions Change
{
Ñ USB Based Infection Lesson Learned Ó New USB Showed up in Registry Change Ó Auto-‐‑Run Shows up in Registry Change Ó Addition of Programs to the “Run” and “RunOnce” keys in the Registry
Ó Copying of Files into “System”, “System32”
Ñ Is this Clear? Definable? Actionable?
Some of the More
Advanced Conditions That We Can Define
Let’s Get Crazy…
{ List of Conditions has
been generated, what next?
What Comes Next?
Condition Detection Method
Reliability
Anti-‐‑Virus Detection
Windows Event Log
Very Reliable, Test Indicates an event generated on each detection in SYSTEM log
Event Log Was Cleared
Windows Event Log
Very Reliable, an explicit event is created on clearing
Excessive Incorrect Login
Windows Event Log
Reliable, so long as the account lockout se]ings in SECPOL.msc are set correctly
{ Part 3
Create Operations Procedures
{
Ñ Notifying Operators of Cyber Security Events is useless if the Operator has no action to take
Ñ This guidance typically takes the form of Operational Procedures
Ñ Each Event must have an appropriate action to be taken
This is Now a Procedure Exercise
Operator Actions
{ Ñ “Notify Lead I&C Engineer by Phone”
Ñ “Isolate Infected System From Network by Disconnecting Ethernet”
Ñ “Call Out via Radio to check if invalid login is from authorized user”
Be Succinct and Specific
Guidelines for Actions
{ Ñ No IT Administrative Functions Ñ No Maintenance Functions Ñ Limit the Analysis Necessary
Ñ …and don’t give them someone else’s work
Keep the Guidance within
Operator’s Authorized Abilities
Guidelines for Actions
{ Personnel Responsible Trigger
Actions Documentation
An Operating Procedure has a few common characteristics
Operating Procedures
Example Operating Procedure
Bring up Example Procedure
{ Ñ Case in Point – Conficker (MS08-‐‑67)
Ó Highly Aggressive Worm which impacts network communication
Ó Makes use of very reliable exploit in Server service
Ó A]empts to brute force accounts Ó Spreads over USB and removable media as well
Some Cyber Security Events may Cause Production Impacts
Worst Case Scenario
{ Ñ A Highly Aggressive worm like Conficker can have production consequences. Ó Continuing to operate while this is going on is risky.
Ó Who makes the decision to halt production? Operator? Shift Supervisor? Plant Manager?
Ñ Make sure the information gets to those make the decision.
What guidance would prepare an operator for these Alarms?
Worst Case Scenario
{ Section 5
Present to Operator
{ Ñ Most Cited:
Ó The Alarm Management Handbook The High-‐‑Performance HMI Handbook.
Ñ Wri]en by Bill Hollifield and Paul Gruhn Ó Of Course, Nothing Specific on Security
There is already a lot of guidance on development of Operator Displays
Operator Displays
{
Ñ Help Operators Perceive the Important Security Data
Ñ Give Operators Data-‐‑in-‐‑Context
Ñ Help Them Comprehend the Situation in Terms of the Process
Ñ Help Predict Future Status by Providing Trending
Guidelines for Cyber Security Displays
Operator Displays
-‐‑ Tough right now… At least without giving access to an SIEM
Cyber Security Master Display
Anti-‐‑Virus Status Display
Users Status Display
Removable Media Status
Display Event Log
Status Display
Concept Operator Display
Mock Up
{
Ñ Many HMIs can accept SNMP Traps Ó Often used for alerting when hosts stop communicating
Ó Security tools can feed this, in certain conditions
Ñ Security Logs don’t Translate Well into traditional displays Ó How do you ‘trend’ when you have thousands of event ids?
Summary: Limited, and Nowhere Near
Ideal
Integration with the HMI
{ Thanks, Mike
Questions?
More Research at S4
Ñ Digital Bond’s S4 Conference in Miami Beach, January 2014
Ñ Got an Idea? Ó Submit a presentation!
Ñ Details on DigitalBond.com