Date post: | 16-Apr-2017 |
Category: |
Technology |
Upload: | dataworks-summithadoop-summit |
View: | 1,226 times |
Download: | 8 times |
Apache Metron:Community Driven Cyber SecurityNed Shawa & Laurence Da Luz
Hadoop Summit Melbourne - 2016
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Metron Introduction
User Personas & Key Functional Themes
Capabilities and Architecture
Building a Use Case in Metron
Questions
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Metron Introduction
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Metron Vision
“Apache Metron is a Security Data Analytics Platform (SDAP). As a next
generation security analytics framework, it is designed to consume
and monitor network traffic and machine data within an enterprise
environment. Apache Metron is extensible and is designed to work at a massive scale. It is not a SIEM but
rather the next evolution of a SIEM.”
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Cyber Security – Today’s Enterprise Threat
Organizations have recently become targets of complex cyber security breeches that could have been easily prevented
Cyber attacks continuously become more advanced and go un-detected using traditional IT security policies and procedures
Cyber Security attacks have increased in visibility and targeted organizations with millions of customers – costing millions in privacy damages
Recent cyber security attacks have been led by highly skilled technical organizations where the attack could have been prevented by known solutions
62 % - Increase in Cyber Security Breaches since 2013
8 months – Average time an advanced security breach goes unnoticed
3 Trillion – Total cost of Cyber Security breaches
1 in 3 – Security professionals are not familiar with cyber security threats
2014 ISACA
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Metron – Community Driven Cyber Security
Security Data Lake
Enriched 360 Correlated Searchable Discoverable
Threat Intelligence
3rd Party Feeds Static Rules ML Models IOC Sharing
Pluggable Framework
Parsers Enrichers Threat IntelUI Widgets
SecurityApplication
PCAP Replay Evidence Store Hunting Platform
Apache Metron
Hortonworks and the Apache Metron Community are focused on delivering the next generation cyber security
platform to enable organizations to enrich and analyze all data within their enterprise
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Metron – How We Got Here
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Metron – Who’s Involved
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Metron – Capabilities Overview
Real-Time Security Stream Processing Pipeline
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
User Personas & Functional Themes
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Metron User Personas
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Metron’s Key Functional Themes
PlatformWork done to harden the platform for performance, scale, extensibility and maintainability. This also includes capabilities around provisioning, managing and monitoring the application.
Set of Data Sources that Metron provides capabilities to stream, ingest and parse into the platform.
A set of Storm Topologies to perform various actions in real-time including: normalization of telemetry data, enrichment, cross reference with threat intel feeds, alerting, indexing, and persisting into Historical stores
Data Collection
Data Processing
Data/Integration ServicesPortals/UI Set of portal, dashboard and user interfaces for the different personas.
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Data CollectionSource Systems Message Queue Stream Process and Enrichment Data Access
Network Traffic
SSH
System Log
HTTP(S)
File System
email Flume
PCAP
NiFi
FlumeKafka
NiFi processor
NiFi processor
NiFi processor
NiFi processor
NiFi processor
NiFi processor
PCAP Topic
Email Topic
SSH Topic
SysLog Topic
HTTP Topic
DPI Topic
FlumeStorm & Spark
PCAP Topology
Email Topology
SSH Topology
SysLog Topology
HTTP Topology
DPI Topology
Hive
Solr
HBase
Raw Data (Historical)
Data Index
PCAP Data
Ability to ingest and process over 1.2 million events per secondApache Metron Logical Architecture
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Capabilities and ArchitectureApache Metron 0.2
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Metron 0.2 Streaming and Enrichment
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Metron 0.2 Data Ingestion Architecture
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Key Points:• Each New Telemetry Data Source will have its own Parser Topology• Two types of Parsers available in TP2: Grok and Java
Metron 0.2 Parsing / Normalization Topology
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Metron 0.2 Parser Types
Metron parser:– Input: Read native format data from Kafka topic– Output: Normalized Metron JSON Object
Grok Parser– Suitable for structured or semi-structured logs– Regex-like syntax with pre-defined mappings (less effort to implement)– Good for lower volumes of data
Java Parser– Requires custom code (more effort to implement)– Good for higher volumes of data
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Metron 0.2 Enrichment Topology
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Metron 0.2 Enrichment Topology
Enrich Add additional information to raw source during streaming
In-built Geo enrichment (IP to coordinates + City/State/Country)
Streaming Allows ML models to score in real-time instead of batch
Threat Intel Flag alerts against intel feed & determine triage
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stellar Framework
What is it?– Powerful framework that provides a custom DSL that is used across different Metron components for querying,
transformation and configuring rules.
Why do we Need it?– For a variety of components we have the need to determine if a condition is true and if so perform some action.– For those purposes, this framework provides the DSL to create those conditions and execute a set of action.
How is Stellar Used within Metron today?1. Filtering, transformations and validations in parser topologies2. Threat Triage - allocating scores to certain rules based on conditions3. PCAP CLI – Query for pcap data
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
What does Stellar consist of?
Referencing Fields in the enriched JSON
Simple boolean operations: and, not, or
Simple comparison operations <, >, <=, >=
Determining whether a field exists (via exists)
The ability to have parenthesis to make order of operations explicit
E.g.: IN_SUBNET( ip, '192.168.0.0/24') or ip in [ '10.0.0.1’,'10.0.0.2' ] or exists(is_local)
A fixed set of functions which take strings and return boolean inlcuding:
– IN_SUBNET, IS_EMPTY, STARTS_WITH, ENDS_WITH, REGEXP_MATCH, IS_IP, IS_DOMAIN, IS_EMAIL, IS_URL, IS_DATE, IS_INTEGER
A fixed set of transformation functions including:
– TO_LOWER, TO_UPPER, TO_INTEGER, TO_DOUBLE, TRIM, JOIN, SPLIT, GET_FIRST, GET_LAST, GET, MAP_GET, DOMAIN_TO_TLD, DOMAIN_REMOVE_TLD, URL_TO_HOST, URL_TO_PROTOCOL, URL_TO_PORT, URL_TO_PATH, TO_EPOCH_TIMESTAMP
E.g.: IN_SUBNET( ip, '192.168.0.0/24') or ip in [ '10.0.0.1', '10.0.0.2' ] or exists(is_local)
Query Language Transformation Language
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Metron 0.2 Metron JSON Object Numerous sensors log in different formats. The parser should normalize at least the following
subset of fields to the following Metron JSON naming conventions:
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Metron 0.2 Metron UI with Kibana 4
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Building a Use Case in MetronSquid Logs (Metron Reference App)
26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Metron Reference Application Squid Sensor What is the Reference App?
– A use case that showcases the following:1. How to add telemetry events from a new data source (Squid) which covers parsing, filtering, transforms and validates2. How to see the new Events in the Metron UI3. How to enrich the telemetry events4. How to do threat intel cross reference checks against event5. How to raise alerts6. How to persist (index, long term storage) the events
Why do we need it?– Similar to the famous java pet store app, it provides an app that is constantly updated to showcase new features.
What are the updates to the Metron Reference App with TP2?– Using Stellar framework to filter, transform and validate events– How to work with the New Metron UI to display new events– Using Stellar framework to do threat triage– Streaming enrichments
How do you consume it?https://cwiki.apache.org/confluence/display/METRON/Metron+Reference+Application
27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Use Case Setup
• Scenario• Customer Foo has installed Metron TP2 and they are using the out of the box data sources (PCAP, YAF/Netflow,
Snort and Bro). They love Metron!• But now they want to add new data source the the platform: squid proxy logs.
• Customer Foo’s requirements are the following1. Need to ingest the proxy events from Squid logs in real-time
2. The proxy logs have to be parsed into a standardized JSON structure that Metron can understand
3. In real-time, the squid proxy event needs to be enriched with domain/whois information (domain, cert, country, company)
4. In real-time, the domain of the proxy event must be checked against for threat intel feeds
5. If there is a threat intel hit, an alert needs to be raised
6. The system should provide the ability to configure rules via a custom DSL to prioritize/score different types of alerts
7. The end user must be able to see the new telemetry events and the alerts from the new data source
28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Metron 0.2 Squid Use Case
29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Metron 0.2 Squid Use Case
Step 1b NiFi TailFile Step 1a Create Topic Step 2 Define Parser
Step 3 Enrichment Config
Step 4 Configure Alerts
Step 5 Create Dashboard
Configuration Driven
30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
• What is Squid?• Squid is a caching proxy for the Web supporting HTTP, HTTPS, FTP, and more. It reduces bandwidth and improves response times
by caching and reusing frequently-requested web pages
• What does a Squid Access Log look like?• When you make an outbound http connection to https://www.cnn.com, the following entry is added to a file called access.log:
1461576382.642 161 98.220.218.158 TCP_MISS/200 103701 GET http://www.cnn.com/ - DIRECT/199.27.79.73 text/html
Unix Epoch Time
IP of host where connection was made
Domain name of the outbound connection
Squid & its Telemetry Event
31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
• What is Squid?• Squid is a caching proxy for the Web supporting HTTP, HTTPS, FTP, and more. It reduces bandwidth and improves response times
by caching and reusing frequently-requested web pages
• What does a Squid Access Log look like?• When you make an outbound http connection to https://www.cnn.com, the following entry is added to a file called access.log:
1461576382.642 161 98.220.218.158 TCP_MISS/200 103701 GET http://www.cnn.com/ - DIRECT/199.27.79.73 text/html
Unix Epoch Time
IP of host where connection was made
Domain name of the outbound connection
Convert from Unix Epoch to Timestamp
Asset enrichment to enrich IP (hostname, type of device)
WHOIS enrichment to look up domain name information
Threat Intel to cross-reference IP with intel feed to see if there is a hit
Index the event into Elastic and persist in HDFS (Security Data Vault)
What Metron does to the Squid telemetry in real-time
Squid & its Telemetry Event
32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
1461576382.642 161 98.220.218.158 TCP_MISS/200 103701 GET http://www.cnn.com/ - DIRECT/199.27.79.73 text/html
Step 1 Telemetry Ingest
Step 1a Create Topic in Kafka Step 1b NiFi TailFile
/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --zookeeper $ZOOKEEPER_HOST:2181 --create --topic squid --partitions 1 --replication-factor 1
Ingest Squid logs into squid Kafka topic via NiFi
33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Step 2 Configuring the Squid Parser
Defining the Grok Filter for the Squid data
• Grok vs Java no custom code• Suitable for structured or semi-structured logs• Pre-defined mappings• Regex-based
TIMESTAMP_ISO8601 NUMBER WORD HOSTNAME IP USERNAME
SQUID_DELIMITED %{NUMBER:timestamp}.*%{INT:elapsed} %{IP:ip_src_address} %{WORD:action}/%{NUMBER:code} %{NUMBER:bytes} %{WORD:method} %{NOTSPACE:url}.*%{IP:ip_dst_addr}
1461576382.642 161 98.220.218.158 TCP_MISS/200 103701 GET http://www.cnn.com/ - DIRECT/199.27.79.73 text/html
Squid Grok Filter:
Pre-defined Grok mappings for IP address and url are used
34 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Step 2 Configuring the Squid Parser
Squid Parser and Transform Configuration
Kafka Topic
Filter Location
Stellar Transformation LanguageCreate 2 additional fields: applying USL_TO_HOST and DOMAIN_REMOVE_SUBDOMAINS
Stellar Transformation Language
DOMAIN_TO_TLD (domain)DOMAIN_REMOVE_TLD(domain)URL_TO_HOST(url)URL_TO_PROTOCOL(url)…
Parser Configurations
Field Transformations
• Configuration stored in ZooKeeper
• Configure parser and field transformations
35 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Data Ingestion Checkpoint / Tracing an event
Raw Source Data Metron JSON Object
• Numerous sensor logs in different formats• The parser normalizes a subset of fields• Data is parsed into the Metron JSON
Object
1462366408966.966 161 127.0.0.1 TCP_MISS/200 105413 GET tp://www.cnn.com/ - DIRECT/199.27.79.73 text/html
Metron Storm Parsing
36 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Step 3 Configure Real-time Enrichment
Enriching events with WHOIS information
• Enrichment reference data stored in HBase• Configuration stored in ZooKeeper• WHOIS data bulk loaded using Metron framework• Sample WHOIS data used:
google.com, "Google Inc.", "US", "Dns Admin",874306800000work.net, "", "US", "PERFECT PRIVACY, LLC",788706000000capitalone.com, "Capital One Services, Inc.", "US", "Domain Manager",795081600000cisco.com, "Cisco Technology Inc.", "US", "Info Sec",547988400000cnn.com, "Turner Broadcasting System, Inc.", "US", "Domain Name Manager",748695600000
Bulk Load or Streaming
37 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
{ "zkQuorum" : "$ZOOKEEPER_HOST:2181" ,"sensorToFieldList" : { "squid" : { "type" : "ENRICHMENT" ,"fieldToEnrichmentTypes" : { "domain_without_subdomains" : [ "whois" ] } } }}
{"config" : { "columns" : { "domain" : 0 ,"owner" : 1 ,"home_country" : 2 ,"registrar": 3 ,"domain_created_timestamp": 4 } ,"indicator_column" : "domain" ,"type" : "whois" ,"separator" : "," } ,"extractor" : "CSV"}
Step 3 Configure Real-time EnrichmentExtractor Configuration Enrichment Configuration
Metron Enrichment Bulk Loader Utility
Map Columns to enrichment data source
Identify column to match on
Configure field to enrichment type mapping
38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Data Enrichment Checkpoint / Tracing an event
Metron JSON Object Enriched Metron object
• Enrichment data is added to the Metron JSON Object
Owner
Data Enrichment Each event is enriched with WHOIS data data based on domain mapping
Country
Registrar
39 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
• Threat Intel feeds are either bulk loaded or streamed
• Similar to enrichment framework• Mapping to flag out any matches between the
Threat Feed and Streaming data• is_alert flag=true is generated on matches
Alerts via Threat Intel Feeds
Stellar Transformation Language
DOMAIN_TO_TLD (domain)DOMAIN_REMOVE_TLD(domain)URL_TO_HOST(url)URL_TO_PROTOCOL(url)…
• Metron ‘Threat Triage’• Define rules based on incoming data • Use any field within the rules (newly enriched
fields)• Label alert severity levels based on rule
conditions
Alert severity via Defined Rules
Step 4 Configure Threat Intel and Alerting
40 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
{ "config" : { "columns" : { "domain" : 0 ,"source" : 1 } ,"indicator_column" : "domain" ,"type" : "zeusList" ,"separator" : "," } ,"extractor" : "CSV"}
• Domain is mapped against this Threat Intel Feed• Alerts generated when a match is hit• Zeus malware tracker list used• Feed Bulk Loaded:
domain,source• Sample threat intel data:
Threat Intel Feed Mapping
Stellar Transformation Language
DOMAIN_TO_TLD (domain)DOMAIN_REMOVE_TLD(domain)URL_TO_HOST(url)URL_TO_PROTOCOL(url)…
Step 4a Configure Threat Intel and Alerting
malware_intel_feed.csv
039b1ee.netsolhost.com,abuse.ch03bbec4.netsolhost.com,abuse.ch0if1nl6.org,abuse.ch0x.x.gg,abuse.ch1st.technology,abuse.ch76tguy6hh6tgftrt7tg.su,abuse.chagiftcard724.com,abuse.ch…
Identify column mappings for the threat Intel feed
Specify column to match on
{ "zkQuorum" : "$ZOOKEEPER_HOST:2181" ,"sensorToFieldList" : { "squid" : { "type" : "THREAT_INTEL" ,"fieldToEnrichmentTypes" : { "domain_without_subdomains" : [ "zeusList" ] } }}}
Metron Threat Intel Bulk Loader Utility
Configure field to threat Intel mapping
41 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Requirement For Scoring a Specific Type Threat Intel Alert:– Rule 1: If the threat intel enrichment came from threat intel feed called zeusList is alerted, then
we want to consider that an alert of score of 5– Rule 2: If the url is neither a .com nor a .net, then we want to consider that alert a score of 10
Step 4b Alert Triage (Scoring Alerts)
Rule 2 If url is not a .com OR .net. Score = 10
Rule 1 If threat intel exists in the Zeus list. Score = 5
Aggregator defined for when multiple conditions are met
42 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Visualize Enriched Data and Alerts
(Example) Trend of Metron generated alerts for data categorized by the alert risk level
Drill down into Squid data events
• Kibana Driven Dashboards• List and Visualize Squid Data
List of Metron generated alerts ordered by risk level - record level drill down
Step 5 Enhance the Metron UI
43 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Metron Default Dashboard Kibana 4
• Displaying network data collected from the Metron probes
• In-built geo enrichment for default sensors feed the map view
44 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Key Takeaways…
• Easy Extensibility - The ability to add new data source without writing any code and in an easy manner!!
• Repeatable Pattern - The reference application represents a repeatable pattern that you can apply to most data sources
• Configuration Drive - End to end framework to build real-time enrichment and alerting data pipelines
45 © Hortonworks Inc. 2011 – 2016. All Rights Reserved