Post on 25-Jan-2017
transcript
Grab some coffee and enjoy the pre-show banter
before the top of the
hour! !
The Briefing Room
Solving the Really Big Tech Problems with IoT
Welcome
Host: Eric Kavanagh
eric.kavanagh@bloorgroup.com @eric_kavanagh
u Reveal the essential characteristics of enterprise software, good and bad
u Provide a forum for detailed analysis of today’s innovative technologies
u Give vendors a chance to explain their product to savvy analysts
u Allow audience members to pose serious questions…and get answers!
Mission
Topics
January: ANALYTICS
February: BIG DATA
March: CLOUD
The Rosetta Stone
u MapReduce wasn’t enough
u Spark is awesome, but still…
u Kafka changed the game
u NiFi is the capstone for Hadoop!
Analyst
Robin Bloor is Chief Analyst at The Bloor Group
robin.bloor@bloorgroup.com @robinbloor
HPE Security
u HPE offers comprehensive data security and privacy solutions for big data, the cloud and the Internet of Things
u Its solution features data encryption, tokenization and key management
u HPE partnered with Hortonworks to enhance its DataFlow solution (powered by Apache NiFi) with data protection
Guest
Reiner Kappenberger, Global Product Management, Big Data HPE Security – Data Security Reiner Kappenberger has over 20 years of computer software industry experience focusing on encryption and security for Big Data environments. His background ranges from Device management in the telecommunications sector to GIS and database systems. He holds a Diploma from the FH Regensburg, Germany in computer science.
Solving the Really Big Tech Problems with IoT HPE Security – Data Security
January 17, 2017
2
HPE Security - Data Security We protect the world’s most sensitive data – Protect the world’s largest brands & neutralize breach impact by securing
sensitive data-at-rest, in-use and in-motion. – Over 80 patents & 51 years of expertise Our Solutions – Provide advanced encryption, tokenization & key management Market leadership – Data-centric security solutions used by eight of the top ten U.S. payment
processors, nine of the top ten U.S. banks.
– Thousands of enterprise customers across all industries including transportation, retail, financial services, payment processing, banking, insurance, high tech, healthcare, energy, telecom & public sector.
– Email solution used by millions of users and thousands of enterprise & mid-sized businesses including healthcare organizations, regional banks & insurance providers.
– Contribute technology to multiple standards organizations.
Why is securing Hadoop difficult?
Rapid innovation in a well funded open source community
Multiple feeds of data in real time from different sources with different
protection needs
Mainframe
MQ
RDBMs
XML Salesforce
Flat Files
Multiple types of data combined in a Hadoop “Data Lake”
3
Why is securing Hadoop difficult?
4
Reduced control if Hadoop clusters are deployed in a cloud
environment
Automatic replication of data across multiple nodes once entered into
the HDFS data store
Access by many different users with varying analytic needs
Introducing “Data-centric” security
5
Traditional IT Infrastructure Security
Disk encryption
Database encryption
SSL/TLS/firewalls
Authentication Management
Threats to Data
Malware, Insiders
SQL injection, Malware
Traffic Interceptors
Malware, Insiders
Credential Compromise
Security Gaps
HPE SecureData Data-centric Security
SSL/TLS/firewalls
Dat
a se
curit
y co
vera
ge
End-
to-e
nd P
rote
ctio
n
Data Ecosystem
Storage
File systems
Databases
Data and applications
Security gap
Security gap
Security gap
Security gap
Middleware
HPE Format-Preserving Encryption (FPE)
6
– Supports data of any format: name, address, dates, numbers, etc.
– Preserves referential integrity
– Only applications that need the original value need change
– Used for production protection and data masking
– NIST-standard using FF1 AES Encryption
AES - CBC
AES - FPE 253- 67-2356
8juYE%Uks&dDFa2345^WFLERG
First Name: Uywjlqo Last Name: Muwruwwbp SSN: 253- 67-2356 DOB: 18-06-1972
Ija&3k24kQotugDF2390^32 0OWioNu2(*872weW Oiuqwriuweuwr%oIUOw1@
First Name: Gunther Last Name: Robertson SSN: 934-72-2356 DOB: 20-07-1966 Tax ID
934-72-2356
Hyper Secure Stateless Tokenization (SST) Credit Card
4171 5678 8765 4321
– Tokenization for PCI scope reduction
– Replaces token database with a smaller token mapping table
– Token values mapped using random numbers
– Lower costs
− No database hardware, software, replication problems, etc.
– Hyper SST technology is architected to leverage the latest compute-platform advances
7
SST 8736 5533 4678 9453
Partial SST 4171 5633 4678 4321
Obvious SST 4171 56AZ UYTZ 4321
BIN Mapping 1236 5533 4678 4321
Granular Policy Managed by HPE SecureData
– A policy consists of Data Formats, Protection, and Data Access Rules
8
Data Format Name – Field or Object
Type Alphabets, Formats
Logic Rules – Meaning Preservation Rules
Protection Method
Authentication Policy
FPE SST IBSE
Dynamic Masking Policy Mask Type, Mask
System Auth
Key Rotation Policy for Encryption, Caching Policy
Authorization Groups, Roles – No Access, Access, Masked Access
App Auth
App Permissions – Encrypt, Tokenization, Detokenize, Decrypt
Note: Some features vary by platform, use of LDAP, IAM, IDM
LDAP, Groups PKI, Secret, IP ranges, custom adapters/Java
Securing Sensitive Data in Big Data Platforms and Hadoop
9
Public data
Big Data Platform Teradata, Vertica, Hadoop
Sqoop Hive UDFs
Map Reduce
“Landing zone”
TDE
SQL Spark
Sensor Data
Power user re-identifies
data
BI tools work on
protected data
Business processes
use protected
data
Laptop log files
Server log files
Any data Source
Flume NiFi
Storm Kafka
60 Data Sources 20 Million records
per day = 1TB
250 Nodes
LDAP
Sensitive structured sources
Hadoop Cluster
Sqoop Flume Storm
Hive UDFs
Map Reduce
Staging Area
HPE SecureData File Processor
Teradata EDW
UDFs
Data Virtualization
layer
Tableau
Analytics & Data Science
HPE SecureData Key Servers & WS
API’s
Leading Telecoms Provider – Big Data Primary Data Flow
Data Cleansing
22
Threats in the IoT Space
11
Public data
Sqoop Hive UDFs
Map Reduce
“Landing zone”
TDE
SQL Spark
Sensor Data
Power user re-identifies
data
BI tools work on
protected data
Business processes
use protected
data
Laptop log files
Server log files
Any data Source
Flume NiFi
Storm Kafka
Back-end infrastructure
Leading car manufacturer – Big Data primary data flow
12
Sensitive structured data
Hadoop Edge Nodes
HPE SecureData Hadoop Tools
Hadoop Cluster Data Warehouse
Sensitive structured sources
Cognos
Analytics & Data Science
HPE SecureData Key Servers &
WS API’s
~2 Billion real time transactions/day
Other real-time data feeds – customer
data from dealerships,
manufacturers
Sqoop
Hive UDFs
Map Reduces
“Landing zone”
“Integration Controls”
Flume real time ingest
Existing data sets and 3rd party data, e.g.. accident data
UDFs
IBM DataStage
Thank you
13
Perceptions & Questions
Analyst: Robin Bloor
The Nature of Data in Motion
Robin Bloor, PhD
Event Management and Processing
We have gradually entered an event-based IT world.
It brings with it new realities.
We need to consider “data in motion.”
New Realities
u A good deal of (new) data is now sourced from outside the business
u Data has to be governed
u Provenance and lineage matter
u There is no perimeter anywhere -- data access permissions and encryption apply everywhere
§ Time § Geographic location § Virtual/logical
location § Source device § Device ID § Ownership and actors § Data
Events and Event Data
§ Apache project from NSA (2015)
§ Highly scalable § Parallel operation § A distributed data
flow platform § Point to point (pull-
push) § IoT
§ Apache Project from LinkedIn (2015)
§ Highly scalable § Parallel operation § A distributed streaming
platform § Publish Subscribe
(push-pull) § Not so much IoT
NiFi v Kafka
NiFi Kafka
A View of a Coherent Data Lake
u Data Lakes are complex - more complex, for example, than a data warehouse
u It’s becoming obvious that streaming and data flow are inherent to the data lake
u It is the primary place of governance
u There needs to be a strategy for data
Secure
Transform &Aggregate
Ingest
Extracts
Network Devices IoTMobileServers DesktopsEmbedded Chips RFID The Cloud Log Files
OSes VMsESB/Messaging Software Sys Mgt Apps
Social Network DataWeb Services Data StreamsWorkflowBI & Office AppsBusiness AppsSaaS
ToDatabasesData MartsOther Apps
Search &Query
BI, Visual'n& Analytics
OtherApps
ETL
DATA
STORAGE
ARCHIVE
Real-TimeApps
MetadataMGT
DataCleansing
LifeCycle
Compliance and Regulations
u Aside from sector initiatives there are many official regulations: HIPAA, SOX, FISMA, FERPA, GLBA (mainly US legislation)
u Standards (Global): PCI-DSS, ISO/IEC 17799 (data should be owned)
u National regulations differ country to country (even in Europe)
u GDPR being negotiated
The Challenges
It is ceasing to be possible to include security as an afterthought.
Security needs to be designed in from the get go.
u How many companies doing big data projects do you believe have security properly organized? Is anything happening that is likely to complicate the situation even more?
u Security often comes with performance penalties. What is the performance cost of the solutions you are advocating?
u Costs? How much budget needs to be allocated? Can you give a feel for this?
u Where do you tend to see NiFi and Kafka (Storm, Flume, Flink…) being used?
u Security needs to be integrated, so encryption needs to shake hands with authentication. How does HPE make this work?
u Are there any environments/applications to which HPE’s security technology is inapplicable: OLTP, Data Streaming & Streaming Analytics, BI, Mobile, Cloud, etc.…
Upcoming Topics
www.insideanalysis.com
January: ANALYTICS
February: BIG DATA
March: CLOUD
THANK YOU for your
ATTENTION!
Some images provided courtesy of Wikimedia Commons