Date post: | 01-Nov-2014 |
Category: |
Technology |
Upload: | sqrrl |
View: | 499 times |
Download: | 2 times |
sqrrl Secure. Scale. Adapt
Sqrrl Data, Inc., All Rights Reserved
Security of data within Hadoop
2 Sqrrl Data, Inc., All Rights Reserved
Problem
<5% of Data
Solu+on
General Data Problems
Source: Forrester
3 Sqrrl Data, Inc., All Rights Reserved
What about security?
3
4 Sqrrl Data, Inc., All Rights Reserved
What is the market saying?
security becomes an “enabler” by making it possible to bring together huge stores of data
You want security to be just as scalable, high-‐performance and self-‐organizing as the clusters
most big data technologies don’t have any security features built in
want fine-‐grained security and policy control at the database-‐level
5 Sqrrl Data, Inc., All Rights Reserved
• With every copy of data, there is an increased risk of unintended disclosure
• Every now and then people with access and privileges take a look at records without a legiCmate business purpose e.g., an employee of a banking system looking up their neighbor
A few more risks…
6 Sqrrl Data, Inc., All Rights Reserved
The Perfect Storm
6
Security Analysis
Customer Support
Customer Profiles
Sales & MarkeCng
Social Media
Business Improvement
Big Data
Regula+ons & Breaches Increased
profits
Increased profits
Increased profits
Increased profits
Increased profits
Increased profits
7 Sqrrl Data, Inc., All Rights Reserved
• Big Data is a Cme-‐bomb based on how things are coming together
• Big Data deployment is growing fast; rushing into it
• Shortage in Big Data skills
• Big Data security soluCons are not effecCve
• General shortage in security skills
The Perfect Storm
7
8 Sqrrl Data, Inc., All Rights Reserved
So what can we do?
9 Sqrrl Data, Inc., All Rights Reserved
(Def.) A form of security in which data carries with it the elements of provenance that are required to make policy decisions on its visibility: • Separate data modeling for security and analysis • Data comes with security aYributes governing its
visibility…..data is self-‐describing • Reusability of applicaCons across security domains
• Distributed development of ingest and query applicaCons • Supported by Accumulo’s cell-‐level security
Data-Centric Security
10 Sqrrl Data, Inc., All Rights Reserved
Data-Centric Security
Within Accumulo, a key is a 5-‐tuple, consis+ng of: " Row: Controls Atomicity " Column Family: Controls Locality " Column Qualifier: Controls Uniqueness " Visibility Label: Controls Access " Timestamp: Controls Versioning
Row Col. Fam. Col. Qual. Visibility Timestamp Value
John Doe Notes PCP PCP_JD 20120912 PaCent suffers from an acute …
John Doe Test Results Cholesterol JD|PCP_JD 20120912 183
John Doe Test Results Mental Health JD|PSYCH_JD 20120801 Pass
John Doe Test Results X-‐Ray JD|PHYS_JD 20120513 1010110110100…
Accumulo Key/Value Example
11 Sqrrl Data, Inc., All Rights Reserved
Data-Centric Security
12 Sqrrl Data, Inc., All Rights Reserved
Data-Centric Security
Row Col Value 1 Name Jones 1 Sales 100 1 Age 28 2 Name Smith 2 Sales 350 2 Age 25 2 Quota 1000
Row Col Value 1 Name Anon1 1 Sales 100 2 Name Smith 2 Sales 350 2 Quota 1000
User 1 User 2 Data Store
Data-‐centric security approach allows all the data to be stored on a single pla9orm and only authorized data is returned to the user
Pushing security to the data-‐level, simplifies applica@on development and enables more powerful queries
13 Sqrrl Data, Inc., All Rights Reserved
We now have user access to the data secured. But what about your
HDFS administrators?
Encryption of Files
14 Sqrrl Data, Inc., All Rights Reserved
Encryption of Files By encrypCng the files we write into HDFS we further eliminate who can access the data!