BigDataRevealed Discovery, PII Isolation, Pattern Detection, Deep Learning, Data Scientist Workbench

Post on 12-Feb-2017

28 views 1 download

transcript

The Pillars of Information Risk Mitigation

Four Pillars of a Complete Solution

Overall Framework

PII Isolation Pattern Detection Data Scientist Workbench

Deep Learning

Sound Scalable and Extensible Architecture

BigDataRevealed Data Discovery for Big Data Hadoop

• Find & locate personal identifiable information (PII)• Store the location of all PII identified

data• Assist in the remediation of PII data

– Assist in the identification of the business process that resulted in PII violation

– Store date and time for PII violation discovery

• Real-time social media monitoring

PII Isolation

• Fraud detection & risk identification• Anomalies identified from human,

mechanical and unusual conditions• IoT connected device validation• Detect social media outlier data

– People– Competitors– New topics

Pattern Isolation

• Create columnar metadata to enable the discovery of locations of data required for reporting• Trustworthy by providing a percentage of

probability of all patterns found (Email 70%, IP Address 10%, SSN 8%, and 12% blank)• Identify data quality issues and

inconsistencies in column data

Data Scientist

• Data Stewards• Data Analysts• Business Analysts• Privacy Officers• Compliance Officers• ETL and BI Developers

Stakeholders Who Receive Value From BDR

• BDR deep learning provides email, resume, and RTF, context, summarization, and descriptive data tags• Enables probabilistic searching

for tags and summaries– Highlights and displays results– Access detailed data, assign tasks

and print

Deep Learning

• BDR is available through a preconfigured VM Hadoop instance that can be up and running in hours• BDR can take advantage of

your existing Hadoop implementation • All BDR functions described

in this PowerPoint do not require any additional software licensing

Try BigDataRevealed without Risk

HDFS

HBASE

Hive or ImpalaPIG

MAPREDUCE

SPARK

SPARK STREAM

MLLIB /BDRLIBDeep Learning

Apache Drill or Impala

D3.JS Interactive GUI BigDataRevealed Callable Java Modules Map/Reduce, Spark,

NLP, Deep Learning Externalized Callable Modules

Departmentalized BDR-Apache-VMWare

CLOUD

MySQL

Oracle/DB2

Teradata

Streaming Feeds Such as Social Media,Machine Code, Banking …

BDRMR

BDRJAR

BDRMR

BDRJAR

BDRPig

KERBERIZED

BDRMR

BDRJAR

BDRMR

BDRJAR

TIKA

BDRJava

BDRJava

BDRMR

BDR

API

MODULES

BDR-VM - Powered with Apache™ Hadoop®

BDR Audit Lineage

(Disallows your data lake from becoming a data swamp)

BDR RemediationTask Management

SPARK STREAMBDR

Risk Alert

Steven MeisterFounderBigDataRevealedsteven.meister@bigdatarevealed.com www.bigdatarevealed.com 847-791-7838

BigDataRevealed Contact