+ All Categories
Home > Documents > Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing &...

Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing &...

Date post: 23-May-2020
Category:
Upload: others
View: 5 times
Download: 0 times
Share this document with a friend
26
Confidential Saama Technologies, Inc Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight
Transcript
Page 1: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

Next Generation Data Management

Synthesizing & Standardizing Clinical Data in Flight

Page 2: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

2

• Next generation Systems &

Mechanisms for centralizing data

while maintaining integrity

• Leveraging data mining for data

quality

• Expanding skills matrix and

processes to evolve data

management

Agenda

Page 3: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

Next Generation

Systems

3

Page 4: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

4

The Problem Areas

Source Integration Standardization Visualization

Quality

Governance

Technology

Process

SDV

Meta Data

Connectors

Variability, Inconsistency, Single Points of Failure

Rules

Transform

Exceptions

Model

Reporting

Tools

Profile AuditingLineage

Page 5: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

5

Typical Approach

Quality

Governance

Technology

Process

SDV

Meta Data

Connectors

Rules

Transform

Exceptions

Model

Reporting

Tools

Profile AuditingLineage

Source Integration Standardization Visualization

Variability, Inconsistency, Single Points of FailureBPAAS, OCM

Data Warehouse Reporting

Manual

Manual

Extract Transform Load

Page 6: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

6

A Novel Approach

Quality

Governance

Technology

Process

SDV

Meta Data

Connectors

Rules

Transform

Exceptions

Model

Reporting

Tools

Profile AuditingLineage

Source Integration Standardization Visualization

Variability, Inconsistency, Single Points of FailureBPAAS, OCM

Extract & Load

to LakePonds

Discovery, Reporting,

& Analytics

Data Governance Framework

Quality Detection & Resolution

Processing Pipelines

Center of Analytic Excellence, Business Process Transformation

Page 7: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

Genomics

Internal, M&A,

External, Syndicated

Wearable Devices

High

Variety, Volume, &

Velocity Data

Analysis

Organization

Ingestion

Automated

Data Wrangling & Deep Data ScienceBusiness Aware

Data Analytics Services Oriented

Architecture

Harmonization

Integration

Analysis

Organized Storage

Provisioning

Aggregation

Modern

TechnologiesConfigurable

Analytic

Applications

Business

Outcomes

7

What does Next Generation Look Like?

Page 8: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

Data

Aggregation

Layer

HBase,

HCatalog,

Elastic

Search

Reports

Specific

Entities

Search

Indexes

Centralized &

Personalized

Reports Expo

Ad Hoc Query

Specific

Entities

Data Processing

& Analysis

Analytics via

Pig, Python,

Spark, R

Data

Standards

& Quality

Predictive

Models

Analysis &

RBMData Mining

Data

Organization

& Storage

Study

Data

Patient

Data

Industry

DataHDFS

Atlas

Falcon

Ranger | Knox

PV Data IvRS

Data Landing Zone

Data

Sources EDC SAS Safety CTMS/EDC

CRO’s, M&A SystemsInternal Systems

CRO

1… …

Other/External

Etc.CRO

nCTMS

Delivery

Of

Insights

Private Cloud,

On Premise

OR Hybrid

Deployment

Reports

Visualization

Ad-hoc

report

building

Search

InterfaceExport Utility

Business

Analytics

Served Via

Applications

Kafka, Pig,

Sqoop,

APIs, SDKs

Modern

Big Data

Environment

Data Profiling, Filters, Format Conversion, Aggregation

Data

Integration

8

The Modern Technology Stack

Page 9: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

Data Management – Data Pipelines

Hortonworks. Web. 18 April. 2016.

<http://hortonworks.com/hadoop-tutorial/defining-processing-data-end-end-data-pipeline-apache-falcon/>.9

Page 10: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

Data Management – Data Pipelines

CTMS, EDC, CRO

Ingest &

Rules

Apply

Raw DataMCC KPI

Transcelerate

KRI

Analytic

Ready

Data

Leadership

Safety

Clinical Operations

10

Page 11: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

Data Management – Meta Data

Papatheodorou, Irene, et al. "A metadata approach for clinical data management in

translational genomics studies in breast cancer." BMC medical genomics 2.1 (2009):

1.

11

Page 12: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

Data Mining & Data Quality

12

Page 13: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

Data Mining for Data Quality

● Cluster analysis Group data to form classes,

maximize intra-cluster similarity and minimize

similarity between clusters

● Association rules discovery Find frequent

rules in the data; popular with market basket

analysis

● Classification (e.g. decision trees) Build

(binary) tree where each node corresponds to

a split of attribute values, e.g. "if the weather is

sunny play golf else don’t play golf.¨

● Predictive modeling Build mathematical

models (functions) of the data in order to

predict unknown or missing values, or future

outcomes

● Outlier detection Find unusual, rare events

(often regarded as noise, these can be the

most interesting objects or events in the data),

used for fraud detection, network intrusion

detection, etc.

● Sequence / time series mining Find patterns

over time (e.g. episodes, clusters) Spatial

mining (geographical data analysis)

● Stream mining Where access to the data is

limited to once (e.g. network data,

telecommunications data, etc.), special

algorithms are necessary

● Multimedia mining (images, audio, video)

13

Page 14: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

14

Business Rule Data Quality Data Fraud

Check Leading Digit Preference

(Actual vs. Expected)

Check Leading Digit Preference

Business Rule Data Quality Data Fraud

Too few or too many outliers

identified

Inliers identified

Too little or too much

variability

Check the skewness of the

data

Business Rule Data Quality Data Fraud

Correlation between variables

Degree of interpolation for

repeated measurement

Degree of duplicated results

for repeated measurements

Business Rule Data Quality Data Fraud

Summary statistics for

variables and compare

between them

Business Rule Data Quality Data Fraud

Ordering of dates to

ensure plausibility

Data recorded at

weekend or public

holidays

Accrual that seems

implausible

Continuous

Data

Calculation

Validation

Comparison

Business Rule Data Quality Data Fraud

Potential invented

patterns in the data

Patterns of missing data

Trend

Analysis

Getting Started

Page 15: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

15

Outcomes of Statistical Analyses for Data Quality

• Fabricated Data

Missing or Outlying Values replaced by plausible values via

Implantation

Data Trend Invention

Distribution Observations

• Implantation

• Abnormally small variability for repeated

measurements

• Falsified Data

Enhancing Patient Eligibility or Treatment Efficacy

Distribution Observations

• Comparison between patients, measurements,

treatment interactions, center, etc.

• Unintentional Data Errors

Improperly calibrated, Imprecise Equipment

Distribution Observations

• Shift

• Large Variability

• Data Errors Resulting from Human Error

Data Entry

• Source to CTMS

Missing Data Observations

• Frequency comparisons: outlying centers (too

much or too little missing data) flagged.

Quality Fraud

Page 16: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

16

Outcomes of Statistical Analyses for Data

Quality

Page 17: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

17

Outcomes of Statistical Analyses for Data

Quality

Page 18: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

18

Outcomes of Statistical Analyses for Data

Quality

Page 19: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

19

Outcomes of Statistical Analyses for Data

Quality

Page 20: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

20

Outcomes of Statistical Analyses for Data

Quality

Page 21: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

Skills Matrix

2

1

Page 22: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

22

• Data Management

• Data Manager

• Database

Programmer/Designer

• Medical Coder

• Clinical Data Coordinator

• Quality Control Associate

• Data Entry Associate

• IT (Big Data)

• Architect

• Integration Developers

• Biostatistics

• Applied Statisticians

• Data Scientists

• Governance

• CoE

Stewards

• Custodians

People & Skills

Analytic

Skills

Technology

Skills

Health/Biomedical

Informatics Experts

Data Scientists

Big Data Engineers

*EMC2

Page 23: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

Saama

23

Page 24: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

24

Saama’s Life Sciences Practice

Life Sciences

Solutions

Subject Matter Experts

• Clinical Trials

• R&D | Preclinical

• GMA

• HEOR

• Commercial

Advanced Analytics

• Biostatistics

• Bioinformatics

• Data Science

• Applied Statistics

• Machine Learning

Technology

• Big Data & Connectors

• App/Web Development

• UI/UX

• MDM

• Open SourceData Standards

• CDISC

• OMOP

• Custom

• Client Plug-In

Design Process Frameworks

• Business Analysis

• Predictive Model Build

• Delivery

• QA

Rapid Configuration | Modular Design | Advanced Data Science | Agile Methodology

Page 25: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc

HCM analytics suite

for workforce

management and

operations

Capturing new

customer segments

with insights from real

world treatment

pathways

Disease incidence and

co-morbidities insights

from population cohort

analysis

Managed Services

(Data & Analytics) /

Clinical Operations

Solution

Your InfrastructureUpdates: Modern Data Warehouse, Cloud, V4, Data Lake, Real-time

Saama’s Analytic AssetsPre-Built Clinical Data Solutions

Big Data Strategy &

AccelerationAnalytical Solutions Managed Services

Refactoring Existing

Data Systems

Analytics Advantage

25

Page 26: Next Generation Data Management - Amazon S3 · Next Generation Data Management Synthesizing & Standardizing Clinical Data in Flight. Confidential Saama Technologies, Inc 2 • Next

Confidential

Saama Technologies, Inc 26Copyright © 2016, Saama Technologies | Confidential

Analytics Advantage

What We Stand For

New Class of Solution Partner: Fluid

Analytics for the digital enterprise

Simultaneous business, technology and

services acceleration with Data Science

Heritage of Innovation

Patented technologies, Fluid Analytics Engine

Accelerator.

Game-Changing Outcomes for the

Global 2000

Multi-million $ business outcomes

5000+ engagements, 99% retention rate


Recommended