Clinical Trials & Big Data-Final

Using Big Data To Design & Manage Clinical Trials

An Architect’s Perspective

Manoj Vig

[email protected]

https://www.linkedin.com/in/manojvig Twitter - #manojvig

https://www.linkedin.com/in/manojvig

https://www.linkedin.com/in/manojvig

Disclaimer

I am an employee of Shire pharmaceuticals. The statements and opinions expressed within this session are my own and do not represent those of Shire.

There are some references to technical design pattern being implemented within Shire but explanation of those implementations provided in this session are purely technical.

This presentation outlines general technology direction and trend analysis. Shire has no obligation to pursue any approaches outlined in this document or use any functionality documented or discussed in today’s session.

Volume

VarietyVelocity

What is Big Data

(Petabytes of Data)

(Structured, Unstructured, images, Sounds)

(Batch, sub second response, stream, changes in data)

Handle large volume of data

Designed for Scalability & Failover

Support multiple workloads

Security, multi tenancy & privacy

Cost effective

Characteristics of a big data system

Technology framework for Innovation

3. Apache Hadoop Multiple work loads/Distributed Computing

1. Mobility 2. Social

Arrival of Mobile Age

Participant Recruitment

Adherence & Engagement

User Interaction

Frequent Data Generation

Remote Data Exchange

Data Generation

Power of Social media

Participant engagement

Patient &Site Identification

Social Listenting

Distributed Scale Unstructured Velocity Security Access

Big Data Processing Systems

Using Twitter – Implementation Pattern

TwitterTwitter API(Multi threaded data acquisition)

Curation

Filter Algorithms Rank

Location Profile

Distributed, Scalable, Fast & Economical

Key Decision Makers

Targeted Ads

Visualizations

Web/Mobile

Delivery Channels

Aut

omat

ed P

roce

ss

Apache Hadoop

Security, governance, privacy and Audit

BI Reports&

Dashboards

Data Analysts

Data Scientists

Apps(Web + Mobile)

Devices

Data Feeds

Data Service : Multiple data sources, multiple processing workloads and multiple delivery channels

Impala / Tez(Interactive)

HDFS(Hadoop Distributed File System)

MR(Batch)

Spark(Stream, ETL, DS)

Hive(DW)

Robust Cloud Infrastructure(e.g. AWS EC2)

Gov

erna

nce,

Sec

urity

& A

udit

YARN (Cluster Resource Manager)

Hbase(NoSQL)

Solr(Search)

Spark(Mlib,

Graph)

Custom/proprietary/Visualization AppsCTMS

Com

mon

Dat

a In

gest

ion

Clinical Trials.gov

Metadata Data Quality

Searchable Data Catalog

Streaming

CRO Data Feed

Genomic Data

Information Overload Problem – Apache Solr

CTMS

Streaming

ClinicalTrials.gov

UK Clinical TrialsGateway

Other R&D Datasets

SAS Datasets

Genomic Datasets

Apache Solr Running on Hadoop Cluster

HDFS(Data Landing)

Apache Solr

Data Indexing

Information Extraction(Spark)

Pattern Recognition(Spark)

Machine Learning(Spark)

Metadata Driven Ontology(Hbase)

Data Indexing

Solr APIs

Web UI

Mobile Apps

Desktop Widgets

Dashboards

Data SourcesConsumption

Hbase APIs

Technology is here to stay

Data Generation speed will accelerate

Data Access will get easier

Device connectivity will increase

Technological disruption is inevitable

Conclusion

Questions?

Are Recommender Systems Now Mainstream?◦ https://icrunchdatanews.com/recommender-systems-now-ma

instream/

The Impact of Real-time Computing Systems – Part 1◦ https://icrunchdatanews.com/impact-real-time-computing-sys

tems-part-1/

The Impact of Real-time Computing Systems – Part 1◦ https://

icrunchdatanews.com/impact-real-time-computing-systems-part-2/

ASCOT: a text mining-based web-service◦ http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3339391/

Further Reading

https://icrunchdatanews.com/recommender-systems-now-mainstream/



https://icrunchdatanews.com/impact-real-time-computing-systems-part-1/






http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3339391/

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3339391/

Date post:	09-Feb-2017
Category:	Documents
Upload:	manoj-vig
View:	552 times
Download:	1 times

Clinical Trials & Big Data-Final

Documents