Date post: | 21-Jan-2018 |
Category: |
Technology |
Upload: | perficient-inc |
View: | 518 times |
Download: | 0 times |
The Role of Data Lakes
in Healthcare
2
About Perficient
Perficient is the leading digital transformation
consulting firm serving Global 2000 and enterprise
customers throughout North America.
With unparalleled information technology, management consulting,
and creative capabilities, Perficient and its Perficient Digital agency
deliver vision, execution, and value with outstanding digital
experience, business optimization, and industry solutions.
3
Perficient ProfileFounded in 1997
Public, NASDAQ: PRFT
2016 revenue $487 million
Major market locations:Allentown, Atlanta, Ann Arbor, Boston, Charlotte,
Chattanooga, Chicago, Cincinnati, Columbus, Dallas,
Denver, Detroit, Fairfax, Houston, Indianapolis, Lafayette,
Milwaukee, Minneapolis, New York City, Northern California,
Oxford (UK), Southern California, St. Louis, Toronto
Global delivery centers in China and Indi Nearly a
3,000+ colleagues
Dedicated solution practices
~95% repeat business rate
Alliance partnerships with major technology vendors
Multiple vendor/industry technology and growth awards
4
5
Speaker Introductions
Juliet Silver, Director, Enterprise Strategy, Healthcare
Juliet provides strategic thought leadership and leverages her more than 20 years
of healthcare industry, management consulting and technology experience to
support healthcare clients in the realization of their strategic vision.
Jill Corcoran, Senior Technical Architect, Healthcare
Jill has more than 20 years of consulting experience focused on helping clients
solve complex business challenges by providing enterprise, data and business
intelligence architectural solutions that transform the way they think about,
organize, and leverage their data.
6
Healthcare Data Lake Concepts
7
Data Lakes in HealthcareWhat
A Data Lake, as originally coined, is designed to
hold raw data assets of varied types as they are
received from their sources. Typically the lake is
stored in an Hadoop ecosystem with minimal (if any)
change to the original format and no content
integration or enhancement of the source data.
Why
Healthcare organizations are attracted to the
concept of a data lake as it allows for in-depth
analysis of patient outcomes, fraud, waste and
abuse, R&D for drugs and DME, and clinical trials.
How
A Data Lake offers schema-on-read access to large
amounts of widely varied information that can be
loaded and accessed rapidly. This allows skilled
data scientists to uncover hidden correlations,
obscure patterns, disease trends, and more.
8
The Need for a Data Lake in Healthcare
“Do we need an enterprise data warehouse, a data lake,
or both as part of our overall data architecture?”
• A Data Lake provides the ability to manage the fluid data requirements of
contemporary healthcare organizations as they attempt to rapidly analyze large
volumes of data in batch or real-time from an extensive range of sources in a
variety of formats.
• An enterprise data warehouse provides the strategy-driven, non-volatile
transformed data used to run day-to-day operations and make informed business
decisions based on known processes and thoroughly vetted data leveraging more
traditional reporting, visualization, and analytics.
9
Data Lake Traits
• Time to value in data delivery is accelerated
• Uses various tools which apply “schema-on-read"
• Introduces and reuses tools and processes that
improve search and general knowledge of the data
content
• Designed for low-cost storage for large data
volumes
• Is highly agile and reconfigurable
10
Healthcare Data Lake
Use Cases
• Genomic analytics used by health plans
• Improved clinical trials
• Predictive healthcare costs
• Member/Patient 360° view
• Billing opportunities in unstructured text
• Psychographic prescriptive modeling
11
Use Case: Genomic Analytics Used by InsurersThe Genetic Information Nondiscrimination Act of 2008 (GINA) protects Americans from
discrimination based on their genetic information in both health insurance and employment.
But we can, and have access to the largest-ever collection of human protein-coding genetic
variants (over 10 million variants), from the Exome Aggregation Consortium (ExAC). The challenge
for healthcare is not how to use genomics data but dealing with massive amount of data.
12
Use Case: Improved Clinical Trials
The analysis and design of clinical trials can discover drug combinations with significant
improvements for overall survival and toxicity. Using these statistical models we can develop
optimization models that select treatment regimens that can be tested in clinical trials, based
on the totality of data available on existing.
Existing models can be expanded upon by using published research as an external source
of data during clinical trials.
13
Use Case: Predictive Healthcare CostsThe data you thought would be useful … was not
• 113 candidate predictors from structured and unstructured data sources
• Structured data was less reliable then unstructured data – increased the reliance on unstructured data
Unexpected indicators emerged from unstructured content
• Increased the value of the Predictive Model
• 18 accurate indicators or predictors
Predictor
Analysis
% Encounters
Structured
Data
% Encounters
Unstructured
Data
Ejection
Fraction
(LVEF)
2% 74%
Smoking
Indicator
35%
(65% Accurate)
81%
(95% Accurate)
Living
Arrangements
<1% 73%
(100% Accurate)
Drug and
Alcohol Abuse
16% 81%
Assisted Living 0% 13%
14
Use Case: Member/Patient 360°View
Member/Patient 360◦
• Improve decision making
• Enhance patient experience
• Provide a greater opportunity
for improved outcomes
• Improve profitability for both the
provider and the health plan
• Reduce unnecessary and
inefficient processes and procedures
When applied across a large
population of patients you can:
• Predict disease outbreaks
• Identify preventative care
• Develop cures for diseases that touch
specific demographics or patient population segments
15
Use Case: Billing Opportunities in Unstructured Text
• The analysis of unstructured data can provide
significant opportunities for more complete and
fairer billing practices. This information is held
by providers and payers but rarely reviewed as
the amount of detail is overwhelming. Using
keyword searches across vast amounts of data
quickly produces meaningful insight.
• Transcripts of physician’s notes show pre- and
post-procedure exam tests, labs, and related
minor procedures performed unbilled
• Large U.S. health plan compensated on per-
patient basis discovered co-morbidities
allowing them to apply risk adjustments to
segments of their patient population
16
Use Case: Psychographic Prescriptive Modeling Adding psychographic data from patient healthcare records (PHR) can provide considerable insight into
additional disease risk factors. One example of this would be The Framingham Heart Study with more than 1000
published medical papers related to the study it is one of the most widely known evidence-based studies. One of
the key discoveries was that heart disease is effected not only by measurable factors (such as blood pressure,
and cholesterol) but also demographic (age, gender, and race) and psychographic factors (values, attitudes, and
lifestyles) as well.
Basic Framingham Analysis Predictor Importance
Designing and Developing
the Data Lake
18
Stocking the Data Lake
19
Provider Data Lake Healthcare Sources
Provider
Data Lake
PatientRecords
PhysicianNotes
DigitalImages
MedicalDevice
Financials
HealthInfo Sys
External Sources
Health Plan
Gov’tAgencies
AccountableCare Orgs
Geo-Political
Wearables
Research
Provider Sources
SystemSources
Security
Log Data
Metadata
WebSources
SocialMedia
Email& Chat
WebContent
20
Payer Data Lake Healthcare Sources
Payer
Data Lake
Provider Sources
Provider
NetworkFinancials
Health Plan SystemSources
Security
Log Data
Metadata
WebSources
SocialMedia
Email& Chat
WebContent
Claim
Encounter
Member
Marketing
Rx Claim
External Sources
Gov’tAgencies
AccountableCare Orgs
Geo-Political
Wearables Genomic
PHR / PGHD
Research Survey Standard Codes
21
Big Data Landscape Components
22
The Enterprise Data Landscape
23
Introducing Hadoop to the Enterprise Data Landscape
24
Best PracticesAssessment
• Genuine need based on the 4 Vs
• Understanding the ‘Big’ Picture
• Mature metadata procedures in place
• Active governance with majority participation
Planning
• Executive suite backing and participation
• Fully vetted use cases
• Staffing and training plan
– Infrastructure Architect
– Big Data Architect
– Data Scientist
Implementation
• Start slow in digestible portions (usable POC)
• Employ technical project management
• Maintain strong scope management
• Small set of very skilled users for initial deployment
• Bring all data that will answer the questions
25
Summary• Data Lakes deliver the power to share data and rapidly explore, discover, and
predict patterns of risk, cost, and improved outcomes and engagement
• Provides the foundation for research and ad-hoc data science to occur across
a variety of large volume data sets
• Integral to evidence-based care and clinical genetics programs
• Need for genomics data
• Pedigree data
• Personal health information
• Geo data sets
• Psychographic data
• Requires advanced data management and data science skill sets
• Should be governed through an Information and data governance structure
• Sets use case and data priorities
• Oversees data risk, security, and compliance
QuestionsType your question into the chat box
27
Next up:
[Webinar] Harness the Power of Cloud to Drive
Business Innovation – Tuesday, April 25th
[Webinar] Modernize Core Technology to Accelerate
Digital Transformation – Tuesday, May 23rd
Follow Us Online
• Perficient.com/SocialMedia
• Facebook.com/Perficient
• Twitter.com/Perficient_HC
• Blogs.perficient.com/healthcare
Thank You