DE-IDENTIFIED BIOMETRIC DATA WAREHOUSE
higi SH llc © 2017 – confidential and proprietary1
POWEREDBY
JamesRappSr.SoftwareEngineer
DanielNeems,PhDDataScienceResearcher
What is higi?higi enables consumers to collect and use their
biometric data to act on their health by connecting them to partners that want to motivate action
higi SH llc © 2017 – confidential and proprietary2
Founded in2012
4.5+ millionregistered account holders
78%of the U.S. population
lives within 5 miles of a higi station
80+integrated health
devices, activity trackers and apps
36.5+ millionpeople have used
a higi station
136+ millionmiles of activity logged
198+ milliontests completed across the
higi network
50+retailer banners
The largest self screening connected network in the world.
Over11,000 stations
nationwide
higi SH llc © 2017 – confidential and proprietary3
V 1.0 De-identified Analytics Issues
higi SH llc © 2017 – confidential and proprietary5
•Rigid Schema•Costly•Brittle•Poor Performance
V 1.5 Analytics Implementation
higi SH llc © 2017 – confidential and proprietary6
TableStorage
1001
Blobstorage
BlobMigration
CloudService
AnalyticsProcessor
CloudService
Hadoop(HDInsight)
Reports(CSV)
AutomationviaAzureRunbook
V 1.5 De-identified Analytics Issues
higi SH llc © 2017 – confidential and proprietary7
• Very costly• Time-consuming queries• Brittle• Rigid report output • Required dev effort to update•Hard to debug and maintain
V 2.0 Analytics Goals
higi SH llc © 2017 – confidential and proprietary8
•Easy to implement•Easy to maintain•Flexible• Inexpensive•Performant
V 2.0 Analytics Implementation
higi SH llc © 2017 – confidential and proprietary9
TableStorage
1001
Blobstorage
BlobMigration
Webjob(timertrigger)
Blobtos3migration
Webjob(event-based)
SnowflakeReporting(Tableau)
How does Snowflake help?
higi SH llc © 2017 – confidential and proprietary10
• Faster• Scalable• Reports customizable by business•Devs deal with raw data and leave the
analysis to our data science team
Overview of Data Operations
higi SH llc © 2017 – confidential and proprietary11
GregRumpleChiefInformationOfficer
RossGoglia,MBADataScienceProduct
Manager
DanielNeems,PhDDataScientist
KhanSiddiqui,MDChiefTechnologyOfficerChiefMedicalOfficer
RobertBakosVPof
Software/Engineering
Overview of Data Operations
higi SH llc © 2017 – confidential and proprietary12
DataOperations
DataInfrastructure
DataProduct
DataScience
BITools/Dashboards
DataStudies/Publications
MachineLearning
ForExternalConsumption
Data Architecture (2nd generation)
higi SH llc © 2017 – confidential and proprietary14
DataPipelineintheCloud
Data Architecture (3rd generation)
higi SH llc © 2017 – confidential and proprietary15
DataWarehouseintheCloud
Snowflake Data Warehouse
higi SH llc © 2017 – confidential and proprietary16
Userregistrationanddemographics
Health(e.g.kiosk)activities
Fitnessactivities
Lifestyleactivities(e.g.gym)
Nutritionactivities/purchases
Challengecreationandjoining
Scoreupdates
Achievementearning
Pointsearning
RewardredemptionConnectionof3rd partydevices
Participationinloyaltyprograms
Userdatashareopt-ins(viaAPI)
Communitycreationandjoining
Friendfollowing
Socialactivities(e.g.sharing,inviting)
Chatterandcommenting/liking
Adviewsandclicks
Surveytaking
Loginsacrosskiosk/web/mobile
Activ
ityIntegrations
Conten
t
Gamificatio
nSocial
Time Saving Scalability
Query on a X-Small Warehouse
higi SH llc © 2017 – confidential and proprietary17
Query on a X-Large Warehouse
Data Science and Machine Learning Approach
higi SH llc © 2017 – confidential and proprietary20
DemographicFactorA
DemographicFactorB
BehavioralFactorA
BehavioralFactorB
HighUser
Engagementor
PositiveHealth
Outcomes
Machine Learning Workflow
higi SH llc © 2017 – confidential and proprietary21
PredictiveanalyticsandMLmodels• retentionandengagementprediction• health-relatedprediction• statisticallysignificantdriversofengagementandbehavior
• userclusteranalysis• recommendationengines
Thank you!
higi SH llc © 2017 – confidential and proprietary22
[email protected]@higi.com
[email protected]@higi.com