DATA SHARING AT HUBZERO
Ann Christine Catlin
Senior Research Scientist
ICT4D Conference
May 28, 2015
COLLECT
EXPLORE
DISCOVER
What is HUBzero?
Open source software platform
used for building
“Science Gateways”
“Collaboratories”
“Hubs”
Platform for scientific research & education
730,877 324,560 nanoHUB.org
422,031 137,460 nees.org
65,389 8,645 HABRIcentral.org
63,419 23,725 GlobalHUB.org
62,918 7,468 ciHUB.org
53,422 26,432 pharmaHUB.org
51,771 11,267 molecularHUB.org
42,710 5,052 iemhub.org
40,234 9,817 vhub.org
37,957 9,416 cceHUB.org
36,588 4,386 PURR
34,879 3,315 DiaGrid.org
~1,800,000 visitors total
60+ Hubs for many disciplines
HUBzero Support at Purdue
Support/Ops Web/Databases Analytics Middleware High Performance Computing Rappture Toolkit
groups
projects
hundreds of tools
a hundred databases
more than ten thousand resources
What’s on HUBzero Hubs?
• COLLABORATIVE DATABASES • Working together with researchers as partners
• Building systems that move research forward
• Improving data technologies with every project
• SELF-SERVE DATABASES • Researchers building their own databases
• Bringing data sharing to a world of hub users
• Always working to make it easier & more powerful
Data sharing solutions
• COLLECTION
• Clinical, reconnaissance, experiment … data gathering
• Samples, specimens, photos, drawings, reports … annotation & tracking
• Data from external databases
• Data from hospital devices
• Forms, spreadsheets, table creates
• Data processors
• Authorization, access rules
• HIPAA, FOIA
• Databases and repositories
Data technology requirement #1
• EXPLORATION • View
• Browse
• Navigate
• Search
• Filter
• Link
• Graph
• Map
• Report
• Analyze
• Audit
• Export. Statistics, Authorization, Performance, Dashboards, …
Data technology requirement #2
• What kind of DATA ?
Big data (100 million records), small data (2 experiments from a lab), complex data, simple spreadsheets, repositories (>500,000 files or just 5)
• What kind of COMMUNITIES ?
Global community of researchers, labs at a university, a single graduate student, a government agency, hospital pharmacists across the USA, collaborating research groups on 2 continents, the public
Search, Sort, Filter, Link, Map, Analyze, Graph, Explore
Finding Answers … Discovering
Making Data Powerful
• DataStore
• DataStore lite
• DataStore “library of the future”
• Management
• Forms
• Repository annotation
• Custom data processors
• Dataviewer
Data technology building blocks
WHO ARE THE COLLABORATORS?
WHAT DOES THE DATA LOOK LIKE?
>100 HOSPITALS
> 570 PHYSICIANS
IN 32 COUNTRIES
Thymic malignancy: an orphan disease
UPLOAD SPREADSHEET
55 columns of clinical data
VALIDATE & CURATE DATA
easy auditing and filtering
EXPLORE AND ANALYZE DATA
search, share, export
Development of the International Thymic Malignancy Interest Group International Database:
An Unprecedented Resource for the Study of a Rare Group of Tumors
James Huang, MD, Usman Ahmad, MD, Alberto Antonicelli, MD,Ann Christine Catlin, Wentao Fang, MD, Daniel Gomez, MD, Patrick
Loehrer, MD, Marco Lucchi, MD, Edith Marom, MD, Andrew Nicholson, MD, Enrico Ruffini, MD, William Travis, MD, Paul Van Schil, MD,
Heather Wakelee, MD, Xiaopan Yao, PhD, Frank Detterbeck, MD; on behalf of the International Thymic Malignancy Interest Group
International Database Committee and Contributors
Journal of Thoracic Oncology, Volume 9, Number 10. October, 2014
Global retrospective database
Global prospective database
Longitudinal with > 200 data elements per episode
Searchable repositories: annotated images
WHO ARE THE COLLABORATORS? WHAT DOES THE DATA LOOK LIKE?
control intervention
> 300
data elements
Demographics
Behavior
Instruments
Clinical
Laboratory
Adherence
Disclosure Status
Child
Caregiver
Health Provider
clinics
Pediatric HIV disclosure intervention
Collaboration in a resource-limited setting
Ensuring data quality
Sankofa Pediatric HIV Disclosure Intervention Cyber Data Management:
Building Capacity in a Resource-limited Setting and Ensuring Data Quality
Ann Christine Catlin, Sumudinie Fernando, Ruwan Gamage, Lorna Renner, Sampson Antwi, Jonas Kusah Tettey, Kofi Aikins Amisah, Tassos Kyriakides, Xiangyu Cong, Nancy Reynolds, and Elijah Paintsil, on Behalf of the Sankofa Project Team
AIDS Care. DOI 10.1080/09540121.2015.1023246. http://dx.doi.org/10.1080/09540121.2015.1023246
Sankofa: A Multi-Site Collaboration on Pediatric HIV in Ghana Nancy R. Reynolds, Angela Ofori-Atta, Margaret Lartey, Lorna Renner, Sampson Antwi, Anthony Enimil, Ann Christine Catlin,
Sumudinie Fernando, Tassos C. Kyriakdes, Elijah Paintsil
AIDS. In press. 2015
Sankofa: making progress …
AMPATH & OpenMRS: understanding data
Disaster and Failure Studies Databases
Exploring and investigating data
College of Pharmacy SafeRX database
Infusion Pump Informatics
smart pumps
infusions
medications limit libraries
alerts
actions taken
Investigation and comparative analytics
Investigation and comparative analytics
Medication safety improvement • clickable charts & reports
• workflows
• metrics
• pivots
• analytics, comparison, drilldown, deep dive
• peaks, trends, patterns
• high harm index
• 100 hospitals: setting standards for alert analysis
• 4 vendors: setting data standards for vendors
• link to patient outcomes
Comparative analytics of infusion pump data across multiple hospital systems Catlin, Ann Christine, Malloy, William, Arthur, Karen, Gaston, Cynthia, Young, James, Fernando, Sudheera R., Fernando, Ruchith
American Journal of Health-system Pharmacy. Vol 72 Feb 15, 2015 DOI: 10.2146/ajhp140424
Using informatics to improve medical device safety and system thinking Witz, Steve, Buening, Natalie, Catlin, Ann Christine, Malloy, William, Kindsfater, Julie, Walroth, Todd, Washington, Alana, Zink, Richard
AAMI Horizon, Sep 2;48 Suppl 2:38-43, 2014 DOI: 10.2345/0899-8205-48.s2.38
23,806 charts and reports generated in 2013-2014
by 80 clinical pharmacists, nurses, medication safety analysts
Medication safety improvement … IMPACT
Self-serve databases … DIBBs style
Medical or scientific data to share?
https://datacenterhub.org/
Library of the future … DIBBs style
• What kind of DATA ?
Big data (100 million records), small data (2 experiments from a lab), complex data, simple spreadsheets, repositories (>500,000 files or just 5)
• What kind of COMMUNITIES ?
Global community of researchers, labs at a university, a single graduate student, a government agency, hospital pharmacists across the USA, collaborating research groups on 2 continents, the public
Search, Sort, Filter, Link, Map, Analyze, Graph, Explore
Finding Answers … Discovering
Making Data Powerful
Exploration and discovery for YOUR data
HUBzero can help you with that
https://hubzero.org