+ All Categories
Home > Documents > Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley...

Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley...

Date post: 25-Dec-2015
Category:
Upload: dennis-long
View: 217 times
Download: 0 times
Share this document with a friend
Popular Tags:
31
Making Large Data Sets Work Making Large Data Sets Work for You for You Advantages and Challenges Advantages and Challenges Lesley H Curtis Lesley H Curtis Soko Setoguchi Soko Setoguchi Bradley G Hammill Bradley G Hammill
Transcript
Page 1: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Making Large Data Sets Work for YouMaking Large Data Sets Work for YouAdvantages and ChallengesAdvantages and Challenges

Lesley H CurtisLesley H Curtis

Soko SetoguchiSoko Setoguchi

Bradley G HammillBradley G Hammill

Page 2: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Presenter disclosure informationPresenter disclosure information

Lesley H CurtisLesley H Curtis

Large Data Sets: An OverviewLarge Data Sets: An Overview

FINANCIAL DISCLOSURE: FINANCIAL DISCLOSURE:

NoneNone

UNLABELED/UNAPPROVED USES DISCLOSURE:UNLABELED/UNAPPROVED USES DISCLOSURE:

NoneNone

Page 3: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Agenda Agenda

Large Data Sets: An OverviewLarge Data Sets: An Overview

Prescription Drug Data: Advantages, Availability, Prescription Drug Data: Advantages, Availability, and Accessand Access

Linking Large Data Sets: Why, How, and What Linking Large Data Sets: Why, How, and What Not to DoNot to Do

Practical ExamplesPractical Examples

Page 4: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Which large data sets?Which large data sets?

Relevant for cardiovascular researchRelevant for cardiovascular research

Available to researchersAvailable to researchers

Potential for linkagePotential for linkage

Claims data—federal and commercialClaims data—federal and commercial

Inpatient registriesInpatient registries

Longitudinal cohort studiesLongitudinal cohort studies

Page 5: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Claims dataClaims data

Derived from payment of bills Derived from payment of bills

Payor-centricPayor-centric

ExamplesExamples MedicareMedicare MedicaidMedicaid Thomson-ReutersThomson-Reuters United Health CareUnited Health Care

Page 6: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Medicare claims dataMedicare claims data

Inpatient services (Part A)Inpatient services (Part A)

Outpatient services (Part B)Outpatient services (Part B)

Physician services (Carrier, Part B)Physician services (Carrier, Part B)

Durable medical equipmentDurable medical equipment

Home health careHome health care

Skilled nursing facilitiesSkilled nursing facilities

HospiceHospice

Page 7: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Medicare claims data elementsMedicare claims data elements

What data are availableWhat data are available DemographicsDemographics Service datesService dates DiagnosesDiagnoses ProceduresProcedures Hospital / PhysicianHospital / Physician

What data are not availableWhat data are not available Physiological measuresPhysiological measures Test resultsTest results Times of admission, procedures, etc.Times of admission, procedures, etc. MedicationsMedications

Page 8: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Medicare claims data coverageMedicare claims data coverage

National scopeNational scope

What patients will be represented?What patients will be represented? Patients enrolled in traditional (fee-for-Patients enrolled in traditional (fee-for-

service) Medicareservice) Medicare

What patients will not be represented?What patients will not be represented? Patients receiving care through the Veterans Patients receiving care through the Veterans

Health AdministrationHealth Administration Patients enrolled in Medicare managed care Patients enrolled in Medicare managed care

plansplans

Page 9: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Medicare claims data qualityMedicare claims data quality

Main pointMain point Reliability of specific claims data elements Reliability of specific claims data elements

depends on importance for reimbursementdepends on importance for reimbursement

Good data on…Good data on… Major proceduresMajor procedures HospitalizationsHospitalizations MortalityMortality

Inconsistent data on…Inconsistent data on… Comorbidities and illness severityComorbidities and illness severity Procedures with low reimbursement ratesProcedures with low reimbursement rates

Page 10: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Acquiring CMS claims dataAcquiring CMS claims data

All requests begin with ResDAC All requests begin with ResDAC (www.resdac.umn.edu)(www.resdac.umn.edu)

CostCost $15K per year of inpatient+denominator data$15K per year of inpatient+denominator data $20K per year of 5% data across all files$20K per year of 5% data across all files $30K+ per year of data for custom requests$30K+ per year of data for custom requests

Detailed approval processDetailed approval process Prepare request packet for ResDAC review (4-6 Prepare request packet for ResDAC review (4-6

weeks)weeks) Review by CMS privacy board (4 weeks)Review by CMS privacy board (4 weeks) Request processed by contractor (6-8 weeks)Request processed by contractor (6-8 weeks)

Page 11: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Preparing for CMS claims dataPreparing for CMS claims data

Make spaceMake space 16 GB for 100% denominator and inpatient files16 GB for 100% denominator and inpatient files 57 GB for 5% denominator, inpatient, outpatient, 57 GB for 5% denominator, inpatient, outpatient,

and carrier* filesand carrier* files

Manage expectationsManage expectations Time to process filesTime to process files Transforming raw claims into usable informationTransforming raw claims into usable information

Coding algorithmsCoding algorithmsCoding changesCoding changes

Learning curveLearning curve

Page 12: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

The Learning CurveThe Learning Curve

CMS Data PublicationsCurtis, Hammill, et al

0 0 1

9

16

28

2005 2006 2007 2008 2009 2010

Page 13: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Claims dataClaims data

Derived from payment of bills Derived from payment of bills

Payor-centricPayor-centric

ExamplesExamples MedicareMedicare MedicaidMedicaid Thomson-ReutersThomson-Reuters United Health CareUnited Health Care

Page 14: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Commercial claims data elementsCommercial claims data elements

What data are typically availableWhat data are typically available DemographicsDemographics Service datesService dates DiagnosesDiagnoses ProceduresProcedures MedicationsMedications Hospital / PhysicianHospital / Physician

What data may not be availableWhat data may not be available Physiological measuresPhysiological measures Test resultsTest results

Page 15: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Commercial claims data coverageCommercial claims data coverage

National scopeNational scope

What patients will be represented?What patients will be represented? Individuals who are commercially insuredIndividuals who are commercially insured

What patients will not be represented?What patients will not be represented? The uninsuredThe uninsured Medicare managed care?Medicare managed care?

Page 16: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Commercial claims data qualityCommercial claims data quality

Similar to Medicare claims dataSimilar to Medicare claims data Reliability of specific claims data elements Reliability of specific claims data elements

depends on importance for reimbursementdepends on importance for reimbursement

Good data on…Good data on… Major proceduresMajor procedures HospitalizationsHospitalizations

Inconsistent data on…Inconsistent data on… MortalityMortality Comorbidities and illness severityComorbidities and illness severity Procedures with low reimbursement ratesProcedures with low reimbursement rates

Page 17: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Preparing for commercial claims dataPreparing for commercial claims data

CostCost $25-70K depending on size, scope of data $25-70K depending on size, scope of data

requestrequest

SizeSize 100 GB per year of data100 GB per year of data Analysis sample sizes will differ from advertised Analysis sample sizes will differ from advertised

sample sizessample sizes

Manage expectations!Manage expectations!

Page 18: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Registry dataRegistry data

Observational cohorts of patients undergoing Observational cohorts of patients undergoing specific treatments or having specific conditionsspecific treatments or having specific conditions

Purpose may be to assess…Purpose may be to assess… Quality of careQuality of care Provider performanceProvider performance Treatment safety/effectivenessTreatment safety/effectiveness

Of interest today are hospital-based registriesOf interest today are hospital-based registries

Page 19: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

OPTIMIZE-HF registryOPTIMIZE-HF registry

Hospital-based quality improvement program Hospital-based quality improvement program and internet-based registry for heart failure.and internet-based registry for heart failure.

2002-2005: 50,000 patients; > 250 hospitals2002-2005: 50,000 patients; > 250 hospitals

Transitioned to GWTG-HF in 2005Transitioned to GWTG-HF in 2005

Page 20: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Registry data coverageRegistry data coverage

Only patients treated at participating hospitals will be Only patients treated at participating hospitals will be includedincluded+ All patients at these hospitals included regardless of All patients at these hospitals included regardless of

payorpayor– Participating hospitals may not be representative of Participating hospitals may not be representative of

hospitals nationwidehospitals nationwide% of group in selected states% of group in selected states

StateState US ElderlyUS Elderly Medicare FFSMedicare FFS OPTIMIZE-HFOPTIMIZE-HF

CaliforniaCalifornia 10.1%10.1% 7.7%7.7% 13.8%13.8%

FloridaFlorida 7.4%7.4% 7.0%7.0% 8.7%8.7%

MichiganMichigan 3.4%3.4% 4.0%4.0% 9.5%9.5%

New YorkNew York 6.6%6.6% 6.1%6.1% 3.5%3.5%

PennsylvaniaPennsylvania 5.2%5.2% 4.4%4.4% 6.7%6.7%

TexasTexas 6.0%6.0% 6.5%6.5% 5.4%5.4%

Page 21: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Registry data qualityRegistry data quality

Good data on…Good data on… Many of the things not included in Medicare Many of the things not included in Medicare

data:data:

Labs, medications, treatment timing, process Labs, medications, treatment timing, process measures, contraindicationsmeasures, contraindications

((if collectedif collected))

Inconsistent data on…Inconsistent data on… Post-hospitalization follow-up carePost-hospitalization follow-up care Outcomes, particularly long-termOutcomes, particularly long-term

Page 22: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Accessing registry dataAccessing registry data

Networking and partneringNetworking and partnering Many require that analyses be performed at Many require that analyses be performed at

selected analytical centers which may have long selected analytical centers which may have long queuesqueues

Approval process via steering or executive Approval process via steering or executive committeecommittee

Page 23: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

NHLBI longitudinal cohort studiesNHLBI longitudinal cohort studies

Atherosclerosis Risk in Communities Study (ARIC)Atherosclerosis Risk in Communities Study (ARIC)

Cardiovascular Health Study (CHS)Cardiovascular Health Study (CHS)

Framingham Heart StudyFramingham Heart Study

Jackson Heart Study Jackson Heart Study

Multi-Ethnic Study of AtherosclerosisMulti-Ethnic Study of Atherosclerosis

Women’s Health InitiativeWomen’s Health Initiative

Page 24: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Cardiovascular Health Study (CHS)Cardiovascular Health Study (CHS)

Prospective, observational study of CV disease in the elderly (Washington Co. Maryland, Forsyth Co. NC, Sacramento Co. CA, and Pittsburgh, PA.)

Baseline exams occurred from 1989-90.

Minority cohort added at Year 5

Annual exams, with ‘major’ exams occurring at year 5 (1992-93), and year 9 (1996-97). Last exam was year 11 (1998-99).

5,201 participants at baseline; 687 additional minority participants 5,888

Page 25: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Cardiovascular Health Study data elementsCardiovascular Health Study data elements

What data are availableWhat data are available DemographicsDemographics Medical, personal historyMedical, personal history Physiological measures, test resultsPhysiological measures, test results QOL, depressionQOL, depression Cognitive functionCognitive function

What data are not availableWhat data are not available Service datesService dates ProceduresProcedures Hospital/physicianHospital/physician

Page 26: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Cardiovascular Health Study data qualityCardiovascular Health Study data quality

Main pointMain point Data collected are of high qualityData collected are of high quality

Good data regarding…Good data regarding… Cardiovascular risk factorsCardiovascular risk factors Cardiovascular endpointsCardiovascular endpoints General healthGeneral health

Limited data on…Limited data on… Non-cardiovascular risk factorsNon-cardiovascular risk factors Non-cardiovascular endpointsNon-cardiovascular endpoints

Page 27: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Accessing NHLBI cohort studiesAccessing NHLBI cohort studies

Via the NHLBI data repositoryVia the NHLBI data repository HIPAA identifiers, geography removedHIPAA identifiers, geography removed

Via Coordinating Center for identifiable dataVia Coordinating Center for identifiable data

SizeSize 20MB per year of data20MB per year of data

Page 28: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

NHLBI-Medicare linked data setsNHLBI-Medicare linked data sets

CMS linked with…CMS linked with… CHS (1991-2004, 2005-2009 pending)CHS (1991-2004, 2005-2009 pending) Framingham (2000-2009 pending)Framingham (2000-2009 pending) Jackson Heart Study (2000-2009 pending)Jackson Heart Study (2000-2009 pending) Multi-Ethnic Study of Atherosclerosis (2000-Multi-Ethnic Study of Atherosclerosis (2000-

2009 pending)2009 pending) Atherosclerosis Risk in CommunitiesAtherosclerosis Risk in Communities Women’s Health InitiativeWomen’s Health Initiative

Page 29: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

ConclusionConclusion

Large data sets aboundLarge data sets abound

Do yourself a favor…manage expectations!Do yourself a favor…manage expectations!

Page 30: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Contact InformationContact Information

Lesley CurtisLesley Curtis

[email protected] [email protected]

Page 31: Making Large Data Sets Work for You Advantages and Challenges Lesley H Curtis Soko Setoguchi Bradley G Hammill.

Recommended