Date post: | 17-Dec-2015 |
Category: |
Documents |
Upload: | florence-martin |
View: | 216 times |
Download: | 0 times |
Linking Large DatasetsLinking Large DatasetsWhy, How, and What Not To DoWhy, How, and What Not To Do
Bradley G HammillBradley G Hammill
Duke Clinical Research InstituteDuke Clinical Research Institute
Presenter disclosure informationPresenter disclosure information
Bradley G HammillBradley G Hammill
Linking Large Datasets: Why, How, and What Not Linking Large Datasets: Why, How, and What Not To DoTo Do
FINANCIAL DISCLOSURE: FINANCIAL DISCLOSURE:
NoneNone
UNLABELED/UNAPPROVED USES DISCLOSURE:UNLABELED/UNAPPROVED USES DISCLOSURE:
NoneNone
AcknowledgementsAcknowledgements
Thanks to:Thanks to:
Lesley CurtisLesley Curtis
Adrian HernandezAdrian Hernandez
Gregg FonarowGregg Fonarow
Kevin SchulmanKevin Schulman
Work initially funded by grant from GSKWork initially funded by grant from GSK
Why link Medicare data to registry data?Why link Medicare data to registry data?
MedicationsMedications
VitalsVitals
Lab resultsLab results
ProceduresProcedures
Clinical historyClinical history
In-hospital eventsIn-hospital events
etc.etc.
Long-term follow-up?Long-term follow-up?
Typical inpatient registryTypical inpatient registry
Why link Medicare data to registry data?Why link Medicare data to registry data?
Potential endpointsPotential endpoints
Mortality Mortality
ReadmissionReadmission
ProcedureProcedure
Adverse events (based on diagnoses)Adverse events (based on diagnoses)
InpatientInpatient
Mortality Mortality (or censoring)(or censoring)
Why not link Medicare data to registry data? Why not link Medicare data to registry data?
Linking will not help us address the limitations of Linking will not help us address the limitations of either data sourceeither data source
MedicareMedicare
No information on VA hospitals or managed care No information on VA hospitals or managed care patientspatients
Selective coverage under age 65Selective coverage under age 65
RegistriesRegistries
Voluntary participationVoluntary participation
May over-represent certain regions or hospital typesMay over-represent certain regions or hospital types
Data quality variesData quality varies
How to link Medicare data with registry dataHow to link Medicare data with registry data
Direct identifiersDirect identifiers
Name, address, SSN, date of birth, etc.Name, address, SSN, date of birth, etc.
GoalGoal: Identify each : Identify each registry patientregistry patient in the Medicare in the Medicare datadata
Indirect identifiersIndirect identifiers
Service dates, date of birth (or age), sexService dates, date of birth (or age), sex
GoalGoal: Identify each : Identify each registry hospitalizationregistry hospitalization in the in the Medicare dataMedicare data
Linking registry data to Medicare claimsLinking registry data to Medicare claims
StepStep 1. 1. Subset registry data Subset registry data
Step Step 2.2. Subset Medicare dataSubset Medicare data
Step Step 3.3. Link hospital identifiersLink hospital identifiers
Step Step 4.4. Link hospitalization recordsLink hospitalization records
Described in:Described in:
Hammill BG, Hernandez AF, Peterson ED, Fonarow GC, Hammill BG, Hernandez AF, Peterson ED, Fonarow GC, Schulman KA, Curtis LH. Linking inpatient clinical registry Schulman KA, Curtis LH. Linking inpatient clinical registry data to Medicare claims data using indirect identifiers. data to Medicare claims data using indirect identifiers. Am Am Heart JHeart J 2009 June;157(6):995-1000. 2009 June;157(6):995-1000.
You will have this conversation [Episode 1]You will have this conversation [Episode 1]
Me:Me: You know, we can link these data to You know, we can link these data to Medicare.Medicare.
Adrian:Adrian: How? We don’t know who the hospitals or How? We don’t know who the hospitals or the patients are?the patients are?
Me:Me: Turns out you don’t really need to know Turns out you don’t really need to know those things.those things.
[Brief explanation of how to link][Brief explanation of how to link]
Adrian:Adrian: (flustered) This feels like a giant leap of faith.(flustered) This feels like a giant leap of faith.
Percent of unique records within sitesPercent of unique records within sites
2007 Medicare HF Records2007 Medicare HF Records
AdmitAdmit
Percent of unique records within sitesPercent of unique records within sites
2007 Medicare HF Records2007 Medicare HF Records
AdmitAdmit DischargeDischarge
Percent of unique records within sitesPercent of unique records within sites
2007 Medicare HF Records2007 Medicare HF Records
AdmitAdmit DischargeDischarge DOBDOB
Percent of unique records within sitesPercent of unique records within sites
2007 Medicare HF Records2007 Medicare HF Records
AdmitAdmit DOBDOB
Percent of unique records within sitesPercent of unique records within sites
2007 Medicare HF Records2007 Medicare HF Records
AdmitAdmit DischargeDischarge 2/3 DOB2/3 DOB
Percent of unique records within sitesPercent of unique records within sites
2007 Medicare HF Records2007 Medicare HF Records
AdmitAdmit DischargeDischarge AgeAge
Percent of unique records within sitesPercent of unique records within sites
2007 Medicare HF Records2007 Medicare HF Records
AdmitAdmit DischargeDischarge AgeAge SexSex
Percent of unique records within sitesPercent of unique records within sites
2007 Medicare HF Records2007 Medicare HF Records
AdmitAdmit AgeAge SexSex
Percent of unique records within sitesPercent of unique records within sites
2007 Medicare HF Records2007 Medicare HF Records
AdmitAdmit1d1d DischargeDischarge AgeAge SexSex
Percent of unique records within sitesPercent of unique records within sites
2007 Medicare HF Records2007 Medicare HF Records
AdmitAdmit DischargeDischarge AgeAge1y1y SexSex
Distinguishing records (DOB available)Distinguishing records (DOB available)
VariablesVariables UniqueUnique
AdmitAdmit DischargeDischarge DOBDOB SexSex >99.9%>99.9%
AdmitAdmit DOBDOB SexSex >99.9%>99.9%
DischargeDischarge DOBDOB SexSex >99.9%>99.9%
AdmitAdmit Discharge 2/3 DOBDischarge 2/3 DOB SexSex 99.9%99.9%
AdmitAdmit DischargeDischarge DOBDOB >99.9%>99.9%
Within sites, what percent of 2007 Medicare HF records Within sites, what percent of 2007 Medicare HF records are unique given…are unique given…
Distinguishing records (Age available)Distinguishing records (Age available)
VariablesVariables UniqueUnique
AdmitAdmit DischargeDischarge AgeAge SexSex 99.4%99.4%
Admit DischargeAdmit Discharge1d1d AgeAge SexSex 98.5%98.5%
AdmitAdmit1d1d DischargeDischarge AgeAge SexSex 98.4%98.4%
AdmitAdmit Discharge AgeDischarge Age1y1y SexSex 98.3%98.3%
AdmitAdmit DischargeDischarge AgeAge 98.9%98.9%
Within sites, what percent of 2007 Medicare HF records Within sites, what percent of 2007 Medicare HF records are unique given…are unique given…
Distinguishing records, in generalDistinguishing records, in general
PopulationPopulation2007 HF Records per Site2007 HF Records per Site
Median (Q1, Q3)Median (Q1, Q3)
All recordsAll records 456456 (194, 1734)(194, 1734)
Heart failure, anyHeart failure, any 8989 (22, 391)(22, 391)
Heart failure, primaryHeart failure, primary 6464 (20, 168)(20, 168)
CABG procedureCABG procedure 7171 (36, 124)(36, 124)
ICD / CRT procedureICD / CRT procedure 1919 (6, 50)(6, 50)
Fewer records per site = Higher % unique recordsFewer records per site = Higher % unique records
Linking registry data to Medicare claimsLinking registry data to Medicare claims
StepStep 1. 1. Subset registry data Subset registry data
Limit to records for patients 65 years or olderLimit to records for patients 65 years or older
Step 2. Subset Medicare dataStep 2. Subset Medicare data
Step 3. Link hospital identifiersStep 3. Link hospital identifiers
Step 4. Link hospitalization recordsStep 4. Link hospitalization records
Example registry data to be used for linkingExample registry data to be used for linking
OPTIMIZE-HF populationOPTIMIZE-HF population
Adults hospitalized for episodes of new or worsening Adults hospitalized for episodes of new or worsening heart failureheart failure
2003–20042003–2004
52,879 records from 255 sites overall52,879 records from 255 sites overall
39,178 records for patients 65+ (74% of total)39,178 records for patients 65+ (74% of total)
Linking registry data to Medicare claimsLinking registry data to Medicare claims
Step 1. Subset registry dataStep 1. Subset registry data
StepStep 2. 2. Subset Medicare data Subset Medicare data
Limit to records for patients 65 years or olderLimit to records for patients 65 years or older
Limit using similar entry criteria as registry, if Limit using similar entry criteria as registry, if possiblepossible
Step 3. Link hospital identifiersStep 3. Link hospital identifiers
Step 4. Link hospitalization recordsStep 4. Link hospitalization records
Example Medicare data to be used for linkingExample Medicare data to be used for linking
Medicare inpatient populationMedicare inpatient population
Hospitalizations with a diagnosis of HF in any position Hospitalizations with a diagnosis of HF in any position (ICD-9-CM Dx 428.x, 402.x1, 404.x1, 404.x3)(ICD-9-CM Dx 428.x, 402.x1, 404.x1, 404.x3)
2003–20042003–2004
Age 65+Age 65+
5.5m records from >5000 sites overall5.5m records from >5000 sites overall
Linking registry data to Medicare claimsLinking registry data to Medicare claims
Step 1. Subset registry dataStep 1. Subset registry data
Step 2. Subset Medicare dataStep 2. Subset Medicare data
StepStep 3. 3. Link hospital identifiers Link hospital identifiers
Link records on exact values of all fields (service Link records on exact values of all fields (service dates, date of birth, sex)dates, date of birth, sex)
Use resulting matches to inform linksUse resulting matches to inform links
Step 4. Link hospitalization recordsStep 4. Link hospitalization records
OPTIMIZE-HF sample site link resultsOPTIMIZE-HF sample site link results
Using DOBUsing DOB Using AgeUsing Age
OPTIMIZE SiteOPTIMIZE Site Medicare SiteMedicare Site Exact MatchesExact Matches Medicare SiteMedicare Site Exact MatchesExact Matches
11 AA 105105 AA 114114
EE 11 KK 77
FF 11 LL 66
1217 others1217 others 55
22 BB 589589 BB 631631
GG 22 MM 2828
40 others40 others 11 NN 2828
3420 others3420 others 2626
33 CC 2929 CC 3232
DD 2525 DD 2727
HH 11 OO 44
II 11 938 others938 others 33
44 ---- ---- PP 44
QQ 44
541 others541 others 33
OPTIMIZE-HF site link resultsOPTIMIZE-HF site link results
Of 255 registry sites…Of 255 registry sites…
247 (97%) identified in Medicare247 (97%) identified in Medicare
All non-VA sites with 25+ records identifiedAll non-VA sites with 25+ records identified
Linking registry data to Medicare claimsLinking registry data to Medicare claims
Step 1. Subset registry dataStep 1. Subset registry data
Step 2. Subset Medicare dataStep 2. Subset Medicare data
Step 3. Link hospital identifiersStep 3. Link hospital identifiers
StepStep 4. 4. Link hospitalization recordsLink hospitalization records
Determine rules to applyDetermine rules to apply
Decide if one-to-one correspondence neededDecide if one-to-one correspondence needed
Go!Go!
Get follow-up data from Medicare Get follow-up data from Medicare
OPTIMIZE-HF hospitalization link resultsOPTIMIZE-HF hospitalization link results
Of 39,178 eligible registry hospitalizations…Of 39,178 eligible registry hospitalizations…
31,753 (81%) identified in Medicare31,753 (81%) identified in Medicare
25,964 unique patients25,964 unique patients
Combinations usedCombinations usedRecordsRecords
IdentifiedIdentified
AdmitAdmit DischargeDischarge DOBDOB SexSex
AdmitAdmit DOBDOB SexSex
DischargeDischarge DOBDOB SexSex
AdmitAdmit Discharge 2/3 DOBDischarge 2/3 DOB SexSex
AdmitAdmit DischargeDischarge DOBDOB
24,750 (86%)24,750 (86%)
1,171 (4%)1,171 (4%)
590 (2%)590 (2%)
2,258 (7%)2,258 (7%)
284 (1%)284 (1%)
You will have this conversation [Episode 2]You will have this conversation [Episode 2]
Me:Me: This is done using deterministic matching.This is done using deterministic matching.
Adrian:Adrian: No, that’s clearly probabilistic matching.No, that’s clearly probabilistic matching.
Me:Me: Actually, it’s not. Actually, it’s not.
Adrian:Adrian: Sure it is. We didn’t have names or SSNs.Sure it is. We didn’t have names or SSNs.
Deterministic v. Probabilistic LinkingDeterministic v. Probabilistic Linking
Deterministic Deterministic
Rule-basedRule-based
The rule determines the resultThe rule determines the result
ProbabilisticProbabilistic
Based on statistical theoryBased on statistical theory
Characteristics assigned weights and potential links Characteristics assigned weights and potential links are scoredare scored
Data-based score threshold determines the result Data-based score threshold determines the result
You will have this conversation [Episode 3]You will have this conversation [Episode 3]
Me:Me: (excited) We were able to link 75% of the (excited) We were able to link 75% of the eligible records!eligible records!
Adrian:Adrian: Golly, that seems low.Golly, that seems low.
Me:Me: It’s about what I expected. It’s about what I expected.
Adrian:Adrian: But [another registry] said they linked 98%.But [another registry] said they linked 98%.
Why might registry records not link to Medicare?Why might registry records not link to Medicare?
Sample siteSample site
All HF patientsAll HF patients
Linked to MedicareLinked to Medicare
Not linked to MedicareNot linked to Medicare
Why might registry records not link to Medicare?Why might registry records not link to Medicare?
In Medicare claims, but…In Medicare claims, but…
Inconsistent coding of procedures or Inconsistent coding of procedures or diagnosesdiagnoses
Inconsistent service dates or patient infoInconsistent service dates or patient info
Not in Medicare claims due to…Not in Medicare claims due to…
Medicare as secondary payerMedicare as secondary payer
Medicare managed care enrollmentMedicare managed care enrollment
AgeAge
VA hospital (site-level)VA hospital (site-level)
You will have this conversation [Episode 4]You will have this conversation [Episode 4]
Adrian:Adrian: The registry didn’t capture [obesity, anemia, The registry didn’t capture [obesity, anemia, etc.]. Now we can use prior claims to get etc.]. Now we can use prior claims to get that information.that information.
Me:Me: We’re going to lose a bunch of patients if we We’re going to lose a bunch of patients if we try that.try that.
Adrian:Adrian: But it’s so worth it. But it’s so worth it.
Me:Me: Maybe not for that particular information, Maybe not for that particular information, though.though.
Other uses of Medicare dataOther uses of Medicare data
Utilizing claims prior to registry hospitalizationUtilizing claims prior to registry hospitalization
Requires prior enrollment in Medicare FFSRequires prior enrollment in Medicare FFS
8% of OPTIMIZE-HF patients did not have 12 months of 8% of OPTIMIZE-HF patients did not have 12 months of prior claimsprior claims
Inpatient data only can be limitingInpatient data only can be limiting
Need to understand coding limitationsNeed to understand coding limitations
e.g. Anemia is poorly codede.g. Anemia is poorly coded
You will have this conversation [Episode 5]You will have this conversation [Episode 5]
Adrian:Adrian: I want to validate our registry with these I want to validate our registry with these links.links.
Me:Me: You can’t easily do that with these data.You can’t easily do that with these data.
Adrian:Adrian: Sure we can, because now we know which Sure we can, because now we know which Medicare patients are in the registry. Medicare patients are in the registry.
Me:Me: True, but that’s not the whole story.True, but that’s not the whole story.
Validation issuesValidation issues
If you start with the registry population…If you start with the registry population…
You usually do not know exactly who you You usually do not know exactly who you shouldshould find find in Medicare claims datain Medicare claims data
Cannot validate VA sitesCannot validate VA sites
Cannot validate managed care patientsCannot validate managed care patients
Cannot validate younger patientsCannot validate younger patients
Assumes all “linkable” records were linkedAssumes all “linkable” records were linked
Validation issuesValidation issues
If you start with the Medicare population…If you start with the Medicare population…
You usually do not know exactly who you You usually do not know exactly who you shouldshould find find in registry datain registry data
Physician groups may be the registry participants, not Physician groups may be the registry participants, not hospitalshospitals
Assumes all “linkable” records were linkedAssumes all “linkable” records were linked
Registry may have allowed sampling at larger sitesRegistry may have allowed sampling at larger sites
Do you want to link data with Medicare?Do you want to link data with Medicare?
Important caveatsImportant caveats
Acquisition requires major investment in claims data Acquisition requires major investment in claims data and infrastructureand infrastructure
Use of Medicare claims data governed by strict data Use of Medicare claims data governed by strict data use agreements (DUA)use agreements (DUA)
Delays in data release are commonDelays in data release are common
[Currently available through 2008][Currently available through 2008]
Why stop at inpatient Medicare data?Why stop at inpatient Medicare data?
Medicare dataMedicare data
InpatientInpatient
Outpatient / PhysicianOutpatient / Physician
PharmacyPharmacy
Mortality Mortality (or censoring)(or censoring)
Why stop with Medicare claims data?Why stop with Medicare claims data?
Other claims data sources existOther claims data sources exist
Private insurer databasesPrivate insurer databases
But more difficult as smaller % of site hospitalizations But more difficult as smaller % of site hospitalizations coveredcovered
PayerPayer Age 18-64Age 18-64 Age 65+Age 65+
MedicareMedicare 15%15% 89%89%
MedicaidMedicaid 20%20% 1%1%
PrivatePrivate 48%48% 8%8%
Other Other (incl. self-pay, charity)(incl. self-pay, charity) 17%17% 2%2%[Source: 2007 HCUP NIS, excluding maternal/neonate-related admissions][Source: 2007 HCUP NIS, excluding maternal/neonate-related admissions]
ConclusionConclusion
You You cancan link your registry to Medicare claims link your registry to Medicare claims
Get long-term follow-up for registry patients 65+ Get long-term follow-up for registry patients 65+ enrolled in fee-for-service Medicareenrolled in fee-for-service Medicare
However…However…
Manage expectationsManage expectations
Understand claims data limitationsUnderstand claims data limitations