+ All Categories
Home > Documents > Addressing an Achilles Heel of PCTs: Missing Race and ... · Missing Race and Ethnicity Data in...

Addressing an Achilles Heel of PCTs: Missing Race and ... · Missing Race and Ethnicity Data in...

Date post: 03-May-2018
Category:
Upload: buixuyen
View: 215 times
Download: 1 times
Share this document with a friend
46
Addressing an Achilles Heel of PCTs: Missing Race and Ethnicity Data in Electronic Health Records Monique L. Anderson, MD Assistant Professor of Medicine Duke Clinical Research Institute Duke University School of Medicine August 28, 2015
Transcript

Addressing an Achilles Heel of PCTs: Missing Race and Ethnicity Data in

Electronic Health Records

Monique L. Anderson, MD

Assistant Professor of Medicine

Duke Clinical Research Institute

Duke University School of Medicine

August 28, 2015

NIH Common Fund Diversity Supplement: Can we examine impact of treatments tested in PCTs by race and ethnicity?

Talk Objectives

1) Discuss the importance of collecting race and ethnicity in PCTs

2) Highlight current challenges with using race and ethnicity data from electronic health records for PCTs

3) Demonstrate imputation methods that could optimize the Collaboratory’s ability to study treatment effect by race and ethnicity

Heterogeneity of Treatment Effect

• RCTs report single measure of treatment impact, average treatment effect

• Same treatment can have variable responses in different populations

• HTE defined as non-random variability in the direction of magnitude of a treatment effect

• HTE or subgroup analyses answer the question, “How likely does a treatment work for a similar group of individuals?”

http://www.effectivehealthcare.ahrq.gov/ehc/assets/File/Ch_3-User-Guide-to-OCER_130129.pdf.

Racial HTE in RCTs• Blacks are more likely to benefit from non-specific vasodilators

compared with whites for systolic heart failure (V-HeFT, SOLVD trials).1,2

• Compared with whites, Asians have higher response rate, survival, and greater toxicity from chemotherapy for both non-SCLC and SCLC. 3

• Blacks fare worse with a genetically-guided warfarin algorithm compared to a clinically-guided algorithm in the recent Optimal Anticoagulation through Genetics (COAG) trial.4

• Compared with whites, blacks have poorer response rates to treatments for Hepatitis C.

• Pegylated IFN with Ribavirin5

• HIV/Hep C Co-infection- Ledipasvir and sofosbuvir 6

• Hispanics have lower response rates and survival for colon cancer compared with whites.7

1NEJM 2001; 344:1351-1357; 2 NEJM 1999;5:178-187;3J. Thorac. Oncol.4(1),37–43 (2009). 4 NEJM 2013;369:2283-93; 5NEJM 2015;373: 705-713; 6NEJM 2004;350:2265-71.7 Cancer Causes & Control, 2014

1993

• NIH Revitalization Act • Directs the NIH to establish guidelines for inclusion of women and minorities in clinical

research

• Established Office of Minority Health and Office of Women’s Health

1994• NIH Guidelines on The Inclusion of Women and Minorities as Subjects in Clinical Research

• Inclusion of minorities to be addressed in funding proposals and annual progress reports

• Phase III trials must examine HTE where applicable

1997• OMB standards revised

2000

• Guidelines Updated

• Research Plan, Progress Reports, Competitive Renewal Applications, Final Progress Reports to include plan for subgroup analyses

• Subgroup analyses strongly encouraged in all publication submissions

2001

• NIH Policy on Reporting Race and Ethnicity Data: Subject in Clinical Research

• OMB revised standards adopted by the NIH

• Inclusion Guidelines Updated to reflect OMB categories

NIH Policies on Minority Population Inclusion and HTE

FDA Policies and Guidance for Race and Ethnicity Reporting and HTE Analyses

1988

• Guidelines for the Format and Content of Clinical and Statistical Sections of NDAs

• Emphasized the importance of subgroup analyses; specified race and ethnicity subgroups should be analyzed

1998

• Demographic Rule – Half of NDAs have sufficient analyses

• Sponsors of IND applications to submit annual demographics of enrolled population

• NDA required to submit effectiveness and safety data for demographic subgroups

• Regulation does not apply to devices

2005

• FDA Guidance on Reporting Race and Ethnicity Reporting in Clinical Research

• OMB Categories Recommended

2007

• FDAAA 801 - Reporting of Basic Results Mandatory for Applicable Clinical Trial

• Race and ethnicity reporting is optional; age and sex mandatory

2012

• Section 907 FDASIA

• Action Plan released Aug 2014 to improve demographic inclusion, data collection, and analyses

FDASIA Report of Status Collection and Analysis of Race and Ethnicity Data

• Drugs and biologic NDAs all include tabulations and address subset analyses by sex, race, and age

• Whites dominate participation

• Subgroup analyses w/o sufficient numbers or power to detect differences in most cases

• Devices

• 70% applications list race/ethnicity

• 20% report race/ethnicity subgroup analyses

• FDA makes data available on 17% of HTE analyses

FDA Report: Collection, Analysis, and Availability of DemographicSubgroup Data 88or FDA-approved medical products. Aug 2013

CDER: Trial Composition by Race

9

CDER: Trial Composition by Race

10

0

10

20

30

40

50

60

70

80

90

100

Ticagrelor(ACS)

Rivaroxaban(DVT)

Azilsartan(HTN)

Linaglpi n(DM)

Indacaterol(COPD)

Abiraterone(PC)

Telaprevir(HepC)

CDERApprovedNewMolecularEn tyBiologics2011

White Black Asian Other

Missing Data in Mini-Sentinel

Pragmatic Clinical Trials - Attractive Option for HTE by Race and Ethnicity?

• Focus on external validity and how interventions work in the real world.

• Draw from health systems serving heterogeneous populations; studies will (with little effort) include more women, elderly, minorities, and low SES populations compared with traditional randomized RCTs.

• Test comparative effectiveness and standard of care practices to determine which are optimal.

Race and Ethnicity in Electronic Health Records

• Low-quality data due to administrator or clinical assignment of race and ethnicity

• Large amounts of missing race and ethnicity data

• Institutional variability in data collection practices

Agreement of Self-Reported versus EHR Race/Ethnicity Data among Veterans Affairs Patients

Boehmer, U. Am J Public Health. 2003

0

10

20

30

40

50

60

70

80

90

100

Na veAmerican

Asian AfricanAmerican

Hispanic PacificIslander

White

Other

AfricanAmerican

White

Self-Reported Race

EHR Race

22.8% 83.4% 92.0 % 83.4% 69.6% 97.9 %

Self-Report versus EHR Race and Ethnicity in a FQHC in Cabarrus County: Convenience Sample

• Convenience Sample of 265 patients.

• EHR race was available for 96.4% of sample.

• 32.8% (87/265) of patients did not have agreement between self-report and EHR race.

• Most (n=62) were discrepancies in racial identity among patients of Hispanic ethnicity. EHR race was either unreported or white.

• Of blacks, only 4 were coded as another race.

From Drs. Meredith Nahm and Kristin Newby

Race and Ethnicity Distribution of Health Plan Membership in Kaiser Permanente Southern California

Race Percent

HistoricalmembersuptoMay31,2011(n=12,764,185)

White 15.1

Hispanic 15.1

Black 4.2

AsianandPacificIslander 2.9

AmericanIndianandAlaskaNative 0.1

Multiracial 0.1

Other 0.9

Unknown(missing) 61.6

ActivemembersonJanuary1,2009(n=3,323,588)

White 25.6

Hispanic 30.1

Black 7.6

AsianandPacificIslander 6.2

AmericanIndianandAlaskaNative 0.1

Multiracial 0.2

Other 1.9

Unknown(missing) 28.3

Derose SF et al. Medical Care Research and Review. 2012:70(3)330-345

Variability in Data Collection of Demonstration Projects

Category Trauma* Proven PPACT TSOS* ICD-Pieces+ LIRE& Anonymous STOP-CRC

Race

White X X X X X X X X

Non-White X

BlackorAA X X X X X X X X

Asian X X X X X X X X

NHOPI X X X X X X X

AIAN X X X X X

MultipleRaces X

Hispanic X X

Mexican X

MexicanAmerican X

Chicana X

Cuban X

Spanish X

SouthAmerican X

Indian X

Unknown X X X X X

Other X X X X X

Ethnicity

Hispanic X X X X X

*combinedraceandethnicityformat

LIRE-notallsitescollectingthesameracecategories

ICDPieces-notallsitescollectHispanicethnicity

Indirect Estimation for Missing Race Data

• Indirect Estimation for Race and Ethnicity has been encouraged by the Agency for Healthcare Research and Quality and the Institute of Medicine1

• Organizations currently using these data:

• Kaiser-Permanente Geographically Enriched Member Socio-demographics datamart (GEMS)2

• Medicare3

• Health plans (Aetna)

• Several methods developed to estimate missing race data in EHR and administrative records3

• Surname

• Geocoding only

• Bayesian Surname Geocoding

• Bayesian Improved Surname Geocoding

1. IOM Race, Ethnicity, and Language Data: Standardization for Heath Care Quality Improvement. 20092. http://share.kaiserpermanente.org/static/cb_annual report/reports/docs/2011_chapters/cb11_healthy_people.pdf 3. Bonito AJ, et al. Creation of New Race-Ethnicity Codes and SES Indicators for MedicareBeneficiaries. AHRQ Publication No. 08-0029_EF. January 2008

Duke Medicine Automated Geospatial Infrastructure

EHR

Duke Enterprise Data Warehouse

Addresses are enriched and geocoded usingData Management Studio, USPS, and TomTom

Census-derived sociodemographic variables created and linked to patient record

Patient Registration

TEXTUAL ADDRESS SOURCE DATA

123 Oake Str.Anytown, NC

· Abbreviations· Misspellings· Missing elements

VERIFY

VERIFICATION STATUS DATA

Verification Flag = YesUpdated on: April 10, 2014

STANDARDIZED ADDRESS DATA

123 OAK STREETANYTOWN, NC 12345-4567DURHAM COUNTY

GEOCODED DATA

Latitude: 36.008348Longitude: -78.937205County FIPS Code: 34567Block FIPS Code: 345678912345678

STANDARDIZE

GEOCODE

Bayesian Improved Surname Geocoding

• Individuals are assigned a set of probabilities for membership in each racial/ethnic group given their surname and place of residence.

• Inputs for calculation:• the proportion of a selected race given surname • proportion of all people in US who self report being race i who

reside in Census Block Group k • Data input• 2010 Census Data• 2000 Surname File• Electronic Health Record (name, address)

• Data output• Set of probabilities for 6 races• BISG race is assigned if a particular probability reaches 0.50.

1.Elliot, MN et al. Health Serv Outcomes Res Method 2009. 9:69-832. Derose, SF. Medical Care Review and Review 2012. 70(3) 330-345.

2000 Surname File

Word et al. Demographic Aspects of Surnames for Census 2000

2000 Surname File - Probability of Race/Ethnicity for Supplement Investigators

Name Rank prop100k pctwhite pctblack pctapi pctaian pct2prace pcthispanic

ANDERSON 12 282.62 77.6 18.06 0.48 0.7 1.59 1.58

CALIFF 37688 0.21 92.61 3.06 1.62 (S) (S) 1.26

HERNANDEZ 15 706372 4.55 0.38 0.65 0.27 0.35 93.81

2010 Census Block Group Data Sample Population

Patient BG-FIPS TotPop White Black AIAN Asian NHOPI Other Multi Hispanic

1 370690604023 1730 539 1149 4 0 0 7 31 63

2 370630020133 2030 1326 409 18 135 1 83 58 198

3 370370201031 3947 2991 323 11 101 2 440 79 849

4 370630018022 1629 205 950 12 15 0 415 32 530

5 370319702002 561 552 0 2 0 0 0 7 0

6 370339305001 1135 630 485 1 0 0 1 18 16

7 371539705002 696 369 256 17 5 0 32 17 63

BISG Probabilities and Race Assignment Examples

BISG assigned race category based on calculated race probability > 0.50

EHRRace BISGImputedRace

White Black AIAN Asian Hispanic Multiple

Black 0.276 0.706 0.001 0.000 0.006 0.012 Black

White 0.687 0.249 0.005 0.013 0.019 0.027 White

Asian 0.016 0.000 0.000 0.958 0.014 0.012 Asian

Hispanic 0.002 0.007 0.000 0.000 0.99 0.001 Hispanic

White 0.985 0 0.002 0.000 0.000 0.012 White

Black 0.496 0.491 0.0000 0.000 0.002 0.113 Unassigned

Black 0.909 0.041 0.011 0.001 0.021 0.01 White

Unavailable 0.673 0.17 0.044 0.005 0.091 0.016 White

BayesianProbabilities

Provided by Duke Enterprise Warehouse, Geospatial Analyst, 8/1/2014

Race and Ethnicity PopulationPopulation Excluding

Unknown

Unique Patient Records 6/2008-8/20/2015 n=4,604,747 n=3,471,665

White, % 54.1 71.6

Black or African-American, % 16.2 21.4

Asian, % 1.3 1.7

American Indian and Alaska Native, % 0.56 0.7

Native Hawaiian or Other Pacific Islander, % 0.06 0.09

Multiracial, % 0.26 0.3

Other, % 3.0 4.3

Unknown, Unavailable, Null 24.6 NA

Hispanic Ethnicity, % 1.6 2.1

Race and Ethnicity Distribution of Duke University Health Systems

0

10

20

30

40

50

60

70

80

90

White

Black

Hispanic

Asian

AIAN

NHOPI

Other

2ormore

Percent

RaceandEthnciity

ComparisonofPa entRaceandEthnicityinDukeandKaiserHealthSystemswithUSPopula on

USPopula on

DukeMedicine

KaiserSC

Preliminary Analysis of BISG Algorithm in Duke Medicine Patients with Geocoded Address and Common Surname

• Restricted our population to patients with a geocoded address and a surname on the 2000 surname list (2,478,352 patients).

• To determine the accuracy of BISG, we further restricted to Duke patients without missing race and ethnicity data (n=1,985,354).

• Using an initial cut-point of 50%, we assigned a race from BISG probabilities in 81.6% of cases.

• We determined the race distribution of unknown patients (n=492,998) in test population according to BISG algorithm.

Accuracy Statistics for Patients with Known EHR Race and Ethnicity

Sensitivity Specificity PositivePredictiveValue NegativePredictiveValue

Black 61.0 93.0 75.0 81.0

White 91.0 65.0 86.0 75.0

Hispanic 87.0 99.0 56.0 100.0

Asian 79.0 99.0 77.0 100.0

AIAN 56.0 100.0 62.0 100.0

MultipleRaces 0.0 100.0 2.0 100.0

Preliminary Data: Sensitivity/Specificity of BISG for Men and Women

Men Women

Sensitivity

Black 61.0 60.0

White 91.0 91.0

Hispanic 91.0 83.0

Asian 84.0 75.0

AIAN 57.0 55.0

MultipleRaces 0.0 0.0

Specificity

Black 93.0 93.0

White 66.0 64.0

Hispanic 99.0 99.0

Asian 99.0 99.0

AIAN 100.0 100.0

MultipleRaces 100.0 100.0

SES Index• Utilized previously validated SES index score1,2

• Created based on measure popularized by Kreiger1

• Developed to help understand health and health disparities

• Validated by AHRQ for use in Medicare Data

• SES index - multidimensional construct2 accounting for:

• Wealth - property values

• Income - median household income, % below poverty

• Education - low education, high education

• Housing - crowded households

• Occupation - unemployment

• Assignment of SES index score at block group level

• 211,267 block groups in US

Krieger N, et al. Am J Epidemiol. 2003;57(3):186-99Bonito AJ, et al. Creation of New Race-Ethnicity Codes and SES Indicators for MedicareBeneficiaries. AHRQ Publication No. 08-0029_EF. January 2008

Integration of SES index score in DEDUCE Research Portal

• SES index1

• Scores then assigned to all patients whose addresses were able to be geocoded

• Range 35-78

• SES index quartiles created for use in research

• SES Q1- 35-48

• SES Q2- 49-51

• SES Q3- 52-55

• SES Q4- 56-78

Bonito AJ, et al. Creation of New Race-Ethnicity Codes and SES Indicators for MedicareBeneficiaries. AHRQ Publication No. 08-0029_EF. January 2008

Preliminary Data: Accuracy of BISG Assigned Race and Ethnicity among SES subgroups

SES1 SES2 SES3 SES4Sensitivity

White 72 89 93 97

Black 85 63 45 20

AIAN 80 47 10 100

Asian 62 65 70 83

Hispanic 94 92 86 73

Multiracial 0 0 0 0Specificity

White 87 66 50 40

Black 77 90 95 99

AIAN 99 100 100 100

Asian 100 100 100 99

Hispanic 99 99 99 99

Multiracial 100 100 100 100

Race and EthnicityEHR Study

Population

BISG Imputation of

Unknown EHR R/EEHR+BISG

Unique Patient Records 6/2008-8/20/2015 n=2,478,352 n=492, 998 n=2,478,352

White, % 57.2 67.7 68.0

Black or African-American, % 19.2 15.1 21.5

Asian, % 1.6 5.2 2.5

American Indian and Alaska Native, % 0.5 0.9 0.7

Multiracial, % 0.3 0.03 0.3

Hispanic, % 1.3 11.2 3.0

Unknown, Unavailable, Null 19.9 NA 3.9

Distribution of Duke Cohort after assigning BISG to Patients with Unknown Race

Limitations

• BISG imputation helps, but is not perfect

• No data on surnames occurring less than 100 times.

• Hispanic is listed as a race; Asian and Pacific Islanders combined.

• If block group or surname missing, can’t use BISG but other methods available.

• Slightly lower accuracy with women for some race groups.

• We compared BISG to EHR assigned race; EHR assigned race may not represent self-report in an unknown number of cases.

• Surname list was created in 2005 based on 2000 census data; unclear if a new surname list will be released in 2015.

Next Steps and Ongoing Work

• Over 6-9 months, we will work to optimize use of indirect imputation strategies in Duke EHR.

• We will explore methods to build upon BISG imputation to improve accuracy and will provide measures of certainty for imputed race and ethnicity.

• Over next 12 months, we will create a toolkit for use of indirect estimators in health systems PCTs (race/ethnicity and SES).

• Goal is to work directly with 1-2 PCTs to impute missing race and ethnicity data, as well as provide SES index data.

• Simulation modeling to optimize detection of racial HTE when using electronic health system data

Conclusions

• In summary, we have a unique opportunity to learn more about how treatments in PCTs may differ for minorities.

• Efforts to examine treatment effect by race and ethnicity may be hampered by large amounts of missing race and ethnicity data.

• We have shown acceptable accuracy for large minority groups with BISG and can reduce missing data significantly. • Reduced accuracy of BISG imputation as SES increases for blacks and

Hispanics

• While imputation is a viable interim fix, engaging health systems in long-term solutions to improve data quality is necessary.

Acknowledgements

• Adrian Hernandez, MD

• Robert Califf, MD

• Sohayla Pruitt, MA

• Kinghshuk Roy Choudhury, PhD

• Yuliya Lokhnygina, PhD

• Meredith Nahm, PhD

• Judy Stafford, MS

• Darcy Louzao, PhD

• Tammy Reece, Cheri Janning, Liz Wing, Jonathan McCall

Research reported in this presentation was supported by the Common FundResearch Supplements To Promote Diversity In Health Related Research under AwardNumber 3U54AT007748-02S1 and the Health Care Systems Research CollaboratoryCoordinating Center under Award Number 1U54AT007748-01 the National Centerfor Complementary and Integrative Health, a center of the National Institutes ofHealth. The content is solely the responsibility of the authors and does notnecessarily represent the official views of the National Institutes of Health.

Additional Slides

Adler-Milstein et alHealth Afffairs 2014. 33(9) 1664-1671

2009 AHRQ Report: Quality of Health Care in US

Socioeconomic Status (SES)

• Paucity of data on the availability of SES data in the EHR

• Experience in Duke EHR

• Years of Education– 0% of the time

• Occupation- 0.56% of the time

• About 2500 patients out of >4,400,000 have this data collected.

• Paucity of availability of data in RCTs

• Increasing use of geocoded neighborhood-level SES variables in observational studies

• More recently, the use of SES data within Medicare

Socioeconomic Index ScoreConstruct Measure Definition

Occupation

Unemployment Percentageofpersonsaged16

yearsorolderinthelaborforce

whoareunemployed(andactively

seekingwork)

Income

BelowUSPovertyLine Percentageofpersonsbelowthe

federallydefinedpovertyline

MedianIncome Medianhouseholdincome

Wealth

PropertyValues Medianvalueofowner-occupied

homes

Education

LowEducation Percentageofpersonsaged>25

yearswithlessthana12th-grade

education

HighEducation Percentageofpersonsaged>25

yearswithatleast4yearsof

college

Housing

Crowdedhouseholds Percentageofhouseholds

containingoneormorepersonper

room

Implementation of SES Index within Duke Medicine Health System’s Research Infrastructure

Strategies to Improve Quality and Completeness

• Collecting data at a patient’s first visit

• Offering routine staff training

• Incorporating questions into existing admission forms

• Development and enforcement of hospital policies regarding data collection

• Availability of a frequently asked questions and answers document for staff

• For patients, much more receptive to “we are collecting this information to improve the care of all patients”

Gomez, J health care poor and underserved25(2014):1384-1396


Recommended