Post on 12-Jun-2020
transcript
For peer review only
Validity of ICD-9-CM Codes for Breast, Lung, and Colorectal Cancers in Three Italian Administrative Healthcare
Databases: A Diagnostic Accuracy Study Protocol
Journal: BMJ Open
Manuscript ID bmjopen-2015-010547
Article Type: Protocol
Date Submitted by the Author: 16-Nov-2015
Complete List of Authors: Abraha, Iosief; Regional Health Authority of Umbria, Health Planning Service Serraino, Diego; IRCCS Centro di Riferimento Oncologico Aviano (IT),
Epidemiology and Biostatistic Unit, Giovannini, Gianni; Regional Health Authority of Umbria, Health Planning Office Stracci, Fabrizio; University of Perugia, Public Health Department Casucci, Paola; Regional Health Authority of Umbria, Health ICT Service Alessandrini, Giuliana; Regional Health Authority of Umbria, Health ICT Service Bidoli, Ettore; IRCCS Centro di Riferimento Oncologico, Epidemiology and Biostatistic Unit Chiari, Rita; Azienda Ospedaliera Perugia, Oncology De Giorgi, Marcello; Regional Health Authority of Umbria, , Health ICT Service
Franchini, David; Regional Health Authority of Umbria, Health ICT Service Vitale, Maria Francesca; Registro Tumori Regione Campania , ASL NA3 Sud Fusco, Mario; ASL NA3 Sud, Registro Tumori Regione Campania Montedori, Alessandro; Regional Health Authority of Umbria, Health Planning Service
<b>Primary Subject Heading</b>:
Health services research
Secondary Subject Heading: Oncology, Public health
Keywords: administrative database, validating ICD-9 codes, breast, lung and colorectal cancers
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open on June 19, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2015-010547 on 25 M
arch 2016. Dow
nloaded from
For peer review only
1
Validity of ICD-9-CM Codes for Breast, Lung, and Colorectal
Cancers in Three Italian Administrative Healthcare
Databases: A Diagnostic Accuracy Study Protocol
Iosief Abraha, Diego Serraino, Gianni Giovannini, Fabrizio Stracci, Paola Casucci, Giuliana
Alessandrini, Ettore Bidoli, Rita Chiari, Marcello De Giorgi, David Franchini, Maria Francesca
Vitale, Mario Fusco, Alessandro Montedori
Author affiliations:
Health Planning Service, Regional Health Authority of Umbria, Perugia, Italy
Iosief Abraha
Gianni Giovannini
Alessandro Montedori
Regional Health Authority of Umbria, Health ICT Service, Perugia, Italy
Paola Casucci
Giuliana Alessandrini
Marcello De Giorgi
David Franchini
Centro di Riferimento Oncologico Aviano, Epidemiology and Biostatistic Unit, Aviano, Italy
Diego Serraino
Ettore Bidoli
Registro Tumori Regione Campania, ASL NA3 Sud, Brusciano (Na), Italy
Mario Fusco
Maria Francesca Vitale
Public Health Department University of Perugia, Perugia, Italy
Fabrizio Stracci
Dipartimento di Oncologia, Azienda Ospedaliera Perugia, Perugia, Italy
Rita Chiari
Corresponding author:
Iosief Abraha, iosief_a@yahoo.it; iabraha@regione.umbria.it
Health Planning Service
Regional Health Authority of Umbria
Via Mario Angeloni 61
06124 Perugia, Italia
Tel. +39 075 504 5251
Fax. +39 075 504 5569
Page 1 of 14
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
2
Abstract
Introduction Administrative healthcare databases are useful tools to study healthcare outcomes and to monitor
the health status of a population. Patients with cancer can be identified through disease-specific
codes, prescriptions and physician claims, but prior validation is required to achieve an accurate
case definition. The objective of this protocol is to assess the accuracy of International
Classification of Diseases 9th Revision – Clinical Modification (ICD-9-CM) codes for breast, lung,
and colorectal cancers in identifying patients diagnosed with the relative disease in three Italian
administrative databases.
Methods and analysis Data from the administrative databases of Umbria (910,000 residents), Napoli (1,170,000 residents),
Friuli-Venezia Giulia (1,227,000 residents) will be considered. The following ICD-9-CM codes will
function as index tests: 233.0 and 174.0 -174.9 for breast cancer, 162.0 - 162.9 for lung cancer,
153.0 - 153.9 and 154.8 for colorectal cancer. Randomly selected clinical charts will function as
reference standards. Data abstractors will be trained appropriately and agreement between pairs of
abstractors will be assessed. By and large, we will consider two types of algorithms: (a) a cancer
specific algorithm by combining ICD-9-CM diagnosis codes (e.g., malignant neoplasm 174.0 ) with
procedure codes (i.e. mastectomy: 85.41 - 85.48); and (b) generic codes including chemotherapy or
radiotherapy codes, and the position of the diagnoses (principal or secondary) as well as variation in
the time window between two diagnoses. For each algorithm, sensitivity, specificity, and predictive
values will be calculated.
Dissemination Study results will be disseminated widely through peer-reviewed publications and presentations at
national and international conferences.
Page 2 of 14
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
3
Article focus
Validation of administrative database diagnosis codes is necessary for health outcome research.
The aim of this protocol is to evaluate the validity of the International Classification of Diseases-
9th Revision – Clinical Modification (ICD-9-CM) codes for breast, lung and colorectal cancers in
three administrative databases.
Strengths and limitations of this study
This study will be the first to validate ICD-9-CM codes of three cancers in three large
administrative databases in Italy.
This study will evaluate the best combination of algorithms through which to identify the disease
of interest.
Validation studies of administrative data codes is setting-specific and not generalizable to other
settings.
Page 3 of 14
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
4
Introduction
As computer technology continues to advance, administrative databases are increasingly growing in
numerous healthcare settings worldwide. These databases anonymously store data about patients
regarding healthcare assistance they received, including birth, death, or disease treatment. Usually,
the diagnosis of the disease is associated with a specific code from the International Classification
of Diseases, 9th Revision (ICD-9) or 10
th Revision (ICD-10) edition. The ICD is designed to map
health conditions to corresponding generic categories together with specific variations1. The
merging of individual patient data from administrative databases with other sources (e.g.,
prescription and laboratory data) allows investigating a wide range of relevant and often unique
public health questions 2, monitoring population health status over time and performing population-
based pharmacoepidemiological research 2.
To constitute a reliable source for research studies, adequate validation of administrative healthcare
databases is mandatory. While non-clinical information in healthcare databases, such as
demographic and prescription data, are highly accurate 3 4, the validity of registered diagnoses and
procedures is variable4 5. Determining the accuracy of the latter two categories of clinical
information is vitally important to all potential users and involves confirming the consistency of
information within the databases with the corresponding clinical records of patients 3.
In Italy, all the regional health authorities maintain large healthcare information systems containing
patient data from all hospital and territorial sources. These databases have the potential to address
important issues in post-marketing surveillance6 7, epidemiology
8, quality performance and health
services research9. However, there is a concern that their considerable potential as a source of
reliable healthcare information has not been realized since they have not been widely validated. A
systematic review of ICD-9 code validation in Italian administrative databases10 reported that only a
few regional databases have been validated for a limited number of ICD-9 codes of diseases
including stroke 11 12, gastrointestinal bleeding
13, thrombocytopenia
14, epilepsy
15, infection
16,
Page 4 of 14
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
5
chronic obstructive pulmonary disease 17 18, Guillain-Barre syndrome
19, and cancer
20 21. In
addition, the use of these databases was scarce, as only six administrative databases served as
sources for published research articles based on the validated ICD-9 codes. Hence, it is imperative
that regional health authorities systematically validate their databases for critical diseases to
productively use the information they contain.
Breast, colorectal and lung cancers are the most commonly diagnosed neoplasms worldwide, as
well as in Italy22. Consequently, they generate interest in the scientific community and industry as
targets for the development of new drugs, and for governments, given that they are an important
cause of public health and economic burden. For example, variation in the epidemiology of breast23,
colorectal24 25 and lung
26 cancers, treatment (pharmacological or surgical) administered to patients
suffering from these cancers and potential clinical and economic outcomes27-29, can all be evaluated
using validated administrative databases.
The objective of the present protocol is to evaluate the accuracy of the ICD-9-CM codes related
breast, lung and colorectal cancers, in correctly identifying the respective diseases using three large
Italian administrative healthcare databases.
Page 5 of 14
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
6
Methods
Setting and data source
Administrative databases
The target administrative databases were from the Umbria Region (910,000 residents), Local Health
Unit 3 of Napoli (1,170,000 residents) and the Friuli-Venezia Giulia Region (1,227,000 residents).
Starting from the early 90s, all of these administrative databases collected information from all
patient medical records from public and private hospitals including demographics, hospital
admission and discharge dates, vital statistics, the admitting hospital department, the principal
diagnosis and a maximum of 5 secondary discharge diagnoses and the principal, and 5 secondary,
surgical and diagnostic procedures. In addition, these databases contain the records of all drug
prescriptions listed in the National Drug Formulary and the basic characteristics of patients’
physicians. Each resident has a unique national identification code with which it is possible to link
the various types of information, corresponding to each person, within the database. In Italy,
healthcare assistance is covered almost entirely by the Italian National Health System (NHS),
therefore most residents’ significant healthcare information can be found within the healthcare
databases.
Study design
The design to be used is that of a diagnostic accuracy study and will be applied to the three
administrative databases. Each team will validate its own database using a common methodological
approach. For each ICD-9 code (considered to be an index test), a reference standard for the
definition of a disease will be identified using available systematic reviews of studies that have
validated the specific code[6]. When a systematic review is not available, in addition to seeking the
opinions of clinicians, we will systematically search relevant primary studies, that validated medical
diagnoses, in electronic databases of the published medical literature (see companion paper). After
constructing a standardized form for each diagnostic code, a random sample of patients with
Page 6 of 14
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
7
specific ICD-9 codes will be obtained from each database and the corresponding charts will be
acquired. Benchmarking between the three administrative databases (rates, inconsistencies in
coding errors and potential differences that need to be resolved before initiating the validation
process) will be performed.
Any reporting or publication of the results from the present study will follow recommended
guidelines to ensure quality of reporting 30 31.
Methodological and statistical analysis Each team will validate the listed ICD-9 codes within its own database, using the same
methodological approach.
Index test
The following ICD-9-CM codes will function as index tests: 233.0 and 174.0 to 174.9 for breast
cancer; 162.0 to 162.9 for lung cancer; and 153.0 to 153.9 and 154.8 for colorectal cancer. To
enhance the accuracy of the codes to classify a disease, algorithms will be developed (see algorithm
development).
Reference standards
Medical charts will function as reference standards (gold standard) to validate the listed codes.
For each cancer, clinical and histological parameters will be retrieved, as well as information related
to tumor size (pT), diagnostic results used to determine the stage of the tumor (TNM), and type of
surgery. In addition, the use of any therapy, including chemotherapy or radiotherapy, either as
standard or adjuvant therapy, will be recorded.
Page 7 of 14
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
8
Algorithm development
Algorithms are a set of rules for identifying disease cases from administrative data32. Elements for
the development of algorithms will be retrieved from the medical literature and by consulting
experienced oncologists. By and large, we will consider two types of algorithms: (a) a cancer
specific algorithm by combining ICD-9-CM diagnosis codes (e.g., Malignant neoplasm 174.0 ) with
procedure codes (i.e. Mastectomy: 85.41 - 85.48); and (b) generic codes including chemotherapy or
radiotherapy codes, and the position of the diagnoses (principal or secondary) as well as variation in
the time window between two diagnoses. Table 1 displays a combination of ICD-9-CM diagnosis
codes and potential surgical, radiotherapy or chemotherapy procedure codes to identify breast,
colorectal, and lung cancer cases in medical charts.
Page 8 of 14
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
9
Table 1. ICD-9-CM diagnosis and procedure codes for identifying breast, colorectal, and lung
cancer cases in medical charts
Site of
cancer
ICD-9-CM principal diagnosis
code
ICD-9-CM secondary diagnosis
code
ICD-9-CM procedure code
Breast Carcinoma in situ: 233.0 Any Incisional breast biopsy:
85.12
Malignant neoplasm: 174.0 -174.9 Any Excision of breast tissue:
85.20 - 85.25
Mastectomy: 85.41 - 85.48
Any Carcinoma in situ: 233.0 Any procedure
Any Malignant neoplasm: 174.0 -174.9 Any procedure
Chemotherapy: V58.1, V67.2 Malignant neoplasm: 174.0 -174.9 Any procedure
Radiotherapy: V58.0, V67.0 Malignant neoplasm: 174.0 -174.9 Any procedure
Lung Malignant neoplasm of trachea,
bronchus, and lung: 162.0e162.9 Any procedure
Chemotherapy: V58.1, V67.2 Malignant neoplasm of trachea,
bronchus, and lung: 162.0-162.9
Any procedure
Radiotherapy: V58.0, V67.0 Malignant neoplasm of trachea,
bronchus, and lung: 162.0-162.9
Any procedure
Any Malignant neoplasm of trachea,
bronchus, and lung: 162.0-162.9
Administration of any
antineoplastic agent: 99.25,
99.28
Colorectal Malignant neoplasm of colon:
153.0 - 153.9
Any Any procedure
Malignant neoplasm of rectum and
rectosigmoid junction: 154.0 -
154.1,154.8
Any Any procedure
Any Malignant neoplasm of colon:
153.0 - 153.9
Any procedure
Any Malignant neoplasm of rectum and
rectosigmoid junction:154.0 -
154.1, 154.8
Any procedure
Chemotherapy: V58.1, V67.2 Malignant neoplasm of colon:
153.0 - 153.9
Any procedure
Radiotherapy: V58.0, V67.0 Malignant neoplasm of rectum and
rectosigmoid junction:154.0 -
154.1, 154.8
Any procedure
Table adapted from Baldi 200820
Page 9 of 14
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
10
Case selection and sampling method
In general, for each medical diagnosis, we will identify patients discharged across a period of time
(3 years). From these records, we will use a random number generator to randomly select a subset
of patients for our reference standard abstraction.
Case ascertainment
A standardized form, for each disease, will be produced to record demographic, clinical, and
laboratory patient data. Clinically trained chart reviewers will examine the medical charts and fill in
the specific form. Before proceeding to full medical chart abstracting, the level of agreement
between pairs of reviewers will be assessed for a limited number of charts.
Validation methods and statistical analysis
We will determine inter-rater agreement for chart abstraction by calculating the kappa statistic for
the sample of charts abstracted and reviewed by two investigators. Sensitivity and specificity will
be analyzed separately for each ICD-9 code by constructing 2 x 2 tables. Sensitivity measures the
extent to which an index test (e.g. ICD-9 code: 233) correctly identifies individuals who, according
to a reference standard, possess the characteristics of interest (i.e. carcinoma in situ) in the medical
chart. Specificity measures the degree to which an index test correctly identifies individuals who,
according to a reference standard, lack the characteristics of interest in the medical chart4. For both
sensitivity and specificity, 95% confidence intervals will be calculated.
In addition, multiple versions of the diagnostic algorithm will be developed using different logistic
regression models and all combinations of the various variables, including the position of the
diagnosis (primary or secondary), chemotherapy use and surgical procedures. Finally, the accuracy
of the algorithm stratified by the candidate ICD-9-CM code diagnosis position, the hospital and the
district, will be compared using the chi-squared test and with p < 0.05.
Page 10 of 14
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
11
Discussion
In this protocol, we present the approach we will use to analyze the validity of ICD-9-CM codes for
breast, lung and colorectal cancers, in order to ascertain their incidence and prevalence in
administrative databases representing northern, central and southern Italy.
Administrative databases constitute a valid alternative to situations in which randomized trials are
not able to provide the required evidence, for practical or economic reasons. Epidemiological
studies are frequently based on administrative claims databases to identify cases of specific
diseases, such as cancer, and often contain ICD-9 codes. The latter codes have the advantage of
being widely available and require lower effort and cost than consulting medical charts4.
Accurate identification of cancer cases using the ICD-9 codes may contribute to monitoring cancer
trends and to proposing interventions to ameliorate cancer care. In 2008, an Italian study developed
and validated an algorithm using a regional administrative database to determine incident cases of
breast, lung, and colorectal cancers and found a sensitivity of, respectively, 76.7%, 80.8%, and
72.4%, for the three cancers20. The present study will add value to the knowledge of the three
cancer diseases given that it covers different areas of Italy.
Ethics and dissemination Ethical approval has been obtained from the Regional Committee of Umbria. Study results will be
disseminated widely through peer-reviewed publications and presentations at national and
international conferences.
Page 11 of 14
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
12
Footnotes
Contributors AM, IA, MF, and DS conceived the study and all authors were responsible for
designing the protocol. IA and AM drafted the protocol manuscript. All authors critically revised
the successive versions of the manuscript and approved the final version.
Funding This study protocol is supported by funding from the National Centre for Disease
Prevention and Control (CCM 2014), Ministry of Health, Italy. The study funder was not involved
in the study design or the writing of the protocol.
Competing interests None.
Page 12 of 14
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
13
References
1. World Health Organization. International statistical classification of diseases and health related
problems, 10th revision. Geneva: WHO 1992.
2. Jutte DP, Roos LL, Brownell MD. Administrative record linkage as a tool for public health research. Annu
Rev Public Health 2011;32:91-108.
3. Rawson NSB, Shatin D. Assessing the validity of diagnostic data in large administrative healthcare
utilization databases. . In: Hartzema A, Tilson H, Chan K, eds. Pharmacoepidemiology and
Therapeutic Risk Management: Harvey Whitney Books, 2008.
4. West SL, Ritchey ME, Poole C. Validity of Pharmacoepidemiologic Drug and Diagnosis Data.
Pharmacoepidemiology: Wiley-Blackwell, 2012:757-94.
5. Campbell SE, Campbell MK, Grimshaw JM, et al. A systematic review of discharge coding accuracy. J
Public Health Med 2001;23(3):205-11.
6. Traversa G, Bianchi C, Da Cas R, et al. Cohort study of hepatotoxicity associated with nimesulide and
other non-steroidal anti-inflammatory drugs. BMJ (Clinical research ed) 2003;327(7405):18-22.
7. Trifiro G, Patadia V, Schuemie MJ, et al. EU-ADR healthcare database network vs. spontaneous reporting
system database: preliminary comparison of signal detection. Studies in health technology and
informatics 2011;166:25-30.
8. Gini R, Francesconi P, Mazzaglia G, et al. Chronic disease prevalence from Italian administrative
databases in the VALORE project: a validation through comparison of population estimates with
general practice databases and national survey. BMC public health 2013;13:15.
9. Colais P, Pinnarelli L, Fusco D, et al. The impact of a pay-for-performance system on timing to hip fracture
surgery: experience from the Lazio Region (Italy). BMC health services research 2013;13(1):393.
10. Abraha I, Montedori A, Eusebi P, et al. The Current State of Validation of Administrative Healthcare
Databases in Italy: A Systematic Review. Pharmacoepidemiol Drug Saf 2012;21:400-00.
11. Leone MA, Capponi A, Varrasi C, et al. Accuracy of the ICD-9 codes for identifying TIA and stroke in an
Italian automated database. Neurological Sciences 2004;25(5):281-8.
12. Rinaldi R, Vignatelli L, Galeotti M, et al. Accuracy of ICD-9 codes in identifying ischemic stroke in the
General Hospital of Lugo di Romagna (Italy). Neurological Sciences 2003;24(2):65-69.
13. Cattaruzzi C, Troncon MG, Agostinis L, et al. Positive predictive value of ICD-9th codes for upper
gastrointestinal bleeding and perforation in the Sistema Informativo Sanitario Regionale database.
Journal of clinical epidemiology 1999;52(6):499-502.
14. Galdarossa M, Vianello F, Tezza F, et al. Epidemiology of primary and secondary thrombocytopenia: first
analysis of an administrative database in a major Italian institution. Blood Coagul
Fibrinolysis;23(4):271-7.
15. Franchi C, Giussani G, Messina P, et al. Validation of healthcare administrative data for the diagnosis of
epilepsy. Journal of epidemiology and community health 2013;67(12):1019-24.
16. Tinelli M, Mannino S, Lucchi S, et al. Healthcare-acquired infections in rehabilitation units of the
Lombardy Region, Italy. Infection 2002;39(4):353-8.
17. Fano V, D'Ovidio M, del Zio K, et al. [The role of the quality of hospital discharge records on the
comparative evaluation of outcomes: the example of chronic obstructive pulmonary disease
(COPD)]. Epidemiologia e prevenzione 2011;36(3-4):172-79.
18. Faustini A, Canova C, Cascini S, et al. The reliability of hospital and pharmaceutical data to assess
prevalent cases of chronic obstructive pulmonary disease. Copd;9(2):184-96.
19. Bogliun G, Beghi E. Validity of hospital discharge diagnoses for public health surveillance of the Guillain-
Barré syndrome. Neurological Sciences 2002;23(3):113-17.
20. Baldi I, Vicari P, Di Cuonzo D, et al. A high positive predictive value algorithm using hospital
administrative data identified incident cancer cases. Journal of clinical epidemiology
2008;61(4):373-9.
21. Schifano P, Papini P, Agabiti N, et al. Indicators of breast cancer severity and appropriateness of surgery
based on hospital administrative data in the Lazio Region, Italy. BMC public health 2006;6:25.
Page 13 of 14
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
14
22. Ferlay J, Soerjomataram I, Dikshit R, et al. Cancer incidence and mortality worldwide: sources, methods
and major patterns in GLOBOCAN 2012. International journal of cancer Journal international du
cancer 2015;136(5):E359-86.
23. Yuan Y, Li M, Yang J, et al. Using administrative data to estimate time to breast cancer diagnosis and
percent of screen-detected breast cancers - a validation study in Alberta, Canada. European journal
of cancer care 2015;24(3):367-75.
24. Deshpande AD, Schootman M, Mayer A. Development of a claims-based algorithm to identify colorectal
cancer recurrence. Annals of epidemiology 2015;25(4):297-300.
25. Quantin C, Benzenine E, Hagi M, et al. Estimation of national colorectal-cancer incidence using claims
databases. Journal of cancer epidemiology 2012;2012:298369.
26. Abdulmalak C, Cottenet J, Beltramo G, et al. Haemoptysis in adults: a 5-year study using the French
nationwide hospital administrative database. The European respiratory journal 2015;46(2):503-11.
27. Dehal A, Abbas A, Johna S. Comorbidity and outcomes after surgery among women with breast cancer:
analysis of nationwide in-patient sample database. Breast cancer research and treatment
2013;139(2):469-76.
28. Konski A. Clinical and economic outcomes analyses of women developing breast cancer in a managed
care organization. American journal of clinical oncology 2005;28(1):51-7.
29. Mittmann N, Liu N, Porter J, et al. Utilization and costs of home care for patients with colorectal cancer:
a population-based study. CMAJ open 2014;2(1):E11-7.
30. Benchimol EI, Manuel DG, To T, et al. Development and use of reporting guidelines for assessing the
quality of validation studies of health administrative data. Journal of clinical epidemiology
2011;64(8):821-9.
31. De Coster C, Quan H, Finlayson A, et al. Identifying priorities in methodological research using ICD-9-CM
and ICD-10 administrative data: report from an international consortium. BMC health services
research 2006;6:77.
32. Lix LM, De Coster C, Currie R. Defining and validating chronic diseases: an administrative data approach:
Manitoba Centre for Health Policy Winnipeg, 2006.
Page 14 of 14
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
Validity of ICD-9-CM Codes for Breast, Lung, and Colorectal Cancers in Three Italian Administrative Healthcare
Databases: A Diagnostic Accuracy Study Protocol
Journal: BMJ Open
Manuscript ID bmjopen-2015-010547.R1
Article Type: Protocol
Date Submitted by the Author: 31-Jan-2016
Complete List of Authors: Abraha, Iosief; Regional Health Authority of Umbria, Health Planning Service Serraino, Diego; IRCCS Centro di Riferimento Oncologico Aviano (IT),
Epidemiology and Biostatistic Unit, Giovannini, Gianni; Regional Health Authority of Umbria, Health Planning Office Stracci, Fabrizio; University of Perugia, Public Health Department Casucci, Paola; Regional Health Authority of Umbria, Health ICT Service Alessandrini, Giuliana; Regional Health Authority of Umbria, Health ICT Service Bidoli, Ettore; IRCCS Centro di Riferimento Oncologico, Epidemiology and Biostatistic Unit Chiari, Rita; Azienda Ospedaliera Perugia, Oncology Cirocchi, Roberto; University of Perugia, Surgery De Giorgi, Marcello; Regional Health Authority of Umbria, , Health ICT
Service Franchini, David; Regional Health Authority of Umbria, Health ICT Service Vitale, Maria Francesca; Registro Tumori Regione Campania , ASL NA3 Sud Fusco, Mario; ASL NA3 Sud, Registro Tumori Regione Campania Montedori, Alessandro; Regional Health Authority of Umbria, Health Planning Service
<b>Primary Subject Heading</b>:
Health services research
Secondary Subject Heading: Oncology, Public health
Keywords: administrative database, validating ICD-9 codes, breast, lung and colorectal cancers
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open on June 19, 2020 by guest. P
rotected by copyright.http://bm
jopen.bmj.com
/B
MJ O
pen: first published as 10.1136/bmjopen-2015-010547 on 25 M
arch 2016. Dow
nloaded from
For peer review only
1
Validity of ICD-9-CM Codes for Breast, Lung, and Colorectal
Cancers in Three Italian Administrative Healthcare
Databases: A Diagnostic Accuracy Study Protocol
Iosief Abraha, Diego Serraino, Gianni Giovannini, Fabrizio Stracci, Paola Casucci, Giuliana
Alessandrini, Ettore Bidoli, Rita Chiari, Roberto Cirocchi, Marcello De Giorgi, David Franchini,
Maria Francesca Vitale, Mario Fusco, Alessandro Montedori
Author affiliations:
Health Planning Service, Regional Health Authority of Umbria, Perugia, Italy
Iosief Abraha
Gianni Giovannini
Alessandro Montedori
Regional Health Authority of Umbria, Health ICT Service, Perugia, Italy
Paola Casucci
Giuliana Alessandrini
Marcello De Giorgi
David Franchini
Centro di Riferimento Oncologico Aviano, Epidemiology and Biostatistic Unit, Aviano(PN), Italy
Diego Serraino
Ettore Bidoli
Registro Tumori Regione Campania, ASL NA3 Sud, Brusciano (Na), Italy
Mario Fusco
Maria Francesca Vitale
Public Health Department University of Perugia, Perugia, Italy
Fabrizio Stracci
Dipartimento di Oncologia, Azienda Ospedaliera Perugia, Perugia, Italy
Rita Chiari
Department of Digestive Surgery and Liver Unit, University of Perugia, Perugia, Italy
Roberto Cirocchi
Corresponding author:
Iosief Abraha, iosief_a@yahoo.it; iabraha@regione.umbria.it
Health Planning Service
Regional Health Authority of Umbria
Via Mario Angeloni 61
06124 Perugia, Italia
Tel. +39 075 504 5251
Fax. +39 075 504 5569
Page 1 of 15
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
2
Abstract
Introduction Administrative healthcare databases are useful tools to study healthcare outcomes and to monitor
the health status of a population. Patients with cancer can be identified through disease-specific
codes, prescriptions and physician claims, but prior validation is required to achieve an accurate
case definition. The objective of this protocol is to assess the accuracy of International
Classification of Diseases 9th Revision – Clinical Modification (ICD-9-CM) codes for breast, lung,
and colorectal cancers in identifying patients diagnosed with the relative disease in three Italian
administrative databases.
Methods and analysis Data from the administrative databases of Umbria Region (910,000 residents), Local Health Unit 3
of Napoli (1,170,000 residents), Friuli-Venezia Giulia Region (1,227,000 residents) will be
considered. In each administrative database, patients with the first occurrence of diagnosis of breast,
lung or colorectal cancer between 2012 and 2014 will be identified using the following groups of
ICD-9-CM codes in primary position: (a) 233.0 and (b) 174.x for breast cancer; (c) 162.x for lung
cancer; (d) 153.x for colon cancer, and (e) 154.0 - 154.1 and 154.8 for rectal cancer. Only incident
cases will be considered, that is, excluding cases that have the same diagnosis in the five years
(2007-2011) before the period of interest. A random sample of cases and non-cases will be selected
from each administrative database and the corresponding medical charts will be assessed for
validation by pairs of trained, independent reviewers. Case ascertainment within the medical charts
will be based on (a) the presence of a primary nodular lesion in the breast, lung or colon-rectum,
documented with imaging or endoscopy and (b) a cytological or histological documentation of
cancer from a primary or metastatic site. Sensitivity and specificity with 95% confidence intervals
will be calculated.
Page 2 of 15
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
3
Dissemination Study results will be disseminated widely through peer-reviewed publications and presentations at
national and international conferences.
Strengths and limitations of this study
The study will evaluate the validity of the International Classification of Diseases-9th Revision –
Clinical Modification (ICD-9-CM) codes for breast, lung and colorectal cancers in three large
Italian administrative databases.
The strength of this study is that it will use a medical chart review to ascertain case of cancer
diseases.
Once these administrative databases will be validate for breast, lung and colorectal cancer diseases
they can be used for Outcome Research including pharmacoepidemiology, health service research
and quality of care research.
This study will be the first to validate ICD-9-CM codes of three cancers in three large
administrative databases in Italy.
Validation studies of administrative data are related to that context and are not generalizable to
other settings.
Page 3 of 15
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
4
Introduction
As computer technology continues to advance, administrative databases are increasingly growing in
numerous healthcare settings worldwide. These databases anonymously store data about patients
regarding healthcare assistance they received, including birth, death, or disease treatment. Usually,
the diagnosis of the disease is associated with a specific code from the International Classification
of Diseases, 9th Revision (ICD-9) or 10
th Revision (ICD-10) edition. The ICD is designed to map
health conditions to corresponding generic categories together with specific variations1. The
merging of individual patient data from administrative databases with other sources (e.g.,
prescription and laboratory data) allows investigating a wide range of relevant and often unique
public health questions 2, monitoring population health status over time and performing population-
based pharmacoepidemiological research 2-4
.
To constitute a reliable source for research studies, adequate validation of administrative healthcare
databases is mandatory. While non-clinical information in healthcare databases, such as
demographic and prescription data, are highly accurate 5 6
, the validity of registered diagnoses and
procedures is variable6 7
. Determining the accuracy of the latter two categories of clinical
information is vitally important to all potential users and involves confirming the consistency of
information within the databases with the corresponding clinical records of patients 5.
In Italy, all the Regional Health Authorities maintain large healthcare information systems
containing patient data from all hospital and territorial sources. These databases have the potential
to address important issues in post-marketing surveillance8 9
, epidemiology10
, quality performance
and health services research11
. However, there is a concern that their considerable potential as a
source of reliable healthcare information has not been realized since they have not been widely
validated. A systematic review of ICD-9 code validation in Italian administrative databases12
reported that only a few regional databases have been validated for a limited number of ICD-9
codes of diseases including stroke13 14
, gastrointestinal bleeding15
, thrombocytopenia16
, epilepsy17
,
Page 4 of 15
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
5
infections18
, chronic obstructive pulmonary disease19 20
, Guillain-Barré syndrome21
, and cancers22 23
.
In addition, the use of these databases was scarce, as only six administrative databases served as
sources for published research articles based on the validated ICD-9 codes. Hence, it is imperative
that Regional Health Authorities systematically validate their databases for critical diseases to
productively use the information they contain.
Breast, colorectal and lung cancers are the most commonly diagnosed neoplasms worldwide, as
well as in Italy24
. Consequently, they generate interest in the scientific community and industry as
targets for the development of new drugs, and for governments, given that they are an important
cause of public health and economic burden. For example, variation in the epidemiology of breast25
,
colorectal26 27
and lung28
cancers, treatment (pharmacological or surgical) administered to patients
suffering from these cancers and potential clinical and economic outcomes29-31
, can all be evaluated
using validated administrative databases.
The objective of the present protocol is to evaluate the accuracy of the ICD-9-CM codes related to
breast, lung and colorectal cancers, in correctly identifying the respective diseases using three large
Italian administrative healthcare databases.
Page 5 of 15
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
6
Methods
Setting and data source
Administrative databases
Starting from the early 90s, local and regional Italian healthcare administrative databases have
collected information from all patient medical records from public and private hospitals including
demographics, hospital admission and discharge dates, vital statistics, the admitting hospital
department, the principal diagnosis and a maximum of 5 secondary discharge diagnoses and the
principal, and 5 secondary, surgical and diagnostic procedures. In addition, these databases contain
the records of all drug prescriptions listed in the National Drug Formulary and the basic
characteristics of patients’ physicians. Each resident has a unique national identification code with
which it is possible to link the various types of information, corresponding to each person, within
the database. In Italy, healthcare assistance is covered almost entirely by the Italian National Health
System (NHS), therefore most residents’ significant healthcare information can be found within the
healthcare databases.
The target administrative databases for the present study will be from the Umbria Region (910,000
residents), Local Health Unit 3 of Napoli (1,170,000 residents) and the Friuli-Venezia Giulia
Region (1,227,000 residents). For each database the corresponding Unit (Regional Health Authority
of Umbria for Umbria Region, Registro Tumori Regione Campania for Local Health Unit 3 of
Napoli, and Centro di Riferimento Oncologico Aviano for Friuli-Venezia Giulia Region) will
conduct the same validation process.
Source population
The source population will be represented by permanent residents aged 18 or above of Umbria
Region, Local Health Unit 3 of Napoli and the Friuli-Venezia Giulia Region. Any resident that has
been discharged from hospital with a diagnosis of breast, lung or colorectal cancer will be
Page 6 of 15
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
7
considered. Residents that have been hospitalised outside the regional territory of competence will
be excluded from analysis due to difficulty in obtaining the medical charts.
Case selection and sampling method
In each administrative database, patients with the first occurrence of diagnosis of breast, lung or
colorectal cancer between 2012 and 2014 will be identified using the following groups of ICD-9-
CM codes located in primary position: (a) 233.0 and (b) 174.x for breast cancer; (c) 162.x for lung
cancer; (d) 153.x for colon cancer, and (e) 154.0 - 154.1 and 154.8 for rectal cancer. Only incident
cases will be considered, that is, excluding cases with the same diagnosis (ICD-9-CM codes in any
position) in the five years (2007-2011) before the period of interest. Subsequently, for each of the
above reported groups of ICD-9-CM codes, a random sample of cases will be selected from each
administrative database. Table 1 displays the description of the ICD-9-CM codes for each of the
cancer diseases of interest.
Chart abstraction and case ascertainment
The corresponding medical charts of the randomly selected sample cases will be obtained from
hospitals for validation purposes. From each medical chart the following information will be
retrieved: initials of the patient, date of birth, sex, dates of hospital admission and discharge, any
diagnostic procedure that contributed to the diagnosis of the cancer, any pharmacological or
surgical intervention that were provided for the treatment of the cancer.
Within each unit, two reviewers will receive training on data abstraction. An initial consensus chart
review will be performed with each reviewer independently examining the same number of medical
chart (n=20). The inter-rater agreement regarding the presence or absence of breast, lung or
colorectal cancer among the pairs of reviewers within each unit will be calculated using the κ
statistics. This process will be repeated until the strength of agreement among the pairs of reviewers
will be near perfect (κ statistics between 0.81 and 1.00). Any discrepancies will be discussed and
resolved through a third party involvement (Rita Chiari).
Page 7 of 15
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
8
Case ascertainment of cancer within medical chart will be based on (a) the presence of a primary
nodular lesion in the breast, lung or colon-rectum, documented with imaging or endoscopy and (b)
the cytological or histological documentation of cancer from a primary or metastatic site.
Following the consensus review, data abstraction will be completed independently. To ensure
consistency among all the reviewers, cases with uncertainty will be discussed and resolved through
a third party involvement (Rita Chiari).
Validation criteria
For non-invasive breast cancer, we will consider the ICD-9-CM code 233.0 valid, when there is
evidence of a breast nodule documented with imaging (e.g., mammography) and a histological
diagnosis of ductal or lobular breast carcinoma in situ (pTis).
For invasive breast cancer, we will consider the ICD-9-CM codes 174.x valid, when there is
evidence of a breast nodule documented with imaging (e.g., mammography) and a cytological or
histological diagnosis from a primary or metastatic site positive for ductal or lobular
adenocarcinoma.
For lung cancer, we will consider the ICD-9-CM codes 162.x valid, when there is evidence of a
pulmonary nodule documented with imaging (e.g., CT scan) and a cytological or histological
diagnosis from a primary or metastatic site positive for either small cell lung cancer (Microcitoma)
or non-small cell lung cancer (NSCLC).
For colon cancer, we will consider the ICD-9-CM codes 153.x valid, when there is evidence of a
neoplastic lesion within the colon documented with endoscopy (e.g., coloscopy) or imaging (e.g.,
barium enema), and a histological diagnosis from a primary or metastatic site positive for
adenocarcinoma, squamous cell carcinoma or neuroendocrine carcinoma.
For rectal cancer, we will consider the ICD-9-CM codes 154.0-154.1 and 154.8 valid, when there is
evidence of a neoplastic lesion, in the rectosigmoid junction or the rectum, documented with
Page 8 of 15
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
9
endoscopy (e.g., coloscopy) or imaging, (e.g., barium enema) and a histological diagnosis from a
primary or metastatic site positive for adenocarcinoma or squamous cell carcinoma.
Table 1. Description of the ICD-9-CM codes related to breast, lung and colorectal cancers.
Condition ICD-9-CM diagnosis code
Breast Carcinoma in situ: 233.0
Malignant neoplasm
of female breast
174.0 nipple and areola
174.1 central portion
174.2 upper-inner quadrant
174.3 lower-inner quadrant
174.4 upper-outer quadrant
174.5 lower-outer quadrant
174.6 axillary tail
174.8 other specified sites of female breast
174.9 breast female, unspecified
Lung Malignant neoplasm of trachea, bronchus, and lung
162.0 Trachea
162.2 main bronchus
162.3 upper lobe, bronchus or lung
162.4 middle lobe, bronchus or lung
162.5 lower lobe, bronchus or lung
162.8 other parts of bronchus or lung
162.9 bronchus and lung, unspecified
Colorectal 153 Malignant neoplasm of colon
153.0 hepatic flexure
153.1 transverse colon
153.2 descending colon
153.3 sigmoid colon
153.4 cecum
153.5 appendix
153.6 ascending colon
153.7 splenic flexure
153.8 other specified sites of large intestine
153.9 colon, unspecified
154 Malignant neoplasm of rectum and rectosigmoid junction
154.0 rectosigmoid junction
154.1 rectum
Page 9 of 15
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
10
154.8 other
Page 10 of 15
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
11
Statistical analysis
We calculated a sample of 130 charts of cases will be necessary to obtain an expected sensitivity of
80% with a precision of 10% and a power of 80%. For specificity calculation, we will randomly
select non-cases, that is, records without the ICD-9-codes of interest from administrative database.
The corresponding medical charts will be retrieved and evaluated. We calculated a sample of 94
charts of non-cases will be retrieved to obtain an expected specificity of 90% with a precision of
10% and a power of 80%. Overall, each unit will evaluate 1,120 charts.
Sensitivity and specificity will be analysed separately for each ICD-9-CM code by constructing 2 x
2 tables. Sensitivity expresses the proportion of ‘true positives’(TP) (i.e., cancer cases classified as
positive by both the administrative database and medical record review) and all cases deemed
positive by medical chart review. Specificity expresses the proportion of ‘true negatives’ (i.e., cases
without cancer identified by both the administrative database and medical record review), and with
all cases deemed negative by medical chart review. For both sensitivity and specificity, 95%
confidence intervals will be calculated.
Reporting
Complete, transparent and accurate reporting is essential in diagnostic accuracy studies because it
allows readers to assess internal validity as well as to evaluate the generalisability and applicability
of results32
. To ensure quality of reporting any reporting or publication of the results from the
present study will follow recommended guidelines based on the criteria published by the Standards
for Reporting of Diagnostic accuracy (STARD) initiative for the accurate reporting of investigations
of diagnostic studies 32-34
.
Page 11 of 15
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
12
Discussion
In this protocol, we present the approach we will use to analyse the validity of ICD-9-CM codes for
breast, lung and colorectal cancers, in administrative databases representing northern, central and
southern Italy.
Administrative databases constitute a valid alternative to situations in which randomised trials are
not able to provide the required evidence, for practical or economic reasons. In addition, despite
epidemiological studies on cancer are frequently based on cancer registries35-37
, administrative
databases can add a further value especially on pharmacoepidemiology3 12 38
and health services
research39 40
.
Accurate identification of cancer cases using the ICD-9-CM codes may contribute to monitoring
cancer trends and to proposing interventions to ameliorate cancer care. In 2008, an Italian study
developed and validated an algorithm using a regional administrative database to determine incident
cases of breast, lung, and colorectal cancers and found a sensitivity of, respectively, 76.7%, 80.8%,
and 72.4%, for the three cancers22
. The present study will add value to the knowledge of the three
cancer diseases given that it covers different areas of Italy.
Ethics and dissemination Ethical approval has been obtained from the Regional Committee of Umbria. Study results will be
disseminated widely through peer-reviewed publications and presentations at national and
international conferences.
Page 12 of 15
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
13
Footnotes
Contributors AM, IA, MF, and DS conceived the study and all authors were responsible for
designing the protocol. IA and AM drafted the protocol manuscript. IA, DS, GG, FS, PC, GA, EB,
RC, RC, MD, DF, MFV, MF and AM critically revised the successive versions of the manuscript
and approved the final version.
Funding:
This study protocol is supported by funding from the National Centre for Disease Prevention and
Control (CCM 2014), Ministry of Health, Italy. The study funder was not involved in the study
design or the writing of the protocol.
Funding This systematic review is developed within the D.I.V.O. project (Realizzazione di un
Database Interregionale Validato per l’Oncologia quale strumento di valutazione di impatto e di
appropriatezza delle attività di prevenzione primaria e secondaria in ambito oncologico) supported
by funding from the National Centre for Disease Prevention and Control (CCM 2014), Ministry of
Health, Italy. The study funder was not involved in the study design or the writing of the protocol.
Competing interests None.
Page 13 of 15
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
14
References
1. World Health Organization. International statistical classification of diseases and health related
problems, 10th revision. Geneva: WHO 1992.
2. Jutte DP, Roos LL, Brownell MD. Administrative record linkage as a tool for public health research. Annual
review of public health 2011;32:91-108.
3. Jick SS. Fresh evidence confirms links between newer contraceptive pills and higher risk of venous
thromboembolism. Bmj 2015;350:h2422.
4. García Rodríguez LA, Pérez-Gutthann S, Jick S. The UK General Practice Research Database.
Pharmacoepidemiology: John Wiley & Sons, Ltd, 2002:375-85.
5. Rawson NSB, Shatin D. Assessing the validity of diagnostic data in large administrative healthcare
utilization databases. . In: Hartzema A, Tilson H, Chan K, eds. Pharmacoepidemiology and
Therapeutic Risk Management: Harvey Whitney Books, 2008.
6. West SL, Ritchey ME, Poole C. Validity of Pharmacoepidemiologic Drug and Diagnosis Data.
Pharmacoepidemiology: Wiley-Blackwell, 2012:757-94.
7. Campbell SE, Campbell MK, Grimshaw JM, et al. A systematic review of discharge coding accuracy. J
Public Health Med 2001;23(3):205-11.
8. Traversa G, Bianchi C, Da Cas R, et al. Cohort study of hepatotoxicity associated with nimesulide and
other non-steroidal anti-inflammatory drugs. Bmj 2003;327(7405):18-22.
9. Trifiro G, Patadia V, Schuemie MJ, et al. EU-ADR healthcare database network vs. spontaneous reporting
system database: preliminary comparison of signal detection. Studies in health technology and
informatics 2011;166:25-30.
10. Gini R, Francesconi P, Mazzaglia G, et al. Chronic disease prevalence from Italian administrative
databases in the VALORE project: a validation through comparison of population estimates with
general practice databases and national survey. BMC public health 2013;13:15.
11. Colais P, Pinnarelli L, Fusco D, et al. The impact of a pay-for-performance system on timing to hip
fracture surgery: experience from the Lazio Region (Italy). BMC health services research
2013;13(1):393.
12. Abraha I, Montedori A, Eusebi P, et al. The Current State of Validation of Administrative Healthcare
Databases in Italy: A Systematic Review. Pharmacoepidemiol Drug Saf 2012;21:400-00.
13. Leone MA, Capponi A, Varrasi C, et al. Accuracy of the ICD-9 codes for identifying TIA and stroke in an
Italian automated database. Neurological Sciences 2004;25(5):281-8.
14. Rinaldi R, Vignatelli L, Galeotti M, et al. Accuracy of ICD-9 codes in identifying ischemic stroke in the
General Hospital of Lugo di Romagna (Italy). Neurological Sciences 2003;24(2):65-69.
15. Cattaruzzi C, Troncon MG, Agostinis L, et al. Positive predictive value of ICD-9th codes for upper
gastrointestinal bleeding and perforation in the Sistema Informativo Sanitario Regionale database.
Journal of clinical epidemiology 1999;52(6):499-502.
16. Galdarossa M, Vianello F, Tezza F, et al. Epidemiology of primary and secondary thrombocytopenia: first
analysis of an administrative database in a major Italian institution. Blood Coagul
Fibrinolysis;23(4):271-7.
17. Franchi C, Giussani G, Messina P, et al. Validation of healthcare administrative data for the diagnosis of
epilepsy. Journal of epidemiology and community health 2013;67(12):1019-24.
18. Tinelli M, Mannino S, Lucchi S, et al. Healthcare-acquired infections in rehabilitation units of the
Lombardy Region, Italy. Infection 2002;39(4):353-8.
19. Fano V, D'Ovidio M, del Zio K, et al. [The role of the quality of hospital discharge records on the
comparative evaluation of outcomes: the example of chronic obstructive pulmonary disease
(COPD)]. Epidemiologia e prevenzione 2011;36(3-4):172-79.
20. Faustini A, Canova C, Cascini S, et al. The reliability of hospital and pharmaceutical data to assess
prevalent cases of chronic obstructive pulmonary disease. Copd;9(2):184-96.
21. Bogliun G, Beghi E. Validity of hospital discharge diagnoses for public health surveillance of the Guillain-
Barré syndrome. Neurological Sciences 2002;23(3):113-17.
Page 14 of 15
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from
For peer review only
15
22. Baldi I, Vicari P, Di Cuonzo D, et al. A high positive predictive value algorithm using hospital
administrative data identified incident cancer cases. Journal of clinical epidemiology
2008;61(4):373-9.
23. Schifano P, Papini P, Agabiti N, et al. Indicators of breast cancer severity and appropriateness of surgery
based on hospital administrative data in the Lazio Region, Italy. BMC public health 2006;6:25.
24. Ferlay J, Soerjomataram I, Dikshit R, et al. Cancer incidence and mortality worldwide: sources, methods
and major patterns in GLOBOCAN 2012. International journal of cancer Journal international du
cancer 2015;136(5):E359-86.
25. Yuan Y, Li M, Yang J, et al. Using administrative data to estimate time to breast cancer diagnosis and
percent of screen-detected breast cancers - a validation study in Alberta, Canada. European journal
of cancer care 2015;24(3):367-75.
26. Deshpande AD, Schootman M, Mayer A. Development of a claims-based algorithm to identify colorectal
cancer recurrence. Annals of epidemiology 2015;25(4):297-300.
27. Quantin C, Benzenine E, Hagi M, et al. Estimation of national colorectal-cancer incidence using claims
databases. Journal of cancer epidemiology 2012;2012:298369.
28. Abdulmalak C, Cottenet J, Beltramo G, et al. Haemoptysis in adults: a 5-year study using the French
nationwide hospital administrative database. The European respiratory journal 2015;46(2):503-11.
29. Dehal A, Abbas A, Johna S. Comorbidity and outcomes after surgery among women with breast cancer:
analysis of nationwide in-patient sample database. Breast cancer research and treatment
2013;139(2):469-76.
30. Konski A. Clinical and economic outcomes analyses of women developing breast cancer in a managed
care organization. American journal of clinical oncology 2005;28(1):51-7.
31. Mittmann N, Liu N, Porter J, et al. Utilization and costs of home care for patients with colorectal cancer:
a population-based study. CMAJ open 2014;2(1):E11-7.
32. Bossuyt PM, Reitsma JB, Bruns DE, et al. Towards complete and accurate reporting of studies of
diagnostic accuracy: the STARD initiative. Bmj 2003;326(7379):41-44.
33. Benchimol EI, Manuel DG, To T, et al. Development and use of reporting guidelines for assessing the
quality of validation studies of health administrative data. Journal of clinical epidemiology
2011;64(8):821-9.
34. De Coster C, Quan H, Finlayson A, et al. Identifying priorities in methodological research using ICD-9-CM
and ICD-10 administrative data: report from an international consortium. BMC health services
research 2006;6:77.
35. Guzzinati S, Buzzoni C, De Angelis R, et al. Cancer prevalence in Italy: an analysis of geographic
variability. Cancer causes & control : CCC 2012;23(9):1497-510.
36. Nicita C, Buzzoni C, Chellini E, et al. [A comparative analysis between regional mesothelioma registries
and cancer registries: results of the ReNaM-AIRTUM project]. Epidemiol Prev 2014;38(3-4):191-9.
37. Zorzi M, Mangone L, Sassatelli R, et al. Screening for colorectal cancer in Italy: 2011-2012 survey.
Epidemiol Prev 2015;39(3 Suppl 1):115-25.
38. Nordstrom BL, Whyte JL, Stolar M, et al. Identification of metastatic cancer in claims data.
Pharmacoepidemiol Drug Saf 2012;21 Suppl 2:21-8.
39. Franchi C, Lucca U, Tettamanti M, et al. Cholinesterase inhibitor use in Alzheimer's disease: the
EPIFARM-Elderly Project. Pharmacoepidemiol Drug Saf 2011;20(5):497-505.
40. Rosato R, Sacerdote C, Pagano E, et al. Appropriateness of early breast cancer management in relation
to patient and hospital characteristics: a population based study in Northern Italy. Breast cancer
research and treatment 2009;117(2):349-56.
Page 15 of 15
For peer review only - http://bmjopen.bmj.com/site/about/guidelines.xhtml
BMJ Open
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
on June 19, 2020 by guest. Protected by copyright.
http://bmjopen.bm
j.com/
BM
J Open: first published as 10.1136/bm
jopen-2015-010547 on 25 March 2016. D
ownloaded from