+ All Categories
Home > Documents > Relative Operating Characteristic Analysis and Group ... › content › canres › 51 › 7 ›...

Relative Operating Characteristic Analysis and Group ... › content › canres › 51 › 7 ›...

Date post: 02-Feb-2021
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
7
[CANCER RESEARCH 51. 1904-1909. April I, 1991) Relative Operating Characteristic Analysis and Group Modeling for Tumor Markers: Comparison of CA 15.3, Carcinoembryonic Antigen, and Mucin-like Carcinoma-associated Antigen in Breast Carcinoma Hulbert K. B. Silver,1 Betty-Lou Archibald, Joseph Ragaz, and Andrew J. Goldman Department of Advanced Therapeutics ¡H.K. B. S., B-L. A./, Division of Medical Oncology /J. R.I, anil Division of Epidemiology, Biometry and Occupational Oncology [A. J. C.], British Columbia Cancer Agency, Vancouver, British Columbia, Canada ABSTRACT Relative operating characteristic (ROC) analysis was used to examine the clinical applicability of 3 breast carcinoma tumor markers, CA 15.3, Carcinoembryonic antigen, and mucin-like carcinoma-associated antigen. Each tumor marker was quantitated in single serum samples collected from 100 normal blood donors, 60 patients with nonmalignant diseases, 33 women at high risk for breast carcinoma, 30 patients with malignancies other than breast carcinoma, and 158 breast carcinoma patients including 67 with no evidence of disease following surgery, 46 with a tumor burden <5 g, and 45 with a tumor burden >5 g. These were used to construct models for early diagnosis among those at high risk for breast carcinoma, the influence of nonmalignant disease on early diagnosis, discrimination of breast carcinoma from other adenocarcinomas, detection of early recurrence, and assessment of change in tumor burden. For each model ROC data permitted the unbiased selection of the most appropriate critical values based on the interaction of sensitivity and specificity. ROC analysis indicated that in practice the assays were remarkably similar. While CA 15.3 generally performed best, there was significant variation among models. Optimal marker selection can thus depend on specific clinical application. In some cases ROC identified a combination of markers as superior to any single assay, but this was not statistically significant. INTRODUCTION Tumor markers have become an important management tool in clinical oncology. While they have proven most useful in monitoring established disease, additional potential applica tions include diagnosis in the subgroup of patients suspected of harboring a malignancy, identification of a probable primary site for patients with an unknown primary, assessment of prog nosis, and detection of early recurrence. Until recently the only well-accepted tumor marker for breast cancer was CEA.2 Its utility has been well reviewed (1). Now, with the advent of monoclonal antibody and recombinant DNA technology, there is an increasing number of potentially useful tumor markers. At least a dozen have been proposed for breast carcinoma in the past decade (2). Yet, it is often difficult or impossible to assess the relative clinical merit of a given marker from pub lished reports. Part of the problem is that, while a number of statistical evaluation methods are available, results from one method may not be strictly comparable to another (3). In some cases the selected patient groups within a study do not reflect the intended clinical application or results for a tumor marker in one group of patients are used as a comparison for another Received 8/13/90; accepted 1/22/91. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. 1To whom requests for reprints should be addressed, at the British Columbia Cancer Agency, 600 West 10th Avenue, Vancouver, British Columbia, Canada, V5Z 4E6. 2 The abbreviations used are: CEA, Carcinoembryonic antigen; ROC, relative operating characteristic; N, normal: NMD. nonmalignant disease; NBA. non- breast adenocarcinoma: HR, high risk; AUC, area under the curve; TPF. true- positive fraction: FPF. false-positive fraction; MCA, mucin-like carcinoma-asso ciated antigen. marker used to evaluate different patients. Often unintended bias is introduced in the selection of decision criteria (normal cutoffs or upper limits of normal) required for most compara tive analyses. ROC analysis has evolved from signal detection theory as a general method of analyzing diagnostic systems (4-6). A major advantage of this analysis is that decision criteria need not be identified to compare accuracy of tests. Where the establish ment of a decision criterion is indicated, ROC analysis can be used to identify a cutoff providing the best accuracy for a given test or aid in the unbiased selection of a cutoff for comparison among tests. In addition, the method provides for a relatively simple graphic representation of test accuracy. The objective of this study was to evaluate ROC analysis using selected patient groups as models for three potential breast tumor markers: CEA, CA 15.3, and MCA. MATERIALS AND METHODS Serum Sample Collection Blood was collected in glass tubes and centrifuged at 1400 x g for 8 min; then the serum was separated and stored in 0.5-ml aliquots at -70°until assayed. A total of 382 patient samples were collected and divided into groups as described below. N Group. The Canadian Red Cross kindly provided samples from 100 female donors, 16-91 years of age (mean, 38 years). At the time of sampling patients specifically denied any history of neoplastic, inflam matory, infectious, central nervous system, cardiovascular, or hepatic disease. NMD Group. The underlying diagnoses for the 60 samples in this category included active rheumatoid arthritis (35 patients), colitis or diverticulitis (6), hypertension (4 patients), peptic ulcer (4 patients), chronic obstructive pulmonary disease (3 patients), pericarditis (3 pa tients), and one each of renal failure, cirrhosis, idiopathic anemia, lumbar disc disease, and sarcoidosis. HR Group. This group of 33 women was defined as high risk by the presence of a suspicious breast lesion, as determined by physical ex amination and mammography, and the identification of cellular atypia after fine needle aspiration. The latter is a recognized risk factor for breast carcinoma (7). NBA Group. The histologically confirmed diagnoses in this group of 30 adenocarcinoma patients included ovary (15 patients), lung (8 pa tients), and pancreas (7 patients). Each of these had advanced regional (15 patients) and/or advanced metastatic disease (15 patients). Breast Carcinoma Group. Samples from these 158 patients with histologically confirmed breast carcinoma were divided into three groups. Group 1 included 67 premenopausal patients with a history of lymph node involvement. All known disease had been resected, and the patients had completed adjuvant chemotherapy and were on a follow- up protocol with no evidence of recurrence at the time of serum sampling. Group 2 included 46 patients sampled at the time of first recurrence of directly assessable local or regional disease estimated at <5 g. Evaluation to exclude distant metastatic disease included serum liver function studies (aspartate aminotransferase, lactic dehydroge- nase, bilirubin), chest radiographs, and "Tc-diphosphonate bone scin- tigraphy. Group 3 consisted of 45 patients with advanced disease 1904 on June 18, 2021. © 1991 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from
Transcript
  • [CANCER RESEARCH 51. 1904-1909. April I, 1991)

    Relative Operating Characteristic Analysis and Group Modeling for TumorMarkers: Comparison of CA 15.3, Carcinoembryonic Antigen, andMucin-like Carcinoma-associated Antigen in Breast Carcinoma

    Hulbert K. B. Silver,1 Betty-Lou Archibald, Joseph Ragaz, and Andrew J. Goldman

    Department of Advanced Therapeutics ¡H.K. B. S., B-L. A./, Division of Medical Oncology /J. R.I, anil Division of Epidemiology, Biometry and Occupational Oncology[A. J. C.], British Columbia Cancer Agency, Vancouver, British Columbia, Canada

    ABSTRACT

    Relative operating characteristic (ROC) analysis was used to examinethe clinical applicability of 3 breast carcinoma tumor markers, CA 15.3,Carcinoembryonic antigen, and mucin-like carcinoma-associated antigen.Each tumor marker was quantitated in single serum samples collectedfrom 100 normal blood donors, 60 patients with nonmalignant diseases,33 women at high risk for breast carcinoma, 30 patients with malignanciesother than breast carcinoma, and 158 breast carcinoma patients including67 with no evidence of disease following surgery, 46 with a tumor burden5 g. These were used to constructmodels for early diagnosis among those at high risk for breast carcinoma,the influence of nonmalignant disease on early diagnosis, discriminationof breast carcinoma from other adenocarcinomas, detection of earlyrecurrence, and assessment of change in tumor burden. For each modelROC data permitted the unbiased selection of the most appropriatecritical values based on the interaction of sensitivity and specificity. ROCanalysis indicated that in practice the assays were remarkably similar.While CA 15.3 generally performed best, there was significant variationamong models. Optimal marker selection can thus depend on specificclinical application. In some cases ROC identified a combination ofmarkers as superior to any single assay, but this was not statisticallysignificant.

    INTRODUCTION

    Tumor markers have become an important management toolin clinical oncology. While they have proven most useful inmonitoring established disease, additional potential applications include diagnosis in the subgroup of patients suspected ofharboring a malignancy, identification of a probable primarysite for patients with an unknown primary, assessment of prognosis, and detection of early recurrence. Until recently the onlywell-accepted tumor marker for breast cancer was CEA.2 Its

    utility has been well reviewed (1). Now, with the advent ofmonoclonal antibody and recombinant DNA technology, thereis an increasing number of potentially useful tumor markers.At least a dozen have been proposed for breast carcinoma inthe past decade (2). Yet, it is often difficult or impossible toassess the relative clinical merit of a given marker from published reports. Part of the problem is that, while a number ofstatistical evaluation methods are available, results from onemethod may not be strictly comparable to another (3). In somecases the selected patient groups within a study do not reflectthe intended clinical application or results for a tumor markerin one group of patients are used as a comparison for another

    Received 8/13/90; accepted 1/22/91.The costs of publication of this article were defrayed in part by the payment

    of page charges. This article must therefore be hereby marked advertisement inaccordance with 18 U.S.C. Section 1734 solely to indicate this fact.

    1To whom requests for reprints should be addressed, at the British Columbia

    Cancer Agency, 600 West 10th Avenue, Vancouver, British Columbia, Canada,V5Z 4E6.

    2The abbreviations used are: CEA, Carcinoembryonic antigen; ROC, relativeoperating characteristic; N, normal: NMD. nonmalignant disease; NBA. non-breast adenocarcinoma: HR, high risk; AUC, area under the curve; TPF. true-positive fraction: FPF. false-positive fraction; MCA, mucin-like carcinoma-associated antigen.

    marker used to evaluate different patients. Often unintendedbias is introduced in the selection of decision criteria (normalcutoffs or upper limits of normal) required for most comparative analyses.

    ROC analysis has evolved from signal detection theory as ageneral method of analyzing diagnostic systems (4-6). A majoradvantage of this analysis is that decision criteria need not beidentified to compare accuracy of tests. Where the establishment of a decision criterion is indicated, ROC analysis can beused to identify a cutoff providing the best accuracy for a giventest or aid in the unbiased selection of a cutoff for comparisonamong tests. In addition, the method provides for a relativelysimple graphic representation of test accuracy.

    The objective of this study was to evaluate ROC analysisusing selected patient groups as models for three potentialbreast tumor markers: CEA, CA 15.3, and MCA.

    MATERIALS AND METHODS

    Serum Sample Collection

    Blood was collected in glass tubes and centrifuged at 1400 x g for 8min; then the serum was separated and stored in 0.5-ml aliquots at-70°until assayed. A total of 382 patient samples were collected and

    divided into groups as described below.N Group. The Canadian Red Cross kindly provided samples from

    100 female donors, 16-91 years of age (mean, 38 years). At the time ofsampling patients specifically denied any history of neoplastic, inflammatory, infectious, central nervous system, cardiovascular, or hepaticdisease.

    NMD Group. The underlying diagnoses for the 60 samples in thiscategory included active rheumatoid arthritis (35 patients), colitis ordiverticulitis (6), hypertension (4 patients), peptic ulcer (4 patients),chronic obstructive pulmonary disease (3 patients), pericarditis (3 patients), and one each of renal failure, cirrhosis, idiopathic anemia,lumbar disc disease, and sarcoidosis.

    HR Group. This group of 33 women was defined as high risk by thepresence of a suspicious breast lesion, as determined by physical examination and mammography, and the identification of cellular atypiaafter fine needle aspiration. The latter is a recognized risk factor forbreast carcinoma (7).

    NBA Group. The histologically confirmed diagnoses in this group of30 adenocarcinoma patients included ovary (15 patients), lung (8 patients), and pancreas (7 patients). Each of these had advanced regional(15 patients) and/or advanced metastatic disease (15 patients).

    Breast Carcinoma Group. Samples from these 158 patients withhistologically confirmed breast carcinoma were divided into threegroups. Group 1 included 67 premenopausal patients with a history oflymph node involvement. All known disease had been resected, and thepatients had completed adjuvant chemotherapy and were on a follow-up protocol with no evidence of recurrence at the time of serumsampling. Group 2 included 46 patients sampled at the time of firstrecurrence of directly assessable local or regional disease estimated at

  • ROC ANALYSIS OF CA 15.3, CEA, AND MCA

    characterized by local, regional, or distant métastases>5 g. Thisclassification is in keeping with our own previous work (8-11) and thatof others (12). Although the usual clinical staging is an indirect correlateof tumor burden (13), individuals with advanced disease may have verylittle tumor burden at some times during their clinical course (forexample stage IV patients with no evidence of disease). For the purposeof tumor marker evaluation our system is more refined in that it ismore directly quantifiable and clearly denotes tumor burden at the timeof serum sampling. Blood samples from group 2 and 3 patients wereobtained before institution of treatment.

    Tumor Marker Assays

    CEA, as initially described by Gold and Freedman (14), is a complexlarge molecular weight glycoprotein associated with the cellular glyco-

    calyx. Although best known as a tumor marker for colorectal cancer,the utility of CEA for breast carcinoma has been well documented (1).The method used in this study was an enzyme immunoassay methodusing heat extraction and polyclonal anti-CEA antibody as describedby the manufacturer, Abbott Laboratories, Chicago, IL.

    The CA 15.3 test is a double determinant radioimmunoassay utilizingtwo different monoclonal antibodies. The first, 115D8, was raisedagainst milk fat globule membranes and reacts with a high molecularweight antigen, MAM-6. The second monoclonal antibody, DF3, wasraised against a membrane-enriched extract of breast carcinoma cellsand reacts with a heterogeneous M, 300,000-450,000 circulating antigen (15, 16). In this assay, 115D8 is immobilized on polystyrene beadsto complete the double antibody sandwich with 125I-DF3.The detailed

    procedure was as described by the manufacturer (Centocor, Malvern,PA) and supplied by Amersham Canada Ltd. (Oakville, Ontario,Canada).

    MCA is an M, 350,000 glycoprotein produced by mammary carcinomas and some normal tissues. The monoclonal antibody used in thisassay, b-12, recognizes an epitope thought to be closely associated withbreast carcinoma (17). The two-step solid phase enzyme immunoassaymethod was performed as directed by the manufacturer, Hoffman La-Roche Ltd., Etobicoke, Ontario, Canada.

    Statistical Methods

    The statistical basis for ROC methods have been well described (4-6). Briefly, when a test is used to detect patients having a disease (forexample, breast carcinoma) among those free of the disease, a criticaltest value is usually selected that it is hoped will best distinguish betweenthe two groups. For tumor markers, results greater than the criticalvalue (test positive) generally denote increased probability of disease.This system defines four groups, those who: test positive with thedisease (true positive), test positive without the disease (false positive),test negative with the disease (false negative), and test negative withoutthe disease (true negative). ROC analysis takes advantage of a simplification of these familiar categories. The entire population can bedescribed by just two functions: true-positive fraction (the proportionof test positives among those with the disease) and false-positive fraction (the proportion of test positives among those without the disease).These fractions are linked by any given critical value. For tumor markerssuch as those discussed in this paper, selection of a higher critical valuemust result in a smaller false-positive fraction as well as a smaller true-positive fraction. Clearly, selection of a critical value can have a profound influence when tumor markers are compared. The simple expedient of plotting true-positive fraction against false-positive fraction fora range of assay values will overcome many of the difficulties inherentin analyses based on critical value, as described in the text. Thecomputer program used by us (June 1989 revision kindly provided bythe developer, Charles E. Metz, Department of Radiology, Universityof Chicago Medical Center, Chicago, IL) analyzes data by a modification of a program by Dorfman and Alf (18). Related statistical tests arebased on the bivariate normal model as previously described by Metzet al. (6).

    Disease group comparisons were also made using conventional logistic regression analysis (19) in which all three tumor markers were

    100

    Fig. 1. ROC analysis of 100 normal controls (group N) versus patients withadvanced breast carcinoma (group 3, n = 45) for the three individual assays (CAIS.3, CEA, and MCA) and the combination defined by logistic regression analysis(ALL). FPF and TPF are expressed as percentages.

    1000

    100

    oo

    10 .

    NMD15.3 CEA MCA

    44 44

    HR15.3 CEA MCA

    25 25 27

    NBA15.3 CEA MCA

    13 18 20

    Fig. 2. Scattergram of assay values for CA 15.3, CEA, and MCA in patientswho did not have breast carcinoma. Groups included NMD, HR, and NBA.Values are expressed on a log scale as multiples of the cutoff (MOC) defined bya 3% false-positive fraction among normal controls. Numbers below the double-ruled line, numbers of patients with assay values below the cutoff.

    used as variables. Not only did this permit an appraisal of the relativemerits of ROC and logistic regression methods but the model developedby logistic regression could then be used in ROC analysis to examinethe predictive value of using multiple markers as opposed to any singlemarker.

    RESULTS

    The ROC analysis of each tumor marker as a discriminatorbetween normal females and group 3 (advanced disease) breastcarcinoma patients is shown in Fig. 1. Each curve delineatesthe relationship between true- and false-positive fractions for arange of critical marker values. Since overall discrimination isa function of AUC, it is immediately apparent that in this caseCA 15.3 is the best of the three markers. The AUC for CA 15.3is significantly greater than that for MCA (P = 0.03) but notCEA. Critical values providing a 3% false-positive fraction, asinterpolated from ROC data, are 25 Mg/liter for CA 15.3, 2.5Mg/Hter for CEA, and 13 i/g/liter for MCA (Fig. 1). Thesevalues can, in turn, be used to construct scattergrams (Figs. 2and 3) in which tumor marker values are rendered comparableby normalizing them in terms of the critical value. This permitsa direct visual comparison of assays having different criticalvalues. For example, Fig. 2 shows that a variety of people notharboring breast carcinoma may have marker values greaterthan perfectly normal controls.

    1905

    on June 18, 2021. © 1991 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

    http://cancerres.aacrjournals.org/

  • 1000

    100

    oo

    10

    (¡p115.3.è58CEA•&.56MCA..»64Gp215.3.:«[I.1i«.*27CEAS:•Hl«

    •»*;,31MCAJ?*•?32Cip

    315.31*Ifttfe¿9CEAV*•••¿19MCA•«fi

    li11

    ROC ANALYSIS OF CA 15.3. CEA, AND MCA

    100

    Fig. 3. Scattcrgram of assay values for CA 15.3, CEA. and MCA in breastcarcinoma patients. Groups included those being followed up after resection ofall known disease (group I) and patients with limited tumor burden (group 2)and advanced disease (group 3). Values are expressed on a log scale as multiplesof the cutoff (A/Of) defined by a 3rc false-positive fraction among normalcontrols. Numbers below the double-ruled line, numbers of patients with assayvalues below the cutoff.

    100

    80

    60

    40

    20

    0 20 40 60 80 100FPF

    Fig. 4. ROC analysis of high-risk patients (group HR, n = 33) versus thosewith limited disease (group 2, «= 46) for the three individual assays (CA 15.3,CEA, and MCA) and the combination defined by logistic regression analysis(ALL). FPF and TPF are expressed as percentages.

    100

    80

    60

    40

    20

    CA15-3CEAMCAALL

    20 40 60FPF

    80 100

    Fig. 5. ROC analysis of patients with NMD (n = 60) versus breast carcinomapatients with limited disease (group 2, n = 46) for the three individual assays (CA15.3, CEA, and MCA) and the combination defined by logistic regression analysis(ALL). FPF and TPF are expressed as percentages.

    Tumor markers are rarely used as in Fig. 1 to discriminatebetween perfectly normal individuals and cancer patients. Moreappropriate comparison groups would serve as better modelsfor decision making. One potential application for breast tumormarkers is to identify carcinoma patients among women sus-

    80

    60

    40

    20

    CA15-3 CEA MCA ALL

    20 40 60FPF

    80 100

    Fig. 6. ROC analysis of patients with advanced carcinoma other than breast(group NBA, n = 30) versus advanced breast carcinoma patients (group 3, n =45) for the three individual assays (CA 15.3, CEA, and MCA) and the combinationdefined by logistic regression analysis (ALL). FPF and TPF are expressed aspercentages.

    pected of having underlying breast carcinoma. The model wehave chosen to examine this is one that tests how well eachmarker discriminates between the HR and limited disease(group 2) groups, as defined above. ROC analysis is displayedin Fig. 4. The best single discriminator, as determined by AUC,is CA 15.3, and this is significantly better than MCA (P =0.01). The dip in the CEA curve suggests unreliability for verylow CEA values. Logistic regression analysis identified CA 15.3as having the most independent predictive power (P = 0.014),with MCA also a significant independent predictor (P = 0.037).ROC analysis of the three assays in combination, as defined bylogistic regression analysis, showed that the combined functionwas better than any single marker. However, this approachedstatistical significance only for the comparison with CEA (P =0.092).

    For any diagnostic application it is important to evaluatehow the identification of small tumor burden might be confounded by other diseases, particularly inflammatory conditions. A possible model for ROC analysis would be one thatcompares the NMD group with group 2 (Fig. 5). In keepingwith Fig. 2, in which marker values tended to be higher in theNMD group than the HR group, the areas under the curves inFig. 5 tend to be less than in Fig. 4.

    Another diagnostic problem is the presentation of a patientwith advanced metastatic adenocarcinoma but no obvious primary site. An ROC model for this is one that determines howwell a given marker can distinguish between group 3 andpatients with advanced adenocarcinomas from sites other thanbreast (NBA). Fig. 6 shows that in this case CA 15.3 and MCAare better than CEA (for AUC, P = 0.005 and P = 0.01,respectively). A combination of assays was not significantlybetter than either CA 15.3 or MCA alone.

    Our model for the early detection of recurrence comparedgroup 1 (those who had no evidence of disease after surgery)with group 2. As shown in Fig. 7, all three markers wereremarkably similar. No single marker AUC was significantlybetter than another, nor was the AUC significantly better forthe combined function defined by logistic regression analysis.The shape of the curves in Fig. 7 is markedly different fromthose in Fig. 1, in which an FPF of 3% corresponded to arelatively high TPF. However, the curves in this region of Fig.7 are relatively steep and a modest increase in FPF results in adisproportionate increase in TPF. A critical value dictated by a10% FPF would yield a TPF of 42% for the best tumor marker

    1906

    on June 18, 2021. © 1991 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

    http://cancerres.aacrjournals.org/

  • ROC ANALYSIS OF CA 15.3. CEA. AND MCA

    100

    Fig. 7. ROC" analysis of breast carcinoma patients being followed up afterreseetion of all known disease (group I, n = 67) versus patients with limiteddisease (group 2, n = 46) for the three individual assays (CA 15.3, CEA. andMCA) and the combination defined by logistic regression analysis (ALL). FPFand TPF are expressed as percentages.

    Table I Critical values for each assay related to comparison groups and FPF

    Critical values (^g/liter)FPF V": FPF 10'

    Groupscompared"N

    vs.31vs. 2CAI

    5.325

    46CEA2.5 6.5MCA1316CAI

    5.322

    27CEA1.82.8MCA11111Groups are defined in "Materials and Methods."

    100

    20 10040 60FPF

    Fig. 8. ROC analysis of breast carcinoma patients with limited disease (group2. n = 46) versus patients with advanced disease (group 3. n = 45) for the threeindividual assays (CA 15.3. CEA. and MCA) and the combination defined bylogistic regression analysis (ALL). FPF and TPF are expressed as percentages.

    and 48% for the combined function based on logistic regressionanalysis. The corresponding critical values, as interpolated fromthe ROC data, would be 27 Mg/'iter for CA 15.3, 2.8 Mg/'iterfor CEA, and 11 ^g/liter for MCA (Table 1).

    Tumor markers can be most effectively used to monitor theprogress of established disease. The model chosen in this casewas one that used ROC to evaluate group 2 versus group 3 (Fig.8). The AUC for CEA was less than either CA 15.3 or MCA,but this was not statistically significant. By contrast, logisticregression analysis identified CEA as contributing the mostindependent predictive information with significance values of0.15, 0.33, and 0.88 for CEA, CA 15.3, and MCA, respectively.

    DISCUSSION

    The use of ROC analysis has been suggested as an aid toclinical decision making (20, 21), yet it has not been widely

    accepted or fully exploited, especially in the field of tumormarkers. Some authors (22) have included ROC analysis as anelement in their evaluation but have not included crucial elements in the analysis such as the use of clinically relevant groupmodels or the importance of ROC for selecting critical valuesdefined by a specific clinical problem.

    The ROC analysis of group N versus group 3 (Fig. 1) hassome practical use as the basis for normalizing simple scatter-grams. However, the primary reason for including this analysisis to illustrate the inappropriateness of validating tumor markerassays with these groups, as has been frequently the case in thepast. Certainly, critical values are most commonly derived froma population of perfectly normal individuals and a cutoff corresponding to the 3-5% FPF. The critical values we obtainedare in keeping with others for CA 15.3 and MCA (17, 23-26).For CEA our critical value of 2.5 is one half the value oftenselected by others from general experience only (1). Inadvertentselection of an inappropriately high critical value for CEAwould make CEA appear less sensitive in comparative studies.One advantage of ROC analysis is it lends itself to the unbiasedselection of critical values.

    Having determined critical values, it is possible to constructscattergrams normalized on this basis (Figs. 2 and 3). Thispermits an immediate assessment of the relative merit of thevarious assays. In this case the most striking feature is that allthree assays may be confounded by conditions other than breastcarcinoma.

    The configuration of the curves in Fig. 1 indicate that all 3assays have an excellent ability to discriminate between perfectly normal individuals and patients with advanced disease.

    Unfortunately, these groups do not reflect clinical reality inwhich the usual diagnostic problem is to identify relativelysmall tumor burden. We know of no data supporting the use ofthese assays alone for diagnostic screening. In that applicationone would expect the false-positive rate to be excessive, especially in view of the confounding influence of inflammatorydisease (Figs. 2 and 5). However, tumor markers do havediagnostic potential in the subgroup of people in whom malignancy is already suspected. In the ROC model for this (Fig. 4)it is immediately apparent that assay discrimination has decreased compared with the clinically improbable conditionsrepresented by Fig. 1. While the markers do have predictivepower in the early diagnosis model, especially for the combinedfunction, it is unlikely that these tumor markers would replacemammography. It is possible that a combination of mammog-raphy and a tumor marker would be significantly better thaneither alone. This has been examined in a preliminary study ofanother tumor marker, MSA (27).

    As indicated in Figs. 2 and 6 none of the tumor markers aretruly diagnostic when discriminating between advanced breastcancer and other adenocarcinomas. Our findings are generallyin keeping with others who found individual markers elevatedin patients harboring a variety of advanced epithelial malignancies (2, 28). Given the above, the discriminating power displayedin Fig. 6 is surprisingly good for CA 15.3 and MCA. Thissuggests that these markers could prove helpful in the clinical"primary unknown" situation in which one may have to depend

    on the best information available to select treatment for trial.By contrast, in a study using a fixed criterion based on normalsera, Colomer et al. (28) concluded that CA 15.3 would probablynot be useful in this application. It is possible that ROCanalysis, which is not constrained by a fixed critical value,would have revealed some discriminating power. This question

    1907

    on June 18, 2021. © 1991 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

    http://cancerres.aacrjournals.org/

  • ROC ANALYSIS OF CA 15.3, CEA, AND MCA

    would best be settled in a larger study than ours, including avariety of epithelial malignancies individually matched for tumor burden. It is perhaps not surprising that our study showedsignificantly less specificity for CEA, since that antigen wasderived from adenocarcinoma of bowel, while the other markerswere developed from breast carcinoma.

    The ROC model for detection of early recurrence is moreencouraging (Fig. 7). One advantage of ROC analysis is that itpermits a ready evaluation of the dynamic interaction of theTPF and FPF over a range of potential critical values. In thiscase one could take advantage of the relatively steep curves topropose that critical values based on an FPF of 10% would bemuch more relevant than the more standard 3% used in thediscussion of Fig. 1.

    Not only is ROC helpful in selecting a critical value, itdemonstrates that a critical value selected for one clinicalproblem may be quite misleading in another application. Thisis emphasized in Table 1 in which critical values defined by a3% FPF for group N versus group 3 (25, 2.5, and 13 ¿Ã-g/'iterfor CA 15.3, CEA, and MCA, respectively) are quite differentfor a 3% FPF applied to the group 1 versus group 2 earlyrecurrence model (46, 6.5, and 16 ^g/liter for CA 15.3, CEA,and MCA, respectively). In fact, the critical values dictated fora 3% FPF in group N versus group 3 are much closer to the10% FPF in the early recurrence model. Investigators usingdecision criteria based on the 3-5% FPF for a normal population may be unwittingly dealing with quite different FPF valueswhen the same decision criteria are applied to a different clinicalsetting. Further examination of Table 1 shows that the shift incritical value of one tumor marker cannot be used to predictthe behavior of another. The use of ROC analysis clarifies thisproblem and encourages the selection of decision criteria tailored to a specific clinical problem.

    Identification of early recurrence is a potential clinical application for tumor markers that deserves close scrutiny. Studiesof CEA have been inconsistent, perhaps partly because optimum critical values had not been selected for this purpose (1).Our single sample ROC model for early recurrence (Fig. 7)suggests that tumor markers might be useful. Specific recommendations would require a prospective serial sample studyaided by ROC and including an analysis of diagnostic lead time.We have such a study in progress. Even so, application of tumormarkers for this clinical problem would only be warranted iffurther intervention were contemplated for asymptomatic earlyrecurrence.

    The greatest clinical application for tumor markers is in theserial monitoring of patients with established disease. An increasing or decreasing CEA correlates with disease progressionor regression, respectively, in about 85% of cases (1). From Fig.8 we might anticipate that MCA and CA 15.3 would performat least as well. Our finding that CEA as a single marker didless well than CA 15.3 is in keeping with others (29), althoughin our study this was not statistically significant.

    ROC and logistic regression analyses are powerful tools forthe assessment of tumor markers. When one uses the AUCmethod for comparing curves, ROC is similar to logistic regression analysis in providing a global assessment of assay performance independent of any fixed critical value. Logistic regressionanalysis is modeled on the optimal mathematical discrimination, which is, in turn, influenced by the distribution of the 2samples. This may not accurately reflect decisions based onfixed decision criteria. A similar potential shortcoming is perhaps more evident for ROC in which it is graphically apparent

    that all areas of the curve are not equally important. However,with ROC the clear visual correlate of performance can be usedto calculate either the best discriminating critical value or selecta critical value based on the most clinically acceptable compromise between an FPF and TPF. For example, in Fig. 7 theoptimal discriminating value for CA 15.3, as determined by thepoint of the curve closest to the upper left hand corner (100%TPF, 0% FPF) (30), would be at an FPF of 28%. Yet thedecision maker might decide that a 10% FPF value would bemore clinically useful. Having selected the best decision criterion for a given clinical problem, one can then compare thediscriminating power at this point rather than relying on AUCcomparisons (31). In our study the resulting values were notremarkably different from AUC comparisons and have not beenreported.

    While ROC analysis can identify a marker with significantlyimproved discriminating power in a pairwise analysis, the greatpower of logistic regression analysis is in identification ofsignificant independent predictors that may be useful in combination. For example, in the early diagnosis model (Fig. 4)ROC analysis identified CA 15.3 as significantly better thanMCA as a single predictor. Yet logistic regression analysisidentified MCA as a significant independent predictor.

    In an effort to combine the merits of both methods of analysiswe have used the logistic regression function derived from allthree assays to define an additional ROC curve. In each casethe combined function was best but not significantly better thanthe best single marker.

    In our analysis CA 15.3 has been consistently the best overalldiscriminator. However, it is perhaps more interesting thatwhen placed on an equal footing by ROC analysis, the assaysare remarkably similar. A larger study would be required toreveal more subtle differences. Our studies illustrate the needfor clinical modeling when comparing assays and the importance of selecting critical values based on such models. ROCanalysis with these relatively simple models would be mostvaluable in assessing the spate of competing tumor markersnow becoming widely available. In fact, the general methodscould be applied to a great variety of clinical and researchdecision-making problems.

    ACKNOWLEDGMENTS

    The authors wish to thank Dr. J. Hanley of the Department ofEpidemiology and Biostatistics, McGill University, and Dr. C. Metz ofthe University of Chicago Medical Center for their generosity in providing current ROC analysis programs and discussion of appropriateapplication. We also thank the technical staff in the Tumor MarkerLaboratory, Kim Kieler for her help in running computer programs,Dr. P. Rebbeck for the provision of clinical material, and Linda Woodfor secretarial assistance.

    REFERENCES

    1. Beard. D. B., and Haskell, C. M. Carcinoembryonic antigen in breast cancer.Am. J. Med., SO:241-245, 1986.

    2. Tondini, C., Hayes, D. F., and Kufe, D. W. Circulating tumor markers inbreast cancer. Hematol. Oncol. Clin. North Am.. 5:653-674. 1989.

    3. Statland. B. E.. Winkel. P., Burke. M. D.. and Galen, R. S. Quantitativeapproaches used in evaluating laboratory measurements and other clinicaldata. In:}. B. Henry (ed.). Clinical Diagnosis and Management by LaboratoryMethods, Vol. 1, pp. 525-555. Philadelphia: W. B. Saunders Co., 1979.

    4. Swets, J. A. Measuring the accuracy of diagnostic systems. Science (Washington DC), 240: 1285-1292, 1988.

    5. McNeil, B. J., and Hanley, J. Statistical approaches to the analysis of receiveroperating characteristic (ROC) curves. Med. Decis. Making, 4: 137-150,1984.

    1908

    on June 18, 2021. © 1991 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

    http://cancerres.aacrjournals.org/

  • ROC ANALYSIS OF CA 15.3. CEA, AND MCA

    6. Metz, C. E., Wang, P-L.. and Kronan. H. B. A new approach for testing thesignificance of differences between ROC curves from correlated data. In: F. 19.Deconink (ed.). Information Processing in Medical Imaging, pp. 432-445.The Hague. The Netherlands: Nijhoff, 1984. 20.

    7. Dupont, VV.D., and Page, D. L. Risk factors for breast cancer in womenwith proliferarne breast disease. N. Engl. J. Med., 312: 146-151, 1985. 21.

    8. Silver, H. K. B., Rangel. D. M.. and Morton. D. L. Serum sialic acidelevations in malignant melanoma patients. Cancer (Phila.), 41: 1497-1499.1978. 22.

    9. Silver, H. K. B., Karim, K. A., Archibald, E. L., and Salinas, F. A. Serumsialic acid and sialytransferase as monitors of tumor burden in malignantmelanoma patients. Cancer Res.. 39: 5036-5042. 1979. 23.

    10. Silver. H. K. B.. Karim. K. A., and Salinas. F. A. Relationship of total serumsialic acid to sialyglycoprotein acute-phase reactants in malignant melanoma.Br. J. Cancer, 41: 745-750. 1980.

    11. Silver, H. K. B.. Karim, K. A., Salinas. F. A., and Swenerton. K. D. 24.Significance of sialic acid and carcinoembryonic antigen as monitors of tumorburden among patients with carcinoma of the ovary. Surg. Gynecol. Obstet..153: 209-213. 1981. 25.

    12. Salinas. F. A., Wee, K. H.. and Ceriani, R. L. Significance of breast carcinoma-associated antigens as monitor of tumor burden: characterization bymonoclonal antibodies. Cancer Res., 47: 907-913, 1987. 26.

    13. Swenerton, K. D., Legha. S. S., Smith. T.. Hortobagyi, G. N.. Gehan. E. A..Yap, H-V.. Gutterman. J. L'.. and Blumenschein. G. R. Prognostic factors

    in metastatic breast cancer treated with combination chemotherapy. CancerRes.. 39: 1552-1562, 1979. 27.

    14. Gold, P.. and Freedman, S. O. Specific carcinoembryonic antigens of thehuman digestive system. J. Exp. Med., 122: 467-481, 1965.

    15. Milkens, J.. Buijs, F.. Hilgers. J.. Hageman. P. H.. Calafat. J.. Sonnenberg. 28.A., and Van der Valk, M. Monoclonal antibodies against human milk fatglobule membranes detecting differentiation antigens of the mammary glandand its tumors. Int. J. Cancer., 34: 197-206, 1984. 29.

    16. Kufe, D., Inghirami, G., Abe, M.. Hayes, D.. Justi-W'heeler, H., and Schlom,

    J. Differential reactivity of a novel monoclonal antibody (DF3) with humanmalignant versus benign breast tumors. Hybridoma. 3: 223-232. 1984.

    17. Bombardieri. E.. Gion. M.. Riccardo. M.. Ruggero. D.. Bruscagnin. G.. and 30.Buraggi. G. A mucinous-like carcinoma-associated antigen (MCA) in thetissue and blood of patients with primary breast cancer. Cancer (Phila.), 63:490-495. 1989. 31.

    18. Dorfman. D. D.. and Alf. E. Maximum likelihood estimation of parametersof signal-detection theory and determination of confidence intervals—rating

    method data. J. Math. Psychol.. 6: 487-496, 1969.Cox. D. R. Regression models and life table analysis. J. R. Slat. Soc.. 34:187-220, 1972.Hanley, J. A. The place of statistical methods in radiology (and in the biggerpicture). Invest. Radiol., 24: 10-16, 1989.Beck, J. R., and Shultz, E. K. The use of relative operating characteristic(ROC) curves in test performance evaluation. Arch. Pathol. Lab. Med.. 110:13-20, 1986.Carson, J. L., Eisenberg, J. M., Shaw, L. M.. Kundel, H. L., and Soper, K.A. Diagnostic accuracy of four assays of prostatic acid phosphatase. JAMA,253: 665-669, 1985.Fujino, N., Haga. Y.. Sakamoto, K., Egami, H.. Kimura. M.. Nishimura. R..and Akagi, M. Clinical evaluation of an immunoradiometric assay for CAI 5-3 antigen associated with human mammary carcinoma: comparison withcarcinoembryonic antigen. Jpn. J. Clin. Oncol.. 16: 335-346. 1986.Gion, M., Mione, R.. Dittadi. R., Fasan, S., Pallini. A., and Bruscagnin, G.Evaluation of CAI5-3 serum levels in breast cancer patients. J. NucÃ-.Med.All. Sci., JO: 29-36. 1986.Pons-Anicet. D. M. F.. Krebs. B. P.. Mira, R., and Namer, M. Value ofCAI5-3 in the follow-up of breast cancer patients. Br. J. Cancer, 55: 567-569. 1987.Stahli. C, Caravatti, M., Aeschbacher, M., Kocyba, C., Takacs, B., andCarmann, H. Mucin-like carcinoma-associated antigen defined by threemonoclonal antibodies against different epitopes. Cancer Res.. 48: 6799-6802, 1988.Hare, W. S. C.. Tjandra, J. J., Russell. I. S.. Collins, J. P., and McKenzie, I.F. C. Comparison of mammary serum antigen assay with mammography inpatients with breast cancer. Med. J. Aust., 149: 402-406, 1988.Colomer. R., Ruibal, A.. Genolla. J.. and Salvador. L. Circulating CA 15-3antigen levels in non-mammary malignancies. Br. J. Cancer. 59: 283-286.1989.Tondini, C.. Hayes. D. F., Gelman, R.. Henderson. I. C., and Kufe. D. W.Comparison of CAI5-3 and carcinoembryonic antigen in monitoring theclinical course of patients with metastatic breast cancer. Cancer Res., 48:4107-4112. 1988.Swets. J. A., and Picket!. R. M. Fundamentals of accuracy analysis. In:Evaluation of Diagnostic Systems. Methods from Signal Detection Theory.pp. 15-45. New York: Academic Press. 1982.Wieand. S.. Gail, M. H., James, B. R., and James, K. L. A family ofnonparametric statistics for comparing diagnostic markers with paired orunpaired data. Biometrika, 76: 585-592. 1989.

    1909

    on June 18, 2021. © 1991 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

    http://cancerres.aacrjournals.org/

  • 1991;51:1904-1909. Cancer Res Hulbert K. B. Silver, Betty-Lou Archibald, Joseph Ragaz, et al. Breast CarcinomaAntigen, and Mucin-like Carcinoma-associated Antigen infor Tumor Markers: Comparison of CA 15.3, Carcinoembryonic Relative Operating Characteristic Analysis and Group Modeling

    Updated version

    http://cancerres.aacrjournals.org/content/51/7/1904

    Access the most recent version of this article at:

    E-mail alerts related to this article or journal.Sign up to receive free email-alerts

    Subscriptions

    Reprints and

    [email protected] at

    To order reprints of this article or to subscribe to the journal, contact the AACR Publications

    Permissions

    Rightslink site. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC)

    .http://cancerres.aacrjournals.org/content/51/7/1904To request permission to re-use all or part of this article, use this link

    on June 18, 2021. © 1991 American Association for Cancer Research. cancerres.aacrjournals.org Downloaded from

    http://cancerres.aacrjournals.org/content/51/7/1904http://cancerres.aacrjournals.org/cgi/alertsmailto:[email protected]://cancerres.aacrjournals.org/content/51/7/1904http://cancerres.aacrjournals.org/


Recommended