Date post: | 15-Apr-2018 |
Category: |
Documents |
Upload: | duongthien |
View: | 216 times |
Download: | 3 times |
Acc
epte
d A
rtic
le
This article has been accepted for publication and undergone full peer review but has not
been through the copyediting, typesetting, pagination and proofreading process, which may
lead to differences between this version and the Version of Record. Please cite this article as
doi: 10.1002/mp.12331
This article is protected by copyright. All rights reserved.
Article Type: Research Article
Computerized Detection of Lung Nodules through Radiomics
Jingchen Ma School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
Zien Zhou
Department of Radiology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University,
Shanghai 200127, China
Yacheng Ren, Junfeng Xiong, and Ling Fu, Qian Wang, and Jun Zhaoa)
School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
Purpose: Lung cancer is a major cause of cancer deaths, and the 5-year survival rate of stage IV lung
cancer patients is only 2%. However, the 5-year survival rate of stage I lung cancer patients
significantly increases to 50%. As such, spiral computed tomography (CT) scans are necessary to
diagnose high-risk lung cancer patients in early stages. In this study, a computer-aided detection
(CAD) system with radiomics was proposed. This system could automatically detect pulmonary
nodules and reduce radiologists’ workload and human errors.
Methods: In the proposed scheme, a nodular enhancement filter was used to segment nodule
candidates and extract radiomic features. A synthetic minority over-sampling technique was also
applied to balance the samples, and a random forest method was utilized to distinguish between real
nodules and false positive detections. The radiomics approach quantified intratumor heterogeneity
and multifrequency information, which are highly correlated with lung nodules.
Results: The proposed method was used to evaluate 1,004 CT cases from the well-known Lung
Image Database Consortium, and 88.9% sensitivity with four false positive detections per CT scan
was obtained by randomly selecting 502 cases for training and 502 other cases for testing.
Conclusions: The proposed scheme yielded a high performance on the LIDC database. Therefore,
the proposed scheme is possibly effective for various CT configurations used in routine diagnosis and
lung cancer screening.
Keywords: CAD, lung nodule detection, radiomics, synthetic minority over-sampling, random forest
1. INTRODUCTION
Lung cancer ranks first among cancer-related deaths affecting both genders worldwide, and this number
continuously increases in America and China1, 2
. However, effective treatments have yet to be developed to
reduce the chance of death in terminal stages; the survival rate has also remarkably increased in early stages3, 4
.
Computed tomography (CT) screening of lung cancer is necessary to prolong lives, particularly high-risk
patients5. The high number of patients for screening with hundreds of CT slices poses a substantial workload to
radiologists, and incorrect diagnosis likely occurs in these conditions. Screening implementations help improve
the curative rates of lung cancer, but these procedures increase the workload of radiologists in reading scans.
Therefore, computer-aided diagnosis (CAD) systems for lung nodule detection have been developed to assist
radiologists to diagnose lung cancer with high efficiency and low misdiagnosis rates.
CAD systems mainly consist of two stages: detection of hundreds of lung nodule candidates from a CT scan
and reduction of false positive detections generated in the first stage. In the first stage, some techniques,
including double thresholds, morphological operations, and multiple selective enhancement filters, are often
applied to generate a large set of nodule candidates to obtain high sensitivity. However, a nodule missed in the
first stage remains undetected when the second stage begins. Thus, all possible nodule candidates must be
included in the first stage to ensure high accuracy in the final assessment. Li et al.6, 7
achieved good performance
in the first stage by developing selective enhancement filters that improve dot-like nodules and suppress line-
like blood vessels and airway walls. Tan et al.8 proposed the divergence of a normalized gradient as a nodule
Acc
epte
d A
rtic
le
This article is protected by copyright. All rights reserved.
candidate detector similar to the Laplacian of Gaussian. The former approach normalizes the gradient of the
image before it obtains divergence; by contrast, the latter does not normalize the gradient. Messay et al.9 used
multiple thresholds and morphological opening structure sizes to produce nodule candidate masks. Duggan et
al.10 used global segmentation methods combined with mean curvature minimization and rule-based filter to
detect lung nodule candidates. The subsequent false positive reduction stage determines most of the
performance of the CAD system and relies on distinguished features and a supervised classification method. In
this stage, intensity, shape11
, and texture features12
are often used to represent the characteristics of nodule
candidates. To optimize features and reduce time consumptions, Messay and colleagues9 performed a sequential
forward feature selection. Rule-based classifier, support vector machine, Fisher linear discriminant classifier,
and fixed-topology ANN classification can be used to distinguish real nodules from false positive detections and
to analyze extracted features6, 8, 9, 13
. To improve classification performance, Li et al.6 utilized an automated rule-
based classifier by iteratively determining the optimal threshold to design processes that can minimize the
overtraining effect. Tan et al.8 employed feature-deselective neuro-evolving augmenting of topologies (FD-
NEAT) to reduce false positive detections. However, the performance of fixed-topology ANN classification
method is slightly higher than that of FD-NEAT. Categorizing lung nodules into 13 different cases based on size,
shape, and position, Lu et al.14
organized decision trees to adopt features extracted from the 13 categories.
Some researchers evaluated their methods on their corresponding datasets and other researchers used the free-
access Lung Image Database Consortium (LIDC) database to evaluate the performance and compare their
findings with those described in other studies. Setio et al.15
tested on 888 scans of LIDC dataset in 5-fold cross
validation and reached detection sensitivities of 78.2% and 87.9% at 1 and 4 FP/scan, respectively. In addition,
it achieved 90% sensitivity with 4 FP/scan if nodules accepted by minority were not counted as false positive.
Lu et al.14
proposed a hybrid method trained by 196 CT scans and tested on 98 CT scans with 223 nodules and
achieved 85.2% sensitivity with 3.1 FP/scan. Brown et al.16
developed a CAD system, and its efficiency was
evaluated on 108 CT scans from LIDC database and resulted in 75% sensitivities and 3.1 FP/scan. Tan et al.13
achieved a detection sensitivity of 83% with 4 FP/scan by using 10-fold cross validation on 360 CT scans. Guo
and Li17
achieved 85% sensitivity at 2.6 FP/scan with leave-one-out cross validation on 85 LIDC CT scans that
consist 111 nodules. Tan et al.8 proposed scheme on 125 LIDC database cases (80 real nodules with agreement
of all four radiologist) and achieved 87.5% sensitivity at 4 FP/scan. Messay et al.9 proposed a CAD system
which achieved the sensitivity of 82.7% at 3 FP/scan on 84 CT scans. Golosio et al.18
tested on 84 CT scans
with 77 nodules that four radiologists agreed from LIDC databases and achieved 79% sensitivity with 4 FP/scan.
Radiomics has been commonly used to classify lung cancer stages. The emergence of radiomics approaches
has provided a new means to maximize the use of CT images19-21
. Images not only act as pictures but also
appear as high-dimensional minable data. Radiomics can be applied to obtain a large number of quantitative
features from CT images and perform a comprehensive characterization of lung nodules. This approach likely
extracts information from intratumor heterogeneity to reveal the unique characteristics of tumors for clinical
diagnosis and prognosis. The ability of radiomics to stage particular nodules detected and outlined by
radiologists has been widely described, but its ability to detect nodules from pulmonary parenchyma and blood
vessels should be further investigated. We also proposed enhancement features to quantify the adjacent tissues.
Another problem is that datasets used to evaluate the systems are small or a subset of the LIDC database.
Consequently, studies have yet to determine whether a system can be applied to a whole dataset at the same high
efficiency and accuracy. In this study, radiomics was introduced to a whole LIDC-IDRI dataset to evaluate lung
nodule detection. This work provided additional wavelet and enhancement features on the basis of classical
features and achieved 88.9% sensitivity with four false positive results.
Fig. 1. Distribution of nodule sizes and types in the LIDC database.
Acc
epte
d A
rtic
le
This article is protected by copyright. All rights reserved.
Fig. 2. Pipeline of the overall scheme.
2. MATERIALS
The adopted CT datasets were obtained from the whole LIDC22-25
. LIDC is a public lung nodule database from
the National Biomedical Imaging Archive. The National Cancer Institute collected 1010 patient CT scans from
several hospitals to develop a CAD system for lung cancer diagnosis and screening. All the CT scans were read
in the two-phase reading procedure by a panel of four experienced radiologists participating in the LIDC project.
The first reading phase was blinded. Each radiologist interpreted nodule locations and radiological
characteristics independently. The second reading phase was unblinded. Each radiologist reviewed the first
reading results along with the results of the other three radiologists. Each radiologist can change their previous
results and determine the final nodule annotation.
The LIDC CT scans were collected from seven institutions and were obtained under various types of CT
scanners. The CT scans were acquired under different kinds of protocols and hence presented various slice
thicknesses that ranged from 0.6 mm to 5 mm and space resolution from 0.5 mm to 1 mm. Some CT scans were
also contrast enhanced. The size and type distribution of the 1361 nodules annotated by at least three
radiologists are presented in Fig. 1. We computed the diameters by using the following equation .
Volume was determined as the pixel size multiplied by the number of pixels inside the contours of each nodule.
The contours were defined as the boundaries of overlapped area marked by at least three radiologists. This
whole dataset mimics actual scenarios.
3. METHODS
The proposed CAD scheme was performed in two stages: generation of lung nodule candidate stage and false
positive reduction stage. Figure 2 displays the framework of our scheme. The first stage (candidate generation)
and the second stage (false positive reduction) are presented in the upper panel and the lower panel, respectively.
Our scheme is illustrated in Fig. 3.
A. Preprocessing
The detection scheme relied on the shape features of nodule candidates, and the original voxel size varied
among different CT configurations. Therefore, converting all voxels into isotropic volumetric, particularly the
same sagittal, coronal, and transverse spacing, was necessary. Tri-linear interpolation was applied to resample
the original data to isotropic voxels with 1 mm resolution.
The isotropic image data were partitioned in the target lung area and surrounding structures. The exclusion of
structures, such as the heart and ribs, may result in possible errors and decelerate computation speed. The
segmentation stage is usually employed as the first step in CAD methods to reduce the computational time and
false positive readings outside the lung mask. In this work, the lung segmentation algorithm developed by Tan8
was utilized to generate a pulmonary field. A check confirmed that none of the nodules were excluded in this
stage.
Acc
epte
d A
rtic
le
This article is protected by copyright. All rights reserved.
Fig. 3. The upper left image is an original CT slice. The upper right image is a result of lung segmentation. The
bottom left image shows the nodule candidates contoured in red lines. The bottom right image presents the final
detections outlined in red lines and gold standards annotated in green lines.
B. Nodule candidate detection
Our method for nodule candidate detection involved the multiscale nodule and vessel enhancement filters
developed by Li et al.6, 7, 26
that enhanced nodules and simultaneously suppressed the normal anatomic structures,
such as vessels. A threshold of 40 in dot-enhanced images (zdot≥40) was set to generate lung nodule candidates,
including false positive detections, mainly in the locations of blood vessel junctions and branches. This
threshold was determined by Li et al. by using their dataset. The nodule candidates were slightly smaller than
the contour of the radiologist annotations. Therefore, a 3D region-growing technique was applied, and a
maximum growth of 5 mm was achieved. Considering various nodule sizes, we employed multiple scales of
enhancement filters. The isotropic image was smoothed out by using a Gaussian filter with five scales: 1, 1.6,
2.4, 3.8, and 6 mm.
Three eigenvalues, namely, , were calculated from the 3 × 3 Hessian matrixes, as follows:
The brightness of the nodules was compared with that of the surrounding pulmonary parenchyma and hence
triggered a high response in the dot-enhanced image. The vessels were similar to cylinders and induced a high
response in the line-enhanced image (Fig. 4).
Acc
epte
d A
rtic
le
This article is protected by copyright. All rights reserved.
Fig. 4. Original image (upper left), dot-enhanced (upper right), line-enhanced (bottom left), and plane-enhanced
images (bottom right) are shown. The nodule has a high response in the dot-enhanced image.
C. Feature Determination
To eliminate the false positive detections during the last stage, we extracted 979 features for each nodule
candidate shown in Fig. 5 as radiomics heat map. The features were categorized into five groups: intensity,
shape, texture, wavelet, and selective enhancement features.
Intensity features comprise energy, entropy, kurtosis, maximum, mean, mean absolute deviation, median,
minimum, range, root mean square, skewness, standard deviation, uniformity, and variance.
Shape features21
involve two definitions of compactness, spherical disproportion, sphericity, surface area,
surface-to-volume ratio, and volume.
Texture features consist of 22 grey-level co-occurrence matrix-based features27
and 11 gray-level run-length
matrix-based features28
.
The lung nodule images displayed multiscale characters. Wavelet transform methods decomposed the
isotropic image at low and high frequencies, which are effective in the extraction of multiscale information of
the lung nodule images. In this study, the Coiflet 1 wavelet was applied to each CT scan and decomposed the
isotropic image into eight components: , , , , , , , and .
where include either scaling function L (low-pass filter) or the wavelet function H (high-pass
filter). For each decomposition, the intensity and texture features were computed.
Enhancement features comprised the intensity and texture features on the zdot, zline, and zplane, implying the
character of the surrounding shape of each nodule candidate.
Acc
epte
d A
rtic
le
This article is protected by copyright. All rights reserved.
Fig. 5. Radiomics features expression with Z-score. Features on the y axis were sorted according to feature
importance of random forest and randomly selecting 500 real nodule candidates, and 500 non-nodule candidates
were listed on the x axis. It showed the difference between most real nodule candidates and others. The
importance of enhancement features ranked higher than any other features. A total of 41 enhancement features
and 9 wavelet features ranked top 50.
D. Balancing Samples
In the lung nodule candidate generation stage, the number of false positive candidates was 50 times higher than
that of the true lung nodules. Imbalanced candidates can cause bias in terms of the cost function of classifiers.
Data can be normally balanced by under sampling the predominant false positive candidates, but only a small
part of a training set is utilized. As such, under sampling a majority class is combined with oversampling a
minority class to obtain enhanced results. The synthetic minority over-sampling technique (SMOTE)29
was
applied to randomly synthesize instances of the minority class along a line between a minority sample and its
closest neighbors. In this study, 5-Nearest Neighbors were used to synthetize new samples, which were 10 times
the original true nodule instances. The detail is shown in Algorithm 1.
E. Classifier
The decision-tree learning method has been widely used in data mining30
. However, this approach involves
the habit of overfitting, which leads to poor performance on dataset testing but high performance on training.
Developed by Leo Breiman and Adele Cutler, random forest overcomes the habit of overfitting by randomly
selecting features without replacement and randomly choosing samples with replacement. The approach
constructs several different decision trees, and each tree predicts the value of a target instance. The final result
can be obtained by determining the majority vote from all the individual trees. The detail is shown in Algorithm
2. In this study, 100 trees were constructed, and each tree utilized 50 features.
Acc
epte
d A
rtic
le
This article is protected by copyright. All rights reserved.
F. Scheme Evaluation
The overall performance in detecting lung nodules was evaluated by comparing the candidate contours of the
segmented nodule with the annotated nodules whose contours were defined as the boundaries of the overlapped
areas annotated by at least three radiologists. If a detected nodule candidate overlapped with the annotated
nodule, the detected candidate was considered a true positive detection (any overlap with the true nodule was
counted as a true-positive). Otherwise, the candidate was regarded as a false positive. The details for a nodule
definition are shown as follows:
1) Read every contour annotated by up to four radiologists for each lung nodule.
2) Mark all pixels inside of each contour.
3) Compute each pixel with times K, the pixel is within K different boundaries (K=1, 2, 3, 4).
4) Label every pixel with K ≥ 3 as the region of interest of the nodule.
4. RESULTS
Our proposed method was evaluated on the well-known LIDC datasets. The datasets involved 1010 patients’
chest CT scans collected from different hospitals with CT scanners of various manufacturer models. We
attempted to use all 1010 scans, but six cases contained some errors. Thus, we evaluated 1004 chest CT scans
from the LIDC datasets, excluding LIDC-IDRI 0158, 0398, 0566, 0706, 0741, and 0979.
The datasets were randomly and equally divided into two parts for training and testing. The first 502 cases
randomly selected from the datasets were used to train the random forest model, whereas the 502 remaining
cases were tested by using the model. There were 1361 nodules in total, 663 nodules in the training set and 698
nodules in the testing set. The criterion to distinguish between true positive and false positive results was based
on whether a detected nodule overlaps with the nearest nodule contours. In the first stage, the nodule candidates
did not contain all of the true nodules, and the detection rate in the testing set was 93.98%, with average 101.7
false positive detections. After the false positive detections were reduced by the random forest classifier, our
proposed scheme achieved 88.9% sensitivity with four false positive results per scan in the testing set. The
number of votes from the random forest classifier for each detection was obtained as a decision variable for the
analysis of free-response receiver operating characteristics demonstrated in Fig. 6. To compare the performance
of competing systems using different features, the jackknife free-response receiver-operating curve(JAFROC)
analysis method was performed using JAFROC analysis software which is available at:
http://www.devchakraborty.com31
. Table I presents figure of merit for lung nodule detection with different
features.
Meanwhile we applied a combination of classical intensity, shape, and texture features (in short of classical
features) to train the RF classifier and achieved 73.5% sensitivity at 4 FP/scan in the testing set. Similarly, our
proposed enhancement features outperformed wavelet features and classical features (Fig. 6).
Figure 7 presents the overall detected nodules in one slice. However, false positive results were also detected
(Fig. 8). In these false results, real nodules were not found, and the detected false nodules were mainly
distributed around the vessels. Some nodules were missing because of the relatively low contrast with
pulmonary parenchyma, whereas some nodules were relatively similar to the blood vessels. Fig. 9 shows the
distributions of nodule sizes and types of detected and missed nodules in the testing set.
Table I: Figure of merit for lung nodule detection with different features.
Radiomics Enhancement Wavelet Classical
Figure of merit 0.6822 0.5556 0.5191 0.4991
Acc
epte
d A
rtic
le
This article is protected by copyright. All rights reserved.
Algorithm 1: SMOTE
Input: X original minority class samples. X is the number of instance * number of features
N is the number of new samples to synthetize (Assume N is in integral M multiplies of instance number of X)
k k-Nearest Neighbors
Output: S new synthetic samples
for i from 1 to No. of instance in X
Compute k nearest neighbors of No. i instance and save the indices in the List.
for k from 1 to M
Randomly choose an index in the List with replacement and save it in the Id
for j from 1 to number of features in X
Step = random number between 0 to 1
S[k+(i-1)*M][j]=X[i][j]+Step*(X[Id][j]-X[i][j])
end for
end for
end for
Fig. 6. Comparisons of different types of features with those obtained under the radiomics approach
demonstrated in FROC curves. At four false positive results per scan, the sensitivity of the radiomics approach
was 88.9%, whereas the sensitivity under the classical features was only 73.5%. Our proposed enhancement
features performed better than either wavelet features or classical features. The each thin dot curve represents
the 95% confidence interval (CI) referring to its FROC curve in the same color by using bootstrapping with
1000 bootstraps.
Acc
epte
d A
rtic
le
This article is protected by copyright. All rights reserved.
Algorithm 2: Random Forest
Training Step:
Input: - training set
- dimension of a feature
- number of trees
- number of features per tree
Output: random forests
for Randomly sample with replacement from as sample of the root node, i.e., , randomly sample
dimension features without replacement from , and start training from the root node.
while (a node v isn’t trained or marked as leaf node)
A value f in one feature and a best threshold are obtained by maximizing the following:
( , )
{ , }
1
( , ) arg max ( , , )
( , , ) ( , , ) ( , , )
( , , ) (1 )
f
S S
S L R
Kv v
i i
i
f Gain f v
Gain f v Gini f v w Gini f v
Gini f v p p
where ( , , )Gini f v and ( , , )Gain f v are the Gini index and Gini information gain for the feature f
and the threshold at the node v , v
ip is the proportion of the training samples belonging to the i-
th class at the node v , Lv and Rv are the left child node and right child node of v , and Lw and Rw
are the proportions of the training samples assigned to Lv and Rv .
Next, train another node.
end while
end for
Return random forests.
Testing Step:
Input: – an instance to be tested
RF - random forests
Output: - the likelihood
for Start from root node.
while (not leaf node)
if *
testf go to left node, else go to right node.
end while
is the prediction of i-th tree.
end for
The likelihood
Return -the likelihood
Acc
epte
d A
rtic
le
This article is protected by copyright. All rights reserved.
5. DISCUSSION
This paper presented a novel method with additional features in a CAD system to detect lung nodules from
CT scans with high sensitivity in the entire LIDC-IDRI dataset. We used selective enhancement filters to
generate lung nodule candidates and then extracted radiomic features to distinguish real lung nodules from false
positive detections. This system was validated on the whole LIDC-IDRI database with 502 cases for training
and 502 other cases for testing, and yielded a good performance. Compared with classical features, more
accurate schemes were developed by including 979 features that provided a comprehensive characterization of
lung nodules. These features likely extract information regarding intratumor heterogeneity and adjacent tissues,
which possibly reveal the unique characteristics of tumors and blood vessels. The AUC32
of the resubstitution in
the random forest classifier was 0.9987, and the AUC of the independent testing set was 0.9862. The Table III
lists the AUCs of the random forest classifier with different parameters in the second stage. The AUCs were
relatively stable if enough trees were present in the classifier.
In this method, enhancement features were added to quantify the shape information of adjacent tissues, and
the enhancement feature was used for the first time to detect lung nodules. We obtained enhancement features
by computing intensity features and texture features on zdot, zline, and zplane. Using these features, we
generated more accurate results compared with those of other reported methods. The performance of the
proposed approach can be further evaluated by comparing the results of the radiomics approach with those of
other existing CAD methods for lung nodule detection. However, difficulties exist in direct comparisons as
follows:
1. Different CAD methods might be tested on different datasets and various numbers of CT scans.
2. Different CAD methods might differ in terms of evaluation methods, such as independent test, k-fold cross
validation, and leave-one-out cross validation.
3. Different CAD methods might use different ways to generate nodule candidates.
4. Some CAD methods focus on nodule dimension and thus analyze performance on the basis of different
nodule sizes and types.
5. Different systems consider different agreement levels. The approaches used to determine whether a detected
nodule candidate is a real nodule may also vary among methods. Some techniques measure the distance of each
candidate to the nearest true nodule, whereas other methods assess the overlap rates between each candidate and
any true nodule. Thresholds also vary across different individuals.
Although these problems cannot be solved, a relative comparison can be helpful. Some relevant CAD
schemes on lung nodule detection with the same LIDC database are listed in Table II. In this table, each row
represents a published paper and its publication year, testing dataset, agreement level, validation method, and
performance. The performance of our method is higher than that of other methods. Our proposed method is also
tested on a larger dataset.
For further evaluating our proposed scheme, we performed 5-fold cross validation on 888 cases used by Seito
et al and achieved 89.5% sensitivity with four false positive results per scan. The 5 cross-validation sets were
listed on the online appendix as well as the independent training and testing sets.
In future studies, a feature selection method should be applied to reduce feature redundancy. The sensitivity
of the first stage is relatively low and thus requires further improvement. In the region-growing stage,
distinguishing between real and false positive detections was difficult. For an actual lung nodule, region growth
should terminate at the boundary between a nodule and a vessel. For a false positive candidate, which could be a
vessel, region-growing steps should be as accurate as possible.
Acc
epte
d A
rtic
le
This article is protected by copyright. All rights reserved.
Fig. 7.
Several CT slices demonstrate the detected nodules. Green lines show the contours of the real nodules, and red
lines include the boundaries of nodule detection results from our scheme.
Fig.
8. Sample false positive detections marked in red lines. Some missing nodules annotated in green lines.
Fig. 9. Distributions of nodule sizes and types of detected and missed nodules in the testing set.
Acc
epte
d A
rtic
le
This article is protected by copyright. All rights reserved.
6. CONCLUSIONS
In this study, a radiomics-based scheme was proposed to detect lung nodules in CT scans. The performance of
the proposed method is comparable with that of other methods in the literatures. Our study provides a basis for
the development of appropriate radiomics approaches to help radiologists detect lung nodules during routine
diagnosis and lung cancer screening.
Table II: Comparison among different CAD schemes for lung nodule detection tested on the LIDC database
Author Year No. of
scans
Agreement
level
Validation Sensitivity FP/scan
Proposed - 502 3 Independent test 88.9% 4
Setio15
2016 888 3 5-fold cross validation 87.9%*
4
Lu14
2015 98 2 Independent test 85.2% 3.1
Brown16
2014 108 3 Independent test 75% 2
Tan33
2013 360 4 10-fold cross validation 83% 4
Li17
2012 85 2 Leave-one-out cross validation 85.0% 2.7
Tan8 2011 125 4 Independent test 83.7% 4
Messay9 2010 84 1 7-fold cross validation 82.7% 3
Golosio18
2009 84 3 Independent test 71% 4
*Setio achieved 90% sensitivity with 4 FP/scan if nodules accepted by minority were not counted as false positive. However, it was not used
by other literatures because minority of the radiologists may over call nodules.
Table III: AUC32
values of the random forest classifier in the second stage with different parameters in the
testing set. The random forest classifier constructed trees and each tree randomly sampled dimension
features from all 979 features. The detail is shown in Algorithm 2.
\ 10 25 50 75 100 125 150 200 300
5 0.974 0.977 0.980 0.982 0.982 0.984 0.985 0.982 0.984
10 0.968 0.981 0.982 0.980 0.984 0.985 0.983 0.983 0.984
25 0.961 0.980 0.981 0.979 0.982 0.983 0.984 0.985 0.984
50 0.973 0.974 0.983 0.981 0.986 0.981 0.984 0.984 0.983
75 0.966 0.977 0.981 0.984 0.984 0.984 0.984 0.983 0.984
100 0.967 0.978 0.982 0.983 0.982 0.982 0.984 0.982 0.983
150 0.968 0.978 0.980 0.981 0.982 0.982 0.984 0.984 0.984
7. ACKNOWLEDGMENTS
This work was supported by National Natural Science Foundation of China (No. 813716234), National Key
Research and Development Program (2016YFC0104608), National Basic Research Program of China
(2010CB834302), and Shanghai Jiao Tong University Medical Engineering Cross Research Funds
(YG2013MS30 and YG2014ZD05)
8. DISCLOSURE OF CONFLICTS OF INTEREST
The authors have no relevant conflicts of interest to disclose.
a) Author to whom correspondence should be addressed. Electronic mail:[email protected] 1
R. Siegel, D. Naishadham, A. Jemal, "Cancer statistics, 2013," CA: A Cancer Journal for Clinicians 63, 11-30 (2013). 2
W.H. Organization, "World cancer report, 2014," WHO Report. Geneva: WHO2014). 3
J.E. Roos, D. Paik, D. Olsen, E.G. Liu, L.C. Chow, A.N. Leung, R. Mindelzun, K.R. Choudhury, D.P. Naidich, S. Napel, "Computer-
aided detection (CAD) of lung nodules in CT scans: radiologist performance and reading time with incremental CAD assistance," European radiology 20, 549-557 (2010).
4 F. Beyer, L. Zierott, E. Fallenberg, K. Juergens, J. Stoeckel, W. Heindel, D. Wormanns, "Comparison of sensitivity and reading time for
the use of computer-aided detection (CAD) of pulmonary nodules at MDCT as concurrent or second reader," European radiology 17,
2941-2947 (2007).
Acc
epte
d A
rtic
le
This article is protected by copyright. All rights reserved.
5 N.L.S.T.R. Team, "Reduced lung-cancer mortality with low-dose computed tomographic screening," The New England journal of
medicine 365, 395 (2011). 6
Q. Li, F. Li, K. Doi, "Computerized detection of lung nodules in thin-section CT images by use of selective enhancement filters and an
automated rule-based classifier," Academic Radiology 15, 165-175 (2008). 7
Q. Li, S. Sone, K. Doi, "Selective enhancement filters for nodules, vessels, and airway walls in two- and three-dimensional CT scans,"
Medical Physics 30, 2040-2051 (2003). 8
M. Tan, R. Deklerck, B. Jansen, M. Bister, J. Cornelis, "A novel computer-aided lung nodule detection system for CT images," Medical
Physics 38, 5630-5645 (2011). 9
T. Messay, R.C. Hardie, S.K. Rogers, "A new computationally efficient CAD system for pulmonary nodule detection in CT imagery,"
Medical Image Analysis 14, 390–406 (2010). 10
N. Duggan, E. Bae, S. Shen, W. Hsu, A. Bui, E. Jones, M. Glavin, L. Vese, "A Technique for Lung Nodule Candidate Detection in CT
Using Global Minimization Methods," Energy Minimization Methods in Computer Vision and Pattern Recognition, 478-491 (2015). 11
B. Wang, X. Tian, Q. Wang, Y. Yang, H. Xie, S. Zhang, L. Gu, "Pulmonary nodule detection in CT images based on shape constraint CV
model," Medical physics 42, 1241-1254 (2015). 12
F. Han, H. Wang, G. Zhang, H. Han, B. Song, L. Li, W. Moore, H. Lu, H. Zhao, Z. Liang, "Texture Feature Analysis for Computer-Aided
Diagnosis on Pulmonary Nodules," Journal of digital imaging 28, 99-115 (2015). 13
M. Alilou, V. Kovalev, E. Snezhko, V. Taimouri, "A Comprehensive Framework for Automatic Detection of Pulmonary Nodules in Lung
Ct Images," Image Anal Stereol 33, 13-27 (2014). 14
L. Lu, Y.Q. Tan, L.H. Schwartz, B.S. Zhao, "Hybrid detection of lung nodules on CT scan images," Medical Physics 42, 5042-5054
(2015). 15
A.A. Setio, F. Ciompi, G. Litjens, P. Gerke, C. Jacobs, R.S. Van, W.M. Winkler, M. Naqibullah, C. Sanchez, G.B. Van, "Pulmonary
nodule detection in CT images: false positive reduction using multi-view convolutional networks," IEEE Transactions on Medical
Imaging 35, 1160-1169 (2016). 16
M.S. Brown, P. Lo, J.G. Goldin, E. Barnoy, G.H.J. Kim, M.F. McNitt-Gray, D.R. Aberle, "Toward clinically usable CAD for lung cancer
screening with computed tomography," European radiology 24, 2719-2728 (2014). 17
W. Guo, Q. Li, "High performance lung nodule detection schemes in CT using local and global information," Medical Physics 39, 5157-
5168 (2012). 18
B. Golosio, G.L. Masala, A. Piccioli, P. Oliva, M. Carpinelli, R. Cataldo, P. Cerello, F.D. Carlo, F. Falaschi, M.E. Fantacci, "A novel
multithreshold method for nodule detection in lung CT," Medical Physics 36, 3607-3618 (2009). 19
C. Parmar, R.T. Leijenaar, P. Grossmann, E.R. Velazquez, J. Bussink, D. Rietveld, M.M. Rietbergen, B. Haibe-Kains, P. Lambin, H.J.
Aerts, "Radiomic feature clusters and Prognostic Signatures specific for Lung and Head & Neck cancer," Scientific reports 52015). 20
F. Khalvati, A. Wong, M.A. Haider, "Automated prostate cancer detection via comprehensive multi-parametric magnetic resonance
imaging texture feature models," BMC medical imaging 15, 27 (2015). 21
H.J.W.L. Aerts, E.R. Velazquez, R.T.H. Leijenaar, C. Parmar, P. Grossmann, S. Carvalho, J. Bussink, R. Monshouwer, B. Haibe-Kains,
D. Rietveld, "Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach," Nature Communications 5,
4006-4006 (2014). 22
A.P. Reeves, A.M. Biancardi, T.V. Apanasovich, C.R. Meyer, H. MacMahon, E.J. van Beek, E.A. Kazerooni, D. Yankelevitz, M.F.
McNitt-Gray, G. McLennan, "The Lung Image Database Consortium (LIDC): a comparison of different size metrics for pulmonary
nodule measurements," Academic radiology 14, 1475-1485 (2007). 23
M.F. McNitt-Gray, S.G. Armato, C.R. Meyer, A.P. Reeves, G. McLennan, R.C. Pais, J. Freymann, M.S. Brown, R.M. Engelmann, P.H.
Bland, "The Lung Image Database Consortium (LIDC) data collection process for nodule detection and annotation," Academic radiology
14, 1464-1474 (2007). 24
S.G. Armato, M.F. McNitt-Gray, A.P. Reeves, C.R. Meyer, G. McLennan, D.R. Aberle, E.A. Kazerooni, H. MacMahon, E.J. van Beek, D.
Yankelevitz, "The Lung Image Database Consortium (LIDC): an evaluation of radiologist variability in the identification of lung nodules
on CT scans," Academic radiology 14, 1409-1421 (2007). 25
S.G. Armato III, G. McLennan, L. Bidaut, M.F. McNitt-Gray, C.R. Meyer, A.P. Reeves, B. Zhao, D.R. Aberle, C.I. Henschke, E.A.
Hoffman, "The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT scans," Medical physics 38, 915-931 (2011).
26 Q. Li, K. Doi, "New selective nodule enhancement filter and its application for significant improvement of nodule detection on computed
tomography," Medical Imaging 2004: Image Processing(2004). 27
R.M. Haralick, K. Shanmugam, I.H. Dinstein, "Textural Features for Image Classification," Systems Man & Cybernetics IEEE
Transactions on smc-3, 610-621 (1973). 28
M.M. Galloway, "Texture analysis using gray level run lengths," Computer Graphics & Image Processing 4, 172-179 (1975). 29
N.V. Chawla, K.W. Bowyer, L.O. Hall, W.P. Kegelmeyer, "SMOTE: synthetic minority over-sampling technique," Journal of artificial
intelligence research, 321-357 (2002). 30
L. Rokach, O.Z. Maimon, "Data mining with decision trees: theory and applications," World Scientific Pub Co Inc Isbn 13, 169 - 198
(2008). 31
D.P. Chakraborty, K.S. Berbaum, "Observer studies involving detection and localization: Modeling, analysis, and validation," Medical
Physics 31, 2313-2330 (2004). 32
T. Fawcett, "An introduction to ROC analysis," Pattern Recognition Letters 27, 861–874 (2006). 33
M. Tan, R. Deklerck, J. Cornelis, B. Jansen, "Phased searching with NEAT in a time-scaled framework: experiments on a computer-aided
detection system for lung nodules," Artificial intelligence in medicine 59, 157-167 (2013).
本文献由“学霸图书馆-文献云下载”收集自网络,仅供学习交流使用。
学霸图书馆(www.xuebalib.com)是一个“整合众多图书馆数据库资源,
提供一站式文献检索和下载服务”的24 小时在线不限IP
图书馆。
图书馆致力于便利、促进学习与科研,提供最强文献下载服务。
图书馆导航:
图书馆首页 文献云下载 图书馆入口 外文数据库大全 疑难文献辅助工具