RETRAIN INCEPTION'S FINAL LAYER FOR DETECTING · PDF filelearning technique with Inception V3...

RETRAIN INCEPTION'S FINAL LAYER FOR DETECTING CANCER METASTASIS IN SENTINEL LYMPH NODE

Wei-Jen Tu

ABSTRACT

The article presents an experience of adopting transfer learning technique with Inception V3 model. The task is to leverage an existing weights of convolutional neural networks (CNN’s) to detect and localize tumor lesions as small as 598 x 598 pixels within limited time and hardware resources.

Index Terms— neural network, transfer learning, inception model, cancer metastasis

1. INTRODUCTION

Training an convolutional neural networks (CNN’s) from scratch would be a highly intensive task in computing. Depending on individual computer configuration, it may take several days or even weeks to finish the whole training procedure. This article is trying to present how applicable transfer learning technique could be as a shortcut to train from existing weights for new classes with Inception V3 model[1] and evaluate its performance for detecting cancer metastasis in sentinel lymph node.

2. DATASET

The dataset I use for developing cancer lesion detector is from Camelyon16[2]. Each slide in it contains lymph node tissue which is digitalized and encapsulated by OpenSlide[3] with multiple resolutions. In order to given maximum flexibility, I always use images on the highest resolution level and segment them into small patches to compile a training asset with two different sizes in 598 x 598 and 1196 x 1196.

3. TARGET PATCH EXTRACTION

In contrast to crop patches from annotated tumor slides, it’s relative straightforward to clip normal slides down to the size of 598 x 598 and 1196 x 1196. In the meanwhile, RGB mean value for each patch is also calculated and recorded in filenames for cropped patches. The value will be used as a hint for differentiation in categorization and detection thresholds later.

With regard to annotated tumor slides, little further image processing is needed. In the folder Ground_Truth of Camelyon16 dataset, there are 110 pre-generated masks which are used as a guidance to translate black and white RGB values into ALPHA value as 0 or 255. Then doing a photo composition with tumor slide to get images from areas based on the tumor annotation (Fig. 1).

Fig. 1. An instance of Tumor_016.tif to present a result of the photo composition based on the area of tumor annotation.

4. CATEGORIES

Although images are extracted from dataset can be classified into three major groups tumor, normal and glass. In order to avoid picking up images manually and have more visually distinct in each category, normal group is also split into six smaller ones according to RGB mean value of each image. Especially when RGB mean value reaches more than

190, there are more and more non-lymph cells, like adipose cells in images.

As to tumor group, some further data augmentation is also applied. Because the amount of extracted tumor images is relatively rare comparing to other groups, Images in tumor group are also rotated by clockwise and counter-clockwise with degree 90 individually and flipped by vertical and horizontal orientation as well to increase the number of images (Fig. 2).

Fig. 2. An example of images are rotated by clockwise and counter-clockwise with degree 90 individually and flipped by vertical and horizontal orientation as well.

5. LEARNING SYSTEM

The learning system I’m using for this paper is TensorFlow[4]. Because only Inception’s Final Layer needs to be retrained for new categories, the training task can be run within as little as hours without GPU acceleration. Also, the feature ‘Bottleneck’ value is highly utilized for each training run. Bottleneck values are generated in the first phase analysis for all images in the training asset. These values can be always reused under the premise that images are identical to the ones while calculating the bottleneck value. Owing to the first phase analysis in the retrain procedure could take days once the amount of images reaches more than hundred thousands, the utilization of bottleneck values do save enormously a lot of time.

6. DETECTION

6.1. Method A: Two phases detection

Unlike training task, detection procedure takes much more time to finish because of whole slide iteration. Therefore, a strategy of two phases detection is adopted. In the first phase, a bigger inspecting window is created by size 1196 x 1196 to iterate whole slide. Once the area is labeled as tumor and similarity is over 0.8 at the same time, it will be cropped and saved as a image for another detection in the second phase. In the second phase, the images generated from the first phase will be split into 4 smaller patches at resolution of 598 x 598 and then receiving another detection. After the processes end, a thumbnail will be generated and detected tumor lesions are also going to be highlighted as brighter white space in the thumbnail (Fig. 3).

A strategy like above can diminish time consuming enormously since it only inspects twice on selected areas which are screened by the first phase detection. For that

reason, most cases in Testset take this approach to detect tumor lesions.

Fig. 3. The result of patient_132_node_0.tif after receiving method A: two phase detection. Note that brighter white space in images represents sites of tumor lesion.

6.2. Method B: One phase detection

In general, the idea of method B is very similar to the one in method A. The major difference is to create a smaller inspecting windows by size 598 x 598 to iterate whole slide. Therefore, only one phase detection is applied in method B. Owing to detection procedure spending more time in iteration because of smaller inspecting window, a pair of maximum RGB mean value 215 and minimum RGB mean value 150 is also set to skip redundant detection for patches like glass areas or artificial black areas. The reason to adopt method B is trying to archive a better sensitive rate for tumor cells that are occupying within tiny space (Fig. 4 and Fig. 5); however, not all test cases are applied method B because of limited time before the deadline for submitting results (Tab. 1).

Fig. 4. The result of patient_100_node_2.tif after receiving method B: one phase detection. Note that brighter white space in images represents sites of tumor lesion.

Fig. 5. A closer view of tumor lesions within the brighter white space in the result of patient_100_node_2.tif (Fig. 4).

Tab. 1. Records on cases that are applied method B: one phase detection in testset.

7. TRAINING SET DETAILS

8. REFERENCES

[1] Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, Zbigniew Wojna. “Rethinking the Inception Architecture for Computer Vision”, arXiv, https://arxiv.org/abs/1512.00567, 11 Dec 2015.

[2] {Ehteshami Bejnordi}, Babak and {van der Laak}, Jeroen. Camelyon16, https://camelyon16.grand-challenge.org, 2017-04-05

[3] OpenSlide python version (1.1.1). http://openslide.org

[4] TensorFlow (1.0). https://tensorflow.org/

patient_102 patient_158







Category (RGB mean value) Number of Images

Glass ( >= 0 and < 240) 347,448

Glass ( >= 240 and <= 255) 187,258

Normal ( >= 60 and < 100) 4,300

Normal ( >= 100 and < 160) 265,065

Normal ( >= 160 and < 190) 94,225

Normal ( >= 190 and < 210) 54,595

Normal ( >= 210 and < 230) 20,751

Normal ( >= 230 and <= 255) 129,033

Tumor ( >= 0 and <= 255) 77,220

Total 1,179,895

Date post:	18-Mar-2018
Category:	Documents
Upload:	hanhi
View:	214 times
Download:	2 times

RETRAIN INCEPTION'S FINAL LAYER FOR DETECTING · PDF filelearning technique with Inception V3...

Documents