+ All Categories
Home > Documents > Towards Fine Whole-Slide Skeletal Muscle Image ...

Towards Fine Whole-Slide Skeletal Muscle Image ...

Date post: 01-Dec-2021
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
11
Research Article Towards Fine Whole-Slide Skeletal Muscle Image Segmentation through Deep Hierarchically Connected Networks Lei Cui , 1 Jun Feng , 1 and Lin Yang 2 1 Department of Information Science and Technology, Northwest University, Xi’an, China 2 e College of Life Sciences, Northwest University, Xi’an, China Correspondence should be addressed to Jun Feng; [email protected] and Lin Yang; [email protected] Received 19 November 2018; Accepted 14 March 2019; Published 27 June 2019 Academic Editor: Norio Iriguchi Copyright © 2019 Lei Cui et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Automatic skeletal muscle image segmentation (MIS) is crucial in the diagnosis of muscle-related diseases. However, accurate methods often suffer from expensive computations, which are not scalable to large-scale, whole-slide muscle images. In this paper, we present a fast and accurate method to enable the more clinically meaningful whole-slide MIS. Leveraging on recently popular convolutional neural network (CNN), we train our network in an end-to-end manner so as to directly perform pixelwise classification. Our deep network is comprised of the encoder and decoder modules. e encoder module captures rich and hierarchical representations through a series of convolutional and max-pooling layers. en, the multiple decoders utilize multilevel representations to perform multiscale predictions. e multiscale predictions are then combined together to generate a more robust dense segmentation as the network output. e decoder modules have independent loss function, which are jointly trained with a weighted loss function to address fine-grained pixelwise prediction. We also propose a two-stage transfer learning strategy to effectively train such deep network. Sufficient experiments on a challenging muscle image dataset demonstrate the significantly improved efficiency and accuracy of our method compared with recent state of the arts. 1. Introduction Skeletal muscle accounts for approximately 40% body mass. As the largest body tissue, skeletal muscle has been extensively recognized as the biomedical health biomarker related to many diseases such as cancer cachexia, heart failure, and chronic obstructive pulmonary disease (COPD) [1–3]. In recent years, the growing attentions, in the muscle biology community, have been paid to the analysis of histological images of skeletal muscle to assist the diagnosis of relevant diseases [1]. e quantification of morphological characteristics of muscle fibres plays an important role in the assistance of disease diagnosis and clinical studies. Critical morphological characteristics, including cross-section area, fiber type and shape, and the minimum Feret diameter, are closely related to the functionality and health of muscle [4, 5]. To accurately quantify these morphological characteristics of muscle fibres, an accurate skeletal muscle image segmentation (MIS) system is the prerequisite. Currently, the segmentation for muscle fibres in routine practice still highly relies on experts’ manual labors or semiautomatic process [6], which are not only expensive but also contain large interobserver variations. e increasing demand of a fast and accurate automatic MIS attracts many attentions recently. Various approaches have been proposed to address this task [4, 7–9]. e difference between MIS and standard histological image cell segmentation is attributed to the specific mor- phology of skeletal muscle. Skeletal muscle is composed of long, multinucleated cells (fibres) tightly grouped into fas- cicles, interspersed with other mononucleated cell types and surrounded by connective tissues and fat. is tightly grouped anatomical structure, coupled with artifacts and staining variances introduced during sample preparation, generates confusing and overlapping cell boundaries. Hindawi Journal of Healthcare Engineering Volume 2019, Article ID 5191630, 10 pages https://doi.org/10.1155/2019/5191630
Transcript

Research ArticleTowards Fine Whole-Slide Skeletal Muscle ImageSegmentation through Deep HierarchicallyConnected Networks

Lei Cui 1 Jun Feng 1 and Lin Yang 2

1Department of Information Science and Technology Northwest University Xirsquoan China2e College of Life Sciences Northwest University Xirsquoan China

Correspondence should be addressed to Jun Feng fengjunnwueducn and Lin Yang linyangnwueducn

Received 19 November 2018 Accepted 14 March 2019 Published 27 June 2019

Academic Editor Norio Iriguchi

Copyright copy 2019 Lei Cui et al-is is an open access article distributed under the Creative Commons Attribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Automatic skeletal muscle image segmentation (MIS) is crucial in the diagnosis of muscle-related diseases However accuratemethods often suffer from expensive computations which are not scalable to large-scale whole-slide muscle images In this paperwe present a fast and accurate method to enable the more clinically meaningful whole-slide MIS Leveraging on recently popularconvolutional neural network (CNN) we train our network in an end-to-end manner so as to directly perform pixelwiseclassification Our deep network is comprised of the encoder and decoder modules -e encoder module captures rich andhierarchical representations through a series of convolutional and max-pooling layers -en the multiple decoders utilizemultilevel representations to performmultiscale predictions -e multiscale predictions are then combined together to generate amore robust dense segmentation as the network output -e decoder modules have independent loss function which are jointlytrained with a weighted loss function to address fine-grained pixelwise prediction We also propose a two-stage transfer learningstrategy to effectively train such deep network Sufficient experiments on a challenging muscle image dataset demonstrate thesignificantly improved efficiency and accuracy of our method compared with recent state of the arts

1 Introduction

Skeletal muscle accounts for approximately 40 bodymass As the largest body tissue skeletal muscle has beenextensively recognized as the biomedical health biomarkerrelated to many diseases such as cancer cachexia heartfailure and chronic obstructive pulmonary disease(COPD) [1ndash3] In recent years the growing attentions inthe muscle biology community have been paid to theanalysis of histological images of skeletal muscle to assistthe diagnosis of relevant diseases [1]

-e quantification of morphological characteristics ofmuscle fibres plays an important role in the assistance ofdisease diagnosis and clinical studies Critical morphologicalcharacteristics including cross-section area fiber type andshape and the minimum Feret diameter are closely relatedto the functionality and health of muscle [4 5] To accuratelyquantify these morphological characteristics of muscle

fibres an accurate skeletal muscle image segmentation (MIS)system is the prerequisite

Currently the segmentation for muscle fibres in routinepractice still highly relies on expertsrsquo manual labors orsemiautomatic process [6] which are not only expensive butalso contain large interobserver variations -e increasingdemand of a fast and accurate automatic MIS attracts manyattentions recently Various approaches have been proposedto address this task [4 7ndash9]

-e difference between MIS and standard histologicalimage cell segmentation is attributed to the specific mor-phology of skeletal muscle Skeletal muscle is composed oflong multinucleated cells (fibres) tightly grouped into fas-cicles interspersed with other mononucleated cell types andsurrounded by connective tissues and fat -is tightlygrouped anatomical structure coupled with artifacts andstaining variances introduced during sample preparationgenerates confusing and overlapping cell boundaries

HindawiJournal of Healthcare EngineeringVolume 2019 Article ID 5191630 10 pageshttpsdoiorg10115520195191630

Although histological cell segmentation research has a richhistory few have been successfully applied to MIS -ere arestill several challenges which remain to be solved to achieve arobust and automatic MIS system First all exiting MISmethods can only handle small image patches with the sizesmaller than 1000 times 1000 cropped from whole-slide muscleimages -e main reason is that supervised methods usuallyrely on handcrafted features and pretrain classifiers to dis-tinguish cell or noncell regions or pixels However thecomputation of well-designed handcrafted features andregionwise classification is usually expensive and the time costis proportional to the image scale -erefore this limitation ofexistingmethodsmakes them hardly be applied for large-scalewhole-slide muscle images

Second the special muscle cell shape and size asabovementioned make the segmentation methods hardly begeneralized For example unsupervisedmethods such as thedeformable model [10 11] and shape prior-based methods[12 13] have been widely used in histological and mi-croscopy cell segmentation However the arbitrarilytransformed cell shapes and size increase the difficulty to useshape information for MIS

-ird the densely touching fibres and staining artifactsmake the fiber boundaries unclear and broken which in-creases the difficulties for methods to separate multipletouching fibres by using boundary information Fourthcurrent methods usually contain multiple mutually de-pendent steps the failure of either step will largely affectother steps and the final results On the other hand thecomplex pipeline largely decreases the speed of MIS

-is paper addresses these challenges to achieve a bothefficient and effective MIS method based on recently popularconvolutional neural network (CNN) CNN-based methodshave achieved unprecedented performance in variousmedicalimage applications Different from conventional computervision methods CNN has strong capability to learn com-prehensive representations via a deep architecture for effectiveclassification When using CNN for pixelwise classificationconventional CNN shows the efficiency shortcomings [14]-e end-to-end CNN training strategy has recently attracted alot of research interest [15 16] However a common problemis that the dense output is relatively coarse and it is difficult toaccurately classify each pixel [16 17] To generate more ac-curate and fine outputs a refinement procedure needs to beconsidered

In this paper we propose a novel MIS method based onCNN trained in an end-to-end manner [16] which enablesthe CNN to better utilize the rich representations and directlypredict fine-grained segmentation given an arbitrarily sizedinput image Figure 1 shows some segmentation results ofdifferent image scales Specifically the main contributions ofthis paper are summarized as follows

(i) We propose a network whose architecture mainlycontains two modules the encoder and the decoder-e encoder captures rich representations through avery deep CNN architecture -e decoder leverageson the hierarchy characteristic of the encoder toenable multiscale prediction independently A

refinement procedure of the decoder automaticallyaddresses the fine-grained dense outputs Figure 2illustrates our network

(ii) We propose a novel spatially weighted lossfunction to take care of the unbalanced class issueand unavoidable errors happened in groundtruth which encourage the convergence of thenetwork

(iii) We propose a two-stage training approach to trainthe proposed very deep network which facilitatesthe network to better use pretrained CNN for betterconvergence and preserve the weak boundary in-formation of muscle cells

(iv) We conduct sufficient experiments on an expertise-annotated skeletal muscle image dataset demon-strating the significantly improved efficiency andaccuracy compared with other state of the arts

2 Related Works

-e growing interest in the computer-aided histologicalimage diagnosis entails rich research literature As oneof the histological image analysis family skeletal muscleimage analysis is a new yet recently popular applicationwhich has built successful cooperations with clinics toaccelerate their research and clinical trials [1ndash3 18 19]

As the prerequisite of skeletal muscle image analysisvarious methods have been proposed for MIS Klemencicet al [10] proposed a semiautomatic muscle image seg-mentation approach based on the active contour modelJanssens et al [8] proposed a top-down cell segmentationframework using supervised learning and clump splittingwhich requires a long pipeline with the help of several low-level image processing techniques However the perfor-mance of these image processing techniques can be easilyinfluenced by imaging artifacts and cell clumps Smith andBarton [6] proposed SMASHmdashsemiautomatic muscle imageanalysis software Some other software applications such asCellProfiler [9] have obtained high exposure in histologicalimage analysis community However these software ap-plications show nonsatisfactory results for challengingmuscle images Practically time-consuming manual ad-justment is still needed Liu et al [4] proposed a deformablemodel-based segmentation algorithm -e success relayslargely on the initial centers of the muscle cells It is not ableto handle cells with arbitrarily transformed fiber shape andit requires complex postprocessing to refine the resultswhich is not robust in practice Recently Liu et al [7]proposed a hierarchical tree-based region selection methodto segment muscle fibres which relies on elaboratelydesigned features and high-level machine learning tech-niques -is method first detects fiber boundaries by usingstructured random forest [20] then it builds a hierarchicalregion tree based on the detected edge map Finally thedynamic programming is performed to select candidateregions from the tree structure [21] -is method showsobvious improvement compared with previous MIS ap-proaches However this method still suffers from relatively

2 Journal of Healthcare Engineering

expensive computation so it is unable to be applied ontowhole-slide images

As a matter of fact whole-slide MIS is still an unsolvedproblem Although some literature discusses the usage ofdistributed computing [22ndash24] to accelerate process forlarge-scale histological images distributed computing isusually difficult to be deployed for practical usage in clinicalpractice

Convolutional neural network (CNN) [25] is one majorbranch of the deep learning family Its applications in pa-thology and histological image analysis domain became in-creasingly popular very recently [14 26ndash28] CNN has shownstrong ability to handle complex classification problem [29]Recently end-to-end CNN training concept is introduced forsemantic image segmentation termed fully convolutionalneural network (FCN) [16] Instead of performing patch-to-pixel prediction it enables the network to perform spatialdense classification (ie a segmentation mask) given a testimage By taking advantaging of this strength severalmethods have been proposed to handle various pixelwiseclassification tasks [15 30ndash34] Our paper shares somesimilarity with the previous works of how to enable CNNto be trained in an end-to-end manner Different fromprevious works we have made several specific designs to

handle fine-grained prediction unbalanced class multiscalefeatures and transfer learning from pretrained model forMIS More details are discussed in the rest of the paper

3 Methodology

In this section we begin by introducing the proposednetwork architecture and then present proposed lossfunction for training the network Finally we introduce thetwo-stage learning to train the overall network

31 Network Architecture We briefly introduce the con-volutional neural network (CNN) at first CNN [25] is avariant of multilayer perception (MLP) which is mainlycomposed of multiple stacked computation layers frombottom to top including convolutional max-pooling andfully connected layers activation layer etc -e convolu-tional layer uses learnable convolutional filters to extractrepresentations from locally connected image regions (re-ceptive fields) -e max-pooling layer reduces the di-mensionality of the obtained representations fromconvolutional layers while keeping the feature translationinvariance -e fully connected layer uses all features for

Ground truth

Segmentation mask Boundary map

Input image

Segmentation result

Multiscale outputs of decoders

Concat FCN

Figure 2 -e illustration of the network architecture -e input image has a ground truth segmentation mask and a boundary map Blackboxes indicate the encoder module while colored boxes indicate the decoder module One decoder takes the feature maps of one encoderlayer as input and outputs one segmentation results -e multiscale outputs of all encoders are concatenated to generate the final seg-mentation result

OverlayImage

1X 11s 2X 18s 4X 88s 6X 209s 8X 364s

Figure 1 Illustration of the segmentation results of different scale (1x 1000 times 1000 pixels to 8x 8000 times 8000 pixels) whole-slide muscleimages (best viewed in electronic form) For each image the right half side represents the segmentation results overlaid by the coloredmasks -e runtime is the result tested on a single GPU

Journal of Healthcare Engineering 3

high-level classification From bottom layers to top layersCNN gradually captures rich representations of input imagefrom pixel level to content level so as to make accurateclassification Conventional CNN is used to perform high-level classification ie assigning a category label to an inputimage patch When it is applied to pixelwise prediction MIStask extensive patch-to-pixel level prediction (CNN feed-forward) is required which will extensively limit the seg-mentation efficiency [14 35]

To solve this problem we train our network in an end-to-endmanner which enables the network to directly outputthe image segmentation given an input image In this waywe no longer need to use patchwise classification to assignlabels to all pixels via millions of CNN feedforward Onlyone-time feedforward is needed to obtain the final seg-mentation End-to-end training of CNN is used to enable thenetwork to directly output dense image segmentation for agiven input image [15 16 32]

However modifying conventional CNN to perform end-to-end training brings a major side effect ie substantialpixel-level information loss at top layers makes the pixelwiseprediction inaccurate [17 30] It is because multiple max-pooling layers will dramatically decrease the spatial size ofthe output so the predicted segmentation output is verycoarse Most proposed end-to-end CNN methods useupsampling [16 17] or deconvolution operations [15] toresize back the output to the spatial size of the input imageNevertheless the max-pooling layer is essential to abstractthe content-level representations for high-level categoryclassification [25 29 36] and decrease the computationspace of CNN

As a matter of fact when we generalize end-to-endCNN to MIS content-level information becomes lessimportant because the label of a single pixel does not relyon the knowledge of the whole muscle image Differentfrom semantic segmentation [16 32] which needs content-level information to predict the category label per pixel weare more interested in the fine-grained pixelwise pre-diction by taking advantage of the hierarchical repre-sentations of the encoder to improve the predictionaccuracy -e hierarchy characteristic can be achieved bygradually enlarging the receptive field size after each max-pooling layer To this end we propose a novel networkarchitecture which is composed by one encoder moduleand multiple decoder modules Generally the decoderaims to use the rich and hierarchical representationsobtained from the encoder for pixelwise classification

311 Encoder Module -e encoder architecture is mostlyidentical to the conventional neural network Instead ofbuilding our own layer combinations we borrow the well-known VGG net [29] with fully connected layers trun-cated to capture the rich and hierarchical representationsfrom pixel level at bottom layers to content level(ie category-specific knowledge) at top layers VGG netis composed of a series of convolutional sets with each sethaving multiple convolutional layers followed by a max-pooling layer VGG has two variants (one has 16 layers

and the other has 19 layers) we use 16-layer VGG forefficiency consideration We choose VGG for two reasons(1) we can transfer the pretrained VGG model to helptrain our very deep network as described in the nextsection (2) VGG net is very deep which extracts fivedifferent-scale feature maps containing very rich multi-scale representations for the usage of decoders

312 Decoder Module -e decoder has two main purposes(1) it utilizes the rich representations obtained from theencoder for pixelwise classification So the output of onedecoder is a dense segmentation mask with each spatialposition assigning a label to the corresponding pixel of theinput image (cell or noncell in our case) (2) it refines thelow-rescale coarse segmentation mask to efficiently generatefine-grained high-scale segmentation mask -e refinementprocedure is achieved by multistep deconvolution andsuccessive usage of same-scale feature maps obtained fromother decoders

We propose to connect multiple decoders prior toevery max-pooling layer of the encoder thus the decoderscan easily utilize the multiscale representations as inputfeatures as inspired by [15 16 31] -e decoder can beviewed as a small pixelwise classification network whichhas an independent loss to update its parameters duringtraining Hence the overall architecture is multitaskCNN

Our design of the decoder includes convolutionallayers with intermediate deconvolution layers [15] Spe-cifically the deconvolution is the backward convolutionoperation which performs elementwise product with itsfilters (please note that some controversies arise in thenaming of ldquodeconvolutionrdquo in recent literature as thedeconvolution layer used here is different from the pre-vious definition of the deconvolution [37] we maintainthe same definitions as most of the literatures on end-to-end CNN) -e output size of deconvolution will be en-larged by a factor of the stride -e filters of the decon-volution layers are learnable and updated by the loss of thedecoder

In this way rather than enlarging the image with alarge stride through a skip connection [16 31 38] ourapproach enlarges the feature map in multiple steps andprogressively refines the feature maps at different scalesvia convolutional kernels with the purpose of reducingthe effects of pixel-level information loss We use 3 times 3filter size as this small size has been proven effectivewidely In the end we concatenate multiscale predictionsof all decoders which generates a 5-dimensional featuremap we apply a 1 times 1 convolutional layer to merge thefeature map to generate the final output Compared withhow recent architecture [35 39] uses multiscale in-formation (resize input patch size and feed into multipleCNNs and merge all predictions [35 39]) our approachenables multiscaling inside the network requiring asingle arbitrarily sized input and outputting the finalsegmentation result Figure 3 specifies the parameters ofeach layer

4 Journal of Healthcare Engineering

32 Spatially Weighted Loss for Backpropagation -is sec-tion describes the loss function training network throughbackpropagation Our proposed spatially weighted loss playsan important role in network training

Denote the training data as D (X Y) isin X times Y1113864 1113865where XY isin RN and N is the total number of pixelsin the training image X Y is the corresponding groundtruth segmentation mask with each pixel Yi isin 0 1

(ie pixels inside and on the boundary the muscle cell andbackground otherwise) For an input image X the mainobjective of our network is to obtain the pixelwise pre-diction Y⋆

Y⋆

arg max1113954Y

P(1113954Y ∣ X θ)(1)

where P(1113954Yi ∣ X θ) is the prediction probability of pixel Xiie the sigmoid function output of the network (denoted asPi afterwards for brevity) θ represents all parameters of ournetwork

Our network has multiple decoders with each havingindependent loss to update their parameters (see Section312 for details) Denote the loss function of i-th decoder asJde

i -e extra 1 times 1 convolutional layer after the concatlayer is updated by another loss (see Figure 3) denoted as

Jc Learning θ is achieved by minimizing the loss functionJ which is defined as

J(θ) 1113944M

i1J

dei (θ) + J

c(θ) (2)

where M is the number of the decoder Note that sinceJdei

and Jc are both spatially computed on pixels of the denseoutput both have the same formulation -e overall lossJ can be jointly minimized via backpropagation (spe-cifically when a layer has more than one path such as theconv1-2 layer in Figure 3 which has two successive layers(decoder-1 and pool1) the gradients will be accumulatedfrom multiple successive paths during backpropagation[40])

In skeletal muscle images there are several commonproblems which affect the network training (1) the largeproportion of pixels inside cell pixels will cause an un-balanced class such that the error effects occurred at themargins will be diminished during backpropagation (2)usually cells are densely distributed and the boundariesbetween touching cells are thin and often unclear orbroken due to musclersquos unique anatomy based on ourobservations the network often misclassifies the pixels at

Decoder-1

Decoder-2

Decoder-3

Decoder-4

Decoder-5

Encoder

Name Kernel Stride Pad OutputNameInputconv-1

Kernelmdash mdash

1 times 1 1mdash0

Inputconv-1

conv-2

mdash mdash1 times 1 1

mdash0

devonv-1dagger 4 times 4 2 0

Stride Pad Output

Name Kernel Stride Pad Output

Name Kernel Stride Pad Output

Inputconv1-1

conv2-1

conv3-1conv3-2conv3-3

3 times 3 1 11 1

1000 times 10001000 times 1000 times 641000 times 1000 times 641000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 11000 times 1000 times 5

250 times 250 times 256

125 times 125 times 512

63 times 63 times 51263 times 63 times 51263 times 63 times 51263 times 63 times 512

125 times 125 times 512125 times 125 times 512125 times 125 times 512

250 times 250 times 256250 times 250 times 256250 times 250 times 256

500 times 500 times 64500 times 500 times 128500 times 500 times 128

3 times 3

2 times 2mdash mdash mdash

211mdash2111mdash2111mdash2

011

111

mdash0

mdash0111mdash

111mdash

111mdash

mdash mdash

0

1 0

3 times 33 times 3

2 times 23 times 33 times 33 times 3

2 times 2

mdash

mdash mdash mdash

mdash

3 times 33 times 33 times 3

2 times 2mdash

3 times 33 times 33 times 3

1 times 1

mdashmdash

conv1-2

conv2-2

Decoder-1

Decoder-2

Decoder-3

pool1

pool2

pool3conv4-1conv4-2conv4-3Decoder-4pool4

concatconv

conv5-1conv5-2conv5-3Decoder-5

1000 times 1000 times 64

500 times 500 times 128

Input mdash mdash mdash 250 times 250 times 256

500 times 500 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

conv-1 1 times 1 1

1 1

0devonv-1

devonv-2dagger

4 times 4

4 times 43 times 3

2

2

0

0

250 times 250 times 1502 times 502 times 1502 times 502 times 1

conv-2

Name Kernel Stride Pad OutputInput mdash mdash mdash 125 times 125 times 512

1000 times 1000 times 1

conv-1 1 times 1 1

1 1

0devonv-1

devonv-2dagger

8 times 8

4 times 43 times 3

2

2

0

0

215 times 125 times 1504 times 504 times 1504 times 504 times 1

conv-2

Name Kernel Stride Pad OutputInput mdash mdash mdash 63 times 63 times 512

1000 times 1000 times 1

conv-1 1 times 1 1

1 1

0devonv-1

devonv-2dagger

8 times 8

8 times 83 times 3

4

4

0

0

63 times 63 times 1256 times 256 times 1256 times 256 times 1

Figure 3 -e detailed network configuration -e convolutional max-pooling deconvolutional and concat layers are denoted by convpool deconv and concat respectively Each convolutional layer of the encoder is followed by a ReLU layer which is hidden in the tables-ere are 5 decoders connected inside the architecture of the encoder -e (black solid and gray dotted) arrows point to the layer where theoutput of the corresponding layer goes-e last column of each table shows the feature map size (heighttimes widthtimes dimension) of each layerIn the tables of decoders ldquodaggerrdquo indicates that a crop layer is connected after that to force the output size to be the same as the input image size(ie 1000 times 1000 in the table)

Journal of Healthcare Engineering 5

margins between fiber boundaries (3) due to the stainingissue the boundary pixels are not smooth and continuousso it is very difficult to ensure that annotations accuratelylabel each pixel It is necessary to reduce the ambiguity fornetwork training

We propose a loss function to ameliorate these problemsby assigning different weights to the loss of each pixel -eloss function of a training data X which is based on thecross-entropy loss is defined as

Jde

(θ) 1113944N

i1f Xi( 1113857 1 Yi 11113858 1113859logPi + 1 Yi 01113858 1113859log 1minusPi( 1113857( 1113857

(3)

-e pixelwise weights are defined by the weight-assigning function f which is defined as

f Xi( 1113857 C Yi( 1113857minus1

times expΩ Xi( 1113857

η1times 1 Yi minusPi

11138681113868111386811138681113868111386811138681113868lt η21113960 1113961 (4)

-e pixelwise weight-assigning function f has threeterms which play different roles to address the above-mentioned three problems-e specific considerations makeour proposed loss different from [16 30]

In the first term C(Yi) is the label frequency which is aglobal term having the same value for same-class pixels Insecond term Ω is the Euclidean distance of pixel Xi to theboundary of the close cell Similar to [30] the intention of f isto assign relatively high weights to pixels adjacent toboundaries to amplify the error penalty occurred at themargins and pixels close to fiber boundaries and 1 otherwise(Ω 0 if Ω(Xi)gt ε ) We set η 06 and ε 10 empiricallyCompared with the ldquohardrdquo error-balancing strategy in[16 31] f produces soft error penalty so as to encouragebetter optimization convergence and enhance fine-grainedprediction -e third term aims to reduce the reliability ofthe ground truth when the network predicts an oppositelabel with high probability -is term is a switch so it forcesthe weight of the corresponding pixel to zero when thecondition is not satisfied In practice we preserve this valueduring network feedforward while the loss of the corre-sponding pixels does not get involved during networkbackpropagation

33 Two-Stage Training Training our deep network hassome common difficulties

(i) -e large number of parameters in both convolu-tional layers and deconvolutional layers makes thetraining difficult to achieve proper convergence[15 41]

(ii) Successful training from scratch requires extensivelabeled data which are extremely difficult to obtain inmedical image domain

One typical solution is to apply transfer learning toreduce the training difficulty [41 42] which reduces thedifficulties of the tricky parameter initialization andtuning [25 29] and heavy data acquirement procedure-e core idea behind is to use a pretrained model as the

initialization and fine-tune the CNN to make it adapt totargeting tasks with new training data -e encoder of ournetwork partially inherits the architecture of VGG [29]which is however trained on a large set of natural imagesfor image classification Transferring its knowledge tobenefit the totally unrelated biological image analysisproblem (ie MIS) seems impracticable However a re-cent literature coincides with our experiments It dem-onstrates the advantage [41] using various biologicalimaging modalities transferring from AlexNet [25] arelatively shallow CNN for natural image classification Interms of our MIS case segmentation the network archi-tecture is much more deeper with many new parame-terized layers in decoders More specific treatment needsto be considered

It is well known that the bottom layers of CNN canbe understood as various feature extractors attemptingto capture the low-level image features such as edgesand corners [25 37 41] Actually those low-level fea-tures are common between natural images and muscleimages of which the most common feature is imagegradients (ie boundaries) In practice we find thattraining the network to detect boundaries is relativelyeasier than directly training the network to segmentmuscle fibres

We propose a two-stage training strategy to pro-gressively train our network so as to utilize the powerfulfeature extractors of VGG and overcome the above-mentioned problems In the first stage we apply transferlearning to use pretrained VGG to initialize the param-eters of the encoder and randomly initialize the param-eters of decoders We then train the network to detectfiber boundaries which is achieved by feeding the networkwith training muscle images associated with the groundtruth boundary map (see Figure 2) -is strategy willfacilitate the network to converge swiftly After the net-work becomes adapted to new muscle images in thesecond stage we fine-tune the model using the originaltraining data D (ie Y is the segmentation mask) to trainthe network to automatically segment muscle fibresassigning in-cell pixels to 1 and other pixels to 0 Moreimplementation details are described in the experimentalsection

Another advantage of our proposed training strategy isthat it further helps reduce the touch objects (due to thinboundaries) problem [30 34] commonly occurred in end-to-end CNN segmentation (besides the pixel weight-assigning function f ) -e strategy of this literature [34] isto predict both a segmentation map and boundary map andmerge two maps to solve touching glands While in ourmethod the first stage training makes the network detect thecell boundaries -e second stage training is able to preservethis boundary information

4 Experimental Results

41 Dataset Our expert annotated skeletal muscle imagedataset with HampE staining contains 500 annotated imageswhich are captured by the whole-slide digital scanner from

6 Journal of Healthcare Engineering

the cooperative institution Muscle Miner -e images ex-hibit large appearance variances in color fiber size andshape -e image size roughly ranges from 500 times 500 to1500 times 1500 pixels We split the dataset into 100 testingimages and 400 training images

In order to evaluate the proposed method to handlelarge-scale images we evaluate the runtime on a whole-slideimage Note that we use small image patches for segmen-tation accuracy evaluation because some comparativemethods in the literature cannot handle whole-slide imagesHowever our proposed network is flexible to the input sizeduring the testing stage because the decoder is able toadaptively adjust the output size to be consistent with theinput size

42 Implementation Details Our implementation is basedon the Caffe [40] framework with modifications for ournetwork design All experiments are conducted on astandard desktop with an Intel i7 processor and a singleTesla K40c GPU -e optimization is driven by stochasticgradient descent with momentum For the first stagetraining the network parameters are set to learningrate 1eminus 6 (divided by 10 every 1e4 iteration) mo-mentum 09 and minibatch size 2 In the second stagewe use the learning rate 1eminus 7 and keep the others thesame

Augmenting dataset is a normal step for training CNNWe apply a simple approach by randomly cropping 30300 times 300 image patches from each of the training imagesto generate totally 12e4 training data We choose thispatch size to take the memory capacity of GPU into ac-count Based on our observations the segmentation ac-curacy will not be affected by increasing input size of testimages To simplify the computation of the weightingfunction f during training we take another per-computedweighting map associated with each training data (X Y) asnetwork inputs

43 Segmentation Accuracy Evaluation For quantitativeevaluation we report Precision (|ScapG||S|) Recall

(|ScapG||G|) and F1-score (2 middot Prec middot RecPrec + Rec)where |S| is the segmented cell region area and |G| is thecorresponding ground truth region area For each testimage Precision and Recall are computed by averaging theresults of all fibres inside We report the three values with afixed threshold (FT) ie a common threshold producesthe best F1-score over the test set and dynamic thresholds(DT) produce the best F1-score per image

In Table 1 we compare the segmentation performance ofour approach to several state-of-the-art methods DC [43]and multiscale combinatorial grouping (MCG) [44] arerecently proposed learning-based image segmentationmethods U-Net [30] is an end-to-end CNN for biomedicalimage segmentation We use their public codes and carefullytrain the models over our training data with the sameamount DNN-SNM [14] is a well-known CNN-based imagesegmentation method We regard it as a generic CNN forcomparison with our end-to-end CNN approach For our

method we directly use the network output as the seg-mentation results for evaluation without any extra post-processing efforts

As shown in Table 1 our method achieves much betterresults than comparative methods Although [7] has betterRecall (FT) our method has around 10 improvement onPrecision (FT) DC and MCG are not robust to the imageartifacts which decreases their segmentation performanceOur method largely outperforms DNN-SNM and U-Netbecause (1) our network is deeper than DNN-SNM tocapture richer representations (2) the decoder betterutilizes the multiscale representations than U-Net and isable to reduce the effects of the pixelwise information lossand (3) two-stage training takes advantage of VGG forbetter training effectiveness rather than training fromscratch as U-Net does -e outstanding Precision resultdemonstrates that our method produces more fine-grainedsegmentation than others -is superiority is betterdemonstrated by the qualitative evaluation as shown inFigure 4

44 Whole-Slide Segmentation Runtime In Table 2 wecompare the runtime of our method to the comparativemethods on images of different sizes cropped from a whole-slide image (see Figure 1) -e runtime of non-deeplearning-based methods (1st block) depends on both pixeland fiber quantities so they cannot handle large-scale im-ages In contrast deep learning-based methods (2nd and 3rdblocks) depend on the pixel quantity so they have close-to-linear time complexity with respect to the image scale Wealso implement a fast scanning version [45] of DNN-SNMon GPU Although the speed has a large improvement it isstill much slower than ours U-Net has more complicatedlayer connection configuration so it is slower than oursespecially in large-scale cases -e significant speed im-provement demonstrates the scalability of our proposedmethod to the application of whole-slide MIS with evenlarger scales

5 Conclusion

-is paper presents a fast and accurate whole-slide MISmethod based on CNN trained in the end-to-end mannerOur proposed network captures hierarchical and compre-hensive representations to support multiscale pixelwisepredictions inside the network A two-stage transfer learningstrategy is proposed to train such a deep network Superioraccuracy and efficiency are experimentally demonstrated ona challenging skeletal muscle image dataset In general ourapproach enables multiscaling inside the network while justrequiring a single arbitrarily sized input and outputting fineoutputs However during the downsampling process of theencoding due to the limitation of resolution of feature layerafter downsampling many important features such as edgefeatures of cells are still lost To further improve decodingefficiency in the future work we can design a module thatcomplements important features to better improve networkperformance

Journal of Healthcare Engineering 7

Test image Ground truth DNN-SNM Liu et al [7] Ours

Figure 4 Segmentation results of four sample skeletal muscle imagesWe show some very challenging cases with large appearance variancesin color fiber shape etc Each segmented fiber is overlaid with a distinctive colored mask while false positives and false negatives arehighlighted by red and blue contours respectively Compared with the other two methods our method obtains more fine-grainedsegmentation results with obviously less false prediction

Table 2 -e runtime (in seconds) comparison on images of different sizes from 1x 1000 times 1000 to 9x 9000 times 9000

Method 1x 2x 3x 4x 5x 6x 7x 8x 9xDC [43] 20 79 mdash mdash mdash mdash mdash mdash mdashMCG [44] 7 27 mdash mdash mdash mdash mdash mdash mdashLiu et al [7] 10 59 mdash mdash mdash mdash mdash mdash mdashDNN-SNM [14] 264 1056 2376 4224 6600 9504 12936 16896 21384DNN-SNM⋆[45] 31 115 242 431 675 974 1325 1738 2160U-net [30] 12 39 90 161 246 368 482 633 792Our approach 11 18 53 88 139 209 278 364 468-e first three methods cannot handle images with 3x and larger sizes on our machine (represented with ldquomdashrdquo in the table) ⋆DNN-SNM is a fast scanningimplementation for prediction speed acceleration

Table 1 -e segmentation results compared with state-of-the-art methods

MethodF1-score ( plusmn σ) Precision ( plusmn σ) Recall ( plusmn σ)

FT DT FT DT FT DT

DC [43] 48plusmn 0093 60plusmn 0138 41plusmn 0066 54plusmn 0164 67plusmn 0194 73plusmn 0148MCG [44] 63plusmn 0201 71plusmn 0105 53plusmn 0136 64plusmn 0138 80plusmn 0303 82plusmn 0091DNN-SNM [14] 76plusmn 0033 78plusmn 0080 83plusmn 0042 85plusmn 0089 70plusmn 0058 73plusmn 0087U-Net [30] 80plusmn 0143 81plusmn 0054 87plusmn 0155 86plusmn 0076 74plusmn 0126 77plusmn 0055Liu et al [7] 82plusmn 0172 84plusmn 0061 81plusmn 0043 84plusmn 0071 85plusmn 0202 85plusmn 0068Our approach 86plusmn 0184 89plusmn 0048 91plusmn 0174 93plusmn 0050 82plusmn 0176 86plusmn 0058σ is the standard deviation

8 Journal of Healthcare Engineering

Data Availability

-e data that support the findings of this study areavailable from the cooperative institution Muscle Minerbut restrictions apply to the availability of these datawhich were used under license for the current study andso are not publicly available Data are however availablefrom the authors upon reasonable request and withpermission of the cooperative institution Muscle Miner

Conflicts of Interest

-e authors declare that they have no conflicts of interest

Acknowledgments

-e authors would like to thank all study participants -iswork was supported by the National Key RampD Program ofChina (grant no 2017YFB1002504) and National NaturalScience Foundation of China (nos 81727802 and 61701404)

References

[1] C S Fry J D Lee J Mula et al ldquoInducible depletion ofsatellite cells in adult sedentary mice impairs muscle re-generative capacity without affecting sarcopeniardquo NatureMedicine vol 21 no 1 pp 76ndash80 2015

[2] M W Lee M G Viola H Meng et al ldquoDifferential musclehypertrophy is associated with satellite cell numbers and aktpathway activation following activin type IIB receptor in-hibition in Mtm1 pR69C micerdquo e American Journal ofPathology vol 184 no 6 pp 1831ndash1842 2014

[3] H Viola P M Janssen R W Grange et al ldquoTissue triage andfreezing for models of skeletal muscle diseaserdquo Journal ofVisualized Experiments JoVE vol e51586 no 89 2014

[4] F Liu A L Mackey R Srikuea K A Esser and L YangldquoAutomated image segmentation of haematoxylin and eosinstained skeletal muscle cross-sectionsrdquo Journal of Microscopyvol 252 no 3 pp 275ndash285 2013

[5] H Su F Xing J D Lee et al ldquoLearning based automaticdetection of myonuclei in isolated single skeletal muscle fibersusing multi-focus image fusionrdquo in Proceedings of the IEEEInternational Symposium on Biomedical Imaging pp 432ndash435 San Francisco CA USA April 2013

[6] L R Smith and E R Barton ldquoSmashndashsemi-automatic muscleanalysis using segmentation of histology a matlab applica-tionrdquo Skeletal Muscle vol 4 no 1 pp 1ndash16 2014

[7] F Liu F Xing Z Zhang M Mcgough and L Yang ldquoRobustmuscle cell quantification using structured edge detection andhierarchical segmentationrdquo in Lecture Notes in ComputerScience pp 324ndash331 Shenzhen MICCAI Shenzhen China2015

[8] T Janssens L Antanas S Derde I VanhorebeekG Van den Berghe and F Guiza Grandas ldquoCharisma anintegrated approach to automatic HampE-stained skeletalmuscle cell segmentation using supervised learning and novelrobust clump splittingrdquoMedical Image Analysis vol 17 no 8pp 1206ndash1219 2013

[9] A E Carpenter T R Jones M R Lamprecht et al ldquoCell-profiler image analysis software for identifying and quanti-fying cell phenotypesrdquo Genome Biology vol 7 no 10pp 1ndash11 2006

[10] A Klemencic S Kovacic and F Pernus ldquoAutomated seg-mentation of muscle fiber images using active contourmodelsrdquo Cytometry vol 32 no 4 pp 317ndash326 1998

[11] N Bova V Gal O Ibantildeez and O Cordon ldquoDeformablemodels direct supervised guidance a novel paradigm forautomatic image segmentationrdquo Neurocomputing vol 177pp 317ndash333 2016

[12] T F Cootes C J Taylor D H Cooper et al ldquoActive shapemodels-their training and applicationrdquo Computer Vision andImage Understanding vol 61 no 1 pp 38ndash59 1995

[13] S Zhang Y Zhan M Dewan J Huang D N Metaxas andX S Zhou ldquoSparse shape composition a new frameworkfor shape prior modelingrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1025ndash1032 IEEE Colorado Springs CO USAJune 2011

[14] D Ciresan A Giusti L M Gambardella and J SchmidhuberldquoDeep neural networks segment neuronal membranes inelectron microscopy imagesrdquo in Proceedings of the NIPSpp 2843ndash2851 Lake Tahoe NV USA December 2012

[15] H Noh S Hong and B Han ldquoLearning deconvolutionnetwork for semantic segmentationrdquo in Proceedings of theICCV Las Condes Chile December 2015

[16] J Long E Shelhamer and T Darrell ldquoFully convolutionalnetworks for semantic segmentationrdquo in Proceedings of theCVPR pp 3431ndash3440 Santiago Chile December 2015

[17] P O Pinheiro R Collobert and P Dollar ldquoLearning tosegment object candidatesrdquo in Proceedings of the Advances inNeural Information Processing Systems pp 1981ndash1989Montreal Canada December 2015

[18] J Mula J D Lee F Liu L Yang and C A PetersonldquoAutomated image analysis of skeletal muscle fiber cross-sectional areardquo Journal of Applied Physiology vol 114 no 1pp 148ndash155 2013

[19] P-Y Baudin N Azzabou P G Carlier and N ParagiosldquoPrior knowledge random walks and human skeletal musclesegmentationrdquo in Proceedings of the Medical Image Com-puting and Computer-Assisted InterventionndashMICCAI 2012pp 569ndash576 Nice France October 2012

[20] P Dollar and C L Zitnick ldquoStructured forests for fast edgedetectionrdquo in Proceedings of the ICCV pp 1841ndash1848 SydneyAustralia December 2013

[21] F Liu F Xing and L Yang ldquoRobust muscle cell segmentationusing region selection with dynamic programmingrdquo inProceedings of the ISBI pp 521ndash524 Beijing China April2014

[22] E Van Aart N Sepasian A Jalba and A Vilanova ldquoCuda-accelerated geodesic ray-tracing for fiber trackingrdquo Journal ofBiomedical Imaging vol 2011 Article ID 698908 12 pages2011

[23] G C Kagadis C Kloukinas K Moore et al ldquoCloud com-puting in medical imagingrdquo Medical Physics vol 40 no 7article 070901

[24] L Yang X Qi F Xing T Kurc J Saltz and D J ForanldquoParallel content-based sub-image retrieval using hierarchicalsearchingrdquo Bioinformatics vol 30 no 7 pp 996ndash1002 2014

[25] A Krizhevsky I Sutskever and G E Hinton ldquoImagenetclassification with deep convolutional neural networksrdquo inProceedings of the Advances in Neural Information ProcessingSystems pp 1097ndash1105 Lake Tahoe NV USA December2012

[26] H-C Shin H R Roth M Gao et al ldquoDeep convolutionalneural networks for computer-aided detection CNN archi-tectures dataset characteristics and transfer learningrdquo IEEE

Journal of Healthcare Engineering 9

Transactions on Medical Imaging vol 35 no 5 pp 1285ndash1298 2016

[27] J Xu X Luo G Wang H Gilmore and A Madabhushi ldquoAdeep convolutional neural network for segmenting andclassifying epithelial and stromal regions in histopathologicalimagesrdquo Neurocomputing vol 191 pp 214ndash223 2016

[28] X Pan L Li H Yang et al ldquoAccurate segmentation of nucleiin pathological images via sparse reconstruction and deepconvolutional networksrdquo Neurocomputing vol 229 pp 88ndash99 2017

[29] K Simonyan and A Zisserman ldquoVery deep convolutionalnetworks for large-scale image recognitionrdquo httpsarxivorgabs14091556

[30] O Ronneberger P Fischer and T Brox ldquoU-net convolu-tional networks for biomedical image segmentationrdquo inLecture Notes in Computer Science pp 234ndash241 ShenzhenMICCAI Shenzhen China 2015

[31] S Xie and Z Tu ldquoHolistically-nested edge detectionrdquo inProceedings of the ICCV pp 1395ndash1403 Las Condes ChileDecember 2015

[32] D Eigen and R Fergus ldquoPredicting depth surface normalsand semantic labels with a commonmulti-scale convolutionalarchitecturerdquo in Proceedings of the IEEE International Con-ference on Computer Vision pp 2650ndash2658 Las CondesChile December 2015

[33] Q Dou H Chen L Yu et al ldquoAutomatic detection of cerebralmicrobleeds from mr images via 3d convolutional neuralnetworksrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1182ndash1195 2016

[34] H Chen X Qi L Yu and P-A Heng ldquoDcan deep contour-aware networks for accurate gland segmentationrdquo httpsarxivorgabs160402677

[35] P Moeskops M A Viergever A M Mendrik L S de VriesM J N L Benders and I Isgum ldquoAutomatic segmentation ofmr brain images with a convolutional neural networkrdquo IEEETransactions on Medical Imaging vol 35 no 5 pp 1252ndash1261 2016

[36] S Hong H Noh and B Han ldquoDecoupled deep neuralnetwork for semi-supervised semantic segmentationrdquo inProceedings of the Advances in Neural Information ProcessingSystems pp 1495ndash1503 Montreal Canada December 2015

[37] M D Zeiler and R Fergus ldquoVisualizing and understandingconvolutional networksrdquo in Proceedings of the ComputervisionndashECCV 2014 pp 818ndash833 Springer Zurich SwitzerlandSeptember 2014

[38] C Szegedy W Liu Y Jia et al ldquoGoing deeper with con-volutionsrdquo in Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition pp 1ndash9 Boston MA USAJune 2015

[39] G Bertasius J Shi and L Torresani ldquoDeepedge a multi-scalebifurcated deep network for top-down contour detectionrdquo inProceedings of the IEEE Conference on Computer Vision andPattern Recognition pp 4380ndash4389 Boston MA USA June2015

[40] Y Jia E Shelhamer J Donahue et al ldquoCaffe convolutionalarchitecture for fast feature embeddingrdquo in Proceedings of theInternational Conference on Multimedia pp 675ndash678Orlando FL USA November 2014

[41] N Tajbakhsh J Y Shin S R Gurudu et al ldquoConvolutionalneural networks for medical image analysis full training orfine tuningrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1299ndash1312 2016

[42] J Yosinski J Clune Y Bengio and H Lipson ldquoHowtransferable are features in deep neural networksrdquo in

Proceedings of the NIPS pp 3320ndash3328 Montreal CanadaDecember 2014

[43] M Donoser and D Schmalstieg ldquoDiscrete-continuous gra-dient orientation estimation for faster image segmentationrdquoin Proceedings of the CVPR pp 3158ndash3165 Columbus OHUSA June 2014

[44] P Arbelaez J Pont-Tuset J Barron F Marques and J MalikldquoMultiscale combinatorial groupingrdquo in Proceedings of theCVPR pp 328ndash335 Columbus OH USA June 2014

[45] A Giusti D C Ciresan J Masci L M Gambardella andJ Schmidhuber ldquoFast image scanning with deep max-poolingconvolutional neural networksrdquo 2013 httpsarxivorgabs13021700

10 Journal of Healthcare Engineering

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Although histological cell segmentation research has a richhistory few have been successfully applied to MIS -ere arestill several challenges which remain to be solved to achieve arobust and automatic MIS system First all exiting MISmethods can only handle small image patches with the sizesmaller than 1000 times 1000 cropped from whole-slide muscleimages -e main reason is that supervised methods usuallyrely on handcrafted features and pretrain classifiers to dis-tinguish cell or noncell regions or pixels However thecomputation of well-designed handcrafted features andregionwise classification is usually expensive and the time costis proportional to the image scale -erefore this limitation ofexistingmethodsmakes them hardly be applied for large-scalewhole-slide muscle images

Second the special muscle cell shape and size asabovementioned make the segmentation methods hardly begeneralized For example unsupervisedmethods such as thedeformable model [10 11] and shape prior-based methods[12 13] have been widely used in histological and mi-croscopy cell segmentation However the arbitrarilytransformed cell shapes and size increase the difficulty to useshape information for MIS

-ird the densely touching fibres and staining artifactsmake the fiber boundaries unclear and broken which in-creases the difficulties for methods to separate multipletouching fibres by using boundary information Fourthcurrent methods usually contain multiple mutually de-pendent steps the failure of either step will largely affectother steps and the final results On the other hand thecomplex pipeline largely decreases the speed of MIS

-is paper addresses these challenges to achieve a bothefficient and effective MIS method based on recently popularconvolutional neural network (CNN) CNN-based methodshave achieved unprecedented performance in variousmedicalimage applications Different from conventional computervision methods CNN has strong capability to learn com-prehensive representations via a deep architecture for effectiveclassification When using CNN for pixelwise classificationconventional CNN shows the efficiency shortcomings [14]-e end-to-end CNN training strategy has recently attracted alot of research interest [15 16] However a common problemis that the dense output is relatively coarse and it is difficult toaccurately classify each pixel [16 17] To generate more ac-curate and fine outputs a refinement procedure needs to beconsidered

In this paper we propose a novel MIS method based onCNN trained in an end-to-end manner [16] which enablesthe CNN to better utilize the rich representations and directlypredict fine-grained segmentation given an arbitrarily sizedinput image Figure 1 shows some segmentation results ofdifferent image scales Specifically the main contributions ofthis paper are summarized as follows

(i) We propose a network whose architecture mainlycontains two modules the encoder and the decoder-e encoder captures rich representations through avery deep CNN architecture -e decoder leverageson the hierarchy characteristic of the encoder toenable multiscale prediction independently A

refinement procedure of the decoder automaticallyaddresses the fine-grained dense outputs Figure 2illustrates our network

(ii) We propose a novel spatially weighted lossfunction to take care of the unbalanced class issueand unavoidable errors happened in groundtruth which encourage the convergence of thenetwork

(iii) We propose a two-stage training approach to trainthe proposed very deep network which facilitatesthe network to better use pretrained CNN for betterconvergence and preserve the weak boundary in-formation of muscle cells

(iv) We conduct sufficient experiments on an expertise-annotated skeletal muscle image dataset demon-strating the significantly improved efficiency andaccuracy compared with other state of the arts

2 Related Works

-e growing interest in the computer-aided histologicalimage diagnosis entails rich research literature As oneof the histological image analysis family skeletal muscleimage analysis is a new yet recently popular applicationwhich has built successful cooperations with clinics toaccelerate their research and clinical trials [1ndash3 18 19]

As the prerequisite of skeletal muscle image analysisvarious methods have been proposed for MIS Klemencicet al [10] proposed a semiautomatic muscle image seg-mentation approach based on the active contour modelJanssens et al [8] proposed a top-down cell segmentationframework using supervised learning and clump splittingwhich requires a long pipeline with the help of several low-level image processing techniques However the perfor-mance of these image processing techniques can be easilyinfluenced by imaging artifacts and cell clumps Smith andBarton [6] proposed SMASHmdashsemiautomatic muscle imageanalysis software Some other software applications such asCellProfiler [9] have obtained high exposure in histologicalimage analysis community However these software ap-plications show nonsatisfactory results for challengingmuscle images Practically time-consuming manual ad-justment is still needed Liu et al [4] proposed a deformablemodel-based segmentation algorithm -e success relayslargely on the initial centers of the muscle cells It is not ableto handle cells with arbitrarily transformed fiber shape andit requires complex postprocessing to refine the resultswhich is not robust in practice Recently Liu et al [7]proposed a hierarchical tree-based region selection methodto segment muscle fibres which relies on elaboratelydesigned features and high-level machine learning tech-niques -is method first detects fiber boundaries by usingstructured random forest [20] then it builds a hierarchicalregion tree based on the detected edge map Finally thedynamic programming is performed to select candidateregions from the tree structure [21] -is method showsobvious improvement compared with previous MIS ap-proaches However this method still suffers from relatively

2 Journal of Healthcare Engineering

expensive computation so it is unable to be applied ontowhole-slide images

As a matter of fact whole-slide MIS is still an unsolvedproblem Although some literature discusses the usage ofdistributed computing [22ndash24] to accelerate process forlarge-scale histological images distributed computing isusually difficult to be deployed for practical usage in clinicalpractice

Convolutional neural network (CNN) [25] is one majorbranch of the deep learning family Its applications in pa-thology and histological image analysis domain became in-creasingly popular very recently [14 26ndash28] CNN has shownstrong ability to handle complex classification problem [29]Recently end-to-end CNN training concept is introduced forsemantic image segmentation termed fully convolutionalneural network (FCN) [16] Instead of performing patch-to-pixel prediction it enables the network to perform spatialdense classification (ie a segmentation mask) given a testimage By taking advantaging of this strength severalmethods have been proposed to handle various pixelwiseclassification tasks [15 30ndash34] Our paper shares somesimilarity with the previous works of how to enable CNNto be trained in an end-to-end manner Different fromprevious works we have made several specific designs to

handle fine-grained prediction unbalanced class multiscalefeatures and transfer learning from pretrained model forMIS More details are discussed in the rest of the paper

3 Methodology

In this section we begin by introducing the proposednetwork architecture and then present proposed lossfunction for training the network Finally we introduce thetwo-stage learning to train the overall network

31 Network Architecture We briefly introduce the con-volutional neural network (CNN) at first CNN [25] is avariant of multilayer perception (MLP) which is mainlycomposed of multiple stacked computation layers frombottom to top including convolutional max-pooling andfully connected layers activation layer etc -e convolu-tional layer uses learnable convolutional filters to extractrepresentations from locally connected image regions (re-ceptive fields) -e max-pooling layer reduces the di-mensionality of the obtained representations fromconvolutional layers while keeping the feature translationinvariance -e fully connected layer uses all features for

Ground truth

Segmentation mask Boundary map

Input image

Segmentation result

Multiscale outputs of decoders

Concat FCN

Figure 2 -e illustration of the network architecture -e input image has a ground truth segmentation mask and a boundary map Blackboxes indicate the encoder module while colored boxes indicate the decoder module One decoder takes the feature maps of one encoderlayer as input and outputs one segmentation results -e multiscale outputs of all encoders are concatenated to generate the final seg-mentation result

OverlayImage

1X 11s 2X 18s 4X 88s 6X 209s 8X 364s

Figure 1 Illustration of the segmentation results of different scale (1x 1000 times 1000 pixels to 8x 8000 times 8000 pixels) whole-slide muscleimages (best viewed in electronic form) For each image the right half side represents the segmentation results overlaid by the coloredmasks -e runtime is the result tested on a single GPU

Journal of Healthcare Engineering 3

high-level classification From bottom layers to top layersCNN gradually captures rich representations of input imagefrom pixel level to content level so as to make accurateclassification Conventional CNN is used to perform high-level classification ie assigning a category label to an inputimage patch When it is applied to pixelwise prediction MIStask extensive patch-to-pixel level prediction (CNN feed-forward) is required which will extensively limit the seg-mentation efficiency [14 35]

To solve this problem we train our network in an end-to-endmanner which enables the network to directly outputthe image segmentation given an input image In this waywe no longer need to use patchwise classification to assignlabels to all pixels via millions of CNN feedforward Onlyone-time feedforward is needed to obtain the final seg-mentation End-to-end training of CNN is used to enable thenetwork to directly output dense image segmentation for agiven input image [15 16 32]

However modifying conventional CNN to perform end-to-end training brings a major side effect ie substantialpixel-level information loss at top layers makes the pixelwiseprediction inaccurate [17 30] It is because multiple max-pooling layers will dramatically decrease the spatial size ofthe output so the predicted segmentation output is verycoarse Most proposed end-to-end CNN methods useupsampling [16 17] or deconvolution operations [15] toresize back the output to the spatial size of the input imageNevertheless the max-pooling layer is essential to abstractthe content-level representations for high-level categoryclassification [25 29 36] and decrease the computationspace of CNN

As a matter of fact when we generalize end-to-endCNN to MIS content-level information becomes lessimportant because the label of a single pixel does not relyon the knowledge of the whole muscle image Differentfrom semantic segmentation [16 32] which needs content-level information to predict the category label per pixel weare more interested in the fine-grained pixelwise pre-diction by taking advantage of the hierarchical repre-sentations of the encoder to improve the predictionaccuracy -e hierarchy characteristic can be achieved bygradually enlarging the receptive field size after each max-pooling layer To this end we propose a novel networkarchitecture which is composed by one encoder moduleand multiple decoder modules Generally the decoderaims to use the rich and hierarchical representationsobtained from the encoder for pixelwise classification

311 Encoder Module -e encoder architecture is mostlyidentical to the conventional neural network Instead ofbuilding our own layer combinations we borrow the well-known VGG net [29] with fully connected layers trun-cated to capture the rich and hierarchical representationsfrom pixel level at bottom layers to content level(ie category-specific knowledge) at top layers VGG netis composed of a series of convolutional sets with each sethaving multiple convolutional layers followed by a max-pooling layer VGG has two variants (one has 16 layers

and the other has 19 layers) we use 16-layer VGG forefficiency consideration We choose VGG for two reasons(1) we can transfer the pretrained VGG model to helptrain our very deep network as described in the nextsection (2) VGG net is very deep which extracts fivedifferent-scale feature maps containing very rich multi-scale representations for the usage of decoders

312 Decoder Module -e decoder has two main purposes(1) it utilizes the rich representations obtained from theencoder for pixelwise classification So the output of onedecoder is a dense segmentation mask with each spatialposition assigning a label to the corresponding pixel of theinput image (cell or noncell in our case) (2) it refines thelow-rescale coarse segmentation mask to efficiently generatefine-grained high-scale segmentation mask -e refinementprocedure is achieved by multistep deconvolution andsuccessive usage of same-scale feature maps obtained fromother decoders

We propose to connect multiple decoders prior toevery max-pooling layer of the encoder thus the decoderscan easily utilize the multiscale representations as inputfeatures as inspired by [15 16 31] -e decoder can beviewed as a small pixelwise classification network whichhas an independent loss to update its parameters duringtraining Hence the overall architecture is multitaskCNN

Our design of the decoder includes convolutionallayers with intermediate deconvolution layers [15] Spe-cifically the deconvolution is the backward convolutionoperation which performs elementwise product with itsfilters (please note that some controversies arise in thenaming of ldquodeconvolutionrdquo in recent literature as thedeconvolution layer used here is different from the pre-vious definition of the deconvolution [37] we maintainthe same definitions as most of the literatures on end-to-end CNN) -e output size of deconvolution will be en-larged by a factor of the stride -e filters of the decon-volution layers are learnable and updated by the loss of thedecoder

In this way rather than enlarging the image with alarge stride through a skip connection [16 31 38] ourapproach enlarges the feature map in multiple steps andprogressively refines the feature maps at different scalesvia convolutional kernels with the purpose of reducingthe effects of pixel-level information loss We use 3 times 3filter size as this small size has been proven effectivewidely In the end we concatenate multiscale predictionsof all decoders which generates a 5-dimensional featuremap we apply a 1 times 1 convolutional layer to merge thefeature map to generate the final output Compared withhow recent architecture [35 39] uses multiscale in-formation (resize input patch size and feed into multipleCNNs and merge all predictions [35 39]) our approachenables multiscaling inside the network requiring asingle arbitrarily sized input and outputting the finalsegmentation result Figure 3 specifies the parameters ofeach layer

4 Journal of Healthcare Engineering

32 Spatially Weighted Loss for Backpropagation -is sec-tion describes the loss function training network throughbackpropagation Our proposed spatially weighted loss playsan important role in network training

Denote the training data as D (X Y) isin X times Y1113864 1113865where XY isin RN and N is the total number of pixelsin the training image X Y is the corresponding groundtruth segmentation mask with each pixel Yi isin 0 1

(ie pixels inside and on the boundary the muscle cell andbackground otherwise) For an input image X the mainobjective of our network is to obtain the pixelwise pre-diction Y⋆

Y⋆

arg max1113954Y

P(1113954Y ∣ X θ)(1)

where P(1113954Yi ∣ X θ) is the prediction probability of pixel Xiie the sigmoid function output of the network (denoted asPi afterwards for brevity) θ represents all parameters of ournetwork

Our network has multiple decoders with each havingindependent loss to update their parameters (see Section312 for details) Denote the loss function of i-th decoder asJde

i -e extra 1 times 1 convolutional layer after the concatlayer is updated by another loss (see Figure 3) denoted as

Jc Learning θ is achieved by minimizing the loss functionJ which is defined as

J(θ) 1113944M

i1J

dei (θ) + J

c(θ) (2)

where M is the number of the decoder Note that sinceJdei

and Jc are both spatially computed on pixels of the denseoutput both have the same formulation -e overall lossJ can be jointly minimized via backpropagation (spe-cifically when a layer has more than one path such as theconv1-2 layer in Figure 3 which has two successive layers(decoder-1 and pool1) the gradients will be accumulatedfrom multiple successive paths during backpropagation[40])

In skeletal muscle images there are several commonproblems which affect the network training (1) the largeproportion of pixels inside cell pixels will cause an un-balanced class such that the error effects occurred at themargins will be diminished during backpropagation (2)usually cells are densely distributed and the boundariesbetween touching cells are thin and often unclear orbroken due to musclersquos unique anatomy based on ourobservations the network often misclassifies the pixels at

Decoder-1

Decoder-2

Decoder-3

Decoder-4

Decoder-5

Encoder

Name Kernel Stride Pad OutputNameInputconv-1

Kernelmdash mdash

1 times 1 1mdash0

Inputconv-1

conv-2

mdash mdash1 times 1 1

mdash0

devonv-1dagger 4 times 4 2 0

Stride Pad Output

Name Kernel Stride Pad Output

Name Kernel Stride Pad Output

Inputconv1-1

conv2-1

conv3-1conv3-2conv3-3

3 times 3 1 11 1

1000 times 10001000 times 1000 times 641000 times 1000 times 641000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 11000 times 1000 times 5

250 times 250 times 256

125 times 125 times 512

63 times 63 times 51263 times 63 times 51263 times 63 times 51263 times 63 times 512

125 times 125 times 512125 times 125 times 512125 times 125 times 512

250 times 250 times 256250 times 250 times 256250 times 250 times 256

500 times 500 times 64500 times 500 times 128500 times 500 times 128

3 times 3

2 times 2mdash mdash mdash

211mdash2111mdash2111mdash2

011

111

mdash0

mdash0111mdash

111mdash

111mdash

mdash mdash

0

1 0

3 times 33 times 3

2 times 23 times 33 times 33 times 3

2 times 2

mdash

mdash mdash mdash

mdash

3 times 33 times 33 times 3

2 times 2mdash

3 times 33 times 33 times 3

1 times 1

mdashmdash

conv1-2

conv2-2

Decoder-1

Decoder-2

Decoder-3

pool1

pool2

pool3conv4-1conv4-2conv4-3Decoder-4pool4

concatconv

conv5-1conv5-2conv5-3Decoder-5

1000 times 1000 times 64

500 times 500 times 128

Input mdash mdash mdash 250 times 250 times 256

500 times 500 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

conv-1 1 times 1 1

1 1

0devonv-1

devonv-2dagger

4 times 4

4 times 43 times 3

2

2

0

0

250 times 250 times 1502 times 502 times 1502 times 502 times 1

conv-2

Name Kernel Stride Pad OutputInput mdash mdash mdash 125 times 125 times 512

1000 times 1000 times 1

conv-1 1 times 1 1

1 1

0devonv-1

devonv-2dagger

8 times 8

4 times 43 times 3

2

2

0

0

215 times 125 times 1504 times 504 times 1504 times 504 times 1

conv-2

Name Kernel Stride Pad OutputInput mdash mdash mdash 63 times 63 times 512

1000 times 1000 times 1

conv-1 1 times 1 1

1 1

0devonv-1

devonv-2dagger

8 times 8

8 times 83 times 3

4

4

0

0

63 times 63 times 1256 times 256 times 1256 times 256 times 1

Figure 3 -e detailed network configuration -e convolutional max-pooling deconvolutional and concat layers are denoted by convpool deconv and concat respectively Each convolutional layer of the encoder is followed by a ReLU layer which is hidden in the tables-ere are 5 decoders connected inside the architecture of the encoder -e (black solid and gray dotted) arrows point to the layer where theoutput of the corresponding layer goes-e last column of each table shows the feature map size (heighttimes widthtimes dimension) of each layerIn the tables of decoders ldquodaggerrdquo indicates that a crop layer is connected after that to force the output size to be the same as the input image size(ie 1000 times 1000 in the table)

Journal of Healthcare Engineering 5

margins between fiber boundaries (3) due to the stainingissue the boundary pixels are not smooth and continuousso it is very difficult to ensure that annotations accuratelylabel each pixel It is necessary to reduce the ambiguity fornetwork training

We propose a loss function to ameliorate these problemsby assigning different weights to the loss of each pixel -eloss function of a training data X which is based on thecross-entropy loss is defined as

Jde

(θ) 1113944N

i1f Xi( 1113857 1 Yi 11113858 1113859logPi + 1 Yi 01113858 1113859log 1minusPi( 1113857( 1113857

(3)

-e pixelwise weights are defined by the weight-assigning function f which is defined as

f Xi( 1113857 C Yi( 1113857minus1

times expΩ Xi( 1113857

η1times 1 Yi minusPi

11138681113868111386811138681113868111386811138681113868lt η21113960 1113961 (4)

-e pixelwise weight-assigning function f has threeterms which play different roles to address the above-mentioned three problems-e specific considerations makeour proposed loss different from [16 30]

In the first term C(Yi) is the label frequency which is aglobal term having the same value for same-class pixels Insecond term Ω is the Euclidean distance of pixel Xi to theboundary of the close cell Similar to [30] the intention of f isto assign relatively high weights to pixels adjacent toboundaries to amplify the error penalty occurred at themargins and pixels close to fiber boundaries and 1 otherwise(Ω 0 if Ω(Xi)gt ε ) We set η 06 and ε 10 empiricallyCompared with the ldquohardrdquo error-balancing strategy in[16 31] f produces soft error penalty so as to encouragebetter optimization convergence and enhance fine-grainedprediction -e third term aims to reduce the reliability ofthe ground truth when the network predicts an oppositelabel with high probability -is term is a switch so it forcesthe weight of the corresponding pixel to zero when thecondition is not satisfied In practice we preserve this valueduring network feedforward while the loss of the corre-sponding pixels does not get involved during networkbackpropagation

33 Two-Stage Training Training our deep network hassome common difficulties

(i) -e large number of parameters in both convolu-tional layers and deconvolutional layers makes thetraining difficult to achieve proper convergence[15 41]

(ii) Successful training from scratch requires extensivelabeled data which are extremely difficult to obtain inmedical image domain

One typical solution is to apply transfer learning toreduce the training difficulty [41 42] which reduces thedifficulties of the tricky parameter initialization andtuning [25 29] and heavy data acquirement procedure-e core idea behind is to use a pretrained model as the

initialization and fine-tune the CNN to make it adapt totargeting tasks with new training data -e encoder of ournetwork partially inherits the architecture of VGG [29]which is however trained on a large set of natural imagesfor image classification Transferring its knowledge tobenefit the totally unrelated biological image analysisproblem (ie MIS) seems impracticable However a re-cent literature coincides with our experiments It dem-onstrates the advantage [41] using various biologicalimaging modalities transferring from AlexNet [25] arelatively shallow CNN for natural image classification Interms of our MIS case segmentation the network archi-tecture is much more deeper with many new parame-terized layers in decoders More specific treatment needsto be considered

It is well known that the bottom layers of CNN canbe understood as various feature extractors attemptingto capture the low-level image features such as edgesand corners [25 37 41] Actually those low-level fea-tures are common between natural images and muscleimages of which the most common feature is imagegradients (ie boundaries) In practice we find thattraining the network to detect boundaries is relativelyeasier than directly training the network to segmentmuscle fibres

We propose a two-stage training strategy to pro-gressively train our network so as to utilize the powerfulfeature extractors of VGG and overcome the above-mentioned problems In the first stage we apply transferlearning to use pretrained VGG to initialize the param-eters of the encoder and randomly initialize the param-eters of decoders We then train the network to detectfiber boundaries which is achieved by feeding the networkwith training muscle images associated with the groundtruth boundary map (see Figure 2) -is strategy willfacilitate the network to converge swiftly After the net-work becomes adapted to new muscle images in thesecond stage we fine-tune the model using the originaltraining data D (ie Y is the segmentation mask) to trainthe network to automatically segment muscle fibresassigning in-cell pixels to 1 and other pixels to 0 Moreimplementation details are described in the experimentalsection

Another advantage of our proposed training strategy isthat it further helps reduce the touch objects (due to thinboundaries) problem [30 34] commonly occurred in end-to-end CNN segmentation (besides the pixel weight-assigning function f ) -e strategy of this literature [34] isto predict both a segmentation map and boundary map andmerge two maps to solve touching glands While in ourmethod the first stage training makes the network detect thecell boundaries -e second stage training is able to preservethis boundary information

4 Experimental Results

41 Dataset Our expert annotated skeletal muscle imagedataset with HampE staining contains 500 annotated imageswhich are captured by the whole-slide digital scanner from

6 Journal of Healthcare Engineering

the cooperative institution Muscle Miner -e images ex-hibit large appearance variances in color fiber size andshape -e image size roughly ranges from 500 times 500 to1500 times 1500 pixels We split the dataset into 100 testingimages and 400 training images

In order to evaluate the proposed method to handlelarge-scale images we evaluate the runtime on a whole-slideimage Note that we use small image patches for segmen-tation accuracy evaluation because some comparativemethods in the literature cannot handle whole-slide imagesHowever our proposed network is flexible to the input sizeduring the testing stage because the decoder is able toadaptively adjust the output size to be consistent with theinput size

42 Implementation Details Our implementation is basedon the Caffe [40] framework with modifications for ournetwork design All experiments are conducted on astandard desktop with an Intel i7 processor and a singleTesla K40c GPU -e optimization is driven by stochasticgradient descent with momentum For the first stagetraining the network parameters are set to learningrate 1eminus 6 (divided by 10 every 1e4 iteration) mo-mentum 09 and minibatch size 2 In the second stagewe use the learning rate 1eminus 7 and keep the others thesame

Augmenting dataset is a normal step for training CNNWe apply a simple approach by randomly cropping 30300 times 300 image patches from each of the training imagesto generate totally 12e4 training data We choose thispatch size to take the memory capacity of GPU into ac-count Based on our observations the segmentation ac-curacy will not be affected by increasing input size of testimages To simplify the computation of the weightingfunction f during training we take another per-computedweighting map associated with each training data (X Y) asnetwork inputs

43 Segmentation Accuracy Evaluation For quantitativeevaluation we report Precision (|ScapG||S|) Recall

(|ScapG||G|) and F1-score (2 middot Prec middot RecPrec + Rec)where |S| is the segmented cell region area and |G| is thecorresponding ground truth region area For each testimage Precision and Recall are computed by averaging theresults of all fibres inside We report the three values with afixed threshold (FT) ie a common threshold producesthe best F1-score over the test set and dynamic thresholds(DT) produce the best F1-score per image

In Table 1 we compare the segmentation performance ofour approach to several state-of-the-art methods DC [43]and multiscale combinatorial grouping (MCG) [44] arerecently proposed learning-based image segmentationmethods U-Net [30] is an end-to-end CNN for biomedicalimage segmentation We use their public codes and carefullytrain the models over our training data with the sameamount DNN-SNM [14] is a well-known CNN-based imagesegmentation method We regard it as a generic CNN forcomparison with our end-to-end CNN approach For our

method we directly use the network output as the seg-mentation results for evaluation without any extra post-processing efforts

As shown in Table 1 our method achieves much betterresults than comparative methods Although [7] has betterRecall (FT) our method has around 10 improvement onPrecision (FT) DC and MCG are not robust to the imageartifacts which decreases their segmentation performanceOur method largely outperforms DNN-SNM and U-Netbecause (1) our network is deeper than DNN-SNM tocapture richer representations (2) the decoder betterutilizes the multiscale representations than U-Net and isable to reduce the effects of the pixelwise information lossand (3) two-stage training takes advantage of VGG forbetter training effectiveness rather than training fromscratch as U-Net does -e outstanding Precision resultdemonstrates that our method produces more fine-grainedsegmentation than others -is superiority is betterdemonstrated by the qualitative evaluation as shown inFigure 4

44 Whole-Slide Segmentation Runtime In Table 2 wecompare the runtime of our method to the comparativemethods on images of different sizes cropped from a whole-slide image (see Figure 1) -e runtime of non-deeplearning-based methods (1st block) depends on both pixeland fiber quantities so they cannot handle large-scale im-ages In contrast deep learning-based methods (2nd and 3rdblocks) depend on the pixel quantity so they have close-to-linear time complexity with respect to the image scale Wealso implement a fast scanning version [45] of DNN-SNMon GPU Although the speed has a large improvement it isstill much slower than ours U-Net has more complicatedlayer connection configuration so it is slower than oursespecially in large-scale cases -e significant speed im-provement demonstrates the scalability of our proposedmethod to the application of whole-slide MIS with evenlarger scales

5 Conclusion

-is paper presents a fast and accurate whole-slide MISmethod based on CNN trained in the end-to-end mannerOur proposed network captures hierarchical and compre-hensive representations to support multiscale pixelwisepredictions inside the network A two-stage transfer learningstrategy is proposed to train such a deep network Superioraccuracy and efficiency are experimentally demonstrated ona challenging skeletal muscle image dataset In general ourapproach enables multiscaling inside the network while justrequiring a single arbitrarily sized input and outputting fineoutputs However during the downsampling process of theencoding due to the limitation of resolution of feature layerafter downsampling many important features such as edgefeatures of cells are still lost To further improve decodingefficiency in the future work we can design a module thatcomplements important features to better improve networkperformance

Journal of Healthcare Engineering 7

Test image Ground truth DNN-SNM Liu et al [7] Ours

Figure 4 Segmentation results of four sample skeletal muscle imagesWe show some very challenging cases with large appearance variancesin color fiber shape etc Each segmented fiber is overlaid with a distinctive colored mask while false positives and false negatives arehighlighted by red and blue contours respectively Compared with the other two methods our method obtains more fine-grainedsegmentation results with obviously less false prediction

Table 2 -e runtime (in seconds) comparison on images of different sizes from 1x 1000 times 1000 to 9x 9000 times 9000

Method 1x 2x 3x 4x 5x 6x 7x 8x 9xDC [43] 20 79 mdash mdash mdash mdash mdash mdash mdashMCG [44] 7 27 mdash mdash mdash mdash mdash mdash mdashLiu et al [7] 10 59 mdash mdash mdash mdash mdash mdash mdashDNN-SNM [14] 264 1056 2376 4224 6600 9504 12936 16896 21384DNN-SNM⋆[45] 31 115 242 431 675 974 1325 1738 2160U-net [30] 12 39 90 161 246 368 482 633 792Our approach 11 18 53 88 139 209 278 364 468-e first three methods cannot handle images with 3x and larger sizes on our machine (represented with ldquomdashrdquo in the table) ⋆DNN-SNM is a fast scanningimplementation for prediction speed acceleration

Table 1 -e segmentation results compared with state-of-the-art methods

MethodF1-score ( plusmn σ) Precision ( plusmn σ) Recall ( plusmn σ)

FT DT FT DT FT DT

DC [43] 48plusmn 0093 60plusmn 0138 41plusmn 0066 54plusmn 0164 67plusmn 0194 73plusmn 0148MCG [44] 63plusmn 0201 71plusmn 0105 53plusmn 0136 64plusmn 0138 80plusmn 0303 82plusmn 0091DNN-SNM [14] 76plusmn 0033 78plusmn 0080 83plusmn 0042 85plusmn 0089 70plusmn 0058 73plusmn 0087U-Net [30] 80plusmn 0143 81plusmn 0054 87plusmn 0155 86plusmn 0076 74plusmn 0126 77plusmn 0055Liu et al [7] 82plusmn 0172 84plusmn 0061 81plusmn 0043 84plusmn 0071 85plusmn 0202 85plusmn 0068Our approach 86plusmn 0184 89plusmn 0048 91plusmn 0174 93plusmn 0050 82plusmn 0176 86plusmn 0058σ is the standard deviation

8 Journal of Healthcare Engineering

Data Availability

-e data that support the findings of this study areavailable from the cooperative institution Muscle Minerbut restrictions apply to the availability of these datawhich were used under license for the current study andso are not publicly available Data are however availablefrom the authors upon reasonable request and withpermission of the cooperative institution Muscle Miner

Conflicts of Interest

-e authors declare that they have no conflicts of interest

Acknowledgments

-e authors would like to thank all study participants -iswork was supported by the National Key RampD Program ofChina (grant no 2017YFB1002504) and National NaturalScience Foundation of China (nos 81727802 and 61701404)

References

[1] C S Fry J D Lee J Mula et al ldquoInducible depletion ofsatellite cells in adult sedentary mice impairs muscle re-generative capacity without affecting sarcopeniardquo NatureMedicine vol 21 no 1 pp 76ndash80 2015

[2] M W Lee M G Viola H Meng et al ldquoDifferential musclehypertrophy is associated with satellite cell numbers and aktpathway activation following activin type IIB receptor in-hibition in Mtm1 pR69C micerdquo e American Journal ofPathology vol 184 no 6 pp 1831ndash1842 2014

[3] H Viola P M Janssen R W Grange et al ldquoTissue triage andfreezing for models of skeletal muscle diseaserdquo Journal ofVisualized Experiments JoVE vol e51586 no 89 2014

[4] F Liu A L Mackey R Srikuea K A Esser and L YangldquoAutomated image segmentation of haematoxylin and eosinstained skeletal muscle cross-sectionsrdquo Journal of Microscopyvol 252 no 3 pp 275ndash285 2013

[5] H Su F Xing J D Lee et al ldquoLearning based automaticdetection of myonuclei in isolated single skeletal muscle fibersusing multi-focus image fusionrdquo in Proceedings of the IEEEInternational Symposium on Biomedical Imaging pp 432ndash435 San Francisco CA USA April 2013

[6] L R Smith and E R Barton ldquoSmashndashsemi-automatic muscleanalysis using segmentation of histology a matlab applica-tionrdquo Skeletal Muscle vol 4 no 1 pp 1ndash16 2014

[7] F Liu F Xing Z Zhang M Mcgough and L Yang ldquoRobustmuscle cell quantification using structured edge detection andhierarchical segmentationrdquo in Lecture Notes in ComputerScience pp 324ndash331 Shenzhen MICCAI Shenzhen China2015

[8] T Janssens L Antanas S Derde I VanhorebeekG Van den Berghe and F Guiza Grandas ldquoCharisma anintegrated approach to automatic HampE-stained skeletalmuscle cell segmentation using supervised learning and novelrobust clump splittingrdquoMedical Image Analysis vol 17 no 8pp 1206ndash1219 2013

[9] A E Carpenter T R Jones M R Lamprecht et al ldquoCell-profiler image analysis software for identifying and quanti-fying cell phenotypesrdquo Genome Biology vol 7 no 10pp 1ndash11 2006

[10] A Klemencic S Kovacic and F Pernus ldquoAutomated seg-mentation of muscle fiber images using active contourmodelsrdquo Cytometry vol 32 no 4 pp 317ndash326 1998

[11] N Bova V Gal O Ibantildeez and O Cordon ldquoDeformablemodels direct supervised guidance a novel paradigm forautomatic image segmentationrdquo Neurocomputing vol 177pp 317ndash333 2016

[12] T F Cootes C J Taylor D H Cooper et al ldquoActive shapemodels-their training and applicationrdquo Computer Vision andImage Understanding vol 61 no 1 pp 38ndash59 1995

[13] S Zhang Y Zhan M Dewan J Huang D N Metaxas andX S Zhou ldquoSparse shape composition a new frameworkfor shape prior modelingrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1025ndash1032 IEEE Colorado Springs CO USAJune 2011

[14] D Ciresan A Giusti L M Gambardella and J SchmidhuberldquoDeep neural networks segment neuronal membranes inelectron microscopy imagesrdquo in Proceedings of the NIPSpp 2843ndash2851 Lake Tahoe NV USA December 2012

[15] H Noh S Hong and B Han ldquoLearning deconvolutionnetwork for semantic segmentationrdquo in Proceedings of theICCV Las Condes Chile December 2015

[16] J Long E Shelhamer and T Darrell ldquoFully convolutionalnetworks for semantic segmentationrdquo in Proceedings of theCVPR pp 3431ndash3440 Santiago Chile December 2015

[17] P O Pinheiro R Collobert and P Dollar ldquoLearning tosegment object candidatesrdquo in Proceedings of the Advances inNeural Information Processing Systems pp 1981ndash1989Montreal Canada December 2015

[18] J Mula J D Lee F Liu L Yang and C A PetersonldquoAutomated image analysis of skeletal muscle fiber cross-sectional areardquo Journal of Applied Physiology vol 114 no 1pp 148ndash155 2013

[19] P-Y Baudin N Azzabou P G Carlier and N ParagiosldquoPrior knowledge random walks and human skeletal musclesegmentationrdquo in Proceedings of the Medical Image Com-puting and Computer-Assisted InterventionndashMICCAI 2012pp 569ndash576 Nice France October 2012

[20] P Dollar and C L Zitnick ldquoStructured forests for fast edgedetectionrdquo in Proceedings of the ICCV pp 1841ndash1848 SydneyAustralia December 2013

[21] F Liu F Xing and L Yang ldquoRobust muscle cell segmentationusing region selection with dynamic programmingrdquo inProceedings of the ISBI pp 521ndash524 Beijing China April2014

[22] E Van Aart N Sepasian A Jalba and A Vilanova ldquoCuda-accelerated geodesic ray-tracing for fiber trackingrdquo Journal ofBiomedical Imaging vol 2011 Article ID 698908 12 pages2011

[23] G C Kagadis C Kloukinas K Moore et al ldquoCloud com-puting in medical imagingrdquo Medical Physics vol 40 no 7article 070901

[24] L Yang X Qi F Xing T Kurc J Saltz and D J ForanldquoParallel content-based sub-image retrieval using hierarchicalsearchingrdquo Bioinformatics vol 30 no 7 pp 996ndash1002 2014

[25] A Krizhevsky I Sutskever and G E Hinton ldquoImagenetclassification with deep convolutional neural networksrdquo inProceedings of the Advances in Neural Information ProcessingSystems pp 1097ndash1105 Lake Tahoe NV USA December2012

[26] H-C Shin H R Roth M Gao et al ldquoDeep convolutionalneural networks for computer-aided detection CNN archi-tectures dataset characteristics and transfer learningrdquo IEEE

Journal of Healthcare Engineering 9

Transactions on Medical Imaging vol 35 no 5 pp 1285ndash1298 2016

[27] J Xu X Luo G Wang H Gilmore and A Madabhushi ldquoAdeep convolutional neural network for segmenting andclassifying epithelial and stromal regions in histopathologicalimagesrdquo Neurocomputing vol 191 pp 214ndash223 2016

[28] X Pan L Li H Yang et al ldquoAccurate segmentation of nucleiin pathological images via sparse reconstruction and deepconvolutional networksrdquo Neurocomputing vol 229 pp 88ndash99 2017

[29] K Simonyan and A Zisserman ldquoVery deep convolutionalnetworks for large-scale image recognitionrdquo httpsarxivorgabs14091556

[30] O Ronneberger P Fischer and T Brox ldquoU-net convolu-tional networks for biomedical image segmentationrdquo inLecture Notes in Computer Science pp 234ndash241 ShenzhenMICCAI Shenzhen China 2015

[31] S Xie and Z Tu ldquoHolistically-nested edge detectionrdquo inProceedings of the ICCV pp 1395ndash1403 Las Condes ChileDecember 2015

[32] D Eigen and R Fergus ldquoPredicting depth surface normalsand semantic labels with a commonmulti-scale convolutionalarchitecturerdquo in Proceedings of the IEEE International Con-ference on Computer Vision pp 2650ndash2658 Las CondesChile December 2015

[33] Q Dou H Chen L Yu et al ldquoAutomatic detection of cerebralmicrobleeds from mr images via 3d convolutional neuralnetworksrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1182ndash1195 2016

[34] H Chen X Qi L Yu and P-A Heng ldquoDcan deep contour-aware networks for accurate gland segmentationrdquo httpsarxivorgabs160402677

[35] P Moeskops M A Viergever A M Mendrik L S de VriesM J N L Benders and I Isgum ldquoAutomatic segmentation ofmr brain images with a convolutional neural networkrdquo IEEETransactions on Medical Imaging vol 35 no 5 pp 1252ndash1261 2016

[36] S Hong H Noh and B Han ldquoDecoupled deep neuralnetwork for semi-supervised semantic segmentationrdquo inProceedings of the Advances in Neural Information ProcessingSystems pp 1495ndash1503 Montreal Canada December 2015

[37] M D Zeiler and R Fergus ldquoVisualizing and understandingconvolutional networksrdquo in Proceedings of the ComputervisionndashECCV 2014 pp 818ndash833 Springer Zurich SwitzerlandSeptember 2014

[38] C Szegedy W Liu Y Jia et al ldquoGoing deeper with con-volutionsrdquo in Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition pp 1ndash9 Boston MA USAJune 2015

[39] G Bertasius J Shi and L Torresani ldquoDeepedge a multi-scalebifurcated deep network for top-down contour detectionrdquo inProceedings of the IEEE Conference on Computer Vision andPattern Recognition pp 4380ndash4389 Boston MA USA June2015

[40] Y Jia E Shelhamer J Donahue et al ldquoCaffe convolutionalarchitecture for fast feature embeddingrdquo in Proceedings of theInternational Conference on Multimedia pp 675ndash678Orlando FL USA November 2014

[41] N Tajbakhsh J Y Shin S R Gurudu et al ldquoConvolutionalneural networks for medical image analysis full training orfine tuningrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1299ndash1312 2016

[42] J Yosinski J Clune Y Bengio and H Lipson ldquoHowtransferable are features in deep neural networksrdquo in

Proceedings of the NIPS pp 3320ndash3328 Montreal CanadaDecember 2014

[43] M Donoser and D Schmalstieg ldquoDiscrete-continuous gra-dient orientation estimation for faster image segmentationrdquoin Proceedings of the CVPR pp 3158ndash3165 Columbus OHUSA June 2014

[44] P Arbelaez J Pont-Tuset J Barron F Marques and J MalikldquoMultiscale combinatorial groupingrdquo in Proceedings of theCVPR pp 328ndash335 Columbus OH USA June 2014

[45] A Giusti D C Ciresan J Masci L M Gambardella andJ Schmidhuber ldquoFast image scanning with deep max-poolingconvolutional neural networksrdquo 2013 httpsarxivorgabs13021700

10 Journal of Healthcare Engineering

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

expensive computation so it is unable to be applied ontowhole-slide images

As a matter of fact whole-slide MIS is still an unsolvedproblem Although some literature discusses the usage ofdistributed computing [22ndash24] to accelerate process forlarge-scale histological images distributed computing isusually difficult to be deployed for practical usage in clinicalpractice

Convolutional neural network (CNN) [25] is one majorbranch of the deep learning family Its applications in pa-thology and histological image analysis domain became in-creasingly popular very recently [14 26ndash28] CNN has shownstrong ability to handle complex classification problem [29]Recently end-to-end CNN training concept is introduced forsemantic image segmentation termed fully convolutionalneural network (FCN) [16] Instead of performing patch-to-pixel prediction it enables the network to perform spatialdense classification (ie a segmentation mask) given a testimage By taking advantaging of this strength severalmethods have been proposed to handle various pixelwiseclassification tasks [15 30ndash34] Our paper shares somesimilarity with the previous works of how to enable CNNto be trained in an end-to-end manner Different fromprevious works we have made several specific designs to

handle fine-grained prediction unbalanced class multiscalefeatures and transfer learning from pretrained model forMIS More details are discussed in the rest of the paper

3 Methodology

In this section we begin by introducing the proposednetwork architecture and then present proposed lossfunction for training the network Finally we introduce thetwo-stage learning to train the overall network

31 Network Architecture We briefly introduce the con-volutional neural network (CNN) at first CNN [25] is avariant of multilayer perception (MLP) which is mainlycomposed of multiple stacked computation layers frombottom to top including convolutional max-pooling andfully connected layers activation layer etc -e convolu-tional layer uses learnable convolutional filters to extractrepresentations from locally connected image regions (re-ceptive fields) -e max-pooling layer reduces the di-mensionality of the obtained representations fromconvolutional layers while keeping the feature translationinvariance -e fully connected layer uses all features for

Ground truth

Segmentation mask Boundary map

Input image

Segmentation result

Multiscale outputs of decoders

Concat FCN

Figure 2 -e illustration of the network architecture -e input image has a ground truth segmentation mask and a boundary map Blackboxes indicate the encoder module while colored boxes indicate the decoder module One decoder takes the feature maps of one encoderlayer as input and outputs one segmentation results -e multiscale outputs of all encoders are concatenated to generate the final seg-mentation result

OverlayImage

1X 11s 2X 18s 4X 88s 6X 209s 8X 364s

Figure 1 Illustration of the segmentation results of different scale (1x 1000 times 1000 pixels to 8x 8000 times 8000 pixels) whole-slide muscleimages (best viewed in electronic form) For each image the right half side represents the segmentation results overlaid by the coloredmasks -e runtime is the result tested on a single GPU

Journal of Healthcare Engineering 3

high-level classification From bottom layers to top layersCNN gradually captures rich representations of input imagefrom pixel level to content level so as to make accurateclassification Conventional CNN is used to perform high-level classification ie assigning a category label to an inputimage patch When it is applied to pixelwise prediction MIStask extensive patch-to-pixel level prediction (CNN feed-forward) is required which will extensively limit the seg-mentation efficiency [14 35]

To solve this problem we train our network in an end-to-endmanner which enables the network to directly outputthe image segmentation given an input image In this waywe no longer need to use patchwise classification to assignlabels to all pixels via millions of CNN feedforward Onlyone-time feedforward is needed to obtain the final seg-mentation End-to-end training of CNN is used to enable thenetwork to directly output dense image segmentation for agiven input image [15 16 32]

However modifying conventional CNN to perform end-to-end training brings a major side effect ie substantialpixel-level information loss at top layers makes the pixelwiseprediction inaccurate [17 30] It is because multiple max-pooling layers will dramatically decrease the spatial size ofthe output so the predicted segmentation output is verycoarse Most proposed end-to-end CNN methods useupsampling [16 17] or deconvolution operations [15] toresize back the output to the spatial size of the input imageNevertheless the max-pooling layer is essential to abstractthe content-level representations for high-level categoryclassification [25 29 36] and decrease the computationspace of CNN

As a matter of fact when we generalize end-to-endCNN to MIS content-level information becomes lessimportant because the label of a single pixel does not relyon the knowledge of the whole muscle image Differentfrom semantic segmentation [16 32] which needs content-level information to predict the category label per pixel weare more interested in the fine-grained pixelwise pre-diction by taking advantage of the hierarchical repre-sentations of the encoder to improve the predictionaccuracy -e hierarchy characteristic can be achieved bygradually enlarging the receptive field size after each max-pooling layer To this end we propose a novel networkarchitecture which is composed by one encoder moduleand multiple decoder modules Generally the decoderaims to use the rich and hierarchical representationsobtained from the encoder for pixelwise classification

311 Encoder Module -e encoder architecture is mostlyidentical to the conventional neural network Instead ofbuilding our own layer combinations we borrow the well-known VGG net [29] with fully connected layers trun-cated to capture the rich and hierarchical representationsfrom pixel level at bottom layers to content level(ie category-specific knowledge) at top layers VGG netis composed of a series of convolutional sets with each sethaving multiple convolutional layers followed by a max-pooling layer VGG has two variants (one has 16 layers

and the other has 19 layers) we use 16-layer VGG forefficiency consideration We choose VGG for two reasons(1) we can transfer the pretrained VGG model to helptrain our very deep network as described in the nextsection (2) VGG net is very deep which extracts fivedifferent-scale feature maps containing very rich multi-scale representations for the usage of decoders

312 Decoder Module -e decoder has two main purposes(1) it utilizes the rich representations obtained from theencoder for pixelwise classification So the output of onedecoder is a dense segmentation mask with each spatialposition assigning a label to the corresponding pixel of theinput image (cell or noncell in our case) (2) it refines thelow-rescale coarse segmentation mask to efficiently generatefine-grained high-scale segmentation mask -e refinementprocedure is achieved by multistep deconvolution andsuccessive usage of same-scale feature maps obtained fromother decoders

We propose to connect multiple decoders prior toevery max-pooling layer of the encoder thus the decoderscan easily utilize the multiscale representations as inputfeatures as inspired by [15 16 31] -e decoder can beviewed as a small pixelwise classification network whichhas an independent loss to update its parameters duringtraining Hence the overall architecture is multitaskCNN

Our design of the decoder includes convolutionallayers with intermediate deconvolution layers [15] Spe-cifically the deconvolution is the backward convolutionoperation which performs elementwise product with itsfilters (please note that some controversies arise in thenaming of ldquodeconvolutionrdquo in recent literature as thedeconvolution layer used here is different from the pre-vious definition of the deconvolution [37] we maintainthe same definitions as most of the literatures on end-to-end CNN) -e output size of deconvolution will be en-larged by a factor of the stride -e filters of the decon-volution layers are learnable and updated by the loss of thedecoder

In this way rather than enlarging the image with alarge stride through a skip connection [16 31 38] ourapproach enlarges the feature map in multiple steps andprogressively refines the feature maps at different scalesvia convolutional kernels with the purpose of reducingthe effects of pixel-level information loss We use 3 times 3filter size as this small size has been proven effectivewidely In the end we concatenate multiscale predictionsof all decoders which generates a 5-dimensional featuremap we apply a 1 times 1 convolutional layer to merge thefeature map to generate the final output Compared withhow recent architecture [35 39] uses multiscale in-formation (resize input patch size and feed into multipleCNNs and merge all predictions [35 39]) our approachenables multiscaling inside the network requiring asingle arbitrarily sized input and outputting the finalsegmentation result Figure 3 specifies the parameters ofeach layer

4 Journal of Healthcare Engineering

32 Spatially Weighted Loss for Backpropagation -is sec-tion describes the loss function training network throughbackpropagation Our proposed spatially weighted loss playsan important role in network training

Denote the training data as D (X Y) isin X times Y1113864 1113865where XY isin RN and N is the total number of pixelsin the training image X Y is the corresponding groundtruth segmentation mask with each pixel Yi isin 0 1

(ie pixels inside and on the boundary the muscle cell andbackground otherwise) For an input image X the mainobjective of our network is to obtain the pixelwise pre-diction Y⋆

Y⋆

arg max1113954Y

P(1113954Y ∣ X θ)(1)

where P(1113954Yi ∣ X θ) is the prediction probability of pixel Xiie the sigmoid function output of the network (denoted asPi afterwards for brevity) θ represents all parameters of ournetwork

Our network has multiple decoders with each havingindependent loss to update their parameters (see Section312 for details) Denote the loss function of i-th decoder asJde

i -e extra 1 times 1 convolutional layer after the concatlayer is updated by another loss (see Figure 3) denoted as

Jc Learning θ is achieved by minimizing the loss functionJ which is defined as

J(θ) 1113944M

i1J

dei (θ) + J

c(θ) (2)

where M is the number of the decoder Note that sinceJdei

and Jc are both spatially computed on pixels of the denseoutput both have the same formulation -e overall lossJ can be jointly minimized via backpropagation (spe-cifically when a layer has more than one path such as theconv1-2 layer in Figure 3 which has two successive layers(decoder-1 and pool1) the gradients will be accumulatedfrom multiple successive paths during backpropagation[40])

In skeletal muscle images there are several commonproblems which affect the network training (1) the largeproportion of pixels inside cell pixels will cause an un-balanced class such that the error effects occurred at themargins will be diminished during backpropagation (2)usually cells are densely distributed and the boundariesbetween touching cells are thin and often unclear orbroken due to musclersquos unique anatomy based on ourobservations the network often misclassifies the pixels at

Decoder-1

Decoder-2

Decoder-3

Decoder-4

Decoder-5

Encoder

Name Kernel Stride Pad OutputNameInputconv-1

Kernelmdash mdash

1 times 1 1mdash0

Inputconv-1

conv-2

mdash mdash1 times 1 1

mdash0

devonv-1dagger 4 times 4 2 0

Stride Pad Output

Name Kernel Stride Pad Output

Name Kernel Stride Pad Output

Inputconv1-1

conv2-1

conv3-1conv3-2conv3-3

3 times 3 1 11 1

1000 times 10001000 times 1000 times 641000 times 1000 times 641000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 11000 times 1000 times 5

250 times 250 times 256

125 times 125 times 512

63 times 63 times 51263 times 63 times 51263 times 63 times 51263 times 63 times 512

125 times 125 times 512125 times 125 times 512125 times 125 times 512

250 times 250 times 256250 times 250 times 256250 times 250 times 256

500 times 500 times 64500 times 500 times 128500 times 500 times 128

3 times 3

2 times 2mdash mdash mdash

211mdash2111mdash2111mdash2

011

111

mdash0

mdash0111mdash

111mdash

111mdash

mdash mdash

0

1 0

3 times 33 times 3

2 times 23 times 33 times 33 times 3

2 times 2

mdash

mdash mdash mdash

mdash

3 times 33 times 33 times 3

2 times 2mdash

3 times 33 times 33 times 3

1 times 1

mdashmdash

conv1-2

conv2-2

Decoder-1

Decoder-2

Decoder-3

pool1

pool2

pool3conv4-1conv4-2conv4-3Decoder-4pool4

concatconv

conv5-1conv5-2conv5-3Decoder-5

1000 times 1000 times 64

500 times 500 times 128

Input mdash mdash mdash 250 times 250 times 256

500 times 500 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

conv-1 1 times 1 1

1 1

0devonv-1

devonv-2dagger

4 times 4

4 times 43 times 3

2

2

0

0

250 times 250 times 1502 times 502 times 1502 times 502 times 1

conv-2

Name Kernel Stride Pad OutputInput mdash mdash mdash 125 times 125 times 512

1000 times 1000 times 1

conv-1 1 times 1 1

1 1

0devonv-1

devonv-2dagger

8 times 8

4 times 43 times 3

2

2

0

0

215 times 125 times 1504 times 504 times 1504 times 504 times 1

conv-2

Name Kernel Stride Pad OutputInput mdash mdash mdash 63 times 63 times 512

1000 times 1000 times 1

conv-1 1 times 1 1

1 1

0devonv-1

devonv-2dagger

8 times 8

8 times 83 times 3

4

4

0

0

63 times 63 times 1256 times 256 times 1256 times 256 times 1

Figure 3 -e detailed network configuration -e convolutional max-pooling deconvolutional and concat layers are denoted by convpool deconv and concat respectively Each convolutional layer of the encoder is followed by a ReLU layer which is hidden in the tables-ere are 5 decoders connected inside the architecture of the encoder -e (black solid and gray dotted) arrows point to the layer where theoutput of the corresponding layer goes-e last column of each table shows the feature map size (heighttimes widthtimes dimension) of each layerIn the tables of decoders ldquodaggerrdquo indicates that a crop layer is connected after that to force the output size to be the same as the input image size(ie 1000 times 1000 in the table)

Journal of Healthcare Engineering 5

margins between fiber boundaries (3) due to the stainingissue the boundary pixels are not smooth and continuousso it is very difficult to ensure that annotations accuratelylabel each pixel It is necessary to reduce the ambiguity fornetwork training

We propose a loss function to ameliorate these problemsby assigning different weights to the loss of each pixel -eloss function of a training data X which is based on thecross-entropy loss is defined as

Jde

(θ) 1113944N

i1f Xi( 1113857 1 Yi 11113858 1113859logPi + 1 Yi 01113858 1113859log 1minusPi( 1113857( 1113857

(3)

-e pixelwise weights are defined by the weight-assigning function f which is defined as

f Xi( 1113857 C Yi( 1113857minus1

times expΩ Xi( 1113857

η1times 1 Yi minusPi

11138681113868111386811138681113868111386811138681113868lt η21113960 1113961 (4)

-e pixelwise weight-assigning function f has threeterms which play different roles to address the above-mentioned three problems-e specific considerations makeour proposed loss different from [16 30]

In the first term C(Yi) is the label frequency which is aglobal term having the same value for same-class pixels Insecond term Ω is the Euclidean distance of pixel Xi to theboundary of the close cell Similar to [30] the intention of f isto assign relatively high weights to pixels adjacent toboundaries to amplify the error penalty occurred at themargins and pixels close to fiber boundaries and 1 otherwise(Ω 0 if Ω(Xi)gt ε ) We set η 06 and ε 10 empiricallyCompared with the ldquohardrdquo error-balancing strategy in[16 31] f produces soft error penalty so as to encouragebetter optimization convergence and enhance fine-grainedprediction -e third term aims to reduce the reliability ofthe ground truth when the network predicts an oppositelabel with high probability -is term is a switch so it forcesthe weight of the corresponding pixel to zero when thecondition is not satisfied In practice we preserve this valueduring network feedforward while the loss of the corre-sponding pixels does not get involved during networkbackpropagation

33 Two-Stage Training Training our deep network hassome common difficulties

(i) -e large number of parameters in both convolu-tional layers and deconvolutional layers makes thetraining difficult to achieve proper convergence[15 41]

(ii) Successful training from scratch requires extensivelabeled data which are extremely difficult to obtain inmedical image domain

One typical solution is to apply transfer learning toreduce the training difficulty [41 42] which reduces thedifficulties of the tricky parameter initialization andtuning [25 29] and heavy data acquirement procedure-e core idea behind is to use a pretrained model as the

initialization and fine-tune the CNN to make it adapt totargeting tasks with new training data -e encoder of ournetwork partially inherits the architecture of VGG [29]which is however trained on a large set of natural imagesfor image classification Transferring its knowledge tobenefit the totally unrelated biological image analysisproblem (ie MIS) seems impracticable However a re-cent literature coincides with our experiments It dem-onstrates the advantage [41] using various biologicalimaging modalities transferring from AlexNet [25] arelatively shallow CNN for natural image classification Interms of our MIS case segmentation the network archi-tecture is much more deeper with many new parame-terized layers in decoders More specific treatment needsto be considered

It is well known that the bottom layers of CNN canbe understood as various feature extractors attemptingto capture the low-level image features such as edgesand corners [25 37 41] Actually those low-level fea-tures are common between natural images and muscleimages of which the most common feature is imagegradients (ie boundaries) In practice we find thattraining the network to detect boundaries is relativelyeasier than directly training the network to segmentmuscle fibres

We propose a two-stage training strategy to pro-gressively train our network so as to utilize the powerfulfeature extractors of VGG and overcome the above-mentioned problems In the first stage we apply transferlearning to use pretrained VGG to initialize the param-eters of the encoder and randomly initialize the param-eters of decoders We then train the network to detectfiber boundaries which is achieved by feeding the networkwith training muscle images associated with the groundtruth boundary map (see Figure 2) -is strategy willfacilitate the network to converge swiftly After the net-work becomes adapted to new muscle images in thesecond stage we fine-tune the model using the originaltraining data D (ie Y is the segmentation mask) to trainthe network to automatically segment muscle fibresassigning in-cell pixels to 1 and other pixels to 0 Moreimplementation details are described in the experimentalsection

Another advantage of our proposed training strategy isthat it further helps reduce the touch objects (due to thinboundaries) problem [30 34] commonly occurred in end-to-end CNN segmentation (besides the pixel weight-assigning function f ) -e strategy of this literature [34] isto predict both a segmentation map and boundary map andmerge two maps to solve touching glands While in ourmethod the first stage training makes the network detect thecell boundaries -e second stage training is able to preservethis boundary information

4 Experimental Results

41 Dataset Our expert annotated skeletal muscle imagedataset with HampE staining contains 500 annotated imageswhich are captured by the whole-slide digital scanner from

6 Journal of Healthcare Engineering

the cooperative institution Muscle Miner -e images ex-hibit large appearance variances in color fiber size andshape -e image size roughly ranges from 500 times 500 to1500 times 1500 pixels We split the dataset into 100 testingimages and 400 training images

In order to evaluate the proposed method to handlelarge-scale images we evaluate the runtime on a whole-slideimage Note that we use small image patches for segmen-tation accuracy evaluation because some comparativemethods in the literature cannot handle whole-slide imagesHowever our proposed network is flexible to the input sizeduring the testing stage because the decoder is able toadaptively adjust the output size to be consistent with theinput size

42 Implementation Details Our implementation is basedon the Caffe [40] framework with modifications for ournetwork design All experiments are conducted on astandard desktop with an Intel i7 processor and a singleTesla K40c GPU -e optimization is driven by stochasticgradient descent with momentum For the first stagetraining the network parameters are set to learningrate 1eminus 6 (divided by 10 every 1e4 iteration) mo-mentum 09 and minibatch size 2 In the second stagewe use the learning rate 1eminus 7 and keep the others thesame

Augmenting dataset is a normal step for training CNNWe apply a simple approach by randomly cropping 30300 times 300 image patches from each of the training imagesto generate totally 12e4 training data We choose thispatch size to take the memory capacity of GPU into ac-count Based on our observations the segmentation ac-curacy will not be affected by increasing input size of testimages To simplify the computation of the weightingfunction f during training we take another per-computedweighting map associated with each training data (X Y) asnetwork inputs

43 Segmentation Accuracy Evaluation For quantitativeevaluation we report Precision (|ScapG||S|) Recall

(|ScapG||G|) and F1-score (2 middot Prec middot RecPrec + Rec)where |S| is the segmented cell region area and |G| is thecorresponding ground truth region area For each testimage Precision and Recall are computed by averaging theresults of all fibres inside We report the three values with afixed threshold (FT) ie a common threshold producesthe best F1-score over the test set and dynamic thresholds(DT) produce the best F1-score per image

In Table 1 we compare the segmentation performance ofour approach to several state-of-the-art methods DC [43]and multiscale combinatorial grouping (MCG) [44] arerecently proposed learning-based image segmentationmethods U-Net [30] is an end-to-end CNN for biomedicalimage segmentation We use their public codes and carefullytrain the models over our training data with the sameamount DNN-SNM [14] is a well-known CNN-based imagesegmentation method We regard it as a generic CNN forcomparison with our end-to-end CNN approach For our

method we directly use the network output as the seg-mentation results for evaluation without any extra post-processing efforts

As shown in Table 1 our method achieves much betterresults than comparative methods Although [7] has betterRecall (FT) our method has around 10 improvement onPrecision (FT) DC and MCG are not robust to the imageartifacts which decreases their segmentation performanceOur method largely outperforms DNN-SNM and U-Netbecause (1) our network is deeper than DNN-SNM tocapture richer representations (2) the decoder betterutilizes the multiscale representations than U-Net and isable to reduce the effects of the pixelwise information lossand (3) two-stage training takes advantage of VGG forbetter training effectiveness rather than training fromscratch as U-Net does -e outstanding Precision resultdemonstrates that our method produces more fine-grainedsegmentation than others -is superiority is betterdemonstrated by the qualitative evaluation as shown inFigure 4

44 Whole-Slide Segmentation Runtime In Table 2 wecompare the runtime of our method to the comparativemethods on images of different sizes cropped from a whole-slide image (see Figure 1) -e runtime of non-deeplearning-based methods (1st block) depends on both pixeland fiber quantities so they cannot handle large-scale im-ages In contrast deep learning-based methods (2nd and 3rdblocks) depend on the pixel quantity so they have close-to-linear time complexity with respect to the image scale Wealso implement a fast scanning version [45] of DNN-SNMon GPU Although the speed has a large improvement it isstill much slower than ours U-Net has more complicatedlayer connection configuration so it is slower than oursespecially in large-scale cases -e significant speed im-provement demonstrates the scalability of our proposedmethod to the application of whole-slide MIS with evenlarger scales

5 Conclusion

-is paper presents a fast and accurate whole-slide MISmethod based on CNN trained in the end-to-end mannerOur proposed network captures hierarchical and compre-hensive representations to support multiscale pixelwisepredictions inside the network A two-stage transfer learningstrategy is proposed to train such a deep network Superioraccuracy and efficiency are experimentally demonstrated ona challenging skeletal muscle image dataset In general ourapproach enables multiscaling inside the network while justrequiring a single arbitrarily sized input and outputting fineoutputs However during the downsampling process of theencoding due to the limitation of resolution of feature layerafter downsampling many important features such as edgefeatures of cells are still lost To further improve decodingefficiency in the future work we can design a module thatcomplements important features to better improve networkperformance

Journal of Healthcare Engineering 7

Test image Ground truth DNN-SNM Liu et al [7] Ours

Figure 4 Segmentation results of four sample skeletal muscle imagesWe show some very challenging cases with large appearance variancesin color fiber shape etc Each segmented fiber is overlaid with a distinctive colored mask while false positives and false negatives arehighlighted by red and blue contours respectively Compared with the other two methods our method obtains more fine-grainedsegmentation results with obviously less false prediction

Table 2 -e runtime (in seconds) comparison on images of different sizes from 1x 1000 times 1000 to 9x 9000 times 9000

Method 1x 2x 3x 4x 5x 6x 7x 8x 9xDC [43] 20 79 mdash mdash mdash mdash mdash mdash mdashMCG [44] 7 27 mdash mdash mdash mdash mdash mdash mdashLiu et al [7] 10 59 mdash mdash mdash mdash mdash mdash mdashDNN-SNM [14] 264 1056 2376 4224 6600 9504 12936 16896 21384DNN-SNM⋆[45] 31 115 242 431 675 974 1325 1738 2160U-net [30] 12 39 90 161 246 368 482 633 792Our approach 11 18 53 88 139 209 278 364 468-e first three methods cannot handle images with 3x and larger sizes on our machine (represented with ldquomdashrdquo in the table) ⋆DNN-SNM is a fast scanningimplementation for prediction speed acceleration

Table 1 -e segmentation results compared with state-of-the-art methods

MethodF1-score ( plusmn σ) Precision ( plusmn σ) Recall ( plusmn σ)

FT DT FT DT FT DT

DC [43] 48plusmn 0093 60plusmn 0138 41plusmn 0066 54plusmn 0164 67plusmn 0194 73plusmn 0148MCG [44] 63plusmn 0201 71plusmn 0105 53plusmn 0136 64plusmn 0138 80plusmn 0303 82plusmn 0091DNN-SNM [14] 76plusmn 0033 78plusmn 0080 83plusmn 0042 85plusmn 0089 70plusmn 0058 73plusmn 0087U-Net [30] 80plusmn 0143 81plusmn 0054 87plusmn 0155 86plusmn 0076 74plusmn 0126 77plusmn 0055Liu et al [7] 82plusmn 0172 84plusmn 0061 81plusmn 0043 84plusmn 0071 85plusmn 0202 85plusmn 0068Our approach 86plusmn 0184 89plusmn 0048 91plusmn 0174 93plusmn 0050 82plusmn 0176 86plusmn 0058σ is the standard deviation

8 Journal of Healthcare Engineering

Data Availability

-e data that support the findings of this study areavailable from the cooperative institution Muscle Minerbut restrictions apply to the availability of these datawhich were used under license for the current study andso are not publicly available Data are however availablefrom the authors upon reasonable request and withpermission of the cooperative institution Muscle Miner

Conflicts of Interest

-e authors declare that they have no conflicts of interest

Acknowledgments

-e authors would like to thank all study participants -iswork was supported by the National Key RampD Program ofChina (grant no 2017YFB1002504) and National NaturalScience Foundation of China (nos 81727802 and 61701404)

References

[1] C S Fry J D Lee J Mula et al ldquoInducible depletion ofsatellite cells in adult sedentary mice impairs muscle re-generative capacity without affecting sarcopeniardquo NatureMedicine vol 21 no 1 pp 76ndash80 2015

[2] M W Lee M G Viola H Meng et al ldquoDifferential musclehypertrophy is associated with satellite cell numbers and aktpathway activation following activin type IIB receptor in-hibition in Mtm1 pR69C micerdquo e American Journal ofPathology vol 184 no 6 pp 1831ndash1842 2014

[3] H Viola P M Janssen R W Grange et al ldquoTissue triage andfreezing for models of skeletal muscle diseaserdquo Journal ofVisualized Experiments JoVE vol e51586 no 89 2014

[4] F Liu A L Mackey R Srikuea K A Esser and L YangldquoAutomated image segmentation of haematoxylin and eosinstained skeletal muscle cross-sectionsrdquo Journal of Microscopyvol 252 no 3 pp 275ndash285 2013

[5] H Su F Xing J D Lee et al ldquoLearning based automaticdetection of myonuclei in isolated single skeletal muscle fibersusing multi-focus image fusionrdquo in Proceedings of the IEEEInternational Symposium on Biomedical Imaging pp 432ndash435 San Francisco CA USA April 2013

[6] L R Smith and E R Barton ldquoSmashndashsemi-automatic muscleanalysis using segmentation of histology a matlab applica-tionrdquo Skeletal Muscle vol 4 no 1 pp 1ndash16 2014

[7] F Liu F Xing Z Zhang M Mcgough and L Yang ldquoRobustmuscle cell quantification using structured edge detection andhierarchical segmentationrdquo in Lecture Notes in ComputerScience pp 324ndash331 Shenzhen MICCAI Shenzhen China2015

[8] T Janssens L Antanas S Derde I VanhorebeekG Van den Berghe and F Guiza Grandas ldquoCharisma anintegrated approach to automatic HampE-stained skeletalmuscle cell segmentation using supervised learning and novelrobust clump splittingrdquoMedical Image Analysis vol 17 no 8pp 1206ndash1219 2013

[9] A E Carpenter T R Jones M R Lamprecht et al ldquoCell-profiler image analysis software for identifying and quanti-fying cell phenotypesrdquo Genome Biology vol 7 no 10pp 1ndash11 2006

[10] A Klemencic S Kovacic and F Pernus ldquoAutomated seg-mentation of muscle fiber images using active contourmodelsrdquo Cytometry vol 32 no 4 pp 317ndash326 1998

[11] N Bova V Gal O Ibantildeez and O Cordon ldquoDeformablemodels direct supervised guidance a novel paradigm forautomatic image segmentationrdquo Neurocomputing vol 177pp 317ndash333 2016

[12] T F Cootes C J Taylor D H Cooper et al ldquoActive shapemodels-their training and applicationrdquo Computer Vision andImage Understanding vol 61 no 1 pp 38ndash59 1995

[13] S Zhang Y Zhan M Dewan J Huang D N Metaxas andX S Zhou ldquoSparse shape composition a new frameworkfor shape prior modelingrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1025ndash1032 IEEE Colorado Springs CO USAJune 2011

[14] D Ciresan A Giusti L M Gambardella and J SchmidhuberldquoDeep neural networks segment neuronal membranes inelectron microscopy imagesrdquo in Proceedings of the NIPSpp 2843ndash2851 Lake Tahoe NV USA December 2012

[15] H Noh S Hong and B Han ldquoLearning deconvolutionnetwork for semantic segmentationrdquo in Proceedings of theICCV Las Condes Chile December 2015

[16] J Long E Shelhamer and T Darrell ldquoFully convolutionalnetworks for semantic segmentationrdquo in Proceedings of theCVPR pp 3431ndash3440 Santiago Chile December 2015

[17] P O Pinheiro R Collobert and P Dollar ldquoLearning tosegment object candidatesrdquo in Proceedings of the Advances inNeural Information Processing Systems pp 1981ndash1989Montreal Canada December 2015

[18] J Mula J D Lee F Liu L Yang and C A PetersonldquoAutomated image analysis of skeletal muscle fiber cross-sectional areardquo Journal of Applied Physiology vol 114 no 1pp 148ndash155 2013

[19] P-Y Baudin N Azzabou P G Carlier and N ParagiosldquoPrior knowledge random walks and human skeletal musclesegmentationrdquo in Proceedings of the Medical Image Com-puting and Computer-Assisted InterventionndashMICCAI 2012pp 569ndash576 Nice France October 2012

[20] P Dollar and C L Zitnick ldquoStructured forests for fast edgedetectionrdquo in Proceedings of the ICCV pp 1841ndash1848 SydneyAustralia December 2013

[21] F Liu F Xing and L Yang ldquoRobust muscle cell segmentationusing region selection with dynamic programmingrdquo inProceedings of the ISBI pp 521ndash524 Beijing China April2014

[22] E Van Aart N Sepasian A Jalba and A Vilanova ldquoCuda-accelerated geodesic ray-tracing for fiber trackingrdquo Journal ofBiomedical Imaging vol 2011 Article ID 698908 12 pages2011

[23] G C Kagadis C Kloukinas K Moore et al ldquoCloud com-puting in medical imagingrdquo Medical Physics vol 40 no 7article 070901

[24] L Yang X Qi F Xing T Kurc J Saltz and D J ForanldquoParallel content-based sub-image retrieval using hierarchicalsearchingrdquo Bioinformatics vol 30 no 7 pp 996ndash1002 2014

[25] A Krizhevsky I Sutskever and G E Hinton ldquoImagenetclassification with deep convolutional neural networksrdquo inProceedings of the Advances in Neural Information ProcessingSystems pp 1097ndash1105 Lake Tahoe NV USA December2012

[26] H-C Shin H R Roth M Gao et al ldquoDeep convolutionalneural networks for computer-aided detection CNN archi-tectures dataset characteristics and transfer learningrdquo IEEE

Journal of Healthcare Engineering 9

Transactions on Medical Imaging vol 35 no 5 pp 1285ndash1298 2016

[27] J Xu X Luo G Wang H Gilmore and A Madabhushi ldquoAdeep convolutional neural network for segmenting andclassifying epithelial and stromal regions in histopathologicalimagesrdquo Neurocomputing vol 191 pp 214ndash223 2016

[28] X Pan L Li H Yang et al ldquoAccurate segmentation of nucleiin pathological images via sparse reconstruction and deepconvolutional networksrdquo Neurocomputing vol 229 pp 88ndash99 2017

[29] K Simonyan and A Zisserman ldquoVery deep convolutionalnetworks for large-scale image recognitionrdquo httpsarxivorgabs14091556

[30] O Ronneberger P Fischer and T Brox ldquoU-net convolu-tional networks for biomedical image segmentationrdquo inLecture Notes in Computer Science pp 234ndash241 ShenzhenMICCAI Shenzhen China 2015

[31] S Xie and Z Tu ldquoHolistically-nested edge detectionrdquo inProceedings of the ICCV pp 1395ndash1403 Las Condes ChileDecember 2015

[32] D Eigen and R Fergus ldquoPredicting depth surface normalsand semantic labels with a commonmulti-scale convolutionalarchitecturerdquo in Proceedings of the IEEE International Con-ference on Computer Vision pp 2650ndash2658 Las CondesChile December 2015

[33] Q Dou H Chen L Yu et al ldquoAutomatic detection of cerebralmicrobleeds from mr images via 3d convolutional neuralnetworksrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1182ndash1195 2016

[34] H Chen X Qi L Yu and P-A Heng ldquoDcan deep contour-aware networks for accurate gland segmentationrdquo httpsarxivorgabs160402677

[35] P Moeskops M A Viergever A M Mendrik L S de VriesM J N L Benders and I Isgum ldquoAutomatic segmentation ofmr brain images with a convolutional neural networkrdquo IEEETransactions on Medical Imaging vol 35 no 5 pp 1252ndash1261 2016

[36] S Hong H Noh and B Han ldquoDecoupled deep neuralnetwork for semi-supervised semantic segmentationrdquo inProceedings of the Advances in Neural Information ProcessingSystems pp 1495ndash1503 Montreal Canada December 2015

[37] M D Zeiler and R Fergus ldquoVisualizing and understandingconvolutional networksrdquo in Proceedings of the ComputervisionndashECCV 2014 pp 818ndash833 Springer Zurich SwitzerlandSeptember 2014

[38] C Szegedy W Liu Y Jia et al ldquoGoing deeper with con-volutionsrdquo in Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition pp 1ndash9 Boston MA USAJune 2015

[39] G Bertasius J Shi and L Torresani ldquoDeepedge a multi-scalebifurcated deep network for top-down contour detectionrdquo inProceedings of the IEEE Conference on Computer Vision andPattern Recognition pp 4380ndash4389 Boston MA USA June2015

[40] Y Jia E Shelhamer J Donahue et al ldquoCaffe convolutionalarchitecture for fast feature embeddingrdquo in Proceedings of theInternational Conference on Multimedia pp 675ndash678Orlando FL USA November 2014

[41] N Tajbakhsh J Y Shin S R Gurudu et al ldquoConvolutionalneural networks for medical image analysis full training orfine tuningrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1299ndash1312 2016

[42] J Yosinski J Clune Y Bengio and H Lipson ldquoHowtransferable are features in deep neural networksrdquo in

Proceedings of the NIPS pp 3320ndash3328 Montreal CanadaDecember 2014

[43] M Donoser and D Schmalstieg ldquoDiscrete-continuous gra-dient orientation estimation for faster image segmentationrdquoin Proceedings of the CVPR pp 3158ndash3165 Columbus OHUSA June 2014

[44] P Arbelaez J Pont-Tuset J Barron F Marques and J MalikldquoMultiscale combinatorial groupingrdquo in Proceedings of theCVPR pp 328ndash335 Columbus OH USA June 2014

[45] A Giusti D C Ciresan J Masci L M Gambardella andJ Schmidhuber ldquoFast image scanning with deep max-poolingconvolutional neural networksrdquo 2013 httpsarxivorgabs13021700

10 Journal of Healthcare Engineering

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

high-level classification From bottom layers to top layersCNN gradually captures rich representations of input imagefrom pixel level to content level so as to make accurateclassification Conventional CNN is used to perform high-level classification ie assigning a category label to an inputimage patch When it is applied to pixelwise prediction MIStask extensive patch-to-pixel level prediction (CNN feed-forward) is required which will extensively limit the seg-mentation efficiency [14 35]

To solve this problem we train our network in an end-to-endmanner which enables the network to directly outputthe image segmentation given an input image In this waywe no longer need to use patchwise classification to assignlabels to all pixels via millions of CNN feedforward Onlyone-time feedforward is needed to obtain the final seg-mentation End-to-end training of CNN is used to enable thenetwork to directly output dense image segmentation for agiven input image [15 16 32]

However modifying conventional CNN to perform end-to-end training brings a major side effect ie substantialpixel-level information loss at top layers makes the pixelwiseprediction inaccurate [17 30] It is because multiple max-pooling layers will dramatically decrease the spatial size ofthe output so the predicted segmentation output is verycoarse Most proposed end-to-end CNN methods useupsampling [16 17] or deconvolution operations [15] toresize back the output to the spatial size of the input imageNevertheless the max-pooling layer is essential to abstractthe content-level representations for high-level categoryclassification [25 29 36] and decrease the computationspace of CNN

As a matter of fact when we generalize end-to-endCNN to MIS content-level information becomes lessimportant because the label of a single pixel does not relyon the knowledge of the whole muscle image Differentfrom semantic segmentation [16 32] which needs content-level information to predict the category label per pixel weare more interested in the fine-grained pixelwise pre-diction by taking advantage of the hierarchical repre-sentations of the encoder to improve the predictionaccuracy -e hierarchy characteristic can be achieved bygradually enlarging the receptive field size after each max-pooling layer To this end we propose a novel networkarchitecture which is composed by one encoder moduleand multiple decoder modules Generally the decoderaims to use the rich and hierarchical representationsobtained from the encoder for pixelwise classification

311 Encoder Module -e encoder architecture is mostlyidentical to the conventional neural network Instead ofbuilding our own layer combinations we borrow the well-known VGG net [29] with fully connected layers trun-cated to capture the rich and hierarchical representationsfrom pixel level at bottom layers to content level(ie category-specific knowledge) at top layers VGG netis composed of a series of convolutional sets with each sethaving multiple convolutional layers followed by a max-pooling layer VGG has two variants (one has 16 layers

and the other has 19 layers) we use 16-layer VGG forefficiency consideration We choose VGG for two reasons(1) we can transfer the pretrained VGG model to helptrain our very deep network as described in the nextsection (2) VGG net is very deep which extracts fivedifferent-scale feature maps containing very rich multi-scale representations for the usage of decoders

312 Decoder Module -e decoder has two main purposes(1) it utilizes the rich representations obtained from theencoder for pixelwise classification So the output of onedecoder is a dense segmentation mask with each spatialposition assigning a label to the corresponding pixel of theinput image (cell or noncell in our case) (2) it refines thelow-rescale coarse segmentation mask to efficiently generatefine-grained high-scale segmentation mask -e refinementprocedure is achieved by multistep deconvolution andsuccessive usage of same-scale feature maps obtained fromother decoders

We propose to connect multiple decoders prior toevery max-pooling layer of the encoder thus the decoderscan easily utilize the multiscale representations as inputfeatures as inspired by [15 16 31] -e decoder can beviewed as a small pixelwise classification network whichhas an independent loss to update its parameters duringtraining Hence the overall architecture is multitaskCNN

Our design of the decoder includes convolutionallayers with intermediate deconvolution layers [15] Spe-cifically the deconvolution is the backward convolutionoperation which performs elementwise product with itsfilters (please note that some controversies arise in thenaming of ldquodeconvolutionrdquo in recent literature as thedeconvolution layer used here is different from the pre-vious definition of the deconvolution [37] we maintainthe same definitions as most of the literatures on end-to-end CNN) -e output size of deconvolution will be en-larged by a factor of the stride -e filters of the decon-volution layers are learnable and updated by the loss of thedecoder

In this way rather than enlarging the image with alarge stride through a skip connection [16 31 38] ourapproach enlarges the feature map in multiple steps andprogressively refines the feature maps at different scalesvia convolutional kernels with the purpose of reducingthe effects of pixel-level information loss We use 3 times 3filter size as this small size has been proven effectivewidely In the end we concatenate multiscale predictionsof all decoders which generates a 5-dimensional featuremap we apply a 1 times 1 convolutional layer to merge thefeature map to generate the final output Compared withhow recent architecture [35 39] uses multiscale in-formation (resize input patch size and feed into multipleCNNs and merge all predictions [35 39]) our approachenables multiscaling inside the network requiring asingle arbitrarily sized input and outputting the finalsegmentation result Figure 3 specifies the parameters ofeach layer

4 Journal of Healthcare Engineering

32 Spatially Weighted Loss for Backpropagation -is sec-tion describes the loss function training network throughbackpropagation Our proposed spatially weighted loss playsan important role in network training

Denote the training data as D (X Y) isin X times Y1113864 1113865where XY isin RN and N is the total number of pixelsin the training image X Y is the corresponding groundtruth segmentation mask with each pixel Yi isin 0 1

(ie pixels inside and on the boundary the muscle cell andbackground otherwise) For an input image X the mainobjective of our network is to obtain the pixelwise pre-diction Y⋆

Y⋆

arg max1113954Y

P(1113954Y ∣ X θ)(1)

where P(1113954Yi ∣ X θ) is the prediction probability of pixel Xiie the sigmoid function output of the network (denoted asPi afterwards for brevity) θ represents all parameters of ournetwork

Our network has multiple decoders with each havingindependent loss to update their parameters (see Section312 for details) Denote the loss function of i-th decoder asJde

i -e extra 1 times 1 convolutional layer after the concatlayer is updated by another loss (see Figure 3) denoted as

Jc Learning θ is achieved by minimizing the loss functionJ which is defined as

J(θ) 1113944M

i1J

dei (θ) + J

c(θ) (2)

where M is the number of the decoder Note that sinceJdei

and Jc are both spatially computed on pixels of the denseoutput both have the same formulation -e overall lossJ can be jointly minimized via backpropagation (spe-cifically when a layer has more than one path such as theconv1-2 layer in Figure 3 which has two successive layers(decoder-1 and pool1) the gradients will be accumulatedfrom multiple successive paths during backpropagation[40])

In skeletal muscle images there are several commonproblems which affect the network training (1) the largeproportion of pixels inside cell pixels will cause an un-balanced class such that the error effects occurred at themargins will be diminished during backpropagation (2)usually cells are densely distributed and the boundariesbetween touching cells are thin and often unclear orbroken due to musclersquos unique anatomy based on ourobservations the network often misclassifies the pixels at

Decoder-1

Decoder-2

Decoder-3

Decoder-4

Decoder-5

Encoder

Name Kernel Stride Pad OutputNameInputconv-1

Kernelmdash mdash

1 times 1 1mdash0

Inputconv-1

conv-2

mdash mdash1 times 1 1

mdash0

devonv-1dagger 4 times 4 2 0

Stride Pad Output

Name Kernel Stride Pad Output

Name Kernel Stride Pad Output

Inputconv1-1

conv2-1

conv3-1conv3-2conv3-3

3 times 3 1 11 1

1000 times 10001000 times 1000 times 641000 times 1000 times 641000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 11000 times 1000 times 5

250 times 250 times 256

125 times 125 times 512

63 times 63 times 51263 times 63 times 51263 times 63 times 51263 times 63 times 512

125 times 125 times 512125 times 125 times 512125 times 125 times 512

250 times 250 times 256250 times 250 times 256250 times 250 times 256

500 times 500 times 64500 times 500 times 128500 times 500 times 128

3 times 3

2 times 2mdash mdash mdash

211mdash2111mdash2111mdash2

011

111

mdash0

mdash0111mdash

111mdash

111mdash

mdash mdash

0

1 0

3 times 33 times 3

2 times 23 times 33 times 33 times 3

2 times 2

mdash

mdash mdash mdash

mdash

3 times 33 times 33 times 3

2 times 2mdash

3 times 33 times 33 times 3

1 times 1

mdashmdash

conv1-2

conv2-2

Decoder-1

Decoder-2

Decoder-3

pool1

pool2

pool3conv4-1conv4-2conv4-3Decoder-4pool4

concatconv

conv5-1conv5-2conv5-3Decoder-5

1000 times 1000 times 64

500 times 500 times 128

Input mdash mdash mdash 250 times 250 times 256

500 times 500 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

conv-1 1 times 1 1

1 1

0devonv-1

devonv-2dagger

4 times 4

4 times 43 times 3

2

2

0

0

250 times 250 times 1502 times 502 times 1502 times 502 times 1

conv-2

Name Kernel Stride Pad OutputInput mdash mdash mdash 125 times 125 times 512

1000 times 1000 times 1

conv-1 1 times 1 1

1 1

0devonv-1

devonv-2dagger

8 times 8

4 times 43 times 3

2

2

0

0

215 times 125 times 1504 times 504 times 1504 times 504 times 1

conv-2

Name Kernel Stride Pad OutputInput mdash mdash mdash 63 times 63 times 512

1000 times 1000 times 1

conv-1 1 times 1 1

1 1

0devonv-1

devonv-2dagger

8 times 8

8 times 83 times 3

4

4

0

0

63 times 63 times 1256 times 256 times 1256 times 256 times 1

Figure 3 -e detailed network configuration -e convolutional max-pooling deconvolutional and concat layers are denoted by convpool deconv and concat respectively Each convolutional layer of the encoder is followed by a ReLU layer which is hidden in the tables-ere are 5 decoders connected inside the architecture of the encoder -e (black solid and gray dotted) arrows point to the layer where theoutput of the corresponding layer goes-e last column of each table shows the feature map size (heighttimes widthtimes dimension) of each layerIn the tables of decoders ldquodaggerrdquo indicates that a crop layer is connected after that to force the output size to be the same as the input image size(ie 1000 times 1000 in the table)

Journal of Healthcare Engineering 5

margins between fiber boundaries (3) due to the stainingissue the boundary pixels are not smooth and continuousso it is very difficult to ensure that annotations accuratelylabel each pixel It is necessary to reduce the ambiguity fornetwork training

We propose a loss function to ameliorate these problemsby assigning different weights to the loss of each pixel -eloss function of a training data X which is based on thecross-entropy loss is defined as

Jde

(θ) 1113944N

i1f Xi( 1113857 1 Yi 11113858 1113859logPi + 1 Yi 01113858 1113859log 1minusPi( 1113857( 1113857

(3)

-e pixelwise weights are defined by the weight-assigning function f which is defined as

f Xi( 1113857 C Yi( 1113857minus1

times expΩ Xi( 1113857

η1times 1 Yi minusPi

11138681113868111386811138681113868111386811138681113868lt η21113960 1113961 (4)

-e pixelwise weight-assigning function f has threeterms which play different roles to address the above-mentioned three problems-e specific considerations makeour proposed loss different from [16 30]

In the first term C(Yi) is the label frequency which is aglobal term having the same value for same-class pixels Insecond term Ω is the Euclidean distance of pixel Xi to theboundary of the close cell Similar to [30] the intention of f isto assign relatively high weights to pixels adjacent toboundaries to amplify the error penalty occurred at themargins and pixels close to fiber boundaries and 1 otherwise(Ω 0 if Ω(Xi)gt ε ) We set η 06 and ε 10 empiricallyCompared with the ldquohardrdquo error-balancing strategy in[16 31] f produces soft error penalty so as to encouragebetter optimization convergence and enhance fine-grainedprediction -e third term aims to reduce the reliability ofthe ground truth when the network predicts an oppositelabel with high probability -is term is a switch so it forcesthe weight of the corresponding pixel to zero when thecondition is not satisfied In practice we preserve this valueduring network feedforward while the loss of the corre-sponding pixels does not get involved during networkbackpropagation

33 Two-Stage Training Training our deep network hassome common difficulties

(i) -e large number of parameters in both convolu-tional layers and deconvolutional layers makes thetraining difficult to achieve proper convergence[15 41]

(ii) Successful training from scratch requires extensivelabeled data which are extremely difficult to obtain inmedical image domain

One typical solution is to apply transfer learning toreduce the training difficulty [41 42] which reduces thedifficulties of the tricky parameter initialization andtuning [25 29] and heavy data acquirement procedure-e core idea behind is to use a pretrained model as the

initialization and fine-tune the CNN to make it adapt totargeting tasks with new training data -e encoder of ournetwork partially inherits the architecture of VGG [29]which is however trained on a large set of natural imagesfor image classification Transferring its knowledge tobenefit the totally unrelated biological image analysisproblem (ie MIS) seems impracticable However a re-cent literature coincides with our experiments It dem-onstrates the advantage [41] using various biologicalimaging modalities transferring from AlexNet [25] arelatively shallow CNN for natural image classification Interms of our MIS case segmentation the network archi-tecture is much more deeper with many new parame-terized layers in decoders More specific treatment needsto be considered

It is well known that the bottom layers of CNN canbe understood as various feature extractors attemptingto capture the low-level image features such as edgesand corners [25 37 41] Actually those low-level fea-tures are common between natural images and muscleimages of which the most common feature is imagegradients (ie boundaries) In practice we find thattraining the network to detect boundaries is relativelyeasier than directly training the network to segmentmuscle fibres

We propose a two-stage training strategy to pro-gressively train our network so as to utilize the powerfulfeature extractors of VGG and overcome the above-mentioned problems In the first stage we apply transferlearning to use pretrained VGG to initialize the param-eters of the encoder and randomly initialize the param-eters of decoders We then train the network to detectfiber boundaries which is achieved by feeding the networkwith training muscle images associated with the groundtruth boundary map (see Figure 2) -is strategy willfacilitate the network to converge swiftly After the net-work becomes adapted to new muscle images in thesecond stage we fine-tune the model using the originaltraining data D (ie Y is the segmentation mask) to trainthe network to automatically segment muscle fibresassigning in-cell pixels to 1 and other pixels to 0 Moreimplementation details are described in the experimentalsection

Another advantage of our proposed training strategy isthat it further helps reduce the touch objects (due to thinboundaries) problem [30 34] commonly occurred in end-to-end CNN segmentation (besides the pixel weight-assigning function f ) -e strategy of this literature [34] isto predict both a segmentation map and boundary map andmerge two maps to solve touching glands While in ourmethod the first stage training makes the network detect thecell boundaries -e second stage training is able to preservethis boundary information

4 Experimental Results

41 Dataset Our expert annotated skeletal muscle imagedataset with HampE staining contains 500 annotated imageswhich are captured by the whole-slide digital scanner from

6 Journal of Healthcare Engineering

the cooperative institution Muscle Miner -e images ex-hibit large appearance variances in color fiber size andshape -e image size roughly ranges from 500 times 500 to1500 times 1500 pixels We split the dataset into 100 testingimages and 400 training images

In order to evaluate the proposed method to handlelarge-scale images we evaluate the runtime on a whole-slideimage Note that we use small image patches for segmen-tation accuracy evaluation because some comparativemethods in the literature cannot handle whole-slide imagesHowever our proposed network is flexible to the input sizeduring the testing stage because the decoder is able toadaptively adjust the output size to be consistent with theinput size

42 Implementation Details Our implementation is basedon the Caffe [40] framework with modifications for ournetwork design All experiments are conducted on astandard desktop with an Intel i7 processor and a singleTesla K40c GPU -e optimization is driven by stochasticgradient descent with momentum For the first stagetraining the network parameters are set to learningrate 1eminus 6 (divided by 10 every 1e4 iteration) mo-mentum 09 and minibatch size 2 In the second stagewe use the learning rate 1eminus 7 and keep the others thesame

Augmenting dataset is a normal step for training CNNWe apply a simple approach by randomly cropping 30300 times 300 image patches from each of the training imagesto generate totally 12e4 training data We choose thispatch size to take the memory capacity of GPU into ac-count Based on our observations the segmentation ac-curacy will not be affected by increasing input size of testimages To simplify the computation of the weightingfunction f during training we take another per-computedweighting map associated with each training data (X Y) asnetwork inputs

43 Segmentation Accuracy Evaluation For quantitativeevaluation we report Precision (|ScapG||S|) Recall

(|ScapG||G|) and F1-score (2 middot Prec middot RecPrec + Rec)where |S| is the segmented cell region area and |G| is thecorresponding ground truth region area For each testimage Precision and Recall are computed by averaging theresults of all fibres inside We report the three values with afixed threshold (FT) ie a common threshold producesthe best F1-score over the test set and dynamic thresholds(DT) produce the best F1-score per image

In Table 1 we compare the segmentation performance ofour approach to several state-of-the-art methods DC [43]and multiscale combinatorial grouping (MCG) [44] arerecently proposed learning-based image segmentationmethods U-Net [30] is an end-to-end CNN for biomedicalimage segmentation We use their public codes and carefullytrain the models over our training data with the sameamount DNN-SNM [14] is a well-known CNN-based imagesegmentation method We regard it as a generic CNN forcomparison with our end-to-end CNN approach For our

method we directly use the network output as the seg-mentation results for evaluation without any extra post-processing efforts

As shown in Table 1 our method achieves much betterresults than comparative methods Although [7] has betterRecall (FT) our method has around 10 improvement onPrecision (FT) DC and MCG are not robust to the imageartifacts which decreases their segmentation performanceOur method largely outperforms DNN-SNM and U-Netbecause (1) our network is deeper than DNN-SNM tocapture richer representations (2) the decoder betterutilizes the multiscale representations than U-Net and isable to reduce the effects of the pixelwise information lossand (3) two-stage training takes advantage of VGG forbetter training effectiveness rather than training fromscratch as U-Net does -e outstanding Precision resultdemonstrates that our method produces more fine-grainedsegmentation than others -is superiority is betterdemonstrated by the qualitative evaluation as shown inFigure 4

44 Whole-Slide Segmentation Runtime In Table 2 wecompare the runtime of our method to the comparativemethods on images of different sizes cropped from a whole-slide image (see Figure 1) -e runtime of non-deeplearning-based methods (1st block) depends on both pixeland fiber quantities so they cannot handle large-scale im-ages In contrast deep learning-based methods (2nd and 3rdblocks) depend on the pixel quantity so they have close-to-linear time complexity with respect to the image scale Wealso implement a fast scanning version [45] of DNN-SNMon GPU Although the speed has a large improvement it isstill much slower than ours U-Net has more complicatedlayer connection configuration so it is slower than oursespecially in large-scale cases -e significant speed im-provement demonstrates the scalability of our proposedmethod to the application of whole-slide MIS with evenlarger scales

5 Conclusion

-is paper presents a fast and accurate whole-slide MISmethod based on CNN trained in the end-to-end mannerOur proposed network captures hierarchical and compre-hensive representations to support multiscale pixelwisepredictions inside the network A two-stage transfer learningstrategy is proposed to train such a deep network Superioraccuracy and efficiency are experimentally demonstrated ona challenging skeletal muscle image dataset In general ourapproach enables multiscaling inside the network while justrequiring a single arbitrarily sized input and outputting fineoutputs However during the downsampling process of theencoding due to the limitation of resolution of feature layerafter downsampling many important features such as edgefeatures of cells are still lost To further improve decodingefficiency in the future work we can design a module thatcomplements important features to better improve networkperformance

Journal of Healthcare Engineering 7

Test image Ground truth DNN-SNM Liu et al [7] Ours

Figure 4 Segmentation results of four sample skeletal muscle imagesWe show some very challenging cases with large appearance variancesin color fiber shape etc Each segmented fiber is overlaid with a distinctive colored mask while false positives and false negatives arehighlighted by red and blue contours respectively Compared with the other two methods our method obtains more fine-grainedsegmentation results with obviously less false prediction

Table 2 -e runtime (in seconds) comparison on images of different sizes from 1x 1000 times 1000 to 9x 9000 times 9000

Method 1x 2x 3x 4x 5x 6x 7x 8x 9xDC [43] 20 79 mdash mdash mdash mdash mdash mdash mdashMCG [44] 7 27 mdash mdash mdash mdash mdash mdash mdashLiu et al [7] 10 59 mdash mdash mdash mdash mdash mdash mdashDNN-SNM [14] 264 1056 2376 4224 6600 9504 12936 16896 21384DNN-SNM⋆[45] 31 115 242 431 675 974 1325 1738 2160U-net [30] 12 39 90 161 246 368 482 633 792Our approach 11 18 53 88 139 209 278 364 468-e first three methods cannot handle images with 3x and larger sizes on our machine (represented with ldquomdashrdquo in the table) ⋆DNN-SNM is a fast scanningimplementation for prediction speed acceleration

Table 1 -e segmentation results compared with state-of-the-art methods

MethodF1-score ( plusmn σ) Precision ( plusmn σ) Recall ( plusmn σ)

FT DT FT DT FT DT

DC [43] 48plusmn 0093 60plusmn 0138 41plusmn 0066 54plusmn 0164 67plusmn 0194 73plusmn 0148MCG [44] 63plusmn 0201 71plusmn 0105 53plusmn 0136 64plusmn 0138 80plusmn 0303 82plusmn 0091DNN-SNM [14] 76plusmn 0033 78plusmn 0080 83plusmn 0042 85plusmn 0089 70plusmn 0058 73plusmn 0087U-Net [30] 80plusmn 0143 81plusmn 0054 87plusmn 0155 86plusmn 0076 74plusmn 0126 77plusmn 0055Liu et al [7] 82plusmn 0172 84plusmn 0061 81plusmn 0043 84plusmn 0071 85plusmn 0202 85plusmn 0068Our approach 86plusmn 0184 89plusmn 0048 91plusmn 0174 93plusmn 0050 82plusmn 0176 86plusmn 0058σ is the standard deviation

8 Journal of Healthcare Engineering

Data Availability

-e data that support the findings of this study areavailable from the cooperative institution Muscle Minerbut restrictions apply to the availability of these datawhich were used under license for the current study andso are not publicly available Data are however availablefrom the authors upon reasonable request and withpermission of the cooperative institution Muscle Miner

Conflicts of Interest

-e authors declare that they have no conflicts of interest

Acknowledgments

-e authors would like to thank all study participants -iswork was supported by the National Key RampD Program ofChina (grant no 2017YFB1002504) and National NaturalScience Foundation of China (nos 81727802 and 61701404)

References

[1] C S Fry J D Lee J Mula et al ldquoInducible depletion ofsatellite cells in adult sedentary mice impairs muscle re-generative capacity without affecting sarcopeniardquo NatureMedicine vol 21 no 1 pp 76ndash80 2015

[2] M W Lee M G Viola H Meng et al ldquoDifferential musclehypertrophy is associated with satellite cell numbers and aktpathway activation following activin type IIB receptor in-hibition in Mtm1 pR69C micerdquo e American Journal ofPathology vol 184 no 6 pp 1831ndash1842 2014

[3] H Viola P M Janssen R W Grange et al ldquoTissue triage andfreezing for models of skeletal muscle diseaserdquo Journal ofVisualized Experiments JoVE vol e51586 no 89 2014

[4] F Liu A L Mackey R Srikuea K A Esser and L YangldquoAutomated image segmentation of haematoxylin and eosinstained skeletal muscle cross-sectionsrdquo Journal of Microscopyvol 252 no 3 pp 275ndash285 2013

[5] H Su F Xing J D Lee et al ldquoLearning based automaticdetection of myonuclei in isolated single skeletal muscle fibersusing multi-focus image fusionrdquo in Proceedings of the IEEEInternational Symposium on Biomedical Imaging pp 432ndash435 San Francisco CA USA April 2013

[6] L R Smith and E R Barton ldquoSmashndashsemi-automatic muscleanalysis using segmentation of histology a matlab applica-tionrdquo Skeletal Muscle vol 4 no 1 pp 1ndash16 2014

[7] F Liu F Xing Z Zhang M Mcgough and L Yang ldquoRobustmuscle cell quantification using structured edge detection andhierarchical segmentationrdquo in Lecture Notes in ComputerScience pp 324ndash331 Shenzhen MICCAI Shenzhen China2015

[8] T Janssens L Antanas S Derde I VanhorebeekG Van den Berghe and F Guiza Grandas ldquoCharisma anintegrated approach to automatic HampE-stained skeletalmuscle cell segmentation using supervised learning and novelrobust clump splittingrdquoMedical Image Analysis vol 17 no 8pp 1206ndash1219 2013

[9] A E Carpenter T R Jones M R Lamprecht et al ldquoCell-profiler image analysis software for identifying and quanti-fying cell phenotypesrdquo Genome Biology vol 7 no 10pp 1ndash11 2006

[10] A Klemencic S Kovacic and F Pernus ldquoAutomated seg-mentation of muscle fiber images using active contourmodelsrdquo Cytometry vol 32 no 4 pp 317ndash326 1998

[11] N Bova V Gal O Ibantildeez and O Cordon ldquoDeformablemodels direct supervised guidance a novel paradigm forautomatic image segmentationrdquo Neurocomputing vol 177pp 317ndash333 2016

[12] T F Cootes C J Taylor D H Cooper et al ldquoActive shapemodels-their training and applicationrdquo Computer Vision andImage Understanding vol 61 no 1 pp 38ndash59 1995

[13] S Zhang Y Zhan M Dewan J Huang D N Metaxas andX S Zhou ldquoSparse shape composition a new frameworkfor shape prior modelingrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1025ndash1032 IEEE Colorado Springs CO USAJune 2011

[14] D Ciresan A Giusti L M Gambardella and J SchmidhuberldquoDeep neural networks segment neuronal membranes inelectron microscopy imagesrdquo in Proceedings of the NIPSpp 2843ndash2851 Lake Tahoe NV USA December 2012

[15] H Noh S Hong and B Han ldquoLearning deconvolutionnetwork for semantic segmentationrdquo in Proceedings of theICCV Las Condes Chile December 2015

[16] J Long E Shelhamer and T Darrell ldquoFully convolutionalnetworks for semantic segmentationrdquo in Proceedings of theCVPR pp 3431ndash3440 Santiago Chile December 2015

[17] P O Pinheiro R Collobert and P Dollar ldquoLearning tosegment object candidatesrdquo in Proceedings of the Advances inNeural Information Processing Systems pp 1981ndash1989Montreal Canada December 2015

[18] J Mula J D Lee F Liu L Yang and C A PetersonldquoAutomated image analysis of skeletal muscle fiber cross-sectional areardquo Journal of Applied Physiology vol 114 no 1pp 148ndash155 2013

[19] P-Y Baudin N Azzabou P G Carlier and N ParagiosldquoPrior knowledge random walks and human skeletal musclesegmentationrdquo in Proceedings of the Medical Image Com-puting and Computer-Assisted InterventionndashMICCAI 2012pp 569ndash576 Nice France October 2012

[20] P Dollar and C L Zitnick ldquoStructured forests for fast edgedetectionrdquo in Proceedings of the ICCV pp 1841ndash1848 SydneyAustralia December 2013

[21] F Liu F Xing and L Yang ldquoRobust muscle cell segmentationusing region selection with dynamic programmingrdquo inProceedings of the ISBI pp 521ndash524 Beijing China April2014

[22] E Van Aart N Sepasian A Jalba and A Vilanova ldquoCuda-accelerated geodesic ray-tracing for fiber trackingrdquo Journal ofBiomedical Imaging vol 2011 Article ID 698908 12 pages2011

[23] G C Kagadis C Kloukinas K Moore et al ldquoCloud com-puting in medical imagingrdquo Medical Physics vol 40 no 7article 070901

[24] L Yang X Qi F Xing T Kurc J Saltz and D J ForanldquoParallel content-based sub-image retrieval using hierarchicalsearchingrdquo Bioinformatics vol 30 no 7 pp 996ndash1002 2014

[25] A Krizhevsky I Sutskever and G E Hinton ldquoImagenetclassification with deep convolutional neural networksrdquo inProceedings of the Advances in Neural Information ProcessingSystems pp 1097ndash1105 Lake Tahoe NV USA December2012

[26] H-C Shin H R Roth M Gao et al ldquoDeep convolutionalneural networks for computer-aided detection CNN archi-tectures dataset characteristics and transfer learningrdquo IEEE

Journal of Healthcare Engineering 9

Transactions on Medical Imaging vol 35 no 5 pp 1285ndash1298 2016

[27] J Xu X Luo G Wang H Gilmore and A Madabhushi ldquoAdeep convolutional neural network for segmenting andclassifying epithelial and stromal regions in histopathologicalimagesrdquo Neurocomputing vol 191 pp 214ndash223 2016

[28] X Pan L Li H Yang et al ldquoAccurate segmentation of nucleiin pathological images via sparse reconstruction and deepconvolutional networksrdquo Neurocomputing vol 229 pp 88ndash99 2017

[29] K Simonyan and A Zisserman ldquoVery deep convolutionalnetworks for large-scale image recognitionrdquo httpsarxivorgabs14091556

[30] O Ronneberger P Fischer and T Brox ldquoU-net convolu-tional networks for biomedical image segmentationrdquo inLecture Notes in Computer Science pp 234ndash241 ShenzhenMICCAI Shenzhen China 2015

[31] S Xie and Z Tu ldquoHolistically-nested edge detectionrdquo inProceedings of the ICCV pp 1395ndash1403 Las Condes ChileDecember 2015

[32] D Eigen and R Fergus ldquoPredicting depth surface normalsand semantic labels with a commonmulti-scale convolutionalarchitecturerdquo in Proceedings of the IEEE International Con-ference on Computer Vision pp 2650ndash2658 Las CondesChile December 2015

[33] Q Dou H Chen L Yu et al ldquoAutomatic detection of cerebralmicrobleeds from mr images via 3d convolutional neuralnetworksrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1182ndash1195 2016

[34] H Chen X Qi L Yu and P-A Heng ldquoDcan deep contour-aware networks for accurate gland segmentationrdquo httpsarxivorgabs160402677

[35] P Moeskops M A Viergever A M Mendrik L S de VriesM J N L Benders and I Isgum ldquoAutomatic segmentation ofmr brain images with a convolutional neural networkrdquo IEEETransactions on Medical Imaging vol 35 no 5 pp 1252ndash1261 2016

[36] S Hong H Noh and B Han ldquoDecoupled deep neuralnetwork for semi-supervised semantic segmentationrdquo inProceedings of the Advances in Neural Information ProcessingSystems pp 1495ndash1503 Montreal Canada December 2015

[37] M D Zeiler and R Fergus ldquoVisualizing and understandingconvolutional networksrdquo in Proceedings of the ComputervisionndashECCV 2014 pp 818ndash833 Springer Zurich SwitzerlandSeptember 2014

[38] C Szegedy W Liu Y Jia et al ldquoGoing deeper with con-volutionsrdquo in Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition pp 1ndash9 Boston MA USAJune 2015

[39] G Bertasius J Shi and L Torresani ldquoDeepedge a multi-scalebifurcated deep network for top-down contour detectionrdquo inProceedings of the IEEE Conference on Computer Vision andPattern Recognition pp 4380ndash4389 Boston MA USA June2015

[40] Y Jia E Shelhamer J Donahue et al ldquoCaffe convolutionalarchitecture for fast feature embeddingrdquo in Proceedings of theInternational Conference on Multimedia pp 675ndash678Orlando FL USA November 2014

[41] N Tajbakhsh J Y Shin S R Gurudu et al ldquoConvolutionalneural networks for medical image analysis full training orfine tuningrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1299ndash1312 2016

[42] J Yosinski J Clune Y Bengio and H Lipson ldquoHowtransferable are features in deep neural networksrdquo in

Proceedings of the NIPS pp 3320ndash3328 Montreal CanadaDecember 2014

[43] M Donoser and D Schmalstieg ldquoDiscrete-continuous gra-dient orientation estimation for faster image segmentationrdquoin Proceedings of the CVPR pp 3158ndash3165 Columbus OHUSA June 2014

[44] P Arbelaez J Pont-Tuset J Barron F Marques and J MalikldquoMultiscale combinatorial groupingrdquo in Proceedings of theCVPR pp 328ndash335 Columbus OH USA June 2014

[45] A Giusti D C Ciresan J Masci L M Gambardella andJ Schmidhuber ldquoFast image scanning with deep max-poolingconvolutional neural networksrdquo 2013 httpsarxivorgabs13021700

10 Journal of Healthcare Engineering

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

32 Spatially Weighted Loss for Backpropagation -is sec-tion describes the loss function training network throughbackpropagation Our proposed spatially weighted loss playsan important role in network training

Denote the training data as D (X Y) isin X times Y1113864 1113865where XY isin RN and N is the total number of pixelsin the training image X Y is the corresponding groundtruth segmentation mask with each pixel Yi isin 0 1

(ie pixels inside and on the boundary the muscle cell andbackground otherwise) For an input image X the mainobjective of our network is to obtain the pixelwise pre-diction Y⋆

Y⋆

arg max1113954Y

P(1113954Y ∣ X θ)(1)

where P(1113954Yi ∣ X θ) is the prediction probability of pixel Xiie the sigmoid function output of the network (denoted asPi afterwards for brevity) θ represents all parameters of ournetwork

Our network has multiple decoders with each havingindependent loss to update their parameters (see Section312 for details) Denote the loss function of i-th decoder asJde

i -e extra 1 times 1 convolutional layer after the concatlayer is updated by another loss (see Figure 3) denoted as

Jc Learning θ is achieved by minimizing the loss functionJ which is defined as

J(θ) 1113944M

i1J

dei (θ) + J

c(θ) (2)

where M is the number of the decoder Note that sinceJdei

and Jc are both spatially computed on pixels of the denseoutput both have the same formulation -e overall lossJ can be jointly minimized via backpropagation (spe-cifically when a layer has more than one path such as theconv1-2 layer in Figure 3 which has two successive layers(decoder-1 and pool1) the gradients will be accumulatedfrom multiple successive paths during backpropagation[40])

In skeletal muscle images there are several commonproblems which affect the network training (1) the largeproportion of pixels inside cell pixels will cause an un-balanced class such that the error effects occurred at themargins will be diminished during backpropagation (2)usually cells are densely distributed and the boundariesbetween touching cells are thin and often unclear orbroken due to musclersquos unique anatomy based on ourobservations the network often misclassifies the pixels at

Decoder-1

Decoder-2

Decoder-3

Decoder-4

Decoder-5

Encoder

Name Kernel Stride Pad OutputNameInputconv-1

Kernelmdash mdash

1 times 1 1mdash0

Inputconv-1

conv-2

mdash mdash1 times 1 1

mdash0

devonv-1dagger 4 times 4 2 0

Stride Pad Output

Name Kernel Stride Pad Output

Name Kernel Stride Pad Output

Inputconv1-1

conv2-1

conv3-1conv3-2conv3-3

3 times 3 1 11 1

1000 times 10001000 times 1000 times 641000 times 1000 times 641000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 11000 times 1000 times 5

250 times 250 times 256

125 times 125 times 512

63 times 63 times 51263 times 63 times 51263 times 63 times 51263 times 63 times 512

125 times 125 times 512125 times 125 times 512125 times 125 times 512

250 times 250 times 256250 times 250 times 256250 times 250 times 256

500 times 500 times 64500 times 500 times 128500 times 500 times 128

3 times 3

2 times 2mdash mdash mdash

211mdash2111mdash2111mdash2

011

111

mdash0

mdash0111mdash

111mdash

111mdash

mdash mdash

0

1 0

3 times 33 times 3

2 times 23 times 33 times 33 times 3

2 times 2

mdash

mdash mdash mdash

mdash

3 times 33 times 33 times 3

2 times 2mdash

3 times 33 times 33 times 3

1 times 1

mdashmdash

conv1-2

conv2-2

Decoder-1

Decoder-2

Decoder-3

pool1

pool2

pool3conv4-1conv4-2conv4-3Decoder-4pool4

concatconv

conv5-1conv5-2conv5-3Decoder-5

1000 times 1000 times 64

500 times 500 times 128

Input mdash mdash mdash 250 times 250 times 256

500 times 500 times 1

1000 times 1000 times 1

1000 times 1000 times 1

1000 times 1000 times 1

conv-1 1 times 1 1

1 1

0devonv-1

devonv-2dagger

4 times 4

4 times 43 times 3

2

2

0

0

250 times 250 times 1502 times 502 times 1502 times 502 times 1

conv-2

Name Kernel Stride Pad OutputInput mdash mdash mdash 125 times 125 times 512

1000 times 1000 times 1

conv-1 1 times 1 1

1 1

0devonv-1

devonv-2dagger

8 times 8

4 times 43 times 3

2

2

0

0

215 times 125 times 1504 times 504 times 1504 times 504 times 1

conv-2

Name Kernel Stride Pad OutputInput mdash mdash mdash 63 times 63 times 512

1000 times 1000 times 1

conv-1 1 times 1 1

1 1

0devonv-1

devonv-2dagger

8 times 8

8 times 83 times 3

4

4

0

0

63 times 63 times 1256 times 256 times 1256 times 256 times 1

Figure 3 -e detailed network configuration -e convolutional max-pooling deconvolutional and concat layers are denoted by convpool deconv and concat respectively Each convolutional layer of the encoder is followed by a ReLU layer which is hidden in the tables-ere are 5 decoders connected inside the architecture of the encoder -e (black solid and gray dotted) arrows point to the layer where theoutput of the corresponding layer goes-e last column of each table shows the feature map size (heighttimes widthtimes dimension) of each layerIn the tables of decoders ldquodaggerrdquo indicates that a crop layer is connected after that to force the output size to be the same as the input image size(ie 1000 times 1000 in the table)

Journal of Healthcare Engineering 5

margins between fiber boundaries (3) due to the stainingissue the boundary pixels are not smooth and continuousso it is very difficult to ensure that annotations accuratelylabel each pixel It is necessary to reduce the ambiguity fornetwork training

We propose a loss function to ameliorate these problemsby assigning different weights to the loss of each pixel -eloss function of a training data X which is based on thecross-entropy loss is defined as

Jde

(θ) 1113944N

i1f Xi( 1113857 1 Yi 11113858 1113859logPi + 1 Yi 01113858 1113859log 1minusPi( 1113857( 1113857

(3)

-e pixelwise weights are defined by the weight-assigning function f which is defined as

f Xi( 1113857 C Yi( 1113857minus1

times expΩ Xi( 1113857

η1times 1 Yi minusPi

11138681113868111386811138681113868111386811138681113868lt η21113960 1113961 (4)

-e pixelwise weight-assigning function f has threeterms which play different roles to address the above-mentioned three problems-e specific considerations makeour proposed loss different from [16 30]

In the first term C(Yi) is the label frequency which is aglobal term having the same value for same-class pixels Insecond term Ω is the Euclidean distance of pixel Xi to theboundary of the close cell Similar to [30] the intention of f isto assign relatively high weights to pixels adjacent toboundaries to amplify the error penalty occurred at themargins and pixels close to fiber boundaries and 1 otherwise(Ω 0 if Ω(Xi)gt ε ) We set η 06 and ε 10 empiricallyCompared with the ldquohardrdquo error-balancing strategy in[16 31] f produces soft error penalty so as to encouragebetter optimization convergence and enhance fine-grainedprediction -e third term aims to reduce the reliability ofthe ground truth when the network predicts an oppositelabel with high probability -is term is a switch so it forcesthe weight of the corresponding pixel to zero when thecondition is not satisfied In practice we preserve this valueduring network feedforward while the loss of the corre-sponding pixels does not get involved during networkbackpropagation

33 Two-Stage Training Training our deep network hassome common difficulties

(i) -e large number of parameters in both convolu-tional layers and deconvolutional layers makes thetraining difficult to achieve proper convergence[15 41]

(ii) Successful training from scratch requires extensivelabeled data which are extremely difficult to obtain inmedical image domain

One typical solution is to apply transfer learning toreduce the training difficulty [41 42] which reduces thedifficulties of the tricky parameter initialization andtuning [25 29] and heavy data acquirement procedure-e core idea behind is to use a pretrained model as the

initialization and fine-tune the CNN to make it adapt totargeting tasks with new training data -e encoder of ournetwork partially inherits the architecture of VGG [29]which is however trained on a large set of natural imagesfor image classification Transferring its knowledge tobenefit the totally unrelated biological image analysisproblem (ie MIS) seems impracticable However a re-cent literature coincides with our experiments It dem-onstrates the advantage [41] using various biologicalimaging modalities transferring from AlexNet [25] arelatively shallow CNN for natural image classification Interms of our MIS case segmentation the network archi-tecture is much more deeper with many new parame-terized layers in decoders More specific treatment needsto be considered

It is well known that the bottom layers of CNN canbe understood as various feature extractors attemptingto capture the low-level image features such as edgesand corners [25 37 41] Actually those low-level fea-tures are common between natural images and muscleimages of which the most common feature is imagegradients (ie boundaries) In practice we find thattraining the network to detect boundaries is relativelyeasier than directly training the network to segmentmuscle fibres

We propose a two-stage training strategy to pro-gressively train our network so as to utilize the powerfulfeature extractors of VGG and overcome the above-mentioned problems In the first stage we apply transferlearning to use pretrained VGG to initialize the param-eters of the encoder and randomly initialize the param-eters of decoders We then train the network to detectfiber boundaries which is achieved by feeding the networkwith training muscle images associated with the groundtruth boundary map (see Figure 2) -is strategy willfacilitate the network to converge swiftly After the net-work becomes adapted to new muscle images in thesecond stage we fine-tune the model using the originaltraining data D (ie Y is the segmentation mask) to trainthe network to automatically segment muscle fibresassigning in-cell pixels to 1 and other pixels to 0 Moreimplementation details are described in the experimentalsection

Another advantage of our proposed training strategy isthat it further helps reduce the touch objects (due to thinboundaries) problem [30 34] commonly occurred in end-to-end CNN segmentation (besides the pixel weight-assigning function f ) -e strategy of this literature [34] isto predict both a segmentation map and boundary map andmerge two maps to solve touching glands While in ourmethod the first stage training makes the network detect thecell boundaries -e second stage training is able to preservethis boundary information

4 Experimental Results

41 Dataset Our expert annotated skeletal muscle imagedataset with HampE staining contains 500 annotated imageswhich are captured by the whole-slide digital scanner from

6 Journal of Healthcare Engineering

the cooperative institution Muscle Miner -e images ex-hibit large appearance variances in color fiber size andshape -e image size roughly ranges from 500 times 500 to1500 times 1500 pixels We split the dataset into 100 testingimages and 400 training images

In order to evaluate the proposed method to handlelarge-scale images we evaluate the runtime on a whole-slideimage Note that we use small image patches for segmen-tation accuracy evaluation because some comparativemethods in the literature cannot handle whole-slide imagesHowever our proposed network is flexible to the input sizeduring the testing stage because the decoder is able toadaptively adjust the output size to be consistent with theinput size

42 Implementation Details Our implementation is basedon the Caffe [40] framework with modifications for ournetwork design All experiments are conducted on astandard desktop with an Intel i7 processor and a singleTesla K40c GPU -e optimization is driven by stochasticgradient descent with momentum For the first stagetraining the network parameters are set to learningrate 1eminus 6 (divided by 10 every 1e4 iteration) mo-mentum 09 and minibatch size 2 In the second stagewe use the learning rate 1eminus 7 and keep the others thesame

Augmenting dataset is a normal step for training CNNWe apply a simple approach by randomly cropping 30300 times 300 image patches from each of the training imagesto generate totally 12e4 training data We choose thispatch size to take the memory capacity of GPU into ac-count Based on our observations the segmentation ac-curacy will not be affected by increasing input size of testimages To simplify the computation of the weightingfunction f during training we take another per-computedweighting map associated with each training data (X Y) asnetwork inputs

43 Segmentation Accuracy Evaluation For quantitativeevaluation we report Precision (|ScapG||S|) Recall

(|ScapG||G|) and F1-score (2 middot Prec middot RecPrec + Rec)where |S| is the segmented cell region area and |G| is thecorresponding ground truth region area For each testimage Precision and Recall are computed by averaging theresults of all fibres inside We report the three values with afixed threshold (FT) ie a common threshold producesthe best F1-score over the test set and dynamic thresholds(DT) produce the best F1-score per image

In Table 1 we compare the segmentation performance ofour approach to several state-of-the-art methods DC [43]and multiscale combinatorial grouping (MCG) [44] arerecently proposed learning-based image segmentationmethods U-Net [30] is an end-to-end CNN for biomedicalimage segmentation We use their public codes and carefullytrain the models over our training data with the sameamount DNN-SNM [14] is a well-known CNN-based imagesegmentation method We regard it as a generic CNN forcomparison with our end-to-end CNN approach For our

method we directly use the network output as the seg-mentation results for evaluation without any extra post-processing efforts

As shown in Table 1 our method achieves much betterresults than comparative methods Although [7] has betterRecall (FT) our method has around 10 improvement onPrecision (FT) DC and MCG are not robust to the imageartifacts which decreases their segmentation performanceOur method largely outperforms DNN-SNM and U-Netbecause (1) our network is deeper than DNN-SNM tocapture richer representations (2) the decoder betterutilizes the multiscale representations than U-Net and isable to reduce the effects of the pixelwise information lossand (3) two-stage training takes advantage of VGG forbetter training effectiveness rather than training fromscratch as U-Net does -e outstanding Precision resultdemonstrates that our method produces more fine-grainedsegmentation than others -is superiority is betterdemonstrated by the qualitative evaluation as shown inFigure 4

44 Whole-Slide Segmentation Runtime In Table 2 wecompare the runtime of our method to the comparativemethods on images of different sizes cropped from a whole-slide image (see Figure 1) -e runtime of non-deeplearning-based methods (1st block) depends on both pixeland fiber quantities so they cannot handle large-scale im-ages In contrast deep learning-based methods (2nd and 3rdblocks) depend on the pixel quantity so they have close-to-linear time complexity with respect to the image scale Wealso implement a fast scanning version [45] of DNN-SNMon GPU Although the speed has a large improvement it isstill much slower than ours U-Net has more complicatedlayer connection configuration so it is slower than oursespecially in large-scale cases -e significant speed im-provement demonstrates the scalability of our proposedmethod to the application of whole-slide MIS with evenlarger scales

5 Conclusion

-is paper presents a fast and accurate whole-slide MISmethod based on CNN trained in the end-to-end mannerOur proposed network captures hierarchical and compre-hensive representations to support multiscale pixelwisepredictions inside the network A two-stage transfer learningstrategy is proposed to train such a deep network Superioraccuracy and efficiency are experimentally demonstrated ona challenging skeletal muscle image dataset In general ourapproach enables multiscaling inside the network while justrequiring a single arbitrarily sized input and outputting fineoutputs However during the downsampling process of theencoding due to the limitation of resolution of feature layerafter downsampling many important features such as edgefeatures of cells are still lost To further improve decodingefficiency in the future work we can design a module thatcomplements important features to better improve networkperformance

Journal of Healthcare Engineering 7

Test image Ground truth DNN-SNM Liu et al [7] Ours

Figure 4 Segmentation results of four sample skeletal muscle imagesWe show some very challenging cases with large appearance variancesin color fiber shape etc Each segmented fiber is overlaid with a distinctive colored mask while false positives and false negatives arehighlighted by red and blue contours respectively Compared with the other two methods our method obtains more fine-grainedsegmentation results with obviously less false prediction

Table 2 -e runtime (in seconds) comparison on images of different sizes from 1x 1000 times 1000 to 9x 9000 times 9000

Method 1x 2x 3x 4x 5x 6x 7x 8x 9xDC [43] 20 79 mdash mdash mdash mdash mdash mdash mdashMCG [44] 7 27 mdash mdash mdash mdash mdash mdash mdashLiu et al [7] 10 59 mdash mdash mdash mdash mdash mdash mdashDNN-SNM [14] 264 1056 2376 4224 6600 9504 12936 16896 21384DNN-SNM⋆[45] 31 115 242 431 675 974 1325 1738 2160U-net [30] 12 39 90 161 246 368 482 633 792Our approach 11 18 53 88 139 209 278 364 468-e first three methods cannot handle images with 3x and larger sizes on our machine (represented with ldquomdashrdquo in the table) ⋆DNN-SNM is a fast scanningimplementation for prediction speed acceleration

Table 1 -e segmentation results compared with state-of-the-art methods

MethodF1-score ( plusmn σ) Precision ( plusmn σ) Recall ( plusmn σ)

FT DT FT DT FT DT

DC [43] 48plusmn 0093 60plusmn 0138 41plusmn 0066 54plusmn 0164 67plusmn 0194 73plusmn 0148MCG [44] 63plusmn 0201 71plusmn 0105 53plusmn 0136 64plusmn 0138 80plusmn 0303 82plusmn 0091DNN-SNM [14] 76plusmn 0033 78plusmn 0080 83plusmn 0042 85plusmn 0089 70plusmn 0058 73plusmn 0087U-Net [30] 80plusmn 0143 81plusmn 0054 87plusmn 0155 86plusmn 0076 74plusmn 0126 77plusmn 0055Liu et al [7] 82plusmn 0172 84plusmn 0061 81plusmn 0043 84plusmn 0071 85plusmn 0202 85plusmn 0068Our approach 86plusmn 0184 89plusmn 0048 91plusmn 0174 93plusmn 0050 82plusmn 0176 86plusmn 0058σ is the standard deviation

8 Journal of Healthcare Engineering

Data Availability

-e data that support the findings of this study areavailable from the cooperative institution Muscle Minerbut restrictions apply to the availability of these datawhich were used under license for the current study andso are not publicly available Data are however availablefrom the authors upon reasonable request and withpermission of the cooperative institution Muscle Miner

Conflicts of Interest

-e authors declare that they have no conflicts of interest

Acknowledgments

-e authors would like to thank all study participants -iswork was supported by the National Key RampD Program ofChina (grant no 2017YFB1002504) and National NaturalScience Foundation of China (nos 81727802 and 61701404)

References

[1] C S Fry J D Lee J Mula et al ldquoInducible depletion ofsatellite cells in adult sedentary mice impairs muscle re-generative capacity without affecting sarcopeniardquo NatureMedicine vol 21 no 1 pp 76ndash80 2015

[2] M W Lee M G Viola H Meng et al ldquoDifferential musclehypertrophy is associated with satellite cell numbers and aktpathway activation following activin type IIB receptor in-hibition in Mtm1 pR69C micerdquo e American Journal ofPathology vol 184 no 6 pp 1831ndash1842 2014

[3] H Viola P M Janssen R W Grange et al ldquoTissue triage andfreezing for models of skeletal muscle diseaserdquo Journal ofVisualized Experiments JoVE vol e51586 no 89 2014

[4] F Liu A L Mackey R Srikuea K A Esser and L YangldquoAutomated image segmentation of haematoxylin and eosinstained skeletal muscle cross-sectionsrdquo Journal of Microscopyvol 252 no 3 pp 275ndash285 2013

[5] H Su F Xing J D Lee et al ldquoLearning based automaticdetection of myonuclei in isolated single skeletal muscle fibersusing multi-focus image fusionrdquo in Proceedings of the IEEEInternational Symposium on Biomedical Imaging pp 432ndash435 San Francisco CA USA April 2013

[6] L R Smith and E R Barton ldquoSmashndashsemi-automatic muscleanalysis using segmentation of histology a matlab applica-tionrdquo Skeletal Muscle vol 4 no 1 pp 1ndash16 2014

[7] F Liu F Xing Z Zhang M Mcgough and L Yang ldquoRobustmuscle cell quantification using structured edge detection andhierarchical segmentationrdquo in Lecture Notes in ComputerScience pp 324ndash331 Shenzhen MICCAI Shenzhen China2015

[8] T Janssens L Antanas S Derde I VanhorebeekG Van den Berghe and F Guiza Grandas ldquoCharisma anintegrated approach to automatic HampE-stained skeletalmuscle cell segmentation using supervised learning and novelrobust clump splittingrdquoMedical Image Analysis vol 17 no 8pp 1206ndash1219 2013

[9] A E Carpenter T R Jones M R Lamprecht et al ldquoCell-profiler image analysis software for identifying and quanti-fying cell phenotypesrdquo Genome Biology vol 7 no 10pp 1ndash11 2006

[10] A Klemencic S Kovacic and F Pernus ldquoAutomated seg-mentation of muscle fiber images using active contourmodelsrdquo Cytometry vol 32 no 4 pp 317ndash326 1998

[11] N Bova V Gal O Ibantildeez and O Cordon ldquoDeformablemodels direct supervised guidance a novel paradigm forautomatic image segmentationrdquo Neurocomputing vol 177pp 317ndash333 2016

[12] T F Cootes C J Taylor D H Cooper et al ldquoActive shapemodels-their training and applicationrdquo Computer Vision andImage Understanding vol 61 no 1 pp 38ndash59 1995

[13] S Zhang Y Zhan M Dewan J Huang D N Metaxas andX S Zhou ldquoSparse shape composition a new frameworkfor shape prior modelingrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1025ndash1032 IEEE Colorado Springs CO USAJune 2011

[14] D Ciresan A Giusti L M Gambardella and J SchmidhuberldquoDeep neural networks segment neuronal membranes inelectron microscopy imagesrdquo in Proceedings of the NIPSpp 2843ndash2851 Lake Tahoe NV USA December 2012

[15] H Noh S Hong and B Han ldquoLearning deconvolutionnetwork for semantic segmentationrdquo in Proceedings of theICCV Las Condes Chile December 2015

[16] J Long E Shelhamer and T Darrell ldquoFully convolutionalnetworks for semantic segmentationrdquo in Proceedings of theCVPR pp 3431ndash3440 Santiago Chile December 2015

[17] P O Pinheiro R Collobert and P Dollar ldquoLearning tosegment object candidatesrdquo in Proceedings of the Advances inNeural Information Processing Systems pp 1981ndash1989Montreal Canada December 2015

[18] J Mula J D Lee F Liu L Yang and C A PetersonldquoAutomated image analysis of skeletal muscle fiber cross-sectional areardquo Journal of Applied Physiology vol 114 no 1pp 148ndash155 2013

[19] P-Y Baudin N Azzabou P G Carlier and N ParagiosldquoPrior knowledge random walks and human skeletal musclesegmentationrdquo in Proceedings of the Medical Image Com-puting and Computer-Assisted InterventionndashMICCAI 2012pp 569ndash576 Nice France October 2012

[20] P Dollar and C L Zitnick ldquoStructured forests for fast edgedetectionrdquo in Proceedings of the ICCV pp 1841ndash1848 SydneyAustralia December 2013

[21] F Liu F Xing and L Yang ldquoRobust muscle cell segmentationusing region selection with dynamic programmingrdquo inProceedings of the ISBI pp 521ndash524 Beijing China April2014

[22] E Van Aart N Sepasian A Jalba and A Vilanova ldquoCuda-accelerated geodesic ray-tracing for fiber trackingrdquo Journal ofBiomedical Imaging vol 2011 Article ID 698908 12 pages2011

[23] G C Kagadis C Kloukinas K Moore et al ldquoCloud com-puting in medical imagingrdquo Medical Physics vol 40 no 7article 070901

[24] L Yang X Qi F Xing T Kurc J Saltz and D J ForanldquoParallel content-based sub-image retrieval using hierarchicalsearchingrdquo Bioinformatics vol 30 no 7 pp 996ndash1002 2014

[25] A Krizhevsky I Sutskever and G E Hinton ldquoImagenetclassification with deep convolutional neural networksrdquo inProceedings of the Advances in Neural Information ProcessingSystems pp 1097ndash1105 Lake Tahoe NV USA December2012

[26] H-C Shin H R Roth M Gao et al ldquoDeep convolutionalneural networks for computer-aided detection CNN archi-tectures dataset characteristics and transfer learningrdquo IEEE

Journal of Healthcare Engineering 9

Transactions on Medical Imaging vol 35 no 5 pp 1285ndash1298 2016

[27] J Xu X Luo G Wang H Gilmore and A Madabhushi ldquoAdeep convolutional neural network for segmenting andclassifying epithelial and stromal regions in histopathologicalimagesrdquo Neurocomputing vol 191 pp 214ndash223 2016

[28] X Pan L Li H Yang et al ldquoAccurate segmentation of nucleiin pathological images via sparse reconstruction and deepconvolutional networksrdquo Neurocomputing vol 229 pp 88ndash99 2017

[29] K Simonyan and A Zisserman ldquoVery deep convolutionalnetworks for large-scale image recognitionrdquo httpsarxivorgabs14091556

[30] O Ronneberger P Fischer and T Brox ldquoU-net convolu-tional networks for biomedical image segmentationrdquo inLecture Notes in Computer Science pp 234ndash241 ShenzhenMICCAI Shenzhen China 2015

[31] S Xie and Z Tu ldquoHolistically-nested edge detectionrdquo inProceedings of the ICCV pp 1395ndash1403 Las Condes ChileDecember 2015

[32] D Eigen and R Fergus ldquoPredicting depth surface normalsand semantic labels with a commonmulti-scale convolutionalarchitecturerdquo in Proceedings of the IEEE International Con-ference on Computer Vision pp 2650ndash2658 Las CondesChile December 2015

[33] Q Dou H Chen L Yu et al ldquoAutomatic detection of cerebralmicrobleeds from mr images via 3d convolutional neuralnetworksrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1182ndash1195 2016

[34] H Chen X Qi L Yu and P-A Heng ldquoDcan deep contour-aware networks for accurate gland segmentationrdquo httpsarxivorgabs160402677

[35] P Moeskops M A Viergever A M Mendrik L S de VriesM J N L Benders and I Isgum ldquoAutomatic segmentation ofmr brain images with a convolutional neural networkrdquo IEEETransactions on Medical Imaging vol 35 no 5 pp 1252ndash1261 2016

[36] S Hong H Noh and B Han ldquoDecoupled deep neuralnetwork for semi-supervised semantic segmentationrdquo inProceedings of the Advances in Neural Information ProcessingSystems pp 1495ndash1503 Montreal Canada December 2015

[37] M D Zeiler and R Fergus ldquoVisualizing and understandingconvolutional networksrdquo in Proceedings of the ComputervisionndashECCV 2014 pp 818ndash833 Springer Zurich SwitzerlandSeptember 2014

[38] C Szegedy W Liu Y Jia et al ldquoGoing deeper with con-volutionsrdquo in Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition pp 1ndash9 Boston MA USAJune 2015

[39] G Bertasius J Shi and L Torresani ldquoDeepedge a multi-scalebifurcated deep network for top-down contour detectionrdquo inProceedings of the IEEE Conference on Computer Vision andPattern Recognition pp 4380ndash4389 Boston MA USA June2015

[40] Y Jia E Shelhamer J Donahue et al ldquoCaffe convolutionalarchitecture for fast feature embeddingrdquo in Proceedings of theInternational Conference on Multimedia pp 675ndash678Orlando FL USA November 2014

[41] N Tajbakhsh J Y Shin S R Gurudu et al ldquoConvolutionalneural networks for medical image analysis full training orfine tuningrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1299ndash1312 2016

[42] J Yosinski J Clune Y Bengio and H Lipson ldquoHowtransferable are features in deep neural networksrdquo in

Proceedings of the NIPS pp 3320ndash3328 Montreal CanadaDecember 2014

[43] M Donoser and D Schmalstieg ldquoDiscrete-continuous gra-dient orientation estimation for faster image segmentationrdquoin Proceedings of the CVPR pp 3158ndash3165 Columbus OHUSA June 2014

[44] P Arbelaez J Pont-Tuset J Barron F Marques and J MalikldquoMultiscale combinatorial groupingrdquo in Proceedings of theCVPR pp 328ndash335 Columbus OH USA June 2014

[45] A Giusti D C Ciresan J Masci L M Gambardella andJ Schmidhuber ldquoFast image scanning with deep max-poolingconvolutional neural networksrdquo 2013 httpsarxivorgabs13021700

10 Journal of Healthcare Engineering

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

margins between fiber boundaries (3) due to the stainingissue the boundary pixels are not smooth and continuousso it is very difficult to ensure that annotations accuratelylabel each pixel It is necessary to reduce the ambiguity fornetwork training

We propose a loss function to ameliorate these problemsby assigning different weights to the loss of each pixel -eloss function of a training data X which is based on thecross-entropy loss is defined as

Jde

(θ) 1113944N

i1f Xi( 1113857 1 Yi 11113858 1113859logPi + 1 Yi 01113858 1113859log 1minusPi( 1113857( 1113857

(3)

-e pixelwise weights are defined by the weight-assigning function f which is defined as

f Xi( 1113857 C Yi( 1113857minus1

times expΩ Xi( 1113857

η1times 1 Yi minusPi

11138681113868111386811138681113868111386811138681113868lt η21113960 1113961 (4)

-e pixelwise weight-assigning function f has threeterms which play different roles to address the above-mentioned three problems-e specific considerations makeour proposed loss different from [16 30]

In the first term C(Yi) is the label frequency which is aglobal term having the same value for same-class pixels Insecond term Ω is the Euclidean distance of pixel Xi to theboundary of the close cell Similar to [30] the intention of f isto assign relatively high weights to pixels adjacent toboundaries to amplify the error penalty occurred at themargins and pixels close to fiber boundaries and 1 otherwise(Ω 0 if Ω(Xi)gt ε ) We set η 06 and ε 10 empiricallyCompared with the ldquohardrdquo error-balancing strategy in[16 31] f produces soft error penalty so as to encouragebetter optimization convergence and enhance fine-grainedprediction -e third term aims to reduce the reliability ofthe ground truth when the network predicts an oppositelabel with high probability -is term is a switch so it forcesthe weight of the corresponding pixel to zero when thecondition is not satisfied In practice we preserve this valueduring network feedforward while the loss of the corre-sponding pixels does not get involved during networkbackpropagation

33 Two-Stage Training Training our deep network hassome common difficulties

(i) -e large number of parameters in both convolu-tional layers and deconvolutional layers makes thetraining difficult to achieve proper convergence[15 41]

(ii) Successful training from scratch requires extensivelabeled data which are extremely difficult to obtain inmedical image domain

One typical solution is to apply transfer learning toreduce the training difficulty [41 42] which reduces thedifficulties of the tricky parameter initialization andtuning [25 29] and heavy data acquirement procedure-e core idea behind is to use a pretrained model as the

initialization and fine-tune the CNN to make it adapt totargeting tasks with new training data -e encoder of ournetwork partially inherits the architecture of VGG [29]which is however trained on a large set of natural imagesfor image classification Transferring its knowledge tobenefit the totally unrelated biological image analysisproblem (ie MIS) seems impracticable However a re-cent literature coincides with our experiments It dem-onstrates the advantage [41] using various biologicalimaging modalities transferring from AlexNet [25] arelatively shallow CNN for natural image classification Interms of our MIS case segmentation the network archi-tecture is much more deeper with many new parame-terized layers in decoders More specific treatment needsto be considered

It is well known that the bottom layers of CNN canbe understood as various feature extractors attemptingto capture the low-level image features such as edgesand corners [25 37 41] Actually those low-level fea-tures are common between natural images and muscleimages of which the most common feature is imagegradients (ie boundaries) In practice we find thattraining the network to detect boundaries is relativelyeasier than directly training the network to segmentmuscle fibres

We propose a two-stage training strategy to pro-gressively train our network so as to utilize the powerfulfeature extractors of VGG and overcome the above-mentioned problems In the first stage we apply transferlearning to use pretrained VGG to initialize the param-eters of the encoder and randomly initialize the param-eters of decoders We then train the network to detectfiber boundaries which is achieved by feeding the networkwith training muscle images associated with the groundtruth boundary map (see Figure 2) -is strategy willfacilitate the network to converge swiftly After the net-work becomes adapted to new muscle images in thesecond stage we fine-tune the model using the originaltraining data D (ie Y is the segmentation mask) to trainthe network to automatically segment muscle fibresassigning in-cell pixels to 1 and other pixels to 0 Moreimplementation details are described in the experimentalsection

Another advantage of our proposed training strategy isthat it further helps reduce the touch objects (due to thinboundaries) problem [30 34] commonly occurred in end-to-end CNN segmentation (besides the pixel weight-assigning function f ) -e strategy of this literature [34] isto predict both a segmentation map and boundary map andmerge two maps to solve touching glands While in ourmethod the first stage training makes the network detect thecell boundaries -e second stage training is able to preservethis boundary information

4 Experimental Results

41 Dataset Our expert annotated skeletal muscle imagedataset with HampE staining contains 500 annotated imageswhich are captured by the whole-slide digital scanner from

6 Journal of Healthcare Engineering

the cooperative institution Muscle Miner -e images ex-hibit large appearance variances in color fiber size andshape -e image size roughly ranges from 500 times 500 to1500 times 1500 pixels We split the dataset into 100 testingimages and 400 training images

In order to evaluate the proposed method to handlelarge-scale images we evaluate the runtime on a whole-slideimage Note that we use small image patches for segmen-tation accuracy evaluation because some comparativemethods in the literature cannot handle whole-slide imagesHowever our proposed network is flexible to the input sizeduring the testing stage because the decoder is able toadaptively adjust the output size to be consistent with theinput size

42 Implementation Details Our implementation is basedon the Caffe [40] framework with modifications for ournetwork design All experiments are conducted on astandard desktop with an Intel i7 processor and a singleTesla K40c GPU -e optimization is driven by stochasticgradient descent with momentum For the first stagetraining the network parameters are set to learningrate 1eminus 6 (divided by 10 every 1e4 iteration) mo-mentum 09 and minibatch size 2 In the second stagewe use the learning rate 1eminus 7 and keep the others thesame

Augmenting dataset is a normal step for training CNNWe apply a simple approach by randomly cropping 30300 times 300 image patches from each of the training imagesto generate totally 12e4 training data We choose thispatch size to take the memory capacity of GPU into ac-count Based on our observations the segmentation ac-curacy will not be affected by increasing input size of testimages To simplify the computation of the weightingfunction f during training we take another per-computedweighting map associated with each training data (X Y) asnetwork inputs

43 Segmentation Accuracy Evaluation For quantitativeevaluation we report Precision (|ScapG||S|) Recall

(|ScapG||G|) and F1-score (2 middot Prec middot RecPrec + Rec)where |S| is the segmented cell region area and |G| is thecorresponding ground truth region area For each testimage Precision and Recall are computed by averaging theresults of all fibres inside We report the three values with afixed threshold (FT) ie a common threshold producesthe best F1-score over the test set and dynamic thresholds(DT) produce the best F1-score per image

In Table 1 we compare the segmentation performance ofour approach to several state-of-the-art methods DC [43]and multiscale combinatorial grouping (MCG) [44] arerecently proposed learning-based image segmentationmethods U-Net [30] is an end-to-end CNN for biomedicalimage segmentation We use their public codes and carefullytrain the models over our training data with the sameamount DNN-SNM [14] is a well-known CNN-based imagesegmentation method We regard it as a generic CNN forcomparison with our end-to-end CNN approach For our

method we directly use the network output as the seg-mentation results for evaluation without any extra post-processing efforts

As shown in Table 1 our method achieves much betterresults than comparative methods Although [7] has betterRecall (FT) our method has around 10 improvement onPrecision (FT) DC and MCG are not robust to the imageartifacts which decreases their segmentation performanceOur method largely outperforms DNN-SNM and U-Netbecause (1) our network is deeper than DNN-SNM tocapture richer representations (2) the decoder betterutilizes the multiscale representations than U-Net and isable to reduce the effects of the pixelwise information lossand (3) two-stage training takes advantage of VGG forbetter training effectiveness rather than training fromscratch as U-Net does -e outstanding Precision resultdemonstrates that our method produces more fine-grainedsegmentation than others -is superiority is betterdemonstrated by the qualitative evaluation as shown inFigure 4

44 Whole-Slide Segmentation Runtime In Table 2 wecompare the runtime of our method to the comparativemethods on images of different sizes cropped from a whole-slide image (see Figure 1) -e runtime of non-deeplearning-based methods (1st block) depends on both pixeland fiber quantities so they cannot handle large-scale im-ages In contrast deep learning-based methods (2nd and 3rdblocks) depend on the pixel quantity so they have close-to-linear time complexity with respect to the image scale Wealso implement a fast scanning version [45] of DNN-SNMon GPU Although the speed has a large improvement it isstill much slower than ours U-Net has more complicatedlayer connection configuration so it is slower than oursespecially in large-scale cases -e significant speed im-provement demonstrates the scalability of our proposedmethod to the application of whole-slide MIS with evenlarger scales

5 Conclusion

-is paper presents a fast and accurate whole-slide MISmethod based on CNN trained in the end-to-end mannerOur proposed network captures hierarchical and compre-hensive representations to support multiscale pixelwisepredictions inside the network A two-stage transfer learningstrategy is proposed to train such a deep network Superioraccuracy and efficiency are experimentally demonstrated ona challenging skeletal muscle image dataset In general ourapproach enables multiscaling inside the network while justrequiring a single arbitrarily sized input and outputting fineoutputs However during the downsampling process of theencoding due to the limitation of resolution of feature layerafter downsampling many important features such as edgefeatures of cells are still lost To further improve decodingefficiency in the future work we can design a module thatcomplements important features to better improve networkperformance

Journal of Healthcare Engineering 7

Test image Ground truth DNN-SNM Liu et al [7] Ours

Figure 4 Segmentation results of four sample skeletal muscle imagesWe show some very challenging cases with large appearance variancesin color fiber shape etc Each segmented fiber is overlaid with a distinctive colored mask while false positives and false negatives arehighlighted by red and blue contours respectively Compared with the other two methods our method obtains more fine-grainedsegmentation results with obviously less false prediction

Table 2 -e runtime (in seconds) comparison on images of different sizes from 1x 1000 times 1000 to 9x 9000 times 9000

Method 1x 2x 3x 4x 5x 6x 7x 8x 9xDC [43] 20 79 mdash mdash mdash mdash mdash mdash mdashMCG [44] 7 27 mdash mdash mdash mdash mdash mdash mdashLiu et al [7] 10 59 mdash mdash mdash mdash mdash mdash mdashDNN-SNM [14] 264 1056 2376 4224 6600 9504 12936 16896 21384DNN-SNM⋆[45] 31 115 242 431 675 974 1325 1738 2160U-net [30] 12 39 90 161 246 368 482 633 792Our approach 11 18 53 88 139 209 278 364 468-e first three methods cannot handle images with 3x and larger sizes on our machine (represented with ldquomdashrdquo in the table) ⋆DNN-SNM is a fast scanningimplementation for prediction speed acceleration

Table 1 -e segmentation results compared with state-of-the-art methods

MethodF1-score ( plusmn σ) Precision ( plusmn σ) Recall ( plusmn σ)

FT DT FT DT FT DT

DC [43] 48plusmn 0093 60plusmn 0138 41plusmn 0066 54plusmn 0164 67plusmn 0194 73plusmn 0148MCG [44] 63plusmn 0201 71plusmn 0105 53plusmn 0136 64plusmn 0138 80plusmn 0303 82plusmn 0091DNN-SNM [14] 76plusmn 0033 78plusmn 0080 83plusmn 0042 85plusmn 0089 70plusmn 0058 73plusmn 0087U-Net [30] 80plusmn 0143 81plusmn 0054 87plusmn 0155 86plusmn 0076 74plusmn 0126 77plusmn 0055Liu et al [7] 82plusmn 0172 84plusmn 0061 81plusmn 0043 84plusmn 0071 85plusmn 0202 85plusmn 0068Our approach 86plusmn 0184 89plusmn 0048 91plusmn 0174 93plusmn 0050 82plusmn 0176 86plusmn 0058σ is the standard deviation

8 Journal of Healthcare Engineering

Data Availability

-e data that support the findings of this study areavailable from the cooperative institution Muscle Minerbut restrictions apply to the availability of these datawhich were used under license for the current study andso are not publicly available Data are however availablefrom the authors upon reasonable request and withpermission of the cooperative institution Muscle Miner

Conflicts of Interest

-e authors declare that they have no conflicts of interest

Acknowledgments

-e authors would like to thank all study participants -iswork was supported by the National Key RampD Program ofChina (grant no 2017YFB1002504) and National NaturalScience Foundation of China (nos 81727802 and 61701404)

References

[1] C S Fry J D Lee J Mula et al ldquoInducible depletion ofsatellite cells in adult sedentary mice impairs muscle re-generative capacity without affecting sarcopeniardquo NatureMedicine vol 21 no 1 pp 76ndash80 2015

[2] M W Lee M G Viola H Meng et al ldquoDifferential musclehypertrophy is associated with satellite cell numbers and aktpathway activation following activin type IIB receptor in-hibition in Mtm1 pR69C micerdquo e American Journal ofPathology vol 184 no 6 pp 1831ndash1842 2014

[3] H Viola P M Janssen R W Grange et al ldquoTissue triage andfreezing for models of skeletal muscle diseaserdquo Journal ofVisualized Experiments JoVE vol e51586 no 89 2014

[4] F Liu A L Mackey R Srikuea K A Esser and L YangldquoAutomated image segmentation of haematoxylin and eosinstained skeletal muscle cross-sectionsrdquo Journal of Microscopyvol 252 no 3 pp 275ndash285 2013

[5] H Su F Xing J D Lee et al ldquoLearning based automaticdetection of myonuclei in isolated single skeletal muscle fibersusing multi-focus image fusionrdquo in Proceedings of the IEEEInternational Symposium on Biomedical Imaging pp 432ndash435 San Francisco CA USA April 2013

[6] L R Smith and E R Barton ldquoSmashndashsemi-automatic muscleanalysis using segmentation of histology a matlab applica-tionrdquo Skeletal Muscle vol 4 no 1 pp 1ndash16 2014

[7] F Liu F Xing Z Zhang M Mcgough and L Yang ldquoRobustmuscle cell quantification using structured edge detection andhierarchical segmentationrdquo in Lecture Notes in ComputerScience pp 324ndash331 Shenzhen MICCAI Shenzhen China2015

[8] T Janssens L Antanas S Derde I VanhorebeekG Van den Berghe and F Guiza Grandas ldquoCharisma anintegrated approach to automatic HampE-stained skeletalmuscle cell segmentation using supervised learning and novelrobust clump splittingrdquoMedical Image Analysis vol 17 no 8pp 1206ndash1219 2013

[9] A E Carpenter T R Jones M R Lamprecht et al ldquoCell-profiler image analysis software for identifying and quanti-fying cell phenotypesrdquo Genome Biology vol 7 no 10pp 1ndash11 2006

[10] A Klemencic S Kovacic and F Pernus ldquoAutomated seg-mentation of muscle fiber images using active contourmodelsrdquo Cytometry vol 32 no 4 pp 317ndash326 1998

[11] N Bova V Gal O Ibantildeez and O Cordon ldquoDeformablemodels direct supervised guidance a novel paradigm forautomatic image segmentationrdquo Neurocomputing vol 177pp 317ndash333 2016

[12] T F Cootes C J Taylor D H Cooper et al ldquoActive shapemodels-their training and applicationrdquo Computer Vision andImage Understanding vol 61 no 1 pp 38ndash59 1995

[13] S Zhang Y Zhan M Dewan J Huang D N Metaxas andX S Zhou ldquoSparse shape composition a new frameworkfor shape prior modelingrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1025ndash1032 IEEE Colorado Springs CO USAJune 2011

[14] D Ciresan A Giusti L M Gambardella and J SchmidhuberldquoDeep neural networks segment neuronal membranes inelectron microscopy imagesrdquo in Proceedings of the NIPSpp 2843ndash2851 Lake Tahoe NV USA December 2012

[15] H Noh S Hong and B Han ldquoLearning deconvolutionnetwork for semantic segmentationrdquo in Proceedings of theICCV Las Condes Chile December 2015

[16] J Long E Shelhamer and T Darrell ldquoFully convolutionalnetworks for semantic segmentationrdquo in Proceedings of theCVPR pp 3431ndash3440 Santiago Chile December 2015

[17] P O Pinheiro R Collobert and P Dollar ldquoLearning tosegment object candidatesrdquo in Proceedings of the Advances inNeural Information Processing Systems pp 1981ndash1989Montreal Canada December 2015

[18] J Mula J D Lee F Liu L Yang and C A PetersonldquoAutomated image analysis of skeletal muscle fiber cross-sectional areardquo Journal of Applied Physiology vol 114 no 1pp 148ndash155 2013

[19] P-Y Baudin N Azzabou P G Carlier and N ParagiosldquoPrior knowledge random walks and human skeletal musclesegmentationrdquo in Proceedings of the Medical Image Com-puting and Computer-Assisted InterventionndashMICCAI 2012pp 569ndash576 Nice France October 2012

[20] P Dollar and C L Zitnick ldquoStructured forests for fast edgedetectionrdquo in Proceedings of the ICCV pp 1841ndash1848 SydneyAustralia December 2013

[21] F Liu F Xing and L Yang ldquoRobust muscle cell segmentationusing region selection with dynamic programmingrdquo inProceedings of the ISBI pp 521ndash524 Beijing China April2014

[22] E Van Aart N Sepasian A Jalba and A Vilanova ldquoCuda-accelerated geodesic ray-tracing for fiber trackingrdquo Journal ofBiomedical Imaging vol 2011 Article ID 698908 12 pages2011

[23] G C Kagadis C Kloukinas K Moore et al ldquoCloud com-puting in medical imagingrdquo Medical Physics vol 40 no 7article 070901

[24] L Yang X Qi F Xing T Kurc J Saltz and D J ForanldquoParallel content-based sub-image retrieval using hierarchicalsearchingrdquo Bioinformatics vol 30 no 7 pp 996ndash1002 2014

[25] A Krizhevsky I Sutskever and G E Hinton ldquoImagenetclassification with deep convolutional neural networksrdquo inProceedings of the Advances in Neural Information ProcessingSystems pp 1097ndash1105 Lake Tahoe NV USA December2012

[26] H-C Shin H R Roth M Gao et al ldquoDeep convolutionalneural networks for computer-aided detection CNN archi-tectures dataset characteristics and transfer learningrdquo IEEE

Journal of Healthcare Engineering 9

Transactions on Medical Imaging vol 35 no 5 pp 1285ndash1298 2016

[27] J Xu X Luo G Wang H Gilmore and A Madabhushi ldquoAdeep convolutional neural network for segmenting andclassifying epithelial and stromal regions in histopathologicalimagesrdquo Neurocomputing vol 191 pp 214ndash223 2016

[28] X Pan L Li H Yang et al ldquoAccurate segmentation of nucleiin pathological images via sparse reconstruction and deepconvolutional networksrdquo Neurocomputing vol 229 pp 88ndash99 2017

[29] K Simonyan and A Zisserman ldquoVery deep convolutionalnetworks for large-scale image recognitionrdquo httpsarxivorgabs14091556

[30] O Ronneberger P Fischer and T Brox ldquoU-net convolu-tional networks for biomedical image segmentationrdquo inLecture Notes in Computer Science pp 234ndash241 ShenzhenMICCAI Shenzhen China 2015

[31] S Xie and Z Tu ldquoHolistically-nested edge detectionrdquo inProceedings of the ICCV pp 1395ndash1403 Las Condes ChileDecember 2015

[32] D Eigen and R Fergus ldquoPredicting depth surface normalsand semantic labels with a commonmulti-scale convolutionalarchitecturerdquo in Proceedings of the IEEE International Con-ference on Computer Vision pp 2650ndash2658 Las CondesChile December 2015

[33] Q Dou H Chen L Yu et al ldquoAutomatic detection of cerebralmicrobleeds from mr images via 3d convolutional neuralnetworksrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1182ndash1195 2016

[34] H Chen X Qi L Yu and P-A Heng ldquoDcan deep contour-aware networks for accurate gland segmentationrdquo httpsarxivorgabs160402677

[35] P Moeskops M A Viergever A M Mendrik L S de VriesM J N L Benders and I Isgum ldquoAutomatic segmentation ofmr brain images with a convolutional neural networkrdquo IEEETransactions on Medical Imaging vol 35 no 5 pp 1252ndash1261 2016

[36] S Hong H Noh and B Han ldquoDecoupled deep neuralnetwork for semi-supervised semantic segmentationrdquo inProceedings of the Advances in Neural Information ProcessingSystems pp 1495ndash1503 Montreal Canada December 2015

[37] M D Zeiler and R Fergus ldquoVisualizing and understandingconvolutional networksrdquo in Proceedings of the ComputervisionndashECCV 2014 pp 818ndash833 Springer Zurich SwitzerlandSeptember 2014

[38] C Szegedy W Liu Y Jia et al ldquoGoing deeper with con-volutionsrdquo in Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition pp 1ndash9 Boston MA USAJune 2015

[39] G Bertasius J Shi and L Torresani ldquoDeepedge a multi-scalebifurcated deep network for top-down contour detectionrdquo inProceedings of the IEEE Conference on Computer Vision andPattern Recognition pp 4380ndash4389 Boston MA USA June2015

[40] Y Jia E Shelhamer J Donahue et al ldquoCaffe convolutionalarchitecture for fast feature embeddingrdquo in Proceedings of theInternational Conference on Multimedia pp 675ndash678Orlando FL USA November 2014

[41] N Tajbakhsh J Y Shin S R Gurudu et al ldquoConvolutionalneural networks for medical image analysis full training orfine tuningrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1299ndash1312 2016

[42] J Yosinski J Clune Y Bengio and H Lipson ldquoHowtransferable are features in deep neural networksrdquo in

Proceedings of the NIPS pp 3320ndash3328 Montreal CanadaDecember 2014

[43] M Donoser and D Schmalstieg ldquoDiscrete-continuous gra-dient orientation estimation for faster image segmentationrdquoin Proceedings of the CVPR pp 3158ndash3165 Columbus OHUSA June 2014

[44] P Arbelaez J Pont-Tuset J Barron F Marques and J MalikldquoMultiscale combinatorial groupingrdquo in Proceedings of theCVPR pp 328ndash335 Columbus OH USA June 2014

[45] A Giusti D C Ciresan J Masci L M Gambardella andJ Schmidhuber ldquoFast image scanning with deep max-poolingconvolutional neural networksrdquo 2013 httpsarxivorgabs13021700

10 Journal of Healthcare Engineering

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

the cooperative institution Muscle Miner -e images ex-hibit large appearance variances in color fiber size andshape -e image size roughly ranges from 500 times 500 to1500 times 1500 pixels We split the dataset into 100 testingimages and 400 training images

In order to evaluate the proposed method to handlelarge-scale images we evaluate the runtime on a whole-slideimage Note that we use small image patches for segmen-tation accuracy evaluation because some comparativemethods in the literature cannot handle whole-slide imagesHowever our proposed network is flexible to the input sizeduring the testing stage because the decoder is able toadaptively adjust the output size to be consistent with theinput size

42 Implementation Details Our implementation is basedon the Caffe [40] framework with modifications for ournetwork design All experiments are conducted on astandard desktop with an Intel i7 processor and a singleTesla K40c GPU -e optimization is driven by stochasticgradient descent with momentum For the first stagetraining the network parameters are set to learningrate 1eminus 6 (divided by 10 every 1e4 iteration) mo-mentum 09 and minibatch size 2 In the second stagewe use the learning rate 1eminus 7 and keep the others thesame

Augmenting dataset is a normal step for training CNNWe apply a simple approach by randomly cropping 30300 times 300 image patches from each of the training imagesto generate totally 12e4 training data We choose thispatch size to take the memory capacity of GPU into ac-count Based on our observations the segmentation ac-curacy will not be affected by increasing input size of testimages To simplify the computation of the weightingfunction f during training we take another per-computedweighting map associated with each training data (X Y) asnetwork inputs

43 Segmentation Accuracy Evaluation For quantitativeevaluation we report Precision (|ScapG||S|) Recall

(|ScapG||G|) and F1-score (2 middot Prec middot RecPrec + Rec)where |S| is the segmented cell region area and |G| is thecorresponding ground truth region area For each testimage Precision and Recall are computed by averaging theresults of all fibres inside We report the three values with afixed threshold (FT) ie a common threshold producesthe best F1-score over the test set and dynamic thresholds(DT) produce the best F1-score per image

In Table 1 we compare the segmentation performance ofour approach to several state-of-the-art methods DC [43]and multiscale combinatorial grouping (MCG) [44] arerecently proposed learning-based image segmentationmethods U-Net [30] is an end-to-end CNN for biomedicalimage segmentation We use their public codes and carefullytrain the models over our training data with the sameamount DNN-SNM [14] is a well-known CNN-based imagesegmentation method We regard it as a generic CNN forcomparison with our end-to-end CNN approach For our

method we directly use the network output as the seg-mentation results for evaluation without any extra post-processing efforts

As shown in Table 1 our method achieves much betterresults than comparative methods Although [7] has betterRecall (FT) our method has around 10 improvement onPrecision (FT) DC and MCG are not robust to the imageartifacts which decreases their segmentation performanceOur method largely outperforms DNN-SNM and U-Netbecause (1) our network is deeper than DNN-SNM tocapture richer representations (2) the decoder betterutilizes the multiscale representations than U-Net and isable to reduce the effects of the pixelwise information lossand (3) two-stage training takes advantage of VGG forbetter training effectiveness rather than training fromscratch as U-Net does -e outstanding Precision resultdemonstrates that our method produces more fine-grainedsegmentation than others -is superiority is betterdemonstrated by the qualitative evaluation as shown inFigure 4

44 Whole-Slide Segmentation Runtime In Table 2 wecompare the runtime of our method to the comparativemethods on images of different sizes cropped from a whole-slide image (see Figure 1) -e runtime of non-deeplearning-based methods (1st block) depends on both pixeland fiber quantities so they cannot handle large-scale im-ages In contrast deep learning-based methods (2nd and 3rdblocks) depend on the pixel quantity so they have close-to-linear time complexity with respect to the image scale Wealso implement a fast scanning version [45] of DNN-SNMon GPU Although the speed has a large improvement it isstill much slower than ours U-Net has more complicatedlayer connection configuration so it is slower than oursespecially in large-scale cases -e significant speed im-provement demonstrates the scalability of our proposedmethod to the application of whole-slide MIS with evenlarger scales

5 Conclusion

-is paper presents a fast and accurate whole-slide MISmethod based on CNN trained in the end-to-end mannerOur proposed network captures hierarchical and compre-hensive representations to support multiscale pixelwisepredictions inside the network A two-stage transfer learningstrategy is proposed to train such a deep network Superioraccuracy and efficiency are experimentally demonstrated ona challenging skeletal muscle image dataset In general ourapproach enables multiscaling inside the network while justrequiring a single arbitrarily sized input and outputting fineoutputs However during the downsampling process of theencoding due to the limitation of resolution of feature layerafter downsampling many important features such as edgefeatures of cells are still lost To further improve decodingefficiency in the future work we can design a module thatcomplements important features to better improve networkperformance

Journal of Healthcare Engineering 7

Test image Ground truth DNN-SNM Liu et al [7] Ours

Figure 4 Segmentation results of four sample skeletal muscle imagesWe show some very challenging cases with large appearance variancesin color fiber shape etc Each segmented fiber is overlaid with a distinctive colored mask while false positives and false negatives arehighlighted by red and blue contours respectively Compared with the other two methods our method obtains more fine-grainedsegmentation results with obviously less false prediction

Table 2 -e runtime (in seconds) comparison on images of different sizes from 1x 1000 times 1000 to 9x 9000 times 9000

Method 1x 2x 3x 4x 5x 6x 7x 8x 9xDC [43] 20 79 mdash mdash mdash mdash mdash mdash mdashMCG [44] 7 27 mdash mdash mdash mdash mdash mdash mdashLiu et al [7] 10 59 mdash mdash mdash mdash mdash mdash mdashDNN-SNM [14] 264 1056 2376 4224 6600 9504 12936 16896 21384DNN-SNM⋆[45] 31 115 242 431 675 974 1325 1738 2160U-net [30] 12 39 90 161 246 368 482 633 792Our approach 11 18 53 88 139 209 278 364 468-e first three methods cannot handle images with 3x and larger sizes on our machine (represented with ldquomdashrdquo in the table) ⋆DNN-SNM is a fast scanningimplementation for prediction speed acceleration

Table 1 -e segmentation results compared with state-of-the-art methods

MethodF1-score ( plusmn σ) Precision ( plusmn σ) Recall ( plusmn σ)

FT DT FT DT FT DT

DC [43] 48plusmn 0093 60plusmn 0138 41plusmn 0066 54plusmn 0164 67plusmn 0194 73plusmn 0148MCG [44] 63plusmn 0201 71plusmn 0105 53plusmn 0136 64plusmn 0138 80plusmn 0303 82plusmn 0091DNN-SNM [14] 76plusmn 0033 78plusmn 0080 83plusmn 0042 85plusmn 0089 70plusmn 0058 73plusmn 0087U-Net [30] 80plusmn 0143 81plusmn 0054 87plusmn 0155 86plusmn 0076 74plusmn 0126 77plusmn 0055Liu et al [7] 82plusmn 0172 84plusmn 0061 81plusmn 0043 84plusmn 0071 85plusmn 0202 85plusmn 0068Our approach 86plusmn 0184 89plusmn 0048 91plusmn 0174 93plusmn 0050 82plusmn 0176 86plusmn 0058σ is the standard deviation

8 Journal of Healthcare Engineering

Data Availability

-e data that support the findings of this study areavailable from the cooperative institution Muscle Minerbut restrictions apply to the availability of these datawhich were used under license for the current study andso are not publicly available Data are however availablefrom the authors upon reasonable request and withpermission of the cooperative institution Muscle Miner

Conflicts of Interest

-e authors declare that they have no conflicts of interest

Acknowledgments

-e authors would like to thank all study participants -iswork was supported by the National Key RampD Program ofChina (grant no 2017YFB1002504) and National NaturalScience Foundation of China (nos 81727802 and 61701404)

References

[1] C S Fry J D Lee J Mula et al ldquoInducible depletion ofsatellite cells in adult sedentary mice impairs muscle re-generative capacity without affecting sarcopeniardquo NatureMedicine vol 21 no 1 pp 76ndash80 2015

[2] M W Lee M G Viola H Meng et al ldquoDifferential musclehypertrophy is associated with satellite cell numbers and aktpathway activation following activin type IIB receptor in-hibition in Mtm1 pR69C micerdquo e American Journal ofPathology vol 184 no 6 pp 1831ndash1842 2014

[3] H Viola P M Janssen R W Grange et al ldquoTissue triage andfreezing for models of skeletal muscle diseaserdquo Journal ofVisualized Experiments JoVE vol e51586 no 89 2014

[4] F Liu A L Mackey R Srikuea K A Esser and L YangldquoAutomated image segmentation of haematoxylin and eosinstained skeletal muscle cross-sectionsrdquo Journal of Microscopyvol 252 no 3 pp 275ndash285 2013

[5] H Su F Xing J D Lee et al ldquoLearning based automaticdetection of myonuclei in isolated single skeletal muscle fibersusing multi-focus image fusionrdquo in Proceedings of the IEEEInternational Symposium on Biomedical Imaging pp 432ndash435 San Francisco CA USA April 2013

[6] L R Smith and E R Barton ldquoSmashndashsemi-automatic muscleanalysis using segmentation of histology a matlab applica-tionrdquo Skeletal Muscle vol 4 no 1 pp 1ndash16 2014

[7] F Liu F Xing Z Zhang M Mcgough and L Yang ldquoRobustmuscle cell quantification using structured edge detection andhierarchical segmentationrdquo in Lecture Notes in ComputerScience pp 324ndash331 Shenzhen MICCAI Shenzhen China2015

[8] T Janssens L Antanas S Derde I VanhorebeekG Van den Berghe and F Guiza Grandas ldquoCharisma anintegrated approach to automatic HampE-stained skeletalmuscle cell segmentation using supervised learning and novelrobust clump splittingrdquoMedical Image Analysis vol 17 no 8pp 1206ndash1219 2013

[9] A E Carpenter T R Jones M R Lamprecht et al ldquoCell-profiler image analysis software for identifying and quanti-fying cell phenotypesrdquo Genome Biology vol 7 no 10pp 1ndash11 2006

[10] A Klemencic S Kovacic and F Pernus ldquoAutomated seg-mentation of muscle fiber images using active contourmodelsrdquo Cytometry vol 32 no 4 pp 317ndash326 1998

[11] N Bova V Gal O Ibantildeez and O Cordon ldquoDeformablemodels direct supervised guidance a novel paradigm forautomatic image segmentationrdquo Neurocomputing vol 177pp 317ndash333 2016

[12] T F Cootes C J Taylor D H Cooper et al ldquoActive shapemodels-their training and applicationrdquo Computer Vision andImage Understanding vol 61 no 1 pp 38ndash59 1995

[13] S Zhang Y Zhan M Dewan J Huang D N Metaxas andX S Zhou ldquoSparse shape composition a new frameworkfor shape prior modelingrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1025ndash1032 IEEE Colorado Springs CO USAJune 2011

[14] D Ciresan A Giusti L M Gambardella and J SchmidhuberldquoDeep neural networks segment neuronal membranes inelectron microscopy imagesrdquo in Proceedings of the NIPSpp 2843ndash2851 Lake Tahoe NV USA December 2012

[15] H Noh S Hong and B Han ldquoLearning deconvolutionnetwork for semantic segmentationrdquo in Proceedings of theICCV Las Condes Chile December 2015

[16] J Long E Shelhamer and T Darrell ldquoFully convolutionalnetworks for semantic segmentationrdquo in Proceedings of theCVPR pp 3431ndash3440 Santiago Chile December 2015

[17] P O Pinheiro R Collobert and P Dollar ldquoLearning tosegment object candidatesrdquo in Proceedings of the Advances inNeural Information Processing Systems pp 1981ndash1989Montreal Canada December 2015

[18] J Mula J D Lee F Liu L Yang and C A PetersonldquoAutomated image analysis of skeletal muscle fiber cross-sectional areardquo Journal of Applied Physiology vol 114 no 1pp 148ndash155 2013

[19] P-Y Baudin N Azzabou P G Carlier and N ParagiosldquoPrior knowledge random walks and human skeletal musclesegmentationrdquo in Proceedings of the Medical Image Com-puting and Computer-Assisted InterventionndashMICCAI 2012pp 569ndash576 Nice France October 2012

[20] P Dollar and C L Zitnick ldquoStructured forests for fast edgedetectionrdquo in Proceedings of the ICCV pp 1841ndash1848 SydneyAustralia December 2013

[21] F Liu F Xing and L Yang ldquoRobust muscle cell segmentationusing region selection with dynamic programmingrdquo inProceedings of the ISBI pp 521ndash524 Beijing China April2014

[22] E Van Aart N Sepasian A Jalba and A Vilanova ldquoCuda-accelerated geodesic ray-tracing for fiber trackingrdquo Journal ofBiomedical Imaging vol 2011 Article ID 698908 12 pages2011

[23] G C Kagadis C Kloukinas K Moore et al ldquoCloud com-puting in medical imagingrdquo Medical Physics vol 40 no 7article 070901

[24] L Yang X Qi F Xing T Kurc J Saltz and D J ForanldquoParallel content-based sub-image retrieval using hierarchicalsearchingrdquo Bioinformatics vol 30 no 7 pp 996ndash1002 2014

[25] A Krizhevsky I Sutskever and G E Hinton ldquoImagenetclassification with deep convolutional neural networksrdquo inProceedings of the Advances in Neural Information ProcessingSystems pp 1097ndash1105 Lake Tahoe NV USA December2012

[26] H-C Shin H R Roth M Gao et al ldquoDeep convolutionalneural networks for computer-aided detection CNN archi-tectures dataset characteristics and transfer learningrdquo IEEE

Journal of Healthcare Engineering 9

Transactions on Medical Imaging vol 35 no 5 pp 1285ndash1298 2016

[27] J Xu X Luo G Wang H Gilmore and A Madabhushi ldquoAdeep convolutional neural network for segmenting andclassifying epithelial and stromal regions in histopathologicalimagesrdquo Neurocomputing vol 191 pp 214ndash223 2016

[28] X Pan L Li H Yang et al ldquoAccurate segmentation of nucleiin pathological images via sparse reconstruction and deepconvolutional networksrdquo Neurocomputing vol 229 pp 88ndash99 2017

[29] K Simonyan and A Zisserman ldquoVery deep convolutionalnetworks for large-scale image recognitionrdquo httpsarxivorgabs14091556

[30] O Ronneberger P Fischer and T Brox ldquoU-net convolu-tional networks for biomedical image segmentationrdquo inLecture Notes in Computer Science pp 234ndash241 ShenzhenMICCAI Shenzhen China 2015

[31] S Xie and Z Tu ldquoHolistically-nested edge detectionrdquo inProceedings of the ICCV pp 1395ndash1403 Las Condes ChileDecember 2015

[32] D Eigen and R Fergus ldquoPredicting depth surface normalsand semantic labels with a commonmulti-scale convolutionalarchitecturerdquo in Proceedings of the IEEE International Con-ference on Computer Vision pp 2650ndash2658 Las CondesChile December 2015

[33] Q Dou H Chen L Yu et al ldquoAutomatic detection of cerebralmicrobleeds from mr images via 3d convolutional neuralnetworksrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1182ndash1195 2016

[34] H Chen X Qi L Yu and P-A Heng ldquoDcan deep contour-aware networks for accurate gland segmentationrdquo httpsarxivorgabs160402677

[35] P Moeskops M A Viergever A M Mendrik L S de VriesM J N L Benders and I Isgum ldquoAutomatic segmentation ofmr brain images with a convolutional neural networkrdquo IEEETransactions on Medical Imaging vol 35 no 5 pp 1252ndash1261 2016

[36] S Hong H Noh and B Han ldquoDecoupled deep neuralnetwork for semi-supervised semantic segmentationrdquo inProceedings of the Advances in Neural Information ProcessingSystems pp 1495ndash1503 Montreal Canada December 2015

[37] M D Zeiler and R Fergus ldquoVisualizing and understandingconvolutional networksrdquo in Proceedings of the ComputervisionndashECCV 2014 pp 818ndash833 Springer Zurich SwitzerlandSeptember 2014

[38] C Szegedy W Liu Y Jia et al ldquoGoing deeper with con-volutionsrdquo in Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition pp 1ndash9 Boston MA USAJune 2015

[39] G Bertasius J Shi and L Torresani ldquoDeepedge a multi-scalebifurcated deep network for top-down contour detectionrdquo inProceedings of the IEEE Conference on Computer Vision andPattern Recognition pp 4380ndash4389 Boston MA USA June2015

[40] Y Jia E Shelhamer J Donahue et al ldquoCaffe convolutionalarchitecture for fast feature embeddingrdquo in Proceedings of theInternational Conference on Multimedia pp 675ndash678Orlando FL USA November 2014

[41] N Tajbakhsh J Y Shin S R Gurudu et al ldquoConvolutionalneural networks for medical image analysis full training orfine tuningrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1299ndash1312 2016

[42] J Yosinski J Clune Y Bengio and H Lipson ldquoHowtransferable are features in deep neural networksrdquo in

Proceedings of the NIPS pp 3320ndash3328 Montreal CanadaDecember 2014

[43] M Donoser and D Schmalstieg ldquoDiscrete-continuous gra-dient orientation estimation for faster image segmentationrdquoin Proceedings of the CVPR pp 3158ndash3165 Columbus OHUSA June 2014

[44] P Arbelaez J Pont-Tuset J Barron F Marques and J MalikldquoMultiscale combinatorial groupingrdquo in Proceedings of theCVPR pp 328ndash335 Columbus OH USA June 2014

[45] A Giusti D C Ciresan J Masci L M Gambardella andJ Schmidhuber ldquoFast image scanning with deep max-poolingconvolutional neural networksrdquo 2013 httpsarxivorgabs13021700

10 Journal of Healthcare Engineering

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Test image Ground truth DNN-SNM Liu et al [7] Ours

Figure 4 Segmentation results of four sample skeletal muscle imagesWe show some very challenging cases with large appearance variancesin color fiber shape etc Each segmented fiber is overlaid with a distinctive colored mask while false positives and false negatives arehighlighted by red and blue contours respectively Compared with the other two methods our method obtains more fine-grainedsegmentation results with obviously less false prediction

Table 2 -e runtime (in seconds) comparison on images of different sizes from 1x 1000 times 1000 to 9x 9000 times 9000

Method 1x 2x 3x 4x 5x 6x 7x 8x 9xDC [43] 20 79 mdash mdash mdash mdash mdash mdash mdashMCG [44] 7 27 mdash mdash mdash mdash mdash mdash mdashLiu et al [7] 10 59 mdash mdash mdash mdash mdash mdash mdashDNN-SNM [14] 264 1056 2376 4224 6600 9504 12936 16896 21384DNN-SNM⋆[45] 31 115 242 431 675 974 1325 1738 2160U-net [30] 12 39 90 161 246 368 482 633 792Our approach 11 18 53 88 139 209 278 364 468-e first three methods cannot handle images with 3x and larger sizes on our machine (represented with ldquomdashrdquo in the table) ⋆DNN-SNM is a fast scanningimplementation for prediction speed acceleration

Table 1 -e segmentation results compared with state-of-the-art methods

MethodF1-score ( plusmn σ) Precision ( plusmn σ) Recall ( plusmn σ)

FT DT FT DT FT DT

DC [43] 48plusmn 0093 60plusmn 0138 41plusmn 0066 54plusmn 0164 67plusmn 0194 73plusmn 0148MCG [44] 63plusmn 0201 71plusmn 0105 53plusmn 0136 64plusmn 0138 80plusmn 0303 82plusmn 0091DNN-SNM [14] 76plusmn 0033 78plusmn 0080 83plusmn 0042 85plusmn 0089 70plusmn 0058 73plusmn 0087U-Net [30] 80plusmn 0143 81plusmn 0054 87plusmn 0155 86plusmn 0076 74plusmn 0126 77plusmn 0055Liu et al [7] 82plusmn 0172 84plusmn 0061 81plusmn 0043 84plusmn 0071 85plusmn 0202 85plusmn 0068Our approach 86plusmn 0184 89plusmn 0048 91plusmn 0174 93plusmn 0050 82plusmn 0176 86plusmn 0058σ is the standard deviation

8 Journal of Healthcare Engineering

Data Availability

-e data that support the findings of this study areavailable from the cooperative institution Muscle Minerbut restrictions apply to the availability of these datawhich were used under license for the current study andso are not publicly available Data are however availablefrom the authors upon reasonable request and withpermission of the cooperative institution Muscle Miner

Conflicts of Interest

-e authors declare that they have no conflicts of interest

Acknowledgments

-e authors would like to thank all study participants -iswork was supported by the National Key RampD Program ofChina (grant no 2017YFB1002504) and National NaturalScience Foundation of China (nos 81727802 and 61701404)

References

[1] C S Fry J D Lee J Mula et al ldquoInducible depletion ofsatellite cells in adult sedentary mice impairs muscle re-generative capacity without affecting sarcopeniardquo NatureMedicine vol 21 no 1 pp 76ndash80 2015

[2] M W Lee M G Viola H Meng et al ldquoDifferential musclehypertrophy is associated with satellite cell numbers and aktpathway activation following activin type IIB receptor in-hibition in Mtm1 pR69C micerdquo e American Journal ofPathology vol 184 no 6 pp 1831ndash1842 2014

[3] H Viola P M Janssen R W Grange et al ldquoTissue triage andfreezing for models of skeletal muscle diseaserdquo Journal ofVisualized Experiments JoVE vol e51586 no 89 2014

[4] F Liu A L Mackey R Srikuea K A Esser and L YangldquoAutomated image segmentation of haematoxylin and eosinstained skeletal muscle cross-sectionsrdquo Journal of Microscopyvol 252 no 3 pp 275ndash285 2013

[5] H Su F Xing J D Lee et al ldquoLearning based automaticdetection of myonuclei in isolated single skeletal muscle fibersusing multi-focus image fusionrdquo in Proceedings of the IEEEInternational Symposium on Biomedical Imaging pp 432ndash435 San Francisco CA USA April 2013

[6] L R Smith and E R Barton ldquoSmashndashsemi-automatic muscleanalysis using segmentation of histology a matlab applica-tionrdquo Skeletal Muscle vol 4 no 1 pp 1ndash16 2014

[7] F Liu F Xing Z Zhang M Mcgough and L Yang ldquoRobustmuscle cell quantification using structured edge detection andhierarchical segmentationrdquo in Lecture Notes in ComputerScience pp 324ndash331 Shenzhen MICCAI Shenzhen China2015

[8] T Janssens L Antanas S Derde I VanhorebeekG Van den Berghe and F Guiza Grandas ldquoCharisma anintegrated approach to automatic HampE-stained skeletalmuscle cell segmentation using supervised learning and novelrobust clump splittingrdquoMedical Image Analysis vol 17 no 8pp 1206ndash1219 2013

[9] A E Carpenter T R Jones M R Lamprecht et al ldquoCell-profiler image analysis software for identifying and quanti-fying cell phenotypesrdquo Genome Biology vol 7 no 10pp 1ndash11 2006

[10] A Klemencic S Kovacic and F Pernus ldquoAutomated seg-mentation of muscle fiber images using active contourmodelsrdquo Cytometry vol 32 no 4 pp 317ndash326 1998

[11] N Bova V Gal O Ibantildeez and O Cordon ldquoDeformablemodels direct supervised guidance a novel paradigm forautomatic image segmentationrdquo Neurocomputing vol 177pp 317ndash333 2016

[12] T F Cootes C J Taylor D H Cooper et al ldquoActive shapemodels-their training and applicationrdquo Computer Vision andImage Understanding vol 61 no 1 pp 38ndash59 1995

[13] S Zhang Y Zhan M Dewan J Huang D N Metaxas andX S Zhou ldquoSparse shape composition a new frameworkfor shape prior modelingrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1025ndash1032 IEEE Colorado Springs CO USAJune 2011

[14] D Ciresan A Giusti L M Gambardella and J SchmidhuberldquoDeep neural networks segment neuronal membranes inelectron microscopy imagesrdquo in Proceedings of the NIPSpp 2843ndash2851 Lake Tahoe NV USA December 2012

[15] H Noh S Hong and B Han ldquoLearning deconvolutionnetwork for semantic segmentationrdquo in Proceedings of theICCV Las Condes Chile December 2015

[16] J Long E Shelhamer and T Darrell ldquoFully convolutionalnetworks for semantic segmentationrdquo in Proceedings of theCVPR pp 3431ndash3440 Santiago Chile December 2015

[17] P O Pinheiro R Collobert and P Dollar ldquoLearning tosegment object candidatesrdquo in Proceedings of the Advances inNeural Information Processing Systems pp 1981ndash1989Montreal Canada December 2015

[18] J Mula J D Lee F Liu L Yang and C A PetersonldquoAutomated image analysis of skeletal muscle fiber cross-sectional areardquo Journal of Applied Physiology vol 114 no 1pp 148ndash155 2013

[19] P-Y Baudin N Azzabou P G Carlier and N ParagiosldquoPrior knowledge random walks and human skeletal musclesegmentationrdquo in Proceedings of the Medical Image Com-puting and Computer-Assisted InterventionndashMICCAI 2012pp 569ndash576 Nice France October 2012

[20] P Dollar and C L Zitnick ldquoStructured forests for fast edgedetectionrdquo in Proceedings of the ICCV pp 1841ndash1848 SydneyAustralia December 2013

[21] F Liu F Xing and L Yang ldquoRobust muscle cell segmentationusing region selection with dynamic programmingrdquo inProceedings of the ISBI pp 521ndash524 Beijing China April2014

[22] E Van Aart N Sepasian A Jalba and A Vilanova ldquoCuda-accelerated geodesic ray-tracing for fiber trackingrdquo Journal ofBiomedical Imaging vol 2011 Article ID 698908 12 pages2011

[23] G C Kagadis C Kloukinas K Moore et al ldquoCloud com-puting in medical imagingrdquo Medical Physics vol 40 no 7article 070901

[24] L Yang X Qi F Xing T Kurc J Saltz and D J ForanldquoParallel content-based sub-image retrieval using hierarchicalsearchingrdquo Bioinformatics vol 30 no 7 pp 996ndash1002 2014

[25] A Krizhevsky I Sutskever and G E Hinton ldquoImagenetclassification with deep convolutional neural networksrdquo inProceedings of the Advances in Neural Information ProcessingSystems pp 1097ndash1105 Lake Tahoe NV USA December2012

[26] H-C Shin H R Roth M Gao et al ldquoDeep convolutionalneural networks for computer-aided detection CNN archi-tectures dataset characteristics and transfer learningrdquo IEEE

Journal of Healthcare Engineering 9

Transactions on Medical Imaging vol 35 no 5 pp 1285ndash1298 2016

[27] J Xu X Luo G Wang H Gilmore and A Madabhushi ldquoAdeep convolutional neural network for segmenting andclassifying epithelial and stromal regions in histopathologicalimagesrdquo Neurocomputing vol 191 pp 214ndash223 2016

[28] X Pan L Li H Yang et al ldquoAccurate segmentation of nucleiin pathological images via sparse reconstruction and deepconvolutional networksrdquo Neurocomputing vol 229 pp 88ndash99 2017

[29] K Simonyan and A Zisserman ldquoVery deep convolutionalnetworks for large-scale image recognitionrdquo httpsarxivorgabs14091556

[30] O Ronneberger P Fischer and T Brox ldquoU-net convolu-tional networks for biomedical image segmentationrdquo inLecture Notes in Computer Science pp 234ndash241 ShenzhenMICCAI Shenzhen China 2015

[31] S Xie and Z Tu ldquoHolistically-nested edge detectionrdquo inProceedings of the ICCV pp 1395ndash1403 Las Condes ChileDecember 2015

[32] D Eigen and R Fergus ldquoPredicting depth surface normalsand semantic labels with a commonmulti-scale convolutionalarchitecturerdquo in Proceedings of the IEEE International Con-ference on Computer Vision pp 2650ndash2658 Las CondesChile December 2015

[33] Q Dou H Chen L Yu et al ldquoAutomatic detection of cerebralmicrobleeds from mr images via 3d convolutional neuralnetworksrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1182ndash1195 2016

[34] H Chen X Qi L Yu and P-A Heng ldquoDcan deep contour-aware networks for accurate gland segmentationrdquo httpsarxivorgabs160402677

[35] P Moeskops M A Viergever A M Mendrik L S de VriesM J N L Benders and I Isgum ldquoAutomatic segmentation ofmr brain images with a convolutional neural networkrdquo IEEETransactions on Medical Imaging vol 35 no 5 pp 1252ndash1261 2016

[36] S Hong H Noh and B Han ldquoDecoupled deep neuralnetwork for semi-supervised semantic segmentationrdquo inProceedings of the Advances in Neural Information ProcessingSystems pp 1495ndash1503 Montreal Canada December 2015

[37] M D Zeiler and R Fergus ldquoVisualizing and understandingconvolutional networksrdquo in Proceedings of the ComputervisionndashECCV 2014 pp 818ndash833 Springer Zurich SwitzerlandSeptember 2014

[38] C Szegedy W Liu Y Jia et al ldquoGoing deeper with con-volutionsrdquo in Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition pp 1ndash9 Boston MA USAJune 2015

[39] G Bertasius J Shi and L Torresani ldquoDeepedge a multi-scalebifurcated deep network for top-down contour detectionrdquo inProceedings of the IEEE Conference on Computer Vision andPattern Recognition pp 4380ndash4389 Boston MA USA June2015

[40] Y Jia E Shelhamer J Donahue et al ldquoCaffe convolutionalarchitecture for fast feature embeddingrdquo in Proceedings of theInternational Conference on Multimedia pp 675ndash678Orlando FL USA November 2014

[41] N Tajbakhsh J Y Shin S R Gurudu et al ldquoConvolutionalneural networks for medical image analysis full training orfine tuningrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1299ndash1312 2016

[42] J Yosinski J Clune Y Bengio and H Lipson ldquoHowtransferable are features in deep neural networksrdquo in

Proceedings of the NIPS pp 3320ndash3328 Montreal CanadaDecember 2014

[43] M Donoser and D Schmalstieg ldquoDiscrete-continuous gra-dient orientation estimation for faster image segmentationrdquoin Proceedings of the CVPR pp 3158ndash3165 Columbus OHUSA June 2014

[44] P Arbelaez J Pont-Tuset J Barron F Marques and J MalikldquoMultiscale combinatorial groupingrdquo in Proceedings of theCVPR pp 328ndash335 Columbus OH USA June 2014

[45] A Giusti D C Ciresan J Masci L M Gambardella andJ Schmidhuber ldquoFast image scanning with deep max-poolingconvolutional neural networksrdquo 2013 httpsarxivorgabs13021700

10 Journal of Healthcare Engineering

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Data Availability

-e data that support the findings of this study areavailable from the cooperative institution Muscle Minerbut restrictions apply to the availability of these datawhich were used under license for the current study andso are not publicly available Data are however availablefrom the authors upon reasonable request and withpermission of the cooperative institution Muscle Miner

Conflicts of Interest

-e authors declare that they have no conflicts of interest

Acknowledgments

-e authors would like to thank all study participants -iswork was supported by the National Key RampD Program ofChina (grant no 2017YFB1002504) and National NaturalScience Foundation of China (nos 81727802 and 61701404)

References

[1] C S Fry J D Lee J Mula et al ldquoInducible depletion ofsatellite cells in adult sedentary mice impairs muscle re-generative capacity without affecting sarcopeniardquo NatureMedicine vol 21 no 1 pp 76ndash80 2015

[2] M W Lee M G Viola H Meng et al ldquoDifferential musclehypertrophy is associated with satellite cell numbers and aktpathway activation following activin type IIB receptor in-hibition in Mtm1 pR69C micerdquo e American Journal ofPathology vol 184 no 6 pp 1831ndash1842 2014

[3] H Viola P M Janssen R W Grange et al ldquoTissue triage andfreezing for models of skeletal muscle diseaserdquo Journal ofVisualized Experiments JoVE vol e51586 no 89 2014

[4] F Liu A L Mackey R Srikuea K A Esser and L YangldquoAutomated image segmentation of haematoxylin and eosinstained skeletal muscle cross-sectionsrdquo Journal of Microscopyvol 252 no 3 pp 275ndash285 2013

[5] H Su F Xing J D Lee et al ldquoLearning based automaticdetection of myonuclei in isolated single skeletal muscle fibersusing multi-focus image fusionrdquo in Proceedings of the IEEEInternational Symposium on Biomedical Imaging pp 432ndash435 San Francisco CA USA April 2013

[6] L R Smith and E R Barton ldquoSmashndashsemi-automatic muscleanalysis using segmentation of histology a matlab applica-tionrdquo Skeletal Muscle vol 4 no 1 pp 1ndash16 2014

[7] F Liu F Xing Z Zhang M Mcgough and L Yang ldquoRobustmuscle cell quantification using structured edge detection andhierarchical segmentationrdquo in Lecture Notes in ComputerScience pp 324ndash331 Shenzhen MICCAI Shenzhen China2015

[8] T Janssens L Antanas S Derde I VanhorebeekG Van den Berghe and F Guiza Grandas ldquoCharisma anintegrated approach to automatic HampE-stained skeletalmuscle cell segmentation using supervised learning and novelrobust clump splittingrdquoMedical Image Analysis vol 17 no 8pp 1206ndash1219 2013

[9] A E Carpenter T R Jones M R Lamprecht et al ldquoCell-profiler image analysis software for identifying and quanti-fying cell phenotypesrdquo Genome Biology vol 7 no 10pp 1ndash11 2006

[10] A Klemencic S Kovacic and F Pernus ldquoAutomated seg-mentation of muscle fiber images using active contourmodelsrdquo Cytometry vol 32 no 4 pp 317ndash326 1998

[11] N Bova V Gal O Ibantildeez and O Cordon ldquoDeformablemodels direct supervised guidance a novel paradigm forautomatic image segmentationrdquo Neurocomputing vol 177pp 317ndash333 2016

[12] T F Cootes C J Taylor D H Cooper et al ldquoActive shapemodels-their training and applicationrdquo Computer Vision andImage Understanding vol 61 no 1 pp 38ndash59 1995

[13] S Zhang Y Zhan M Dewan J Huang D N Metaxas andX S Zhou ldquoSparse shape composition a new frameworkfor shape prior modelingrdquo in Proceedings of the IEEEConference on Computer Vision and Pattern Recognition(CVPR) pp 1025ndash1032 IEEE Colorado Springs CO USAJune 2011

[14] D Ciresan A Giusti L M Gambardella and J SchmidhuberldquoDeep neural networks segment neuronal membranes inelectron microscopy imagesrdquo in Proceedings of the NIPSpp 2843ndash2851 Lake Tahoe NV USA December 2012

[15] H Noh S Hong and B Han ldquoLearning deconvolutionnetwork for semantic segmentationrdquo in Proceedings of theICCV Las Condes Chile December 2015

[16] J Long E Shelhamer and T Darrell ldquoFully convolutionalnetworks for semantic segmentationrdquo in Proceedings of theCVPR pp 3431ndash3440 Santiago Chile December 2015

[17] P O Pinheiro R Collobert and P Dollar ldquoLearning tosegment object candidatesrdquo in Proceedings of the Advances inNeural Information Processing Systems pp 1981ndash1989Montreal Canada December 2015

[18] J Mula J D Lee F Liu L Yang and C A PetersonldquoAutomated image analysis of skeletal muscle fiber cross-sectional areardquo Journal of Applied Physiology vol 114 no 1pp 148ndash155 2013

[19] P-Y Baudin N Azzabou P G Carlier and N ParagiosldquoPrior knowledge random walks and human skeletal musclesegmentationrdquo in Proceedings of the Medical Image Com-puting and Computer-Assisted InterventionndashMICCAI 2012pp 569ndash576 Nice France October 2012

[20] P Dollar and C L Zitnick ldquoStructured forests for fast edgedetectionrdquo in Proceedings of the ICCV pp 1841ndash1848 SydneyAustralia December 2013

[21] F Liu F Xing and L Yang ldquoRobust muscle cell segmentationusing region selection with dynamic programmingrdquo inProceedings of the ISBI pp 521ndash524 Beijing China April2014

[22] E Van Aart N Sepasian A Jalba and A Vilanova ldquoCuda-accelerated geodesic ray-tracing for fiber trackingrdquo Journal ofBiomedical Imaging vol 2011 Article ID 698908 12 pages2011

[23] G C Kagadis C Kloukinas K Moore et al ldquoCloud com-puting in medical imagingrdquo Medical Physics vol 40 no 7article 070901

[24] L Yang X Qi F Xing T Kurc J Saltz and D J ForanldquoParallel content-based sub-image retrieval using hierarchicalsearchingrdquo Bioinformatics vol 30 no 7 pp 996ndash1002 2014

[25] A Krizhevsky I Sutskever and G E Hinton ldquoImagenetclassification with deep convolutional neural networksrdquo inProceedings of the Advances in Neural Information ProcessingSystems pp 1097ndash1105 Lake Tahoe NV USA December2012

[26] H-C Shin H R Roth M Gao et al ldquoDeep convolutionalneural networks for computer-aided detection CNN archi-tectures dataset characteristics and transfer learningrdquo IEEE

Journal of Healthcare Engineering 9

Transactions on Medical Imaging vol 35 no 5 pp 1285ndash1298 2016

[27] J Xu X Luo G Wang H Gilmore and A Madabhushi ldquoAdeep convolutional neural network for segmenting andclassifying epithelial and stromal regions in histopathologicalimagesrdquo Neurocomputing vol 191 pp 214ndash223 2016

[28] X Pan L Li H Yang et al ldquoAccurate segmentation of nucleiin pathological images via sparse reconstruction and deepconvolutional networksrdquo Neurocomputing vol 229 pp 88ndash99 2017

[29] K Simonyan and A Zisserman ldquoVery deep convolutionalnetworks for large-scale image recognitionrdquo httpsarxivorgabs14091556

[30] O Ronneberger P Fischer and T Brox ldquoU-net convolu-tional networks for biomedical image segmentationrdquo inLecture Notes in Computer Science pp 234ndash241 ShenzhenMICCAI Shenzhen China 2015

[31] S Xie and Z Tu ldquoHolistically-nested edge detectionrdquo inProceedings of the ICCV pp 1395ndash1403 Las Condes ChileDecember 2015

[32] D Eigen and R Fergus ldquoPredicting depth surface normalsand semantic labels with a commonmulti-scale convolutionalarchitecturerdquo in Proceedings of the IEEE International Con-ference on Computer Vision pp 2650ndash2658 Las CondesChile December 2015

[33] Q Dou H Chen L Yu et al ldquoAutomatic detection of cerebralmicrobleeds from mr images via 3d convolutional neuralnetworksrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1182ndash1195 2016

[34] H Chen X Qi L Yu and P-A Heng ldquoDcan deep contour-aware networks for accurate gland segmentationrdquo httpsarxivorgabs160402677

[35] P Moeskops M A Viergever A M Mendrik L S de VriesM J N L Benders and I Isgum ldquoAutomatic segmentation ofmr brain images with a convolutional neural networkrdquo IEEETransactions on Medical Imaging vol 35 no 5 pp 1252ndash1261 2016

[36] S Hong H Noh and B Han ldquoDecoupled deep neuralnetwork for semi-supervised semantic segmentationrdquo inProceedings of the Advances in Neural Information ProcessingSystems pp 1495ndash1503 Montreal Canada December 2015

[37] M D Zeiler and R Fergus ldquoVisualizing and understandingconvolutional networksrdquo in Proceedings of the ComputervisionndashECCV 2014 pp 818ndash833 Springer Zurich SwitzerlandSeptember 2014

[38] C Szegedy W Liu Y Jia et al ldquoGoing deeper with con-volutionsrdquo in Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition pp 1ndash9 Boston MA USAJune 2015

[39] G Bertasius J Shi and L Torresani ldquoDeepedge a multi-scalebifurcated deep network for top-down contour detectionrdquo inProceedings of the IEEE Conference on Computer Vision andPattern Recognition pp 4380ndash4389 Boston MA USA June2015

[40] Y Jia E Shelhamer J Donahue et al ldquoCaffe convolutionalarchitecture for fast feature embeddingrdquo in Proceedings of theInternational Conference on Multimedia pp 675ndash678Orlando FL USA November 2014

[41] N Tajbakhsh J Y Shin S R Gurudu et al ldquoConvolutionalneural networks for medical image analysis full training orfine tuningrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1299ndash1312 2016

[42] J Yosinski J Clune Y Bengio and H Lipson ldquoHowtransferable are features in deep neural networksrdquo in

Proceedings of the NIPS pp 3320ndash3328 Montreal CanadaDecember 2014

[43] M Donoser and D Schmalstieg ldquoDiscrete-continuous gra-dient orientation estimation for faster image segmentationrdquoin Proceedings of the CVPR pp 3158ndash3165 Columbus OHUSA June 2014

[44] P Arbelaez J Pont-Tuset J Barron F Marques and J MalikldquoMultiscale combinatorial groupingrdquo in Proceedings of theCVPR pp 328ndash335 Columbus OH USA June 2014

[45] A Giusti D C Ciresan J Masci L M Gambardella andJ Schmidhuber ldquoFast image scanning with deep max-poolingconvolutional neural networksrdquo 2013 httpsarxivorgabs13021700

10 Journal of Healthcare Engineering

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Transactions on Medical Imaging vol 35 no 5 pp 1285ndash1298 2016

[27] J Xu X Luo G Wang H Gilmore and A Madabhushi ldquoAdeep convolutional neural network for segmenting andclassifying epithelial and stromal regions in histopathologicalimagesrdquo Neurocomputing vol 191 pp 214ndash223 2016

[28] X Pan L Li H Yang et al ldquoAccurate segmentation of nucleiin pathological images via sparse reconstruction and deepconvolutional networksrdquo Neurocomputing vol 229 pp 88ndash99 2017

[29] K Simonyan and A Zisserman ldquoVery deep convolutionalnetworks for large-scale image recognitionrdquo httpsarxivorgabs14091556

[30] O Ronneberger P Fischer and T Brox ldquoU-net convolu-tional networks for biomedical image segmentationrdquo inLecture Notes in Computer Science pp 234ndash241 ShenzhenMICCAI Shenzhen China 2015

[31] S Xie and Z Tu ldquoHolistically-nested edge detectionrdquo inProceedings of the ICCV pp 1395ndash1403 Las Condes ChileDecember 2015

[32] D Eigen and R Fergus ldquoPredicting depth surface normalsand semantic labels with a commonmulti-scale convolutionalarchitecturerdquo in Proceedings of the IEEE International Con-ference on Computer Vision pp 2650ndash2658 Las CondesChile December 2015

[33] Q Dou H Chen L Yu et al ldquoAutomatic detection of cerebralmicrobleeds from mr images via 3d convolutional neuralnetworksrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1182ndash1195 2016

[34] H Chen X Qi L Yu and P-A Heng ldquoDcan deep contour-aware networks for accurate gland segmentationrdquo httpsarxivorgabs160402677

[35] P Moeskops M A Viergever A M Mendrik L S de VriesM J N L Benders and I Isgum ldquoAutomatic segmentation ofmr brain images with a convolutional neural networkrdquo IEEETransactions on Medical Imaging vol 35 no 5 pp 1252ndash1261 2016

[36] S Hong H Noh and B Han ldquoDecoupled deep neuralnetwork for semi-supervised semantic segmentationrdquo inProceedings of the Advances in Neural Information ProcessingSystems pp 1495ndash1503 Montreal Canada December 2015

[37] M D Zeiler and R Fergus ldquoVisualizing and understandingconvolutional networksrdquo in Proceedings of the ComputervisionndashECCV 2014 pp 818ndash833 Springer Zurich SwitzerlandSeptember 2014

[38] C Szegedy W Liu Y Jia et al ldquoGoing deeper with con-volutionsrdquo in Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition pp 1ndash9 Boston MA USAJune 2015

[39] G Bertasius J Shi and L Torresani ldquoDeepedge a multi-scalebifurcated deep network for top-down contour detectionrdquo inProceedings of the IEEE Conference on Computer Vision andPattern Recognition pp 4380ndash4389 Boston MA USA June2015

[40] Y Jia E Shelhamer J Donahue et al ldquoCaffe convolutionalarchitecture for fast feature embeddingrdquo in Proceedings of theInternational Conference on Multimedia pp 675ndash678Orlando FL USA November 2014

[41] N Tajbakhsh J Y Shin S R Gurudu et al ldquoConvolutionalneural networks for medical image analysis full training orfine tuningrdquo IEEE Transactions on Medical Imaging vol 35no 5 pp 1299ndash1312 2016

[42] J Yosinski J Clune Y Bengio and H Lipson ldquoHowtransferable are features in deep neural networksrdquo in

Proceedings of the NIPS pp 3320ndash3328 Montreal CanadaDecember 2014

[43] M Donoser and D Schmalstieg ldquoDiscrete-continuous gra-dient orientation estimation for faster image segmentationrdquoin Proceedings of the CVPR pp 3158ndash3165 Columbus OHUSA June 2014

[44] P Arbelaez J Pont-Tuset J Barron F Marques and J MalikldquoMultiscale combinatorial groupingrdquo in Proceedings of theCVPR pp 328ndash335 Columbus OH USA June 2014

[45] A Giusti D C Ciresan J Masci L M Gambardella andJ Schmidhuber ldquoFast image scanning with deep max-poolingconvolutional neural networksrdquo 2013 httpsarxivorgabs13021700

10 Journal of Healthcare Engineering

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom


Recommended