+ All Categories
Home > Documents > [IEEE 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops...

[IEEE 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops...

Date post: 16-Dec-2016
Category:
Upload: tanveer
View: 213 times
Download: 0 times
Share this document with a friend
8
Cardiac Disease Detection from Echocardiogram using Edge Filtered Scale-invariant Motion Features Ritwik Kumar 1 , Fei Wang 2 , David Beymer 2 , Tanveer Syeda-Mahmood 2 1 School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA 2 IBM Almaden Research Center, San Jose, CA, USA [email protected], {wangfe, beymer}@us.ibm.com, [email protected] Abstract Echocardiography provides important morphological and functional details of the heart which can be used for the diagnosis of various cardiac diseases. Most of the exist- ing automatic cardiac disease recognition systems that use echocardiograms are either based on unreliable anatomi- cal region detection (e.g. left ventricle) or require extensive manual labeling of training data which renders such sys- tems unscalable. In this paper we present a novel system for automatic cardiac disease detection from echocardiogram videos which overcomes these limitations and exploits cues from both cardiac structure and motion. In our framework, diseases are modeled using a configuration of novel salient features which are located at the scale-invariant points in the edge filtered motion magnitude images and are encoded using local spatial, textural and motion information. To demonstrate the effectiveness of this technique, we present experimental results for automatic cardiac Hypokinesia de- tection and show that our method outperforms the existing state-of-the-art method for this task. 1. Introduction Echocardiography is one of the most widely used diag- nostic tests for heart disease. It can provide a wealth of help- ful information, including the size and shape of the heart, its pumping capacity and the location and extent of any dam- age to its tissues. It is especially useful for assessing dis- eases of the heart valves. However, the current clinical prac- tice requires manual intervention in both imaging and in in- terpretation. The sonographer manually delineates major anatomical structures like Left Ventricle (LV) and computes numerical quantities like ejection fraction from the images. This data is examined further by a cardiologist who makes the diagnosis based on the interpretation made from the echocardiogram. Despite this extensive manual interven- tion, identifying some motion abnormalities such as My- ocardium Infraction, where myocardium contracts signifi- cantly less than the rest of tissue, is difficult due to the char- acteristics of the ultrasound images. Therefore, tools that automate the disease discrimination process are desired. Analyzing the spatio-temporal regional motion patterns of the heart is important for cardiac disease discrimination. Fig. 1 shows the echocardiographic frames of a normal pa- tient and a Hyypokinetic (a pathology) patient in Apical 4 Chamber (A4C) view. The motion observed at a few loca- tions on the left ventricle wall and the mitral valve leaflets for the two cases are also depicted. It is clear from the figure that the intensity information alone may not be sufficient to distinguish the difference between the normal and the dis- eased case unless the motion information is also incorpo- rated in the diagnosis. However, the disease recognition problem is compli- cated by the heart’s non-rigid motion (see bottom row of Fig. 1). Furthermore, the poor imaging quality of 2D echo videos due to low contrast, speckle noise, and sig- nal dropouts also cause problems in image interpretation. Most of the existing automatic cardiac disease recognition systems that use echocardiograms, are either based on un- reliable anatomical region detection (e.g. left ventricle) or require extensive manual labeling of the training data [2] which renders such systems unscalable. the task is further complicated by the fact that it is not clear how the motion information should be taken into account to accomplish the goal of disease detection. This is a critical problem since it is well-known in the pattern recognition community that the choice of feature representation can have a greater impact on performance than the selection of the top level classifier architecture. To address this issue, we present a novel system for automatic cardiac disease detection from echocardiogram videos which overcomes these limitations by exploiting cues from both cardiac structure and motion. In our frame- work, diseases are modeled using a configuration of novel salient features which are located at the scale-invariant points in the edge filtered motion magnitude images and are 1 162 978-1-4244-7030-3/10/$26.00 ©2010 IEEE
Transcript
Page 1: [IEEE 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops) - San Francisco, CA, USA (2010.06.13-2010.06.18)] 2010 IEEE Computer

Cardiac Disease Detection from Echocardiogram using Edge FilteredScale-invariant Motion Features

Ritwik Kumar1, Fei Wang2, David Beymer2, Tanveer Syeda-Mahmood2

1School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA2IBM Almaden Research Center, San Jose, CA, USA

[email protected], {wangfe, beymer}@us.ibm.com, [email protected]

Abstract

Echocardiography provides important morphologicaland functional details of the heart which can be used forthe diagnosis of various cardiac diseases. Most of the exist-ing automatic cardiac disease recognition systems that useechocardiograms are either based on unreliable anatomi-cal region detection (e.g. left ventricle) or require extensivemanual labeling of training data which renders such sys-tems unscalable. In this paper we present a novel system forautomatic cardiac disease detection from echocardiogramvideos which overcomes these limitations and exploits cuesfrom both cardiac structure and motion. In our framework,diseases are modeled using a configuration of novel salientfeatures which are located at the scale-invariant points inthe edge filtered motion magnitude images and are encodedusing local spatial, textural and motion information. Todemonstrate the effectiveness of this technique, we presentexperimental results for automatic cardiac Hypokinesia de-tection and show that our method outperforms the existingstate-of-the-art method for this task.

1. Introduction

Echocardiography is one of the most widely used diag-nostic tests for heart disease. It can provide a wealth of help-ful information, including the size and shape of the heart, itspumping capacity and the location and extent of any dam-age to its tissues. It is especially useful for assessing dis-eases of the heart valves. However, the current clinical prac-tice requires manual intervention in both imaging and in in-terpretation. The sonographer manually delineates majoranatomical structures like Left Ventricle (LV) and computesnumerical quantities like ejection fraction from the images.This data is examined further by a cardiologist who makesthe diagnosis based on the interpretation made from theechocardiogram. Despite this extensive manual interven-tion, identifying some motion abnormalities such as My-

ocardium Infraction, where myocardium contracts signifi-cantly less than the rest of tissue, is difficult due to the char-acteristics of the ultrasound images. Therefore, tools thatautomate the disease discrimination process are desired.

Analyzing the spatio-temporal regional motion patternsof the heart is important for cardiac disease discrimination.Fig. 1 shows the echocardiographic frames of a normal pa-tient and a Hyypokinetic (a pathology) patient in Apical 4Chamber (A4C) view. The motion observed at a few loca-tions on the left ventricle wall and the mitral valve leafletsfor the two cases are also depicted. It is clear from the figurethat the intensity information alone may not be sufficient todistinguish the difference between the normal and the dis-eased case unless the motion information is also incorpo-rated in the diagnosis.

However, the disease recognition problem is compli-cated by the heart’s non-rigid motion (see bottom row ofFig. 1). Furthermore, the poor imaging quality of 2Decho videos due to low contrast, speckle noise, and sig-nal dropouts also cause problems in image interpretation.Most of the existing automatic cardiac disease recognitionsystems that use echocardiograms, are either based on un-reliable anatomical region detection (e.g. left ventricle) orrequire extensive manual labeling of the training data [2]which renders such systems unscalable. the task is furthercomplicated by the fact that it is not clear how the motioninformation should be taken into account to accomplish thegoal of disease detection. This is a critical problem since itis well-known in the pattern recognition community that thechoice of feature representation can have a greater impacton performance than the selection of the top level classifierarchitecture.

To address this issue, we present a novel system forautomatic cardiac disease detection from echocardiogramvideos which overcomes these limitations by exploitingcues from both cardiac structure and motion. In our frame-work, diseases are modeled using a configuration of novelsalient features which are located at the scale-invariantpoints in the edge filtered motion magnitude images and are

1162978-1-4244-7030-3/10/$26.00 ©2010 IEEE

Page 2: [IEEE 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops) - San Francisco, CA, USA (2010.06.13-2010.06.18)] 2010 IEEE Computer

Normal Hypokinesia

Figure 1. (Top-row) Echocardiographic frames from normal and Hypokinesia patients respectively (apical 4 chamber view); (Bottom-row)Motions of the heart for normal and Hypokinesia patients. ”o” are the features points extracted from the left ventricle wall and mitral valveleaflets. It can be noted that although the intensity profile of the two cases are similar, the motion pattern are significantly different betweenthe two.

encoded using local spatial, textural and kinetic informa-tion. Here we demonstrate the effectiveness of the features-based approach to disease discrimination on patients withHypokinesia, and we show that our method can producebetter results than various state-of-the-art techniques.

The rest of the paper is organized as follows. Section 2provides a survey of the existing techniques for view recog-nition. In Section 3 we describe our framework for featuredetection and description. The training and testing algo-rithms for the disease recognition are presented in Section4. Section 5 presents experimental results and a compari-son with state-of-the-art techniques. Finally, we concludein Section 6.

2. Previous MethodsHere we briefly survey automatic methods for cardiac

disease detection using echocardiogram videos. Since mostof such methods, including ours, requires an estimation ofcardiac motion, we begin with a survey of cardiac motionestimation methods.

The estimation of cardiac motion and deformation fromechocardiograms has been a difficult problem due to the in-

herent characteristics of echo video. Existing methods canbe roughly classified into two catergories: intensity-based(direct) methods and feature-based (indirect) methods. Theintensity-based methods include Optical Flow [1], Demons’algorithm [15] and Spline-based [11] approaches. The es-timates resulting from these approaches are often inconsis-tent with the actual observed motion in the echo videos forcardiac regions due to the low quality of the echo videosand non-smooth heart motion. Due to these issues, the ma-jority of the existing approaches use feature-based meth-ods, where they first segment the myocardial regions [8, 14]and the motion is then recovered by aligning the segmentedshapes. For example, Jacob et al. [8] propose a method tosegment the myocardial region using deformable models,and the motion is then recovered by aligning the segmentedshapes. In [14], an Bayesian approach, combined with thebiomechanical model were used the recover left ventriculardeformation. Their method has the advantage of accountingfor the fiber directions in the left ventricle when estimatingthe cardiac motion. These methods, though dependent onan accurate segmentation of the myocardium walls, have theadvantage of being imaging modality independent. But the

163

Page 3: [IEEE 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops) - San Francisco, CA, USA (2010.06.13-2010.06.18)] 2010 IEEE Computer

Figure 2. (Left to Right) Echocardiogram intensity frame, Corresponding motion magnitude image, Intensity and Motion Magnitudeoverlayed. It can be noted the bright spots in motion magnitude image correspond to fast moving anatomical features like valves in theintensity image.

dependence on obtaining an accurate segmentation remainsa significant issue, as there still are no fully automated ro-bust and efficient echo region segmentation methods.

Recently, new techniques have been proposed to dis-criminate diseases by examining the motion features of theheart region. In [16], the average velocity curve is intro-duced for validating disease diagnosis through video simi-larity, which uses features from the entire heart region, re-stricting its use in characterizing region-specific diseases.Later work combined motion estimation using the Demon’salgorithm with graph-based region segmentation approachto improve disease discrimination [15]. Even so, the in-accuracies of region segmentation using a graph-theoreticapproach and rough motion estimation using average veloc-ities often lead to inaccuracies in disease discrimination.

More recently, Wang et al. [17] introduced a new motionestimation technique in which the registration of the entiresequence of echocardiographic frames are achieved in oneshot. The derived average velocity curve features are thenfed into a SVM for disease discrimination. This method as-sumes that all the extracted motion features are of the samescale, which may not be true in real applications. Beymeret al. [2] have proposed a learning based approach wherethey build a statistical spatio-temporal disease models withthe help of Active Shape Models. Here the disease label ofa new cardiac echo video is recovered by fitting the videowith each of the disease models. The main drawback of thismethod is that it depends on an accurate manual segmenta-tion of the region of interest during the training stage wherebias can be introduced in the model building process.

Our method, on the other hand, does not require any hu-man intervention during the training stage other than theknowledge of the disease labels. Automatic detection ofscale invariant features on edge-filtered motion magnitudemaps allows our method to be independent of the presenceand/or segmentation of specific anatomical structures. Ourframework facilitates a novel and seamless fusion of salient

motion and structural information, which can be critical indisease discrimination, a capability which the above men-tioned methods lack.

As compared to the previous work, we would demon-strate that our method achieves higher disease detection rateas well as is more extendible. In particular, compared to themethods outlined in [17] and [2], our Hypokinesia recogni-tion accuracy is higher. Built on a scalable framework, oursystem does not require an initial LV detection stage as in[14] or an expensive manual labeling during training as in[2].

But more generally, our paper makes an important con-tribution in its fusion of motion and intensity to form a dis-criminating ”spatiotemporal” feature for the task of diseasedetection. As detailed in the following section, our featuresare unique both in their location and description. Featurelocations are scale invariant interest points in motion mag-nitude that are also close to intensity edges. Feature descrip-tions include position (x; y) and histograms of local motionand intensities. The utility of these features is borne out incombination with Pyramid Matching Kernel (PMK) basedSVM classifier proposed in [6].

Perhaps the method that is closest in spirit to our ap-proach is the recent work in echo view recognition by Ku-mar et al. [9], where they exploits similar cues from bothcardiac structure and motion in echocardiogram videos, butfor the different task of viewpoint recognition. In theirwork, each image from the echocardiogram video is rep-resented by a set of novel salient features. These featuresare located at scale invariant points in the edge-filtered mo-tion magnitude images and are encoded using local spatial,textural and kinetic information. Our system, in contrast,tackles a completely different problem of disease recogni-tion for echocardiogram videos. For the echo view recog-nition problem, it is quite possible for a trained eye to dif-ferentiate among views based just on individual intensityimages alone, which is not the case for the disease recog-

164

Page 4: [IEEE 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops) - San Francisco, CA, USA (2010.06.13-2010.06.18)] 2010 IEEE Computer

Figure 3. (Top-left) Echocardiogram intensity image. (Top-Right) Motion magnitude image in red with detected features in green. (Bottom-Left) Intensity edge map in red. (Bottom-Right) Edge filtered motion features in green. The green dots correspond to feature locationswhich are significant in terms of both the cardiac motion and the cardiac structure.

nition. For instance, in Fig. 1, physicians have to look atthe whole video sequence to discern structural and motionabnormalities. Because of this reason, unlike [9], we used anon-uniform weighting scheme (weighting motion and in-tensity features differently) for the feature vector during thedictionary creation stage (in PMK), which is another keydifference between [9] and the proposed method.

3. Edge-filtered Motion FeaturesSince cardiac diseases can be characterized by both

structural and motion abnormalities, we seek an automaticdisease discrimination system that accounts for both ofthese components. This information can be obtained fromechocardiogram videos from the embedded structural, tex-tural and motion details. Further, we do not want our systemto be contingent upon detection and segmentation of spe-cific anatomical structures, as such systems are often unre-liable. Any requirement of manual intervention in the train-ing phase is also not desirable as this hampers the scalabilityof the system. In order to fulfill these requirements, in thefollowing subsections we present a scalable system whichmodels diseased and normal hearts using a configuration ofsalient features.

3.1. Preprocessing

Though the classifier we intend to use for final catego-rization is immune to slight misalignment in input images,

we can improve the results by correcting gross misalign-ment in echocardiogram image sectors as a preprocessingstep. We accomplish this using a two step process, wherewe first extract the image sectors which contain the actualregion of interest either manually or by using automatedmethods ([13]). And then we use the three extreme pointsof the extracted sectors to recover the affine transformationneeded to align all the sectors to a reference sector. Notethat the three points are sufficient to recover the affine trans-formation matrix.

3.2. Feature Localization

We seek features that are salient from the point of viewof both motion and structure. To get a handle on the mo-tion features, we analyze the optical flow information in theechocardiogram videos. We obtain the optical flow usingthe Demon’s like algorithm [15] which provides us a flowvector at each frame pixel. This flow can be further bro-ken down into scalar motion magnitude and motion phaseimages. Here we make the important observations that inpractice, motion phase is more sensitive to image transfor-mations (rotation, translation etc.) than the motion magni-tude. Hence, we have chosen to work with motion magni-tude images as they provide a more stable domain for salientfeature hunting.

An example of motion magnitude image is shown inFig. 2, where it can be noted that the scalar image is ef-fective in capturing the regions with large motions, e.g.

165

Page 5: [IEEE 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops) - San Francisco, CA, USA (2010.06.13-2010.06.18)] 2010 IEEE Computer

heart valves, which can also be instrumental in disease de-tection. Since echocardiogram images are characterized byhigh noise and low contrast, these artifacts also make theirway through to the motion magnitude images. This cancause spurious features to appear when some automatic fea-ture detection scheme is used.

To get around this problem we invoke the structuralinformation present in the echocardiogram images. Weknow that the motion in magnitude image is meaningfulonly when it corresponds to some underlying anatomicalstructure and we can obtain its good estimate using anedge map of the intensity image. Thus we incorporate thestructure and motion information together by retaining onlythose segments of the motion magnitude image which cor-responds to (lie within certain distance of) some intensityedge.

From these edge-filter motion maps, salient featurespoints can be chosen in various ways. Object recognitionliterature provides numerous options such as space timefeatures [10], scale-invariant features [12], rectified andblurred optical flow [5] etc. For our implementation wehave chosen to use the scale-invariant features due to theirsimplicity. Note that a direct application of object recogni-tion methods to echocardiogram images has been shown tobe ineffectual [3] primarily due to the inherent low contrastand high noise. The novelty of our features lie in the factthat they seamlessly combine the motion and the structureinformation present in the echocardiogram videos to locatethe features. To the best of our knowledge, we are the firstto exploit edge filtered motion magnitude images for ob-taining disease discriminating features. These features havealso been previously used for solving the echocardiogramview recognition problem [9].

In order to avoid spurious features due to artificial edges,we first detect the scale invariant features on the motionmagnitude images and then retain only those which liewithin certain distance of some intensity edge. As an ex-ample, the locations of features detected using our schemeare shown on one of the echocardiogram frames in Fig. 3.

3.3. Feature Description

Once the features have been located, we encode each ofthem using the following three pieces of information:

• The location of the feature in image coordinates (x, y).This is useful since these locations corresponds to bothsignificant motion and structure.

• The histogram of the motion magnitude values in awindow around the feature location. This encodes themotion signature of the feature e.g. the points corre-sponding to cardiac valves would have a histogram bi-ased towards higher values.

• The histogram of the intensity value in a windowaround the feature location. This encodes the struc-tural information at the feature location.

We compute the above three quantities and string themall into a single vector at each feature location. The set of allsuch feature vectors from a frame forms its signature whichwould be used as input in our classifier.

In the scale invariant features (SIFT) method [12], a fea-ture description based on oriented gradient histograms isproposed. However, since local gradients are very noisy inechocardiogram images, we use the above feature descrip-tion instead.

Since the main focus of this paper is to discriminate be-tween motion related diseases and controls we have explic-itly defined the importance of motion features as comparedto intensity features (location features are always assignedunit weight) using a weighting scheme. Specifically wetreat the ratio of the weights for motion histograms and in-tensity histograms as another unknown parameter that is ex-plicitly estimated during the training process.

4. Disease DiscriminationNow that we have defined a way to represent each video

frame by a set of feature vectors, we turn our attention tomodeling diseases and discriminating them from the con-trol cases. Foremost, unlike existing methods, our diseasemodel does not depend on any key frame selection pro-cess. Instead, given a labeled set echocardiogram video se-quences, we automatically detect a heart cycle synchronizedat the R-wave peak, and extract n equally spaced framesfrom the heart cycle. We then detect and encode the eachframe using our novel scheme described in the previous sec-tion.

In a preliminary training stage, the labeled set of allframes are provided to a Pyramid Matching Kernel (PMK)([6]) based Support Vector Machine (SVM). The featureextraction and the training process is summarized in Algo-rithm 1. In the testing stage, we again detect a heart cycleand extract n frames from it as before. Features are detectedand encoded as described in the previous section and usingthe PMK based SVM, each frame is classified as diseasedor normal. Once all the frames have been labeled, we countthe number of frames labeled diseased versus those labelednormal and the category with higher number of votes is cho-sen to be the label of the input video sequence. This testingprocess is summarized in Algorithm 2.

The voting based scheme has the advantage of being ro-bust to outlier frames which may corrupt the classificationif the whole video sequence is used as a unit. Note that ourfeatures do account for the fact that the frames come froma video sequence since they encode motion information de-rived from multiple frames.

166

Page 6: [IEEE 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops) - San Francisco, CA, USA (2010.06.13-2010.06.18)] 2010 IEEE Computer

Algorithm 1: Disease Discrimination TrainingInput: Labeled Training Echocardiogram Videos: trainSet, Weighting Factor: α, Neighborhood Size: nh, Number of

Frames: nOutput: SVM Parameters: M , Feature Dictionary: D

1 Pick a reference frame and detect its 3 anchor points2 foreach α ∈ [0.1, 10] do3 foreach Video V ∈ trainSet do4 FV = {n equidistant frames ∈ V }5 foreach frame f ∈ FV do6 Feature Set for f , FSf

V = {}7 Extract Region of Interest (ROI)8 Detect the 3 anchor points9 Compute Affine matrix w.r.t. reference frame.

10 Apply the affine matrix and align ROI11 Mf

V = Optical Flow w.r.t. next frame in sequence. [15]12 Ef

V = Edge map of f13 Mag = ||Mf

V ||2. (Motion Magnitude)14 FT ={ Scale invariant features in Mag [12]}15 foreach Feature i ∈ FT do16 Loc = (x, y) coordinate of i17 if Loc ∈ nh× nh neighborhood of some edge ∈ Ef

V then18 THist = histogram of nh× nh intensity neighborhood of i19 MHist = histogram of nh× nh motion magnitude neighborhood of i20 FT i

V ec = concat(Loc, THist,MHist)

21 FSfV = FSf

V

⋃{FT i

V ec}

22 Learn Dictionary, D, from⋃

V

⋃f FS

fV [7] using the weight α

23 Learn SVM parameters M using FSfV , D, & PMK [7] and store them indexed by α values

24 Pick the SVM parameters corresponding to α which achieves the best training result

Algorithm 2: Disease Discrimination TestingInput: Learnt SVM Model: M , Test Echocardiogram

Videos: testV ideo, Dictionary:DNeighborhood Size: nh, Number of Frames: n

1 FtestV ideo = {n equidistant frames ∈ testV ideo }2 vote = zeros(numberOfClasses).3 foreach frame f ∈ FtestV ideo do4 Compute FSf

testV ideo as described in Alg. 1.5 Classify f using FSf

testV ideo, SVM M and D.6 class(f) = class obtained by classification.7 vote(class(f)) = vote(class(f)) + 1.

8 Classify testV ideo as argmaxclass{vote(class)}.

5. Experiment: Hypokinesia Detection

In order to demonstrate the usefulness of the disease de-tection scheme outlined in this paper, we conducted exper-iments for automatic detection of the condition known as

Hypokinesia, a reduced motion of the heart.Our data was composed of complete echo exam recorded

in continuous video. From these videos the diagnosti-cally relevant viewpoint - Apical Four Chambers (A4C) wasmanually extracted for both the normal and the Hypokine-sia patients. The videos were captured at 320 × 240 pixelsize at 25 Hz and the included ECG waveform was used forheart cycle detection. Due to scarcity of labeled echocardio-gram data in general and in order to make fair comparisonwith existing state-of-the-art results, we used the same setupas used in [2] for our experiments i.e. 16 video sequencesfor Hypokinesia patients and 5 videos for normal patients.We picked 21 frames per patients and thus the experimentalsetup involved 336 frames for the diseased and 126 framesfor the normal case.

As described earlier in Section 4, there are a few param-eters that need to be set both in disease discrimination train-ing and testing step in our system. Foremost is the numberof frames per video to be used for classification. We havenoticed that as the number of frames increases so does the

167

Page 7: [IEEE 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops) - San Francisco, CA, USA (2010.06.13-2010.06.18)] 2010 IEEE Computer

Method Hypokinesia Normal All AveragePMK [6]with only intensity features

31.2 40.0 33.3 35.6

EMBC 2008,ASM + Motion [2]

93.7 60.0 85.7 76.8

CinC 2008,SVM + Motion [17]

75.0 80.0 76.2 77.5

Our Method 100.0 60.0 90.5 80

Table 1. Comparison of Hypokinesia detection rates for different classification algorithms.

recognition rate, but at the expense of computation time, sothis parameter should be set based on accuracy-efficiencytrade-off. Next is the neighborhood size selection for edgefiltering, motion and texture histogramming. Here we havenoticed that a neighborhood size of around 10 % of theROI (rectangle containing image sector/fan) size providesthe best result. This number is also used to set the num-ber of bins in histograms. Minor changes in this size doesnot have any significant impact on recognition rates. In oursetup, the neighborhood size was set to be 15 × 15 pixelsand 15 bin histograms were used in feature encoding. Pa-rameters of scale invariant features detector are set to givearound 200 features per frame. The next parameter is thedictionary size used during the learning phase. We set itsuch that 5 % of the total features are retained in the dictio-nary with random initialization. The PMK dictionary wasset to have approximately 14000 features (see [7] for moredetails on PMK parameters). As in [2], we performed leave-one-out testing and repeated our classification 20 times withdifferent initialization of the PMK dictionary. Note that weremoved all the frames of the subject which was used as thetest case from the training set.

We compared the performance of our algorithm to exist-ing state-of-the-art methods for the task of disease recogni-tion mentioned above. Since our classification method uti-lize both the textual and motion information, it is expectedto perform better than the same methods with textual fea-tures alone. Therefore the first algorithm that we compareis to apply PMK on the intensity features alone. This ex-periment will also demonstrate how much of the final per-formance is contributed by the motion features. To evaluatethe performance over other motion-based disease classifica-tion algorithm, we compared our approach to two other ma-chine learning approaches, namely, SVM based algorithm[17] and an Active Shape Model (ASM) based approach [2].

The obtained results are presented in Table 1. Thecolumns show the correct classification rates for the caseswith Hypokinesia, cases with normal heart, all the cases andthe average classification rate for the two classes. Along therows we have presented results from four different methods.

The following observations are made from this table:

1. Our method outperforms all other methods in termsof correct classification rates. As noted in [2], thelow classification rate for the normal cases can be at-tributed to the limited training data. This is peculiar asone would expect normal heart echocardiogram videosto be readily available. The reason is that we wanted toavoid mixing videos obtained from different echocar-diogram machines and had only limited normal datafrom the machine that was used to obtain the diseasedcases.

2. It can be noted that the PMK method, when used withjust intensity features, performs significantly worse(by 57.2% for the overall accuracy rate) as comparedto our method, where PMK is applied on our novelfeatures. This demonstrate the importance of usingmotion histograms obtained at salient points in theechocardiogram frames for discriminating between thediseased and the normal cases.

6. Discussion and ConclusionThe method outlined in this paper, by virtue of being

based on the features based classification paradigm, doesnot suffer from the drawbacks of some other methods thatare based on anatomical region segmentation [14, 8]. Sinceour features simultaneously go after the structure and themotion information in the echo videos, they can also be veryeasily applied to cardiac pathologies other than Hypokine-sia, e.g. mitral stenosis, aortic valve stenosis etc, withoutrequiring any change in our setup. In future we would liketo quantitatively evaluate its performance on other diseasesvia experiments along the lines described in this paper.

Another major advantage of the method presented hereis that it is not dependent on the viewpoint from whichdata is obtained as long as all the data is consistently ob-tained in the same viewpoint. This advantage does not carryover to techniques based on anatomical structures, sincesome anatomical structures are not visible from all the view-points. In fact, during a given patient exam, data is gener-ally obtained through various viewpoints and our methodsprovides the possibility of simultaneously using the infor-

168

Page 8: [IEEE 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops) - San Francisco, CA, USA (2010.06.13-2010.06.18)] 2010 IEEE Computer

mation in more than one viewpoint to obtain better results.This is also something we would like to explore in future.

As compared to the appearance model based methods[2], our method does not require extensive labeling of thedata which can be quite time and effort consuming. Appear-ance based methods, unlike our method, are also viewpointspecific. Our methods also sidesteps the problem of keyframe selection which is encountered in methods that use asingle frame for analysis [4]. Our methods aggregates rele-vant classification information from various frames across avideo and hence it is also robust against presence of outlierframes. The scalable nature of our method can allow dis-crimination among multiple diseases simultaneously, whichis something we wish to explore in future.

Along with the advantages of our methods mentionedabove, it also has some limitations that we would like toaddress in our future-work. The quality of the selected fea-ture is critically dependent on the quality of the estimatedmotion. Since echo video are known to be difficult videosfor motion estimation, this dependence could be a possi-ble limitation of our method. We would also like to eval-uate our method on a larger data set without bias in thenumber of the samples for any class. In conclusion, wehave demonstrated the usefulness of feature based methodsfor cardiac disease detection using echocardiogram videos,which moves us closer to a true decision support system forcardiac disease diagnosis.

References[1] S. Behbahani and K. Magholi. Evaluation optical-

flow based methods for estimation of wallmotions. InSecond International Multi-Symposiums on Computerand Computational Sciences, pages 164–169. IEEEComputer Society, 2007.

[2] D. Beymer and T. Syeda-Mahmood. Cardiac diseaserecognition in echocardiograms using spatio-temporalstatistical models. In IEEE Computer Society Work-shop on Mathematical Methods in Biomedical ImageAnalysis (MMBIA), pages 1–8, 2008.

[3] D. Beymer, T. Syeda-Mahmood, and F. Wang. Ex-ploiting spatio-temporal information for view recog-nition in cardiac echo videos. In IEEE Engineeringin Medicine and Biology Society Conference (EMBC),pages 1–5, 2008.

[4] S. Ebadollahi, S. Chang, and H. Wu. Modeling theactivity pattern of the constellation of cardiac cham-bers in echocardiogram videos. In Computer VisionApproaches to Medical Image Analysis (CVAMIA06),pages 202–213, 2006.

[5] A. A. Efros, A. C. Berg, G. Mori, and J. Malik. Rec-ognizing action at a distance. In International Confer-ence on Computer Vision (ICCV), 2003.

[6] K. Grauman and T. Darrell. The pyramid matchingkernel: Discriminative classification with sets of im-age features. In International Conference on Com-puter Vision (ICCV), 2005.

[7] K. Grauman and T. Darrell. Approximate correspon-dences in high dimensions. In Neural InformationProcessing Systems (NIPS), 2006.

[8] G. Jacob, J. A. Noble, C. P. Behrenbruch, A. D. Ke-lion, and A. P. Banning. A shape-space based ap-proach to tracking myocardial borders and quantifyingregional left ventricular function applied in echocar-diography. IEEE Trans. Med. Imaging, 21(3):226–238, 2002.

[9] R. Kumar, F. Wang, D. Beymer, and T. Syeda-Mahmood. Echocardiogram view classification us-ing edge filtered scale-invariant motion features. InIEEE Computer Society Conference on Computer Vi-sion and Pattern Recognition (CVPR), pages 121–124,Miami, US, 2009.

[10] I. Laptev. On space-time interest points. InternationalJournal on Computer Vision (IJCV), 64(2-3):107–123,2005.

[11] M. Ledesma-Carbayo, J. Kybic, M. Desco, A. San-tos, M. Suhling, P. Hunziker, and M. Unser. Spatio-Temporal Nonrigid Registration for Ultrasound Car-diac Motion Estimation. IEEE Transactions on Medi-cal Imaging, 24(9):1113–1126, 2005.

[12] D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal on Com-puter Vision (IJCV), 60(2):91–110, 2004.

[13] M. Otey, J. Bi, S. Krishna, B. Rao, J. Stoeckel, A. S.Katz, J. Han, and S. Parthasarathy. Automatic viewrecognition for cardiac ultrasound images. In MIC-CAI: Internaltional Workshop on Computer Vision forIntravascular and Intracardiac Imaging, pages 187–194, 2006.

[14] X. Papademetris, A. J. Sinusas, D. P. Dione, and J. S.Duncan. Estimation of 3D left ventricular deforma-tion from echocardiography. Medical Image Analysis,pages 17–28, 2001.

[15] T. Syeda-Mahmood, F. Wang, D. Beymer, M. London,and R. Reddy. Characterizing spatio-temporal patternsfor disease discrimination in cardiac echo videos. InMedical Image Computing and Computer-Assisted In-terventation (MICCAI), pages 261–269, 2007.

[16] T. F. Syeda-Mahmood and J. Yang. Characterizingnormal and abnormal cardiac echo motion patterns. InComputers in Cardiology, volume 33, pages 725–728,2006.

[17] F. Wang, T. Syeda-Mahmood, and D. Beymer. Spatio-temporal motion estimation for disease discriminationin cardiac echo videos. In Computers in Cardiology,pages 121–124, Bologna, Italy, 2008.

169


Recommended