Mapping forest changes using multi-temporal remote sensing … › etd › ucb › text ›...

Mapping forest changes using multi-temporal remote sensing images: BITE for accurate

trajectory extraction and CBEST for efficient clustering

By

Yanlei Chen

A dissertation submitted in partial satisfaction of the

requirements for the degree of

Doctor of Philosophy

in

Environmental Science, Policy and Management

in the

Graduate Division

of the

University of California, Berkeley

Committee in charge:

Professor Peng Gong, Chair

Professor Gregory Biging

Professor John Radke

Fall 2014

1

Abstract

Mapping forest changes using multi-temporal remote sensing images: BITE for accurate

trajectory extraction and CBEST for efficient clustering

By

Yanlei Chen

Doctor of Philosophy in Environmental Science, Policy and Management

University of California, Berkeley

Professor Peng Gong, Chair

We developed a semi-automatic algorithm named Berkeley Indices Trajectory Extractor

(BITE) to detect forest disturbances, especially slow-onset disturbances such as insect mortality,

from time series of Landsat 5 Thematic Mapper (TM) images. BITE is a streamlined process that

features trajectory extraction and interpretation of multiple spectral indices followed by an

integration of all indices. The algorithm was tested over Grand County in Colorado, located in

the Southern Rocky Mountains Ecoregion, where forests dominated by lodgepole pine have been

under mountain pine beetle attack since 2000. We produced a disturbance map using BITE with

an identification accuracy of 94.7% assessed from 602 validation sample pixels. The algorithm

shows its robustness in deriving forest disturbance type and timing with the presence of different

levels of atmospheric conditions, noises, pixel misregistration and residual cloud/snow cover in

the imagery. Outputs of the BITE algorithm could be used in studies designed to increase

understanding of the mechanisms of mountain pine beetle dispersal and tree mortality, as well as

other types of forest disturbances.

Large remote sensing datasets, that either cover large areas or have high spatial resolution, are

often a burden for information mining for scientific studies. Here, we present an approach that

conducts clustering after gray-level vector reduction. In this manner, the speed of clustering can

be considerably improved. The approach features applying eigenspace transformation to the

dataset followed by compressing the data in the eigenspace and storing them in coded matrices

and vectors. The clustering process takes advantage of the reduced size of the compressed data

and thus reduces computational complexity. We name this approach Clustering Based on Eigen

Space Transformation (CBEST). In our experiment with a subscene of Landsat Thematic

Mapper (TM) imagery, CBEST was found to be able to improve speed considerably over

conventional K-means as the volume of data to be clustered increases. We assessed information

loss and several other factors. In addition, we evaluated the effectiveness of CBEST in mapping

land cover/use with the same image that was acquired over Guangzhou City, South China and an

AVIRIS hyperspectral image over Cappocanoe County, Indiana. Using reference data we

2

assessed the accuracies for both CBEST and conventional K-means and we found that the

CBEST was not negatively affected by information loss during compression in practice. We then

applied CBEST in mapping the forest change from 1986-2011 for the entire state of California,

USA with over 400 Landsat TM images. We discussed potential applications of the fast

clustering algorithm in dealing with large datasets in remote sensing studies.

We present an efficient approach for a practice of large-area mapping of forest changes based on

the Clustering Based on Eigen Space Transformation (CBEST) algorithm using remote sensing.

By analyzing 450 Landsat Thematic Mapper (TM) satellite images from 1986 to 2011 with a

five-year interval covering the entire state of California, USA, we derived a forest change type

map, a forest loss map and a forest gain map. Although California has 99.6 million acres land

area in total and the spatial resolution of Landsat TM is 30m, the computing time of the task took

only 10 hours in a computer with an Intel 2.8 Ghz i5 CPU and 8 Gigabytes RAM. The overall

accuracy of the forest cover in year 2011 was reported as 92.9% ± 1.6%. We found that the

estimated forest area changed from 28.20 ± 1.98 million acres to 28.05 ± 1.98 million acres from

1986-2011. In particular, our rough estimate indicates that each year California’s forest

experienced loss of 92 thousand acres and recovery of 85 thousand acres, resulting in seven

thousand acres forest loss per year. In addition, during 1986-2011, around 12% of the forestland

experienced changes, in which the change was 4% each for deforestation, afforestation and

deforestation then recovered respectively. We concluded that the forestland in California had

been managed in a sustainable manner over the 25 years, since no significantly directional

changes were observed. Our approach made a tighter estimate of the true canopy coverage such

that 29% of land in California is forestland, comparing with the statistics of 33% and 40% made

by previous studies that had lower spatial resolution and shorter temporal coverage.

i

Table of Contents

LIST OF TABLE CAPTIONS ................................................................................................................... III

LIST OF FIGURE CAPTIONS ................................................................................................................. IV

INTRODUCTION ........................................................................................................................................ V

ACKNOWLEDGEMENT .........................................................................................................................VII

CHAPTER 1 BITE: AN ALGORITHM FOR MAPPING SLOW-ONSET FOREST DISTURBANCES CAUSED BY MOUNTAIN PINE BEETLES WITH LANDSAT IMAGE STACKS .................................................. 1

ABSTRACT .................................................................................................................................................. 2

1 INTRODUCTION .............................................................................................................................. 3

2 METHODOLOGY .............................................................................................................................. 5 2.1 STUDY AREA ............................................................................................................................................................. 5 2.2 DISTURBANCE MAPPING PROCEDURE ................................................................................................................. 6

2.2.1 Data and Preprocessing ....................................................................................................................................... 7 2.2.2 Spectral Indices ........................................................................................................................................................ 9 2.2.3 Trajectory Extraction ............................................................................................................................................ 9 2.2.4 Trajectory Interpretation ................................................................................................................................ 13 2.2.5 Post-classification Process ............................................................................................................................... 14

3 RESULTS AND DISCUSSION ...................................................................................................... 15 3.1 EVALUATION OF THE CLASSIFIERS AND THE INDICES ..................................................................................... 15 3.2 ACCURACY ASSESSMENT ...................................................................................................................................... 16 3.3 THE DISTURBANCE MAP PRODUCT ................................................................................................................... 18

4 CONCLUSION AND PERSPECTIVES ......................................................................................... 20

CHAPTER 2 CLUSTERING BASED ON EIGEN SPACE TRANSFORMATION – CBEST FOR EFFICIENT CLASSIFICATION ..................................................................................................................................... 22

ABSTRACT ............................................................................................................................................... 23

1 INTRODUCTION ........................................................................................................................... 24

2 BACKGROUND............................................................................................................................... 25 2.1 K-MEANS ................................................................................................................................................................. 25 2.2 EIGEN-BASED GRAY-LEVEL VECTOR REDUCTION .......................................................................................... 27

3 CLUSTERING BASED ON EIGEN SPACE TRANSFORMATION ......................................... 28 3.1 COMPRESSION ........................................................................................................................................................ 28 3.2 CLUSTERING ........................................................................................................................................................... 30 3.3 FURTHER IMPROVEMENT .................................................................................................................................... 31

3.3.1 Mean vectors .......................................................................................................................................................... 31 3.3.2 Vacant Eigenspace Partitions ........................................................................................................................ 32 3.3.3 Boundary Optimization ..................................................................................................................................... 32

4 EXPERIMENTAL DESIGN ........................................................................................................... 32 4.1 EXPERIMENT DATA ............................................................................................................................................... 33 4.2 PREPROCESSING .................................................................................................................................................... 34

ii

4.3 METHODS ............................................................................................................................................................... 34

5 RESULTS AND ANALYSIS ........................................................................................................... 37 5.1 EFFICIENCY & PERFORMANCE TEST .................................................................................................................. 37 5.2 APPLICATION EXPERIMENTS ............................................................................................................................... 46

5.2.1 Landsat TM Image ............................................................................................................................................... 46 5.2.2 AVIRIS Hyperspectral Image .......................................................................................................................... 49

6 DISCUSSIONS ................................................................................................................................. 51

CHAPTER 3 APPLICATIONS OF CBEST IN EFFICIENTLY MAPPING FOREST CHANGES IN THE STATE OF CALIFORNIA FROM 1986-2011 ..................................................................................................... 54

ABSTRACT ............................................................................................................................................... 55

1 INTRODUCTION ........................................................................................................................... 56

2 METHODOLOGY ........................................................................................................................... 58 2.1 STUDY AREA ........................................................................................................................................................... 58 2.2 DATA ....................................................................................................................................................................... 59 2.3 PROCEDURE ............................................................................................................................................................ 60

2.3.1 Data Preparation ................................................................................................................................................. 61 2.3.2 Initial Clustering ................................................................................................................................................... 62 2.3.3 Integrating Cluster Centers ............................................................................................................................. 63 2.3.4 Probability Assigning ......................................................................................................................................... 63 2.3.5 Probability Trajectory Interpretation ........................................................................................................ 64 2.3.6 Post-processing ..................................................................................................................................................... 66

3 RESULTS AND ANALYSIS ........................................................................................................... 67 3.1 INTERMEDIATE RESULTS ..................................................................................................................................... 67 3.2 FOREST CHANGE MAP AND ACCURACY ASSESSMENT ..................................................................................... 70

4 DISCUSSIONS ................................................................................................................................. 72

5 CONCLUSIONS ............................................................................................................................... 77

CHAPTER 4 CONCLUSIONS AND PERSPECTIVES ............................................................................ 79

1 SUMMARY OF THE RESULTS .................................................................................................... 79

2 FUTURE PERSPECTIVES ............................................................................................................ 80

REFERENCES ........................................................................................................................................... 81

iii

List of Table Captions

Table 1 Data acquisition dates and land percentage. ................................................................................................ 7 Table 2 The list of the spectral indices. .......................................................................................................................... 9 Table 3 Overall accuracies of the classification test results. ‘CV’ represents the cross-validation test on

the training dataset. ‘Test’ represents the evaluation on the test dataset. ........................................... 15 Table 4 Overall accuracies of the classification test of integration of multiple indices. The evaluation

was done on the test dataset. ............................................................................................................................... 16 Table 5 Confusion matrix of the forest change type classification result. The evaluation was done on

the test dataset. ........................................................................................................................................................ 17 Table 6 Conventional K-means Algorithm ................................................................................................................. 26 Table 7 Eigen-Based Gray Level Vector Reduction ................................................................................................. 28 Table 8 CBEST Algorithm ................................................................................................................................................. 30 Table 9 Description of the Indicators .......................................................................................................................... 36 Table 10 Classification System for Guangzhou ......................................................................................................... 36 Table 11 Test Results w/respect to Data Size ........................................................................................................... 38 Table 12 Test Results w/respect to k .......................................................................................................................... 39 Table 13 Test Results w/respect to N .......................................................................................................................... 42 Table 14 Assignment of Eigenspace Partitions for Eigen Axes ............................................................................ 44 Table 15 Performance Test w/respect to the Max number of Iterations ........................................................ 45 Table 16 Confusion Matrices for Validation (Landsat) .......................................................................................... 46 Table 17 Summary of Classification Results (Landsat) ......................................................................................... 47 Table 18 Summary of Class Results (AVIRIS) ............................................................................................................ 51 Table 19 Verification classes and corresponding probability weights ............................................................ 64 Table 20 Elapsed time for the clustering process. Each result was selected as the lowest within-cluster

sum of squares from 5 runs. 1986N10 means the mosaicked image in 1986 with projection of UTM Zone 10 North. ................................................................................................................................................ 67

Table 21 Means and standard deviations of forest probabilities calculated for the clusters ................... 69 Table 22 The error matrix of samples with four classes from validation labels and two classes from

the forest cover in 2011 ......................................................................................................................................... 71 Table 23 Error matrix of accuracies for the forest cover in 2011 ...................................................................... 71

iv

List of Figure Captions

Figure 1 Study Area: Grand County, CO, USA. ............................................................................................................... 6 Figure 2 Flowchart of processing steps using in the BITE algorithm. .................................................................. 7 Figure 3 Example of an NDVI time series for 1 disturbed pixel, and intermediate results of processing

steps in the time series. These processing steps include 1) Inter-year value selection (a)-(b); 2) Noise removal (b)-(c); 3) Segmentation (c)-(d). ............................................................................................ 10

Figure 4 Example of the intermediate processing steps producing segments of the entire NDVI trajectory for 1 pixel (Segmentation Process). .............................................................................................. 13

Figure 5 Outputs of the BITE algorithm, including starting year of (left) slow-onset disturbances and (right) rapid-onset disturbances. ....................................................................................................................... 18

Figure 6 Area affected by different disturbance types for 2001-2009. ............................................................ 19 Figure 7 Enlarged view of the BITE output showing the staring year of slow-onset disturbances. ........ 20 Figure 8 Illustrative Comparisons between CBEST and K-means ...................................................................... 31 Figure 9 Test Area: Guangzhou, China ......................................................................................................................... 33 Figure 10 Experiment Flow Chart ................................................................................................................................. 35 Figure 11 Speed Comparison w/respect to Data Size. (a) Elapsed Time Comparison; (b) Elapsed Time

ratio (how many times faster); (c) ETI Comparison; (d) ETI ratio ........................................................... 39 Figure 12 Efficiency w/respect to k. (a) Elapsed Time Comparison; (b) Elapsed Time ratio (how many

times faster); (c) ETI Comparison; (d) ETI ratio; (e) Rescaled Within-Cluster Sum of Square average; (f) Rescaled Within-Cluster Sum of Square Best/Worst Case. ................................................. 41

Figure 13 Efficiency w/respect to N. (a) Elapsed Time Comparison; (b) Elapsed Time ratio (how many times faster); (c) ETI Comparison; (d) ETI ratio. (e) Within-Cluster Sum of Squares Comparison; (f) Within-Cluster Sum of Squares Limited by various max numbers of Iterations. .......................... 43

Figure 14 Scatterplot of Ground Truth ........................................................................................................................ 48 Figure 15 Land Cover/Use Map derived by K-means and CBEST in Guangzhou ........................................... 49 Figure 16 Validation Samples as Ground Reference in Guangzhou ................................................................... 49 Figure 17 Mapping Results in Tippercanoe County ................................................................................................ 50 Figure 18 California: Study Area and Landsat TM scenes. Since the study area is in the northern

hemisphere, the UTM is of North Zone. ............................................................................................................ 59 Figure 19 Flowchart of the Procedure to map forest changes in California ................................................... 61 Figure 20 CBEST software interface. The initial clustering was implemented under the configuration

in this figure............................................................................................................................................................... 63 Figure 21 Graphic demonstration of probability trajectory interpretation. (a) A typical forest loss

pixel with elaborations on the rules for automatic determination of forest loss; (b) Non-forest, all points fall within the bounds; (c) Forest; (d) Forest Gain detected in 2006. ........................................ 65

Figure 22 Post-clustering result in year 2011 and stratified samples ............................................................. 68 Figure 23 California Forest Change Maps 1986-2011. Left: Change Type Map; Upper right: Forest loss

characterized by years; Lower right: Forest gain/recovery characterized by years. ........................ 70 Figure 24 Estimated Forest Area by Mapping Years ............................................................................................... 72 Figure 25 Proportions of forest change type in California for 1986-2011 ...................................................... 73 Figure 26 Local views of some chosen places of the forest change map. The four images in the bottom

of the figure demonstrate the changes detected with historical aerial photographs back in the 1988 and 1993 in comparison with high resolution image acquired recently. Orange circles indicate a regenerated forest patch after early removal while red circles encompass a clearcutting area. ..................................................................................................................................................... 74

Figure 27 An example of how scale affect the classified area. Suppose each smallest cell unit is 30m by 30m in size, there are 20 cells or 18000m2 forest area. If using 120m by 120m cell, there are 3 cells or 43200m2. If using the entire 240m by 240m scene, the area is classified as one forest patch, with an area of 57600m2. ......................................................................................................................... 76

v

Introduction

Forestland is commonly defined as land that is at least one acre in area and has at least 10% area

stocked with trees of any size, or previously had such tree cover but not currently being

developed for non-forest use (Helms, 1998). The Resource Planning and Act assessment (USDA

Forest Service, 2012) additionally limits a width of at least 120 feet (37 meters). It also includes

transition zones with 10% tree cover and excludes lands predominantly under agricultural and

urban land use. Forests, when properly managed, are known to be a major carbon sink that can

mitigate the process of climate change. In the United States, forest growth and afforestation

offset approximately 13 percent of the Nation’s fossil fuel CO2 production in 2012 (Vose et al.,

2012). Traditionally, forest is well recognized for its economic, social and ecological values.

Commercial forest (Timberland) provides valuable wood products, while reserved forest is

preserved for recreations, aesthetics, wildlife, biodiversity, etc. The importance of sustainable

forest management that aims to conserve the forest for the benefit and sustainability for future

generations is increasingly acknowledged by the public nowadays. Therefore, it is crucial to

monitor forest changes and to estimate deforestation for tracking carbon stocks and fluxes

(Running, 2008), as well as to support decision making for better forest management for the

benefit of the society. Moreover, monitoring these deforestation and regeneration events over

time is also important since natural and human-induced disturbances that cause deforestation is

becoming more and more frequent under climate change (Overpeck et al., 1990; Westerling et

al., 2006). Natural disturbances include hurricanes, earthquakes, wildfires, increased temperature,

drought, pathogens and insect attacks (Soja et al., 2007; Kurtz et al., 2008; Westerling and

Bryant, 2008). Human-induced disturbances include logging, clear-cutting and prescribed fire.

The detection of these disturbances and land use changes provides evidence for scientists and

policy makers to study the implications of such changes and to project future trends. In

particular, slow-onset forest disturbances, which are commonly caused by insects and pathogens,

comprise a significant source of long-term carbon dioxide emissions to the atmosphere through

decomposition of dead organic matter leading to climate warming (Metz, 2001; Kurz et al. 2008;

Maness et al., 2013). Currently, there are two major challenges for forest change mapping.

Firstly, there is a lack of reliable approaches for detecting slow-onset disturbances spatially and

temporally as well as distinguishing them from rapid-onset disturbances. Secondly, there is a

lack of efficient algorithms for detecting forest changes over many years in a large area such as a

large State such as California with rich forest resources, or even the entire United States.

Therefore, to address the first challenge, we were particularly interested in accurately tracking

slow-onset disturbances with satellite images acquired in multiple years. For the second

challenge, we focused on developing an efficient automatic algorithm based on K-means, a

widely used algorithm for data mining and applied this algorithm in a practice of large-area

mapping over many years.

This dissertation paper consists of four chapters. In the first chapter, a reliable semi-automatic

algorithm for detecting slow-onset disturbances vs. rapid-onset disturbances based on Landsat

image stacks from 2001 to 2011 for Grand County in Colorado was developed. The algorithm

was named Berkeley Indices Trajectory Extractor (BITE). Temporal trajectories of multiple

spectral indices were processed with unique techniques followed by interpretation and

integration. An overall accuracy of 94.7% for the classification of disturbance types was

vi

achieved. The BITE product effectively maps the spatial and temporal dispersal of mountain pine

beetle outbreak that occurred during the time frame in the study area, supporting better

understanding of fundamentals of mechanics of insect attack patterns. Furthermore, this

algorithm should be suitable for detecting other disturbances that result in canopy loss regardless

of the speed of deforestation.

However, BITE had high computational cost and was time consuming when executed in an

ordinary lab computer. In the second chapter, an efficient unsupervised algorithm was proposed

with great improvement of lowering computational cost of conventional K-means algorithm. The

algorithm was named Clustering Based on Eigen Space Transformation (CBEST). The algorithm

compressed the data before iterating calculations for the clustering process, making the original

data size based computational cost to be based solely on a fixed number of desired compressed

space. Although there is information loss during the compression, the analysis and experiment on

some test images suggest the loss could be ignored in practice, however achieving great

improvement in computing time.

In the third chapter, the CBEST algorithm was applied in producing a forest change map for the

entire state of California from 1986-2011 with a five-year interval. With a total of 450 Landsat

Thematic Mapper images, the entire computing time was approximately 10 hours in an ordinary

lab computer. The overall accuracy was assessed for the forest cover in 2011 derived from the

map as 92.9% ± 1.6%. This efficient approach allowed us to produce the first California forest

change map with such spatial resolution of 30 meters and temporal coverage of 25 years. The

facts of California’s forestland were found using the produced map. No significant directional

change was observed. The differences between the produced map and previous forest inventories

were discussed.

In the fourth chapter, the achievements from the first three chapters were summarized. The links

between these chapters were explored and the further integration of BITE and CBEST was

envisioned to take advantages of both algorithms in a larger extent. The ultimate goal was to

efficiently and reliably map the forest changes for a relative large administrative area or

ecoregions. The potentials and benefit of the study were also prospected.

vii

Acknowledgement

I am grateful to Congcong Li for sharing the processed TM image and validation data used in

this article. We also thank David Landgrebe from Laboratory for Applications of Remote

Sensing, Purdue University for sharing the data online. This research has been partially

supported by USGS (grant number G12AC20085) and a national high technology program grant

from China (grant number 2009AA12200101).

1

Chapter 1 BITE: an algorithm for mapping slow-onset forest disturbances caused

by mountain pine beetles with Landsat image stacks

2

Abstract

We developed a semi-automatic algorithm named Berkeley Indices Trajectory Extractor

(BITE) to detect forest disturbances, especially slow-onset disturbances such as insect mortality,

from time series of Landsat 5 Thematic Mapper (TM) images. BITE is a streamlined process that

features trajectory extraction and interpretation of multiple spectral indices followed by an

integration of all indices. The algorithm was tested over Grand County in Colorado, located in

the Southern Rocky Mountains Ecoregion, where forests dominated by lodgepole pine have been

under mountain pine beetle attack since 2000. We produced a disturbance map using BITE with

an identification accuracy of 94.7% assessed from 602 validation sample pixels. The algorithm

shows its robustness in deriving forest disturbance type and timing with the presence of different

levels of atmospheric conditions, noises, pixel misregistration and residual cloud/snow cover in

the imagery. Outputs of the BITE algorithm could be used in studies designed to increase

understanding of the mechanisms of mountain pine beetle dispersal and tree mortality, as well as

other types of forest disturbances.

Keywords: Mountain Pine Beetle, Forest Disturbance, Slow-onset Disturbance, Landsat TM,

Remote Sensing

3

1 Introduction

Forest land is a major carbon sink. In the United States, forest growth and afforestation offset

approximately 13 percent of the Nation’s fossil fuel CO2 production in 2012 (Vose et al., 2012).

Monitoring forests and estimating deforestation as a result of disturbances is crucial for tracking

carbon stocks and fluxes in ecosystems (Running, 2008). Slow-onset forest disturbances, which

are commonly caused by insects and pathogens, comprise a significant source of long-term

carbon dioxide emissions to the atmosphere through decomposition of dead organic matter

leading to climate warming (Metz, 2001; Kurz et al. 2008; Maness et al., 2013). Currently, there

is a lack of reliable approaches for detecting slow-onset disturbances spatially and temporally as

well as distinguishing them from rapid-onset disturbances. Therefore, we were particularly

interested in accurately tracking slow-onset disturbances with satellite images acquired in

multiple years.

Mountain pine beetle (MPB, Dendroctonus ponderosae) is a native species to North America

and is known to cause large-scale mortality in coniferous forests. For the past decade, pine

forests in western North America have experienced extensive and severe morality from MPB

outbreaks (Kurz et al., 2008; Honey-Marie et al., 2011). Despite concerns about large amount of

carbon emissions from extensive tree mortality, the recurring MPB outbreaks in western North

America also have other socioeconomic impacts. These include: the wood industry is affected by

increased cost of timber due to damages from MPB; dead trees increase ground fuel loading and

thus wildfire vulnerability; recreational values of the landscape are negatively affected as dead

trees are not visually appealing; dead tree falls could cause injuries as a safety concern; and

wildlife composition is altered due to drastic habitat change as a result of extensive tree mortality

(Safranyik and Wilson, 2007). Therefore, challenges were raised for management of forest

sustainability in implementing effective measures to prevent and confine MPB dispersal.

Potential counter measures were investigated, including thinning (Mitchell et al., 1983), fire

suppression (Parker et al., 2006), and removing infested trees (Trzcinski and Reid, 2008). These

counter measures were limited by costs and scope, and therefore were not effective in dealing

with large-scale outbreaks.

MPB normally exists at endemic levels that cause limited mortality; however, under certain

circumstances these attacks can reach epidemic levels. Past studies about factors that could

trigger epidemic behavior of MPB focused on physiological interactions between pines, beetles

and climate (Cole and Amman, 1980; Raffa and Berryman, 1983; Safranyik and Whitney, 1985;

Bentz et al., 1991; Bentz et al., 1996) and recent technological advancements in Geospatial

Science, Geographic Information System (GIS) and Remote Sensing (RS) have facilitated the

understanding of patterns of MPB attacks throughout space and time (Logan et al., 1998;

Aukema et al., 2006; Chapman et al., 2012). Moreover, with climate variables (Iverson and

Prasad, 1998; Aukema et al., 2008), future spread trend of MPB attacks could be predicted under

different climate change scenarios leading to improved estimation of the long-term impacts of

MPB mortality on carbon stocks and fluxes. These studies provided significant support for

scientists and decision makers to develop strategies to predict and control dispersal of MPB in

forested landscapes. However, these studies were limited in scope because the spatial resolution

and temporal coverage were relatively low. To better understand the driving factors of MPB

4

dispersals, it relies on accurate spatial and temporal detections of MPB outbreaks, which can be

derived and interpreted from remotely-sensed data acquired at regular intervals over long time

periods.

Satellite imagery has been widely used to provide information about forest coverage. Nationwide

forest land cover mapping in the United States can be traced back to the 1990s (Loveland et al.,

1991; Zhu and Evans, 1994). Worldwide forest cover products such as global tropical forest

cover map (Mayaux et al., 1998), global land cover maps (Hansen et al., 2000; Gong et al., 2013)

along with global forest percentage map (Defries et al., 2000) were produced to reflect forest

conditions in relatively broad categories. With multi-temporal approaches which stack remotely

sensed images acquired from multiple dates, one is able to track land use and land cover (LULC)

changes over time. There is a variety of change detection techniques for remotely sensed data

(Singh, 1989; Coppin and Bauer, 1996; Mas, 1999; Hayes and Sader, 2001; Lu et al., 2004). For

forest change detection related to insects, there are studies that independently interpret single

date of satellite imagery to map disturbances (Keane et al., 1994; Bentz and Endreson, 2004).

These studies were limited to the detection of presence of disturbances but not their timing.

Multi-temporal approaches like time-series analysis of trajectories of vegetation indices were

developed (Goodwin et al., 2008; Goodwin et al., 2010). Similarly, Landsat-based detection of

Trends in Disturbance and Recovery (LandTrendr) was introduced using yearly Landsat time-

series stacks to extract spectral trajectories (Kennedy et al., 2010; Cohen et al., 2010).

Particularly, it was applied to detect MPB outbreaks (Meigs et al., 2011). There is another

algorithm featuring an automatic streamlined procedure for detecting forest changes using

Landsat time-series stacks, which is called Vegetation Change Tracker algorithm (VCT, Huang

et al., 2009; Huang et al., 2010). A recent global map depicting forest changes from 2000 to

2012 was produced using efficient cloud computing (Hansen et al., 2013). Generally, a reliable

forest change map is a crucial source of reference in gaining insights into mechanisms of spatial

patterns of forest changes.

Although having high ground resolution, aerial images often lack temporal consistency in data

acquisition and cost for private sources is high. As for satellite imagery, Landsat TM had a long

life of nearly 30 years of operation, with 16-day revisit intervals, and a medium spatial resolution

of 30 meters, and thus is suitable for mapping the MPB outbreaks over a long period at

appropriate spatial scale. For disturbance detection approaches that use Landsat imagery, there

are limitations when applied to mapping the outbreaks of MPB in the Southern Rocky Mountain

region. Goodwin et al. (2008; 2010) only explored one spectral index to detect disturbances with

cloud-free subsets and thresholds. LandTrendr also used a single spectral index at one time to

process the time-series and there are too many parameters for users to easily test for optimal

values. The VCT algorithm relied on thresholds to classify its forest index to low and high values.

A disturbance was determined based on consecutive high values, thus it is inaccurate in detecting

slow-onset disturbances. The global forest change map by Hansen et al. (2013) classified time-

series images for each continent which might not adapt to certain local regions. It used

supervised classification to determine forest percentage cover, forest loss and forest gain,

potentially leading to inaccuracies when determining the start year of slow-onset disturbances.

Furthermore, these studies are neither particularly adapted to areas with frequent cloud cover,

nor are they able to separate between rapid- and slow- onset disturbances.

5

Generally, the major difficulties in disturbance mapping are: 1) inconsistent quality of data due

to various atmospheric conditions, cloud, shadows and snow; 2) separating slow and abrupt

changes; 3) adaptation to local environment. To overcome these difficulties, we developed a new

semi-automatic algorithm named Berkeley Indices Trajectory Extractor (BITE) to detect

forest disturbances, especially slow-onset disturbances, by integrating multiple spectral indices

from medium resolution remote sensing images. 30 m medium resolution Landsat 5 TM time-

series stacks were chosen for the temporal coverage over the last decade. Our algorithm can

overcome the difficulties in forest change studies as described above, which are frequent cloud

cover and slow-onset vs. rapid-onset disturbances. Moreover, our algorithm does not rely on

parameters/thresholds and thus reduces complications and potential subjective errors at the users’

end. BITE enables mapping forest disturbances in the regions where MPB outbreaks occur,

which supports the studies of analyzing and modelling the dispersal pattern of the MPB

outbreaks. Furthermore, we expect our algorithm to detect slow-onset and rapid-onset

disturbances without being limited to particular research contexts. In this paper, we present the

BITE algorithm in detail and initial validation results for 1 Landsat path/row.

2 Methodology

2.1 Study Area

Our study area was Grand County, Colorado (Figure 1), located in the southern rocky mountain

region, that has been the epicenter of a widespread mountain pine beetle outbreak that started

around 2000 (Chapman et al., 2012). Grand County is one the largest counties in the state of

Colorado, with a land area of 4,843 km2 and a total population of 14,608 as of 2012 according to

U.S. Census Explorer data (America Community Survey Office, 2013). Towns such as Granby,

Grand Lake, and Winter Park attract a great number of tourists, particularly in the winter. The

majority of the county is covered by forests, dominanted by coniferous species, primarily

lodgepole pine (Pinus contorta), Engelmann spruce (Picea engelmannii) and subalpine fir

(Ablies lasiocarpa). Quaking aspen (Populus tremuloides) is the primary species in deciduous

forests that cover approximately 15% of the forested area in Grand County according to the

United States Forest Service (USFS) Forest Inventory and Analysis (FIA) (Ruefenacht et al.,

2008). The study area is partially covered by snow above a certain altitude throughout the spring,

fall, and winter, and there is a high frequency of precipitation during the summer. The local

climate leaves a narrow temporal window for acquiring satellite images with little cloud and

snow cover.

6

Figure 1 Study Area: Grand County, CO, USA.

2.2 Disturbance Mapping Procedure

BITE is a semi-automatic streamlined process that comprises the following steps: 1) Image

Preprocessing; 2) Calculating spectral indices and stacking indices along the timeline; 3)

Trajectory Extraction; 4) Trajectory Interpretation; and 5) Post-classification Processing. The

flow of the procedure is shown in Figure 2. The steps are explained in the following subsections.

7

Figure 2 Flowchart of processing steps using in the BITE algorithm.

2.2.1 Data and Preprocessing

The spatial frame of a single Landsat satellite scene at Path/Row of 34/32 includes the study

area. Landsat 5 TM images were acquired from 2001 to 2011 during leaf-on seasons (June to

September) to obtain information about vegetation during the peak of the growing season and to

avoid snow cover in the images. The acquired images were preprocessed at the L1T level, which

means systematic radiometric and geometric accuracy were ensured by using ground control

points and a digital elevation model for topographic correction. Images were processed into

surface reflectance using the Land Ecosystem Disturbance Adaptive Processing System

(LEDAPS, Masek et al., 2012). LEDAPS carries out radiance calibration, top-of-atmosphere

reflectance conversion and atmospheric correction. Subsequently, the Fmask algorithm (Zhu and

Woodcock, 2012) that takes advantage of object-based image analysis in cloud masking was

used to mask clouds and their shadows on the ground as well as snow cover and water bodies. A

total of 23 images were collected, each had less than 33% cloud and snow cover combined over

the study area. Therefore, the percentage of cloud- and snow-free land area was at least 66% of

the entire study area. Visual inspections were also carried out to remove unmasked snow cover

and bright scan lines due to overexposure of large areas of clouds. The acquisition dates and

cloud-/snow-free area percentages of the images were listed in Table 1.

Table 1 Data acquisition dates and land percentage.

Acquisition

Date

Cloud/Snow/

Shadow/Water

Land Acquisition Date Cloud/Snow/

Shadow/Water

Land

2001 June 28 20.7% 79.3% 2007 June 29* 22.5% 77.5%

2001 July 30* 17.3% 82.6% 2007 July 15 13.8% 86.2%

2002 July 1 1.7% 98.4% 2007 July 31* 21.3% 78.7%

2002 July 17* 11.3% 88.7% 2007 Aug 16* 15.2% 84.8%

8

2003 July 4 6.7% 93.2% 2008 Aug 18 31.7% 68.2%

2003 Sep 22 3.5% 96.5% 2009 Aug 5 8.7% 91.3%

2004 July 6 10.6% 89.4% 2009 Aug 21* 1.8% 98.2%

2005 Aug 26* 25.9% 74.2% 2010 June 21 6.2% 93.8%

2005 Sep 11 2.2% 97.8% 2010 Sep 25 3.1% 96.9%

2006 June 26* 11.2% 88.9% 2011 June 8 22.6% 77.4%

2006 July 28 2.4% 97.6% 2011 Aug 27* 17.3% 82.6%

2006 Aug 29 3.8% 96.2% (Starred images have thin bright scanlines)

In addition to the Landsat images, three aerial image mosaics from 2005, 2009 and 2011

collected by the National Agriculture Imagery Program (NAIP) covering the study area were

obtained for visual disturbance detections for validation purposes. NAIP imagery has a 1 meter

spatial resolution, providing sufficient details for visual interpretations at the level of individual

trees. The NAIP imagery were used to identify forest disturbances.

We also visited the study area and collected field samples with handheld Global Position System

(GPS) units to record geographic coordinates in 2010, 2012 and 2013, respectively. 118 field

plots with GPS coordinates were visited in 2010 with each plot covering a circular buffer of 8

meter in radius, in which healthy trees, unhealthy trees, sick trees and dead trees were counted

along with many individual tree and stand measurements (Caldwell et. al., 2013). In the two

subsequent field trips, a few plots sampled in 2010 were revisited and 76 new locations were

added, where surrounding tree status and environment elements were documented.

For supervised classification and accuracy assessment, a total of 301 pixels consisting of 111

persistent forest pixels, 98 slow-onset disturbance pixels and 92 rapid-onset disturbance pixels

were selected as the training set, while a total of 602 pixels consisting of 222 persistent forest

pixels, 195 slow-onset disturbance pixels and 185 rapid-onset disturbance pixels were selected as

the test set.

Field plots were not evenly distributed throughout the entire study area as some forested areas

are remote and difficult to access due to the limitation of cost and time. Thus most samples were

selected in NAIP aerial images. Both the training and test pixels were randomly selected within

the forest cover throughout the study area. The interpretation of the sample pixels incorporates

the field sample co-registered with GPS in 2010, 2012 and 2013, and visual identification of

NAIP image mosaics in 2005, 2009 and 2011. Particularly, locations of the field plots were

inspected in NAIP images to help identify infested trees from NAIP images based on the

differences between healthy and infested trees in color and texture. Generally, the crown canopy

of the infested trees is red or brown and is easily distinguished from that of healthy trees. Dead

trees lose all needles and thus are grey branches and trunks. The criteria of determining a slow-

onset disturbance sample is either an increased number of infested/dead trees observed from

2005 to 2011, or at least 30% of trees attacked within roughly a 45 meter radius (3 × 3 pixels)

over the same time period.

9

2.2.2 Spectral Indices

A total of 9 spectral indices calculated from multiple band combinations for unmasked pixels

were used to test the response performance in the time-series trajectory with regard to forest

disturbances. They include NDVI (Tucker, 1979), NDSI (Shimamura et al., 2006), NBR (Key

and Benson, 2005), NDWI (McFeeters, 1996), NDMI (Wilson and Sader, 2002), EVI (Huete et

al., 1997) and Tasseled-cap indices (Crist and Cicone, 1984). The complete names and equations

for each index are listed in Table 2. Radiometric transformation takes advantage of combining

information from multiple spectral bands to a single spectral index. The spectral indices reflect

dynamic spectral features of certain cover types, forming time-series trajectories from which

either slow or rapid changes could be interpreted. Furthermore, misregistration in land cover

change studies can cause significant errors in accuracy (Dai and Khorram, 1998). To reduce such

error, a 3 by 3 low pass filter was applied to the unmasked pixels of spectral indices to reduce

misregistration errors (Gong et al., 1992). For each index, a time-series image was created by

stacking the 23 image sequence from 2001 to 2011.

Table 2 The list of the spectral indices.

Spectral Index Equation

Normalized Difference Vegetation Index NDVI = (B4 – B3)/(B4 + B3)

Normalized Difference Snow Index NDSI = (B2 – B5)/(B2 + B5)

Normalized Burn Ratio NBR = (B4 – B7)/(B4 + B7)

Normalized Difference Water Index NDWI = (B2 – B4)/(B2 + B4)

Normalized Difference Moisture Index NDMI = (B4 – B5)/(B4 + B5)

Enhanced Vegetation Index EVI = 2.5(B4 – B3)/(B4 + 6 × B3 - 7.5 × B1 + 1)

Brightness (Tasseled Cap) Brightness = 0.3037B1 + 0.2793B2 + 0.4343B3 + 0.5585B4 + 0.5082B5 + 0.1863B7

Greenness (Tasseled Cap) Greenness = -0.2848B1 – 0.2435B2 – 0.5436B3 + 0.7243B4 + 0.084B5 – 0.18B7

Wetness (Tasseled Cap) Wetness = 0.1509B1 + 0.1793B2 + 0.3299B3 + 0.3406B4 – 0.7112B5 – 0.4572B7

2.2.3 Trajectory Extraction

Trajectories in this paper are defined as the individual line segments that represent unique trends

of spectral indices within the complete time series. The trajectory extraction process can be

further divided into three steps: Inter-year value selection, Noise removal, and Segmentation. A

diagram demonstrating the trajectory extraction process is presented in Figure 3 using NDVI as

an example. The figure presents the temporal dynamics of a MPB attacked pixel, which was

converted from 23 index values from different dates into 3 segments indicating changes within

the trajectory.

10

Figure 3 Example of an NDVI time series for 1 disturbed pixel, and intermediate results of

processing steps in the time series. These processing steps include 1) Inter-year value selection

(a)-(b); 2) Noise removal (b)-(c); 3) Segmentation (c)-(d).

Inter-year Value Selection

Due to the various weather conditions in the southern Rocky Mountain region, at any given date

of satellite overpass it is unpredictable whether the image acquired over a location in the study

area is covered by major cloud/shadow/snow or not. For instance, there are 4 images selected in

2007 as opposed to only one selected in 2004 (Table 1), and at the pixel location used in Figure

3(a) was masked for cloud coverage in 2004. The strategy in our algorithm is to select the most

suitable value in a year when multiple values exist, and to interpolate gaps between years when

no cloud/shadow/snow-free data are available.

The most suitable value for a year is approximated from its neighboring years. It takes a few

iterations until the values no longer change. In particular, the algorithm initializes the first value

(earliest in the year derived from images) for each year, then fills the gaps in the data using linear

interpolation. As for head and tail gaps, we used the first/last available value to fill. The result is

11

a list of 11 values with only one value in each year. The list is then examined, with a value

selected from each multiple-valued year that is the closest value to the mean of the values from

the previous year and the subsequent year. In the special case for heads and tails, only one

available direction is examined (e.g. check the value of the subsequent year for the head). The

process is terminated until no changes occur after repeatedly re-interpolating the gaps, followed

by list examination and selection as described above. With the value selection and gap filling, the

result is a time-series value sequence with identical temporal interval of 1 year (Figure 3(b)).

Noise Removal

It is possible that noise in the images exist even after low pass filtering, for instance, residual

cloud cover/shadow/snow cover (Kennedy et al., 2010) and device errors including undetected

bright scan lines and misregistration. In addition, no relative calibration was done among the

images to reduce uncertainties in Landsat 5 TM calibration (Thorne et al., 1997). Therefore, the

noise removal technique proposed in BITE aimed to remove abnormal data fluctuations. Noise

rarely occurs within a pixel for two temporally consecutive scenes, which suggests that the noisy

pixels form a spike deviating from the distribution of values in the series (Figure 3(b)). We used

standard deviation as the threshold to determine whether a sample value differentiated

substantially from those of the previous year and the subsequent year, since standard deviation

reflects the variation of the values in the sequence such that relatively flat sequences are less

tolerant to abnormal spikes while disturbed sequences are more tolerant. For each iteration, only

the greatest value is removed as noise, followed by the recalculation of the standard deviation for

noise detection in the next iteration. We decided to replace the noise with the closer value of the

year before and year after but not the mean to avoid over-smoothing when the noise happens to

occur at a rapid-onset disturbance such that one neighboring value is quite close to the noise

value while the other is far from the noise value. Normally, the previous value and subsequent

value of the detected noise is similar and the noise filtering would not affect the trend.

Furthermore, it was sometimes difficult to determine whether the first or the second value was

noise when there was a significant difference between the two. The circumstance could also

apply to the tail of the series. (Figure 3(b)). We simplified this problem by inserting the mean of

the second and third samples ahead of the first sample. We applied a similar operation to the tail.

The appended samples were removed after the noise removal was complete. To explain the

reason of our solution, there are two scenarios in our deliberation. Under one scenario in which

the second and the third sample are similar, both are distant from the first sample, the first

sample would be treated as a noise since a sample of similar value to the second and third sample

is appended before the first sample to form a spike. Under the second scenario in which the

second and the third sample are largely different in value so that it is difficult to tell which

sample is a noise, our approach makes a ‘mild’ guess by averaging the second sample and third

sample and appending it before the first sample and then proceeds with the noise removing

iteration by removing outstanding spikes to constrain the series to some extent.

Segmentation

Segmentation is the process of simplifying the time-series into a sequence of one or more parts

(or segments) which represent different forest conditions (e.g. undisturbed, disturbed,

12

recovering). As the study period is 11 years, complicated change patterns are unlikely to occur

(e.g. multiple disturbance and recovery cycles). Therefore, the maximum number of segments

was set to 4. For multiple segments, an exhaustive search for all possible breakpoints between

segments was performed since there were a limited number of possible combinations for a

sequence of 11 points. Breakpoints were defined as the endpoints between segments and were

selected from the points of the noise-removed time-series. Segments were formed by linearly

connecting a set of breakpoints including the first and the last sample points. Therefore, for the

time-series sequence a with n sample points, and a set of break points 1<p1<p2 … <pm<n, where

m is the number of break points, which is also the number of segments minus one. So the

trajectory function is,

(1)

The best break-point combination is determined to be the one with the minimum sum of squares

.

The coefficient of determinant (R2) is a statistical measure of model performance, which is used

to evaluate the goodness of fit for the trajectory with regards to the time-series sequence after

noise removal. R2 monotonically increases with the number of segments because one can always

break any segment into two segments to decrease or maintain the sum of squares, which leads to

non-decreasing R2. However, overfitting with more than the intended number of segments in the

trajectory introduces redundant information and even noise to the next stage of trajectory

interpretation. Therefore in practice, tradeoffs between the number of segments and model fit is

evaluated using three thresholds as criteria to determine the optimal number of segments: R2

threshold (RThres), minimum R2 (MinR) and incremental R

2 threshold (IThres). If the R

2 of the

trajectory is greater than RThres, or if it is greater than MinR, plus the incremental R2 compared

to the trajectory in the previous iteration with one less segment is smaller than IThres, then the

current trajectory is selected as the final trajectory. Empirically after a few tests, we set RThres

as 0.95, MinR as 0.9 and IThres as 0.04. RThres can be regarded as a satisfactory level of model

fit, implying that more segments could lead to overfitting. MinR defines a minimum goodness of

fit, beyond which if increasing by one more segment cannot substantially increase R2, say less

than IThres, then the contribution of more segments is considered trivial. In the example

demonstrated in Figure 4, the trajectory with 3 segments surpassed the R2 over 0.95 and therefore

was the optimal selection, while the trajectory with 4 segments was apparently redundant in

number of segments.

11

1

1

1 1 1

1

( ) (1)( 1) (1) ( [1.. ])

1

( ) ( )ˆ( ) ( ) ( ) ( [ +1.. ], 2,..., )

( ) ( )( ) ( ) ( [ 1.. ])

j j

j j j j

j j

mm m m

m

a p ai a i p

p

a p a pa i i p a p i p p j m

p p

a n a pi p a p i p n

n p

2ˆ( ( ) ( ))i

a i a i

13

Figure 4 Example of the intermediate processing steps producing segments of the entire

NDVI trajectory for 1 pixel (Segmentation Process).

2.2.4 Trajectory Interpretation

In BITE, trajectory alone does not contribute to differentiating forest from non-forest. Therefore,

a prior forest cover map is required and National Land Cover Database 2001 (NLCD 2001,

Homer et al., 2007) was chosen for this purpose as it represents land-cover at the start of our

study time period with the identical spatial resolution of 30 meters. For each of the nine spectral

indices for which we generate trajectories, six features describing the trajectories were extracted:

minimum slope, maximum slope, minimum range change, maximum range change, minimum

value and maximum value. Minimum slope is a negative slope with the greatest absolute value of

the segments while maximum slope is the largest positive slope value. Minimum range change is

the lowest negative change of index value of the segments while maximum range change is the

largest positive change value. Minimum value and maximum value are the trajectory minimum

and maximum index values, respectively. With the six features for each of the spectral indices,

supervised classification algorithms are able to classify a trajectory into three disturbance classes:

Persistent Forest, Slow-onset Disturbance and Rapid-onset Disturbance. In this study,

Classification and Regression Tree (CART) and Support Vector Machine (SVM) were tested for

forest disturbance type classification (Hastie et al., 2009).

CART and SVM classification algorithms were implemented in Matlab 2013a to test the training

set using stratified 5-fold cross validation (Kohavi, 1995) and to validate with the test set. CART

14

is a built-in toolset in Matlab, while LIBSVM is an external tool package to carry out multi-class

classification with grid search of optimal parameters (Chang and Lin, 2011). CART divides the

feature space with thresholds into rectangular partitions with such splitting determined by

training data. The interpretability of CART classes is a major advantage of this algorithm. In the

trajectory interpretation process, CART generates classification rules for trajectory features for

slopes, range changes and boundary values. SVM with the Radial Basis Function (RBF) kernel

classifies data by nonlinearly mapping data into higher dimensional space, which is suitable for

splitting datasets in low dimensional space with complex decision boundaries (Hsu et al., 2003).

We used ‘accuracy’ as the indicator of how well the combinations of classifiers and indices

perform. Overall accuracy is defined as the number of correctly classified sample pixels divided

by the number of total sample pixels.

There are two accuracies assessed for each test: 1) 5-fold cross validation was evaluated on the

training set containing 301 pixels. In a 5-fold cross validation, we randomly divided data into 5

equal-sized partitions, using 4 partitions to train the classifier and 1 partition for validation. We

did this to test all 5 partitions and combined the results into one estimate of accuracy. 2) We

trained the classifier with the entire training set, and then used it to classify the test set which

consisted of 602 pixels independent of the training set. Particularly for SVM, the two parameters

γ and c of the RBF kernel were grid searched (Hsu et al., 2003) with 5-fold cross validations on

the training set. Both cross-validations on the training set and validation on the test set were

conducted to evaluate the performance of the classification algorithms and the combination of

spectral indices. Several of the best combinations of indices and classifiers were chosen to make

forest change type maps. When a pixel is detected as disturbance, the start year of the

disturbance is determined as the left endpoint of the trajectory segment with the minimum or the

maximum slope depending on whether the index decreases or increases with a disturbance. Nine

disturbance maps were created from the time-series images of 9 spectral indices independently.

Subsequently, an integrated disturbance map was created by a simple plurality voting of a

selection from these disturbance maps, of which each pixel is labeled with the category receiving

the most votes. If a pixel is labeled as disturbance, the corresponding start year of the disturbance

is determined as the median from the start years of the winning voters.

2.2.5 Post-classification Process

To remove speckle noise and increase spatial connectivity of homogeneous patches, a minimum

mapping unit (MMU) filter is favored in a number of studies (Homer et al., 2007; Thomas et al.,

2010). The filter replaces connected pixels that are less than a minimum number with

surrounding majority labels. The connectivity can be defined as 4 neighboring pixels or 8

neighboring pixels. Connecting 8 neighboring pixels in Landsat TM imagery preserves narrow

line features such as roads and rivers, but meanwhile retains residual speckles along patch

boundaries. When MPB attacks reach outbreak levels, it is suitable to use Landsat TM for its

medium spatial resolution of 30 m pixel size (Bentz and Endreson, 2004). Rapid-onset

disturbances such as clearcuts and fires are also identifiable as patches. To better depict the

spatial pattern of disturbances, we set the MMU to 0.5 ha (6 pixels) with an 8-neighbor

connectivity. Filtered pixels were refilled by repeatedly applying a 3 × 3 majority filter until no

further change can be made. Residual unfilled pixels were relabeled with their original classes.

15

This allows for a proportional expansion of multiple surrounding classes into the filtered patch.

Combination of disturbance type and start year yields a total of 19 classes including one

persistent forest, slow-onset and rapid-onset disturbances combined with start year from 2001 to

2009. Deriving a start year of 2010 or 2011 is not possible since the noise removal process

eliminates sudden value changes at the tail of the sequence due to uncertainties of noise vs.

normal value.

3 Results and Discussion

3.1 Evaluation of the Classifiers and the Indices

Table 3 Overall accuracies of the classification test results. ‘CV’ represents the cross-

validation test on the training dataset. ‘Test’ represents the evaluation on the test dataset.

Spectral Indices CART 5-CV CART Test SVM 5-CV SVM Test

NDVI 95.3% 88.2% 94.0% 90.9%

NDSI 89.7% 73.9% 98.0% 73.8%

NDWI 94.0% 85.4% 94.0% 90.4%

NDMI 95.7% 87.5% 95.0% 92.4%

NBR 95.7% 88.5% 95.3% 92.2%

EVI 90.7% 70.4% 84.1% 78.7%

Brightness 95.0% 88.9% 93.4% 92.4%

Greenness 93.0% 83.6% 86.0% 84.2%

Wetness 96.7% 89.4% 96.3% 91.5%

(Shaded indices were selected for integration)

The parameters that achieved the highest cross validation accuracy were selected as the optimal

configuration and used to classify the test set. Neither CART nor SVM achieve accuracies higher

than 80% for the test set with NDSI or EVI (Table 3). Compared to other indices, these two

indices were less separable due to lower accuracies in tests with both training and test samples.

Although the SVM cross validation accuracy of 98.0% with NDSI surpassed others, only 73.8%

of the test samples were correctly classified. It could be the result of using a small set of training

data to evaluate the large test sample given a chance of ‘overfitting’ of biased data (Ng, 1997),

considering the training set and test set were selected independently throughout the study area.

Particularly, CART yielded better average cross-validation accuracy than SVM (94.0% vs.

92.9%. respectively). On the contrary, the average accuracy of CART was 84.0% compared to

87.4% from SVM in validating the test set. The relatively even accuracies between the cross

validation and the validation with the test set for SVM suggest that it is more adaptive to

selection bias of the samples. Therefore, though both cross validation and validation results were

taken into account when selecting the best spectral indices, but validation gains were given

greater weight in the consideration. With a thorough investigation of the performances, five

spectral indices were selected to test for integrated accuracies. The indices were Brightness,

NDMI, NBR, Wetness and NDVI respectively.

16

Table 4 Overall accuracies of the classification test of integration of multiple indices. The

evaluation was done on the test dataset.

Intg. of Brightness, NDMI

and NBR

Intg. of Brightness, NDMI, NBR, Wetness and

NDVI

CART 93.2% 93.9%

SVM 94.4% 94.7%

Both 93.7% 94.4%

We integrated predicted labels of the test samples via a simple plurality vote, in which the

winner is the one with the most votes. The integrated prediction labels were validated with the

test set to derive accuracies. We tested the integration of 3 best indices and 5 best indices for

CART and SVM independently, and integrations of both CART and SVM (Table 4). With the

integration of all 5 indices using SVM, the accuracy surpasses that of any independent index on

the test set, including the combination of SVM and CART with 10 voters. Integrating CART

results into those of SVM actually decreased accuracies. Given that the training and test samples

were selected independently, the degree of accuracies on the test samples achieved by the

integration of the 5 indices with SVM was less biased. It was thus considered as the optimal

configuration for producing the final map.

3.2 Accuracy Assessment

The 602 test sample pixels were used as reference for accuracy assessment to generate a

confusion matrix from which the producer’s accuracies, user’s accuracies and an overall

accuracy for change type were calculated (Table 5). Despite a high overall accuracy of 94.7%

and Kappa coefficient of 0.92, producer’s accuracy and user’s accuracy for persistent forest and

slow-onset disturbance were all above 90%. For the rapid-onset disturbance, the difference

between the user’s accuracy of 99.4% and the relatively low producer’s accuracy of 89.2%

implies an under-classification of the disturbance type. Similarly, most misclassified samples

occurred in the lower left triangle of the confusion matrix indicating an underestimate of

disturbance severity, agreeing with that from individual SVM runs. It was slightly mitigated by

the voting protocol as 3 of 602 samples invoked a draw as the label assigning prioritizes in rapid-

onset disturbance, slow-onset disturbance and persistent forest. For instance, if a pixel integrated

from the 5 maps are labeled as ‘persistent’, ‘rapid’, ‘rapid’, ‘slow’ and ‘slow’ respectively. There

is a tie between ‘rapid’ and ‘slow’. In this case, the labeling follows the priority of ‘rapid’, ‘slow’

and ‘persistent’. Thus, the pixel is labeled as ‘rapid’. In the disturbance map consisting of

2973340 forest pixels, the number of pixels labeled in draw situations is 55807, which covers an

area of 5023 ha. Consequently, there was an increased chance that forest changes were classified

into the more intensive type but a decreased chance of the reverse. Therefore, the integration

17

process allowed the automatic classification to be more sensitive to disturbance detections.

Although this trend of underestimation was observed, it was difficult to quantify such areas

accurately.

Table 5 Confusion matrix of the forest change type classification result. The evaluation was

done on the test dataset.

Forest Change Type

Classified

Total Prod. Acc.

Persistent Slow-onset Rapid-onset

Ref

eren

ce

Persistent 218 3 1 222 98.2%

Slow-onset 8 187 0 195 95.9%

Rapid-onset 6 14 165 185 89.2%

Total 232 204 166 602 OA = 94.7%

Kappa = 0.92 User Acc. 94.0% 91.7% 99.4%

As for the sampling scheme, the selection of the samples was a haphazard process, preventing

the calculation of the inclusion probability of samples due to unintended selection bias. As a

result, the generalization of samples to estimate the accuracy of the entire map is limited

(Stehman and Czaplewski, 1998). However, with a relatively large sample size and disperse

spatial coverage, we are confident that the impact of selection bias in this study was minimized.

Another factor that impacts the product accuracy is the accuracy of NLCD 2001 forest mask as

the input for the image. It was reported that the overall accuracy of the level 1 classification

(including forest) was 91% for the region (Wickham et al., 2010). Therefore we can conclude

that the mislabeling error of the forest area was less than 10%. Other sources of error include

sub-pixel image registration which was reduced by a 3×3 average filter during time-series image

processing. The MMU filter also reduced such error but it also introduced the chance of altering

small correctly labeled patches. The error of slope-determined start year of disturbance was also

not assessed due to lack of consistent aerial image coverage in the corresponding year or other

reliable references.

18

3.3 The Disturbance Map Product

The disturbance map (Figure 5) was presented in two schemes highlighting slow-onset and

rapid-onset disturbance, respectively. Since the values selected for image analysis were from the

summer season of the year, the ‘start year’ of a MPB disturbance should instead be defined as a

period from the summer season of the year to the summer season of the next year. Needles on

MPB attacked trees turn red within one year (Safranyik and Wilson, 2007) and are therefore

observable visually and spectrally. The spatial spread pattern could also be roughly assessed

from the disturbance map, implying annual incrementally disturbed areas over thousands of

hectares, which is in agreement with MPB’s spatial synchrony with a lag distance at 100 km

level (Peltonen et al., 2002). Therefore, the time required for MPB attacks to spread at a pixel

level is much shorter than the time required for the trees to turn red following the attack. It is

thus reasonable for MPB mortality, to shift the disturbance start period to the year before first

detection. However, such a temporal shift does not apply to other disturbances which are

detected instantly between the summer of the labeled start year and the next summer.

Consequently, we should interpret the start year differently according to the disturbance type.

Figure 5 Outputs of the BITE algorithm, including starting year of (left) slow-onset

disturbances and (right) rapid-onset disturbances.

From 2001 to 2003, the dispersal of MPB was slow, covering a total of 6117 ha. The major

outbreak took place from 2004 to 2006 as was observed (Chapman et al., 2012), affecting 18876

ha, 47261 ha and 27350 ha forested land, respectively (Figure 6). The synchronous outbreaks of

MPB were also observed in a previous study for the northern Rocky Mountain area (Goodwin et

al., 2008; Honey-Marie et al., 2011). In the 9 year period, the total disturbed area caused by

MPB attacks is 111443 ha, which is dominant compared to the 11494 ha of rapid-onset

disturbance. Combined, slow-onset and rapid-onset disturbances affected 46% of the forested

area in Grand County over the 9 year period.

19

Figure 6 Area affected by different disturbance types for 2001-2009.

The spatiotemporal pattern of the dispersal of MPB attacks can be easily recognized by visual

inspection. The attack spread outwards from the initial locations in all directions without

constraining physical connections between forest patches. The consistent spread of the outbreak

to the neighboring regions annually adds confidence in the maps of disturbance types and start

years. Typically, three locations were identified from the disturbance map as the earliest MPB

attacked forests, which are Northeast Lake Granby, Arapaho National Forests and East Fork

Troublesome Creek Valley. All three locations have been dominated by MPB-susceptible

lodgepole pines. In particular, the outbreak originated in East Fork Troublesome Creek and its

surrounding mountains was examined in an enlarged view of the disturbance map (Figure 7).

Before the outbreak in 2004, the infested region was limited to hillsides with a distance from 0.5

km to 4 km surrounding the East Fork Troublesome Creek. During the outbreak from 2004 to

2006, MPB rapidly expanded outwards to neighboring areas regardless of terrain, meanwhile the

inwards dispersal was at a very slow pace. After the major outbreak period, the persistent forests

near the center of the region by the creek were attacked. By visually inspecting NAIP imagery in

2005 and 2011, we found the MPB presence in the center region in 2005 during the outbreak,

though at very low levels (less than 10% infested trees per pixel). They were below the

capability of detection using our algorithm. The reason was that either infested trees did not

comprise a majority of a pixel or the disturbance scale was under the MMU (6 pixels). We

assumed that the center area contained lesser density of host tree species or biophysical

conditions made it more resistant to MPB attacks, but was sought out by MPB after locations

with more favorable conditions had been depleted (Raffa and Berryman, 1983). This pattern

could also be attributed to other factors such as synchronous adult emergence implied by tree

maturity (Cole and Amman, 1980), seasonal pattern of temperatures (Logan et al., 1998), or

certain terrain features (Honey-Marie et al., 2011).

20

Figure 7 Enlarged view of the BITE output showing the staring year of slow-onset

disturbances.

For the rapid-onset disturbances, there are also three locations, around which forests were

abruptly removed by natural successions or human activities. Since disturbances such as wildfire

were not observed during the considered time frame, all rapid-onset disturbances were assumed

to be human caused. In developed areas such as Lake Granby and the Fraser Valley, the

surrounding forests were cleared for construction of roads, snow tracks, residential and resort

development. In the Arapaho National Forests, the rapid-onset disturbances were the treatments

on federal and private lands where infested trees were removed to increase aesthetic value of the

land, to prevent risks of treefall, to reduce wildfire hazards, and to utilize infested trees for wood

products.

4 Conclusion and Perspectives

Compared with existing disturbance detection algorithms using Landsat TM/ETM+ stacks such

as NDMI trajectory (Goodwin et al., 2008), the VCT algorithm (Huang et al., 2010) and the

Landtrendr (Kennedy et al., 2010), BITE can select the optimal value from multiple scenes with

tolerance for cloud and snow coverage in each year and perform trajectory extraction using

multiple spectral indices. Though with a similar segmentation process to Landtrendr, BITE takes

advantage of multiple spectral indices and supervised learning algorithms, so that it can

accurately separate persistent forest, slow-onset disturbances and rapid-onset disturbances as

well as the start year of a disturbance. BITE also showed its robustness to various atmospheric

conditions, noise, pixel misregistration and residual cloud/snow cover. Furthermore, BITE does

21

not require any parameters but only a training sample to proceed with an automated mapping

process. During the test, we found that Brightness, NDMI, NBR, Wetness and NDVI yielded the

best accuracies in detecting disturbances and separating slow- and rapid- onset disturbances. An

integration of the five indices via a plurality vote can further improve the accuracy. However,

there are some limitations with BITE including: 1) the prerequisite of an accurate forest cover

map synchronous with the start date of the time-series images; 2) selecting representative

training data; 3) the lack of seasonal vegetation dynamics; and 4) not sufficiently characterizing

forest restoration or multi-staged disturbances/restoration/persisting behaviors. To overcome

these limitations, further efforts will be made to revise current modules and add new ones, such

as the automation of prior forest detection (Huang et al., 2008) instead of using an independently

derived forest cover map (e.g. NLCD). Furthermore, comprehensive sampling and response

designs were lacking for accuracy assessment of the disturbance map, though our tentative visual

validation agrees with our accuracy report. In the future, we plan to compare our algorithm with

existing algorithms and products with a better sampling design.

Currently, our disturbance map is provided for a county, enabling studies of local dispersal

patterns at the landscape level. At scales of large areas, some studies found that the dispersal

pattern of MPB population follows a Moran effect, particularly assuming temperature as the

common environmental factor that triggers the MPB epidemic (Peltonen et al., 2002; Aukema et

al., 2008). In addition, a study found that climate change causes the outbreak of MPB attacks to

expand to previously unsusceptible forests in North America (Carroll et al., 2003), which leads

to increases in CO2 releasing to the atmosphere via fire and forest decomposition and forming a

positive feedback to climate change (Kurz et al., 2008). Therefore, at the same time of improving

the algorithm to overcome various limitations, we also plan to generalize this mapping procedure

to larger areas, e.g. from the southern Rocky Mountain ecoregion to the entire MPB susceptible

Rocky Mountain areas. Temporally, lodgepole pine forests are periodically attacked by MPB

(Logan and Powell, 2001). It would be more convenient to explore such patterns by making a

fuller use of all available images inter-annually over a period of nearly 30 years (Zhu et al.,

2012). With larger spatial and longer temporal coverage, we will have more confidence in

analyzing the driving factors of such disturbances with less influence from local and/or spurious

conditions. In addition, beyond merely detecting slow-onset forest disturbances caused by MPB

attacks, BITE has further potential in monitoring other types of ecosystem disturbances such as

wildfires, flooding and hurricanes, restored ecosystems and other LULC changes in the world.

These changes are theoretically detectable from forming trajectories of spectral indices, and are

therefore consistent with the framework BITE algorithm proposed. By implementing seasonal

vegetation dynamics, BITE will be even more capable and adaptive to ephemeral disturbances.

To conclude, we plan to test and adapt BITE to larger areas with greater variety of vegetation

types and climate conditions, to continuously monitor land cover changes.

22

Chapter 2 Clustering based on eigen space transformation – CBEST for efficient

classification

This chapter has been published in International Society of Photogrammetry and Remote

Sensing

Yanlei Chen and Peng Gong*

Department of Environmental Science, Policy and Management

University of California at Berkeley

137 Mulford Hall, Berkeley, CA 94720-3114

23

Abstract

Large remote sensing datasets, that either cover large areas or have high spatial resolution, are

often a burden for information mining for scientific studies. Here, we present an approach that

conducts clustering after gray-level vector reduction. In this manner, the speed of clustering can

be considerably improved. The approach features applying eigenspace transformation to the

dataset followed by compressing the data in the eigenspace and storing them in coded matrices

and vectors. The clustering process takes advantage of the reduced size of the compressed data

and thus reduces computational complexity. We name this approach Clustering Based on

EigenSpace Transformation (CBEST). In our experiment with a subscene of Landsat Thematic

Mapper (TM) imagery, CBEST was found to be able to improve speed considerably over the

conventional K-means as the volume of data to be clustered increases. We assessed information

loss and several other factors. In addition, we evaluated the effectiveness of CBEST in mapping

land cover/use with the same image that was acquired over Guangzhou City, South China and an

AVIRIS hyperspectral image over Cappocanoe County, Indiana. Using reference data we

assessed the accuracies for both CBEST and conventional K-means and we found that the

CBEST was not negatively affected by information loss during compression in practice. We

discussed potential applications of the fast clustering algorithm in dealing with large datasets in

remote sensing studies.

Keywords: Land cover/use Mapping, Large Dataset, Landsat Thematic Mapper image, K-means,

Remote Sensing, Unsupervised Classification

24

1 Introduction

Computer-based image classification is a common practice allowing fast and automatic

identification and classification of data. Two types of image classification are found in standard

texts, supervised and unsupervised (Jensen, 2004; Lillesand and Kiefer, 1987; Richards and Jia,

2005). Unsupervised classification is often preferred when a priori knowledge over a study area

is lacking. This is particularly the case for mapping large areas where field data acquisition used

for training in supervised classification becomes prohibitively expensive. Unsupervised

classification involves the use of a clustering algorithm that runs across the entire image in many

iterations until an optimal set of clusters converges. Two most widely used algorithms are the K-

means and the iterative self-organizing data analysis technique (ISODATA) (Richards and Jia,

2005). However, these traditional iterative algorithms are time consuming when the data

dimension becomes high or the data volume becomes large. In pursuit of efficiency of

classification for continuously expanding data size in present days, we present an approach that

integrated the Eigen-Based Gray-Level Vector Reduction proposed in Gong and Howarth (1992)

followed by clustering. We call this approach Clustering Based on EigenSpace

Transformation (CBEST).

Computer assisted classification in large area mapping is more and more popular as both data

availability and computational power are increasing. The demand for variables derived from the

classification of remotely sensed data is increasing in developing global land and ocean

databases for global change studies (Jensen, 2004; Gong, 2012). Clustering is engaged in

mapping from regional (Woodcock et al., 1994; Homer et al., 1997; Franklin et al., 2001), to

continental (Loveland et al., 1991; Stone et al., 1994; Zhu and Evans, 1994; Homer et al., 2007),

even to global scales (Loveland et al., 2000; Bartholomé et al., 2005; Arino et al., 2007; Gong et

al., 2013). Among these mapping efforts, K-means clustering was adopted by some early studies

because of its simplicity for implementation.

K-means clustering is one of the earliest unsupervised classification approaches (Lloyd, 1982)

and has been used in remote sensing studies for various purposes. The applications include land

cover classification (Muller et al., 1999; Han et al., 2004; Zharikov et al., 2005), land cover

change (Brumby et al., 2002; Reger et al., 2007; Celik, 2009), time-series analysis and mapping

(Viovy, 2000; Wulder et al., 2004), cloud mapping (Gordon et al., 2005; Sano et al., 2007; Eitzen

et al., 2008), cotton yield estimation (Zarco-Tejada et al., 2005), chlorophyll concentration

mapping (Roelfsema et al., 2002), plumes and CO2 mapping (Zhang and Small, 2002), and

hydrological analysis (Belluco et al., 2006). In these studies, K-means clustering may function as

the primary classification approach, as a decisive branch of the entire classification method

(Franklin et al., 2001), or as a supplemental analysis tool which provides insight into

understanding of the data.

The K-means clustering is also used as a reference for comparison with other approaches

(Belluco et al., 2006), particularly newly proposed approaches (Viovy, 2000; Remund et al.,

2000; Shah et al., 2004; Zhong et al., 2006; Shah et al., 2007). Although the K-means algorithm

is outperformed in some cases (Shah et al., 2004; Shah et al., 2007), studies (Ouma et al., 2006;

Jiao et al., 2010) in which K-means achieved fair accuracies in remote sensing defend the role of

K-means clustering in near future practices.

25

Since the limitation of K-means clustering is widely known, some choose to integrate it into

advanced classifiers to utilize its advantages. For instance, Rollet et al. (1998) used it to initialize

the RBF (radial basis function) neural network for image classification. Some choose to revise or

add additional steps to minimize some of the shortcomings in the conventional K-means. Such

studies include contiguity-enhanced K-means (Theiler and Gisler, 1997) and unsupervised

spectral-contextual classification in the context of K-means (Zhou and Robson, 2001). K-means

also appears to be implemented with Principal Component Analysis (PCA), such as extreme

centroid initialization after PCA transformation (Funk et al., 2001) and hyperspectral data

preprocessed with Segmented PCA, then clustered with K-means (Tsagaris et al., 2005).

Outside the field of remote sensing, a number of clustering algorithms designed for large

databases have been proposed in the last few decades, such as DBMS (Ester et al., 1995),

DBSCAN (Ester et al., 1996), BIRCH (Zhang et al., 1997), STING (Wang et al., 1997), CURE

(Guha et al., 1998), WaveCluster (Sheikholeslami et al., 1998), MAFIA (Nagesh et al., 1999),

etc. These approaches have seldom been used in the context of remote sensing classification for

large area mapping. Particularly for K-means, a variety of studies also focused on improving the

efficiency of K-means for large datasets. For instance, the k-d tree and filtering algorithm

(Alsabti et al., 1998; Kanungo et al., 2002) organizes data into a k-d tree structure and prunes or

filters the branches to speed up the K-means. Scaling clustering algorithm (Bradley et al., 1998)

established a scalable framework in which compressible and discardable regions in large datasets

are identified and processed for K-means. Coresets K-means (Frahling and Sohler, 2006) uses

small weighted sets of samples that approximate the original dataset to reduce the computational

complexity. There are also studies exploring the potential of K-means in performance: fuzzy c-

means is a fuzzy implementation of K-means (Bezdek et al., 1984) and kernel K-means projects

data into higher dimensional space using kernel functions to detect complex patterns in feature

space (Girolami, 2002). The efficiency of kernel K-means was later improved by Zhang and

Rudnicky (2002) by shifting the clustering order from sample sequence to kernel sequence.

CBEST introduced in this paper is a distinct approach and it aims to greatly reduce the space and

time complexity of K-means with only one additional user specified parameter on desired

memory usage as the trade-off for little accuracy loss.

2 Background

2.1 K-means

Given a set of data in p dimensional space (x1, x2, …, xn), the objective of K-means is to find k

cluster centers (µ1, µ2, …, µk) for partition sets V={V1, V2, …, Vk}, such that within-cluster sum

of squares J is minimized (MacQueen, 1967).

2

1

-j i

k

j i

i x V

J x

(1)

µi is the mean of xj, where j ix V.

26

The iterative algorithm of conventional implementation of K-means clustering (in Euclidean

Distance) can be described in Table 6.

Table 6 Conventional K-means Algorithm

Steps Implementation

I Initialize k centers (µ1(0)

, µ2(0)

, …, µk(0)

); Iteration count t=0.

II ( ) ( )t t

ij j iD x . i=1, 2, …, k; j=1, 2, …, n.

III ( ) ( ) ( ) ( ) ( )

1 2{ : min( , ,..., )}t t t t t

i j ij j j kjV x D D D D . i=1, 2, …, k; j=1, 2, …, n.

IV

( )

( 1)

( )

1

tj i

t

i jtx Vi

xV

. i=1, 2, …, k; j=1, 2, …, n.

V t=t+1

VI Repeat step II to V until converge

( )t

ijD is the distance matrix to record distances between data and cluster centers so that the data

could be assigned to its closest cluster center. In practice, it is not necessary to take the square

root of elements in distance matrix ( )t

ijD since only comparison of elements is the concern. At

each iteration, partition sets V(t)

are updated by the minimum distance criterion, followed by

cluster centers µ(t+1)

being recalculated from the members x in their respective cluster partition

sets. The termination conditions could be determined by rules such as converging, membership

changing rate < threshold, change of within-cluster sum of squares < threshold and others.

There are a few issues to be considered carefully when applying K-means in practice. Firstly, the

cluster numbers k is required to be predefined prior to the classification. Since in clustering

analysis, we often assume no prior knowledge of the data, the predefined cluster numbers k is

therefore a guess. The strategy towards this would be either to test multiple k values to find the

most appropriate one (Zharikov et al., 2005), or starting with a large k then merge down to a

lower number according to expertise (Wulder et al., 2004). Secondly, the algorithm described in

Table 6 does not guarantee a global minimum of within-cluster sum of squares and there is a

possibility that result converges to a local minimum. This problem could be mitigated by

multiple runs with different random initial points to derive optimal or near optimal results. Some

studies have proposed initialization strategies that increase the probability to converge on a

global minimum. Zha et al. (2001) examined the relationship between PCA and K-means, and

they developed an approach to approximate global solutions for K-means by relaxing special

constraints of a trace maximization problem, which later evolved into PCA guided K-means

(Ding and He, 2004). Thirdly, K-means assumes identical within-cluster variances of Gaussian

distributions for all clusters. For data which features different within-cluster variances or those

which do not follow a Gaussian distribution, the performance could be rather poor. There are

quite a number of advanced algorithms to implement on specific types of data. For instance,

Expectation-Maximization for Gaussian Mixture Models applies a maximum likelihood

optimization which could well handle data following a Gaussian distribution. Kernel K-means

project data into higher dimensional space (Girolami, 2002) so that it could achieve good results

27

with ‘ring data’. Fourthly, since ‘distance’ is used, K-means suffers from the ‘curse of

dimensionality’ that when clustering high dimensional data, irrelevant dimensions introduce

noise to reduce the significance of ‘distance’. Eigenspace transformation is a common data

processing approach to reduce irrelevant dimensions.

2.2 Eigen-Based Gray-Level Vector Reduction

The compression approach proposed by Gong and Howarth (1992) is based on eigenspace (ES)

transformation. The same as Principle Component Analysis (PCA) (Jolliffe, 2002) in finding the

eigen structure of the feature space, ES transformation applies linear transformation to the data

such that the covariance matrix is made diagonal. Elements of the diagonal covariance matrix are

called eigenvalues representing the variance of data in corresponding dimension. The feature

space is transformed into eigenspace with correlated features being transformed into uncorrelated

eigen axes. Eigenvalues of the eigen axes are ranked from high to low. Since usually distinctive

features are associated with larger variances as opposed to noises which are usually smaller, low

ranked eigen axes are regarded as noise and then removed to reduce dimension in some studies

as a preprocessing step (Han et al., 2004). It is important to note that ES transformation alone

does not lose any information during the process.

The objective of the compression is to project original data space (n × p) into a compressed space

(n × 1) represented by N gray levels.

The compression is implemented following a ES transformation of input p dimensional data (x1,

x2, …, xn) with ranked eigenvalues (λ12, λ2

2, …, λp

2) as well as corresponding eigenvectors (v1,

v2, …, vn).

As λi2 represents the variance of data projected onto the ith eigen vector, and λi implies the

standard deviation of it, the compression rule is that the value of standard deviation in the ith

eigenvector λi determines how many partitions are assigned to along this direction to ensure the

the final partition of the eigen space is equilateral. It can be simply written as following:

1 2

1 2

...p

p

NN N

(2)

where Ni denotes the number of partitions assigned in the direction of the ith eigenvector.

The total partition N could be written as,

1 2... pN N N N (3)

Along the ith eigenvector, (1-α)% data should fall within a confidence interval of (-zα(2)λi, zα(2)λi)

if assuming a normal distribution, given the mean is zero since the ES transformation centers the

data. Here zα(2) denotes the z-value at significance level of α in a two tailed test. Gong and

Howarth (1992) chose 2.1 to be the z-value such that 97% of the data are roughly within this

range. Outliers are captured by the first and the last partitions, while the majority is assigned to

28

(Ni – 2) cells uniformly spaced in the confidence interval. Compressed gray levels in the

eigenspace are subsequently coded into N gray levels, with which each data entry could be

represented by a single integer. The process is summarized by steps in Table 7.

Table 7 Eigen-Based Gray Level Vector Reduction

Step Implement

I ES Transformation

II Derive eigenvalues and eigenvectors

III Partition ith subspace based on its eigenvalue

IV Coding the gray level

3 Clustering Based on Eigen Space Transformation

In general, CBEST is based on the integration of the Eigen-Based Gray Level Vector Reduction

approach (Gong and Howarth, 1992) and the K-means algorithm to speed up the conventional

space and time consuming process for large datasets. For remote sensing images, most spectral

data are stored in bytes, which range from 0 to 255 digital numbers. Although data is sometimes

calibrated, the information contained by the data is still one byte. If data size is large in

quantities, intuitively there are certainly overlapped cells in feature space. ES transformation

eliminates correlation, which identifies the dominant subspace of data spreading. Intuitively, the

compression approach introduced above could store more variation of data in a specified total

number of partitions. Since identical or close eigenvectors in eigenspace is very likely, especially

when the partition number N is extremely small compared to the number of entries in the data n,

it would be more efficient for a clustering algorithm such as K-means to calculate based on

scanning over weighted partitions to eliminate redundancy instead of data entries. Here a weight

for a partition is the count of number of data instances falling into the partition in eigenspace.

CBEST will be introduced in two sessions: Compression and Clustering.

3.1 Compression

In particular, it starts with ES transformation applied on original dataset (x1, x2, …, xn) to derive

eigen values (λ12, λ2

2, …, λp

2) as well as eigenvectors (v1, v2, …, vn).

Then an expected total number of partitions N̂ is specified.

Hence combine (2) and (3) to get the following equations pair:

1 2

1 2

1 2

...

ˆ...

p

p

p

NN N

N N N N

Solve the above equations pair to derive:

29

1 2

ˆ

...p

i i

p

NN

(4)

Then round Ni since partition numbers are integers. Notice that if rounding up, the actual

partition number N could be larger than N̂ . Rounding to the nearest integer was used in this

paper. Since Ni’s are rounded to integers and at least two partitions are required for providing

information for separation, it is possible that partition Nq+1, Nq+2, …, Np ≤ 1. Under this

circumstance, eigen subspace from dimension q + 1 to p are discarded because a single partition

in one-dimensional subspace does not provide any information for clustering. So the remaining

subspace has a dimension of q (q ≤ p).

A partition set G={G1, G2, …, GN} is used to denote the eigenspace partitions. For a partition set

Gr, its location in eigenspace can be defined by a coordinate (j1, j2, …, jq), ji=1, 2, …, Ni,

meaning the partition set Gr consists of the j1th unit in the first subspace, j2th unit in the second

subspace and so on. The coordinate is then projected into the single-axis indices of G for

convenience of indexing for computation. The index r to represent the coordinate (j1, j2, …, jq)

for Gr is coded as following:

1

1 2 1 3 1 2

1

( 1) ( 1) ... ( 1)q

q i

i

r j j N j N N j N

(5)

To determine which partition cell in ith one dimensional subspace should an eigenvector vi of

value ai be assigned to, the following rules are applied:

(2)

(2) (2)

(2)

1

2 [( )( 2) 2 ]

i i

i i i i i

i i i

a z

j round a z N z other

N a z

(6)

Here we use an α value at 0.1, z0.1(2)=1.64.

In the experiment later presented in this paper, we set the minimum number of Ni to 3 so that at

least two outlier partitions and one major partition could be established based on (6).

For the partition set G, a set of correspondent mean vectors (m1, m2, …, mN) and weight vectors

(w1, w2, …, wN) are statistically calculated by the following:

1

j Gi

i j

vi

m vG

(7)

and

i iw G

30

Other than using original dataset (x1, x2, …, xn) , ES transformed eigenvectors (v1, v2, …, vn) are

chosen to calculate (m1, m2, …, mN) as K-means performed on eigenvectors could yield results

closer to the global minimum (Zha et al., 2001).

3.2 Clustering

Given a set of q dimensional mean vectors (m1, m2, …, mN) and a set of weight vectors (w1, w2,

…, wN) representing eigenvectors (v1, v2, …, vn), CBEST aims to find cluster centers (µ1, µ2, …,

µk) for partition sets V={V1, V2, …, Vk}, such that within-cluster sum of squares J can be

minimized:

2

1

-j i

k

j i

i v V

J v

(8)

To differentiate compressed eigenspace partition sets G from cluster partition V, we refer G as

eigenspace partitions. The differences between CBEST clustering and conventional K-means are

(1) compressed eigenspace partitions (size N × p) are scanned for each iteration instead of

original dataset (size n × p); (2) number of counts of eigenvectors in each eigenspace partition as

weights is introduced to update cluster centers for new iterations. The algorithm is shown in

Table 8. The process of CBEST as compared with K-means is illustrated in Figure 8.

Table 8 CBEST Algorithm

Steps Implementation

I Perform compression to derive (m1, m2, …, mN) and (w1, w2, …, wN).

II Initialize k centers (µ1(0)

, µ2(0)

, …, µk(0)

); Iteration count t=0.

III ( ) ( )t t

ij j iD m . i=1, 2, …, k; j=1, 2, …, N.

IV ( ) ( ) ( ) ( ) ( )

1 2{ , : min( , ,..., )}t t t t t

i j j ij j j kjV m w D D D D . i=1, 2, …, k; j=1, 2, …, N.

V ( )

( )

( 1)

,

1t

j j it

j i

t

i j jm w Vjw V

w mw

. i=1, 2, …, k; j=1, 2, …, N.

VI t=t+1

VII Repeat step III to VI until converge

Finalization

' ( )t

ij j iD v . i=1, 2, …, k; j=1, 2, …, n.

' ' ' '

1 2{ : min( , ,..., )}i j ij j j kjV v D D D D . i=1, 2, …, k; j=1, 2, …, n.

31

Figure 8 Illustrative Comparisons between CBEST and K-means

Instead of scanning over the entire dataset once per iteration in the conventional K-means,

CBEST scans the entire dataset twice, one for compression, the other for finalization. For each

iteration, CBEST reduces the time complexity from o(knp) to o(kNq), where q ≤ p. For large

datasets, one can predefine a much smaller N to significantly decrease the computation time.

However, as cluster centers are calculated by grouped gray-level reduced vectors in eigenspace

partitions, it is difficult for CBEST to reach the local optimal if clusters are intermixed because

partition sets in clustering boundary could not be divided to different clusters. The trade-off

between user defined number of eigenspace partitions N and overall performance was

experimented and discussed in later sections in the paper.

3.3 Further Improvement

3.3.1 Mean vectors

Notice that mean vectors are being divided by weights and later only used by multiplying the

same weights back. Additional efficiency could be gained by eliminating the redundancy. We

thus replace mean vectors used in (7) by total vectors (T1, T2, …, TN). Total vectors are defined

as following:

32

j i

i j

v G

T v

The corresponding change in CBEST algorithm is the update of step V in Table 8:

( )

( )

( 1) 1t

j it

j i

t

i jv Vjw V

Tw

i=1, 2, …, k; j=1, 2, …, N.

This revision could reduce N multiplication operations in one iteration and avoid N divisions in

the compression.

3.3.2 Vacant Eigenspace Partitions

In practice, most of the eigenspace partition sets in G are vacant, unoccupied by any data.

Suppose occupied number of eigenspace partition sets is N', and N' ≤ N. The vacant partition sets

could be cleared out of G to free the space occupied by (N – N')(q × size of datatype(m or T) +

size of datatype(w)). Time complexity also decreases to o(kN'q) for each iteration.

3.3.3 Boundary Optimization

When CBEST converges, eigenspace partition sets that are not located on the inter-cluster

boundaries are less likely to change their cluster membership if further refinement is

implemented, to reach local optimal. As it is simple to locate neighboring eigenspace partition

sets from its original indices decoded from the index of partition sets G, eigenspace partition sets

on the inter-cluster boundaries could be extracted by determining whether any of their

neighboring partition sets belongs to a different cluster. Conventional K-means is then

implemented on eigenvectors belonging to these boundary partition sets while other eigenvectors

being static require no calculations.

However, by doing so the time complexity increases from o(kN'qt) to o(kN'qt + kn'qt'), where t is

the iteration number of CBEST, n' is the number of eigenvectors in boundary partition sets, and t'

is the iteration number of subsequent K-means. Moreover, as the dimension increases, the

boundary partition sets greatly increase as a result of less partition numbers in lower ranked

eigen subspace. The entire process then becomes using CBEST to initialize for K-means.

4 Experimental Design

The experiment was carried out by using CBEST on a subscene of Landsat image to identify

specific land covers for each ground mapping unit (image pixel) based on spectral information

only.

33

4.1 Experiment Data

Experiments were done on two datasets. For the first dataset, calibrated radiance of 6 spectral

bands about land surface in the central district area Guangzhou, Guangdong Province, China was

extracted from a Landsat 5 TM imagery (Path/Row: 122/44) acquired on 1/2/2009 (Figure 9).

Guangzhou is the capital and the largest city of Guangdong province in the People’s Republic of

China. It is located at the north part of Pearl River Delta, which consisted of a number of

municipalities along the south coast of Mainland China. Guangzhou has an area of 7,434.4 km2,

in which urban area takes up 3,843.43 km2. The population was about 12.78 million as of 2010.

As a result of humid subtropical climate influenced by Asian monsoon, Guangzhou has a hot and

humid summer when cloud covers degrade the quality of satellite images. Therefore, a cloud free

Landsat 5 TM scene acquired during the winter time on January 2, 2009 was chosen for the test.

As Guangzhou is located at the estuary connecting the Pearl River and South China Sea,

agriculture lands were along the Pearl River in the northwest part. Urban settlement and ports

were built near the estuary in the south as a result of sea trading for a long history. Mountainous

area in the northeast features forests and grasslands.

Figure 9 Test Area: Guangzhou, China

The image was clipped by political boundary of Guangzhou city to yield a partially masked

image of 1484 columns and 1449 rows in dimension, in which the number of unmasked pixels is

1,508,553. The TM sensor onboard Landsat 5 has 7 spectral bands in which bands 1, 2, and 3 lie

in visible spectrum; 4, 5, 7 in near and medium infrared. The six bands have a 30 meter spatial

34

resolution, which means each pixel nominally represents 30×30 m2 land area. In land cover

classification, band 6 is a thermal band with a 120 m resolution and is usually excluded due to

the resolution difference.

The second dataset is from an Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) image

taken over Northwest Tippecanoe County, Indiana on June 12, 1992 (AVIRIS Image, 2013).

Tippecanoe County has a humid continental climate with warm summers and cold winters and a

significant proportion of the land is used for agriculture. A subset of image was clipped to a

subscene of 1467 × 614 pixels, which covers approximately 14 × 6 square miles in area on the

ground. A river passes through the subscene of east-west direction, near which there are some

forests on the north side. Only a few buildings for residential use are present on the image, while

most of the area is covered by corn and soybeans fields. There are 220 spectral bands of which

the wavelength ranges from 400 nm to 2500 nm. The data is calibrated into digital numbers

proportional to radiance. Although no georeference is found with the hyperspectral image,

another spatially pixel-to-pixel matched ground reference image was available for validation.

4.2 Preprocessing

Since classification was performed for an area covered by one scene, atmospheric correction was

not necessary for both images (Gong and Howarth, 1990; Song et al., 2001). For a data

exploration clustering algorithm, the linear radiometric calibration was not necessary. ES

transformation was performed directly on Digital Number (DN) for spectral bands 1, 2, 3, 4, 5

and 7 to replace original spectral feature vectors with eigenvectors for the TM imagery. All

clustering experiments on the TM imagery were implemented on the gray-level reduced vectors

since it is known that K-means performed on such vectors could yield results closer to the global

optimum (Zha et al., 2001).

4.3 Methods

To determine whether CBEST could significantly improve efficiency while mildly

compromising accuracy for clustering large remote sensing dataset, the experiment was designed

in two parts (Figure 10).

35

Figure 10 Experiment Flow Chart

The first part being an efficiency and performance test, CBEST and conventional K-means are

tested on preprocessed TM image mainly to compare the following indicators: total elapsed time,

elapsed time per iteration (ETI) and number of total iteration t for efficiency; within-cluster sum

of squares for performance. Particularly for the efficiency test, as the indicators are relative, time

consumption is the major concern. The total elapsed time was used to estimate the overall time

consumption of both algorithms if being totally converged (a cap was set to 1000 iterations at

most). For the performance test, as CBEST derived cluster centers were not at local optima, the

total within-cluster sum of squares was calculated to compare with K-means which converges at

local minimum. Our empirical tests indicated that neither distance between cluster centers nor

pixel agreement could directly reflect the quality of the algorithm, so we used the total within-

cluster sum of squares, which is what K-means clustering intends to minimize. Comparisons

were implemented by plotting the indicators between the two algorithms against changing

variables including samples of the images, number of cluster centers k, number of eigenspace

partitions N and maximum number of iterations allowed tmax. We first test the relationship

between size of the data and running speed by systematic sampling the TM image at various

sizes of intervals. Then by varying the number of cluster centers, we examined how much speed

CBEST could gain and how much precision it could lose as number of cluster centers increases.

Number of eigenspace partitions determines the compression intensity directly and affects both

speed and accuracy. Lastly, we compared how efficiently both algorithms converge after a given

number of iterations because in practice, a limit on the number of iterations is always predefined

to avoid slow convergence leading to long running time. As CBEST and K-means both use

randomized initial cluster centers, multiple runs were implemented to mitigate biased assessment

36

due to randomness. In each run, CBEST and K-means were initialized with identical number of

cluster centers.

A total of 10 runs were implemented for each specific set of variables. In the test, we first varied

sampling interval from 10,000 to 1 while k was fixed at 16, and N was fixed at 1,000,000. Then k

was tested from 2 to 64 cluster centers, and number of eigenspace partitions was fixed at

1,000,000. When N was varied from 1000 to 10,000,000, k was fixed at 16. For most test runs,

the maximum number of iterations was set to 1000 in order to assess completely converged

clustering. At last, the maximum number of iterations varies from 10 to 80 while was N fixed at

1,000,000, and k was fixed at 16. Mean, minimum and maximum values of the 10 runs were

calculated for sample means. This gave us an estimator of overall average efficiency and

performance while the minimum and maximum were intended to represent the range. In the

performance test, the minimum value of Within-Cluster Sum of Square (WCSS) was important

as the randomly initialized K-means is sometimes run for multiple times to ensure that WCSS is

closer to global minimum. The description of all indicators used in the test is listed in Table 9.

Table 9 Description of the Indicators

Indicator Description

Total Elapsed Time Start counting right after ES Transformation and stop at converging or

reaching the iteration limit.

Number of Iterations The number of iterations that an algorithm goes through

Elapsed Time per

Iteration

Total Elapsed Time / Number of Iterations

Within-Cluster Sum of

Square

2

1

-j i

k

j i

i v V

J v

In the second part, CBEST and K-means were compared in a practical land cover/use

classification scenario.

For the Landsat TM dataset, a classification system for land cover/use types in Guangzhou was

proposed in Table 10. Both land cover and land use were taken into account in the classification

system considering the complicated nature of urban, agriculture and forest mixture in the area,

though spectral based classification is more appropriate for identifying and mapping land cover

(Gong and Howarth, 1990; Ouma et al., 2006) than land use. Since unsupervised classification is

always used to explore a natural classification system matching the spectral patterns of the region

of study, we also discussed the potential of K-means in refining our land cover/use classification.

Table 10 Classification System for Guangzhou

Land Cover Description

Settlement/Residential Urban residential areas with tree, lawn and driveways

Industrial/Commercial Urban Industrial/Commercial built-up areas

Clearland Bright land cover that cleared for construction

37

Idle land Idle land with little growing vegetation

Orchard Sparse fruit trees and grass

Cropland/grassland Grass and crops

Urban Forest Dense tree canopy

Water body Including river, reservoirs and ponds

500 reference plots were collected for validation purposes by stratified sampling (using

stratification classes in Table 10), 138 of which were collected in the field in April and

December, 2009 and June, 2010. The difference between the sampling time and image

acquisition time was not long enough for major land use change to occur. Others were

interpreted from high resolution aerial photos and local knowledge of the area. To ensure the

identical confidence level of sampling for the strata, each stratum was allocated for more than 40

sampling units and no more than 80. The range of the strata sampling sizes is subject to the

different size of areas of corresponding strata. The sampling units were distributed evenly in the

entire area to avoid redundancy from similar neighboring objects as well as to increase the

overall sampling quality by outstretched coverage.

CBEST and K-means both were implemented for multiple runs with k=30. The best solutions

were used to cluster the preprocessed TM imagery. Multiple runs were implemented to ensure

the global optimum was approximated. On the other hand, overestimated number of cluster

centers was later reduced in post processing such that these 30 clusters were merged or split

based on automatic image interpretation criteria. Still the natural boundaries between clusters are

maintained. A confusion matrix as well as Kappa coefficient was then calculated for accuracy

assessment.

For the AVIRIS dataset, CBEST and K-means were also applied in a similar way. In this

experiment, we predefined k=100 since the ground reference consists of 58 subdivided classes.

However, some of the 58 classes are defined for better management and they exhibit great

spectral similarity among each other. For instance, corn is divided into subclasses based on

tillage approach such as no-till, min-till and clean-till. It is very difficult to distinguish one from

another with only spectral information during their growing season since they both grow the

same corn type in the fields. Therefore, these 58 classes were simplified into 10 classes in which

the 100 clusters were assigned in the post-classification process. The final maps were then

assessed for accuracies by comparing to the reference dataset, and their accuracies were

compared.

All programs used in our experiments were coded using Matlab 2011b and run in a computer

featuring an Intel i5 760 2.8Ghz quad-core CPU and 8 gigabytes physical memory.

5 Results and Analysis

5.1 Efficiency & Performance Test

The plot of the first test on speed gain with respect to data size is shown in Figure 11. Mean

values of elapsed time and iterations could be found in Table 11. As the data size increases, time

38

cost by CBEST increases much slower than that of the K-means since CBEST only scans over

the entire dataset twice. At around 5000 samples, CBEST started to gain an edge over the K-

means and after that the gap significantly builds up between them. When using the entire dataset

of about 1.5 million data instances, K-means took approximately 100 times longer than CBEST

to converge, and CBEST is 15 times faster than K-means for each iteration on average. The slow

growing of time cost of CBEST is attributed to the fixed eigenspace partitions. As data size

increases, more cells are occupied and more data are scanned in preprocessing and finalizing for

CBEST. The increasing rate of time cost in CBEST is minimal when compared to that of the K-

means as all additional cost from the increased data size applied in each iteration. As a result,

CBEST gains significant efficiency boost over large datasets as confirmed by this experiment.

Table 11 Test Results w/respect to Data Size

Data Size (n) 151 503 1509 5029 15086 30172 75428 150856 301711 754277 1508553

Total Elapsed Time

(s)

CBEST 0.04 0.05 0.06 0.15 0.23 0.41 0.86 1.15 1.34 1.79 2.45

K-

means

0.01 0.01 0.03 0.14 0.78 1.15 3.83 8.06 25.58 71.28 135.95

Number of Iterations CBEST 8.9 17.6 30.3 61 67.6 78.8 95.2 83.9 82 72 66.2

K-

means

8.8 19.5 34 63.9 123.7 89.6 151 154.4 234.5 233.2 225.7

Elapsed Time

per Iteration (s)

CBEST 0.005 0.003 0.002 0.003 0.003 0.005 0.010 0.014 0.017 0.026 0.041

K-

means

0.001 0.000 0.001 0.002 0.006 0.013 0.025 0.052 0.109 0.305 0.603

39

Figure 11 Speed Comparison w/respect to Data Size. (a) Elapsed Time Comparison; (b)

Elapsed Time ratio (how many times faster); (c) ETI Comparison; (d) ETI ratio

The summary of efficiency test with respect to k was given in Table 12 (values are mean of 10

runs without specification) and illustrated in Figure 12. As expected, CBEST was faster than K-

means as number of cluster centers increases. The ratios calculated in (b) and (d) directly

represent the number of times that CBEST is faster than K-means, which boosted to about 100

times faster in the configuration as k increases to over 48. The K-means is roughly 4-5 times in

number of total iterations of CBEST, which contributes to CBEST’s shorter running time too. It

should also be noticed that the total elapsed time was associated with a greater variation as a

result of large variation in iteration numbers. For instance, when k=64 for K-means, the iteration

number ranges from 410 to 907 with an average of 629 and the maximum is over twice of the

minimum.

Table 12 Test Results w/respect to k

Number of Cluster Centers k 2 4 8 16 24 32 48 64

Total Elapsed Time (s) CBEST 0.66 0.89 1.50 2.55 4.25 5.25 6.22 9.06

K-means 8.9 16.9 46.7 127.8 306.3 432.0 700.5 1,055.8

102

103

104

105

106

10-4

10-3

10-2

10-1

100

Num of Samples

Ela

psed T

ime p

er

Itera

tion (

ET

I)

102

103

104

105

106

10-2

10-1

100

101

102

Num of Samples

ET

I R

atio (

k-m

eans/C

BE

ST

)

102

103

104

105

106

10-3

10-2

10-1

100

101

102

103

Num of Samples

Ela

psed T

ime (

s)

CBEST

k-means

CBEST Max/Min

k-means Max/Min

102

103

104

105

106

10-2

10-1

100

101

102

103

Num of Samples

Tim

e R

atio (

k-m

eans/C

BE

ST

)

Ratio (k-means/CBEST)

Max/Min

Ratio = 1

(a) (b)

(c) (d)

40

Number of Iterations CBEST 11 27 55 77 114 124 106 136

K-means 49 66 122 226 411 444 533 629

Elapsed Time per

Iteration (s)

CBEST 0.07 0.03 0.03 0.04 0.04 0.04 0.06 0.07

K-means 0.18 0.26 0.38 0.57 0.74 0.97 1.31 1.68

WCSS (×107) CBEST mean 93.74 57.48 38.35 19.64 13.60 11.67 9.73 8.46

min 93.66 57.46 31.84 17.10 13.30 11.45 9.54 7.54

K-

means

mean 93.65 57.45 38.76 17.66 13.19 11.33 9.26 7.72

min 93.65 57.45 38.76 16.84 13.15 11.27 9.23 6.70

WCSS (rescaled to K-means

min)

CBEST mean 1.00 1.00 0.99 1.17 1.03 1.04 1.05 1.26

min 1.00 1.00 0.82 1.02 1.01 1.02 1.03 1.13

K-

means

mean 1.00 1.00 1.00 1.05 1.00 1.01 1.00 1.15

min 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

41

Figure 12 Efficiency w/respect to k. (a) Elapsed Time Comparison; (b) Elapsed Time ratio

(how many times faster); (c) ETI Comparison; (d) ETI ratio; (e) Rescaled Within-Cluster

Sum of Square average; (f) Rescaled Within-Cluster Sum of Square Best/Worst Case.

0 10 20 30 40 50 600.95

1

1.05

1.1

1.15

1.2

1.25

1.3

Num of Cluster Centers

Within

-Clu

ste

r S

um

of

Square

s (

Rescale

d t

o M

in)

CBEST avg

k-means avg

k-means Best (Base Line)

0 10 20 30 40 50 600.8

0.9

1

1.1

1.2

1.3

1.4

1.5


Within

-Clu

ste

r S

um

of

Square

s (

Rescale

d t

o M

in)

CBEST Worst

CBEST Best

k-means Best (Base Line)/Worst

0 10 20 30 40 50 6010

-1

100

101

102

103

104


Ela

psed T

ime (

s)

CBEST

k-means

CBEST Max/Min

k-means Max/Min

0 10 20 30 40 50 600

50

100

150

200

250

300


Tim

e R

atio (

k-m

eans/C

BE

ST

)


Max/Min

0 10 20 30 40 50 600

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2


Ela

psed T

ime p

er

Itera

tion (

ET

I)

0 10 20 30 40 50 600

5

10

15

20

25

30

35


ET

I R

atio (

k-m

eans/C

BE

ST

)

(a) (b)

(c) (d)

(e) (f)

42

The performances of both CBEST and K-means are illutrated in Figure 12 (e, f), using the lowest

within-cluster sum of squares calculated by K-means as a base line on which other values were

rescaled. Subplot Figure 12e characterizes the average K-means and CBEST performance as

CBEST always behaves less effectively with respect to the varying cluster centers with an

exception of k=8, which suggests the possibility that CBEST could reach better optimal solution

on average for certain number of initial centers. The minimum values presented in Figure 12f,

suggests the same finding that at k=8, the best result of CBEST found an optimum at

approximately 80% of that K-means derived in WCSS. In other cases, the best solution CBEST

yielded was a little higher than achieved using K-means. CBEST didn’t find a close solution in

k=64. Considering the average WCSS of K-means in this case is quite high, K-means was able to

converge to close to the minimal WCSS of 67 million for 3 of 10 times, while CBEST was able

to converge to 75 million only once. Other solutions converged over 81 million for both

algorithms. We figure that as k increases, the eigenspace partitions N should also increase to

achieve finer resolution in the eigenspace in order to compensate for more information loss due

to increased between-cluster boundary surfaces. Otherwise the average number of instances in an

eigenspace partition is too many to be divided into more clusters. This conclusion is strengthened

by the following experiments.

The comparison of the means of indicators by varying N was shown in Table 13. The mininum

for WCSS was also shown in the table as a reference for the optimal solutions. The range and

mean were plotted in Figure 13. The figure was plotted with occupied partitions as x-axes

because it is this number which determines the time and space complexity of CBEST.

Table 13 Test Results w/respect to N

CBEST K-

means

Expected Eigenspace Partitions N 1000 5000 10000 50000 100000 500000 1000000 5000000 10000000 n/a

Eigenspace Partitions N' 1008 2700 15300 41184 88200 338580 505050 2079168 11450160 n/a

Occupied Partitions 554 1370 4423 9904 19091 53640 71434 185901 295316 n/a

Total Elapsed Time (s) 0.74 0.74 0.84 0.93 1.11 2.03 2.26 6.75 12.70 143.33

Number of Iterations 9 16 25 32 43 69 63 101 111 256

Elapsed Time per Iteration (s) 0.09 0.05 0.04 0.03 0.03 0.03 0.04 0.07 0.11 0.56

WCSS (×108) mean 2.67 2.26 2.31 2.24 2.15 2.09 1.96 1.84 1.83 1.77

min 1.90 1.81 1.80 1.74 1.74 1.72 1.72 1.72 1.71 1.68

WCSS

(rescaled to K-means

min)

mean 1.59 1.34 1.37 1.33 1.27 1.24 1.17 1.09 1.09 1.05

min 1.13 1.07 1.07 1.04 1.03 1.02 1.02 1.02 1.02 1.00

43

Figure 13 Efficiency w/respect to N. (a) Elapsed Time Comparison; (b) Elapsed Time ratio

(how many times faster); (c) ETI Comparison; (d) ETI ratio. (e) Within-Cluster Sum of

Squares Comparison; (f) Within-Cluster Sum of Squares Limited by various max numbers

of Iterations.

103

104

105

0

2

4

6

8

10

12

14

16

Num of Eigenspace Partitions

Ela

psed T

ime (

s)

CBEST

Max/Min

103

104

105

0

50

100

150

200

250

300

350


Tim

e R

atio (

k-m

eans/C

BE

ST

)


Max/Min

103

104

105

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16


Ela

psed T

ime p

er

Itera

tion (

ET

I)

103

104

105

0

5

10

15

20

25

30


ET

I R

atio (

k-m

eans/C

BE

ST

)

(a) (b)

(c) (d)

103

104

105

1

1.2

1.4

1.6

1.8

2


Within

-Clu

ste

r S

um

of

Square

s (

Rescale

d t

o M

in)

CBEST avg

k-means avg

CBEST Best/Worst

k-means Best/Worst

10 20 30 40 50 60 70 800.8

0.9

1

1.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8

Num of Max Iterations

Within

-Clu

ste

r S

um

of

Square

s (

Rescale

d t

o M

in)

CBEST avg

k-means avg

CBEST Best/Worst

k-means Best/Worst

(e) (f)

44

The edge of CBEST to K-means gradually decreases as the number of partitions increases. This

is an expected result since the number of partitions in eigenspace determines the complexity of

the CBEST algorithm. As opposed to the total elapsed time, the quadratic like curve of elapsed

time per iteration (ETI, Figure 13c) was caused by larger weight of time in preprocessing when

scanning the entire dataset into eigenspace partitions and post-processing when assigning

memberships. In this case, ETI does not indicate the average time in each iterative update but an

overall time consumption divided into each iteration interval. Initially, the time was spent mostly

in preprocessing and post-processing so that small iteration number ensures each iterative

interval is longer. As the number of partition increases, the time is consumed more in the

iterative calculation and update steps and thus the ETI slowly drops to the lowest limit until the

number of eigenspace partitions outweighs number of iterations and pre- and post-processing.

The performance of CBEST represented by WCSS (Figure 13e) on average decreases to

approximate that of K-means as the number of eigenspace partitions N increases. The lowest

WCSS for CBEST starts with 1.13 of that of the K-means, decreased to 1.02 as partition number

increases to 10 million in the test. It can be inferred that the ultimate case is that N is large

enough to contain only one single data instance in every occupied partition set, so that the

clustering process would be completely identical to that of K-means.

Furthermore, we can explore how much information loss due to dropping the eigen axes with

lower eigenvalues could possibly affect the accuracy of CBEST. As mentioned above, we

empirically set the minimum number of partitions for a 1-dimensional subspace to be 3 since it is

the minimum partitions to allow confidence interval to be applied. In the experiment that N

gradually increases, the dimension starts with 4, and increases to 5 at N=10,000, then to 6 when

N=10,000,000 (Table 14). However, no abrupt boost of performance could be observed from

Table 13 and Figure 13 at the two breaking points. Particularly for N from 500,000 to

10,000,000, the best performance almost stabilizes at 1.02 of that of the K-means. It seems that

the additional 3 partitions assigned to the 6th

eigen axis has little impact on the clustering

performance.

Table 14 Assignment of Eigenspace Partitions for Eigen Axes

Expected N 1000 5000 10000 50000 100000 500000 1000000 5000000 10000000 Eigen Values

Eigen Axis 1 12 15 17 22 25 33 37 48 54 626.2

Eigen Axis 2 7 9 10 13 14 19 21 28 31 204.7

Eigen Axis 3 4 5 6 8 9 12 13 17 19 78.5

Eigen Axis 4 3 4 5 6 7 9 10 13 15 48.0

Eigen Axis 5 3 3 4 5 5 7 8 13.3

Eigen Axis 6 3 1.5

There is an issue of how well the means as opposed to the minimums could be used to justify the

experiment. The minimums indicate the best solutions which the algorithms could achieve from

10 runs, while the means indirectly reflect the possibility that the algorithms are trapped in

45

distant local optimums. As for the distant local optimum, for instance, K-means running with a

cluster center of 64, 7 runs generated local optimum at WCSS of 81 million as opposed to the

other 3 runs at that of 67 million. Although K-means is more likely to get through the distant

local optimum to get a solution close to the global optimum than CBEST, 10 runs are not

sufficient to generalize the frequency of getting a distant local optimum. Consequently, the

WCSS averaged by only 10 runs is subject to bias by possible multiple distant local optima. But

as for the minimum values, 10 runs could likely generate at least a solution that is close to the

global optimum though not guaranteed. However, when k is not too large, the comparison would

be more confident.

The performance of CBEST as opposed to K-means was further examined by limiting the

maximum number of iterations allowed in a single clustering implementation (Table 15, Figure

13f).

Table 15 Performance Test w/respect to the Max number of Iterations

Maximum Iteration 10 20 30 40 50 60 70 80

Within-Cluster

Sum of Squares

(×108)

CBEST Mean 2.52 2.42 2.09 1.92 2.32 2.41 2.06 2.26

Min 1.78 1.74 1.73 1.71 1.72 1.74 1.72 1.72

K-means Mean 2.57 2.26 2.01 1.78 2.16 2.01 1.70 1.70

Min 2.15 1.71 1.70 1.68 1.69 1.69 1.68 1.68

Rescaled Within-Cluster

Sum of Squares

CBEST Mean 1.17 1.42 1.23 1.14 1.37 1.43 1.22 1.34

Min 0.83 1.02 1.02 1.02 1.02 1.03 1.02 1.02

K-means Mean 1.20 1.32 1.18 1.05 1.28 1.19 1.01 1.01

Min 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

The best solutions (min WCSS) derived by K-means were used as a baseline for rescaling as

well. It can be easily observed that within 80 iterations, the best solution generated by CBEST is

steadily around 2% higher than that of the K-means with an exception at a limiting number of

iterations of 10 when CBEST converged to a much lower WCSS. However, this could also be

caused by previously argued insufficient number of runs. As a result of similar efficiency in

converging speed between the two algorithms, more iterations of K-means as observed in the

above experiments could be speculated as the following: CBEST usually reaches the separation

limits bounded by eigenspace partitions much earlier while the K-means continues to minimize

WWCS by separating without limitation.

In general, driven by the efficiency & performance test, the following points could be made:

† Time cost of CBEST increases much slower than K-means as data size increase;

† As k increases, CBEST is faster than K-means but the increasing rate slows down;

† Both CBEST and K-means could approximately reach an optimum far from the

global optimal solution, but CBEST has higher possibility of failure given that an

identical random set of initial centers is used;

† CBEST slows down as N increases, but approaches local optimum more closely;

46

† As N increases, CBEST has higher possibility to approach an local optimum close to

global solution;

† CBEST and K-means converge at almost identical efficiency.

Consequently, to carry out a CBEST clustering in practice considering efficiency and accuracy, a

balanced N to allow fast multiple runs to approximate global optimum should be an appropriate

strategy. However, the balanced N may not be properly estimated without experimenting on the

data first. For large datasets, one could first draw a sample from the dataset and experiment on

various N to empirically get an appropriate value.

5.2 Application Experiments

5.2.1 Landsat TM Image

Both CBEST and K-means were implemented for 10 runs. The one with the lowest WCSS was

chosen to proceed to post-processing. Number of clusters k was set to 30. For CBEST, N was set

to 1,000,000 (CBEST is approximately 85x faster than K-means in this configuration on

average). Then the 30 clusters were split and merged down to 8 land cover/use classes in post-

processing. We developed an approach guided by spectral information of ground truth samples

projected in eigenspace for splitting and merging. If a cluster was considered to be a mixture of

more than one class, it was split and its members were then reassigned to its neighboring clusters

in eigenspace. Multiple clusters which were considered to be corresponding to one class were

merged into one cluster. In particular, first we assumed a Gaussian distribution for the 8 classes

and log likelihoods of the 30 cluster centers with respect to the 8 Gaussian distributed classes

were calculated. The class with outstanding highest log likelihood was assigned to the

corresponding class. To determine if the highest log likelihood is sufficiently outstanding, we set

a threshold to test if the two highest log likelihood values are close, in which case the cluster is a

mixture of at least two classes and thus split and its members reassigned to its neighboring

clusters in eigenspace. In this experiment, one cluster was split as its two highest log likelihood

values were close and the remaining 29 clusters were merged into their corresponding 8 classes

for both algorithms.

We compared the ground truth data and generated confusion matrices (Table 16). A summary of

their overall accuracy was compared in Table 17.

Table 16 Confusion Matrices for Validation (Landsat)

K-means Classified

Reference

Clearland Cropland Forest Idle Industry Orchard Settlement Water P Acc

Clearland 13 0 0 7 23 0 1 0 29.55%

Cropland 0 61 7 4 0 3 2 0 79.22%

Forest 0 2 73 0 0 6 2 0 87.95%

Idle 0 20 0 24 0 0 1 0 53.33%

Industry 9 0 0 1 52 0 9 0 73.24%

Orchard 0 0 15 0 0 33 0 0 68.75%

47

Settlement 0 0 0 5 0 0 84 2 92.31%

Water 0 0 0 0 0 0 4 37 90.24%

U Acc 59.09% 73.49% 76.84% 58.54% 69.33% 78.57% 81.55% 94.87%

CBEST Classified

Reference

Clearland Cropland Forest Idle Industry Orchard Settlement Water P Acc

Clearland 37 0 0 5 1 0 1 0 84.09%

Cropland 0 65 7 3 0 0 2 0 84.42%

Forest 0 3 68 0 0 12 0 0 81.93%

Idle 0 14 1 30 0 0 0 0 66.67%

Industry 18 0 0 1 41 0 11 0 57.75%

Orchard 0 0 9 0 0 39 0 0 81.25%

Settlement 0 1 0 1 0 0 87 2 95.60%

Water 0 0 0 0 0 0 5 36 87.80%

U Acc 67.27% 78.31% 80.00% 75.00% 97.62% 76.47% 82.08% 94.74%

Table 17 Summary of Classification Results (Landsat)

CBEST K-means

WCSS (×108) 1.179 1.166

Average Producer’s Acc 79.94% 71.82%

Average User’s Acc 81.44% 74.04%

Overall Accuracy 80.60% 75.40%

Kappa Coefficient 0.775 0.713

Agreement 82.05%

To assess spectral information, the first three eigen axes of ground reference data points are

plotted pairwise (Figure 14). It can be observed that several class pairs are mixed such as

clearland and industrial land, orchard and forest, cropland and idle land. Between clearland and

industrial land, as spectrally clearland could be grouped into two widely separated subclasses,

one of which closely mixed with industrial use. Clearland is sometimes cleared for construction

and thus covered by paved ground with little vegetation which resembles commercial/industrial

cover. As a result, poor classification accuracies were derived by both algorithms for the two

classes, particularly for the K-means. Orchard and forests are both tree covers and thus easier to

be mixed in spectral spaces. As idle land is sometimes covered by weed, its spectral signature

could shift towards less vegetated cropland and grassland.

48

Figure 14 Scatterplot of Ground Truth

Land cover/use maps were created using both algorithms (Figure 15). In general, the

classification result agrees with the layout of the area by comparing with validation samples

(Figure 16). Residential areas are located near the Pearl River on the south and west. Agriculture

lands and villages characterized by sparse residential areas are in the north west of Guangzhou.

Orchards and sparse tree canopies are spread out in the north part and forested areas, there are

also some along the Pearl River in the south. Despite the difference in bias toward different sides

for intermixed pixels, overall the results are similar as 82.05% pixels are identically classified.

49

Figure 15 Land Cover/Use Map derived by K-means and CBEST in Guangzhou

Figure 16 Validation Samples as Ground Reference in Guangzhou

5.2.2 AVIRIS Hyperspectral Image

The number of clusters k was set to 100 due to a large number of 58 classes provided in the

reference data. For CBEST, N was set to 10,000,000 due to the large number of bands for the

hyperspectral image. CBEST and K-means were both used to cluster the hyperspectral image.

The results were then compared with the reference to combine and simplify the 58 classes since

some classes were either too detailed or too similar to another class in spectral space. The final

classification system consists of 10 classes, which are Urban, Corn, Soybeans, Wheat/Oats,

Grass/Pasture, Hay, Woods, Swampy Area, Water and Other (Bare soils, Not Cropped and

Orchard). The 100 clusters were assigned into one of the 10 classes. Each cluster was assigned to

the class corresponding to the highest number of pixel counts. Lastly, the final product of the

land cover maps were produced after applying a majority filter on the reassigned cluster maps to

reduce the pepper and salt effects. The reason for applying the majority filter is that the image

primarily consists of agriculture lands and thus pixels of the same land cover are very likely

connected in large patches. The resulting maps are shown in Figure 17.

50

Figure 17 Mapping Results in Tippercanoe County

The accuracies of the two maps and the percentage of agreement between them were calculated

(Table 18). Despite their large gaps in the computing times as CBEST was almost 40 times faster

than K-means, they generated the product at the same level of accuracies with respect to the

given reference. Neither K-means nor CBEST was capable of separating corn from soybeans

effectively. If combining the two crop types into one class, the accuracy goes up to 88%. The

increase of accuracy is partly attributed to the dominant number of pixels of the two as they

comprise approximately 66% of all reference pixels in the reference image. Another reason is

that the two classes have relatively close spectral curves by visual inspection and are thus

difficult to separate. Limitation of the K-means when dealing with high dimensional data is

another concern of possible accuracy loss. Furthermore, the reference polygons may not be as

uniform as they were labelled. If observed closely, one can notice that the textures in some

polygons of the reference are complicated. For instance, the urban reference polygons define the

entire neighbourhood, while there are only a few sparsely distributed buildings yet most areas are

covered by planned lawns, brushes and trees. All these issues stated above could contribute to

the misclassification in this experiment. To sum up, it is reasonable to interpret the results as an

implication that K-means and CBEST could achieve a similar level of accuracy in a clustering

practice for hyperspectral data.

51

Table 18 Summary of Class Results (AVIRIS)

CBEST K-means

Elapsed Time (Seconds) 107 ( ES Transformation 10s) 4071

Accuracy 65.8% 65.7%

Accuracy (Corn/Soybeans Combined) 88.4% 88.1%

Agreement 81.0%

Agreement (Corn/Soybeans Combined) 93.4%

6 Discussions

There is a concern that the eigen space transformation could cost a comparable amount of time as

the K-means under certain circumstances. Theoretically, the computing complexity for the eigen

analysis is o(np2), while K-means is o(npkt). However, the process of K-means is more

complicated and redundant than the eigen analysis since it involves calculating distance pairs,

searching minimum distances, recalculating cluster centers and calculating and checking

termination flag. Each process requires scanning over the entire dataset. To give an estimate, an

empirical parameter b can be specified such that when np2 increases large enough to approach

bnpkt, eigen analysis would take as long as K-means and thus CBEST would not be a good

choice. In the Efficiency & Performance Test, we did not include the time of performing the

eigen analysis in our time comparison tables, because the eigen analysis performed on the TM

image took only 0.1 seconds on average of 10 runs under the same hardware configuration. In

comparison with over a one hundred second level run time for K-means, this time is relatively

small. In the later experiment with the hyperspectral image, though the eigen analysis processing

time went up to 10 seconds, it is still not significant compared to 4071 seconds processing time

under K-means. From the AVIRIS image experiment with p=220, k=100 and t=100, it can be

approximated that this empirical parameter b is around 9, which is large enough to guarantee that

eigen analysis is not time consuming in most cases for remotely sensed data even for

hyperspectral data with over 200 bands.

In the experiment on the Landsat TM satellite image, the post-processed CBEST map is

considered to be more accurate than the post-processed K-means map with this validation

sample set. However, due to the same nature of the two algorithms, one could argue the results

simply reflect that the two sets of results are from two respective local optima. Although both

CBEST and K-means intend to find optimal clustering patterns for the data, as they start with

different random initial cluster centers, the clustering partition boundary could be different

because they could possibly be trapped in many local minima. In practice, sometimes some local

minima could yield better classification results than the global minimum depending on the

configuration of the original dataset, classification system and validation dataset. In the

experiment with the AVIRIS data, however, CBEST and K-means yielded nearly identical

results in which they performed at the same level of accuracies yet showing the same weakness

when separating certain classes. The application for land cover/use mapping we demonstrated

here is merely intended to present that CBEST is capable of handling classification practice in

remote sensing studies.

52

The experiments conducted as above assure us of the potential of CBEST in land cover and land

use classification. Here is a sum up of the pros and cons of CBEST comparing to K-means in

remote sensing applications as following:

Pros:

† CBEST is fast when dealing with large datasets;

† CBEST converges earlier;

† By choosing a proper N, CBEST could maintain a performance level similar to K-

means;

Cons:

† CBEST does not guarantee local optimum;

† Need to estimate a proper N prior to application;

† More sensitive to initial centers.

Consequently, we can roughly conclude that CBEST enhances the advantages of the

conventional K-means while suffering almost identical drawbacks. However, the speed gain is so

significant that more agile strategies could be applied to CBEST. For instance, for an extremely

large dataset, one can only afford to run K-means several times with a low number of iterations.

CBEST can actually run a lot of times and converge at a point which could potentially yield

better result than K-means in a shorter period of time.

In practice, CBEST has already been applied to study the distribution and temporal changes of

surface cover colors over the entire country of China with 8-day composite MODIS satellite

images from 2001 to 2010 (Fu et al., In press). It was shown this method had considerably saved

computation time in comparison to the conventional K-means.

In general, CBEST is competitive in tasks that certain loss of accuracy could be tolerated to

achieve greatly improved efficiency in processing for large dataset. For instance, one would like

to scan over a large database to find clusters that are small in number and whose centers are

difficult to be detected by training. In the context of classification, CBEST can be used to

explore large datasets to search for distinct cluster centers that are unlikely to belong to any

existing classes defined in a supervised classification. One can also use CBEST to search for any

clusters of small populations within existing classes to split existing classes into higher level

subclasses. For instance, if supervised classification is applied to the image used in our

experiment, there are plenty of subclasses that could be further explored such as the large

eigenspace distance between two subclusters within the clearland class that is covered by

different cleared ground covers. In remote sensing, the image we experimented with is hardly

considered a large dataset as it is only a subset of one TM scene. For clustering the global land

areas for instance, the total land area on earth is around 150 million km2 comprising roughly 150

million pixels for a coarse spatial resolution at 1 km, which could take days for K-means to

process, but maybe only a few minutes for CBEST. With this gained efficiency, we could deal

with higher spatial resolution a lot faster for large areas. Consider a scenario that we increase the

spatial resolution to 30 meters for clustering tests for all global land areas at the same scale as the

53

global land cover product produced with TM and ETM+ data (Gong et al., 2013). The data size

increases by more than 1000 times to over 150 billion pixels in total. It could be an impossible

task for the K-means to complete within a reasonable duration of time. However, the nature of

CBEST makes it possible for this type of tasks very well.

As for K-means, not only Euclidean distance could be implemented, but generalizing to other

types of distances is also viable for CBEST. In addition, CBEST could facilitate as an

initialization tool to fast approaching the global optimum, followed by other improved K-means

approaches. Jumping out of the framework of K-means, CBEST alone as a compression

approach could be generalized in many potential iterative clustering algorithms by computing

additional statistics such as variances in eigenspace partition sets. More improvement to CBEST

is possible under the framework that we established in this paper and we are looking forward to

applying it agilely in various remote sensing practices in the future.

54

Chapter 3 Applications of CBEST in efficiently mapping forest changes in the

state of California from 1986-2011

55

Abstract

We present an efficient approach for a practice of large-area mapping of forest changes based on

the Clustering Based on Eigen Space Transformation (CBEST) algorithm using remote sensing.

By analyzing 450 Landsat Thematic Mapper (TM) satellite images from 1986 to 2011 with a

five-year interval covering the entire state of California, USA, we derived a forest change type

map, a forest loss map and a forest gain map. Although California has 99.6 million acres land

area in total and the spatial resolution of Landsat TM is 30m, the computing time of the task took

only 10 hours in a computer with an Intel 2.8 Ghz i5 CPU and 8 Gigabytes RAM. The overall

accuracy of the forest cover in year 2011 was reported as 92.9% ± 1.6%. We found that the

estimated forest area changed from 28.20 ± 1.98 million acres to 28.05 ± 1.98 million acres from

1986-2011. In particular, our rough estimate indicates that each year California’s forest


thousand acres forest loss per year. In addition, during 1986-2011, around 12% of the forestland

experienced changes, in which the change was 4% each for deforestation, afforestation and

deforestation then recovered, respectively. We concluded that forestland in California had been

managed in a sustainable manner over the 25 years, since no significantly directional changes

were observed. Our approach made a tighter estimate of the true canopy coverage such that 29%

of land in California is forestland, as opposed to the statistics of 33% and 40% made by previous

studies that had lower spatial resolution and shorter temporal coverage.

Keywords: Forest Change, Deforestation, Forest Disturbance, Landsat, Large Areas Mapping,

Large Dataset, Unsupervised Classification, Multi-temporal Image Analysis, Remote Sensing

56

1 Introduction

Forestland is commonly defined as land that is at least one acre in area and has at least 10% area

stocked with trees of any size, or previously had such tree cover but not being currently

developed for non-forest use (Helms, 1998). The Resource Planning and Act assessment (USDA

Forest Service, 2012) additionally limits a width of at least 120 feet (37 meters). It also includes

transition zones with 10% tree cover and excludes lands predominantly under agricultural and

urban land use. Forest, when properly managed, is known to be a major carbon sink that can

mitigate the process of climate change. Traditionally, the importance of forest is assessed for its

economic, social and ecological values. Commercial forest (Timberland) provides valuable wood

products, while reserved forest is preserved for recreations, aesthetics, wildlife, watershed,

biodiversity, etc. The importance of sustainable forest management that aims to conserve the

forest for the benefit and sustainability for future generations is increasingly acknowledged by

the public nowadays. Therefore, it is crucial to monitor forest changes and to estimate

deforestation for tracking carbon stocks and fluxes (Running, 2008), as well as to support

decision making for better forest management for the benefit of the society. It is therefore

demanding to monitor how much land area is really dominated by mature trees but not being

regenerated as bare soils, grasses, shrubs or seedlings regardless of administrative definitions.

Moreover, monitoring these deforestation and afforestation activities over time is also important

since natural and human-induced disturbances that cause deforestation are becoming more and

more frequent under climate change (Overpeck et al., 1990; Westerling et al., 2006). Natural

disturbances include hurricanes, earthquakes, wildfires, increased temperature, drought,

pathogens and insect attacks (Soja et al., 2007; Kurtz et al., 2008; Westerling and Bryant, 2008).

Human-induced disturbances include logging, clear-cutting and prescribed fire. The detection of

these disturbances and land use changes provides evidence for scientists and policy makers to

study the implications of such changes and to project future trends.

Remote sensing has been widely used in forest mapping. Large area forest land cover mapping

begain in the 1990s (Loveland et al., 1991; Zhu and Evans, 1994; Vogelmann et al., 2001). In

global scale, the global tropical forest cover map (Mayaux et al., 1998), global land cover maps

(Hansen et al., 2000; Gong et al., 2013) and the global forest percentage map (Defries et al.,

2000) mapped land cover including forest at a single time. Among the many sensors that have

been used for forest mapping such as AVHRR (Zhu and Evans, 1994; Hansen et al., 2000),

MODIS (FIA, 2014; Parmentier and Eastman, 2014) and MERIS (Arino et al., 2007), Landsat

satellite sensors can provide both finer resolution (30 meters) and temporal coverage with a 16-

day cycle since 1984. Therefore Landsat satellite had been a preferred choice for mapping forest

changes. For instance, National Land Cover Dataset (NLCD) was produced for year 1992, 2001,

2006 and 2011 (Vogelmann et al., 2001; Homer et al., 2007; Fry et al., 2011; Jin et al., 2013), in

which forest was mapped at 30 meter spatial resolution and is the finest resolution for the

country and states so far. Due to the big volume of data as a result of its relatively fine resolution

and complexity of land cover classification approaches, only until recent years has this sensor

been more frequently utilized. Landsat images acquired from 1988-2006 over a study site at New

Mexico were used to map forest changes (Vogelmann et al., 2009). Biomass loss as a result of

disturbances was mapped using Landsat-based detection of Trends in Disturbance and Recovery

(LandTrendr, Kennedy et al., 2010; Cohen et al., 2010) for the conterminous US from 1986-

57

2004 (Powell et al., 2014). Souza et al. (2013) used Landsat images of ten years to map the

deforestation of the entire amazon forest. Hansen et al. (2013) was able to produce a global map

depicting forest changes with 30 meter resolution from 2000 to 2012 using efficient cloud

computing.

A variety of automatic classification algorithms were used to group pixels of satellite images into

land cover classes, including separating forest from non-forest. Generally there are two types of

classification algorithms: supervised and unsupervised (Jensen, 2004). A number of maps were

made using supervised classification, which requires a training dataset to train classifier models

and then uses the models to classify the image. For instance, decision trees were used to produce

the NLCD national land cover maps (Vogelmann et al., 2001), MODIS global land cover

product (Friedl et al., 2010) and 2000-2012 forest change map (Hansen et al., 2013). Support

Vector Machines (SVM) was used to produce the first 30 meter global land cover map (Gong et

al., 2013). The unsupervised clustering algorithm partitions data based on its own properties

rather than prior selection of training data. It has been applied in mapping land cover in a variety

of studies (Woodcock et al., 1994; Zhu and Evans, 1994; Loveland et al., 2000; Bartholomé et al.,

2005; Homer et al., 2007; Arino et al., 2007). In addition, there were studies that used neither

approach but thresholding rules to interpret time-series image stacks (Goodwin et al., 2008;

Kennedy et al., 2010; Cohen et al., 2010; Huang et al., 2010).

Forests cover 31 percent of the total land area in the world (FAO, 2010). In the US,

approximately 750 million acres are forestland, occupying 33 percent land area (USDA Forest

Service, 2001; Smith et al., 2002; FIA, 2014). As one of the most influential States in the US for

its top ranked economics and diverse demographics, California has a great proportion of forest

coverage and almost half of the forest is timberland (Laaksonen-Craig et al., 2003), which is

commercial forestland that is suitable for producing wood products. An earlier assessment by

Laaksonen-Craig et al. (2003) suggests 40% of the California land area is forestland. However,

national annualized inventory since 2001 indicates that 33% of the land area in California is

forest (FIA, 2014). The inventory used MODIS images with spatial resolution of 250 meters, of

which the area of each pixel is 15 times the minimum mapping unit, implying a very rough

estimate of forestland. With finer resolution of 30 meters, Franklin et al. (2000) mapped the tree

cover types in the national forests in California. The CALVEG mapping project that aims to map

all vegetation covers had not yet been finished and only limited regions of one time were mapped.

These maps were produced for one time and thus not adequate for change detection. Although

there are nationwide and worldwide map products that include the land of California, there had

been no existing studies that can track the forest change in California in a spatial and temporal

consistent manner (same methodology, data sources and small temporal gap). It is both

expensive and time consuming for experts to derive forest maps in a large area such as

California. The lack of temporal consistency results in gaps that can potentially impede

observing and studying important phenomenon and events. With the availability of Landsat

satellite images that can trace back to 1980s with a relatively fine spatial resolution of 30 meters,

we propose an approach that is based on the efficient clustering algorithm - Clustering Based on

Eigen Space Transformation (CBEST, Chen and Gong, 2013) to map forest changes in the entire

state of California from 1986-2011. CBEST was an efficient implementation of conventional K-

means, a widely used unsupervised algorithm. Unsupervised classification relies on post-

58

classification interpretation of the unlabeled clusters into information classes, which involves

expertise and intensive human works. Therefore, the procedure we designed features efficient

CBEST clustering, probability based forest cluster labeling and a unique automatic multi-

temporal forest change interpretation with probability trajectory. In this way, we can fill the

discrepancy of detailed forest monitoring in California for a long time and establish a reliable

and consistent map with efficient computing.

2 Methodology

2.1 Study Area

California locates on the west coast of the United States of America. It has 99,699 thousand acres

of total land area (U.S. Department of Commerce, 2010) with an estimated population of

38,332,521 as of 2013 (America Community Survey Office, 2013). The gross domestic product

(GDP) is about $2.003 trillion in 2012, which is the largest in the United States (U.S. Bureau of

Economic Analysis, 2013) and ranked 8th among all countries in the world. Cool offshore ocean

current and cold upwelling subsurface water lead to a Mediterranean climate in the coastal and

southern parts of the state with a rainy winter and dry summer, and moderate oceanic climate in

the north of the state. California has diverse ecosystems including deserts, forested mountain,

coastal forests, chaparral and woodlands. The plain in the central valley is one of the world’s

most productive agriculture area that supplies 8 percent of national agricultural output by value

(Reilly et al., 2008). California has a large area of wildland-urban interface (WUI), making more

than 5 million homes vulnerable to wildfires (Stewart et al., 2006).

59

2.2 Data

Figure 18 California: Study Area and Landsat TM scenes. Since the study area is in the

northern hemisphere, the UTM is of North Zone.

Thirty one Landsat scenes were required to cover the entire state of California per year, crossing

three Universal Transverse Mercator (UTM) zones as Landsat images were projected in UTM

coordinates system (Figure 18). A total of 450 Landsat Thematic Mapper Surface Reflectance

Climate Data Record (CDR) images were downloaded for year 1986, 1991, 1996, 2001, 2006

and 2011. The CDR product was processed using Land Ecosystem Disturbance Adaptive

Processing System (LEDAPS, Masek et al., 2012). LEDAPS carries out radiance calibration,

top-of-atmosphere reflectance conversion and atmospheric correction. A layer of cloud mask was

also included in the CDR product, which was generated from the Fmask algorithm (Zhu and

60

Woodcock, 2012), an object-based cloud masking approach. To ensure the quality of data as well

as to fill potential gaps as a result of cloud cover, all images from July to September with cloud

coverage less than 10 percent were downloaded. Coastal scenes with clear land area were

manually picked since the ocean part of the image is usually covered by cloud, therefore always

having more than 10 percent cloud coverage overall.

2.3 Procedure

The general procedure could be summarized as the following steps: 1) Data preparation; 2)

Initial Clustering; 3) Integrating Cluster Centers; 4) Probability Assignment; 5) Probability

Trajectory Interpretation; and 6) Post-processing. Figure 19 shows the flowchart of our approach.

The classification at the pixel level for each year was a semi-supervised classification that

incorporated both supervised and unsupervised classification. Firstly, the unsupervised

classification was carried out using CBEST to partition the image into a number of spectral

classes each with similar spectral properties. A stratified sampling was then implemented and

samples of all spectral classes were visually identified as a forest percentage class (<10%, 10%-

20%, 20%-50% and >50%) from high resolution images in Google Earth. An arbitrary value

indicating tree cover fraction was calculated for each forest percentage class. By averaging the

cover fractions for each spectral class, the mean and variance of the forest probability was

derived. Then we assigned the probability to all pixels for all acquisition years, in which each

pixel had a probability trajectory that implies the forest change from 1986 to 2011.

61

Figure 19 Flowchart of the Procedure to map forest changes in California

2.3.1 Data Preparation

62

All Landsat scenes with the same coordinate system and acquisition year were mosaicked into

one larger scene. The earlier the acquisition time of the year the image was acquired from, the

higher priority the image had in the overlay area with multiple image stacked. Because all

images were obtained from July to September, the image with its acquisition time closest to July

1st should be put on top of the image stacks. Later images were used to fill the gaps in the scene

masked as cloud, shadow, snow or water according to the ascending order of the difference

between acquisition time and July 1st. We mosaicked images from the same coordinate system

to reduce computing time and distortion error that are caused from re-projecting and re-sampling

the images (Lunetta et al., 1991). In our approach, all images from the same coordinate system

were processed independently until spectral bands were converted into clusters and temporally

stacked, so that only one pass of mosaicking and re-projecting operation was required.

2.3.2 Initial Clustering

CBEST (Chen and Gong, 2013) was applied in clustering each mosaicked image into 100

clusters. The desired number of eigenspace partitions N was set to 10 million, which means that

CBEST compresses the eigenspace by segmenting eigen axes based on the corresponding

eigenvalues into approximately 10 million eigenspace partition cells. This eigenspace

compression was originally conceptualized and developed by Gong and Howarth (1992) based

on Principal Component Analysis (Jolliffe, 2002). For CBEST, instead of using pixels as

members to implement each iteration, CBEST uses eigenspace partitioned cells in which means

and counts are calculated for a revised K-means clustering. The K-means algorithm aims to

partition the data into a specified number of clusters with minimized within-cluster sum of

squares (MacQueen, 1967). Since N is user-defined and is relatively small compared with the

number of pixels for a large image, the computational cost and memory usage of the algorithm

can be greatly reduced. The eigenspace transformation (or PCA transformation) does not strictly

require all pixels to calculate the optimal eigenspace compression. Firstly, a representative

subsample of the original image does not change the eigenvalues too much. Secondly, even if the

eigenspace compression is somehow distorted a little, the subsequent K-means clustering is still

the major step and has more impact on the result given the total number of eigenspace partitions

does not change. Therefore, a systematic sampling with a ten-pixel interval was done solely for

the purpose of compression. The projecting from original feature space to the compressed

eigenspace still scanned over the entire dataset. The CBEST software we used in this paper is

available to download at http://data.ess.tsinghua.edu.cn/. The configuration of parameters and the

interface of CBEST software are demonstrated in Figure 20.

http://data.ess.tsinghua.edu.cn/

63

Figure 20 CBEST software interface. The initial clustering was implemented under the

configuration in this figure.

CBEST software was coded and compiled in Matlab 2013a. All analyses and processing were

implemented in a computer features an Intel i5 760 2.8Ghz quad-core CPU with 8 gigabytes

physical memory.

2.3.3 Integrating Cluster Centers

There are three mosaicked images in each year, comprising a total of 18 mosaicked images for

clustering. Each mosaic yields 100 clusters, so a total number of 1800 cluster centers were

calculated. To integrate these centers and make cluster labels consistent over all images, we

further implemented a K-means clustering on these 1800 cluster centers to 30 clusters with a

universal labeling scheme. By doing this, all images were relabeled into 30 clusters, within each

of which the samples had similar spectral properties and were distinguished from those of other

clusters.

2.3.4 Probability Assignment

The clustered mosaick images in year 2011 were mosaicked into one larger image for the

purpose of sampling integrity. A stratified sampling was carried out with each cluster being a

stratum that was proportionally sampled to the total number of land pixels in California. The

desired number of sampling units was set to 1000, with a limitation of at least 10 sampling units

for each stratum. All the sampling units were categorized based on visual verifications in the

Google Earth software. Particularly, a 30m by 30m rectangular region centered at the location of

the sampling unit was roughly divided into 3 by 3 cells. By counting if there are dominate forest

presence in these cells, the label of the forest percentage was determined. The classes are defined

64

in Table 19. A secondary label was also assigned if a sample consisted of several mixed classes

or a transition (Edwards et al., 1998; Stehman and Czaplewski, 1998; Olofsson et al., 2014).

Since the minimum mapping unit for forestland is one acre and there would be some uncertainty

caused by mis-registration from either Google Earth and the Landsat imagery, it is thus

reasonable to take the surrounding pixels into account and form a secondary label in addition to

the primary label at the pixel when the sampling location exhibited great disagreement from the

surrounding neighbor pixels.

Table 19 Verification classes and corresponding probability weights

Class Description Probability Weight

Non-forest Tree cover < 10% 0

Woodland Tree cover 10%-20% 0.2

Low-density Forest Tree cover 20%-50% 0.5

High-density Forest Tree cover >50% 1

The probability weight was arbitrarily determined to convert nominal classes into numerical

values. Given that many sampling units had an additional secondary label, the probability

weights of these samples were averaged between the primary and secondary. Mean and standard

deviation of each stratum were calculated excluding pixels that were identified as agricultural

use since orchards and forests have similar spectral properties. In such manner, each cluster

(stratum) had a mean and variance that indicates the probability of the pixels being covered by

trees.

2.3.5 Probability Trajectory Interpretation

The probability weight is an indicator of tree cover fractions. By stacking the means and standard

deviation of probability of the images temporally from 1986 to 2011, one is able to derive a

trajectory with a range of dispersion of forest probabilities for each pixel. The probability

plus/minus standard deviation provides a rough range of estimating how much of the pixel is

occupied by trees. We interpreted the probability and probability plus/minus standard deviation

starting from the second available year, which was year 1991 in most cases. However, there were

values of some years that might be missing for some pixels due to cloud masks. The pixels

without values for at least 4 years were disregarded and remain unclassified. We determined the

rules for detecting a forest loss or gain by utilizing the probability range overlaps and two

thresholds.

For each year, a test for overlapping with previous years was implemented that if any mean value

of the previous year is beyond the bounds of this year and the mean value of this year is beyond

the bounds of all previous years, we continued with further tests. In order to avoid detecting

insignificant changes (e.g. removing a small proportion of trees but still mostly covered by trees),

we determined two thresholds: mean-threshold and upper-bound-threshold that a change towards

afforestation must have meant greater than the mean-threshold and upper bound greater than the

upper-bound-threshold. However, the reverse change towards deforestation only requires either

of them to be less than the threshold. In addition, if no changes were detected, the average of all

six mean probabilities and highest upper bound were tested with the same threshold as described

above. A graphical demonstration of the trajectory interpretation was shown in Figure 21. The

65

determination of the two thresholds were averaged between the mixed clusters and searched

around the mean to best equalize the user’s and producer’s accuracies, such that the chance of

misclassification between ‘high-density forest’ (>50%) and ‘non-forest’ (<10%) was even but

not one class being over-classified and the other under-classified. To simplify the storage of the

results, only the latest forest loss and forest gain were retained so that all possible classes were

listed as forest, non-forest, six forest loss years, six forest gain years and 30 combinations of loss

and gain (2 permutations of 6), with a total of 44 classes. These 44 classes were coded and stored

in an output image with a pixel depth of unsigned 8-bit (one byte), which allowed us to store

integer numbers from 0 to 255.

Figure 21 Graphic demonstration of probability trajectory interpretation. (a) A typical

forest loss pixel with elaborations on the rules for automatic determination of forest loss;

66

(b) Non-forest, all points fall within the bounds; (c) Forest; (d) Forest Gain detected in

2006.

2.3.6 Post-processing

To remove speckle noises and eliminate patches that are smaller than the minimum mapping unit

(MMU) for better administration, a MMU filter or majority filter is usually applied in mapping

practices (Homer et al., 2007; Thomas et al., 2010). The filter replaces connected neighboring

pixels with the same value that are less than a minimum number with the surrounding majority

labels. To determine neighboring connectivity, one can use 4 neighboring directions (up, down,

left and right) or all 8 neighboring directions. Connecting 8 neighboring pixels preserves narrow

line features such as roads and rivers, but since USDA Forest Service (2012) requires the width

of a forest patch wider than 120 ft (37 meters), we decided to use 4 neighbors connectivity. The

MMU was set to one acre (4045m2, 5 pixels = 4500m

2). Filtered pixels were refilled by

repeatedly applying a 3 × 3 majority filter until no further change could be made. Residual

unfilled pixels were relabeled with their original classes. The 2013 Cultivated Layer that

identifies agricultural land derived from the 2009-2013 US Department of Agriculture, National

Agriculture Statistics Service, Cropland Data Layers (CDL) was acquired (Boyan et al., 2012).

Speckle noises were filtered followed by a dilation operation that was carried out twice for the

agriculture areas with a 3 by 3 kernel to expand and eliminate trivial patches in the center and

edge of a cropland. Since there are many orchards in California, which has very similar spectral

properties to forests, it is necessary to mask out all agricultural zones. The map was then

separated into three maps: a change type map, a forest loss year map and a forest gain year map.

The type map reflects categories of forest change with ‘forest’, ‘non-forest’, ‘forest loss’, ‘forest

gain’, ‘forest loss and recovered’, while the forest loss year and gain year maps were temporal

records of the latest forest loss and forest gain. In addition, forest gain before loss was a very rare

occasion especially at the head or tail of the trajectory, leading to erratic detection of change

which was supposed to be stable non-forest. Henceforth, we relabeled these changes as non-

forest.

An accuracy assessment to evaluate the overall quality of the California forest change map was

carried out. Due to the lack of reliable past record of ground survey or high resolution images

throughout the state, it is very difficult to assess the accuracy of the changes. However, the

samples identified for year 2011 were properly sampled and labeled and thus can be used to

validate the forest cover in year 2011 derived from the change map. Considering the limitation of

remote sensing for mixed pixels, especially woodland and low density forest that have more than

half of the pixel not covered by trees. We determined that either ‘non-forest’ or ‘forest’ is an

acceptable label for ‘woodland’ and ‘low-density forest’ in reference.

67

3 Results and Analysis

3.1 Intermediate Results

Table 20 Elapsed time for the clustering process. Each result was selected as the lowest

within-cluster sum of squares from 5 runs. 1986N10 means the mosaicked image in 1986

with projection of UTM Zone 10 North.

Name Elapsed Time

(Seconds)

Within-Cluster

Sum of Squares (Trillion)

Name Elapsed Time

(Seconds)

Within-Cluster

Sum of Squares (Trillion)

1986N10 10126 17.65 2001N10 14206 18.00

1986N11 7974 38.20 2001N11 9939 41.17

1986N12 7504 1.67 2001N12 7028 1.93

1991N10 10205 17.32 2006N10 8538 20.78

1991N11 10659 46.85 2006N11 9498 44.64

1991N12 7227 1.78 2006N12 6536 1.83

1996N10 12652 17.67 2007N10 7822 18.90

1996N11 9591 38.76 2007N11 8816 39.06

1996N12 8393 1.50 2007N12 6375 2.07

It took approximately 45 hours in total to batch process these images with approximately 9.5

billion pixels in total. Because of the efficiency of the CBEST algorithm, we had the luxury of

time to run the clustering for multiple times with different randomized initial states which can

greatly improve the performance of the K-means to approximate the global optimal solution. The

performance of the clustering was documented in Table 20. The elapsed times for the mosaicked

images of three zones were similar. Although mosaicked image in zone 10 and zone 11 were

almost over 10 times bigger than that in zone 12 with only one scene in presence, the

compressive nature of CBEST algorithm decides that the time cost was more dependent on the

occupied eigenspace partitions, in other words, the dispersion of data in eigenspace rather than

the size of data. Within-cluster sum of squares is a relative measure of the degree of optimum of

the result, which is comparable to itself with multiple runs. The smaller WCSS implies tighter

positioning of the members in the eigenspace for the clusters. We could observe that mosaicked

images in zone 11 had the highest WCSS, suggesting a more diverse landscape for the region.

68

Figure 22 Post-clustering result in year 2011 and stratified sampling units

After the integration of cluster centers into a universal 30 clusters for all images, the 2011 post-

clustering result were used for stratified sampling. The cluster map and sampling plots were

shown in Figure 22. A total of 1057 samples were generated and verified in Google Earth. The

mean and standard deviation of the probability weight of each cluster/stratum were listed in

Table 21. From these numbers one can tentatively conclude that cluster 3 and 19 were most

likely to be dense forest, cluster 1, 2, 5, 6, 12, 13, 15, 18, 20, 21, 23, 24, 26, 28, 29 and 30 were

primarily without tree covers. Other clusters except for 8 and 25 were somewhat mixed between

the two, implying a transitioning or mixed woodland or low-density forest. Cluster 8 and 25 were

labeled as ‘n/a’, in which cluster 8 contained bright scan lines caused by error of the sensor while

cluster 25 was all agricultural land. Therefore, these two clusters were treated as continuation or

69

transition between the previous time and the subsequent time. In addition, for clusters with

means less than 0.5 and its range not reaching zero, the ‘standard deviations’ were manually

increased to just cover the zero value. In such a way, the mixed clusters were more flexible to

remain non-forest when the previous or/and the subsequent clusters were completely non-forest

clusters with both mean and standard deviation being zero. The average of the means of the

mixed clusters was 0.35, around which a search was done to determine the best mean-threshold

to even the chance of misclassifications. The best result of mean-threshold was found to be

0.382. The upper-bound-threshold was set to an arbitrary value of 0.5, implying that in order for

a pixel to be forest at one time, it must have its highest probability being a forest at the same time

of at least 50%.

Table 21 Means and standard deviations of forest probabilities calculated for the clusters

Cluster

#

Mean Standard

Deviation

Cluster

#

Mean Standard

Deviation

Cluster

#

Mean Standard

Deviation

1 0.03 0.12 11 0.38 0.40 21 0.00 0.00

2 0.05 0.19 12 0.01 0.07 22 0.45 0.45

3 0.86 0.33 13 0.01 0.04 23 0.04 0.09

4 0.33 0.46 14 0.17 0.33 24 0.00 0.00

5 0.00 0.00 15 0.02 0.07 25 n/a n/a

6 0.00 0.00 16 0.26 0.37 26 0.00 0.00

7 0.27 0.41 17 0.50 0.50 27 0.36 0.41

8 n/a n/a 18 0.04 0.11 28 0.09 0.18

9 0.55 0.47 19 0.79 0.35 29 0.06 0.18

10 0.37 0.42 20 0.00 0.00 30 0.10 0.32

70

3.2 Forest Change Map and Accuracy Assessment

Figure 23 California Forest Change Maps 1986-2011. Left: Change Type Map; Upper

right: Forest loss characterized by years; Lower right: Forest gain/recovery characterized

by years.

The process of the interpretation of probability trajectories to generate the initial coded forest

change map took approximately 7 hours by the same machine. The initial map was then masked

out for agriculture zones and filtered for MMU, which cost around 1 hour processing time

combined. Subsequently, the final products of forest change in California from 1986 to 2011

with a five-year interval were generated by separating the coded map into three maps: a forest

change type map, a forest loss year map and a forest gain year map (Figure 23).

We determined four classes to validate and characterize the forest and non-forest based on the

percentage of tree cover. However, there are only forest and non-forest in our forest change map.

As described in section 2.3.5, the rules for separating ‘forest’ vs ‘non-forest’ for our forest

change map were based on the even chance of misclassification between ‘high-density forest’

and ‘non-forest’. We explained in section 2.3.6 that ‘low-density forest’ and ‘woodland’ were

mixed classes with tree coverage less than 50%, that therefore can be automatically classified to

71

either ‘forest’ or ‘non-forest’ due to the mixed spectral information at the subpixel level.

Therefore, an error matrix was generated to reflect the match between the four-class reference

and two-class map systems (Table 22).

Table 22 The error matrix of samples with four classes from validation labels and two

classes from the forest cover in 2011

Reference

High-Density

Forest (>50%)

Low-Density

Forest (<50%)

Woodland

(10%-20%)

Non-forest

(<10%)

Total

Map Forest 193 36 14 34 277

Non-forest 43 57 55 625 780

Total 236 93 69 659

It can be observed that the number of samples of ‘high-density forest’ misclassified as ‘non-

forest’ (43) and that of ‘non-forest’ misclassified as ‘forest’ (34) were similar, which was

achieved by searching the appropriate mean-threshold in probability trajectory interpretation. To

match the classes between the map and the reference, ‘low-density forest’ and ‘woodland’ were

acceptable labels of either ‘forest’ or ‘non-forest’ as described in section 2.3.6. Other than

matching with their names (low-density forest and woodland), these mixed classes were

probably transition zones between forest and non-forest, marginal areas and urban forests. From

the perspective of change detection, these classes were normally in the intermediate state of

changes or in stable forms, but seldom were the results of afforestation or deforestation in

forestland. The error matrix of accuracies was then calculated (Table 23).

Table 23 Error matrix of accuracies for the forest cover in 2011

Reference

Map

Forest Non-forest Total User's Acc 95% CI (±)

Forest 245 32 277 0.884 0.038

Non-forest 43 737 780 0.945 0.016

Total 288 769

Overall Accuracy = 0.929 ± 0.016 Prod's Acc 0.851 0.958

95% CI (±) 0.035 0.016

Estimated area, 95% confidence intervals for both accuracies and estimated areas were

calculated based on the approaches introduced in Olofsson et al. (2014). The estimated area that

was classified as forest in 2011 was 28.1 million acres with a 95% confidence interval of 28.05 ±

1.54 million acres, while the non-forest area was 71.22 ± 1.54 million acres. The land area that

remained unclassified was 0.43 million acres, which should add up to the uncertain land area.

We assumed the error matrices for the forest cover maps for all years derived from the forest

change map were consistent, implying a constant chance of misclassification for both classes

throughout the mapping years. Therefore, the forest area for all mapping years could be

estimated with a 95% confidence interval that incorporates unclassified uncertainties (Figure 24).

72

Figure 24 Estimated Forest Area by Mapping Years

4 Discussions

The forestland area in California has been quite stable during the 25-year period studied and no

significant changes of forest loss or gain were detected. The slight fluctuation of the forest area

plot in Figure 24 implies there were times that disturbed areas affected more land than the

regenerating rate of the forest, and there were years that new disturbances were less frequent as

opposed to the resilient forest ecosystems in regeneration. Since no accuracy assessment of the

forest change was done, we could only estimate yearly loss and gain solely based on the number

of pixels of that change type in our map product. On an average for the 25 years, each year

California’s forest experienced a loss of 92 thousand acres and recovery of 85 thousand acres,

resulting in seven thousand acres forest loss per year. Compared with the area of forestland in the

entire state, the average annual loss was only 0.02%, which was too small even compared to the

uncertainties in the map (1.54 mil acres confidence interval bound to mean and 0.43 mil acres

unclassified). In the past 25 years, around 12% of the forestland experienced change, with

equally 4% each for ‘forest loss’, ‘forest gain’ and ‘forest loss and recovered’ (Figure 25).

73

Figure 25 Proportions of forest change type in California for 1986-2011

In particular, 4% of deforestation should be divided into two categories: 1) Permanent changes

such that land was converted to other land uses or succeeded by a different ecosystem without

dense tree covers. Such changes occur in the short or near-term without the detection of

regeneration. 2) Temporary changes that occur from disturbance such as wildfire, clearcut, etc.

These changes occurred recently and trees have not yet regenerated. Apparently, the 4% of

‘forest loss and recovered’ class falls into the latter temporary category. The 4% of ‘forest gain’

could also be categorized into permanent changes and temporary changes. Based on visual

comparisons between our maps and high-resolution images, we found the natural regenerations

usually took more than 10 years for forest to reach the level detectable by our algorithm.

Therefore, the natural process of forest loss then recovered should be initially detected in the first

15 years. We can roughly calculate the proportion of such temporary changes for the last 10

years to be 10/(10+15)×4%=1.6%. By assuming a constant ratio between permanent changes and

temporary change over the years, we can estimate the permanent loss was 4% - 1.6% = 2.4%

during the 25 years. The same calculation could be applied to the forest gain and also 2.4% of the

permanent gain was accounted for the permanent loss. However, the above analysis was limited

by the scope of temporal intervals of 5 years as well as the complexity of forest changes due to

natural disturbances, climate, change of land ownership, fluctuation of economics impacting the

wood products market and change of management policy. Further explorations could be done to

study the potential factors that affect these changes. In general, we did not observe that the

forestlands in California were either declining or increasing in a significant way. All these

observations suggest that the replacement of California’s forest was relatively stable and healthy,

which should be attributed to the Forest Practices Act and efforts to sustainably manage

forestland in California.

74

Figure 26 Local views of some chosen places of the forest change map. The four images in

the bottom of the figure demonstrate the changes detected with historical aerial

photographs back in the 1988 and 1993 in comparison with high resolution image acquired

75

recently. Orange circles indicate a regenerated forest patch after early removal while red

circles encompass a clearcut area.

In Figure 26, we demonstrated a closer view of detected disturbances at the landscape scale. The

scene in the north had many small patches that are similar in size and shape, in which the forests

were harvested for wood products. The harvesting activities in the period of 1996-2001, 2001-

2006 and 2006-2011 were almost identical, implying a well-managed forest harvest strategy. The

scene in the south suggests a big irregular disturbance, probably as a result of a wildfire during

2006-2011. The scene in the San Francisco Bay Area with zoomed view compared with

historical aerial photos lies in the hills of the University of California at Berkeley campus,

northwest of the botanical garden. The small scene includes two land use change scenarios. The

north part of the forest was clearcut in the late 1980s and later recovered. The forest in the south

part was removed for paved road and building construction during the same time and was thus

experienced a permanent land use change.

In comparison with the 40% forestland area by Laaksonen-Craig et al. (2003) and 33%

forestland area by FIA (2014), our map indicated forest occupies 29% of land. If we counted the

proportion of forest samples (>10%) in our stratified sampling, the forest samples were 38% of

the total. Considering the stratified sampling was not strictly carried out based on equal

probability (stratum of at least 10 in sample size) and only a very small proportion of the entire

state was sampled, this estimate could be both biased and large in variance. The differences in

the estimated forestland area between our approach and the previous studies can be attributed to

the following factors. Forestland is officially defined as the land at least 120 feet wide and 1 acre

in size with tree cover greater than 10%, excluding areas for urban and agricultural use. All

transition zones that met the above standard should also be defined as forest. It also includes any

disturbed areas that were previously forested without changing the land use to urban or

agricultural land. It leaves space of ambiguities for subjective manipulations. For instance, a

large grassland patch as a result of natural succession after a forest fire surrounded by forestland

could be defined as either forestland or non-forest. In a larger extent combined with surrounding

forest, the entire patch that encompasses the grassland has tree cover over 10% and thus the

patch is defined as forestland. However, the disturbance occurred a long time ago and can be

treated no longer as forest regeneration, the patch alone does not have trees over 10% and thus

should be classified as non-forest. From the perspective of data sources, without additional

administrative information in spatial details, it is not possible to find all forestland even if we

assumed that remote sensing images could derive perfect information about tree coverage. For

instance under a similar scenario as above, areas that were disturbed and under regeneration were

somehow succeeded by non-forest ecosystems due to possible factors such as nutrition, water,

climate and invasive species. If the disturbances were larger than the one acre MMU and the

spatial resolution of the data, they should be easily classified as non-forest. However, since the

patches were in the middle of a large forestland, they should remain as forest use and classified

as forestland administratively. Here the scales of observation play an important role. In Figure 27,

we demonstrated that delineating forest patches with different scales of measurement could result

in great differences in mapped areas. Finer resolution always narrows the classification boundary

tighter to the tree covers, leading to a more accurate estimate of authentic tree covers rather than

an administrative boundary that encompasses a greater range including a lot of non-forest covers.

76

This is especially so in forest and non-forest transition areas and wildland-urban interfaces where

tree coverage is relative sparse.

Figure 27 An example of how scale affect the classified area. Suppose the smallest cell unit

is 30m by 30m in size, there are 20 cells or 18000m2 forest area. If using a 120m by 120m

cell, there are 3 cells or 43200m2. If using the entire 240m by 240m scene, the area is

classified as one forest patch, with an area of 57600m2.

Therefore, since our California map was made from the finest resolution with 30 meters in

comparison to the others, it is reasonable that our map yielded the lowest estimate of forest area.

In addition we note that FIA data has only a sampling intensity of 1 plot per 6,000 acres, so that

estimates of total forest area derived using these data have fairly large confidence intervals.

Furthermore, consider that the classification for ‘low-density forest’ (20%-50%) and ‘woodland’

(10%-20%) in the error matrix of Table 22 was more inclined towards non-forest. A great

amount of small patches of mixed pixels of such tree cover percentage were not classified as

77

forestland. Other error sources may include the residual cloud post cloud masking,

misregistration of images, re-projecting and resampling process, inconsistent surface reflectance

due to the complex atmospheric conditions at acquisition, etc. (Lunetta et al., 1991).

5 Conclusions

In this study, we presented an approach to efficiently produce forest change maps for California

for 25 years with a 5 year interval. Our achievements and findings were listed as follows:

† The total computer processing time was approximately 10 hours, in which 2 hours is for

preprocessing, 7 hours for clustering and 1 hour for post-processing;

† The overall accuracy of the map was 92.9% ± 1.6%;

† The estimated forest area changed from 28.20 ± 1.98 million acres to 28.05 ± 1.98

million acres;

† During 1986-2011, California’s forests experienced loss of 92 thousand acres and

recovery of 85 thousand acres per year, resulting in seven thousand acres of forest loss

per year (These numbers were not estimated using the error matrix and thus did not

perfectly match with the above calculation);

† During 1986-2011, around 12% of the forestland (‘stable forest’, ‘forest loss’, ‘forest

gain’ and ‘forest loss and recovered’ combined) experienced changes, in which the

change was 4% each for ‘forest loss’, ‘forest gain’ and ‘forest loss and recovered’,

respectively.

† Our estimate of forestland was approximately 29% of the land area in California, as

opposed to the 40% by Laaksonen-Craig et al. (2003) and 33% by FIA (2014). We

attribute the disagreement to solely data oriented methodology regardless of subjective

administrative considerations, finer resolution, ambiguity of our approach in dealing with

mixed pixels (tree<50%) and other errors.

In conclusion, our map made a tighter estimate of the forest cover and changes during the 25

years, meaning that forest boundaries were closer to the real boundary of trees. Meanwhile, we

did not particularly treat wildland-urban interface and urban forests like masking out agriculture

land, because our goal was to document real tree-covered areas for potentially better estimate of

carbon sequestration in urban areas also. Furthermore, by overlapping urban areas with our forest

change map, one could easily identify wildland-urban interfaces where wildfire is a threat to

properties and human lives. Our forest change map can contribute to the monitoring of the

forestland in California with relatively low cost without requiring field visitation as well as being

informative about when and where the deforestation and afforestation occurred in the past. By

tracking forest changes, policy makers could regularly determine whether or not there is a major

deforestation (such as a devastating and long lasting wildfire) that natural regeneration could not

account for maintaining a balance.

However, it should also be noted that our approaches could be further improved. To reduce the

computational cost, we only used images acquired in a five-year interval, which allows the false

detection of changes in the probability trajectory interpretation without utilizing stable

consecutive values for multiple years annually (Huang et al., 2010). Neither the number of

clusters nor the minimum number of samples in a stratum was large enough to reduce the

78

variance of probabilities estimated for the strata, leading to an error prone and threshold sensitive

situation for trajectory interpretation. In addition, an accuracy assessment of the forest change

was lacking to evaluate the change aspect of our map due to the incomplete record of the aerial

photos and ground surveys. However, by using annual Landsat images we would be able to

assess whether the changes took place in a recent 10 year period since most of the aerial photos

in Google Earth can trace back into 1990s for California, thus to generalize the estimated errors

to the earlier detected change as well. Given the processing time for this practice was around 10

hours, including all Landsat records from 1984 to 2011 should increase the processing time to

circa 70 hours. If processing in a cluster of high-performance computers, the time should be

reduced further. It is also possible to generalize our approach to a larger extent such as the entire

United States and even the world. It should be able to explore the past record of all countries’

forests back to 1984 and hence allow decision makers in these countries to maintain or develop

better strategies for forest management.

79

Chapter 4 Conclusions and Perspectives

1 Summary of the Results

In the first chapter, a reliable semi-automatic algorithm for detecting mountain pine beetle

outbreaks based on Landsat image stacks from 2001 to 2011 for Grand County in Colorado was

developed. The algorithm was named Berkeley Indices Trajectory Extractor (BITE). Temporal

trajectories of multiple spectral indices were processed with unique techniques followed by

interpretation and integration. An overall accuracy of 94.7% for the classification of disturbance

types was achieved. The detection between slow-onset disturbances and rapid-onset disturbances

proved to be effective using BITE. The spatial and temporal dispersal of mountain pine beetle

outbreak that occurred during the time frame in the study area was accurately mapped. It is

appropriate to conclude that BITE algorithm should be suitable for detecting other disturbances

as well, as long as there is forest canopy loss.

In the second chapter, our experiment with a subscene of Landsat Thematic Mapper (TM)

imagery suggests that CBEST was able to improve speed considerably over conventional K-

means as the volume of data to be clustered increases. We assessed information loss and several

other factors. In addition, we evaluated the effectiveness of CBEST in mapping land cover/use

with the same image that was acquired over Guangzhou City, South China and an AVIRIS

hyperspectral image over Cappocanoe County, Indiana. Using reference data we assessed the

accuracies for both CBEST and conventional K-means and we found that the CBEST was not

negatively affected by information loss during compression in practice. We discussed potential

applications of the fast clustering algorithm in dealing with large datasets in remote sensing

studies.

In the third chapter, we efficiently produced a forest change map for the entire state of California

with a spatial resolution of 30 meters over 25 years. The computing time of the mapping process

took only 10 hours in a computer with an Intel 2.8 Ghz i5 CPU and 8 Gigabytes RAM. The

overall accuracy of the forest cover in year 2011 was reported as 92.9% ± 1.6%. We found that

the estimated forest area changed from 28.20 ± 1.98 million acres to 28.05 ± 1.98 million acres

from 1986-2011. In particular, our rough estimate indicates that each year California’s forest


thousand acres forest loss per year. In addition, during a twenty five year period from1986-2011,

around 12% of the forestland experienced changes (~0.5%/year), in which the change was 4%

(.16%/year) each for deforestation, afforestation and deforestation then recovered respectively.

We concluded that the forestland in California had been managed in a sustainable manner over

the 25 years, since no significantly directional changes were observed. This is an expected result

since forest management is California is regulated by the California Forest Practices Act. Our

approach made a tighter estimate of the true canopy coverage such that 29% of land in California

is forestland, as opposed to the statistics of 33% and 40% made by previous studies that had

lower spatial resolution and shorter temporal coverage.

80

2 Future Perspectives

BITE and CBEST are distinctive algorithms. BITE is more noise resistant and capable of

differentiating between slow-onset disturbances and rapid-onset disturbances. CBEST is less

accurate and the approach using CBEST for the California forest change mapping was not able

to detect slow-onset disturbances. However, CBEST was more efficient in computing than BITE.

Within 10 hours, CBEST processed the entire state of California while BITE took over 25 hours

to process Grand County which is only 1% of the size of California. Because the data sources for

the two algorithms are identical, it is thus viable to integrate the two algorithms for mapping

tasks with consistency. To take advantages of both algorithms for large area mapping of forest

changes, an agile strategy of integrating the two algorithms should be deliberated. A general

concept of the integration is that CBEST is used in an initial mapping of the forest changes for

the entire mapping area and temporal coverage. BITE is then applied to the areas where

disturbances are detected for the onset period of the disturbances. The initial change map from

CBEST can also be used as the input for BITE so that additional sources of forest cover map is

no longer required or limited by the past maps such as NLCD and FIA forest map. Therefore, the

processing time for mapping a large area can be significantly reduced by ignoring the unchanged

areas as well as enhancing the detection of deforestation and afforestation with the more accurate

BITE algorithm.

Nevertheless, BITE lacks afforestation detection and requires a pre-mapped forest cover mask.

Also it requires training samples in a local region. It is currently unknown how well BITE can

perform if applied to other areas without in situ trained models, and to what extent of the scale

can local models adapt. The California forest change map using CBEST was not evaluated for

changes, is thus not appropriate to be used for follow up BITE refinement. More sampling units

and clusters should be added to lower the variance in the estimate. Annual or biannual Landsat

records are more favored for noise removal in the data, and for increasing the reliability of the

change detection. We expect that by refining and combining the two algorithms developed in this

dissertation paper, we will be able to derive the finest and longest forest change records for any

places in the world with Landsat satellite coverage. At least for the entire United States of

America with well-georegistrated image records, the task will be viable and significant. Decision

making in the future could be greatly influenced by a fine resolution forest change map with

longer temporal coverage. Greater details lead to a more intelligent way of managing the

sustainability of forestland that is beneficial to both human society and natural environment in

the world.

81

References

Alsabti, K., Ranka, S., Singh, V., 1998. An Efficient K-means Clustering Algorithm. Proc. 1st

Workshop on High Performance Data Mining.

America Community Survey Office, 2013. American Community Survey Multiyear Accuracy of

the Data (3-year 2010-2012 and 5-year 2008-2012), URL:

http://www.census.gov/acs/www/Downloads/data_documentation/Accuracy/MultiyearACSAccu

racyofData2012.pdf, U.S. Census Bureau, Washington, DC (last date accessed: 2 Feb 2014).

Arino, O., Gross, D., Ranera, F., Bourg, L., Leroy, M., Bicheron, P., Latham, J., Di Gregorio, A.,

Brockman, C., Witt, R., Defourny, P., Vancutsem, C., Herold, M., Sambale, J., Achard, F.,

Durieux, L., Plummer, S., Weber, J.-L., 2007. Globcover. ESA Service for global land cover

from MERIS. IEEE International Geoscience and Remote Sensing Symposium, Barcelona,

Spain, 23-27 July, pp. 2412–2415.

AVIRIS image, 2013. AVIRIS image North-South flightline over Northwest Tippecanoe

County, Indiana, https://engineering.purdue.edu/~biehl/MultiSpec/hyperspectral.html. (Accessed

3 Feb, 2013)

Aukema, B. H., Carroll, A. L., Zhu, J., Raffa, K. F., Sickley, T. A., and Taylor, S. W., 2006.

Landscape level analysis of mountain pine beetle in British Columbia, Canada: spatiotemporal

development and spatial synchrony within the present outbreak, Ecography, 29(3): 427-441.

Aukema, B.H., Carroll, A.L., Zheng, Y., Zhu, J., Raffa, K.F., Moore, R.D., Stahl, K. and Taylor,

S.W., 2008. Movement of outbreak populations of mountain pine beetle: influences of

spatiotemporal patterns and climate. Ecography, 31(3): 348-358.

Bartholomé, E., Belward, A.S., 2005. GLC2000: a new approach to global land cover mapping

from Earth observation data. International Journal of Remote Sensing 26(9), 1959–1977.

Belluco, E., Camuffo, M., Ferrari, S., Modenese, L., Silvestri, S., Marani, A., Marani, M., 2006.

Mapping salt-marsh vegetation by multispectral and hyperspectral remote sensing. Remote

Sensing of Environment 105(1), 54–67.

Bentz, B. J., Logan, J. A., and Amman, G. D., 1991. Temperature-dependent development of the

mountain pine beetle (Coleoptera: Scolytidae) and simulation of its phenology, Canadian

Entomologist, 123(5): 1083-1094.

Bentz, B. J., Powell, J. A., and Logan, J. A., 1996. Localized spatial and temporal attack

dynamics of the mountain pine beetle in lodgepole pine, US Department of Agriculture, Forest

Service, Intermountain Research Station, (INT-RP-494).

Bentz, B. J., and Endreson, D., 2004. Evaluating satellite imagery for estimating mountain pine

beetle-caused lodgepole pine mortality: current status， Information Report, Pacific Forestry

Centre, Canadian Forest Service, (BC-X-399), pp. 154-163.

82

Bezdek, J.C., Ehrlich, R., Full, W., 1984. FCM: the Fuzzy c-Means clustering algorithm.

Computers and Geosciences 10(2-3), 191–203.

Boryan, C., Yang, Z., & Di, L. (2012, July). Deriving 2011 cultivated land cover data sets using

usda national agricultural statistics service historic cropland data layers. In Geoscience and

Remote Sensing Symposium (IGARSS), 2012 IEEE International (pp. 6297-6300). IEEE.

Bradley, P.S., Fayyad, U., Reina, C., 1998. Scaling Clustering Algorithms to Large Databases.

Proc. 4th Int'l Conf. Knowledge Discovery and Data Mining, pp. 9–15.

Brumby, S.P., Theiler, J.P., Bloch J.J., Harvey N.R., Perkins S.J., Szymanski J.J., Young, A.C.,

2002. Evolving land cover classification algorithms for multispectral and multitemporal imagery.

Proc. SPIE 4480, pp. 120–129.

Caldwell, M. K., Hawbaker, T. J., Briggs, J. S., Cigan, P. W., & Stitt, S., 2013. Simulated

impacts of mountain pine beetle and wildfire disturbances on forest vegetation composition and

carbon stocks in the Southern Rocky Mountains. Biogeosciences, 10(12): 8203-8222.

Carroll, A. L., Taylor, S. W., Régnière, J., and Safranyik, L., 2003. Effect of climate change on

range expansion by the mountain pine beetle in British Columbia, Mountain Pine Beetle

Symposium: Challenges and Solutions, 30-31 Oct 2003, Kelowna, British Columbia, (Natural

Resources Canada, Information Report BC-X-399, Victoria), pp. 223-232.

Celik, T., 2009. Unsupervised Change Detection in Satellite Images Using Principal Component

Analysis and K-means Clustering. IEEE Geoscience and Remote Sensing Letters 6(4), 772–776.

Chang, C. C., and Lin, C. J., 2011. LIBSVM: a library for support vector machines, ACM

Transactions on Intelligent Systems and Technology, 2(3): 1-27.

Chapman, T. B., Veblen, T. T., and Schoennagel, T., 2012. Spatiotemporal patterns of mountain

pine beetle activity in the southern Rocky Mountains, Ecology, 93(10): 2175-2185.

Chen, Y., & Gong, P. (2013). Clustering based on eigenspace transformation–CBEST for

efficient classification. ISPRS Journal of Photogrammetry and Remote Sensing, 83, 64-80.

Cohen, W. B., Yang, Z., and Kennedy, R., 2010. Detecting trends in forest disturbance and

recovery using yearly Landsat time series: 2. TimeSync—Tools for calibration and validation,

Remote Sensing of Environment, 114(12): 2911-2924.

Cole, W. E., and Amman, G. D., 1980. Mountain pine beetle dynamics in lodgepole pine forests,

Part I: Course of an infestation, General Technical Report, Intermountain Forest and Range

Experiment Station, USDA Forest Service, (INT-89).

Coppin, P. R., and Bauer, M. E., 1996. Digital change detection in forest ecosystems with remote

sensing imagery, Remote Sensing Reviews, 13(3-4): 207-234.

83

Crist, E. P., and Cicone, R. C., 1984. A physically-based transformation of Thematic Mapper

data---The TM Tasseled Cap, Geoscience and Remote Sensing, IEEE Transactions on, GE-22(3):

256-263.

Dai, X., and Khorram, S., 1998. The effects of image misregistration on the accuracy of remotely

sensed change detection, Geoscience and Remote Sensing, IEEE Transactions on, 36(5): 1566-

1577.

DeFries, R. S., Hansen, M. C., Townshend, J. R. G., Janetos, A. C., and Loveland, T. R., 2000. A

new global 1‐km dataset of percentage tree cover derived from remote sensing, Global Change

Biology, 6(2): 247-254.

Ding, C., He, X., 2004. K-means clustering via principal component analysis. Proc. of Int’l Conf.

Machine Learning (ICML 2004), pp. 225-232.

Edwards Jr, T. C., Moisen, G. G., & Cutler, D. R. (1998). Assessing map accuracy in a remotely

sensed, ecoregion-scale cover map. Remote Sensing of environment, 63(1), 73-83.

Eitzen, Z.A., Xu, K.-M., Wong, T., 2008. Statistical Analyses of Satellite Cloud Object Data

from CERES Part V: Relationships between Physical Properties of Marine Boundary Layer

Clouds. Journal of Climate 21(24), 6668–6688.

Ester, M., Kriegel, H.-P., Sander, J., Xu, X., 1996. A Density-Based Algorithm for Discovering

Clusters in Large Spatial Databases with Noise. Proc. 2nd Int’l Conf. Knowledge Discovery and

Data Mining (KDD-96), pp. 226–231.

Ester, M., Kriegel, H.-P., Xu, X., 1995. A Database Interface for Clustering in Large Spatial

Databases. Proc. First Int'l Conf. Knowledge Discovery and Data Mining (KDD-95), pp. 94–99.

FAO, 2010. Global Forest Resources Assessment 2010 Main Report. FAO.

Forest Inventory and Analysis, 2014. Forest Inventory and Analysis Fiscal Year 2013 Business

Report. FIA.

Frahling, G., Sohler, C., 2006. A fast K-means implementation using coresets. Proc. 22nd

symposium on Computational geometry (SoCG).

Franklin, S.E., Stenhouse, G.B., Hansen, M.J., Popplewell, C.C., Dechka, J.A., Peddle, D.R.,

2001. An integrated decision tree approach (IDTA) to mapping landcover using satellite remote

sensing in support of grizzly bear habitat analysis in the Alberta Yellowhead Ecosystem.

Canadian Journal of Remote Sensing 27(6), 579–592.

Franklin, J., Woodcock, C. E., & Warbington, R. (2000). Multi-attribute vegetation maps of

forest service lands in California supporting resource management decisions. Photogrammetric

Engineering and Remote Sensing, 66(10), 1209-1218.

84

Friedl, M. A., Sulla-Menashe, D., Tan, B., Schneider, A., Ramankutty, N., Sibley, A., & Huang,

X. (2010). MODIS Collection 5 global land cover: Algorithm refinements and characterization

of new datasets. Remote Sensing of Environment, 114(1), 168-182.

Fry, J., Xian, G., Jin, S., Dewitz, J., Homer, C., Yang, L., Barnes, C., Herold, N., and Wickham,

J., 2011. Completion of the 2006 National Land Cover Database for the Conterminous United

States, PE&RS, Vol. 77(9):858-864.

Fu, W., Chen, Y., Shi, M., Zhang, X., Xiao, X., Gong, P., 2013. The distribution and temporal

changes of surface cover color in China revealed by satellite based dynamic observation. Journal

of Remote Sensing. In press.

Funk, C.C., Theiler, J., Roberts, D.A., Borel, C.C., 2001. Clustering to improve matched filter

detection of weak gas plumes in hyperspectral thermal imagery. IEEE Transactions on

Geoscience and Remote Sensing 39(7), 1410–1420.

Girolami, M., 2002. Mercer Kernel Based Clustering in Feature Space. IEEE Transactions on

Neural Networks 13(3), 780–784.

Gong, P., 2012. Remote sensing of environmental change over China, a review. Chinese

Science Bulletin 57(22), 2793-2801.

Gong, P., Howarth, P.J., 1990. An assessment of some factors influencing multispectral land-

cover classification. Photogrammetric Engineering and Remote Sensing 56(5), 597–603.

Gong, P., Howarth, P.J., 1992. Frequency‐based contextual classification and gray‐level vector

reduction for land‐use identification. Photogrammetric Engineering and Remote Sensing 58(4),

423–437.

Gong, P., LeDrew, E. F., and Miller, J. R., 1992. Registration-noise reduction in difference

images for change detection, International Journal of Remote Sensing, 13(4): 773-779.

Gong, P., Wang, J., Yu, L., et al., 2013. Finer resolution observation and monitoring of global

land cover: first mapping results with Landsat TM and ETM+ data, International Journal of

Remote Sensing, 34(7):2607-2654.

Goodwin, N. R., Coops, N. C., Wulder, M. A., Gillanders, S., Schroeder, T. A., and Nelson, T.,

2008. Estimation of insect infestation dynamics using a temporal sequence of Landsat data,


Goodwin, N. R., Magnussen, S., Coops, N. C., and Wulder, M. A., 2010. Curve fitting of time-

series Landsat imagery for characterizing a mountain pine beetle infestation, International

Journal of Remote Sensing, 31(12): 3263-3271.

Gordon, N.D., Norris, J.R., Weaver, C.P., Klein, S.A., 2005. Cluster analysis of cloud regimes

and characteristic dynamics of midlatitude synoptic systems in observations and a model. Journal

of Geophysical Research 110(D15), D15S17.

85

Guha, S., Rastogi, R., Shim, K., 1998. CURE: An efficient clustering algorithm for large

databases. Proc. ACM SIGMOD Int. Conf. Management of Data, pp. 73–84.

Han, K.-S., Champeaux, J.-L., Roujean, J.-L., 2004. A land cover classification product over

France at 1 km resolution using SPOT4/VEGETATION data. Remote Sensing of Environment

92(1), 52–66.

Hansen, M. C., DeFries, R. S., Townshend, J. R., and Sohlberg, R., 2000. Global land cover

classification at 1 km spatial resolution using a classification tree approach. International

Journal of Remote Sensing, 21(6-7): 1331-1364.

Hansen, M. C., Potapov, P. V., Moore, R., Hancher, M., Turubanova, S. A., Tyukavina, A., Thau,

D., Stehman, S. V., Goetz, S. J., Loveland, T. R., Kommareddy, A., Egorov, A., Chini, L.,

Justice, C. O., and Townshend, J. R. G., 2013. High-resolution global maps of 21st-century

forest cover change. Science, 342(6160): 850-853.

Hastie, T., Tibshirani, R., and Friedman, J., 2009. The Elements of Statistical Learning: Data

Mining, Inference, and Prediction, Springer Science+Business Media, New York, NY, 745 p.

Helms, J. A., 1998. The dictionary of forestry. Bethesda: SAF and CABI publishing.

Homer, C.G., Ramsey, R.D., Edwards, T.C. Jr., Falconer, A., 1997. Landscape cover-type

modeling using a multi-scene Thematic Mapper mosaic. Photogrammetric Engineering &

Remote Sensing 63(1), 59–67.

Homer, C., Dewitz, J., Fry, J., Coan, M., Hossain, N., Larson, C., Herold, N., McKerrow, A.,

VanDrel, J. N., and Wickham, J., 2007. Completion of the 2001 National Land Cover Database

for the Conterminous United States, Photogrammetric Engineering & Remote Sensing, 73(4),

337-341.

Honey-Marie, C., Carroll, A. L., Lindgren, B. S., and Aukema, B. H., 2011. Incoming!

Association of landscape features with dispersing mountain pine beetle populations during a

range expansion event in western Canada, Landscape Ecology, 26(8): 1097-1110.

Hsu, C. W., Chang, C. C., and Lin, C. J., 2003. A practical guide to support vector classification.

URL: http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf, National Taiwan University,

Taipei, Taiwan (last date accessed: 2 Feb 2014).

Huang, C., Song, K., Kim, S., Townshend, J. R., Davis, P., Masek, J. G., and Goward, S. N.,

2008. Use of a dark object concept and support vector machines to automate forest cover change

analysis, Remote Sensing of Environment, 112(3): 970-985.

Huang, C., Goward, S. N., Masek, J. G., Gao, F., Vermote, E. F., Thomas, N., Schleeweis, K.,

Kennedy, R. E., Zhu, Z., Eidenshink, J. C., and Townshend, J. R., 2009. Development of time

series stacks of Landsat images for reconstructing forest disturbance history, International

Journal of Digital Earth, 2(3): 195-218.

86

Huang, C., Goward, S. N., Masek, J. G., Thomas, N., Zhu, Z., and Vogelmann, J. E., 2010. An

automated approach for reconstructing recent forest disturbance history using dense Landsat time

series stacks, Remote Sensing of Environment, 114(1): 183-198.

Huete, A. R., Liu, H. Q., Batchily, K., and Van Leeuwen, W. J. D. A., 1997. A comparison of

vegetation indices over a global set of TM images for EOS-MODIS, Remote Sensing of

Environment, 59(3): 440-451.

Iverson, L. R., and Prasad, A. M., 1998. Predicting abundance of 80 tree species following

climate change in the eastern United States, Ecological Monographs, 68(4): 465-485.

Jensen, J.R., 2004. Introductory Digital Image Processing: A Remote Sensing Perspective, third

ed. Prentice Hall, New Jersey.

Jiao, L., Gong, M., Wang, S., Hou, B., Zheng, Z., Wu, Q., 2010. Natural and Remote Sensing

Image Segmentation Using Memetic Computing. IEEE Computational Intelligence Magazine

5(2), 78–91.

Jin, S., Yang, L., Danielson, P., Homer, C., Fry, J., and Xian, G. 2013. A comprehensive change

detection method for updating the National Land Cover Database to circa 2011. Remote Sensing

of Environment, 132: 159 – 175.

Jolliffe, I., 2002. Principal component analysis, second ed. Springer-Verlag, New York.

Kanungo, T., Mount, D., Netanyahu, N., Piatko, C., Silverman, R., Wu, A., 2002. An Efficient

K-means Clustering Algorithm: Analysis and Implementation. IEEE Trans. Pattern Anal. Mach.

Intell. 24(7), 881–892.

Keane, R. E., Morgan, P., and Menakis, J. P., 1994. Landscape Assessment of the Decline of

Whitebark pine (pinusa tbi caulis) in the Bob Marshal Wilderness Complex, Montana, USA,

Northwest Science, 68(3): 213-229.

Kennedy, R. E., Yang, Z., and Cohen, W. B., 2010. Detecting trends in forest disturbance and

recovery using yearly Landsat time series: 1. LandTrendr—Temporal segmentation algorithms,


Key, C. H., and Benson, N. C., 2005. Landscape assessment: Sampling and analysis methods,

FIREMON: Fire Effects Monitoring and Inventory System (D. C. Lutes, R. E. Keane, and J. F.

Caratti, Editors), USDA Forest Service, Rocky Mountain Research Station, Ogden, Utah,

General Technical Report, (RMRS-GTR-164).

Kohavi, R., 1995. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and

Model Selection., International Joint Conference on Artificial Intelligence, 20-25 August 1995,

Montreal, Quebec, Canada, pp. 1137-1145.

87

Kurz, W. A., Dymond, C. C., Stinson, G., Rampley, G. J., Neilson, E. T., Carroll, A. L., Ebata,

T., and Safranyik, L., 2008. Mountain pine beetle and forest carbon feedback to climate change,

Nature, 452(7190): 987-990.

Laaksonen-Craig, S., Goldman, G. E., & McKillop, W., 2003. Forestry, forest industry, and

forest products consumption in California. Oakland, CA: University of California, Division of

Agriculture and Natural Resources.

Lillesand, T.M., Kiefer, R.W., 1987. Remote Sensing and Image Interpretation, second ed.

Wiley, New Jersey.

Lloyd, S., 1982. Least Squares Quantization in PCM. IEEE Transactions on Information Theory

28(2), 129–137.

Logan, J. A., White, P., Bentz, B. J., and Powell, J. A., 1998. Model analysis of spatial patterns

in mountain pine beetle outbreaks, Theoretical Population Biology, 53(3): 236-255.

Logan, J. A., and Powell, J. A., 2001. Ghost forests, global warming, and the mountain pine

beetle (Coleoptera: Scolytidae), American Entomologist, 47(3): 160-173.

Loveland, T., Merchant, J., Brown, J., and Ohlen, D., 1991. Development of a land-cover

characteristics database for the conterminous U. S., Photogrammetric Engineering & Remote

Sensing, 57(11): 1453-1463.

Loveland, T.R., Reed, B.C., Brown, J.F., Ohlen, D.O., Zhu, Z., Yang, L., Merchant, J.W., 2000.

Development of a global land cover characteristics database and IGBP DISCover from 1 km

AVHRR data. International Journal of Remote Sensing 21(6), 1303–1330.

Lu, D., Mausel, P., Brondizio, E., and Moran, E., 2004. Change detection techniques,

International Journal of Remote Sensing, 25(12): 2365-2401.

Lunetta, R., Congalton, R., Fenstermaker, L., Jensen, J., Mcgwire, K., & Tinney, L. (1991).

Remote sensing and Geographic Information System data integration: error sources and research

issues. Photogrammetric engineering and remote sensing, 57(6), 677-687.

MacQueen, J.B., 1967. Some Methods for classification and Analysis of Multivariate

Observations. Proc. 5th Berkeley Symposium on Mathematical Statistics and Probability,

University of California Press, pp. 281–297.

Maness, H., Kushner, P. J., and Fung, I., 2013. Summertime climate response to mountain pine

beetle disturbance in British Columbia. Nature Geoscience, 6(1): 65-70.

Mas, J. F., 1999. Monitoring land-cover changes: a comparison of change detection techniques,


Masek, J. G., Vermote, E. F., Saleous, N., Wolfe, R., Hall, F. G., Huemmrich, F., Gao, F.,

Kutler, J., and Lim, T. K., 2012. LEDAPS Landsat Calibration, Reflectance, Atmospheric

88

Correction Preprocessing Code, Model product, URL: http://daac.ornl.gov, Oak Ridge National

Laboratory Distributed Active Archive Center, Oak Ridge, Tennessee (last date accessed: 2 Feb

2014).

Matgen, P., El Idrissi, A., Henry, J.B., Tholey, N., Hoffmann, L., de Fraipont, P., Pfister, L.,

2006. Patterns of remotely sensed floodplain saturation and its use in runoff predictions.

Hydrological Processes 20(8), 1805–1825.

Mayaux, P., Achard, F., and Malingreau, J. P., 1998. Global tropical forest area measurements

derived from coarse resolution satellite imagery: a comparison with other approaches,

Environmental Conservation, 25(1): 37-52.

McFeeters, S. K., 1996. The use of the Normalized Difference Water Index (NDWI) in the

delineation of open water features, International Journal of Remote Sensing, 17(7): 1425-1432.

Meigs, G. W., Kennedy, R. E., and Cohen, W. B., 2011. A Landsat time series approach to

characterize bark beetle and defoliator impacts on tree mortality and surface fuels in conifer

forests, Remote Sensing of Environment, 115(12): 3707-3718.

Metz, B., 2001. Climate change 2001: mitigation: contribution of Working Group III to the third

assessment report of the Intergovernmental Panel on Climate Change (Vol. 3), Cambridge

University Press, Cambridge, United Kingdom, 758 p.

Mitchell, R. G., Waring, R. H., & Pitman, G. B., 1983. Thinning lodgepole pine increases tree

vigor and resistance to mountain pine beetle. Forest Science, 29(1): 204-211.

Muller, S.V., Racoviteanu, A.E., Walker, D.A., 1999. Landsat MSS-derived land-cover map of

northern Alaska: Extrapolation methods and a comparison with photo-interpreted and AVHRR-

derived maps. International Journal of Remote Sensing 20(15–16), 2921–2946.

Nagesh, H., Goil, S., Choudhary, A., 1999. MAFIA: Efficient and scalable subspace clustering

for very large data sets. Center for Parallel and Distributed Computing, NWU, Tech. Rep. 9906-

010.

Ng, A. Y., 1997. Preventing Overfitting of Cross-Validation Data, Proceedings of the Fourteenth

International Conference on Machine Learning, 8-12 July 1997, Nashville, Tennessee, pp. 245-

253.

Olofsson, P., Foody, G. M., Herold, M., Stehman, S. V., Woodcock, C. E., & Wulder, M. A.

(2014). Good practices for estimating area and assessing accuracy of land change. Remote

Sensing of Environment, 148, 42-57.

Ouma, Y., Ngigi, T.G., Tateishi, R.R., 2006. On the optimization and selection of wavelet

texture for feature extraction from high‐resolution satellite imagery with application towards

urban‐tree delineation. International Journal of Remote Sensing 27(1–2), 73–104.

89

Overpeck, J. T., Rind, D., & Goldberg, R. (1990). Climate-induced changes in forest disturbance

and vegetation. Nature (London), 343(6253), 51-53.

Parker, T. J., Clancy, K. M., and Mathiasen, R. L., 2006. Interactions among fire, insects and

pathogens in coniferous forests of the interior western United States and Canada, Agricultural

and Forest Entomology, 8(3): 167-189.Peltonen, M., Liebhold, A. M., Bjørnstad, O. N., and

Williams, D. W., 2002. Spatial synchrony in forest insect outbreaks: roles of regional

stochasticity and dispersal, Ecology, 83(11): 3120-3129.

Parmentier, B., & Eastman, J. R. (2014). Land transitions from multivariate time series: using

seasonal trend analysis and segmentation to detect land-cover changes. International Journal of

Remote Sensing, 35(2), 671-692.

Powell, S. L., Cohen, W. B., Kennedy, R. E., Healey, S. P., & Huang, C. (2014). Observation of

Trends in Biomass Loss as a Result of Disturbance in the Conterminous US: 1986–2004.

Ecosystems, 17(1), 142-157.

Raffa, K. F., and Berryman, A. A., 1983. The role of host plant resistance in the colonization

behavior and ecology of bark beetles (Coleoptera: Scolytidae), Ecological Monographs, 53(1):

27-49.

Reger, B., Otte A., Waldhardt, R., 2007. Identifying patterns of land-cover change and their

physical attributes in a marginal European landscape. Landscape and Urban Planning 81(1–2),

104–113.

Reilly, T.E., Dennehy, K.F., Alley, W.M., and Cunningham, W.L., 2008, Ground-Water

Availability in the United States: U.S. Geological Survey Circular 1323, 70 p., also available

online at http://pubs.usgs.gov/circ/1323/

Remund, Q.P., Long, D.G., Drinkwater, M.R., 2000. An iterative approach to multisensor sea ice

classification. IEEE Transactions on Geoscience and Remote Sensing 38(4), 1843–1856.

Richards J.A., Jia, X., 2005. Remote Sensing Digital Image Analysis: An Introduction, fourth ed.

Springer-Verlag, Berlin Heidelberg.

Roelfsema, C.M., Phinn, S.R., Dennison, W.C., 2002. Spatial distribution of benthic microalgae

on coral reefs determined by remote sensing. Coral Reefs 21(3), 264–274.

Rollet, R., Benie, G.B., Li, W., Wang, S., Boucher, J.-M., 1998. Image classification algorithm

based on the RBF neural network and K-means. International Journal of Remote Sensing 19(15),

3003–3009.

Ruefenacht, B., Finco, M. V., Nelson, M. D., Czaplewski, R., Helmer, E. H., Blackard, J. A.,

Holden, G. R., Lister, A. J., Salajanu, D., Weyermann, D., and Winterberger, K., 2008.

Conterminous US and Alaska forest type mapping using forest inventory and analysis data,

Photogrammetric Engineering & Remote Sensing, 74(11): 1379-1388.

90

Running, S. W., 2008. Ecosystem disturbance, carbon, and climate. Science, 321(5889): 652-

653.

Safranyik, L., and Whitney, H. S., 1985. Development and survival of axenically reared

mountain pine beetles, Dendroctonus ponderosae (Coleoptera: Scolytidae), at constant

temperatures, The Canadian Entomologist, 117(02): 185-192.

Safranyik, L., and Wilson, B., 2007. The mountain pine beetle: a synthesis of biology,

management and impacts on lodgepole pine, Natural Resouces Canada, Canadian Forest Service,

Victoria, British Columbia, 317 p.

Sano, E.E., Ferreira, L.G., Asner, G.P., Steinke, E.T., 2007. Spatial and temporal probabilities of

obtaining cloud‐free Landsat images over the Brazilian tropical savannah. International Journal

of Remote Sensing 28(12), 2739–2752.

Shah, C.A., Arora, M.K., Varshney, P.K., 2004. Unsupervised classification of hyperspectral

data: an ICA mixture model based approach. International Journal of Remote Sensing 25(2),

481–487.

Shah, C.A., Varshney, P.K., Arora, M.K., 2007. ICA mixture model algorithm for unsupervised

classification of remote sensing imagery. International Journal of Remote Sensing 28(8), 1711–

1731.

Sheikholeslami, G., Chatterjee, S., Zhang, A., 1998. WaveCluster: A multiresolution clustering

approach for very large spatial databases. Proc. 24th VLDB Conf., pp. 428–439.

Shimamura, Y., Izumi, T., and Matsuyama, H., 2006. Evaluation of a useful method to identify

snow‐covered areas under vegetation–comparisons among a newly proposed snow index,

normalized difference snow index, and visible reflectance, International Journal of Remote

Sensing, 27(21): 4867-4884.

Singh, A., 1989. Review Article Digital change detection techniques using remotely-sensed data,


Smith, W.B., Miles, P.D., Vissage, J.S., Pugh, S.A., 2002. Forest Resources of the United States,

General Tech Rep NC-241 (US Department of Agriculture, Forest Service, North Central

Research Station, St. Paul, MN.

Soja, A. J., Tchebakova, N. M., French, N. H., Flannigan, M. D., Shugart, H. H., Stocks, B. J., ...

& Stackhouse Jr, P. W. (2007). Climate-induced boreal forest change: predictions versus current

observations. Global and Planetary Change, 56(3), 274-296.

Song, C., Woodcock, C.E., Seto, K.C., Lenney, M.P., Macomber, S.A., 2001. Classification and

change detection using Landsat TM data: When and how to correct atmospheric effects. Remote

Sensing of Environment 75(2), 230–244.

91

Souza Jr, C. M., Siqueira, J. V., Sales, M. H., Fonseca, A. V., Ribeiro, J. G., Numata, I., ... &

Barlow, J. (2013). Ten-Year Landsat Classification of Deforestation and Forest Degradation in

the Brazilian Amazon. Remote Sensing, 5(11), 5493-5513.

Stehman, S. V., and Czaplewski, R. L., 1998. Design and analysis for thematic map accuracy

assessment: fundamental principles, Remote Sensing of Environment, 64(3): 331-344.

Stewart, S. I., Radeloff, V. C., & Hammer, R. B. (2006). The wildland-urban interface in the

United States. The public and wildland fire management: Social science findings for managers,

197-202.

Stone, T.A., Schlesinger, P., Houghton, R.A., Woodwell, G.M., 1994. A map of the vegetation of

South America based on satellite imagery. Photogrammetric Engineering & Remote Sensing

60(5), 541–551.

Theiler, J.P., Gisler, G., 1997. Contiguity-enhanced K-means clustering algorithm for

unsupervised multispectral image segmentation. Proc. SPIE 3159, 108–118.

Thorne, K., Markharn, B., Barker, P. S., and Biggar, S., 1997. Radiometric calibration of

Landsat, Photogrammetric Engineering & Remote Sensing, 63(7): 853-858.

Trzcinski, M. K., and Reid, M. L., 2008. Effect of management on the spatial spread of mountain

pine beetle (Dendroctonus ponderosae) in Banff National Park, Forest Ecology and

Management, 256(6): 1418-1426.

Tsagaris, V., Anastassopoulos, V., Lampropoulos, G.A., 2005. Fusion of hyperspectral data

using segmented PCT for color representation and classification. IEEE Transactions on

Geoscience and Remote Sensing 43(10), 2365–2375.

Tucker, C. J., 1979. Red and photographic infrared linear combinations for monitoring

vegetation. Remote Sensing of Environment, 8(2): 127-150.

USDA Forest Service, 2001. U.S. Forest Facts and Historical Trends. FS-696. Washington, DC:

USDA Forest Service.

USDA Forest Service, 2012. Future of America’s Forest and Rangelands: Forest Service 2010

Resources Planning Act Assessment. Gen. Tech. Rep. WO-87. Washington, DC. 198 p.

U.S. Department of Commerce, 2010. Census 2010. U.S. Gazetteer Files at

http://www.census.gov/geo/maps-data/data/gazetteer2010.html.

U.S. Bureau of Economic Analysis, 2013. Widespread Economic Growth in 2012, news release

(June 6, 2013), http://www.bea.gov/newsreleases/regional/gdp_state/2013/pdf/gsp0613.pdf.

Viovy, N., 2000. Automatic Classification of Time Series (ACTS): A new clustering method for

remote sensing time series. International Journal of Remote Sensing 21(6–7), 1537–1560.

http://www.census.gov/geo/maps-data/data/gazetteer2010.html

http://www.bea.gov/newsreleases/regional/gdp_state/2013/pdf/gsp0613.pdf

92

Vogelmann, J.E., S.M. Howard, L. Yang, C. R. Larson, B. K. Wylie, and J. N. Van Driel, 2001,

Completion of the 1990’s National Land Cover Data Set for the conterminous United States,

Photogrammetric Engineering and Remote Sensing 67:650-662.

Vogelmann, J. E., Tolk, B., and Zhu, Z. (2009). Monitoring forest changes in the southwestern

United States using multitemporal Landsat data. Remote Sensing of Environment, 113(8), 1739-

1748.

Vose, J. M., Peterson, D. L., Patel-Weynand, T, 2012. Effects of climatic variability and change

on forest ecosystems: a comprehensive science synthesis for the U.S. forest sector. Gen. Tech.

Rep. PNW-GTR-870. Portland, OR: U.S. Department of Agriculture, Forest Service, Pacific

Northwest Research Station. 265 p.

Wang, W., Yang, J., Muntz, R.R., 1997. STING: A Statistical Information Grid Approach to

Spatial Data Mining. Proc. 23rd VLDB Conf., pp. 186–195.

Westerling, A. L., Hidalgo, H. G., Cayan, D. R., & Swetnam, T. W. (2006). Warming and earlier

spring increase western US forest wildfire activity. science, 313(5789), 940-943.

Westerling, A. L., & Bryant, B. P. (2008). Climate change and wildfire in California. Climatic

Change, 87(1), 231-249.

Wickham, J. D., Stehman, S. V., Fry, J. A., Smith, J. H., and Homer, C. G., 2010. Thematic

accuracy of the NLCD 2001 land cover for the conterminous United States, Remote Sensing of

Environment, 114(6): 1286-1296.

Wilson, E. H., and Sader, S. A., 2002. Detection of forest harvest type using multiple dates of

Landsat TM imagery, Remote Sensing of Environment, 80(3): 385-396.

Woodcock, C.D., Collins, J., Gopal, S., Jakabhazy, V.D., Li, X., Macomber, S., Ryherd, S.,

Harward, V.J., Levitan, J., Wu, Y., Warbington, R., 1994. Mapping forest vegetation using

Landsat TM imagery and a canopy reflectance model. Remote Sensing of Environment 50(3),

240–254.

Wulder, M.A., Franklin, S.E., White, J.C., 2004. Sensitivity of hyperclustering and labelling land

cover classes to Landsat image acquisition date. International Journal of Remote Sensing 25(23),

5337–5344.

Zarco-Tejada, P.J., Ustin, S.L., Whiting, M.L., 2005. Temporal and Spatial Relationships

between Within-Field Yield Variability in Cotton and High-Spatial Hyperspectral Remote

Sensing Imagery, Agronomy Journal 97(3), 641–653.

Zha, H., Ding, C., Gu, M., He, X., Simon, H.D., 2001. Spectral Relaxation for K-means

Clustering. Neural Information Processing Systems 14, Vancouver, Canada, 3-8 Dec, pp. 1057–

1064.

93

Zhang, L., Small, G.W., 2002. Automated detection of chemical vapors by pattern recognition

analysis of passive multispectral infrared remote sensing imaging data. Applied Spectroscopy

56(8), 1082–1093.

Zhang, R., Rudnicky, A., 2002. A large scale clustering scheme for kernel K-means. Int’l Conf.

Pattern Recognition (ICPR02), pp. 289–292.

Zhang, T., Ramakrishnan, R., Livny, M., 1997. BIRCH: A New Data Clustering Algorithm and

Its Applications. Data Mining and Knowledge Discovery 1(2), 141–182.

Zharikov, Y., Skilleter, G.A., Loneragan, N.R., Taranto, T., Cameron, B.E., 2005. Mapping and

characterising subtropical estuarine landscapes using aerial photography and GIS for potential

application in wildlife conservation and management. Biological Conservation, 125(1), 87–100.

Zhong, Y., Zhang, L., Huang, B., Li, P., 2006. An unsupervised artificial immune classifier for

multi/hyperspectral remote sensing imagery. IEEE Transactions on Geoscience and Remote

Sensing 44(2), 420–431.

Zhou, Q., Robson, M., 2001. Automated rangeland vegetation cover and density estimation using

ground digital images and a spectral-contextual classifier. International Journal of Remote

Sensing, 22(17), 3457–3470.

Zhu, Z., and Evans, D. L., 1994. US forest types and predicted percent forest cover from

AVHRR data, Photogrammetric Engineering and Remote Sensing, 60(5): 525-531.

Zhu, Z., and Woodcock, C. E., 2012. Object-based cloud and cloud shadow detection in Landsat

imagery, Remote Sensing of Environment, 118: 83-94.

Zhu, Z., Woodcock, C. E., and Olofsson, P., 2012. Continuous monitoring of forest disturbance

using all available Landsat imagery, Remote Sensing of Environment, 122: 75-91.

Date post:	27-Jun-2020
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times

Mapping forest changes using multi-temporal remote sensing … › etd › ucb › text ›...

Documents