THREE-TIER DETECTION AND MULTI-LEVEL SYNERGY FOR...

THREE-TIER DETECTION AND MULTI-LEVEL SYNERGY FOR COASTAL

MIXED-LAND ZONE CLASSIFICATION

MUHAMAD ASYRAF BIN MOHD POUZI

UNIVERSITI TEKNOLOGI MALAYSIA

THREE-TIER DETECTION AND MULTI-LEVEL SYNERGY FOR COASTAL

MIXED-LAND ZONE CLASSIFICATION

MUHAMAD ASYRAF BIN MOHD POUZI

A thesis submitted in partial fulfillment of the

requirements for the award of the degree of

Master of Science (Computer Science)

Faculty of Computing

Universiti Teknologi Malaysia

MAY 2013

iii

All praises to Allah the Almighty for

the strengths and His blessing in completing this thesis.

Specially dedicated to;

my beloved parents Mohd Pouzi bin Hamzah and Murni binti Ghani

my precious siblings Muhamad Afiq and Adlina Najihah

my most helpful friend Ismaliza binti Ismail

iv

ACKNOWLEDGEMENT

I would like to express my gratitude to my supervisor, Dr. Muhamad Razib

bin Othman, my co-supervisor, Dr. Hishammuddin bin Asmuni as well as Dr.

Rohayanti binti Hassan for their patience, guidance, encouragement, invaluable

comments, and advice that made this research possible and completed. I would like

to thank all members of the Laboratory of Computational Intelligence and

Biotechnology (LCIB) for their continuous support in many aspects of this research.

My deepest appreciation also goes to my parents as they were the ones who

encouraged me to pursue my MSc. My father, who is a lecturer, shared his valuable

experience in research including the ethics and skills while my mother has been

continuously giving her greatest support. The strength to withstand the hardships that

were encountered along this road came from their aspirations that live in me, for it

will inspire me to achieve more great things in life ahead. With God’s willing.

The datasets used in this study are the courtesy of Geographic Information

System (GIS) solution provider and Malaysian Remote Sensing Agency (ARSM).

Last but not least, I would like to acknowledge the funding from GATES BIOTECH

Solution Sdn. Bhd. (GBIT) under GATES Scholars Foundation (GSF) scheme

(LTR/GSF/2011-06) and MyMaster Scholarship of Ministry of Higher Education

Malaysia as well as the research opportunity provided by the Faculty of Computing,

Universiti Teknologi Malaysia.

v

ABSTRACT

Vegetation, urban terrain and water are considered as the problematic

segments in land use and land cover classifications because of confusion factors.

These segments are vulnerable to high misclassification level. In addressing these

problems, several fundamental issues shall be emphasized: ineffective stand-alone

data classification, high investment for data fusions and the need for high frequency

of data collection. Thus, this research proposes a classification method consisting of

two important components: Three-tier Detection (TTD) and Multi-level Synergy

(MLS) after evaluating LiDAR point cloud, aerial photography, Quickbird and

Landsat 7 ETM+ images. TTD which is a hierarchical and priority-based data fusion

method is used to solve the vegetation and urban terrain classification while MLS,

which is a synergy strategy by the utilization of single data and robust learning

algorithms is used for water classification. The creation of TTD that has managed to

outperform the stand-alone data classification made it a worthwhile investment while

for MLS, the usage of single data is capable of meeting the high data collection

demand. Both methods started with data processing such as image filtering followed

by the comparison of several existing techniques for each data (rank) to identify their

potentials and limitations. Next, multi-level data fusions and multi-level synergy are

conducted for TTD and MLS, respectively. The dataset employed is Bukit Kanada,

Sarawak which exemplifies a coastal mixed-land zone. The performance is then

measured using statistical indices include overall accuracy and Kappa Index of

Agreement. Both TTD and MLS outperformed recent works such as Normalized

Digital Surface Model, Edge Detection technique and Support Vector Machine.

Based on the success rates, TTD is suitable to be applied in planning and

development sectors, management and detection of land use changes while MLS is

suitable for creating maps, charts, and also in monitoring national coastline.

vi

ABSTRAK

Cabaran utama dalam pengklasifikasian penggunaan dan penutupan tanah

adalah kekeliruan yang berlaku pada segmen-segmen yang bermasalah seperti

tumbuh-tumbuhan, kawasan bandar dan air. Segmen-segmen ini terdedah kepada

tahap keterlepasan pengklasifikasian yang tinggi. Bagi menangani permasalahan ini,

beberapa isu asas perlu dititikberatkan iaitu pengklasifikasian data tunggal yang tidak

berkesan manakala paduan data melibatkan pelaburan yang tinggi serta

kebergantungan kepada frekuensi pengumpulan data yang tinggi. Oleh yang

demikian, kajian ini telah memperkenalkan satu kaedah pengklasifikasian yang

terdiri daripada dua komponen penting iaitu Three-tier Detection (TTD) dan Multi-

level Synergy (MLS) setelah menilai beberapa data. TTD merupakan kaedah paduan

data yang berasaskan kepada hierarki dan keutamaan yang digunakan untuk

pengklasifikasian tumbuh-tumbuhan dan kawasan bandar manakala MLS yang

merupakan strategi sinergi berdasarkan kepada data tunggal dan algoritma-algoritma

pembelajaran digunakan untuk pengklasifikasian air. Pencapaian TTD yang telah

berjaya mengatasi pengklasifikasian data tunggal menjadikannya suatu pelaburan

yang berbaloi manakala MLS yang dioperasikan berdasarkan data tunggal dilihat

mampu memenuhi kebergantungan kepada frekuensi pengumpulan data yang tinggi.

Kedua-dua kaedah ini bermula dengan pemprosesan data seperti penapisan imej dan

diikuti dengan perbandingan beberapa teknik yang sedia ada untuk setiap data bagi

mengenal pasti potensi dan kelemahannya. Seterusnya, pelbagai peringkat paduan

data dan sinergi diuji bagi TTD dan MLS. Lokasi kajian ini ialah Bukit Kanada,

Sarawak yang merupakan zon tanah bercampur di kawasan pantai. Berdasarkan

indeks-indeks statistik termasuk ketepatan keseluruhan dan Kappa Index of

Agreement, TTD dan MLS telah berjaya mengatasi kerja-kerja baru seperti

Normalized Digital Surface Model, teknik Edge Detection dan Support Vector

Machine. Dengan pencapaian ini, TTD sesuai untuk diaplikasikan dalam sektor

perancangan dan pembangunan, pengurusan dan pengesanan perubahan penggunaan

tanah manakala MLS sesuai untuk mewujudkan peta, carta dan juga memantau

perairan kebangsaan.

vii

TABLE OF CONTENTS

CHAPTER TITLE PAGE

DECLARATION ii

DEDICATION iii

ACKNOWLEDGEMENTS iv

ABSTRACT v

ABSTRAK vi

TABLE OF CONTENTS vii

LIST OF TABLES xi

LIST OF FIGURES xiv

LIST OF ABBREVIATIONS xvii

1 INTRODUCTION 1

1.1 Background 1

1.2 Challenges of Land Use and Land Cover

Classification

3

1.3 Current Methods in Land Use and Land Cover

Classification

5

1.4 Problem Statement 6

1.5 Objectives of the Study 7

1.6 Scope and Significance of the Study 8

1.7 Organization of the Thesis 10

2 LITERATURE REVIEW 11

2.1 Introduction 11

2.2 Coastal Mixed-land Zone 12

viii

2.3 Land Use and Land Cover Classification 15

2.4 Remote Sensing Data 16

2.5 Data Fusions 21

2.6 Remote Sensing Data and Learning Algorithm(s) 26

2.7 Information Fusion Schemes 28

2.8 Trends and Directions 31

2.9 Summary 33

3 RESEARCH METHODOLOGY 34

3.1 Introduction 34

3.2 Research Framework 35

3.3 Data Sources and Preparation 38

3.3.1 LiDAR Point Cloud 38

3.3.2 Aerial Photography 39

3.3.3 Quickbird Image 39

3.3.4 Landsat 7 ETM+ Image 40

3.3.5 Area-of-Interest 40

3.4 Instrumentation and Results Analysis 42

3.4.1 Hardware and Software Requirements 42

3.4.2 Testing and Analysis 42

3.4.3 Evaluation Metrics 43

3.5 Summary 46

4 A COMPARATIVE STUDY OF STAND-ALONE

DATA APPLICATION IN COASTAL MIXED-LAND

ZONE

47

4.1 Introduction 47

4.2 Related Works 48

4.3 Materials and Method 49

4.3.1 Test Scenes 49

4.3.2 Generation of DSM and DTM 50

4.3.3 Orthorectification 52

4.3.4 Image Filtering 53

4.3.5 Signature Analysis 55

ix

4.3.6 Classification 56

4.4 Results and Discussion 57

4.4.1 Analysis on Classification Results 67

4.4.2 Analysis on Supervised Classifiers 60

4.4.3 Analysis on Non-classifiers 61

4.4.4 Analysis on False Negative and False

Positive

63

4.5 Summary 64

5 THREE-TIER DETECTION AS EFFECTIVE DATA

FUSIONS FOR VEGETATION AND URBAN

TERRAIN CLASSIFICATION IN COASTAL

MIXED-LAND ZONE

65

5.1 Introduction 65








5.3.5.1 Stand-alone Data 73

5.3.5.2 Multi-level Data Fusions 73


5.4.1 Analysis on Classification Results 75

5.4.1.1 Analysis on Tier-1

(Inter-data Comparisons)

76


(Fusion of Two Data)

77


(Fusion of Three Data)

78


Positive

78

5.4.3 Comparison to Other Related Works 81

5.5 Summary 81

x

6 MULTI-LEVEL SYNERGY (MLS) OF AERIAL

PHOTOGRAPHY WITH THE FUSION OF

LEARNING ALGORITHMS FOR WATER

CLASSIFICATION IN COASTAL MIXED-LAND

ZONE

83

6.1 Introduction 83








6.3.5.1 Synergy of Single Remote

Sensing Data with Different

Computational Techniques

92

6.3.5.2 Multi-level Synergy of Remote

Sensing Data with Computational

Techniques

93


6.4.1 Analysis on Stage 1

(Ranking Determination)

95

6.4.2 Analysis on Stage 2

(Synergy-based Classification)

96

6.4.3 Analysis on McNemar’s Test for End Result

Selection

96


Positive

98

6.4.5 Comparison with Other Related Works 99

6.5 Summary 100

7 CONCLUSION 101

7.1 Concluding Remarks 101

7.2 Contributions of the Research 103

7.3 Future Works 104

7.4 Closing Remarks 105

REFERENCES 106

xi

LIST OF TABLES

TABLE NO. TITLE PAGE

2.1 The advantage and disadvantage of LULC classification

patterns that include amount of classification steps taken,

information provided and also time consumption

17

2.2 Example of works for each LULC classification patterns 18

2.3 Descriptions of remote sensing data. Note that RGB

denotes the red, green and blue, respectively

20

2.4 Related works that utilize data capabilities as well as the

advantage and disadvantage reported for each of the

method

22

2.5 Results of the method by Awrangjeb et al. (2010). The Cmp

and Crp denote completeness and correctness, respectively

while both Ql and Qlp denote quality. Low completeness

and correctness level in Scene 2 and also low quality level

in Scene 3

24

2.6 Results of the method proposed by Hermosilla et al.

(2011). The µ denotes mean value while σ denotes standard

deviation. The high µ value performance indicates that the

thresholding-based approach performed better, meanwhile

the low σ value suggests a better robustness for this

approach

24

2.7 LULC classification based on data fusions that consist of

various type of data and methods

27

2.8 Results of the OMISII image classification by Du et al. 28

xii

(2012). The application of MCS outperformed single

classifier-based classification and between the MCS, the

concatenation combination outperformed the others

2.9 The application of learning algorithms to remote sensing

data including the advantage and disadvantage

29

3.1 Testing analysis conducted in this study for particular SOI

by using several data types as well as the respective chapter

for reference

43

3.2 KIA strength of agreement by Landis and Koch (1977) 45

4.1 Filtering formula and weighting function of ELF. Csi is

standard speckle index, Ci is varied standard speckle index,

L is the number of looks, Cmax is the upper threshold and Kd

is called damping factor

54

4.2 RGB attributes for the imageries based on the signature

analysis of both SOIs. 10 samples for each SOI. These

spectral information used as a reference for supervised

classifiers in this study

55

4.3 Results for stand-alone data classification (average of three

test scenes) for vegetation and urban terrain. The

highlighted rows indicate the best method for each data for

vegetation and urban terrain classification

59

4.4 FN and FP for vegetation and urban terrain classification.

The ‘*’ indicates that the process is conducted for

vegetation classification only while the highlighted rows

indicate the best method for each data to classify the SOIs

63


analysis of both SOIs. 10 samples for each SOI. These

spectral information used as a reference for supervised

classifiers in this study

72

5.2 Results for TTD (average of five test scenes) for each SOI.

The highlighted rows indicate the best method in each tier

76

5.3 Decrement of FNs and FPs for the classification using

TTD. The highlighted rows signify no change to the FN or

79

xiii

FP values. This phenomenon is caused either by the value

is already 0 or no mutual agreement is achieve with the

additional supportive layer


analysis of water (20 samples). These spectral information

used as a reference for supervised classifiers in this study

92

6.2 The results of water classification in Stage 1 (average of

ten test scenes) for LPC, AP, QB and LDST

96

6.3 The results of water classification in Stage 2 (average of

ten test scenes). The highlighted row indicates the selected

synergy level for the end results

97

6.4 Statistical significance of differences in overall

classification accuracy expressed by means of the

McNemar’s test for the datasets. Statistically significant

differences at 5% level of significance (|Zij| ≥ 1.96) are

highlighted in bold face. The 4th

and 5th

level of synergy do

not satify the McNemar’s rule. Thus, the 3rd

level of

synergy is chosen as the final result

98

6.5 Decrement of FNs and FPs for the classification using

MLS

99

6.6 Superiority of the proposed synergy approach over data

fusions (average of ten test scenes). The ‘’ indicates

synergy

100

xiv

LIST OF FIGURES

FIGURE NO. TITLE PAGE

1.1 Comparisons between a coastal mixed-land zone and

urban area; (a) a coastal mixed-land zone and (b) urban

area

2

1.2 Several characteristics of a coastal mixed-land zone; (a)

seawater, (b) industrial areas, (c) port facilities, and (d)

roads

3

1.3 Scope of the study. Note that “*” indicates the

involvement in water classification only

9

2.1 The characteristics of vegetation classes 12

2.2 The possible confusion factors 13

2.3 Rapid changes of LULC segments in year 2003 14

2.4 Patterns in LULC classification 15

2.5 Examples of remote sensing data. (a) LPC, (b) AP, (c)

QB and (d) LDST. Note that for (d), the similar segment

with (a) to (c) is highlighted in black box

19

2.6 Swimming pool and flowering plants in pots (highlighted

in red box)

21

2.7 Results of the method proposed by Pérez-Hoyos et al.

(2012). Note that OA denotes Oac

25

2.8 Results of the comparative study by Du et al. (2012) for

Decision Level Fusion. The Majority Voting (highlighted

in red box) was ranked first over Weighted Majority

Vote, Fuzzy Integral and Dempster-Shafer evidence

theory (denoted by DS) in all diversity cases

30

2.9 Results of the comparative study by Du et al. (2013) that

include Data Level Fusion and Decision Level Fusion.

Again, the Majority Voting (denoted by MV and

highlighted in red box) showed outstanding performance

followed by Fuzzy Integral (denoted by FI), Dempster-

Shafer evidence theory (denoted by D-S), Generalized

31

xv

Intensity-Hue-Saturation (denoted by GIHS), Principal

Component Analysis (denoted by PCA), Gram-Schmidt

(denoted by GS), High Pass Filter (denoted by HPF) and

Wavelet Transform (denoted by WT). In short, the

Majority Voting works by collecting the agreement

between particular numbers of parties in deciding the

final decision.

2.10 Directions of this study in improving the SOIs

classification

32

3.1 Research framework 35

3.2 Research structure 37

3.3 The LPC data 38

3.4 The AP data 39

3.5 The QB data 39

3.6 The LDST data 40

3.7 The details of Sarawak and overview of Bukit Kanada 41

4.1 Flowchart of the classification procedures 50

4.2 The three test scenes for each data 51

4.3 The Digital Surface Model (DSM) and Digital Terrain

Model (DTM)

52

4.4 The results of orthorectification process 53

4.5 The results of enhanced Lee filtering (ELF) 54

4.6 The results for all methods in each test scene 58

4.7 The KIA order of learning algorithms. (a) AP and (b) QB 60

4.8 Examples of the confusion factors (highlighted in red

box). The first rows are the original images while the

second rows are the classified images; (a) mossy roofs of

old buildings, (b) dark coloured buildings, (c) shadow

and (d) ground class. The green colour should indicating

vegetation classification result

61

4.9 The limitations of the classification. (a) LPC (TP for

vegetation is highlighted in green while low vegetation is

undetected), (b) LPC (TP for the building is highlighted

in blue while the uneven building structure is not fully

detected), (c) LPC (particular buildings are incorrectly

classified as medium vegetation highlighted in red) and

(d) QB (shaded areas)

62

5.1 Framework of TTD. “k” denotes the selected method in

Chapter 4

67

5.2 Algorithm of TTD for the fusions of LPC, AP and QB 68

xvi

5.3 The test scenes for this study. (a) TS1, (b) TS2, (c) TS3, (d)

TS4 and (e) TS5

69

5.4 The results of orthorectification process. (a) to (e) are TS1

to TS5 (AP), respectively while (f) to (j) are TS1 to TS5

(QB), respectively

70

5.5 The results of enhanced Lee filtering (ELF) 71

5.6 Illustration of multi-level data fusions by TTD 73

5.7 The results of TTD in each test scene from Tier-1 to

Tier-3

74

5.8 The differences between Majority Voting with the

proposed TTD

80

6.1 Framework of MLS. Note that “” is synergy and “*”

indicates the process is conducted for LDST only

86

6.2 Algorithm of MLS. The synergy approach () between

Rsd and Ct

87

6.3 The test scenes for this study. TS1 to TS5 show the

characteristics of water areas for: (a) AP and (b) QB

88

6.3 (cont.) The test scenes for this study. TS1 to TS5 show the

characteristics of water areas for: (c) LDST and (d) TS6

to TS10 are the different ratio of LULC segments.

89

6.4 The results of orthorectification process 90

6.5 The results of enhanced Lee filtering. (a) TS1 to TS5 and

(b) TS6 to TS10

91

6.6 The concept of the proposed synergy approach. Note that

in (c), the outputs of R1 synergized with R2 are

highlighted in red. These regions indicate the overlapped

regions between the FP or FN of R1 and R2

94

xvii

LIST OF ABBREVIATIONS

AB - AdaBoost

AP - Aerial photography

AOI - Area-of-interests

ANN - Artificial Neural Networks

BB - Bagging and Boosting

Bf - Branching factor

Ce - Commission error

Cm - Completeness

Ct - Computational technique

Cr - Correctness

DS - Dempster-Shafer

DEM - Digital Elevation Model

DSM - Digital Surface Model

DTM - Digital Terrain Model

ED - Edge Detection

ELF - Enhanced Lee Filtering

FN - False Negative

FP - False Positive

FS - Fuzzy Sets

GIS - Geographical Remote Sensing

Gr - Green

GCP - Ground control point

htnDSM - Height threshold of nDSM

ICP - Image control points

xviii

J48 DT - J48 Decision Tree

KIA - Kappa Index of Agreement

kNN - k-Nearest Neighbor

LC - Land Cover

LU - Land Use

LDST - Landsat 7 ETM+

LPC - LiDAR point cloud

ARSM - Malaysian Remote Sensing Agency

ML - Maximum Likelihood

M - Methods

MIR - Middle infrared band

MD - Minimum Distance

Mf - Miss factor

MODIS - Moderate-resolution Imaging Spectroradiometer

MNDWI - Modified Normalized Difference Water Index

MLP - Multi-layer Perceptron

MLPNN - Multi-layer Perceptron Neural Network

MLS - Multi-level Synergy

Multi - Multiple

MCS - Multiple classifier system

NB - Naïve Bayes

NIR - Near-infrared

NN - Neural Network

NDVI - Normalized Difference Vegetation Index

nDSM - Normalized Digital Surface Model

Oe - Omission error

Oac - Overall accuracy

PP - Parallelepiped

PC - Personal Computer

Q - Quality

QB - Quickbird

RBF - Radial Basis Function

RAM - Random Access Memory

xix

RF - Random Forest

R - Rank

RGB - Red, green and blue

Rsd - Remote sensing data

S - Sample

SOI - Segment-of-interest

SOM - Self-Organizing Map

SRTM - Shuttle Radar Topography Mission

Si - Single

SAM - Spectral Angle Mapper

SVM - Support Vector Machine

SAR - Synthetic Aperture Radar

TS - Test scenes

TTD - Three-tier Detection

TIN - Triangulated Irregular Network

TN - True Negative

TP - True Positive

UTM - Universal Transverse Mercator

CHAPTER 1

INTRODUCTION

1.1 Background

In Geographical Remote Sensing (GIS) perspective, Earth can be categorized

into two categories: (i) Land Use (LU) which describes the human use of land that

involves management or modification of natural environment or wilderness into built

environment such as residential and industrial area; and (ii) Land Cover (LC) which

describes physical materials at the surface of Earth such as vegetation, urban terrain

and water segments. The main difference between LU and LC is that LU concerns on

the changes that are made to the Earth’s surface while LC includes the changed area

and natural area. Among the important aspects related to LU and LC is the

classification that also depends on the efficiency of data procurement. The traditional

ways involve physical contact with the objects, high cost, time-consuming and some

areas are not easy to be reached such as volcanic, landslides and remote areas. These

limitations have been successfully overcome by the enhanced remote sensing

technology by using plane, unmanned aerial vehicle and satellite which do not

involve physical contact, low cost, fast and high reach capabilities. As well as the

data, the capabilities increase from time to time and each of them carries particular

advantages in which some are unique. LiDAR point cloud (LPC), aerial photography

(AP), Quickbird (QB) and Landsat 7 ETM+ (LDST) images are examples of remote

sensing data types used for the classification.

2

The classification task takes place in two types of area-of-interests (AOI: see

Figure 1.1) which are: (i) mixed-land zones (Garcia-Gutierrez et al., 2011); and (ii)

urban areas (Awrangjeb et al., 2010). A mixed-land zone is an area that is comprised

of land uses in a variety of ways such as industrial areas, port facilities, roads and

natural areas in the same space (Garcia-Gutierrez et al., 2011; see Figure 1.2). It

usually consists of low and medium-sized buildings and large area vegetation

segments. These characteristics exist in almost all countries in the world. However,

in a coastal mixed-land zone, seawater is included besides freshwater. Meanwhile, an

urban area consists mainly of high buildings and smaller vegetation areas. Besides,

an urban area may consist of more modern and sophisticated structures. Among the

obvious examples are Petronas Twin Towers and KL Tower in Kuala Lumpur as

well as Taming Sari Tower in Melaka. The basic criteria used to differentiate these

two AOIs are building density and total area of vegetation in which a coastal mixed-

land zone commonly has lower building density and larger total area of vegetation

compared to the urban areas.

a b

Figure 1.1 Comparisons between a coastal mixed-land zone and urban area; (a) a

coastal mixed-land zone and (b) urban area.

3

a b

c d

Figure 1.2 Several characteristics of a coastal mixed-land zone; (a) seawater, (b)

industrial areas, (c) port facilities, and (d) roads.

1.2 Challenges of Land Use and Land Cover Classification

The existence of many types of data with various capabilities accompanied by

advances in computer field provides wider options in LULC classification in order to

improve the classification accuracy. These options include Normalized Digital

Surface Model (nDSM) application presented by Demir et al. (2008), the height

threshold of nDSM (htnDSM) application by Hermosilla et al. (2011) and the Edge

Detection (ED) technique by Babykalpana and Thanushkodi (2011). The nDSM,

htnDSM and ED technique were reported to produce good classification accuracy

together with certain limitations. This is where the first challenge belongs. Due to the

experiments demonstrated by the previous researchers that are incomparable from

each other since many parameters are different such as: (i) dataset; (ii) method; and

4

segment-of-interest (SOI), these inconstancies made the comparison difficult while it

is crucial to recognize the advantages and disadvantages of each remote sensing data,

techniques and the proposed methods to provide better solution for particular

problems.

Several works such as by Sohn and Dowman (2007) and Awrangjeb et al.

(2010) have suggested data fusion as a new option for LULC classification. It

enables the collection of useful information from different sensors (Campos et al.,

2010) which is reported to outperform stand-alone data classification. A number of

researchers such as Rottensteiner et al. (2005) and Hyde et al. (2006) have proved

the potential of data fusions as new alternative in LULC classification. In order to

consider the data fusion options, the second challenge must be tackled. This

challenge concerns the data fusion methods used to solve the classification problem

since current results were not in satisfactory level. As example, the method by

Campos et al. (2010) which employs data fusion of LPC data with QB data achieved

Kappa Index of Agreement (KIA) value of 0.78 which is in the range of “Good”

based on Kappa strength of agreement. The achievement is below the “Very good”

range. Awrangjeb et al. (2010) fused various products of LPC with QB. Low

completeness and correctness level in particular test scenes as well as low quality

level were reported. To date, Pérez-Hoyos et al. (2012) created a synergetic land-

cover map by using four types of data. However, some issues arose such as the

reproducibility of the method, availability and price of the data. In short, the key of

tackling the second challenge relies on the handling of the first challenge.

In contrast with vegetation and urban terrain segments, the water segment

possessed many factors which rapidly affect the condition of this segment such as

daily human activity, natural phenomena and pollution. Hence, frequent

classification works are needed to fulfill the purposes of water classification such as

to monitor the national coastline. Since this task requires high frequency of data

collection, the investment in preparing the data must be minimized. Such

circumstance requires forcing good results from a single data which lead to the

consideration of learning algorithms and water index utilization, where the third

5

challenge stemmed from. Multiple classifier system (MCS) is an advanced approach

of learning algorithms which has been proven by Du et al. (2012) to outperform

single classifier while Modified Normalized Difference Water Index (MNDWI) is a

well-known water index which was explored by Ho et al. (2010). This method is

capable of classifying water bodies due to the sensitivity of LDST data. However,

several issues aroused such as the classifiers that will be chosen which affect the

effectiveness and time consumption of the MCS and the identification of the superior

method for water classification.

1.3 Current Methods in Land Use and Land Cover Classification

Generally, current methods for LULC classification can be categorized into

two: stand-alone data and data fusions:

(i) Stand-alone data - utilizes a single dataset for the classification task

by exploiting the benefits of each data type. For LPC, nDSM

(Brennan and Webster, 2006; Demir et al., 2008), ED (Babykalpana

and Thanushkodi , 2011) and LPC with intelligent techniques (Garcia-

Gutierrez et al., 2011). For imagery data, the application of single

learning algorithm (Foody et al., 2007; Perumal and Bhaskaran, 2010;

Szuster et al., 2011) while the fusion of the learning algorithms was

implemented by Du et al. (2012). Lee and Yeh (2009) utilized the

near-infrared band of QB images by Normalized Difference

Vegetation Index (NDVI) which is a well-known vegetation index.

(ii) Data fusions - utilize more than one dataset for the classification task

whereby many schemes have been observed. Amarsaikhan et al.

(2010) fused the nDSM of LPC data with Support Vector Machine

(SVM) application on imagery data. Khoshelham et al. (2010) fused

the htnDSM with MCS that consists of SVM and Maximum

Likelihood (ML). Guan et al. (2012) fused Triangulated Irregular

6

Network (TIN) and nDSM with SVM. Elghazali (2011) and Campos

et al. (2010) fused LPC with QB data.

1.4 Problem Statement

The accuracy of LULC classification is a general problem because LULC

does not only consist of vegetation, urban terrain and water segments. In such

confused environment, the method of reducing the misclassification levels by using

large amount of data has not been adequately investigated. In order to solve this

problem, various classification schemes have been observed with single data

application, data fusions and learning algorithms are among the options. However,

for some purposes, the classification results are highly demanded over a period of

time such as water classification (the details are presented in Chapter 6). Hence, the

investment in preparing the data and the dependency on frequent data collection need

to be addressed. Since the results by different researchers are incomparable due to

various experimental configurations, therefore, the classification problems to be

solved in this study can be described as follows:

“Given a number of remote sensing data with particular capabilities and

exploitation methods with their barely known advantages and disadvantages,

the main problem is to produce an accurate classification level by reducing

the misclassification levels namely False Negative (FN) and False Positive

(FP) caused by the presence of confusion factors in the study area in order to

achieve higher overall accuracy (Oac) and Kappa Index of Agreement (KIA).

While current data fusions seem to be insufficiently effective, particular

purposes urge to force good results from a single data. The investment to

prepare the data and the dependency on frequent data collection are taken into

account.”

7

Based on the above challenges, some factors need to be addressed by the

possible solution. The first factor is related to insufficient knowledge on the

advantages and disadvantages of particular methods in classifying a coastal mixed-

land zone caused by the different experiment settings. Thus, the results are

incomparable and the actual potential of the methods remains questionable. Thus,

this study aims to investigate a number of selected methods to recognize their

capabilities in terms of accuracy level and identify the contributors of the

misclassifications.

The second factor is the unsatisfactory performance of current data fusion

techniques which may lead to waste of investment. This condition is mainly caused

by the inability of the data fusions to fully utilize the potential of the data in order to

manage the misclassification factors that exist in a coastal mixed-land zone.

Realising these facts, after the advantages and disadvantages of the data and methods

have been identified, this study aims to properly utilize them in order to reduce the

FN and FP levels by considering several options of data fusion techniques. The third

factor is the urge to reduce the investment in preparing the data based on the demand

of high frequency of data collection. Forcing good results from a single data lead to

the use of learning algorithms which is among the considerable options. While

current MCS is at unsatisfactory level and consists of too many classifiers, this study

aims to produce a method based on MCS by decreasing the number of classifiers

involved, which is able to reduce time consumption as well as producing good

results.

1.5 Objectives of the Study

The goal of this study is to develop a method that can produce high

classification levels for vegetation, urban terrain and water segments in the presence

of various confusion factors. This can be objectified into:

8

(i) To perform a comparative study which investigates the potential of

each selected data and method in terms of classifying vegetation,

urban terrain and water segments in coastal mixed-land zone.

(ii) To develop a method that utilizes the capabilities of the data for

vegetation and urban terrain which is able to manage the

misclassification factors that exist in coastal-mixed land zone.

(iii) To establish a method based on MCS that is able to separate various

water types in a coastal mixed-land zone from other segments as well

as reducing the number of classifier combinations.

1.6 Scope and Significance of the Study

In this study, the remote sensing data used are LPC, AP, QB and LDST

images obtained from GIS solution provider and Malaysian Remote Sensing Agency

(ARSM: http://www.remotesensing.gov.my/). The proposed method consists of two

components: Three-tier Detection (TTD) and Multi-level Synergy (MLS). TTD is

used to solve the issues of vegetation and urban terrain classification while MLS is

for water classification. The proposed method is conducted in a coastal mixed-land

zone exemplified by Bukit Kanada in Sarawak. The performance is then measured

using reliable statistical indices which are completeness (Cm), correctness (Cr),

omission error (Oe), commission error (Ce), branching factor (Bf), miss factor (Mf),

quality (Q), Oac, KIA, and McNemar’s test. The scope of this study is simplified in

Figure 1.3.

The significance of this study can be branched according to the SOI: (i)

vegetation and urban terrain; and (ii) water. For vegetation and urban terrain, the

proposed method can be served in management, planning and development sectors

such as estate, oil palm plantations, city and housing. In other aspects, it can also be

used for detection of land use changes such as soil erosion and deforestation and

natural disaster management such as flood. The end results of water classification are

9

LiDAR point cloud

Aerial photography

Quickbird images

Landsat 7 ETM+ images*

Bukit Kanada in Sarawak

Three-tier

Detection (TTD)

Completeness

Correctness

Multi-level

Synergy (MLS)

Omission error

Commission error

Branching factor

Miss factor

Quality

Overall accuracy

Kappa Index of Agreement

McNemar’s test*

Figure 1.3 Scope of the study. Note that “*” indicates the involvement in water

classification only.

also substantial. Other than mapping the river, lake and reservoirs, the mapping of

coastal area is very important for many countries to define and monitor large national

coastline, create maps and charts, and monitor environmental change. Nautical charts

which are among the coastal mapping products are fundamental tools to mariners in

planning voyages and navigating ships using the shortest, safest, and the most

economical routes. Besides, coastal mapping is performed for coastal change

assessment which is to determine the changing rate of the coast, which can help with

future planning. It is conducted by measuring the differences in the past and present

shoreline locations. The ‘before’ and ‘after’ comparison is one way of how the

scientists determine shoreline change.

Data

Proposed

method

Evaluation

methods

Scope of

the Study

Area-of-interest

10

1.7 Organization of the Thesis

This thesis is organized into seven chapters. A brief description on the

content of each chapter is given below:

(i) Chapter 1 defines the challenges, problems, current methods,

objectives, scope and significance of the study.

(ii) Chapter 2 reviews the main subjects of interest which are coastal

mixed-land zone classification, remote sensing data, data fusions and

the application of remote sensing data with learning algorithm(s).

(iii) Chapter 3 provides the design of the computational method that

supports the objectives of the study. This includes research framework,

data sources, instrumentation and analysis of results.

(iv) Chapter 4 presents the comparative study of stand-alone data

application whereby several existing techniques were evaluated using

several statistical indices. The objectives are to produce comparable

results between the techniques and identify the contributors of FN and

FP.

(v) Chapter 5 describes the proposed TTD which is the data fusions used to

reduce the FN and FP levels for vegetation and urban terrain

classification in coastal mixed-land zone.

(vi) Chapter 6 describes another component of the proposed method namely

MLS. MLS synergizes single remote sensing data with the fusions of

learning algorithms used to solve the issues of water classification.

MLS fulfils the requirements of high frequency of data collection.

(vii) Chapter 7 draws general conclusions of the accomplished results and

presents the contributions of the study as well as suggests several ideas

for related future works.

REFERENCES

Al-Ahmadi, F. S. and Hames, A. S. (2009). Comparison of four classification

methods to extract land use and land cover from raw satellite images for some

remote arid areas, Kingdom of Saudi Arabia. Earth Sciences. 20(1): 167–191.

Alajlan, N., Bazi, Y., Melgani, F., and Yager, R. R. (2012). Fusion of supervised and

unsupervised earning for improved classification of hyperspectral images.

Information Sciences. 217(1): 39–55.

Amarsaikhan, D., Blotevogel, H. H., Van Genderen, J. L., Ganzorig, M., Gantuya,

R., and Nergui, B. (2010). Fusing high-resolution SAR and optical imagery for

improved urban land cover study and classification. International Journal of

Image and Data Fusion. 1(1): 83–97.

Awrangjeb, M., Ravanbaksh, M., and Fraser, S. C. (2010). Automatic detection of

residential buildings using LiDAR data and multispectral imagery. ISPRS

Journal of Photogrammetry and Remote Sensing. 65(1): 457–467.

Babykalpana, Y. and Thanushkodi, K. (2011). Classification of land use and land

cover change detection using remotely sensed data. International Journal of

Computer Science and Engineering. 3(4):1638–1644.

Brennan, R. and Webster, T. L. (2006). Object-oriented land cover classification of

LIDAR-derived surfaces. Canadaian Journal of Remote Sensing. 32(2): 167–

172.

Campos, N., Lawrence, R., McGlynn, B., and Gardner, K. (2010). Effects of LiDAR-

Quickbird fusion on object-oriented classification of mountain resort

development. Journal of Applied Remote Sensing. 4(1): 1–14.

Ceamanos, X., Waske, B., Benediktsson, J. A., Chanussot, J., Fauvel, M., and

Sveinsson, J. R. (2010). A classifier ensemble based on fusion support vector

machines for classifying hyperspectral data. International Journal of Image and

Data Fusion. 1(4): 293–307.

107

Chen, L., Teo, T., Shao, Y., Lai, Y., and Rau, J. (2004). Fusion of LIDAR data and

optical imagery for building modelling. International Archives of the

Photogrammetry, Remote Sensing and Spatial Information Sciences. 35(Part B4):

732–737.

Cohen, J. A. (1960). Coefficient of agreement for nominal scales. Educational and

Psychological Measurement. 20(1): 37–46.

Demir, N., Poli, D., and Baltavias, E. (2008). Extraction of building and trees using

images and LiDAR data and a combination of various methods. International

Archives of the Photogrammetry, Remote Sensing and Spatial Information

Sciences. 37(Part B4/1): 313–318.

Demir, N., Poli, D., and Baltavias, E. (2009). Detection of buildings at airport sites

using images and LiDAR data and combination of various methods. International

Archives of the Photogrammetry, Remote Sensing and Spatial Information

Sciences. 38(Part 3/W4): 71–76.

Devi, R. M. and Baboo, S. S. (2011). Land Use and Land Cover Classification using

RGB&L based Supervised Classification Algoritm. International Journal of

Computer Science and Engineering Technology. 2(10):167–180.

Du, P., Xia, J., Zhang, W., Tan, K., Liu, Y., and Liu, S. (2012). Multiple classifier

system for remote sensing image classification: A review. Sensors. 12(4): 4764–

4792.

Du, P., Liu, S., Xia, J. and Zhoa, Y. (2013). Information fusion techniques for change

detection from multi-temporal remote sensing images. Information Fusion. 14

(1): 19–27.

Elghazali, E. S. (2011). Performance of Quickbird image and LIDAR data fusion for

2d/3d city mapping. Australian Journal of Basic and Applied Sciences. 5(11):

1588–1600.

Eliason, E. M. and McEwen, A. S. (1990). Adaptive box filters for removal of

random noise from digital images. Photogrammetric Engineering and Remote

Sensing. 56(4): 453.

Er-Raki, S., Chehbouni., A., and Duchemin, B. (2010). Combining satellite remote

sensing data with the FAO-56 dual approach for water use mapping in irrigated

wheat fields of a semi-arid region. Remote Sensing. 2(1): 375–387.

108

Foody, G. M., Boyd, D. S., and Sanchez-Hernandez, C. (2007). Mapping a specific

class with an ensemble of classifiers. International Journal Remote Sensing.

28(8): 1733–1746.

Frost, V. S., Stiles, J. A., Shanmugan, K. S., and Holtzman, J. C. (1982). A model for

radar images and its application to adaptive digital filtering of multiplicative

noise. IEEE transactions on Pattern Analysis and Machine Intelligent. 4(2): 157–

166.

Garcia-Gutierrez, J., Goncalves-Seco, L., and Riquelme-Santos, J. C. (2011).

Automatic environmental quality assessment for mixed land zone using LiDAR

and intelligent techniques. Expert System with Applications. 38(6): 6805–6813.

Guan, H., Ji, Z., Zhong, L., Li, J., and Ren, Q. (2012). Partially supervised

hierarchical classification for urban features from LIDAR data with aerial

imagery. International Journal of Remote Sensing. 34(1): 190–210.

Heipke, C., Mayer, H., Wiedemann, C., and Jamet, O. (1997). Evaluation of

automatic road extraction. International Achieves of Photogrammetry and

Remote Sensing. 32(Part 3–2W3): 47–56.

Hermosilla, T., Ruiz, L. A., Recio, J. A., and Estornell, J. (2011). Evaluation of

automatic building detection approaches combining high resolution images and

LiDAR data. Remote Sensing. 3(6): 1188–1210.

Hill, R. A. and Thomson, A. G. (2005). Mapping woodland species composition and

structure using airborne spectral and LIDAR data. International Journal of

Remote Sensing. 26(17): 3763–3779.

Ho, L. T. K., Umitsu, M., and Yamaguchi, Y. (2010). Flood hazard mapping by

satellite images and SRTM DEM in Vu Gia-Thu Bon Alluvial Plain, Central

Vietnam. International Archives of the Photogrammetry, Remote Sensing and

Spatial Information Science. 38(8): 275–280.

Hostache, R., Matgen, P., and Wagner, W. (2012). Change detection approaches for

flood extent mapping: how to select the most adequate reference image from

online archives?. International Journal of Applied Earth Observation and

Geoinformation. 19(1): 205–213.

Houtsitjoki, H. and Juhola, M. Comparing the one-vs-one and one-vs-all methods in

Benthic macroinvertebrate image classification. Proceedings of the 7th

International Conference on Machine Learning and Data Mining in Pattern

109

Recognition. August 30-September 03, 2011. New York, USA: Machine

Learning and Data Mining. 2011. 399–413.

Huth, J., Kuenzer, C., Wehrmann, T., Gebhardt, S., Tuan, V. Q., and Dech, S.

(2012). Land cover and land use classification with TWOPAC: towards

automated processing for pixel and object based image classification. Remote

Sensing. 4(9): 2530–2553.

Hyde, P., Dubaya, R., Walker, J. B., Holten, M., and Hunsaker, C. (2006). Mapping

forest structure for wildlife habitat analysis using multi-sensor (LiDAR,

SAR/InSAR, ETM+, Quickbird) synergy. Remote Sensing Environment. 102(1-

2): 63–73.

Jun, G. and Ghosh, J. (2011). Spatially adaptive semi-supervised learning with

Gaussian process for hyperspectral data analysis. Statistical Analysis and Data

Mining. 4(4): 358–371.

Kay, S., Hedley, J. D., and Lavender, S. (2009). Sun glint correction of high and low

spatial resolution images of aquatics scenes: A review methods for visible and

near-infrared wavelengths. Remote Sensing. 1(4): 697–730.

Khoshelham, K., Nardinocchi, C., Frontoni, E., and Mancini, A. (2010). Performance

evaluation of automated approaches to building detection in multi-source aerial

data. ISPRS Journal of Photogrammetry and Remote Sensing. 65(1): 123–133.

Landis, J. R. and Koch, G. G. (1977). The measurement of observer agreement for

categorical data. Biometrics. 33(1): 159–174.

Lee, J. S. (1986). Speckle suppression and analysis for synthetic aperture radar

images. Optical Engineering. 25(1): 636–643.

Lee, T. M. and Yeh, H. C. (2009). Applying remote sensing technique to monitor

shifting wetland vegetation: A case study of Danshui River estuary mangrove

community, Taiwan. Ecological Engineering. 35(4): 487–496.

Loosvelt, L., Peters, J., Skriver, H., Lievens, H., Collie, F. V., Beats, B. D. and

Verhoest, N. E. C. (2012).Random Forest as a tool for estimating uncertainty at

pixel in SAR image classification. International Journal of Applied Earth

Observation and Geoinformation. 19(1): 173–184.

Lopes, A., Touzi, R., and Nezry, E. (1990). Adaptive speckle filters and scene

heterogeneity. IEEE Transactions on Geoscience and Remote Sensing. 28(6):

992–1000.

110

McCombs, J. W., Roberts, S. D., and Evans, D. L. (2003). Influence of fusing

LIDAR and multi-spectral imagery on remotely sensed estimates of stand density

and mean tree height in managed loblolly pine plantation. Forest Science. 49(3):

457–466.

Naidoo, L., Cho, M. A., Mathieu, R., and Asner, G. (2012). Classification of savanna

tree species, in the Greter Kruger National Park Region, by intergrating

hyperspectral and LiDAR data in a Random Forest data mining environment.

ISPRS Journal of Photogrammetry and Remote Sensing. 69(1):167–179.

Pérez-Hoyos, A., Garcia-Haro, F. J., and San-Minguek-Ayanz, J. (2012). A

methodology to generate a synergetic land-cover map by fusion of difference

land-cover products. International Journal of Applied Earth Observation and

Geoinformation. 19(1): 72–87.

Perumal, K. and Bhaskaran, R. (2010). Supervised classification performance of

multispectral images. Journal of Computing. 2(2): 124–129.

Plaza, J., Plaza, A. J., and Barra, C. (2009). Multi–channel morphological profiles for

classification of hyperspectral images using support vector machines. Sensors.

9(1): 196–218.

Reinart, A., Herlevi, A., Arst, H., and Sipelgas, L. (2003). Preliminary optical

classification of lakes and coastal waters in Estonia and south Finland. Journal of

Sea Research. 49(4): 357–366.

Rodriguez-Galiano, V. F., Ghimire, B., Rogan, J., Chicha-Olmo, M., and Rigol-

Sanchez, J. P. (2012). An assessment of the effectiveness of a random forest

classifier for land cover classification. ISPRS Journal of Photogrammetry and

Remote Sensing. 67(1): 93–104.

Rottensteiner, F., Trinder, J., Clode, S., and Kubik, K. (2005). Using the Dempster

Shafer method for the fusion of LiDAR data and multi-spectral images for

building detection. Information Fusion. 6(4): 283–300.

Rottensteiner, M., Trinder, J., Clode, S., and Kubuk, K. (2007). Building detection

by fusion of airborne laser scanner data and multi-spectral images: performance

evaluation and sensitivity analysis. ISPRS Journal of Photogrammetry and


Rutzinger, M., Rottensteiner, F., and Pfeifer, N. (2009). A comparison of evaluation

techniques for building extraction from air-bone laser scanning. IEEE Journal of

Selected Topics in Applied and Remote Sensing. 2(1): 11–20.

111

Salah, H., Trinder, J., Shaker, A., Hamed, M., and Elsagheer, A. (2009). Aerial

image and LIDAR data fusion for automatic feature extraction using the Self-

Organization Map (SOM) classifier. Proceedings of the International Achieves of

Photogrammetry and Remote Sensing. September 1-2, 2009. Paris, France :

International Achieves of Photogrammetry and Remote Sensing. 317–322.

Sohn, G. and Dowman, I. (2007). Data fusion of high resolution satellite imagery and

LiDAR data for automatic building extraction. ISRPS Journal of

Photogrammetery and Remote Sensing. 62(1): 43–63.

Starek, M. J., Vemula, R., and Slatton, K. C. (2012). Probability detection of

morphology indicators for beach segmentation with multitemporal LiDAR

measurement. IEEE Transaction on Geoscience and Remote Sensing. 4(1): 1–12.

Szuster, B. W., Qi, C., and Borger, M. (2011). A comparison of classification

techniques to support land cover and land use analysis in tropical coastal zone.

Applied Geography. 31(2): 525–532.

Taharzadeh, E. and Shafri, H. Z. M. Using hyperspectral remote sensing data

mapping over Kuala Lumpur. Proceedings of the 2011 Joint Urban Remote

Sensing Event. April 11-13, 2011. Munich, Germany : Joint Urban Remote

Sensing Event. 2011. 405–408.

Tuia, D., Ratle, F., Kanevski, M. F., and Emery, W. J. (2010). Active learning

methods for remote sending image classification. IEEE Transaction on

Geoscience and Remote Sensing. 47(7): 2218–2232.

Vu, T., Yamazaki, F., and Matsuoka, M. (2009). Multi-scale solution for building

extraction from LiDAR and image. International Journal of Applied Earth

Observation and Geoinformation. 11(4): 281–289.

Waske, B., Van Der Linden, S., Benediktsson, J. A., Rabe, A., and Hostert, P.

(2010). Sensitivity of support vector machines to random feature selection in

classification of hyperspectral data. Geosciences and Remote Sensing. 48(7):

2880–2889.

Wu, J., Cai, Z., and Goa, E. (2010). Dynamic K-Nearest Neighbor with distance and

attribute weighted for classification. ISPRS Journal of Photogrammetry and


Yuan, H., Van Der Wiele, C. F., and Khorarm, S. (2009). An automated artificial

neural network system for land use/land cover classification from Landsat TM

imagery. Remote Sensing. 1(3): 243–265.

112

Zhang, D. H., Zhang, H. K., Xie, B., Huang, Z. Q., Yu, L., and Cao., Y. Y. Analysis

of image fusion and classification for high resolution SAR data online.

Proceedings of the 2nd International Conference on Education Technology and

Computer. June 22-24, 2010. Shanghai, China: Education Technology and

Computer. 2010. 267–271.

Zheng, M., Cai, Q., and Qing, M. (2009). The effect of prior probabilities in the

maximum likelihood classification on individual classes: A theoretical reasoning

and empirical testing. Photogrammetric Engineering and Remote Sensing. 75(9):

1109–1117.

Date post:	25-May-2019
Category:	Documents
Upload:	duongdung
View:	212 times
Download:	0 times

THREE-TIER DETECTION AND MULTI-LEVEL SYNERGY FOR...

Documents