+ All Categories
Home > Documents > Molinier - Feature Selection for Tree Species Identification in Very High resolution Satellite...

Molinier - Feature Selection for Tree Species Identification in Very High resolution Satellite...

Date post: 17-Dec-2014
Category:
Upload: grssieee
View: 433 times
Download: 0 times
Share this document with a friend
Description:
 
Popular Tags:
15
Feature Selection for Tree Species Identification in Very High Resolution Satellite Images Matthieu Molinier and Heikki Astola VTT Technical Research Centre of Finland [email protected] , [email protected] IGARSS 2011 Vancouver, 28.7.2011
Transcript
Page 1: Molinier - Feature Selection for Tree Species Identification in Very High resolution Satellite Images.ppt

Feature Selection for Tree Species Identification in Very High Resolution Satellite Images

Matthieu Molinier and Heikki Astola

VTT Technical Research Centre of Finland

[email protected], [email protected]

IGARSS 2011 Vancouver, 28.7.2011

Page 2: Molinier - Feature Selection for Tree Species Identification in Very High resolution Satellite Images.ppt

204/10/23

Introduction

NewForest – Renewal of Forest Resource Mapping

• A 1.5-year study (2009-2010) funded by The Finnish Funding Agency for Technology and Innovation (TEKES), with Finnish Companies (forest) and Research Organizations (VTT and University of Eastern Finland UEF)

Study motivation

• Improve methods for operative forest inventory from remote sensing data• Species-wise estimates (e.g. stem volume) not accurate enough (accuracy

vs. cost)

Page 3: Molinier - Feature Selection for Tree Species Identification in Very High resolution Satellite Images.ppt

304/10/23

NewForest approach in forest variable estimation

Plot ID Coord-X Coord-Y Total stem volumeStem volume - pineStem volume - spruceStem volume - deciduousNo Eastings Northings ToV PinV SprV DecV# [m] [m] [m3/ha] [m3/ha] [m3/ha] [m3/ha]

49 508469.1 6973060 149.49 149.49 0 0117 510723.7 6972375 68.36 0 8.37 59.99118 510732.8 6972976 150.78 0 0 150.78121 511123 6973176 97.08 78.12 18.01 0.94122 511324 6973177 89.63 20.31 65.63 3.69132 516717.5 6969571 337.18 39.81 168.01 129.37133 516716.5 6969773 370.18 71.73 282.28 16.17134 516720.5 6969978 229.53 0 229.53 0135 516717.6 6970173 159.47 0.31 131.7 27.46158 510024.1 6973974 69.23 53.58 15.32 0.33159 510223.8 6973976 103.16 0 0 103.16160 510438.2 6973988 108.96 31.92 8.19 68.85168 510732.5 6974084 218.62 0 215.11 3.5169 510935.5 6974078 228.76 0 218.32 10.43171 513829.3 6972287 97.35 0 90.18 7.18172 513817 6972480 162.25 109.52 11.79 40.94174 514021.7 6973078 247.16 156.36 7.38 83.42175 514227.1 6973084 316.99 135.16 181.1 0.73176 514409.4 6973059 288.53 177.77 110.45 0.31196 515921.4 6971387 133.66 0 1.26 132.4197 515922.5 6971571 242.78 0 21.72 221.06200 516121.7 6971978 86.56 0 42.66 43.91209 513921.5 6971078 56.27 0 0 56.27210 514123.4 6971079 103.61 59 39.52 5.09212 514527.7 6971098 164.14 90.45 72.96 0.73213 514714.9 6971086 101.19 8.71 3.94 88.54222 513720.3 6969993 282.51 0 254.33 28.17223 513719.9 6970169 220.55 34.85 185.7 0226 514112.5 6970168 219.36 0 176 43.36227 514321.7 6970179 164.12 0 124.97 39.15

Modelling based on satellite image pixel reflectances and contextual features

Individual tree crown(ITC) detection and

crown width estimation

Combining data to predict

total amount and sizevariation by species

segmentation estimates

Refined, more accuratespecies-wise estimates

Page 4: Molinier - Feature Selection for Tree Species Identification in Very High resolution Satellite Images.ppt

404/10/23

Study siteKarttula / Kuopio,

Central Finland

62.9007º N

27.2392º E

Karttula

GeoEye image, 26.6.2009, RGB NIR10.5 km x 11.5 km, 3% clouds

Mixed forest, spruce dominated25% pine, 45% spruce, 30% deciduous (mainly birch)

Page 5: Molinier - Feature Selection for Tree Species Identification in Very High resolution Satellite Images.ppt

504/10/23

Optical image data pre-processing

• Rectification to geographic coordinate system (WGS84, NUTM35)

• Geo-coding corrected using Digital Elevation Model (Airborne Laser Scanning DEM) : mean corrections 2.65 m, maximum 20 m

• Calibration to Top Of Atmosphere (TOA) reflectances using the band-specific calibration coefficients

• Atmospherical correction into surface reflectances by applying the SMAC4-radiation transfer code

Page 6: Molinier - Feature Selection for Tree Species Identification in Very High resolution Satellite Images.ppt

604/10/23

Ground reference data

Training data – from 222 field plots 212 field plots within GeoEye image area (2009) 10 additional 0-stem volume plots extracted visually Tree species classification : training data from 20 pure

species field plots

Testing data – from 178 field plots (mixed species) 178 field plots acquired in 2009, limited spatial distribution

(several plots per forest stand)

In total : 1164 ground objects mapped (276 pines, 277 spruces, 347 deciduous, 264 non-trees)

Training set ToV PinV SprV DecVMean 200,8 48,4 99,2 53,2 [m3/ha]Stdev 116,3 88,8 108,5 63,8 [m3/ha]

Test set ToV PinV SprV DecVMean 203,3 78,7 79,0 45,6 [m3/ha]Stdev 107,3 91,5 96,2 49,4 [m3/ha]

GeoEye image : 10.5 km x 11.5 km

Page 7: Molinier - Feature Selection for Tree Species Identification in Very High resolution Satellite Images.ppt

704/10/23

Input for feature selection – 35 + 4 features

R G B NIR PANmean intensity within 1.5 m radius around tree candidates (TC)

SPECTRAL (5) – set A

CONTEXTUAL (9) – set B

From PAN, 7.5 m radius around TCmeanmean / medianskewnesskurtosiscontrastpm1 : mean of brightest pixelsps1 : std of brightest pixelspm2 : mean of darkest pixelsps2 : std of darkest pixels

SEGMENT-WISE (21) – set C

From PAN, 3 segment sizes : 50 m2, 85 m2, 125 m2

meanmean / median skewnesskurtosisstd : standard deviationpmean : partial meanpstd : partial standard deviation

Probe variables

random vectors or random permutations of a feature vectorprobe_gauss1, probe_gauss2probe_shuffle1, probe_shuffle2

Page 8: Molinier - Feature Selection for Tree Species Identification in Very High resolution Satellite Images.ppt

804/10/23

Class definitions and training scheme

Class#

Class name

1 pine

2 spruce

3 deciduous

4 shadow

5 open area / sunlit

6 bare ground

7 green vegetation

Tree classes

Non-tree classes

WHOLE DATASET (1164 samples)900 trees, 264 non-trees

TESTING (391)

MODEL DESIGN (773)

2 / 3 1 / 3

TRAINING(512)

VAL(261)

2 / 3 1 / 3

stratified sampling to preserve classes proportions

model building ranking

Page 9: Molinier - Feature Selection for Tree Species Identification in Very High resolution Satellite Images.ppt

904/10/23

Feature selection preparation (Guyon et al., 2003)

• Feature normalization to the range [0, 1]• Visual screening of scatter plots on the 35 real features : no obvious

correlations, very few outlier samples

• Variable ranking – assessing features one by one with the most simple classifier (single threshold), one(+) vs all(-). 4 scores :

– Fisher criteria F, scaled to [0 1]– R2 – Pearson correlation coefficient for a single feature vs +/- labels– AUC : Area under ROC curve (Receiver-Operative Curve)– sum of previous scores (FR2AUC)

• All scores computed for every class, then averaged to rank the variables for all 7 classes and for tree classes only (1,2,3).

• No single feature outperformed significantly and consistently the others

F

Page 10: Molinier - Feature Selection for Tree Species Identification in Very High resolution Satellite Images.ppt

1004/10/23

Feature selection and image classification

• Classification accuracy on validation set VAL (261) as a score

• Sequential Forward Selection (SFS) with three classification methods :– Linear Discriminant Analysis (LDA)– Quadatric LDA– k-nearest neighbor (kNN) classifier, k [2 9]. Feature selection and

choice of k at the same time.

• Find the best minimal feature subset by a brute-force approach – 10 best features from the SFS– retrain the best model using all modeling dataset (TRAIN + VAL)

and test with the independent TEST set– brute force approach tractable in this case with simple classifiers– overcome the sub-optimality of SFS

Page 11: Molinier - Feature Selection for Tree Species Identification in Very High resolution Satellite Images.ppt

1104/10/23

6-10 features is enough

Spectral features performed bestsegment-wise features not suited to mixed species study

Overall classification accuracy on tree classes over 80%

Probe variables selected more often in the first places with LDA than with kNN : linear classifier too simple. Quadratic LDA was overfitting.

kNN, k=5 best overall performance, and lowest difference from training to validation error => lower risk of overfitting

Page 12: Molinier - Feature Selection for Tree Species Identification in Very High resolution Satellite Images.ppt

1204/10/23

Example of tree species classification map

pine : 76 %

spruce : 76 %

deciduous : 88 %

non-forest

• Pan-sharpened GeoEye image extract of 1 km x 1 km

• Individual tree crown classification with 5-NN classifier trained with pure species training data

• Non-forest mask generated with

k-means clustering + cluster labeling

Page 13: Molinier - Feature Selection for Tree Species Identification in Very High resolution Satellite Images.ppt

1304/10/23

Predicted species-wise stem numbers vs. field plot data

Nspruce [stems/ha]Npine [stems/ha]

Pre

dict

ed [s

tem

s/ha

]

Ndecid [stems/ha]

• Predicted stem number per species plot against test data (178 test plots)

• Systematic under-estimation of predicted stem number with spruce and deciduous classes

• Noise partly due the small collecting radius (r = 8 m) of test data, and to geolocation differences between satellite and ground data

0 500 1000 1500 20000

200

400

600

800

1000

1200

1400

1600

1800

2000

2200

True number of spruces/field plot

Pre

dict

ed n

umbe

r of

spr

uces

/fie

ld p

lot

y=0.98*x + 137.1y=0.98*x + 137.1

y=0.98*x + 137.1R2 = 0.24

y=0.98*x + 137.1

y=0.33*x + 239.8

0 500 1000 1500 2000 25000

500

1000

1500

2000

2500

y=0.56*x + 21.0

R2 = 0.54

True number of broadleaved/field plot

Pre

dict

ed n

umbe

r of

bro

adle

aved

/fie

ld p

lot

0 500 1000 1500 20000

200

400

600

800

1000

1200

1400

1600

1800

2000

y=0.85*x + 45.0R2 = 0.34

True number of pines/field plot

Pre

dict

ed n

umbe

r of

pin

es/fi

eld

plot

Page 14: Molinier - Feature Selection for Tree Species Identification in Very High resolution Satellite Images.ppt

1404/10/23

Conclusions

• The methodology could detect individual treetops, identify their species and determine species proportions in mixed forest.

• Feature ranking and feature selection was performed on a set of 35 features for tree species classification.

• Several classifiers (model including a feature subset and a classification method) were built. The best turned out to be 5-NN with a subset of 6 features, mostly spectral. Segment-wise features could be discarded.

• The tree species proportion accuracy was good (1.4% to 3.5%), but the correlation of stem numbers / species not as good as expected.

Future work• Model selection with more elaborate classifiers (e.g. SVMs)• Embedding feature selection into a cross-validation scheme• Improve stem number estimation with adaptive filtering• Tree crown width estimation validation with ground data

Page 15: Molinier - Feature Selection for Tree Species Identification in Very High resolution Satellite Images.ppt

1504/10/23

[email protected]@vtt.fi

Thank you


Recommended