+ All Categories
Home > Documents > PLANT LEAF RECOGNITIONcs229.stanford.edu/proj2016/poster/LiuHuang-PlantLeaf... · 2017-09-23 ·...

PLANT LEAF RECOGNITIONcs229.stanford.edu/proj2016/poster/LiuHuang-PlantLeaf... · 2017-09-23 ·...

Date post: 03-Aug-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
1
P LANT L EAF R ECOGNITION {A LBERT L IU AND Y ANGMING H UANG }@ STANFORD . EDU P ROBLEM Fine-grained leaf recognition has important application in weeds identification, species dis- covery, plant taxonomy, etc. However, the subtle differences between species and the sheer num- ber of categories makes it hard to solve. Prunus laurocerasus Laurus nobilis Magnolia grandiflora M ETHOD 1. Preprocessing Apply CLAHE to reduce lighting condi- tion variation and resize to fit the next layer. Selectively use K-means to remove background heuristically. For challenging dataset, find the convex hull containing the largest N contours and then use GrabCut to segment leaf out. 2. Feature extraction ConvNets. Transfer learning approach. Specifi- cally, we take a couple of ConvNets pretrained on ImageNet for ILSVRC object classification task, remove top layers and use it as generic feature ex- tractor. Traditional SIFT + BoF Key points are densely sampled. Size of the codebook (K) is fixed at 1000/3000. 3. Classification SVM Linear/RBF/Softmax/MLP/etc. E XPERIMENTAL R ESULTS T RANSFER L EARNING -C ONV N ET The pre-trained weights of VGG16, VGG19 and ResNet50 are available from open source Keras Framework. After comparing preliminary results, we choose ResNet50 since ResNet50 gives better results and less overfitting. We believe this can be attributed to the fact that ResNet50 is deeper and generates lower dimension feature vector, which is likely due to the use of a more aggressive Average Pooling with a pool size of 7x7. The ResNet is famous for it’s deep layers, in our case, 50 layers, with 49 Conv layers and one FC layer on top. Except for the first Conv layer, the rest 48 composes 16 “residual” blocks in 4 stages. The block within each stage has similar architecture, i.e. same input & output shape. The output of the stage 5 give 2048-D features. Every cell in the grid shown above is a 8x8 filter visualized with heatmap, which will be a scalar after average pooling. DATASET 1. Swedish/Flavia Leaf Dataset: clean images taken in controlled conditions. Use 20 sam- ples per species for training. 2. ImageCLEF Plant: crowd sourced and noisy. Considerable variations on lighting conditions, viewpoints, back- ground clutters and occlusions. Choose species with at least 20 training samples. D ISCUSSIONS 1. As expected, CNN codes off the shelf yields similar or better accuracy, compared to SIFT+BoF. Particularly traditional method suffers with noisy datasets. 2. Error analysis shows that it helps greatly to reduce noise/variations on data. 3. Looking at the confusion matrix, we believe the main causes for misclassification – Very fine differences between species, which is hard even for human experts – Noisy and possibly non-representative train data lead to overfitting, – Since the ConvNet models are pre- trained for a different task, we specu- late these features may not always gen- eralize well. F UTURE WORKS 1. Acquire more data and fine-tune ConvNet to solve overfitting problem 2. Engage advanced techniques for image augmentation 3. Explore state-of-art method to detect and locate leaf for Image CLEF natural leaf dataset.
Transcript
Page 1: PLANT LEAF RECOGNITIONcs229.stanford.edu/proj2016/poster/LiuHuang-PlantLeaf... · 2017-09-23 · PLANT LEAF RECOGNITION { ALBERT LIU AND YANGMING HUANG}@STANFORD.EDU PROBLEM Fine-grained

PLANT LEAF RECOGNITION{ ALBERT LIU AND YANGMING HUANG }@STANFORD.EDU

PROBLEMFine-grained leaf recognition has important

application in weeds identification, species dis-covery, plant taxonomy, etc. However, the subtledifferences between species and the sheer num-ber of categories makes it hard to solve.

Prunus laurocerasus Laurus nobilis Magnolia grandiflora

METHOD

1. PreprocessingApply CLAHE to reduce lighting condi-tion variation and resize to fit the nextlayer. Selectively use K-means to removebackground heuristically. For challengingdataset, find the convex hull containing thelargest N contours and then use GrabCut tosegment leaf out.

2. Feature extraction

• ConvNets.Transfer learning approach. Specifi-cally, we take a couple of ConvNetspretrained on ImageNet for ILSVRCobject classification task, remove toplayers and use it as generic feature ex-tractor.

• Traditional SIFT + BoFKey points are densely sampled. Sizeof the codebook (K) is fixed at1000/3000.

3. ClassificationSVM Linear/RBF/Softmax/MLP/etc.

EXPERIMENTAL RESULTS

TRANSFER LEARNING - CONVNETThe pre-trained weights of VGG16, VGG19 and ResNet50 are available from open

source Keras Framework. After comparing preliminary results, we choose ResNet50 sinceResNet50 gives better results and less overfitting. We believe this can be attributed tothe fact that ResNet50 is deeper and generates lower dimension feature vector, whichis likely due to the use of a more aggressive Average Pooling with a pool size of 7x7.

The ResNet is famous for it’s deep layers, in our case, 50 layers, with 49 Conv layers and one FC layeron top. Except for the first Conv layer, the rest 48 composes 16 “residual” blocks in 4 stages. The blockwithin each stage has similar architecture, i.e. same input & output shape.

The output of the stage 5 give 2048-D features. Every cell in the grid shown above is a 8x8 filtervisualized with heatmap, which will be a scalar after average pooling.

DATASET1. Swedish/Flavia Leaf Dataset: clean images

taken in controlled conditions. Use 20 sam-ples per species for training.

2. ImageCLEF Plant: crowd sourced andnoisy. Considerable variations onlighting conditions, viewpoints, back-ground clutters and occlusions. Choosespecies with at least 20 training samples.

DISCUSSIONS1. As expected, CNN codes off the shelf yields

similar or better accuracy, compared toSIFT+BoF. Particularly traditional methodsuffers with noisy datasets.

2. Error analysis shows that it helps greatly toreduce noise/variations on data.

3. Looking at the confusion matrix, we believethe main causes for misclassification

– Very fine differences between species,which is hard even for human experts

– Noisy and possibly non-representativetrain data lead to overfitting,

– Since the ConvNet models are pre-trained for a different task, we specu-late these features may not always gen-eralize well.

FUTURE WORKS1. Acquire more data and fine-tune ConvNet

to solve overfitting problem

2. Engage advanced techniques for imageaugmentation

3. Explore state-of-art method to detect andlocate leaf for Image CLEF natural leafdataset.

Recommended