+ All Categories
Home > Documents > © 2020 Perseverança da Delfina Khossa Mungofa

© 2020 Perseverança da Delfina Khossa Mungofa

Date post: 01-Feb-2022
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
157
APPLICATION OF DEEP LEARNING MACHINE VISION FOR DIAGNOSIS OF PLANT DISORDERS AND PREDICTION OF SOIL PHYSICAL AND CHEMICAL PROPERTIES By PERSEVERANÇA DA DELFINA KHOSSA MUNGOFA A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE UNIVERSITY OF FLORIDA 2020
Transcript

APPLICATION OF DEEP LEARNING MACHINE VISION FOR DIAGNOSIS OF PLANT

DISORDERS AND PREDICTION OF SOIL PHYSICAL AND CHEMICAL PROPERTIES

By

PERSEVERANÇA DA DELFINA KHOSSA MUNGOFA

A THESIS PRESENTED TO THE GRADUATE SCHOOL

OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT

OF THE REQUIREMENTS FOR THE DEGREE OF

MASTER OF SCIENCE

UNIVERSITY OF FLORIDA

2020

© 2020 Perseverança da Delfina Khossa Mungofa

To my grandmother Mazarare Catandica, the foundation of education in my family, giving my

father the opportunity for a formal education that she never had.

To my parents, Domingas Alberto Bequel Khossa and Alfredo Chaurombo Mungofa for their

dedication and unconditional support in my education and career goals.

To all my academic and life mentors for their influential guidance in the development of my

career and life goals

4

ACKNOWLEDGMENTS

I would like to extend a special thanks to my advisor and committee co-chair Dr. Arnold

Schumann, as well as to committee members Dr. Rao Mylavarapu, ( co-chair), and Dr. Lauren

Diepenbrock for their mentorship during this study. The Funding for this research, and my

graduate assistantship, was provided by the United States Department of Agriculture (USDA)

HLB Multi-Agency Coordination (MAC) System and the USDA National Institute of Food and

Agriculture (NIFA)/Citrus Disease Research and Education Program. I would like to thank the

UF-IFAS Extension Soil Testing Laboratory (ESTL) for providing the soil samples used in this

research. I extend my gratitude to the experts and novices who participated in the citrus leaf

disorders survey. I thank the support staff from the Soil and Water Sciences department as well

as the Citrus Research and Education Center. I would also like to thank Laura Waldo, Napoleon

“Junior” Mariner, Timothy Ebert, Danny Holmes, Gary Test, Jamin Bergeron, Greg Means, and

Rosemary Collins for their help in conducting my research. A special thanks to Domingas

Mungofa, Samuel Kwakye, Eva Mulandesa and Elizabeth Nderitu for their emotional support.

Finally, I would like to express my infinite gratitude to my parents and family members for

providing their unconditional support, motivation, and strength, as I seek out the path to

achieving my career goals and aspirations.

5

TABLE OF CONTENTS

page

ACKNOWLEDGMENTS ...............................................................................................................4

LIST OF TABLES ...........................................................................................................................8

LIST OF FIGURES .......................................................................................................................10

LIST OF OBJECTS .......................................................................................................................12

LIST OF ABBREVIATIONS ........................................................................................................13

ABSTRACT ...................................................................................................................................15

CHAPTER

1 INTRODUCTION AND LITERATURE REVIEW ..............................................................17

Introduction .............................................................................................................................17 Hypothesis and Research Objectives ......................................................................................19

Hypotheses ......................................................................................................................19

Research Objective ..........................................................................................................19 Literature Review ...................................................................................................................20

Citrus Production .............................................................................................................20

Citrus greening or Huanglongbing (HLB) disease ...................................................20

HLB effects on citrus nutrition .................................................................................22 Greasy spot ...............................................................................................................23

Citrus canker ............................................................................................................23 Phytophthora disease ................................................................................................24 Citrus scab ................................................................................................................25

Spider mite damage ..................................................................................................25 Importance of Diagnosis of Soil Properties .....................................................................26

Soil texture and bulk density ....................................................................................26

Soil color ..................................................................................................................28 Soil water potential and permanent witling point ....................................................29 Soil organic matter and soil organic carbon .............................................................30

Deep Learning and Convolutional Neural Network (CNN) ............................................31 Machine vision and deep convolutional neural networks ........................................32 Scaling convolutional neural networks ....................................................................34 The VGG-16 architecture .........................................................................................34

The EfficientNet-B4 architecture .............................................................................35 Optimizers ................................................................................................................36 Transfer learning and fine-tuning .............................................................................36

Machine Vision in Agriculture ........................................................................................37 Machine vision for prediction of soil properties ......................................................37 Machine vision for identification of plant disorders ................................................38

6

2 DETECTING NUTRIENT DEFICIENCIES, PEST AND DISEASE DISORDERS ON

CITRUS LEAVES USING DEEP LEARNING MACHINE VISION ..................................42

Introduction .............................................................................................................................42 Hypothesis ..............................................................................................................................45 Objectives ...............................................................................................................................45 Materials and Methods ...........................................................................................................46

Experimental Design .......................................................................................................46

Data Collection ................................................................................................................47 Data Processing ...............................................................................................................48

Data annotation and image cropping ........................................................................48 Dataset for calibration - training and validation .......................................................49

Dataset for testing - independent validation .............................................................49 Data Analysis ...................................................................................................................50

Training and validation for citrus leaf disorders classification models with

pretrained networks ..............................................................................................50 Training methodology ..............................................................................................51

Evaluating model performance ................................................................................53 Evaluating model performance on an external dataset .............................................54 Developing and training image classification models for citrus leaf diagnosis .......55

Developing and training new image classification models for citrus leaf

diagnosis with an improved dataset ......................................................................56

Statistical Analysis ..........................................................................................................59 Results and Discussion ...........................................................................................................59

Training and Validation Results ......................................................................................60

CLD-Model-1 ...........................................................................................................60

CLD-Model-2 ...........................................................................................................61 CLD-Model-3 ...........................................................................................................61 CLD-Model-4 ...........................................................................................................62

CLD-Model-5 ...........................................................................................................63 Model Performance During Training and Validation .....................................................64

Model Performance on the Validation Dataset ...............................................................66

Model Performance on the Independent Validation ........................................................70 Chemical Nutrient Analysis Results ................................................................................71 Statistical Analysis Results Comparing Model Performance to Human Performance ...71 Model Performance Compared to Human Expertise .......................................................73

3 EVALUATING THE POTENTIAL OF MACHINE VISION TO PREDICT SOIL

PHYSICAL AND CHEMICAL PROPERTIES FROM DIGITAL IMAGES .......................92

Introduction .............................................................................................................................92

Hypothesis ..............................................................................................................................95 Objective .................................................................................................................................95 Materials and Methods ...........................................................................................................96

Data Collection ................................................................................................................96 Soil photography and scanning ................................................................................96 Permanent wilting point (PWP), the dew point ........................................................97

7

Loss on ignition (LOI) to determine soil organic matter content .............................98 Soil bulk density .......................................................................................................98

Soil color with the Munsell soil color charts ............................................................99 Soil spectra for CIE-L*a*b* color ...........................................................................99 Sieving method for sand fractionation .....................................................................99

Data Processing .............................................................................................................100 Training dataset ......................................................................................................100

Test dataset for independent validation ..................................................................100 Data Analysis .................................................................................................................101

Data management for linear regression ..................................................................102 Data management for training and validation ........................................................103 Training methodology ............................................................................................103

Training CNN-based Linear Regression Models to Predict SOM, BD, PWP,

L*a*b* Color .............................................................................................................105 Training the EfficientNet-B4 Model for Munsell Color Classification ........................106

Training a Multiclass Image Classification Model for Sand Texture ...........................107

Training a Binary Image Classification Model for Sand Texture .................................107 Statistical Analysis to Evaluate Model Performance ....................................................108 Evaluating Model Performance on the Independent Soil Dataset .................................110

Results and Discussion .........................................................................................................110 Training and Validation of the CNN Linear Regression Models ..................................111

Performance of the CNN Linear Regression Models on the Validation Dataset ..........111 Training and Validation of the Multiclass Munsell Soil Color Classification ..............114 Performance of Munsell Soil Color Classification Models on the Validation Dataset .115

Performance of Munsell Soil Color Classification Models on the Independent

Validation Dataset ......................................................................................................116 Training and Validation of the Multiclass and Binary Classification Models for

Textural Classes of Sandy Soils .................................................................................117

Performance of the Multiclass and Binary Image Classification Models for Textural

Classes of Sandy Soils on the Validation Dataset .....................................................117

Performance of the Multiclass and Binary Models for Textural Classes on the

Independent Validation Dataset .................................................................................118

4 SUMMARY OF RESULTS .................................................................................................135

LIST OF REFERENCES .............................................................................................................140

BIOGRAPHICAL SKETCH .......................................................................................................157

8

LIST OF TABLES

Table page

1-1 USDA soil separate for sandy soils ...................................................................................40

1-2 Network parameters of the VGG-16 and the EfficientNet-B4 models. .............................40

1-3 Coefficients for scaling network dimension ......................................................................40

2-1 Identified classes of leaf disorders and healthy leaves. .....................................................75

2-2 Sampling locations of the leaf disorders and respective cultivars .....................................76

2-3 Guidelines for interpretation of leaf analysis based on 4 to 6-month-old spring flush

leaves from non-fruiting twigs ...........................................................................................77

2-4 Hyperparameters used in training and validation of the five models. ...............................77

2-5 Summary of data used during calibration and independent validation .............................78

2-6 Classes with outliers removed after testing the training dataset with CLD-Model-1 ........78

2-7 Comparison of model performance on the validation dataset. ..........................................78

2-8 Comparison of model performance based on Precision (%) values obtained from the

validation dataset ...............................................................................................................78

2-9 Comparison of model performance based on Recall (%) values obtained from the

validation dataset. ..............................................................................................................79

2-10 Comparison of model performance based on F1 score (%) obtained from the

validation dataset ...............................................................................................................80

2-11 Summary of results based on the confusion matrix values ................................................81

2-12 Results of DRIS analysis on the independent validation dataset. ......................................81

2-13 Summary of model performance on the independent validation dataset. ..........................81

2-14 Summary of model performance on selected 20 leaves per class of the independent

validation dataset. ..............................................................................................................82

2-15 Summary of classification results from the three groups used for Chi-square test ...........82

2-16 Chi-square test results, with 95% confidence level ...........................................................82

3-1 Sample size for training and validation of each variable and the method .......................120

9

3-2 List of classes and respective sample size used to train the Munsell color image

classification model. ........................................................................................................120

3-3 List and number of classes used to train the multiclass and binary classification

models for sand texture classes. .......................................................................................120

3-4 Data transformation methods applied to train the linear regression model .....................120

3-5 Hyperparameters used in training and validation of the five models. .............................121

3-6 Summary of descriptive statistics of the continuous variables ........................................121

3-7 Munsell color notation and names of the training and validation dataset ........................121

3-8 Munsell color notation and names of the independent validation dataset .......................122

3-9 Number of samples used for training/validation (317 samples) and independent

validation (100 samples) of sand texture classes with binary and multiclass methods ...122

3-10 Training and validation results of the linear regression models ......................................122

3-11 Classification performance of Munsell soil color Model1. .............................................123

3-12 Classification performance of Munsell soil color Model2. .............................................123

3-13 Classification performance of Munsell soil color Model3 ..............................................123

3-14 Classification performance of Munsell soil color Model1 on the independent

validation dataset. ............................................................................................................123

3-15 Classification performance of Munsell soil color Model2 on the independent

validation dataset. ............................................................................................................123

3-16 Classification performance of Munsell soil color Model3 on the independent

validation dataset. ............................................................................................................124

3-17 Classification performance of the multiclass sand texture model in prediction of

coarse sand, sand, and fine sand textured soils of the validation dataset.........................124

3-18 Classification performance of the binary sand texture model in prediction of sand,

and fine sand textured soils of the validation dataset. .....................................................124

3-19 Classification performance of soil texture multiclass model on the independent

validation dataset. ............................................................................................................124

3-20 Classification performance the binary classification model in classification of soil

texture of the independent validation dataset. ..................................................................124

10

LIST OF FIGURES

Figure page

1-1 Deep neural network architecture ......................................................................................41

1-2 The procedure of data analysis of a CNN ..........................................................................41

1-3 Learning process of Transfer Learning ..............................................................................41

2-1 Citrus leaf disorders proposed for this study .....................................................................83

2-2 Sequence of training methodology implemented to develop the model using transfer

learning and fine-tuning .....................................................................................................83

2-3 Flow diagram of model development ................................................................................84

2-4 Model performance during training: transfer learning and fine tuning .............................85

2-5 CLD-Model-1 confusion matrix ........................................................................................86

2-6 CLD-Model-2 confusion matrix ........................................................................................86

2-7 CLD-Model-3 confusion matrix ........................................................................................87

2-8 CLD-Model-4 confusion matrix ........................................................................................87

2-9 CLD-Model-5 confusion matrix ........................................................................................88

2-10 Model performance on the independent validation dataset ...............................................88

2-11 Confusion matrix with classification results from group of novice scout .........................90

2-12 Confusion matrix with classification results from the group of experienced

professionals. .....................................................................................................................91

3-1 Flow diagram of model development ..............................................................................125

3-2 Sequence of training methodology implemented to develop the model using transfer

learning and fine-tuning ...................................................................................................125

3-3 Sequence of training methodology implemented to train the linear regression models

with transfer learning and fine-tuning ..............................................................................126

3-4 Example of classes before and after removal of samples with different notations. .........126

3-5 Histograms with original distribution of soil variables....................................................127

3-6 Histogram of data distribution after data transformation .................................................127

11

3-7 Results of linear regression analysis performed on the validation subset .......................128

3-8 Training and validation of the soil color models .............................................................130

3-9 Confusion matrix of model performance in classifying soil color on the validation

dataset. .............................................................................................................................131

3-10 Confusion matrices of model performance on the independent validation. ....................132

3-11 Model progress in training and validation process of multiclass and binary

classification of sand classes ............................................................................................133

3-12 Confusion matrix showing model performance at predicting sand texture on the

validation dataset .............................................................................................................133

3-13 Model performance on the independent validation dataset .............................................134

12

LIST OF OBJECTS

Object page

2-1 DRIS analysis results of all leaf samples of nutrient deficiency used to train the citrus

leaf disorders identification models. ..................................................................................91

13

LIST OF ABBREVIATIONS

ACP Asian Citrus Psyllid

AI Artificial Intelligence

ANN Artificial Neural Networks

CEC Cation Exchange Capacity

CIE-L*a*b* Commission Internationale d'Eclairage system of color classification. L*

lightness or darkness, a* hue green-red axis, and b* hue blue-yellow axis.

CLD Citrus Leaf Disorder Models

CNN Convolutional Neural Networks.

ConvNets Convolutional Neural Networks.

CS Coarse Sand

CUPS Citrus Under Protected Screen

DCNN Deep convolutional Neural Networks

DNNR Deep neural network regression

FS Fine Sand

HLB Huanglongbing disease. A Chinese word meaning yellow dragon disease.

Synonymous to Citrus Greening Disease.

ILSVRC ImageNet Large Scale Visual Recognition Competition

IPM Integrated Pest Management

LOI Weight Loss on Ignition

MS Medium Sand

OM Oorganic Matter

PD complex Phytophthora-Diaprepes complex

PSD Particle Size Distribution

PWP Soil water content at Permanent Wilting Point

R-CNN Region-based Convolutional Neural Networks

14

R-FCN Region-based Fully Convolutional Network

RMSE Root Mean Squared Error

SAR Systematic Acquired Resistance

SOC Soil Organic Carbon

SOM Soil Organic Matter

USDA United States Department of Agriculture

VCS Very Coarse Sand

VFS Very Fine Sand

VGGNet Visual Geometry Group Network.

WRC Soil Water Retention Curve

15

Abstract of Thesis Presented to the Graduate School

of the University of Florida in Partial Fulfillment of the

Requirements for the Degree of Master of Science

APPLICATION OF DEEP LEARNING MACHINE VISION FOR DIAGNOSIS OF PLANT

DISORDERS AND PREDICTION OF SOIL PHYSICAL AND CHEMICAL PROPERTIES

By

Perseverança da Delfina Khossa Mungofa

December 2020

Chair: Arnold Walter Schumann

Cochair: Rao Mylavarapu

Major: Soil and Water Sciences

Alternative methods are needed to supplement the laborious conventional analytical

methods employed for analysis of plant tissue and soil samples. In this study, deep convolutional

neural networks (CNN) were applied to develop models for rapid, accurate and non-destructive

analysis of plant tissue and soil samples from digital images. The pretrained models

EfficientNet-B4 and VGG-16 were trained using 14,400 digital images of eleven frequent citrus

leaf nutrient deficiency, pest and disease disorders encountered in HLB-endemic Florida groves.

Results show excellent validation accuracy: 98% for the VGG-16 and 99% for the EfficientNet-

B4 models. Chi-square tests compared the models to experts and novices familiar with citrus on

an unknown dataset, with the models outperforming both groups (p<0.001). The EfficientNet-B4

was also trained to estimate soil physical and chemical properties, through linear regression,

multiclass classification, and binary classification. A total of 321 soil samples were analyzed for

six variables: SOM, PWP, BD, L*, a*, b* color, with CNN regression; and Munsell color and

soil texture with multiclass and binary classification. Five replicates of each sample were

photographed (1,605 images). The CNN regression models achieved R2 values ranging from 0.56

to 0.86, the Munsell color models had validation accuracies ranging from 82% to 100% and the

16

binary and the multiclass sand texture models achieved 94% and 92% validation accuracy,

respectively. The results demonstrated that machine vision can be an effective approach to

predict physical and chemical properties of sandy soils and diagnose citrus leaf disorders and

could be especially useful when deployed with smartphone apps.

17

CHAPTER 1

INTRODUCTION AND LITERATURE REVIEW

Introduction

The technology advances in agriculture have been very noticeable in recent years. Most

of the advances in modern agriculture such as precision agriculture (PA) have benefitted from

the continuing development of applied technology to food production systems (Priya & Ramesh,

2020; Toriyama, 2020). The conventional analytical laboratory methods for soil and plant tissue

diagnosis are well known in producing quantitative accurate results to make decisions about soil

and nutrient management (Motsara & Roy, 2008). Although the method is reliable, the

procedures are time consuming, laborious, and sometimes costly, reducing the cost-effectiveness

in agricultural business. In past years, many methods have been proposed for rapid large-scale

and accurate assessment of soil and plant conditions. Different methods were successfully

applied for both soil and plant sciences. Multispectral and hyperspectral spectroscopy are applied

for soil studies and crop monitoring (Garza et al., 2020; Nocita et al., 2015; Xu et al., 2020),

laser induced breakdown and laser induced fluorescence spectroscopy are used to detect plant

disorders (Ranulfi et al., 2017; Saleem, Atta, Ali, & Bilal, 2020) while laser diffraction was

largely applied to define soil texture classes (Eshel, Levy, Mingelgrin, & Singer, 2004; Yang et

al., 2019), and artificial neural networks were used to predict soil physical and chemical

variables (Minasny et al., 2004; Moreira De Melo & Pedrollo, 2015; Saffari, Yasrebi, Sarikhani,

& Gazni, 2009). The methods mentioned above have a substantial advantage over the

conventional methods in terms of time-effectiveness, however, they still present high cost and

require expertise to operate the equipment and develop the predictive models (Pinheiro, Ceddia,

Clingensmith, Grunwald, & Vasques, 2017; Swetha et al., 2020). The recent advances in

machine vision have made it possible to develop accurate and inexpensive diagnostic tools to

18

predict soil and plant properties from digital images. Additionally, it can increase sampling

capacity and in-situ sample analysis with major reduction in time at nearly no cost. Deep

Convolutional Neural Networks (CNN or ConvNets) have shown exceptional performance in the

image classification and object detection tasks, making efficient use of computer resources

(Chunjing, Yueyao, Yaxuan, & Liu, 2017; Garcia-Garcia, Orts-Escolano, Oprea, Villena-

Martinez, & Garcia-Rodriguez, 2017; Lecun, Bengio, & Hinton, 2015). Several methods have

been implemented for image classification and object detection, using CNNs (Lecun et al., 2015;

Russakovsky et al., 2015). Fortunately, modern smartphone and computer technology are now in

the hands of most growers. With machine vision, it is possible to analyze a photograph of a test

leaf in the grove and provide an on-screen instant diagnosis of the nutrient deficiency, disease

symptom or pest damage (AppAdvice LCC,2020; Ramcharan et al., 2019). Machine vision can

provide an alternative method to predict soil properties from digital images, in real time, at low

cost (Swetha et al., 2020). Deep CNN was applied to predict soil texture from digital images

(Swetha et al., 2020). Other models were developed to predict soil properties using soil

spectroscopy and deep CNN with results in soil analysis with AI (Padarian, Minasny, &

McBratney, 2019b, 2019a; Padarian, Minasny, & McBratney, 2019).

This research aimed the use of deep learning machine vision as a tool for diagnosis of

leaf nutrient deficiency and other biotic stresses, such as disease symptoms and pest damage. The

same approach was used to estimate soil physiochemical properties. Digital images were used to

train two pretrained deep CNN models for image classification, the VGG-16 and the

EfficientNet-B4. A study conducted by Mungofa, Schumann, and Waldo (2018), on the

application of deep learning machine vision for the identification of chemical crystals, showed

excellent performance of CNN models, with probability accuracies of 93.34% (GoogLeNet) and

19

99.41% (VGG-16). A similar approach was implemented in this study to predict soil properties

and identify leaf disorders with some modifications to adapt the method to the specific datasets.

The study was divided in two experiments: leaf nutrient disorders identification using image

multiclass image classification method and estimation of soils physical and chemical properties,

using deep learning machine vision for simple linear regression, binary image classification, and

multiclass image classification. The CNN approach was compared to standard laboratory

methods soil sample analysis and conventional scouting in the identification of leaf disorders.

Hypothesis and Research Objectives

Hypotheses

• Deep learning machine vision powered technologies can perform as well as expert scout

and conventional field and analytical laboratory methods in diagnosis of plant disorders

(nutrient deficiency symptoms, disease symptoms and pest damage) and estimation of

soil physical and chemical properties.

Research Objective

• To develop AI-based deep learning machine vision CNN models for the identification of

leaf disorders frequently found on tree canopies that are affected by HLB disease, as well

as to predict soil physical and chemical properties.

Specific objectives

• To develop fast and accurate diagnostic artificial intelligence models, using image

classification models, VGG-16 and EfficientNet-B4 to identify key nutrient deficiencies of

citrus, disease symptoms and pest damage encountered when trees are impacted by HLB

disease

• To train deep CNN-based EfficientNet-B4 image classification network to predict physical

and chemical properties of Florida soils, using digital images of soil samples.

• To compare the deep CNN approach to analytical laboratory methods for soil sample

analysis and conventional scouting for the identification of plant disorders.

20

Literature Review

Citrus Production

Citrus production is among the most important agricultural activities in Florida and in the

United States of America. In the 2018-2019 season, Florida citrus provided 44 percent of the

total country utilized production of 7.94 million tons, up in 31% compared to the 2017-2018

season. California was the leading producer with 51 percent, while Arizona and Texas had the

lowest production of 5 percent for both states (Fried, 2020). Despite the devasting effect of

Huanglongbing disease (HLB), Florida citrus production increased from the previous season

2017-2018 by 8% (Fried, 2019). However, citrus production in Florida and in the country has

decreased for the past 10 years (Fried, 2019). For example, Florida citrus production has

decreased by about 50% in 2017-2018, compared to 2015-2016 season (Fried, 2019). National

Research Council (2010), indicated the main challenges faced by the Florida citrus industry

includes unfavorable weather and climate conditions, hurricanes, diseases, urbanization,

international competition, and shortage of water. The above-mentioned factors have resulted in

the reduction of area dedicated for production, leading to a decrease of citrus production and

reduction of fruit and juice quality.

Citrus greening or Huanglongbing (HLB) disease

Since HLB was discovered in Florida in 2005, the disease has become the main challenge

faced by the citrus industry in the state (National Research Council, 2010; Hall, Richardson,

Ammar, & Halbert, 2013). The disease was first found in China in the late 19th century and has

since been a major challenge for the citrus industry worldwide (Hall et al., 2013). In Florida, the

HLB vector ACP was first found in 1998 Halbert and Núñez (2004); Tsai (2006) and HLB

disease was later found in 2005 (Halbert, Manjunath, Roka, & Brodie, 2008). Since then, many

citrus groves were devastated and abandoned. At present, no cure for the disease has been found

21

and no resistant citrus cultivars or species were identified (National Research Council, 2010;

Halbert et al., 2008; Hall et al., 2013).

Alternative solutions to mitigate the effect of the disease are being developed and

implemented by researchers and farmers. The most common mitigation methods include

prevention and control. Integrated Pest Management (IPM) is the primary strategy to reduce

vector incidence, combining chemical control with insecticide spays along with biological

control using predators and parasite of the vector (Grafton-Cardwell et al., 2013; Stansly et al.,

2019; Grafton-Cardwell & Daugherty, 2018). IPM for control of ACP was successful in

controlling both the vector and the disease, using natural enemies combined with destruction of

HLB-infected trees (Aubert, 1978; Grafton-Cardwell et al., 2013; Rakhshani & Saeedifar, 2013;

Tsai, 2006). Vector exclusion from the crop system by producing citrus under protected

environment or citrus under protected screen (CUPS), is also a viable alternative to establish new

groves for fresh fruit production (Rolshausen, 2019).

Disease control with direct injection, foliar spray and root drench of antibiotics such as

Tetracycline, Ampicillin (Amp), Penicillin (Pen) and Sulfonamide presents some positive results

in eliminating CLas (Shin et al., 2016; Zhang, Yang, & Powell, 2015). However, the approach is

not viable due to its potential residual in plants and adverse effects on human health and the

environment (Shin et al., 2016; Zhang et al., 2015). Important studies are being carried out in

plant breeding to develop citrus cultivars and rootstocks resistant to CLas (Grosser, Gmitter Jr, &

Gmitter, 2013). Thermotherapy approach has also been proven to yield positive results in

reducing the bacterial content in plants (Fan et al., 2016; Ghatrehsamani et al., 2019). However,

under field conditions heat distribution is not efficient, leaving some parts of the plant such as

22

roots untreated, remaining as a reservoir of bacteria for reinfection; it is also not a long term

option because it does not prevent reinfection through feeding by the vector (Yang et al., 2016).

HLB effects on citrus nutrition

As a phloem-limited pathogen, CLas triggers disruption of the vascular system

obstructing the translocation stream (Bové, 2006). Plant nutrition is negatively affected because

the vascular system is blocked by massive accumulation of starch in the plastids as well as

necrotic phloem. Therefore, the transport of photosynthesis products to other plant tissue is

obstructed and plant growth is limited (Bove, 2006; Nwugo, Lin, Duan, & Civerolo, 2013). The

interaction between HLB and nutrient uptake by trees is inconsistent resulting in different

nutrient concentrations in plant tissue, depending on nutrient mobility (Morgan, Rouse, & Ebel,

2016). Nutrient deficiency is more likely to occur in infected plants, due to a reduction in

nutrient and water uptake as plants experience decline in fibrous root density, reducing plant

growth and yield (Hamido, Morgan, & Kadyampakeni, 2017; Johnson & Graham, 2015;

Kadyampakeni, Morgan, Schumann, & Nkedi-Kizza, 2014). Positive results have been found

when implementing customized fertilization combined with vector control (Pustika et al., 2008;

Rouse, Irey, Gast, Boyd, & Willis, 2012; Shen et al., 2013; Stansly et al., 2014; Vashisth &

Grosser, 2018). Some studies have shown that HLB-affected trees can be responsive to foliar and

soil applied macro and micronutrients, such as magnesium (Mg), manganese (Mn), zinc (Zn),

and boron (B), which scan reduce HLB visual symptoms (Morgan et al., 2016; Shen et al., 2013;

Zambon, Kadyampakeni, & Grosser, 2019). A citrus nutrient management guide is available for

Florida growers, which is a helpful tool in maintaining productivity in HLB-affected areas

(Morgan et al., 2016).

23

Greasy spot

Greasy spot is a fungal disease Zasmidium citri-griseum also called Mycosphaerella citri

Whiteside that causes damages to leaves and fruits. Severe symptoms lead to premature leaf

drop, which decreases the tree’s photosynthetic capabilities, resulting in low yield (Dewdney,

2019; Timmer, Roberts, Chung, & Bhatia, 2008). Visual leaf symptoms start on the underside of

leaf surface as a chlorotic mottle. After penetration of ascospores inside the leaf tissue, hypha

growth generates yellow to brown spots visible on the underside of the leaf surface (Timmer et

al., 2008). In the later stages of the disease, brown to black spots are dominant and the symptoms

can be visible on the upper side of the leaf surface, with yellow, brown, and black spots. Leaf

drop is the last stage of infection and the litter is usually the source of inoculum. Warm and

humid weather conditions are favorable for infection and disease development (Dewdney, 2019).

Foliar application of fungicide and petroleum oils, with cultural control to reduce inoculum are

the main methods applied for disease control (Dewdney, 2019).

Citrus canker

Citrus canker is a bacterial disease caused by Xanthomonas citri subsp. citri, that causes

lesions on fruits, leaves, and stems of most citrus cultivars. It causes important economic losses

especially under Florida weather conditions, which favors the disease spread (Dewdney,

Johnson, & Graham, 2020). In early stages of disease infection, visual symptoms include leaf

spot with raised lesions that appear on both sides of leaf surfaces. In advanced stages of

infection, the symptoms are corky and raised lesions with hollow centers, surrounded by a

chlorotic halo; defoliation, twig dieback, and blemishes with corky appearance on the fruit

(Dewdney et al., 2020). New shoots and fruits in early stages of development are more

susceptible to infection during heavy rain storms and warm weather (Dewdney et al., 2020). The

presence of leafminer larvae feeding on leaves favors inoculum penetration and disease

24

development (Dewdney et al., 2020). Disease incidence can be reduced through IPM including

cultural control: using seedlings from canker-free nurseries, pruning and defoliation followed by

burning infected twigs; chemical control: applying copper-based bactericides; leafminer

management; development of resistant cultivars, activation of systematic acquired resistance

(SAR) (Dewdney et al., 2020).

Phytophthora disease

Phytophthora is a group of diseases caused by soilborne oomycetes, Phytophthora

nicotianae or Phytophthora palmivora. Four citrus diseases are known to be caused by

Phytophthora spp., foot rot also known as trunk gummosis, root rot, crown rot and brown rot of

fruits (Khanchouch, Pane, Chriki & Cacciola, 2017). Foot rot infection generates bark lesions

that starts above the soil surface and can be extended to the bud union. Root rot is the result of

fibrous roots infection, the root cortex becomes soft and is separated from the root, leaving only

the inner tissue of the fibrous root (Dewdney & Johnson, 2020). Visual symptoms of

phytophthora root rot include stunting canopy growth, branch dieback, the leaves show chlorotic

veins, and severe infections cause general leaf chlorosis and defoliation (Khanchouch, et al.,

2017). Disease infestation is favored under high soil moisture conditions and warm temperature.

The presence of Diaprepes abbreviates (Diaprepes root weevil) and HLB infection also

contribute to high phytophthora infection, due to root damage. Management of Phytophthora-

Diaprepes complex (PD complex) and Phytophthora-HLB interaction are implemented to

prevent major crop losses (Dewdney & Johnson, 2020). IPM strategies include chemical control

using fungicides; cultural control, controlling moisture conditions and applying the right

irrigation methods (e.g., drip irrigation) and time, use of pathogen free seedlings, tolerant

rootstock; and use of natural enemies for biological control of Diaprepes root weevil (Dewdney

& Johnson, 2020).

25

Citrus scab

Citrus scab is a fungal disease caused by Elsinoë fawcettii, causing damages to leaves and

fruits, which is where the most important damages are seem. It does not cause major economic

losses, however, severe early infections on ‘Temple’ orange reduces fruit size. The visual

symptoms are localized scab pustules on leaves and fruits where the spores are produced

(Dewdney, 2020). Spores are usually transported by water splash and infect healthy tissues,

mostly young leaves, and fruits. In groves and trees, the disease is localized, does not spread

throughout the area; the spread is limited to the reach of water splashes. Disease control methods

include cultural practices, using disease free seedlings from nurseries, avoid overhead irrigation,

pruning heavily infected branches; chemical control with fungicide (Dewdney, 2020).

Spider mite damage

There are four important species of spider mite affecting citrus in Florida. Texas citrus

mite Eutetranychus banksi (McGregor) Childers (2006), citrus red mite Panonychus citri

(McGregor) McMurtry (1989), six-spotted mite Eotetranychus sexmaculatus (Riley) Childers

and Fasulo (2005) and two-spotted spider mite Tetranychus urticae Koch (Fasulo & Denmark,

2012). The most abundant species in Florida is the Texas citrus mite followed by the citrus red

mite. The two species colonize mature flush on the adaxial side of the leaf surface, found along

the midvein and migrate to margins of the leaf and fruits as the colony population increases

(Qureshi, Stelinski, Martini, & Diepenbrock, 2020). The six-spotted and two-spotted spider mite

feed on the abaxial side of the leaf surface, primarily along the petiole, the midvein and the

larger veins. The colonies generate yellow blistering areas and bright yellow on the adaxial side

of the leaf (Childers & Fasulo, 2005; Qureshi et al., 2020). Leaf damages include graying and

yellowing of the leaves, resulting from the collapse of the mesophyll tissue. Advanced level of

leaf damage cause necrosis and defoliation (Fasulo & Denmark, 2012). In higher population

26

densities chemical control of adults is done using miticide, and petroleum oil is used against

spider mite eggs (Qureshi et al., 2020).

Importance of Diagnosis of Soil Properties

The diagnosis of soil properties provides a baseline to develop guidelines used to support

decision making processes (McLaughlin, Reuter & Rayment, 1999). Soil physical and chemical

properties, such as soil texture and soil hydraulic properties and organic matter (OM) content

play important roles in nutrient and water retention and availability, as well as in soil biological

properties (Hillel, 1998; McLaughlin et al., 1999; Binkley & Fisher, 2012). Therefore,

determining soil physio-chemical properties is important to understand the processes and

reactions occurring in the soil, involving the chemical and biological components.

Soil texture and bulk density

Soil texture is the proportion of sand, silt, and clay particles, which comprise particles of

less than 2 mm in diameter. Based on the U.S. Department of Agriculture (USDA), sand

particles are soil particles with diameter size between 2 mm and 50 μm, silt particle size ranges

from 50 μm to 2 μm and the clay fraction, also defined as the colloidal fraction of the soil, are

particles less than 2 μm in size (Gee & Bauder, 1986). Soil texture is directly related to soil

porosity, water holding capacity, water potential, soil structure (aggregate stability and size),

organic matter (OM) content, and nutrient content and availability, represented by the soil cation

exchange capacity (CEC) and thermal regime (Hillel, 1998; Binkley & Fisher, 2012). Soil bulk

density is also high influenced by soil texture, which is the content of dry soil in bulk volume of

soil (Blake & Hartge, 1986). These properties significantly affect plant growth and yield, as they

impact the rhizosphere and root’s ability of uptake water and nutrients (Arvidsson, 1998). The

particle size distribution (PSD) can be estimated using field and laboratory methods. The

laboratory methods include sedimentation (pipet and hydrometer) and sieving methods (Gee &

27

Bauder, 1986; Hillel, 1998). Sedimentation method is based on the relationship between the

particle diameter/size, velocity, gravity in a fluid of specific viscosity and density (Hillel, 1998;

Gee & Bauder, 1986). The sieving method consists of quantifying the content of particles of

specific size in the range of 2000 μm to 2 μm that passes though specific mesh size of a sieve

(Gee & Bauder, 1986; Soil Survey Staff, 2014). Common methods used to measure soil bulk

density include core method (most used), clod method and excavation method (Blake & Hartge,

1986).

In Florida, most of the area is occupied by coarse-textured soils, with seven of the soil

orders represented: Spodosols, Entisols, Ultisols, Alfisols, Histosols, Mollisols, and Inceptisols

(Mylavarapu, Harris, & Hochmuth, 2016). For proper nutrient and water management,

fractionation of sand classes is necessary. Sands are defined as soil material that contain more

than 85% sand, where the percentage of silt plus 1.5 times the percent of clay is less than 15%

(Soil Science Division Staff, 2017). There are five separates of sandy soils: very coarse sand

(VCS), coarse sand (CS), medium sand (MS), fine sand (FS) and very fine sand (VFS). The

ranges of values corresponding to each class of sands is illustrated in Table 1-1. Based on the

values presented in Table 1-1, four subclasses of sand are defined (Soil Science Division Staff,

2017).

• Coarse sand – soil material with 25% or more very coarse sand and coarse sand and less than

50% of any other single grade of sand.

• Sand – soil material containing 25% of very coarse, coarse and medium sand, less than 25%

very coarse and coarse sand, and less than 50% fine sand and less than 50% very fine sand;

Or soil material with 25% or more very coarse and coarse sand and 50% or more of medium

sand.

• Fine sand – material containing 50% or more of fine sand, and the content of fine sand is

more than the content of very fine sand; Or soil material with less than 25% very coarse,

coarse, and medium sand and less than 50% very fine sand.

• Very fine sand – soil material with 50% or more very fine sand.

28

Soil color

Soil color is the primary visual physical property used for soil characterization in-situ or

in the laboratory. It indicates specific soil chemical properties and processes, such as oxidation

status (mostly driven by Fe2+ and Fe3+), organic matter content, soil aeration, and moisture

content (Soil Survey Staff, 2014). Soil color is an important property to understand pedogenic

processes in soils (Owens & Rutledge, 2005). The Munsell Soil Color Chart, is a convenient

method to measure soil color by visually matching the soil color with the color charts (Munsell

Soil Color Charts, 1994). The color chips combine three dimensions, the Hue, Value and

Chroma. The Hue denotes the relation of color with Red, Yellow, Green, Blue, and Purple. The

Value is related to lightness or darkness and the Chroma indicates the strength (intensity) of the

hue (Munsell Soil Color Charts, 1994). Another system used in soil colorimetric measurement is

the Commission Internationale d'Eclairage, in English the “International Commission on

Illumination” (CIE-L*a*b*) system (Blum, 1997). The CIE-L*a*b* system uses three

coordinates: L* for value (lightness or darkness), a* for hue on red-green axis and b* hue on the

yellow-blue axis (Blum, 1997). The drawback of colorimetric methods in soil classification is the

subjective perception of color from the individuals performing the measurement.

The spectrophotometric method consists of the use of a spectrophotometer, which

collects soil spectral data from the visible range (400-700 nm). The equipment is coupled with a

standard light source that eliminates the limitation of color differences influenced by variation of

light intensity and angle of measurement (Barrett, 2002; Blum, 1997). Shields, Paul, Arnaud, and

Head (1968) reported the usse of spectrophotometric method to analyze the relationship between

soil color, with moisture and organic matter content. Comparisons of the spectrophotometric

method with the Munsell and CIE- L*a*b* systems show good agreements making it easier to

convert the spectrophotometer measurements to both methods (Barrett, 2002; Islam, Mcbratney,

29

& Singh, 2006). Kirillova and Sileva (2017), proposed the use of digital cameras in colorimetric

analysis of soil samples, finding high correlation with spectrophotometric and CIE- L*a*b*

colorimetric system. Fan et al. (2017) obtained similar results using digital images compared to

Munsell color charts.

Soil water potential and permanent witling point

Soil water potential is defined as the sum of different potential energies per unit of mass,

volume or weight of water, representing the water content in relation to the soil water energy

status (Campbell, 1988; Cassel & Klute, 1986). There are four potential energy components that

govern the movement and retention of water in soils; matric potential, osmotic potential, pressure

potential, and gravitational potential (Campbell, 1988). The water potential explains the soil’s

ability to retain water, defined as water-retention capacity (Klute & Dirksen, 1986). The soil

water retention curve (SWRC) is a very important parameter to study available water at different

soil water potentials as well as understand soil hydrological properties, such as infiltration,

evaporation, and available water for root uptake (Kirste, Iden, & Durner, 2019). An essential

component in irrigated systems is plant-available water, which is the difference between soil

water content at field capacity assumed to be -33 kPa and soil water content at the permanent

wilting point (PWP), -1500 kPa soil matric potential (Cassel & Nielsen, 1986; Hillel, 1998).

Tensiometers and the dew-point methods are widely used to measure water potential at

field capacity and PWP, respectively (Campbell et al., 2007; Cassel & Klute, 1986; Kirste et al.,

2019; Rawlings & Campbell, 1986). The dew point method with the WP4 instrument (Decagon

Devices, Inc., Pullman WA 99163) is a fast and precise method to measure soil water potential at

PWP, which ranges from 0 to -300 MPa, applying the chilled-mirror dew point technique

(Decagon Devices, 2007; Campbell et al., 2007). The instrument measures the dew point

temperature of the vapor pressure of air in equilibrium with a soil sample in a sealed chamber to

30

determine its total suction or water potential (Campbell et al., 2007). The WP4T equipment

includes a user selectable temperature control, and internal thermoelectric components to avoid

measurement error caused by variation in room temperature (Decagon Devices, 2007). To obtain

a full range of SWRC, the chilled-mirror method can be used in addition to the HYPROP

evaporation method to measure SWRC and the PWP (Kirste et al., 2019; Maček, Smolar, &

Petkovšek, 2013). The method applies the chilled-mirror dew point method after the HYPROP

evaporation method, coupled with tensiometers to measure the water potential at the wet end of a

soil (Kirste et al., 2019; Maček et al., 2013).

Soil organic matter and soil organic carbon

Soil carbon is composed of organic and inorganic fractions. The inorganic fraction is

found in carbonate minerals whereas the organic fraction is found in soil organic matter (Nelson

& Sommers, 1996). Soil organic matter SOM is the organic fraction of the soil, which includes

fresh and all stages of decomposition of plant, animal and microbial residues, and the resistant

soil humus (Nelson & Sommers, 1996). Soil organic carbon SOC is the main component of

SOM and is highly correlated with soil health, quality, and fertility, influencing nutrient cycling

and availability (Kimble et al., 2001; Kutsch et al., 2009; FAO, 2017). Direct measurement of

SOM is conducted by oxidizing or volatilizing the OM content in a soil sample. The oxidation

method using hydrogen peroxide (H2O2) quantifies SOM through the weight loss after oxidation.

The volatilization through ignition of soil at high temperature, between 350 and 950oC, is used to

quantify weight loss on ignition (WLOI or LOI) (Nelson & Sommers, 1996). The H2O2 method

is not satisfactory in estimating total OM, because of the incomplete oxidation, but it can

accurately estimate readily oxidized materials (Nelson & Sommers, 1996). The LOI method is

reported to overestimate the OM content, due to losses of structural water of phyllosilicates

(dihydroxylation) and iron oxides (Gibbsite), and the decomposition of hydrated salts and

31

carbonate minerals (Konare et al., 2010; Jensen, Christensen, Schjønning, Watts & Munkholm,

2018; Roper, Robarge, Osmond, & Heitman, 2019; Sun, Nelson, Chen, & Husch, 2009).

Temperatures between 400 and 450oC maximizes removal of OM with minimal dihydroxylation

of clay minerals (Nelson & Sommers, 1996). An alternative is to remove hygroscopic water prior

to ignition, using temperatures between 105oC and 120oC (Konare et al., 2010; Sun et al., 2009).

SOM content can also be assessed using Munsell color charts and colorimetric methods with

color sensors (Abbott, 2012; Roper et al., 2019; Stiglitz et al., 2018). Spectroscopy methods are

also widely used to estimate OM content (Mohamed, Saleh, Belal, & Gad, 2018; Zhang, Lu,

Zhang, Nie, & Li, 2019).

Deep Learning and Convolutional Neural Network (CNN)

A deep-learning architecture is a composition of stacked multilayers, where most of the

hidden layers are subject to learning, computing non-linear input–output maps (Lecun et al.,

2015). The deep architectures are able to identify similarities in objects of the same class,

ignoring irrelevant variations like background and lighting (Lecun et al., 2015). Figure 1-1

illustrates the general architecture and sequence of data analysis that is performed by the deep

learning artificial neural networks. A deep neural network architecture consists of an input layer,

followed by a sequence of hidden layers, with a non-linear activation function (ReLU) where the

learning process occurs. The classifier is the last layer inside the network, also called

classification head. The last layer generates the predictions from the model, using activation

functions, such as SoftMax, Sigmoid and Linear activation (Lecun et al., 2015; Li, Yang, Peng,

& Liu, 2020). The results are presented as predicted classes for an image classification model or

predicted classes and object localization for the object detection model, with respective

probability percentages, or single points and multiple points in linear and multiple regression

approach, respectively. The Convolutional Neural Networks (CNN) or ConvNets are

32

feedforward neural networks designed to analyze 1D, 2D and 3D, for signals, images, and video

processing, respectively. Equation 1-1 defines a ConvNet as:

(1-1)

where N is the ConvNet, Fi Li, indicate layer Fi is repeated Li times in stage i, X is the tensor and

(Hi, Wi, Ci) is the shape of the input tensor of layer i. The ConvNets use many layers to analyze

natural signals with local connectivity, where each neuron is connected to few neurons; shared

weights to reduce number of parameters and improve computation; and polling (down-

sampling), using local image correlation for dimensionality reduction (Lecun et al., 2015; Li et

al., 2020). Four components conform a CNN model (Figure 1-2): 1) convolution for feature

extraction and generate feature maps; 2) padding to enlarge and adjust input size; 3) stride to

control density of the convolution; and 4) pooling, including average pooling and max pooling to

avoid overfitting (Li et al., 2020).

Machine vision and deep convolutional neural networks

Since 2010, the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) runs an

annual competition for large-scale image classification and object recognition using deep

learning algorithms (Russakovsky et al., 2015). In 2012, a new generation of machine vision

models was introduced, with the development of deep convolution neural network (DCNN)

(Figure 1-1), the AlexNet model (Krizhevsky, Sutskever, & Hinton, 2012). The method was

introduced to improve performance in computer vision tasks (Krizhevsky et al., 2012; Lecun et

al., 2015; LeCun et al., 1989). The deep CNN models are able to train large scale data (e.g., the

ImageNet dataset, with more than one million images and 1000 classes) and learn complex

features, such as multiple objects in an image (Lecun et al., 2015; Russakovsky et al., 2015).

33

Several strategies are applied to improve deep CNN model’s accuracy and computation,

including scaling network depth, width, image resolutions, channel boosting, multi-path, feature-

map exploitation, and attention (Khan, Sohail, Zahoora, & Qureshi, 2020; Li et al., 2020). The

VGG-16 and VGG-19 were developed by scaling up network depth to improve model accuracy,

achieving state-of-the-the-art performance in object detection and image classification tasks,

with 7.3% top-5 test error on ImageNet dataset (Simonyan & Zisserman, 2015). The GoogLeNet

model was introduced to improve computation efficiency by including dimensional reduction,

the Inception layer (Szegedy et al., 2015). The Inception V2 and V3 by Szegedy, Vincent, and

Ioffe (2014), Inception V4 and Inception-ResNet by Szegedy, Ioffe, Vanhoucke, and Alemi

(2017) were proposed to reduce computation cost while maintaining high accuracy. The

RestNet18-152, with deeper network was proposed to improve performance in image

classification and objsect detection tasks (He, Zhang, Ren, & Sun, 2016). The DenseNet is a

network that improves computation by enabling direct connections between layers to improved

accuracy (Huang, Liu, Van Der Maaten, & Weinberger, 2017). NASNet, the Neural Architecture

Search (NAS) network, was developed to enable transferability of models adapted to variable

datasets, using NAS search tool to identify data-specific networks (Zoph, Vasudevan, Shlens, &

Le, 2018). Most recently, the series of EfficientNet models (B0-B7), broke the record in

computer vision tasks, where the EfficientNet-B7 achieved 84.4% top-1 / 97.1% top-5 accuracy

on ImageNet (Tan & Le, 2019). Great progress was also observed in the field of object

recognition with improving accuracy in object detection tasks. The MobileNet was built to

improve efficiency in mobile application for object detection developed by Howard et al. (2017),

other object detection models are Singe Shot MultiBox Detector-SSD, which was developed by

Liu et al. (2016), YoloV3 was developed by Redmon and Farhadi (2018), and the recently

34

released, YoloV4 by Bochkovskiy, Wang, and Liao (2020) and the EfficientDet series by Tan,

Pang and Le (2020), with increased performance in recent models.

Scaling convolutional neural networks

Increased network dimensions depth, width, and image resolution, improve model

performance, but each method presents its limitations. Increasing network depth is the common

method used to scale CNN models. With deeper networks, models can learn complex details in

images and can generalize well when trained for new tasks (He et al., 2016; Simonyan &

Zisserman, 2015). Improve network width enables networks to learn fine-grained image

characteristics (Lu, Pu, Wang, Hu, & Wang, 2017; Zagoruyko & Komodakis, 2016). Wide

networks, are easier to train compared to deeper networks. However, they tend to lose accuracy

when training complex datasets. Training models with high resolution images (e.g.,,224x224,

299x299, 331x331 pixels or higher) tends to improve accuracy by detecting fine-grained features

in images (He et al., 2016; Simonyan & Zisserman, 2015; Zoph et al., 2018). To better take

advantage of high-resolution images, scaling network depth and width is required to capture

complex features in images.

The VGG-16 architecture

The VGGNet models were developed by scaling up network depth to improve accuracy

in image classification and object detection tasks (Simonyan & Zisserman, 2015). The network

architecture was designed increasing depth of the network, while maintaining other parameters.

The number of convolutions layers was also increased by applying small convolution filters

(3x3) to all layers. The VGG-16 model (Table 1-2) is composed of 13 convolutional layers, and

3 fully connected (FC) layers, for a total of 16 weight layers. The number of channels in the

network starts at 64 (3x3) channels and it increases in factor of 2 after every max-pooling layer

up to 512 channels of 3x3. The dropout is applied to the first two FC layers and the last FC layer

35

corresponds to the number of classes. The SoftMax is applied to the last layer. The image input

size for the VGG-16 is 224x224 pixels. The network contains a total of 138 million trainable

parameters, which converts to high computation cost (Simonyan & Zisserman, 2015).

The EfficientNet-B4 architecture

The EfficientNet-B0-B7 series of models were designed to improve accuracy and

computation efficiency in image classification by applying a compound coefficient (Equation 1-

2) to balance the network’s dimensions depth (d), width (w), and image resolution (r), Figure 1-3

(Tan & Le, 2019). To develop the EfficientNet series of model, a multi-objective neural

architecture search was used to generate an efficient baseline network (EfficientNet B-0), to

optimize accuracy and FLOPS (FLoating-point Operations Per Second), improving computation

(Tan & Le, 2019).

depth: d = αφ

width: w = βφ

resolution: r = γφ

s.t. α · β2 · γ2 ≈ 2

α ≥ 1, β ≥ 1, γ ≥ 1

(1-2)

where the α, β, γ are constants determined by a grid search and φ is a specific value defined by

the user that controls resource availability. The compound coefficient was applied to the baseline

model to generate the series of EfficientNet-B1 to B7 networks, shown in Table 1-3. From

EfficientNet-B0 to B7, accuracy increased as different coefficients were applied to the network,

thus increasing network depth, width and using a greater image resolution. The EfficientNet

models perform better than other models of similar specifications, using less parameters and

requiring less computation (Tan & Le, 2019). The scaling coefficient for the EfficientNet-B4,

are: 1.4 for width, 1.8 for depth, input image resolution of 380x380 pixels and scale coefficient

of 1.7. The EfficientNet-B4 and the VGG-16 parameters are shown in Table 1-2.

36

Optimizers

For a successful implementation of supervised learning it is necessary to find the right

functions that approximates the predicted values or classes to the observed samples.

Optimization algorithms are applied during training to minimize the error (loss function)

between the target prediction and the predicted output (Sun, 2019). The optimizers should have

good convergence in training, have a fast convergence speed, generalize to other tasks, and

achieve good test accuracy (Sun, 2019). Common optimizers used for machine vision tasks

include stochastic gradient descent (SDG) with momentum or Nesterov accelerated gradient. The

adaptive gradient methods include Adagrad, RMSProp and Adam. Adam is the Adaptive

Momentum Estimation, built to adapt the learning rate for each parameter by computing the

averaged exponential decay, it combines the RMSProp and the momentum methods. AdaMax

and Nadam are other momentum and adaptive learning based optimizers (Ruder, 2016; Sun,

2019).

Transfer learning and fine-tuning

Large labeled datasets are required to train deep networks, making data dependence one

of the major problems in deep learning (Tan et al., 2018). There is a linear relationship between

the size of the model and sample size required for training (Tan et al., 2018). The ImageNet

dataset (Russakovsky et al., 2015) is frequently used to train and test new deep learning

architectures for image classification, regression, and clustering. When limited high-quality

labeled datasets are available, training these models for new tasks is done through transfer

learning (Pan & Yang, 2010). Transfer learning (Figure 1-3) consists of using the knowledge

(weights) from pretrained models on different task to train a new domain or a new task with a

limited or no labeled dataset (Pan & Yang, 2010). In transfer learning, the domain and target data

can be different and have different distribution and the models are still able to perform well on

37

the new tasks (Pan & Yang, 2010; Tan et al., 2018). Using transfer learning reduces the time

required to generate and annotate datasets for every specific task. It also makes the models learn

new features faster and more efficiently, reducing training time, as the models do not have to

learn from scratch (Pan & Yang, 2010; Tan et al., 2018). Fine-tuning is implemented to optimize

the source domain task on the new task. Fine-tuning is used to develop task specific models from

pre-trained models. Depending on the target task and sample size, part of the network can be

frozen to avoid overfitting, fine-tuning only part of the network and the top fully connected (FC)

layers (Li & Hoiem, 2018).

Machine Vision in Agriculture

In agricultural sciences, the applications of machine vision have been on a variety of

topics, including identification of soil properties, soil and nutrient management, crop monitoring,

pest, disease and weed detection and control, weather and climate forecast, yield prediction, crop

quality assessment, species recognition, genetics and phenotyping, livestock production, animal

welfare, and agriculture robotics (Duckett et al., 2018; Liakos, Busato, Moshou, Pearson, &

Bochtis, 2018; Mochida et al., 2018). Notable applications of AI in agriculture are on automated

weed control (Dyrmann, Skovsen, Laursen, & Jørgensen, 2018; Kantipudi, Lai, Min, & Chiang,

2018) and yield prediction for automated harvesting of commercial crops (Schumann et al.,

2019; Solberg, 2017). Site specific applications of agrochemicals (fertilizer and fungicides) with

machine vision have positive effects on plant health and efficient use of agrochemical (Esau et

al., 2018).

Machine vision for prediction of soil properties

Machine vision is a relatively an emerging field in soil sciences. Liu, Ji, and Buchroithner

(2018), applied transfer learning for soil spectroscopy to predict soil clay content. Transfer

learning was used to calibrate the hyperspectral data collected in the laboratory for later

38

application on hyperspectral imagery. The model achieved good accuracy with R2 of 0.756, root

mean-square error (RMSE) of 7.07 %. Padarian et al. (2019), developed a multi-task CNN model

for digital soil mapping using 3-D images of covariates and spatial information. Multi-task

learning and data augmentation were applied to train the model to simultaneously predict SOC at

different soil depths (Krizhevsky et al., 2012; Padarian et al., 2019; Ruder, 2017). The results

showed that the multi-task CNN model reduced the error by about 30% compared to Cubist

regression tree (Padarian et al., 2019). Deep CNNs have also been used to predict six soil

variables including SOM (g kg−1), CEC (cmol(+) kg−1), sand, and clay content, pH in water and

total nitrogen using vis-NIR spectroscopy (Padarian et al., 2019a). Multi-task and single-task

CNNs with three 3x3 convolutional layers were implemented. The results showed high

performance of both CNN models with decreased prediction error by 62% and 87% (Padarian et

al., 2019a). Considering the high spatial variability of landscapes and its influences in soil

properties, Padarian et al. (2019b) investigated the use osf transfer learning with models trained

on global data to predict soil properties at a local level. Soil properties included, SOM (g kg−1),

CEC (cmol(+) kg−1), pH in water, and the fraction of clay. The results show that with transfer

learning the model can generalize well on local data (Padarian et al., 2019b). Deep neural

network regression (DNNR) was implemented in soil moisture prediction, from meteorological

data (Cai, Zheng, Zhang, Zhangzhong, & Xue, 2019). Seven variables were used for training, six

meteorological data and one soil water content feature. The model accurately predicted soil

moisture, with R2 ranging from 0.96 to 0.98, and RMSE from 0.78 to 1.61 % (Cai et al., 2019).

Machine vision for identification of plant disorders

Sladojevic, Arsenovic, Anderla, Culibrk, and Stefanovic (2016) developed a system for

plant disease recognition, trained to identify thirteen classes of plant diseases using deep CNNs,

with precision values ranging from 91% to 98%. Ghazi, Yanikoglu, and Aptoula (2017)

39

compared the performance of three CNNs, GoogLeNet, AlexNet, and VGGNet in plant

identification, via optimization of transfer learning parameters and data augmentation. The

GoogLeNet model achieved validation accuracy of 80% after transfer learning and data

augmentation. Fuentes, Yoon, Kim, and Park (2017) developed a system for real time

identification of nine pests and diseases of the Tomato crop, using the Deep Learning-Based

Detector (deep meta-architectures and feature extractors). Results from this study show that R-

CNN with VGG-16 and R-FCN with ResNet-50 obtained the best results with 83% and 85%

average precision, respectively. Ferentinos (2018) also developed a system for plant disease

detection and diagnosis, for twenty-five different plant species and 58 distinct classes of plant

diseases and healthy leaves, from a database comprised of 87,848 images. Five pretrained CNNs

were trained, AlexNet, AlexNetOWTBn, GoogLeNet, Overfeat, and VGG. All models achieved

high performance in the testing dataset, with accuracy values above 97% (Ferentinos, 2018). An

example of an AI-based smartphone application is the Pocket Agronomist app developed by

Agricultural Intelligence, LLC for iOS smartphones. The application was built on a CNN model

to identify plant diseases, nutrient deficiencies, weeds, and insect damage (AppAdvice LLC,

2020). Agrio is another example of a smartphone application for plant disease diagnosis based on

deep CNN, available for iOS and Android smartphones (NVIDIA Corporation, 2019). In the

citrus industry, Schumann, Waldo, Holmes, Test, and Ebert (2018) conducted a preliminary

study to determine the possibility of using deep CNN for nutrient diagnosis from visual foliage

symptoms of citrus, and the results were promising with potential application in mobile

smartphone apps.

40

Table 1-1. USDA soil separate for sandy soils (Soil Science Division Staff, 2017). Name of the soil separate Diameter limits (mm)

Very coarse sand <2 to > 1

Coarse sand 1 to > 0.5

Medium sand 0.5 to > 0.25

Fine sand 0.25 to > 0.10

Very fine sand 0.10 to > 0.05

Table 1-2. Network parameters of the VGG-16 and the EfficientNet-B4 models. Based on

Equation 1-1, the rows represent stages i, L represents the number of layers, C the

number of channels and H x W the image resolution. The EfficientNet-B4 network

architecture is deeper than the VGG-16, it also has greater number of channels. The

input image resolution for the VGG-16 network is 224x224 pixels and for the

EfficientNet-B4 is 380x380 pixels RGB images. Stage

i

VGG-16 EfficientNet-B4

Operator

F

#Layer

L

#Channels

C

Resolution

H × W

Operator

F

#Layer

L

#Channels

C

Resolution

H × W

1 k3x3 2 64 224x224 Conv3x3 1 45 380x30

2 k3x3 2 128 112x112 MBConv1, k3x3 2 22 190x190

3 k3x3 3 256 56x56 MBConv6, k3x3 4 34 190x190

4 k3x3 3 512 28x28 MBConv6, k5x5 4 56 95x95

5 k3x3 3 512 14x14 MBConv6, k3x5 5 112 95x95

6 Maxpool 512 7x7 MBConv6, k5x5 5 157 48x48

7 FC-4096 MBConv6, k5x5 7 269 48x48

8 FC-4096 MBConv6, k3x3 2 448 24x24

9 FC-1000

soft-max

Conv1x1 &

Pooling & FC

1 1792 12x12

138 million trainable parameters 19 million trainable parameters

Table 1-3. Coefficients for scaling network dimension. Equation 1-2 was applied to width, depth,

and image resolution. Networks Width coefficient Depth coefficient Scale coefficient

EfficientNet-B0 1 1 1

EfficientNe-B1 1 1.1 240/224

EfficientNet-B2 1.1 1.2 260/224

EfficientNet-B3 1.2 1.4 300/224

EfficientNet-B4 1.4 1.8 380/224

EfficientNet-B5 1.6 2.2 456/224

EfficientNet-B6 1.8 2.6 528/224

EfficientNet-B7 2.0 3.1 600/224

41

Figure 1-1. Deep neural network architecture. With an input layer, used to insert a new image to

be analyzed by the network, the hidden layers, where the learning process occurs, it

carries out the process of feature extraction, which are analyzed by the classifier. The

classifier converts the feature maps into categorical classes, these classes are

presented in the output layer with probability values (adapted from Fuentes et al.,

2017; Ruder, 2017, image source nicepng.com).

Figure 1-2. The procedure of data analysis of a CNN (Li et al., 2020).

Figure 1-3. Learning process of Transfer Learning. The source tasks and the target task can be

different. The process uses the knowledge from source task to improve the learning

process in the new target task (Pan & Yang, 2010).

42

CHAPTER 2

DETECTING NUTRIENT DEFICIENCIES, PEST AND DISEASE DISORDERS ON CITRUS

LEAVES USING DEEP LEARNING MACHINE VISION

Introduction

The adequate diagnosis of crop condition is an essential component of crop management,

as this is the first approach in the decision-making process. Early diagnosis of crop disorders

enables proper scheduling of disease and pest control, as well as correction of nutrient

imbalances (Baramidze, Khetereli, & Kushad, 2015). Several leaf disorders are found in citrus

groves, including nutrient deficiencies, disease symptoms, pest damage, phytotoxicity, and the

effects of environmental conditions such as sunburn (National Research Council, 2010; Hill &

Station, 1967). Biotic and abiotic stress in crops cause significant decrease in productivity and

subsequent economic losses resulting from late and imprecise diagnosis as well as delays in

implementing corrective actions (National Research Council (US) Committee on Biosciences,

1985). In Florida, Huanglongbing (HLB) disease is a major threat to citrus production. It is

caused by a phloem-limited bacterium, Candidatus Liberibacter asiaticus (CLas), and transmitted

by an insect vector called Diaphorina citri Kuwayama, the Asian citrus psyllid (ACP) (National

Research Council, 2010; Grafton-Cardwell et al., 2013; Halbert & Núñez, 2004; Hall et al.,

2013; Manjunath, Halbert, Ramadugu, Webb, & Lee, 2008).

Visual diagnosis of citrus leaf symptoms is challenging in the presence of confounding

factors causing changes in plant phenotype, resulting from plant-pathogen-environment

interactions, as well as similarities with nutrient deficiency symptoms (Grafton-Cardwell,

Godfrey, Rogers, Childers, & Stansly, 2006). The common nutrient deficiencies found in HLB-

affected trees include manganese (Mn), zinc (Zn), phosphorus (P), calcium (Ca), magnesium

(Mg), iron (Fe), boron (B), and copper (Cu) (Graham, Gottwald, & Irey, 2012; Nwugo et al.,

2013; Spann & Schumann, 2009). Asymmetrical foliar chlorosis and “blotchy mottle”

43

appearance are the main visual characteristics of HLB symptomatology. Some HLB symptoms,

such as chlorosis of young leaves, are similar to nutrient deficiency symptoms of Zn and Mn

(Bove, 2006; Grafton-Cardwell et al., 2006). Additionally, other commonly occurring leaf

disease and pest symptoms like the fungal diseases of greasy spot (Zasmidium citri-griseum also

called Mycosphaerella citri Whiteside) and citrus scab (Elsinoë fawcettii), the bacterial disease

citrus canker (Xanthomonas citri subsp. Citri), oomycetes disease phytophthora root and foot rot

(Phytophthora nicotianae), and pest damages, such as spider mite damage (Tetranychus urticae

Koch), can confuse the interpretation of the nutritional deficiencies by the inexperienced,

occasionally leading to inaccurate diagnosis and decision making (Dewdney, 2019; Dewdney,

2020; Dewdney & Johnson, 2020; Dewdney et al., 2020; Qureshi et al., 2020). Plant disorder

assessment is done through scouting and further confirmation with analytical methods under

laboratory conditions (Sankaran, Mishra, Ehsani, & Davis, 2010). Visual identification of leaf

symptoms is in most cases the first step in assessing plant conditions during scouting. Usually it

requires training and expertise for the proper diagnosis, and further investigation using standard

analytical methods, which are time consuming and often costly (Sankaran et al., 2010). Accurate

methods are required to reduce the complexity of visual diagnosis and improve efficiency and

precision in the identification of leaf disorders.

The recent advances in artificial intelligence (AI), machine vision have introduced state-

of-the-art models with improved accuracy in image classification, object detection, and image

segmentation (Dhillon & Verma, 2020; Lecun et al., 2015). In agriculture, the field of robotics

and autonomous systems (RAS) for automation of weed control with smart sprayers and

automated harvesting are the most positively impacted areas (Duckett et al., 2018; Duong,

Nguyen, Di Sipio, & Di Ruscio, 2020; Kamilaris & Prenafeta-Boldú, 2018; Liakos et al., 2018).

44

AI has largely contributed to cost reduction, as well as, improved efficiency and sustainability in

precision agriculture by reducing labor requirement and targeted application of agrochemicals

(Duckett et al., 2018; Esau et al., 2018; Liakos et al., 2018). Another field of important use of AI

and machine vision is the mobile smartphone application development that has been

implemented in agriculture (Alreshidi, 2019; Hernández-Hernández et al., 2017). In the field of

citrus production, machine vision was implemented to detect fruit infected with citrus canker and

citrus scab, two diseases that affect post-harvest fruit quality (Duong et al., 2020). (Sharif et al.,

2018) implemented an optimized weighted segmentation method to develop a system to classify

and detect citrus diseases using a Multi-Class Support Vector Machine (M-SVM). The proposed

approach achieved 97% classification accuracy, and created a new dataset of citrus pests with six

classes of mite species for the automation of IPM (Bollis, Pedrini, & Avila, 2020). The

EfficientNet-B0 was used to train the new pest benchmark, achieving 91.8% accuracy on

automatically-generated images of 400x400 pixels (Bollis et al., 2020). Also, Xing, Lee, and Lee

(2019) developed a Weakly Dense-16 CNN model for object recognition to specifically train an

integrated citrus pests and diseases database. The model achieved an accuracy of 93.42%,

performing better than the other models, including the VGG-16 (93%) and Network In Network

(NIN) with 91.84% (Xing et al., 2019).

This research was primarily centered in the development of a computer vision powered

system to diagnose citrus disorders using a convolutional neural network. The results obtained

from this study are essential to provide greater efficiency in the diagnosis of citrus leaf disorders

such as pest, diseases, and nutrient deficiencies, as well as contributing towards modernization

and digitalization of farm management activities. Two pretrained CNN networks, the VGG-16

and the EfficientNet-B4 were re-trained to identify eleven classes of citrus disorders commonly

45

found in HLB-endemic citrus production regions of Florida. The leaf disorder classes included in

this study were fungal diseases (greasy spot and citrus scab), bacterial diseases (citrus canker and

HLB) and oomycetes diseases (phytophthora foot and root rot), nutrient deficiencies (nitrogen,

magnesium, iron, manganese and zinc), pest damage (spider mite), and a class of asymptomatic

(healthy) leaves. Previous studies on this topic only focused on detection of citrus leaves and

fruits impaired with pests and diseases. This is the first time that CNN models were applied to

recognize nutrient deficiencies of citrus. Additionally, a completely new database of citrus leaf

disorders was developed in this study, with a total of 15,800 images, used for calibration

(training and validation) and external validation. Transfer learning and fine-tuning approaches

were used to train the data using the two pretrained models. The models were evaluated on their

ability to converge on and generalize the citrus leaf disorders dataset, followed by its

performance on an external database of unknown images. A comparison with human experts and

novices in the field of citrus disorders identification was made, using accuracy and time to

validate the usefulness of the developed technology.

Hypothesis

Machine-vision powered models for image classification can perform as well as expert

scouts and better than a novice scout.

Objectives

To develop fast and accurate diagnostic artificial intelligence models, using two

pretrained image classification models, the VGG-16 and the EfficientNet-B4 models for key

nutrient deficiencies of citrus, disease symptoms and pest damage that are commonly

encountered in Florida citrus trees impacted by HLB disease.

46

Materials and Methods

This research was carried out at the Soil and Precision Agriculture Laboratory, Citrus

Research and Education Center (CREC), University of Florida (https://crec.ifas.ufl.edu/). The

study was conducted in four phases: 1) field and laboratory data collection; 2) training/validation

(model development); 3) comparison of model performance to the perfoamance of human

experts and novices in Florida citrus; and 3) statistical analysis. The data collection phase

included leaf sampling, photography, scanning, and leaf nutrient analysis, followed by nutrient

data processing, labelling, and cropping of leaf images. The leaf symptoms diagnosis models

were developed by re-training DCNN networks for leaf image classification, using EfficientNet-

B4 by Tan and Le (2019) and VGG-16 developed by Simonyan and Zisserman (2015) models.

Both models were previously trained on the ImageNet dataset, which contains over 1000 classes

(Deng et al., 2009; Russakovsky et al., 2015). The re-trained models were used to predict

symptoms of an independent set of test leaves, and the results were analyzed using classical

statistics methods to compare model performance, select the best model, and validate the

research hypothesis. Finally, the performance of both models on independent sets of leaf data

was compered with humans, classified as experts and novices in diagnose citrus leaf disorders.

Experimental Design

To develop and train the DCNN models, a database of leaf images was created, which

included leaf disorders such as common nutrient deficiencies encountered in HLB-impacted

groves, disease symptoms, pest damage, as well as asymptomatic leaves. The leaf database

contained twenty-four classes of symptoms, twelve for adaxial (top) and the other twelve for the

abaxial (bottom) surface of the leaf. Each class contained representative data of different nutrient

deficiency, pest or disease progression stages and degree of symptoms: initial/early,

moderate/intermediate, and severe/late symptoms, including representative samples of citrus

47

cultivars (Figure 2-1). The selected leaf disorders were comprised of five nutrient deficiencies

(N, Mn, Mg, Fe, and Zn), five diseases (citrus canker, greasy spot, HLB, phytophthora and

scab), spider mite damage, and healthy leaves (asymptomatic leaves). Table 2-1 shows the

distribution of all classes and the four broad categories.

Data Collection

The leaf samples were collected from selected locations, at the CREC and surrounding

groves (Table 2-2). The purposive sampling method was employed to subjectively sample the

classes of leaf disorders selected for this research. More than 600 leaves were sampled for each

class, from which abaxial and adaxial sides of individual leaves were photographed, using a

white letter-size paper as a background, and fluorescent lighting in the laboratory. For the

nutrient deficiency classes, the leaves were sampled and later, in the lab, grouped into thirty,

twenty-leaf samples per class based on similar level of visual symptoms (initial, moderate, and

severe deficiency) for nutrient analysis. The leaves were cleaned by wiping with a paper towel to

remove surface impurities, placed in Ziploc™ bags and then stored in a 4oC refrigerator to

preserve their original properties, such as color and turgidity. The disease symptoms, pest

damage classes and asymptomatic leaves were also grouped in batches of twenty leaves, but no

laboratory analysis was performed.

The data used for model training consisted of digital photographs, true 24-bit RGB

images, in Joint Photographic Experts Group (JPEG) file format. The leaves were photographed

in a batch size of twenty leaves for calibration (training and validation) and test (independent

validation). A Samsung Galaxy S8 smartphone Android camera with 12MP Dual Pixel Sensor,

was used to take the photographs of leaves with a resolution of 4032x3024 pixels, automatic

white balance, focus, and exposure. After photographing, the leaf samples were scanned, using a

flatbed scanner (EPSON Scan V550 Photo) for a permanent record. The leaf samples to be

48

analyzed for nutrient deficiencies were washed using Liquinox soap (Alconox, Inc., White

Plains, NY) and 5% (v/v) hydrochloric acid, then oven dried at 70oC for 48 hours. The dry

weights were recorded, and the samples were ground using a Mini Thomas Wiley Grinding Mill

(40 mesh screen) (Thomas Scientific, Swedesboro, NJ). Samples were sent for chemical nutrient

analysis to Waters Agricultural Laboratories in Camilla, Georgia. The nutrient analysis results

were interpreted using the values presented in Table 2-3. The results from nutrient analysis were

further analyzed using DRIS (Diagnosis and Recommendation Integrated System) to identify the

most limiting essential nutrient in the sample. The DRIS web tool computes and identifies

nutrient imbalances in a sample analysis, which can be nutients in excess of deficiency

(Schumann, 2020).

Data Processing

Data annotation and image cropping

After photographing, the samples and respective classes were renamed, this process is

called data annotation. All photographs were manually named, following the order of sample

number (1-30 samples) and the number of leaves per sample (20). Each scanned file was

similarly identified, to indicate class name and sample number to match lab results with the leaf

nutrient deficiency symptoms. Prior to training, all images were automatically cropped using a

Yolo-v3 object detection model that was trained to identify leaves, in order to remove excessive

background around leaf objects (Redmon & Farhadi, 2018). Image background can interfere with

the learning process by reducing model accuracy in image classification and object detection.

The feature learning (training process) in deep learning models is a pixel-based process, where

the models use all image features to recognize the different properties of an image under

analysis. As a result, cropping is an important process to improve model performance, by

removing unnecessary pixels. For image classification, the EfficientNet-B4 model recommended

49

image resolution is 380x380 pixels and 224x224 pixels for the VGG-16 model which were the

minimum image resolutions after cropping and resizing, respectively (Simonyan & Zisserman,

2015; Tan & Le, 2019).

Dataset for calibration - training and validation

The training dataset consisted of a database of 14,400 images, from 24 classes, containing

600 images each. The images were saved in individual class folders and named accordingly to

match the class ID. Typically, a sample size of 500 to 1,000 images (leaves) of each class of leaf

disorder are sufficient for deep learning model training. According to Russakovsky et al. (2015),

for the ILSVRC12-14 each class had between 732 and 1300 images for training, 50 images for

validation and 100 images for testing. Two DCNN models were successfully trained for image

classification of chemical crystals with sample size of 600 images (Mungofa et al., 2018).

Dataset for testing - independent validation

The model performance was tested using an external dataset with three replicates of 20

images (60 images per class). A total of 1,400 images (72 separate test sets) were used for the 24

classes. The sampling method for the test dataset was purposive sampling, where experts in the

identification of citrus leaf disorders objectively selected characteristic samples of each of the

classes included in the dataset. For nutrient deficiency validation classes, the leaf symptoms were

confirmed by chemical nutrient analysis. Image properties (resolution and data format) were the

same as those images used for calibration. A subset of 20 leaves from the test dataset was used to

compare model performance to human expertise in identifying leaf disorders.

50

Data Analysis

Training and validation for citrus leaf disorders classification models with pretrained

networks

A deep learning machine vision approach was used to train two pre-trained deep DCNN

models to recognize citrus leaf disorders. The pre-trained image classification models

EfficientNet-B4 by Tan and Le (2019) and the VGG-16 by Simonyan and Zisserman (2015)

were used to develop the citrus leaf symptoms diagnosis models. The models were trained in a

Jupyter Notebook developed by P’erez and Granger (2018), using the Keras API, developed by

François Chollet in 2015, written in Python 3, running on the TensorFlow framework version

2.4, an open source platform developed by the Google Brain team (Abadi et al., 2016). A Linux

server, running the Ubuntu 18.04 operating system on a 64-bit Intel® Core™ i3-7100 CPU @

3.90GHz computer with 16Gb of RAM and a NVIDIA (NVIDIA CORPORATE, Santa Clara,

CA, USA) GeForce GTX 1080 Ti Graphics Card (GPU) was used to train the models. For

calibration, a proportion of 80%:20% was set for the training and validation dataset, respectively.

The images were normalized to pixel values ranging from 0 to 1, by dividing the pixel values by

255, the maximum pixel value per color in a 24-bit RGB image. Data augmentation was applied

to the training dataset, including geometric distortions: image rotation (90o), horizontal flip,

vertical flip, and fill mode which was set to constant and photometric distortions: brightness (0.2

to 1.5). By applying data augmentation to the training subset, the sample size was augmented

four times, resulting from image rotation, horizontal and vertical flip, as well as brightness. The

fill mode was only used to maintain true shape of the images after geometric distortions, such as

image rotation. Data augmentation is a procedure carried out to artificially generate a set of data

to increase variability and sample size of the training dataset. Data augmentation was not applied

to the validation subset and the independent validation dataset. This improves model capability

51

to recognize and correctly classify images under variable ranges of image properties, which vary

from user to user and device to device.

The Adam optimizer (an algorithm for stochastic optimization), one of the most used

algorithms in deep learning machine vision, was utilized for training. It provides a smart learning

rate and momentum, by intuitively reducing the learning rate when dealing with complex

datasets (Kingma & Ba, 2015). Reducing the learning rate enables the network to learn complex

features gradually and accurately, leading to improved performance. For the EfficientNet-B4

model the initial learning rate (LR) was set to 0.005 during transfer learning and reduced by 10x,

to 0.0005, when fine-tuning. The VGG-16 model was trained with a lower LR, starting with

0.0005 during transfer learning and reduced to 0.00005 for fine-tuning. A loss function is used to

monitor model performance during training. The categorical cross entropy was used to compute

the loss values between the true class labels and predictions from the model (Zhang & Sabuncu,

2018). The loss function computes the Mean Squared Error (per sample), using the sum of errors

over the batch size. Training and validation accuracy were the metrics used to evaluate model

performance. Accuracy calculates the frequency of agreement between the predictions from the

model and the true class labels. Automatic early stopping was activated to halt training when no

more improvement in validation accuracy occurred for five consecutive epochs. Automatic LR

reduction was set to reduce LR by a factor of 5 (0.2) when validation accuracy did not improve

for two epochs (Table 2-4).

Training methodology

Transfer learning using a pretrained model was used as the first step in training. During

transfer learning, a copy of the EfficientNet-B4 and the VGG-16 models was downloaded, which

was previously trained on the ImageNet dataset (Russakovsky et al., 2015). Only the base model

was used, and the classification head for the 1,000 ImageNet classes was removed. The base

52

model architecture for the EfficientNet-B4 model is comprised of 467 trainable layers and the

VGG-16 contains 19 trainable layers with 26 trainable weights. However, during transfer

learning, only three new selected layers (the new classification head for leaf classes) were

attached to the upper part of the base network, and were trained while the pretrained layers of the

base model remained frozen (represented in the upper part of the base model, Figure 2-2). The

classification head included the following layers: Global average pooling 2D layer (for two-

dimensional images), Dropout layer (set to 0.5), and the Dense layer (dense class or multiclass

layer for 24 classes). These layers were the same types as those used in the classification head

used to train the ImageNet dataset with 1,000 classes. Global Average Pooling uses a nonlinear

function approximator in the classification layer to make predictions based on the feature maps.

It is used directly over feature maps in the classification layer to avoid overfitting, regularize the

network structure, converting feature maps into confidence maps of categories or classes (Lin,

Chen, & Yan, 2014). The dropout layer, which is also used to prevent overfitting of the training

dataset, regularizes the training by randomly selecting and setting half of the activations in the

fully connected layers to zero (Srivastava, Hinton, Krizhevsky, Sutskever, & Salakhutdinov,

2014). Dense layer, is a nonlinear layer, which employs a linear formula to make final

predictions with a non-linear activation function (Huang et al., 2017). Dense layers are

particularly important in deeper networks to enable shorter connections between layers (Huang et

al., 2017). The number of dense units is computed and set to the number of output classes, where

SoftMax is the non-linear activation function. The selection of these layers was based on

computational efficiency and improved model performance (high accuracy and low loss values).

After transfer learning, fine tuning was done to train the model on the leaf dataset and

improve model performance. The process was carried out by gradually unfreezing part of the

53

network that was previously frozen during transfer learning. The principle is that increasing the

number of trainable layers will increase model performance for the new set of classes. For

EfficientNet-B4, fine-tuning was carried out in two steps. The first, freezing 66% of the lower

base model to train 33% of its upper layers. The second, train 66% of the upper base model, by

freezing its lowest 33%. The VGG-16 model, being a smaller network, compared to the

EfficientNet-B4, fined-tune was carried out by first unfreezing 33% of the layers and then

unfreezing the rest of the network, 100%. The sequence of training is shown in Figure 2-2.

Evaluating model performance

Model performance was evaluated as training progressed, using training accuracy and

loss values. Validation accuracy and validation loss were assessed on the validation dataset

(20%). The best fit model had high accuracy and low loss values, for both training and validation

subsets. An equilibrium between the accuracy and loss during training and validation is required

to exclude the possibility of overfitting or underfitting. Usually, unbalanced training parameters,

such as unequal sample size between classes, or the improper solvers (algorithms) and

classification heads are the main causes of imbalances in model performance. The validation

dataset was also used to assess model performance in individual classes. The variables used to

assess performance for trained classes were obtained with SciKit-Learn’s Classification_Report

function, Pedregosa et al. (2011), calculating accuracy (probability results), precision, recall and

F1 scores, (Equation 2-1 to Equation 2-4).

Accuracy. Is the ratio of the total correct predictions (true positives and true negatives)

over the total number of observations. It is computed to evaluate model performance, using the

averaged class probability results. It is important to note that generally, the accuracy value does

not alone represent model performance, which is better evaluated using precision, recall, and F1

score.

54

𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑇𝑜𝑡𝑎𝑙 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐶𝑜𝑟𝑟𝑒𝑐𝑡 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠

𝑇𝑜𝑡𝑎𝑙 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑂𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 (2-1)

Precision. Is the ratio of total true positives to the total number of samples predicted as

positive (true positives and false positives). It indicates the model’s capacity to correctly classify

objects based on its true label, not confusing true positive with a false positive.

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠

𝑇𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 + 𝐹𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 (2-2)

Recall. Is defined as the ratio of true positive to the actual positives (true positive and

false negative predictions). It shows the model’s ability to correctly identify the true positives in

a class, also called sensitivity.

𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠

𝑇𝑟𝑢𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 + 𝐹𝑎𝑙𝑠𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 (2-3)

F1 score. Is a function of Precision and Recall. It indicates a balance of precision and

recall, showing the impact of false positives and false negative in model performance. When

comparing the performance of different models trained under the same circumstances, the F1-

score is more suitable to assess performance.

𝐹1 𝑆𝑐𝑜𝑟𝑒 = 2 ∗ 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑅𝑒𝑐𝑎𝑙𝑙

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙 (2-4)

The results from the model predictions were used to develop the confusion matrix, to

visualize where confusion occurs among classes that share similar features. The confusion matrix

contrasts true labels with the predicted values, showcasing the probability percent of true

positives and false positives.

Evaluating model performance on an external dataset

Testing was conducted using an external database of unknown images, containing

representative images of all classes. These images were not used in the calibration of the model.

55

Folders containing the unknown images were analyzed by the model, which provided probability

percentages of single leaves. Model predictions were set to provide the three best classes (called

top-3), ranked from high to low probability percentages. Correctly classified samples with

probability percentage equal or greater to 50% received a score of one (1), and incorrect

classification with probability percentage of less than 50%, received a score of zero (0).

However, in case of samples where all probability percentages were less than 50%, the class with

the highest probability percentage, when correctly classified received score of one (1), and when

incorrectly classified, a score of zero (0). The probability results were averaged to compute final

model performance on unknown test images. Three metrics were computed, the probability

percentage per class, averaged correct probability, and averaged error, all in percentages. For

nutrient deficiency samples, the results from model testing were also compared to the results

from chemical nutrient analysis of each batch of 20 leaves, using DRIS analysis and published

thresholds (Morgan et al., 2020).

Developing and training image classification models for citrus leaf diagnosis

Five models were trained to converge (best fit model) on the citrus leaf diagnosis, CLD-

Model-1, CLD-Model-2, CLD-Model-3, CLD-Model-4 (using the EfficientNet-B4 pretrained

model), and CLD-Model-5 (using the VGG-16 pretrained model). An initial model, CLD-Model-

1, was trained using a database of twenty-three classes of citrus leaf symptoms, where the HLB

class, only contained the images of adaxial (HLB_d) side of the leaf surface. The other 22 classes

included leaf symptoms of the adaxial and abaxial sides of the leaf (Table 2-1). A total of 13,800

images was used to train the model, 80% (11,040 images) for training and 20% (2,760 images)

for validation. All classes contained the same sample size of 600 images, and data augmentation

was applied to the training subset (80%). Image resolution was 380x380, batch size set to 24 and

number of training epochs was initially set to 50. The number of steps per epoch in training was

56

460 and in validation 115. Equation 2-5 and Equation 2-6 were used to compute number of steps

per epoch for training and validation, respectively.

𝑆𝑡𝑒𝑝𝑠 𝑝𝑒𝑟 𝑒𝑝𝑜𝑐ℎ = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑎𝑖𝑛𝑖𝑛𝑔 𝑠𝑎𝑚𝑝𝑙𝑒𝑠

𝑇𝑟𝑎𝑖𝑛𝑖𝑛𝑔 𝑏𝑎𝑡𝑐ℎ 𝑠𝑖𝑧𝑒 (2-5)

𝑉𝑎𝑙𝑖𝑑𝑎𝑡𝑖𝑜𝑛 𝑠𝑡𝑒𝑝𝑠 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑖𝑑𝑎𝑡𝑖𝑜𝑛 𝑠𝑎𝑚𝑝𝑙𝑒𝑠

𝑉𝑎𝑙𝑖𝑑𝑎𝑡𝑖𝑜𝑛 𝑏𝑎𝑡𝑐ℎ 𝑠𝑖𝑧𝑒 (2-6)

The model was trained for 56 epochs, distributed between transfer learning, fine-tuning

33%, and fine-tuning 66%. After completion of each training step, model progress and best

weights were saved to proceed to the next step in training (e.g., fine-tuning). Once training was

completed, model performance was evaluated on the validation dataset (2,760 images). Figure 2-

3 shows the framework employed for model development. The variables used to analyze model

performance in training were precision, recall, F1 score, and accuracy. A confusion matrix was

generated with SciKit-Learn to visualize conflictive classes, those with similar feature that the

model was not able to distinguish (false positives and false negatives) (Pedregosa et al., 2011).

Finally, model performance was tested on an external dataset with unkwown leaf images.

Developing and training new image classification models for citrus leaf diagnosis with an

improved dataset

An improved dataset was created to increase model performance. The results from the

confusion matrix and external validation were used to assess the integrity of data used for

training. The training data were classified by prediction with the CLD-Model-1, to identify

which images were outliers, causing confusion in the dataset. After the assessment, the outliers

were manually removed from the database. The new dataset was constituted of classes with

unbalanced sample size. One implication with training a dataset containing classes of different

sample size is the possibility of overfitting and underfitting. This leads to a biased classification

57

towards the class with greater sample size. The new training dataset included an additional class

for HLB class (HLB_b, the abaxial side of leaf), and complete replacement of the previous

HLB_d class. The total number of images in the new training dataset was 14,312 images, 80 %

(11,456 images) for training and 20% (2,856 images) for validation.

A second model (CLD-Model-2) was trained, using a database of twenty-four classes of

citrus leaf symptoms. The training strategy was the same as implemented for CLD-Model-1. The

number of steps per epoch in training was 477 and 119 in validation. The new model was trained

for 78 epochs, divided into transfer learning and fine-tuning. Model performance was evaluated

on a validation set of 2,856 images, using the same variables as in the previous model. The

performance was also tested on an external dataset, which also included images for the abaxial

side of HLB leaves. A confusion matrix was built for visualization of model accuracy and

conflictive classes on the improved dataset.

A third model (CLD-Model-3) was developed, with a balanced sample size for training

classes. For this model, new leaf images were photographed to replace the outliers that were

previously removed. The leaves for nutrient deficiency classes were sent for chemical nutrient

analysis for confirmation. The image database used for training contained 14,400 images, from

twenty-four classes of 600 images each. Training strategy was the same as with previous models,

80% (11,520 images) were used for training and 20% (2,880 images) for validation. The number

of steps per epoch in training was 480 and 120 in validation. The model was trained for 49

epochs. The number of training epochs was low for CLD-Model-3 compared to the other two

models, because the weights of the CLD-Model-2 was used to reduce training time and improve

model performance during transfer learning. Model performance was evaluated on an external

validation set of 1,400 images.

58

To assess if model performance would improve, the scab_d class, was removed from the

database. This decision was made based on the confusion matrix from CLD-Model-3, where

scab_d class was the one with lowest accuracy value. A new model was trained CLD-Model-4,

with 23 classes of 600 images each. The same training methodology was used, 13,800 images,

80% (11,040 images) were used for training and 20% (2,760 images) for validation. The training

was carried out for 59 epochs.

The fifth model, the CLD-Model-5 was developed using the VGG-16 pretrained network,

using the balanced dataset, the same used for CLD-Model-3. The training strategy for the model

was the same as previously described. The image database used for training contained 14,400

images, from twenty-four classes of 600 images each. The training strategy was the same as

previous models, 80% (11,520 images) were used for training and 20% (2,880 images) for

validation. The number of steps per epoch in training was 480 and 120 in validation. The model

was trained for 61 epochs. After training, model performance was evaluated on an independent

validation set of 1400 images representative of the 24 classes.

Independent validation dataset to compare model performance to human expertise

Model results were compared with human classification results. Two groups of people

were asked to identify symptoms in a custom web survey, using a subset of 20 leaves per class,

for a total 240 leaves. The web survey was conducted with surveyplanet.com (Survey Planet,

LLC), available at the link https://s.surveyplanet.com/YC17pXmhH. The group of experienced

professionals (Experts) included three individuals which together have approximately 100 years

of experience of citrus production in Florida. The second group (Novices) consisted of three

individuals with nearly 20 years of experience all together, ranging from 3 to 10 years. Three of

the models, with comparable performance were used, CLD-Model-2, CLD-Model-3, and CLD-

59

Model-4. The survey containing 240 images (each showing the adaxial and abaxial side of the

leaves) was created to evaluate the two groups of individuals on classification of citrus leaf

disorders. The same set of images was used to test the models in image classification. The

individuals had one chance to classify the samples, with no time limit and no additional self-

training. The survey included 12 options of answers from the eleven classes of leaf disorder and

one class of asymptomatic leaves.

Statistical Analysis

A Pearson’s Chi-square test for categorical variables was used to compare model

performance to the performance of Experts and Novices in classification of 240 selected images

of all leaf disorders on the test set. A comparison between the groups of humans was also

performed to assess differences between the two groups. A confidence level of 95% (p-

value=0.05) was used. The analysis was performed in R-Studio statistical software (RStudio

Team, 2016) using the R programming language (R Core Team, 2015).

Results and Discussion

A database of 15,800 digital images of twenty-two classes of citrus leaf disorders and two

classes of asymptomatic leaves was created, from which 14,400 images were used for calibration

and 1,400 images for independent validation. The database of images was used to train five

DCNN models from two pretrained networks (EfficientNet-B4 and VGG-16), through transfer

learning and fine-tuning. Data augmentation was applied to the training subset, which constitutes

80% of the data used for calibration (Table 2-5). Five data augmentation methods were applied

to the training dataset. In Table 2-5 is a summary of the data generated and how the data was

used to train the five CLD-Models.

60

Training and Validation Results

CLD-Model-1

The first model trained achieved a validation loss of 0.059 and a validation accuracy of

98.19%. Model training was carried out for 56 epochs. Transfer learning was completed in 23

epochs, reaching a validation loss of 0.39. The remaining training was carried out for 33 epochs,

14 epochs with 33% of the network, reached a validation loss of 0.081. The highest training

performance was achieved in the second step of fine-tuning, when 66% of the network was used

(Figure 2-4A).

On the validation dataset, the model achieved 98% accuracy, precision of 98%, recall of

98% and a F1 score of 98% (Table 2-7). The confusion matrix for CDL-Model-1 is shown in

Figure 2-5, most classes achieved excellent performance, including spider mite, greasy spot,

phytophthora, citrus canker, and magnesium with 100% accuracy, manganese, nitrogen, and

magnesium (abaxial), with 99% accuracy. Among the 23 classes, the zinc_d class showed the

highest number of false predictions (9.2%). The symptoms were confused with the manganese_d

class (8.3%) and zinc_b (0.8%). Another class with low performance was the iron_d, with 94.2%

of true predictions, confusing with the manganese_d (4.2%), scab_d (0.8%) and magnesium_d

(0.8%). In the confusion matrix, other classes with low performance include scab_b (96.7%),

scab_d (95.8%), nitrogen_d (95.8%) and zinc_b (95.8%) of positive predictions. As the results

obtained from CLD-Model-1 showed a substantial number of false predictions, Table 2-11 (error

rate of 1.8%), the training dataset was tested to identify and remove outliers, the results are

shown in Table 2-6. The model performance on independent validation was 98.26% of true

predictions (accuracy) with an error rate of 1.74% and a prediction confidence of 97.96%. The

results show good overall performance in most classes, except zinc and spider mite on both sides

of the leaf surface, (Figure 2-10A).

61

CLD-Model-2

CLD-Model-2 (Figure 2-4B) was trained on the improved dataset, after removing outliers

that were causing confusion and affecting model performance. Twenty-four classes were trained

for 78 epochs, subdivided in 27 epochs for transfer learning, 26 epochs trained with 33% of the

network and 24 epochs during second step of fine-tuning. A validation loss 0.34 and accuracy of

88.9% was achieved during transfer learning. After fine-tuning, validation loss decreased to

0.024 and validation accuracy increased to 99.52%.

Model performance on the validation set achieved an accuracy of 99%, the same

averaged percentage was achieved for model precision, recall and F1 score (Table 2-7). The

confusion matrix (Figure 2-6) improved considerably for nearly all classes that contained a high

number of removed outliers, with percent of positive predictions above 99%. The scab_d, which

had performance decreasing to 90.7% of true predictions, was the only class with low

performance. The scab_d class symptoms were confused with symptoms of healthy_d (6.8%),

manganese_d (1.7%) and spider mite (0.8%). The overall model performance reached 99.4% of

true positive predictions with a low error rate of 0.6% (Table 2-11). When tested on the

independent validation set, the averaged percentage of true predictions was 97.99% with an

averaged error rate of 2.01%, and the prediction confidence was 97.78% (Table 2-13). Classes

such as manganese_d and zinc_d with 80.2% and 86.95% of true predictions, respectively,

represented the lowest values (Figure 2-10B).

CLD-Model-3

The CLD-Model-3 (Figure 2-4C) model performance on the improved dataset (when the

outliers were replaced with better images of leaf symptoms) reached to 99.17% validation

accuracy and loss of 0.044. During transfer learning the validation accuracy reached to 99.14%

62

and validation loss of 0.046. The high validation accuracy during transfer learning was achieved

through the use the weights from the CLD-Model-2.

Model performance on the validation set completed an accuracy of 99% and the same

percentage was obtained for precision, recall and F1 score (Table 2-7). From the confusion

matrix (Figure 2-7), a high performance was achieved for most of the classes, with percentages

of positive predictions greater than 98%, comparable to the performance of CLD-Model-2.

However, a few classes presented low performance, scab_d (92.5%), nitrogen_d and nitrogen_b

both with 96.7% of true predictions. The scab_d symptoms were confused with healthy_d (6.7%)

and manganese_d (0.8). Nitrogen_d symptoms were confused with spidermite_d (1.7%) and

HLB_d (1.7%) and nitrogen_b symptoms were confused with manganese_b (1.7%), zinc_b

(0.8%) and healthy_b (0.8%). Confusion between manganese and zinc classes were still found,

2.5% of false predicted manganese_b, with respect to zinc_b with 97.5% of true predictions, and

zinc_d with 97.5% was confused to manganese_d (1.7%) and zinc-b (0.8%). The remaining

classes achieved performance greater than 98%, with excellent performance observed on the

classes of disease symptoms. The overall rate of true predictions was 99% with an error rate of 1

% (Table 2-11). The results on the independent validation set improved to 98.26%, with an error

of 1.74% and a prediction confidence of 98% (Table 2-13). The two classes showing low

performance were zinc_b with 88.33% and manganese_d also with 88.33% of true predictions

(Figure 2-10C).

CLD-Model-4

As the scab_d class showed the lowest performance, CLD-Model-4 (Figure 2-4D) was

trained to evaluate the impact of removing the entire class from the dataset. This model

performed better in training than CLD-Model-1 and CLD-Model-3, achieving a validation

accuracy of 99.24%, and a validation loss of 0.037. During transfer learning, the model was

63

trained for 24 epochs, where the validation accuracy reached to 88.26% with a validation loss of

0.38. Fine-tuning greatly improved the validation accuracy as shown in Figure 2-4D. However,

model performance on the validation dataset did not improve compared to the two previous

models, an average of 99% validation accuracy was obtained, and model precision, recall and F1

score reached the same percentage as model accuracy. The accuracy values observed in the

confusion matrix (Figure 2-8) showed good performance, with most classes achienving greater

than 99% positive predictions. The iron_d class, with 95% true predictions had the lowest

performance and it was confused with the manganese_d class, with 5% of false predictions. The

zinc_d class with 96.7%, the symptoms were still confused with manganese_d, and zinc_b with

2.5% and 0.8% false predictions, respectively. The overall performance of positive predictions

was 99.2% with an error rate of 0.8% (Table 2-11). The classification performance on the

independent validation (Table 2-13) was comparable to the other models, 97.9% of true

predictions and 2.01% error rate and prediction confidence of 97.64% (Table 2-13). The classes

with low performance were manganese_d (86.67%), spidermite_b (88.33%), zinc_b (91.67%)

and HLB_b (96.67%). The remaining classes achieved classification performance greater than

98% (Figure 2-10D).

CLD-Model-5

The VGG-16 neural network was used to train the CLD-Model-5 (Figure 2-4E) and

compare its performance to the EfficientNet-B4 model. The improved dataset with 24 classes

was used for training the model, which reached to a validation accuracy of 98.33%, and a

validation loss of 0.054 after fine-tuning. During transfer learning, which was carried out for 10

epochs, the validation accuracy and validation loss (41.60% and 2.50, respectively) were

considerably lower than all the EfficientNet models. These values increased gradually, but at a

much slower rate when compared to the EfficientNet-B4 models during the two steps of fine-

64

tuning, with 33% and 100% of the network used for training. When training the model with 33%

of the network, 24 epochs were used and validation accuracy increased to 91.67% with a loss of

0.24, reaching the highest accuracy in the second step of fine-tuning.

Model performance on the validation dataset achieved an accuracy of 98%. The same

percentage was observed for the model’s precision, recall and F1 score (Tables 2-7). The

confusion matrix (Figure 2-9) shows a good prediction performance in most classes with positive

prediction rates ranging from 98% to 100%. The model results showed a generalized decrease in

performance compared to models CLD-Model-2 to CLD-Model-4 for most of the classes except,

greasyspot_d, spidermite_d, greasyspot_b, canker_d, magnesium_d, healthy_b and

phytophthora_b, all with 100% of true predictions. The model had a low performance at

predicting the phytophthora_d with 97.5% of true predictions, compared to all EfficientNet-B4

models. Classes such as scab_d (90.8%), zinc_b (92.5%), zinc_d (95.8), and iron_d (96.7%), still

maintained low percentages of true predictions compared to those observed in the previous

models. The overall percentage of true predictions was 98.3% with an error rate of 1.7% (Table

2-11). The classifier’s performance on the independent validation was 95.9% of true predictions,

with an error rate of 4.1% and a prediction confidence of 95.34% (Table 2-13). The classes with

low classification performance included iron_d (96.67%), HLB_b (96.67%), manganese_d

(76.67%), phytophthora_b (96.67%), phytophthora_d (93.33%), spidermite_b (75%),

spidermite_d (91.67%), zinc_b (88.33%) and zinc_d (93.33%). The rest of the classes had

similarly high performance as it was observed in the previous models (Figure 2-10E).

Model Performance During Training and Validation

A similar trend was observed in all models trained using the pretrained EfficientNet-B4

network, with training and validation accuracy increasing from transfer learning to fine-tuning.

In Figure 2-4A, CLD_Model-1 had the lowest performance among the EfficientNet-B4 models,

65

with maximum validation accuracy of 98.19% and loss of 0.059. In this model, training accuracy

outperformed the validation accuracy, where the highest training accuracy was 99.51% with loss

of 0.0147, indicating a potential overfitting of the training set over the validation subset. In this

case, the imbalance might have resulted from the presence of samples that did not show classic

symptoms of the assigned classes. As the samples are randomly subdivided into 80%:20%,

training and validation, the chances of having more erroneous images on the validation subset

than in the training subset exists, which results in pronounced differences in accuracy. After

removing outliers (Table 2-6), model performance of CLD-Model-2 improved by 1.33%, from

98.19% to 99.52% validation accuracy, with a loss of 0.22 (Figure 2-4B). The highest training

accuracy was 99.71% and loss of 0.0088 (a difference of 0.19% between training and validation

accuracies). This performance suggests a better agreement between training and validation

subsets, as a benefit of the removal of outliers which improved the quality of the training dataset.

One of the aspects considered when developing image classification models was the balanced

number of samples in the training dataset. The CLD-Model-2 (Figure 2-4B) performance

increased when compared to CLD-Model-1, however, the training dataset was not balanced for

all classes (Table 2-6). The CLD-Model-3 (Figure 2-4C) was trained on a balanced dataset. The

model performance was better than the CLD-Model-1, with 99.17% validation accuracy and a

loss of 0.044, nevertheless the model did not outperform the CLD-Model-2. Comparing model

performance on the training subset after fine-tuning, training accuracy reached to 99.9% with a

loss of 0.004. The improved dataset did not benefit the validation accuracy. The difference

between training accuracy and training validation was pronounced (0.73%), with training

accuracy outperforming the validation accuracy. When analyzing the confusion matrix for all the

previous models, the scab_d class showed low performance. The CLD-Model-4 (Figure 2-4D)

66

without scab_d class, achieved a validation accuracy of 99.24% and loss of 0.037. Model

performance on the training subset reached an accuracy of 99.65% and loss of 0.0107, a

difference of 0.41%, regarding the validation accuracy. The difference in performance on the two

subsets was less pronounced than the previous model. Removing the scab_d class slightly

benefitted model performance, however it was not better than CLD-Model-2. The four

EfficientNet-B4 models were compared to CLD-Model-5 (Figure 2-4E), trained using the VGG-

16 network. The final validation accuracy of CLD-Model-5 was 98.33% and loss of 0.054. The

accuracy on the training subset was 99.54% and loss of 0.0135, a difference of 1.21%, compared

to the validation accuracy. The pronounced differences between training and validation suggest

potential overfitting of the training subset.

Model Performance on the Validation Dataset

The performance on the validation dataset (Table 2-7), improved from CDL-Model-1 to

CDL-Model-4. The CDL-Model-5 did not perform as well as the other models, even though it

was trained on the improved dataset. The improvement in model performance on the validation

dataset can be attributed the EfficientNet-B4 network. As observed in model performance during

training, the EfficientNet-B4 had better generalization to the citrus leaf symptoms dataset. The

same performance was observed on the validation dataset. Based on the F1 scores, excluding the

CLD-Model-1, the EfficientNet-B4 performed better on the leaf dataset, with an F1 score of

99%, whereas the VGG-16 F1 score was 98%. Evaluating model performance based on the rate

of true predictions (Table 2-11), the CLD-Model-2 had the highest rate of true predictions

(99.4%), with the lowest rate of false predictions (0.6%). All models achieved excellent

performance during calibration. This can be attributed to four main aspects: quality of data used

for training, good selection of hyperparameters, transfer learning, and fine-tuning. In general, the

data used for training was carefully selected and grouped into respective classes. While this was

67

true for most of the disease symptoms classes, scab introduced confusion the models due to the

difficult-to-detect symptoms on the adaxial surfaces of the leaves. Also, in the initial database

(used to train the CLD-Model-1), the nutrient deficiency classes contained a considerable

number of outliers (Table 2-6), with special attention paid to manganese (the number of outliers

removed are not shown in Table 2-6), zinc and iron. Improving the dataset by removing and

replacing the outliers greatly improved model performance.

Regarding the data, the leaf symptoms dataset contained some classes that are easily

confused with other class symptoms. The citrus scab disease symptoms are more pronounced on

the abaxial side of the leaf. In some cases, infected leaves do not show symptoms on the adaxial

side of the leaf, therefore appearing healthy. Also, when leaves present mild symptoms, the

symptoms on the adaxial side of the leaves are present with a slight chlorosis that look similar to

the damage caused by spider mite, and the chlorosis seen in manganese deficiency, both on the

adaxial side of the leaf. Similar features are also observed between manganese and zinc classes.

In early stages of deficiency, zinc deficiency tends to have similarities with the manganese

classes, because the interveinal chlorosis is well pronounced in zinc deficient leaves. In late

stages of deficiency, plants with manganese deficiency also show symptoms of zinc deficiency,

with pronounced chlorosis. One way to distinguish zinc deficiency from manganese deficiency is

observing the size of the leaf. The zinc deficient leaves are small and narrow leaves with

pronounced interveinal chlorosis. The manganese deficient leaves maintain normal sized leaves,

with mild interveinal chlorosis. However, in most plants, micronutrient deficiency occurs at the

same time due to imbalances in soil chemistry, such as soil pH (Havlin, Beaton, Tisdale, and

Nelson 2005); which might be the reason why most samples presented with more than one

essential nutrient in deficiency (Object 2-1). There was also confusion between manganese and

68

iron deficiency, observed in samples with mild symptoms of iron deficiency which tends to look

like manganese deficiency, based on the size of the leaf and mild interveinal chlorosis. Some

samples with iron deficiency also presented manganese deficiency, resulting from nutrient

imbalances in the plant. Since HLB-affected trees in the field tend to be deficient in

micronutrients, this could explain the reason why multiple deficiencies were encountered. Leaf

chlorosis was the main feature identified in misclassified leaves. Nitrogen deficiency is

characterized by the general leaf chlorosis, some nitrogen deficient leaves were misclassified as

HLB, this might be because in severe HLB symptomatic leaves, leaf chlorosis is pronounced. As

the image classification is a pixel-based process, it is possible that the pixel values of these

leaves were like those of nitrogen deficiency. On the other hand, the disease symptoms classes,

phytophthora, citrus canker, and greasy spot, had excellent performances, with unique features

that were not confused with symptoms of other classes.

The proper selection of hyperparameters (Table 2-4) allowed the training process to occur

smoothly, reaching good model performance. The use of a balanced batch size to class ratio,

24:24, was important to allow an equilibrium in representative samples of each class. A smaller

batch size in big datasets for multiclass tasks, usually causes overfitting or underfitting. This is

because the number of samples being fed to the network at each training and validation step are

not the same, due to random sample selection. Whereas large batch sizes (32 and 64 images),

may risk over and under fitting and greatly depends on computation capacity (with the GPU).

Learning rate and the optimizer (in this case the Adam optimizer), were set to start at a higher

LR during transfer learning, as it was training on a known dataset, and decreased during fine-

tuning, to allow the model to learn the new features of the new dataset. This allowed the

performance to greatly increase during fine-tuning, also slowing down the training process. The

69

LR rate for the VGG-16 model was slower than the EfficientNet-B4 model, because the VGG-16

is a smaller model with a greater number of training parameters, it requires a slower learning rate

or an optimizer with slow learning rate such as SDG (Simonyan & Zisserman, 2015). Transfer

learning and fine-tuning were essential to achieve the high performance achieved by all models

(Figure 2-4A to 2-4E). It was clear that transfer learning is a fast method to train CNN models

and that fine-tuning is essential to improve model performance on a new dataset. The two

pretrained models achieved great performance when trained in large datasets like the ImageNet.

These models were built to train large and complex datasets. Therefore, training big models with

relatively small datasets like the citrus leaf disorders requires the gradual release of the layers,

until an optimum accuracy is reached. In cases of deeper networks like the EfficientNet-B4, fine-

tuning was successfully implemented using 66% of the network layers, while for the VGG-16,

all network layers where used to fine-tune the model on the citrus leaf disorders dataset.

Comparing the performance of both pretrained models during transfer learning and fine-

tuning, considering CLD-Model-3 (Figure 2-4C) and CLD-Model-5 (Figure 2-4E), the results

obtained in this study indicate that the EfficientNet-B4 performs better than the VGG-16 model.

The improved performance of the EfficientNet-B4 can be attributed to is architecture’s depth,

width, and the image resolution, which are all greater than the VGG-16. It is well known that

deeper networks perform better than shallow models. The EfficientNet-B4, in addition to the

deeper networks, also has wider layers, with a greater number of channels (up to 1792 channels

with resolution of 12x12), to train finer details and use greater image resolution (380x380 image

input size), which improves the model’s ability to correctly classify complex images (Tan & Le,

2019). On the other hand, the VGG-16 network with its relatively deep architecture, does not

have the same capacity to train on fine details. The image input size for the VGG-16 model is

70

smaller (224x224) and the number of channels is also small (up to 512 with resolution of 7x7),

which reduces the number of details that can be detected by the model (Simonyan & Zisserman,

2015).

Model Performance on the Independent Validation

The model performance on the test set showed the same pattern as observed in the

validation dataset, however, the rate of true predictions was lower. The models shown excellent

performance when predicting the disease symptoms, on both sides of the leaf surface, shown in

Figures 2-10A to 2-10E. For spider mite damage classes, the models CLD-Model-1 to CLD-

Model-3 (Figures 2-10A to 2-10C) had a better performance when predicting the abaxial side of

the leaf and more confusion when looking at the adaxial side. In contrast, the last two models

(Figures 2-10D and 2-10E), which also showed low rate of true predictions, were better at

predicting the spider mite damage that are visible on the adaxial side. For the nutrient deficiency

classes, the models were able to correctly predict the symptoms of magnesium deficiency on

both sides of the leaf. Similar performance was observed for nitrogen deficiency. For iron

deficiency, the first three models (Figures 2-10A to 2-10C) had better performance predicting the

symptoms on the adaxial side of the leaves. CLD-Model-4 had the same rate of true prediction

for both sides of the leaves (Figure 2-10D). The last model, (Figure 2-10E) had the lowest

performance and was better at predicting the symptoms on the abaxial side of the leaves in most

of the classes. Classes with lowest rate of true prediction in all models were manganese and zinc,

on the adaxial and abaxial side, respectively. However, the models were better at predicting the

symptoms on the opposite sides of those mentioned above for both classes. With the manganese

class (Figure 2-10A), the first model had good performance in classifying the external

manganese dataset (98.3% on both sides of the leaves), but the performance decreased drastically

for the rest of the models. This could be resulting from the scrutiny and removal of the outliers

71

from the training set, which was not done with the testing set, causing the probability results to

reduce (Figure 2-10A to 2-10E). Looking at general model performance on the test dataset

(Table 2-13), the CLD-Model-3 had the best performance with 98.26% of true predictions and

the CLD-Model-5 had the lowest performance with 95.9% of true predictions; the good

performance of the models can be attributed to the improved capabilities of the EfficientNet-B4.

The low performance observed in all the models on the manganese and zinc classes can be

attributed to the ambiguity of samples, caused by the presence of multiple symptoms, selected to

test the models.

Chemical Nutrient Analysis Results

The results from chemical nutrient analysis confirmed true deficiency in all samples

showing nutrient deficiency symptoms. However, in some samples, more than one nutrient

deficiency was observed, the table with DRIS results can be found in Object 2-1. The manganese

and zinc classes had the highest number of samples with multiple micronutrient deficiencies. In

samples of the nitrogen and iron classes, multiple nutrient deficiencies were observed but in

minor proportions. Table 2-12 contains the results of DRIS analysis on the independent

validation set. The results show the same trend of multiple nutrient deficiency for the classes

indicated above.

Statistical Analysis Results Comparing Model Performance to Human Performance

A total of 240 image samples were used to compare human and model’s classification

performance. The list of samples and prediction results from the models are shown in Table 2-

14. The models had an excellent performance in almost all classes, except spider mite damage,

which was observed in all models and magnesium deficiency for CLD-Model-4.

The confusion matrices shown in Figures 2-11A to 2-11C corresponds to the

classification results of individuals in the novice group. A general agreement is observed with

72

classification of disease symptoms citrus canker and greasy spot. Differences were found in

classification of citrus scab, phytophthora and HLB. Substantial variations were observed in

identification of spider mite damage symptoms. The same trend was observed with the

classification of nutrient deficiency symptoms, where zinc and nitrogen were the classes with the

highest classification accuracy from two of the novices. A general low performance was

observed for magnesium, manganese, and iron. The overall performance of the novice group is

shown in Figure 2-11D, and shows good classification accuracy for HLB, citrus canker, citrus

scab, healthy leaves, and zinc classes. A low performance was observed for all remaining

classes.

The classification results from the group of experts are shown in Figures 2-12A – 2-12C.

The experts had better classification accuracy for most classes of disease symptoms (Figure 2-

12C). Regarding the nutrient deficiency symptoms, nitrogen, iron, and zinc had greater

classification accuracy than manganese and magnesium. Exceptions are shown in Figure 2-12B,

with 90% classification accuracy of manganese symptoms. All experts had good performance

classifying spider mite symptoms and asymptomatic healthy leaves. Overall classification

(Figure 2-12D) indicated great classification performance for disease symptoms, except

phytophthora disease with 65% classification accuracy. For nutrient deficiency, good

performance was observed with iron, zinc, and nitrogen deficiency; manganese and magnesium

deficiency had the low classification accuracy, with accuracies of 66.7% and 51.7%,

respectively.

Table 2-15 shows the results from the contingency table of Pearson’s Chi-square

analysis, with the number of correct and wrong answers of three Models, three experts and three

novices. Chi-square test results are shown in Table 2-16. There were significant statistical

73

differences (p < 0.001) for all groups, where the models outperformed both groups of experts and

novices. However, the differences were less pronounced comparing experts with the models as

shown in Table 2-15 and confusion matrices in Figures 2-11D and 2-12D.

Model Performance Compared to Human Expertise

Statistical analysis showed significant differences between all groups, p<0.001, with the

models performing better than the two groups of humans. The X2 under the null hypothesis was

greater (X2 = 291.61) when comparing model performance to the group of novices than the

group of experts. This implies that the group of novices had a lower performance, compared to

the group of experts, where X2 under the null hypothesis was 104.88, compared to the group of

models. Under the null hypothesis, the X2 between the group of experts and novices was 74.511,

with the experts performing better than the novices. Based on the results, the first null hypothesis

is rejected (p<0.001), the models performed better than the experts professionals and the second

null hypothesis is accepted (p<0.001), the models performed better than the novices, as expected.

The differences in performance between the models and the two groups of individuals

validate the need for a computer tool to diagnose leaf disorders. As expected, the group of

experts performed better than the group of novices. Nevertheless, the models outperformed the

group of experts, especially when compared to classification of nutrient deficiency symptoms. A

leaf diagnosis tool can supplement humans in field or laboratory identification of citrus leaf

nutrient deficiency symptoms. An important aspect to point out is the difference in training

conditions to develop the models and the conventional methods used to assess leaf disorders in

the field. The models were trained on single leaves of single symptoms, displayed on a digital 2D

image. However, humans are trained to identify disorders under field conditions, where an

individual can have a holistic view of the site. With that said, confusions between a leaf with

phytophthora chlorosis and a generalized chlorosis caused by nitrogen deficiency could be

74

eliminated. Therefore, the low performance of experts might have resulted from the limitation of

diagnosis options (digital images of single leaf disorders), as humans tend to analyze disorders

and make conclusions based on an overview of field conditions. On the other hand, a diagnosis

tool can be used to provide knowledge to novice identification of citrus leaf disorders. As shown

in the results, the group of novices had the lowest performance among all groups, with confusion

in both symptoms of biotic and abiotic stresses. A leaf diagnosis tool might contribute to

improve accuracy in field assessment of leaf disorders. Time reduction in generating predictions

is another great advantage of leaf disorder diagnosis tool. In average a model takes 20 seconds to

generate predictions of 20 leaves (a total of 120 seconds for 240 samples), versus humans, with

time ranging from 34 minutes for experts to 50 minutes for novices to classify 240 images. This

might not be a fair comparison; however, it is a clear advantage of having an automated and

accurate system to identify citrus leaf disorders.

75

Table 2-1. Identified classes of leaf disorders and healthy leaves. The table shows the leaf

disorders with respective class names where “b” indicates the abaxial side of the leaf

surface and “d” indicates the adaxial side of the leaf surface.

Category Foliage disorders Training classes

Nutrient

deficiency

Nitrogen (N)

Nitrogen_b

Nitrogen_d

Magnesium (Mg)

Magnesium_b

Magnesium_d

Manganese (Mn)

Manganese_b

Manganese_d

Zinc (Zn)

Zinc_b

Zinc_d

Iron (Fe)

Iron_b

Iron_d

Pest

damage Spider mite damage (Tetranychus urticae Koch)

Spidermite_b

Spidermite_d

Disease

symptoms

Blotchy-mottle HLB (Candidatus Liberibacter asiaticus-Clas)

HLB_b

HLB_d

Phytophthora chlorosis (Phytophthora nicotianae)

Phytophthora _b

Phytophthora _d

Citrus canker (Xanthomonas citri subsp. citri)

Canker_b

Canker_d

Greasy spot (Zasmidium citri-griseum)

Greasyspot_b

Greasyspot_d

Scab (Elsinoë fawcettii)

Scab_b

Scab_d

Healthy

leaves Healthy and asymptomatic leaves (Citrus spp.)

Healthy_b

Healthy_d

76

Table 2-2. Sampling locations of the leaf disorders and respective cultivars. The GPS coordinates

are approximated to the center of the sampling locations.

Location Facility Geographic

Coordinates Leaf symptoms Cultivars

CREC

CUPS

28.102373,

-81.710839

Greasy spot,

phytophthora chlorosis,

scab, spider mite

damage, healthy

leaves, Fe, Zn, Mg, N,

and Mn deficiencies

Murcott, W.

Murcott, Sugar Belle

and Kinnow

tangerines

Ray Ruby, Ruby Red

and Flame grapefruit

Meyer and Eureka

lemon,

Persian/Tahiti lime

Hamlin orange

City Block

28.116103,

-81.711879

HLB, Zn, Mn, Mg, and

citrus canker

Hamlin and Valencia

orange

Teaching Block

28.102544,

-81.709487

Citrus canker,

phytophthora chlorosis,

HLB, Zn, Mn, and N

deficiencies

Hamlin and Valencia

orange

Block 22

28.107407,

-81.685169

HLB and Mn

deficiency Valencia orange

Block 8

28.104913,

-81.713890

HLB and Mn

deficiency Hamlin orange

Trellis Block

28.102468,

-81.709927

HLB and Mn, N

deficiency and Citrus

canker

Murcott and Ray

Ruby grapefruit

Greenhouse

28.101789,

-81.712115

N, and Mn deficiencies Murcott tangerine

Greenhouse

28.104576,

-81.713211

Spider mite damage

and Mn deficiencies Murcott tangerine

Bolender

Road

The Gapway groves

28.089211,

-81.769884 Fe, Mn, N and Mg

deficiencies Hamlin orange

77

Table 2-2. Continued.

Location Facility

Geographic

Coordinates Leaf symptoms Cultivars

Adams

Road The Gapway groves

28.097628,

-81.783477

N deficiency Hamlin orange

Table 2-3. Guidelines for interpretation of leaf analysis based on 4 to 6-month-old spring flush

leaves from non-fruiting twigs (Morgan et al., 2021), modified.

Element Unit of measure Deficient Low Optimum High Excess

N % <2.20 2.20-2.40 2.50-2.70 2.80-3.00 >3.00

Mg % <0.20 0.20-0.29 0.30-0.49 0.50-0.70 >0.70

Mn mg/kg or ppm1 <18 18-24 25-100 101-300 >300

Zn mg/kg or ppm1 <18 18-24 25-100 101-300 >300

Fe mg/kg or ppm1 <35 35-59 60-120 121-200 >200

1ppm = parts per million

Table 2-4. Hyperparameters used in training and validation of the five models.

Parameter Value Description

Target size 380x380x3 Image input size for the EfficientNet-B4 network

Target size 224x224x3 Image input size for the VGG-16 network

Batch size 24 Number of images in a batch for training and validation subset. It

is was selected considering server computation capability

Patience 5 Number of training epochs without improvement in validation

accuracy, after which training will be stopped

Alpha transfer learning 0.005 Learning rate for the EfficientNet-B4

Alpha fine tuning 0.0005 Learning rate for the EfficientNet-B4

Alpha transfer learning 0.0005 Learning rate for the VGG-16

Alpha fine tuning 0.00005 Learning rate for the VGG-16

Automatically reduce the

learning rate 0.2

Reduce the learning rate by a factor of two when the validation

accuracy did not improve for two epochs

Minimum learning rate 0.0000001 The lowest learning rate when the validation accuracy Plateau

Minimum delta 0.0001 Minimum change in the validation accuracy to qualify as an

improvement

78

Table 2-5. Summary of data used during calibration and independent validation. The table shows

the sample size without data augmentation and with data augmentation methods. The

pretrained (ImageNet weights) models used to train each dataset is also indicated. Model Training data

(80 %)

Augmented

training data

Validation

dataset (20 %)

Independent

validation set

Pretrained model

CLD-Model-1 11,040 45,600 2,760 1,380 EfficientNet -B4

CLD-Model-2 11,456 45,824 2,856 1,400 EfficientNet -B4

CLD-Model-3 11,520 46,080 2,880 1,400 EfficientNet -B4

CLD-Model-4 11,040 45,600 2,760 1,380 EfficientNet -B4

CLD-Model-5 11,520 46,080 2,880 1,400 VGG-16

Table 2-6. Classes with outliers removed, after testing the training dataset with CLD-Model-1.

The classes where the training dataset did not have outliers are not shown. Class Number of images used for training Number of outliers

iron_b 589 11

healthy_d 598 2

zinc_d 585 15

scab_b 592 8

spidermite_d 598 2

nitrogen_d 594 6

iron_d 589 11

zinc_b 585 15

spidermite_b 598 2

scab_d 592 8

nitrogen_b 594 6

healthy_b 598 2

Table 2-7. Comparison of model performance on the validation dataset. Model Precision ( %) Recall ( %) F1 score ( %) Accuracy ( %) n

CLD-Model-1 98 98 98 98 2760

CLD-Model-2 99 99 99 99 2856

CLD-Model-3 99 99 99 99 2880

CLD-Model-4 99 99 99 99 2760

CLD-Model-5 98 98 98 98 2880

Table 2-8. Comparison of model performance based on Precision (%) values obtained from the

validation dataset. The values ware computed for each class on the validation subset

(support) of the training dataset. Classes CLD-Model-2 n CLD-Model-3 n CLD-Model-5 n

iron_b 100 117 100 120 99 120

healthy_d 93 119 93 120 93 120

manganese_d 98 120 98 120 95 120

79

Table 2-8. Continued. Classes CLD-Model-2 n CLD-Model-3 n CLD-Model-5 n

zinc_d 100 117 100 120 100 120

magnesium_b 100 120 100 120 97 120

scab_b 100 118 100 120 98 120

greasyspot_d 100 120 100 120 100 120

spidermite_d 99 119 98 120 99 120

HLB_d 98 120 98 120 98 120

nitrogen_d 100 118 100 120 100 120

greasyspot_b 100 120 100 120 100 120

phytophthora_d 100 120 100 120 100 120

iron_d 99 117 99 120 99 120

zinc_b 99 117 98 120 100 120

HLB_b 100 120 100 120 98 120

spidermite_b 100 119 100 120 100 120

canker_b 99 120 100 120 100 120

manganese_b 100 120 96 120 92 120

scab_d 100 118 100 120 97 120

magnesium_d 100 120 100 120 98 120

nitrogen_b 100 118 100 120 99 120

healthy_b 99 119 97 120 99 120

phytophthora_b 100 120 99 120 98 120

canker_d 100 120 100 120 100 120

Table 2-9. Comparison of model performance based on Recall (%) values obtained from the

validation dataset. The values ware computed for each class on the validation subset

(support) of the training dataset. Classes CLD-Model-2 n CLD-Model-3 n CLD-Model-5 n

iron_b 100 117 99 120 98 120

healthy_d 100 119 99 120 98 120

manganese_d 99 120 99 120 99 120

zinc_d 99 117 97 120 96 120

magnesium_b 100 120 100 120 98 120

scab_b 100 118 98 120 99 120

greasyspot_d 100 120 100 120 100 120

spidermite_d 100 119 100 120 100 120

HLB_d 100 120 100 120 99 120

nitrogen_d 99 118 97 120 99 120

greasyspot_b 100 120 100 120 100 120

phytophthora_d 100 120 100 120 97 120

iron_d 100 117 100 120 97 120

zinc_b 100 117 97 120 93 120

80

Table 2-9. Continued. Classes CLD-Model-2 n CLD-Model-3 n CLD-Model-5 n

HLB_b 100 120 100 120 99 120

spidermite_b 99 119 100 120 99 120

canker_b 100 120 99 120 100 120

manganese_b 100 120 100 120 99 120

scab_d 91 118 93 120 91 120

magnesium_d 99 120 100 120 100 120

nitrogen_b 99 118 97 120 98 120

healthy_b 100 119 100 120 100 120

phytophthora_b 100 120 100 120 100 120

canker_d 99 120 99 120 99 120

Table 2-10. Comparison of model performance based on F1 score (%) obtained from the

validation dataset. The values ware computed for each class on the validation subset

(support) of the training dataset. Classes CLD-Model-2 n CLD-Model-3 n CLD-Model-5 n

iron_b 100 117 100 120 99 120

healthy_d 96 119 96 120 96 120

manganese_d 99 120 98 120 97 120

zinc_d 100 117 99 120 98 120

magnesium_b 100 120 100 120 98 120

scab_b 100 118 99 120 99 120

greasyspot_d 100 120 100 120 100 120

spidermite_d 100 119 99 120 100 120

HLB_d 99 120 99 120 99 120

nitrogen_d 100 118 98 120 100 120

greasyspot_b 100 120 100 120 100 120

phytophthora_d 100 120 100 120 99 120

iron_d 100 117 100 120 98 120

zinc_b 100 117 98 120 96 120

HLB_b 100 120 100 120 98 120

spidermite_b 100 119 100 120 100 120

canker_b 100 120 100 120 100 120

manganese_b 100 120 98 120 96 120

scab_d 95 118 96 120 94 120

magnesium_d 100 120 100 120 99 120

nitrogen_b 100 118 98 120 99 120

healthy_b 100 119 98 120 100 120

phytophthora_b 100 120 100 120 99 120

canker_d 100 120 100 120 100 120

81

Table 2-11. Summary of results based on the confusion matrix values. The table shows the

percent of true predictions and the percent of false predictions, and respective number

of samples.

Model True Predictions

( %)

False Predictions

( %)

Standard

deviation

Number

True

predictions

Number

False

predictions

Total

number of

samples

CLD-Model-1 98.19 1.81 2.37 2710 50 2760

CLD-Model-2 99.38 0.62 1.89 2839 17 2856

CLD-Model-3 98.97 1.03 1.76 2852 28 2880

CLD-Model-4 99.25 0.75 2.34 2738 22 2760

CLD-Model-5 98.34 1.66 1.35 2831 49 2880

Table 2-12. Results of DRIS analysis on the independent validation dataset. All samples were

deficient on targeted nutrient deficiency classes.

Sample Class Diagnosis

Mn1_val Manganese

deficiency

DEFICIENT: Cu<Mn LOW: Zn HIGH: S EXCESS: N

Mn2_val DEFICIENT: Cu<Mn<Zn HIGH: P>S>K EXCESS: N

Mn3_val DEFICIENT: Mg<Mn<Zn LOW: S HIGH: P>K

Mg1_val Magnesium

deficiency

DEFICIENT: Mg LOW: Ca<P<Fe HIGH: B

Mg2_val DEFICIENT: Mg LOW: Ca<P<Fe HIGH: B

Mg3_val DEFICIENT: Mg LOW: Ca<P<Fe HIGH: B

Fe1_val

Iron deficiency

DEFICIENT: Fe<Ca LOW: Mg HIGH: P>K EXCESS: N

Fe2_val DEFICIENT: Fe LOW: Ca HIGH: P>K EXCESS: N

Fe3_val DEFICIENT: Mn<Fe<Ca LOW: Mg HIGH: P EXCESS: K

N1_val Nitrogen

deficiency

DEFICIENT: N<Zn LOW: Mn HIGH: P>B

N2_val DEFICIENT: N<Fe LOW: Zn HIGH: Mg>B EXCESS: P>S

N3_val DEFICIENT: N<Zn LOW: K HIGH: Mg>B>P

Zn1_val

Zinc deficiency

DEFICIENT: Mn<Zn<Fe LOW: Mg<Ca HIGH: P>K EXCESS: Cu

Zn2_val DEFICIENT: Mg<Zn<Mn LOW: Ca<S<N HIGH: P>K EXCESS: Cu

Zn3_val DEFICIENT: Mg<Mn<Zn LOW: S<N HIGH: P

Table 2-13. Summary of model performance on the independent validation dataset. Confidence

refers to the averaged Top1 predictions of 20 leaves of all classes. Model True predictions ( %) Prediction error ( %) Standard deviation Confidence ( %)

CLD-Model-1 98.26 1.74 3.31 97.96

CLD-Model-2 97.99 2.01 4.40 97.78

CLD-Model-3 98.26 1.74 4.01 98.00

CLD-Model-4 97.90 2.10 3.77 97.64

CLD-Model-5 95.90 4.10 6.93 95.34

82

Table 2-14. Summary of model performance on selected 20 leaves per class of the independent

validation dataset. These results were used to compare model classification

performance with human classification performance. Classes Validation sample CLD_Model2 CLD_Model3 CLD_Model4

Citrus Canker CK_val3 100 100 100

Citrus Scab Sc_val3 100 100 100

Greasy Spot Gs_val1 100 100 100

Healthy HL_val3 100 100 100

HLB HLB_val1 100 100 100

Iron Fe2_val 100 100 100

Magnesium Mg2_val 100 100 95

Manganese Mn1_val 100 100 100

Nitrogen N2_val 100 100 100

Phytophthora PH_val1 100 100 100

Spider Mite Damage SM_val1 95 95 85

Zinc Zn2_val 100 100 100

Table 2-15. Summary of classification results from the three groups used for Chi-square test.

Values correspond to number of observations resulting from the three replicates. Classification CLD-Model Experts Novices

Incorrect 6 113 256

Correct 714 607 464

n 740 740 740

Table 2-16. Chi-square test results, with 95% confidence level. X2 df p-value n

CLD-Model vs Experts 104.88 1 < 0.001 1440

CLD-Model vs Novices 291.61 1 < 0.001 1440

Experts vs Novices 74.51 1 < 0.001 1440

83

Figure 2-1. Citrus leaf disorders proposed for this study. The figure shows classic visual

symptoms of nutrient deficiencies, diseases, and pest damages on citrus leaves.

Figure 2-2. Sequence of training methodology implemented to develop the model using transfer

learning and fine-tuning. The same training methodology was performed for the

VGG-16 model, where out of 19 trainable layers from the base model, 33% were

unfrozen in the first step of fine-tuning and the rest of the network (100%) in the

second step of fine-tuning.

84

Figure 2-3. Flow diagram of model development. The figure shows the steps implemented to

develop and select the best models to diagnose citrus leaf disorders with two

pretrained models, the EfficientNet-B4 and the VGG-16.

85

Figure 2-4. Model performance during training: transfer learning and fine tuning. In the figure,

the transition from transfer learning to fine tuning observed with a slight decrease of

accuracy and slight increase of loss. Model’s accuracy increased in fine-tuning,

reaching to its high performance in the second set of fine tuning, with 66% and 100%

of the network for the EfficientNet -B4 models and the VGG-16 model, respectively.

A) CLD-Model-1, B) CLD-Model-2, C) CLD-Model-3, D) CLD-Model-4 and E)

CLD-Model-5.

86

Figure 2-5. CLD-Model-1 confusion matrix. The values are percentage of true labels (vertical

axis) allocated to predicted labels (horizontal axis).

Figure 2-6. CLD-Model-2 confusion matrix. The values are percentage of true labels (vertical

axis) allocated to predicted labels (horizontal axis).

87

Figure 2-7. CLD-Model-3 confusion matrix. The values are percentage of true labels (vertical

axis) allocated to predicted labels (horizontal axis).

Figure 2-8. CLD-Model-4 confusion matrix. The values are percentage of true labels (vertical

axis) allocated to predicted labels (horizontal axis).

88

Figure 2-9. CLD-Model-5 confusion matrix. The values are percentage of true labels (vertical

axis) allocated to predicted labels (horizontal axis).

A B

Figure 2-10. Model performance on the independent validation dataset. The figure shows the

rates of true predictions per class (percentage). The percentage of true predictions was

computed based on the number of images that the model predicted correctly. A and D

were tested on 1320 images, from 23 classes, the HLB_b and scab_d classes,

respectively, were not included in these. B, D and E models were tested on leaf

89

disorders database of 1380 images. A) CLD-Model-1, B) CLD-Model-2, C) CLD-

Model-3, D) CLD-Model-4 and E) CLD-Model-5. CLD-Model-5 was the lowest

performing model, with 6 classes showing true prediction rates under 95%.

C D

E

Figure 2-10. Continued.

90

A B

C D

Figure 2-11. Confusion matrix with classification results from group of novice scout. A) Novice

1, B) Novice 2, C) Novice 3 and D) overall results of the three individuals.

91

A B

C D

Figure 2-12. Confusion matrix with classification results from the group of experienced

professionals. A) Expert 1, B) Expert 2, C) Expert 3 and D) overall results of the

three individuals.

Object 2-1. DRIS analysis results of all leaf samples of nutrient deficiency used to train the citrus

leaf disorders identification models. The data file is an Excel dataset containing (14.1

kB) with 185 data points.

92

CHAPTER 3

EVALUATING THE POTENTIAL OF MACHINE VISION TO PREDICT SOIL PHYSICAL

AND CHEMICAL PROPERTIES FROM DIGITAL IMAGES

Introduction

Soil and water quality are important components of sustainable agriculture and necessary

for food production. Maintaining long term soil productivity is essential to ensure crop

production and meet the food demand of a growing global population while preserving the

environment (Lal, 2009). Moreover, there are the threats of climate change on food production

with increasing temperatures, low precipitation, and soil degradation (Garfin et al., 2014).

Accurate diagnosis is required to understand soil physical, chemical, and biological properties to

optimize farm production potential. Precision Agriculture (PA) principles are based on the

implementation of techniques and technological tools that aid in accurate diagnosis of soil

properties, considering its spatial and temporal variability (Pedersen & Lind, 2017; Shannon et

al., 2018). These tools are used to generate information for decision making to ensure

profitability while reducing the negative impacts of agriculture on the environment (Pedersen &

Lind, 2017; Shannon et al., 2018). On-farm diagnosis of soil and crop conditions generate site

and time-specific information used to fine tune recommendations (Shannon et al., 2018). Grid

sampling improves sampling density per unit area for more accurate information used to map soil

properties for site specific application of agricultural inputs (Pedersen & Lind, 2017; Shannon et

al., 2018). However, there is an increase in cost for sample testing as well as the time required to

generate recommendations. Alternative methods are used to estimate soil properties and monitor

crop production, such as Visible-Vis, Infrared-IR, Near Infrared-NIR (VNIR, 400-1200 nm) and

Shortwave Infrared (SWIR, 1200-2500 nm) spectroscopy along with regression models (Curcio,

Ciraolo, D’Asaro, & Minacapilli, 2013; Nocita et al., 2015). Nevertheless, these methods require

93

a wide range of samples for calibration, and high investment in equipment as well as

knowledgeable personnel to manage the equipment (Pedersen & Lind, 2017).

Indirect methods of assessing soil properties have been extensively studied, such as the

use of pedotransfer functions (PTFs), along with the use of artificial neural networks (ANNs)

and regression models (Marashi, Mohammadi Torkashvand, Ahmadi, & Esfandyari, 2017;

Minasny et al., 2004; Moreira De Melo & Pedrollo, 2015). Other methods focused on the use of

soil sensors, commonly used to assess soil pH and electrical conductivity (Grisso, Alley,

Holshouser, & Thomason, 2009; Motsara & Roy, 2008). The use of soil spectroscopy and

advances in remote sensing, introduce an efficient method of assessing soil properties (Chabrillat

et al., 2019; Nocita et al., 2015). Soil variables, such as organic matter (OM) and soil organic

carbon (SOC) content, nutrient content, soil particle size, pH, cation exchange capacity (CEC),

and soil moisture, have been accurately predicted and calibrated for different regions using these

techniques, enabling site-specific management of water and nutrients (Curcio et al., 2013;

Gomez & Lagacherie, 2016; Nocita et al., 2015; Pinheiro et al., 2017).

With the recent advances in Artificial Intelligence (AI) and machine vision, it is possible

to develop affordable and accurate methods of estimating soil properties. Machine vision with

the continuous improvements of convolutional neural networks (CNN) has shown to benefit

many scientific and technological advances in image processing for object recognition and the

development of more powerful computers (Lecun et al., 2015; Li et al., 2020). In PA, machine

vision has been implemented to improve a variety of farm activities including robotic automated

harvesting, weed control, and in-crop monitoring (Duckett et al., 2018; Liakos et al., 2018).

Machine vision is a relatively emerging field in soil sciences. Deep learning techniques such as

transfer learning and fine-tuning are implemented to model soil variables using pretrained CNN

94

models (Liu et al., 2018; Tan et al., 2018). Liu et al. (2018), applied transfer learning for soil

spectroscopy to predict soil clay content, with the model achieving R2 of 0.756 and root mean

square error (RMSE) of 7.07. Padarian et al. (2019), developed a multi-task CNN model for

digital soil mapping using 3-D images of covariates and spatial information, where the multi-task

CNN had 30% less error compared to other regression methods (Krizhevsky et al., 2012;

Padarian et al., 2019; Ruder, 2017). Soil spectroscopy and deep CNN were used to predict SOM,

CEC, sand, and clay content, pH in water and total nitrogen using NIR spectroscopy, showing an

improved prediction performance, and decreased error of 62% and 87% (Padarian et al., 2019a).

To account for the high spatial variability of landscapes and its influences in soil properties,

Padarian et al. (2019b) investigated the use of transfer learning with models trained on global

data to predict soil properties at a local level. The results proved that transfer learning was

important to improve prediction performance on local data (Padarian et al., 2019b). Deep neural

network regression (DNNR) was implemented to predict soil moisture from meteorological data

(Cai et al., 2019). High accuracy results were obtained in this study, with R2 ranging from 0.96 to

0.98, and RMSE from 0.78 and 1.61 (Cai et al., 2019). The breakthroughs of computer vision

present an option for implementation of visual analysis of soil properties with the use of digital

images. Deep CNN are quite efficient and accurate in image analysis. Swetha et al. (2020),

developed a CNN-based model to predict soil texture classes from digital images from a

smartphone camera. The method showed good performance in prediction of sand, silt, and clay,

with R2 of 0.97, 0.98 and 0.70, respectively.

Knowing soil properties is indispensable for decision making in terms of variable rate

application and selecting the right management strategies in accordance with soil conditions.

New tools for soil testing are necessary to increase sampling density, on-farm analysis, and

95

generation of site-time-specific recommendations for farm input such as fertilizers and irrigation

management. This study presents a novel methodology to predict soil physical and chemical

properties from digital images. The purpose of this study was to develop a simple, fast, accurate

and affordable method for soil testing. The method is intended to provide an on-farm assessment

of soil properties including soil texture, bulk density, color, water content at permanent wilting

point and soil organic matter content. These soil properties are important to understand soil

nutrient and water holding capacity, soil health and understand soil processes and contribute to

decision making. Three deep learning machine vision methods were used: multiclass image

classification, binary image classification and linear regression. The state-of-the-art pretrained

EfficientNet-B4 model developed by Tan and Le (2019) was used to develop the predictive

models for soil properties. Transfer learning and fine-tuning were used in model development. A

database of 421 soil samples was created, from which 321 were used to train the predictive

models and 100 samples were used to test model performance on an unknown dataset. The

results obtained in this study showed great potential application of the proposed methods to

predict properties of sandy soils, which make up a large percentage of soils in the peninsula in

the State of Florida.

Hypothesis

Machine-vision powered models can accurately predict soil properties from digital

images, and therefore can be used to supplement on-farm diagnosis of soil physical and chemical

properties.

Objective

To evaluate potential use of machine vision models in prediction of soil physical and

chemical properties from digital images, through fine-tuning of the pretrained EfficientNet-B4

model using image classification and linear regression.

96

Materials and Methods

This research was carried out at the Soil and Precision Agriculture Laboratory, Citrus

Research and Education Center (CREC), University of Florida (https://crec.ifas.ufl.edu/). A total

of 421 soil samples collected from various locations in the State of Florida were used to model

five soil physical and chemical properties, Soil Organic Matter (SOM), permanent wilting point

(PWP), soil bulk density (BD) and soil color, with the CIE L*a*b* and the Munsell Color

Notation. The samples were routine samples provided by the UF-IFAS Extension Soil Testing

Laboratory (ESTL) and the Soil and Precision Agriculture Laboratory at the CREC. From the

total number of samples, 321 samples (from the ESTL) were used for calibration of the models

and 100 samples from the CREC were used for independent validation, to test the model with an

unknown dataset. The samples were photographed, scanned, and analyzed in the laboratory for

each property. Digital images of soil samples were used to retrain a pretrained model, the

EfficientNet-B4 developed by Tan and Le (2019), using simple linear regression, multiclass, and

binary image classification approaches.

Data Collection

All samples were collected from the topsoil 0-6 inches (15 cm depth). These were

disturbed samples, previously prepared for chemical analysis of nutrient content. Laboratory

sample processing included, grinding, and sieving through a 2 mm sieve. The samples did not

undergo any known chemical or physical change, aside from the soil structure and soil aggregate

destruction during sample preparation for chemical analysis.

Soil photography and scanning

The soil samples were photographed using a NIKON COOLPIX L830, 16 Megapixel

camera. A Petri dish was used to contain the soil while photographing the top view of the soil

sample. Light adjustment was done when necessary, to remove excessive glare from the light in

97

the room and to have a clear display of soil characteristics, such as color and particle size. Each

sample had five replicates of images of 3456x3456 pixels, where each replicate was a separate

pouring of soil from the same sample bag. When taking the photographs, the Petri dish

containing soil sample was centralized to facilitate image cropping during data processing. All

images were photographed at a fixed vertical distance of 71 cm, using a tripod. After

photographing, the samples were scanned through the transparent Petri dish, using an EPSON

Scan V550 Photo flatbed scanner. The image resolution was 2345x2423 pixels. Prior to

photographing and scanning, samples were mixed by rotation and flipping to obtain different

views of the sample. Special attention was given to avoid fine particles from sinking to the

bottom of the Petri dish and to have even distribution of different particle sizes and organic

matter in the samples. Figure 3-1 shows the flow diagram of the methodology implemented to

develop the models.

Permanent wilting point (PWP), the dew point

To conduct PWP measurements, the samples were subjected to field capacity using the

centrifuge method for disturbed samples (Cassel & Nielsen, 1986). The method was modified for

coarse soils, using 30 grams of soil, and centrifugation for 30 minutes at 700 rpm. After

centrifuging, 5 grams of the moistened sample were air dried for 24 hours, under room

temperature, about 21 oC. The WP4T instrument (Dewpoint potentiaMeter, Decagon Devices)

was used, which measures the sum of the osmotic and matric potential in a sample. The

methodology for measurement was the same as described by equipment manufacturer (Decagon

Devices, 2007). The WP4T provided values of water potential (ψ) in megapascal (MPa – J/kg)

that was used to calculate water content at permanent wilting point. The wet weight was recorded

after sample reading and the oven dry weight was recorded after 48 hours at 105˚C. The data was

98

used to compute the gravimetric water content (mass of water per unit mass of dry soil, θm),

using Equation 3-1. The PWP using the dew point was calculated using Equation 3-2.

θₘ =Mass of water

Mass of oven dry soil=

Mass of wet soil − Mass of oven dry soil

Mass of oven dry soil (3-1)

θ ̱₁ ̣₅ = 𝑊𝑚 ∗ln(−1000/−1.5)

ln(−1000/ψₘ) (3-2)

where W-1.5 is the water content at PWP, Wm is the measured water content corresponding to the

water potential, ψm is the measured water potential in MPa and -1.5 is the water potential at PWP

in MPa.

Loss on ignition (LOI) to determine soil organic matter content

The soil organic matter content analysis was done using the LOI method. Sample weights

were taken using an analytical balance, 10 to 20 grams of sample was used. The samples were

oven dried at 105oC for 24 hours. The dry weight was recorded, and the samples were placed in a

muffle furnace at 500oC for 5 hours. The final weight was recorded, and the SOM content was

calculated in percentage using Equation 3-3.

LOI (%) =(Weight (105) – Weight (500)

Weight (105)∗ 100 (3-3)

Soil bulk density

An approximation of soil bulk density was done, using the core method described by

Blake and Hartge (1986) for disturbed samples. A cup of 5 mL of volume was used to measure

oven dry soil (105oC ) and an analytical balance was used to take the sample weight. Equation 3-

4 was used to compute bulk density in g/mL.

𝑩𝑫(𝒈/𝒄𝒎ᵌ) =𝑴𝒂𝒔𝒔

𝑽𝒐𝒍𝒖𝒎𝒆 (3-4)

99

Soil color with the Munsell soil color charts

The Munsell Color Charts were used to classify soil color. The samples were placed in

small dishes and superficially wetted with DI water from a handheld spray bottle. The Hue,

Value and Chroma and respective Munsell color name were recorded. The Munsell color names

were used to define the soil color and to train the Munsell soil color model.

Soil spectra for CIE-L*a*b* color

Soil spectra of dry samples was collected using the visible light range, 400 – 700 nm of a

multispectral sensor (EPP2000-VIS-100 Spectrometer, StellarNet, Inc.). One spectral reflectance

along with spectra color code was taken per sample. The L*a*b* color code was taken on the

RGB channel. The Stellarnet Spectrawiz software was used to process colorimetry as L*a*b*

values.

Sieving method for sand fractionation

The soil sieving method was modified from Gee and Bauder (1986) and Kroetsch &

Wang (2008). Soil sieving was done without sample pretreatment for removal of OM and iron

oxides. A set of sieves (USA STANDARD TEST SIEVE ASTME11 SPECIFICATION)

corresponding to the soil separates (2mm; 1mm; 0.5mm; 0.25mm; 0.125mm; 0.05mm + base)

was used to fractionate the classes of sand. In this method, a sample size ranging from 5 to 20

grams of soil was used depending on the availability of soil sample material. The samples were

shaken for five minutes at 430 rpm on an orbital shaker (NEW BRUNSWICK SCIENTIFIC,

Edison, N.J., USA). Equation 3-5 was used to calculate the percent of each soil separate (SS) and

determine the texture classes and subclasses for sandy soils (85% sand content), based on the

USDA classification (Soil Science Division Staff, 2017).

SS(%) =𝑇𝑜𝑡𝑎𝑙 𝑚𝑎𝑠𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑠𝑖𝑒𝑣𝑒 (𝑔) − 𝑆𝑖𝑒𝑣𝑒 𝑤𝑒𝑖𝑔ℎ𝑡 (𝑔)

𝑇𝑜𝑡𝑎𝑙 𝑠𝑎𝑚𝑝𝑙𝑒 𝑤𝑒𝑖𝑔ℎ𝑡∗ 100 (3-5)

100

The five classes of sand were defined based on the percent of separates in each sample, as

established by the Soil Science Division Staff (2017), previously described in Chapter 1.

Data Processing

A database of 2,105 images of soil was created, from which 1,605 images were used for

training and 500 images were used for testing. Sample images were cropped to a fixed resolution

of 2521x2521 pixels. For calibration, 80% of the samples were used for training and 20% of the

samples were used for validation. Another dataset of soil image was created for the Munsell

color analysis. The cropped 2521x2521 images were further cropped to a resolution of 380x380

pixels, resulting in 57,781 images. A database of soil physical and chemical variables was

created, which included all the data from laboratory analytical methods.

Training dataset

The distribution and number of soil images used to train each of the variables was

dependent on the variable and the method used to train the model, i.e. linear regression,

multiclass, or binary. All samples, 1,605 (2521x2521 pixels) images were used to model

continuous variables, BD, SOM, PWP, L*, a*, and b* color. Multiclass image classification

method was used to model soil color based on the Munsell Color System using 13,000 (380x380

pixels) images (Table 3-2). For classification of sand classes, 1,584 images were used for the

multiclass method, divided in three classes: Coarse sand, Sand and Fine Sand. For binary image

classification of two sand classes, Sand and Fine Sand, 1,519 images were used. Table 3-3 shows

the distribution of samples per class for the binary and multiclass methods.

Test dataset for independent validation

An external database soil images was used to test model performance. Each sample (100

samples) had 5 replicates of images, which were associated with one value on the database of

analytical data. This data was used to test model performance comparing its predictions to the

101

analytical data. The samples for independent validation were collected from three primary citrus

production regions, Central Ridge, Indian River and South Florida. Samples were collected from

the first 30 cm of topsoil and the subsoil, 30 to 45 cm. The samples were collected between 2015

to 2017. One homogenized composite sample was collected, from two primary samples collected

using a three-inch bucket auger in the field. Samples were air dried, ground, and sieved through a

2 mm sieve. Analytical methods were also applied to analyze the test dataset.

Data Analysis

The pre-trained image classification model, EfficientNet-B4, was used to develop the

models for predicting the soil variables (Tan & Le, 2019). Training was conducted in a Jupyter

Notebook developed by P’erez and Granger (2018), using the Keras API, developed in 2015 by

François Chollet, written in Python 3, running on the TensorFlow framework version 2.4, an

open source platform developed by the Google Brain team (Abadi et al., 2016). A Linux server,

running the Ubuntu 18.04 operating system on a 64-bit Intel® Core™ i3-7100 CPU @ 3.90GHz

computer with 16Gb of RAM and a NVIDIA (NVIDIA CORPORATE, Santa Clara, CA, USA)

GeForce GTX 1080 Ti Graphics Card (GPU) was used to train the models.

The Adam optimizer (an algorithm for stochastic optimization), one of the most used

algorithms in deep learning machine vision, was utilized for training. It provides a smart learning

rate and momentum, by intuitively reducing the learning rate when dealing with complex

datasets (Kingma & Ba, 2015). Reducing the learning rate enables the network to learn complex

features, leading to improved performance. The initial learning rate (LR) was set to 0.005 during

transfer learning and reduced by 10x, to 0.0005, when fine-tuning. A loss function is used to

monitor model performance during training. The categorical cross entropy was used to compute

the loss values between the true class labels and predictions from the model (Zhang & Sabuncu,

2018). The loss function computes the Mean Squared Error (per sample), using the sum of errors

102

over the batch size. Training and validation accuracy and loss were the metrics used to evaluate

model performance. Accuracy calculates the frequency of agreement between the predictions

from the model and the true class labels. Automatic early stopping was activated to halt training

when no more improvement in validation accuracy occurred for five consecutive epochs.

Automatic LR reduction was set to reduce LR by a factor of 5 (0.2) when validation accuracy did

not improve for two epochs (Table 3-5).

Data management for linear regression

Simple linear regression in the final output layer of an EfficientNet-B4 model was used to

predict soil properties using digital images. Six EfficientNet-B4 models were trained using the

linear regression method as shown in Table 3-1. Before training, the dataset was examined to

evaluate the data distribution for each variable. Based on distribution of values, data

transformation was applied to all variables showing skew distributions as shown in Table 3-4,

including log transformation (Equation 3-6), rescaling the values after log transformation

(Equation 3-7) and normalize the data (Equation 3-8).

• Log transformation

𝑦𝑖 = 𝑙𝑜𝑔(𝑥𝑖) (3-6)

where y is the transformed variable, x corresponds to the untransformed values and log is the

natural log transformation function in Python.

• Rescaling the log transformed data

𝑦𝑖 = 𝑥𝑖 + |𝑥𝑚𝑖𝑛| (3-7)

where xi is the value of the input variable and yi is the rescaled value. This step was performed to

rescale all negative values resulted from log transformation, changing to values ≥ 0 by adding the

absolute value of the minimum value to all the data.

103

• Normalize the data

𝑦𝑖 =𝑥𝑖 − 𝑥𝑚𝑖𝑛

𝑥𝑚𝑎𝑥 − 𝑥𝑚𝑖𝑛 (3-8)

where yi is the normalized value (0-1), xi is the value being normalized, xmin and xmax are the

range of values.

Data management for training and validation

For calibration, a proportion of 80%:20% images were set for the training and validation

dataset, respectively. The images were normalized to pixel values ranging from 0 to 1, by

dividing the pixel values by 255, the maximum pixel value in a 24-bit RGB image. Data

augmentation was applied to the training dataset, including geometric distortions: horizontal flip,

vertical flip, and fill mode which was set to nearest. By applying data augmentation to the

training subset, the sample size was augmented two times, resulting from horizontal and vertical

flip. Fill mode, was only used to maintain a true shape of the images after geometric distortions.

The nearest fill mode has no effect on image characteristics, as the two geometric distortions do

not leave empty spaces in the image. Data augmentation is a procedure carried out to artificially

generate a set of data to increase variability and sample size of the training dataset. Data

augmentation was not applied to the validation subset and the independent validation dataset.

Applying data augmentation improves model capability to recognize and correctly classify

images under variable ranges of image properties.

Training methodology

Transfer learning using a pretrained model was used as the first step in training. During

transfer learning, a copy of the EfficientNet-B4 model was downloaded, which was previously

trained with 1,000 classes on the ImageNet dataset (Russakovsky et al., 2015). Only the base

model was used, and the classification head for the 1,000 ImageNet classes was removed. The

104

base model architecture for the EfficientNet-B4 model is comprised of 467 trainable layers. For

image classification models, during transfer learning, only three new selected layers attached to

the upper part of the base network were trained. The classification head included the following

layers: Global average pooling 2D layer (for two-dimensional images), Dropout layer (set to

0.5), and the Dense layer (classification layer), where the number of outputs corresponds to the

number of soil property classes. For linear regression, two new added layers were used in

transfer learning. The prediction head of linear regression included Global average pooling 2D

layer and a Dense layer (prediction layer, with one linear output). The pretrained layers of the

base model remained frozen for transfer learning (represented in the upper part of the base

model, Figure 3-2, and Figure 3-3).

Global Average Pooling uses a nonlinear function approximator in the classification layer

to make predictions based on the feature maps. It is used directly over feature maps in the

classification layer to avoid overfitting, regularize the network structure, converting feature maps

into confidence maps of categories or classes (Lin et al., 2014). The Dropout layer, which is also

used to prevent overfitting of the training dataset, regularizes the training by randomly selecting

and setting half of the activations in the fully connected layers to zero (Srivastava et al., 2014).

Dense layer, is a nonlinear layer, which employs a linear formula to make final predictions with

a non-linear activation function (Huang et al., 2017). Dense layers are particularly important in

deeper networks to enable shorter connections between layers (Huang et al., 2017). The number

of dense units is computed and set to the number of output classes. The SoftMax is the non-linear

activation function, in Dense layers for multiclass models. The Sigmoid is the activation

function, in Dense layers for binary classification models. The linear activation function is used

105

to compute the output for linear regression model, with one output. The selection of these layers

was based on computational efficiency and improved model performance.

After transfer learning, fine tuning was done to train the models on the soil variables and

improve model performance. The process was carried out by unfreezing part of the network that

was frozen during transfer learning. The principle is that increasing the number of trainable

layers will increase model performance for the new set of classes. For linear regression models,

the model was fine-tuned training the upper 33% of the network while the rest of the network

(66% of the model) remained frozen. To fine-tune the image classification models, both binary

and multiclass, the process was carried out in two steps. The first, freezing 66% of the lower base

model to train 33% of its upper layers. The second, train 66% of the upper base model, by

freezing its lowest 33%. The sequence of training is shown in Figure 3-2 and Figure 3-3.

Training CNN-based Linear Regression Models to Predict SOM, BD, PWP, L*a*b* Color

Total sample size used to train models for SOM, BD, PWP, L*, a* and b* is shown in

Table 3-1. The training dataset was subdivided into 80% for training and 20% for validation. The

image input size to the model was 380x380, batch size 32 and the number of training epochs was

set to 50. The number of iterations (steps) per training epoch was 41 and for validation was 11.

The number of iterations was computed using Equation 3-9 and Equation 3-10.

𝑺𝒕𝒆𝒑𝒔 𝒑𝒆𝒓 𝒆𝒑𝒐𝒄𝒉 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑎𝑖𝑛𝑖𝑛𝑔 𝑠𝑎𝑚𝑝𝑙𝑒𝑠

𝑇𝑟𝑎𝑖𝑛𝑖𝑛𝑔 𝑏𝑎𝑡𝑐ℎ 𝑠𝑖𝑧𝑒 (3-9)

𝑽𝒂𝒍𝒊𝒅𝒂𝒕𝒊𝒐𝒏 𝒔𝒕𝒆𝒑𝒔 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑖𝑑𝑎𝑡𝑖𝑜𝑛 𝑠𝑎𝑚𝑝𝑙𝑒𝑠

𝑉𝑎𝑙𝑖𝑑𝑎𝑡𝑖𝑜𝑛 𝑏𝑎𝑡𝑐ℎ 𝑠𝑖𝑧𝑒 (3-10)

All models were trained in two steps, transfer learning and fine-tuning with 33% of the

upper layers. After completion of each training step, model progress and best weights were saved

106

to proceed to the next step in training (e.g., fine-tuning). After training, the models were

evaluated on the validation dataset and on independent validation samples.

Training the EfficientNet-B4 Model for Munsell Color Classification

Three sets of Munsell soil color classes were created. The classes were defined based on

the Munsell color names, resulted from the Munsell color notations. The decision to use color

names over the notations was intended to reduce the number of classes by grouping the Munsell

notation with same name into classes. However, this approach caused confusion among the

classes, because of the differences in hue, value and chroma. The classes were edited to remove

samples with noticeable differences in value and chroma (Figure 3-4).

Three multiclass models of three classes each were trained to test the model’s ability to

recognize and differentiate soil color. Table 3-2 shows the number of samples per class and the

sample size used to train the models. From the total training dataset, 80% was used for training

and 20% for validation. The image input dize to the model was 380x380, batch size was set to 24

and the number of training epochs was initially set to 50. For Model1, the number of steps per

epoch in training was 143 and 35 in validation. For Model2, the number of training steps per

epochs was 140 and the number of steps per epochs for validation was 38. The number of

training epochs for Model3 was 149 and the number of steps for validation was 38, computed

using Equations 3-9 and 3-10, respectively.

Training was conducted in two steps, transfer learning and fine-tuning with the upper

33% of the base model unfrozen (Figure 3-2). Model progress was saved, and best weights were

saved, and then proceeded to the next step of training or model testing. Training performance

was tested on the validation dataset (Table 3-2). The variables used to assess model performance

were precision, recall, F1 score and accuracy. A confusion matrix was generated with SciKit-

Learn to visualize conflictive classes, those with similar features that the model was not able to

107

distinguish (false predictions). Finally, model performance was tested on an external dataset.

Confusion matrices were also used to visualize the results on the test dataset.

Training a Multiclass Image Classification Model for Sand Texture

Three sand texture classes were used to develop the sand texture image classification

model: Coarse sand, Sand and Fine sand (Table 3-3). The number of images used for training

was 1,584 (2521x2521), from which a proportion of 80%:20% was used for training and

validation, respectively. Image input size to the model was 380x380, the batch size was 24 and

number of epochs was set to 50. The number of steps per epoch in training was 52 and 13 in

validation, computed using Equations 3-9 and 3-10, respectively. The model was trained in three

steps: transfer learning, fine-tuning 33% and fine tuning 66%. The procedures employed after

training including model performance evaluation were the same described for the Munsell soil

color models. The performance was tested on the validation dataset (316 images) and the

independent validation dataset (499 images).

Training a Binary Image Classification Model for Sand Texture

The sample size of the Coarse sand class on the previous model was only 65 images,

compared to 720 and 800 images, of Sand and Fine sand, respectively. The Coarse Sand class

was removed to train a binary classification of only Sand and Fine sand classes. A total of 1,519

images were used to train the model, using a proportion of 80%:20%, for training and validation,

respectively. Training parameters were the same as for the multiclass sand texture classification.

The number of steps per epoch in training was 50 and in validation 12, computed using

Equations 3-9 and 3-10, respectively. The model was trained for 47 epochs, divided between

transfer learning, fine-tuning 33% and fine tuning 66%. Model progress and best weights were

saved, and then proceeded to the next step of training or model testing (Figure 3-2). After

108

training, the subsequent procedures were the same described for the previous image

classification models.

Statistical Analysis to Evaluate Model Performance

The statistical analysis conducted for hypothesis testing included root mean square error

(RMSE), and coefficient of determination (R2) for linear regression models. F1 score, precision

and recall were calculated for the image classification models. Analysis were conducted using

Python 3 on the Jupyter notebook. Model performance was evaluated as training progressed,

using training accuracy and loss values. Validation accuracy and loss were assessed on the

validation dataset (20%). For linear regression, model progress was monitored on validation loss,

computed as the mean squared error. The best fit model had high accuracy and low loss values,

for both training and validation subsets. An equilibrium between the accuracy and loss during

training and validation is required to exclude the possibility of overfitting or underfitting.

Usually, unbalanced training parameters, such as sample size between classes, or the improper

solvers (algorithms) and classification or prediction head are the main causes of imbalances in

model performance. The variables used to assess validation performance for trained classes were

obtained with SciKit-Learn’s Classification_Report function, Pedregosa et al. (2011), calculating

accuracy, precision, recall and F1 scores, (Equations 3-11 to 3-14).

Accuracy. Is the ratio of the total correct predictions over the total number of

observations. It is computed to evaluate model performance, using the averaged class probability

results. It is important to note that generally, the accuracy value does not alone represent model

performance, which is better evaluated using precision, recall, and F1 score.

𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑇𝑜𝑡𝑎𝑙 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐶𝑜𝑟𝑟𝑒𝑐𝑡 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠

𝑇𝑜𝑡𝑎𝑙 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑂𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 (3-11)

109

Precision. Is the ratio of total true positives to the total number of samples predicted as

positive (true positive and false positive). It indicates the model’s capacity to correctly classify

objects based on its true label, not confusing true positive with a false positive.

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑇𝑟𝑢𝑒 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠

𝑇𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 + 𝐹𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 (3-12)

Recall. Is the defined as the ratio of true positive to the actual positives (true positive and

false negative predictions). It shows the model’s ability to correctly identify the true positives in

a class, also called sensitivity.

𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑟𝑢𝑒 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠

𝑇𝑟𝑢𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 + 𝐹𝑎𝑙𝑠𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 (3-13)

F1 score. Is a function of Precision and Recall. It indicates a balance of precision and

recall, showing the impact of false positives and false negative in model performance. When

comparing the performance of different models trained under the same circumstances, the F1-

score is more suitable to assess performance.

𝐹1 𝑆𝑐𝑜𝑟𝑒 = 2 ∗ 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑅𝑒𝑐𝑎𝑙𝑙

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙 (3-14)

Confusion matrix. The results from the model predictions were used to develop the

confusion matrix, generated with SciKit-Learn to visualize where confusion occurs (Pedregosa et

al., 2011). The confusion matrix contrasts true labels with the predicted values, showcasing the

probability percent of true positives and false positives.

Root Mean Square Error (RMSE). RMSE was used to evaluate the performance of the

linear regression models on the validation dataset (Equation 3-15). The RMSE computes the

average deviation of the true values from the predictions of the linear regression.

110

𝑅𝑀𝑆𝐸 = √∑ (𝑦𝑖 − �̂��̇�)2𝑖=𝑛

𝑖=1

𝑛 (3-15)

where ŷi are the values predicted by the model, yi are the true values or analytical data. Before

prediction, the variables were reverted to their normal range, as the variables were previously

transformed for training.

Coefficient of determination (R2). R2 was used to evaluate model fit, comparing the

measured and predicted values for each variable (Equation 3-16).

𝑅2 = 1 −𝛴𝑖(𝑦𝑖 − �̂�𝑖)2

𝛴𝑖(𝑦𝑖 − 𝜇)2 (3-16)

Evaluating Model Performance on the Independent Soil Dataset

The Binary and Multiclass image classification models were tested on external datasets.

The models were evaluated on precision, recall, F1 score and accuracy. A confusion matrix was

also used to evaluate performance on the unknown dataset.

Results and Discussion

In this study 321 soil samples were used, which were used to model six soil variables:

SOM, PWP, BD, L*, a*, and b* soil color, texture of sandy soils and Munsell color system.

Table 3-6 contains a summary of the descriptive statistics of the first six variables. All variables

had wide range of values within its limits. Figure 3-5, shows the distribution of values for each

of the continuous variables, with SOM and PWP showing very skewed distributions. Figure 3-6

shows an improved distribution of values after data transformation. A total of 51 Munsell color

notations and 22 Munsell color designations resulted from the 321 samples (Table 3-7). Due to

the high complexity of the color data, not all samples were used to develop the soil color

classification models. The test dataset contained 15 Munsell color designations and 18 Munsell

111

color notations (Table 3-8). A total of 317 samples were used to train the multiclass and the

binary image classification of sand texture classes (Table 3-9). From the 321 soil samples, three

samples did not meet the requirement of 85% sand texture to be classified as sandy soil, and one

sample did not have enough material remaining for classification using the sieving method. All

samples in the test dataset were used to test model performance, also shown in Table 3-9.

The following sections present the results of model development for all variables. First,

the results of calibration process are presented. It includes results of linear regression models,

multiclass and binary classification of sand textural classes and the multi classification of color

classes based on the Munsell color name. Second, the results of model performance on the

validation dataset and finally the results of model testing on the independent validation dataset

for multiclass and binary classification models are presented.

Training and Validation of the CNN Linear Regression Models

The linear regression approach was applied to train six soil variables: SOM (%), PWP

(%), BD (g/cm3), lightness of soil color (L*), green-red values (a*) of soil color, and blue-yellow

(b*) values of soil color. All models were trained under the same conditions. Training was

carried out until models reached a mean squared error (loss) MSE<0.009. Most models reached

validation losses less than 0.009, except blue-yellow (b*), which validation loss was 0.0103. The

results of training are shown in Table 3-10, presenting the lowest training and validation losses.

Performance of the CNN Linear Regression Models on the Validation Dataset

The results of linear regression models are shown in Figure 3-7A – Figure 3-7F. Among

the three CIE color measurement variables, the green-red a* axis predicted values showed more

agreement with the actual values measured with the spectrophotometer. Figure 3-7A, shows a

relatively low prediction error (RMSE = 0.858) of the predicted values compared to the values

measured by the equipment. The model explained 77% of the variation of the predicted values

112

(R2 = 0.77). The model tends to overestimate values on margins of the green-red axis, more

pronounced beyond positive values of 6. Less variation was observed in values between 0 and 5,

which was also where most samples was centered (Figure 3-5). A different trend was seen with

prediction of soil color on the blue-yellow axis (b*). Prediction values were more accurate with

intensity of yellow. Most of the variation was observed with values between 5 and 20, with

RMSE of 3.274, with 60% of the variation explained by the model (R2 = 0.60). Non-uniform

distribution of values seemed to have the most influence in model performance. Most of the

samples had b* values ranging from 5 to 20 and very few samples had values above 20 (Figure

3-4). A similar trend was observed with the L* (black to white axis) color values (Figure 3-7C),

with the model showing good performance on samples with light tonalities (L*>70). The model

explained 59% of the variation (R2 = 0.59). High deviation was observed (RMSE = 5.715),

especially on samples with L*<80, with tendency of overestimation. Fewer samples had L*>70

compared to samples with darker tonalities (Figure 3-4). The uniqueness of these Florida soil

samples (e.g., low SOM) might have contributed to model performance. On the other hand, in

samples with darker tonalities, the subtle variation in tonalities might have caused confusion,

increasing prediction error. The same is true for other color variables (a* and b*), where small

difference in tonalities might lead to different predicted results. Sample size is another factor

affecting model performance, with increased error in values with small sample size.

Model prediction of soil bulk density (g/cm3) showed a relatively low performance

(Figure 3-7D). Prediction of BD values >1 g/cm3 showed better agreement with the measured

values. Most of the variability was observed between 1.2 g/cm3 and 1.5 g/cm3, which coincides

with the range that contains the greatest sample size (Figure 3-4). The model explained 56% of

variation in prediction, with a RMSE of 0.072. There was tendency to overestimate BD when

113

measured values were below 1.2 g/cm3 and a tendency to underestimate values greater than 1.3

g/cm3. Soil color might have influenced model performance (increased error), especially in

samples with high organic matter content, which tend to have low bulk density, with the model

overestimating the BD values. Sample size was found to be determining prediction performance,

especially the pronounced error in samples with low bulk density, which had very few samples.

The predictive model for SOM (%) using LOI method (Figure 3-7E) produced the best

results among all the models, with RMSE of 0.857 with 86% of the variation (R2 = 0.86)

explained by the model. Good agreement was observed when LOI values were below 5%.

Samples with SOM content above 5% contributed the most to prediction error, with the model

underestimating SOM content. This performance can be attributed to the fact the OM is a soil

property that can be visually observed. The black pigmentation from presence of humic

compounds is the main feature used to differentiate levels of OM in soils. Some samples

contained non decomposed OM, which accounted for the total SOM content when using the LOI

(complete disintegration of OM) method to measure SOM. Non decomposed OM has different

visual features from the humic compounds, which might be the source of increased prediction

error from the underestimation of SOM content. The model was able to generalize well on

samples with low OM content but was less confident on samples with high OM content. Most of

the samples in the dataset had OM content less than 10% (Figure 3-4), with the highest number

of samples containing less than 5% OM. As observed with the previous models, the low sample

size had an influence on the observed results in the upper range of OM content.

The predictive model of the permanent wilting point (PWP) showed good performance

on samples with PWP<5% (Figure 3-7F). As PWP values increase, from 5% to approximately

14%, the model tends to underestimate the water content at PWP. Some deviation is observed in

114

the range of approximately 2% to 4%. The RMSE of this model was 1.052, with the model

explaining 65% of the variation (R2 = 0.65). The PWP regression plot shows an inverse

relationship with the BD (Figure 3-7D), and a direct relationship with OM content (Figure 3-7E),

also shown in Figure 3-4, where most of the values are below 5% of water content at PWP. Soil

BD and SOM are directly related with particle size, soil mineralogy and soil matric potential,

which influences soil water retentive capacity (Brady & Weil, 2008; Hillel, 1998). Most of the

samples included in this study were classified as sandy soils, with varying fractions of sand, and

different content of silt and clay, which have different water potential at PWP (Campbell et al.,

2007). The water content at PWP, might also be influenced by the OM and the finer fraction of

soil separates (silt and clay), which was shown in the few samples with high water content.

Based on the distribution of values shown in Figure 3-4, these samples might also have low BD

and high OM content. Soil organic matter might be the visual feature used by the model to

predict soil water content at PWP. Particle size might be another feature used by the model in the

prediction process, shown by the inverse relationship between the PWP and BD.

Training and Validation of the Multiclass Munsell Soil Color Classification

Due to the complexity of the of soil color data, the soil color models were trained using a

small subset of color classes. When training with the entire dataset the model was not able to

discern between closely related colors, such as those with the same Hue and Value, but different

Chroma (Table 3-7). Also, some classes had few samples, not enough to train. Including all

classes for training resulted in overfitting and the training was stopped. Therefore, three models

were developed to predict soil color based on the Munsell soil color names (Table 3-2). Model1

(Figure 3-8A) was trained for three soil colors: black, brown, and gray. Model training and

validation was carried out for 27 epochs, reaching a training and validation accuracy of 99.8%

and 99.52%, respectively. The loss values reached 0.0054 and 0.011, for training and validation

115

subsets, respectively. Model2 was trained on three other classes of soil color, including very dark

gray, dark brown and light olive brown (Figure 3-8B). Training was carried out for 21 epochs,

reaching the maximum validation accuracy of 100%, with a training accuracy of 99.64%.

Validation loss was 0.001 and training loss of 0.0083. Model3 was trained for very dark gray,

dark yellowish brown and dark grayish brown color classes (Figure 3-8C). The model was

trained for 28 epochs. The best training accuracy was 98.68%, while the validation accuracy was

82.32%. The loss values were 0.038 and 0.7, for training and validation, respectively.

Performance of Munsell Soil Color Classification Models on the Validation Dataset

The three models trained for classification of soil color achieved good performance on

the validation dataset. Model1 had an excellent classification performance (Figure 3-9A), with

99.3%, 100% and 99.3% of correct prediction for black, brown, and gray colors, respectively.

Minor confusions were observed in prediction of black color, with 0.7% of the samples classified

as brown color. Precision, recall, and F1 score for black color were 99%, 99% and 100%,

respectively (Table 3-11). All soil samples of brown color were correctly predicted, with 100%

recall, however, the model’s precision in predicting brown color was 99%, due to confusion with

a black soil color. Finally, the model’s prediction of gray color had precision, recall, and F1

scores of 100% as no other class was wrongly predicted as gray. The performance of Model2 on

the validation dataset (Figure 3-9B) was the best among all models, with 100% of positive

predictions of: very dark gray, dark brown and light olive brown. Table 3-12 shows the

precision, recall, F1 score and overall model accuracy of 100%. Model3, did not perform as well

as the first two models (Figure 3-9C), with 94.1%, 58.9%, and 93.1% of true predictions for the

soil color classes very dark gray, dark grayish brown, and dark yellowish brown, respectively.

Table 3-13 shows an overall accuracy of 82% and the same percentage was obtained for recall

116

and F1 score, while overall precision was 85%. Great confusion was observed between dark

grayish brown and dark yellowish brown, with F1 scores of 71% and 79%, respectively.

Performance of Munsell Soil Color Classification Models on the Independent Validation

Dataset

The three classifier models were tested on external soil color data shown in Table 3-8.

Model1 (Figure 3-10A) had decent classification performance, with an overall accuracy of 67%

(Table 3-14). The best classification results were of the black color class, with 90% of true

predictions. There was considerable confusion among all classes, the brown color class, had 60%

of true predictions and 40% of samples were misclassified as black color. The lowest

classification performance was of gray soil color class, with 55.3% of true prediction and 44% of

the samples classified as brown soil color class. The best performing model, Model2 (Figure 3-

10B), correctly classified all Light olive brown samples. The classification performance for very

dark brown soil color was 83.3% of true prediction and 16.7% of false prediction. The overall

model accuracy was 85%, with low precision in classification of Light olive brown samples

(44%), and 83% recall for the very dark gray samples (Table 3-15). The dark brown soil color

class was not included in the confusion matrix of Model3 (Figure 3-10C) because none of the

samples in the test dataset were classified as dark brown (Table 3-8). Model3 had the lowest

performance in calibration, which was reflected in the model’s ability to classify unknown soil

samples. The model was able to correctly classify 83.8% of very dark gray samples, with 10.8%

of samples classified as dark grayish brown and 5.4% as dark yellowish brown. Classification of

dark grayish brown was 50.9% of true prediction, 21% of samples were predicted as very dark

gray and 27.3% as dark yellowish brown. The model was not able to distinguish dark yellowish

brown from dark grayish brown. All samples in the dark yellowish brown class were classified as

dark grayish brown, as a result, dark grayish brown had precision of 60%, shown in Table 3-16.

117

Training and Validation of the Multiclass and Binary Classification Models for Textural

Classes of Sandy Soils

Three classes of sand were defined from soil sieving: Coarse sand, Sand, and Fine sand,

based on their classification from USDA. A multiclass model was trained to identify these three

classes of sand. The model was trained for 44 epochs, including transfer learning and fine tuning.

The highest training and validation accuracy values were 98.79% and 92.31%, respectively. The

loss values at the best training and validation epochs were 0.038 and 0.314, respectively (Figure

3-11A). Figure 3-11B, shows the training progress of the binary image classification model for

Sand and Fine sand textural classes. The model was trained for 47 epochs, where validation

accuracy was 94.1%, with loss of 0.20 and training accuracy was 98.99%, with loss of 0.05.

Performance of the Multiclass and Binary Image Classification Models for Textural

Classes of Sandy Soils on the Validation Dataset

Figure 3-12A, presents the confusion matrix of the multiclass model. The model had a

relatively good performance at predicting Sand and Fine sand texture classes with 90.5% and

96.2% of true predictions, respectively. Low performance was shown in classification of Coarse

sand, with 53.8% of true predictions. Table 3-17 shows 92% overall accuracy in classification.

Most of the false predictions of Coarse sand class were confused with Sand class (46.2%). The

observed precision (88%) recall (54%) and F1 score (67%) were also low for Coarse sand,

showing the model’s failure to correctly differentiate Coarse sand from Sand texture classes.

Minor confusion was observed between Fine sand and Sand classes, with 8.8% of samples in the

Sand class being classified as Fine sand. The model’s precision (92%), recall (90%) and F1

(91%), were better than the values observed for Coarse sand. A similar trend was noted with the

Fine sand class, with the best classification results: precision of 92%, recall 96% and F1 score of

94%. From the total of 316 images (Table 3-17), only 13 samples were classified as Coarse sand

samples, the same in training, where only 52 samples were used to calibrate the model. On the

118

other hand, 156 and 147 images of Fine sand and Sand, respectively were used in the validation

dataset. The low number of Coarse sand samples in training limited the model’s ability to

adequately learn the features of Coarse sand, and therefore, to be able to differentiate Coarse

from Sand and Fine sand textured sandy soils. Poor model performance during validation is often

attributed to imbalanced data in the classes (Buda, Maki, & Mazurowski, 2018).

The binary classification of Sand and Fine sand textures performed better than the

multiclass model, with an overall validation accuracy of 94% (2% greater than the multiclass

model), shown in Table 3-18. There was less confusion among the two classes, as well as a

similar level of prediction error. Figure 3-12B shows 94.5% of true prediction of Fine sand

textured soils, with 5.5% of false predictions. The model achieved 94.3% of true prediction of

Sand textured soils, with 5.7% of false predictions. The values of precision, recall, and F1 score

are shown in Table 3-18, they indicate an improved performance compared to the multiclass

model. The elimination of the confusing class (Coarse sand, with a low count of samples)

benefited the model’s ability to differentiate soil texture. However, the model still has difficulties

distinguishing the two texture classes. The potential of using this method to estimate soil texture

is clearly presented in the results of this study, regardless of the complexity of soil properties.

Performance of the Multiclass and Binary Models for Textural Classes on the Independent

Validation Dataset

The multiclass classifier (Figure 3-13A) did not perform well in classification of

unknown samples. For the Fine sand texture, the model’s classification was 56.7% of true

positives, and 43.3% of the samples were misclassified as Sand texture. With the Sand texture

class, 60.5% of samples were correctly classified, with 39.5% misclassified as Fine sand. The

model was unable to correctly classify Coarse sand texture, 85.9% were identified as Sand

texture class and 14.1% as Fine sand. The binary classification model (Figure 3-13B) performed

119

better on the external dataset, with 60% true positives for Fine sand class and 72.1% true positive

for Sand texture class. Overall accuracy increased from 50% in multiclass model (Table 3-19) to

71% in binary classification model (Table 3-20), with both models showing better performance

in classification of Sand texture class.

Soil color, particle size, and shape might be the image features contributing the most to

the model’s predictive capacity of soil variables. In general, color is a major factor used by deep

learning models to learn object features, since the CNN learning method uses RGB pixel based

input data. In this study, soil pigmentation was important in the prediction of soil color variables

CIE-L*a*b* and Munsell systems as well as SOM with dark pigmentation from humic

substances and the colors of soil minerals. Another feature observed and learned by deep and

wide CNNs like the EfficientNet-B4, are shape and size of the object, such as soil particle size.

Particle size and shape might have played an important role for prediction of BD, water content

at PWP, and soil texture with the interference of OM. Soil color might have been determinant to

the prediction error of BD, especially in samples with high OM content, which tend to have low

BD. Deeper and wider models require high image resolution to maximize their potential in

prediction of image properties. The resolution of images used to train the linear regression,

multiclass and binary classification models might have contributed to the good predictive and

classification performance. Sample size was determinant for lower prediction accuracy on value

range of samples with few samples. Deep learning models perform well when trained with

abundant data and balanced sample size, which was a challenge in this study. Transfer learning

and fine-tuning are contributing the most for improved performance when using pretrained

networks such as the EfficientNet-B4. Nevertheless, the complexity of soil properties mostly

resulting from spatial variability was a challenge to achieve high accuracies.

120

Table 3-1. Sample size for training and validation of each variable and the method. The image

size of BD, SOM, PWP, CIE-L*a*b* and sand classes was 2251x2251 and for

Munsell color notation was 380x380.

Soil variable

Number of

samples for

training (80%)

Number of

samples for

validation (20%)

Number of

samples for

testing

Method

Bulk Density (BD)

1,284 321 500 Simple linear

regression

Soil Organic Matter (SOM)

Permanent Wilting Point (PWP)

L* (black - white)

a* (green - red)

b* (blue - yellow)

Munsell Color Notation 13,000 2,600 500 Multiclass

Sand classes 1,216 303 414 Binary

1,268 316 499 Multiclass

Table 3-2. List of classes and respective sample size used to train the Munsell color image

classification model. Models Classes Training Validation Total

Model 1

Black 1,148 287

4,305 Brown 1,148 287

Gray 1,148 287

Model 2

Very dark gray 1,148 287

4,476 Dark brown 1,220 304

Light olive brown 1,008 252

Model 3

Very dark gray 1,148 287

4,219 Dark grayish brown 1,148 304

Dark yellowish brown 1,285 252

Table 3-3. List and number of classes used to train the multiclass and binary classification

models for sand texture classes.

Method Coarse sand Sand Fine sand

Total Training Validation Training Validation Training Validation

Multiclass 52 13 576 147 640 156 1,584

Binary 576 140 640 163 1,519

Table 3-4. Data transformation methods applied to train the linear regression model. Variable Data transformation methods Description

LOI Log transformation To meet normal distribution

Rescaling the log transformed

data by adding the absolute

minimum

To eliminate the negative values generated after log

transformation PWP

Rescale To rescale (normalize) values from 0-1

BD

Rescale To rescale (normalize) values from 0-1

L* (black - white)

a* (green - red)

b* (blue - yellow)

121

Table 3-5. Hyperparameters used in training and validation of the five models.

Parameter Value Description

Target size 380x380x3 Image input size for the EfficientNet-B4 network

Batch size for image

classification models 24

Number of images in a batch for training and validation subset. It

is was selected considering server computation capability

Batch size for linear

regression models 32

Number of images in a batch for training and validation subset. It

is was selected considering server computation capability

Patience 5 Number of training epochs without improvement in validation

accuracy, after which training will be stopped

Alpha transfer learning 0.005 Learning rate for the EfficientNet-B4

alpha fine tuning 0.0005 Learning rate for the EfficientNet-B4

Automatically reduce the

learning rate 0.2

Reduce the learning rate by a factor of five when the validation

accuracy did not improve for two epochs

Minimum learning rate 0.0000001 The lowest learning rate when the validation accuracy Plateaus

Table 3-6. Summary of descriptive statistics of the continuous variables. Soil variable Minimum Maximum Mean Median Standard deviation CV (%)

LOI (% w/w) 0.279 18.000 2.72 2.134 2.127 78.241

PWP (% w/w) 0.052 14.295 1.806 1.348 1.643 90.971

BD (g/cm3) 0.901 1.547 1.339 1.350 0.103 7.727

L* (black - white) 32.110 86.620 52.41 51.280 8.380 16.004

a* (green - red) -3.347 11.830 2.521 2.247 1.862 73.858

b* (blue - yellow) 1.839 28.260 11.980 12.430 5.254 43.870

Table 3-7. Munsell color notation and names of the training and validation dataset. Total number

of samples = 321, with 22 Munsell color classes from 51 Munsell color notations.

Munsell color name Munsell color

Notation #Samples Munsell color name

Munsell color

Notation #Samples

Black

2.5Y2.5/1 13

Gray

7.5YR4/4 1

10YR2/1 13 10YR5/1 6

5Y2.5/1 1 10YR6/1 1

5YR2.5/1 1 2.5Y5/1 11

7.5YR2.5/1 3 2.5Y6/1 1

Very dark brown 10YR2/2 5 5Y5/1 1

7.5YR2.5/3 2

Grayish brown

10YR5/2 1

Very dark gray 10YR3/1 11 2.5Y5/2 12

2.5Y3/1 28 2.5Y6/2 1

Dark grayish brown

10YR4/2 9

Light olive brown

10YR5/3 1

2.5Y3/2 1 2.5Y5/3 4

2.5Y4/2 34 2.5Y5/4 4

Very dark grayish

brown

10YR3/2 7

Yellowish brown

10YR5/4 3

2.5Y3/2 28 10YR5/6 3

Dark brown 10YR3/3 10 10YR5/8 1

7.5YR3/4 3 Dark olive brown 2.5Y3/3 11

Dark yellowish brown

10YR3/4 2 Olive brown

2.5Y4/3 12

10YR3/6 1 2.5Y4/4 7

10YR4/4 5 Light brownish gray 2.5Y6/2 1

Dark gray 10YR4/1 9 Light gray 2.5Y7/1 2

2.5Y4/1 30 Dark red 2.5YR3/6 1

Brown 10YR4/3 11 Olive 5Y4/3 1

122

Table 3-7. Continued.

Munsell color name Munsell color

Notation #Samples Munsell color name

Munsell color

Notation #Samples

10YR5/3 2 Dark reddish brown 5YR3/2 1

7.5YR4/2 1 Yellowish red 5YR4/6 1

Strong brown 7.5YR4/6 3

Table 3-8. Munsell color notation and names of the independent validation dataset. A total of 15

color classes from 18 Munsell color notations of 100 soil samples. Munsell color name Munsell color Notation #Samples

Black 2.5Y2.5/1 7

Brown 10YR5/3 1

Dark gray 2.5Y4/1 11

Dark Grayish Brown 2.5Y4/2 11

Dark Olive Brown 2.5Y3/3 1

Dark Yellowish Brown 10YR4/4 1

Gray 2.5Y6/1 3

2.5Y5/1 6

Grayish Brown 2.5Y5/2 4

Light Gray 2.5Y7/1 1

Light Olive Brown 2.5Y5/3 6

2.5Y5/4 1

Light Yellowish Brown 2.5Y6/3 2

Olive Brown 2.5Y4/3 6

2.5Y4/4 3

Very Dark Gray 2.5Y3/1 26

Very Dark Grayish Brown 2.5Y3/2 8

White 2.5Y8/1 2

Total # samples 100

Table 3-9. Number of samples used for training/validation (317 samples) and independent

validation (100 samples) of sand texture classes with binary and multiclass methods.

Texture class Training and validation Independent validation

Coarse sand 13 17

Fine sand 160 12

Sand 144 71

Table 3-10. Training and validation results of the linear regression models. Soil color (CIE-

L*a*b), bulk density (BD), permanent wilting point (PWP), and soil organic matter

content through loss on ignition (LOI). Model Training loss (MSE) Validation loss (MSE) Training epochs

a* (green - red) 0.0042 0.0053 49

b*(blue - yellow) 0.0048 0.0103 62

L*(black - white) 0.0037 0.0044 45

Bulk density 0.0020 0.0022 50

Permanent Wilting Point 0.0058 0.0067 31

Loss on Ignition 0.0020 0.0040 50

123

Table 3-11. Classification performance of Munsell soil color Model1 in prediction of three soil

colors: Black, Brown, and Gray.

Classes Precision Recall F1-score n

Black 99 99 99 287

Brown 99 100 100 287

Gray 100 100 100 287

Accuracy 100 861

Weighted avg 100 100 100 861

Table 3-12. Classification performance of Munsell soil color Model2 in prediction of three soil

colors: Very dark gray, Dark brown, and Light olive brown.

Classes Precision Recall F1-score n

Very dark gray 100 100 100 287

Dark brown 100 100 100 304

Light olive brown 100 100 100 252

Accuracy 100 843

Weighted avg 100 100 100 843

Table 3-13. Classification performance of Munsell soil color Model3 in prediction of three soil

colors: Very dark gray, Dark grayish brown, and Light Dark yellowish brown. Classes Precision Recall F1-score n

Very dark gray 100 94 97 287

Dark grayish brown 88 59 71 287

Dark yellowish brown 69 93 79 321

Accuracy 82 895

Weighted avg 85 82 82 895

Table 3-14. Classification performance of Munsell soil color Model1 on the independent

validation dataset. Classes Precision Recall F1-score n

Black 79 90 84 30

Brown 12 60 19 5

Gray 100 56 71 45

Accuracy 67 80

Weighted avg 87 69 73 80

Table 3-15. Classification performance of Munsell soil color Model2 on the independent

validation dataset. Classes Precision Recall F1-score n

Very dark gray 100 83 91 84

Light olive brown 44 100 61 11

Accuracy 85 95

Weighted avg 94 85 87 95

124

Table 3-16. Classification performance of Munsell soil color Model3 on the independent

validation dataset.

Classes Precision Recall F1-score n

Very dark gray 90 84 87 130

Dark grayish brown 60 51 55 55

Dark yellowish brown 0 0 0 5

Weighted avg 79 72 75 190

Table 3-17. Classification performance of the multiclass sand texture model in prediction of

coarse sand, sand, and fine sand textured soils of the validation dataset. Classes Precision Recall F1-score n

Coarse Sand 88 54 67 13

Fine Sand 92 96 94 156

Sand 92 90 91 147

Accuracy 92 316

Weighted avg 92 92 92 316

Table 3-18. Classification performance of the binary sand texture model in prediction of sand,

and fine sand textured soils of the validation dataset. Classes Precision Recall F1-score n

Fine Sand 95 94 95 163

Sand 94 94 94 140

Accuracy

94 303

Weighted avg 94 94 94 303

Table 3-19. Classification performance of soil texture multiclass model on the independent

validation dataset.

Classes Precision Recall F1-score n

Coarse sand 0 0 0 85

Fine sand 18 57 28 60

Sand 68 60 64 354

Accuracy 50 499

Weighted avg 51 50 49 499

Table 3-20. Classification performance the binary classification model in classification of soil

texture of the independent validation dataset.

Class Precision Recall F1-score n

Fine sand 27 60 37 60

Sand 91 72 81 354

accuracy 71 414

weighted avg 82 71 74 414

125

Figure 3-1. Flow diagram of model development. The figure shows the steps implemented to

train the soil variables using three approaches: linear regression, multiclass and binary

image classification using the pretrained model, the EfficientNet-B4.

Figure 3-2. Sequence of training methodology implemented to develop the model using transfer

learning and fine-tuning. The procedure was used to train the multiclass and binary

models: Munsell color classification and sand texture classes. The Munsell color

classification fine-tuning was done with 33% of the top layers.

126

Figure 3-3. Sequence of training methodology implemented to train the linear regression models

with transfer learning and fine-tuning. The procedure was used to train the multiclass

models: SOM, BD, PWP and L*a*b color.

Figure 3-4. Example of classes before and after removal of samples with different notations.

127

Figure 3-5. Histograms with original distribution of soil variables. LOI and PWP show a very

skewed distribution (log transform was applied to meet normal distribution). The

range of values in each variable was uneven, therefore the values were rescaled from

0 to1.

Figure 3-6. Histogram of data distribution after data transformation. LOI and PWP values show a

more normalized distribution after log transformation.

128

A B

C D

Figure 3-7. Results of linear regression analysis performed on the validation subset. A) a*

(green-red), the greed to red axis ranges from negative (green) to positive (red)

values; B) b* (blue-yellow), the blue to yellow axis range from negative (blue) to

positive (yellow) values; C) L* (black-white), the axis range from (black) to 100

(white); D) Bulk density, measured in g/cm3; E) Loss on Ignition (LOI), measured in

percentage (mass of OM over mass of soil), F) Permanent Wilting Point (PWP),

measured in percentage (gravimetric soil water content at PWP). The regression line

(model fit) is shown in red.

R2 = 0.59 RMSE = 5.715

R2 = 0.56 RMSE = 0.072

R2 = 0.77 RMSE = 0.858

R2 = 0.60 RMSE = 3.274

129

E F

Figure 3-7. Continued.

R2 = 0.86 RMSE = 0.857

R2 = 0.65 RMSE = 1.052

130

Figure 3-8. Training and validation of the soil color models. A) Model1 (Black, Brown, and

Gray soil color classes), B) Model2 (Very dark gray, Dark brown and Light olive

brown soil color classes), C) Model3 (Very dark gray, Dark grayish brown, and Dark

yellowish brown) color soil color classes). Training and validation accuracy

increased, and losses decreased during training. Model1 and Model3 showed the

same trend in training, with smooth and increasing performance. Model 2 had

difficult training, showing overfitting and low training and validation performance.

All models were fine-tuned using 33% of the top layers.

131

B

C

Figure 3-9. Confusion matrix of model performance in classifying soil color on the validation

dataset. A) multiclass confusion matrix of Model1. B) multiclass confusion matrix of

Model2. C) multiclass confusion matrix of Model3. The values are percentage of true

labels (vertical axis) allocated to predicted labels (horizontal axis).

132

B

Figure 3-10. Confusion matrices of model performance on the independent validation. A)

Model1, B) Model2, and C) Model3. Model2 only shows two classes since test

dataset did not contain classes for Dark brown. The values are percentage of true

labels (vertical axis) allocated to predicted labels (horizontal axis).

133

Figure 3-11. Model progress in training and validation process of multiclass and binary

classification of sand classes. A) training and validation of the multiclass model. B)

training and validation of binary classification model. Training and validation

accuracy increase while training and validation loss decreases. Model’s accuracy

increased in fine-tuning, reaching to its high performance in the second step of fine

tuning, with 33%. Binary classification model had higher validation accuracy and loss

than the multiclass model.

B

Figure 3-12. Confusion matrix showing model performance at predicting sand texture on the

validation dataset. A) confusion matrix of the multiclass model. B) confusion matrix

of the binary model. The values are percentage of true labels (vertical axis) allocated

to predicted labels (horizontal axis).

134

B

Figure 3-13. Model performance on the independent validation dataset. A) confusion matrix of

multiclass classification of sand fractions. B) confusion matrix of binary classification

of sand fractions. The values are percentage of true labels (vertical axis) allocated to

predicted labels (horizontal axis).

135

CHAPTER 4

SUMMARY OF RESULTS

Deep learning and machine vision have shown excellent performance in image

classification and object recognition tasks. The need for efficiency and accuracy in diagnosis of

plant disorders and soil testing encouraged the implementation of this study. Conventional field

scouting and analytical laboratory methods are used to analyze plant tissue for nutrient content,

diagnosis of disease and pest damage symptoms. Similar approaches are used to analyze soil

samples. However, these methods are time consuming and with cost limitations for most farmers

across the globe. These methods are also labor intensive, which limits sampling density, making

it difficult to develop site specific recommendation for farm inputs such as water, fertilizer, lime,

and pesticides.

The models developed to identify citrus leaf disorders achieved high classification

accuracy for almost all leaf disorders. The most complex classes to predict included citrus scab,

spider mite damage, zinc, and manganese deficiency. Some samples of citrus scab disease did

not show clear symptoms on the adaxial side of the leaf, which was found to cause confusion

with other classes, such as manganese and spider mite as well as asymptomatic leaves. In this

study, the use of transfer learning and pretrained CNN with the ImageNet dataset and fine-tuning

with the citrus leaf disorders dataset seemed to have contributed to the excellent model

performance. All EfficientNet-B4 models (CLD-Model-1 to CLD-Model-4) performed better

than the VGG-16 model (CLD-Model-5), also showing the positive effect of increased network

depth, width, and image resolution in accurate classification of leaf disorders. The comparison

between three models (CLD-Model-2, CLD-Model-3, and CLD-Model-4) with a group of expert

professionals in citrus production in Florida and a group of novices familiar with citrus showed

significant differences in performance. The model performed better than both groups of

136

individuals (p< 0.001). These results are very important, suggesting that the citrus leaf disorder

diagnosis models are a reliable tool to supplement field and laboratory assessment of biotic and

abiotic stress. With the results obtained from this study, the citrus leaf diagnosis models are

being tested on a smartphone application (Schumann, Waldo, Mungofa & Oswalt, 2020). Other

improvements are being implemented to increase the model’s capability to correctly classify leaf

samples under different field, laboratory, and other conditions.

All three methods applied to develop the predictive models for soil properties had

different performances, resulting from many factors. Some of these factors include sample size

and complexity of data, resulting from the interaction of two of the soil visual features: color and

particle size. Sample size had a considerable impact in performance of image classification

models for soil texture and soil color variables. Training of the multiclass model for sand texture

was negatively affected by the small sample size of the Coarse sand texture (65 images from 13

soil samples), compared to 720 images for Sand texture and 800 images for Fine sand texture

class. Deep convolutional neural networks perform better when trained with a balanced dataset.

Unbalanced datasets result in overfitting, with decreases the ability to recognize objects of

classes containing fewer samples. Multiclass model performance of sand textural classes was

low in the validation and independent validation datasets. The binary classification model

performed better than the multiclass model. Improved performance can be attributed to a more

balanced sample size, compared to the multiclass dataset. However, overall model performance

was not the best possible, due to the complexity of soil samples, such as the high spatial

variability of soil texture.

The complexity of the color feature of the soil samples was a determining factor to train

the soil color models. The Munsell color names were used to group different notations into

137

classes. This approach resulted in increased complexity in classes and unbalanced datasets per

class. Grouping soil color data based on color names resulted in major overfitting during

training, and eventual model failing to progress in training. The three Munsell color

classification models were trained using a small subset of the total dataset. Results showed that

concise and balanced color classes contributed to increased model performance during training.

An important consideration to develop a Munsell color classification model includes using a

large and balanced sample size and using notation designation instead of color names. However,

some notations are very closely related so that other aspects must be considered, such as

inspecting the database to identify potential similarities between classes. Alternatively, as

observed in this experiment, training models with subclasses of the database would be an option

to avoid confusion between Munsell color notation.

The linear regression models for color analysis were simpler to train while including all

samples. The results show different potential applications for each of the developed soil color

prediction models. The a* (green-red) axis regression results shows potential to use the model to

predict color of samples showing tonalities associated with red color, such as brown and reddish

soil colors. Few samples were found in the green range; therefore, the application of this

particular model could have limitations in predicting samples out of the red range of color (e.g.,

wetland soils). The b* (blue-yellow), was more associated with samples showing higher degree

of yellowness than blue. In both cases (model a* and model b*), the results are associated with

the characteristics of the samples used to calibrate the models. Most soils show ranges of red,

brown, and yellow, which agrees with the results of these two models. The L* (black-white)

model is more associated with the OM content in the soil (black color). It is suitable for

138

prediction of samples with low or very low OM content, with high prediction error in samples

with high OM content.

Particle size and OM content were potential features contributing to the results of the

bulk density linear regression model. Bulk density is not a clear visual feature of soils, particle

size and soil texture directly impact soil bulk density. Most of the Florida soils and those

included in this study were sandy soils, except for few fine textured samples. Few other samples

had high OM content, compared to the majority. The model is less accurate at predicting a range

of values with small representative samples, such as fine textured soils and soils with high OM

content, and consequently low BD. Therefore, this model is suitable to predict soil bulk density

of coarse textured soils, from 1.2 g/cm3 to 1.6 g/cm3, characteristic of cultivated sandy loams and

sands. The same properties were important in prediction of soil water content at PWP. As most

of the samples had coarse texture, PWP values were low, between 0.1% to 4% water content.

Few samples had high PWP values, which might be those with high OM content and fine texture

(silts and clays). Similar to bulk density, this model shows considerable accuracy in predicting

water content at PWP of coarse textured soils, compared to fine textured and soils with high OM

content, where water content was underestimated.

Soil organic matter content was accurately predicted by the model, with a higher

performance in samples with very low and very high OM content. Soil color might have

contributed the most to the high predictive capacity of the linear model. However, the models

ability to predict samples out of these ranges of color, might be lower. Non-decomposed OM

content also contributed to the total OM measured through LOI method. Additionally,

hygroscopic water loss from soils with high clay, might account to LOI values, which adds to the

prediction error.

139

In general, all models developed in this study show great potential for the use of deep

convolutional neural networks and digital images of soil samples to predict soil variables though

image classification and linear regression methods. Based on the results, a major limitation was

sample size to meet the high variability of soil properties. The prediction accuracy of the linear

regression models was greatly influenced by the small number of samples of extreme values of

the predicted soil variables. As mentioned previously, deep learning models generalize well

when using balanced sample size. The two training strategies applied to develop the predictive

models were transfer learning and fine-tuning. Based on the results of training and validation,

both methods contributed to model performance. However, there were limitations due to the data

complexity that was caused by high variability of soil properties. Also, the study of soil

properties using the machine vision approach is not common, which might have been the reason

for low performance during transfer learning and fine tuning, compared to other machine vision

tasks.

140

LIST OF REFERENCES

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., … Zheng, X. (2016).

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems.

Retrieved from http://arxiv.org/abs/1603.04467.

Abbott, L. (2012). Soil Health - Organic matter. Soil Health, (May 2014). Retrieved from

http://www.soilhealth.com/soil-health/organic/#one.

Alreshidi, E. (2019). Smart Sustainable Agriculture (SSA) solution underpinned by Internet of

Things (IoT) and Artificial Intelligence (AI). International Journal of Advanced Computer

Science and Applications, 10(5), 93–102. https://doi.org/10.14569/ijacsa.2019.0100513.

AppAdvice LLC. (2020). Pocket Agronomist. AppAdvice https://appadvice.com/app/pocket-

agronomist/1262053489.

Arvidsson, J. (1998). Influence of soil texture and organic matter content on bulk density, air

content, compression index and crop yield in field and laboratory compression experiments.

Soil and Tillage Research, Vol. 49, pp. 159–170. https://doi.org/10.1016/S0167-

1987(98)00164-0.

Aubert, B. (1978). Trioza erytheae del Guercio and Diaphorina citri Kuwayama (Homoptera:

Psylloidea), the two vectors of citrus greening disease: biological aspects and possible

control stratégies. Fruits (1978), Vol. 42, pp. 149–162.

Baramidze, V., Khetereli, A., & Kushad, M. (2015). Identification and Control of Major

Diseases and Insect Pests of Vegetables and Melons in Georgia.

Barrett, L. R. (2002). Spectrophotometric color measurement in situ in well drained sandy soils.

Geoderma, Vol. 108, pp. 49–77. https://doi.org/10.1016/S0016-7061(02)00121-0.

Binkley, D., & Fisher, R.F. (2012). Ecology and Management of Forest Soils. New York: John

Wiley & Sons.

Blake, G.R., & Hartge, K.H. (1986). Particle Density. In A. Klute (Ed.), Methods of soil analysis.

Part 1. Physical and mineralogical methods (2nd. Ed., Agronomy Monograph 9, pp. 377-

381). Madison, WI: ASA and SSSA.

Blum, P. (1997). Reflectance spectrophotometry and colorimetry. PP Handbook, 10(9), 1–11.

Bochkovskiy, A., Wang, C.-Y., & Liao, H.-Y. M. (2020). YOLOv4: Optimal Speed and

Accuracy of Object Detection. Retrieved from http://arxiv.org/abs/2004.10934.

Bollis, E., Pedrini, H., & Avila, S. (2020). Weakly Supervised Learning Guided by Activation

Mapping Applied to a Novel Citrus Pest Benchmark. (LIV), 310–319.

https://doi.org/10.1109/cvprw50498.2020.00043.

Bove, J. M. (2006). Bove Hlb Review 2006. 88, 7–37.

141

Bové. (2006). Huanglongbing: a Destructive, Newly-Emerging, Century-Old Disease of Citrus.

Journal of Plant Pathology, 88(1), 7–37.

https://pdfs.semanticscholar.org/2562/a5320216acc36b1a826308eaf0e50064e438.pdf.

Brady, N.C., & Weil, R.R. (2008). The Nature and Properties of Soils (14th. Ed.), Columbus,

OH: Pearson Education.

Buda, M., Maki, A., & Mazurowski, M. A. (2018). A systematic study of the class imbalance

problem in convolutional neural networks. Neural Networks, Vol. 106, pp. 249–259.

https://doi.org/10.1016/j.neunet.2018.07.011.

Cai, Y., Zheng, W., Zhang, X., Zhangzhong, L., & Xue, X. (2019). Research on soil moisture

prediction model based on deep learning. PLoS ONE, 14(4), 1–19.

https://doi.org/10.1371/journal.pone.0214508.

Campbell, G. S. (1988). Soil water potential measurement: An overview. Irrigation Science,

9(4), 265–273. https://doi.org/10.1007/BF00296702.

Campbell, G.S., Smith, D.M., & Teare, B.L. (2007). Application of a Dew Point Method to

Obtain Soil Water Characteristcs. In T. Schanz (Ed.), Experimental Unsuturated Soil

Mechanics (pp. 71-77). Berlin, Heidelberg: Springer. doi:10.1007/3-540-69873-6_7.

Cassel, D.K., & Klute, A. (1986). Water Potential: Tensiometry. In A. Klute (Ed.), Methods of

soil analysis. Part 1. Physical and mineralogical methods (2nd. Ed., Agronomy Monograph

9), (pp. 563-596). Madison, WI: ASA and SSSA.

Cassel, D.K., & Nielsen, D.R. (1986). Field Capacity and Available Water Capacity. In A. Klute

(Ed.), Methods of soil analysis. Part 1. Physical and mineralogical methods (2nd. Ed.,

Agronomy Monograph 9), (pp. 901-926). Madison, WI: ASA and SSSA.

Chabrillat, S., Ben-Dor, E., Cierniewski, J., Gomez, C., Schmid, T., & van Wesemael, B. (2019).

Imaging Spectroscopy for Soil Mapping and Monitoring. In Surveys in Geophysics (Vol.

40). Springer Netherlands. https://doi.org/10.1007/s10712-019-09524-0.

Childers, C. C. (2006). Texas Citrus Mite. Encyclopedia of Entomology, 2222–2222.

https://doi.org/10.1007/0-306-48380-7_4281.

Childers, C. C., & Fasulo, T. R. (2005). Six-Spotted Mite 1. 1–4.

Chunjing, Y., Yueyao, Z., Yaxuan, Z., & Liu, H. (2017). Application of convolutional neural

network in classification of high resolution agricultural remote sensing images.

International Archives of the Photogrammetry, Remote Sensing and Spatial Information

Sciences - ISPRS Archives, 42(2W7), 989–992. https://doi.org/10.5194/isprs-archives-XLII-

2-W7-989-2017.

Collet, F. (2015). Keras. https://keras.io/.

142

Curcio, D., Ciraolo, G., D’Asaro, F., & Minacapilli, M. (2013). Prediction of Soil Texture

Distributions Using VNIR-SWIR Reflectance Spectroscopy. Procedia Environmental

Sciences, 19, 494–503. https://doi.org/10.1016/j.proenv.2013.06.056.

Decagon Devices, Inc. (2007). Dewpoint PotentiaMeter. WP4C PotenciaMeter. Operator`s

Manual, Version 2, 66. Retrieved from www.decagon.com.

Deng, Wei Dong, Socher, R., Li-Jia Li, Kai Li, & Li Fei-Fei. (2009). ImageNet: A large-scale

hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern

Recognition, 248–255. IEEE. https://doi.org/10.1109/CVPRW.2009.5206848.

Dewdney, M M. (2019). Greasy Spot 1. 2018–2019.

Dewdney, M. M. (2020). 2019 – 2020 Florida Citrus Pest Management Guide : Citrus Scab 1.

2019–2020.

Dewdney, M. M, & Johnson, E. G. (2020). 2020 – 2021 Florida Citrus Production Guide :

Phytophthora Foot Rot , Crown Rot , and Root Rot 1 Cultural Practices to Manage. 1–7.

Dewdney, M. M, Johnson, E. G., & Graham, J. H. (2020). 2019 – 2020 Florida Citrus

Production Guide : Citrus Protecting Canker-Free Areas. 1–6.

Dhillon, A., & Verma, G. K. (2020). Convolutional neural network: a review of models,

methodologies and applications to object detection. Progress in Artificial Intelligence, Vol.

9. https://doi.org/10.1007/s13748-019-00203-0.

Duckett, T., Pearson, S., Blackmore, S., Grieve, B., Chen, W.-H., Cielniak, G., … Yang, G.-Z.

(2018). Agricultural Robotics: The Future of Robotic Agriculture. Retrieved from

http://arxiv.org/abs/1806.06762.

Duong, L. T., Nguyen, P. T., Di Sipio, C., & Di Ruscio, D. (2020). Automated fruit recognition

using EfficientNet and MixNet. Computers and Electronics in Agriculture, 171(March),

105326. https://doi.org/10.1016/j.compag.2020.105326.

Dyrmann, M., Skovsen, S., Laursen, M. S., & Jørgensen, R. N. (2018). Using a fully

convolutional neural network for detecting locations of weeds in images from cereal fields.

The 14th International Conference on Precision Agriculture, 1–7. Retrieved from

https://pdfs.semanticscholar.org/9476/2a8f63bbda7cd5a5260b0afb6ed0e0e40d05.pdf%0Aht

tp://www.forskningsdatabasen.dk/en/catalog/2397441396.

Esau, T., Zaman, Q., Groulx, D., Farooque, A., Schumann, A., & Chang, Y. (2018). Machine

vision smart sprayer for spot-application of agrochemical in wild blueberry fields. Precision

Agriculture, 19(4). https://doi.org/10.1007/s11119-017-9557-y.

Eshel, G., Levy, G. J., Mingelgrin, U., & Singer, M. J. (2004). Critical Evaluation of the Use of

Laser Diffraction for Particle-Size Distribution Analysis. Soil Science Society of America

Journal, 68(3), 736–743. https://doi.org/10.2136/sssaj2004.7360.

143

Fan, G. cheng, Xia, Y., Lin, X., Hu, H., Wang, X., Ruan, C., … Liu, B. (2016). Evaluation of

thermotherapy against Huanglongbing (citrus greening) in the greenhouse. Journal of

Integrative Agriculture, 15(1), 111–119. https://doi.org/10.1016/S2095-3119(15)61085-1.

Fan, Z., Herrick, J. E., Saltzman, R., Matteis, C., Yudina, A., Nocella, N., … Van Zee, J. (2017).

Measurement of soil color: A comparison between smartphone camera and the munsell

color charts. Soil Science Society of America Journal, 81(5), 1139–1146.

https://doi.org/10.2136/sssaj2017.01.0009.

Fasulo, T. R., & Denmark, H. A. (2012). Twospotted Spider Mite, Tetranychus urticae Koch

(Acari: Tetranychidae). SpringerReference, 1–5.

https://doi.org/10.1007/springerreference_87762.

Ferentinos, K. P. (2018). Deep learning models for plant disease detection and diagnosis.

Computers and Electronics in Agriculture, 145(February), 311–318.

https://doi.org/10.1016/j.compag.2018.01.009.

Food and Agriculture Organization of the United Nation. (2017). Soil Organic Carbon: the

hidden potential. Food and Agriculture Organization of the United Nations. Rome, Italy.

Fried, N. (2019). Florida Citrus Statistics 2017-2018. 117. Retrieved from

www.nass.usda.gov/fl.

Fried, N. (2020). Florida Citrus Statistics 2018-2019. 117. Retrieved from

www.nass.usda.gov/fl.

Fuentes, A., Yoon, S., Kim, S. C., & Park, D. S. (2017). A robust deep-learning-based detector

for real-time tomato plant diseases and pests recognition. Sensors (Switzerland), 17(9).

https://doi.org/10.3390/s17092022.

Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena-Martinez, V., & Garcia-Rodriguez, J.

(2017). A Review on Deep Learning Techniques Applied to Semantic Segmentation. 1–23.

Retrieved from http://arxiv.org/abs/1704.06857.

Garfin, G., Franco, G., Blanco, H., Comrie, A., Gonzalez, P., Piechota, T., … Waskom, R.

(2014). Southwest. Climate Change Impacts in the United States: In The Third National

Climate Assessment. https://doi.org/10.7930/J08G8HMN.

Garza, B. N., Ancona, V., Enciso, J., Perotto-Baldivieso, H. L., Kunta, M., & Simpson, C.

(2020). Quantifying citrus tree health using true color UAV images. Remote Sensing, 12(1).

https://doi.org/10.3390/rs12010170.

Gee, G.W., & Bauder, J.W. (1986). Particle-size Analysis. In A. Klute (Ed.), Methods of soil

analysis. Part 1. Physical and mineralogical methods (2nd. Ed., Agronomy Monograph 9,

pp. 383-411). Madison, WI: ASA and SSSA.

144

Ghatrehsamani, S., Czarnecka, E., Lance Verner, F., Gurley, W. B., Ehsani, R., & Ampatzidis,

Y. (2019). Evaluation of mobile heat treatment system for treating in-field HLB-affected

trees by analyzing survival rate of surrogate bacteria. Agronomy, 9(9).

https://doi.org/10.3390/agronomy9090540.

Ghazi, M. M., Yanikoglu, B., & Aptoula, E. (2017). Plant identification using deep neural

networks via optimization of transfer learning parameters. Neurocomputing, Vol. 235, pp.

228–235. https://doi.org/10.1016/j.neucom.2017.01.018.

Gomez, C., & Lagacherie, P. (2016). Mapping of Primary Soil Properties Using Optical Visible

and Near Infrared (Vis-NIR) Remote Sensing. In Land Surface Remote Sensing in

Agriculture and Forest. https://doi.org/10.1016/B978-1-78548-103-1.50001-7.

Gottwald, T. R., Graham, J. H., Irey, M. S., McCollum, T. G., & Wood, B. W. (2012).

Inconsequential effect of nutritional treatments on huanglongbing control, fruit quality,

bacterial titer and disease progress. Crop Protection, 36, 73–82.

https://doi.org/10.1016/j.cropro.2012.01.004.

Grafton-Cardwell, E.E., & Daugherty, P.M. (2018). Asian Citrus Psyllid and Huanglongbing

Disease. Pest Notes 74155, (June), 1–5.

Grafton-Cardwell, E. E., Godfrey, K. E., Rogers, M. E., Childers, C. C., & Stansly, P. A. (2006).

Asian Citrus Psyllid. Asian Citrus Psyllid. https://doi.org/10.3733/ucanr.8205.

Grafton-Cardwell, E. E., Stelinski, L. L., & Stansly, P. A. (2013). Biology and Management of

Asian Citrus Psyllid, Vector of the Huanglongbing Pathogens. Annual Review of

Entomology, 58(1), 413–432. https://doi.org/10.1146/annurev-ento-120811-153542.

Graham Gottwald, T., and Irey, M., J. (2012). Balancing resources for management of root

health in HLB-affected groves. . Citrus Industry, 93(7), 6–11.

Graham, J. H., & Timmer, L. W. (2008). 2008 Florida Citrus Pest Management Guide :

Phytophthora Foot Rot and Root Rot 1. 1–6.

Grisso, R., Alley, M., Holshouser, D., & Thomason, W. (2009). Precision farming tools: soil

electrical conductivity. Virginia Cooperative Extension, 442(508), 1–6. Retrieved from

http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:Precision+farming+tools:

+soil+electrical+conductivity#7.

Grosser, J. W., Gmitter Jr. F.G., & Gmitter, F. G. J. (2013). Breeding disease-resistant citrus for

Florida: Adjusting to the canker/HLB world - Part 2: rootstocks. Citrus Industry,

94(March), 10–16.

Halbert, S. E., & Núñez, C. A. (2004). Distribution of the Asian Citrus Psyllid, Diaphorina Citri

Kuwayama (Rhynchota: Psyllidae) in the Caribbean Basin. Florida Entomologist, 87(3),

401–402. https://doi.org/10.1653/0015-4040(2004)087[0401:dotacp]2.0.co;2.

145

Halbert, S., Manjunath, K., Roka, F., & Brodie, M. (2008). Huanglongbing (citrus greening) in

florida, 2008. 1–8.

Hall, D. G., Richardson, M. L., Ammar, E. D., & Halbert, S. E. (2013). Asian citrus psyllid,

Diaphorina citri, vector of citrus huanglongbing disease. Entomologia Experimentalis et

Applicata, 146(2), 207–223. https://doi.org/10.1111/eea.12025.

Hamido, S. A., Morgan, K. T., & Kadyampakeni, D. M. (2017). The effect of huanglongbing on

young citrus tree water use. HortTechnology, 27(5), 659–665.

https://doi.org/10.21273/HORTTECH03830-17.

Havlin, J.L., Beaton, J.D., Tisdale, S.L., & Nelson, W.L. (2005). Soil Fertility and Fertilizers an

Introduction to Nutrient Management (7th Ed.). Upper Saddle River, NJ: Pearson Education.

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition.

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern

Recognition, 2016-Decem, 770–778. https://doi.org/10.1109/CVPR.2016.90.

Hernández-Hernández, J. L., Ruiz-Hernández, J., García-Mateos, G., González-Esquiva, J. M.,

Ruiz-Canales, A., & Molina-Martínez, J. M. (2017). A new portable application for

automatic segmentation of plants in agriculture. Agricultural Water Management, 183.

https://doi.org/10.1016/j.agwat.2016.08.013.

Hill, E. C., & Station, C. E. (1967). Florida Citrus. 15(5), 1091–1094.

Hillel, D. (1998). Environmental Soil Physics. San Diego, CA: Academic Press.

Hoffman, M. T., Doud, M. S., Williams, L., Zhang, M. Q., Ding, F., Stover, E., … Duan, Y. P.

(2013). Heat treatment eliminates “Candidatus Liberibacter asiaticus” from infected citrus

trees under controlled conditions. Phytopathology, 103(1), 15–22.

https://doi.org/10.1094/PHYTO-06-12-0138-R.

Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., … Adam, H.

(2017). MobileNets: Efficient Convolutional Neural Networks for Mobile Vision

Applications. Retrieved from http://arxiv.org/abs/1704.04861.

Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected

convolutional networks. Proceedings - 30th IEEE Conference on Computer Vision and

Pattern Recognition, CVPR 2017, 2017-Janua, 2261–2269.

https://doi.org/10.1109/CVPR.2017.243.

Islam, K., Mcbratney, A. B., & Singh, B. (2006). Estimation of soil colour from visible

reflectance spectra. (December 2004), 5–9.

Jensen, J. L., Christensen, B. T., Schjønning, P., Watts, C. W., & Munkholm, L. J. (2018).

Converting loss-on-ignition to organic carbon content in arable topsoil: pitfalls and

proposed procedure. European Journal of Soil Science, Vol. 69, pp. 604–612.

https://doi.org/10.1111/ejss.12558.

146

Johnson, E., & Graham, J. (2015). Root health in the age of HLB. Citrus Industry, (August

2015), 14–18.

Kadyampakeni, D. M., Morgan, K. T., Schumann, A. W., & Nkedi-Kizza, P. (2014). Effect of

irrigation pattern and timing on root density of young citrus trees infected with

Huanglongbing disease. HortTechnology, 24(2), 209–221.

https://doi.org/10.21273/horttech.24.2.209.

Kamilaris, A., & Prenafeta-Boldú, F. X. (2018). Deep learning in agriculture: A survey.

Computers and Electronics in Agriculture, Vol. 147.

https://doi.org/10.1016/j.compag.2018.02.016.

Kantipudi, K., Lai, C., Min, C.-H., & Chiang, R. C. (2018). Weed detection among crops by

convolutional neural networks with sliding windows. 14th International Conference on

Precision Agriculture, 8. Retrieved from

https://www.ispag.org/proceedings/?action=abstract&id=4975&search=topics.

Khan, A., Sohail, A., Zahoora, U., & Qureshi, A. S. (2020). A survey of the recent architectures

of deep convolutional neural networks. Artificial Intelligence Review.

https://doi.org/10.1007/s10462-020-09825-6.

Khanchouch, K., Pane, A., Chriki, A. & Cacciola, S.O. (2017). Major and Emerging Fungal

Diseases of Citrus in the Mediterranean Region. In G. Harsimran & G. Harsh (Eds.), Citrus

Pathology (pp. 3-30). Rijeka, Croatia: InTech. http://dx.doi.org/10.5772/66943.

Kimble, J.M., Lal, R., & Follet, R.F. (2001). Methods for Assessing Soil C Polls. In M.J.

Kimble, R.F. Follet, & B.A. Stewart (Eds.), Assessing Methods for Soil Carbon (pp. 3-12).

Boca Raton, FL: CRC Press LLC.

Kingma, D. P., & Ba, J. L. (2015). Adam: A method for stochastic optimization. 3rd

International Conference on Learning Representations, ICLR 2015 - Conference Track

Proceedings, 1–15.

Kirillova, N. P., & Sileva, T. M. (2017). Colorimetric analysis of soils using digital cameras.

Moscow University Soil Science Bulletin, 72(1), 13–20.

https://doi.org/10.3103/s0147687417010045.

Kirste, B., Iden, S. C., & Durner, W. (2019). Determination of the soil water retention curve

around the wilting point: Optimized protocol for the DeWpoint method. Soil Science

Society of America Journal, 83(2), 288–299. https://doi.org/10.2136/sssaj2018.08.0286.

Klute, A., & Dirksen, C. (1986). Hydraulic Conductivity and Diffusity: Laboratory Methods. In

A. Klute (Ed.), Methods of soil analysis. Part 1. Physical and mineralogical methods (2nd.

Ed., Agronomy Monograph 9, pp. 687-734). Madison, WI: ASA and SSSA.

Konare, H., Yost, R. S., Doumbia, M., Mccarty, G. W., Jarju, A., & Kablan, R. (2010). Loss on

ignition: Measuring soil organic carbon in soils of the sahel, west africa. African Journal of

Agricultural Research, 5(22), 3088–3095.

147

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep

convolutional neural networks. Advances in Neural Information Processing Systems, 2,

1097–1105.

Kroetsch D., & Wang C (2008). Particle Size distribution. Soil. In Carter MR., & Gregorich EG.

(Ed.), Soil Sampling and Methods of Analysis. (2nd. Ed.), pp. 713-725. Broken Sound

Parkway, NW: Taylor & Francis Group.

Kutsch, W.L., Bahn, M., & Heinemeyer, A. (2009). Soil carbon relations: an overview. In W.L.

Kutsch, M. Bahn, & A. Heinemeyer (Eds.), Soil Carbon Dynamics: An Integrated

Methodology (pp. 1-16). Cambridge: Cambridge University Press.

Lal, R. (2009). Soils and Sustainable Agriculture: A Review. In E. Lichtfouse, M. Navarrete, P.

Debaeke, S. Véronique & C. Alberola (Eds.), Sustainable Agriculture (Vol.1, pp. 15-23).

Heidelberg London, NY: Springer. DOI 10.1007/978-90-481-2666-8.

Lecun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444.

https://doi.org/10.1038/nature14539.

LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., & Jackel, L. D.

(1989). Backpropagation applied to digit recognition. Neural Computation, Vol. 1, pp. 541–

551. Retrieved from https://www.ics.uci.edu/~welling/teaching/273ASpring09/lecun-

89e.pdf.

Li, Zewen, Yang, W., Peng, S., & Liu, F. (2020). A Survey of Convolutional Neural Networks:

Analysis, Applications, and Prospects. Retrieved from http://arxiv.org/abs/2004.02806.

Li, Zhizhong, & Hoiem, D. (2018). Learning without Forgetting. IEEE Transactions on Pattern

Analysis and Machine Intelligence, 40(12), 2935–2947.

https://doi.org/10.1109/TPAMI.2017.2773081.

Liakos, K. G., Busato, P., Moshou, D., Pearson, S., & Bochtis, D. (2018). Machine learning in

agriculture: A review. Sensors (Switzerland), 18(8), 1–29.

https://doi.org/10.3390/s18082674.

Lin, M., Chen, Q., & Yan, S. (2014). Network in network. 2nd International Conference on

Learning Representations, ICLR 2014 - Conference Track Proceedings, 1–10.

Liu, L., Ji, M., & Buchroithner, M. (2018). Transfer learning for soil spectroscopy based on

convolutional neural networks and its application in soil clay content mapping using

hyperspectral imagery. Sensors (Switzerland), 18(9). https://doi.org/10.3390/s18093169.

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). SSD:

Single shot multibox detector. Lecture Notes in Computer Science (Including Subseries

Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9905 LNCS,

21–37. https://doi.org/10.1007/978-3-319-46448-0_2.

148

Lu, Z., Pu, H., Wang, F., Hu, Z., & Wang, L. (2017). The expressive power of neural networks:

A view from the width. Advances in Neural Information Processing Systems, 2017-

December.

Maček, M., Smolar, J., & Petkovšek, A. (2013). Extension of measurement range of dew-point

potentiometer and evaporation Method. 18th International Conference on Soil Mechanics

and Geotechnical Engineering: Challenges and Innovations in Geotechnics, ICSMGE 2013,

2, 1137–1142.

Manjunath, K. L., Halbert, S. E., Ramadugu, C., Webb, S., & Lee, R. F. (2008). Detection of

“Candidatus Liberibacter asiaticus” in Diaphorina citri and its importance in the

management of citrus huanglongbing in Florida. Phytopathology, 98(4), 387–396.

https://doi.org/10.1094/PHYTO-98-4-0387.

Marashi, M., Mohammadi Torkashvand, A., Ahmadi, A., & Esfandyari, M. (2017). Estimativa

de índices de estabilidade de agregados do solo usando redes neuronais artificiais e modelos

de regressão linear múltipla. Spanish Journal of Soil Science, 7(2), 122–132.

https://doi.org/10.3232/SJSS.2017.V7.N2.04.

McLaughlin, M.J., Reuter, D.J., & Rayment, G.E. (1999). Soil Testing – Principles and

Concepts. In K.I. Peverill, L.A. Sparrow, & D.J. Reuter (Eds.), Soil Analysis: An

Interpretation Manual (pp. 1-21). Collingwood, Australia: CSIRO Publishing.

McMurtry, J. A. (1989). Citrus Red Mite. Biological Control in the Western United States:

Accomplishments and Benefits of Regional Research Project W-84, 1964-1989, 61–62.

Minasny, B., Hopmans, J. W., Harter, T., Eching, S. O., Tuli, A., & Denton, M. A. (2004).

Neural networks prediction of soil hydraulic functions for alluvial soils using multistep

outflow data. Soil Science Society of America Journal, 68(2), 417–429.

https://doi.org/10.2136/sssaj2004.4170.

Mochida, K., Koda, S., Inoue, K., Hirayama, T., Tanaka, S., Nishii, R., & Melgani, F. (2018).

Computer vision-based phenotyping for improvement of plant productivity: A machine

learning perspective. GigaScience, 8(1), 1–12. https://doi.org/10.1093/gigascience/giy153.

Mohamed, E. S., Saleh, A. M., Belal, A. B., & Gad, A. A. (2018). Application of near-infrared

reflectance for quantitative assessment of soil properties. Egyptian Journal of Remote

Sensing and Space Science, 21(1), 1–14. https://doi.org/10.1016/j.ejrs.2017.02.001.

Moreira De Melo, T., & Pedrollo, O. C. (2015). Artificial neural networks for estimating soil

water retention curve using fitted and measured data. Applied and Environmental Soil

Science, 2015. https://doi.org/10.1155/2015/535216.

Morgan, K. T., Kadyampakeni, D. M., Zekri, M., Schumann, A. W., Vashisth, T., & Obreza, T.

A. (2021). 2020 – 2021 Florida Citrus Production Guide : Nutrition Management for Citrus

Trees 1. 1–9.

149

Morgan, K. T., Rouse, R. E., & Ebel, R. C. (2016). Foliar applications of essential nutrients on

growth and yield of ‘valencia’ sweet orange infected with huanglongbing. HortScience,

51(12), 1482–1493. https://doi.org/10.21273/HORTSCI11026-16.

Motsara, M. R., & Roy, R. N. (2008). Guide to laboratory establishment for plant nutrient

analysis, Food And Agriculture Organization Of The United Nations Rome, 2008. In Fao

Fertilizer and Plant Nutrition Bulletin 19.

Mungofa, P., Schumann, A., & Waldo, L. (2018). Chemical crystal identification with deep

learning machine vision. BMC Research Notes, 11(1), 1–6. https://doi.org/10.1186/s13104-

018-3813-8.

Munsell Color (Firm). (1994). Munsell Soil Color Charts. Revised Edition. Macbeth Division of

Kollmorgan Instruments Corporation, New Windsor, NY.

Mylavarapu, R., Harris, W., & Hochmuth, G. (2016). Agricultural Soils of Florida. EDIS,

University of Florida IFAS Extension.

https://edis.ifas.ufl.edu/pdffiles/SS/SS65500.pdf%0Ahttp://edis.ifas.ufl.edu/ss655.

National Research Council (US) Committee on Biosciences. New Directions for Biosciences

Research in Agriculture: High-Reward Opportunities. Washington (DC): National

Academies Press (US); 1985. PMID: 25032394.

National Research Council. (2010). Strategic Planning for the Florida Citrus Industry:

Addressing Citrus Greening. In Strategic Planning for the Florida Citrus Industry:

Addressing Citrus Greening. https://doi.org/10.17226/12880.

Nelson, D.W., Sommers, L.E. (1996). Total Carbon, Organic Carbon and Organic Matter. In

J.M. Bigham (Ed.). Methods of Soil Analysis. Part 3. Chemical Methods. (pp. 961-1110).

Madison, WI: SSSA.

Nocita, M., Stevens, A., van Wesemael, B., Aitkenhead, M., Bachmann, M., Barthès, B., …

Wetterlind, J. (2015). Soil Spectroscopy: An Alternative to Wet Chemistry for Soil

Monitoring. Advances in Agronomy, 132(March), 139–159.

https://doi.org/10.1016/bs.agron.2015.02.002.

NVIDIA Corporation. (2018, April 4). Startup Uses AI to Identify Crop Diseases With Superb

Accuracy. NVIDIA Corporation https://news.developer.nvidia.com/startup-uses-ai-to-

identify-crop-diseases-with-superb-accuracy/.

Nwugo, C. C., Lin, H., Duan, Y., & Civerolo, E. L. (2013). The effect of “Candidatus

Liberibacter asiaticus” infection on the proteomic profiles and nutritional status of pre-

symptomatic and symptomatic grapefruit (Citrus paradisi) plants. BMC Plant Biology,

13(1), 1–24. https://doi.org/10.1186/1471-2229-13-59.

Owens, P. ., & Rutledge, E. . (2005). Minimum Tillage. Morphology, 511–520.

150

P’erez, F., & Granger, B. E. (2018). Jupyter Notebook Documentation. Computing in Science

and Engineering, 1–155.

Padarian, J., Minasny, B., & McBratney, A. B. (2019a). Transfer learning to localise a

continental soil vis-NIR calibration model. Geoderma, 340(December 2018), 279–288.

https://doi.org/10.1016/j.geoderma.2019.01.009.

Padarian, J., Minasny, B., & McBratney, A. B. (2019b). Using deep learning to predict soil

properties from regional spectral data. Geoderma Regional, Vol. 16.

https://doi.org/10.1016/j.geodrs.2018.e00198.

Padarian, J., Minasny, B., & McBratney, A. B. (2019). Using deep learning for digital soil

mapping. Soil, 5(1), 79–89. https://doi.org/10.5194/soil-5-79-2019.

Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge

and Data Engineering, 22(10), 1345–1359. https://doi.org/10.1109/TKDE.2009.191.

Pedersen, S.M., & Lind, K.M. (2017). Precision Agriculture – From Mapping to Site-Specific

Application. In S.M. Pedersen & K.M. Lind (Eds.), Precision Agriculture: Technology and

Economic Perspectives (pp. 1-20). Cham: Springer International Publishing AG.

https://doi.org/10.1007/978-3-319-68715-5.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M.,

Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D.,

Brucher, M., Perrot, M. & Duchesnay, E. (2011). Scikit-learn: Machine Learning in Python.

Journal of Machine Learning Research, 12 (2011) 2825-2830.

Pinheiro, É. F. M., Ceddia, M. B., Clingensmith, C. M., Grunwald, S., & Vasques, G. M. (2017).

Prediction of soil physical and chemical properties by visible and near-infrared diffuse

reflectance spectroscopy in the Central Amazon. Remote Sensing, 9(4), 1–22.

https://doi.org/10.3390/rs9040293.

Priya, R., & Ramesh, D. (2020). ML based sustainable precision agriculture: A future generation

perspective. Sustainable Computing: Informatics and Systems, 28(August), 100439.

https://doi.org/10.1016/j.suscom.2020.100439.

Pustika, A. B., Subandiyah, S., Holford, P., Beattie, G. A. C., Iwanami, T., & Masaoka, Y.

(2008). Interactions between plant nutrition and symptom expression in mandarin trees

infected with the disease huanglongbing. Australasian Plant Disease Notes, 3(1), 112.

https://doi.org/10.1071/dn08045.

Qureshi, J., Stelinski, L., Martini, X., & Diepenbrock, L. M. (2021). 2020 – 2021 Florida Citrus

Production Guide : Rust Mites , Spider Mites , and Other Phytophagous Mites 1.

R: A language and environment for statistical computing. R Foundation for Statistical

Computing, Vienna, Austria. URL http://www.R-project.org/.

151

Rakhshani, E., & Saeedifar, A. (2013). Seasonal fluctuations, spatial distribution and natural

enemies of Asian citrus psyllid Diaphorina citri Kuwayama (Hemiptera: Psyllidae) in Iran.

Entomological Science, Vol. 16, pp. 17–25. https://doi.org/10.1111/j.1479-

8298.2012.00531.x.

Ramcharan, A., McCloskey, P., Baranowski, K., Mbilinyi, N., Mrisho, L., Ndalahwa, M., …

Hughes, D. P. (2019). A mobile-based deep learning model for cassava disease diagnosis.

Frontiers in Plant Science, 10(March), 1–8. https://doi.org/10.3389/fpls.2019.00272.

Ranulfi, A. C., Romano, R. A., Bebeachibuli Magalhães, A., Ferreira, E. J., Ribeiro Villas-Boas,

P., & Marcondes Bastos Pereira Milori, D. (2017). Evaluation of the Nutritional Changes

Caused by Huanglongbing (HLB) to Citrus Plants Using Laser-Induced Breakdown

Spectroscopy. Applied Spectroscopy, 71(7), 1471–1480.

https://doi.org/10.1177/0003702817701751.

RavenProtocol. (2017, December 4). Everything you need to know about Neural Networks.

Medium https://medium.com/ravenprotocol/everything-you-need-to-know-about-neural-

networks-6fcc7a15cb4.

Rawlings, S.L., & Campbell, G.S. (1986). Water Potential: Thermocouple Psychrometry. In A.

Klute (Ed.), Methods of soil analysis. Part 1. Physical and mineralogical methods (2nd. Ed.,

Agronomy Monograph 9) (pp. 597-618). Madison, WI: ASA and SSSA.

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-

time object detection. Proceedings of the IEEE Computer Society Conference on Computer

Vision and Pattern Recognition, Vol. 2016-Decem, pp. 779–788.

https://doi.org/10.1109/CVPR.2016.91.

Redmon, J., & Farhadi, A. (2018). YOLOv3: An Incremental Improvement. Retrieved from

http://arxiv.org/abs/1804.02767.

Rolshausen, P. E. (2019). Citrus Undercover Production Systems. (February).

Roper, W. R., Robarge, W. P., Osmond, D. L., & Heitman, J. L. (2019). Comparing four

methods of measuring soil organic matter in North Carolina soils. Soil Science Society of

America Journal, 83(2), 466–474. https://doi.org/10.2136/sssaj2018.03.0105.

Rouse, B., Irey, M., Gast, T., Boyd, M., & Willis, T. (2012). Fruit Production in a Southwest

Florida Citrus Grove Using the Boyd Nutrient / SAR Foliar Spray. Proc. Fla. State Hort.

Soc, 125(61), 61–64.

RStudio Team (2016). RStudio: Integrated Development for R. RStudio, Inc., Boston, MA URL

http://www.rstudio.com/.

Ruder, S. (2016). An overview of gradient descent optimization algorithms. 1–14. Retrieved from

http://arxiv.org/abs/1609.04747.

152

Ruder, S. (2017). An Overview of Multi-Task Learning in Deep Neural Networks. (May).

Retrieved from http://arxiv.org/abs/1706.05098.

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., … Fei-Fei, L. (2015).

ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer

Vision, 115(3), 211–252. https://doi.org/10.1007/s11263-015-0816-y.

Saffari, M., Yasrebi, J., Sarikhani, F., & Gazni, R. (2009). Evaluation of ANN models for

prediction of Spatial Variability of Some Soil Chemical Properties. Research Journal of

Biological Sciences, 4(7), 815–820.

Saleem, M., Atta, B. M., Ali, Z., & Bilal, M. (2020). Laser-induced fluorescence spectroscopy

for early disease detection in grapefruit plants. Photochemical and Photobiological

Sciences, 19(5), 713–721. https://doi.org/10.1039/c9pp00368a.

Sankaran, S., Mishra, A., Ehsani, R., & Davis, C. (2010). A review of advanced techniques for

detecting plant diseases. Computers and Electronics in Agriculture, Vol. 72, pp. 1–13.

https://doi.org/10.1016/j.compag.2010.02.007.

Schumann, A. (2020). Computer Tools for Diagnosing Citrus Leaf Symptoms (Part 1):

Diagnosis and Recommendation Integrated System (DRIS). Edis, 2020(4), 1–2.

https://doi.org/10.32473/edis-ss683-2020.

Schumann, A., Waldo, L. Mungofa P., & Oswalt C. (2020). Computer Tools for Diagnosing

Citrus Leaf Symptoms (Part 2): Diagnosis and Recommendation Integrated System (DRIS).

Edis, 2020(4), 1–2. https://doi.org/10.32473/edis-ss683-2020.

Schumann A., Waldo L., Holmes W., Test G., & Ebert T. (2018, July 1). Artificial intelligence

for detecting citrus pests, diseases and disorders. Citrus Industry.

https://citrusindustry.net/2018/07/02/artificial-intelligence-detecting-citrus-pests-diseases-

disorders/.

Schumann, A. W., Mood, N. S., Mungofa, P. D. K., MacEachern, C., Zaman, Q. U., & Esau, T.

(2019). Detection of three fruit maturity stages in wild blueberry fields using deep learning

artificial neural networks. 2019 ASABE Annual International Meeting. American Society of

Agricultural and Biological Engineers. https://doi.org/10.13031/aim.201900533.

Shannon, D.K., Clay, D.E., & Sudduth, K.A. (2018). An Introduction to Precision Agriculture.

In. D.K. Shannon, D.E. Clay & N.R. Kitchen (Eds.), Precision Agriculture Basics (pp 1-

12). Madison, WI: ASA, CSSA and SSSA.

Sharif, M., Khan, M. A., Iqbal, Z., Azam, M. F., Lali, M. I. U., & Javed, M. Y. (2018). Detection

and classification of citrus diseases in agriculture based on optimized weighted

segmentation and feature selection. Computers and Electronics in Agriculture, 150(April),

220–234. https://doi.org/10.1016/j.compag.2018.04.023.

153

Shen, W., Cevallos-Cevallos, J. M., Nunes da Rocha, U., Arevalo, H. A., Stansly, P. A., Roberts,

P. D., & van Bruggen, A. H. C. (2013). Relation between plant nutrition, hormones,

insecticide applications, bacterial endophytes, and Candidatus Liberibacter Ct values in

citrus trees infected with Huanglongbing. European Journal of Plant Pathology, 137(4),

727–742. https://doi.org/10.1007/s10658-013-0283-7.

Shields, J. A., Paul, E. A., Arnaud, R. J. St., & Head, W. K. (1968). Spectrometric Measurment

of Soil Color and its Relationship to Moisture and Organic Matter. Can. J. Sci., 48(17),

271–280.

Shin, K., Ascunce, M. S., Narouei-Khandan, H. A., Sun, X., Jones, D., Kolawole, O. O., … van

Bruggen, A. H. C. (2016). Effects and side effects of penicillin injection in huanglongbing

affected grapefruit trees. Crop Protection, 90, 106–116.

https://doi.org/10.1016/j.cropro.2016.08.025.

Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image

recognition. 3rd International Conference on Learning Representations, ICLR 2015 -

Conference Track Proceedings, 1–14.

Sladojevic, S., Arsenovic, M., Anderla, A., Culibrk, D., & Stefanovic, D. (2016). Deep Neural

Networks Based Recognition of Plant Diseases by Leaf Image Classification.

Computational Intelligence and Neuroscience, 2016. https://doi.org/10.1155/2016/3289801.

Soil Science Division Staff (2017). Soil Survey Manual. United States Department of

Agriculture, (Handbook No. 18). 587.

Soil Survey Staff. (2014). Soil Survey Field and Laboratory Methods Manual. United States

Department of Agriculture, Natural Resources Conservation Service, (Soil Survey

Investigations Report No. 51, Version 2.0), 487.

https://doi.org/10.13140/RG.2.1.3803.8889.

Solberg, E. (2017). Deep neural networks for object detection in agricultural robotics. Retrieved

from https://brage.bibsys.no/xmlui/handle/11250/2463891.

Spann, T. M., & Schumann, A. W. (2009). The Role of Plant Nutrients in Disease Development

with Emphasis on Citrus and Huanglongbing. Proceedings of Florida State Horicultural

Science, 122, 169–171.

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout:

A simple way to prevent neural networks from overfitting. Journal of Machine Learning

Research, 15, 1929–1958.

Stansly, P A, Qureshi, J. A., Stelinski, L. L., & Rogers, M. E. (2019). 2018-2019 Florida Citrus

Production Guide: Asian Citrus Psyllid and Citrus Leafminer. 1–9. Retrieved from

https://edis.ifas.ufl.edu.

154

Stansly, P. A., Arevalo, H. A., Qureshi, J. A., Jones, M. M., Hendricks, K., Roberts, P. D., &

Roka, F. M. (2014). Vector control and foliar nutrition to maintain economic sustainability

of bearing citrus in Florida groves affected by huanglongbing. Pest Management Science,

70(3), 415–426. https://doi.org/10.1002/ps.3577.

Stiglitz, R., Mikhailova, E., Sharp, J., Post, C., Schlautman, M., Gerard, P., & Cope, M. (2018).

Predicting Soil Organic Carbon and Total Nitrogen at the Farm Scale Using Quantitative

Color Sensor Measurements. Agronomy, 8(10), 212.

https://doi.org/10.3390/agronomy8100212.

Sun, H., Nelson, M., Chen, F., & Husch, J. (2009). Soil mineral structural water loss during loss

on ignition analyses. Canadian Journal of Soil Science, 89(5), 603–610.

https://doi.org/10.4141/CJSS09007.

Sun, R. (2019). Optimization for deep learning: theory and algorithms. 1–60. Retrieved from

http://arxiv.org/abs/1912.08957.

Swetha, R. K., Bende, P., Singh, K., Gorthi, S., Biswas, A., Li, B., … Chakraborty, S. (2020).

Predicting soil texture from smartphone-captured digital images and an application.

Geoderma, 376(June), 114562. https://doi.org/10.1016/j.geoderma.2020.114562.

Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. A. (2017). Inception-v4, inception-ResNet

and the impact of residual connections on learning. 31st AAAI Conference on Artificial

Intelligence, AAAI 2017.

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., … Rabinovich, A. (2015).

Going Deeper with Convolutions Christian. Journal of Chemical Technology and

Biotechnology, 91(8), 2322–2330. https://doi.org/10.1002/jctb.4820.

Szegedy, C., Vincent, V., & Ioffe, S. (2014). Inception-v3:Rethinking the Inception Architecture

for Computer Vision Christian. HARMO 2014 - 16th International Conference on

Harmonisation within Atmospheric Dispersion Modelling for Regulatory Purposes,

Proceedings.

Tan, C., Sun, F., Kong, T., Zhang, W., Yang, C., & Liu, C. (2018). A survey on deep transfer

learning. Lecture Notes in Computer Science (Including Subseries Lecture Notes in

Artificial Intelligence and Lecture Notes in Bioinformatics), 11141 LNCS, 270–279.

https://doi.org/10.1007/978-3-030-01424-7_27.

Tan, M., & Le, Q. V. (2019). EfficientNet: Rethinking model scaling for convolutional neural

networks. 36th International Conference on Machine Learning, ICML 2019, 2019-June,

10691–10700.

Tan, M., Pang, R., & Le, Q. V. (2020). EfficientDet: Scalable and Efficient Object Detection.

10778–10787. https://doi.org/10.1109/cvpr42600.2020.01079.

Timmer, L. W., Roberts, P. D., Chung, K. R., & Bhatia, A. (2008). Greasy Spot 1. 1–4.

155

Toriyama, K. (2020). Development of precision agriculture and ICT application thereof to

manage spatial variability of crop growth. Soil Science and Plant Nutrition, 00(00), 1–9.

https://doi.org/10.1080/00380768.2020.1791675.

Tsai, J. H. (2006). Asian Citrus Psyllid, Diaphorina Citri Kuwayama (Hemiptera: Psyllidae).

Encyclopedia of Entomology, 205–207. https://doi.org/10.1007/0-306-48380-7_324.

Vashisth, T., & Grosser, J. (2018). Comparison of Controlled Release Fertilizer (CRF) for Newly

Planted Sweet Orange Trees under Huanglongbing Prevalent Conditions. Journal of

Horticulture, 05(03), 3–7. https://doi.org/10.4172/2376-0354.1000244.

Vincent, D. R., Deepa, N., Elavarasan, D., Srinivasan, K., Chauhdary, S. H., & Iwendi, C.

(2019). Sensors driven ai-based agriculture recommendation model for assessing land

suitability. Sensors (Switzerland), 19(17). https://doi.org/10.3390/s19173667.

Xing, S., Lee, M., & Lee, K. K. (2019). Citrus pests and diseases recognition model using

weakly dense connected convolution network. Sensors (Switzerland), 19(14).

https://doi.org/10.3390/s19143195.

Xu, F., Hao, Z., Huang, L., Liu, M., Chen, T., Chen, J., … Yao, M. (2020). Comparative

identification of citrus huanglongbing by analyzing leaves using laser-induced breakdown

spectroscopy and near-infrared spectroscopy. Applied Physics B: Lasers and Optics, 126(3),

2–8. https://doi.org/10.1007/s00340-020-7392-8.

Yang, C., Powell, C. A., Duan, Y., Shatters, R. G., Lin, Y., & Zhang, M. (2016). Mitigating

citrus huanglongbing via effective application of antimicrobial compounds and

thermotherapy. Crop Protection, 84, 150–158. https://doi.org/10.1016/j.cropro.2016.03.013.

Yang, Y., Wang, L., Wendroth, O., Liu, B., Cheng, C., Huang, T., & Shi, Y. (2019). Is the Laser

Diffraction Method Reliable for Soil Particle Size Distribution Analysis? Soil Science

Society of America Journal, 83(2), 276–287. https://doi.org/10.2136/sssaj2018.07.0252.

Zagoruyko, S., & Komodakis, N. (2016). Wide Residual Networks. British Machine Vision

Conference 2016, BMVC 2016, 2016-Septe, 87.1-87.12. https://doi.org/10.5244/C.30.87.

Zambon, F. T., Kadyampakeni, D. M., & Grosser, J. W. (2019). Ground application of overdoses

of manganese have a therapeutic effect on sweet orange trees infected with Candidatus

liberibacter asiaticus. HortScience, 54(6), 1077–1086.

https://doi.org/10.21273/HORTSCI13635-18.

Zhang, M., Yang, C., & Powell, C. A. (2015). Application of antibiotics for control of citrus

huanglongbing. Advances in Antibiotics & Antibodies, 1(1), e101.

Zhang, S., Lu, X., Zhang, Y., Nie, G., & Li, Y. (2019). Estimation of soil organic matter, total

nitrogen and total carbon in sustainable coastalwetlands. Sustainability (Switzerland), 11(3).

https://doi.org/10.3390/su11030667.

156

Zhang, Z., & Sabuncu, M. R. (2018). Generalized cross entropy loss for training deep neural

networks with noisy labels. Advances in Neural Information Processing Systems, 2018-

Decem(NeurIPS), 8778–8788.

Zoph, B., Vasudevan, V., Shlens, J., & Le, Q. V. (2018). Learning Transferable Architectures for

Scalable Image Recognition. Proceedings of the IEEE Computer Society Conference on

Computer Vision and Pattern Recognition. https://doi.org/10.1109/CVPR.2018.00907.

157

BIOGRAPHICAL SKETCH

Perseverança da Delfina Khossa Mungofa was born in Nhamatanda District, Sofala

Province in Mozambique. She started her career in agronomy at Chimoio Agriculture Institute.

After completing her Techical degree, in 2013 she was awarded a full scholarship by the

MasterCard Foundation to study a career in Agricultural Sciences and Natural Resources

Management at EARTH University in Costa Rica. In 2016 she worked with NASA-SERVIR-

Easter and Southern Africa on the impacts of climate variability in water and food security in

Kenya. In 2017 while at EARTH University she worked with the Centro de Agricultura de

Precision (CAP) as lab assistant. She graduated from her bachelor’s degree in December of 2017.

In January of 2018, she joined the Soil and Precision Agriculture Laboratory at Citrus Research

and Education Center, in Lake Alfred, Florida, as a visiting scholar. In Spring of 2019, she began

her Master of Science degree at the University of Florida-Gainesville, FL in the Department of

Soil and Water Sciences. During her program she also was a Graduate Assistant at the Soil and

Precision Agriculture Laboratory in Lake Alfred and Soil Physics Laboratory in Gainesville, FL.

She completed her Master of Science degree in soil and water sciences in 2020.


Recommended