+ All Categories
Home > Documents > Thesis_ClarkA_FinalDraft

Thesis_ClarkA_FinalDraft

Date post: 25-Jan-2017
Category:
Upload: alexander-clark
View: 178 times
Download: 0 times
Share this document with a friend
136
Calculating the Weight of a Pig through Facial Geometry using 2-Dimensional Image Processing by Alexander W. Clark, B.S. A Thesis In Electrical Engineering Submitted to the Graduate Faculty of Texas Tech University in Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCES IN ELECTRICAL ENGINEERING Approved Dr. Brian Nutter Chair of Committee Dr. Sunanda Mitra Committee Member Mark Sheridan Dean of the Graduate School August, 2015
Transcript
Page 1: Thesis_ClarkA_FinalDraft

Calculating the Weight of a Pig

through Facial Geometry using

2-Dimensional Image Processing

by

Alexander W. Clark, B.S.

A Thesis

In

Electrical Engineering

Submitted to the Graduate Faculty

of Texas Tech University in

Partial Fulfillment of

the Requirements for

the Degree of

MASTER OF SCIENCES

IN

ELECTRICAL ENGINEERING

Approved

Dr. Brian Nutter

Chair of Committee

Dr. Sunanda Mitra

Committee Member

Mark Sheridan

Dean of the Graduate School

August, 2015

Page 2: Thesis_ClarkA_FinalDraft

Copyright 2015, Alexander W. Clark

Page 3: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

ii

ACKNOWLEDGMENTS

I want to first thank Dr. Nutter, not only setting me up with this project, but for

looking out for me throughout my entire education at Texas Tech. When I was taking

the challenging Electronics II with him, my parents had to remind me that Dr. Nutter

“was my friend” by pushing his students so hard. Now that I am graduating though, I

truly can say that he is my friend. Dr. Nutter has helped countless times and directly

enabled me to achieve my dreams. I hope he knows that the little favors and many

hours he pours into his students are remembered and appreciated forever.

I would like to acknowledge the folks at Animal Biotech, including Garrett

Thompson and Dr. John McGlone, for dreaming up this crazy idea and giving me the

thousands of pig pictures I needed to make it happen.

I also graciously thank Dr. Mitra for being on my committee. Your willingness

to help a student you don’t know well is as impressive as it is appreciated.

I especially need to thank my family as well. My parents have always

encouraged me to aspire for impossible dreams and are the first to believe I can

accomplish anything. Thank you for helping me become an accomplished engineer by

first teaching me to be a man of character. Your love has always been known and felt,

even out here in the plains of West Texas. Don’t worry – I’m coming home.

And Rachel, thanks for putting up with the many hours of “piggie piggie crop

crop” and supporting the crazy timeline I was shooting for. It helps that you always

knew I would figure this project out – even when I didn’t think I would!

Lastly, I cannot conclude these acknowledgements without expressing my

firm belief that nothing I achieve could be possible or would have any value aside

from my faith in Christ. The education I have received and the many hours of work I

have put into this degree cannot take away from the true victor that has led me here.

“For the horse is made ready for the day of battle,

but the victory rests on the Lord.”

Proverbs 21:31

Page 4: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

iii

TABLE OF CONTENTS

ACKNOWLEDGMENTS .................................................................................... ii

ABSTRACT .......................................................................................................... vi

LIST OF TABLES .............................................................................................. vii

LIST OF FIGURES ............................................................................................. ix

I. INTRODUCTION ............................................................................................ 1

II. FEATURE DETECTION USING THE VIOLA-JONES

FRAMEWORK .......................................................................................... 4

Grayscale Conversion and Coordinate Plane .................................................... 4

Integral Images .................................................................................................. 5

AdaBoost Technique ......................................................................................... 8

Introduction to Algorithm ........................................................................... 8

AdaBoost Algorithm Description ............................................................... 8

Cascade Classifiers............................................................................................ 9

Summary of Training Parameters on Classifiers Created ............................... 11

Feature Detection Conclusion ......................................................................... 13

III. FALSE POSITIVE REDUCTION AND VALID FEATURE

SELECTION ............................................................................................. 15

Valid Face Selection ....................................................................................... 15

Valid Nose Selection ....................................................................................... 16

Valid Eye Selection ......................................................................................... 16

Properly Cropping the Eye Photo ............................................................. 16

Limiting Eye Classification to a Dynamic Region of Interest .................. 17

Selecting Valid Set of Eye Based on Probability ...................................... 19

Classification Testing Results ......................................................................... 21

IV. FACIAL RECOGNITION ........................................................................... 23

Uniform Transformation of Pig Facial Features ............................................. 23

The Need for Common Feature Positions in Facial Recognition ............. 23

Perspective Quadrilateral Mapping ........................................................... 24

Perspective Transformation of Pixel Position ........................................... 26

Bicubic Pixel Interpolation ....................................................................... 27

Local Binary Patterns ...................................................................................... 31

Page 5: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

iv

Features of Local Binary Patterns ............................................................. 31

Local Binary Images ................................................................................. 32

Histogram Comparison ............................................................................. 33

Unsupervised Data Clustering......................................................................... 34

Facial Recognition Conclusion ....................................................................... 36

V. REGRESSION ................................................................................................ 37

Examination of Features ................................................................................. 37

The Feature Vector Sets ............................................................................ 37

Methods Attempted ................................................................................... 38

Undesirable Results ................................................................................... 39

Least Squares Method with Interdependent Predictors ................................... 41

Desirable Results ....................................................................................... 41

Averaging Pig Clusters ............................................................................. 43

Predictor Creation ..................................................................................... 45

Primary Features Used .............................................................................. 46

Least Squares Methodology ...................................................................... 46

Regression Conclusion .................................................................................... 48

VI. CLUSTER ADJUSTMENTS ....................................................................... 49

Outlier Detection ............................................................................................. 49

Cluster Minimization ................................................................................ 49

Nose Angle Limitation .............................................................................. 51

Grubb’s T-test ........................................................................................... 53

Cluster Regrouping ......................................................................................... 56

Algorithm for Regrouping Fractured Clusters .......................................... 56

Final Cluster Regrouping Results ............................................................. 60

Cluster Adjustments Conclusion ..................................................................... 61

VII. CONCLUSION ............................................................................................ 62

Accomplishments ............................................................................................ 62

Future Work .................................................................................................... 62

Closing Remarks ............................................................................................. 65

BIBLIOGRAPHY ............................................................................................... 66

APPENDICES

A. CONSOLE LOG DURING FEATURE DETECTION .............................. 67

Page 6: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

v

B. EXAMPLES OF PIGS CLASSIFIED USING PROGRAM ...................... 68

C. CONSOLE LOG DURING TRAINING MODE ........................................ 72

D. CONSOLE LOG AT END OF TRAINING MODE ................................... 73

E. CONSOLE LOG DURING FACE RECOGNIZER MODE ...................... 74

F. CONSOLE LOG AFTER FACE RECOGNIZER MODE ......................... 75

G. HOW-TO GUIDE ON RUNNING PIG ESTIMATION PROGRAMS .... 77

H. MATLAB CODE FOR LEAST SQUARES REGRESSION ..................... 86

I. OPENCV C++ SOURCE CODE .................................................................... 91

Page 7: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

vi

ABSTRACT

This thesis will outline the groundwork of facial detection and recognition

software to be used with pigs in order to estimate their weight from a digital image.

The facial detection of the pig is achieved through identification of the features using

the Viola-Jones method for cascade classifiers and basic likelihood functions. The

document will cover both the general theory behind these concepts and the actual

implementation as used in the software. Next, the need of transforming the newly

detected pig face to be used for facial recognition is covered through perspective

transformation and bicubic pixel interpolation of the facial geometries. After this, the

thesis will discuss the use of local binary patterns to sort the photos of the pigs with an

unsupervised clustering technique. Next, the implementation of least squares

regression is covered to predict the weight of a pig from the facial features. Finally,

the thesis will conclude with a discussion on the multiple error-checking and outlier

correction techniques used to make the software more robust.

Page 8: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

vii

LIST OF TABLES

2.1 Summary of Training Parameters for Final Cascade Classifiers .............. 11

6.1 Tcrit Values for Grubbs’ T-test for Outliers [“Outlier”] ........................... 55

Page 9: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

viii

LIST OF FIGURES

1.1 Program Flow Chart .................................................................................... 3

2.1 Transformation of Source into Equalized Grayscale Image ....................... 5

2.2 Project Coordinate System .......................................................................... 5

2.3 Integral Image Example .............................................................................. 6

2.4 Example Rectangular Features Enclosed in Detection Window................. 6

2.5 Examples of Haar Features in a Pig Photo ................................................. 7

2.6 Cascade Classifier Flow Chart [Viola] ....................................................... 9

2.7 Cascade Classifier Optimization Algorithm [Viola] ................................. 10

2.8 Transformations Performed on Positive Image Set for

Robustness .................................................................................... 12

2.9 Examples of Positive Images Used for Pig Features. ............................... 13

3.1 Selection of Most Valid Face .................................................................... 15

3.2 Selection of Most Valid Nose ................................................................... 16

3.3 Early Eye Classifiers’ Prolific False Positives .......................................... 17

3.4 Cropped Images for Eye Classifiers .......................................................... 17

3.5 Classifying Entire Image for Eyes ............................................................ 18

3.6 Classifying Only the Face Region Interest for Eyes ................................. 18

3.7 Eye False Positives Correctly Rejected..................................................... 21

3.8 Examples of Rejected Images Due to Lack of Adequate

Features ......................................................................................... 22

4.1 Four Different Pictures of the Same Pig ................................................... 23

4.2 Perspective Transformation of the Pig Face ............................................. 24

4.3 Perspective Transformation of a Warped Quadrilateral............................ 26

4.4 The Basic Pixel Interpolation Model ........................................................ 28

4.5 Sixteen Neighbors Used for Bicubic Interpolation ................................... 29

4.6 A 3x3 Pixel LBP Example ........................................................................ 32

4.7 LBP Feature Examples [Wagner] ............................................................. 32

4.8 Local Binary Pattern Image and Histogram Concatenation ...................... 33

4.9 Supervised LBP Facial Recognition Flowchart ........................................ 34

4.10 Unsupervised LBP Facial Recognition Flowchart .................................... 35

5.1 Pig Face Feature Vectors ......................................................................... 37

Page 10: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

ix

5.2 Example of Unsuccessful Regression with High Bias and Low

Variance ....................................................................................... 39

5.3 Example of Overfitting the Training Set .................................................. 40

5.4 High Variance of a Testing Test after Overfitting the Training

Set ................................................................................................. 41

5.5 Regression Results of Features for Training Set ...................................... 42

5.6 Regression Results of Features for Testing Set ........................................ 43

5.7 Averaging Results of Training Set ........................................................... 44

5.8 Averaging Results of Testing Set ............................................................. 44

6.1 Example of Misclassification Error in the Pig Photo ............................... 50

6.2 Example of Cluster 6 Being Discarded Due to Insufficient Size ............. 51

6.3 Picture of Pig Face Exhibiting Desirable Eye and Nose Angles .............. 52

6.4 Picture of Pig Exhibiting Undesirable Eye and Nose Angles .................. 53

6.5 Example of Pig Misidentification ............................................................ 54

6.6 Example of Cluster Fracturing With Unintentionally Split

Clusters Highlighted .................................................................... 56

6.7 Cluster Regrouping Flowchart ................................................................. 57

6.8 Console Log of Clusters Being Combined Using the

Regrouping Method of Supervised Facial Recognition ............... 58

6.9 Console Log of Cluster Being Rejected for Combination Using

the Regrouping Method of Supervised Facial

Recognition .................................................................................. 59

6.10 Cluster Regrouping Final Results ............................................................ 60

7.1 Necessary Standard Metrics for Camera Set-Up ..................................... 63

7.2 Devices Used in Setting Up the Standard Metrics of the Camera

System ........................................................................................... 64

Page 11: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

1

CHAPTER I

INTRODUCTION

Knowing the weight of a pig is vital in the agricultural market of meat

processing. There is a target weight that an ideal pig should have when the farmer sells

it to make maximum profit. If a pig is below this target weight, then profits are lost

because the pig was not the standard required size. If the pig is above the target

weight, then that is excess, unused weight. This extra weight is ultimately a waste of

resources used to grow the pig to a mass larger than needed for the standardized size.

Weighing pigs is not a trivial task though. Getting an individual pig to

cooperate and stand still on a scale long enough to be measured can be very difficult,

especially if using a balance scale rather than a digital one.

The pig’s weight does not necessarily have to be measured directly though.

Pigs have a unique characteristic that their eyes grow further apart linearly with their

growth in weight. If a person knew the exact distant between a pig’s eyes, then they

could easily predict its weight. Using this fact, one could presumably create software

that uses a digital image of a pig to calculate the distance between the eyes of a pig

and ultimately predict its weight.

Application in the agriculture industry would include day-by-day tracking of

pig weights in order to calculate the ideal time to sell a pig to the market. Anything

underneath an ideal weight isn’t worth the full market price. Anything above that ideal

weight is wasted resources. Farmers can increase profit by optimizing their selling

procedures to match the growth rate of the pigs in their care. Measuring the weight of

a pig automatically by an image also requires significantly less man power than

weighing each pig individually on a mechanical scale.

The software designed in the course of this project seeks to meet this need by

providing the means of determining the weight of a pig using nothing more than a

semi-low resolution camera in the pig pen.

Page 12: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

2

The goals of the project were to develop a low-cost solution to capturing

pictures of pigs and using image processing to estimate the weight of the pig. The low

cost solution, a miniature computer called the Raspberry Pi with a camera, was placed

in several pig pens with the help of Animal Biotech. Water flow from the spigot where

the pigs drink was monitored by a sensor connected to the computer. Whenever a pig

drank from the spigot, a digital image of the pig was taken and stored on the device.

The involvement of the project covered in this document does not go into great depth

on the hardware of the project. Rather, it will cover how the design goals of the

software were met and the processes that were used in the image processing.

The software being designed must be self-contained within the box that houses

the computer and camera. It would not be possible on most farms to take pictures at

the box and transmit those wirelessly. The bandwidth and connection simply do not

exist at typical farms. Instead, the picture must be processed and the data stored within

the pig pen. This requires that the software be fast so that digital images can be

efficiently processed and discarded to keep the limited memory space on the tiny

computer module free. A design goal on speed is that an image can be captured and

processed under a second.

The image processing software that manages the images, which will through

the course of the thesis simply be referred to as the program, also must run on a tiny

computer module, such as the Raspberry Pi. For this reason, the computer language

C++ was chosen in conjunction with the open source library OpenCV. With this

library and language, an executable can be compiled and placed on the device without

the installation of any other advanced programs with image processing capabilities.

As for the processing itself, the software has two main goals: predict the

weight of the pig and identify the pig that the face and weight belong to. The first step

is an object detection problem, while the second is facial recognition. This document

will fully cover how the features of the pigs are detected, how they correlate to the

estimated weight of the pig, and finally how unsupervised facial recognition

technology is used on pigs to sort the data.

Page 13: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

3

A full program flow chart is shown below in Figure 1.1. The thesis will follow

the outline and processes of this flow chart. All operations will be described in the

same sequential order as the program flow.

Figure 1.1: Program Flow Chart

Page 14: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

4

CHAPTER II

FEATURE DETECTION USING THE VIOLA-JONES

FRAMEWORK

The first step in calculating the facial geometries begins with the creation of

classifiers that can identify the three main features of a pig head: the face, the eyes,

and the nose. To service the needs of the project, the Viola-Jones Framework is

followed to create three cascade classifiers. This section presents a summary of the

Viola-Jones framework for facial detection, as presented in their paper [Viola]. It is a

robust and rapid method that utilizes three important tools: integral images for quick

feature evaluation, AdaBoost to construct classifiers, and cascade classifiers to further

reduce the operating time.

Grayscale Conversion and Coordinate Plane

Before covering the Viola-Jones Framework, it is worth mentioning that all

image processing in this project was on monochromatic images. Although many of the

images shown in this document and output of the program are represented in color, the

actual calculations were performed on grayscale images transformed by Equation 2.1.

𝑅𝐺𝐵 𝑡𝑜 𝐺𝑟𝑎𝑦: 𝑌 ← 0.299 ∙ 𝑅 + 0.587 ∙ 𝐺 + 0.114 ∙ 𝐵 (2.1)

After the grayscale conversion, we also equalize the histogram of every

normalized image used. This means we calculate the histogram 𝐻, normalize it so that

there are 256 bins (the number of possible pixel values in the grayscale image), and

then calculate the cumulative distribution function of the histogram using Equation 2.2

[“Histogram”].

𝐻𝑖′ = ∑ 𝐻(𝑗)0≤𝑗<𝑖 (2.2)

The image can then be transformed to increase the contrast and normalize the

brightness by using 𝐻𝑖 as a look-up table (Equation 2.3) [“Histogram”].

𝑑𝑠𝑡(𝑥, 𝑦) = 𝐻′(𝑠𝑟𝑐(𝑥, 𝑦)) (2.3)

The transformation of the source image is shown below in Figure 2.1.

Page 15: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

5

Figure 2.1: Transformation of Source into Equalized Grayscale Image

Finally, it is also worth noting that we use the coordinate directions depicted in

Figure 2.2 throughout the project.

Figure 2.2: Project Coordinate System

Integral Images

The first key concept in the Viola-Jones Framework is forming an integral

image. An integral image is a representation of an image that allows the sum of all the

pixels in a rectangular region of the image to be computed in a constant time,

independent of the size of the rectangle. Each element in the integral image is the

inclusive sum of all the pixels of the original image that are above and to the left of the

pixel in the original image. To demonstrate the use of the integral image, consider

Figure 2.3.

Page 16: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

6

Figure 2.3: Integral Image Example

The sum of the pixels in region D in the original image is equal to the value of

element 1 minus the values of elements 2 and 3, plus the value in element 4. This

integral image greatly simplifies the calculation of the Haar-like features that are used

for facial detection. Three types of features are used in Viola-Jones: two rectangle

features, which require six array references; three rectangle features, which require

eight; and four rectangle features, which require nine (Figure 2.4).

Figure 2.4: Example Rectangular Features Enclosed in Detection Window

The features shown in squares A and B of the figure are two-rectangle features, C is a

three-rectangle feature, and D is a four-rectangle picture.

Page 17: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

7

These features are calculated as the sum of the pixels in the white rectangles

minus the sum of the pixels in the grey rectangles. Examples of how these features

correlate to the actual images of the pigs can be seen below in Figure 2.5.

Figure 2.5: Examples of Haar Features in a Pig Picture

The two pictures on the left focus on features associated with the eyes. The

upper-left one correlates with the face as a whole and the dark horizontal area

associated with the eyes while the lower-left one looks at the light space associated

with the bridge of the nose.

The two pictures on the right focus on features associated with the sides of the

pig’s head. The upper-right picture exhibits a feature that frames the face of the pig,

while the lower-right picture shows the feature associated with the angle along the

edge of the pig head and the background.

All of the features shown in the above image could individually be considered

weak classifiers by themselves, roughly defining an aspect of the desired object.

Page 18: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

8

AdaBoost Technique

Introduction to Algorithm

For the 24 x 24 pixel windows used in the Viola-Jones paper and for the eye

and face classifiers, there are approximately 180,000 possible features. Rather than

creating a classifier using all of the features, it would be helpful to only use a select

subset of feature vectors that have the greatest effect on detecting the desired object in

the window. This is where the AdaBoost technique is useful. “Adaptive Boosting” is

used to create a strong classifier by selecting and combining several weak classifiers.

Essentially, the technique iterates and selects the classifier with the lowest

classification error and an associated weight. After this classifier is combined with the

others, the algorithm continues until the desired total number of weak classifiers is

reached. The AdaBoost algorithm as used by Viola-Jones is described in the section

below.

AdaBoost Algorithm Description

Start with example images (x1, y1),…,(xn, yn) where yi = 0 for negative

examples (lacking desired object) and yi = 1 for images with positive examples

(contains desired object).

The initial weight for each weak classifier, w1,i is determined by Equation 2.4

[Viola] below, where m is equal to the number of negative images and l is the number

of positive images.

𝑤1,𝑖 ={

1

2∙𝑚 , 𝑓𝑜𝑟 𝑦𝑖=0

1

2∙𝑙 , 𝑓𝑜𝑟 𝑦𝑖=1

(2.4)

Next, Adaboost iterates the variable t from 1,…,T. T is total possible number of

Haar features that can be found in the image. At the beginning of the iteration, the

weight of each weak classifier is normalized by the probability distribution in

Equation 2.5 [Viola].

𝑤𝑡+1,𝑖 ←𝑤𝑡,𝑖

∑ 𝑤𝑡,𝑗𝑛𝑗=1

(2.5)

Page 19: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

9

Now, for every feature, j, the algorithm trains a weak classifier hj which is

limited to just this one feature. The error of that classifier is calculated with respect to

wt (Equation 2.6) [Viola].

𝜖𝑗 = ∑ 𝑤𝑖|ℎ𝑗(𝑥𝑖) − 𝑦𝑖|𝑖 (2.6)

Next, the classifier, ht, with the smallest error ϵt is chosen and the weights are

updated (Equation 2.7).

𝑤𝑡+1,𝑖 = {𝑤𝑡,𝑖 ∙ [

𝜖𝑡

1−𝜖𝑡] , 𝑖𝑓 𝑥𝑖 𝑖𝑠 𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑒𝑑 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦

𝑤𝑡,𝑖, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 (2.7)

After all of the weights have been fully updated, then the final strong classifier

is expressed by Equation 2.8.

ℎ(𝑥) = {1, ∑ ℎ𝑡(x) ∙ log (

1−𝜖𝑡

𝜖𝑡) ≥

1

2∑ log (

1−𝜖𝑡

𝜖𝑡)𝑇

𝑡=1𝑇𝑡=1

0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 (2.8)

Cascade Classifiers

The simplest way to improve the performance of the AdaBoost classifier is to

increase the number of features used, but this directly increases the computation time

required. Hundreds of windows, or sections of the digital image at different scales and

positions, must be tested for the object as well. To improve the performance of the

detection system while keeping the computation time low, a cascade of classifiers was

used, as seen in Figure 2.6, where failing one stage immediately discards that window,

and passing the stage allows the next classifier to be applied.

Figure 2.6: Cascade Classifier Flow Chart [Viola]

Page 20: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

10

Basically, it is logically assumed that most windows will not have a face, eye,

or nose present in them, so the first classifiers in the cascade reject the windows that

obviously lack the particular object. This eliminates most of the windows in the image

using a computationally inexpensive classifier. The later stages increase in complexity

to ensure that a face, eye, or nose is truly present, but since most windows do not

reach these stages, they do not significantly affect the computation time for the image

as a whole. The algorithm used to optimize the Viola-Jones cascade classifiers is given

in Figure 2.7.

Figure 2.7: Cascade Classifier Optimization Algorithm [Viola]

Given the desired overall detection rate, false positive rate, and the number of

stages, the necessary performance of the individual stages can be found. Then, using

the AdaBoost technique, a classifier can be trained for each stage that meets those

specifications. The cascade classifier, when combined with the other techniques

mentioned, allows faces in an image to be detected accurately and efficiently.

Page 21: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

11

Summary of Training Parameters on Classifiers Created

Now that the algorithms and theory behind the classifiers has been discussed,

this section will cover a few specifics on the classifiers created for this project. In

total, three classifiers were created: a pig face, a pig eye, and a pig nose. All were

created using AdaBoost and Haar features and trained using OpenCV’s cascade

classifier training programs.

Below in Table 2.1, the parameters for each classifier are displayed.

Table 2.1: Summary of Training Parameters for Final Cascade Classifiers

Face Classifier Eye Classifier Nose Classifier

Positive Set 2906 3986 4040

Negative Set 2542 2542 2542

Dimensions (pixels) 24x24 24x24 32x16

Stages 20 32 33

Minimum Hit Rate 99.9% 99.9% 99.9%

Max False Hit Rate 50% 50% 50%

The positive set parameter designates the size of the set of images that have

been marked as positive examples of the object being trained. For instance, in the case

of the nose classifier, a set of 4040 cropped images of actual pig noses were loaded

into the training program. This set of positive images is careful to include all kinds,

shapes, and positions. In order to make sure that all pig noses can be classified, it is

important to make sure that images of all different types of noses with varying

markings are included. In order to simulate drastic changes in environmental lighting,

a good portion of the positive samples were repeated but with differing contrast levels

and exposure adjustments completed in a basic image editing program. This should

enable the classifier to work in a variety of lighting conditions. Lastly, in order to

increase the size of the positive sample set and ensure an asymmetrical classifier,

every positive image that was cropped and doctored was also duplicated one more

time but flipped horizontally. This way, every positive image in the set also has a

matching mirrored image to accompany it. This process was repeated for both the face

Page 22: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

12

and the eye classifiers too. An example of the transformations made on a positive

image of a pig eye can be seen below in Figure 2.8.

Figure 2.8: Transformations Performed on Positive Image Set for Robustness

The negative image set is a large set of any background images not containing

the desired objects. For this application, the focus was mainly set of using thousands

of stock images of dirt, rocks, mud, iron bars, wood – background typical of a pig pen.

It is also worth noting that some cropped images of pigs without the desired feature

(for example, a pig nose would not contain a pig eye) were included to ensure a more

robust classifier. Not as many negative images were needed as positive images since

the training program takes a random cropped window from the negative image and

will reuse negative images by selecting different portions of the picture.

The dimensions designate the size of the positive images trained. For both the

face and the eye classifier, 24 by 24 pixel images were used. The nose classifier used a

32 by 16 pixel image to accommodate the long horizontal nature of a pig nose.

Positive images were all cropped to include as much of the desired feature (face, eye,

and nose) as possible without including too much of the background. Examples of

actual cropped images can be seen below in Figure 2.9.

Page 23: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

13

Figure 2.9: Examples of Positive Images Used for Pig Features

A positive image of a pig face is shown on the left side of the figure, a positive image

for an eye is in the middle, and a positive image for a nose is on the right.

The stages designate the number of levels a window must go through on the

cascade classifier to be marked as valid. The face classifier uses 20 stages. The eyes

and nose classifiers are stricter with 32 and 33 stages, respectively. Pig faces as a

whole image have significantly less variation than the images of the eyes or nose,

meaning it is comparatively easier to distinguish if a window contains a pig face than

if it contains a nose or eye.

The minimum hit rate is the percentage of positive images that must be

correctly classified as a valid feature for a single stage. The larger the hit rate, the

better quality the classifier is, but the longer it takes to train the data. Here, 99.9% is

used to create high quality classifiers, but all three classifiers took well over 24 hours

to train.

Finally, the maximum false hit rate is the likelihood that negative images will

be incorrectly classified as positive features. This high number of false positives is not

an issue after all of the weak classifiers are combined.

Feature Detection Conclusion

This project begins with the detection of facial features. Without the ability to

detect the location and size of those features on an image, the estimation of the pig’s

weight would be impossible. As covered in this chapter, we used the Viola-Jones

Page 24: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

14

framework to create three cascade classifiers: pig faces, pig eyes, and pig noses. These

classifiers, composed of many weak classifiers combined together, can determine with

reasonable certainty whether or not a digital image has all three of those features. The

remaining chapters will cover what to do with the detected features in order to reach a

method for estimating the pig’s weight.

Page 25: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

15

CHAPTER III

FALSE POSITIVE REDUCTION AND VALID FEATURE

SELECTION

One of the distinct advantages of cascade classifiers is the dynamic number of

objects classified. Therefore, the classifier can determine when there are no objects

detected in a picture, avoiding the problem of attempting to assign a location and size

of a pig face that does not exist in the digital image. While the ability to not force a

classification is desired, this also introduces the difficulty of false positives. In order to

select a valid feature from the detected objects, likelihood functions much be created

to ensure that the object we select is actually one that exists. In the end, only a picture

with a valid face, nose, and set of eyes will have its weight calculated.

Valid Face Selection

The valid face is determined by the classified object with the shortest

Euclidean distance to the center of the picture, like the valid frame for the pig face

selected in the example below in Figure 3.1.

Figure 3.1: Selection of Most Valid Face

Page 26: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

16

Valid Nose Selection

The selection of the nose is similar to the selection of the face, except that for

all images taken of the pigs, the noses are located near the bottom of the picture.

Given this, the most valid nose is selected to be the one with the shortest Euclidean

distance to the bottom-center of the picture, like the nose selected in Figure 3.2 below.

Figure 3.2: Selection of Most Valid Nose

Valid Eye Selection

Correct selection of the eyes is more complicated than of the face and the nose.

Due to the widely varied difference in pig eyes through their growth cycle and the

disadvantage of a semi-low resolution camera, the eye classifier had to be created so

that it is more accepting, or generic, than the nose and face classifiers. Due to this,

there are many more false positives of eyes than any other feature.

Properly Cropping the Eye Image

Early classifiers for eyes had very poor results. Consider the output of the early

eye classifier shown in Figure 3.3; the image is riddled with false positives.

Page 27: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

17

Figure 3.3: Early Eye Classifiers' Prolific False Positives

A large part of this is due to the cropping of the source images used to train the

classifier. Initial images were cropped very closely to the eyeball of the pig,

accidentally leaving out the precious features of the eyelid and folds around the eye.

Later images were cropped like the eyes shown below in Figure 3.4, which finally

yielded at most one false positive per pig face.

Figure 3.4: Cropped Images for Eye Classifiers

Limiting Eye Classification to a Dynamic Region of Interest

The next improvement that can be done to make the eye classifier stricter is to

limit the face as the region of interest. Take, for instance, the image below in Figure

3.5, which does not have a limited region of interest.

Page 28: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

18

Figure 3.5: Classifying Entire Image for Eyes

The only way to correctly classify the eyes when searching the whole picture is to

make a more general classifier with fewer features. This increases not only the amount

of false positives, but also lengthens the processing time.

However, if we take the same image and limit it to a region of interest such as

just the face, we can require more features in the cascade classifier and reduce the

false positives classified in the picture to zero, such as shown below in Figure 3.6.

Figure 3.6: Classifying Only the Face Region of Interest for Eyes

Page 29: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

19

Selecting Valid Set of Eyes Based on Probability

Even though the number of false positives is greatly reduced, there is still a

need to select the valid set of eyes out of all of those classified, in case there are

misclassified objects.

The first step in weeding out the false positives is to ensure that there is at least

one eye on both the left and right side of the face. The pig face is divided into two

regions: the left and the right half. Then, the positions of all of the classified eyes are

checked against these regions to ensure that at least one eye falls into each region. If

this check fails, then it is considered that the current face has an invalid set of eyes.

After that first check, the valid set is selected by comparing all pairs of

combinations of left and right eyes for which two are most likely to be the valid set.

The most valid set of eyes is one that:

1. Has smallest least square relative error between 50x50 pixel scaled images

of the eyes (with right side flipped).

2. Has greatest similarity in size.

3. Is closest to the horizon line of the face.

4. Is farthest apart.

5. Has the smallest angle between the two.

We then mathematically determine the likelihood per pair of eyes so that the

pair with the highest likelihood is the valid pair of eyes.

The first criterion is measured using Equation 3.1, where 𝑖 represents an eye on

the left side of the face and 𝑗 represents an eye on the right.

𝑃1(𝑖, 𝑗) = 𝐿2

𝑖𝑗−𝐿2𝑚𝑖𝑛

𝐿2𝑚𝑎𝑥−𝐿2

𝑚𝑖𝑛 , (3.1)

where 𝐿2𝑖𝑗 represents the least squares relative error between two pairs of eyes. To

calculate the error, the image of the right eye is compared to the flipped image on the

left after both have been resized to the same dimensions. Then, pixel by pixel, the two

Page 30: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

20

images are compared in value. The smaller the error value, the more similar the

images are.

Next, the sizes of the two eyes are compared using Equation 3.2, where 𝑤𝑖 and

𝑤𝑗 are the widths of the left and right eye being compared.

𝑃2(𝑖, 𝑗) =1−|𝑤𝑖−𝑤𝑗|

𝑤𝑚𝑎𝑥 (3.2)

The smaller the difference in the eye size is, the greater the probability that the

two eyes are a valid pair.

Next, the pair of eyes is measured for proximity to the middle of the face

using Equation 3.3.

𝑃3(𝑖, 𝑗) =

ℎ𝑓𝑎𝑐𝑒

2−|

ℎ𝑓𝑎𝑐𝑒

2−𝑦𝑖|

ℎ𝑓𝑎𝑐𝑒

2

+

ℎ𝑓𝑎𝑐𝑒

2−|

ℎ𝑓𝑎𝑐𝑒

2−𝑦𝑗|

ℎ𝑓𝑎𝑐𝑒

2

, (3.3)

where ℎ𝑓𝑎𝑐𝑒 is the height of the detected face in pixels, and 𝑦𝑖 and 𝑦𝑗 are the y-

coordinates of the left and right eyes being compared. The closer to the middle of the

face the pair is, the higher the chances of it being a valid pair of eyes.

After that, Equation 3.4 looks at distance between the eyes and favors the set

that is furthest apart. 𝑥𝑖 and 𝑥𝑗 are the x-coordinates of the eyes, and 𝑤𝑓𝑎𝑐𝑒 is the width

of the face.

𝑃4(𝑖, 𝑗) =|𝑥𝑖−𝑥𝑗|

𝑤𝑓𝑎𝑐𝑒 (3.4)

The final condition in Equation 3.5 favors the pair of eyes that are closer

together vertically.

𝑃5(𝑖, 𝑗) =ℎ𝑓𝑎𝑐𝑒−|𝑦𝑖−𝑦𝑗|

ℎ𝑓𝑎𝑐𝑒 (3.5)

After all of these parameters are calculated, they can be summed together using

the likelihood function listed in Equation 3.6 and weighting factors 𝜑𝑛 to express the

features that are most important.

Page 31: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

21

𝑃𝑡𝑜𝑡(𝑖, 𝑗) = ∑ 𝜑𝑛𝑃𝑛(𝑖, 𝑗)5𝑛=1 (3.6)

Finally, using this likelihood function, the maximum indices can be found for

the pair of eyes that would yield the highest likelihood, as shown in Equation 3.7. The

variable 𝑚 is the total number of left eyes, and 𝑛 is the total number of right eyes.

𝑥𝑀𝐿 = max𝑖,𝑗 ∑ 𝑃𝑡𝑜𝑡(𝑥𝑖𝑗)0≤𝑖≤𝑚0<𝑗<𝑛

(3.7)

Now the program has the capability to reject eyes that were classified

incorrectly. Shown below in Figure 3.7 are two examples of images where a false

positive (shown in red) is successfully rejected and the correct pair of eyes validated

(shown in blue).

Figure 3.7: Eye False Positives Correctly Rejected

Classification Testing Results

In order to test that the classifiers and the valid feature selection code works,

the program was run on 705 images taken over a single day, every time a pig drank

from the water spigot. A screenshot of the console log during this test run can be seen

in Appendix A.

The classifiers performed well in rejecting the myriad of pictures inadequate

for facial geometry. Images such as the ones shown below in Figure 3.8 are rejected

for reasons such as the pig’s head being turned, eyelids being closed, an eye being

covered by an ear, or general blurriness.

Page 32: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

22

Figure 3.8: Examples of Rejected Images Due to Lack of Adequate Features

In the test of 705 images, 283 images were processed as valid, yielding a

rejection rate of 59.86%. While this number may seem high, it is important to

remember that many images can be taken and processed every second while the pig

drinks. Rejecting a high number of invalid images, like the ones shown above, is not a

problem when there are a multitude of images to choose from.

Furthermore, for this same set, there were only 5 misclassifications, meaning

that 5 features that were incorrectly classified. In all cases, it was a shadow or tear

stain misclassified as an eye. Even with these 5 misclassifications, that still yields a

misclassification rate of 0.71%, and the data obtained erroneously can easily be

discarded as an outlier, as discussed in Chapter 6.

Overall, the classification is very robust and surprisingly accurate. Captured

screenshots of the program positively classifying pig images can be found in

Appendix B.

Page 33: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

23

CHAPTER IV

FACIAL RECOGNITION

Detecting the features of the pig to calculate the weight is only useful if one

can assign the weight to a specific pig. There must be a methodology for sorting the

pig faces and assigning weights to the appropriate owner.

Uniform Transformation of Pig Facial Features

The Need for Common Feature Positions in Facial Recognition

The first step in facial recognition is to transform the face of the pig using its

facial geometries. Take for instance the four pictures below in Figure 4.1.

Figure 4.1: Four Different Pictures of the Same Pig

While these pictures may look similar to the human eye due to them all being

of the same pig, the computer has a harder time knowing that they are all the same

subject. Subtle changes in the heads’ rotation and angle create small variations in

feature locations, making it difficult for a computer to analyze. Therefore, it is

important that we map out all of the main features in a uniform fashion.

The chosen method for flattening the image out to a normalized coordinate

system, making it less susceptible to pig movement, is to map out a quadrilateral based

Page 34: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

24

on the eyes’ location and size difference as well as the nose’s location and size. An

example of the transformation we want to perform is shown below (Figure 4.2).

Figure 4.2: Perspective Transformation of the Pig Face

Perspective Quadrilateral Mapping

Before the corners of the quadrilateral can be positioned, there are a few

metrics to be computed first. The first ones that are calculated are the angle between

the eyes, as shown below in Equation 4.1 and the Euclidean distance between them, as

shown in Equation 4.2. After that, a metric, ∆eye, is designated to mark the difference

in widths of one eye versus the other (Equation 4.3).

θeye = tan−1 (yright_eye−yleft_eye

xright_eye−xleft_eye) (4.1)

𝑑𝑒𝑦𝑒 = √(𝑥𝑟𝑖𝑔ℎ𝑡_𝑒𝑦𝑒 − 𝑥𝑙𝑒𝑓𝑡_𝑒𝑦𝑒)2+ (𝑦𝑟𝑖𝑔ℎ𝑡_𝑒𝑦𝑒 − 𝑦𝑙𝑒𝑓𝑡_𝑒𝑦𝑒)

2 (4.2)

∆𝑒𝑦𝑒= |𝑤𝑙𝑒𝑓𝑡_𝑒𝑦𝑒 − 𝑤𝑟𝑖𝑔ℎ𝑡_𝑒𝑦𝑒| (4.3)

Page 35: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

25

After that, we can then calculate the bisecting point between the eyes. This

point represents the location on the picture that is directly between the two eyes, as

shown in Equation 4.4.

bisector = (xright_eye+xleft_eye

2,yright_eye+yleft_eye

2) (4.4)

This bisector point is useful in most of the nose calculations. Note the “T”

mark between the eyes and the nose in the Figure 4.2 above. The bisector point is the

center of this intersection.

The angle of the nose can now be calculated by using the bisector as an anchor

for the angle, like the equation shown in Equation 4.5, as well as the distance to the

nose (Equation 4.6).

𝜃𝑛𝑜𝑠𝑒 = tan−1 (𝑦𝑛𝑜𝑠𝑒−𝑦𝑏𝑖𝑠𝑒𝑐𝑡𝑜𝑟

𝑥𝑛𝑜𝑠𝑒−𝑥𝑏𝑖𝑠𝑒𝑐𝑡𝑜𝑟) (4.5)

𝑑𝑛𝑜𝑠𝑒 = √(𝑥𝑏𝑖𝑠𝑒𝑐𝑡𝑜𝑟 − 𝑥𝑛𝑜𝑠𝑒)2 + (𝑦𝑏𝑖𝑠𝑒𝑐𝑡𝑜𝑟 − 𝑦𝑛𝑜𝑠𝑒)2 (4.6)

The last metric that is useful in determining the transformation quadrilateral is

a point in the middle of the forehead determined by the angle of the nose as shown in

Equation 4.7.

𝑓𝑜𝑟𝑒ℎ𝑒𝑎𝑑 = (𝑥𝑏𝑖𝑠𝑒𝑐𝑡𝑜𝑟 − sin(𝜃𝑛𝑜𝑠𝑒) ∗ 𝑑𝑛𝑜𝑠𝑒 ∗

2

5,

𝑦𝑏𝑖𝑠𝑒𝑐𝑡𝑜𝑟 − cos( 𝜃𝑛𝑜𝑠𝑒)∗ 𝑑𝑛𝑜𝑠𝑒 ∗2

5

) (4.7)

Now that all of those metrics are calculated, it’s easy to mark the quadrilateral

to be transformed. The top-right and top-left points of the quadrilateral, marked by

Equations 4.8 and 4.9, respectively, are a function of both the forehead position in

addition to the distance, angle, and size difference between the eyes.

𝑇𝑅 = (𝑥𝑓𝑜𝑟𝑒ℎ𝑒𝑎𝑑 + 𝑐𝑜𝑠(𝜃𝑒𝑦𝑒) ∗ [𝑑𝑒𝑦𝑒 ∗

3

4+

∆𝑒𝑦𝑒

2] ,

𝑦𝑓𝑜𝑟𝑒ℎ𝑒𝑎𝑑 + 𝑠𝑖𝑛(𝜃𝑒𝑦𝑒) ∗ [𝑑𝑒𝑦𝑒 ∗3

4+

∆𝑒𝑦𝑒

2]) (4.8)

𝑇𝐿 = (𝑥𝑓𝑜𝑟𝑒ℎ𝑒𝑎𝑑 − 𝑐𝑜𝑠(𝜃𝑒𝑦𝑒) ∗ [𝑑𝑒𝑦𝑒 ∗

3

4−

∆𝑒𝑦𝑒

2] ,

𝑦𝑓𝑜𝑟𝑒ℎ𝑒𝑎𝑑 − 𝑠𝑖𝑛(𝜃𝑒𝑦𝑒) [𝑑𝑒𝑦𝑒 ∗3

4−

∆𝑒𝑦𝑒

2]

) (4.9)

Page 36: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

26

The bottom-left and bottom-right points of the quadrilateral, marked by

Equations 4.10 and 4.11, respectively, are functions of the size, angle, and position of

the nose.

𝐵𝐿 = (𝑥𝑛𝑜𝑠𝑒 − cos(𝜃𝑛𝑜𝑠𝑒) ∗ 𝑤𝑛𝑜𝑠𝑒 ∗

3

8,

𝑦𝑛𝑜𝑠𝑒 + sin (𝜃𝑛𝑜𝑠𝑒) ∗𝑤𝑛𝑜𝑠𝑒

2

) (4.10)

𝐵𝑅 = (𝑥𝑛𝑜𝑠𝑒 + cos(𝜃𝑛𝑜𝑠𝑒) ∗ 𝑤𝑛𝑜𝑠𝑒 ∗

3

8,

𝑦𝑛𝑜𝑠𝑒 − sin (𝜃𝑛𝑜𝑠𝑒) ∗𝑤𝑛𝑜𝑠𝑒

2

) (4.11)

Finally, we can use these four points in a perspective transformation to flatten

the image into a rectangular of any size using the perspective transformation technique

outlined in the next two sections.

Perspective Transformation of Pixel Position

In the last section, four points of the perspective quadrilateral were found

based on the eye and nose locations and size. If these four points are known as well as

the size of the desired rectangle, it is easy to find a transformation matrix representing

the relationship between the two, as modeled in Figure 4.3 below.

Figure 4.3: Perspective Transformation of a Warped Quadrilateral

In this image, the coordinates (𝑥𝑖, 𝑦𝑖) represent the four corners of the warped

quadrilateral. In this application, these would be the four points found in the previous

Page 37: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

27

section. The coordinates (𝑥𝑖′, 𝑦𝑖

′) represent the four corners of the spatially

normalized image.

Finally, the relationship between the two is shown in Equation 4.12, an

equation obtained from OpenCV’s documentation on geometric transformations

[“Geometric Transformations”].

[𝑡𝑖𝑥𝑖

𝑡𝑖𝑦𝑖′

𝑡𝑖

] = 𝑀 ∙ [𝑥𝑖

𝑦𝑖

1] (4.12)

𝑀 is a 3x3 transformation matrix, and 𝑡𝑖 is the scaling factor of the new

rectangular. The new rectangular can be any size, but for this application, it was

chosen to be 300 pixels wide and 600 pixels high.

Now that the transformation matrix has been found, it is easy to find the

coordinate points on the new rectangular that correspond to coordinates on the warped

quadrilateral. Any points on the rectangular, found by iterating across the sides and

height of the rectangular, are found with Equation 4.13.

𝑑𝑠𝑡(𝑥, 𝑦) = 𝑠𝑟𝑐 (𝑀11𝑥+𝑀12𝑦+𝑀13

𝑀31𝑥+𝑀32𝑦+𝑀33,𝑀21𝑥+𝑀22𝑦+𝑀23

𝑀31𝑥+𝑀32𝑦+𝑀33) (4.13)

In this equation, 𝑑𝑠𝑡 represents the new rectangle, and 𝑠𝑟𝑐 represents the

warped quadrilateral.

Bicubic Pixel Interpolation

The section above covers calculation of the coordinates on the warped

quadrilateral but does not specify how to determine the pixel value. The actual value

of the pixel in the new rectangle is determined by bicubic pixel interpolation. For

instance, in Figure 4.4, points 𝑝(0,0), 𝑝(0,1), 𝑝(1,0), and 𝑝(1,1) on the image all

have known pixel values. The value at the new coordinate, the 𝑛𝑒𝑤 𝑝(𝑥, 𝑦), must be

determined using these four neighbors.

Page 38: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

28

Figure 4.4: The Basic Pixel Interpolation Model

One of the easier and more commonly used interpolation algorithms is the

bilinear transformation which proportions the resulting value of the new pixel with the

relative distance between the four neighboring pixel points to the coordinate. Bicubic

transformation takes a step further by attempting to recreate the surface between the

four points. Bilinear just needs to know the position of the four pixels and their values.

Bicubic, however, needs to know:

The values of the pixels.

The partial derivative with respect to x of the slopes of those values.

The partial derivative with respect to y of the slopes of those values.

The x-y cross product of the slopes of those values.

It is also worth mentioning that while the four neighboring pixels are the most

important for the interior of the surface, where the new point will be, a full 16 points

surrounding the new coordinate will be necessary to calculate all of the information

needed (Figure 4.5).

Page 39: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

29

Figure 4.5: Sixteen Neighbors Used for a Bicubic Interpolation

With that information, one can form a bicubic equation that outputs the pixel

value at the given coordinates, such as the one shown below in Equation 4.14

[Lancaster].

𝑝(𝑥, 𝑦) = 𝑎00𝑥0𝑦0 + 𝑎01𝑥

0𝑦1 +

𝑎02𝑥0𝑦2 + 𝑎03𝑥

0𝑦3 +

𝑎10𝑥1𝑦0 + 𝑎11𝑥

1𝑦1 +

𝑎12𝑥1𝑦2 + 𝑎13𝑥

1𝑦3 +

𝑎20𝑥2𝑦0 + 𝑎21𝑥

2𝑦1 +

𝑎22𝑥2𝑦2 + 𝑎23𝑥

2𝑦3 +

𝑎30𝑥3𝑦0 + 𝑎31𝑥

3𝑦1 +

𝑎32𝑥3𝑦2 + 𝑎33𝑥

3𝑦3 (4.14)

In order for this equation to work though, the coefficients 𝑎00 through 𝑎33

must be solved. To begin doing this, all four different pieces of information required

for each of the four points must be calculated in terms of the bicubic equation.

First, the values of the pixels are determined in Equations 4.15 – 4.18

[Lancaster].

𝑤0 = 𝑝(0,0) = 𝑎00 (4.15)

𝑤1 = 𝑝(1,0) = 𝑎00 + 𝑎10 + 𝑎20 + 𝑎30 (4.16)

𝑤2 = 𝑝(0,1) = 𝑎00 + 𝑎01 + 𝑎02 + 𝑎03 (4.17)

𝑤3 = 𝑝(1,1) = 𝑎00 + 𝑎10 + 𝑎20 + 𝑎30 + 𝑎01 + 𝑎11 + 𝑎21 + 𝑎31 + 𝑎02 +𝑎12 + 𝑎22 + 𝑎32 + 𝑎03 + 𝑎13 + 𝑎23 + 𝑎33 (4.18)

Page 40: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

30

Then the partial derivative with respect to x of the slopes of the values is

determined in Equations 4.19 – 4.22 [Lancaster].

𝑥0 =𝜕

𝜕𝑥𝑝(0,0) = 𝑎10 (4.19)

𝑥1 =𝜕

𝜕𝑥𝑝(1,0) = 𝑎10 + 2𝑎20 + 3𝑎30 (4.20)

𝑥2 =𝜕

𝜕𝑥𝑝(0,1) = 𝑎10 + 𝑎11 + 𝑎12 + 𝑎13 (4.21)

𝑥3 =𝜕

𝜕𝑥𝑝(1,1) = 1(𝑎10 + 𝑎11 + 𝑎12 + 𝑎13) + 2(𝑎20 + 𝑎21 + 𝑎22 + 𝑎23) +

3(𝑎30 + 𝑎31 + 𝑎32 + 𝑎33) (4.22)

Following that, the partial derivative with respect to the y of the slopes of the

values is shown in Equations 4.23-4.26 [Lancaster].

𝑦0 =𝜕

𝜕𝑦𝑝(0,0) = 𝑎01 (4.23)

𝑦1 =𝜕

𝜕𝑦𝑝(1,0) = 𝑎01 + 𝑎11 + 𝑎21 + 𝑎31 (4.24)

𝑦2 =𝜕

𝜕𝑦𝑝(0,1) = 𝑎01 + 2𝑎02 + 3𝑎03 (4.25)

𝑦3 =𝜕

𝜕𝑦𝑝(1,1) = 1(𝑎01 + 𝑎11 + 𝑎21 + 𝑎31) + 2(𝑎02 + 𝑎12 + 𝑎22 + 𝑎32) +

3(𝑎03 + 𝑎13 + 𝑎23 + 𝑎33) (4.26)

Next, the cross product of the slopes for all four points’ values is calculated as

shown in Equations 4.27 – 4.30 [Lancaster].

𝑧0 = 𝑥 × 𝑦 𝑜𝑓 𝑝(0,0) = 𝑎11 (4.27)

𝑧1 = 𝑥 × 𝑦 𝑜𝑓 𝑝(1,0) = 𝑎11 + 2𝑎21 + 3𝑎31 (4.28)

𝑧2 = 𝑥 × 𝑦 𝑜𝑓 𝑝(0,1) = 𝑎11 + 2𝑎12 + 3𝑎13 (4.29)

𝑧3 = 𝑥 × 𝑦 𝑜𝑓 𝑝(1,1) = 1𝑎11 + 2𝑎12 + 3𝑎13 + 2𝑎21 + 4𝑎22 + 6𝑎23 +3𝑎31 + 6𝑎32 + 9𝑎33 (4.30)

Finally, now that all of those equations can be evaluated, linear algebra is used

to solve for all of the coefficients 𝑎00 through 𝑎33, as shown through Equations 4.31 –

4.46 [Lancaster].

𝑎00 = 𝑤0 (4.31)

𝑎01 = 𝑦0 (4.32)

𝑎02 = −3𝑤0 + 3𝑤2 − 2𝑦0 − 𝑦2 (4.33)

𝑎03 = 2𝑤0 − 2𝑤2 + 𝑦0 + 𝑦2 (4.34)

Page 41: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

31

𝑎10 = 𝑥0 (4.35)

𝑎11 = 𝑧0 (4.36)

𝑎12 = −3𝑥0 + 3𝑥2 − 2𝑧0 − 𝑧2 (4.37)

𝑎13 = 2𝑥0 − 2𝑥2 + 𝑧0 + 𝑧2 (4.38)

𝑎20 = −3𝑤0 + 3𝑤1 − 2𝑥0 − 𝑥1 (4.39) 𝑎21 = −3𝑦0 + 3𝑦1 − 2𝑧0 − 𝑧1 (4.40)

𝑎22 = 9𝑤0 − 9𝑤1 − 9𝑤2 + 9𝑤3 + 6𝑥0 + 3𝑥1 − 6𝑥2 − 3𝑥3 + 6𝑦0 − 6𝑦1 +3𝑦2 − 3𝑦3 + 4𝑧0 + 2𝑧1 + 2𝑧2 + 𝑧3 (4.41)

𝑎23 = −6𝑤0 + 6𝑤1 + 6𝑤2 − 6𝑤3 − 4𝑥0 − 2𝑥1 + 4𝑥2 + 2𝑥3 − 3𝑦0 + 3𝑦1 −3𝑦2 + 3𝑦3 − 2𝑧0 − 𝑧1 − 2𝑧2 − 𝑧3 (4.42)

𝑎30 = 2𝑤0 − 2𝑤1 + 𝑥0 + 𝑥1 (4.43) 𝑎31 = 2𝑦0 − 2𝑦1 + 𝑧0 + 𝑧1 (4.44) 𝑎32 = −6𝑤0 + 6𝑤1 + 6𝑤2 − 6𝑤3 − 3𝑥0 − 3𝑥1 + 3𝑥2 + 3𝑥3 − 4𝑦0 + 4𝑦1 −

2𝑦2 + 2𝑦3 − 2𝑧0 − 2𝑧1 − 𝑧2 − 𝑧3 (4.45) 𝑎33 = 4𝑤0 − 4𝑤1 − 4𝑤2 + 4𝑤3 + 2𝑥0 + 2𝑥1 − 2𝑥2 − 2𝑥3 + 2𝑦0 − 2𝑦1 +

2𝑦2 − 2𝑦3 + 𝑧0 + 𝑧1 + 𝑧2 + 𝑧3 (4.46)

Now that all of the coefficients have values, it is possible to use the bicubic

equation described in Equation 4.14 and determine the pixel values of the flattened

image using bicubic pixel interpolation.

Local Binary Patterns

Features of Local Binary Patterns

Following the perspective transformation and bicubic interpolation, the next

step is to implement the facial recognition. This software uses local binary patterns

(LBP). The local binary patterns are useful in this design because they do not

necessarily require a supervised sample training set of images in order to classify.

Eigenfaces and Fisherfaces, two other popular forms of facial detection, typically

require 9-10 valid samples of a face before they can recognize it in unknown images.

Unfortunately, due to the environment of a pig pen, our application does not have the

luxury of being able to take supervised images of the pigs in the pen every time the

software needs to be used. Instead, an unsupervised facial recognition technique is

used to circumvent this issue.

The image below in Figure 4.6 demonstrates how a local binary pattern is

formed.

Page 42: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

32

Figure 4.6: A 3x3 Pixel LBP Example

The center pixel of this 3x3 pixel section of an image is tested against its

neighbors by value. The value of the center pixel becomes the threshold, and the

neighboring pixel is labeled with a 1 if its value is greater or equal to the center pixel’s

value. It is labeled as a 0 otherwise. The binary value formed by these 1’s and 0’s

represent the unique pattern or feature that the pixel creates with its neighbors.

This same concept can be applied in a more scalable form with Extended (or

Circular) LBP so that the neighborhood is variable by a radius around the center pixel,

rather than just the immediate neighboring pixels. The result can then designate a

variety of features, like those shown in Figure 4.7.

Figure 4.7: LBP Feature Examples [Wagner]

Local Binary Images

After each pixel has been assigned a binary value, a local binary image is

formed from all of these components. Then, if the local binary image is divided up

into windows of an equidistant grid like the picture below in Figure 4.8, one can

actually assign a histogram to all of the binary values found in each window.

Page 43: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

33

Figure 4.8: Local Binary Pattern Image and Histogram Concatenation

Finally, the histograms are combined by concatenating them all together, rather

than merging, to maintain all of the spatial information of the features. This histogram

is unique to the face, yet will be similar to other histograms of images taken of the

same pig. In order to compare any picture against a sample to determine similarity,

one calculates the local binary pattern histogram of the new image and compares it to

the sample one.

Histogram Comparison

The comparison performed by OpenCV is implemented through a correlation

equation shown below, where 𝑁 is the total number of histogram bins (Equation 4.47).

𝑑(𝐻1, 𝐻2) =∑ (𝐻1(𝐼)−𝐻1̅̅ ̅̅ )(𝐻2(𝐼)−𝐻2̅̅ ̅̅ )𝑁

𝐼=0

√∑ (𝐻1(𝐼)−𝐻1̅̅ ̅̅ )2𝑁𝐼=0 ∑ (𝐻2(𝐼)−𝐻2̅̅ ̅̅ )2𝑁

𝐼=0

(4.47)

The variable Hk̅̅̅̅ in the equation is described in Equation 4.48.

𝐻𝑘̅̅̅̅ =

1

𝑁∑ 𝐻𝑘(𝐽)

𝑁𝐽=0 (4.48)

Page 44: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

34

The metric 𝑑(𝐻1, 𝐻2) measures the similarity between the two histograms.

Unsupervised Data Clustering

Finally, we use these LBP facial recognition techniques to cluster the images

by similarity. Before the unsupervised technique is explained, the methodology of

supervised data clustering must first be covered. If we had sample images of all of the

pigs in a pen, then we could use those images to cluster any unknown images by

assigning labels appropriately, given similarity of an unknown image’s local binary

image to the local binary images of the known samples. The process would follow the

outline below in Figure 4.9.

Figure 4.9: Supervised LBP Facial Recognition Flowchart

For any given image, one could compute the local binary image and assign it a

label from whichever sample image it most closely resembles in terms of the

histogram. With that label, one would also have a confidence value that is directly

correlated to the similarity of the images. If a confidence value is high, the recognizer

is more certain the two images represent the same pig. If the confidence value is low,

the chances are less likely.

The unsupervised mode follows a similar flow chart but doesn’t have the initial

sample data at the beginning that the supervised technique does. In its stead, it will

actually dynamically add to the sample training set whenever it finds an image that

Page 45: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

35

falls below the confidence value of any image in the currently trained sample set, as

shown below in Figure 4.10.

Figure 4.10: Unsupervised LBP Facial Recognition Flowchart

The process begins with the very first image of a pig face. We already know

this is a pig face and know that it does not match any other faces yet, because it is in

fact the first face. Thus begins our sample set. Subsequent pictures that are put through

the recognizer follow one of two actions. They either match an image in the sample set

with a confidence value over the specified threshold, or they do not match any existing

image in the sample set because their associated confidence value is too low. This

threshold value then becomes the basis of the clustering technique. If a face is

matched, then we assign to the image the label of that cluster. If the face does not have

a match, we add it to the sample set with a new label and retrain the recognizer to

include that picture as a new face to be matched against, starting a new cluster. The

process repeats for any pig face to be labeled. An example of the program processing

this information can be found in Appendix E. In the end, all of the faces have been

gathered into clusters of similarity to be further processed.

Page 46: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

36

Facial Recognition Conclusion

The advantage of finding all of the facial features using the cascade classifiers

is that we can flatten the image out to a normalized coordinate system for facial

recognition. The program successfully uses perspective transformation and bicubic

pixel interpolation to create a spatially normalized image of the pig’s face. After this

normalization, the local binary patterns of the image are compared to other images and

grouped into clusters of based on similarity. These clusters will be vital in determining

the weight of the pig represented by each cluster.

Page 47: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

37

CHAPTER V

REGRESSION

Examination of Features

The Feature Vector Sets

In order to find a mathematical relationship between the features detected in

the picture and the overall weight of the pig, all of the features shown below in Figure

5.1 were analyzed and passed through multiple kinds of regression.

Figure 5.1: Pig Face Feature Vectors

In total, the program outputs 16 different features associated with the pig face

in the picture, all measured in pixels. The program has a special mode for training the

Page 48: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

38

regressive data that allows the user to check each image to confirm that it was

classified correctly before outputting the data on the feature. Examples of this working

are shown in console screenshots in Appendices C and D. After that, all of the features

are stored in a file for regressive analysis.

The goal of the regression work is to find a mathematical relationship between

the 16 features and the known weight of the pig in the digital image (measured and

recorded the same day as the pictures being used). A weight vector representing the

coefficients of an equation formed by the features creates a mathematical formula for

calculating the weight of a pig.

Many different methods, not all of which will be discussed in full detail, were

run on the feature vectors output from a test run of approximately 200 different valid

images. It is unwise to use all of the feature vectors to calculate the weight vector,

since that leaves no means to evaluate the accuracy of the regression. Therefore, of

these 200 images and associated feature vectors, the data was divided into two sets:

70% for training and 30% for testing. Whenever a regression technique was run, the

best iteration of it was selected using the results of the testing set.

Methods Attempted

A variety of regression techniques were examined on the data including, but

not limited to, ridge, lasso, and elastic net regression. All three generally make the

assumption that the predictors are independent variables. While elastic net was a

related technique designed for several highly correlated variables, it still performed

poorly (discussed in next section). Even lasso, generally a strong regularization

technique, failed to identify the most important predictors in the data set. All three of

these techniques were implemented successfully in MATLAB with synthetically

generated data but still failed to give desirable results with the real data from the pig

features.

Page 49: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

39

Undesirable Results

All unsuccessful regression techniques tested had either of two main results.

The first common trend in an unsuccessful fitting was a very flat line close to the x-

axis, such as the example shown below in Figure 5.2.

Figure 5.2: Example of Unsuccessful Regression with High Bias and Low Variance

When the data produces this type of results, it can be assumed that the

predictors chosen do not actually correlate with the final function value. More

specifically, a trend like this implies that the features used do not actually form a

linear combination to output the weight of the pig. To compensate for this, the script

finds the minimal error in the set by flattening out the line with low variance but a

very high bias.

The next unsuccessful case often seen is the overfitting of the training data. In

these cases, it appears that the training data has yielded a successful regression from

0 20 40 60 80 100 120 140 160 180 20060

80

100

120

140

160

180

200

220

240

Sample Number

Weig

ht

Training Sample Regression Results

Actual Weight

Calculated Weight

Page 50: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

40

the marginal error between the known weight of the pig and the calculated weight,

such as the graph shown below in Figure 5.3.

Figure 5.3: Example of Overfitting the Training Set

While this sort of result may initially look good, there is danger in this result.

The very nature of overfitting is that the training set is overly matched, producing an

erroneous final equation because it overcompensated for noise by favoring the training

data too closely. If such a case occurs, then the training data will appear exceptionally

well-fitted, but the test data will suffer. For instance, the graph below in Figure 5.4 is

the output of the testing samples used by the equation formed with the training set in

Figure 5.3.

0 20 40 60 80 100 120160

170

180

190

200

210

220

230

240

Sample Number

Weig

ht

Training Sample Regression Results

Actual Weight

Calculated Weight

Page 51: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

41

Figure 5.4: High Variance of a Testing Test after Overfitting the Training Set

As the graph indicates, the testing results then have extremely high variance

that produce unrealistic weight calculations.

Least Squares Method with Interdependent Predictors

Desirable Results

The most successful methodology found was a form of the least squares

regression method mixed with making a few of the features strongly dependent on

each other. Before the specifics on this technique are discussed, the results will be

shown and discussed first so that it is clear what makes this technique desirable.

The goal of the training regression results is to form a calculated weight line

that trends with the actual weight lines, without being influenced too much by the

noise of the data. A good example of this would be the result shown below in Figure

5.5.

0 10 20 30 40 50 600

100

200

300

400

500

600

Sample Number

Weig

ht

Test Sample Regression Results

Actual Weight

Calculated Weight

Page 52: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

42

Figure 5.5: Regression Results of Features for Training Set

It can clearly be seen that the data trends properly with the training set. More

importantly, when the same equation formed by the training set above is applied to the

testing data, the results there also trend (Figure 5.6).

0 20 40 60 80 100 120 140150

160

170

180

190

200

210

220

230

240

250

Sample Number

Weig

ht

Training Sample Regression Results

Actual Weight

Calculated Weight

Page 53: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

43

Figure 5.6: Regression Results of Features for Testing Set

Averaging Pig Clusters

Obviously, the regression does not produce exact results; the training set error

suggests that there is still some variance. However, if the different weights calculated

for each pig are averaged together, the resulting averaged weights are reasonably close

to the actual weights. It is practical to use the average weight since the pictures have

already been clustered together during the facial recognition phase.

The accuracy of the averaged weights is shown in Figures 5.7 and 5.8 for the

training and testing sets, respectively.

0 10 20 30 40 50 60140

160

180

200

220

240

260

Sample Number

Weig

ht

Test Sample Regression Results

Actual Weight

Calculated Weight

Page 54: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

44

Figure 5.7: Averaging Results of Training Set

Figure 5.8: Averaging Results of Testing Set

1 2 3 4 5 6160

170

180

190

200

210

220

230

240

Sample Number

Weig

ht

Training Set

Average Calculated Weight

Actual Weight

1 2 3 4 5160

170

180

190

200

210

220

230

240

Sample Number

Weig

ht

Testing Set

Average Calculated Weight

Actual Weight

Page 55: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

45

Predictor Creation

A typical application of the least squares for regression would use all of the

features, or predictors, only once but at different polynomial degrees. For instance,

one could develop an equation with 16 variables (features/predictors), 16

corresponding coefficients, and a constant. Another choice is one with 32 variables, 32

coefficients, and a constant, where the additional 16 variables and coefficients just

come from the previous 16 features squared. This trend could continue for equations

of multiple orders. Notice that in this methodology, features are completely

independent of each other, an assumed trait to most regression techniques.

In the case of the final equation used, a similar concept is applied, but we will

use a set of new feature vectors formed by concatenating three vectors. The first vector

is simply the original feature vector, all predictors to the first power. The second

vector is the first set multiplied by all combinations of the first set. This means the

second set will contain single predictors to the second degree and products of all

combinations of the individual, original predictors. The third set repeats this operation

but on all combinations of multiplying the first set and the second set. That means

some new predictors will be original predictors to the third degree, some will be an

original predictor times another original predictor to the second degree, or some will

be three different predictors multiplied together.

More specifically, if we were going to look at the final equation for the original

feature set �̂� = {𝑥1, 𝑥2, … , 𝑥𝑛} and the weight vector, or coefficients, �̂� =

{𝑤0, 𝑤1, 𝑤2, … , 𝑤1+2𝑛+2𝑛2+𝑛3}, we use Equation 5.1.

𝑓(�̂�) =𝑤0 + ∑ 𝑤𝑖𝑥𝑖1≤𝑖≤𝑛 + ∑ 𝑤(1+𝑛+𝑛𝑖+𝑗)𝑥𝑖𝑥𝑗1≤𝑖≤𝑛

1≤𝑗≤𝑛+ ∑ 𝑤(1+𝑛+𝑛2+𝑛2𝑖+𝑛𝑗+𝑘)𝑥𝑖𝑥𝑗𝑥𝑘1≤𝑖≤𝑛

1≤𝑗≤𝑛1≤𝑘≤𝑛

)

(5.1)

Note that the weight vector is simply the coefficients multiplied times the new

feature vector formed plus the constant w0.

Page 56: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

46

Primary Features Used

Finally, now that the creation of the large, new feature vector is explained, it is

important to state which features were actually used in the original feature vector

before it is expanded.

As it turns out, the regression returned the least error with a reduced feature

set. The only features on the pig geometry being used are the difference between the

eye sizes, the average eye size, the Euclidean distance between the centers of the eyes,

and the coordinate position of the nose.

Least Squares Methodology

After showing the successful results and explaining which features were

actually used, the specifics on the regression form chosen will be explained.

Henceforth, �̂� will be reassigned to express the new set of predictors so that

�̂� = {𝑥1, 𝑥2, … , 𝑥𝑁},

where N represents the length of the new vector determined by

1 + 2𝑛 + 2𝑛2 + 𝑛3

with n being the number of original features (difference in eye sizes, average eye size,

Euclidean distance, etc.).

The weight vector, or coefficients of the final equation, is still referred to as

�̂� = {𝑤0, 𝑤1, 𝑤2, … , 𝑤𝑁}. The least squares method is used to minimize the following

cost function shown in Equation 5.2.

𝑚𝑖𝑛�̂� 𝐽(�̂�) = ∑ (𝑦𝑖 − �̂� 𝑥�̂�𝑇)2𝑀𝑡𝑟𝑎𝑖𝑛

𝑖=1 (5.2)

The variable �⃗⃗� is the weight vector, 𝑥�̂� is the feature vector of the ith image out

of 𝑀𝑡𝑟𝑎𝑖𝑛 images in the training set, �̂� 𝑥�̂�𝑇 is the estimate of the weight, and yi is the

actual weight of the pig in the ith image. This equation can be rewritten with the

variables

𝑋 = [𝑥1̂; 𝑥2̂; … ; 𝑥�̂�]

Page 57: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

47

and with

𝑌 = [𝑦1; 𝑦2; … ; 𝑦𝑁]

as the minimization function shown below in Equation 5.3.

min�̂� 𝐽(�̂�) = ‖𝑋 ∙ �̂� − 𝑌‖2. (5.3)

To find the minimum of the cost function, the gradient is set equal to zero,

which gives the following function in Equation 5.4.

𝛻𝐽(�̂�) = 2�̂�𝑇𝑋𝑇𝑋 − 2𝑌𝑇𝑋 = 0. (5.4)

Solving for �⃗⃗� , one finds the following solution in Equation 5.5:

�̂� = (𝑋𝑇𝑋)−1𝑋𝑇𝑌. (5.5)

The term (𝑋𝑇𝑋)−1𝑋𝑇 is known as the pseudoinverse of X.

Then, weight estimates for the test set can be calculated and compared to the

actual weights by computing the test set error. Ideally, this error is representative of

the out-of-sample error, which demonstrates the general accuracy of the weight

equation in practice.

It was observed that variations in the accuracy of the calculated weights

existed, based on the various assignments of feature vectors to the training and testing

sets. Thus, the regression was performed multiple times, with the feature vectors

randomly reassigned to each set upon every iteration. The final weight vector used in

the program was the one associated with the sets of smallest test error, found using

Equation 5.6.

∑ (𝑦𝑖−�̂�∙𝑋)2𝑀𝑡𝑒𝑠𝑡𝑖=0

𝑀𝑡𝑒𝑠𝑡 (5.6)

𝑀𝑡𝑒𝑠𝑡 is simply the size of the features used in the test set. As explained in the section

on overfitting, it is necessary to measure the success of the regression by the results of

the testing set, not the training set.

Page 58: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

48

Regression Conclusion

As long as the face, eyes, and nose of the pig in a picture can be classified

correctly, then it is feasible to use those features to formulate a mathematical

prediction of that pig’s weight. As discussed in this chapter, the least squares

regression with the expanded interdependent feature vector will return the coefficients

to the necessary equation, as long as training data is provided. The training data for

every picture classified needs to include the average eye size, the difference in eye

size, the Euclidean two-dimensional distance between the eyes, the coordinate position

of the nose, and the pig’s known weight. With that data and the regression MATLAB

code (Appendix H), any fixed camera position can be trained to return an estimation of

a pig’s weight from a captured image.

Page 59: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

49

CHAPTER VI

CLUSTER ADJUSTMENTS

As with any system, we may need to compensate for error. The system is built

to handle four cases for error. To begin, two forms of outlier detectors are

implemented. One is used to discard pictures with misclassified features, while the

other is used to discard numerical data that could skew the distribution of estimated

weights in a cluster. There is also a limitation placed on the orientation of a pig face

for it to be considered for facial recognition. Finally, there is regrouping of clusters

that represent the same pig but have been split apart by the unsupervised clustering

method.

Outlier Detection

Cluster Minimization

The first step in getting rid of misclassified images is to require a minimum

number of pictures in a cluster for it to be valid. For instance, in the example below

shown in Figure 6.1, it can be seen that the eye on the left side of the picture has been

misclassified.

Page 60: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

50

Figure 6.1: Example of Misclassification Error in the Pig Image

A shadow in the pig’s face has been identified as the pig’s eye. While normally

a misclassification of this kind could be detrimental to the overall predicted weight of

the pig, erroneous results are minimized by discarding any clusters that do not meet a

certain value. After running hundreds of iterations of the program with different

parameters, the final minimal cluster count chosen was five. This means that for this

image to be marked as part of a valid cluster and its weight estimation to be recorded,

the unsupervised facial clustering has to find four other images with similar local

binary patterns. The chance of generating a misclassification that produces the same

incorrect transformation of an image five times, to the extent the program identifies all

five images as the same face, is extremely slim. Indeed, in all of the tests run during

the course of this project, such a case was never seen.

As can be seen in the console log of the program below in Figure 6.2, smaller

clusters have just been omitted from the output of the program.

Page 61: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

51

Figure 6.2: Example of Cluster 6 Being Discarded Due to Insufficient Size

Nose Angle Limitation

There are many cases where pictures of the same pig, normally grouped

together in one cluster, will fracture into separate clusters. While a special regrouping

method is implemented and discussed later in the chapter, there is one limitation

placed on valid images to prevent this from happening.

After all of the facial features are detected, the difference in the angle of the

eyes and the angle of the nose to the bisector between the eyes is checked. Normally, a

perpendicular angle, such as the one in Figure 6.3 below, is desired to ensure proper

facial recognition.

Page 62: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

52

Figure 6.3: Picture of Pig Face Exhibiting Desirable Eye and Nose Angles

A face at such an angle is facing the camera and squared away enough that the

transformed image is easily computed by the unsupervised local binary patterns and

grouped with images from the same pig.

However, if the angle of the eyes and the nose are too far away from this

perpendicular relationship, then some features of the pig face will be lost because of

its orientation and rotation. Figure 6.4 is an example of such an image.

Page 63: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

53

Figure 6.4: Picture of Pig Exhibiting Undesirable Eye and Nose Angles

The dark space next to the pig on the left that is the background can

erroneously be interpreted as part of the pig’s face, leading to this picture being

classified as an entirely different pig than the other pictures it should be clustered with.

The solution to this problem is as simple as limiting that only images with

angles below a certain threshold be analyzed and calculated as part of a pig’s weight.

The threshold was chosen to filter images as indicated by Equation 6.1.

𝑝ℎ𝑜𝑡𝑜 𝑖𝑠 {"𝑣𝑎𝑙𝑖𝑑", |𝜃𝑒𝑦𝑒 − 𝜃𝑛𝑜𝑠𝑒| < 0.04 𝑟𝑎𝑑𝑖𝑎𝑛𝑠

"𝑖𝑛𝑣𝑎𝑙𝑖𝑑", 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 (6.1)

Grubb’s T-test

The second threat to the accuracy of the weight estimation is the possibility of

statistical outliers. The greatest chance of causing this would be the misidentification

of one pig as another. For instance, a case of this occurring is shown in a very early

run of the program, as seen below in Figure 6.5.

Page 64: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

54

Figure 6.5: Example of Pig Misidentification

In this example, the known ID’s of the pigs (i.e. “14”,”15”,”17”, etc.) are being

displayed underneath the cluster name to show whether or not a cluster was filled with

images of the same pig. Here, Cluster 8 has grouped to form a cluster with four

pictures of Pig 17 but failed by including Pig 15 as well. If the weight between these

two pigs varies greatly, then this one outlier could consequently also skew the total

calculated weight for Pig 17.

In order to detect and discard the outliers, Grubb’s T-test is implemented into

the program. This outlier test was chosen due to it being designed for cases of multiple

possible outliers, rather than just one, since there is no way of knowing the number of

misidentified data points that may exist.

To begin, the mean of the clustered data set is calculated using Equation 6.2.

�̅� = ∑𝑥𝑖

𝑁

𝑁𝑖=1 (6.2)

Next, the standard deviation of the set is calculated (Equation 6.3).

𝑠 = √∑ (𝑥𝑖−�̅�)2𝑁𝑖=1

𝑁−1 (6.3)

After those two metrics are computed, we must be able to select the data point

that is farthest from the mean and is most likely to be an outlier. This data point is

going to be the one that fits the following parameter, or has the maximum distance

from the mean (Equation 6.4).

Page 65: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

55

|𝑥𝑖 − �̅�|𝑚𝑎𝑥 (6.4)

Next, the T metric is computed using Equation 6.5.

𝑇 =|𝑥𝑖−�̅�|

𝑠 (6.5)

This metric is used to determine if the data point is an outlier or not. The value

obtained is compared to a table of Tcrit values, like the one shown below in Table 6.1.

Table 6.1: Tcrit Values for Grubbs’ T-test for Outliers [“Outlier”]

Data Points Risk of false rejection (%)

N 0.1 0.5 1 5 10

3 1.155 1.155 1.155 1.153 1.148

4 1.496 1.496 1.492 1.463 1.425

5 1.780 1.764 1.749 1.672 1.602

6 2.011 1.973 1.944 1.822 1.729

7 2.201 2.139 2.097 1.938 1.828

8 2.358 2.274 2.221 2.032 1.909

9 2.492 2.387 2.323 2.110 1.977

10 2.606 2.482 2.410 2.176 2.036

15 2.997 2.806 2.705 2.409 2.247

20 3.230 3.001 2.884 2.557 2.385

25 3.389 3.135 3.009 2.663 2.486

50 3.789 3.483 3.336 2.956 2.768

100 4.084 3.754 3.600 3.207 3.017

If the T value is greater than or equal to Tcrit, it is an outlier and can be

discarded from the data set. If the T value is less than the Tcrit value, then it is not an

outlier and can be kept as part of the data set.

Finally, it is worth noting that this case of an outlier is rather rare. The

parameters of the facial recognition were maximized through running hundreds of

Page 66: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

56

iterations of the program and set to favor smaller clusters of correctly identified pigs

rather than large clusters with misidentified pigs. Due to this bias, it is generally

unlikely that a data point will be misidentified and thus discarded as an outlier.

Nevertheless, it is still good practice to have the assurance of reliability built in for

unforeseen anomalies.

Cluster Regrouping

Algorithm for Regrouping Fractured Clusters

Smaller groups of more accurately identified pigs are favored over large

clusters with misidentified pigs. It is thus possible that during the process of the

clusters being formed, some of the clusters fractured and split pictures of the same pig

into separate groups. An example of this can be seen in Figure 6.6 below.

Figure 6.6: Example of Cluster Fracturing With Unintentionally Split Clusters

Highlighted

It can be seen above by the spray-painted markings on the pigs that there are

seven pigs output, when there are only five in the pen. Two of the pigs have been

reported twice. Here we come to a trade-off. If the standards are lower for the facial

recognition, then there will be many misidentified pig faces. If the standards are too

Page 67: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

57

strict, then the clusters can fracture into smaller groups of the same pigs. The

compromise is to do both, but in separate steps.

The full flow chart is depicted below in Figure 6.7 and will follow a process

that is described through the rest of this section.

Figure 6.7: Cluster Regrouping Flowchart

To begin, we operate the unsupervised clustering based on local binary

patterns with the strictest of parameters. This yields many different clusters but with

very low probability of any cluster containing a misidentified pig. As discussed in the

section on unsupervised clustering, any time the program comes across a face that is

not previously trained or recognized, the program creates a new label entirely. At this

point, we may have as many as three to four times the number of clusters as we

actually have pigs.

The very next step is the same process as discussed on thinning the data of

outliers. The smallest clusters, of four or less images, are deleted and discarded as

Page 68: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

58

insufficient data. This leaves a small number of clusters but still most likely more

clusters than there are pigs.

Now, given the supervised facial recognition enters with 𝑁 clusters, then

supervised facial recognition iterates 𝑁 times. For every cluster that exists, the label

and associated sample image is removed from the training set. The images that were

originally labeled as that cluster are retested. However, this time, a label other than the

original one will be forced on the image by the next closest fit. If 75% or more of the

images in the cluster and can be relabeled as the same label of another cluster, it is

considered safe to assume that these two different clusters were actually the same pig.

All of the images in the cluster being retested are labeled as the new cluster, while the

previously existing cluster label and sample image are deleted. As this process repeats

for every cluster that exists, the number of total clusters decreases, but we keep the

same number of total pictures. Different pictures are just regrouped together into

larger clusters of the same face.

To show some of this technique in action, take a look at the console log below

in Figure 6.8.

Figure 6.8: Console Log of Clusters Being Combined Using the Regrouping Method

of Supervised Facial Recognition

Page 69: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

59

The first label applied to Cluster 1 was, of course, that it belonged to Cluster 1.

When the Cluster 1 label and sample image were omitted from the testing set though,

notice that 90% of the data set could be relabeled as Cluster 9 (as designated by

“…2nd: 9…”). Since the mode of the cluster being retested is 9 by 90% of the data,

Cluster 1 meets the above 75% minimum and can be combined with Cluster 9 as it is

now considered safe to assume these are the same face.

In order to show both sides, take a look at a case where the cluster is not

moved, such as shown below in the console log of Figure 6.9.

Figure 6.9: Console Log of Cluster Being Rejected for Combination Using the

Regrouping Method of Supervised Facial Recognition

In this case, Cluster 4 was not combined into any new cluster but was instead

kept as a separate entity. The reason for this is that while the mode of the newly

labeled pictures was for Cluster 13, this only applied to 58.333% of the images, not

meeting the 75% requirement. Because of this, Cluster 4 is kept as Cluster 4 was

originally.

Page 70: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

60

The last thing worth mentioning on the regrouping is a final countermeasure

built-in to prevent unintended data grouping. In addition to the new labels having to be

over 75% in agreement, the weights of the two clusters being combined must also be

within 10% of each other to be a valid combination. While there were no examples of

this being vital in the test runs, it was implemented anyway just to ensure another level

of protection against unintentional combining.

Final Cluster Regrouping Results

After this algorithm was implemented following the facial recognition, the

output of the program with the testing set looks like the window below in Figure 6.10.

Figure 6.10: Cluster Regrouping Final Results

It is relatively easy to see that the program differentiated between five different

pigs (proven by the spray paint markings on their head). The error for the weight

calculation is remarkably low. The largest difference in the calculated weight of a pig

and the recorded weight is on the fourth pig (bottom-left). With a calculated weight of

207 pounds and a recorded weight of 212 pounds, the percentage error is just 2.358%,

as indicated by Equation 6.6.

Page 71: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

61

𝜌4 = |212−207

212| ∙ 100 = 2.358% (6.6)

A screenshot of the resulting console output can be found in Appendix F.

Cluster Adjustments Conclusion

No system is perfect, but the measures in this chapter have been implemented

to ensure that the system can at least handle some degree of expected error. The

program can now account for errant data values through Grubbs’ T-test, irregular pig

orientation through angle limitation, misclassifications by discarding small clusters,

and clusters of the same pig fracturing. All of this error protection is vital in delivering

consistent, valid estimations of the pigs’ weights.

Page 72: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

62

CHAPTER VII

CONCLUSION

Accomplishments

The purpose of this project was to construct software that can analyze the faces

of pigs in digital images taken from a pig pen, ultimately to predict the weight. The

work done on this project met all design goals.

The image processing classification is robust enough to detect in a matter of

milliseconds if a picture contains a pig’s face, and if all features on the pig can be

detected, such as the nose and both eyes. Pig movement does not affect the outcome of

the program’s calculation. If only one eye is visible or if one is closed, the picture is

rejected, assuming that a valid one can be taken eventually.

The program represents the first time unsupervised facial recognition

technology has been implemented on pigs. While other facial recognition research has

been done on the animal, this is the first program to operate on untrained data.

The program even implements multiple algorithms to handle its own errors.

Statistical outliers are discarded via Grubb’s T-test, and misclassified pictures are

quickly weeded out of the valid images. The program is even designed to double

check its facial recognition, combining clusters that might have fractured during the

process and had falsely marked one pig as two clusters.

Future Work

The project as it stands now is designed to run at the end of a 24-hour day.

While the weight of the pigs that drank that day is accurately recorded, there will still

need to be further software development to track the clusters and weights of the pigs

over time for the information to be useful.

In addition, if this prototype was to be commercialized, it is highly advised that

the data for the regression be retrained, this time being very careful to measure both

Page 73: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

63

the angle of the camera and its x-y distance from the water spigot end. The desired

metrics are shown below in Figure 7.1.

Figure 7.1 Necessary Standard Metrics for Camera Set-Up

Originally, the need for multi-angle compensation was thought to be needed in

this project. Early project designs were prepared to account for the camera being at

different angles and distances from the spigot. However, due to the lack of granularity

in the pixel data and the features ultimately used in the weight formula, this is realized

to no longer be necessary.

For instance, the only metrics being used in calculating the weight of the pigs

are the average eye size, the difference in eye size, the 2D Euclidean distance between

the eyes, and the coordinate position of the nose. Also, the calculated weight of the pig

before averaging the cluster can be off by as much as 10 pounds. A single picture of a

pig and the calculation performed on that picture will hardly ever get the pig’s weight

exactly. It will merely get close. In fact, it is the averaged weight across a cluster of

images of the pigs at multiple angles and positions that can get the more accurate

weight.

Page 74: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

64

In the end, because there will have to be many pictures taken and each

individual image only places the weight in a loose figure, the problem of

compensating for multiple camera angles and positions is not as important as

originally supposed.

The one item that is important is keeping the camera roughly in the same

position as the data it was trained with. In order for the nose coordinate position to

have any meaning, the position and angle of the camera need to be closely similar in

every set-up of every pen it is in. While this may seem like a strict and hard to achieve

design goal, the set-up can be optimized simply with a bubble level and some string.

The camera box, if commercialized, would be supplied with a triangle of three strings

and three washers, with the sides corresponding to the distance of the camera to the

water spigot (Figure 7.2). The camera, set at its trained angle in the box, could then be

outfitting with a water level, to guarantee it is roughly set at the same angle every

time.

Figure 7.2: Devices Used in Setting Up the Standard Metrics of the Camera System

Page 75: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

65

Closing Remarks

As with any engineering product, the system will definitely need to go through

the usual process of refining, testing, and debugging. The software should be

expanded to include cluster tracking so that the pigs can be monitored over time. A lot

more data on the pigs will need to be collected to provide long term and greater

variety of data.

Nevertheless, the project as a whole serves as a valid proof of concept in that a

system can indeed be successfully made to calculate the weight of a pig from a mere

two-dimensional image.

Page 76: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

66

BIBLIOGRAPHY

“Geometric Transformations." OpenCV 2.4.11.0 Documentation. OpenCV Dev Team,

25 Feb. 2015. Web. 29 May 2015.

<http://docs.opencv.org/modules/imgproc/doc/geometric_transformations.html

>.

"Histogram Comparison." OpenCV 2.4.11.0 Documentation. OpenCV Dev Team, 25

Feb. 2015. Web. 29 May 2015.

<http://docs.opencv.org/doc/tutorials/imgproc/histograms/histogram_comparis

on/histogram_comparison.html>.

"Histogram Equalization." OpenCV 2.4.11.0 Documentation. OpenCV Dev Team, 25

Feb. 2015. Web. 29 May 2015.

<http://docs.opencv.org/doc/tutorials/imgproc/histograms/histogram_equalizati

on/histogram_equalization.html>.

Lancaster, Don. "A Review of Some Pixel Image Interpolation Algorithms." Image

Super-Resolution and Applications (2012): n. pag. Tinaja. GuruGram, 2007.

Web. <http://www.tinaja.com/glib/pixintpl.pdf>.

"Outlier Handout." Statistical Treatment of Analytical Data (2004): n. pag. Web.

<http://education.mrsec.wisc.edu/research/topic_guides/outlier_handout.pdf>.

Viola and Jones, "Rapid object detection using a boosted cascade of simple features",

Computer Vision and Pattern Recognition, 2001.

Y. Freund, R.E. Shapire, “A decision-theoretic generalization of online learning and

an application to boosting,” J. Comput. Syst. Sci. 55 (1) (1997) 119–139.

Wagner, Phillip. "Local Binary Patterns." Httpbytefishde ATOM. N.p., 8 Nov. 2011.

Web. 30 Apr. 2015. <http://www.bytefish.de/blog/local_binary_patterns/>.

Watkins, Christopher, Alberto Sadun, and Stephen Marenka. Modern Image

Processing: Warping, Morphing, and Classical Techniques. Boston: Academic,

1993. Print.

Page 77: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

67

APPENDIX A

CONSOLE LOG DURING FEATURE DETECTION

Page 78: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

68

APPENDIX B

EXAMPLES OF PIGS CLASSIFIED USING PROGRAM

Page 79: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

69

Page 80: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

70

Page 81: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

71

Page 82: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

72

APPENDIX C

CONSOLE LOG DURING TRAINING MODE

Page 83: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

73

APPENDIX D

CONSOLE LOG AT END OF TRAINING MODE

Page 84: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

74

APPENDIX E

CONSOLE LOG DURING FACE RECOGNIZER MODE

Page 85: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

75

APPENDIX F

CONSOLE LOG AFTER FACE RECOGNIZER MODE

Page 86: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

76

Page 87: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

77

APPENDIX G

HOW-TO GUIDE ON RUNNING PIG ESTIMATION PROGRAMS

Page 88: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

78

INTRODUCTION

The intent of this document is to loosely guide the user on the four different modes of

the program I made, as well as how to run the MATLAB script for training the data. It

does not include information on the source code itself, as the source is well-

documented and extensively commented to hopefully answer any questions one might

have.

Note that all of the programs require the provided DLL files from OpenCV, the three

XML classifier files, a folder of images, CSV files designating those images (covered

later), and the CSV file of the regression coefficients (generation explained later).

Page 89: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

79

TRAINING MODE

The Training Mode is used to generate the feature vectors that train the regression

data. It can be run by either running (double-clicking) training_mode.exe from within

the folder of files or setting the value of PROGRAM_MODE in Visual Studio to 1:

In order for this program to run properly, make sure that the image files to be trained

are specified in a comma-separated values (CSV) file named supervised_pigs.csv.

The first value of each row is the filename and path (needs to be in same folder as

executable), the second is the known ID of the pig (integer number), and the known

weight of the pig (also an integer). An example of this file is shown in Notepad:

and in Excel.

As each valid pig shows up, you will be given the option to press “N” on the keyboard

to mark misclassified pigs:

or hit any other key to mark correctly classified pigs:

Page 90: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

80

After every pig has been sorted through, a CSV file of the feature vectors for only the

correctly classified pigs is created. It is titled feat_vect.csv and is shown below:

This data will be used to train the regression coefficients.

Page 91: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

81

GENERATING COEFFICIENTS

In order to predict the weight of a pig in a pen, data from pigs in that pen must be

trained via the regression code provided in regression.m.

Before running the script in MATLAB, make sure that the file feat_vect.csv is in the

same folder as the script. If it is, and you have a correctly generated feature vector file,

then you can run the MATLAB script.

Running this script will not only output the associated graphs generated in MATLAB:

but most importantly the coefficients file coefficients.csv:

0 20 40 60 80 100 120 140150

160

170

180

190

200

210

220

230

240

250

Sample Number

Weig

ht

Training Sample Regression Results

Actual Weight

Calculated Weight

1 2 3 4 5 6160

170

180

190

200

210

220

230

240

Sample Number

Weig

ht

Training Set

Average Calculated Weight

Actual Weight

Page 92: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

82

This file can be placed in the same folder as cluster_mode_supervised.exe,

cluster_mode_unsupervised.exe, sample_mode.exe, and training_mode.exe to

modify the regression coefficients the programs run on. Better data can be trained this

way.

Page 93: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

83

CLUSTER MODE (SUPERVISED)

The Cluster Mode (Supervised) is used to cluster and predict the weights of multiple

pigs in a pen and output this estimate alongside known data on the pigs. It can be run

by either running (double-clicking) cluster_mode_supervised.exe from within the

folder of files or setting the value of PROGRAM_MODE in Visual Studio to 2:

In order for this program to run properly, make sure that the image files to be trained

are specified in a comma-separated values (CSV) file named supervised_pigs.csv.

The first value of each row is the filename and path (needs to be in same folder as

executable), the second is the known ID of the pig (integer number), and the known

weight of the pig (also an integer). An example is shown in Notepad:

and in Excel.

The results of the run will be output onto a dynamic screen showing each pig and its

weight.

Page 94: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

84

CLUSTER MODE (UNSUPERVISED)

The Cluster Mode (Unsupervised) is used to cluster and predict the weights of

multiple pigs in a pen without known data on the pigs. It can be run by either running

(double-clicking) cluster_mode_unsupervised.exe from within the folder of files or

setting the value of PROGRAM_MODE in Visual Studio to 3:

In order for this program to run properly, make sure the images being tested are

specified in a comma-separated values (CSV) file named unsupervised_pigs.csv.

Every line is just the filename and path of the image (needs to be in same folder as

executable). An example of this file is shown in Notepad:

and in Excel.

The results of the run will be output onto a dynamic screen showing each pig and its

weight.

Page 95: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

85

SAMPLE MODE (PREDICTING FROM A SINGLE IMAGE)

This mode serves as an example for operation of the program when you want to see

the weight of just a single pig. This mode only works given that the Clustered Mode

has already been run on the batch of pigs, preferably for a full day in order to cluster a

batch of pigs’ faces properly.

If the program has already been run in Cluster Mode, then there should be two

generated files: cluster_data.csv and face_rec_model.xml. Make sure both of these

are available in the folder.

Then, provide a text file called single_pig.txt. The only line that should be listed in

this code is the name and path of the image being tested, such shown in Notepad:

After that, the program can be executed by either running (double-clicking)

sample_mode.exe from within the folder of files or setting the value of

PROGRAM_MODE in Visual Studio to 0:

The output of the program will show the picture of the pig and designate its estimated

weight.

Page 96: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

86

APPENDIX H

MATLAB CODE FOR LEAST SQUARES REGRESSION

close all; clear all; clc;

%% -- Import Data and Create Predictor Vector -----------------------

------ % Read the data from the .csv file created by the Face Training

program data = csvread('feat_vect.csv'); % All columns except the last are features, the last is its weight X = data(:,1:5); Y = data(:,6);

% Expands the predictors by multiplying the original predictors times % themselves twice m=size(X,1); X2 = reshape(bsxfun(@times,reshape(X,m,1,[]),X),m,[]); X = [X,X2,reshape(bsxfun(@times,reshape(X,m,1,[]),X2),m,[])]; [N,L] = size(X); X = [ones(N,1),X]; L = L + 1;

%% -- Pre-Loop Definitions ------------------------------------------

------ % Defines how many iterations to run varying random sets of training

and % testing data k_max = 3000; % Define a vector that will store the testing error for each random

seed E_test = zeros(1,k_max); % Define the weight vector W = zeros(L,k_max);

% loop through values of k for k = 1:k_max %% -- Creation of Random Training/Testing Sets ------------------

------ % Declare the sizes of the training and testing sets p_test = 0.3; n_test = round(N*p_test); n_train = N-n_test;

% Create the vectors for the training and testing sets based on

size X_train = X; X_test = zeros(n_test,L); Y_train = Y; Y_test = zeros(n_test,1);

Page 97: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

87

% Set the random seed based on the value of k rng(k-1);

% Divy up the training/samples sets for i = 1:n_test j = round(rand()*(N-i))+1; X_test(i,:) = X_train(j,:); X_train = [X_train(1:j-1,:); X_train(j+1:N-i+1,:)]; Y_test(i) = Y_train(j); Y_train = [Y_train(1:j-1); Y_train(j+1:N-i+1)]; end

%% -- Solving for Weight Vector ---------------------------------

------ % Solve for the weight vector based on pseudo-inverse and

training set W(:,k) = (pinv(X_train)*Y_train);

% Report the cooresponding error value of the weight vector

created for i = 1:n_test E_test(k) = E_test(k) + (Y_test(i)-

dot(W(1:L,k),X_test(i,:)))^2; end E_test(k) = E_test(k)/n_test; end

%% -- Evaluation of Random Samples ----------------------------------

------ % Find the index of the set with the lowest error ind_j = find(E_test == min(E_test(:))); % Select the weight vector of that set W_final = W(:,ind_j);

% Recreate the training and testing sets based on the random seed

used rng(ind_j-1); X_train = X; X_test = zeros(n_test,L); Y_train = Y; Y_test = zeros(n_test,1); for i = 1:n_test j = round(rand()*(N-i))+1; X_test(i,:) = X_train(j,:); X_train = [X_train(1:j-1,:); X_train(j+1:N-i+1,:)]; Y_test(i) = Y_train(j); Y_train = [Y_train(1:j-1); Y_train(j+1:N-i+1)]; end

%% -- Find Training Sample Error ------------------------------------

------

Page 98: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

88

Y_hat_trn = zeros(n_train,1); e_train = zeros(n_train,1);

for i = 1:n_train Y_hat_trn(i) = dot(W_final,X_train(i,:)); e_train(i) = (Y_train(i)-Y_hat_trn(i))^2; end

e_train = sum(e_train)/n_train;

%% -- Find the testing sample error ---------------------------------

------ Y_hat_tst = zeros(n_test,1); e_test = zeros(n_test,1);

for i = 1:n_test Y_hat_tst(i) = dot(W_final,X_test(i,:)); e_test(i) = (Y_test(i)-Y_hat_tst(i))^2; end

e_test = sum(e_test)/n_test;

%% -- Sort the Data in Ascending order ------------------------------

------ flag = 1;

while flag flag = 0; for i = 1:n_test-1 if Y_test(i+1) < Y_test(i) temp = Y_test(i+1); Y_test(i+1) = Y_test(i); Y_test(i) = temp; temp = Y_hat_tst(i+1); Y_hat_tst(i+1) = Y_hat_tst(i); Y_hat_tst(i) = temp; flag = 1; end end end

flag = 1; temp = 0;

while flag flag = 0; for i = 1:n_train-1 if Y_train(i+1) < Y_train(i) temp = Y_train(i+1); Y_train(i+1) = Y_train(i); Y_train(i) = temp; temp = Y_hat_trn(i+1);

Page 99: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

89

Y_hat_trn(i+1) = Y_hat_trn(i); Y_hat_trn(i) = temp; flag = 1; end end end

%% -- Plot Results of All Samples -----------------------------------

----- % Plot the training sample of predicted value versus actual figure plot(1:n_train,Y_train,1:n_train,Y_hat_trn) legend('Actual Weight','Calculated Weight','Location','southeast') title('Training Sample Regression Results') xlabel('Sample Number') ylabel('Weight') grid on

% Plot the testing sample of predicted value versus actual figure plot(1:n_test,Y_test,1:n_test,Y_hat_tst) legend('Actual Weight','Calculated Weight','Location','southeast') title('Test Sample Regression Results') xlabel('Sample Number') ylabel('Weight') grid on

%% -- Average Calculated Weight for Given Weight --------------------

------ % Calculate the average values of calculated weight for given weight

on training i = 1; avg_trn = []; while i < n_train curr_w = Y_train(i); ind = find(Y_train == curr_w); avg_trn = [avg_trn;mean(Y_hat_trn(ind)),curr_w]; i = max(ind)+1; end

% Calculate the average values of calculated weight for given weight

on testing i = 1; avg_tst = []; while i < n_test curr_w = Y_test(i); ind = find(Y_test == curr_w); avg_tst = [avg_tst;mean(Y_hat_tst(ind)),curr_w]; i = max(ind)+1; end

% Plot the results for the training set figure

Page 100: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

90

plot(1:length(avg_trn),avg_trn(:,1),'b*',1:length(avg_trn),avg_trn(:,

2),'r*') legend('Average Calculated Weight','Actual

Weight','Location','southeast') title('Training Set') xlabel('Sample Number') ylabel('Weight') set(gca,'XTick', 1:1:length(avg_trn)); grid on

% Plot the results for the testing set figure plot(1:length(avg_tst),avg_tst(:,1),'b*',1:length(avg_tst),avg_tst(:,

2),'r*') legend('Average Calculated Weight','Actual

Weight','Location','southeast') title('Testing Set') xlabel('Sample Number') ylabel('Weight') set(gca,'XTick', 1:1:length(avg_trn)); grid on

%% -- Write the Coefficients CSV File -------------------------------

------ csvwrite('coefficients.csv',W_final');

Page 101: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

91

APPENDIX I

OPENCV C++ SOURCE CODE

PigWeight.cpp

#include <opencv2/core/core.hpp> #include <opencv2/objdetect/objdetect.hpp> #include <opencv2/highgui/highgui.hpp> #include <opencv2/imgproc/imgproc.hpp> #include <iostream> #include <stdio.h> #include <stdlib.h> #include <string> #include <fstream> #include "PigClassifier.h" #include "PigFace.h" using namespace std; using namespace cv; /* PROGRAM MODES: 0 -> Sample Mode (predicts a weight for a single picture) UNIMPLEMENTED 1 -> Training Mode (outputs feature vectors) 2 -> Cluster Mode - Supervised (clusters faces into groups with predicted weights and shows known data alongside it) 3 -> Cluster Mode - Unsupervised (clusters faces into groups with predicted weights)*/ #define PROGRAM_MODE 2 /* RECOGNIZER MODES: 0 -> Execution Mode (the values I've found to yield the smallest error) 1 -> Training Mode (outputs mega CSV file for results of all different values 2 -> Prompt Mode (asks for values manually) */ #define RECOGNIZER_MODE 0 // Global Variables for Window Names char* image_window = "Source Image"; char* trans_window = "Transformed Image"; char* final_faces = "Final detected faces"; /** @function main */ int main( int argc, char** argv ) { // Set the random seed srand(0); // Declare input file and associated vectors ifstream in_file; vector<string> files; vector<int> weights;

Page 102: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

92

vector<int> piggies; // Open the appropriate file depending on the program mode string line; if(PROGRAM_MODE == 1 || PROGRAM_MODE == 2) { in_file.open("supervised_pigs.csv"); } else if (PROGRAM_MODE == 3) { in_file.open("unsupervised_pigs.csv"); } else { in_file.open("single_pig.txt"); } if(in_file.is_open()) { if(PROGRAM_MODE == 1 || PROGRAM_MODE == 2) { while(in_file.good() ) { // Get the file name getline(in_file, line, ','); if(line == "") { break; } files.push_back(line); // Get the next item getline(in_file, line, ','); if(line == "") { files.pop_back(); break; } // Store the ID and weight if applicable if(PROGRAM_MODE == 1 || PROGRAM_MODE == 2) { piggies.push_back(stoi(line)); getline(in_file, line); weights.push_back(stoi(line)); } } } else { while(in_file.good() ) { // Get the file name getline(in_file, line); if(line == "") { break; } files.push_back(line); piggies.push_back(-1); weights.push_back(-1); } } } else { cout << "Failed to open input file." << endl; cout << "Press any key to terminate program." << endl; waitKey(0); return 0; } // Allow window to be resized namedWindow( image_window, CV_WINDOW_NORMAL ); namedWindow( trans_window, CV_WINDOW_AUTOSIZE );

Page 103: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

93

// Create file to write feature vectors to ofstream feat_vect_file; feat_vect_file.open ("feat_vect.csv"); // Instantiate pig class PigClassifier pig; vector<PigFace> pig_faces; // Create vector for the file names of misclassified images vector<string> mis_class; // Iterate through all of the image files listed for(size_t j = 0; j < files.size(); j++) { // Load images Mat img = imread(files[j]); if(PROGRAM_MODE == 1 || PROGRAM_MODE == 2) { cout << j << ":" << files[j] << " ("; cout << "Piggie #" << piggies[j]; cout << " weighs " << weights[j] << " lbs"; cout << ") : " << endl; } else { cout << "Piggie # " << j << ":" << endl; } // Set the known weight for supervised data if(weights.size() > j) { pig.weight_known = weights[j]; } // Set the known piggie ID for supervised data if(piggies.size() > j) { pig.piggie_known = piggies[j]; } // Check for invalid input if(!img.data) { cout << "Could not open or find the image" << endl << endl; continue; } // Classify the facial geometry of the pig bool valid_pig = pig.classify(img); // Border color Scalar value; // Skips the invalid pigs and checks valid ones with user if(valid_pig) { // Green border for accepted images value = Scalar(0,200,0); // Calculate the facial geometries pig.calcMetrics(); // Mark the image with all geometries (minus transformation) pig.markImage(); // Calculate the transformation of a valid face pig.calcTransformation(); // Calculate weight of pig pig.calcWeight();

Page 104: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

94

// Display picture imshow(image_window, pig.getImg()); // Display the cropped transformation imshow(trans_window, pig.getCroppedFace()); // Prompt user to mark misclassified if training facial geometry if(PROGRAM_MODE == 1) { char key; cout << "Validate with any button, reject with 'n'" << endl; key = waitKey(0)%255; // 0 = Wait indefinitely for keypress, k*1000 = Wait k seconds or for keypress // If 'n' is pressed, classification was not valid if(key == 27) { // Escape key pressed, end training program break; } else if(key == 'n') { cout << "Classification marked as INVALID." << endl << endl; mis_class.push_back(files[j]+" ("+to_string((long long)j)+")"); pig.misclassified++; // Red border for rejected images value = Scalar(0,0,200); continue; } else if(key == ' ') { cout << "Classification marked as VALID." << endl; } else { cout << "Classification assumed to be VALID." << endl; } // Write feature vector to file to input to regression script vector<double> feat_vect = pig.getFeatVect(); for(size_t i = 0; i < feat_vect.size(); i++) { feat_vect_file << feat_vect[i] << ","; } feat_vect_file << endl; } // Push valid faces onto vector if(PROGRAM_MODE != 1) { // Only run facial recognition on pictures of proper orientation (or short-circuit if in sample mode) if(abs(pig.eye_angle + pig.nose_angle) < 0.04 || PROGRAM_MODE == 0) { pig_faces.push_back(PigFace(pig.getCroppedFace(), pig.piggie_known, files[j], pig.weight_known, pig.weight_est)); } } } else { // Red border for rejected images value = Scalar(0,0,200); }

Page 105: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

95

// Output image with border on it float BORD = 0.03; img = pig.getImg(); img = img(Rect(img.cols*BORD/2, img.rows*BORD/2,img.cols*(1-BORD),img.rows*(1-BORD))); copyMakeBorder( img, img, img.rows*BORD, img.rows*BORD, img.cols*BORD, img.cols*BORD, BORDER_CONSTANT, value ); imshow(image_window, img); waitKey(1); cout << endl; } // Close output file feat_vect_file.close(); // If in training mode, print out the number of misclassified images manually marked if(PROGRAM_MODE == 1) { if(!mis_class.empty()) { cout << endl; cout << "*************************************************************" << endl <<endl; cout << "Misclassified images include: " << endl; for(size_t i = 0; i < mis_class.size(); i++) { cout << " - " << mis_class[i] << endl; } cout << endl; } } // Print out all of the data from the classification pig.getResults(); if(PROGRAM_MODE == 2 || PROGRAM_MODE == 3) { // Destroy the windows for the last pig image and the transformed image destroyWindow(image_window); destroyWindow(trans_window); } // delete any previous Recognizer log files ofstream temp_file; temp_file.open ("recognizer_output.txt"); temp_file.close(); // If in Cluster Mode, cluster all of the valid faces if(PROGRAM_MODE != 1) { cout << "RECOGNIZER PROGRAM_MODE!!" << endl; cout << " Let's look at " << pig_faces.size() << " faces." << endl; // If we aren't in Execution Mode, no need to output title for parameters if(RECOGNIZER_MODE != 0) { feat_vect_file.open ("recognizer_training.csv",ios::app);

Page 106: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

96

feat_vect_file << "threshold, min_cluster_count, radius, neighbors, grid_x, grid_y, " << endl; feat_vect_file.close(); } // Declare all the different parameters used in the facial LBP facial recognition int threshold, min_cluster_count, radius, neighbors, grid_x, grid_y, iterations; bool training = false; if(PROGRAM_MODE != 0) { // Use the facial recognition to create clusters if(RECOGNIZER_MODE == 0) { // Runtime mode threshold = 80; min_cluster_count = 5; radius = 4; neighbors = 12; grid_x = 4; grid_y = 8; iterations = 1; pig.createClusters(pig_faces, threshold, min_cluster_count, radius, neighbors, grid_x, grid_y, iterations,false); } else if (RECOGNIZER_MODE == 1) { // Recognizer training mode (runs the recognition multiple times in all kinds of configurations to give a good idea where to start with the values) vector<int> training_data; min_cluster_count = 0; iterations = 1; for(threshold = 20; threshold <= 60; threshold+=10) { for(radius = 1; radius <= 4; radius++) { for(neighbors = 8; neighbors <= 12; neighbors+=2) { for(grid_x = 4; grid_x <= 12; grid_x+=4) { for(grid_y = 4; grid_y <= 16; grid_y+=4) { training_data = pig.createClusters(pig_faces, threshold, min_cluster_count, radius, neighbors, grid_x, grid_y, iterations, true); feat_vect_file.open ("recognizer_training.csv",ios::app); cout << "File output: "; for(size_t i = 0; i < training_data.size(); i++) { feat_vect_file << training_data[i] << ","; cout << training_data[i] << ","; }

Page 107: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

97

feat_vect_file << endl; cout << endl; feat_vect_file.close(); } } } } } } else { // Prompt mode (ask for the parameters of the values to manually be typed in). Note that some pictures will be deleted every run do { cout << "Select a threshold for recognition confidence (default = 80): " << endl; cin >> threshold; cout << "Select a minimum cluster count (default = 5): " << endl; cin >> min_cluster_count; cout << "Select a radius (default = 4): " << endl; cin >> radius; cout << "Select how many neighbors (default = 12): " << endl; cin >> neighbors; cout << "Select grid_x (default = 4): " << endl; cin >> grid_x; cout << "Select grid_y (default = 8): " << endl; cin >> grid_y; cout << "Select the number of iterations (default = 1): " << endl; cin >> iterations; pig.createClusters(pig_faces, threshold, min_cluster_count, radius, neighbors, grid_x, grid_y, iterations, false); } while(threshold != 0 && RECOGNIZER_MODE != 0); } } else { // Run recognizer on just a single image pig.samplePig(pig_faces[0]); } } // Wait for user to view results waitKey(0); return 0; }

Page 108: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

98

PigClassifier.h

#ifndef PIGCLASSIFIER_DEF #define PIGCLASSIFIER_DEF #include <opencv2/core/core.hpp> #include <opencv2/objdetect/objdetect.hpp> #include <opencv2/highgui/highgui.hpp> #include <opencv2/imgproc/imgproc.hpp> #include <opencv2/contrib/contrib.hpp> #include <iostream> #include <stdio.h> #include <string> #include <math.h> #include <fstream> #include <algorithm> #include "utilities.h" #include "PigFace.h" using namespace std; using namespace cv; class PigClassifier { public: static int images_classified; static int misclassified; static int failed_from_face; static int failed_from_eyes; static int failed_from_nose; // Known weight for supervised data double weight_known; double weight_est; int piggie_known; double eye_angle; double nose_angle; // No-arg constructor has all empty variables PigClassifier(); // Weight function, calculates weight of pig given the other feature vectors double calcWeight(); // Classification function, classifies a face and features for an image bool classify(Mat img); // Return the Mat stored with the pig Mat getImg(); // Choose one of the faces as valid Rect chooseFace(vector<Rect> faces); // Choose two eyes as valid, false if no valid eyes bool chooseEyes(vector<Rect> eyes); // Choose one of the noses as valid

Page 109: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

99

Rect chooseNose(vector<Rect> noses); // Calculate the distance between the eyes, angle, distance to nose, and angle void calcMetrics(); // Mark image with facial geometries void markImage(); // Calculate the perspective transformation of the face void calcTransformation(); // Return the appropriate feature vector for the facial geometry vector<double> getFeatVect(); // Return static variables summary void getResults(); // Return cropped face Mat getCroppedFace(); // Prints the cluster data on a vector of pig faces void getClusters(vector<PigFace> & pig_faces); // Creates clusters out of the pig faces using LBP facial recognition vector<int> createClusters(vector<PigFace> & pig_faces, int confidence_threshold, int min_cluster_count, int radius, int neighbors, int grid_x, int grid_y, int iterations, bool training); // Prints one window with all of the final pig images void displayClusters(vector<String> & file_names, vector<int> weight_known, vector<int> weight_est); // Take a single pig face and returns it's weight based on the facial recognition model void samplePig(PigFace piggie); private: // Timer variables double t1, t2; Rect face; Rect lt_eye; Rect rt_eye; Rect nose; Point bisector; double eye_avg; double eye_del; double eye_dist; double nose_dist; Mat img; Mat mark_img; Mat face_img; }; #endif //if not defined

Page 110: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

100

PigClassifier.cpp

#include "PigClassifier.h" #define SHOW_MISCLASS 0 #define PRINT_TIMES 0 // declare and instatiate the global variables int PigClassifier::images_classified = 0; int PigClassifier::misclassified = 0; int PigClassifier::failed_from_face = 0; int PigClassifier::failed_from_eyes = 0; int PigClassifier::failed_from_nose = 0; // Define the constructor PigClassifier::PigClassifier() { this->rt_eye = Rect(0,0,0,0); this->lt_eye = Rect(0,0,0,0); this->face = Rect(0,0,0,0); this->weight_known = -1; this->weight_est = 0; } // Weight function, calculates weight of pig given the other feature vectors double PigClassifier::calcWeight() { ifstream coefficients_file("coefficients.csv"); string line; vector<float> coefficient; vector<float> predictor; // Get all of the coefficients from file if(coefficients_file.is_open()) { while(coefficients_file.good()) { getline(coefficients_file, line, ','); coefficient.push_back(stof(line)); } } else { cout << "Failed to open coefficients file." << endl; return 0; } // Create all of the predictors (everything multiplied times everything, plus a 1 for constant double first_set[] = {this->eye_avg, this->eye_del, this->eye_dist, nose.x, nose.y}; double set_size = 5; for(int i = 0; i < set_size; i++) { predictor.push_back(first_set[i]); } for(int i = 0; i < set_size; i++) { for(int j = 0; j < set_size; j++) { predictor.push_back(first_set[i]*first_set[j]); } }

Page 111: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

101

int mid_set_size = predictor.size(); for(int i = 0; i < set_size; i++) { for( int j = set_size; j < mid_set_size; j++) { predictor.push_back(first_set[i]*predictor[j]); } } predictor.insert(predictor.begin(),1); // Calculate the weight this->weight_est = 0; for(int i = 0; i < predictor.size(); i++) { this->weight_est += coefficient[i]*predictor[i]; } cout << "Estimated weight is " << this->weight_est << endl; return this->weight_est; } // Classification function, classifies a face and features for an image bool PigClassifier::classify(Mat img) { // Increment global number of images classified images_classified++; // Create copies of the image specified in the parameters this->img = img.clone(); this->mark_img = img.clone(); // Define and load all of the cascade classifiers CascadeClassifier eyes_cascade; eyes_cascade.load("cascade_eyes_32stage_999.xml"); CascadeClassifier pigface_cascade; pigface_cascade.load("cascade_face_20stage_999.xml"); CascadeClassifier nose_cascade; nose_cascade.load("cascade_nose_33stage_999.xml"); // Convert to grayscale Mat gray; cvtColor(this->img, gray, CV_BGR2GRAY); equalizeHist( gray, gray); // Detect all possible faces cout << " Classifying faces..." << endl; // Optionally start timers t1 = get_wall_time(); t2 = get_cpu_time(); // Create vector of rectangles and pass to classifier std::vector<Rect> faces; pigface_cascade.detectMultiScale( gray, faces, 1.15, 3, 0|CV_HAAR_SCALE_IMAGE, Size((int)img.rows*0.31,(int)img.rows*0.31)); // Print classification time if desired if(PRINT_TIMES) { cout << "\tWall Clock Time: \t" << get_wall_time()-t1 << " sec " << endl; cout << "\tCPU Clock Time: \t" << get_cpu_time()-t2 << " sec " << endl;

Page 112: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

102

} // Deal with rectangles depending on how many objects were detected. if(faces.size() == 1) { // If there's only one face, must be the valid one. face = faces[0]; cout << "\tOnly one valid face detected." << endl; } else if(faces.empty()) { // If there are no faces, picture cannot be valid. cout << "\tCould not detect any faces in picture." << endl; failed_from_face++; return false; } else if(faces.size() > 1) { // If there are multiple faces, then the valid one needs to be selected. cout << "\tMore than one face detected, selecting most probable object." << endl; chooseFace(faces); cout << "\tOne single face selected out of " << faces.size() << "." << endl; } // Mark all detected faces if(SHOW_MISCLASS) { for(size_t i = 0; i < faces.size(); i++) { // Mark detected faces onto image rectangle( this->mark_img, Point(faces[i].x,faces[i].y), Point(faces[i].x+faces[i].width,faces[i].y+faces[i].height), Scalar( 0,0,150 ), 5, 8, 0 ); } // Mark detected face onto image early if debugging rectangle( this->mark_img, Point(face.x,face.y), Point(face.x+face.width,face.y+face.height), Scalar(150,0,0), 5, 8, 0 ); } // Resize the face with a buffer for eye detection double buffer = 0.05; face.x = ((int) (face.x - face.width*buffer)) < 0 ? 0 : ((int) (face.x - face.width*buffer)); face.y = ((int) (face.y - face.height*buffer)) < 0 ? 0 : ((int) (face.y - face.height*buffer)); face.width = ((int) (face.width + 2*face.width*buffer)) + face.x > gray.cols ? gray.cols - face.x : ((int) (face.width + 2*face.width*buffer)); face.height = ((int) (face.height + 2*face.height*buffer)) + face.y > gray.rows ? gray.rows - face.y : ((int) (face.height + 2*face.height*buffer)); Mat faceROI = gray( face ); // Detect all possible eyes in ROI cout << " Classifying eyes:" << endl; t1 = get_wall_time(); t2 = get_cpu_time(); // Create vector of rectangles and pass to eye classifier vector<Rect> eyes; eyes_cascade.detectMultiScale( faceROI, eyes, 1.35, 30, 0|CV_HAAR_SCALE_IMAGE, Size((int)img.rows*0.039,(int)img.rows*0.039), Size((int)img.rows*0.154,(int)img.rows*0.154)); // Print classification time if desired

Page 113: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

103

if(PRINT_TIMES) { cout << "\tWall Clock Time: \t" << get_wall_time()-t1 << " sec " << endl; cout << "\tCPU Clock Time: \t" << get_cpu_time()-t2 << " sec " << endl; } // Check if any eyes are detected if(eyes.empty()) { cout << "\tNo eyes detected on face." << endl; failed_from_eyes++; return false; } // Show misclassified eyes if debugging if(SHOW_MISCLASS) { for( size_t j = 0; j < eyes.size(); j++ ) { rectangle( this->mark_img, Point(face.x + eyes[j].x,face.y + eyes[j].y), Point(face.x + eyes[j].x + eyes[j].width,face.y + eyes[j].y + eyes[j].height), Scalar( 0,0,150 ), 5, 8, 0 ); } } // Choose the eyes that we will actually use if(!chooseEyes(eyes)) { cout << "\tInvalid set of eyes." << endl; failed_from_eyes++; return false; } // Mark valid detected eyes onto image early if debugging if(SHOW_MISCLASS) { rectangle( this->mark_img, Point(face.x + lt_eye.x,face.y + lt_eye.y), Point(face.x + lt_eye.x + lt_eye.width,face.y + lt_eye.y + lt_eye.height), Scalar( 200,0,0 ), 5, 8, 0 ); rectangle( this->mark_img, Point(face.x + rt_eye.x,face.y + rt_eye.y), Point(face.x + rt_eye.x + rt_eye.width,face.y + rt_eye.y + rt_eye.height), Scalar( 200,0,0 ), 5, 8, 0 ); } // Detect all possible noses cout << " Classifying noses..." << endl; // Pad the bottom with a border of 10% to make sure full nose is classified int top=0, bottom=0, left=0, right = 0; bottom = (int) (0.3*gray.rows); int borderType = BORDER_CONSTANT; copyMakeBorder( gray, gray, top, bottom, left, right, borderType, (0,0,0) ); // Start timers t1 = get_wall_time(); t2 = get_cpu_time(); // Define vector of rectangles to pass to nose classifier std::vector<Rect> noses; // Create a nose ROI that's below the face Mat nose_ROI = gray(Rect(1,face.y+face.height-1,gray.cols-1,gray.rows-(face.y+face.height))); // best parameters ...

Page 114: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

104

nose_cascade.detectMultiScale( nose_ROI, noses, 1.05, 3, 0|CV_HAAR_SCALE_IMAGE, Size((int)img.rows*0.154,(int)img.rows*0.154)); // Print classification times if desired if(PRINT_TIMES) { cout << "\tWall Clock Time: \t" << get_wall_time()-t1 << " sec " << endl; cout << "\tCPU Clock Time: \t" << get_cpu_time()-t2 << " sec " << endl; } //adjust noses for ROI for(size_t i = 0; i < noses.size(); i++) { noses[i].y = noses[i].y + face.y+face.height; } // Deal with nose rectangles depending on how many objects are detected if(noses.size() == 1) { // If only one nose is found, then it must be the valid nose. nose = noses[0]; cout << "\tOnly one valid nose detected." << endl; } else if(noses.empty()) { // If no noses are found, the picture cannot be a valid image of a pig. cout << "\tCould not detect any noses in picture." << endl; failed_from_nose++; return false; } else if(noses.size() > 1) { // If more than one nose are found, then we must select the valid one. cout << "\tMore than one nose detected, selecting most probable object." << endl; chooseNose(noses); cout << "\tOne valid nose selected out of " << noses.size() << "." << endl; } // Mark all detected noses if(SHOW_MISCLASS) { for(size_t i = 0; i < noses.size(); i++) { // Mark detected noses onto image rectangle( this->mark_img, Point((int) noses[i].x,(int) noses[i].y), Point((int)noses[i].x+noses[i].width,noses[i].y+noses[i].height), Scalar( 0,0,150 ), 5, 8, 0 ); } // Mark valid detected nose early if debugging rectangle( this->mark_img, Point((int) nose.x,(int) nose.y), Point((int)nose.x+nose.width,nose.y+nose.height), (0,0,200), 5, 8, 0 ); } return true; } // Choose a valid face based on shortest Euclidean distance to the center of the image Rect PigClassifier::chooseFace(vector<Rect> faces) { double shortest_distance=999999999; Point img_center = ((double) (this->img.cols/2),(double) (this->img.rows/2)); for(size_t i = 0; i < faces.size(); i++) {

Page 115: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

105

Point face_center = Point(faces[i].x + faces[i].width/2,faces[i].y + faces[i].height/2); double distance = sqrt(pow((double) img_center.x - face_center.x,2.0) + pow( (double) img_center.y - face_center.y,2.0)); if(distance < shortest_distance) { this->face = faces[i]; shortest_distance = distance; } } return face; } // Choose eyes, false if no valid eyes bool PigClassifier::chooseEyes(vector<Rect> eyes) { // Divide eyes into right and left vector<Rect> rt_eyes; vector<Rect> lt_eyes; for(size_t i = 0; i < eyes.size(); i++) { if(eyes[i].x < (int) (face.width/2)) { lt_eyes.push_back(eyes[i]); } else { rt_eyes.push_back(eyes[i]); } } if(lt_eyes.empty()) { cout << "\tFace does not contain any eyes on the left side." << endl; return false; } if(rt_eyes.empty()) { cout << "\tFace does not contain any eyes on the right side." << endl; return false; } cout << "\tThere are " << lt_eyes.size() << " eyes on the left." << endl; cout << "\tThere are " << rt_eyes.size() << " eyes on the right." << endl; //Break eye selector if already have valid pair of eyes if(lt_eyes.size() == 1 && rt_eyes.size() == 1) { this->lt_eye = lt_eyes[0]; this->rt_eye = rt_eyes[0]; return true; } // allocate dynamic 2-dimensional array to predict probability of any two given eyes double** E = new double*[lt_eyes.size()]; for(size_t i = 0; i < lt_eyes.size(); i++) { E[i] = new double[rt_eyes.size()]; } // fill array of expected values all initially with 1 for(size_t i = 0; i < lt_eyes.size(); i++) { for( size_t j = 0; j < rt_eyes.size(); j++) { E[i][j] = 1; }

Page 116: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

106

} // determine max and min width of eyes int max_width = 0; int min_width = 999999999; for(size_t i = 0; i < lt_eyes.size(); i++) { if(eyes[i].width > max_width) { max_width = eyes[i].width; } if(eyes[i].width < min_width) { min_width = eyes[i].width; } } // create temporary images of eyes, all resized vector<Mat> lt_eye_images; for(size_t i = 0; i < lt_eyes.size(); i++) { lt_eye_images.push_back(this->img(lt_eyes[i])); resize(lt_eye_images[i],lt_eye_images[i],Size(50,50)); } vector<Mat> rt_eye_images; for(size_t i = 0; i < rt_eyes.size(); i++) { rt_eye_images.push_back(this->img(rt_eyes[i])); resize(rt_eye_images[i],rt_eye_images[i],Size(50,50)); flip(rt_eye_images[i],rt_eye_images[i],1); } double max_similarity = 0; double min_similarity = 99999; for(size_t i = 0; i < lt_eyes.size(); i++) { for( size_t j = 0; j < rt_eyes.size(); j++) { // Calculate the L2 relative error between images. double errorL2 = norm( lt_eye_images[i], rt_eye_images[j], CV_L2 ); // Convert to a reasonable scale, since L2 error is summed across all pixels of the image. double similarity = errorL2 / (double)( lt_eye_images[i].rows * lt_eye_images[i].cols ); E[i][j] = similarity; cout << "\tsimilarity result = " << E[i][j] << endl; if(E[i][j] < min_similarity) { min_similarity = E[i][j];\ } if(E[i][j] > max_similarity) { max_similarity = E[i][j]; } } } // Check for potential division by zero if(max_width == min_width) { max_width++; } // compute probablilities cout << "\tExamining probabilities.." << endl; for(size_t i = 0; i < lt_eyes.size(); i++) { for( size_t j = 0; j < rt_eyes.size(); j++) {

Page 117: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

107

// probability increases with similarity in template matching E[i][j] = (double) 0.5*(E[i][j]-min_similarity)/(max_similarity-min_similarity); // probability increases with similarity in eye size between the two E[i][j] += (double) 0.5*(1 - abs(lt_eyes[i].width-rt_eyes[j].width)/max_width); // probability increases with closeness to horizon E[i][j] += (double) 2*(face.height/2 - abs(face.height/2 - (lt_eyes[i].y+lt_eyes[i].height/2)))/(face.height/2); E[i][j] += (double) 2*(face.height/2 - abs(face.height/2 - (rt_eyes[j].y+rt_eyes[j].height/2)))/(face.height/2); // probability increases the further apart an eye is E[i][j] += (double) abs((lt_eyes[i].x + lt_eyes[i].width/2) - (rt_eyes[j].x + rt_eyes[j].width/2))/(face.width-max_width); // probability increases the less angle there is between the eyes E[i][j] += (double) 4*(face.height - abs(lt_eyes[i].y - rt_eyes[j].y))/(face.height); cout << "\tprobability = " << E[i][j] << " at (" << lt_eyes[i].x << "," << lt_eyes[i].y << ") and (" << rt_eyes[j].x << "," << rt_eyes[j].y << ")" << endl; } } // If a best pairwise probability is found, set that as valid set of eyes int best_i = 0; int best_j = 0; double best_E = 0; for(size_t i = 0; i < lt_eyes.size(); i++) { for( size_t j = 0; j < rt_eyes.size(); j++) { if(E[i][j] > best_E) { best_i = i; best_j = j; best_E = E[i][j]; } } } cout << "\tSelected valid left and right eyes." << endl; this->lt_eye = lt_eyes[best_i]; //cout << "\tThe left eye is at " << lt_eye.x << " and " << lt_eye.y << endl; this->rt_eye = rt_eyes[best_j]; //cout << "\tThe right eye is at " << rt_eye.x << " and " << rt_eye.y << endl; delete[] E; return true; } // Choose a valid nose based on shortest Euclidean distance to the bottom-center of the picture Rect PigClassifier::chooseNose(vector<Rect> noses) { double shortest_distance=999999999; Point bottom_center = ((double) (this->img.cols/2),(double) this->img.rows);

Page 118: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

108

for(size_t i = 0; i < noses.size(); i++) { Point nose_center = Point(noses[i].x + noses[i].width/2,noses[i].y + noses[i].height/2); double distance = sqrt(pow((double) bottom_center.x - nose_center.x,2.0) + pow( (double) bottom_center.y - nose_center.y,2.0)); if(distance < shortest_distance) { this->nose = noses[i]; shortest_distance = distance; } } return nose; } // Return the Mat stored with the pig Mat PigClassifier::getImg() { return this->mark_img; } // Calculate the various feature metrics void PigClassifier::calcMetrics() { cout << " Calculating metrics..." << endl; // Make temporary points for eyes Point rightEye = Point((int)face.x+rt_eye.x+rt_eye.width/2,(int)face.y+rt_eye.y+rt_eye.height/2); Point leftEye = Point((int)face.x+lt_eye.x+lt_eye.width/2,(int)face.y+lt_eye.y+lt_eye.height/2); Point nosePt = Point((int)nose.x+nose.width/2,(int)nose.y+nose.height*3/4); this->bisector = Point((int)(rightEye.x+leftEye.x)/2,(int)(rightEye.y+leftEye.y)/2); // Calculate eye and nose angle from bisector this->eye_angle = (double) atan((double)(rightEye.y-this->bisector.y)/(rightEye.x-this->bisector.x)); this->nose_angle = (double) atan((double)(nosePt.x-this->bisector.x)/(nosePt.y-this->bisector.y)); // Calculate distance from eye to eye and bisector to nose this->eye_dist = (double) sqrt(pow((double)rightEye.y-leftEye.y,2) + pow((double)rightEye.x-leftEye.x,2)); this->nose_dist = (double) sqrt(pow((double)nosePt.x-this->bisector.x,2) + pow((double)nosePt.y-this->bisector.y,2)); // Calculate eye size average this->eye_avg = (lt_eye.width + rt_eye.width)/2; // Calculate eye size delta this->eye_del = abs(lt_eye.width - rt_eye.width); } // Mark image with facial geometries void PigClassifier::markImage() { // Mark valid detected face

Page 119: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

109

rectangle( this->mark_img, Point(face.x,face.y), Point(face.x+face.width,face.y+face.height), Scalar(200,0,0), 5, 8, 0 ); // Mark valid detected eyes rectangle( this->mark_img, Point(face.x+lt_eye.x,face.y+lt_eye.y), Point(face.x+lt_eye.x + lt_eye.width,face.y+lt_eye.y + lt_eye.height), Scalar( 200,0,0 ), 5, 8, 0 ); rectangle( this->mark_img, Point(face.x+rt_eye.x,face.y+rt_eye.y), Point(face.x+rt_eye.x + rt_eye.width,face.y+rt_eye.y + rt_eye.height), Scalar( 200,0,0 ), 5, 8, 0 ); // Mark valid detected nose rectangle( this->mark_img, Point(nose.x,nose.y), Point(nose.x+nose.width,nose.y+nose.height), Scalar(200,0,0), 5, 8, 0 ); // Make temporary points for eyes Point rightEye = Point((int)face.x+rt_eye.x+rt_eye.width/2,(int)face.y+rt_eye.y+rt_eye.height/2); Point leftEye = Point((int)face.x+lt_eye.x+lt_eye.width/2,(int)face.y+lt_eye.y+lt_eye.height/2); Point nosePt = Point((int)nose.x+nose.width/2,(int)nose.y+nose.height/2); // Draw circles pinpointing the eyes circle( this->mark_img, leftEye, 4, Scalar(100,0,0), 25, 8); circle( this->mark_img, rightEye, 4, Scalar(100,0,0), 25, 8); // Draw line between eyes line( this->mark_img, leftEye, rightEye, Scalar(100,0,0), 8, 8); // Draw circle pinpointing the nose circle( this->mark_img, nosePt, 4, Scalar(100,0,0), 25, 8); //Draw line between bisector and nose line( this->mark_img, nosePt, this->bisector, Scalar(100,0,0), 8, 8); } // calculate the perspective transformation of the face void PigClassifier::calcTransformation() { Point rightEye = Point((int)face.x+rt_eye.x+rt_eye.width/2,(int)face.y+rt_eye.y+rt_eye.height/2); Point leftEye = Point((int)face.x+lt_eye.x+lt_eye.width/2,(int)face.y+lt_eye.y+lt_eye.height/2); Point forehead = Point(this->bisector.x-sin(this->nose_angle)*this->nose_dist*0.4,this->bisector.y-cos(this->nose_angle)*nose_dist*0.4); Point ltFace, rtFace, rtNose, ltNose; //rtFace = Point((int) face.x + rt_eye.x + rt_eye.width*3/2, (int) face.y + rt_eye.y - rt_eye.height*2); //ltFace = Point((int) face.x + lt_eye.x - lt_eye.width/2, (int) face.y + lt_eye.y - lt_eye.height*2); double eye_delta = (rt_eye.width - lt_eye.width)/2; rtFace = Point((int) forehead.x + cos(this->eye_angle)*(this->eye_dist*3/4+eye_delta), (int) forehead.y + sin(this->eye_angle)*(this->eye_dist*3/4+eye_delta)); ltFace = Point((int) forehead.x - cos(this->eye_angle)*(this->eye_dist*3/4-eye_delta), (int) forehead.y - sin(this->eye_angle)*(this->eye_dist*3/4-eye_delta));

Page 120: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

110

ltNose = Point((int) nose.x+nose.width/2-cos(this->eye_angle)*(nose.width*3/8),(int) nose.y + nose.height/2 - sin(this->eye_angle)*(nose.width/2)); rtNose = Point((int) nose.x+nose.width/2+cos(this->eye_angle)*(nose.width*3/8),(int) nose.y + nose.height/2 + sin(this->eye_angle)*(nose.width/2)); // Draw quadrilateral around plane to be transformed circle( this->mark_img, rtFace, 4, Scalar(100,0,100), 25, 8); circle( this->mark_img, ltFace, 4, Scalar(100,0,100), 25, 8); circle( this->mark_img, ltNose, 4, Scalar(100,0,100), 25, 8); circle( this->mark_img, rtNose, 4, Scalar(100,0,100), 25, 8); line( this->mark_img, rtFace, ltFace, Scalar(100,0,100), 8, 8); line( this->mark_img, ltFace, ltNose, Scalar(100,0,100), 8, 8); line( this->mark_img, ltNose, rtNose, Scalar(100,0,100), 8, 8); line( this->mark_img, rtNose, rtFace, Scalar(100,0,100), 8, 8); Point2f src[4], dst[4]; src[0] = Point2f(rtFace); src[1] = Point2f(ltFace); src[2] = Point2f(ltNose); src[3] = Point2f(rtNose); dst[0] = Point2f(Point(300,0)); dst[1] = Point2f(Point(0,0)); dst[2] = Point2f(Point(0,600)); dst[3] = Point2f(Point(300,600)); // Take perspective transformation with bicubic interpolation Mat M(2, 4, CV_32FC1); M = getPerspectiveTransform( src, dst); warpPerspective( this->img, this->face_img, M, Size(300,600),INTER_CUBIC); cvtColor(this->face_img, this->face_img, CV_BGR2GRAY); equalizeHist(this->face_img,this->face_img); } // Return the appropriate feature vector for the facial geometry vector<double> PigClassifier::getFeatVect() { // Note: all features are in relation to center of face as origin and normalized to face.width vector<double> feat_vect; // Store the average eye size feat_vect.push_back(this->eye_avg); // Store the difference in eye size feat_vect.push_back(this->eye_del); // Store the distance between the eyes feat_vect.push_back(this->eye_dist); // Position of nose feat_vect.push_back(this->nose.x); feat_vect.push_back(this->nose.y); // Store the known weight of the pig for supervised data feat_vect.push_back(weight_known);

Page 121: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

111

return feat_vect; } // Return static variables summary void PigClassifier::getResults() { cout << endl; cout << "*************************************************************" << endl; cout << " Results" << endl; cout << "*************************************************************" << endl; cout << "Misclassifications = " << misclassified << " / " << images_classified << endl; cout << " Misclassification error = " << (double) misclassified/images_classified << endl; cout << "Total rejected images = " << failed_from_face + failed_from_eyes + failed_from_nose << " / " << images_classified << endl; cout << " Failures from faces = " << failed_from_face << " / " << images_classified << endl; cout << " Failures from eyes = " << failed_from_eyes << " / " << images_classified << endl; cout << " Failures from noses = " << failed_from_nose << " / " << images_classified << endl; cout << "*************************************************************" << endl; } // Return cropped face Mat PigClassifier::getCroppedFace() { return this->face_img; } // Prints the cluster data on a vector of pig faces void PigClassifier::getClusters(vector<PigFace> & pig_faces) { // Check if the vector has elements if(pig_faces.empty()) { return; } cout << endl << "Pig facial cluster information:" << endl; // loop through all pig faces for(size_t i = 0; i < pig_faces.size(); i++) { cout << "Image " << i << endl; cout << " ID known: " << pig_faces[i].ID_known << endl; cout << " ID prediction: " << pig_faces[i].ID_prediction << endl; cout << " Prediction confidence: " << pig_faces[i].cluster_confidence << endl; cout << endl; } } // Creates clusters out of the pig faces

Page 122: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

112

vector<int> PigClassifier::createClusters(vector<PigFace> & pig_faces, int confidence_threshold, int min_cluster_count, int radius, int neighbors, int grid_x, int grid_y, int iterations, bool training) { ofstream out_file; out_file.open ("recognizer_output.txt",ios::app); out_file << "**************************************************" << endl; out_file << "Recognizer Run" << endl; out_file << "**************************************************" << endl; out_file << "threshold = " << confidence_threshold << endl; out_file << "min_cluster_count = " << min_cluster_count << endl; out_file << "radius = " << radius << endl; out_file << "neighbors = " << neighbors << endl; out_file << "grid_x = " << grid_x << endl; out_file << "grid_y = " << grid_y << endl; out_file << "iterations = " << iterations << endl; out_file << "training = " << training << endl; namedWindow( "Transformed Image", CV_WINDOW_AUTOSIZE ); // Check if the vector has elements vector<int> training_data; if(pig_faces.empty()) { return training_data; } double threshold = DBL_MAX; Ptr<FaceRecognizer> model = createLBPHFaceRecognizer(radius, neighbors, grid_x, grid_y, threshold); vector<Mat> images; vector<int> labels; cout << "Initializing FaceRecognizer..." << endl; int predicted_label = -1; double predicted_confidence = 0.0; for(int k = 0; k < iterations; k++) { cout << "Iteration #" << k << endl; out_file << "Iteration #" << k << endl; images.clear(); labels.clear(); //random_shuffle(pig_faces.begin(),pig_faces.end()); // Initialize model with first face image images.push_back(pig_faces[0].getFaceImg()); labels.push_back(0); model->train(images, labels); for(size_t i = 0; i < pig_faces.size(); i++) { if (!training) { cout << "Updating pig # " << i << endl; } model->predict(pig_faces[i].getFaceImg(), predicted_label, predicted_confidence); imshow("Transformed Image", pig_faces[i].getFaceImg()); waitKey(1); predicted_confidence = 100 - predicted_confidence;

Page 123: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

113

pig_faces[i].cluster_confidence = predicted_confidence; if(predicted_confidence > confidence_threshold) { // Confidence is good enough for facial match pig_faces[i].ID_prediction = predicted_label; pig_faces[i].cluster_confidence = predicted_confidence; pig_faces[i].membership.push_back(predicted_label); if(!training) { cout << "We'll call it a match." << endl; } } else { // Create a new label images.push_back(pig_faces[i].getFaceImg()); labels.push_back(labels.size()); model->update(images,labels); if(!training) { cout << "Let's make a new label." << endl; } i--; if(labels.size() > 20 && training) { training_data.push_back(confidence_threshold); training_data.push_back(min_cluster_count); training_data.push_back(radius); training_data.push_back(neighbors); training_data.push_back(grid_x); training_data.push_back(grid_y); training_data.push_back(-1); return training_data; } } } // Sort all of the faces based on pig ID and weight sort(pig_faces.begin(), pig_faces.end()); cout << endl; out_file << endl; for(size_t j = 0; j < pig_faces.size(); j++) { if(pig_faces[j].ID_prediction != -1) { cout << pig_faces[j].ID_prediction << ": " << "ID_known = " << pig_faces[j].ID_known << endl; out_file << pig_faces[j].ID_prediction << ": " << "ID_known = " << pig_faces[j].ID_known << endl; } else { cout << j << ": " << pig_faces[j].ID_prediction << ": " << endl; out_file << j << ": " << pig_faces[j].ID_prediction << ": " << endl; } } cout << endl; out_file << endl; } //create dynamic array of cluster counts

Page 124: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

114

int *cluster_count = new int[labels.size()]; for(size_t i = 0; i < labels.size(); i++) { cluster_count[i] = 0; } // Count the number of faces in each cluster for(size_t i = 0; i < labels.size(); i++) { for(size_t j = 0; j < pig_faces.size(); j++) { if(pig_faces[j].ID_prediction == i) { cluster_count[i]++; } } } // Delete all clusters that are smaller than the given threshold //int offset = 0; for(size_t i = 0; i < labels.size(); i++) { for(size_t j = 0; j < pig_faces.size(); j++) { if(pig_faces[j].ID_prediction == i) { if(cluster_count[i] < min_cluster_count) { pig_faces.erase(pig_faces.begin() + j); j--; cluster_count[i] = 0; } } } } // Store only clusters that are big enough vector<Mat> temp_images; vector<int> temp_labels; for(size_t i = 0; i < labels.size(); i++) { if(cluster_count[i] >= min_cluster_count) { temp_images.push_back(images[i]); temp_labels.push_back(labels[i]); } } images.clear(); labels.clear(); for(size_t i = 0; i < temp_labels.size(); i++) { images.push_back(temp_images[i]); labels.push_back(temp_labels[i]); } // Recount the number of faces in each cluster cluster_count = new int[labels.size()]; for(size_t i = 0; i < labels.size(); i++) { cluster_count[i] = 0; } for(size_t i = 0; i < labels.size(); i++) { for(size_t j = 0; j < pig_faces.size(); j++) { if(pig_faces[j].ID_prediction == i) { cluster_count[i]++; } } }

Page 125: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

115

// Cycle through, taking away a cluster at a time Mat temp_image; int temp_label; for(size_t i = 0; i < labels.size(); i++) { bool moving_flag = false; vector<int> mode_list; double average; temp_image = images[0]; images.erase(images.begin()); temp_label = labels[0]; labels.erase(labels.begin()); model->train(images, labels); cout << "Omitting Cluster " << temp_label << "..." << endl; out_file << "Omitting Cluster " << temp_label << "..." << endl; int last_ID=-1; for(size_t j = 0; j < pig_faces.size(); j++) { if(pig_faces[j].ID_prediction != last_ID) { imshow("Transformed Image", pig_faces[j].getFaceImg()); waitKey(1); if(last_ID != -1) { average = average/mode_list.size(); cout << "Average = " << average << endl; out_file << "Average = " << average << endl; sort(mode_list.begin(),mode_list.end()); int mode = 0; int mode_count = 0; int largest_mode_count = 0; int last_ID_value = -1; for(size_t k = 0; k < mode_list.size(); k++) { if(mode_list[k] == last_ID_value) { mode_count++; } else { mode_count = 1; } if(mode_count > largest_mode_count) { mode = mode_list[k]; largest_mode_count = mode_count; } last_ID_value = mode_list[k]; } cout << "Mode = " << mode << " with " << largest_mode_count << " instances." << endl; out_file << "Mode = " << mode << " with " << largest_mode_count << " instances." << endl; if(last_ID == temp_label) { // Determine if we should move ommitted cluster to target cluster (75%) if((double) largest_mode_count/mode_list.size() > .75) { // Calculate target cluster average

Page 126: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

116

double target_average = 0; int target_size = 0; for(size_t t = 0; t < pig_faces.size(); t++) { if(pig_faces[t].ID_prediction == mode) { target_average += pig_faces[t].weight_est; target_size++; } } target_average = target_average/target_size; // Move cluster to target cluster if within 10% of estimated weight if(abs(target_average - average)/target_average < 0.1 && abs(target_average - average)/average < 0.1) { moving_flag = true; for(size_t t = 0; t < pig_faces.size(); t++) { if(pig_faces[t].ID_prediction == temp_label) { pig_faces[t].ID_prediction = mode; } } sort(pig_faces.begin(), pig_faces.end()); cout << "Moved Cluster " << temp_label << " to be joined with Cluster " << mode << endl; out_file << "Moved Cluster " << temp_label << " to be joined with Cluster " << mode << endl; } else { cout << "Not moving Cluster " << temp_label << " because it's est weight is " << average << " but the target cluster is " << target_average << endl; out_file << "Not moving Cluster " << temp_label << " because it's est weight is " << average << " but the target cluster is " << target_average << endl; } } else { cout << "Not moving Cluster " << temp_label << " because only " << ((double) largest_mode_count/mode_list.size()*100) << "% of cluster is the mode." << endl; out_file << "Not moving Cluster " << temp_label << " because only " << ((double) largest_mode_count/mode_list.size()*100) << "% of cluster is the mode." << endl; } } } // Start looking at a new cluster mode_list.clear(); average = 0;

Page 127: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

117

cout << "Cluster " << pig_faces[j].ID_prediction << endl; out_file << "Cluster " << pig_faces[j].ID_prediction << endl; } if(pig_faces[j].ID_prediction == temp_label) { model->predict(pig_faces[j].getFaceImg(), predicted_label, predicted_confidence); // FIXME I think I could choose to only run on the cluster being omitted (AKA ID_prediction = temp_label) predicted_confidence = 100 - predicted_confidence; if(pig_faces[j].ID_known != -1) { cout << "1st: " << pig_faces[j].ID_prediction << ",2nd: " << predicted_label << "(" << predicted_confidence << "%)" << ", Actual = " << pig_faces[j].ID_known << ", Weight = " << pig_faces[j].weight_est << endl; out_file << "1st: " << pig_faces[j].ID_prediction << ",2nd: " << predicted_label << "(" << predicted_confidence << "%)" << ", Actual = " << pig_faces[j].ID_known << ", Weight = " << pig_faces[j].weight_est << endl; } else { cout << "1st: " << pig_faces[j].ID_prediction << ",2nd: " << predicted_label << "(" << predicted_confidence << "%)" << ", Weight = " << pig_faces[j].weight_est << endl; out_file << "1st: " << pig_faces[j].ID_prediction << ",2nd: " << predicted_label << "(" << predicted_confidence << "%)" << ", Weight = " << pig_faces[j].weight_est << endl; } mode_list.push_back(predicted_label); average += pig_faces[j].weight_est; } last_ID = pig_faces[j].ID_prediction; } cout << endl; out_file << endl; if(moving_flag == false) { images.push_back(temp_image); labels.push_back(temp_label); } cout << endl; out_file << endl; } // for every cluster vector<double> temp_cluster; for(size_t i = 0; i < labels.size(); i++) { cout << "Cluster " << i << endl; // dump data into vector temp_cluster.clear(); for(size_t j = 0; j < pig_faces.size(); j++) { if(pig_faces[j].ID_prediction == labels[i]) { temp_cluster.push_back(pig_faces[j].weight_est); } } while(cluster_count[i] >= 3) {

Page 128: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

118

// calculate mean and standard deviation double mean = 0, s = 0; for(size_t j = 0; j < temp_cluster.size(); j++) { mean += temp_cluster[j]; } mean = mean/temp_cluster.size(); for(size_t j = 0; j < temp_cluster.size(); j++) { s += (temp_cluster[j] - mean)*(temp_cluster[j] - mean); } s = sqrt(s/(temp_cluster.size()-1)); // iterate through and find index of farthest outlier int farthest_index=0; double farthest_value=0; for(size_t j = 0; j < temp_cluster.size(); j++) { if(temp_cluster[j] > farthest_value) { farthest_index = j; farthest_value = temp_cluster[j]; } } // using that index,check if outlier double t = abs(temp_cluster[farthest_index] - mean)/s; cout << "mean = " << mean << "s = " << s << "t = " << t << endl; // if it is outlier, delete from cluster and repeat if((temp_cluster.size() == 3 && t > 1.153) || (temp_cluster.size() == 4 && t > 1.463) || (temp_cluster.size() == 5 && t > 1.672) || (temp_cluster.size() == 6 && t > 1.822) || (temp_cluster.size() == 7 && t > 1.938) || (temp_cluster.size() == 8 && t > 2.032) || (temp_cluster.size() == 9 && t > 2.110) || (temp_cluster.size() >= 10 && temp_cluster.size() < 15 && t > 2.176) || (temp_cluster.size() >= 15 && temp_cluster.size() < 20 && t > 2.409) || (temp_cluster.size() >= 20 && temp_cluster.size() < 25 && t > 2.557) || (temp_cluster.size() >= 25 && temp_cluster.size() < 50 && t > 2.663) || (temp_cluster.size() >= 50 && temp_cluster.size() < 100 && t > 2.956) || (temp_cluster.size() >= 100 && t > 3.207)) { cout << "Erasing outlier" << endl; temp_cluster.erase(temp_cluster.begin() + farthest_index); } else { // if not an outlier, can move on to the next cluster cout << (cluster_count[i] - temp_cluster.size()) << " outliers deleted." << endl; cout << "Cluster average without outliers = " << mean << endl << endl; break; }

Page 129: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

119

} } // Create text file of cluster info for sampling program ofstream cluster_file; cluster_file.open ("cluster_info.csv"); // Print clusters and calculate average weight vector<String> file_names; vector<int> weight_known; vector<int> weight_est; for(size_t i = 0; i < labels.size(); i++) { cluster_count[i] = 0; // need to recalculate now that outliers are gone double average_weight = 0; double average_confidence = 0; int ID_average = 0; cout << "Cluster " << i << " contains:" << endl; out_file << "Cluster " << i << "contains:" << endl; for(size_t j = 0; j < pig_faces.size(); j++) { if(pig_faces[j].ID_prediction == labels[i]) { if(cluster_count[i] == 0 ) { // If we're looking at the first image, push onto the stack file_names.push_back(pig_faces[j].file_name); weight_known.push_back(pig_faces[j].weight_known); } if(!training) { if(pig_faces[j].weight_known != -1 && pig_faces[j].ID_known != -1) { cout << " " << pig_faces[j].ID_known << ": known=" << pig_faces[j].weight_known << "lbs, est=" << pig_faces[j].weight_est << " lbs, confidence = " << pig_faces[j].cluster_confidence << endl; out_file << " " << pig_faces[j].ID_known << ": known=" << pig_faces[j].weight_known << "lbs, est=" << pig_faces[j].weight_est << " lbs, confidence = " << pig_faces[j].cluster_confidence << endl; } else { cout << j << ": est=" << pig_faces[j].weight_est << " lbs, confidence = " << pig_faces[j].cluster_confidence << endl; out_file << j << ": est=" << pig_faces[j].weight_est << " lbs, confidence = " << pig_faces[j].cluster_confidence << endl; } } average_weight += pig_faces[j].weight_est; average_confidence += pig_faces[j].cluster_confidence; ID_average += pig_faces[j].ID_known; cluster_count[i]++; } }

Page 130: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

120

cout << "Average weight of cluster = " << average_weight/cluster_count[i] << " lbs" << endl; average_confidence = (average_confidence-100)/(cluster_count[i]-1); cout << "Average cluster confidence = " << average_confidence << "%" << endl; weight_est.push_back((int) (average_weight/cluster_count[i]+0.5)); cout << endl; cluster_file << labels[i] << "," << (int) (average_weight/cluster_count[i]+0.5) << endl; } delete[] cluster_count; // Pushing the various parameters if we're training the recognizer if(training) { training_data.push_back(confidence_threshold); training_data.push_back(min_cluster_count); training_data.push_back(radius); training_data.push_back(neighbors); training_data.push_back(grid_x); training_data.push_back(grid_y); training_data.push_back(labels.size()); } out_file.close(); cluster_file.close(); model->save("face_rec_model.xml"); // Display all of the clusters in the dynamically sizing window displayClusters(file_names, weight_known, weight_est); return training_data; } void PigClassifier::displayClusters(vector<String> & file_names, vector<int> weight_known, vector<int> weight_est) { vector<Mat> images; if(file_names.size() != weight_known.size() && weight_known.size() != weight_est.size()) { cout << "Error in displayCluster: Vectors not all the same size." << endl << endl; return; } for(size_t i = 0; i < file_names.size(); i++) { cout << "Reading from file: " << file_names[i] << endl; Mat img = imread(file_names[i]); // Check for invalid input if(!img.data) { cout << "Error in displayCluster: Could not open or find the image of pig face." << endl << endl;

Page 131: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

121

return; } else { images.push_back(img); } } // Set up the multi-image window size_t x_max = 3; int x_count, y_count; if(images.size() < x_max) { x_count = images.size(); } else { x_count = x_max; } y_count = (images.size()/x_max) + 1; int dstWidth = images[0].cols * x_count; int dstHeight = images[0].rows * y_count; Mat dst = Mat(dstHeight, dstWidth, CV_8UC3, cv::Scalar(0,0,0)); // Draw text on each image and output to large window for(int i = 0; i < (int) images.size(); i++) { // Only output known weight if it was specified String text; if(weight_known[i] != -1) { text = "Est: " + to_string((long long) weight_est[i]) + " lbs, Known: " + to_string((long long) weight_known[i]) + " lbs"; } else { text = "Est: " + to_string((long long) weight_est[i]) + " lbs"; } putText(images[i],text,Point(images[0].cols/15,images[0].rows*7/8),FONT_HERSHEY_SIMPLEX,5,Scalar(0,200,200), 10); Rect roi(Rect((i%x_count)*images[0].cols,(i/x_count)*images[0].rows,images[0].cols, images[0].rows)); Mat targetROI = dst(roi); images[i].copyTo(targetROI); } // Create window and show image on it namedWindow( "Clustering Final Results", CV_WINDOW_NORMAL ); imshow("Clustering Final Results", dst); waitKey(0); } void PigClassifier::samplePig(PigFace piggie) { // Read data from CSV file on previously clustered pigs ifstream cluster_file; cluster_file.open("cluster_info.csv"); vector<int> labels; vector<int> weights; string line;

Page 132: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

122

if(cluster_file.is_open()) { while(cluster_file.good() ) { // Get the file name getline(cluster_file, line, ','); if(line == "") { break; } labels.push_back(stoi(line)); getline(cluster_file, line); weights.push_back(stoi(line)); } } else { cout << "Failed to open cluster_info.csv" << endl; cout << "Press any key to terminate program." << endl; waitKey(0); return; } // Load the Recognizer model from previously clustered data Ptr<FaceRecognizer> model = createLBPHFaceRecognizer(); model->load("face_rec_model.xml"); // Predict the label of the photograph int predicted_label = -1; double predicted_confidence = -1; model->predict(piggie.getFaceImg(), predicted_label, predicted_confidence); // Open the image being tested Mat img = imread(piggie.file_name); // Output the results for(int i = 0; i < labels.size(); i++) { if(labels[i] == predicted_label) { String text = "Est: " + to_string((long long) weights[i]) + " lbs"; putText(img,text,Point(img.cols/15,img.rows*7/8),FONT_HERSHEY_SIMPLEX,5,Scalar(0,200,200), 10); cout << endl << "The pig most likely weighs " << weights[i] << " lbs" << endl; // Create window and show image on it namedWindow( "Clustering Final Results", CV_WINDOW_NORMAL ); imshow("Clustering Final Results", img); waitKey(0); } } }

Page 133: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

123

PigFace.h

#ifndef PIGFACE_DEF #define PIGFACE_DEF #include <opencv2/core/core.hpp> #include <opencv2/objdetect/objdetect.hpp> #include <opencv2/highgui/highgui.hpp> #include <opencv2/imgproc/imgproc.hpp> #include <iostream> #include <stdio.h> #include <string> #include <math.h> #include "utilities.h" using namespace std; using namespace cv; class PigFace { public: double weight_known; double weight_est; int ID_known; int ID_prediction; double cluster_confidence; String file_name; vector<int> membership; PigFace(); PigFace(Mat face_img); PigFace(Mat face_img, int ID_known, String file_name, double weight_known, double weight_est); int setIDPrediction(int ID_prediction); Mat getFaceImg(); private: Mat face_img; }; bool operator<(const PigFace &face1, const PigFace &face2); #endif //if not defined

Page 134: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

124

PigFace.cpp

#include "PigFace.h" // Define all constructors PigFace::PigFace() { this->ID_known = -1; this->weight_known = 0; this->weight_est = 0; this->ID_prediction = -1; cluster_confidence = 0; file_name = ""; } PigFace::PigFace(Mat face_img) { this->face_img = face_img; this->ID_known = -1; this->weight_known = 0; this->weight_est = 0; this->ID_prediction = -1; cluster_confidence = 0; file_name = ""; } PigFace::PigFace(Mat face_img, int ID_known, String file_name, double weight_known, double weight_est) { this->face_img = face_img; this->ID_known = ID_known; this->weight_known = weight_known; this->weight_est = weight_est; this->ID_prediction = -1; cluster_confidence = 0; this->file_name = file_name; } Mat PigFace::getFaceImg() { return this->face_img; } // Need to define this operator so that the faces can be sorted bool operator<(const PigFace &face1, const PigFace &face2) { if(face1.ID_prediction == face2.ID_prediction) { if(face1.weight_est < face2.weight_est) { return true; } else { return false; } } else { if(face1.ID_prediction < face2.ID_prediction) { return true; } else { return false; } } }

Page 135: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

125

utilities.h

#ifndef UTILITIES_DEF #define UTILITIES_DEF #include <string> #include <opencv2/core/core.hpp> #include <opencv2/highgui/highgui.hpp> #include <opencv2/imgproc/imgproc.hpp> #include <stdio.h> #include <stdarg.h> using namespace cv; // ################################################################################ // # Timer Functions // ################################################################################ // Windows #ifdef _WIN32 #include <Windows.h> inline double get_wall_time(){ LARGE_INTEGER time,freq; if (!QueryPerformanceFrequency(&freq)){ // Handle error return 0; } if (!QueryPerformanceCounter(&time)){ // Handle error return 0; } return (double)time.QuadPart / freq.QuadPart; } inline double get_cpu_time(){ FILETIME a,b,c,d; if (GetProcessTimes(GetCurrentProcess(),&a,&b,&c,&d) != 0){ // Returns total user time. // Can be tweaked to include kernel times as well. return (double)(d.dwLowDateTime | ((unsigned long long)d.dwHighDateTime << 32)) * 0.0000001; }else{ // Handle error return 0; } } // Posix/Linux #else #include <sys/time.h>

Page 136: Thesis_ClarkA_FinalDraft

Texas Tech University, Alexander W. Clark, August 2015

126

inline double get_wall_time(){ struct timeval time; if (gettimeofday(&time,NULL)){ // Handle error return 0; } return (double)time.tv_sec + (double)time.tv_usec * .000001; } inline double get_cpu_time(){ return (double)clock() / CLOCKS_PER_SEC; } #endif #endif //if not defined