Localization of fractures in drill cores using Deep Learning

$Page 1: Localization of fractures in drill cores using Deep Learning$
IN DEGREE PROJECT COMPUTER SCIENCE AND ENGINEERING,SECOND CYCLE, 30 CREDITS

, STOCKHOLM SWEDEN 2021

Localization of fractures in drill cores using Deep Learning

FELIX MAGNELL

KTH ROYAL INSTITUTE OF TECHNOLOGYSCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE

Localization of fractures indrill cores using DeepLearning

Felix Magnell

Master’s Programme, Computer Science, 120 credits

Date: July 12, 2021

Supervisors: Haibo Li, Torbjörn Svensson

Examiner: Jonas Beskow

School of Electrical Engineering and Computer Science

Host company: Minalyze

Swedish title: Lokalisering av sprickor i borrkärnor med hjälp av

djupinlärning

© 2021 Felix Magnell

AbstractInvestigating and assessing rock structures requires extensive manual work,where geologists measure the fractures in drill cores to get a better understandingof the condition of the underlying rock.

This study investigates the possibility to use Deep Learning for localizingfractures in images of drill cores, automating and facilitating the work ofgeologists. The method described is a combination of object detection andsegmentation, used to localize fractures and orientation lines. The studyfocuses on two main aspects: to investigate whether Deep Learning can beused to locate fractures and orientation lines, and whether the data generatedby the method can be used to calculate the two angles required to analyze thedrill cores, the so-called Alpha and Beta angles.

The results show that the localization of fractures can be done from arelatively small data set, with high precision. Angle determination yieldspoorer results and further work is required to deal with the discrepancies thatarise in the segmentation.

SammanfattningAtt undersöka och bedöma bergsstrukturer kräver idag ett omfattande manuelltarbete, där geologer undersöker sprickor i borrkärnor för att kartlägga detunderliggande bergets skick.

I den här studien undersöks möjligheten att med hjälp av djupinlärninglokalisera och markera sprickor i bilder av borrkärnor, och därmed automatiseraoch underlätta geologernas arbete. Metoden som här redovisas är en kombinationav objektigenkänning och bildsegmentering, vilka används för att lokaliserasprickor och orienteringslinjer. Studien fokuserar på två huvudaspekter: attundersöka dels om djupinlärning kan användas för att lokalisera sprickor ochorienteringslinjer, dels om datan som metoden genererar kan användas för attberäkna de två vinklar som krävs för att analysera borrkärnan, de så kalladeAlpha- och Betavinklarna.

Resultatet visar att positionsbestämningen av sprickor kan göras utifrån ettrelativt litet dataunderlag med förhållandevis hög precision. Vinkelbestämningenger ett sämre resultat och vidare arbete krävs för att kunna hantera de avvikelsersom uppstår i bildsegmenteringen.


Sammanfattning | iii

Contents

1 Introduction 11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.5 Delimitations . . . . . . . . . . . . . . . . . . . . . . . . . . 31.6 Structure of the thesis . . . . . . . . . . . . . . . . . . . . . . 3

2 Background 52.1 Artificial Neural Networks . . . . . . . . . . . . . . . . . . . 52.2 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2.1 Supervised learning . . . . . . . . . . . . . . . . . . 62.2.2 Unsupervised learning . . . . . . . . . . . . . . . . . 62.2.3 Weakly-supervised learning . . . . . . . . . . . . . . 6

2.3 Convolutional neural network . . . . . . . . . . . . . . . . . . 62.3.1 Convolutional layer . . . . . . . . . . . . . . . . . . . 72.3.2 Dilated Convolutions . . . . . . . . . . . . . . . . . . 72.3.3 Pooling layer . . . . . . . . . . . . . . . . . . . . . . 72.3.4 Fully-connected layer . . . . . . . . . . . . . . . . . . 7

2.4 Computer vision . . . . . . . . . . . . . . . . . . . . . . . . . 82.5 RANSAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.6 Drill cores . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.6.1 Assessment of drill cores . . . . . . . . . . . . . . . . 92.7 Object detection . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.7.1 R-CNN . . . . . . . . . . . . . . . . . . . . . . . . . 112.7.2 Fast R-CNN . . . . . . . . . . . . . . . . . . . . . . . 122.7.3 Faster R-CNN . . . . . . . . . . . . . . . . . . . . . . 13

2.8 Image Segmentation . . . . . . . . . . . . . . . . . . . . . . 142.8.1 Fully Convolutional network for segmentation . . . . . 14

2.8.2 Convolutional Models with Graphical models . . . . . 142.8.3 Mask R-CNN . . . . . . . . . . . . . . . . . . . . . . 14

2.9 Deeplab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.9.1 Deeplabv1 . . . . . . . . . . . . . . . . . . . . . . . 152.9.2 Deeplabv2 . . . . . . . . . . . . . . . . . . . . . . . 152.9.3 Deeplabv3 . . . . . . . . . . . . . . . . . . . . . . . 152.9.4 DeepLabv3+ . . . . . . . . . . . . . . . . . . . . . . 15

2.10 Data pre-processing . . . . . . . . . . . . . . . . . . . . . . . 162.11 Transfer learning . . . . . . . . . . . . . . . . . . . . . . . . 17

2.11.1 Feature extraction . . . . . . . . . . . . . . . . . . . . 172.11.2 Fine-tuning . . . . . . . . . . . . . . . . . . . . . . . 17

2.12 Ensemble learning . . . . . . . . . . . . . . . . . . . . . . . 172.13 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.14 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.14.1 Computer vision applied on drill cores . . . . . . . . . 202.14.2 Masonary crack detection . . . . . . . . . . . . . . . 202.14.3 Pavement crack detection . . . . . . . . . . . . . . . . 20

3 Method 213.1 Research Process . . . . . . . . . . . . . . . . . . . . . . . . 213.2 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.2.1 Transformation . . . . . . . . . . . . . . . . . . . . . 223.2.2 Generation of bounding box data set . . . . . . . . . . 233.2.3 Generation of segmentation data set . . . . . . . . . . 233.2.4 Manual segmentation . . . . . . . . . . . . . . . . . . 23

3.3 Model selection . . . . . . . . . . . . . . . . . . . . . . . . . 243.4 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.4.1 Object detection . . . . . . . . . . . . . . . . . . . . 253.4.2 Segmentation . . . . . . . . . . . . . . . . . . . . . . 26

3.5 Inference/post-processing . . . . . . . . . . . . . . . . . . . . 263.5.1 Object detection . . . . . . . . . . . . . . . . . . . . 263.5.2 Fracture segmentation model . . . . . . . . . . . . . . 273.5.3 Orientation line segmentation model . . . . . . . . . . 273.5.4 Point cloud . . . . . . . . . . . . . . . . . . . . . . . 283.5.5 Alpha/Beta angles . . . . . . . . . . . . . . . . . . . 28

3.6 Hardware/Software to be used . . . . . . . . . . . . . . . . . 283.7 Assessing reliability and validity of the data collected . . . . . 29

3.7.1 Validity of method . . . . . . . . . . . . . . . . . . . 293.7.2 Reliability of method . . . . . . . . . . . . . . . . . . 30

4 Experiments and Results 314.1 Object detection . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.1.1 Segmentation . . . . . . . . . . . . . . . . . . . . . . 324.1.2 Alpha/Beta angles . . . . . . . . . . . . . . . . . . . 37

5 Discussion 415.0.1 Object detection . . . . . . . . . . . . . . . . . . . . 415.0.2 Segmentation . . . . . . . . . . . . . . . . . . . . . . 415.0.3 Alpha/Beta angles . . . . . . . . . . . . . . . . . . . 41

6 Conclusions and Future work 436.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

6.1.1 Future work . . . . . . . . . . . . . . . . . . . . . . . 446.2 Ethical, Sustainable and Social aspects . . . . . . . . . . . . . 45

References 47


Contents | vii

List of Figures

2.1 Illustration of the Alpha/Beta angles . . . . . . . . . . . . . . 92.2 R-CNN architecture ©[2014] IEEE . . . . . . . . . . . . . . . 112.3 Fast R-CNN architecture ©[2015] IEEE . . . . . . . . . . . . 122.4 Faster R-CNN architecture © [2017] IEEE . . . . . . . . . . . 13

3.1 Research Process . . . . . . . . . . . . . . . . . . . . . . . . 213.2 Drill core box . . . . . . . . . . . . . . . . . . . . . . . . . . 223.3 Drill core box with orientation lines . . . . . . . . . . . . . . 223.4 Drill core with centered bounding box around one fracture . . 233.5 (a) Cropped fracture (b) Segmentation map . . . . . . . . . . 243.6 Result of object detection - drill core . . . . . . . . . . . . . . 273.7 Leftmost pixels of fracture segmentation . . . . . . . . . . . . 273.8 Separated left orientation line . . . . . . . . . . . . . . . . . . 283.9 Left-most pixels of fracture segmentation and orientation line . 29

4.1 Ground truth boxes . . . . . . . . . . . . . . . . . . . . . . . 324.2 Prediction boxes . . . . . . . . . . . . . . . . . . . . . . . . . 324.3 A subset of predictions on validation set, Left to right: Ground

truth, Binary fracture, Binary Orientation Line , Multi-class . 344.4 Ground truth and final binary segmentation predictions . . . . 364.5 Error Alpha/Beta angles - on test set in degrees . . . . . . . . 384.6 Example of outlier point in fracture edge . . . . . . . . . . . . 384.7 Example of shifting plane with one outlier in the point cloud . 39


Contents | ix

List of Tables

3.1 Table of di�erent AP result on Faster R-CNN . . . . . . . . . 25

4.1 Final network configuration object detection . . . . . . . . . . 314.2 Average Precision - on test set . . . . . . . . . . . . . . . . . 314.3 Averaged IoU over the validation set for the binary models and

multi-class model . . . . . . . . . . . . . . . . . . . . . . . . 334.4 Averaged IoU over the validation set for the binary models

with augmentation . . . . . . . . . . . . . . . . . . . . . . . 354.5 Final network configuration segmentation . . . . . . . . . . . 354.6 Averaged IoU over the test set for the binary models with

augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . 354.7 Alpha/Beta angles computed on one box in the test set with

points from our method vs points created by geologists. Measurementsin degrees. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37


Introduction | 1

Chapter 1

Introduction

1.1 BackgroundIn the mining industry and for large infrastructure projects, information aboutthe underground rock properties are essential when planning for furtherexploitation.

The host company, Minalyze, develops hardware and software for analysistools of drill cores, and parts of the assessment of the drill cores are donemanually by geologists. The analyzing process is time consuming withvariations in quality, since it is largely dependent of the experience, skill andknowledge of the investigating person.

The purpose of the project is to find a method to automate parts of theanalysis process, namely to localize fractures and orientation lines in drillcores, using Deep Learning.

The data set provided by Minalyze consists of high resolution imagesof drill cores placed in wooden boxes, divided into portions. Each boxholds roughly six meters of a cylindrical drill core, which has been cut intoapproximately one meter lengths as dictated by the dimensions of the box.

The drill cores are sometimes in poor condition and have been arrangedin such a way that when they are broken, they are put together as close to theoriginal condition as practically possible. However, parts end up skewed oro�set in relation to each other, and shards are often missing in fracture zones.It is also di�cult to distinguish a subtle fracture from other shifts in color andstructure of the drill core, which is another challenge in achieving a correctand reliable fracture analysis.

Given the aforementioned state of the samples, the ambition of this projecthas been to develop a method which can be of support to the manual analysis

and make the whole process faster, more reliable and e�cient.

1.2 ProblemMinalyze uses two representations of drill core boxes, 3D point cloud andbit map images. This thesis will examine the possibilities to use DeepLearning for localizing and extracting edges and orientation lines in bit mapimages of drill cores and approximating an intersecting plane in the pointcloud representation. The thesis will be based on the following two researchquestions:

1. Is it possible to use Deep Learning to localize drill core fractures andorientation lines on bitmap images?

2. Can the output from the found Deep Learning method be used toapproximate a plane intersection of fractures in point clouds of drillcores?

1.3 PurposeThe purpose of this study is to explore Deep Learning as a tool for automatingparts of the process of assessing drill cores. This is highly interesting forthe mining industry, as it could contribute to streamlining their analysis tools.Analyzing and locating fractures in drill cores is also valuable when buildinglarger infrastructure projects or underground work, where it is important tocalculate what type of reinforcement that is needed.

1.4 GoalsThe goal of this project is to find an adequate method for localizing fracturesin drill cores with help of Deep Learning. This goal has been divided into thefollowing two sub-goals:

1. To improve the analysis tools in the mining industry with automatedfracture localization in drill cores.

2. To explore how well the chosen segmentation methods works withirregular data such as drill core fractures with few data points.

1.5 DelimitationsThe method for generation of the ellipse intersecting the drill core is notincluded in the thesis. We will deal with Minalyze’s way of generating anellipses given points in the 3D domain. It will be used strictly to evaluate thepoints of the fracture edge generated by the method, compared with pointsgenerated by hand.

1.6 Structure of the thesisThe theoretical background and similar work will be presented in Chapter 2.Chapter 3 will go through the method used to find edges of fractures and how itshould be evaluated. Chapter 4 is going to present the results and experimentsof the method from Chapter 3. Chapter 5 will be a discussion about the results.Chapter 6 will be a conclusion of the work and what can be improved in thefuture.


Background | 5

Chapter 2

Background

2.1 Artificial Neural NetworksArtificial neural networks (ANN) is a processing system inspired by biologicalprocesses. ANNs try to model the human cognition by creating a network ofsimple processes, which passes signals called neurons, between each other toproduce a response [1].

An ANN consists of several layers, an input layer, a hidden layer and finallyan output layer. The architecture is a network of neurons where each neuronis connected to a neuron in the next layer from left to right. There are no loopsor inter-connections in an ANN. Each neuron has an associated weight thatcan be adjusted. During training, the weights are adjusted based on an inputand what output the network is producing, determined by a loss function. Anoften used training algorithm is called stochastic gradient descent (SGD), thisalgorithm propagates back from the output layer and takes the gradient of eachweight in relation to the error produced in the output layer. The weights arethen adjusted in the opposite direction of the gradient to reduce the error [2].

Architecture of ANN

• Input layer

• Hidden layer

• Output layer

ANN is a learning machine that tries to approximate a function f and isdefined as a mapping y = f(x; ✓) where ✓ is some learnable parameters. Ithas a network structure where multiple neurons are chained together which

collectively learns from the input and adjusts itself to a final output. The input,often a vector, is fed to the input layer which distributes it to a hidden layer.Each neuron is associated with a weight which gets updated in respect to theimprovement of the output result. Multiple hidden layers in a network structureis referred to as a Deep Neural Network [3].

2.2 Learning

2.2.1 Supervised learningIn supervised learning, each input has an associated output target. Whenlearning in a supervised setting the goal is to decrease the error given by thedi�erence in the output and the target value called error [3].

2.2.2 Unsupervised learningUnsupervised learning refers to learning without ground truth targets. Its oftenused to find patterns in data like for example clustering [3].

2.2.3 Weakly-supervised learningWeakly-supervised learning refers to the learning of a problem which hasground truth labels but not suitable for the problem at hands. One exampleis, having class labels but no localization data for a localization problem.Semantic segmentation tasks often relies on expensive annotations by domainexperts [4].

2.3 Convolutional neural networkA convolutional neural network (CNN) is a type of deep neural network, usedfor data with a grid-like structure, such as images. The architecture is designedto learn spatial hierarchies of features. CNNs are made of three main buildingblocks, Convolutional layers, Pooling layers, and Fully-connected layers. Thefirst two, the convolutional layer and Pooling layer is used as feature extractors,and the fully-connected is used at the end of the model to make predictions [5].

2.3.1 Convolutional layerThe convolutional layer makes use of learnable filters, often referred to askernels who form a matrix with a dimension smaller than the input but withthe same depth as the input. A linear operation called convolution is applied tothe kernel and the input to generate a response map. The kernel slides acrossthe input region (convolves) to generate the response map. Multiple filters arethen convolved on the input image to generate a volume of response maps,where the dimensions are equal to the number of filters. When the activationmaps are stacked they are forwarded to the next layer [3].

2.3.2 Dilated ConvolutionsDilated convolutions, also known as atrous convolutions, is a convolution thatintroduces a new parameter to increase the field of view without increasingthe computational cost. Dilated convolutions introduce space in betweenthe values of the kernel, enabling exponential receptive field without losingresolution or coverage [6].

2.3.3 Pooling layerThe response maps represent the input well, and captures low-level featuresin the early layers, with higher detailed features in the deeper layers. Onelimitation with the response maps, however, is the fact that they are sensitiveto location, meaning an object that is rotated or shifted will get a di�erentresponse map even though it is the same object.

To address the issue with the sensitive locality in the response maps,a pooling layer is applied. Pooling makes the representation translationalinvariant, meaning that if the input is slightly changed the output will stillbe the same. This is especially valuable when assessing whether a feature ispresent in the image, without the need to know the exact location.

Another aspect of the Pooling layer is the reduction of parameters. Poolinglayers reduces the computational time with the cost of removing information[3].

2.3.4 Fully-connected layerThe CNN finishes with a fully connected layer in the end to make the finalclassification. The images from previous layers are flattened to a 1D vectorand fed into the fully-connected layer which essentially is a standard ANN [5].

2.4 Computer visionComputer vision is the construction of meaningful descriptions of objectsfrom images or video [7]. Computer vision can be viewed as a sub-area ofartificial intelligence where the goal is to model the visual senses of the humanbrain, to interpret, process and describe objects automatically. Since the rapiddevelopment in Deep Learning, computer vision tasks have evolved to solvecomplex problems that, though it requires extensive computer power and data,extracts features automatically [8].

2.5 RANSACA common technique for fitting data to a line is to use the least square methodwhich will fit the data poorly if there are outliers (noise). RANSAC (RandomSample Consensus) algorithm is a technique of estimating parameters of amodel to fit data that contains outliers. RANSAC is an iterative algorithmthat takes a random subset of the input data and estimates model parameterswhich are then evaluated on the full data set by counting the number ofdata points consistent with the model (support). The model with the highestsupport is chosen as output. A disadvantage of the method is that it is verycomputationally demanding [9].

2.6 Drill coresA core sample is a cylindrical shaped section of a sub-surface material,extracted for observation and logging. Various properties are examined, suchas chemical assays, veins, beddings and fractures. Minalyze has developed adrill core scanner for chemistry, structural data, density and photography. Withthe drill core scanner they provide a software tool to examine the scanned drillcores. This can be viewed both in 3D in the form of a point cloud as well asin 2D bitmap images.

Fractures in drill cores is an important property for the company to analyzeas they can show how the rock structure is arranged, which can provideinformation about how to design reinforcement before potential mining. Byinspecting the angle in respect to the direction of the drill core sample,geologist learn how fractures in the actual rock is aligned. Today, geologist

uses either mechanical instruments, such as a Goniometer or digital tools toplot these points by hand.

Figure 2.1 – Illustration of the Alpha/Beta angles

The intersection of a plane and a cylindrical drill core makes up anellipse where the maximum curvature make up what’s called the long axis.Two angles, Alpha and Beta are measured so that the strike and dip can becalculated. The Alpha angle is the angle between the core axis and the longaxis. The Beta angle is the angle along the core between lowest point on thelong axis and the orientation line [10] [11].

2.6.1 Assessment of drill coresFor large infrastructure projects that includes underground work such astunnels or buildings on top of bedrocks, a forecast of the rock conditions isimportant to establish. The forecast of the rock quality is based on geophysicalmeasurements like core drilling. Straahle [12] proposes that a mapping ofdiscontinuities should be performed to obtain as much quantitative data aspossible according to the following parameters:

• Location of fractures along the core

• Measure the angle of the fracture along in respect to the core.

• Roughness of the fracture

• The distance between intersecting fractures caused by foliation or otherregularly intersecting fractures

• Fracture opening ("Aperature") - How well two core pieces on each sideof a fracture can be fitted.

[12][13]

2.7 Object detectionObject detection is a problem class of trying to encapsulate objects with abounding box often defined by b = (bx, by, bw, bh) where bx and by is thecenter point of the bounding box and bh and bw is its height and width.Tasks associated with Object detection are: face detection, lane detection forself driving cars, and tumour detection in MRA-images. Historically objectdetection has been solved mainly by two methods. Manually engineeredfeature extractors or a sliding window approach [14] [15].

A large improvement in solving the object detection problem came whendeep neural networks could be utilized e�ciently for the task, particularlyCNNs. The gain of using CNNs over traditional methods can be attributedto the deep architectures that is able to capture complex features, without theneed of manual engineering of feature descriptors. In 2014 Regions with CNNfeatures (R-CNN) was proposed, which beat all other methods at the time bya large margin. Since R-CNN, new improved object detection networks hasbeen proposed. Fast-RCNN, Faster-RCNN and YOLO [16][15].

2.7.1 R-CNN

Figure 2.2 – R-CNN architecture ©[2014] IEEE

R-CNN is a three step object detection method. The three steps consists ofRegion proposal with selective search, feature extraction using CNN and thenfinally classified using a SVM. The di�erent parts are independent, meaningthey are trained separately. As mentioned in section 2.7, object detection werepreviously often performed with a brute force approach of sliding windowthat was iteratively moved over the image and classified to a certain class. Asobjects can appear in multiple scales, this strategy can be very computationallyexpensive. R-CNNs first step tries to solve this by doing a course search forRIOs to generate meaningful bounding box proposals which can be viewed inFigure 2.2. The region proposal algorithm used in R-CNN is called selectivesearch, and aims to generate bounding boxes close to the ground truth withan emphasis of having high recall. Selective search works by generating aninitial segmentation map of the input image based on pixel intensity and thengenerate bounding box proposals in an iterative process of three steps:

1. Generating bounding boxes of the segments in the segmentation map.

2. Storing the bounding boxes as proposals.

3. Combine segments based on similarity.

This process generates bounding boxes in di�erent sizes, the starting point hasmany small boundaries for di�erent objects which is then combined to biggersegments.

The feature extractor is a CNN, pretrained on a classification task in asupervised manner. The R-CNN paper used AlexNet as a state-of-the-artnetwork at the time. The final softmax layer is then cut o� to output featurevectors.

The final step in R-CNN is the classification of the feature vectors whichis done by multiple SVMs, one for each class.

R-CNN achieved a mAP score of 53% on the VOC2012 data set which is30% more than previous best result [16].

2.7.2 Fast R-CNN

Figure 2.3 – Fast R-CNN architecture ©[2015] IEEE

Fast-RCNN builds upon the previous version with several improvements toboth speed and accuracy. Instead of a separate the feature extractor andclassifier, Fast-RCNN now has an end-to-end learning architecture, takingadvantage of feature sharing and removing the need of storing feature mapsto disk. Instead of passing each RIO as separate images as in R-CNN, Fast-RCNN takes the full image as input and RIOs as a four tuple (r,c,h,w) where(r,c) is the top left corner and (h,w) is the height and width of the RIO. As theRIOs has di�erent sizes and the CNN structure is made to work with fixed sizefeature maps, RIO pooling is used. RIO pooling reduces the size of the RIOsinto equal sized sections and then applies max pooling to get the final featuremaps. [17].

2.7.3 Faster R-CNN

Figure 2.4 – Faster R-CNN architecture © [2017] IEEE

Fast-RCNN showed great increase in speed for object detection, howeverit also exposed the region proposal as a bottle neck to increase the speedfurther. Faster-RCNN solves this by introducing Region Proposal Network(RPN). Unlike in Fast-RCNN, the object proposal step is now integrated withthe rest of the network, which enables feature sharing between all stagesduring training in an end-to-end architecture [18]. The RPN uses what’scalled Anchor boxes which is predefined bounding boxes in di�erent sizes.It generates ROIs by passing a sliding window of each feature map outputtingk bounding boxes with a score of how good theses bounding boxes are [19].

2.8 Image SegmentationImage segmentation is a method for partitioning images into specific regionsbased on objects. It can be formulated as a classification problem where eachpixel in an image is classified. Historically various algorithms has been used,such as k-means, region-growing, graph cuts and Markov random fiends. Thedevelopment has been rapid in recent years with the introduction of DeepLearning. Some popular network architectures includes, CNN, RNN, LSTM,and GAN [19].

2.8.1 Fully Convolutional network for segmentationOne of the first Deep Learning models for image segmentation was proposedby Long et al. [20]. They used a fully convolutional network (FCN), whichonly contains convolutional layers, essentially replacing the fully-connectedlayer with a fully-convolutional layer in a CNN to be able to output asegmentation map. Some limitation associated with the FCN are: slow forreal time applications and it doesn’t take information about the global contextinto account [19].

2.8.2 Convolutional Models with Graphical modelsTo try to overcome the limitations of FCNs, various attempts to incorporatemore context has been made. Chen et al. [21] combined a CNN with aConditional Random Fields (CRF) to achieve higher accuracy than previousmethods, as well as showing the properties that makes CNNs insu�cient foraccurate segmentation [19].

2.8.3 Mask R-CNNMask R-CNN uses a modified fast-CNN by adding fully convolutional networkto generate a mask output. This modification means there are 3 outputbranches, bounding boxes, classes and segmentation which are trained inparallel with a shared loss function [19].

2.9 DeeplabDeepLab is family of open source semantic segmentation architectures consideredstate-of-the-art developed by Google. DeepLab has continuously been improved

from its initial version DeepLab1 through is fourth version DeepLabv3+.Challenges associated with DCNNs for image segmentation is the spatialinsensitivity and the downsampling. These two properties is a problem whenclassifying at pixel level because the classification needs to be more accuratethan for image classification. The development of the DeepLab models hastried to address these problems through multiple versions [21][22] [23] [24].

2.9.1 Deeplabv1The authors of Deeplab1 identified two problems when using DCNNs forsegmentation tasks. Firstly the reduction of resolution by the application ofmax-pooling and the downsamplig e�ect of the striding in the convolutionallayers. The second problem is related to the built-in spatial invariance, whichhas a negative e�ect on the spatial accuracy that is important in a segmentationtask. To overcome these problems the authors proposed atrous convolutionsand Conditional Random Fields. These improvements yielded a new recordaccuracy on the PASCAL VOC-2012 data set at the time reaching 71.6% IOUon the test set [21].

2.9.2 Deeplabv2In order to identify objects of di�erent scales, Atrous Spatial Pyramid Pooling(ASPP) was introduced in Deeplabv2. ASPP consists of several Atrousconvolutions of di�erent sampling rates that are put together into a featuremap. Deeplabv2 reached a mIOU of 79.7% on the PASCAL VOC2012 testset [22].

2.9.3 Deeplabv3To further improve the ability to segmenting objects at multiple scales, theauthors included global average pooling on the last feature map of the modelwith batch normalization. The updated model got 85.7% mIOU on thePASCAL VOC 2012 test set, without using the CRF-module from previousversions [23].

2.9.4 DeepLabv3+Deeplabv3+ has a encoder-decoder structure where DeepLab3 is used as anencoder module. A decoder is added to recover the segmentation resultsespecially around the object boundaries. They also incorporated dilated

depthwise separable convolutions instead of using max-pooling. The resultingmodel achieves a result of 89.0% on the PASCAL VOC 2012 test set [24].

2.10 Data pre-processingData preprocessing is an important step to increase the performance ofautomatic learning. It reduces the e�ects such as noise and outliers that maydestroy the learning process. One popular approach in data pre processingfor CNNs is data augmentation, which increases the number of trainingexamples by transforming the data points. Data augmentation in imagesoften includes rotation, cropping and zooming. With a sparse data set, dataaugmentation can decrease the risk of overfitting and lessen the sensitivityto noise. Augmentation also makes the learning algorithm more invariant totranslations.

There are several methods for data augmentation of images. Some ofwhich are shown below:

• Translation: Moving the image is some direction, improves invarianceas most objects can be located anywhere in the image.

• Stretch: Stretching the image horizontally or vertically by some factor.

• Random crop: Cut the image by a predefined amount of pixels.

• Rotation: The image is rotated in some angle between 0° and 360° withthe center of rotation that is equal to the center of the image.

• Gaussian blur: Smears out the pixels by reducing the di�erence betweenpixel values.

• Gaussian noise: Add random noise to the image.

• Salt and pepper: Randomly color pixels black (0) or white (255), canalso be areas with multiple pixels.

• Histogram equalization: Adjusting the contrast in the image.

• Elastic transformation: The image is moved in random directions whilstpreserving the topology. Can be seen as an imitation of a humanrewriting a digit on paper.

[25] [26]

2.11 Transfer learningA problem with Deep Learning is data dependency. Deep Learning algorithmsare often dependent on large amounts of data to be able to automatically learnthe latent representations of the data. Furthermore, data acquisition is hard andexpensive. To first gather, and then label the data, in some cases by a domainexpert, can be very expensive [27].

Transfer learning is a method for transferring the knowledge from onedomain to another, utilizing an already trained network in a new setting toovercome the problem of insu�cient training data. There is also the possibilityto make use of very large networks without having to train the whole networkwhich can be very computationally intensive [27].

There are mainly two methods of applying transfer learning, Featureextraction and Fine-tuning.

2.11.1 Feature extractionThis method is used for extracting features from a previously learned network.This is done by pruning the network, often prior to the fully connected layer,to end up with activation maps that are handled as feature vectors for a newproblem by attaching the activation maps to a new fully-connected layer [27][28].

2.11.2 Fine-tuningA pre-trained model that has been trained on a large data set often state-of-the-art architectures such as VGG, ResNet, or Inception is used as a base for thenew problem. The last layer of the pre-trained model is pruned and replacedwith a new fully-connected layer that fits the new problem. The weights ofthe network can either act as a starting point for the new problem, which isfine-tuned by training with the new data. Another approach is to freeze theearly layers, and only fine-tuning the latest layers to make them more relevantfor the new problem [27] [28].

2.12 Ensemble learningThe idea of ensemble learning is to have multiple learners contributing tothe final classification(s). When using ensemble learning, the individuallearners should learn di�erent things and therefore yielding a di�erent decision

boundary to get a meaningful ensemble. Ensemble learning can be used toincrease performance, or it can be used for model selection [29].

Pair wise stacking is an ensemble technique where a multi-class problem isconverted into multiple binary classifiers, reducing the complexity by makingthe decision boundary simpler and increasing the accuracy[30]. A paper onimage segmentation on brain tumors [31] utilizes a variant on this idea on amulti-class segmentation task, by using one classifier for each class, removingthe competition between classes.

2.13 EvaluationThere are various ways to evaluate segmentation and object detection models.Some of the most widely used methods will be presented here.

Pixel accuracy

Pixel accuracy is a metric for comparing the predicted segmentation maskswith ground truth labels per pixel. It is a simple technique but can bemisleading as it is often biased to the majority class [32].

Intersection over union

Intersection over union (IoU) also called Jaccard index is often used to evaluatesegmentation models. It provides a ratio between intersection over union oftwo sets. IoU ranges between 0.0 and 1.0 where 1.0 indicates perfect overlapbetween prediction and ground truth and 0.0 means no overlap [32].

IoU(Ytrue, Yprediction) =Ytrue \ Yprediction

Ytrue [ Yprediction

It can be formulated by the following function where (TP=True positives,FP=Falsepositives,FN=False negative ):

IoU(Ytrue, Yprediction) =TP

TP + FN + FP

A threshold is used to determine to what certainty a prediction should beconsidered valid [32].

Precision and Recall

Precision and recall, also known as sensitivity and specificity are two usefulmeasurements for evaluating a classifier. Their value span between 0.0 and1.0 [29].

Precision tells us the ratio between correct positive predictions over allpositive examples[29].

Precision =TP

TP + FP

The precision formula punishes a classifier with false positives, meaning whenit makes predictions that are incorrect, precision gets lower.

Recall tells us the ratio between true positive predictions over all predictionsclassified as positive [29].

Recall =TP

TP + FN

The recall formula punishes a classifier with many false negatives, In simpleterms we want to make a prediction despite the risk of predicting wrong tomake sure we predict more positive examples to get high recall.

Precision and recall is inversely related, meaning when one increases the otherdecreases, and vice versa. Given theses characteristics, it depends on the typeof problem for which to prioritize [29].

Average precision

Average precision a standard detection evaluation metrics used in the COCOchallenge. It reports the average precision across all the recall values withdi�erent IoU threshold values either static or in a predetermined interval.AP50 means that that the IoU threshold is set to 50% AP or AP@[.5:.95]means that the IoU threshold varies from 50% with a step size of 0.5 up untila threshold value of 95% [32].

2.14 Related work

2.14.1 Computer vision applied on drill coresThe mineral extraction industry is increasingly utilizing Deep Learning forautomatically feature extraction. A recent paper [33] suggests using CNN forclassifying three di�erent lithologies: sandstone, limestone, and shale. Theirarchitecture is based on ResNeXt-50 and yields an accuracy result of 93.12%[33].

2.14.2 Masonary crack detectionTo inspect masonry structures for cracks and other signs of decay are oftendone by visual inspection, a costly method with subjective elements. Chaiyasarnet al. [34] proposes a crack detection system using a CNN for feature extractionand a Support Vector Machine (SVM) for classification. The authors gathereda data set consisting of 6002 patches of cracks/non-cracks. To localize cracksthe authors divided their sample images into a grid like structure where theyclassified each grid as crack/non-crack. The accuracy on their test set wasmeasured to 74.9%. The authors concluded that a CNN can successfully beused for automatic crack detection for masonry structures. However, they alsoconclude that their data set is small, and that the results would likely improveby utilizing Transfer Learning and augmenting the data. Furthermore theystate that crack detection in masonry structures is a di�cult problem as thecracks can easily be mistaken by grout lines [34].

2.14.3 Pavement crack detectionPavement crack detection is researched to increase safety in roads. Scalona etal. [35] highlights some of the challenges associated with detecting cracksin asphalt. These challenges include the inhomogeneous nature of cracks,the complexity of the background, and low contrast in comparison with thesurrounding pavement [35]. The paper [36] proposes using a SDD networkfor object detection and then applying a U-net for segmenting the pavementcracks [36].

Method | 21

Chapter 3

Method

3.1 Research ProcessThis report deals with two research questions, a localization problem and aproblem dealing with fracture angles. An overview of the research processused is shown in Figure 3.1.

Figure 3.1 – Research Process

3.2 Data CollectionThe data set provided by Minalyze includes images and point clouds of the drillcores and a text file with points outlining the fracture in the drill core. Thesepoints will be referred as to fracture points. The total amount of fractures in thedata set is 1200 divided amongst 10 bore holes. Each borehole has a varyingamount of drill core boxes, between three and eight. The dimensions of thedrill core boxes are 10138x4540 pixels. The data set is confidential, apart fromfew images of other boreholes used to show as examples. Two di�erent drillcore boxes can be seen in Figure 3.2 and 3.3

Figure 3.2 – Drill core box

Figure 3.3 – Drill core box with orientation lines

3.2.1 TransformationThe fracture points that outlines the fractures were created in the pointcloud which doesn’t map directly to the bitmap images. Therefore an A�ne

transformation was used to generate a transformation matrix by taking threearbitrary points from the point cloud and their near equivalent from the bitmapimages. This was used to map the points from the point cloud to the bit mapimages and vice versa with the inverse of the transformation matrix.

3.2.2 Generation of bounding box data setThe points outlining fractures were encapsulated by a bounding box to get adata-set of localized fractures. The annotated points wasn’t always from topto bottom of the fractures, therefore the bounding boxes where corrected inLabelImg under the supervision of representatives at Minalyze.

3.2.3 Generation of segmentation data setAs the bounding boxes has varying dimensions, images of each outlinedfracture were generated by the center point of each bounding box, with afixed height and width of 600x600 pixels which can be seen in Figure 3.4The dimensions where chosen based on the trade-o� between GPU resourcesand information loss. The image of the drill core were cropped by these newbounding boxes of fixed sizes.

Figure 3.4 – Drill core with centered bounding box around one fracture

3.2.4 Manual segmentationSegmentation maps were generated by hand, with directives from Minalyzebased on how a fracture should be defined. A fracture was defined as all thepixels in between two edges of a broken o� part. The orientation line was alsooutlined as per pixel in a di�erent color. Three classes were outlined: fractureas green (0,255,0), orientation line as red (255,0,0) and the background as

blue (0,0,255). The segmentation maps were examined and approved asaccurate representations by representatives at Minalyze. A total of 230 imageswere annotated Figure 3.5 shows a cropped fracture image with its annotatedsegmentation map.

(a) (b)

Figure 3.5 – (a) Cropped fracture (b) Segmentation map

3.3 Model selectionThe data set consists of 54 images of drill cores boxes with 1200 instancesof bounding boxes and 230 segmentation masks. Given the uneven datadistribution between bounding boxes and segmentation masks, instead ofusing Mask R-CNN or similar end-to-end solution, two independent modelswas implemented for the object detection and the segmentation task. To seeif the complexity of the segmentation problem could be further decreased, Itried both a multi-class model and two binary models by splitting the classesinto two separate binary classes as it has proven beneficial for segmentationproblems with few data points [31]. To choose model, there is often a trade-o� between speed and accuracy, as accuracy is prefered over training andinference speed I choose the state-of-the-art models for the respective problemregardless of speed. Faster-RCNN for the object detection task with pretrainedweights from and DeepLabV3+ for the segmentation task.

Architecture Backbone box AP # ParametersFaster R-CNN R50-C4 38.4 25MFaster R-CNN R50-DC5 39.0 25MFaster R-CNN R50-FPN 40.2 25MFaster R-CNN R101-C4 41.1 45MFaster R-CNN R101-DC5 40.6 45MFaster R-CNN R101-FPN 42.0 45MFaster R-CNN X101-FPN 43.0 88M

Table 3.1 – Table of di�erent AP result on Faster R-CNN

3.4 Training

3.4.1 Object detectionThe object detection is made using the Detectron2 library [37] developed byFacebook. Detectron2 is a library built on top of PyTorch and has a modelzoo with di�erent pre-trained backbones as well as many object detectionarchitectures.

The drill core images were downsampled to a height of 500 pixels withkept aspect ratio which was chosen as a trade o� between information lossand capacity of our GPU. A batch size of 2 was chosen as it was the maximumthat could fit in the GPU. The data set was divided into a train/validation/testsplit as 80/10/10. In Detectron2 they use iterations instead of epochs to outlineprogress which is determined based on batch size and number of GPUs, in ourcase GPUs = 1.

Epoch =number_of_images

batch_size

Epochs =Iterations

Epoch

Two di�erent backbones was chosen based on AP results and complexity asnumber of parameters. R50-FPN and ResNeXt-101-32x8d as seen in figure3.1. Learning rate and warm-up iterations was determined experimentally,the other parameters were left at default. Each experiments was run for 3000iterations with 1000 iterations as warm-up with the learning rate linearlyincreasing from 0 up to the base learning rate. The backbone with the lowestloss (R50-FPN) was used in the final model. The weights of the final modelwas stored when the validation loss converged at 1700 iterations. Averageprecision was reported on final model with the stored weights as can be seen

in figure 4.1.

3.4.2 SegmentationThe segmentation part was developed in PyTorch with a Deeplabv3+ modelfrom Segmentation Models (SMP) [38]. The following hyper-parameterswere selected experimentally: learning rate, optimizer, momentum and weightdecay. Two di�erent backbones with pre-trained weights were tested: Resnet50and Resnet101. Resnet101 was deemed as the backbone yielding the highersIoU. Early stopping was used to store the weights with the lowest loss. Dataaugmentation was selected experimentally where vertical and horizontal flipproved to give better results. The ensemble learning technique Pair wisestacking was compared with a multi-class model to see if the IoU score couldbe increased by using the results of two binary models. The data set wasdivided into a 80/10/10 split for training, validation and testing. The finalnetwork configurations can be seen in figure 4.5.

3.5 Inference/post-processingIn this section I will describe the inference and post-processing steps togenerate edge points of the fractures and points describing the orientation line.

3.5.1 Object detectionA prediction is made on an image of a drill core box with its originaldimensions. The output is a list of bounding boxes defined by top left andbottom right points. Due to downsampling in the Segmentation model theinput images needs to be divisible by 32, therefore the width and height ofthe bounding boxes were extended to the closest number divisible by 32. Theimage of the drill core box is cropped according to the expanded boundingboxes to be used for the segmentation task.

Figure 3.6 – Result of object detection - drill core

3.5.2 Fracture segmentation modelThe edge of the segmented fracture is extracted by taking the right and left-most pixel of the prediction. An example of left-most edge can be seen seenin Figure 3.7. The edge points are then reduced to 10 points with equal stepsize as seen in Figure 3.9

Figure 3.7 – Leftmost pixels of fracture segmentation

3.5.3 Orientation line segmentation modelThe segmentation is separated by the fracture into a left and right orientationline by the fracture. An example of the extracted left-side can be seen in Figure

3.8.

Figure 3.8 – Separated left orientation line

RANSAC is used on both the left and the right segmentation group to geta robust line of the segmentation as can be seen in figure 3.9. A sample size isset experimentally with a percentage of the pixels assuming that most pixelsare inliers.

3.5.4 Point cloudA�ne transformation was used to create a mapping between the drill coreimage and the point cloud to convert the x-y pairs generated by or model tothe point cloud domain. The point cloud was used as a lookup-table for thex-y pairs to extract the z-position for the edge and orientation line points.

3.5.5 Alpha/Beta anglesThe Alpha and Beta angles were calculated in the point cloud domain byMinalyze in the same way as points created by geologists for reference. Thedi�erence between the respective angles generated by points created by ourmethod and geologist was calculated.

3.6 Hardware/Software to be usedThe experiments in this study was done on the following system:

• Hardware, Desktop computer with 1080Ti graphics

Figure 3.9 – Left-most pixels of fracture segmentation and orientation line

• Software, Python, PyTorch, Detectron2 Numpy, OpenCV and PIL

3.7 Assessing reliability and validity of thedata collected

The data set is developed using criteria for how a fracture should be defined.The result of the per pixel annotation is analyzed and approved as a goodrepresentation of fractures in drill cores by representatives at Minalyze.

3.7.1 Validity of methodThe segmentation model was evaluated by mIoU and the object detection wasevaluated using mAP, both standard evaluation metrics for object localization.

Minalyze has developed a method for producing a cross-section of a drillcore given a number of points on the edge of the fracture. The generatedpoints from this thesis will be feed into their system to generate Alpha andBeta angles. These angles will be compared numerically to the Alpha/Betaangles generated by geologist by hand.

The reason for dividing the problem with bounding boxes and the segmentationis because the distribution of data is very uneven, where there are many morebounding boxes than there are segmentation maps. Moreover the imagesof drill cores are very large, which means that they have to be heavilydownsampled in order to fit in the GPU, which would lead to low resolutionof the fractures if an end-to-end solution was used.

As for the models, those that performs state-of-the-art on other data setswere chosen. Fine-tuning pre-trained models is also due to the limited amountof data.

3.7.2 Reliability of methodThere are numerous factors that makes drill cores from di�erent boreholesvary, such as texture, chemical properties and the structural properties of therocks. The data set consists of di�erent boreholes from the same area, whichmakes the method biased towards the data set and might generalize poorly.Since the algorithm that calculates the Alpha and Beta angles can be seen asa black box, it is di�cult to evaluate the internal mechanisms that a�ect theresult.

Experiments and Results | 31

Chapter 4

Experiments and Results

4.1 Object detectionDetectron2 was used to implement the object detection model with weightspre-trained on resnet50. The network settings for the final model can be viewedin Table 4.1 and the Average Precision score in Table 4.2. Figure 4.1 and 4.2show the learned model predicting on a test image and the ground truth.

Network settings - object detection

Backbone R50-FPNOptimizer SGDBase learning rate 3.5 ·10�2

Batch size 2Threshold 0.5Batch size region proposals 512Warm-up iterations 400Total iterations 2000

Table 4.1 – Final network configuration object detection

AP AP50 AP7547.650 80.315 52.222

Table 4.2 – Average Precision - on test set

Figure 4.1 – Ground truth boxes

Figure 4.2 – Prediction boxes

4.1.1 SegmentationPyTorch was used to implement the segmentation model. The Deeplabv3+model from SMP was used with backbone resnet101 pre-trained on Imagenet.Experiments was made by training both a multi-class model and two binarymodels. The multi-class model was fine-tuned with Adams with a low learningrate of 10�4 to not destroy the pre-trained weights. The batch size wasset to 4 with image and mask dimensions of 576x576, to fit in the GPU.The Binary models had similar settings, with a di�erent loss criterion ofBinary Cross Entropy instead of Cross Entropy Loss. The IOU was used asevaluation metrics to monitor the performance. The resulting predictions onthe validation set can be viewed in Figure 4.4

Model Classes %(IoU)

Fracture Orientation-line

Multi-class 64.13 77.42Binary Fractures 70.51 -Binary Orientation line - 85.46

Table 4.3 – Averaged IoU over the validation set for the binary models andmulti-class model

(a) (b) (c) (d)

(e) (f) (g) (h)

(i) (j) (k) (l)

(m) (n) (o) (p)

(q) (r) (s) (t)

(u) (v) (w) (x)

Figure 4.3 – A subset of predictions on validation set, Left to right: Groundtruth, Binary fracture, Binary Orientation Line , Multi-class

Experiment Augmentation

Augmentation by vertical and horizontal flip was applied randomly to thebinary models, these augmentations were chosen based on the notion that theywouldn’t yield an unnatural representation of the fractures. As the drill core isalways aligned horizontally augmentations with hard shiftings was avoided.



Binary Fractures 74.26 -Binary Orientation line - 85.23

Table 4.4 – Averaged IoU over the validation set for the binary models withaugmentation

Final segmentation model

The binary models with augmentation was evaluated on the test set.

Network settings - segmentation

Backbone resnet101Optimizer AdamBase learning rate 10�4

Batch size 4Threshold 0.5Epochs 30

Table 4.5 – Final network configuration segmentation



Binary Fractures 75.43 -Binary Orientation line - 86.02

Table 4.6 – Averaged IoU over the test set for the binary models withaugmentation

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

(j) (k) (l)

(m) (n) (o)

Figure 4.4 – Ground truth and final binary segmentation predictions

4.1.2 Alpha/Beta anglesMinalyze�s algorithm to calculate the alpha and beta angles was used with thedata output from the method presented in this thesis of generating fracture andorientation line points.

The points generated by the method proposed in this thesis was used tocalculate the Alpha and Beta angles with the same algorithm as handcraftedpoints created by geologists. A comparison of one drill core box of Alpha andBeta angles generated by the method as well as angles generated by hand bygeologists can be viewed in Table 4.7 where each point represents a fracturewith it’s measured error in degrees. Figure 4.1.2 shows the error of fracture’sAlpha and Beta measure. Figure 4.6 shows an example of an outlier producedby the method.

↵ Method � Method ↵ Geologist � Geologist ↵ Error � Error

9.74 75.86 67.14 336.53 57.4 99.3377.87 315.59 79.74 271.96 1.87 43.6372.78 14.36 70.62 14.4 2.20 0.0458.31 124.72 59.56 114.92 1.25 9.8076.21 16.44 60.85 0.56 15.36 15.8871.88 19.04 55.09 357.81 16.79 21.2351.18 161.3 47.52 2.52 3.66 158.7887.11 282.61 66.72 19.1 20.39 06.4938.48 338.68 80.65 335.45 42.17 3.2372.8 301.81 78.09 271.01 5.29 30.850.00 249.21 59.18 55.84 9.19 166.6350.73 56.71 38.17 29.44 12.56 27.27

Table 4.7 – Alpha/Beta angles computed on one box in the test set with pointsfrom our method vs points created by geologists. Measurements in degrees.

Figure 4.5 – Error Alpha/Beta angles - on test set in degrees

.

Figure 4.6 – Example of outlier point in fracture edge

Figure 4.7 – Example of shifting plane with one outlier in the point cloud


Discussion | 41

Chapter 5

Discussion

5.0.1 Object detectionThe object detection model found fractures with high Average Precision whenusing a IoU threshold of 50%. A lower score was observed for higher IoUthresholds, which may indicate that the predictions aren’t completely accuratebut that the model is able to locate the fractures.

5.0.2 SegmentationThe results show that with relatively little data, the segmentation model canfind a representation of fractures in drill cores with an IoU score of 75.43%on the test data. The experiments comparing binary and multi-class modelsshowed that there was a gain in IoU by training two separate binary classifiersinstead of a multi-class model. The segmentation model got a high IoU scorewhen applying light augmentation. When visually inspecting, we can see thatthe edge of the fracture is most often captured. Even when a prediction missesa lot of the pixels of the fracture, the model is able to capture the right-mostedge most of the time, as seen in the first row of figure 4.4. The segmentationmodel was able to find the orientation line with high IoU score. This mayindicate that it is a fairly simple problem, as it does not vary much in eitherposition or color over di�erent drill cores.

5.0.3 Alpha/Beta anglesIt is di�cult to draw any definite conclusions about the calculation of the Alphaand Beta angles given the limited amount of data for testing. Points generatedwith the method presented in this thesis work well in finding Alpha and Beta

angles for certain fractures. We can see in figure 4.5 that the majority of thefractures has an error below 20 degrees for Alpha angles and below 35 degreesfor the beta angles. The algorithm that Minalyze uses to generate planes issensitive to outliers. It was observed that one outlier completely shifted thegenerated plane even if a majority of the points were aligned with the fractureedge. An example of an outlier point in the generated edge points can beseen in figure 4.6 as well in the 3D domain in Figure 4.7 The appearance andcomplexity of the fractures varies greatly in the data set, which may be anotherexplanation to why it is di�cult to generate accurate edge points in some cases.

Conclusions and Future work | 43

Chapter 6

Conclusions and Future work

6.1 ConclusionsThe purpose of this thesis was to investigate if Deep Learning could be utilizedto localize fracture edges in drill cores. The result of the object detection andsegmentation shows that a Deep Learning based method is able to successfullylocalize fractures and orientation lines. The resulting fracture edge points aswell as the orientation line points that our method generated could be used togenerate Alpha/Beta angles. However, when comparing to handcrafted pointsthey have a relatively large error. I think the results show the potential in usingthis method for localizing fracture edges in drill cores. Given the sparse dataset used in this thesis, there is in my view also a lot of potential for future workand improving the method.

Di�erent types of fractures

The shape of the fractures in the data set varies a lot. The type of fracturethat is most common is narrow and curved, but there are also other types,such as fractures with wide aperture and edge pieces with a steep intersection.Therefore it would be valuable to classify the fractures in di�erent types, togain a better understanding of which fractures the model has a hard time toclassify correctly. This could also provide insight into how to balance the dataset between di�erent types of fractures when annotating more data.

Sources of error

In the mining industry, there is a problem where di�erent geologists classifythe same fractures di�erently. An automated system, like the one proposed

in this thesis, could work as a base to an impartial assessment, given that thedata is created by several geologists. Another advantage of this method is thatboth sides of a fracture can be used to generate Alpha and Beta angles. Thismakes it possible to get a safer measure if they are averaged. In the industry,you rarely do that because it is so time consuming. The method presentedin this thesis is compared to only one geologist, manually plotting points. Itwould be valuable to investigate the error across di�erent geologists plottingthe fractures.

One major flaw with this method, if combined with Minalyze’s algorithmfor generating intersecting planes in the drill cores, is the sensitivity to outliers.The algorithm Minalyze uses for fitting the points seems to create an optimalfit for all points, outliers included.

Another shortcoming of the method is that it requires manual input to mapthe images to the point cloud. This could be solved via hardware if the laserscanner and the camera are set up in such a way that they never di�er betweendrill cores. The manual extraction of points for the transformation could alsobe a source of error.

Other use cases

Although the method may have di�culty finding certain types of fractures anddefining them well, there are other aspects of this work that could be used as atool for geologists. Since the model finds orientation lines with high precision,parts of the method could be used to automate that process. It’s also possibleto use the object detection model to guide geologists in finding fractures thatcan be di�cult to see with the eye.

6.1.1 Future workGiven the large images of the drill core boxes, an end to end solution might behard to utilize without extensive computational power or loss of informationin the segmentation step. Tiling could be used to circumvent this, by dividingthe drill core images in quarters, or even smaller images, to avoid heavydownsampling.

When it comes to the generation of robust plane fitting for the Alpha/Betaangles, one solution could be to use outlier/abnormality detection. Anothersolution might be to review the algorithm that fits the points to the plane, tosuppress the influence of outliers.

Another thing to investigate, could be using a localization model directlyon the point clouds, as it would make it possible to avoid the transformation

step from images to point clouds. The additional dimension, the z-value,would then also give valuable information during training.

6.2 Ethical, Sustainable and Social aspectsMining is non-renewable in nature and the process can damage the environmentin several stages, from the mining itself, to the consumption of what isextracted. The exploitation of minerals is often very demanding for theenvironment. When a mine is dug, the top layer of the soil is often removed,leading to the disappearance of vegetation and, consequently, the migrationof animals. Furthermore, nearby watercourses can be polluted, which in turnnegatively a�ects the local community and wildlife [39].

Automation of work previously performed by humans is an importantsocial aspect to consider when developing AI algorithms. However, automatingprocesses that are monotonous can leave geologists with more time for otherimportant work that is more fulfilling.


Conclusions and Future work | 47

References

[1] K. O’Shea and R. Nash, “An introduction to convolutional neuralnetworks,” 2015.

[2] B. Mehlig, “Machine learning with neural networks,” 2021.

[3] I. Goodfellow, Deep learning, ser. Adaptive computation and machinelearning, 2016. ISBN 9780262035613

[4] R. A. McEver and B. S. Manjunath, “Pcams: Weakly supervisedsemantic segmentation using point supervision,” 2020.

[5] R. Yamashita, M. Nishio, R. K. G. Do, and K. Togashi, “Convolutionalneural networks: an overview and application in radiology,” Insights intoimaging, vol. 9, no. 4, pp. 611–629, 2018.

[6] F. Yu and V. Koltun, “Multi-scale context aggregation by dilatedconvolutions,” 2016.

[7] “Computer vision: Dana h. ballard and christopher m. brown. prentice-hall, englewood cli�s, n.j. 1982. xx + 523 pages. $39.95,” ComputerGraphics and Image Processing, vol. 20, no. 1, p. 96, 1982. doi:https://doi.org/10.1016/0146-664X(82)90077-6. [Online]. Available:https://www.sciencedirect.com/science/article/pii/0146664X82900776

[8] U. Shah and A. Harpale, “A review of deep learning modelsfor computer vision,” in 2018 IEEE Punecon, 2018. doi:10.1109/PUNECON.2018.8745417 pp. 1–6.

[9] R. Raguram, J.-M. Frahm, and M. Pollefeys, “A comparative analysisof ransac techniques leading to adaptive real-time random sampleconsensus,” in Computer Vision – ECCV 2008, D. Forsyth, P. Torr, andA. Zisserman, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg,2008. ISBN 978-3-540-88688-4 pp. 500–513.

https://www.sciencedirect.com/science/article/pii/0146664X82900776

[10] S. Bright, G. Conner, A. Turner, and J. Vearncombe, “Drill core,structure and digital technologies,” Applied Earth Science, vol. 123,no. 1, pp. 47–68, 2014. [Online]. Available: https://doi.org/10.1179/1743275814Y.0000000051

[11] N. Shigematsu, M. Otsubo, K. Fujimoto, and N. Tanaka,“Orienting drill core using borehole-wall image correlation analysis,”Journal of Structural Geology, vol. 67, pp. 293–299, 2014.doi: https://doi.org/10.1016/j.jsg.2014.01.016 Structural Geology andResources. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0191814114000297

[12] A. Straahle, Definition and description of parameters for geologic,geophysical and rock mechanical mapping of rock. Definition ochbeskrivning av parametrar foer geologisk, geofysisk och bergmekaniskkartering av berg, 2001.

[13] The ISRM Suggested Methods for Rock Characterization, Testing andMonitoring: 2007-2014, 1st ed., 2015. ISBN 3-319-07713-9

[14] Deep Learning in Object Detection and Recognition, 1st ed., 2019. ISBN981-10-5152-6

[15] Z.-Q. Zhao, P. Zheng, S. tao Xu, and X. Wu, “Object detection with deeplearning: A review,” 2019.

[16] R. B. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich featurehierarchies for accurate object detection and semantic segmentation,”CoRR, vol. abs/1311.2524, 2013. [Online]. Available: http://arxiv.org/abs/1311.2524

[17] R. Girshick, “Fast r-cnn,” 2015.

[18] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-timeobject detection with region proposal networks,” 2016.

[19] S. P. Mary, Ankayarkanni, U. Nandini, Sathyabama, and S. Aravindhan,“A Survey on Image Segmentation Using Deep Learning,” Journal ofPhysics: Conference Series, vol. 1712, no. 1, pp. 1–22, 2020. doi:10.1088/1742-6596/1712/1/012016

[20] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks forsemantic segmentation,” 2015.

https://doi.org/10.1179/1743275814Y.0000000051

https://doi.org/10.1179/1743275814Y.0000000051

https://www.sciencedirect.com/science/article/pii/S0191814114000297

https://www.sciencedirect.com/science/article/pii/S0191814114000297

http://arxiv.org/abs/1311.2524

http://arxiv.org/abs/1311.2524

[21] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille,“Semantic image segmentation with deep convolutional nets and fullyconnected crfs,” 2016.

[22] ——, “Deeplab: Semantic image segmentation with deep convolutionalnets, atrous convolution, and fully connected crfs,” 2017.

[23] L.-C. Chen, G. Papandreou, F. Schro�, and H. Adam, “Rethinking atrousconvolution for semantic image segmentation,” 2017.

[24] L.-C. Chen, Y. Zhu, G. Papandreou, F. Schro�, and H. Adam,“Encoder-decoder with atrous separable convolution for semantic imagesegmentation,” 2018.

[25] S. Tabik, D. Peralta, A. Herrera-Poyatos, and F. Herrera, “A snapshot ofimage Pre-Processing for convolutional neural networks: Case study ofMNIST,” International Journal of Computational Intelligence Systems,vol. 10, no. 1, pp. 555–568, 2017. doi: 10.2991/ijcis.2017.10.1.38

[26] Z. Wang, J. Yang, H. Jiang, and X. Fan, “CNN training withtwenty samples for crack detection via data augmentation,” Sensors(Switzerland), vol. 20, no. 17, pp. 1–17, 2020. doi: 10.3390/s20174849

[27] C. Tan, F. Sun, T. Kong, W. Zhang, C. Yang, and C. Liu, “Asurvey on deep transfer learning,” Lecture Notes in Computer Science(including subseries Lecture Notes in Artificial Intelligence and LectureNotes in Bioinformatics), vol. 11141 LNCS, pp. 270–279, 2018. doi:10.1007/978-3-030-01424-727

[28] A. Rosebrock, “Practitioner Bundle,” Deep Learning for Computer Vision withPython, 2017.

[29] S. Marsland, Machine Learning: An Algorithmic Perspective, Second Edition,2nd ed. Chapman amp; Hall/CRC, 2014. ISBN 1466583282

[30] P. Savicky and J. Fürnkranz, “Combining pairwise classifiers with stacking,”in Advances in Intelligent Data Analysis V, M. R. Berthold, H.-J. Lenz,E. Bradley, R. Kruse, and C. Borgelt, Eds. Berlin, Heidelberg: SpringerBerlin Heidelberg, 2003. ISBN 978-3-540-45231-7 pp. 219–229.

[31] X. Chen, J. H. Liew, W. Xiong, C.-K. Chui, and S.-H. Ong, “Focus, segmentand erase: An e�cient network for multi-label brain tumor segmentation,”in Computer Vision – ECCV 2018, ser. Lecture Notes in Computer Science.

Cham: Springer International Publishing, 2018. ISBN 9783030012601. ISSN0302-9743 pp. 674–689.

[32] Y. Zhang, S. Mehta, and A. Caspi, “Rethinking semantic segmentationevaluation for explainability and model selection,” 2021.

[33] F. Alzubaidi, P. Mostaghimi, P. Swietojanski, S. R. Clark, and R. T.Armstrong, “Automated lithology classification from drill core images usingconvolutional neural networks,” Journal of petroleum science engineering,vol. 197, p. 107933, 2021.

[34] K. Chaiyasarn, W. Khan, L. Ali, M. Sharma, D. Brackenbury, and M. DeJong,“Crack detection in masonry structures using convolutional neural networksand support vector machines,” ISARC 2018 - 35th International Symposiumon Automation and Robotics in Construction and International AEC/FMHackathon: The Future of Building Things, no. November, 2018. doi:10.22260/isarc2018/0016

[35] U. Escalona, F. Arce, E. Zamora, and J. Azuela, “Fully convolutional networksfor automatic pavement crack segmentation,” Computación y Sistemas,vol. 23, 06 2019. doi: 10.13053/cys-23-2-3047

[36] X. Feng, L. Xiao, W. Li, L. Pei, Z. Sun, Z. Ma, H. Shen, and H. Ju,“Pavement Crack Detection and Segmentation Method Based on ImprovedDeep Learning Fusion Model,” Mathematical Problems in Engineering, vol.2020, p. 8515213, 2020. doi: 10.1155/2020/8515213. [Online]. Available:https://doi.org/10.1155/2020/8515213

[37] “Detectron2,” https://detectron2.readthedocs.io/, accessed: 2021-05-15.

[38] “Segmentation models,” https://smp.readthedocs.io/, accessed: 2021-05-15.

[39] A. Widana, “Environmental impacts of the mining industry : A literaturereview,” 10 2019. doi: 10.13140/RG.2.2.11702.86083

https://doi.org/10.1155/2020/8515213

https://detectron2.readthedocs.io/

https://smp.readthedocs.io/

www.kth.se

TRITA -EECS-EX-2021:648

Date post:	17-Feb-2022
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

Localization of fractures in drill cores using Deep Learning

Documents