Methods for Truck Dispatching in Open-Pit Mining

Thesis presented to the Faculty of the Department of Graduate Studies

of the Aeronautics Institute of Technology, in partial fulfillment of the

requirements for the Degree of Doctor in Science in the Program of

Electronic Engineering and Computer Science, Field Computer Science.

Guilherme Sousa Bastos

METHODS FOR TRUCK DISPATCHING IN

OPEN-PIT MINING

Thesis approved in its final version by signatories below:

Prof.Dr. Carlos Henrique Costa Ribeiro

Advisor

Prof.Dr Luiz Edival de Souza

Co-advisor

Prof. Celso Massaki Hirata

Head of the Faculty of the Department of Graduate Studies

Campo Montenegro

Sao Jose dos Campos, SP - Brazil

2010

Cataloging-in Publication Data

Documentation and Information Division

Sousa Bastos, Guilherme

Methods for Truck Dispatching in Open-Pit Mining / Guilherme Sousa Bastos.

Sao Jose dos Campos, 2010.

140f.

Thesis of Doctor in Science – Course of Electronic Engineering and Computer Science. Area of

Computer Science – Aeronautical Institute of Technology, 2010. Advisor: Prof.Dr. Carlos

Henrique Costa Ribeiro. Co-advisor: Prof.Dr Luiz Edival de Souza.

1. Programacao matematica. 2. Distribuicao de mercadorias. 3. Algoritmos Geneticos.

4. Matematica aplicada. 5. Rotas. 6. Caminhoes. 7. Matematica. I. Aeronautics Institute of

Technology. II. Title.

BIBLIOGRAPHIC REFERENCE

SOUSA BASTOS, Guilherme. Methods for Truck Dispatching in Open-Pit

Mining. 2010. 140f. Thesis of Doctor in Science – Aeronautics Institute of Technology,

Sao Jose dos Campos.

CESSION OF RIGHTS

AUTHOR NAME: Guilherme Sousa Bastos

PUBLICATION TITLE: Methods for Truck Dispatching in Open-Pit Mining.

PUBLICATION KIND/YEAR: Thesis / 2010

It is granted to Aeronautics Institute of Technology permission to reproduce copies of

this thesis and to only loan or to sell copies for academic and scientific purposes. The

author reserves other publication rights and no part of this thesis can be reproduced

without the authorization of the author.


Rua Oscar Renno, 309. Costa II

CEP 37500-433 – Itajuba–MG

METHODS FOR TRUCK DISPATCHING IN

OPEN-PIT MINING


Thesis Committee Composition:

Prof. Cairo Lucio Nascimento Junior Chair Person - ITA

Prof.Dr. Carlos Henrique Costa Ribeiro Advisor - ITA

Prof.Dr Luiz Edival de Souza Co-advisor - UNIFEI

Prof. Rodrigo Arnaldo Scarpel Member - ITA

Dra. Leliane Nunes de Barros External Member - IME-USP

Dr. Marcone Jamilson Freitas Souza External Member - UFOP

ITA

To Karina, by her love and pa-

tience.

Acknowledgments

Thank you God for writing straight on crooked lines... All my entire life has been guided

by this wise saying, and now, after a really hard way, I’m here finishing my most

important work till now.

I would like to express my gratitude to my advisor Prof. Carlos Henrique Costa Ribeiro.

Your supervision style was primordial to point the research way, by never giving the

correct ways, but always avoiding me from the wrong ones. Because of this, I can affirm

that now I am a researcher. Thank you very much!

Thanks to my co-advisor and colleague Prof. Luiz Edival de Souza. Your presence

beside my office was fundamental in my developments, by being every time available to

answer and help me in my infinite questions. This is an end point of your supervisions

on my researches, which occurs since I was doing my engineering course; however, it is a

start point of our future research projects. Thanks a lot!

Another special thanks goes to my supa in Australian Centre for Field Robotics (ACFR)

Dr. Fabio Ramos. Thank you for had received me in ACFR and supervised my work

during my six months stay in Sydney. This time period was the differential of my work,

which certainly will drive my future researches to a superior quality rate. Cheers mate!

Thanks to CAPES for conceding a scholarship, which was primordial for my studies at

ACFR.

Continuing in the Oz Land, I must thank the persons that helped me in the works, and

mainly in the the foreign life. Thanks to ACFR staff, and mainly to Vitor, Sildomar,

Guilherme, Tim, Adrian, Paco, Simon, Gabriel, Surya, and Pablo Chilean. A special

thanks goes to Pablo Peruvian, you were my first friend in Sydney! Thanks to guide me

vi

(a newbie) across the great pubs in the city! Another special thanks to my other friends

in Sydney, which I can classify as my brothers, Alex Cowboy, Du, Elton, Leandro, and

Pablo Chilean. My staying in Sydney can be divided on before and after knowing yous!

Another thank to my great friend Andy and his wife Joanna; thanks a lot for bought

”Possante”, I am sure that it will bring happiness for you!

Many thanks to Karina Valdivia for teaching me the ”crazy” Factored MDPs. I am sure

that we can make a partnership in a near future to study and develop new trends in

decision making area.

So many times in this long way I had the comprehension of two special persons at

UNIFEI allowing my research work at ITA and adjusting my schedule whenever I

needed; thank you Prof. Carlos Augusto Ayres and Prof. Carlos Alberto Pinheiro.

Thanks to my mom and dad for the constant incentives on my studies since I was a kid.

I really cannot have achieved this position without your help. I love you two.

A really special thanks to my wife Karina. Only you know the difficulties that we have

passed together during this years of studies... That’s the past, from now we will collect

the fruits that we have started planting five years ago! Thanks for everything my love!

Eu te amo!!!

“Logic takes you from a to b.Imagination takes you everywhere.”

— Albert Einstein

Resumo

O transporte de material e um dos mais importantes aspectos das operacoes realizadas

em minas a ceu aberto. Este problema envolve geralmente um sistema de despacho de

caminhoes, o qual realiza a alocacao dos caminhoes em tempo real. Dada a importancia

deste problema, diversos sistemas de decisao vem sendo desenvolvidos durante os ultimos

anos, aumentando a produtividade e diminuindo os custos operacionais. Como em muitas

outras aplicacoes reais, uma correta modelagem das incertezas presentes no problema

torna-se crucial para o bom funcionamento do sistema de despacho. Como incertezas

podem-se citar falhas em equipamentos, condicoes climaticas e erros humanos, as quais

podem resultar em filas de caminhoes e carregadeiras inoperantes. Entretanto, incertezas

nao sao consideradas na maioria dos sistemas de despacho comerciais, fato que pode levar

a resultados longe dos esperados. Nesta tese, novos sistemas de despacho de caminhoes sao

introduzidos aproximando deste modo os sistemas atuais a uma metodologia de decisao

estocastica. Primeiramente, e apresentado um metodo estocastico utilizando Processo

Decisorio de Markov Dependente do Tempo (TiMDP) aplicado ao problema de despacho

de caminhoes. Neste modelo, os tempos de deslocamento dos caminhoes sao representa-

dos como funcoes de densidade de probabilidade, janelas de tempo podem ser inseridas

representando disponibilidade das rotas existentes, e utilidade baseada no tempo pode

ser utilizada como um parametro de prioridade. Com o objetivo de minimizar a questao

ix

ja bem conhecida da maldicao da dimensionalidade, na qual problemas multi-agentes es-

tao sujeitos quando se considera modelagem em estados discretos, o sistema e modelado

utilizando-se o conceito introduzido de simples-agentes interdependentes. Baseando-se

ainda neste conceito, o metodo TiMDP Genetico (G-TiMDP) e apresentado para apli-

cacao no problema de despacho de caminhoes. Este metodo apresenta-se como uma hi-

bridizacao do modelo TiMDP e Algoritmos Geneticos (GA), o qual e tambem utilizado

para solucionar o problema de despacho. Finalmente, de modo a testar e comparar os

resultados dos metodos introduzidos, sao executadas simulacoes pelo metodo de Monte

Carlo em uma mina heterogenea composta por 15 caminhoes, 3 carregadeiras e 1 ponto de

processamento de minerio. O aspecto de incerteza presente no problema e representado

pela escolha da rota entre o ponto de processamento do minerio e as carregadeiras, a qual

e realizada pelo motorista do caminhao, sendo independente do sistema de despacho. Os

resultados sao comparados a sistemas classicos de despacho (Heurıstica Gulosa e Mini-

mizacao dos Tempos de Ciclo dos Caminhoes – MTCT) utilizando o Teste T de Student,

comprovando a eficiencia dos metodos de despacho de caminhoes propostos.

Abstract

Material transportation is one of the most important aspects of open-pit mine oper-

ations. The problem usually involves a truck dispatching system in which decisions on

truck assignments and destinations are taken in real-time. Due to its significance, several

decision systems for this problem have been developed in the last few years, improving

productivity and reducing operating costs. As in many other real-world applications, the

assessment and correct modeling of uncertainty is a crucial requirement as the unpre-

dictability originated from equipment faults, weather conditions, and human mistakes,

can often result in truck queues or idle shovels. However, uncertainty is not considered in

most commercial dispatching systems. In this thesis, we introduce novel truck dispatching

systems as a starting point to modify the current practices with a statistically princi-

pled decision making methodology. First, we present a stochastic method using Time-

Dependent Markov Decision Process (TiMDP) applied to the truck dispatching problem.

In the TiMDP model, travel times are represented as probabilistic density functions (pdfs),

time-windows can be inserted for paths availability, and time-dependent utility can be used

as a priority parameter. In order to minimize the well-known curse of dimensionality is-

sue, to which multi-agent problems are subject when considering discrete state modelings,

the system is modeled based on the introduced single-dependent-agents. Based also on the

single-dependent-agents concept, we introduce the Genetic TiMDP (G-TiMDP) method

xi

applied to the truck dispatching problem. This method is a hybridization of the TiMDP

model and of a Genetic Algorithm (GA), which is also used to solve the truck dispatching

problem. Finally, in order to evaluate and compare the results of the introduced methods,

we execute Monte Carlo simulations in a example heterogeneous mine composed by 15

trucks, 3 shovels, and 1 crusher. The uncertain aspect of the problem is represented by

the path selection through crusher and shovels, which is executed by the truck driver, being

independent of the dispatching system. The results are compared to classical dispatching

approaches (Greedy Heuristic and Minimization of Truck Cycle Times – MTCT) using

Student’s T-test, proving the efficiency of the introduced truck dispatching methods.

List of Figures

FIGURE 2.1 – Example of a MDP with 3 states. . . . . . . . . . . . . . . . . . . . 33

FIGURE 2.2 – Value Iteration for (a) γ = 0.9 and (b) γ = 0.3. . . . . . . . . . . . . 36

FIGURE 2.3 – TiMDP example solved step-by-step by value iteration. . . . . . . . 42

FIGURE 2.4 – Sequential decision making problem using time-dependent utility. . . 46

FIGURE 2.5 – Value function - V (1, t). . . . . . . . . . . . . . . . . . . . . . . . . . 46



FIGURE 2.8 – Policies over time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

FIGURE 2.9 – Value function V (1, t), P1 = N(10, 3). . . . . . . . . . . . . . . . . . 49

FIGURE 2.10 –Policies over time - P1 = N(10, 3). . . . . . . . . . . . . . . . . . . . 50

FIGURE 3.1 – 1-truck-for-n-shovels strategy. . . . . . . . . . . . . . . . . . . . . . 56

FIGURE 3.2 – m-trucks-for-1-shovel strategy. . . . . . . . . . . . . . . . . . . . . . 58

FIGURE 3.3 – m-trucks-for-n-shovels strategy. . . . . . . . . . . . . . . . . . . . . 59

FIGURE 4.1 – Abstract graph of a medium-scale mine. . . . . . . . . . . . . . . . . 62

FIGURE 4.2 – Truck cycle time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

FIGURE 4.3 – Path selection outcomes. (a) Crusher-Shovel 1-Crusher; (b) Crusher-

Shovel 2-Crusher; (c) Crusher-Shovel 3-Crusher . . . . . . . . . . . . 67

FIGURE 4.4 – Outcome likelihood functions. . . . . . . . . . . . . . . . . . . . . . 69

FIGURE 4.5 – Truck dispatching state transitions. . . . . . . . . . . . . . . . . . . 78

LIST OF FIGURES xiii

FIGURE 4.6 – TiMDP truck dispatching states. . . . . . . . . . . . . . . . . . . . . 79

FIGURE 4.7 – Expected tonnage production at crusher C (Truck 1 - Shovel 1 -

Queue 0). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

FIGURE 4.8 – Expected tonnage production at crusher C (Truck 1 - Queue 0). . . 87

FIGURE 4.9 – Expected tonnage production at crusher C (Truck 2 - Shovel 3). . . 88

FIGURE 4.10 –Expected tonnage production at crusher C (Truck 2). . . . . . . . . 89

FIGURE 4.11 –Comparative of expected tonnage production at crusher C (Truck 3

- Queue 0) for standard and Gauss representations. . . . . . . . . . 91

FIGURE 4.12 –Comparative of expected tonnage production at crusher C (Truck 1

- Queue 0) for standard and Gauss representations. . . . . . . . . . 92

FIGURE 4.13 –Truck dispatching GA chromosome. . . . . . . . . . . . . . . . . . . 94

FIGURE 4.14 –Truck dispatching GA crossover. . . . . . . . . . . . . . . . . . . . . 95

FIGURE 4.15 –Truck dispatching GA mutation. . . . . . . . . . . . . . . . . . . . . 95

FIGURE 4.16 –Truck dispatching GA elitist behavior. . . . . . . . . . . . . . . . . . 96

FIGURE 4.17 –Truck dispatching GA reproduction result. . . . . . . . . . . . . . . 97

FIGURE 4.18 –Auxiliary chromosome array. . . . . . . . . . . . . . . . . . . . . . . 98

FIGURE 5.1 – General mine simulation environment. . . . . . . . . . . . . . . . . . 104

FIGURE 5.2 – Shovel 1 block simulation environment detail. . . . . . . . . . . . . 105

FIGURE 5.3 – Queue 1 block simulation environment detail. . . . . . . . . . . . . 106

FIGURE 5.4 – Paths 1 block simulation environment detail. . . . . . . . . . . . . . 107

FIGURE 5.5 – Path µ1 block simulation environment detail. . . . . . . . . . . . . . 108

FIGURE 5.6 – Quantity of trucks in shovels for the Greedy Heuristic simulation. . 109

FIGURE 5.7 – Quantity of trucks in paths going to Shovel 1 for the Greedy Heuris-

tic simulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

FIGURE 5.8 – Quantity of trucks in shovels for the MTCT Heuristic simulation. . . 112

FIGURE 5.9 – Trucks on parking lot for the MTCT heuristic. . . . . . . . . . . . . 113

LIST OF FIGURES xiv

FIGURE 5.10 –Quantity of trucks in shovels for TiMDP model simulation. . . . . . 114

FIGURE 5.11 –Trucks on parking lot for TiMDP model. . . . . . . . . . . . . . . . 115

FIGURE 5.12 –Quantity of trucks in shovels for the GA model simulation. . . . . . 116

FIGURE 5.13 –Trucks on parking lot for the GA model. . . . . . . . . . . . . . . . 117

FIGURE 5.14 –Quantity of trucks in shovels for the G-TiMDP simulation. . . . . . 118

FIGURE 5.15 –Trucks on parking lot for the G-TiMDP model. . . . . . . . . . . . . 119

FIGURE 5.16 –Mean time in the queues for TiMDP model. . . . . . . . . . . . . . 120

FIGURE B.1 – Gamma distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . 139

FIGURE B.2 – Gaussian distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . 140

List of Tables

TABLE 2.1 – Transition Probabilities . . . . . . . . . . . . . . . . . . . . . . . . . 33

TABLE 2.2 – Reward . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

TABLE 2.3 – MDP Solution (γ = 0.9) . . . . . . . . . . . . . . . . . . . . . . . . 36

TABLE 2.4 – Q(s, a) (γ = 0.9) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

TABLE 2.5 – MDP Solution (γ = 0.3) . . . . . . . . . . . . . . . . . . . . . . . . 37

TABLE 2.6 – Q(s, a) (γ = 0, 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

TABLE 4.1 – Truck specifications. . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

TABLE 4.2 – Shovel specifications. . . . . . . . . . . . . . . . . . . . . . . . . . . 64

TABLE 4.3 – Mining data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

TABLE 5.1 – Monte Carlo simulations of truck dispatching methods using stan-

dard representation (standard deviation equals zero for all consid-

ered times). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

TABLE 5.2 – Monte Carlo simulations of truck dispatching methods using Gaus-

sian representation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

TABLE 5.3 – Comparatives between truck dispatching methods using T-test. . . . 122

TABLE 5.4 – Comparatives between truck dispatching methods with Gaussian

representations using T-test. . . . . . . . . . . . . . . . . . . . . . . 123

List of Abbreviations and Acronyms

GA Genetic Algorithm

G-TiMDP Genetic Time-dependent Markov Decision Process

PWC Piecewise Constant

PWP Piecewise Polynomial

PWL Piecewise Linear

MDP Markov Decision Process

MSWT Minimizing Shovel Waiting Time

MSC Minimizing Shovel Saturation or Coverage

MTCT Minimizing Truck Cycle Time

MTWT Minimizing Truck Waiting Time

pdf probability density function

ROM Run Of Mine

SA Simulated Annealing

SMDP Semi-Markov Decision Process

TiMDP Time-dependent Markov Decision Process

Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

1.3 Work Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

1.4 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2 Time Dependence in Decision Processes . . . . . . . . . . . 27

2.1 Markov Decision Processes . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.1.1 MDP formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.1.2 MDP solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.1.3 A MDP example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.2 Time-dependent Markov Decision Processes . . . . . . . . . . . . . . . 37

2.2.1 Discrete solution for relative time distributions by backwards convolution . 39

2.2.2 A TiMDP example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

2.3 Time-dependent utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

2.3.1 Decreasing time-dependent utility function . . . . . . . . . . . . . . . . . . 43

2.3.2 Increasing time-dependent utility function . . . . . . . . . . . . . . . . . . 44

2.3.3 A time-dependent utility example . . . . . . . . . . . . . . . . . . . . . . . 45

3 Truck Dispatching in Open Pit Mines . . . . . . . . . . . . . 51

CONTENTS xviii

3.1 Vehicle dispatching problems . . . . . . . . . . . . . . . . . . . . . . . . 52

3.2 Truck dispatching problem . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.2.1 The 1-truck-for-n-shovels strategy . . . . . . . . . . . . . . . . . . . . . . . 55

3.2.2 The m-trucks-for-1-shovel strategy . . . . . . . . . . . . . . . . . . . . . . 57

3.2.3 The m-trucks-for-n-shovels strategy . . . . . . . . . . . . . . . . . . . . . . 58

4 Truck Dispatching Modeling . . . . . . . . . . . . . . . . . . . 61

4.1 A model for a medium-scale mine example . . . . . . . . . . . . . . . . 62

4.1.1 Mine environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

4.1.2 Specifying trucks and shovels . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.1.3 The truck cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.1.4 Mine uncertainties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.1.4.1 Stochastic path selection . . . . . . . . . . . . . . . . . . . 66

4.1.4.2 Gaussian-based truck traveling times . . . . . . . . . . . . 68

4.2 Truck dispatching methods . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.2.1 Greedy heuristic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.2.2 MTCT heuristic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.2.3 TiMDP Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.2.3.1 Single-dependent-agent TiMDP modeling . . . . . . . . . 75

4.2.3.2 TiMDP results and analysis . . . . . . . . . . . . . . . . . 82

4.2.4 Genetic Algorithm (GA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

4.2.5 G-TiMDP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

5 Simulations and Analysis . . . . . . . . . . . . . . . . . . . . . . 102

5.1 Simulation Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

5.2 Dispatching Methods Behavior . . . . . . . . . . . . . . . . . . . . . . . 106

5.3 Comparative Results and Analysis . . . . . . . . . . . . . . . . . . . . . 117

CONTENTS xix

6 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

6.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

Appendix A – Genetic Algorithm . . . . . . . . . . . . . . . . . 134

A.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

A.1.1 Population generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

A.1.2 Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

A.1.3 Reproduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

A.1.3.1 Crossover . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

A.1.3.2 Mutation . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

A.1.4 Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

Appendix B – Statistical Distributions . . . . . . . . . . . . . 138

B.1 Gamma Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

B.2 Gaussian Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

1 Introduction

1.1 Motivation

Truck dispatching is an important issue to be tackled in the Open-Pit Mining Area be-

cause of the costs of material transportation, which can represents up to 60% of operating

expenditure in realistic settings (ALARIE; GAMACHE, 2002). Basically, truck dispatch-

ing is a combinatorial problem that consists of assigning trucks to shovels in order to

optimize a specific objective while taking into account several constraints. The objective

can be the maximization of the tonnage material transported during a shift (productivity

policy), minimization of equipment inactivity, or Run of Mine (ROM) attendance (quality

policy). All these mining objectives can be attended independently (only one objective

per time – single-objective) or combined with each other (two or more objectives combined

to take the best result of each one or of the combination – multi-objective). The objec-

tive attendance is generally subject to common mine constraints, such as truck hauling

and shovel loading capacities, empty and loaded truck speeds, refueling, and preventive

maintenance schedules.

Recently, some dispatching systems were developed for open-pit mining using a di-

versity of operational research and evolutionary techniques (BRAHMA, 2007; KRAUSE;

CHAPTER 1. INTRODUCTION 21

MUSINGWINI, 2007; JAOUA; GAMACHE; RIOPEL, 2009). However, neither of them

presents a stochastic representation for actions and environmental changes. The stochas-

ticity is represented by inherent problem uncertainties, which are present in most real-

world problems; classical and deterministic approaches does not consider uncertain be-

havior of real-world problems, leading most of time to non-optimal results. Truck dis-

patching problems in open-pit mines are often subject to uncertain behavior, such as fuel

consumption variations, unexpected equipment stopping (faults, flat tires, emergencies,

etc.), and time variations of durative actions. Therefore, the truck dispatching model-

ing by a stochastic approach becomes crucial in order to attend and optimize its specific

objectives.

A stochastic truck dispatching system can be represented by a Markov Decision

Process (MDP) (PUTERMAN, 1994), which is the classical approach for decision theo-

retic problems (LITTMAN; DEAN; KAELBLING, 1995; BOUTILIER; DEAN; HANKS,

1999). Uncertain parameters can be modeled based on reliable historical database, equip-

ment faults, weather conditions and route availability. Another set of uncertain param-

eters based on time, such as truck travel and loading durations, cannot be represented

by a MDP. To solve this problem, the truck dispatching model can be based on a Time-

dependent MDP (TiMDP) (BOYAN; LITTMAN, 2000), which is used to model and

solve sequential decision problems with stochastic state transitions and stochastic time-

dependent action durations.


1.2 Objectives

The solution of a dispatching problem modeled by a MDP is represented by policies

producing the actions that must be selected by the agent (truck) when it is in a specific

state. Generally, for a single agent, the optimal solution can be found quickly by dynamic

programming techniques. However, dispatching problems with many agents (multi-agent)

will generate an exponential state space augmentation that causes a correspondingly dras-

tic increase of the necessary time to find the optimal solution. This issue is known as curse

of dimensionality (BELLMAN, 1966) and can be very serious in combinatorial problems

like these and critical for TiMDP models, in which policies also depend on current time.

Therefore, the main objective of this thesis is to develop and study an approach to mini-

mize this unwanted behavior with an approximation to a single-dependent-agent problem.

In this approximation, the problem is modeled for each truck type (the mine may have

trucks that differ on speed and capacity) with dependent states that represent queues

with different sizes at shovels. Thus, the decision on which shovel must the truck travel

to will depend on the states representing the current size of the queues. In this case, the

policies for a specific truck dispatch can change ”on-the-fly” (real-time operating system)

because of the dependence on current queue sizes; if a truck goes to a shovel and has to

wait in a queue before its loading, it will indeed increase the size of the queue, this way

affecting the next truck dispatching decision.

Another point that must be considered are the real-time characteristics of truck dis-

patching in an open-pit mining. The values used to model the problem behavior are not

fixed and change all the time. For example, the truck travel time from a specific point to

another one certainly will not be exactly the same over distinct passages. TiMDP deals

very well with these characteristics, in which the actions duration can be modeled by


probability density functions (pdfs) based on reliable historical database.

Given the presented truck dispatching characteristics, we investigate the dispatching

system for an example mine based on a TiMDP model, verifying the validity of the

method comparing its simulated results to those from other methods, namely: (1) Greedy

heuristic, (2) MTCT heuristic, and (3) Genetic algorithm (GA). We then present a novel

hybrid method named Genetic TiMDP (G-TiMDP), which uses the value functions given

by the TiMDP model as the GA fitness function. The G-TiMDP results are also compared

to the results of the previous methods. For the empirical analysis, we apply the objective

function of maximization of tonnage production for the whole mining shift to all modeled

and simulated methods.

1.3 Work Contributions

The contributions of this thesis are presented in what follows, in the sequence in which

they appear in the text.

TiMDP solution by backwards convolution

The TiMDP solution method is presented by Boyan and Littman (2000) and sub-

sequent works as Li and Littman (2005) and Rachelson, Fabiani and Garcia (2009a);

however these works are strictly mathematical and do not present a basic step-by-step

solution example. We developed a method to solve a TiMDP by complete discretization

and backwards convolution. The term backwards convolution is used to represent the step

needed to solve the TiMDP model, which is performed in the reverse way of a standard

convolution. Finally, we solve step-by-step a TiMDP model using our proposed method.


Time-dependent utility decision making using the TiMDP model

Many common situations can be modeled with time-dependent utilities, in which

specific parameters represent the gain or cost that some decision problems returns over

time for the decision maker (agent). We make a correlation between time-dependent

utility and TiMDP using definitions and examples, which can be useful to solve and make

a better approximation to decision problems that occur in practical real-world domain

settings.

Single-dependent-agent truck dispatching modeling

We developed an approximation for the multi-agent problem (truck dispatching) oc-

curring in an example mine (JAOUA; GAMACHE; RIOPEL, 2009), in which the state

models are built for each truck and are self-dependent in a specific common state (queue-

ing state). We named this approximation as single-dependent-agent, which minimizes the

space state size, making possible an approximated and a fast solution for truck dispatching

using a TiMDP model.

Real-time truck dispatching using TiMDP policies

The single-dependent-agent state representation that we developed is used by the

TiMDP model in the real-time truck dispatching simulation. The truck assignments

(for which shovel must the truck travel to) are taken in real-time using the corrected

value functions given by the TiMDP solution (policies). The TiMDP is solved before the

simulation, making the assignment decisions extremely fast.

Real-time truck dispatching using a GA

We used SimEventsTM(package of MatlabTM) to simulate the truck dispatching in the

example mine during a 10 hour shift, and developed a novel technique that uses a GA in


real-time for the shovel-truck assignments. This optimization algorithm is very fast and

seems to be suitable for real-time applications. Because of the uncertain parameters that

are present in the truck dispatching model, the sequence of trucks asking for dispatch

until the end of the shift becomes impossible to be predicted. Therefore, the algorithm is

executed many times during the whole shift, seeking for result improvement (maximization

of the tonnage production).

Real-time truck dispatching using G-TiMDP

The truck decision results given by the TiMDP policies are in general adequate, but

are degraded by the approximation made by the single-dependent-agent model. There-

fore, we developed a novel technique, named G-TiMDP, that uses the corrected value

functions given by the TiMDP to feed the fitness function of the previous developed GA

technique. We performed a Monte Carlo simulation using this novel technique and eval-

uate the superiority of this method comparing the results by means of a Student’s t-test

comparison.

1.4 Thesis Outline

Chapter 2 presents an introduction to the TiMDP, which is the main model used in

our truck dispatching algorithm development. We present the state-of-the-art and current

research on TiMDPs, and propose a solution method that can be used in various discrete

applications. We also solve an example using this method and propose the use of TiMDP

models for problems with time-dependent utilities.

The truck dispatching problem in open-pit mining is presented in Chapter 3. In order

to position the complexity and details of the truck dispatching problem, we first review


the general vehicle dispatching problem with some variants and applied solution methods.

Following, we present the specificities of the truck dispatching problem, such as involved

equipments, specific goals, and dispatching strategies that are used in real-world truck

dispatching problems. Dispatching strategies are presented, which are the basis for the

developed solution methods presented in the next chapter.

We present in Chapter 4 real-time truck dispatching methods for open-pit mines

operating with production policy (maximize the tonnage production over the shift). We

model this problem by using the concept of single-dependent-agents for TiMDPs and

G-TiMDPs. We also present additional techniques for truck dispatching that are used

in further analysis, namely: greedy heuristic, MTCT (Minimizing Truck Cycle Time)

heuristic, and GAs.

Chapter 5 presents simulation results and analysis of the developed dispatching meth-

ods. This includes details of the Monte Carlo simulations and comparison of results using

the Student’s t-test. Some analysis on improving the method are also made here.

The conclusions of the work are presented in Chapter 6. We also present some recent

trends in MDP modeling and make propositions for future work.

For reference, we also present overviews of Genetic Algorithms in Appendix A and

relevant statistical distributions in Appendix B.

2 Time Dependence in Decision

Processes

Consider the following problem: an accident have occurred, three people are injured,

there is only one doctor (agent) who can only give medical care for a single person at any

time, their lives are dependent on medical care, What does the doctor do? Consider still

that the injury level can be different for each person and there are uncertainties on life

maintenance after medical care. Analyzing these parameters, it is almost obvious that the

right decision on the attendance sequence could maximize the probability of life savings.

Decision theory is often claimed as the right framework for producing the most rational

choice (PARSONS; WOOLDRIDGE, 2002), and it can be the basic theory to solve this

practical and common sequential decision problem.

In fact, sequential decision problems have been tackled very intensively in the last few

years, and it is well known that the theoretical framework based on MDPs is the best way

to model and solve them, giving optimal results in many cases (BOUTILIER; DEAN;

HANKS, 1999). However, real-world problems have an additional and specific parameter,

which is time dependency. MDP theory only considers fixed time steps between epochs

that can be easily understood and modeled as iteration steps. To avoid this limitation,

CHAPTER 2. TIME DEPENDENCE IN DECISION PROCESSES 28

Semi-MDPs (SMDPs) (SUTTON; PRECUP; SINGH, 2000), and, more recently, Time-

dependent MDPs (TiMDPs) have been proposed1. In those models, the transition between

states is not instantaneous, but instead takes a specific time t (durative action). In a

TiMDP, time is observable, so the agent can wait the best moment to make the decision

(or execute the action in the current state). For the SMDP, the problem can be modeled

in infinite time horizon and there is a time duration probability for the durative action,

that is, the agent cannot decide to wait for the best moment to execute the action. A

TiMDP also has likelihood time-dependent functions that activate the action outcome for

the current time, and always models finite time horizon problems (the decisions are made

between a starting and ending clock marks).

In the TiMDP model, the rewards related to the action outcomes can be also repre-

sented as time-dependent functions. In the accident scenario, the person lifetime, defined

as a utility for decision problems (RUSSELL; NORVIG, 2009), decreases over time and

can be formally understood as a time-dependent utility (HORVITZ; RUTLEDGE, 1991).

This problem can be modeled as a TiMDP, in which time-dependent utilities can be

directly represented by time-dependent rewards in the model. This is only one appli-

cation that can be modeled as a TiMDP problem. Other instances like vehicle routing

and scheduling problems with time window constraints (SOLOMON, 1987; ICHOUA;

GENDREAU; POTVIN, 2003; JI, 2005) can also be modeled as TiMDPs.

The following sections present an introduction to MDPs and TiMDPs as technical

basis for modeling the problem considered in this thesis, namely truck dispatching in a

open-pit mine.

1We use the TiMDP representation introduced by Rachelson, Fabiani and Garcia (2009b) insteadof the original one, TMDP introduced by Boyan and Littman (2000), to avoid confusion with otherrepresentations such as tree-structured MDPs (LENGYEL; DAYAN, 2007)


2.1 Markov Decision Processes

2.1.1 MDP formulation

The Markov Decision Process (MDP) (PUTERMAN, 1994; BERTSEKAS, 1987;

PELLEGRINI; WAINER, 2008) is a stochastic system modeling technique, in which the

transitions between states are probabilistic, the states are observable and it is possible

to interfere with the system dynamics through actions that produce state changes and

rewards. A process is Markovian if it follows the Markov Property : the effect of an action

depends only on the action itself and on the current state of the system. The decision

aspect is found in the fact that the agent can periodically take decisions on the system,

using actions.

Formally, a MDP is a tuple (S, A, T, R) as follows:

• S is the set of possible states of the system;

• A is a set of actions that can be executed in different decision epochs;

• T : S × A × S → [0, 1] is a probability function for the system changing to state

s′ ∈ S, from state s ∈ S and agent action a ∈ A, denoted by T (s′|s, a); and

• R : S × A → R is the reward by taking the decision a ∈ A when the system is in

state s ∈ S.

Considering that the system is at some state s in a given decision epoch k, it is

necessary to select which action a must be executed. The action is selected following a

decision rule, and the mapping of actions to states following the decision rules is the policy

(π). Given a policy, we can calculate the expected utility (or the expected total reward)


of the taken action sequence. The expected total reward, considering immediate reward

r and for a finite horizon z is

E

z−1∑k=0

rk

. (2.1)

We can also define the discounted expected reward for finite horizon z,

E

z−1∑k=0

γkrk

, (2.2)

which uses a discount factor γ ∈]0, 1[ to ensure a bounded value for the expected total

reward in the case of infinite horizon:

E

limz→∞

z−1∑k=0

γkrk

. (2.3)

The importance of decisions taken in future epochs is governed by the discount factor

γ; a value zero gives no importance to future rewards (greedy behavior), whereas a value

one gives no discounts in the cumulative expected reward.

A policy is optimal (π∗) when the expected total reward for any state is maximized.

The value function V ∗(s) gives the optimal expected total reward value for the optimal

policy π∗:

V∗(s) = maxa∈A

R(s, a) + γ∑s′∈S

T(s′|s, a)V∗(s′)

. (2.4)

The action-value function Qπ(s, a), for a given policy π, gives the value of action a in

state s, considering the immediate reward from the execution of a in s and the expected


total reward thereafter:

Qπ(s, a) = R(s, a) + γ∑s′∈S

T(s′|s, a)Vπ(s′) . (2.5)

For an optimal policy π∗, we can define Q∗(s, a):

Q∗(s, a) = R(s, a) + γ∑s′∈S

T(s′|s, a)V∗(s′) . (2.6)

The optimal policy π∗ produces the optimal actions that return the maximum Q

values for each state s:

π∗(s) = arg maxa∈A

Q∗(s, a) . (2.7)

Notice that V ∗(s) can also be represented based on the maximum Q value in the state

s:

V∗(s) = maxa∈A

Q∗(s, a) . (2.8)

2.1.2 MDP solution

The solution of a MDP is an optimal policy π∗ that produces the value function V ∗(s)

for all states. A successive approximation algorithm to solve a MDP, called Value Iteration

(ALG. 1), was presented by Bellman (1966).

The stopping criterion of ALG. 1 for an error ε is defined by the so-called Bellman

Error :


Algorithm 1: Value iteration

Input: MDP(S,A,T,R)Output: V*foreach s ∈ S do

V0(s)← maxa∈A R(s, a);

endi← 1;while stop criteria not satisfied do

foreach s ∈ S doVi(s) = maxa∈A

[R(s, a) + γ

∑s′∈S T(s′|s, a)Vi−1(s′)

];

endi← i+ 1 ;

endreturn V ;

∀s ∈ S, |V(s)− V′(s)| ≤ ε(1− γ)

2γ. (2.9)

The policy iteration algorithm, which is more efficient than value iteration (converges

in less iterations), was proposed by Howard (1960). This algorithm (ALG. 2) alternates

between a value determination step (current policy execution), and a policy improvement

step (current policy improvement).

Algorithm 2: Policy iteration

Input: MDP(S,A,T,R)Output: π∗

Initialize π randomly repeatπ ← π′;∀s ∈ S, V (s) = R(s, π′(s)) + γ

∑s′∈S T (s, π′(s), s′)V (s′);

foreach s ∈ S do∀a ∈ A,Qπ′(s, a)← R(s, a) + γ

∑s′∈S T (s, a, s′)V (s′);

endforeach s ∈ S do

π(s)← arg maxa∈AQπ′(s, a);

end

until π = π′;return π;


s2s1

a2

a2

a1

s3

a3a4

a4

FIGURE 2.1 – Example of a MDP with 3 states.

TABLE 2.1 – Transition Probabilities

a1 a2 a3 a4s1 (s2, 1) - - -s2 - (s1, 0.9); (s2, 0.1) (s3, 1) -s3 - - - (s1, 0.4); (s2, 0.6)

2.1.3 A MDP example

A 3-state MDP is presented in FIG. 2.1. Tables 2.1 and 2.2 present the transition

probabilities and rewards, respectively, for this example.

We solve this MDP example using Value Iteration (ALG. 1) and, in order to do the

method demonstration, we present the firsts iterations for γ = 0.9:

V0(s1) = 10

V0(s2) = 7

V0(s3) = 4

, (2.10)

TABLE 2.2 – Reward

a1 a2 a3 a4s1 10 - - -s2 - 5 7 -s3 - - - 4


Q1(s1, a1) = R(s1, a1) + 0.9[T (s2|s1, a)V0(s2)]

Q1(s1, a1) = 16.3

V1(s1) = Q1(s1, a1)

V1(s1) = 16.3

Q1(s2, a2) = R(s2, a2) + 0.9[T (s1|s2, a2)V0(s1) + T (s2|s2, a2)V0(s2)]

Q1(s2, a2) = 13.7

Q1(s2, a3) = R(s2, a3) + 0.9[T (s3|s2, a3)V0(s3)]

Q1(s2, a3) = 10.6

V1(s2) = max[Q1(s2, a2), Q1(s2, a3)]

V1(s2) = 13.7


Q1(s3, a4) = 11.4

V1(s3) = Q1(s3, a4)

V1(s3) = 11.4

, (2.11)


Q2(s1, a1) = R(s1, a1) + 0.9[T (s2|s1, a)V1(s2)]

Q2(s1, a1) = 22.3

V2(s1) = Q1(s1, a1)

V2(s1) = 22.3


Q2(s2, a2) = 19.4

Q2(s2, a3) = R(s2, a3) + 0.9[T (s3|s2, a3)V1(s3)]

Q2(s2, a3) = 17.3

V2(s2) = max[Q1(s2, a2), Q1(s2, a3)]

V2(s2) = 19.4


Q2(s3, a4) = 17.3

V2(s3) = Q1(s3, a4)

V2(s3) = 17.3

. (2.12)

The convergence of iterations (ε = 0.001) are shown in FIGS. 2.2a and 2.2b for γ = 0.9

and γ = 0.3, respectively.

The final results for ε = 0.001 are presented in the following Tables 2.3 and 2.4 for

γ = 0.9, and Tables 2.5 and 2.6 for γ = 0.3.

We can note the difference between the policies for state 2 in both solutions. In the

case of γ = 0.3, the agent gives less importance for future states and tends to execute the


1 9 17 25 33 41 49 57 65 73 81 880

10

20

30

40

50

60

70

80

Iteration

V(s

)

Value Iteration (Gamma = 0.9)

s1

s2

s3

a)

b)

1 2 3 4 5 64

5

6

7

8

9

10

11

12

13

Iteration

V(s

)

Value Iteration (Gamma = 0.3)

s1

s2

s3

FIGURE 2.2 – Value Iteration for (a) γ = 0.9 and (b) γ = 0.3.

TABLE 2.3 – MDP Solution (γ = 0.9)

State V (s) Policy (action)1 75.1 a12 72.4 a23 70.1 a4

TABLE 2.4 – Q(s, a) (γ = 0.9)

a1 a2 a3 a4s1 75.1 - - -s2 - 72.4 70.1 -s3 - - - 70.1


TABLE 2.5 – MDP Solution (γ = 0.3)

State V (s) Policy (action)1 12.7 a12 9.2 a33 7.2 a4

TABLE 2.6 – Q(s, a) (γ = 0, 3)

a1 a2 a3 a4s1 12.7 - - -s2 - 8.7 9.2 -s3 - - - 7.2

action that returns the highest immediate reward. In the other example (γ = 0.9), the

agent considers the future rewards given the transition probabilities.

2.2 Time-dependent Markov Decision Processes

Time-dependent MDPs (TiMDPs) were first proposed by Boyan and Littman (2000)

to model and solve sequential decision problems with the following attributes:

• Stochastic state transitions; and

• Stochastic time-dependent action durations.

Formally, a TiMDP consists of the following components:


S Discrete space state

A Discrete action space

M Discrete set of outcomes, each of the form µ =⟨s′µ, Tµ, Pµ

⟩:

s′µ ∈ S: the resulting space

Tµ ∈ {ABS,REL}: specifies the type of the resulting time distribution

(absolute or relative)

Pµ(t′)(if Tµ = ABS): pdf over absolute arrival times of µ

Pµ(δ)(if Tµ = REL): pdf over durations of µ

L L(µ|s, t, a) is the likelihood of outcome µ given state s, time t AND action a

R R(µ, t, δ) is the reward for the outcome µ at time t with duration δ

The TiMDP model is represented by the following Bellman equations2:

V (s, t) = maxa∈AQ(s, t, a)

Q(s, t, a) =∑

µ∈M L(µ|s, a, t).U(µ, t)

U(µ, t) =

∫∞−∞ Pµ(t′)[R(µ, t, t′ − t) + V (s′µ, t

′)]dt′ (if Tµ = ABS)∫∞−∞ Pµ(t′ − t)[R(µ, t, t′ − t) + V (s′µ, t

′)]dt′ (if Tµ = REL)

, (2.13)

where U(µ, t) is the utility of outcome µ in time t, V (s, t) is the time-value function for

the immediate action, and Q(s, t, a) is the expected Q time-value over outcomes.

We can note that the calculations of U(µ, t) are convolutions of the result-time pdf

Pµ with the lookahead value R+ V . The likelihood function L represents the probability

of an outcome occurring for action a in time t, and can be used to model problems with

2The equation 2.13 differs from original one defined in Boyan and Littman (2000) on not havingdawdling, that is, the agent does not receive a reward for waiting in a state. Several works like Li andLittman (2005) and Marecki, Topol and Tambe (2006) use the same formulation proposed herein.


time-windows (BRESINA et al., 2002).

This model is used to solve time-dependent problems with finite time horizon and

represents an undiscounted continuous-time MDP.

2.2.1 Discrete solution for relative time distributions by back-

wards convolution

In the general TiMDP model (BOYAN; LITTMAN, 2000), the time-value functions for

each state can be arbitrarily complex and therefore impossible to represent exactly. The

TiMDP problem is solved by representing R and V as a piecewise linear (PWL) function,

L as a piecewise constant (PWC) function, and Pµ discretized. This representation ensures

closure under the convolutions and avoids an increased number of iterations. This solution

is fast and exact (for the approximated functions), but there are the following drawbacks:

loss of information caused by the initial approximations, insertion of new breakpoints in

the piecewise functions over iterations, and need for an analytic solution of the convolution

integral.

Li and Littman (2005) explored the practical solution of value iteration considering

that Pµ is now a PWC function. This way, the degree of convoluted functions would grow

up during the iterations, making impossible its solution in a reasonable time. To prevent

this behavior, Li and Littman (2005) introduced the Lazy Approximation Algorithm, in

which the resultant PWL function of the convolution is approximated to a PWC function

on each iteration. Hence, the imprecisions and state space augmentation introduced by

discretization of Pµ is avoided in this solution method.

In a recent work performed by Rachelson, Fabiani and Garcia (2009a), the related


functions of the TiMDP model are represented by piecewise polynomial (PWP) functions.

In order to limit the degree growing of the iteration results, the introduced algorithm

executes, when needed, a decreasing step, reducing the degree of the results in the current

iteration by PWP interpolation.

In order to simplify the solution algorithm and focus on the proposed dispatching

problem, we propose the discretization of all involved functions in the model and solution

of the convolutions by a discrete numerical method. This approximation does not provide

a solution as fast as the original one, but it is an easier and direct way to solve problems

with few states. The only problem here is that the convolution present in the TiMDP

model is not solved as conventional convolution integral. A conventional convolution

integral can be represented by:

h(t) =

∫ ∞−∞

g(t′)k(t− t′)dt′ . (2.14)

The discrete formulation of a convolution is,

h(j) = k(j) ∗ g(j) =∑i

g(i)k(j − i) . (2.15)

This convolution involves a delay represented by the k function over the g function.

However, in the TiMDP there is a negative delay, and the convolution integral is now,

h(t) =

∫ ∞−∞

g(t′)k(t′ − t)dt′ . (2.16)

We characterize it as a backwards convolution, and its discrete solution is,


h(j) = k(j) • g(j) =∑i

g(i)k(j + i) . (2.17)

So, using our solution method, the time-value function V for relative Pµ is,

V (s, t) = maxa∈A

∑µ∈M

L(µ|s, a, t) · Pµ(t) • [R(µ, t) + V (s′µ, t)] . (2.18)

For discretized problems with absolute time distributions, the integral of Eq. 2.13 can

be solved by numerical methods such as the Newton-Cotes Rule (THISTED, 1988).

2.2.2 A TiMDP example

The example presented in FIG. 2.3 is a good starting point to understand value

iteration in TiMDPs. The problem is composed by two states, one action per state,

constant rewards (R) over time t, and an unitary probability function (L) over all time

horizon. In this case, at State 1 the agent will receive reward R1 = 1 after one time

period (the action is durative and takes exactly one time period), going to State 2. In

State 2, the agent will receive a reward R2 = 2 after two time periods. The rewards can

be cumulated until the end of the time horizon.

The system starts with time-value function V equal to zero for both states. Then, the

problem is solved by value iteration using Bellman equations (eq.2.18) with our approxi-

mation presented in Section 2.2.1.

The value iteration process converges at the sixth iteration, and the solution of V

gives important information for agent decision making. For example, when the agent is

at State 2 at time 2 it knows that can receive an accumulated reward of 6 units following


1 2

t1 2 3 40

p1

1

t1 2 3 40

p2

1

t1 2 3 40

R1

1

5 6 7 8 9 10t

1 2 3 40

R2

2

5 6 7 8 9 10

t1 2 3 40

V2

5

5 6 7 8 9 10

23

6

7

t1 2 3 40

V1

4

5 6 7 8 9 10

1

3

6

9

t1 2 3 40

V2

5

5 6 7 8 9 10

23

6

89

7

t1 2 3 40

V1

4

5 6 7 8 9 10

1

3

6

t1 2 3 40

V2

5

5 6 7 8 9 10

23

6

8

t1 2 3 40

V1

4

5 6 7 8 9 10

1

3

6Iteration 4

Iteration 5

Iteration 6

t1 2 3 40

V2

2

5 6 7 8 9 10

t1 2 3 40

V1

3

5 6 7 8 9 10

1

t1 2 3 40

V2

3

5 6 7 8 9 10

2

t1 2 3 40

V1

4

5 6 7 8 9 10

1

3

t1 2 3 40

V2

3

5 6 7 8 9 10

2

5

Iteration 1

Iteration 2

Iteration 3

t1 2 3 40

V1

1

5 6 7 8 9 10

FIGURE 2.3 – TiMDP example solved step-by-step by value iteration.

the policy. In this case, it can wait until time 3 and receive the same cumulated reward.

So, for TiMDPs, policies are dependent both on state and current time.

2.3 Time-dependent utilities

An agent needs a measurement value to select the best option (or to make a decision)

among others. This measurement is the value of the utility function (LI; SOH, 2004). This

value is also called, in decision theoretic planning, value function (cumulated rewards in

sequential decision making) (BOUTILIER; DEAN; HANKS, 1999). The expected utility

(EU) can be calculated for problems with nondeterministic actions (RUSSELL; NORVIG,

2009):


EU(A|E) =∑i

P (Resulti|E,Do(A))U(Resulti(A)) , (2.19)

where Resulti(A) are the possible outcome states for a nondeterministic action A, E

summarizes the agent’s available evidence about the world, and Do(A) is a proposition

informing that action A is executed in the current state.

This common utility representation may not be used in complex real problems, in

which actions to be executed are durative and have priorities. Often, it is necessary to

solve more urgent tasks and to leave others in wait (BASTOS; RIBEIRO; SOUZA, 2008).

For solving this question, time-dependent utility theory (HORVITZ; RUTLEDGE, 1991)

can be used. In this theory, the utility is a function of time, greater than zero, and can

be increasing or decreasing.

2.3.1 Decreasing time-dependent utility function

Decreasing functions can be used to represent a task lifespan and give some idea of

priorities to the decision maker. For example, there are two injured people that must

receive medical care by the only doctor present in a scenario. They have different injury

levels and will die if do not receive medical care as soon as possible. So, the doctor

needs to take a right decision in the attempt to save both lives, choosing which person to

attend first. This decision could be made easily, for this simple example, if the doctor has

a time-dependent utility function representing the importance of a person life (that is,

the death risk) in the current time. This function must map important information like

age, life decreasing rate, injury level and so forth, to a utility value (this mapping is not

the focus of this work, and it is assumed known by the decision maker). Therefore, the


right doctor decision is the one that executes the right attendance sequence, considering

durative actions, without the utility function reaching a zero value (death).

The decreasing time-dependent utility function can be represented by any decreasing

function, but for functionality and simplicity we use exponential or linear functions for

its representation:

U(A, t) = U(A, to) · e−k1·t

U(A, t) = U(A, to)− k2 · t, U(A, t) ≥ 0

, (2.20)

where U(A,t) is the utility for choosing action A at time t, to is the initial time, and k1

and k2 are parameters for adjusting the exponential and linear functions, respectively, for

the problem requirements.

2.3.2 Increasing time-dependent utility function

Increasing functions can be used for instance to represent profits along time. For

example, sometimes it is interesting to choose the task execution sequence based on greater

rewards, as is the case for the vehicle refueling problem, in which the utility of the refueling

state increases over time. Thus, as the fuel level decreases, the utility of refueling increases,

and after a certain time and depending on the current position of the vehicle (distance

from the fueling station), the refueling decision will be taken. Unlike the decreasing

utility function that has a minimum value (zero in the most of the cases), in this case

it is reasonable to assume a maximum value. For vehicle refueling in particular, it is

important to agree upon a maximum utility value that will refer to an empty tank. The

utility model is


U(A, t) = Umax · (1− e−k3·t)

U(A, t) = U(A, to) + k4 · t, U(A, t) ≤ 0

, (2.21)

where U(A, t) is the utility for choosing action A at time t, to is the initial time, Umax is

the maximum utility, and k3 and k4 are parameter constants for adjusting the exponential

and linear functions, respectively, for the problem requirements.

2.3.3 A time-dependent utility example

In this section, we present a more complex sequential decision making example using

time-dependent utilities (or rewards varying over time) modeled and solved by a TiMDP.

The example is presented in FIG. 2.4. It has three states, two selectable actions per state,

a finite horizon with limit of 100 time periods, unitary likelihood function over all time

horizon, and deterministic action durations.

The problem was solved by value iteration using Bellman equations with our approx-

imations (eq. 2.18). The results for the time-value functions V and Q are presented in

FIGS. 2.5, 2.6 and 2.7.

In the graphics, we have the time-value function V (State, t), which is the maximum

between the Q(State, Action, t) time functions.

This solution follows the same idea presented in section 2.2.2, with the difference that

the agent cannot wait in the state for the best decision making time. The solution is

hard to analyze (even for just three states and six actions), and it shows the need and

importance of TiMDP models for solving large time-dependent problems.

FIG.2.8 shows the policies depending on the time. Such policies define the actions


FIGURE 2.4 – Sequential decision making problem using time-dependent utility.

FIGURE 2.5 – Value function - V (1, t).





FIGURE 2.8 – Policies over time.

that the agent must choose based on the maximum Q value for a state, in time t. For

example, if the agent is at State 3 and the current time is 77, it must choose Action 6,

therefore moving to State 2.

In the TiMDP model, actions can be durative and uncertain (represented by pdfs). We

used a Normal Distribution to represent P1 in our example, with mean 10 and variance

3. Normal distribution are very convenient for this kind of problem, in which the action

is durative and with different durations over executions. For real situations and with a

reliable database of past action durations, a Normal distribution is a good approximation

for the action duration pdf, because it tends to cluster around a single mean value with

the proper variance. The solution for State 1 is shown in FIG. 2.9.

Comparing this result with the original problem (FIG. 2.4), it is clear that the function

Q(1, 1, t) becomes smoother. There is also a change in aspect for function Q(1, 2, t). In

fact, these changes in the function may change the overall policies due to the uncertainty


FIGURE 2.9 – Value function V (1, t), P1 = N(10, 3).

in action durations that is related to inherent variances.

The policies are shown in the FIG. 2.10.

Comparing to FIG. 2.8 we note a difference between the policies, that is caused by

the uncertainty added in the duration of Action 1. For example, now the policy in State 1

at time 25 is Action 2, against Action 1 in the original problem. The uncertainty added

to the action duration that belongs to State 1 has also caused a changing in the policy

for State 3. Therefore, it is very important to model correctly the pdfs in order to avoid

wrong decisions.


FIGURE 2.10 – Policies over time - P1 = N(10, 3).

3 Truck Dispatching in Open Pit

Mines

Truck dispatching in open-pit mining consists of material (mineral matter) transporta-

tion during a shift by haul trucks from pickup stations (shovels) to delivery stations or

dump points (crushers, waste dumps or stock piles). The mineral matter is composed by

(KOLONJA; KALASKY; MUTMANSKY, 1993) ore (the most valuable mineral prod-

uct), leach (of marginal, but positive value), and waste (of no value). A mine is often

composed by different models of trucks and shovels (heterogeneous fleet), that work at

specific and different truck speeds and capacities and shovel digging rates.

Under truck driver solicitations, the dispatcher (or fleet manager) must decide in real-

time which shovel must the truck travel to (truck assignment) based on the current mine

state and on a decision support system or on own experience. These decisions have crucial

importance in the mining operation, given that material transportation is one of the most

important aspects of open-pit mine operations, representing up to 60% of operating costs

(ALARIE; GAMACHE, 2002). Due to its significance, several decision systems for this

problem have been developed in the last few years, improving productivity and reducing

operational costs.

CHAPTER 3. TRUCK DISPATCHING IN OPEN PIT MINES 52

In the following sections similarities between truck dispatching and other vehicle dis-

patching systems are presented; the truck dispatching in open pit-mining is fully addressed

and detailed.

3.1 Vehicle dispatching problems

The truck dispatching problem does not occur only in Mining, and can be found in any

area that includes management of a vehicle fleet. Some examples of vehicle dispatching

problems are:

• Dynamic vehicle assignment problem (POWELL, 1988)

This is a common problem in the shipping industry. Given a request, the fleet manager

must decide which truck will be sent to the ship for loading and further delivering. After

the delivering, if there is not more loadings, the truck must be repositioned given future

loading demands.

• Dial-a-ride (GENDREAU; POTVIN, 1998)

It is a generalization of the dynamic vehicle assignment problem. During a day, a vehicle

must pickup and deliver material (or people) in different locations. This problem can have

some capacity restrictions and soft time-window constraints. The objective is doing all

transportation with minimum costs.

• Automated Guided Vehicles (AGVs) in the manufacturing industry (CO; TAN-

CHOCO, 1990)


AGVs, or mobile robots, do the material transportation in a shop floor (raw material or

finished product) in an automated plant. The transportation occurs in close locations and

there are predefined robot waiting places to avoid queues in the processes.

Alarie and Gamache (2002) relate that truck dispatching in open-pit mining seems to

be a simplification of the other vehicle dispatching problems; however, it presents some

characteristics that are not commonly reported in the literature:

• Mines are closed systems, that is, the pickup and delivery points remain the same

and stay at the same position during a long period of time (generally, a shift of 8 to

12 hours);

• The traveling distances are short comparatively to the length of the shift (10 to 25

min);

• The frequency of demands at each pickup point is high (each 3 to 5 minutes); and

• If the size of the fleet is too large, truck queues may appear.

Additionally, we cite the high combinatorial aspect of the problem due to several

trucks typically working in a mine (the dispatching system must considers the position

of all trucks on its assignment to the shovels, which is exemplified by values in the next

chapter considering our example mine model). In the simulated mines presented by Jaoua,

Gamache and Riopel (2009), there are 15 trucks in a medium-scale mine (3 shovels and

2 dump points), and 60 trucks in a large scale mine (10 shovels and 3 dump points). In

Computer Science, the truck is an agent and this problem is modeled as a multi-agent

system.


The number of trucks (fleet size) working in a mine is defined in a previous decision

epoch by a specific optimization technique, which is not the focus of this work. Situations

with more trucks than the optimal quantity (over-trucked) will increase the length of

queues at shovels, while less trucks (under-trucked) cause shovel underutilization. So, the

results of our algorithm are strongly influenced by the quantity of trucks operating in a

shift, that must be close enough to the optimal quantity.

3.2 Truck dispatching problem

Solving a truck dispatching problem in open-pit mining can signify maximizing ton-

nage production (productivity policy), minimization of equipment inactivity (truck waiting

time and shovel idle time), or Run of Mine (ROM) attendance (quality policy). In a mine,

the ROM is the quality level of the ore that can be a combination (balanced mean) of

many mining fronts. Pinto (2007) developed a Fuzzy Algorithm to simultaneously find a

balanced result using both production and quality policies. Therefore, to obtain the best

results, the problem is divided in two upper stages (KRAUSE; MUSINGWINI, 2007): (1)

truck resource allocation or fleet size estimation, and (2) real-time truck dispatching.

The fleet size estimation, which is not the focus of this thesis, is a very important

issue to be tackled in the truck dispatching problem; over-trucked situations will increase

the length of queues at shovels, whereas under-trucking cause shovel underutilization

(ALARIE; GAMACHE, 2002). The costs in an over-trucked mine are increased because

of higher truck utilization causing more maintenance stops and higher fuel consumption,

whereas the production objectives will not be attained in an under-trucked mine. Due

to its importance, this issue is tackled by many recent works in the mining literature.


Brahma (2007) used Queueing Theory (GROSS, 2008) and Petri Nets (MURATA, 2002)

to find the optimal number of trucks in the context of a shovel dumper (haul truck) com-

bination system; Krause and Musingwini (2007) used a modified Machine Repair Model

for estimating the truck fleet size; Ta et al. (2005) used a chance-constrained stochastic

optimization approach in heterogeneous truck fleet resource allocation, accommodating

uncertain parameters such as truck load and cycle time; Huang et al. (2010) used a Ge-

netic Algorithm to optimize the number of trucks in an open-pit mine minimizing the

cost of truck transportation and maintenance; and Souza et al. (2010) developed a hybrid

metaheuristic algorithm (Greedy Randomized Adaptive Search Procedure and General

Variable Neighborhood Search) to minimize the number of mining trucks used to meet

production goals and quality requirements.

The real-time truck dispatching stage can be modeled by three strategies (ALARIE;

GAMACHE, 2002): (1) 1-truck-for-n-shovels, (2) m-trucks-for-1-shovel, and (3)m-trucks-

for-n-shovels.

3.2.1 The 1-truck-for-n-shovels strategy

This is the most used strategy in the mining industry. Trucks are assigned one by one

to shovels (FIG. 3.1).

The fleet manager assigns the truck to the shovel that is most suitable to the current

dispatching criterion, following a heuristic method (ALARIE; GAMACHE, 2002), or rule

(TA et al., 2005). Heuristics are procedures which are not mathematically proven but

which are based upon practical or logical operating procedures (RUSSELL; NORVIG,

2009). The most used heuristic methods used in truck dispatching are (KOLONJA;


FIGURE 3.1 – 1-truck-for-n-shovels strategy.

KALASKY; MUTMANSKY, 1993; CETIN, 2004):

• Minimizing Shovel Waiting Time (MSWT): an empty truck in the dispatching point

is assigned to the longest idle time shovel, or to the shovel that expects to be idle

first. The objective of this criterion is to maximize the utilization of both truck and

shovels.

• Minimizing Truck Cycle Time (MTCT): the goal of this strategy is to assign an

empty truck to the shovel that allows the shortest truck cycle time, maximizing

the total tonnage productivity. The objective of this criterion is to maximize the

number of truck cycles during the shift.

• Minimizing Truck Waiting Time (MTWT): in this criterion, an empty truck in

the dispatching point is assigned to a shovel in which the loading operation starts

first. The objective of this criterion is to maximize the utilization of a shovel by

minimizing its waiting time.


• Minimizing Shovel Saturation or Coverage (MSC): empty trucks are assigned to the

shovel at equal time intervals to keep a non-idle shovel operation. The objective

of this rule is to assign the trucks to the shovels at equal time intervals to keep a

shovel operating without waiting for trucks.

This strategy is myopic (or greedy) because the system is not completely observed

when a truck is being dispatched. For example, in a two shovel and two truck mine,

the first truck positioned at the dispatching point is assigned to the shovel number one,

because of its higher production, and the second one must have to be assigned to the

shovel number two (this example system does not allow queues in the mining). In this

situation, the total production, following the production policy, will not be the maximum

one. Thus, the global result (sum of individual truck productions) is affected because of

the greedy behavior of this strategy. Nevertheless, Lizotte and Bonates (1987) and Tu

and Hucka (1985) used this strategy in their works.

3.2.2 The m-trucks-for-1-shovel strategy

In this strategy (FIG. 3.2), the shovels are first sorted following a priority scheme

(e.g., by how much they are behind schedule on their production), and then, each one ”se-

lects”, from a list of m trucks, the one that best serves it (e.g., the truck with highest load

capacity and the nearest one). Alarie and Gamache (2002) relate that there is only one

implemented system that use this strategy, namely the DISPATCHTMcommercial package

for truck dispatching, which is developed by Modular Mining Systems. As DISPATCHTMis

a commercial package, no substantial information about its algorithms and heuristic meth-

ods are found in the scientific literature.


Hig

he

r P

rio

rity

FIGURE 3.2 – m-trucks-for-1-shovel strategy.

3.2.3 The m-trucks-for-n-shovels strategy

This strategy (FIG. 3.3) considers simultaneously the m available trucks for dispatch-

ing and the n shovels present in the mine. This is a combinatorial problem that can be

modeled as an assignment problem or as a transport problem.

Elbrond and Soumis (1987) solves the truck dispatching as an assignment problem.

Here, the system considers for the assignment optimization the truck that asks for dis-

patching and the next 10 to 15 trucks that will ask for dispatch in the near future (e.g.

over the paths, finishing dumping or finishing material loading). Only the assignment of

the current asking truck is answered, other assignments are discarded. The system will

repeat the same steps in the next dispatching solicitations. The solution is only for the

near future dispatching trucks because of the combinatorial explosion of this problem,

that is, NP-hard (PAPADIMITRIOU; STEIGLITZ, 1998). In fact, a solution considering

the whole shift would be extremely time consuming, and impracticable for a real-time

system.


Ne

xt k D

isp

atc

he

d T

rucks

FIGURE 3.3 – m-trucks-for-n-shovels strategy.

The system proposed by Temeng, Otuonye and Frendewey (1997) is modeled and

solved as a transport problem. In this problem, each supply center is associated to a

truck that will be dispatched in a near future, and each receiver center is a shovel present

in the mine. The receiver center demand is expressed as the number of trucks needed to

reach the production goals. The cost of sending a truck to a shovel is given by the truck

waiting time (truck queues at the shovels).

Another current trend in solving this kind of problem is the Evolutionary Algorithm

(EA), which uses some mechanisms inspired by biological evolution: reproduction, muta-

tion, recombination, and selection. This is a near optimal algorithm, that is, the global

optimal solution is not guaranteed to be found and the algorithm often converges to lo-

cal optimal solutions (the EAs have specific search mechanisms to avoid a premature

convergence to first local optimal solutions). A near optimal solution is generally found

must faster by the AGs than exact searching methods (e.g. breadth-first search), and can

be considered acceptable given the convergence criteria of the algorithm. Some related


techniques are: Genetic Algorithm (GA) (MITCHELL, 1998), Particle Swarm Optimiza-

tion (PSO) (SHI; EBERHART, 2002), and Simulated Annealing (SA) (KIRKPATRICK,

1984). Jaoua, Gamache and Riopel (2009) used SA as an optimization algorithm applied

to the truck dispatching in a simulation-based real-time control.

4 Truck Dispatching Modeling

The truck dispatching in open-pit mines is a problem in which decisions on truck

assignments and destinations are taken in real-time. As in many other real-world ap-

plications, the assessment and correct modeling of uncertainty is a crucial requirement,

as the unpredictability originated from equipment faults, weather conditions and human

mistakes can often result in truck queues or idle shovels. There are also uncertainties

in travel and loading times related to the problem; the travel time of a truck between

the same specific loading and dumping points certainly will not be the same over the

whole shift, and can be represented by a probability density function (pdf). Therefore,

this problem can be classified as a stochastic problem, in which the uncertainties must be

part of the problem model and be considered in the problem solving process. However,

uncertainty is not considered in most of current dispatching systems, possibly providing

worse solutions than the average optimal one.

Consider the following example: two identical trucks are parked in the same area,

just waiting to be assigned to two identical shovels. Considering that queues are not

allowed, Which shovel must each truck travel to? The answer is quite obvious because

of truck homogeneity: each truck must travel to a different shovel (there will be no

difference in total production). This simple example shows the easiness of solution in

simple environments; even if the shovels were different, the solution remains the same.

CHAPTER 4. TRUCK DISPATCHING MODELING 62

3

7

2

2

4

S1

S2

S3

C

FIGURE 4.1 – Abstract graph of a medium-scale mine.

However, in most real situations the mines operate with heterogeneous trucks and shovels,

queues are allowed, and the dispatching requisitions do not occur simultaneously.

Given the stochasticity, mining objectives, heterogeneous fleet and queueing charac-

teristics present in a mine, What would be the best technique to solve this real-time

dispatching problem? It is known that this hard problem is not fully addressed and

solved by current systems. We present in the following sections a realistic example of a

medium-scale mine, which is the testbed for some models that deals with the real problem

characteristics. These models will be the basis for the simulations and analysis presented

in the next chapter.

4.1 A model for a medium-scale mine example

4.1.1 Mine environment

In order to have a testbed for the simulations of the proposed truck dispatching

algorithms, we present a modified medium-scale mine example (FIG. 4.1), which was first

introduced by Jaoua, Gamache and Riopel (2009).


TABLE 4.1 – Truck specifications.

Truck Type Quantity Empty Aver. Speed Loaded Aver. Speed Payload Capacity1 10 50 km/h 40 km/h 200 t2 3 48 km/h 37 km/h 300 t3 2 40 km/h 35 km/h 400 t

The mine has three pickup stations (shovels S1, S2, S3), and one delivery/departure

station (crusher C). It differs from the original one considered in Jaoua, Gamache and

Riopel (2009) for the absence of one waste dump (another delivery station) and one

departure station (truck parking area and starting point of truck dispatching). For the

sake of simplicity and because of our main objective (i.e., to introduce a novel real-time

stochastic truck dispatching system), we reduced the number of elements in the original

mine. Certainly, our proposed truck dispatching systems (section 4.2) can be used with

few modifications in larger and more constrained mines.

4.1.2 Specifying trucks and shovels

In the same manner as in the original mine, we use 15 trucks for the material trans-

portation from shovels to crusher. Unfortunately, nothing is reported in Jaoua, Gamache

and Riopel (2009) about trucks and shovels specifications. In order to overcome the de-

ficiency of the previously introduced model, we propose heterogeneous types of trucks

(Table 4.1) and shovels (Table 4.2) operating in the mine environment. Therefore, per-

formance comparisons can be made against dispatching algorithms already developed or

proposed in the future.


TABLE 4.2 – Shovel specifications.

Shovel Average Loading Rate1 40 t/min2 20 t/min3 100 t/min

4.1.3 The truck cycle

Truck dispatching is executed following a Truck Cycle: an action sequence with its

related timespan. Basically, the sequence is: (1) the truck receives a dispatching order at

the departure station (crusher in our model), (2) it then travels through a path to the

assigned shovel, (3) loads the material, (4) returns to the crusher through a path (that

can be different from the first one), (5) unloads the material, and (6) waits for another

dispatching order. This sequence is repeated until the end of the shift.

The truck cycle must be adapted to a state-based representation, which is the basis for

the methods presented in section 4.2. In order to complete the representation of actions,

timespan, and queue at the shovels, we represent shovels and crushers by sub-states (FIG.

4.2). The truck cycle in a state-based representation follows the sequence:

1. The truck starts its cycle at Crusher (state C ) being assigned to a Shovel (state

S’ ), and then executing the action move_shovel that takes the timespan t shovel

(which depends on the distance from crusher to shovel and on the empty truck

average speed);

2. At state S’, the truck moves (action move_queue) to the FIFO (first in first out)

queue state, that takes the timespan t queue (depends on the size of the queue);

3. When the truck is the first one in the queue, it is loaded (action load_truck) by

the Shovel (state S ) in timespan t load (depends on shovel loading rate and truck


capacity);

4. Then, the truck must move to the Crusher (state C’ ) (action move_crusher) in

timespan t crusher (depends on distance from shovel to crusher and loaded truck

average speed);

5. Finishing the cycle, the truck unloads (action unload_truck) the material in the

Crusher (state C ) in timespan t unload (based on truck capacity).

For the sake of simplicity, we consider that there is no queue at the crusher; the trucks

unload the material collected from the shovels in a concurrent manner. Moreover, in our

model the queues at the shovels are limited to 9 trucks (the dispatching system controls

considers this size limitation on assignments, and we consider that the truck driver follows

strictly its shovel assignment).

In order to make the presented system suitable to a TiMDP modeling, times are

related to the actions, not to the states; e.g. the time that the truck waits in the queue

(t queue) (which is related to the action move_queue) depends on the current size of the

queue.

The estimated truck cycle time can also be delayed because of prohibitions of truck

overtakes. Thus, if a truck is behind a slower truck, it may have a travel delay changing

the estimated travel time. This drawback is one of many issues that occurs in a real-world

mine, and indeed causes a decrease in the quality of dispatching heuristics.

4.1.4 Mine uncertainties

We introduce two kinds of uncertainty to the mine model, approximating its behav-

ior to a real-world mining: (1) stochastic path selection, and (2) Gaussian-based truck


t_shovel t_crusher

move_shovel move_crusherunload_truck

Shovel

Crusher

Q SS' t_load

move_queue load_truck

t_queue

C C't_unload

FIGURE 4.2 – Truck cycle time.

traveling times.

4.1.4.1 Stochastic path selection

Path selection is related to action (shovel assignment) outcomes (µ). First, at the

departure station (dispatching point), the truck driver receives from the dispatcher the

information of the shovel that it must travel to. As we do not consider the routing problem,

the truck driver must select the best path to the shovel based on own experience and/or

depending on the actual traffic/weather conditions. The same stochastic characteristic

occurs in the return travel (from shovel to delivery/departure station). In order to be

applied in the TiMDP model (section 4.2.3), the outcomes are classified depending on the

shovel assignment. The truck driver can select 3 paths for the traveling; by default, we

use µ1 for the shortest path, µ2 for the medium path, and µ3 for the longest path. FIG.

4.3 shows the outcome classification for each travel between crusher and shovels (forward

and return travels). In order to represent a real-world mine operation behavior, we define

that the selected path for the forward travel (empty truck) is not necessarily the same as

the return travel (full truck).

Given the truck assignment, the probability of an outcome occurrence (which path

the truck will follow in) is based on a likelihood function over the whole shift (FIG. 4.4).


μ1

S1

S2

S3

Cμ2

μ3

μ3

S3

Cμ2

μ1

S1

S2

μ2

S3

Cμ1

μ3

S1

S2

a) b)

c)

FIGURE 4.3 – Path selection outcomes. (a) Crusher-Shovel 1-Crusher; (b) Crusher-Shovel2-Crusher; (c) Crusher-Shovel 3-Crusher


Hence, path selection occurs based on a probability value that can vary over time, but

of course the sum of outcome probabilities always equals one. The likelihood may be

obtained based on historical data; herein, for the sake of simplicity we used arbitrary

values and likelihood functions that are valid for all truck types.

As an illustrative example, in FIG. 4.4a, when the truck is assigned to Shovel 1,

the probability of the driver taking path µ1 is 85%, µ2 is 10%, and µ3 is 5%, from

time 0 to 300 minutes and also from time 360 minutes to the end of the shift. These

probabilities only change between times 300 to 360 minutes, in which µ1 is zero, µ2

is 60%, and µ3 is 40%. This abrupt change in probability values occurs because of a

programmed maintenance and resulting blocking of the path between C and S1 in the

aforementioned period. Therefore, we introduce a novel constraint in the modeling of

truck dispatching in open-pit mining problems, namely time windows, used before in

vehicle dispatching problems (SOLOMON, 1987). The introduction of this constraint in

the model approximates the problem to real-world mining, in which path blockages often

occur.

4.1.4.2 Gaussian-based truck traveling times

Trucks assignment in the mine is a cyclic operation, in which they are constantly

executing material transportation between shovels and crusher until the end of the shift.

Each truck movement or operation, represented in FIG. 4.2, takes a timespan depending

on distance, truck speed, truck and shovel capacities, and queues size. These times can

be attributed based on historical mine database, being these representations the basis for

the presented real-time dispatching methods (Section 4.2), hence their importance in our

model.


FIGURE 4.4 – Outcome likelihood functions.


Certainly, the truck displacement timespan between two identical points will not be

the same over travels. Minor variations can be explained based on different drivers that

conduct trucks with similar, but not equal, speeds and throttles, and small differences

on shovel and crusher positions. Major variations are based on high reduction of truck

speed because of weather conditions, and changes in mine configuration. In this thesis,

our solution methods consider only the minor timespan variations, which are represented

by a probability distribution function (pdf). We use a Normal (or Gaussian) distribution

for timespan representation, which is a convenient model to represent time processing.

However, due to only positive representations of time, this distribution may not be a

good choice in some cases because of its theoretical range (−∞ to +∞). In this case,

we can use the Gamma Distribution, which have range from zero to +∞ and is often

used to represent the time required to complete some task. The graphical representation

of the Gamma Distribution is similar to the Gaussian in situations in which the values

tend to zero in the negative ”time” axis. Gibson and Bruck (2000) and Ludwig (1996)

also considered the involved times in their problems as Gamma Distributions. Both

distributions formulations are found in Appendix B.

For the sake of simplicity and because of sufficient time representation (for the consid-

ered times in the presented example) considering the positive range, we use the Gaussian

Distribution for time travel representation. We present in Table 4.3, for the example mine

(FIG. 4.1), the required times for the truck travelings. The mean time of travels are given

by the distances and full and empty truck speeds; the standard deviations are arbitrarily

defined. Since our objective concerns the comparison among truck assignment methods,

only in a few occasions we considered standard deviations different from zero. We con-

sider that these data are necessary and sufficient conditions to compare the dispatching


TABLE 4.3 – Mining data.

methods.

Our work only consider pdfs for truck traveling times; the other timespans present in

the mine operation (t queue, t load, and t unload) are considered deterministic.


4.2 Truck dispatching methods

Using the introduced mine environment (FIG. 4.1), we propose five methods to solve

the truck dispatching problem: Greedy Heuristic, MTCT Heuristic, TiMDP, Genetic Al-

gorithm (GA), and Genetic TiMDP (G-TiMDP). The methods Greedy Heuristic, MTCT

(Minimizing Truck Cycle Time) Heuristic and TiMDP follow the 1-truck-for-n-shovels

strategy, whereas GA and Genetic TiMDP follow the m-truck-for-n-shovels strategy. All

presented dispatching methods are implemented in order to maximize tonnage production.

4.2.1 Greedy heuristic

We have shown in section 3.2 that the 1-truck-for-n-shovels strategy is greedy. Indeed,

this strategy can be considered as such because the truck assignment is made observing

only its own state; it is an egotist behavior that leads to not so good global results.

However, most of the methods applied according to this strategy are fast and have some

knowledge about the mine environment, leading to acceptable results considering the real-

time and uncertain aspects of the problem. Thus, due to the acceptable quality of the

results presented by these heuristic methods (such as the MTCT heuristic), we propose

an extremely greedy heuristic that certainly will return poor results, which will be used

for comparisons with other methods.

In this method, the dispatcher does not have much information about the mine en-

vironment. Crucial informations for a good dispatching, like distances and truck/shovel

capacities are completely unknown and not considered by the dispatching algorithm. The

only observation that is allowed is the size of the queues at the shovels. Thus, in this

method the truck must be assigned to the shovel that presents the smallest queue. Be-


cause of the balanced dispatching characteristic, the size of the queues tends to be near

equal during the whole shift; the problem of shovel underutilization is not present in this

method.

Likewise, the time-window in the shift, which indicates the blocking period of the

nearest path to the crusher, is not considered in this method. Since the only information

for the heuristic is the size of the queues, the knowledge about the time-window does not

affect the performance of the method.

Another important issue that occurs in the dispatching is the instant of decision; the

first decision differs from the others, because in the beginning all trucks are available and

waiting for its shovel assignment. Considering that trucks cannot overtake each other in

the paths, we organize a decision queue, in which the fastest trucks are placed first in

order to prevent traffic slowness.

As a special case for further decisions in which trucks asks for dispatching at the same

time, the fastest trucks always have the preference. This decision policy used for cases

with conflicting trucks will also be used for our methods based on 1-truck-for-n-shovels

strategy.

4.2.2 MTCT heuristic

The main objective in our proposed mine is the transportation of the maximum quan-

tity of material by the trucks during the shift. Thus, a good dispatching heuristic would

be the minimization of the truck cycle times, in order to maximize the number of truck

travels. We apply this heuristic (MTCT heuristic) to the mine allowing full observation,

which means that the dispatcher knows how to calculate the cycle times, and have enough


information for doing it. However, we assume determinism even though dispatching oc-

curs in a stochastic environment. The dispatcher considers that the trucks travel to the

shovels always using the shortest path (outcome µ1), and does not consider the Gaussian

aspect of the time of travelings (the mean time is considered for all dispatches). This

deterministic assumption in a stochastic environment may not lead to dispatching with

sufficiently near-optimal results.

In order to improve the performance of this method, we considered knowledge about

the time-window to estimate the truck cycle time. During the time-window, the heuristic

considers that the truck takes the medium path to travel and return from Shovel 1 (taking

a longer time).

Since the trucks have different payload capacities, the time wait in the queues must

be estimated. We use the mean loading time, which is the time that the shovel takes to

load a 300 tons truck.

4.2.3 TiMDP Model

We propose modeling of the truck dispatching problem as a TiMDP, in which char-

acteristics as uncertainties, related times, and quantity of transported material can be

addressed. The solution of this model will return policies which define the best action

to be executed by the agent given the current time and its current state, that is, the

dispatcher must verify the policies given by the TiMDP to decide on the truck assign-

ment. However, dispatching problems with many agents modeled as variations of Markov

decision processes (MDPs) can generate an exponential state space augmentation that

causes a correspondingly drastic increase of the necessary time (and memory) to find the


optimal solution. This issue is known as the curse of dimensionality (PUTERMAN, 1994)

and can be very serious in problems like these and critical for TiMDP models, in which

policies also depend on current time. Therefore, some approximations must be performed

to minimize this problem, in an attempt to get a feasible solution to the dispatching

problem.

We present in this section an approximation for the TiMDP model by single-dependent

agents that reduces significantly the size of the problem. Since the dispatching decisions

(presented in Chapter 5) are taken in real-time based on TiMDP policies solved previously,

we also present some results and analysis of our proposed mine TiMDP model.

4.2.3.1 Single-dependent-agent TiMDP modeling

The state representation of the mine involves the places where trucks can be located,

that is, crusher, shovels and queues, and paths. We consider that paths are not states,

but transitions between states. Thus, the number of states is highly decreased because

the truck is traveling from one location to another location, and not at a position of a

path that would be discretized. However, for a complete state representation, the states

considering all trucks should be considered interdependent, which results in a huge state

space.

In a mine with only one location and two trucks (A and B), the location can be

associated to 4 states: no trucks, only truck A, only truck B, and both trucks. In our

example, not considering time and queue sizes, the complete state space has order 1011. To

solve a problem of this size we have to use approximate solvers such as APRICODD (ST-

AUBIN; HOEY; BOUTILIER, 2001), which is based on SPUDD (HOEY et al., 1999), an

exact factored MDP (BOUTILIER; DEARDEN; GOLDSZMIDT, 2000) solver. However,


in our formulation there are two other issues that increase the state space: time and

queue size. We consider 10 hours shift, and a discretization step of 0.1 minute for the

Gaussian representation of timespans; the queues have the maximum size of 9 trucks, and

– due to the heterogeneous fleet – the truck order in the queue must be considered. Such

considerations enlarge the space state to order 1021, making the problem impossible to be

solved in a reasonable time.

In order to solve a problem with such huge number of states, we propose the approx-

imation of the multi-agent problem to an introduced single-dependent-agent problem, in

which a solution is generated for each agent (on a small state space) with some states that

add dependencies on other agents. In this model, the actions are executed concurrently

(MAUSAM; WELD, 2004) by the agents; some actions can be executed at the same time

(each agent execute one action per time), however, because of the dependency model, the

execution of other actions is dependent on the current position of the other agents in the

environment. These dependencies are important to the quality of the solution because

they insert another dependency for the decision in a specific state. Now, the policy is de-

pendent on the own agent current state, current time, and other states that depend on the

other agents present in the environment. Naturally, this approximation does not return

optimal results, however the results provide evidence of good performance for solutions

that are returned in short time (as shown in the next chapter).

In our example mine we have the state transitions representation illustrated in FIG.

4.5 for each truck present in the environment, in which FIFO queues (Q1, Q2, and Q3) are

represented by slot buffers for the shovels (S1, S2, and S3, respectively). The dependencies

of this representation are addressed by the queue; the current size of a queue is set based

on observations exchanged among the agents. Considering that a specific truck is at S1’


ready to execute the action move_queue, its position in the queue will be governed by the

current positions of the other trucks present in the same queue. Therefore, considering

that the only decision state is C, the observation of the current position of the other trucks

(mainly in the queues) is essential for good results for truck dispatching.

The single-dependent-agent representation seems to be appropriate for TiMDP mod-

eling; an agent decision can be made observing its own current state, current time, and the

position of the other agents in the environment. However, the actual queue representation

(as a slot buffer) is not appropriate for the TiMDP model, and must be mapped to a state

representation. As the queues are limited to 9 trucks, we propose their representation by

a set of states (for each queue), in which each state represents the quantity of trucks in

the queue (varying from zero to 9 trucks). Our example mine is thus approximated to

a single-dependent-agent model (FIG. 4.6), which is an expansion of the presented truck

cycle (FIG. 4.2) considering queue sizes, and outcomes over path selection by the truck

driver.

The only action that gives a reward is unload_truck. Its value is the quantity of

tonnage transported by the truck. The policies of the TiMDP model aim at maximizing

the expected tonnage that can be transported by the truck, considering the whole shift.

The action that makes the transition from state S’ to its queue Q is move_queue. In

the TiMDP model, the queue is represented by a set of states in which the transitions are

produced by independent actions (e.g. move_queue_Q1_2). However, in the dispatching

instant, the truck moves to the queue independently of its size. To solve this problem

of action representation, we propose a two-phased TiMDP model applied to real-time

queueing problems that are represented by single-dependent-agents:


C C’

µ2

µ1

move_shovel_1

move_shovel_2

S3S3'

S1S1'

S2S2'

µ1

µ3

µ2

µ3

µ1

µ2

µ3

move_shovel_3

move_crusher

move_crusher

move_crusher

unload_truck

move_queue load_truckQ1



FIGURE 4.5 – Truck dispatching state transitions.


Q1_0

Q1_9

Q1_1 S1S1'

Q3_0

Q3_9

Q3_1 S3S3'

C

µ2

µ2

C’

µ1

µ3

µ1

µ3

move_queue

move_shovel

load_truck

unload_truck

move_crusher

move_queue load_truck

move_crusher

move_shovel

FIGURE 4.6 – TiMDP truck dispatching states.


1. Solve the complete TiMDP model (off-line phase).

2. Find the optimal dispatching policy (dispatching phase), which is subdivided into:

(a) Execute a Value Iteration algorithm step (on-line sub-phase).

(b) Assign the truck to a shovel (assignment sub-phase).

The off-line phase is solved in a moment before the mine shift, whereas the dispatching

phase occurs in real-time during the shift.

In the off-line phase, the system is modeled following the representation presented in

FIG. 4.6, that is, the action move_queue is represented for each queue state. For each new

action that produces the transition from a state S’ to states Q (representing the quantity

of trucks in the queue), there is a specific duration t queue. The duration of the action

move_queue depends on the size of the queue (|Q|) and on the mean time (t) of truck

waiting in the queue:

t queue = t · |Q| . (4.1)

An initial approximation for the mean time t is

t =t load T1 + t load T2 + t load T3

3, (4.2)

where T1, T2, and T3 are the truck types, as presented in Table 4.1.

The importance of a state representing a zero number of trucks in the queue resides

on the interaction between the phases of our proposed TiMDP model. In the off-line

phase, the TiMDP is solved using the Value Iteration algorithm, which takes around 5


minutes 1to be solved for each truck type considering a shift of 10 hours and discretization

step of 0.1 min. The value function of state S’ is the maximum value of the convoluted

action duration and value function of the queue states; considering that the action move_-

queue_Q0 has a null duration, the value function of S’ will be equal to the value function

of Q0. Thus, the policy representing the dispatching decision (which is always executed

at state C) will be found not considering the current size of the queues in the decision

epoch. This incorrect behavior is then solved in the on-line sub-phase.

The dispatching phase is subdivided into two sub-phases: on-line and assignment.

The dispatching decisions start in the on-line sub-phase considering the estimates of future

sizes of the queues, which are based on truck expected traveling time, number of trucks

traveling to the queue, current size of the queue, and mean time (t) of truck waiting in the

queue. In an example situation, the number of trucks in the queues and the other cited

aspects are observed, being the dispatching decision taken in the assignment sub-phase

considering the maximum value given by the TiMDP for the expected queue size of all

shovels. However, the decision is always taken on dispatching state C, and the values

(expected tonnage) used for the decision are valid for states Q. Therefore, in the on-line

sub-phase, these values must be referred to the dispatching state by execution of one step

of the Value Iteration algorithm. In this sub-phase, the other value functions of Q states,

that differs from the expected size of the queue, are not considered. Thus, the expected

tonnage considered at state C now refers to the expected size of the queue. In order to

save time on dispatching decisions, we execute the on-line sub-phase for all sizes of queues

right after the convergence of the TiMDP model in the off-line phase. Thus, the on-line

sub-phase can also be solved before the shift, but it is an essential step that must be

1Pentium Quad Core [email protected], 4 Gb RAM


executed to get a correct truck dispatching.

Considering that a truck asks for dispatching at C at current time td and the current

size of Q1 is zero, the Q value of Q1 referenced to C is calculated in the on-line sub-phase:

Q(C, t,move shovel 1) = L(µ1|C,move shovel 1) · U(µ1, t)

+L(µ2|C,move shovel 1) · U(µ2, t)

+L(µ3|C,move shovel 1) · U(µ3, t)

. (4.3)

The utility U is

U(µ1, t) = t shovel 1 µ1 •Q(S1′, t,move queue)



. (4.4)

During the shift, in the assignment sub-phase, the selected shovel that the truck must

travel to is given by the action defined by the policy π∗, which compares the Q values of

the state C (move_shovel actions):

π∗(C, td) = arg maxmove shovel(Q(C, td,move shovel 1), Q(C, td,move shovel 2),

Q(C, td,move shovel 3)) .

(4.5)

4.2.3.2 TiMDP results and analysis

The simulations presented in this section concern the off-line and on-line phases (ex-

ecuted before the shift) of the TiMDP model, which uses all mine data presented until

then. The assignment sub-phase, which returns the final results of this method (that is,


the total tonnage production), is presented in the next chapter due to the necessity of

a simulation considering all trucks and executed over the whole shift. All simulations

presented have results displayed in a graphical form in which data are presented by ex-

pected tonnage production (tons) versus time (minutes). In order to understand the main

characteristics of the method we present a diversity of simulations combining different

shovels, queues sizes, and phases (off-line and on-line).

The differences between the off-line (Normal TiMDP) and on-line (Dislocated TiMDP)

phases are shown in FIG. 4.7 for dispatching decision of T1 with queue size at shovel 1

equal to zero 2. We show in FIG. 4.7b the detail for the time-window (blocking of path

between C and S1), in which we can observe more carefully the differences between ton-

nage productions for a same instant. As commented in the previous section, the indicated

value at the off-line phase represents the expected tonnage for the current size of the

queue, represented by states Q; however, this value must be referred to state C, occurring

in the presented differences between phases. The time-window is represented by the first

and last discontinuities in the function, in the interval 295-355 minutes for the Normal

TiMDP. The difference in the original time-window that represents the path blockage be-

tween 300-360 minutes can be explained based on TiMDPs theory whose decisions depend

on subsequent action durations. Therefore, if T1 moves from S1’ to Q1 in instant 355 the

truck driver will have the choice (considering that there is only one truck in the mine) to

take the shortest path in the return travel, because the size of the queue is equal to zero

and its loading takes 5 minutes. However, as the decisions are taken in state C, we must

consider the Dislocated TiMDP function to analyze the time-window behavior. Now, we

can observe a difference in the time mark of the first function discontinuity comparing

2We define the term Dislocated TiMDP based on function dislocation that the on-line sub-phase causeson the original TiMDP – calculated in the off-line phase – defined here as Normal TiMDP.


Normal and Dislocated TiMDPs. This difference is explained by the dislocation on the

function caused by the expected time that the truck might have to travel from the crusher

to the shovel. The last discontinuity changes exactly to instant 360 minutes, which is the

unblocking instant of the shortest path between crusher and shovel 1. The other discon-

tinuities present in the Dislocated TiMDP can be explained based on outcome likelihood

functions (FIG. 4.4) applied to EQS. 4.3 and 4.4, which are used to refer the decision to

state C.

We can observe another effect of the on-line sub-phase in FIG. 4.7c, which is the

dislocation in time of the last value of tonnage production that differs from zero. The

zero value in the function indicates that the truck should go to a parking lot, due to the

time size of the shift, that is, the crusher ends it works exactly at time 600 minutes, and if

a truck travels to the shovel it may (expected values) encounter the crusher out of work.

Thus, in the simulations presented in the next chapter, the trucks are always sent to a

parking lot if expected values of tonnage production for all shovels are equal to zero. In

this example, we can observe clearly the dislocation of the TiMDP function caused by the

on-line sub-phase.

The next figures presented in this section refer to the on-line sub-phase. FIG. 4.8

presents the differences between values of expected tonnage production considering all

shovels and queues with size zero. We can observe in FIGs. 4.8b and 4.8c, the difference

that the values present along the shift. For example, in time around 293 minutes, the

policy (defined in the assignment sub-phase and found based on the higher tonnage pro-

duction value) changes from Shovel 1 to 3. This change in the policy can be explained

by the time-window. The Shovel 1 returns to be the best dispatching decision in time

360 minutes. In time around 572 minutes we observe a change in the policies, which are


FIGURE 4.7 – Expected tonnage production at crusher C (Truck 1 - Shovel 1 - Queue 0).


dependent on the approximating end of the shift, and the timespans in the system, such

as t shovel and t load. These decisions based on expected tonnage production, current

time, and queue size, are all executed in the assignment sub-phase.

The differences between the expected tonnage production for a same shovel and truck,

and different size of queues are presented in FIG. 4.9 for T2 and Shovel 3. We can observe

that the differences remain almost the same during most of the shift (FIG. 4.9b), except

for the end (FIG. 4.9c), in which the differences are all highly dependent on the current

time. Clearly, the truck should go earlier to the parking lot if the size of the queue is

larger.

FIG. 4.10 compares results in a more realistic behavior of the mine environment, in

which the size of the queues differs from each one during the shift. We observe in the

zoomed figures (FIGs. 4.10b and 4.10c) that the policies change depending on current

time; before time 300 the action move_shovel_1 is better than action move_shovel_2,but

it is the worst action during the period 300-333. We can note that move_shovel_3, even

leading to the lengthiest queue, is a good action to be selected during most of time. This

issue occurs because of the average loading rate of Shovel 3, which is 2.5 times longer

than in Shovel 1 and 5 times longer than in Shovel 2. We must also note that the sizes of

the queues change during all the shift, indeed modifying the policies, however, we show

in this section comparisons among fixed size queues just for a better understanding of the

TiMDP model.

Up to this point, we have shown results of TiMDP models considering standard time

representations (exact durations of the actions), whereas the time in a real-world problem

tends to be non exact. Let us then consider, as presented in Table 4.3, the action durations

represented by Gaussian distributions. In order to show the differences between standard


FIGURE 4.8 – Expected tonnage production at crusher C (Truck 1 - Queue 0).


FIGURE 4.9 – Expected tonnage production at crusher C (Truck 2 - Shovel 3).


FIGURE 4.10 – Expected tonnage production at crusher C (Truck 2).


and Gaussian time representations, we present in FIG. 4.11 two graphics for the same

condition, in which are considered the expected tonnage production at C for T3 and sizes

of queues equal to zero at Shovels 1, 2, and 3. We can observe a smooth function for

the Gaussian representation (FIG. 4.11b) compared to the standard representation (FIG.

4.11a), which can be explained based on the convolution operations of the Q functions

with the discretized Gaussian representations that are used in the TiMDP solution. In

order to guarantee a good solution (convergence of TiMDP solution to its near optimal

values) and to limit the use of memory in simulations, we used a discretization step of 0.2

minutes 3.

The effects of the Gaussian representations can be better observed in FIG. 4.12. Com-

paring FIGs. 4.12b2 and 4.12b1 we observe the increase of the expected tonnage produc-

tion introduced by the Guassian representations. Policies can change also due to the

behavior of the Gaussian distribution; originally the selected action between times 353

and 354 was move_shovel_2 (FIG. 4.12c1), being changed to action move_shovel_1 in

the Gaussian representation (FIG. 4.12c2). The smoothness from the Gaussian represen-

tation can be also observed comparing FIGs. 4.12d1 and 4.12d2, and 4.12e1 and 4.12e2.

We note that all those modifications are based on all combined Gaussian distributions

present in the model, as shown in Table 4.3, and it can be a difficult task to predict the

behavior of this type of representation due to the high number of combination of values

that are executed in a TiMDP solution. Certainly, these modifications are more significant

in a system with a complete Gaussian representation of all involved times.

3Discretization steps smaller than 0.2 minutes caused memory overflow because of usage of 32 bitsoperational system. Steps bigger than 1 minute returned results much different of results presented bystandard TiMDP. We have reduced regularly the discretization steps upon 0.2 minutes observing theconvergence tendency of the results and avoiding memory overflow.


0 100 200 300 400 500 6000

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

11000

Time (min)

To

nn

ag

e P

rod

uctio

n (

t)

Expected Tonnage Production at Crusher C (Truck 3 - Queue 0 - Gauss)

Shovel 1

Shovel 2

Shovel 3

0 100 200 300 400 500 6000

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

11000

Time (min)

To

nn

ag

e P

rod

uctio

n (

t)

Expected Tonnage Production at Crusher C (Truck 3 - Queue 0)

Shovel 1

Shovel 2

Shovel 3

a)

b)

FIGURE 4.11 – Comparative of expected tonnage production at crusher C (Truck 3 -Queue 0) for standard and Gauss representations.


0 100 200 300 400 500 6000

1000

2000

3000

4000

5000

6000

7000

8000

9000

Time (min)

Ton

nage

Pro

duct

ion

(t)

Expected Tonnage Production at Crusher C ( Truck 1 - Queue 0 - Gauss)

Shovel 1Shovel 2Shovel 3

520 530 540 550 560 570 580 590 6000

100

200

300

400

500

600

700

800

900

1000

1100

Time (min)

Ton

nage

Pro

duct

ion

(t)

290 300 310 320 330 340 350 360 3702800

3000

3200

3400

3600

3800

4000

4200

Time (min)

Ton

nage

Pro

duct

ion

(t)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 57900

7950

8000

8050

8100

8150

8200

Time (min)

Ton

nage

Pro

duct

ion

(t)

520 530 540 550 560 570 580 590 6000

100

200

300

400

500

600

700

800

900

1000

1100

Time (min)

Ton

nage

Pro

duct

ion

(t)

290 300 310 320 330 340 350 360 3702800

3000

3200

3400

3600

3800

4000

4200

Time (min)

Ton

nage

Pro

duct

ion

(t)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 57900

7950

8000

8050

8100

8150

8200

Time (min)

Ton

nage

Pro

duct

ion

(t)

0 100 200 300 400 500 6000

1000

2000

3000

4000

5000

6000

7000

8000

9000

Time (min)

Ton

nage

Pro

duct

ion

(t)

Expected Tonnage Production at Crusher C ( Truck 1 - Queue 0)

Shovel 1Shovel 2Shovel 3

353 353.1 353.2 353.3 353.4 353.5 353.6 353.7 353.8 353.9 3543200

3205

3210

3215

3220

3225

3230

3235

3240

Time (min)

Ton

nage

Pro

duct

ion

(t)

353 353.1 353.2 353.3 353.4 353.5 353.6 353.7 353.8 353.9 3543200

3205

3210

3215

3220

3225

3230

3235

3240

Time (min)

Ton

nage

Pro

duct

ion

(t)

FIGURE 4.12 – Comparative of expected tonnage production at crusher C (Truck 1 -Queue 0) for standard and Gauss representations.


4.2.4 Genetic Algorithm (GA)

All proposed methods up to this point are based on 1-truck-for-n-shovels strategy,

which is egotist because of non observation of other coming trucks in the dispatching

decision. In order to minimize this problem and try to improve the results, we propose

the truck dispatching using Genetic Algorithm (GA), which is based on m-trucks-for-

n-shovels strategy. As GA theory is a well-known theme in the combinatorial problem

community, we present its general theory on Appendix A.

The GA technique is applied in the problem in two distinct decision instants: first

dispatching decision in the start of the shift, in which the trucks are all available, and

next decisions, in which commonly one truck asks for dispatching. For the first decision

instant, we organize a queue of trucks, in which the decisions are taken starting from

the first to the last truck in the queue. The goal is then to minimize the summed truck

cycle time. This method uses the MTCT heuristic in the selection phase as the fitness

function, being applied for a set of trucks in order to minimize its total cycle time. The

fitness function also considers the delays caused by traffic (faster trucks can be behind

slower trucks) and queues in the shovels.

The GA chromosome is composed by a double array (FIG. 4.13), in which the first

array represents which shovel the truck is assigned to, and the second array represents

the position of truck in the decision queue. We assume that, after the GA algorithm

execution, the decision is instantly executed, incurring in a null truck waiting time in the

decision queue.

Given the chromosome configuration, we initialize a population composed by 2500

individuals. This number seems to be not a large quantity of individuals given the high


ST#1. . .ST#2 ST#3

PT#1 PT#2 PT#3. . .

ST#14 ST#15

PT#14 PT#15

T#1 T#2 T#3 T#14 T#15

Shovel

Queue

Position

. . .

FIGURE 4.13 – Truck dispatching GA chromosome.

combinatorial characteristic of the chromosome (1019 order in our example); however the

necessary time to converge to a good solution is directly related to the population size. In

our mine problem, the dispatches occur in real-time, thus a long time for decisions is not

acceptable in this kind of problem (because of delays inserted in the production). In our

tests, using the proposed population size, we have obtained results that are close enough

to the same GA formulations with bigger initial populations. Using our GA formulation,

the final results can be obtained in around one minute, which is an acceptable time in a

real-time truck dispatching environment.

The initial selection is executed two by two individuals (binary tournament selection),

selecting the best one based on the fitness function. After this phase, the population is

reduced to half its original size, that is, 1250 individuals. In the next GA step, reproduction

phase, we proceed with pairwise crossover of individuals (FIG. 4.14), just for the first

array of the chromosome, with a defined probability of 0.9; in order to recover the original

population size, each crossover generates 4 sons. In the crossover example, first the sons

are generated as exact copies of their fathers; we adopt Sons #1 and #3 as copies of Father

#1, and Sons #2 and #4 as copies of Father #2. If the crossover is accepted (given by

the defined probability), the start and end genes are randomly selected and modified by

the genes in the same range of the other father; e.g. Son #1 is initially an exact copy of

Father #1 and the crossover indicates that its genes from T#4 to T#9 must be changed


Crossover 1: x1=T4; x2=T9

Crossover 2 – x1=T13; x2=T15

S1 S2 S1

1 3 5

S1 S2 S3

4 12 11

S3 S3 S1

15 6 7

S2 S2 S3

9 8 10

S1 S1 S1

13 2 14

T#1 T#2 T#3 T#4 T#5 T#6 T#7 T#8 T#9 T#10 T#11 T#12 T#13 T#14 T#15

Father #1

S2 S2 S1

7 13 15

S3 S2 S3

1 2 5

S1 S3 S2

8 9 4

S1 S2 S3

3 11 14

S1 S2 S3

12 6 10

Father #2

S1 S2 S1

1 3 5

S3 S2 S3

4 12 11

S1 S3 S2

15 6 7

S2 S2 S3

9 8 10

S1 S1 S1

13 2 14

Son #1

S2 S2 S1

7 13 15

S1 S2 S3

1 2 5

S3 S3 S1

8 9 4

S1 S2 S3

3 11 14

S1 S2 S3

12 6 10

Son #2

S1 S2 S1

1 3 5

S1 S2 S3

4 12 11

S3 S3 S1

15 6 7

S2 S2 S3

9 8 10

S1 S2 S3

13 2 14

Son #3

S2 S2 S1

7 13 15

S3 S2 S3

1 2 5

S1 S3 S2

8 9 4

S1 S2 S3

3 11 14

S1 S1 S1

12 6 10

Son #4

X


T#1 T#2 T#3 T#4 T#5 T#6 T#7 T#8 T#9 T#10 T#11 T#12 T#13 T#14 T#15 T#1 T#2 T#3 T#4 T#5 T#6 T#7 T#8 T#9 T#10 T#11 T#12 T#13 T#14 T#15



FIGURE 4.14 – Truck dispatching GA crossover.

S1 S2 S1

1 3 13

S3 S2 S3

4 12 11

S1 S3 S2

15 6 7

S2 S2 S3

9 8 10

S1 S1 S1

5 2 14

Mutation Son #1 – T3xT13

S2 S2 S1

7 13 15

S1 S2 S3

1 2 5

S3 S3 S1

8 9 4

S1 S2 S3

3 11 14

S1 S2 S3

12 6 10

Son #2

S1 S2 S1

1 3 5

S1 S2 S3

4 12 11

S3 S3 S1

15 6 7

S2 S2 S3

9 8 10

S1 S2 S3

13 2 14

Son #3

S2 S2 S1

7 13 15

S3 S2 S3

1 2 5

S1 S3 S2

8 9 4

S1 S2 S3

14 11 3

S1 S1 S1

12 6 10

Mutation Son #4 – T10xT12




FIGURE 4.15 – Truck dispatching GA mutation.

for the same genes range of Father #2.

The genes in the second array of the chromosome represent the truck position in the

initial decision queue, that is, its values cannot repeat along the array, therefore preventing

the crossover operation. The next step of our GA dispatching model is the mutation of

the second array of the chromosome (queue position), that occurs with a 0.01 probability

being a random swap between genes values. In our example (FIG. 4.15), the mutation

operation is executed in Son #1, swapping its genes T#3 and T#13, and in Son #4,

swapping its genes T#10 and T#12.

We assume that the reproduction policy is elitist, that is, if the best father is better



S1 S2 S1

1 3 13

S3 S2 S3

4 12 11

S1 S3 S2

15 6 7

S2 S2 S3

9 8 10

S1 S1 S1

5 2 14

Son #1

S2 S2 S1

7 13 15

S1 S2 S3

1 2 5

S3 S3 S1

8 9 4

S1 S2 S3

3 11 14

S1 S2 S3

12 6 10

Son #2

S1 S2 S1

1 3 5

S1 S2 S3

4 12 11

S3 S3 S1

15 6 7

S2 S2 S3

9 8 10

S1 S2 S3

13 2 14

Son #3

S2 S2 S1

7 13 15

S3 S2 S3

1 2 5

S1 S3 S2

8 9 4

S1 S2 S3

14 11 3

S1 S1 S1

12 6 10

Son #4

S1 S2 S1

1 3 5

S1 S2 S3

4 12 11

S3 S3 S1

15 6 7

S2 S2 S3

9 8 10

S1 S1 S1

13 2 14

Father #1

S2 S2 S1

7 13 15

S3 S2 S3

1 2 5

S1 S3 S2

8 9 4

S1 S2 S3

3 11 14

S1 S2 S3

12 6 10

Father #2

Total Truck Cycle = 116 [min]











FIGURE 4.16 – Truck dispatching GA elitist behavior.

than the worst son, it must take place over son if and only if its fitness value is better

than the son’s one. In our example (FIG. 4.16), the best father is #2 (smallest truck cycle

between fathers), which takes place over Son #4 because its total truck cycle is smaller

than the Son’s #4 one. This elitist policy assures the maintenance of the best individual

of its generation, therefore allowing the convergence of the algorithm.

Finally, the individuals of the reproduction operations are shown in the FIG. 4.17.

These reproduction steps are applied two by two following a sequential order to all

individuals in the population. Hence, the population will double its size after the repro-

duction of all individuals, returning to its original size. After that, the selection phase is

executed again, in order to select the best individuals and reduce the population size to


S1 S2 S1

1 3 13

S3 S2 S3

4 12 11

S1 S3 S2

15 6 7

S2 S2 S3

9 8 10

S1 S1 S1

5 2 14

Son #1

S2 S2 S1

7 13 15

S1 S2 S3

1 2 5

S3 S3 S1

8 9 4

S1 S2 S3

3 11 14

S1 S2 S3

12 6 10

Son #2

S1 S2 S1

1 3 5

S1 S2 S3

4 12 11

S3 S3 S1

15 6 7

S2 S2 S3

9 8 10

S1 S2 S3

13 2 14

Son #3

S2 S2 S1

7 13 15

S3 S2 S3

1 2 5

S1 S3 S2

8 9 4

S1 S2 S3

3 11 14

S1 S2 S3

12 6 10

Father #2




FIGURE 4.17 – Truck dispatching GA reproduction result.

another reproduction phase, and start a new generation. The convergence was obtained

in around 50 generations, and took less than one minute. The solution, which is the first

truck assignment, is the best individual after the problem convergence.

For the next decision instants we consider that only one truck asks for dispatching

per time. Now, the GA dispatching method will consider for shovel assignment the asking

truck and the next m estimated trucks to arrive in state C in the next tGA time period.

The shovel assignments for the future expected trucks arriving in the state C during the

considered tGA will be placed in a so-called dispatching list. Another dispatching list will

be only generated when the first truck arrives at state C after the considered tGA. As this

method perform the truck dispatch considering more than one truck, it is considered a

m-trucks-for-n-shovels strategy.

Now, the chromosome (represented as the first array of the previous chromosome

presented in FIG. 4.13) is defined considering observations on trucks being loaded at

shovels and unloaded at the crusher, waiting in the queues, and traveling through the

paths. Some genes may indicate zero, representing that the truck was not observed (its

arrival time on state C cannot be estimated) and it will not be considered in the GA

algorithm for that dispatching decision.

In order to estimate the truck cycle, we insert an auxiliary array to the chromosome


taT#1. . .taT#2 taT#3 taT#14 taT#15

T#1 T#2 T#3 T#14 T#15

Estimated

arrival time

. . .

FIGURE 4.18 – Auxiliary chromosome array.

(FIG. 4.18), which only indicates the estimated arrival time of the truck (ta) on state C,

not being used in crossovers or mutations.

Therefore, the GA algorithm is executed considering the current dispatching truck,

and the next trucks that are expected to dispatch in the next tGA minutes. The estimated

arriving times of trucks on state C depend on observations of their current states. However,

we face some specific characteristics of dispatching simulator that difficult the estimation

of arriving times, such as impossibility of observing the truck when traveling through

the paths and the position of a specific truck in any queue 4. These issues and the

stochastic behavior of the problem (uncertainty on selecting the traveling paths) add

some imprecisions on trucks arriving time, which may imply in results that differs from

the previewed by the GA algorithm. In this case, when a truck arrives at state C before

tGA and it is not at the dispatching list, a dispatch heuristic method (such MTCT) must

be executed in order to perform the shovel assignment. Certainly, this situations will

degrade the quality of the general GA method. In order to minimize this problem, we

limit the maximum considered time for chromosome construction (tGA) based on current

truck observations and estimation of arrival times at state C. This time limitation is

presented by the ALG. 3, in which tGA is found based on estimated trucks arrival times

on state C. In the algorithm, T S1, T S2, and T S3 are the set of estimated truck arrival

times of observed trucks being loaded and at first and second position in queues on

4We added the possibility of observing the first and the second trucks in the queue in order to improveour results.


shovel 1, shovel 2, and shovel 3, respectively; tcurrent is the current shift time got in the

dispatching GA decision. In the t max S calculation, it is added a constant, that is,

the loading time of the smallest truck on each shovel. The tGA is basically the smallest

maximum arrival time at crusher of considered trucks. Therefore, the trucks composing

the chromosome must have their estimated arrival time at state C between tcurrent and

tGA. This approximation added more GA dispatching executions, however the quality of

the results was considerably improved due to the drastic reduction of heuristic dispatches.

Algorithm 3: Calculation of maximum truck arrival time in state C

Input: CALC TGA(T S1,T S2,T S3,tcurrent)Output: tGAt max S1← max(T S1) + 5 ;t max S2← max(T S2) + 10 ;t max S3← max(T S3) + 2 ;t max a← min(t max S1, t max S2, t max S3) ;t max← tcurrent ;foreach t s1 ∈ T S1 do

if t s1 ≤ t max a AND t s1 ≥ t max thent max← t1 ;

end

endforeach t s2 ∈ T S2 do


end

endforeach t s3 ∈ T S3 do


end

endtGA ← t max ;return tGA;

The GA is always started when the first truck arrives in state C after the previously

calculated tGA following the previous shown steps (first dispatching decision) with small

modifications because of differences on the current chromosome construction. As the

chromosome is formed by only one array indicating the trucks positions, the mutation


phase, that was executed in the decision queue position, is now executed on the trucks

positions, following the previous defined procedures. The fitness function follows the

previous one, which is used for minimizing the total cycle time; however, now considering

the estimated truck arrivals times (ta) to find the cycle time for each truck represented

in the chromosome. As the number of shovels attended (indicated by the chromosome

construction) is dependent on the observed trucks, the size of the population used in the

GA algorithm will be dependent on the chromosome configuration. We have adjusted the

population size in order to converge to good results in short time (less than one minute)

due to the real-time dispatching best practices.

4.2.5 G-TiMDP

The introduced TiMDP model for truck dispatching seems to be a good representation

for this problem because of its specific characteristics, such as: stochastic behavior (the

real-world problems are often uncertain), sequential decision making (the accumulated

reward, or value function, considers the expected results of all sequential actions during

the whole shift, hence the reward can be considered just for one action – in our prob-

lem the action unload_truck), time-dependent decisions (time-windows and variations

on outcomes over time can be easily considered). However, due to single-dependent-agent

approximation, the model follows the 1-truck-for-n-shovels strategy, that is, the dispatch-

ing decisions are egotist leading to not so good results. In order to improve the results, we

introduce the Genetic TiMDP (G-TiMDP), which is a hybrid algorithm that combines

the sequential decisions in uncertain environments of the TiMDP with the combinatorial

characteristic of GA, leading to a new m-trucks-for-n-shovel method.

The G-TiMDP differs from the GA model (presented in the last subsection) only in


the selection phase, in which the fitness function is evaluated based on maximization of the

Expected Tonnage Production that is given by the proposed TiMDP model. Following

the TiMDP phases, in this hybrid dispatching method the off-line and on-line phases

remain calculated as previously, providing its results to a new assignment phase, which is

now performed by the GA dispatching method. As the TiMDP results in the cited phases

are found before the mine shift and the shovel assignments resulted from the GA method

(such as the previous one), the dispatching time of G-TiMDP remains the same of the

pure GA method, that is, less than one minute.

5 Simulations and Analysis

We defined in the last chapter some truck dispatching methods that are applied to

our example mine: (1) Greedy heuristic, (2) MTCT heuristic, (3) TiMDP model, (4) GA

model, and (5) G-TiMDP model. In order to test the performance of these methods, we

developed a simulation framework based on example mine data, such as shovels character-

istics and positions, trucks characteristics, present uncertainties, shift length, and queue

size limitations. The dispatching methods were evaluated by Monte Carlo simulation, be-

ing their results compared using Student’s t-test providing enough data for further quality

analysis.

5.1 Simulation Framework

The proposed dispatching methods were developed and simulated using the software

SimEventsTM(a MatlabTMpackage). All simulations follow the characteristics of the pro-

posed mine environment example, being executed during a 10 hour shift. The objective of

the simulations is to compute the total tonnage production in the end of the shift consid-

ering a fleet composed by 15 heterogeneous trucks as already proposed. A general mine

simulation environment is presented in FIG. 5.1. The dispatching methods use the same

simulation framework, except for specific functionalities, shown in the next subsections.

CHAPTER 5. SIMULATIONS AND ANALYSIS 103

Referring to FIG. 5.1, the trucks are treated as entities, which are generated in the

Truck Generator block with their specific characteristics (based on truck type), which

will define the traveling times along the paths and the quantity of transported material.

After that, at time zero (started by the Start Timer block), the trucks are positioned

following a priority scheme (faster trucks are positioned first; GA and G-TiMDP methods

follow the priority based on the first dispatching decision defined in Sections 4.2.4 and

4.2.5) in the Priority Queue block. The TiMDP block is specific for TiMDP and G-

TiMDP methods and is responsible for getting the results of the real-time phase from

the workspace in the current time t. Some entity attributes that indicate important

informations for dispatching decisions, such as size of the queues at the shovels, are set

in the Set Attribute block with current data from the environment. The dispatching

decision of all methods considering their specific characteristics is taken in the Shovel

Decision Function block. After the shovel assignment, the truck travels to the Shovel

and then (after material loading) goes to the Crusher. The time period during which the

truck stays unloading at the crusher is calculated in the Crusher Time Function block.

The quantity of material transported in a cycle is added to the total tonnage production

by the Simout - Tonnage Transported block.


FIG

UR

E5.

1–

Gen

eral

min

esi

mula

tion

envir

onm

ent.


FIGURE 5.2 – Shovel 1 block simulation environment detail.

The Shovel block is actually a group of blocks (FIG. 5.2). First, the truck must read

the current time t (Read Timer block), which will be important for the time-dependent

outcome (path selection). The path (Paths 1 block) is selected randomly in the Likelihood

Function 1 block. After traveling through the path, the truck arrives in the shovel going

first to the Queue 1 block. Then, when the truck is at the first position of the queue

and the shovel is idle, the truck loads material at the Shovel 1 block during the timespan

given by the Shovel 1 Time Function block. The truck must return to the crusher by the

Path 2 block, in which the outcome path is defined by the Likelihood Function 2 block

considering the current time given by the Read Timer 2 block.

In order to get important data (first and second trucks in the queue) for the Calculation

of maximum truck arrival time in state C (ALG. 3) as required for both GA and G-TiMDP

methods, the Queue needs to be segmented (FIG. 5.3). The first and second positions in

the queue are represented by Single Server 1 and Single Server 2 blocks, respectively.

As the trucks have the possibility of randomly choosing the paths (outcomes) from


FIGURE 5.3 – Queue 1 block simulation environment detail.

the crusher to the shovels (or vice-versa), the Path 1 block is represented by all possible

paths (FIG. 5.4). Blocks Path µ1, Path µ2, and Path µ3 represent the µ1, µ2, and µ3

outcomes, respectively.

Our example mine environment has an important security constraint that is the prohi-

bition of truck overtakes. This constraint is added in the simulation framework as shown

in the FIG. 5.5, in which the first truck in the path only releases the behind truck to

continue along the simulation after its arriving in the shovel.

5.2 Dispatching Methods Behavior

The proposed dispatching methods were simulated in the developed framework and

presented different behaviors related to tonnage production and queue formations. In

order to show an initial result, we have simulated all methods using the same seed for the

randomly selection of the traveling paths, which is the sequence of path outcomes will be

the same for all simulations, independently of methods and trucks.

The first simulated method is the Greedy Heuristic method, for which we show the

quantity of trucks in shovels along the shift (FIG. 5.6). FIGs. 5.6a, 5.6b, and 5.6c show


FIGURE 5.4 – Paths 1 block simulation environment detail.

the quantity of trucks in the queues and traveling to Shovel 1, Shovel 2, and Shovel

3, respectively. Because overtakes are not allowed, all presented dispatching methods

consider that the queues in the shovels are formed by stopped trucks waiting for material

loading and by traveling trucks. We note that the mean quantity of trucks in all queues

tends to be the same along time. Indeed, because of the greedy behavior, trucks travel to

the shovel with smaller queues, producing this balancing. In this simulation the system

does not know about the existence of a time-window and the trucks do not go to the

parking lot at the end of the shift. This specific simulation returned a total tonnage

production of 77 000 tons.

The quantity of trucks in paths going to Shovel 1 is shown in FIG. 5.7. The outcomes

µ1, µ2, and µ3 are shown in FIGs. 5.7a, 5.7b, and 5.7c, respectively. The presented

graphics indicate only the quantity of trucks traveling in the paths going to Shovel 1; the

quantity of trucks returning to the Crusher, and using the same path to go to another


FIGURE 5.5 – Path µ1 block simulation environment detail.


a)

b)

c)

0 100 200 300 400 500 6000

1

2

3

4

5

Time (min)

Quantity

of T

rucks

Quantity of Trucks - Shovel 1 - Greedy

0 100 200 300 400 500 6000

1

2

3

4

5

Time (min)

Qu

an

tity

of T

rucks


0 100 200 300 400 500 6000

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

Time (min)

Qu

an

tity

of T

rucks


FIGURE 5.6 – Quantity of trucks in shovels for the Greedy Heuristic simulation.


shovel is not considered here (we consider that the paths for the shovels, despite being the

same in the real problem, are different and are considered independently with proper out-

comes). The time-window (the µ1 blockage between times 300 and 360 minutes) is clearly

shown in FIG. 5.7a, in which no truck is allowed to travel through path 1. The likelihood

function is also represented in the graphics by the higher usability of the outcomes µ1, µ2,

and µ3, sequentially.

In the MTCT heuristic the trucks are dispatched according to the minimum truck

cycle, which is directly related to the shovels loading rates and consequently the size of

their queues. This behavior makes the mean sizes of the queues differ from each other,

which can be explained by the shovels loading rates. Indeed, fastest shovels can attend

more trucks in the same timespan, that is, more trucks can be sent to those shovels,

consequently leading to a larger queue. FIG. 5.8 shows this behavior, in which the highest

mean size of the queue is for Shovel 3 (FIG. 5.8c), followed by Shovel 1 (FIG. 5.8a), and

then Shovel 2 (FIG. 5.8b). This dispatching heuristic knows about the path blockage

(time-window), but does not know about the likelihood function; it considers that the

trucks always travel through the shortest available path, that is, outcome µ1 during the

whole shift, except during the time-window in which the outcome is µ2. The total tonnage

production for this simulation was 88 900 tons.

In the MTCT heuristic simulation, trucks must go to the parking lot (FIG. 5.9) when

it is not possible to complete a cycle until the end of the shift. In fact, due to uncertainties

present in the problem, such as time in the queues and path outcomes, some trucks are

dispatched and do not return to the decision point (Crusher) until the end of the shift.

This problem can be bypassed by considering the addition of a constant in the calculated

cycles, but this could worsen the results.


a)

b)

c)

0 100 200 300 400 500 6000

1

2

3

4

Time (min)

Qu

an

tity

of T

rucks

Quantity of Trucks - u1 - Shovel 1 - Greedy

0 100 200 300 400 500 6000

1

2

Time (min)

Qu

an

tity

of T

rucks


0 100 200 300 400 500 6000

1

2

Time (min)

Qu

an

tity

of T

rucks


FIGURE 5.7 – Quantity of trucks in paths going to Shovel 1 for the Greedy Heuristicsimulation.


0 100 200 300 400 500 6000

1

2Quantity of Trucks - Shovel 2 - MTCT

Time (min)

Qu

an

tity

of T

rucks

0 100 200 300 400 500 6000

1

2

3

4

5Quantity of Trucks - Shovel 1 - MTCT

Time (min)

Qu

an

tity

of T

rucks

0 100 200 300 400 500 6000

1

2

3

4

5

6

7

8

Time (min)

Qu

an

tity

of T

rucks

Quantity of Trucks - Shovel 3 - MTCT

a)

b)

c)

FIGURE 5.8 – Quantity of trucks in shovels for the MTCT Heuristic simulation.


575 580 585 590 595 600

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

Time (min)

Tru

ck N

um

be

r

Trucks on Parking Lot - MTCT

FIGURE 5.9 – Trucks on parking lot for the MTCT heuristic.

The quantity of trucks in shovels for the TiMDP model is shown in FIG. 5.10. It is

hard to notice a different behavior from the last dispatching method. However, due to

likelihood knowledge and consideration of the sequential decision, the tonnage production

is slightly better: 90 200 tons.

The parking lot occupation for the TiMDP model is shown in FIG. 5.11. Due to

uncertainties in the problem (e.g., time in the queues, and outcomes of move_shovel

action), not all trucks go to the parking lot until the end of the shift.

The GA model for truck dispatching is based on the MTCT heuristic, and likewise it

does not assume knowledge about the likelihood function for path selection. Moreover,

its assumptions are the same regarding the time-window. However, dispatching is made

considering the sequence of trucks going to the Crusher in the next time tGA, which leads

to better results. The quantity of trucks in shovels for this model is shown in FIG. 5.12.

An interesting result is that the mean quantity of trucks in Shovel 1 (FIG. 5.12a) does

not decrease during the time-window. Independently of this behavior, the results of this


a)

b)

c)

0 100 200 300 400 500 6000

1

2

3

4

Time (min)

Qu

an

tity

of T

rucks

Quantity of Trucks - Shovel 1 - TiMDP

0 100 200 300 400 500 6000

1

2

3Quantity of Trucks - Shovel 2 - TiMDP

Time (min)

Qu

an

tity

of T

rucks

0 100 200 300 400 500 6000

1

2

3

4

5

6

7

8

9

10

Time (min)

Qu

an

tity

of T

rucks

Quantity of Trucks - Shovel 3 - TiMDP

FIGURE 5.10 – Quantity of trucks in shovels for TiMDP model simulation.


582 586 590 594 598 6002

4

6

8

10

12

14Trucks on Parking Lot - TiMDP

Time (min)

Tru

ck N

um

be

r

FIGURE 5.11 – Trucks on parking lot for TiMDP model.

method are better than those for MTCT; its total tonnage production is 90 300 tons 1.

The parking lot occupation for the GA model is shown in FIG. 5.13.

The quantity of trucks in the shovels for the G-TiMDP model is presented in FIG. 5.14.

Again, it is hard to identify substantial changes when comparing to the other presented

methods. An interesting aspect is that Shovel 2 is used more times during the shift, with

a peak of 4 trucks on it. The results of this m-trucks-for-n-shovels method were the best

among all we tested: a production of 90 600 tons.

The parking lot occupation for the G-TiMDP model is shown in FIG. 5.15.

1We note that this result is even better than the one presented for the TiMDP model, however it is aspecific result based on considered paths outcomes. Based only on this result, we cannot claim that thismethod is superior to TiMDP. A complete and statistically sound comparison is presented in the nextsection.


a)

b)

c)

0 100 200 300 400 500 6000

1

2

3

4

Time (min)

Qu

an

tity

of T

rucks

Quantity of Trucks - Shovel 1 - GA

0 100 200 300 400 500 6000

1

2

3

Time (min)

Qu

an

tity

of T

rucks


0 100 200 300 400 500 6000

1

2

3

4

5

6

7

8

9

Time (min)

Qu

an

tity

of T

rucks


FIGURE 5.12 – Quantity of trucks in shovels for the GA model simulation.


570 575 580 585 590 595 6001

2

3

4

5

6

7

8

9

10

11

12

13

14

15

Time (min)

Tru

ck N

um

be

r

Trucks on Parking Lot - GA

FIGURE 5.13 – Trucks on parking lot for the GA model.

5.3 Comparative Results and Analysis

Given the simulation framework and due to the stochastic behavior of the system

(path outcomes), we compare the presented truck dispatching methods using Monte Carlo

simulation (MOONEY, 1997).

In the simulations, we used two different t queue (multiplying factor due to the queue

size) for the TiMDP and G-TiMDP models. In the first simulation, we used the necessary

time to load (t load) the truck type with the mean capacity (truck T2) as t queue. How-

ever, even though this time value is a good initial approximation, t queue certainly will

not be the mean value of t load (unbalanced number of truck types and parallel queues

with different servers, or shovels). As a better approach, we did some preliminary simu-

lations and found t queue as the Average queue length/Average wait, whose values are

given by the statistical information from the Queue block. An example is found in FIG.

5.16, in which only the mean time in the queues of Shovel 1 and Shovel 3 are given by

FIGs 5.16a and 5.16b, respectively. The mean time of the Shovel 2 queue is not repre-


a)

b)

c)

0 100 200 300 400 500 6000

1

2

3

4

Time (min)

Qu

an

tity

of T

rucks

Quantity of Trucks - Shovel 1 - G-TiMDP

0 100 200 300 400 500 6000

1

2

3

4

Time (min)

Qu

an

tity

of T

rucks


0 100 200 300 400 500 6000

1

2

3

4

5

6

7

8

9

Time (min)

Qu

an

tity

of T

rucks


FIGURE 5.14 – Quantity of trucks in shovels for the G-TiMDP simulation.


570 575 580 585 590 595 6001

2

3

4

5

6

7

8

9

10

11

12

13

14

15

Time (min)

Tru

ck N

um

be

r

Trucks on Parking Lot - G-TiMDP

FIGURE 5.15 – Trucks on parking lot for the G-TiMDP model.

sented because it is not constantly used, making the simulator incapable to indicate the

variables Average queue length and Average wait (these values are always shown as zero

by the simulator). The considered t queue must be taken after its convergence. Based on

many observations, we adopted the values 6.2, 13 (estimated), and 2.5 minutes for t queue

of shovels 1, 2, and 3, respectively.

The simulations results (Table 5.1) are presented for all methods, considering around

4500 simulations and a standard representation of involved times, that is, the times are

always exact. The TiMDP (1) and G-TiMDP (1) methods used the original t queue,

whereas TiMDP (2) and G-TiMDP (2) methods used the estimated t queue. Considering

only the averages, we can observe that the better methods are for, in descending order:

G-TiMDP, TiMDP, GA, MTCT heuristic, and Greedy heuristic, as we have previewed

in the previous sections. The TiMDP (2) results contradicted our predictions by being

worse than the TiMDP (1) results, which can be explained by the single-dependent agent

approximations. However, the G-TiMDP (2) results, which are for a m-trucks-for-n-


0 100 200 300 400 500 6003

3.5

4

4.5

5

5.5

6

6.5

Time (min)

Me

an

tim

e (

min

)

Mean Time in Queue - Shovel 1 - TiMDP

0 100 200 300 400 500 6000

0.5

1

1.5

2

2.5

3

Time (min)

Me

an

tim

e (

min

)

Mean Time in Queue - Shovel 3 - TiMDP

a)

b)

FIGURE 5.16 – Mean time in the queues for TiMDP model.


TABLE 5.1 – Monte Carlo simulations of truck dispatching methods using standard rep-resentation (standard deviation equals zero for all considered times).

Method Sims Min (tons) Max (tons) Mean (tons) Std Dev (tons)Greedy 4722 74 300 80 600 77 427 870.3MTCT 4570 86 900 91 400 89 292 629.2

GA 4545 88 700 92 000 90 467 482.5TiMDP (1) 4719 89 100 92 600 90 936 534.9TiMDP (2) 4561 88 600 92 600 90 923 534.6

G-TiMDP (1) 4603 89 100 93 100 91 201 539.4G-TiMDP (2) 4548 89 100 92 800 91 236 528.3

shovels strategy based on solution combinations, are better than those for G-TiMDP (1).

Thus, we conclude that the t queue adjustment is fundamental for achieving good results.

The simulations results considering the involved times as originally proposed (Table

4.3), that is, by using Gaussian representations, are presented in Table 5.2. As the use

of estimated times t queue produced good results and is a rational choice, we have used

it both in the TiMDP and G-TiMDP models. Just for comparative purposes and to

prove the applicability of pdfs in the TiMDP model, we did the simulations using the

previous TiMDP and G-TiMDP models, and using these models with Gaussian pdfs. The

overall results are worse than the results presented in the last simulation round (Table

5.1), which can be explained by the unawareness of the exact needed times to execute

an action (decisions are taken based on expected values). Therefore, the knowledge of

inherent imprecisions in the system by the decision model is of paramount importance, as

can be shown by the better results of the TiMDP Gauss when compared to the TiMDP

method. However, results for G-TiMDP Gauss were a little worse than for G-TiMDP,

which can be explained by the estimated t-queue value and, mainly, by the imprecisions

added by the single-dependent-agent modeling.

The presented analysis until this point are valid, however they do not consider the


TABLE 5.2 – Monte Carlo simulations of truck dispatching methods using Gaussian rep-resentation.

Method Sims Min (tons) Max (tons) Mean (tons) Std Dev (tons)Greedy 2410 74 300 80 000 77 335 855.9MTCT 2410 87 100 91 400 89 241 654.8

GA 2191 88 400 92 100 90 435 494.8TiMDP 2410 87 900 92 400 90 836 535.0

TiMDP Gauss 2409 88 800 92 400 90 888 511.9G-TiMDP 2316 89 300 92 800 91 121 552.4

G-TiMDP Gauss 2318 92 600 92 800 91 092 537.5

TABLE 5.3 – Comparatives between truck dispatching methods using T-test.

Comparated methods t value Confidence levelMTCT - Greedy 754.9 >99.9%

GA - MTCT 100.1 >99.9%TiMDP(1) - GA 44.34 >99.9%

TiMDP(2) - TiMDP(1) -1.17 between 70 and 80%G-TiMDP(1) - TiMDP(1) 23.8 >99.9%G-TiMDP(2) - TiMDP(2) 28.1 >99.9%

G-TiMDP(2) - G-TiMDP(1) 3.14 >99.8%

quantity of simulations and the standard deviation of the results, which can hardly affect

the quality of results. In order to evaluate the simulation results considering the number

of simulations, means, and standard deviations, we use the Student’s T-Test (CRAMER,

1999) as a statistically comparator of results from two different groups.

For the comparisons between the developed dispatching methods, we consider that a

method is better than other one if the significance is greater than 0.05.

Table 5.3 shows the comparatives of significance among the methods considering a

standard representation of involved time (standard deviation always equal to zero). We

can observe that only the difference between TiMDP (1) and TiMDP (2) is not significant

with a confidence level of 99.9%, which is just the result that was unexpected due to the

adjustment of the t queue time parameter. These results confirm the superiority of the

G-TiMDP method over all other methods.


TABLE 5.4 – Comparatives between truck dispatching methods with Gaussian represen-tations using T-test.

Comparated methods t value Confidence levelMTCT - Greedy 542.4 >99.9%

GA - MTCT 70.16 >99.9%TiMDP Gauss - GA 30.51 >99.9%

TiMDP Gauss - TiMDP 3.45 >99.9%G-TiMDP Gauss - TiMDP Gauss 13.42 >99.9%

G-TiMDP Gauss - G-TiMDP -1.82 between 90 and 95%

The comparatives of the proposed truck dispatching methods are shown in Table

5.4 for the system with Gaussian representations for the involved times. The proposed

G-TiMDP Gauss method is superior to all other methods, however it is worse than G-

TiMDP (with no consideration of Gaussian distributions). However, we note that the

confidence level is smaller than 95%, therefore we cannot categorically affirm that G-

TiMDP is better than G-TiMDP with the Gauss model. In fact, in our view G-TiMDP

Gauss should be selected to be used in truck dispatching environments because of its

more precise uncertain time representation. Certainly, better results can be attained in

environments with a complete Gauss representation of involved times, as commonly found

in real-world applications.

6 Final Remarks

We present in this chapter the final conclusions of this thesis, based on all contributions

made and results achieved along the work. We also suggest future work that can be

useful to improve the representation of the real-world truck dispatching problem and the

proposed dispatching methods, leading to consideration of contingencies by the model and

probably to higher tonnage production.

6.1 Conclusions

We presented the development of diverse truck dispatching methods to optimize the

tonnage production in an example stochastic time-window mine. The developed methods

were: (1) Greedy heuristic, (2) MTCT heuristic, (3) TiMDP model, (4) GA model, and

(5) G-TiMDP model. The methods (1) and (2) are classical in the open-mining industry,

being classified as 1-truck-to-n-shovels strategies. They suffer from many problems, such

as egotist behavior and determinism, being their results used as comparatives to the other

developed methods. The method (3) is also classified as a 1-truck-to-n-shovels strategy,

and the methods (4) and (5) are classified as m-trucks-to-n-shovels strategies, in which its

combinatorial behavior may lead to better results. Our contributions point to methods

(3), (4), and (5), whereas methods (1) and (2) are classical ones used in truck dispatching

CHAPTER 6. FINAL REMARKS 125

for open-pit mining, which were used in the thesis just for basement of the problem and

result comparisons over a simulated example mine environment.

The example mine environment was composed by time-window and uncertain vari-

ables, such as path choices by truck driver and involved times modeled as Gaussian dis-

tributions. The time-window was used to indicate the path blockage in a period of the

shift, and was assumed as available information by all methods, except (1).

We developed a novel application of TiMDP models to the real-time truck dispatch-

ing problem, which is a real-world problem with inherent uncertainties. The TiMDP

model was solved by introducing backwards convolution, which is a solution method for

discretized states. In order to minimize the curse of dimensionality (result of agents com-

bination and state discretization), we modeled the problem using the introduced single-

dependent agent representation, in which agents are modeled in a concurrent single agent

environment being their actions choices dependent on the current general state of the en-

vironment (which is changed by all agents’ actions). In our development, the dependence

was modeled based on the size of the queues at shovels. Hence, the dispatching decisions

were dependent on the characteristics of the truck itself and on the current state of the

mine environment.

Since all previously developed methods belonged to the 1-truck-to-n-shovels strategy

class (egotist behavior), we introduced GA truck dispatching, which used the MTCT

heuristic as fitness function. This m-trucks-to-n-shovels strategy considers the following

trucks in the dispatching decision, however inherent uncertainties of the environment are

not considered, leading to worse results than the TiMDP model. Finally, we developed

our main contribution, a novel hybrid method called G-TiMDP, which is basically the GA

model using the results of the TiMDP model as fitness function. Basically, this approach


adapted the TiMDP model to a m-trucks-to-n-shovels strategy.

All presented methods demonstrated to be good choices for the considered real-time

problem, taking into account that dispatching decisions must be pursued quickly and the

methods returned the decisions in timespans shorter than one minute.

Monte Carlo simulations for the example mine were performed for all methods using

the SimEventsTMenvironment. The results were compared using Student’s T-Test, in

which the G-TiMDP model was ranked as the best one.

The presented methods can also be used for other mine configurations, simply by

adjusting the models to the new conditions. Certainly, considering a tonnage production

goal, G-TiMDP will be, for any mine configuration, the best method among the presented

ones.

6.2 Future work

We address some future work in order to improve the methods and to deal with

common contingencies present in mine environments.

Factored TiMDP representation

MDPs suffer the curse of dimensionality problem, in which state space explosion can

lead to extremely time-consuming solutions. TiMDPs are more affected because of its

discretized time representation, which is indeed a segmentation of time in states (the

number of states grows up based on the discretization resolution increase). Factored

MDPs (BOUTILIER; DEARDEN; GOLDSZMIDT, 2000) deal very well with large state

spaces, by considering states represented in a Dynamic Bayesian Network. We propose


a factored TiMDP, which can be a good approach for time-dependent stochastic decision

problems with large state spaces. Thus, our presented TiMDP model for truck dispatching

can be represented as a multi-agent problem (as a m-trucks-to-n-shovels strategy), with

a likely improvement of results.

Time-dependent Reinforcement Learning

Reinforcement Learning (RL) (SUTTON; BARTO, 1998) is a method for learning

in uncertain environments that can be represented according to a MDP formalism. In

RL, the agent learns characteristics about the environment based on its actions that can

return positive or negative reinforcements (rewards or punishments in the MDP jargon).

We propose the study of Time-Dependent Reinforcement Learning (TiRL), in which the

reinforcements will be also related to current time and action durations. The associated

theory may be applied to all time-dependent problems that can already be represented

by TiMDPs. Therefore, by following a RL representation, TiRL can be based on TiMDP

theory. In our presented truck dispatching problem, TiRL could be successfully applied in

all involved time adjustments (such as t queue), leading to better results along the shifts.

TiMDP sensibility analysis

Real-world problems are subject to non-previewed alterations along the decision pe-

riod. In our problem, paths can be blocked and shovels may break down or become

unavailable during the shift. To consider such issues, TiMDP can be remodeled and its

off-line and on-line phases executed again. However, all this rework might require a long

time, which is unacceptable for real-time dispatching problems. Another solution, could

be a policy selection considering that some states are unavailable, e.g. in a state, an agent

can select among three different actions and the policy indicates the action that leads to

an unavailable state; in this case, the agent must select the second action in the policy


list. In some cases, depending on the weight of the unavailable state to the model, the

selected action by the agent can be the best one, however, the best action could be the

third one, changing the quality of the final result. Therefore, we propose an analysis on

TiMDP sensibility, in which we could know in advance the maximum error in the quality

of the solution introduced by the modification caused in the original TiMDP model.

GA method improvement

The introduced GA method used to solve the truck dispatching problem can be im-

proved in order to provide better solutions in a shorter time. Problem representation

and reproduction phase revisions certainly will improve the final results. Based on this

improvement and on the last cited future work, probably, a reviewed version of G-TiMDP

will provide better results than those presented in this thesis.

Consideration of production and blending goals

Application of a TiMDP model to a real-time dispatching problem allowed us to solve

the truck dispatching problem considering the simplified tonnage production goal. Gen-

erally, in real-world mines, the goals are based on plans, which considers daily production

and blending necessities. Hence, we propose the introduction of these goals in future

models using all developed stochastic methods for truck dispatching, in order to have a

better representation of the problem which might improve the quality of the results.

Truck dispatching based on time-dependent utilities

The truck dispatching problem is composed by many other parameters that were not

considered in our developed modelings and can be also considered time-dependent. Pa-

rameters like fuel consumption and tires usage are truck-displacement-dependent, however

they can be correctly approximated to dependence on time. This way, we can use the in-


troduced time-dependent utilities applied as rewards in TiMDP models to consider other

important parameters in the truck dispatching problem. For example, when a truck is

asking for dispatch the system must decide whether it is better to send it to a shovel or to

the fuel-station. When these decisions are taken incorrectly, they may send the trucks to

a premature refueling, thus reducing the total tonnage production, or, in the worst case,

causing a truck halting in the mine environment because of an empty fuel.

Bibliography

ALARIE, S.; GAMACHE, M. Overview of solution strategies used in truck dispatchingsystems for open pit mines. International Journal of Surface Mining,Reclamation and Environment, Taylor and Francis Ltd, v. 16, n. 1, p. 59–76, 2002.

BASTOS, G. S.; RIBEIRO, C. H. C.; SOUZA, L. E. de. Variable utility in multi-robottask allocation systems. Robotic Symposium, IEEE Latin American, IEEEComputer Society, p. 179–183, 2008.

BELLMAN, R. Dynamic programming. Science, American Association for theAdvancement of Science, v. 153, n. 3731, p. 34–37, 1966.

BERTSEKAS, D. Dynamic programming: deterministic and stochastic models.[S.l.]: Prentice-Hall, Inc. Upper Saddle River, NJ, USA, 1987. ISBN 0132215810.

BOUTILIER, C.; DEAN, T.; HANKS, S. Decision-theoretic planning: Structuralassumptions and computational leverage. Journal of Artificial IntelligenceResearch, Citeseer, v. 11, n. 1, p. 94, 1999.

BOUTILIER, C.; DEARDEN, R.; GOLDSZMIDT, M. Stochastic dynamicprogramming with factored representations. Artificial Intelligence, Elsevier, v. 121,n. 1-2, p. 49–107, 2000.

BOYAN, J.; LITTMAN, M. Exact solutions to timedependent mdps. Advances inNeural Information Processing Systems, v. 13, p. 1–7, 2000.

BRAHMA, K. C. A Study on Application of Strategic Planning AndOperations Research Techniques in Open Cast Mining. 2007. Tese (Doutorado)— Department of Mining Engineering, National Institute of Technology, 2007.

BRESINA, J.; DEARDEN, R.; MEULEAU, N.; RAMAKRISHNAN, S.; SMITH, D.;WASHINGTON, R. Planning under continuous time and resource uncertainty: Achallenge for AI. In: CITESEER. AIPS Workshop on Planning for TemporalDomains. [S.l.], 2002. p. 91–97.

CETIN, N. Open-pit truck/shovel haulage system simulation. Tese (Doutorado)— The Graduate School of Natural and Applied Sciences, Middle East TechnicalUniversity, 2004.

CO, C.; TANCHOCO, J. A Review of Research and AGVS VehicleManagement. [S.l.]: School of Industrial Engineering, Purdue University, 1990.

CRAMER, H. Mathematical methods of statistics. [S.l.]: Princeton Univ Pr, 1999.

BIBLIOGRAPHY 131

ELBROND, J.; SOUMIS, F. Towards integrated production planning and truckdispatching in open pit mines. International Journal of Mining, Reclamation andEnvironment, Taylor & Francis, v. 1, n. 1, p. 1–6, 1987.

GENDREAU, M.; POTVIN, J. Dynamic Vehicle Routing and Dispatching. Fleetmanagement and logistics, Kluwer Academic Publishers, p. 115–126, 1998.

GIBSON, M.; BRUCK, J. Efficient exact stochastic simulation of chemical systems withmany species and many channels. J. Phys. Chem. A, ACS Publications, v. 104, n. 9,p. 1876–1889, 2000.

GROSS, D. Fundamentals of queueing theory. [S.l.]: Wiley-India, 2008. ISBN8126517778.

HOEY, J.; ST-AUBIN, R.; HU, A.; BOUTILIER, C. SPUDD: Stochastic planning usingdecision diagrams. In: CITESEER. Proceedings of the Fifteenth Conference onUncertainty in Artificial Intelligence. [S.l.], 1999. p. 279–288.

HORVITZ, E.; RUTLEDGE, G. Time-dependent utility and action under uncertainty.In: CITESEER. Proceedings of Seventh Conference on Uncertainty inArtificial Intelligence, Los Angeles, CA. [S.l.], 1991. p. 151–158.

HOWARD, R. Dynamic programming and Markov process. [S.l.]: MIT press,1960.

HUANG, B.; WEI, J.; HE, M.; LU, X. The Genetic Algorithm for Truck DispatchingProblems in Surface Mine. Information Technology Journal, v. 9, n. 4, p. 710–714,2010.

ICHOUA, S.; GENDREAU, M.; POTVIN, J. Vehicle dispatching with time-dependenttravel times. European journal of operational research, Elsevier, v. 144, n. 2, p.379–396, 2003.

JAOUA, A.; GAMACHE, M.; RIOPEL, D. Specification of an IntelligentSimulation–Based Real Time Control Architecture: Application to Truck ControlSystem. Simulation a Evenements Discrets pour la Commande Temps Reel deSystemes Dynamiques Complexes, p. 24, 2009.

JI, X. Models and algorithm for stochastic shortest path problem. AppliedMathematics and Computation, Elsevier, v. 170, n. 1, p. 503–514, 2005.

KIRKPATRICK, S. Optimization by simulated annealing: Quantitative studies.Journal of Statistical Physics, Springer, v. 34, n. 5, p. 975–986, 1984. ISSN0022-4715.

KOLONJA, B.; KALASKY, D.; MUTMANSKY, J. Optimization of dispatching criteriafor open-pit truck haulage system design using multiple comparisons with the best andcommon random numbers. In: ACM NEW YORK, NY, USA. Proceedings of the25th conference on Winter simulation. [S.l.], 1993. p. 393–401.

KRAUSE, A.; MUSINGWINI, C. Modelling open pit shovel-truck systems using theMachine Repair Model. Journal of the South African Institute of Mining andMetallurgy, Marshalltown, South Africa., v. 107, n. 8, p. 469–476, 2007.

BIBLIOGRAPHY 132

LENGYEL, M.; DAYAN, P. Hippocampal contributions to control: The third way.Adv. Neural Inf. Process. Syst, Citeseer, v. 20, p. 889–896, 2007.

LI, L.; LITTMAN, M. Lazy approximation for solving continuous finite-horizon MDPs.In: MENLO PARK, CA; CAMBRIDGE, MA; LONDON; AAAI PRESS; MIT PRESS;1999. Proceedings of the National Conference on Artificial Intelligence. [S.l.],2005. v. 20, n. 3, p. 1175.

LI, X.; SOH, L. Applications of Decision and Utility Theory in Multi-Agent Systems.CSE Technical reports, p. 56, 2004.

LITTMAN, M.; DEAN, T.; KAELBLING, L. On the complexity of solving Markovdecision problems. In: CITESEER. Proceedings of the Eleventh Conference onUncertainty in Artificial Intelligence. [S.l.], 1995. p. 394–402.

LIZOTTE, Y.; BONATES, E. Truck and shovel dispatching rules assessment usingsimulation. Mining Science and Technology, v. 5, p. 45–58, 1987.

LUDWIG, D. The distribution of population survival times. American Naturalist,JSTOR, v. 147, n. 4, p. 506–526, 1996.

MARECKI, J.; TOPOL, Z.; TAMBE, M. A fast analytical algorithm for MDPs withcontinuous state spaces. In: AAMAS-06 Proceedings of 8th Workshop on GameTheoretic and Decision Theoretic Agents. [S.l.: s.n.], 2006.

MAUSAM, M.; WELD, D. Solving concurrent Markov decision processes. In: AAAIPRESS. Proceedings of the 19th national conference on Artifical intelligence.[S.l.], 2004. p. 716–722.

MITCHELL, M. An introduction to genetic algorithms. [S.l.]: The MIT press,1998.

MOONEY, C. Monte Carlo Simulation. [S.l.]: Sage Publications, Inc, 1997.

MURATA, T. Petri nets: Properties, analysis and applications. Proceedings of theIEEE, IEEE, v. 77, n. 4, p. 541–580, 2002. ISSN 0018-9219.

PAPADIMITRIOU, C.; STEIGLITZ, K. Combinatorial optimization: algorithmsand complexity. [S.l.]: Dover Publications, 1998.

PARSONS, S.; WOOLDRIDGE, M. An introduction to game theory and decisiontheory. Game theory and decision theory in agent-based systems, KluwerAcademic Publishers, p. 1–28, 2002.

PELLEGRINI, J.; WAINER, J. Processos de Decisao de Markov: um tutorial. Revistade Informatica Teorica e Aplicada, v. 14, n. 2, p. 133–179, 2008.

PINTO, E. B. Despacho de caminhoes em mineracao usando logica nebulosa,visando ao atendimento simultaneo de polıticas excludentes. 2007. 120 p.Dissertacao (Masters in Production Engineering) — Engineering School, FederalUniversity of Minas Gerais, 2007.

POWELL, W. A comparative review of alternative algorithms for the dynamic vehicleallocation problem. Vehicle Routing: Methods and Studies, p. 249–291, 1988.

BIBLIOGRAPHY 133

PUTERMAN, M. Markov decision processes: discrete stochastic dynamicprogramming. [S.l.]: John Wiley & Sons, Inc. New York, NY, USA, 1994.

RACHELSON, E.; FABIANI, P.; GARCIA, F. TiMDPpoly: An improved method forsolving time-dependent MDPs. In: Proceedings of the 21st IEEE InternationalConference on Tools with Artificial Intelligence (ICTAI). [S.l.: s.n.], 2009. p.796–799.

RACHELSON, E.; FABIANI, P.; GARCIA, F. TiMDPpoly: An Improved Method forSolving Time-Dependent MDPs. In: IEEE. 2009 21st IEEE InternationalConference on Tools with Artificial Intelligence. [S.l.], 2009. p. 796–799.

RUSSELL, S.; NORVIG, P. Artificial intelligence: a modern approach. [S.l.]:Prentice hall, 2009.

SHI, Y.; EBERHART, R. Empirical study of particle swarm optimization. In: IEEE.Evolutionary Computation, 1999. CEC 99. Proceedings of the 1999 Congresson. [S.l.], 2002. v. 3. ISBN 0780355369.

SOLOMON, M. Algorithms for the vehicle routing and scheduling problems with timewindow constraints. Operations research, JSTOR, p. 254–265, 1987.

SOUZA, M.; COELHO, I.; RIBAS, S.; SANTOS, H.; MERSCHMANN, L. A hybridheuristic algorithm for the open-pit-mining operational planning problem. EuropeanJournal of Operational Research, Elsevier, v. 207, p. 1041–1051, 2010.

ST-AUBIN, R.; HOEY, J.; BOUTILIER, C. APRICODD: Approximate policyconstruction using decision diagrams. Advances in Neural Information ProcessingSystems, Citeseer, p. 1089–1096, 2001.

SUTTON, R.; BARTO, A. Reinforcement learning: An introduction. [S.l.]: TheMIT press, 1998.

SUTTON, R.; PRECUP, D.; SINGH, S. Between MDPs and Semi-MDPs: Learning,planning, and representing knowledge at multiple temporal scales. ArtificialIntelligence, Citeseer, v. 112, p. 181–211, 2000.

TA, C.; KRESTA, J.; FORBES, J.; MARQUEZ, H. A stochastic optimization approachto mine truck allocation. International Journal of Mining, Reclamation andEnvironment, Taylor & Francis, v. 19, n. 3, p. 162–175, 2005.

TEMENG, V.; OTUONYE, F.; FRENDEWEY, J. Real-time truck dispatching using atransportation algorithm. International Journal of Mining, Reclamation andEnvironment, Taylor & Francis, v. 11, n. 4, p. 203–207, 1997.

THISTED, R. Elements of statistical computing: numerical computation. [S.l.]:Chapman & Hall/CRC, 1988.

TU, J.; HUCKA, V. Analysis of open-pit truck haulage system by use of a computermodel. CIM Bulletin, v. 78, n. 879, p. 53–59, 1985.

Appendix A - Genetic Algorithm

Genetic Algorithm (GA) (MITCHELL, 1998) is a search procedure (or heuristic) that

is based on the process of natural evolution. GAs originated in 1975 from studies of cellular

automata, conducted by John Holland and his students at the University of Michigan.

Their applications include different areas such as scheduling and dispatching problems,

neural nets training, image feature extraction and recognition, and other optimization

and search problems.

GA is part of a larger class, called evolutionary algorithms (EA), in which is also

encountered the such algorithms: Ant Colony Optimization (ACO), Cultural Algorithm

(CA), Particle Swarm Optimization (PSO), Memetic Algorithm (MA), Simulated An-

nealing (SA), and Tabu Search (TS). These algorithms generate solutions to optimization

and search problems using techniques inspired by natural evolution, such as selection,

crossover and mutation.

A.1 Methodology

In order to find the solution of a search or optimization problem, GA simulates the

process of natural evolution (Alg. 4). First, the algorithm generates randomly the initial

population with its total size (size pop) composed by candidate solutions (individuals).

APPENDIX A. GENETIC ALGORITHM 135

Each individual is encoded by an array (chromosome), in which each value can be rep-

resented by a binary value (gene). After the generation, the population is reduced to

its better individuals in selection phase evaluated by the fitness function (fitness func).

The next generations are encountered following reproduction and selection phases until

the end condition (end condition) is attained.

Algorithm 4: Genetic Algorithm

Input: GA(size pop, fitness func, end condition)Output: solutiont← 0;Generate initial population, G(0), based on size pop;Select G(0) using fitness func;repeat

t← t+ 1;Generate G(t) by reproduction using G(t− 1);Select G(t− 1) using fitness func;

until end condition;return best individual(G(t))

A.1.1 Population generation

The population is generated randomly in order to cover the entire range of solutions

(search space). Its size depends on the nature of the problem, being a percentage of the

total of possible solutions; it contains typically hundreds or thousands of individuals. The

size of the population is direct related to the quality of the solution; small populations can

lead to local optimal solutions, and very large populations turns the convergence slow.

A.1.2 Selection

The selection phase is responsible to select the best individuals of a generation ac-

cording to a fitness function. The selected individuals are those in which its solutions


fits better to the fitness function. Popular selection methods include roulette wheel se-

lection and tournament selection. In the roulette wheel the individuals has its selection

probability based on its fitness values, that is, individuals with larger fitness values (for a

maximization fitness function) have higher selection probability. In the tournament the

selection is made considering the most fittest individual of a pair.

A.1.3 Reproduction

After the best individuals selection, the size population is reduced to a percentage of

its original size. In order, to restore the original size and, mainly, to improve the quality

of individuals, the reproduction phase is executed. This phase is divided in two phases:

crossover, and mutation.

A.1.3.1 Crossover

The crossover (or recombination) phase generates new individuals (child or son) from

the random combination of genes of individuals of the previous generation (parents). This

phase occurs until an appropriate population size is generated.

In order to maintain the best individuals of the previous generation (that is, the best

solutions), the crossover phase can be elitist. In this case, the parents can proceed in

the next generation if its fitness function is better than of their sons. Therefore, the new

generated individuals that are worse than their parents are not considered in the current

generation.


A.1.3.2 Mutation

After the crossover phase, the generated individuals can have some of their genes

randomly swapped or value changed, based on a mutation ratio (generally less than 1%).

The mutation phase is used to avoid the convergence to local optimal solutions; some

mutated individual can lead the next generations to better solution spaces, increasing the

chance to find the optimal solution.

A.1.4 Termination

The GA ends when a termination condition is reached. Common termination condi-

tions are:

• A fixed number of generations is reached;

• A maximum time of computation is reached;

• Solution convergence given an allowed maximum error;

• A solution is found that satisfies minimum criteria; or

• Combination of the above.

Appendix B - Statistical

Distributions

B.1 Gamma Distribution

Parameters: Shape parameter (α) and scale parameter (β) specified as positive real

values.

Range: [0,+∞)

Mean: αβ

Variance: αβ2

Applications: The gamma distribution is often used to represent the time required

to complete some task (e.g., a machining time or machine repair time).

The Gamma pdf is

U(µ, t) =

β−αxα−1e−x/β

Γ(α)for x > 0

0 otherwise

, (B.1)

where Γ is the complete gamma function given by:

APPENDIX B. STATISTICAL DISTRIBUTIONS 139

Γ(α) =

∫ ∞0

tα−1e−1dt . (B.2)

The gamma distribution is represented by the following graphic.

FIGURE B.1 – Gamma distribution.

B.2 Gaussian Distribution

Parameters: The mean (µ) is specified as a real number and standard deviation (σ)

is specified as a positive real number.

Range: (−∞,+∞)

Mean: µ

Variance: σ2

Applications: The Gaussian (or Normal) used empirically for many processes that

appear to have a symmetric distribution. Because the theoretical range is from −∞ to

APPENDIX B. STATISTICAL DISTRIBUTIONS 140

+∞, the distribution should only be used for positive quantities like processing times.

The normal pdf is

f(x) =1

σ√

2πe−(x−µ)2/(2σ2) , (B.3)

for all real x.

The normal distribution is represented by the following graphic.

FIGURE B.2 – Gaussian distribution.

FOLHA DE REGISTRO DO DOCUMENTO

1. CLASSIFICACAO/TIPO 2. DATA 3. DOCUMENTO No 4. No DE PAGINAS

TD 20 de dezembro de 2010 DCTA/ITA/TD - 018/2010 140

5. TITULO E SUBTITULO:

Methods for Truck Dispatching in Open-Pit Mining

6. AUTOR(ES):


7. INSTITUICAO(OES)/ORGAO(S) INTERNO(S)/DIVISAO(OES):

Instituto Tecnologico de Aeronautica - ITA

8. PALAVRAS-CHAVE SUGERIDAS PELO AUTOR:

Truck Dispatching; Open-pit Mining; TiMDP; Genetic Algorithm

9. PALAVRAS-CHAVE RESULTANTES DE INDEXACAO:

Programacao matematica; Distribuicao de mercadorias; Algoritmos geneticos; Matematica aplicada; Rotas;Caminhoes; Mineracao; Matematica10. APRESENTACAO: (X) Nacional ( ) InternacionalITA, Sao Jose dos Campos. Curso de Doutorado. Programa de Pos-Graduacao em Engenharia Eletronica eComputacao. Area de Informatica. Orientador: Carlos Henrique Costa Ribeiro; co-orientador: Luiz Edival deSouza . Defesa em 09/12/2010. Publicada em 2010.

11. RESUMO:

Material transportation is one of the most important aspects of open-pit mine operations. The problem usuallyinvolves a truck dispatching system in which decisions on truck assignments and destinations are taken in real-time. Due to its significance, several decision systems for this problem have been developed in the last few years,improving productivity and reducing operating costs. As in many other real-world applications, the assessmentand correct modeling of uncertainty is a crucial requirement as the unpredictability originated from equipmentfaults, weather conditions, and human mistakes, can often result in truck queues or idle shovels. However,uncertainty is not considered in most commercial dispatching systems. In this thesis, we introduce novel truckdispatching systems as a starting point to modify the current practices with a statistically principled decisionmaking methodology. First, we present a stochastic method using Time-Dependent Markov Decision Process(TiMDP) applied to the truck dispatching problem. In the TiMDP model, travel times are represented asprobabilistic density functions (pdfs), time-windows can be inserted for paths availability, and time-dependentutility can be used as a priority parameter. In order to minimize the well-known curse of dimensionality issue, towhich multi-agent problems are subject when considering discrete state modelings, the system is modeled basedon the introduced single-dependent-agents. Based also on the single-dependent-agents concept, we introduce theGenetic TiMDP (G-TiMDP) method applied to the truck dispatching problem. This method is a hybridization ofthe TiMDP model and of a Genetic Algorithm (GA), which is also used to solve the truck dispatching problem.Finally, in order to evaluate and compare the results of the introduced methods, we execute Monte Carlosimulations in a example heterogeneous mine composed by 15 trucks, 3 shovels, and 1 crusher. The uncertainaspect of the problem is represented by the path selection through crusher and shovels, which is executed bythe truck driver, being independent of the dispatching system. The results are compared to classical dispatchingapproaches (Greedy Heuristic and Minimization of Truck Cycle Times – MTCT) using Student’s T-test, provingthe efficiency of the introduced truck dispatching methods.

12. GRAU DE SIGILO:

(X) OSTENSIVO ( ) RESERVADO ( ) CONFIDENCIAL ( ) SECRETO

Date post:	21-Apr-2015
Category:	Documents
Upload:	guilherme-bastos
View:	442 times
Download:	3 times

Methods for Truck Dispatching in Open-Pit Mining

Documents