PREDICTING FRAUD IN MOBILE MONEY TRANSFER...Mobile Money Transfer (MMT) is a fast growing medium of...

PREDICTING FRAUD

IN MOBILE MONEY TRANSFER

ADEYINKA ADEDOYIN

A thesis submitted in partial fulfilment of the

requirements of the University of Brighton

for the degree of Doctor of Philosophy

June 2018

PREDICTING FRAUD IN MOBILE MONEY TRANSFER

Supervisory team:

Dr. Stelios Kapatenakis,School of Computing, Engineering and MathematicsUniversity of Brighton

Prof. Miltos Petridis,Department of Computer ScienceMiddlesex University, London, UK

Dr. Emmanouil Panaousis,School of Computing, Engineering and MathematicsUniversity of Brighton

Dr. Georgios Samakovitis,Department of Computing and Information SystemsUniversity of Greenwich

Declaration

I declare that the research contained in this thesis, unless otherwise formally

indicated within the text, is the original work of the author. The thesis has

not been previously submitted to this or any other university for a degree, and

does not incorporate any material already submitted for a degree.

Adeyinka Adedoyin

iii

Abstract

Mobile Money Transfer (MMT) is a fast growing medium of making financial

transactions via a mobile device. It is increasingly becoming adopted in grow-

ing markets especially in developing countries. The ability of Mobile Money

Transfer services (MMT) to handle large number of small value payments

worldwide funds exchange in digital currencies and lack of oversight makes it

an attractive target for attackers and fraudsters. Although the risks inherent

in all payments channels exist in the mobile money payment environment. The

usage of mobile money transfer technologies introduces additional risks caused

by the large number of non-bank participants, higher speed of transactions

and level of anonymity compared to mobile banking and mobile commerce

systems. This provides motivation for detecting and preventing fraudulent

mobile money transactions in mobile payment systems.

The main objective of this thesis is to investigate and propose a pattern

recognition model to predict fraud in Mobile money transfer transactions. To

this end, a novel pattern recognition model has been proposed from the find-

ings of this thesis. Also, synthetic mobile money transfer transaction dataset

was simulated with possible different fraud scenario(s) to explore. The ap-

plicability of the proposed pattern recognition model was evaluated using the

simulation dataset. From the results of the experiments, a promising recogni-

tion performance was achieved. The results also provide the ranking of clusters

of transaction neighbours for new cases which may operate as an effective tool

for experts to develop preliminary insight into suspicious transactions which

can then be investigated in more detail.

Acknowledgements

I owe my sincere and endless appreciation to my supervisors Doctor Stelios Ka-

patenakis, Professor Miltos Petridis, Doctor Georgios Samakovitis and Doctor

Emmanouil Panaousis for their impeccable supervision. Their continuous en-

couragement, guidance and immense supports from inception made the com-

pletion of this thesis possible. I have to extend my thanks to all PhD students

who helped and motivated me to keep going in my research. In particular,

thanks to Mohammed AL-Obeidallah, and Jose L. Jorro-Aragoneses it was a

valuable experience sharing office with you.

Most importantly, I would like to thank my family. None of this would

have been possible without their love and encouragements. My wife, Kehinde,

for her patience and support in difficult times, and my daughter, Joan. I would

like to give an extra mention to Doctor Georgios Samakovitis for his sincere

enthusiasm in my PhD work, his relentless effort and support saw me through

the completion of this thesis. My innermost gratitude goes to my parent, who

are always there for me. I extend my thanks to my brothers and other relatives

for their continuous support towards the success of this PhD thesis.

v

Contents

Declaration iii

Abstract iv

Acknowledgements v

1 Introduction 1

1.1 Mobile Money Overview . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2.1 Operational Challenges . . . . . . . . . . . . . . . . . . . 6

1.2.2 Technological Challenges . . . . . . . . . . . . . . . . . . 9

1.3 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.4 Research Methodology . . . . . . . . . . . . . . . . . . . . . . . 13

1.5 Contribution to Knowledge . . . . . . . . . . . . . . . . . . . . . 16

1.6 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1.7 Publications and Research Activities . . . . . . . . . . . . . . . 19

2 Background on Mobile Money Transfer Operation and Dataset 20

2.1 Mobile Money Transfer . . . . . . . . . . . . . . . . . . . . . . . 21

2.2 Mobile Money Business Model . . . . . . . . . . . . . . . . . . . 23

vi

Contents vii

2.2.1 Mobile Money Transfer Eco-system . . . . . . . . . . . . 24

2.2.2 Mobile Money in Emerging Economies . . . . . . . . . . 28

2.2.3 Regulatory Issues . . . . . . . . . . . . . . . . . . . . . . 30

2.3 Mobile Money Transfer Synthetic Data . . . . . . . . . . . . . . 32

2.3.1 Why Synthetic Data? . . . . . . . . . . . . . . . . . . . . 35

2.3.2 Synthetic Data Generation Methodology . . . . . . . . . 36

2.3.3 Creation of Synthetic Log Data . . . . . . . . . . . . . . 39

2.3.4 Synthetic Data Simulation Using MABS . . . . . . . . . 40

2.3.5 Evaluation of Simulation Data . . . . . . . . . . . . . . . 41

2.4 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . 42

3 A Review of Fraud Detection Techniques 44

3.1 ML Algorithms for Fraud Detection . . . . . . . . . . . . . . . . 45

3.1.1 Supervised Approaches . . . . . . . . . . . . . . . . . . . 46

3.1.2 Unsupervised Approaches . . . . . . . . . . . . . . . . . 55

3.2 Learning from Imbalanced Data . . . . . . . . . . . . . . . . . . 57

3.2.1 Data Level Methods . . . . . . . . . . . . . . . . . . . . 57

3.2.2 Algorithm Level Methods . . . . . . . . . . . . . . . . . 59

3.2.3 Cost-sensitive Learning Methods . . . . . . . . . . . . . 61

3.3 Classification Performance Measures . . . . . . . . . . . . . . . 63

3.4 Case-Based Reasoning . . . . . . . . . . . . . . . . . . . . . . . 64

3.4.1 Case-Based Reasoning Cycle . . . . . . . . . . . . . . . . 66

3.5 CBR and Machine Learning . . . . . . . . . . . . . . . . . . . . 68

3.6 Case-Based Reasoning in Fraud Detection . . . . . . . . . . . . 70

3.7 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . 72

Contents viii

4 Mobile Money Transfer Data Simulation 74

4.1 Mobile Money Model . . . . . . . . . . . . . . . . . . . . . . . . 75

4.2 Users’ Behaviour Model . . . . . . . . . . . . . . . . . . . . . . 76

4.2.1 Legitimate Actors . . . . . . . . . . . . . . . . . . . . . . 76

4.2.2 Bad Actors . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.3.1 Simulated Scenarios . . . . . . . . . . . . . . . . . . . . . 81

4.3.2 Input Parameters . . . . . . . . . . . . . . . . . . . . . . 82

4.3.3 Output Parameters . . . . . . . . . . . . . . . . . . . . . 85

4.3.4 Simulation Walkthrough . . . . . . . . . . . . . . . . . . 86

4.4 Evaluation of the Log Data . . . . . . . . . . . . . . . . . . . . 88

4.4.1 Quality of Data . . . . . . . . . . . . . . . . . . . . . . . 92

4.4.2 Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . 93

4.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . 99

5 Design of Mobile Money Transfer Fraud Detection System 100

5.1 Proposed Detection Method . . . . . . . . . . . . . . . . . . . . 102

5.2 Standard CBR Model . . . . . . . . . . . . . . . . . . . . . . . . 104

5.3 Weighted CBR Model . . . . . . . . . . . . . . . . . . . . . . . 105

5.3.1 Problem Representation . . . . . . . . . . . . . . . . . . 106

5.3.2 Case Similarity . . . . . . . . . . . . . . . . . . . . . . . 108

5.3.3 CBR Model Feature Weighting . . . . . . . . . . . . . . 110

5.4 Data Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . 112

5.5 Preliminary Experiment . . . . . . . . . . . . . . . . . . . . . . 114

5.6 Evaluating the Efficiency of Prediction . . . . . . . . . . . . . . 116

5.6.1 Cross-Validation . . . . . . . . . . . . . . . . . . . . . . 118

5.7 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . 119

Contents ix

6 Experiments and Validation 121

6.1 First Set of Experiments . . . . . . . . . . . . . . . . . . . . . . 122

6.2 Second Set of Experiments . . . . . . . . . . . . . . . . . . . . . 126

6.2.1 Data Sampling . . . . . . . . . . . . . . . . . . . . . . . 128

6.2.2 Experiments with the Weighted CBR . . . . . . . . . . . 129

6.2.3 Weighted CBR with Clustering . . . . . . . . . . . . . . 134

6.2.4 Weighted CBR with Clustering Experiment . . . . . . . 136

6.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

7 Conclusion and Further Work 142

7.1 Thesis Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 143

7.2 Contributions and Findings . . . . . . . . . . . . . . . . . . . . 146

7.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

Appendix 153

A Preliminary Experiments 153

A.1 Experimental Results from the Selected Machine Learning Al-

gorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

B Data Simulation Configuration 157

References 160

List of Figures

1.1 Mobile money services: the figure is adapted from [AHG14], to

highlight the target financial transaction service covered in the

study. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Number of subscribers and transactions (2011-2016) . . . . . . . 5

1.3 Illustration of the research methodology . . . . . . . . . . . . . 13

2.1 P2P funds transfer using MMT service . . . . . . . . . . . . . . 22

2.2 Mobile money transfer service ecosystem [Rie+13] . . . . . . . . 25

2.3 Regulatory issues . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.4 Synthetic log data generation method [BKJ03] . . . . . . . . . . 37

2.5 The synthetic log data generation process [LKJ02] . . . . . . . . 39

3.1 Algorithmic solutions for fraud detection systems . . . . . . . . 45

3.2 CBR classical paradigm [Cor08] . . . . . . . . . . . . . . . . . . 65

3.3 The CBR cycle [AP94] . . . . . . . . . . . . . . . . . . . . . . . 66

4.1 The synthetic log data generation process [LKJ02] . . . . . . . . 79

4.2 Screen-shot of MMT simulation window . . . . . . . . . . . . . 81

4.3 Simulation window for input parameters . . . . . . . . . . . . . 83

4.4 A flowchart representing the simulation walk-through . . . . . . 87

4.5 Number of non-fraud and fraud transactions . . . . . . . . . . . 95

x

List of Figures xi

4.6 Different transaction services performed in the simulation . . . . 95

4.7 Categories of users based on their frequency of transactions . . . 96

4.8 Fraction of fraud types to category of transaction in the MMT

dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

4.9 Total number of different fraud types . . . . . . . . . . . . . . . 97

4.10 Line plot of amount fraction in Kenya shillings (Sept.2013 -

Feb.2014) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

4.11 A relationship graph for 100 most active MMT users in the

simulation. The blue and red nodes represent Legitimate and

bad actors respectively and the edges represent relationships

between the actors. . . . . . . . . . . . . . . . . . . . . . . . . . 98

5.1 Proposed fraud detection framework . . . . . . . . . . . . . . . 102

5.2 Schematic representation of k-fold cross-validation . . . . . . . . 118

5.3 Schematic representation of 5-fold cross-validation . . . . . . . . 119

6.1 Transaction neighbours summary . . . . . . . . . . . . . . . . . 125

6.2 An illustration of the SMOTE + Tomek . . . . . . . . . . . . . 129

6.3 Results for all types of fraud class detection . . . . . . . . . . . 131

6.4 Results for Non-fraud transaction detection . . . . . . . . . . . . 132

6.5 Structure of case library retrieval using clustering algorithm

[TD13] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

6.6 CBR with clustering Results . . . . . . . . . . . . . . . . . . . . 139

A.1 Classifiers recall results . . . . . . . . . . . . . . . . . . . . . . 154

A.2 Classifiers F-measure results . . . . . . . . . . . . . . . . . . . . 155

A.3 Classifiers Mathews Correlation Coefficient results . . . . . . . . 155

A.4 Classifiers area under ROC curve results . . . . . . . . . . . . . 156

List of Tables

1.1 The growth in number and value of transactions. (Central Bank

of Kenya, 2016 [Ken16]) . . . . . . . . . . . . . . . . . . . . . . 4

2.1 Mobile payments business model . . . . . . . . . . . . . . . . . . 24

2.2 List of mobile money applications in emerging economies . . . . 30

3.1 List of some related works on fraud detection . . . . . . . . . . 50

3.2 Discussion of some of the related works performance . . . . . . . 51

3.3 Confusion matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.1 Simulation input parameters . . . . . . . . . . . . . . . . . . . . 84

4.3 Simulation output parameters . . . . . . . . . . . . . . . . . . . 85

4.5 Groups of users according to possible days of the week the

mWallet account is used. . . . . . . . . . . . . . . . . . . . . . . 89

4.6 Results of chi-square test for each set of users . . . . . . . . . . 91

4.7 MMT dataset statistics . . . . . . . . . . . . . . . . . . . . . . . 94

6.1 Evaluation of StdCBR classifier on small MMT dataset. . . . . . 124

6.2 Performance of classifiers: row 1, represents StdCBR, row 2 Std-

CBR + new features, row 3 Weighted CBR, and row 4 Weighted

CBR + new features. . . . . . . . . . . . . . . . . . . . . . . . . 133

B.1 Simulation Input parameters . . . . . . . . . . . . . . . . . . . . 158xii

Abbreviations

AML Anti-Money Laundry MNO Mobile Network Operator

AUC Area Under the ROC Curve mMoney Mobile Money

CV Cross-Validation mWallet Mobile Account

CBR Case-Based Reasoning MABS Multi-agent Based Simulator

DT Decision Tree NB Naive Bayes

ENN Edited Nearest Neighbor NN Neural Network

FD Fraud Detection kNN k-Nearest Neighbor

FDS Fraud Detection System P2P Person-to-Person

FN False Negative RF Random Forest

FNR False Negative Rate ROC Receiving Operating Charac-

teristic

FPR False Positive Rate SMS Short Message Service

GA Genetic Algorithm SVM Support Vector Machine

LG Logistic Regression SNA Social Network Analysis

ML Machine Learning TNR True Negative Rate

MCC Mathew Correlation Coeffi-

cient

TPR True Positive Rate

MMT Mobile Money Transfer

xiii

Chapter 1

Introduction

This Chapter presents an overview of the term mobile money in Section 1.1.

Section 1.2 discusses the motivation(s) for detecting fraud in mobile money

transfer. The research questions along with the thesis contribution to knowl-

edge are presented in Sections 1.3 and 1.5 respectively. Finally, the thesis

structure is presented in Section 1.6 while the paper published as part of re-

search activities tied to this study, are presented in Section 1.7.

1.1 Mobile Money Overview

Mobile money is an umbrella term that defines an ecosystem encompassing

various types of financial activities or services transacted via a mobile device.

Three main types of mobile financial services exist; they include Mobile Bank-

ing, Mobile Payments and Mobile Commerce as shown in Figure 1.1. Those

can be defined as [AHG14]: (i) Mobile Banking: This involves the remote

management of one’s bank account in a mobile environment such as access

account information, setting up standing orders, paying bills, direct debits etc.

1

1.1. Mobile Money Overview 2

(ii) Mobile Payments: This is commonly referred to as mobile money transfer

(MMT) and refers to peer-to-peer payment services, operated under financial

regulation and performed from or via a mobile device. This covers remittance

type of activities, covering both domestic and international, cash in/out, etc.

(iii) Mobile Commerce: This involves the buying and selling of goods and

services through a mobile device, either remotely or on-site.

Figure 1.1: Mobile money services: the figure is adapted from [AHG14], tohighlight the target financial transaction service covered in the study.

Figure 1.1 is adapted from [AHG14] to illustrate the financial activity that will

be covered in this thesis. The highlighted area covers mobile money transfer

transactions within the mobile payment financial services. Although the risks

inherent in all payments channels also exist in the mobile money payment en-

vironment, however the use of mobile money transfer technologies introduces

additional risks caused by an increasing number of non-bank participants, the

higher speed of transactions and level of anonymity compared to mobile bank-

ing and mobile commerce systems [Mer11; NK14].

This research is motivated by the recognition that Mobile Money Trans-

fer (MMT) is a fast growing medium for financial transaction made via a


mobile device [Muy15]. It is seen as a platform with high significant societal

value and considered critical in supporting financial inclusion to unbanked and

under-banked populations in developing countries [LD12; Lak13]. In developed

countries, where most people have bank accounts and easy access to banks,

mobile money transfer is seen as just another evolving channel for existing fi-

nancial products and services. In developing countries financial infrastructure

are not well developed, physical transportation infrastructure are often inad-

equate, unreliable, and dilapidated making access to financial services very

costly. This consequently results into a larger unbanked population [Int13a]

and throwing up a large percentage of the population to be operating on a

cash only basis outside the formal banking system. In some cases, informal

methods are used to transfer money. For example, in rural areas people have

to travel long distances from their homes to collect remittances. This presents

several risks and a significant cost in addition to the already high transfer fees

[Int13b].

All these benefits have helped to make mobile money transfer more ap-

pealing [Int13b]. In addition, it has brought about significant implications

for economic activity across the board. First, it offers a simple and low-cost

service with reduced risk and second, it facilitates the flow of money from one

party to another using a communications infrastructure that already connects

billions of people around the world. This has given MMT provider the ad-

vantage to incorporate wide range of financial transaction services such as bill

payment, payroll deposit, loan receipt and repayment, and purchases of goods

and services, prepaid airtime, groceries, bus tickets, micro insurance etc., into

their system [Jen08].

In 2007 Vodafone and Safaricom in Kenya launched M-PESA, a short mes-


sage service (SMS)-based money transfer system that allows individuals to

deposit, send, and withdraw funds from a virtual account on their cell phones

which is separate from the banking system [WST10]. Three years after mo-

bile money services were first launched in Kenya in 2007, there were about 16

million mobile money subscribers [She13]. This number grew by 142% about

39 million in 2016, higher than the 2015 adult population of about 25.6 mil-

lion. Between 2011 and 2016, active mobile money subscribers grew with a

compound annual growth rate (CAGR) of about 16% from 19 million to 39

million (see Table 1.1). The usage metrics calculated by number and value of

transactions as well as balance on customer accounts have grown at a CAGR

of over 26% [Pae17]. The mobile financial services sector experience in Kenya

has been one of significant growth. This has made it to be widely viewed as

a success story worthy of being emulated across the developing world. As a

result, similar products have recently been launched in some other countries

across Africa, Asia, and Latin America, with the intent of expanding financial

services to low income and rural populations [WST10].

Table 1.1: The growth in number and value of transactions. (Central Bank ofKenya, 2016 [Ken16])

Measurement 2011 2012 2013 2014 2015 2016 CAGR

Mobile Money Ac/s (mns) 19 21 25 25 32 39 16%

No. of Transactions (mns) 433 575 733 911 1,114 1,362 26%

Transactions Value (KSh bn) 1,169 1,538 1,902 2,372 2,816 3,343 23%

Average Transaction Value

(KSh)

2,700 2,672 2,594 2,604 2,528 2,454 -2%

No. of Agents 76,912 50,471 113,130 123,703 143,946 167,501 17%

Studies have shown [Pae17] that the successful penetration of mobile money

services was typically accompanied by an increase in financial inclusion. For


example, in Uganda non-bank formal financial inclusion grew from 7% in 2009

when mobile money was first introduced to 49% in 2016 [Pae17; Uga16]. Sim-

ilarly, in Zimbabwe (Figure 1.2) between 2011 and 2014, financial inclusion

grew from 60% to 77% largely due to mobile money service [Tru15].

Source: Central Banks of Kenya, Tanzania and Uganda

Figure 1.2: Number of subscribers and transactions (2011-2016)

However, the level of evolution and uptake has varied by country. Kenya

has taken the lead in terms of uptake and more competitive pricing while

Uganda lags behind in terms of available financial services. Tanzania on it’s

part is the first of the East African countries to implement interoperability

between mobile money operators [Pae17]. In 2014 due to mobile money ser-

vice, sub-Saharan Africa accounted for 17% of the worlds unbanked population

compared to 31% in South Asia and 24% in East Asia and Pacific [DK+15].

The key component driving the penetration of mobile money and electronic

payments is their ability to disaggregate or unbundle the services traditionally

offered by banks into less expensive and accessible platforms [GV16].

1.2. Motivation 6

1.2 Motivation

Predicting fraud in mobile money transfer services comes with a number of

challenges. These challenges can be categorised into operational and techno-

logical challenges; the operational challenge relates to processes such as regu-

latory controls, control on account cash in and cash out, customer registration

process etc., and how they evolve over time. While it is critical to take into

account some of these challenges they are however, not a direct output of this

research work. On the other hand, the technological challenge take into ac-

count problems associated with the techniques/tools used for addressing this

challenges in the literature. The discussions on some of these challenges follows

below:

1.2.1 Operational Challenges

Mobile Money Transfer (MMT) services are financial services provided by a

Mobile Network Operator (MNO) that enable transfer of funds using a digi-

tal equivalent of cash (electronic money) between service subscribers through

mobile channels [Zhd+14]. In developed countries MMT is merely seen as an

extension to existing banking services. Consequently, in developing countries

where access to banking is often challenging for individuals and businesses, mo-

bile money transfer technologies is viewed as platforms with significant strate-

gic and societal value in supporting financial inclusion to both unbanked and

under-banked populations. More than 2.5 billion adults globally lack a formal

bank account majority of them in developing countries. Furthermore, approx-

imately 68 percent of that population have access to a mobile phone [Int13b].

In a 2013 Gartner report [She13], the worldwide market for MMT was esti-

1.2. Motivation 7

mated to reach over 450 million subscribers in 2017, with a mobile transaction

value of more than $721 billion. The main drivers behind the success of mo-

bile money are the explosive growth in the number of mobile devices and the

drop in computing power cost, which has made mobile phones more accessible

[Int13a].

The ability of MMT to handle large numbers of small value payments,

its suitability for transferring funds worldwide in digital currencies, and the

current absence of robust regulatory oversight, makes it both an attractive

target for attackers, fraudsters and an equally attractive vehicle for money

laundering [BD13]. While in most countries Anti-Money Laundering (AML)

and transaction fraud reporting is compulsory for service providers and fi-

nancial institutions [Zhd+14], in many of them, existing ML legislation is not

presently fit to fully accommodate the relatively young m-money markets. The

absence of suitable oversight intensifies the exposure of MMT to risk. These

risks includes fraud, money laundering and other financial misuse. For ex-

ample, where proper controls are not deployed, fraudsters can get access to

MMT services without disclosing their identity to the MNO, by taking advan-

tage of prepaid phones, ”pooling” and delegation of mobile devices [Zhd+14;

Cha+11b]. ”Since the success of any payment system is based on ubiquity,

convenience, and trust, it is necessary to address emerging risks in order to

maintain public confidence in mobile money” [Lui+15].

A crucial observation is made at this point to distinguish between capa-

bilities for investigating transaction fraud as opposed to those addressing the

identification of money laundering. While transaction fraud is typically recog-

nised as most commonly associated with money laundering [Zhd+14], money

laundering activity itself may technically exist in the absence of transaction

1.2. Motivation 8

fraud e.g. through the use of mule accounts [Zhd+14]. Even more crucially,

money laundering is process−driven as opposed to transaction fraud which is

event−driven. As a consequence, AML predictive modelling is far more com-

plex and computationally demanding than fraud monitoring while selection of

suitable Artificial Intelligence approaches becomes significantly more challeng-

ing for AML. Therefore, this thesis considers the development of monitoring

and predictive models for transaction fraud with a view to merely supporting

AML indirectly.

One of the commonest approaches used today for mitigating illicit financial

activity is to impose transaction thresholds on different risk profiles [BH02].

Transactions that exceed this thresholds will require extra scrutiny whereby

the client needs to declare the origin of the funds [LRA14]. These thresholds

are usually set by law without distinction made between different economic

sectors or actors. This of course is easily countered by fraudsters through

adapting their spending behaviour to smaller value transactions (a.k.a smurf-

ing) [Zhd+14]. Consequently, this and other similar methods hitherto used

have proven insufficient [LRA14; Mag09].

New promising research in the field of data mining based methods have been

used to detect fraud [Phu+10]. Observations from this research shows that

machine learning algorithms can identify novel methods of fraud by detecting

those transactions that are different (anomalous) in comparison to the benign

transactions. Hence, several machine learning techniques have been used for

the detection of fraud [LRA14] and the application of machine learning to this

problem is advantageous in many situations [Yue+07; ZSY03]. Most of these

machine leaning techniques are data-driven, typically requiring a significant

amount of financial transaction historical data [Sha+16].

1.2. Motivation 9

The challenges in obtaining real life financial transaction data sets for re-

search purposes are well-known [BH02] including data protection, confidential-

ity, purpose and storage limitation as well as what is outlined through GDPR

principles [Com18]. Even where real life data sets are available, this may be

small in size and lack information on confirmed fraud cases and their possible

taxonomies [Gor15]. The overall scarcity of real data use cases in the academic

literature clearly attests to this [Gab+13]. In the absence of historical dataset

due to a number of reasons as described above, Phua et al. [Phu+10] suggests

the simulation of synthetic transaction data which matches closely to real data

as a solution. This can be achieved by using real data as a property seed for

the simulation i.e statistical properties from small amounts of authentic data

is used to generate large amounts of synthetic data. This motivates part of

our aim in simulating mobile money transfer transaction data for the purpose

of evaluating the proposed prediction model.

1.2.2 Technological Challenges

In general fraud detection system relies on the analysis of recorded transactions

data for the purpose of identifying unusual transactions. These transaction

data can be enormous and are mainly composed of a number of attributes

such as account identifier, transaction date, recipient, amount of transaction

and more. As a result, the use of automatic systems are essential since it is

not always possible or easy for human analysts to detect fraudulent patterns

in transaction datasets by manually checking all transactions [Poz15].

A classic approach towards this system is an expert-based approach that

uses experience, intuition and domain knowledge from fraud analysts to define

rules that are used to predict the probability of a new transaction to be fraud-

1.2. Motivation 10

ulent or not [BVV15]. For example [BVV15], lets assume we have a rule based

fraud detection system for an insurance claim company. The expert rule can

be ”IF: Amount of claim is above threshold OR Severe accident, but no police

report OR Multiple receipts submitted, THEN: Flag claim as suspicious AND

Alert fraud investigation officer”. This expert system relies on human expert

input, evaluation and monitoring, thereby suffering from a number of disad-

vantages. This systems are expensive to build since they require advanced

manual input by the fraud experts. They often turn out to be difficult to

maintain and manage when they become obsolete due to fraud evolution (i.e

when fraudsters change their modus operandi, they become undetectable by

the current rules) or change in behavioural pattern of customers.

As an alternative to this expert system, automated approaches (such as sta-

tistical and machine learning techniques) that leverage on the recorded trans-

action for data monitoring and analysis in a more efficient manner is used.

These automated approaches are data driven and are able to learn from the

data in a supervised or unsupervised manner for the purpose of identifying pat-

terns that are most probably related to a fraudulent behaviour [Poz15]. This

data driven approaches are able to learn complex fraudulent configurations,

ingest large volumes of data and adapt to changing distribution in the case

of fraud evolution. However, they come with some drawbacks such as [Poz15;

Sha+16]: (i) in the absence of significant size of historical data, they tend

not to perform well, (ii) some models are black box, i.e. they are not easily

interpretable by investigators and thus they do not provide an understanding

of the reason why an alert is generated.

A Case-based reasoning method as an alternative to the aforementioned

methods, comes with a number of advantages when applied to the field of finan-

1.2. Motivation 11

cial transaction fraud. For example [Sha+16; PDM15], case-based reasoning

features has the ability to (i) learn in the absence of historical consumption

data while continuously improving when more data becomes available over

time. (ii) realize knowledge transfer as spending habits evolve; as is the case

where information on one transaction is exploited to improve predictions for

different yet similar transactions. (iii) provide precedent-based justification

instead of justifying a solution by showing a trace of the rules that led to the

decision [CK91; Wat99]. In addition, a CBR system is fast in construction

compared to expert systems which are easier to maintain and can cope with

complex structures. This is an advantage in comparison with ANNs that uses

numeric input or symbolic patterns to deal with the complexity of structures

[PH07; Kap12]. Additionally, a CBR system is more transparent than black-

box models, such as neural networks [Sha+16], making its overall applicability

easy.

Leveraging on the advantages above, a case-based reasoning methodology

is considered suitable to provide an effective way to analyse complex structures

(such as evolving genuine and fraudulent behaviours) in mobile money transfer

payment services. However, to design a fraud detection system using case-

based reasoning methodology a couple of requirements needs to be considered.

Some of these includes: (i) either to use a supervised or unsupervised approach.

(ii) mechanism for feature selection/dimensionality reduction on sample sets.

(iii) unbalanced dataset problem associated with financial transaction data.

All these requirements will be investigated in detail and considered in this

thesis as further discussed in chapter 3.

1.3. Research Questions 12

1.3 Research Questions

Based on the motivations above, this research makes an investigation on whether

CBR methodology can be used towards the effective prediction of mobile pay-

ment fraud. Thus, the main research question in this thesis can be stated as

follows:

How can the Case-based Reasoning (CBR) methodology be used

for effective analysis and prediction of transaction fraud in mobile

money transfer (MMT) networks?

This main research question is further divided into three sub-questions, as

addressed in this thesis.

1. How can a model developed through the CBR approach be used for ef-

fective analysis of MMT transaction fraud? Achieving this will support

feature engineering of systems that will deliver improved predictive ac-

curacy.

2. To what extent can such a model deliver measures/metrics for prediction

of MMT fraud? This will involve the similarity measures used and the

performance from these predictions.

3. What are the limitations of such a model and the performance to expect

from it?

Finding answers to these research questions will help to address the im-

plementation, validation and evaluation of the proposed CBR approach. One

major concern in building and evaluating the performance of the proposed

1.4. Research Methodology 13

model in a realistic condition is the challenge of obtaining mobile money pay-

ment dataset due to privacy issues. This leads to an additional question: How

can background transaction data as training and test cases for pattern analy-

sis and learning algorithms evaluation be obtained? Addressing this challenge

will support the evaluation of the proposed prediction model.

1.4 Research Methodology

The research methodology adopted in this thesis can be classified into three

major steps and its presentation is as shown in Figure 1.3:

Figure 1.3: Illustration of the research methodology

As seen in Figure 1.3 above, the first approach in this thesis was to carry

out an investigation into the literature so as to provide answers to the stated re-

search questions. To investigate the state of art, a narrative review was carried

out with a view of providing comprehensive overview of topics such as Mobile

money transfer services environment. This was done by providing basic knowl-

edge on fundamentals of Mobile money such as the business models, ecosystem


and regulatory issues. Investigation was also carried out into the existing ap-

proach that were used in the literature for dealing with the challenge of lack

of publicly available mobile money transfer dataset. Also, common algorith-

mic solutions used for financial service fraud detection in the literature was

studied. Different application of supervised machine learning approaches exist

in fraud detection problems and the common approaches used for handling

unbalanced data problem in supervised learning were investigated. The eval-

uation techniques for measuring a fraud detection system effectively were also

investigated. Lastly, other significant areas of this research were investigated

and highlighted.

The second step is the generation of synthetic mobile money transfer (MMT)

data set. Due to the absence of real MMT transaction dataset, the need for

simulated dataset is identified and the methodology for simulating synthetic

mobile transfer dataset in [LKJ02] was adopted. This is based on the fact that

this method has a well defined interface, which makes it easy to use i.e the

whole process is divided into steps and this provides the possibility of using

the whole or part of the system for data simulation (as discussed in Section

2.3.2). To run the simulation, the Multi-agent based simulator (MASON) used

in [LrA12b] was adapted. The rationale for using MASON was that it is fast

and supports discrete event interaction between many agents in swam i.e it

facilitates the implementation of social networks as discussed in Section 2.3.4.

In order to evaluate the simulation dataset, verification and quantitative (chi-

square test) methods were carried out since there are no real data as input to

the simulator as discussed in Subsection 4.4. After the data simulation, the

proposed CBR system was designed and developed using jCOLIBRI frame-

work [RGGCDA14]; a Java framework that allows rapid prototyping of a CBR


system, the development and deployment of the CBR system in real scenarios.

As part of an improvement to the CBR system, it classifies features in the

MMT transaction dataset into five contexts and then recombines into a single

dimension to capture user behaviour effectively.

The third step is the simulation of the proposed CBR prediction algorithms.

At this stage different set of experiments were designed and conducted. The

process followed an evolutionary approach in order to evaluate the research

methodology. As a result, the first set of experiment started with the applica-

tion of basic CBR technique as discussed in Section 5.2 on minimalistic simu-

lated dataset. Gradually the experiments were progressed with the application

of Weighted CBR techniques using machine learning capabilities (Genetic al-

gorithm) for assigning parameter weights and automating the random selection

of k-value in the CBR k-NN algorithm. In addition, for the purpose of cap-

turing user behaviour effectively and improving the CBR prediction accuracy

feature engineering was carried out as discussed in Section 5.3.

Next, in order to demonstrate the applicability of the proposed model,

more sophisticated simulation datasets imposing both high complexity and

significant size were used. The rationale was to ensure that a promising clas-

sification precision of the proposed approach is achieved. To ensure that the

CBR model is not biased as a result of unbalanced dataset and also using the

same transaction data for both training and testing phase in the experiment,

the following were carried out: (i) data balancing using an hybrid approach

that combines oversampling and under-sampling algorithms (SMOTE+Tomek-

Link) was adopted. According to Chawla et al. in [Cha+02; GS17], this hybrid

approach works better than either one. (ii) The MMT dataset was split into

training and testing set using a ratio of 70:30 respectively. In addition, to avoid

1.5. Contribution to Knowledge 16

an overoptimistic estimate of the CBR model performance after the dataset

split, transactions from known compromised MMT account was removed from

the subsequent split as proposed in [Fab+17]. For example when an MMT

account is already associated with a fraudulent transaction in the training set,

its transactions are removed from the test set. Furthermore, Clustering tech-

nique was applied to the CBR retrieval process to reduce the computation cost

problem associated with the use of genetic algorithms for feature weighting.

Clustering has been widely used in various fields in the literature to improve

the classification accuracy and computation cost of learning algorithms [TD13]

as the case library grows.

As evaluation metrics, four types of performance measures were used namely:

recall, F-measure, Mathew Correlation Coefficient (MCC) and Area Under the

Curve (AUC) as discussed in Section 5.6. This is based on the fact that they

have high efficiency with respect to handling imbalanced data without getting

biased towards the majority class and also they are highly suitable with respect

to handling fraud detection domain. Finally, the difference in the results are

analysed both qualitatively and quantitatively.

1.5 Contribution to Knowledge

The work that will be discussed in this thesis presents the following contribu-

tions to knowledge.

First, this thesis proposes a novel approach to detecting fraud in mobile

money payment networks using Case-based reasoning methodology. This work

did not only present the use of CBR approach in financial transaction fraud

detection domain but it also introduces the novelty of feature engineering by

1.6. Thesis Structure 17

classifying features into the context of information. This approach allows users’

behavioural pattern to be captured effectively and also helps to improve the

prediction accuracy of the proposed CBR system.

An additional value offered in this work is the injection of variations of

known frauds into the CBR system simulation. The general binary annotation

for fraud class (i.e 1) in the literature was further annotated and introduced

into the mobile money transfer dataset using the three fraud case scenario

presented in the data simulation. This approach allows the investigation and

analysis of how this variation affects the proposed system performance param-

eters, such as its detection rate for each fraud class case.

The final contribution in this thesis is the extension of the synthetic fraud

data generation methodology in [LKJ02] to accommodate the use-case of mo-

bile money transfer services. The methodology was used to expand and simu-

late additional use-case scenarios which were hitherto not considered in the lit-

erature for mobile money transfer dataset. The use of these simulated dataset

in the absence of real data will allow the development and evaluation of fraud

detection techniques or tools.

1.6 Thesis Structure

Chapter 2 provides a general overview on Mobile money transfer services envi-

ronment, by providing basic knowledge on fundamentals of Mobile money such

as the business models, ecosystem and regulatory issues. It also discusses the

existing approach that was used in the literature for dealing with the challenge

of lack of publicly available mobile money transfer dataset.

Chapter 3 discusses common algorithmic solutions used for financial service

1.6. Thesis Structure 18

fraud detection in the literature review. It further discusses the application

of supervised approaches in fraud detection problems and the common ap-

proaches used for handling unbalanced data problem in supervised learning.

The evaluation techniques for measuring a fraud detection system effectively

was also discussed.

Chapter 4 contains discussion on the generation of synthetic mobile money

transfer dataset. It provides discussion on the data simulation model as well

as the different misuse scenarios used. This chapter also discusses the imple-

mentation, simulated scenario and evaluation of the generated dataset. The

results from the simulated data were analysed and presented and then used

for evaluating the proposed models.

Chapter 5 provides an extensive description for the methodology adopted for

the needs of predicting money transfer fraud in mobile money services. It also

contains the rationale behind the conducted experiments as well as the case

representation used for the similarity measures. The chapter further discuss

the enhancement of basic CBR model using machine learning capabilities to

enhance its effectiveness in predicting money transfer fraud. Finally, the data

pre-processing and performance evaluation matrices used in this thesis were

discussed.

Chapter 6 discusses the results from the model evaluations using different

volumes of dataset. This involves the implementation of a set of experiments

in an evolutionary approach starting from a simplified to more complex ones.

It also discusses the rationale for the use of clustering to further enhance the

performance of the proposed CBR model.

Chapter 7 concludes this thesis by summarising the findings and highlighting

1.7. Publications and Research Activities 19

the main contributions with respect to the thesis objectives. Discussion on

future plans for further work in the field of mobile money transfer fraud are

also explored.

1.7 Publications and Research Activities

The list of works published during this thesis is summarized as follows.

Peer Reviewed International Conference Papers

1. Adeyinka Adedoyin, Stelios Kapetanakis, Georgios Samakovitis, and Mil-

tos Petridis (2017). Predicting Fraud in Mobile Money Transfer Using

Case-Based Reasoning. In: 37th (SGAI) International Conference on

Artificial Intelligence, AI-2017, Cambridge, Uk (2017). Won the Best

Student Paper.

2. Adeyinka Adedoyin, Stelios Kapetanakis, Georgios Samakovitis, and Mil-

tos Petridis (2017). Fraud Detection in Mobile Payment Transfer. In:

22nd UK Symposium on Case-Based Reasoning (UKCBR2017), Cam-

bridge, Uk (2017).

3. Adeyinka Adedoyin, Stelios Kapetanakis, Miltos Petridis, and Emmanouil

Panaousis (2016). Evaluating Case-Based Reasoning Knowledge Discov-

ery in Fraud Detection. In: 24th International Conference on Case-Based

Reasoning (ICCBR2016) Workshop proceedings, Atlanta Georgia, USA,

October, 2016.

Chapter 2

Background on Mobile Money

Transfer Operation and Dataset

In the previous chapter, types of mobile money financial services were identified

and the rationale for selecting mobile Payments transfer as the area of study in

this thesis was discussed. An overview of mobile money transfer was carried out

as well as discussions on motivation in this thesis. This chapter provides basic

knowledge on fundamentals of Mobile money such as the mobile money transfer

as in Section 2.1, business models, ecosystem, regulatory issues and mobile

money in emerging countries in Section 2.2. Then a detailed approach that has

been adopted for dealing with the challenge of lack of publicly available mobile

money transfer data set is presented in Section 2.3 as well as few relevant work

in the literature where financial transaction dataset were simulated. Section

2.4 concludes the chapter by summarising the topics related to the research

questions in this thesis and the research gaps to be addressed.

20

2.1. Mobile Money Transfer 21

2.1 Mobile Money Transfer

According to Zhdanova et al. [Zhd+14], Mobile Money Transfer (MMT) ser-

vices can be define as a financial service provided by a Mobile Network Op-

erator (MNO) that enables transfer of funds (mMoney) between service sub-

scribers through the use of mobile channels. In MMT service, mobile sub-

scribers can add electronic money called mMoney to his or her virtual mobile

account (mWallet) and store for later use, transfer to other mobile subscribers

or purchase goods via mobile phone. The receiver can inexpensively convert

this credit back into cash through a retailer such as local corner shops to act

as bank branches.

Mobile money transfer service allows users to send cash using SMS technol-

ogy thereby avoiding inconvenient and costly transfer methods such as physical

travel, the mail, or traditional wire transfer services like Western Union and

Postapay which are often done in banks. For example, payments for services

like electricity and water where people need to travel long distances and may

end up meeting huge queues at the bank. To deposit funds into mobile money

account, consumers go to participating local shops (retailer) and hand over

physical money. There is no charge to a customer for depositing funds into

his/her account, but a sliding tariff is levied on withdrawals from the account.

A subscriber who sends mMoney is charged a flat fee if sending to another

registered user and a sliding fee if sending to a mobile subscriber that is not

registered with the same MMT service provider [ML10; WST10]. Figure 2.1

shows a person to person fund transfer using MMT service [Zhd+14].

2.1. Mobile Money Transfer 22

Figure 2.1: P2P funds transfer using MMT service

From Figure 2.1, in order to access MMT services e.g perform P2P transfer,

Jane must register at an authorized MMT retail agent outlet R1 (e.g M-PESA).

Then get an individual electronic money account (mWallet) that is managed by

MNO (e.g Safaricom), which in turn deposits the full value its customers store

in M-PESA accounts at a pooled account in a regulated bank. Thus, the issuer

of M- PESA accounts is Safaricom, but the value in the accounts is entirely

backed by highly liquid deposits at a commercial bank. So Jane converts

her cash into mMoney and deposit its amount into her mWallet account with

the help of a retailer R1. Then, Jane can use her mobile device to transfer

mMoney to Frank if he is subscribed to the same MMT service. On receiving

the transfer, they both get an SMS receipts as confirmation Frank can then

withdraw cash from his mWallet at the retailer R2. Since the sender and

recipient receive an SMS receipt as proof of transfer after each transaction, it

has helped to build more trust in the system even if a customer doesn’t trust

the local agents themselves [Zhd+14]. The next Section discusses the different

business model that can be adopted to build a mobile money transfer service.

2.2. Mobile Money Business Model 23

2.2 Mobile Money Business Model

In order to build a mobile money transfer service as discussed above, a busi-

ness model is required Lurie in [Lur11] defined a business model as a way of

designing a business to create, deliver and capture value. According to the

author in [AHG14], mobile money is growing rapidly around the world and

they require a supportive business model as the different key players try to

leverage their interest into the service. These varies from financial institutions

trying to leverage new technologies and channels to mobile operators looking

to augment their revenue opportunities and third party vendors attempting to

take advantage of business opportunities presented by new technologies and

changes in consumer behaviour. There are three core business models in mobile

money payment and they are:

• MNO-Centric: The Mobile network operator (MNO) takes the lead

and provides various financial services initially outside the banking sys-

tem. In extreme cases the MNO could acquire banking licenses that

would allow them to store the deposits made into the system [AHG14].

• Bank-Centric: A bank takes the lead and finds an MNO with which

to partner. They use mobile phone platforms to leverage their credi-

bility and expertise in the extension of their existing and new channels

[AHG14].

• Collaborative (including third-party players): In collaborative model

an MNO and a bank joins forces to create an m-money service. In the

case of third party led, the vendors create solutions that allow cross

operator and cross bank solutions to be launched in the market [Lak13].


Examples of how the various actors in the value chain have implemented these

various business models is as shown in Table 2.1 [AHG14].

Table 2.1: Mobile payments business model

Business

Model

Technologies

Used

Purchase

RelationshipCharged to Examples

Financial

Institution Led

NFC

Internet

External Device

Consumer to

Business

Bank Account

Debit Card

Credit Card

Prepaid Card

Barclay NFC

Tag

Mobile

Operator Led

SMS

USSD

Internet

NFC

Customer to Business

Business to Business

Peer to Peer

Network Bill

Debit Card

Credit Card

M-Pesa

Felica

Third Party Led

Internet

NFC

External Device

Customer to Business

Business to Business

Peer to Peer

Debit Card

Credit Card

Prepaid Card

Paypal

Sqaure

Google Wallet

Over time as mobile money technology keeps evolving, the business model

will experience some variations. For instance an MNO-centric venture could

evolve over time by increasing its partnership with banks and possibly develop

into a collaborative model. A good example of this, is M-PESA in Kenya. This

shows that the models are dynamic, and they can be linked at certain stages of

financial development in each country [Ste11]. However, the implementation

of this aforementioned business models involves partnership between different

stakeholders. The discussion of the major stakeholders follows in the next

Section.

2.2.1 Mobile Money Transfer Eco-system

Jenkins in [Jen08] defined Mobile Ecosystem as the networks of organizations,

individuals, processes and systems that link and facilitate or control the deliv-

ery of payments system. Nazareno, the president of Smart Communications


highlights that mobile money ecosystem works on three rules; partnership,

partnership and partnership. This encourages the creation of a mesh of part-

nership covering various networks of relationships [Jen08]. The key players in

mobile money transfer ecosystem are the Consumers (End-Users and Service

Providers), Distribution channels (retailer and wholesaler), Network Operator

(MNO), Commercial banks and the Central bank [Rie+13]. The activities of

this different key players within the ecosystem was used to simulate the mobile

money transfer dataset as discussed in Chapter 3.

Figure 2.2: Mobile money transfer service ecosystem [Rie+13]

Figure 2.2 which is adapted from [WST10], shows the roles of various key

players in the mobile money transfer ecosystem. This outlines the major use

cases in the MMT ecosystem for the purpose of data simulation as further

discussed in Chapter 3. The discussion of each of the key players is as follows:


A. Mobile Network Operator

Mobile Network Operator (MNO) emits mMoney (m) in partnership with a

private bank and they regularly produce compliancy reports to the Central

Bank who is responsible for the country’s monetary policy [Rie+13]. The role

of MNO in mobile money ecosystem is very critical as they play the leadership

role by drawing the different stakeholders in the ecosystem together. MNO

provides infrastructure such as wireless communication, back end server and

the mobile application for the operation of the ecosystem. In addition, they

bring their huge existing distribution channels and subscribers into the ecosys-

tem. Wherever there is mobile coverage, there is an agent of a distributor that

sells prepaid credits. The geographical distribution of the agents gives MNO

the ability to reach customers across all income segments. This coupled with

the ownership of the infrastructure gives MNO the ability to be the key player

in the mobile money ecosystem. They also play key roles in further ecosystem

expansion and training of agents in dealing with consumers. However, they

lack experience in the financial services, payment risk, regulatory and legal

governance of the payment system [Jen08; Tob11].

B. Financial Institutions (Partner Banks)

The financial institution provides banking license and help to store the mobile

money customers’ keep in their mWallet. They also bring their vast experi-

ence and customer trust in dealing with eMoney while acting as intermediary

between the MNOs and agents in acquiring the eValue. The branch offices of

the banks act as aggregation points for the merchants, distribution channels

and their agents in facilitating the flow of money in mobile money ecosystem.

The bank provides financial regulatory advice to the MNOs and also on-line

banking integration to the m-commerce system of the MNOs to facilitate their


operations [Tob11].

C. Distribution Channels (Agents)

The distribution channels (agents) are primarily the consumer facing touch-

point and can often be seen as the ”face” for mobile money offering [SCP16].

The distribution channels are non-bank entities such as MNOs retail shops,

village corner stores or a mix of both that handle customer registration, cash-

in/cash-out services and other transactions on behalf of the MNO. Through

their knowledge and understanding of the consumers, they help to educate,

maintain liquidity, handle account opening procedures and report suspicious

transactions in line with regulatory requirements. The agents earn commis-

sions based on the amount of mobile money trading they undertake which are

usually very small amount per transaction. However, it is expected that the

volume of transactions will add up to a good amount to sustain their retail

business [Muy15; Tob11].

D. Service Providers (Merchant and Utilities)

The adoption of mobile money platforms as a means of receiving payment by

service providers enables convenient and timely payments for both the mer-

chant and customers. In Kenya and Ghana for example, subscribers of popular

pay-per-view TV service use mobile money (M-PESA and ZAP) for the pay-

ment of their subscription fees rather than queuing up to make such payment.

Also, the adoption of mobile money platform can lead to increased customer

base of the mobile money ecosystem thereby acting as a catalyst in promoting

the service [Tob11].

E. Regulators

The function of regulators in mobile money is to provide an enabling envi-


ronment, protect the stability of the financial system, ensure implementation

of regulations and innovation facilitation. The development of mobile money

cuts across two regulatory bodies in most countries, telecommunication and

banking. This has brought about competition and unclear functions between

the two major operators. As a result, many countries have not yet developed

mobile money regulations and policies. There is therefore, the need to clarify

and understand the relationships between the actors within the mobile money

ecosystem so as to ensure improved efficiency and clear regulatory policies.

This also gives rise to a need for a converged regulation for both technology

standards and policy which is slowly coming to the attention of regulators

globally. This proposed collaboration requires careful balancing with national

interest [LD12; Tob11].

F. Customers

In mobile money ecosystem, customers are the final recipients and it is there-

fore important that effective and efficient services are made available by all

participating mobile service providers. The use of mobile money payment re-

duces the risk of carrying cash and increased access of payment, remittances

and other financial services for customers particular in the developing markets

[Tob11].

2.2.2 Mobile Money in Emerging Economies

Globally, the main drive behind the success of mobile money is the explosive

growth in the number of mobile devices and the drop in the cost of computing

power. This has enabled millions of people to own mobile devices [Int13a].

Mobile money in emerging countries is more than just a technology because

it brings about financial inclusion to the worlds unbanked and under-banked


population. In developing countries [Int13b], more than 2.5 billion people lack

a formal bank account but about 68 percent of these population have a mobile

phone. The reasons for high percentage of adult not having a bank account

ranges from lack of money to use one, distance from banking facilities especially

in rural areas, and high cost of banking services. Africa has the highest growth

rate in mobile phone usage with increasing mobile coverage. This has allowed

millions of people in the remote areas to adopt the use of mobile money as an

alternatives to traditional banking especially in Sub-Saharan Africa [Ken16].

In Kenya like most low income African countries, a great number of households

that depend on domestic remittances using M-PESA makes it the biggest suc-

cess story in terms of overall usage. Since the launch of M-PESA in Kenya

2007, the mobile money industry has continued to grow in many African coun-

tries and today it has reached a level of sophistication not seen anywhere else

in the word [Int13b; Zha12]. Examples of few mobile money applications in

emerging economies are as shown in Table 2.2 [Int13b].


Table 2.2: List of mobile money applications in emerging economies

M-money Application Countries Implemented Main Features

M-PESA Kenya, Tanzania, South Africa

and Afghanistan

P2P transfer, Pay school fees, Pay

electricity, Pay for goods and ser-

vices.

Easypaisa Pakistan Pay utility bills, Make P2P trans-

fer, Increase airtime credits, Save

money, Pay for goods and services.

T-Cash Haiti Receive salary, Make P2P trans-

fers, Pay bills.

Globe GCash Philippines Pay utility bills, Make P2P trans-

fers, Use as a mobile wallet, In-

crease airtime credits, Pay for

goods and services.

Airtel Money India and 14 African coun-

tries including Uganda, Tanza-

nia and Kenya

Make P2P transfers, Pay for goods

and services, Bill payments.

MTN Mobile Money Africa, including Uganda,

Ghana, Cameroom, Ivory

Coast, Rwanda and Benin

P2P transfers, Buy airtime, Check

balances, Pay utility bills.

EKO India Make P2P transfers, Bill and loan

payments.

WIZZIT South Africa P2P transfers, Buy airtime, Check

balances and statements, Pay elec-

tricity.

The level of MMT evolution and uptake has varied by country bringing

about various regulatory issues. The next Section discusses some of the rele-

vant regulatory issues.

2.2.3 Regulatory Issues

As discussed in Section 2.2.2 the rate of MMT use in emerging economies is

growing rapidly and as a result they come with some regulatory issues. In

general, the regulatory space for Mobile money services encompasses both

telecommunications and central banks. The role of regulators in the mobile


money ecosystem is very critical for the long term survival of its ecosystem.

This underscores the need for partnership and collaboration between both

sectors in order to mitigate the risks for the consumer. The primary goal of

regulation is to enforce compliance to the various regulations so as to safeguard

the interests of the consumer, enhance trust in the payment system and ensure

that participants have effective means for identifying, measuring and managing

business risk [Int13b; Tob11]. For example, the Philippines regulates MNO

that provide mobile money services; subscribers are required to register in

person with the service providers using a valid photographic identification

before they can put cash into their mobile accounts or withdraw cash. They

also regulate how much money a subscriber can transfer at any one time, during

a day or month. In some developing countries their are no clear regulatory

frameworks for mobile financial services. In Kenya and Cambodia, they have

not issued a specific regulations but nevertheless allow MNO-centric models

on an ad hoc basis through ”no objection” letters and conditional approvals

or other means [Int13b].

According to [Int13b] report, the regulation of mobile money services can

be different from country to country depending on the business model adopted.

For example, in a country where only banking institutions are allowed to han-

dle cash in the context of mobile money transfer, it will be difficult to outsource

that cash handling function to a service provider outside the financial insti-

tution. Also, the issue of whether or not a company offering mobile money

services should be regulated as a bank is another challenge. This is the case

in Pakistan unlike in Kenya. The emergence of mobile money remittances

have brought about great concern on some regulatory issues which are outside

the traditional financial institution regulations. Figure 2.3 shows a graphical

2.3. Mobile Money Transfer Synthetic Data 32

representation of some of the observed concerns.

Figure 2.3: Regulatory issues

2.3 Mobile Money Transfer Synthetic Data

In the absence of real transactions dataset due to a number of reasons as men-

tioned in Section 1.2.1, the need for simulating synthetic transaction data arise

as proposed by Phua et al. in [Phu+10]. Lundin2002 et al. [LKJ02] defined

synthetic data as a data that is generated by humans using simulated users

in a simulated system to perform simulated actions. Simulated actions can be

an agent or program that perform actions according to a specification created

by the experiment organizers, reflecting the desired behaviour of the system.

A synthetic data simulation can be implemented by either stimulating user

behaviour using a software automation or hire people to generate background


data and attacks. In addition to carrying out synthetic simulation, a choice

needs to be made between using a fully simulated system, a real system or a

mix of real and simulated system components [LKJ02].

In the recent years, research areas such as data mining [Jes+05], artificial

intelligence [RN03] and process mining [Rie+13] are paying more attention

in developing data generation systems that systematically generate synthetic

data for numerous applications. This can be associated to the difficulties in

obtaining or modifying properties of a real life data for evaluating various

algorithms [PZ06]. The availability of representative data such as synthetic

data in analysing the use of machine learning for fraud detection gives sev-

eral advantages compared to using authentic data [LKJ02]. Data properties of

synthetic data can be tailored to meet various patterns and attacks not avail-

able in authentic data sets. However, a purely synthetic dataset suffers from

the fact that they may underfit and will consequently not properly reflect the

properties or pattern of a real life data set [BKJ03].

In order to generate a synthetic data that properly reflects the properties

of an authentic dataset, a methodology can be developed such as in [LKJ02;

WHV08; Jes+05] using the authentic data as the property ”seed” to simulate

the synthetic data. This is achieved by using statistical properties from small

amounts of authentic data to generate large amount of synthetic data thereby

preserving some important parameters in the authentic data such as users and

service behaviour [LKJ02]. Despite the increasing interest, the research on

synthetic data generation in the area of fraud detection is still in its early

stage. With respect to exhaustive survey of approaches, there are only three

synthetic log generators that exists in the field of fraud detection.

The authors in [BKJ03] applied their proposed data generation method in


[LKJ02] to generate synthetic test data on IP based Video-on-Demand service.

They aimed at testing the feasibility of generating and using synthetic data

for training and testing a fraud detection system. In the study, small amount

of authentic log data was used to generate a large amount of synthetic log files

containing both normal and fraudulent user behaviour. In the simulation, they

were able to preserve the important statistical properties of the authentic data

by using authentic normal data and fraud as seed for generating the synthetic

data. Thus, they were able to create a realistic behaviour profile for both

normal users and attackers.

In [LrA12b], the author modelled money-laundering cases in mobile money

systems. In their work, a Multi-agent based simulator (MABS) was developed

to simulate the behaviour of several clients interacting in a Mobile Money

environment. Their aim was to simulate several transactions corresponding to

concurrent use of an account by legitimate user, a fraudulent one and results

in a shift of behaviour. The implementation of their model was limited to

random actions of users at random times.

Gaber et al. in [Gab+13] modelled mobile money transfer using the con-

cept of habit to create a meaningful pattern of normal user behaviour. The

implementation of the model was based on the assumption that legitimate

users transactions are mostly related to their habits i.e. legitimate users tend

to carry out frequently and repeatedly a specific set of transactions.

Other methods are used in generating synthetic data but do not generate

data related to frauds and attacks. Some focus on testing a specific detection

prototype and comparing the performance of various detection systems. Some

studies used manipulated authentic data sets while in others there is no detailed

description of how these datasets were generated.


In this work, to address the challenge of lack of mobile money transaction

dataset, the methodology in [Gab+13] was adopted and implemented using

the Multi-agent based simulator in [LrA12b]. To trace the sequence of events

in the simulated dataset, time-stamp parameter was included in the simulator

rather than steps or category of users over a period of time. This was applied in

[Gab+13] and [LrA12b] respectively as further discussed in chapter 4. The next

Section provides a justification for the use of simulated dataset in evaluating

the applicability of predictive models.

2.3.1 Why Synthetic Data?

In information discovery and analysis, there is need to identify events that

could occur in the future. Developing test cases for information discovery,

analysis and prediction tools requires background data into which hypothet-

ical future scenarios can be overlaid. Obtaining real life data sets on mobile

money can be difficult due to privacy issues, time and cost associated with

collecting multiple instances of a diverse set of data sources [Jes+05]. Addi-

tionally, when real life data are available, they may be of poor quality, small

in quantity, lack fraud cases and possible different scenarios to explore. Al-

though using real data in supervised learning algorithm is often preferred, the

use of simulated data becomes a necessity due to inaccessibility of real data.

Simulated sets are designed to reflect real data behaviour as closely as pos-

sible [LKJ02]. Furthermore, it is highly important to be able to model how

emerging threat landscape will affect existing counter measures. For example

information from simulation of potential future scenarios can make it possible

to better plan for what might happen next by initiating further precautions,

if deemed necessary. Under all this circumstances, synthetic data becomes an


alternative solution in making the data set available [Gor15; Phu+10]. When

using synthetic data, several benefits can be identified including [Gab+13]:

• the possibility to generate as much data and as many different scenarios

as needed which could have taken months or years to collect from real

system,

• the control by researcher over parameters of the generated data which

can help to address the issues of class imbalance,

• the possibility to create data with specific properties to test characteristic

features of the algorithm,

• the absence of privacy or disclosure issues which hinder research in fraud

detection,

• the possibility to test and choose detection algorithms for systems which

are yet to be deployed.

There is no guarantee that synthetic data are fully realistic or representative of

real situations that an anomaly detection system can effectively use to identify

both existing attacks as well as newly evolving ones. However, to balance

some of this disadvantages, it is highly important to build a simulation with

characteristics that are as close as possible to a real world situations [Gab+13].

The next Section discusses the identified methodology from the literature that

as the level of depth required to generate synthetic data for this research work.

2.3.2 Synthetic Data Generation Methodology

The approach of generating synthetic data can be a complex system and in-

volves the use of autonomous and interactive agents. The main components


needed in the automation process are specifications of desired user behaviour

in the system, a user, an attacker and a system simulator. The data generation

process begins with collection of information about the anticipated behaviour

in the target system such as background data from similar systems and possible

attacks. This data serves as basis for modelling the user and system behaviour.

The various components involved in the data generation is as shown in Fig-

ure 2.4 and the aim of this methodology is to guide the production of these

components [LKJ02].

Figure 2.4: Synthetic log data generation method [BKJ03]

As illustrated in Figure 2.4, the different steps required to generate syn-

thetic data includes the following [BKJ03; LKJ02]:


From Step 1, the process begins with the collection of data that should be

representative of the anticipated behaviour. The data may consist of authentic

background data from the target system, background data from similar sys-

tems, authentic attacks as well as possible attacks. In most situations, known

attacks are often not available and to circumvent these difficulty, there is the

need to invent possible attack scenarios or adapt known frauds from other types

of services to the situation. Step 2, involves the analysis of the data collected

and identification of important properties such as user classes, attack charac-

teristics, statistics of usage and system behaviour. The aim of this step is to

get a picture of how the system is used, how user actions and attacks shows

up in the log data. Next Step 3, here the information from previous steps is

used to identify important parameters that can be used detect the anticipated

attacks. One way to identify these parameters is to study the features needed

to detect the expected fraud. The output from this step is files for different

user classes containing values for all parameters that are required for the user

simulation. Step 4, here a user and attack model is created. This models

must be sophisticated enough to preserve the selected profile parameters. The

output from this step is user classes containing values for all parameters that

are required for the user simulations. Finally the system is modelled in Step

5. This model must be accurate enough to produce log data similar to that

of the target system where the similarity can be restricted to those features

meant for fraud detection. The system simulator is implemented according to

this model and the output from this step is a lists of user actions in a format

that is suitable to use as input to the system simulator.

The essence of dividing the whole process into steps with well defined in-

terfaces is to reduce the complexity of the tasks in such a way that different


groups of people can work on different task. This makes it possible for us to

use people instead of a user simulator to create user actions. In addition, one

can use the whole, or parts of the real system instead of a system simulator.

This may be preferable in some situations especially when the system or user

behaviour is very complex and needs to be modelled in great detail [BKJ03;

LKJ02]. The next Section discusses how synthetic log data are created using

the methodology above.

2.3.3 Creation of Synthetic Log Data

During the data generation process, the interactions between different compo-

nents are as shown in Figure 2.5 :

Figure 2.5: The synthetic log data generation process [LKJ02]

The user profiles are used as input to the user and attacker simulator. The

configuration data for the user and attacker simulator contains information

that controls the generation of normal user and attack actions. For example,

it may be the start time and stop time for simulation, the number of normal

users from different user classes, the number of attacks of each type, and the

start and stop time for different attacks. The user and attacker simulator


generates user actions that are fed to the system simulator. The configuration

data for the system simulator contains for example information about the

number of clients that should be simulated, statistics about reply times and

traffic amounts for different events as well as behaviour in the presence of

certain attack events. The configuration data for the system simulator also

contains the parameters that will be tuned to vary the different simulation runs.

Lastly, in live data injection processes, aggregated data from real financial

transaction dataset can be injected as seeds into the simulation. This will

allow the simulation of a more realistic transaction data (due to absence of

real transaction dataset, this process was not implement in this study). With

these batches, synthetic data can be generated for different purposes without

reprogramming the simulators [LKJ02]. The next Section refers to the relevant

and adopted tools used for simulating synthetic data in this research.

2.3.4 Synthetic Data Simulation Using MABS

Simulation of synthetic data using Multi-agent based simulation (MABS) may

be implemented by programming with computer languages such as C, C++,

Java, etc., depending on their computational efficiency for the proposed model

[HB12]. There are also other user friendly event simulation libraries that can

be use to develop an agent-based model such as Repast [Rep], Swarm [Swa],

Netlogo [Net], Mason [MAS], etc. A detailed comparison of their features is

provided by [All09] and [RLJ06].

For the needs of synthetic data creation, MASON is adopted as the sim-

ulation engine. MASON is a fast and easily extendable multi-agent based

simulation software designed in Java. It supports discrete event interaction

between many agents in swarm [Luk+04]. According to Luke et al. [Luk+04],


MASON has several advantages such as (i) It carefully delineates between

model and visualisation. This allows models to be dynamically detached from

or attached to visualisers and enabling cross platform migration. (ii) Possibil-

ity of visualising and manipulating models in both 2D and 3D (using Java3D)

to produce screenshots and movies. (iii) The model layer comes with small

collection of classes consisting of a discrete-event schedule, high-quality ran-

dom number generator, and variety of fields which hold objects and associate

them with locations. These factors influenced the choice to use MASON as a

Java library for the data simulation in this research. The next Section refers

to relevant approaches for evaluating simulation dataset.

2.3.5 Evaluation of Simulation Data

Simulation models are abstract representations of real state systems and they

can help to increase the ability to control, forecast or understand the behaviour

of the actual system [DA14]. Simulation models can also be applied to decision

making and solving complex problems but there is the concern over whether

the outputs of a model are correct or not. However, it can be very difficult

to ascertain whether the behaviour that is been observed is truly the repre-

sentation of that system [HD92]. As a result, researchers have come up with

numerous approaches and techniques for validating and verifying the accuracy

of a model simulation.

The verification process is generally defined as the process of testing whether

or not the logic of the model is acceptable [CCB08; DA14]. While validation

process refers to the extent that the model adequately represents the system

being modelled [DA14]. According to [DA14], the validation approach can be

classified into two general categories: (1) quantitative methods, also called sta-

2.4. Chapter Summary 42

tistical methods that use statistical approaches to evaluate the credibility of the

simulation model such as t-test, Johnson’s modified t statistics, distribution-

free statistics, sensitivity analysis e.t.c (2) subjective methods that are based

on judgement of experts e.g Black-box testing, internal validity, Turing test

and face validation.

A chi-square test was used in [Gab+13]. The authors used chi-square test

with a significance level of 5% to check whether the amounts and periods of

the simulated data are normally distributed. It was shown that all the selected

regular users had normally-distributed amount and period. In [LrA12b] the

authors verified their simulated data by checking constraints such as positive

balance numbers, account age, consistency between the transfers, deposits and

withdrawals with the changes in account balances. The authors also validated

the simulated data, however since there was no real world data as input to

the simulator, the validation relied on the description of the desired scenario

and opinions of experts in the field to show that the basic statics and overall

process of the simulation design correspond to a real world scenario. The next

Section concludes the chapter by providing a summary.

2.4 Chapter Summary

In this chapter a number of issues on fraud detection in mobile money trans-

fer has been discussed. Firstly, a background on mobile money transfer, its

ecosystem and the regulatory issues fraudsters could explore was discussed.

Followed by the discussion from the investigation on issues surrounding the

availability of financial data set and methodology on how to generate mobile

money payment transaction dataset. From the result of the investigation, only


three previous work on fraud detection domain provided the required level of

depth that is needed to carry out this research work. From their work, the pro-

posed methodology in [LKJ02] and Multi-agent based simulator in [LrA12b]

were adapted in this thesis to simulate mobile transfer transaction data as

discussed later in Section 4.3.

The rationale for using the methodology in [LKJ02] was because it has a

well defined interface, which makes it easy to use i.e the whole process is divided

into steps. This provides the possibility of using the whole or part of the system

for data simulation. The choice of Multi-agent based simulator (MASON) was

based on the fact that MASON is fast and it supports discrete event interaction

between many agents in swarm (i.e facilitates the implementation of social

networks).

Furthermore in Chapter 2, the existing approach in the literature for eval-

uating simulated dataset were also discussed. From the evaluation, statistical

approaches were identified as the most recognised approach for evaluating the

credibility of a simulated dataset. Therefore, to evaluate the simulated dataset

in this thesis, a chi-square test is employed. The rationale for choosing chi-

square test is because it is easy to compute and robust with respect to the

distribution of the data [Mch13]. Finally, background on the topics related to

the research questions in this thesis and the research gaps that this thesis has

addressed was discussed.

Chapter 3

A Review of Fraud Detection

Techniques

In the previous chapter, the mobile money transfer business model and eco-

system was discussed. The challenges faced in obtaining real mobile money

transaction dataset was also reviewed. In the light of absence of real MMT

transaction dataset, the need for simulated dataset was identified and the

methodology for simulating synthetic mobile transfer dataset and how to eval-

uate these dataset was analysed. The decision to use a multi-agent based

simulation (MASON) was taken. This was done because of it’s speed and ease

of use for simulating the mobile money transfer dataset for the purpose of

building a predictive model that is able to learn hidden patterns in the trans-

action data. This chapter presents a review of common approaches used in the

literature for dealing with financial transaction service fraud detection.

Section 3.1 discusses the algorithmic solutions for financial service fraud

detection. The discussion was further divided into supervised and unsuper-

vised approaches. These two categories may be further grouped into four

44

3.1. ML Algorithms for Fraud Detection 45

groups based on how they are been evaluated as shown in Figure 3.1. Section

3.2 reviews the existing approach for handling unbalanced data for predictive

algorithms. The rationale for the adopted approach in this thesis were also dis-

cussed. Section 3.3 elaborates on evaluation techniques i.e performance metrics

for measuring a fraud detection system effectively. Section 3.7 concludes the

chapter with a summary.

Figure 3.1: Algorithmic solutions for fraud detection systems

Leveraging on the generic categorisation of algorithmic solution i.e Machine

leaning techniques into supervised and unsupervised approach in the litera-

ture, Figure 3.1 above presents a flow chart with slight modification showing

grouping based on their application in financial fraud detection domain. The

discussions on this different categories and groups follows.

3.1 ML Algorithms for Fraud Detection

Machine learning algorithms for fraud detection is an area of research where

ML algorithms are used to recognise patterns in data in order to discern fraud-

sters from legitimate clients. This is done based on thousands of pieces of

information that sometimes may seem completely unrelated to human beings


[Ale17]. Machine learning approaches have shown promising results in pre-

dicting financial transaction fraud [Zhd+14]. In the literature, these promis-

ing results were achieved using both supervised [Aze+14; KSM07; LrA12a]

and unsupervised [BH01; FM06; Wes+08] ML algorithms. The supervised

method is a system that attempts to learn by example using a teacher. Here,

a predictive model is trained under the supervision of labelled data i.e transac-

tions labelled as genuine or fraudulent to discover patterns associated to either

while the unsupervised learning methods works with unlabelled data samples

[AMZ16]. This approach is commonly used in outlier or anomaly detection

technique where an associated fraudulent behaviour to any transaction does

not conform with the majority class [BH01].

3.1.1 Supervised Approaches

In the supervised learning algorithm, labelled samples are used to train a

learner in order to predict the class of a new observation i.e the output vari-

ables that defines the class observation is assumed to be dependent on the input

variables [AMZ16]. Based on their evaluation approach, supervised learning

can be categorised into [Poz15]: i) supervised profiling, ii) classification, iii)

cost-sensitive and iv) networks methods. Supervised learning techniques have

been widely used in the literature for detecting financial transaction fraud.

This is particularly when the available dataset for the evaluation of this al-

gorithmic solutions are already annotated by fraud analyst. However, they

come with the challenge of overfitting due to unbalanced financial transaction

dataset characteristics i.e skewed class distribution of fraud to non-fraud data.

Several methods have been applied in the literature to deal with this prob-

lem such as data sampling, cost-sensitive e.t.c as further discussed in Section


3.2. The discussion of the different categories of supervised learning technique

follows.

Supervised Profiling

Supervised profiling works by observing the distribution of relevant variables

for both genuine and fraudulent accounts in an already labelled transaction

data sample [Sud+10]. As a result, different profiles are created for each class

such that when there is a new incoming transaction it is compared to see

which profile is more similar [Sud+10]. For example, Xu [XSL07] proposes

an adaptive user profiling method using association rule set to mine users

behaviour for credit card fraud detection.

Rule-based profiles are a popular approach in supervised profiling. These

rules can be defined by human experts or learned from data with a rule dis-

covery algorithm e.g ”any individual exceeding one million dollars worth of

transactions in a single day shall be considered suspicious”. The rule-based

approach is easy to understand and implement [Sud+10]. However, as the

criminal activities and legitimate user-behaviour evolves, fraudulent profiles

have to be updated as well. As an alternative, Wang et al. [Wan+03] pro-

poses a weighted ensemble approach that can be used to include new rules

while maintaining old rules. This means Profiles must be updated to reflect

the dynamic patterns of criminal activity as well as changes in legitimate user

behaviour. As a result, it presents a challenge for static rule-based methods

that are learned off-line as they must be frequently validated and retrained

[Poz15; Sud+10].

Supervised profiling are commonly used in fraud detection domain when

labelled transactions data are available. This makes it possible to profile or


construct distributions of relevant variables for both genuine and fraudulent

transaction [Sud+10]. For instance in supervised profiling one profile of ex-

pected genuine behaviour is maintained per customer and one profile of fraud-

ulent behaviour is maintained per type of fraud. Incoming new transaction are

then compared with the customer’s profile of genuine behaviour and with the

different profiles of fraud. Any deviations from expected behaviour or similarly

to known patterns of fraud may be a sign of criminal activity. This approach

is commonly used in telecommunications fraud detection research [Hol00].

Classification methods

Similar to supervised profiling approach discussed above, classification meth-

ods are commonly used when labelled transactions data are available. Tradi-

tionally they are commonly implemented as linear and non-linear models and

often seen as the standard way of solving financial transaction fraud problem

[Nga+11]. Over the years, several classification algorithms have been used

in the literature for fraud detection e.g Neural networks [CLL05; Moh+09;

Rav+11; ZS06], Support vector machines [DD13], Probabilistic graphical model

[Lui+15], and Decision Trees [LrA12a].

In recent years, decision trees have gained prominence in fraud detection

and credit risk scoring field [Zhd+14]. The authors in [Zhd+14] applied part

decision table, C4.5 and random forest algorithm to detect fraud chains in mo-

bile money transfer. The results from the experiment shows that C4.5 decision

tree had a better precision performance but the recall for all three algorithms

was quite low. Fabian et. al. in [Fab+17] incorporated pattern information in

the form of new attributes into random forest and logistic regression model.

The new attributes lead to a significant performance improvement for both


models compared to state-of-art aggregated transaction features. However,

the random forest has slightly better performance.

Artificial neural network consist of highly interconnected mathematical pro-

cessing elements, called neurons, that work together to solve problems in a

similar way the human brain performs [Hay99]. However, they are black box

models i.e difficult to understand and typically require a significant amount of

historical data [Sha+16]. A decision tree offers an alternative that is commen-

surate with neural network limitation describe above. It uses an approach that

is based on the extraction of conjunction and disjunction of rules that are rep-

resentation of the choice of the classification and they are easy to understand

[Poz15].

In this thesis four base-line classifiers were selected based on the fact that

they are commonly used in the literature for fraud detection problems. They

are Logistic regression (LR), Random forest (RF), Support vector machine

(SVM) and Artificial neural networks (ANN) for experiments in this thesis.

LR is a generalized linear model, easy to use and one of the most commonly

used technique for data mining in practice but is vulnerable to overconfidence

[Bha+11]. The RF classifier has the ability to capture non-linear data, shows

high scalability with better visual representation of results data but are liable

to over-fit. SVM has a regularization parameter that is used to prevent over-

fitting [Sud+10]. ANN can show higher accuracy in prediction but they are

much more computationally expensive and hard to understand the interpreta-

tion of predicted results. Table 3.11 shows some approaches used in detecting

financial transaction fraud.

1The description of most of these machine learning approaches are standard and can befound in a textbook on Data mining, for example [HTF09]


Table 3.1: List of some related works on fraud detection

Abbr. Full Name Description References

LR Logistic Re-

gression

Models the probability of oc-

currence of one (success) of the

two classes of dichotomous.

[Bhattacharyya2011,

Lin2014a, Ngai2011, Rav-

isankar2011, Pallavi2015]

DT Decision Tree A tree structure, where each

node represents a test on an at-

tribute and each branch repre-

sents an outcome of the test.

[Bahnsen2015, Geng2015,

Sahin2013, Soltaniz-

iba2015, Tsang2014]

RF Random Forest Operates as a decision tree op-

erator but creates an ensemble

of random trees.

[Albashrawi2016a,

Liu2015, Nolan2017,

Qi-Feng2015, Xuan2018,

Yaram2016]

SVM Support Vector

Machine

It uses a linear model to im-

plement nonlinear class bound-

aries by mapping input vec-

tors nonlinearly into a high-

dimensional feature space.

[Akbani2004, Chyan-

long2018, Francis2011,

Moepya2014, Sub-

udhi2015]

NB Naive Bayes A probabilistic classifier based

on applying Bayes Theorem.

[Bhowmik2011,

Gupta2017, Pani-

grahi2009, Saravanan2014]

ANN Artificial Neu-

ral Network

An adaptive algorithm that

works in a way in which the hu-

man brain performs.

[Azeem2014, Bekirev2015,

Dong2014, Olszewski2014,

Ogwueleka2011,

Roselina2015, Rav-

isankar2011, Yuan2017]

k-NN Nearest Neigh-

bour

An instance based learning

algorithm that operates by

choosing k nearest instances.

[Arianto2017, Chang2012,

Fahmi2016, Ganji2012,

Malini2017]

The discussion of some of these related works based on their performance

evaluation using different performance metrics follows in Table 3.2.


Tab

le3.

2:D

iscu

ssio

nof

som

eof

the

rela

ted

wor

ks

per

form

ance

Abbr.

References&

Sum

mary

LR

Bh

att

ach

ary

ya

etal.

[Bh

a+

11]

exam

ined

the

per

form

an

ceof

logis

tic

regre

ssio

n(L

R),

ran

dom

fore

sts

(RF

)an

dsu

pp

ort

vec

tor

mach

ines

(SV

M)

for

cred

itca

rdfr

au

dd

etec

tion

usi

ng

diff

eren

tle

vel

sof

un

der

sam

plin

gte

chn

iqu

e.T

he

RF

class

ifier

dem

on

stra

ted

over

all

bet

ter

per

form

an

ceacr

oss

per

form

an

cem

easu

res

use

d.

Afa

ctor

contr

ibu

tin

gto

the

per

form

an

ceof

LR

isp

oss

ibly

the

care

fully

der

ived

att

rib

ute

s

use

d.

Lin

etal.

[Lin

+14]

ap

plied

LR

ton

ovel

featu

rese

lect

ion

met

hod

sto

fin

an

cial

dis

tres

sp

red

icti

on

.T

he

emp

iric

al

resu

lts

ind

icate

s

that

the

LR

mod

elb

ase

don

the

novel

featu

rese

tse

lect

ion

ou

tper

form

edth

em

od

elw

ith

trad

itio

nal

featu

rese

lect

ion

mod

els

pre

dic

tion

acc

ura

cy.

Ravis

an

kar

etal.

[Rav+

11]

use

sd

ata

min

ing

tech

niq

ues

such

as

LR

,S

VM

,G

enet

icp

rogra

mm

ing

(GP

)an

dP

rob

ab

ilis

tic

Neu

ral

Net

work

(PN

N)

for

fin

an

cial

state

men

tfr

au

dw

ith

an

dw

ith

ou

tfe

atu

rese

lect

ion

.T

he

PN

Nou

tper

form

edall

tech

niq

ues

wit

hou

tfe

atu

re

sele

ctio

n,

an

dG

Pan

dP

NN

ou

tper

form

edoth

ers

wit

hfe

atu

rese

lect

ion

an

dw

ith

marg

inal

equ

al

acc

ura

cies

.

DT

Bah

nse

net

al.

[BA

O15]

pro

pose

dan

exam

ple

-dep

end

ent

cost

-sen

siti

ve

dec

isio

ntr

eealg

ori

thm

,by

inco

rpora

tin

gth

ed

iffer

ent

exam

ple

-

dep

end

ent

cost

sin

toa

new

cost

-base

dim

pu

rity

mea

sure

an

da

new

cost

-base

dp

run

ing

crit

eria

.T

his

was

evalu

ate

du

sin

gcr

edit

card

frau

d

det

ecti

on

,cr

edit

scori

ng

an

dd

irec

tm

ark

etin

gd

ata

set.

Th

ere

sult

ssh

ow

that

the

pro

pose

dalg

ori

thm

isth

eb

est

per

form

ing

met

hod

for

all

data

base

s.G

eng

etal.

[GB

C15]

pre

dic

ted

fin

an

cial

dis

tres

sin

Sh

an

gh

ai

Sto

ckE

xch

an

ge.

Th

ed

ecis

ion

tree

class

ifier

per

form

edle

sser

than

neu

ral

net

work

s,su

pp

ort

vec

tor

mach

ines

,as

wel

las

an

ense

mb

leof

mu

ltip

lecl

ass

ifier

sco

mb

ined

usi

ng

ma

jori

tyvoti

ng.

As

contr

ibu

tion

,

itw

as

dis

cover

edth

at

fin

an

cial

ind

icato

rs,

such

as

net

pro

fit

marg

inof

tota

lass

ets,

retu

rnon

tota

lass

ets,

earn

ings

per

share

,an

dca

sh

flow

per

share

,p

lay

an

imp

ort

ant

role

inp

red

icti

on

of

det

erio

rati

on

inp

rofi

tab

ilit

y.S

ah

inet

al.

[SB

D13]

pro

pose

da

new

cost

-sen

siti

ve

dec

isio

ntr

eeap

pro

ach

tom

inim

ize

the

sum

of

mis

class

ifica

tion

cost

sw

hile

sele

ctin

gth

esp

litt

ing

att

rib

ute

at

each

non

-ter

min

al

nod

ean

d

the

per

form

an

ceof

this

ap

pro

ach

was

com

pare

dw

ith

wel

l-kn

ow

ntr

ad

itio

nal

class

ifica

tion

mod

els

on

are

al

worl

dcr

edit

card

data

set.

Th

e

resu

lts

show

that

cost

-sen

siti

ve

dec

isio

ntr

eealg

ori

thm

ou

tper

form

sth

eex

isti

ng

wel

l-kn

ow

nm

eth

od

son

the

giv

enp

rob

lem

set

wit

hre

spec

t

toth

ew

ell-

kn

ow

np

erfo

rman

cem

etri

cssu

chas

acc

ura

cyan

dtr

ue

posi

tive

rate

.

RF

Nola

n[N

ol1

7]

pro

pose

da

com

bin

ati

on

of

un

sup

ervis

edan

dsu

per

vis

edle

arn

ing

usi

ng

both

Logis

tic

regre

ssio

nan

dR

an

dom

fore

stcl

ass

ifier

s

togen

erate

frau

dsc

ore

.T

he

frau

dsc

ore

was

use

don

the

pla

tform

tofl

ag

tran

sact

ion

san

dm

ark

them

for

manu

al

ver

ifica

tion

.A

pro

mis

ing

resu

ltw

as

ach

ieved

.Y

ara

m[Y

ar1

6]

use

da

set

of

class

ifica

tion

alg

ori

thm

s(D

ecis

ion

Tre

e,R

an

dom

Fore

stan

dN

aiv

eB

ayes

)fo

rd

ocu

men

t

clu

ster

ing

an

dfr

au

dd

etec

tion

.T

he

resu

ltan

aly

sis

revea

lsth

at

Dec

isio

nT

ree

an

dR

an

dom

Fore

stalg

ori

thm

sp

erfo

rmb

ette

rth

an

Nave

Bayes

alg

ori

thm

.Q

i-F

eng

etal.

[QF

+15]

pro

pose

da

fram

ework

that

use

sen

sem

ble

learn

ing

tod

etec

tn

ovel

tyb

ase

don

Ran

dom

Fore

st

(RF

).T

he

pro

pose

dap

pro

ach

was

com

pare

dagain

sttw

oco

mm

on

ap

pro

ach

es:

sup

port

vec

tor

dom

ain

des

crip

tion

(SV

DD

)an

dG

au

ssia

n

Mix

edM

od

el(G

MM

)on

on

eart

ifici

al

data

set

an

dfi

ve

ben

chm

ark

data

sets

.T

he

exp

erim

enta

lre

sult

ssh

ow

that

the

pro

pose

dm

eth

od

ach

ieved

bet

ter

per

form

an

cein

term

sof

acc

ura

cyan

dre

call.


SV

MS

ub

ud

hi

an

dP

an

igra

hi

[SP

15]

use

dS

up

port

Vec

tor

Mach

ine

(SV

M)

class

ifier

for

frau

dd

etec

tion

inm

ob

ile

tele

com

mu

nic

ati

on

net

work

s.

Th

eex

per

imen

tsh

ow

sp

rom

isin

gre

sult

sin

term

sof

det

ecti

ng

frau

du

lent

calls

wit

hou

tra

isin

gto

om

any

fals

eala

rms.

Moep

ya

etal.

[MN

V14]

dev

elop

edsu

pp

ort

vec

tor

mach

ine

(SV

M)

mod

elto

det

ect

fin

an

cial

state

men

tfr

au

du

sin

gp

ub

lish

edS

ou

thA

fric

an

fin

an

cial

data

for

the

evalu

ati

on

.T

he

SV

Mm

od

elw

as

com

pare

dto

the

k-N

eare

stN

eighb

ou

r(k

NN

)m

eth

od

an

dL

ogis

tic

regre

ssio

n(L

R).

Th

e

SV

Mm

od

elp

rovid

edan

incr

ease

dcl

ass

ifica

tion

acc

ura

cy.

Chyan

-long

[Cl1

8]

ap

plied

SV

Mto

det

ect

ente

rpri

ses

fin

an

cial

state

men

ts

frau

dfo

rT

aiw

an

Sto

ckE

xch

an

ge.

Ap

rom

isin

gp

red

icti

on

acc

ura

cyw

as

ach

ieved

.

NB

Gu

pta

etal.

[GK

B17]

use

dn

aiv

e-b

ayes

for

det

ecti

ng

cred

itca

rdfr

au

du

sin

gti

me-

stam

pan

dIP

ad

dre

ssfe

atu

res.

Th

eem

pir

ical

resu

lt

from

the

exp

erim

ent

show

sth

at

the

pro

pose

dapp

roach

work

sw

ith

more

effici

ency

than

exis

tin

gm

od

els.

Sara

van

an

etal.

[Sar+

14]

ap

plied

naiv

e-b

ayes

ian

class

ifica

tion

toca

lcu

late

the

pro

bab

ilit

yan

dan

ad

ap

ted

ver

sion

of

KL

-div

ergen

ceto

iden

tify

the

frau

du

lent

cust

om

ers

on

the

basi

sof

sub

scri

pti

on

inte

leco

mm

un

icati

on

sect

or.

Th

ere

sult

sfr

om

the

exp

erim

ent

show

sa

red

uce

dfa

lse

posi

tive

rate

.P

an

igra

hi

etal.

[Pan

+09]

pro

pose

da

novel

ap

pro

ach

for

cred

itca

rdfr

au

dd

etec

tion

usi

ng

afu

sion

of

dem

pst

er-s

hafe

rth

eory

an

d

bayes

ian

learn

ing.

Th

eem

pir

ical

resu

ltsh

ow

sth

at

fusi

on

of

diff

eren

tev

iden

ces

has

aver

yh

igh

posi

tive

imp

act

on

the

per

form

an

ceof

a

cred

itca

rdfr

au

dd

etec

tion

syst

emas

com

pare

dto

oth

erm

eth

od

s.

AN

NY

uan

etal.

[Yu

a+

17]

pro

pose

da

novel

fram

e-w

ork

that

com

bin

esd

eep

neu

ral

net

work

san

dsp

ectr

al

gra

ph

an

aly

sis

for

frau

dd

etec

tion

.

Inp

art

icu

lar,

they

use

the

nod

ep

roje

ctio

n(c

alled

as

spec

tral

coord

inate

)in

the

low

dim

ensi

on

al

spec

tral

space

of

the

gra

ph

sad

jace

ncy

matr

ixas

inp

ut

of

dee

pn

eura

ln

etw

ork

s.T

he

exp

erim

enta

lre

sult

ssh

ow

sth

at

the

spec

tru

mb

ase

dd

eep

neu

ral

net

work

sare

effec

tive

in

frau

dd

etec

tion

.O

lsze

wsk

iet

al.

[Ols

14]

pro

pose

da

frau

dd

etec

tion

met

hod

base

don

the

use

racc

ou

nts

vis

ualiza

tion

an

dth

resh

old

-typ

e

det

ecti

on

.T

hey

pro

pose

dvis

ualisa

tion

an

dth

resh

old

-typ

ed

etec

tion

met

hod

usi

ng

SO

MU

-Matr

ix.

Th

eex

per

imen

tal

resu

ltco

nfi

rmed

the

effec

tiven

ess

of

the

pro

pose

dap

pro

ach

.A

zeem

etal.

[Aze

+14]

use

dan

evolu

tion

ary

sim

ula

ted

an

nea

lin

galg

ori

thm

totr

ain

Neu

ral

Net

work

sfo

rC

red

itC

ard

frau

dd

etec

tion

inre

al-

tim

esc

enari

o.

Th

eex

per

imen

tal

resu

ltsh

ow

sth

at

bet

ter

resu

ltis

ach

ieved

wit

hA

NN

wh

entr

ain

edw

ith

sim

ula

ted

an

nea

lin

galg

ori

thm

.

k-N

NM

alin

ian

dP

ush

pa

[MP

17]

an

aly

sed

k-N

Nan

dou

tlie

rd

etec

tion

tooth

erco

mm

on

mach

ine

learn

ing

tech

niq

ues

for

cred

itca

rdfr

au

d

det

ecti

on

.T

hey

cam

eto

aco

ncl

usi

on

that

both

tech

niq

ues

min

imis

eth

efa

lse

posi

tive

rate

an

din

crea

seth

efr

au

dd

etec

tion

rate

wh

en

use

din

mon

itori

ng

cred

itca

rdtr

an

sact

ion

s.A

rianto

etal.

[AA

N17]

use

dk-N

Nfo

rd

etec

ting

op

inio

nan

om

aly

for

pu

blic

sect

or

fin

an

cial

state

men

ts.

Th

eyp

rop

ose

the

use

of

ori

gin

al

featu

res

from

pu

blic

sect

or

rath

erth

an

the

use

of

mod

ified

featu

res

from

pri

vate

sect

or

for

the

det

ecti

on

.T

he

resu

ltsh

ow

sth

at

the

ori

gin

al

featu

refr

om

pu

blic

sect

or

ou

tper

form

edth

at

of

the

pri

vate

sect

or.

Ch

an

g[C

C12]

pro

pose

da

new

earl

yon

lin

eau

ctio

nfr

au

dd

etec

tion

met

hod

that

con

sid

ers

acc

ura

cyan

dti

mel

ines

ssi

mu

ltan

eou

sly

usi

ng

k-N

N.

To

det

erm

ine

the

most

ap

pro

pri

ate

att

rib

ute

sth

at

dis

tin

gu

ish

bet

wee

nn

orm

al

trad

ers

an

dfr

au

dst

ers,

am

od

ified

wra

pp

erpro

ced

ure

is

dev

elop

edto

sele

cta

sub

set

of

att

rib

ute

sfr

om

ala

rge

can

did

ate

att

rib

ute

pool.

Usi

ng

thes

eatt

rib

ute

s,th

eex

per

imen

tal

resu

ltsh

ow

s

an

incr

ease

inth

efr

au

dd

etec

tion

syst

emacc

ura

cy.


Cost-sensitive methods

Unlike the aforementioned methods, cost-sensitive learning in fraud detection

systems helps in making cost-benefit-wise optimal decisions which means it

helps to estimate cost such as misclassification cost and test cost. Therefore,

cost-sensitive learning method is a type of learning in data-mining that takes

the misclassification costs and possibly other types of cost of misclassifying a

transaction as fraud or legitimate into consideration [KBC16]. Cost-sensitive

learning methods such as the Meta-Cost procedure, deal with class-imbalance

problems by incurring different costs for different classes [KBC16; LS10b]. In

the literature, diverse learning algorithms were used on minimising the total

cost of misclassification costs, test costs and other types of costs such as Deci-

sion Tree [SBD13], AdaCost [FSC99], AdaBoost [SS99] and SMOTEBOOST

[Cha+03].

Cost-sensitive learning such as AdaCost assumes that costs are fixed and

class-dependent [FSC99]. In fraud detection system, the cost lost or saved

is proportional to the transaction amount. This implies that the larger the

amount, the greater the potential loss. Similarly, the cost of missing a fraud

i.e a false negative is not fixed but proportional to the transaction amount

[FSC99]. For example in [Bah+13; BAO15; KBC16; SBD13], a cost-sensitive

classifier that depends on transaction cost was applied to the field of fraud

detection. Bahnsen et al. [Bah+13] used Bayes minimum risks classifier as

a method for cost sensitive credit card fraud detection. There experimental

results show that their proposed cost sensitive method decreased significantly

the cost due to fraud as compared to other techniques used in the literature.

Similarly, Kim et al. [KBC16] saved cost on financial misstatements using a

multi-class cost-sensitive classifier and a promising result was achieved.


However, according to Pozzolo in [Poz15] when the cost of misclassifying

a fraud is higher than that of a genuine transaction, cost-based algorithms

could in principle generate false alerts rather than take the risk to predict a

transaction as legitimate when it is actually not. As a result, these algorithms

can generate many false positive alerts which may be of no practical use for

investigators who require precise alerts.

Social Network methods

Social network analysis (SNA) is a form of link analysis technique that aims at

understanding relationships between network participants, by means of map-

ping and measuring [LN15]. The detection of links between data in SNA can

be achieved by the application of various graph mining algorithms (e.g Peer

group analysis [Van+15] and Entity link analysis [Sud+10]) on this data source

[Maj15]. This approach is commonly used in financial institutions for identi-

fying crime rings or groups of people that work together to commit money

laundering or fraud [Sud+10]. In fraud detection systems, analysing links

to identify suspicious individuals, groups, relationships, and unusual changes

over time/geography can be used to complement other traditional data-mining

techniques [Maj15] previously discussed above. For example, when groups of

individuals are creating fake identities for loan applications, SNA can be used

to flag suspicious behaviour by showing the connections of things like ad-

dresses, phone numbers and email addresses [Div15]. A literature review on

the research and application of knowledge mapping and SNA can be found in

[CL06].

Recently Veronique et al. [Ver+17] proposed a new approach (GOTCHA)

to define and extract features from a time-weighted network. GOTCHA was


used to exploit and integrate network-based and intrinsic features in fraud

detection system. They observed that domain-driven network variables have a

significant impact on detecting past frauds, future and improve the baseline by

detecting up to 55% additional fraudsters over time. However, linking social

networks data spread upon different heterogeneous data repositories calls for

addressing several challenging problems such as algorithms optimization and

parallelization, new knowledge representation paradigms for heterogeneous,

redundant, false information, graph analysis for clustering and partitioning

[SMM13].

3.1.2 Unsupervised Approaches

Unsupervised learning method works with no prior knowledge of any partic-

ular class of observations present in a data sample [HTF09]. In the case of

financial transaction fraud, there are no prior sets of legitimate and fraudulent

observations as in the case of supervised learning techniques. As discussed in

Section 3.1.1, assigning labels of class membership to available dataset are car-

ried out by fraud analyst which are subject to errors and are time consuming.

Unsupervised learning approaches are commonly used in outlier or anomaly

detection technique where an associated fraudulent behaviour to any transac-

tion does not conform with the majority class [BH01]. As a result, they are

not affected by the problem of mislabelled dataset and class imbalance. This

is based on the fact that there is no human class labelling which are subject

to errors and no class label for tuning and evaluation of algorithms during

simulation [Wue+16] respectively. Despite this advantage, the application of

unsupervised learning method in financial fraud detection has not received a

lot of attention in the literature [BH01]. This could be because it is more novel


and requires background knowledge in interpreting the discovered structure in

the dataset [GCG17].

In [KS12] an improved peer group analysis (unsupervised technique) was

proposed to detect suspicious patterns of stock price manipulation. In their

work, they incorporated the weight of peer group members into summarizing

their behaviour as well as the consideration of parameter updates over time

in order to detect a target that its behaviour deviates from its peer. Sherly

and Nedunchezhian [SN10] used clustering technique, one of the most popular

unsupervised method to detect credit card fraud and a promising result was

achieved. Despite the benefits of this technique, clustering analysis can suffer

from a bad choice of metric. This refers to the way it scale, transform and

combine variables to measure the ’distance’ between observations. A typical

instance is the difficulty to combine categorical and continuous variables in a

good clustering metric. In other words, observations may cluster differently on

some subsets of variables than they do on others in order to have more than

one valid clustering in a data sample [BH01].

In general, the idea of unsupervised learning is to observe a data sample

that represents normal behaviour, then attempt to identify groups that show

the greatest departure from this norm and flag as outliers [Kou+04]. However

this approach is very difficult to manage because [Kou+04]: i) Configuring

such rules require precise, laborious, and time-consuming programming for

each imaginable fraud possibility. ii) The dynamic appearance of multiple new

fraud types demands frequent adaptation of the rules to accommodate this

emerging fraud types. iii) As more data becomes available for the system to

process the scalability of the system is affected.

3.2. Learning from Imbalanced Data 57

3.2 Learning from Imbalanced Data

As discussed in Section 3.1.1, the application of supervised learning techniques

in fraud detection domain comes with the challenge of overfitting due to unbal-

anced transaction dataset characteristics. Most supervised learning algorithms

are not designed to cope with a large difference between the number of cases

belonging to different classes [GAM00]. Learning from unbalanced dataset can

therefore lead to several problems with respect to the output of a classifica-

tion model. For example: (1) The classifier may assume that the samples are

uniformly distributed, which is false in this case. (2) The classification model

is biased towards the dominant class [GS17]. Several methods have been pro-

posed in the literature to deal with this problem. However, to conform with

available literature, these proposed methods are categorised into three ma-

jor approaches; the data level, algorithmic level and cost-sensitivity learning

framework. The justification for this categorisation is based on the fact that

in data level method the data is tuned to handle the dataset skewness. In

algorithm level the learning algorithm is tuned to handle the dataset skewness

while the cost sensitive approach is a compromise for both data and algorithm

level methods. The discussions on this methods follows below.

3.2.1 Data Level Methods

In the data level method, the unbalanced strategies are used as pre-processing

steps. This is done to rebalance the dataset or remove the noise between the

two classes before any algorithm is applied. The reason for this is to reduce the

effect of the skewed class distribution in the learning process [Gal+12]. Data

sampling methods do not take into consideration any class formation in remov-


ing or adding observations, yet they are easy to implement and understand

[Poz15]. They can be grouped into three main categories: under-sampling,

over-sampling and hybrid.

Under-sampling creates a subset of the original data-set by eliminating in-

stances usually majority class instances [Elh+16]. In under-sampling method,

it can be assumed that many observations of the majority class are redun-

dant and that by removing some of them at random, the resulting distribu-

tion should not change much. However, this comes with the risk of removing

relevant observations from the dataset that might contribute to the learn-

ing process since the removal is done in an unsupervised manner [DH03]. In

practice, this reduces the size of the data and therefore decreases the run-

time cost making easily to be adopted to unbalance data learning problems

[Elh+16]. Examples of under-sampling methods include but not limited to

Under-sampling based on Clustering [YL06; ZM03], Condensed Nearest Neigh-

bour (CNN) [Har68], Edited Nearest Neighbour (ENN) [Wil72] and Tomek

Link Removal (T-Link) [Tom76].

Oversampling creates a superset of the original dataset by replicating some

instances or creating new instances from existing ones. It replicates the mi-

nority class until the two classes have equal frequency [Elh+16]. As a result,

it increases the risk of overfitting [DH03] by biasing the model towards the mi-

nority class. In addition, it increases the training time particularly when the

original dataset is large thus making it ineffective. Examples of over-sampling

methods include but not limited to Adaptive Synthetic Sampling (ADASYN)

[Hai+16], Random Over-Sampling (ROS) [Fer+08] and SMOTE[Cha+02].

Hybrid methods consists of a combination of over and under-sampling

methods. In the hybrid approach, after applying the oversampling strategy the


data is cleaned such that original or introduced instances by the oversampling

method in the new dataset are removed. For example, Ganguly and Samira

[GS17] applied a hybrid approach for the classification of imbalanced auction

fraud data. The result from their study shows an improved classification effi-

ciency for the different classifiers used. The hybrid strategy of over-sampling

and under-sampling arguably works better than either one [Cha+02]. Exam-

ples of hybrid sampling algorithms include but not limited to Smote+Tomek

[BBM03], SMOTE+ENN [BPM04] and SMOTE+Spreadsubsample[GS17].

3.2.2 Algorithm Level Methods

As an alternative to data level methods discussed above, algorithmic level

learning methods are adjusted to deal with the minority class [Gu+08]. As

a result, it is not exposed to the risk of removing relevant observations from

the dataset which might contribute to the learning process. In addition, it is

exposed to overfitting. In order to develop an algorithmic solution for data bal-

ancing, one needs knowledge of both the corresponding classifier learning algo-

rithm and the application domain. Most especially, a thorough comprehension

on why the learning algorithm fails when the class distribution of available data

is uneven is critical[Sun+07]. In this case, the learning algorithm is adjusted

by modifying the cost per class [PF01], adjusting the probability estimation

in the leaves of a decision tree (establishing a bias towards the positive class)

[WP03], or learning from just one class [RK04] (”recognition based learning”)

instead of learning from two classes (discrimination based learning) [Fer+08].

The idea behind the algorithm level method, is to modify the original clas-

sifier in order to learn better patterns from the minority class. Examples of

algorithmic solutions are Cost sensitive learning [LS10b] and Ensemble [ZY12].


Cost sensitive learning2 is a learning approach that is used for improving

the performance of classifiers by applying a cost to the minority class distribu-

tion [San+17]. It biases the classifier towards the minority class by assigning a

higher misclassification cost to this class while at the same time seeks to mini-

mize the total error cost of both classes [Gal+12] as discussed in Section 3.2.3.

The major drawback of this approach, is the need to define misclassification

costs which are not usually available in the data-sets [Gal+12]. Research work

that focuses on using cost sensitive learning approach for financial fraud detec-

tion includes Sahin et al.[SBD13], Viaene et al.[Via+04] and Pinquet [PAG07],

etc.

Ensemble learning is the combination of multiple classifiers to improve or

increase the prediction accuracy of weak classification algorithms on unbal-

anced data. Bagging [BK99] and Boosting [FS96] are the most widely used

methods and their applications on several classification problems have led to

significant improvements [OT08]. In supervised ensemble learning [LWZ09],

majority of the classes are under-sampled by iteratively removing the majority

class instances that are correctly classified by a boosting algorithm. The idea

is that observations of the majority class that are easy to classify are redun-

dant and that by removing them the algorithm can concentrate on the hard

cases. In practical, these implies that the classification algorithm has to be

applied several times to reduce the majority class thus leading to an increased

computational cost. Research work that focuses on using ensemble learning

approach for fraud detection includes:

Hassan and Ajith [IA16], designed an innovative ensemble Insurance Fraud

2The cost sensitive approach is also presented in the next Section as it is thought to bea compromise between data and algorithm level methods in the literature.


Detection (IFD) model using decision tree, support vector machine and arti-

ficial neural network base-classifiers. The base-classifiers were evaluated using

automobile insurance dataset and the empirical results illustrate that the pro-

posed models gave better results.

Perera et al. [Per+13], applied a novel ensemble approach for click fraud

detection which is based on a set of new features derived from existing at-

tributes. The ensemble model which is based on six different learning algo-

rithms showed an improved results on training, validation and test datasets,

thus demonstrating its generalizability to different datasets.

Zareapoor and Shamsolmoali [ZS15] applied bagging classifiers based on

decision tree algorithm for detecting fraud in credit card transactions. The

performance of the classifier was found to be stable gradually as the fraud rate

was increased during evaluation and with less computational time.

Sohony et al. [IRU18] applied ensemble machine learning approach to

detect fraudulent credit card transaction. In their ensemble learning model,

a combination of random forest and neural network was used with the aim

of addressing the skewed nature of financial transaction dataset. The results

from their experiment presents a high prediction accuracy.

3.2.3 Cost-sensitive Learning Methods

The cost-sensitive learning method falls between data and algorithmic level ap-

proaches. It incorporates both data level transformation which involves adding

costs to instances and algorithmic level modifications which modifies the learn-

ing process to accept costs [Gal+12]. In classification task with unbalanced

data, it is usually more important to correctly predict positive (minority) in-

stances than negative (majority) instances [WMZ07].


According to Ling in[LS10b], cost sensitive learning can be classified into

two categories. The first category is the design of classifiers that are cost-

sensitive to themselves e.g decision tree [DH00]. The second category is the

design of a wrapper that converts any existing cost -insensitive classifier into

a cost-sensitive classifier. Their goal is to minimize the misclassification cost

using standard classification algorithms [Mal+97]. In the literature many ma-

chine leaning algorithms have been proposed for cost sensitive learning includ-

ing but not limited to decision tree [WMZ07], ANN [CB13] and SVM [CGC05]

.

In [SBD13], Sahin et al. applied a cost decision tree induction algorithm

to identify credit card fraud. As part of contribution to knowledge, they used

varying misclassification cost for their cost-sensitive decision tree induction al-

gorithm. Similarly, in [CZZ13], an optimized cost-sensitive SVM was proposed

to improve the performance of classification by simultaneously optimizing the

feature subset and misclassification cost parameters. The result shows that

the proposed approach is effective in comparison to commonly used sampling

techniques. Furthermore, estimating the cost in cost-sensitive approach is

considered highly challenging. Costs are not explicitly available or easy to

estimate. This makes it difficult to be adopted as a solution for unbalanced

learning [WMZ07].

For the purpose of data balancing in this thesis, a data sampling approach

was chosen over cost-sensitive approach because costs are not explicitly avail-

able or easy to estimate in cost-sensitive approach [Mal+97]. More discussions

on the advantages of data sampling approach over cost-sensitive approach can

be found in [EA13; WMZ07; VKN07]. Additionally, to ensure that the best

data balancing approach is adopted, a hybrid technique that combines an over-

3.3. Classification Performance Measures 63

sampling and under-sampling algorithm is employed. These works better than

either one [Cha+02; GS17].

3.3 Classification Performance Measures

As discussed in the Sections above, the applicability of the identified machine

learning techniques in detecting financial transaction fraud can be evaluated

using different performance metrics. Thus, the success of computational in-

telligence algorithms is an important step in determining their suitability at

solving their respective problems. This is especially true for financial trans-

actions fraud problem where minor improvements in performance can lead to

large economic benefits [WB16]. There are varieties of standards that can be

used to measure the computational performance of an algorithm such as: abso-

lute ability, visual mediums, probability of success, and more [WB15; WB16].

These standards are formulated from four possible outcomes from the classifi-

cation as shown in Table 3.3.

Table 3.3: Confusion matrix

Positive class Negative class

Positive class True positives (TP)

Number of examples correctly

predicted as pertaining to the

positive class.

False positives (FP)

Number of examples pre-

dicted as positive, which are

from the negative class.

Negative class False negatives (FN)

Number of examples pre-

dicted as negative, whose true

class is positive.

True negatives (TN)

Number of examples correctly

predicted as belonging to the

negative class.

3.4. Case-Based Reasoning 64

It should be noted that none of these measures alone is adequate by itself

since the aim of the classification task is to achieve good quality results for

both classes. One way to combine these measures and produce an evaluation

criterion is to use the area under the ROC curve (AUC) [Gal+12]. AUC pro-

vides a single measure of classifier’s performance for evaluating which model

is better on average. Thus, AUC can be applied to evaluate the imbalanced

dataset [Bra97] learner. Traditionally, standard classification measures such as

Accuracy, Recall and F-measures are commonly used for evaluating a classifier

[Cha+11a]. However, for classification with the class imbalance problem, ac-

curacy is no longer a proper measure since the rare class has very little impact

on accuracy as compared to the prevalent class [Gu+08]. Recall focuses on

the significant class (e.g fraud) and is not sensitive to data distribution, while

F-measure is a good metric due to its non-linear nature. Another common

evaluation metric is Mathews Correlation Coefficient (MCC); it measures the

quality of a two-class problem. This takes into account the true and false

positives and negatives. It is a balanced measure, even when the classes are

different sizes. In MCC the result of +1 indicates a perfect prediction, and -1

a total disagreement [Ran+17].

3.4 Case-Based Reasoning

As already mentioned in Section 3.1, most machine learning techniques rely

on statistical relevant dataset for prediction. Unfortunately, in the absence of

a significant size of historical data, they tend not to perform well [Sha+16].

A Case-based reasoning (CBR) on the other hand is a modern computational

method that solves new problems using solution (specific knowledge) from


past and similar problems that were successfully solved [Lee08]. A simple

illustration of these by [AP94] is further described. A financial consultant

working on a difficult credit decision task, uses the knowledge from similar

previous case that involves a company in similar trouble as the current one.

This is used to recommend that a loan application should be refused. In essence

CBR is a cased-based reasoner that solves problems by using or adapting

previously successful solutions to old problems [BP89]. A basic principle of

CBR can be summarised as shown in Figure 5.3.

Figure 3.2: CBR classical paradigm [Cor08]

From Figure 3.2 the classical principle of CBR is to solve a target problem,

by retrieving a source case and adapt the solution to fit the target problem re-

quirements. The root of CBR methodology in Artificial intelligence arose out

of research into cognitive science on human reasoning and memory organiza-

tion by Roger Schank [Rog83]. Schank suggests that human knowledge about

the world is mainly organized as memory packets holding together particu-

lar episodes from our lives that were significant enough to remember. These

memory organization packets (MOPs) and their elements are not isolated but

interconnected by our expectations as to the normal progress of events called

scripts. For example, an MOP that contains a situation where some problem

was successfully solved and later the person finds him/herself in a similar sit-

uation. Then the previous experience is recollected and the person can try

to follow and apply the same steps in order to reach a solution [Rog83]. The


reasoning of classical CBR paradigm is often organised following a cycle which

specifies the sequence of the various steps. This cycle now serves as a reference

for most studies in this field [Cor08]. The next Section presents the discussions

on the CBR process cycle.

3.4.1 Case-Based Reasoning Cycle

In general, CBR is a cyclic and integrated process made up of four cycles,

commonly referred to as the four R’s. The R’s stand for Retrieve, Reuse, Revise

and Retain. In a CBR process, there is a terminology called a case which

usually denotes the knowledge memory. These knowledge memory usually

contains details about a problem with a known solution as well as relevant

information that leads to the addressed solution [Kap12]. In Figure 3.3, this

cycle is illustrated.

Figure 3.3: The CBR cycle [AP94]


The discussion of the CBR process as illustrated in Figure 4.5 follows:

1. Retrieve: In a CBR system, once a new transaction (a query) is ob-

served, the retrieve tasks examines the transaction description, searches

through its knowledge repository or case-base for similar cases to the

investigated one and ends when a best matching previous case has been

found. During the matching, the most common traditional approach

used is a similarity measure which can vary from rather a simplistic one

to an advance one based on the structure, knowledge intensive and field

of application [Kap12].

2. Reuse: This step is also known as the adaptation stage. The retrieved

case transaction found at the retrieve step is examined by a domain

expert. If suitable, it is proposed as the similar transaction case to the

new case. Otherwise, it is adapted to meet the specific requirements

or features of the new case [Cor08; Kap12]. This step is very domain

dependent and is optional depending on the application.

3. Revise: This step examines and evaluates the proposed transaction or

cases to the new case or query, to verify it’s suitability. If successful, it

uses it as the class of transaction to the new transaction (case retain-

ment). Otherwise it repairs the case transaction using domain specific

knowledge [AP94; Kap12]. This stage is domain dependent and may

change among applications [RGDAGC08]. Therefore, the touch of a

fraud analyst becomes critical at this stage to correct the system when

the class of transaction detected is wrong.

4. Retain: once the previous steps have been conducted successfully and a

verified transaction class (solution) has been confirmed and validated, a

3.5. CBR and Machine Learning 68

new case to be retained is formulated. This means that the result of the

problem-solving process is added to the systems knowledge repository for

possible future use. The retained case consists of the imported problem,

its proven solution as well as any available surrounding information in

the problem-solving process [Kap12].

In general the CBR cycle has been used over a wide range of applications

in the literature due to its generic characteristics and its adaptation flexibility

across systems [Kap12]. The next Section refers to the use of CBR with other

machine learning algorithms.

3.5 CBR and Machine Learning

In the literature, the coupling of a CBR method with other methods is used

to solve specific problems in various domains [YMR14]. These added methods

could be embedded in any of the stages that compose the CBR cycle. Gener-

ally, CBR systems are flexible systems capable of using the beneficial properties

of other technologies to their advantage. For example, Silva et al. [SVF15]

combined CBR with Artificial neural networks for credit risk analysis. Also,

according to Corchado [CL01], ”ANNs deal easily (and normally) with numeric

data sets whereas CBR systems deal normally with symbolic knowledge. Even

when symbolic knowledge can be transformed into numeric knowledge and nu-

meric into symbolic, there is always the risk of losing accuracy and resolution

in the data and hence obtaining misleading results”. Therefore, a combination

of CBR systems and ANNs may avoid transforming data and therefore gain

precision [CL01].

Other works where machine learning is used with CBR includes clustering

3.5. CBR and Machine Learning 69

[KPJ15; MA16; SE13], optimization algorithm [KGA12; Kj04; Yu+16], genetic

algorithms [Isl+02; SH99] and ontology [AL13; Mar+13; Qin+18]. In addition,

the combination of CBR with machine learning have been widely applied to

various domains. For example, the CBR method and rule-based reasoning was

applied to medical diagnosis system [SEDMK14]. The proposed system was

used to improve the accuracy of the retrieval process of the CBR systems.

The medical diagnosis system integrates case-based reasoning and rule-based

reasoning and also applies the adaptation process automatically by exploiting

adaptation rules. Both adaptation rules and reasoning rules are generated

from the case-base. The results show that the proposed approach increases the

diagnosing accuracy of the retrieval process of the CBR systems, and provides

a reliable accuracy compared to the current breast cancer and thyroid diagnosis

systems. Other applications of CBR and machine learning in the medical field

includes [ABL13; CG+13; YMR14] and [ABL13].

Ahn et al. in [AKH06] applied CBR and Genetic algorithm for customer

classification i.e classifies customers into either purchasing or non-purchasing

groups. The genetic algorithms was used to optimize the weights of features

and selection of instances for the CBR method. The experimental results

shows that simultaneously optimized CBR may improve the classification ac-

curacy and outperform various optimized models of CBR as well as other

classification models including logistic regression, multiple discriminant analy-

sis, artificial neural networks and support vector machines. Other examples of

CBR and machine learning for customer classification includes [LS10a; SK11]

and [ZLD14].

Chuang et al. in [Chu13] developed a CBR-based hybrid model for predict-

ing bankruptcy prediction i.e business failure. The hybrid models developed

3.6. Case-Based Reasoning in Fraud Detection 70

in their study include: RST-CBR (combining Rough Set Theory with CBR),

RST-GRA-CBR (integrating RST, Grey Relational Analysis, and CBR), and

CART-CBR (combining Classification and Regression Tree with CBR). The

RST-GRA-CBR hybrid model is a viable alternative method. It appears to

outrun the other four algorithms in terms of accuracy as it helps users to iden-

tify similar cases as references for making decisions. Other examples of CBR

and machine learning for business failure includes [LX12; PH02] and [SVF15].

CBR and machine learning has been used in the literature for both credit

scoring and fraud detection. For example, Wheeler in [WA00] applied mul-

tiple algorithmic CBR for fraud detection. The result from their experiment

suggests that multi-algorithmic CBR will be capable of high accuracy rates.

Other applications of CBR and machine learning in credit scoring and fraud

detection includes [CH11; Lee08]

From the examples above, it is proven that the CBR method is indeed very

flexible and viable for application in various forms of knowledge and domains.

The use of GA and CBR was extensively used in the literature. This motivates

the choice of GA for feature weighting in the proposed CBR systems in this

research work. The next Section refers to the relevant work in the literature

where the CBR system was applied to the problem of financial transaction

fraud detection.

3.6 Case-Based Reasoning in Fraud Detection

Case-based reasoning (CBR) as already mentioned in Section 3.4 is an artifi-

cial intelligence paradigm for solving new problems through reusing previous

similar problem solving experiences [AP94]. As an alternative to standard

3.6. Case-Based Reasoning in Fraud Detection 71

machine learning methods, CBR comes with a number of advantages when

applied to the field of financial transaction fraud [Wat99]. For example, case-

based reasoning features has the ability to (i) Learn in the absence of histori-

cal consumption data while continuously improving when more data becomes

available over time. (ii) Realize knowledge transfer as spending habits evolve;

as is the case where information on one transaction is exploited to improve

predictions for different yet similar transactions. (iii) Provide precedent-based

justification instead of justifying a solution by showing a trace of the rules that

led to decision [PH02; Wat99].

A case-based reasoning methodology has proven to be valuable and suc-

cessful in many range of applications. This is due to its generic characteristics

and its adaptation flexibility across systems [Kap12]. Furthermore, it is more

transparent than black-box models such as neural networks and has the ability

to operate with limited experience, learn and improve predictive accuracy as

more data becomes available [Sha+16; PDM15]. There are few research in

applying these technique in the context of detecting financial fraud patterns

[Ade+16; Kap+12], e.g [Ade+17].

In [WA00], multi-agent case-based reasoning approach was applied to the

problem of reducing the number of final-line fraud investigation in credit ap-

proval process. From the results, the adaptive CBR algorithm was found to

have the best performance. The results indicate that an adaptive solution can

provide fraud filtering and case ordering functions for reducing the number of

final-line fraud investigations necessary. The model however needs to be tested

with similarly complex data sets from other real world domains.

Park and Han [PH02] used multi-agent case-based reasoning approach to

reduce the number of final-line fraud investigation in credit approval process


achieving precise results. In [Ade+16] and [Kap+12], promising results were

produced with a basic CBR model (i.e standard CBR) for monitoring and pre-

dicting financial transaction fraud. However, the predictive accuracy of that

model was lower than that of a neural network of similar complexity and fea-

tured a relatively high false positive rate. As discussed in [SK13], this identified

weakness is considered as damaging due to high false negative rates for cus-

tomer trust. This acutely reflects why precision requirements for operational

fraud detection systems are high and partly explaining current reluctance to

adopt unified industry-wide approaches. In this thesis, an improved CBR sys-

tem that uses machine learning capabilities to predict mobile money payment

fraud is proposed.

3.7 Chapter Summary

In this chapter, algorithmic solutions for financial service fraud detection was

discussed. The discussion was further divided into supervised and unsuper-

vised approaches. These two categories were further grouped into four groups

based on how they are been evaluated as shown in Figure 3.1. Related works

on the use of machine learning algorithms for financial fraud detection were re-

viewed. As a result, logistic regression, random forest, support vector machine

and artificial neural neural network were then selected to run some prelim-

inary experiments as further discussed in Appendix A.1. The afore choices

was based on the fact that logistic regression is easy to use and one of the

most commonly used technique for data mining in practice. The random for-

est’s classifier has the ability to capture non-linear data, shows high scalability

with better visual representation of results data. Artificial neural network can


show higher accuracy in prediction.

Section 3.2 reviews the existing approach for handling unbalanced data for

predictive algorithms. As a result, a conclusion was reached to use a hybrid

technique that combines an oversampling and an undersampling algorithm to

balance the MMT dataset. According to [Cha+02], this works better than

either one. Section 3.3 discusses evaluation techniques (performance metrics)

for measuring a fraud detection system effectively. Finally, Section 3.4 con-

cludes the chapter by providing an introduction to the case-based reasoning

methodology and its application in fraud detection domain.

Chapter 4

Mobile Money Transfer Data

Simulation

Chapter 2 has demonstrated that there is no concrete evidence of research

using real transaction data for the learning stage of fraud detection system.

This is due to data protection, confidentiality, ethical issues, time, and the cost

associated with collecting multiple instances of a diverse set of data sources

[Jes+05]. In order to circumvent the challenge of lack of publicly available

data sets, representative mobile money transfer data is simulated as suggested

by Lundin et. al. in [LKJ02].

This chapter presents the simulation of transaction data that resembles a

Mobile money transfer payment platform. This is achieved by generating a

representative data on mobile money transactions that are as close as possible

to a real world situations. As case study for the data simulation, the Mobile

Money Transfer Service scenario requirements developed by EU FP7 MASSIF

(”MAnagement of Security information and events in Service Infrastructures”)

[Lla+11] will be used for the ongoing research. The rationale for using EU FP7

74

4.1. Mobile Money Model 75

MASSIF mobile money transfer service scenarios as the case study for the data

simulation was because from the literature only this work provides the required

level of depth that is needed to carry out the investigation.

Section 4.1 describes the mobile money model simulation platform. Sec-

tion 4.2 expands on this by discussing the users behavioural model within the

MMT platform. It describes the different agents within the MMT platform as

well as events that indicates their behavioural models. Section 4.3 discusses

the implementation of the simulator using the adapted Multi-agent Based tool.

This Section also describes the simulated scenarios as well as input and out-

put parameters from the simulation. It further describes the simulation walk

through. Section 4.4 discusses the data evaluation and presents the overall

statistical analysis of the simulated data. Finally, Section 4.5 concludes this

chapter by summarising the simulation approach adopted in this thesis for the

needs of mobile money transfer data generation.

4.1 Mobile Money Model

In order to model the mobile money transfer platform, the mobile money trans-

fer(MMT) service described in Section 2.2.1 is considered. This service uses

the schema of the real mobile money service that enables end-users to transfer

money to other end-users or buy goods and services from merchants. It is

assumed that the transactions are made with mMoney, which corresponds to

electronic money emitted by the operator that manages the service. In ad-

dition, End-users can exchange cash for mMoney and vice versa at mMoney

vendors by depositing or withdrawing cash from their MMT accounts as dis-

cussed in [Gab+13].

4.2. Users’ Behaviour Model 76

4.2 Users’ Behaviour Model

The user behaviour model approach corresponds to multi-agents models that

uses the schema of the real mobile money service to generate synthetic data

with known ground truth. Here, a scenario on legitimate/fraudulent agents

from the literature were created. In addition, elements of their activities

from the scenario was mapped to the data [WHV08]. To align with litera-

ture [Gab+13], there are two major agents in the simulation of our MMT

platform; (i) The legitimate users who subscribe to the Mobile-based Money

Transfer Service and thus own an account. (ii) The fraudsters also known as

bad actors who attack the system and take the place of a legitimate end-user.

4.2.1 Legitimate Actors

There are three major agents involved in the legitimate MMT system trans-

action; End-users (customers), Service providers (merchant and utilities) and

Distribution channels (retailers). Each category is made up of several roles

which are associated with specific actions in the platform. End-users are in-

dividuals who use their mobile devices to access to the MMT platform to

carry out transactions through the network provided by their operator. Ser-

vice providers sell services or goods to end-users. Retailer are in charge of the

distribution of electronic money to end-users [Gab+13].

In the legitimate agent model simulation, it is assumed that a legitimate

user tends to carry out frequently and repeatedly a specific set of transactions

i.e their transactions are mostly related to their habits. According to Gaber

et al. [Gab+13] a habit is a repetition of a sequence of legitimate transactions

which are characterized by (i) a type of transaction, (ii) a normally distributed

4.2. Users’ Behaviour Model 77

transaction amount, (iii) a normally distributed period of time between two

transactions of the considered habit, (iv) an initial date and (v) a final date.

From this, it can also be deduced that the connection of actors in the Mo-

bile money ecosystem who interact on a regular basis with each other can be

referred to as his Community of Interest (COI). Any shift or deviation from

this may be viewed as an anomaly or suspicion. This concept was also used in

[CPV01] and [Gab+13].

4.2.2 Bad Actors

They correspond to misuse cases of the system by ”Fraudsters” and they can

follow a pattern in which many parameters can evolve. In other to simu-

late these agents, the three identified misuse scenario from EU FP7 MASSIF

project [Lla+11] were explored and they are described in detail below:

1. Account takeover as a result of Sim splitting or loss of device. In this

case, an attacker gains access to the device of an mWallet holder and

knowledge of authentication credentials. Then he carries out several

transactions, purchases or money withdrawals. Events that indicate the

scenario: abnormal or shift in behaviour of end users compared to a

specific profile or compared to usual behaviour [Lla+11].

2. Retailer who is complicit with Money Laundering activities and then

facilitates opening of an account despite knowing such an account will

be loaded with funds coming from criminal activities. In this case, the

actor does not require the subscriber to provide true identity or any

specific identity. The fraudulent subscriber uses mWallet to manage

money coming from criminal activities. We assume that the fraudulent

4.3. Implementation 78

subscriber will withdraw money or make a purchase corresponding to

the amount of money stolen at a certain time after he receives it. Events

that indicate the scenario: an atypical use of mWallets or an abnormal

volume and frequency of cash transactions compared to a specific profile

e.g mWallet used only for withdrawal or p2p transfer, multiple funding

and loading sources of the mWallet, followed by withdrawals shortly

afterwards [Gab+13; Lla+11].

3. Account management system compromise by an attacker. Here, the at-

tacker takes control of the account management system and uses the

system to change accounts data (identity, balance, credit/debit accounts

etc). Events that indicates such a misuse case are a change of the global

balance of the mMoney in system, an intrusion in the accounts manage-

ment system, or a large number of transfers from several mWallets used

to fund the one specific account [Lla+11].

4.3 Implementation

The Multi-Agent Based Simulation was implemented by simulating the com-

bination of behaviour and habits of several users in a Mobile money envi-

ronment. The simulator is built according to the methodology proposed in

[LKJ02] as shown in Figure 4.1 (aggregated data from real transaction was

not injected into simulation as earlier mentioned in 2.3.3) and implemented

using the adapted multi-agent based simulator (MABS) developed by Lopez

et al. [LrA12b]. The main reason was that the proposed methodology has

a well defined interface which makes it easy to use while MASON facilitates

the implementation of social networks. As a contribution, misuse scenario in-


volving SIM swap and retailer facilitating opening of end user account were

modelled which were not considered in [Gab+13; LrA12b]. Also, to trace the

sequence of events in the generated data, time-stamp parameter was included

in the simulation rather than steps or category of users over a period of time

used in [Gab+13; LrA12b] respectively.

Figure 4.1: The synthetic log data generation process [LKJ02]

From Figure 4.1 above, the user profile configuration (for both user and

attacker simulator) involves agents with two types of account profile; Personal

(P1) and Business account (P2). In order to simplify the model (i.e to reflect

a realistic scenario, using MPESA as the case study), it is assumed that all

values are given in Kenyan Shilling, with Profile (P1) having a maximum daily

limit of 35,000 Ksh (approx. 285 Pounds) for all transactions and a maximum

account balance of 70,000 Ksh. While for profile (P2) both thresholds are

increased to 140,000 Ksh.

In the user/attacker simulator, agents are switched from one profile to

another using Markov matrix of transition probabilities (Markov matrix was

chosen because it has the ability to transit an agent from one state to another

and it is commonly used in the literature for sequence evolution). This tells

the system when to change from Active to Inactive and from Profile P1 to


Profile P2 which allows higher limits for transactions. The output from here

acts as simulated user actions which are then fed into the system simulator. As

part of configuration in the system simulator, each clients (i.e users) has five

possible actions in each step of the simulation i.e the category of transaction

they can perform which is considered as a ”habit” in the simulation. They

can either make a money deposit (MD), money withdrawal (MW), merchant

payment (MP), person-to-person transfer (P2P) or airtime recharge (AR). The

autonomy of the agent is implemented by a probabilistic transition function

that computes the type of operation and the action that an agent will perform

in each step. This transition function depends on clients attributes such as

category of user and the amount which is calculated according to the balance

and the limits of each client’s profile [LrA12b; Zhd+14]. Other configuration

parameters/data are provided in Section 4.3.2, 4.3.3 and appendix A.1.

For each simulation, the parameters and probabilities of occurrence for the

transition can be modified to improve the quality of the simulation towards

a more realistic scenario. Additionally, the implementation of the type of

action each client can perform is based on pseudo random transitions. These

pseudo random transitions are based on 3 different configurations using the

percentage of account balance in comparison with the maximum limit allowed

by the client profile (Lower than 15%, higher than 80% and medium balance

which is between low and high) [LrA12a]. The agent has a higher probability

to make a deposit when the balance is low. When the balance is high the agent

has a higher probability to make a withdrawal, transfer, pay a merchant, or

airtime recharge rather than a deposit [LrA12b].


4.3.1 Simulated Scenarios

The system employs the concept of social networks to model agent’s behaviour

and interconnection within a mobile money transfer service. In a social net-

work, a node and edge connected to another node constitutes a social graph.

The nodes represent mobile money users while the edges describes the social

relationship between two mobile money users. Figure 4.2 illustrates the social

graph of mobile money users in the simulation of an MMT system. The graph

shown in Figure 4.2b represents a small simulation of 35 agents across 7 cities

(the nodes represents the 7 cities). This is used to explain what the edges and

nodes of the desired simulation scenario look like. The red nodes represents

fraudsters, while the purple represents legitimate users. Both nodes represent

various end users and their edges represents the interaction between pairs of

end users (i.e money sent or received).

Figure 4.2: Screen-shot of MMT simulation window

In total 2000 end users were created from 7 different cities performing sev-

eral transactions with partners either inside or outside of their city as shown


in Figure 4.2a. In the simulation as part of the configuration, around 10%

of the end users will be behaving as malicious agents (fraudsters). Although

in a real life scenario it is more common to find a lower percentage of fraud-

sters. The rationale for using a high percentage of fraudsters is to prevent the

class imbalance problem during the training of the proposed prediction model

[LrA12b]. The transaction sequence is expected to follow this pattern [Lla+11;

Zhd+14]: (i) authentication, (ii) transmission of sender’s payment instructions

and transaction details to the MMT platform, (iii) authorization by the MMT

platform, (iv) credit and debit on the receiver’s and sender’s accounts respec-

tively. When a simulated user carries out a transaction, all the transactions

details are stored in a log file. Each entry contains the transaction type, trans-

action amount, sender and receiver pre- and post-transaction balance, sender

and receiver category, and transaction time-stamp [Zhd+14]. The generated

database contains all simulated transactions for six months.

4.3.2 Input Parameters

The simulation starts with some basic initial input parameters that can be

provided from aggregated statistics or supplied randomly based on users ex-

perience1. These values are automatically set as the default values when the

simulation is loaded as shown in Figure 4.3.

1Some of the parameters used in the simulation were obtained from the work of Lopez-Rojas in [LrA12a]


Figure 4.3: Simulation window for input parameters

The input parameters are the initial parameters needed to run the sim-

ulation. The values for these parameters can be entered via the simulation

console or input files as shown in Figure 4.3. A detailed description of the

input parameters is as shown in Table 4.1


Table 4.1: Simulation input parameters

Parameter Default Value Description

RandomMultiplier 0.1 multiplier for execution

MaxNeighbour 10 Maximum number of friends a mobile

money transfer user associated to a de-

vice can have

ClientsBalance - Clients balance generated randomly

during the execution of the simulation

using probabilistic function

MaxOtherNeighbour 2 Maximum number of friends a mobile

money transfer user can delegate when

in an in-active state

UpgradeAccountRate 0.01 The rate at which account is been up-

graded from personal to business during

the simulation

TransactionRate 0.5 The rate at which transaction is been

initiated for each mobile money transfer

users

Trans - Transaction category

NumClients 2000 Number of mobile money transfer user

device

Types - Type of operation a user perform in the

simulation

NumCities 7 Number of cities

The data simulation was run five times and at each simulation the pa-

rameters were varied (except for the number of clients, cities, transactionrate,

& UpgradeAccountRate). This will allow changes in behavioural pattern of

each client (i.e users) during each simulation. The different values used for the

parameters during the simulation is provided in Appendix B.


4.3.3 Output Parameters

The output parameters are based on the results from simulating the behaviour

of several clients interacting in a Mobile Money environment. A log of trans-

action is produced as output from the simulation (MABS). This log was built

to generate the attributes described in Table 4.3.

Table 4.3: Simulation output parameters

Parameter Description

Time-stamp Date and time of transaction

Client ID Sender account’s number

Usercategory User’s grouping according to their possible spending behaviour (as

discussed in Section 4.4)

Profile Account type (P1-personal or P2-business)

Location Sender’s location

TypeTrans Type’s of transaction operations (i.e MD, MW, P2P, MP and AR)

Amount Transaction amount

Balance Sender’s account balance after each transaction

ClientB ID Reciever’s account number

ProfileB Account type (P1-personal or P2-business)

LocationB Receiver’s location

BalanceB Receiver’s account balance after each transaction

Suspicious Transaction label (fraud or non-fraud)

FraudType Class of misuse scenarios described in Section 4.2.2

These parameters were selected because they are typical description of any

mobile money transaction instances. The discussion of the simulation process

is presented in the next Section.


4.3.4 Simulation Walkthrough

Simulation walk-through: In the beginning of the simulation, the first step

is to set up agents and their locations. Then different clients that will be

present in the simulation were randomly generated and each client is assigned

an ID. A client state at each time depends on a Markov transition matrix that

assigns when to change from Active to Inactive and from personal to business

account with higher limits on daily transaction. The clients in this simulation

have basic operations; they can either make a deposit, withdrawal, person-to-

person transfer, pay a merchant, buy airtime or decide not to perform any

transaction. If a client needs to perform an action, it conducts a local search

within its network to see which of its neighbours are in active state. If the

search is successful, then it places a request for a type of operation using a

probabilistic transition function. The request placed depends on the transition

function from client account balance, daily limits on each client’s account type,

and user spending habits category. When the balance is high the agent has a

higher probability to make a withdrawal, transfer, pay a merchant, or airtime

recharge, rather than a deposit. Figure 4.4 outlines these activities.


Figure 4.4: A flowchart representing the simulation walk-through

Nevertheless, if the search is unsuccessful, the client can delegate a request

to an in-active client to conduct a local search within its own neighbourhood

for a mediator. Once this is achieved, a routing record is created with infor-

mation about the originator of the request. At each pass, the routing record is

updated with information about the intermediate requestor. At some point, if

the search is successful, then the delegated client places a request to perform

an operation on behalf of the initial requesting client. The delegation of re-

quest stops after a search is conducted in the neighbourhood, a level above the

requesting clients level. For each simulation, the input parameter values were

modified in order the improve the quality of the simulation. A total of 2000

end users were created from different cities performing several transactions

with partners either inside or outside of their network. The data simulation

was run five times for six months. At each simulation, the parameters were

4.4. Evaluation of the Log Data 88

varied except for the number of clients and cities to allow for changes in be-

havioural pattern of each client i.e users to be captured. The simulator stores

transactions details in a log file and each entry contains informations such as

the transaction type, amount, sender and receiver profile (Id, account type),

time-stamp etc. The files generated were merged, cleaned and to be used as

input data for the proposed prediction approach.

At the end of the data preparation phase, a total of 497,565 transactions

were generated with 5,900 transactions (1.2%) labelled as suspicious. That ra-

tio is a realistic representation of common class imbalance problems in financial

transactions dataset where one of the classes is very large in comparison to the

other one. In order to evaluate whether the dataset itself offers a realistic

representation, the log data was evaluated. This is discussed in Section 4.4

below.

4.4 Evaluation of the Log Data

The simulated MMT platform, format of the generated logs and Multi-agent

based simulator have been validated in [Gab+13], [Lla+11] and [LrA12b] re-

spectively. To evaluate the generated dataset in the absence of real world data

as input to the simulator, the simulated data was verified by checking con-

straints such as positive balance numbers, account age, consistency between

the transfers, deposits and withdrawals with the changes in account balances

(as earlier discussed in Section 2.3.5 & 2.4). In addition, to verify whether the

amounts and periods of the simulated data were normally distributed as in

case of Gaber et. al. [Gab+13], a chi-square test was employed. For the pur-

pose of evaluating the simulated data, the end-users to be used were selected


randomly and manually. However, to avoid too much complexity, the number

of end-users used in the evaluation was limited to 20. Only the 905 users who

had done more than 60 transactions (average transaction generated by the end

users) were considered. The end users were separated into four different groups

according to the days of the week. The mWallet account is used as adapted

from [Gab+13]. This enables us to highlight the days of the week the mWallet

account is used most often. Therefore, it provides a suitable representation of

spending behaviour frequency in the MMT platform as described in Table 4.5.

Table 4.5: Groups of users according to possible days of the week the mWalletaccount is used.

Class Days of the Week Possible spending representation

S M T W T F S

A x x Low (up to 2 days per week)

B x x x Medium (up to 3 days per week)

C x x x x x High (up to 5 days per week)

D x x x x x x x Very high (up to 7 days per week)

It is however interesting to point out that from legitimate users spending

activity, habitual behaviours can be developed i.e transactions of each mWallet

account holder include reoccurring patterns regarding shopping areas, group

persons transfer made with or to, amount spent etc [Kok97]. According to

Zhdanova et. al [Zhd+14], this habitual behaviour are expected to have (i)

a type of transaction, (ii) a normally distributed transaction amount, (iii) a

normally distributed period of time between two transactions of the considered

habit, (iv) an initial date and (v) a final date. Based on these observations, the

verification of whether both the amounts and periods are normally distributed

was carried out using chi-squared test. The rationale for using chi-square test


is based on the fact that it is easy to compute and robust with respect to the

distribution of the data [Mch13]. For the chi-square analysis, a significance

level of 5% was used to check whether the amount and periods were normally

distributed. The results are summarized in Table 4.6.


Tab

le4.

6:R

esult

sof

chi-

squar

ete

stfo

rea

chse

tof

use

rs

Cla

ssM

DM

WP

2P

MP

AR

am

ou

nt

per

iod

use

ram

ou

nt

per

iod

use

ram

ou

nt

per

iod

use

ram

ou

nt

per

iod

use

ram

ou

nt

per

iod

use

r

A100%

100%

5100%

100%

5100%

100%

5100%

100%

5100%

100%

5

B100%

100%

5100%

100%

5100%

100%

5100%

100%

5100%

100%

5

C100%

100%

4100%

100%

5100%

100%

5100%

100%

5100%

100%

5

D100%

100%

5100%

100%

5100%

100%

5100%

100%

5100%

100%

5

Tota

l100%

100%

20

100%

100%

20

100%

100%

20

100%

100%

20

100%

100%

20


As shown in Table 4.6, the chi-square results are organized according to

users spending behaviour defined in Table 4.5 and the category of types of

transactions performed and which is considered as an habit. From Table 4.6

the results shows that the proportion of the selected users for both the amount

and period are normally distributed. On the basis of the above, the dataset

is considered realistic. Although, the reason for uniform results may be based

on the fact that the average percentage of transactions used in the selection of

participants was 60%. The accuracy of the generated log couldn’t be calculated

due to lack of authentic sample data in the simulation.

4.4.1 Quality of Data

The quality of data in information discovery, analysis and prediction is highly

important. A poor quality data can produce a unrealistic answer during anal-

ysis and prediction problems [HSW07]. For this purpose, several queries were

executed to verify if there are missing values, errors and outliers in the gener-

ated dataset.

• Missing values: transactions data with missing values (unverified client

profile) were used as part of the analysis. Although, there were few

missing values compared to the total volume of the data. These missing

values were left for further analysis because they may give some insight

into the scenario of a retailer who is complicit with Money Laundering

activities and then facilitates the opening of an account despite know-

ing that the account will be loaded with funds coming from criminal

activities.

• Errors: in the generated dataset, constraints such as positive balance


numbers, consistency between the transfers, deposits and withdrawals

with the changes in account balances were checked. There were a few

entries with negative values for the amount transacted which is impossi-

ble since the MMT platform was not designed to accommodate overdraft.

These entries were excluded from the analysis.

• Outliers: transaction data that are inconsistent with the remainder of

the dataset or deviate so much from other observations were identified

and examined (≈ 0.01% of the total transactions). In the context of

fraud analysis or detection, outliers have a higher probability of being

a fraud [BK17]. For this purpose this entries were left in the data for

further analysis.

The descriptive representation of the dataset after pre-processing follows

in the next Subsection.

4.4.2 Data Analysis

The overall statistics for the generated dataset consists of 497,565 transactions

log between 1st of September 2013 and 28th of February 2014. This Section

presents the descriptive representation of the simulated mobile money dataset.

Table 4.7 summarises the properties of the simulation dataset i.e the min.

and max. value, mean and standard deviation. Figure 4.5 shows the numbers

of non-fraud and fraud transactions. The number of fraud to non-fraud with

respect to the five different transaction operations performed is as shown in

Figure 4.6. Figure 4.7 2 shows the categories of users based on their frequency

of transactions. Figure 4.8 shows the Fraction of fraud types to category

2The description of the simulated data output parameters are presented in Section 4.3.2.


of transaction in the MMT dataset. Figure 4.9 shows the total number of

different fraud types. Figure 4.10 shows the line plot of amount fraction in

Kenya Shillings between the period of simulation. Finally, Figure 4.11 shows

a relationship graph for 100 most active MMT users in the simulation.

Table 4.7: MMT dataset statistics

Attribute Minimum Maximum Mean StdDeviation

usercategory 1 4 2.497 1.121

profile 1 2 1.149 0.356

location 1 7 3.421 2.196

typeTrans 1 5 3.408 1.296

amount (KES) 0.18 139998.46 25124.354 25736.816

balance (KES) 0.03 139998.06 25301.94 26036.484

profileB 1 2 0.837 0.599

locationB 1 7 2.354 2.367

balanceB (KES) 0.05 139993.95 18301.084 24755.301

FraudType 1 4 1.024 0.235

suspicious 0 1 0.012 0.108


Figure 4.5: Number of non-fraud and fraud transactions

As it can be seen from Figure 4.5 above, the MMT dataset has a ratio

of 98.8:1.2 for non-fraud to fraud transactions respectively. This represents a

realistic situation of common class imbalance problem in financial transaction

dataset where one of the classes is very large in comparison to the other one.

Figure 4.6: Different transaction services performed in the simulation

Figure 4.6 shows the distribution of fraud and non-fraud class for the differ-

ent types of MMT operations carried out (i.e Deposit, Withdrawal, Transfer,

Pay Merchant and Buy Airtime).


Figure 4.7: Categories of users based on their frequency of transactions

As it can be seen above, Figure 4.7 shows the distribution of different class

of transactions based on user spending behaviour frequency as mentioned in

Section 4.4.

Figure 4.8: Fraction of fraud types to category of transaction in the MMTdataset.

Figure 4.8 above shows the fraction of fraud types (as mentioned in Section

4.2.2) to category of transaction that was performed in the simulated dataset.

An observation from Figure 4.8 was that high proportion of fraudulent trans-


actions were carried out on Pay Merchant, Transfer and Withdrawal category

of transaction.

Figure 4.9: Total number of different fraud types

Figure 4.9 shows the distribution of the three misuse scenarios introduced

into the data simulation as mentioned in Section 4.2.2 after the data pre-

processing phase.

Figure 4.10: Line plot of amount fraction in Kenya shillings (Sept.2013 -Feb.2014)


Figure 4.10 shows the daily fraction of transaction amount between the

period of simulation. A representation of relationship between the most active

users in the simulation is presented in Figure 4.11 below using social network

analysis tool (Gephi).

Figure 4.11: A relationship graph for 100 most active MMT users in the simula-tion. The blue and red nodes represent Legitimate and bad actors respectivelyand the edges represent relationships between the actors.


4.5 Chapter Summary

This chapter addressed the challenge of lack of publicly available data set on

mobile money transfer by simulating transaction data using the synthetic data

generation method discussed in Section 2.3.2. To simulate the MMT platform,

a multi-agent based simulator(MABS) in [LrA12b] was used to generate mobile

money transfer transaction data using the misuse scenarios in EU FP7 MASSIF

project [Lla+11]. As a distinct contribution, misuse scenario involving SIM

swap and retailer facilitating opening of end user account were modelled which

were not considered in [Gab+13] and [LrA12b]. Also, to trace the sequence

of events in the generated data, time-stamp parameter was included in the

simulation rather than steps or category of users over a period of time as used

in [Gab+13] and [LrA12b] respectively. To ensure that the modelled data

characteristics are as close as possible to a real world situation, combinations

of user habits as well as their behaviour was incorporated in the simulation

as proposed by Gaber et al. [Gab+13]. From the results of the evaluation,

the generated dataset seems to be as close as possible to actual transactions

dataset. This dataset will be used to train and test the proposed predictive

model to further verify it’s applicability on fraud detection tools.

Chapter 5

Design of Mobile Money

Transfer Fraud Detection

System

All through the literature review in Chapter 2, it was consistently shown that

pattern recognition models using machine learning algorithms can be efficiently

used towards the identification of financial transaction fraud. However, there

is the need for a more efficient detection algorithm to address issues that can

affect their performance such as change in spending habits of legitimate users

and evolving sophistication of fraudsters.

This chapter presents the research approach towards the identification of

money transfer fraud in Mobile Money Transfer (MMT) environments. In or-

der to resort to the stated research questions, an integrated fraud detection

system in a mobile money transfer environment is used as a case study for the

ongoing research. Section 5.1 shows and describes the proposed framework

in this thesis to predict fraud in a financial transaction service environment.

100

Chapter 5. Design of Mobile Money Transfer Fraud DetectionSystem 101

This was used to illustrate the different components and algorithms used by

the approach as well as the different experiments performed with the aim of

investigating their performances. Section 5.2 describes the simplified technique

referred to in this thesis as standard case-based reasoning model and/or rep-

resentation for the similarity measurement of different mobile payment trans-

action instances.

However, with the aim of improving the performance of standard CBR

model, the approach was enhanced. Section 5.3 describes the problem repre-

sentation and how the standard CBR model was complimented with machine

learning capabilities for assigning parameter weights and automating the ran-

dom selection of k-value so as to detect financial transaction fraud. Section

5.4 address the skewness of the sample data using data sampling approach.

The applicability of the simulated dataset was tested by conducting pre-

liminary experiments using some machine learning algorithms. The rationale

is that, these algorithms have been used successfully in previous studies and

it is of interest to evaluate the simulated dataset with these algorithms before

using it to test the performance of the proposed CBR system. For that reason,

Section 5.5 describes the machine learning algorithms used as preliminary ex-

periment in this research for detecting mobile money transfer fraud. Section

5.6 discuss the different performance evaluation metrics and cross-validation

approach adopted. Finally, Section 5.7 contains the conclusions of this chapter

by summarising the research approach adopted in this thesis.

5.1. Proposed Detection Method 102

5.1 Proposed Detection Method

The aims of a fraud detection system (FDS) in a financial transaction service

would be to monitor and identify malicious transactions as early as possible,

while at the same time limiting the possibility of raising too many false alarms.

In a financial transaction service environment a fraudster attempts to or pre-

forms series of transactions with out being detected while the FDS tries to

recognise any malicious behaviour. As a step in performing these functions,

a detection framework is proposed using an augmented Case-based reasoning

method with machine learning capabilities as illustrated in Figure 5.1.

Figure 5.1: Proposed fraud detection framework

Figure 5.1 illustrates the architecture of the proposed framework. To de-

tect mobile money transfer fraud, a case-based reasoning (CBR) system is

being proposed as the classifier. The reason for such suggestion is that from

literature [PDM15; Sha+16], CBR has proven to be effective in the absence

of historical consumption data. It also shows a continous improvement when

more data becomes available over time. In addition, CBR has the ability to

realize knowledge transfer as spending habits evolve; as is the case where infor-

mation on one transaction is exploited to improve predictions for different yet

5.1. Proposed Detection Method 103

similar transactions [Ade+17]. As a result, CBR seems to be an appropriate

methodology as justified by several studies. The proposed framework consists

of three main components: Input, Process and Output. The discussion of these

components is as follows.

1. Input: Under the framework, the simulated MMT data were first pre-

processed by running several queries to verify it’s quality as discussed

in Section 3.5.1. The dataset was highly skewed with a ratio of 99.8:1.2

negative to positive instances, which is a major characteristics of finan-

cial transaction dataset. Neglecting data skewness in these approach

will lead to a less accurate prediction. Therefore, the ability to deal

with the imbalance data for the learning problem by selecting the best

data balancing approach provides an effective foundation for sampling

methods. After the data preparation phase, a clustering algorithm is

applied. The clustering algorithm helps to guide and reduce the search

space by collecting the most similar clusters. This allows the identifica-

tion of cases collected under similar circumstances and limits the retrieval

to these cases. The rational behind this is to provide a structured case

base that will guide and speed up the retrieval process for the case-base

classification. This will also help to address the computation complexity

associated with the use of GA for feature selection as more data becomes

available over time.

2. Process: In this section, the CBR classifier is used to classify new in-

stances of MMT transactions into either fraud or non-fraud case. In order

to exploit the flexibility of weighting all the input vectors in the process

component, a Genetic Algorithm (GA) was used to calculate their weight

so as to reflect the significance of each vector as determined by the GA

5.2. Standard CBR Model 104

procedure. The GA is also used to automate the random selection of

k-value for the CBR classification. To perform the similarity measure

and retrieval task in the classification section using CBR, a k-Nearest

Neighbour algorithm is executed using Euclidean distance metric.

3. Output: As indicated in Figure 5.1, the output section provides a sum-

mary window for the prediction. It will provide ranking of clusters of

transaction neighbours for new cases which may operate as an effective

tool for experts to develop preliminary insight into suspicious transac-

tions which can then be investigated in more detail.

In order to evaluate the proposed detection framework, the implementation

of the experiments were performed in an evolutionary approach as discussed

in chapter 6. The initial sets of the experiment started with the application

of basic techniques (Standard CBR) on minimalistic datasets. After which

the remainder experiments progressed with the application of advance tech-

niques (Weighted CBR) on more sophisticated dataset, imposing both high

complexity and significant size. The discussion of the different stages of the

experimental design follows.

5.2 Standard CBR Model

Standard case-based reasoning methodology (henceforth referred to as Std-

CBR) can be used as a classification technique. It classifies an unlabelled case

by retrieving closely matching labelled cases and reusing their labels. For the

purpose of case classification, k−nearest neighbour (kNN) was used to predict

mobile money transfer fraud. The kNN requires defining the case represen-

tation and the similarity function, which may employ algorithms for feature

5.3. Weighted CBR Model 105

selection or weighting [MGA04]. In order to compute the similarity between

a new case and previously experienced case instances, weighted Euclidean dis-

tance metric was executed due to its simplicity.

d = |Z −X| =

√√√√ m∑i=1

wi|Zi −Xi|2 (5.2.1)

Where wi is the weight of vector (attribute) i, Z is the query (new case),

X is the source (retrieved case), m is the number of vectors in each case, and

i is an individual vector from 1 to m. The idea of this algorithm is to choose

k neighbouring cases for the input cases as the most similar case to this new

case and it also assigns the class of its nearest neighbour(s) to the new case.

Each neighbouring case has a classification (categorical) variable of interest

(e.g fraudulent or non-fraudulent) associated to it. Therefore to chose a k

class of transaction for an incoming transaction Zi, the incoming transaction

is compared to all stored cases in the past using the similarity measure d.

5.3 Weighted CBR Model

This chapter proposes an improved CBR approach (referred to in this thesis as

Weighted CBR) for the identification of money transfer fraud in Mobile Money

Transfer (MMT) environments. Here, StdCBR capability is augmented by ma-

chine learning techniques to assign parameter weights in the sample dataset

and automate k-value random selection in k-NN classification to improve CBR

performance. The CBR system observes users’ transaction behaviour within

the MMT service and tries to detect abnormal patterns in the transaction

flows. To capture user behaviour effectively, the CBR system classifies the


log information into five contexts and then recombines them into a single di-

mension. This is done instead of using the conventional approach where the

transaction amount, time dimensions or features dimension are used individ-

ually. The applicability of the proposed Weighted CBR system is evaluated

using simulation data.

5.3.1 Problem Representation

There are several factors that affect consumer spending behaviour such as

lifestyle, age, and income group which may evolve over time [SM70]. This

indicates that consumer transaction behaviour is temporal in nature. In addi-

tion, as consumer spending behaviour is temporal and most individuals exhibit

consistent spending habits, an event-driven chain of transactions [KPK10] can

be a robust representation of patterns [JTPL03]. To represent such behavioural

pattern of users, it is necessary to define events that model the MMT process.

According to [Rie+13], it is challenging to extract a workflow pattern of trans-

actions from event flow of mobile money systems since any user is free to use

the system as they wish (for instance, a user can have unlimited transactions

of different amounts, various frequencies, pay for activities of their choice,

etc.). For this reason our events representation was generated from the users’

behaviour in the mobile money system.

In mobile money transfer transaction processing, the spending behaviour

contains information about the transaction amount, time gap since last trans-

action, day of the week, etc. The transaction amount, frequency, and time are

closely related to spending behaviour of a person which are actually influenced

by income, resource availability, and lifestyle of the person. In most conven-

tional fraud detection systems (FDS), the transaction amount is considered


as the most important parameter for fraud detection. Also, previous research

work [KSM06], has shown that the efficiency of any FDS is associated with the

dimensions of amount and time which were used separately. In [Kun+09], the

authors combined these two dimensions into a single one and a significant im-

provement in performance accuracy was achieved. In this work, the approach

in [Kun+09] was extended by classifying features in our MMT transaction

dataset into five contexts and recombined into a single dimension to capture

user behaviour. For example, each clients transaction on the mWallet account

is denoted by a quintuple,

transactioninstance = [Transtype, Client, Interval, location,Amount]

where:

1. Transtype: Transaction type entities

2. Client: Features of entities (client ID, Profile e.g savings or current

account, account balance, spending habit category)

3. Interval: Features of entities (Month of the year, Day of the Week).

4. Location: Location entities

5. Amount: quantization of amount entities into finite levels up to maxi-

mum daily spending limit.

For the CBR system problem formulation, lets consider the total set of

transaction instances E in the log file, where each instance is a quintuple (vi)

vector representing the different contexts of information in a transaction.

vi = [ c1, c2, . . . , c5 ] (5.3.2)


where (c) is the context of information type

Each instance of query (Z) is composed of 5 vectors (vi) representing the

different context of information:

Z = [ vz1, . . . , vz5 ] (5.3.3)

For the case base representation, each case (X) contains a description (D)

with the corresponding solution vector (S), i.e an outcome tag (y) associated

to each instance of transaction. That is,

X = [ Dx, Sx ] (5.3.4)

where

Dx = [ vx1 , . . . , vx5 ] (5.3.5)

Sx = y1, . . . , yn (5.3.6)

In the experiment, the value of n is set to n = 2 as there are only two

possible transaction outcomes (non-fraudulent and fraudulent) in the MMT

dataset.

5.3.2 Case Similarity

For the needs of similarity measure, the similarity between two cases is de-

fined as a weighted average of the vector similarities. In [Ade+16], where the

StdCBR approach was used, the flexibility of weighting was not exploited i.e

all weights were simply set to the same value. However, since different vectors

are obviously of different importance, a conclusion was reached to take advan-


tage of the capability of genetic algorithms to assign weights that reflect such

differences more accurately. The CBR system was used in the following way:

Retrieval: During the retrieval process, an ordered list of k1 most similar

cases to the query were retrieved and returned. This was implemented using

a k-Nearest Neighbour algorithm2. The overall similarity value was computed

by weighting the local similarity of each vector (vi). Therefore, each pair of

vectors [ vzi , vzi ] is computed and the resulting value is weighted with a value

(wi) that represents the relevance of the corresponding transaction in the global

similarity computation:

Sim(Z,Dx) =5∑

j=1

wi ∗ Simi(vzi , vxi ) (5.3.7)

where5∑

i=1

wi = 1 (5.3.8)

Reuse: to obtain the solution for the query (Z), the proposed CBR system

applies a weighted voting schema according to the similarity of the retrieved

cases. Using a scoring function:

score(pi) =∑

sim(Z, Sx) ∀ x|Sx = pi (5.3.9)

Therefore, the solution assigned to the query is:

1For the purpose of this experiment as explained in Section 6.2, the value of k= 3, 5, 7,& 9 will be used.

2Weighted Euclidean distance was executed for the k-NN algorithm based on the factthat it is easy to understand and interpret [GG16].


pi = argmax{score(pi), i = 1, . . . , k} (5.3.10)

Where k is the value of nearest neighbour assigned during retrieval of the

kNN algorithm.

The two final steps of the CBR cycle (revise and retain) are implemented in

this particular case outside this research work. While they are important part

of the CBR cycle, they will be implemented as part of the fraud identification

process by the fraud analyst as this falls outside this research work. i.e the

touch of a fraud analyst becomes critical at this stage and as a result this

cycles are not automated in this research.

5.3.3 CBR Model Feature Weighting

In order to exploit the flexibility of weighting all the input vectors, a Genetic

Algorithm (henceforth GA) was used to calculate their weight so as to reflect

the significance of each vector as determined by the GA procedure. Weighting

of variables using GA was extensively used in the literature as for instance

in [AKH06; Eka03; JADaRg14; Man+12] 3. The rationale4 for using GA is

that it allows a better exploration of search space and thus tends to produce

a better result [BGK05]. For the purpose of this present work, the method

in [JADaRg14] was adapted for the configuration of the GA in obtaining the

weights for each of the vectors. In the experiment, the GA uses a popula-

tion of individuals representing the different weights and the generation of the

population evolves until the individual weights with the best performance is

3This has been used in other domains as discussed in Section 3.5.4Other methods like gradient descent was not chosen because it can get stuck in local

minimum and their performance are dependent on initial values of design variables [DCM93].


returned. Each individual weight contains both the vector weight and k pa-

rameter of the K-Nearest Neighbour algorithm to estimate the best number

of cases that must be retrieved to classify new transactions. For the need of

configuration, the genetic algorithm was run with an initial population of 1000

individuals. Each individual contains the weights of each vector and the value

of k5 (a random value 3, 5, 7 or 9) that the CBR model uses in the retrieval

stage. Also, at the initial stage a random value was assigned to each weight

and then later normalised to the sum of 1. The following cycle is repeated

until there is no more improvement in the performance of the best individual

population:

1. Evaluation: At this stage, the genetic algorithm executes a cross-validation

of the CBR system configured with the weights and k value for each in-

dividual in the population. The resulting performance is the fitness of

the individual.

2. Remove: After the evaluation of all the individual population, 25% of

the population with the worst fitness performance was removed.

3. Cross-over: To reproduce the population that was removed (i.e 25%

individual removed after the fitness evaluation), the genetic algorithm

combines the individual population with the best performance. During

the cross-over process, the parent individuals are taken in pairs and then

combined together to form a new individual called child. The weight of

each child individual contains the average weights of the parents (nor-

malised weight) and the value of k is computed analogously.

5The range 3 to 9 was used for k so as to ensure a scalable computation cost of theaugmented CBR system is achieved.

5.4. Data Pre-processing 112

4. Mutation: During the implementation of the mutation function, the In-

dividuals along with their weights are chosen randomly for modification,

using 5% of the population. This modification prevents local maximum

values.

In the evaluation stage, the genetic algorithm executes the cross-validation

of the CBR system using hold-out function6. The rational for using the hold

out function is to minimise the computation cost of running the experiment.

The next Section discusses the pre-processing of the dataset.

5.4 Data Pre-processing

In a real-world fraud detection scenario, the number of negative instances,

which refers to legitimate records, far outnumbers the number of positive in-

stances which refers to illegitimated records [GS17]. According to [PF01], the

ratio of negative instances may vary as much as 99:1 in the fraud domain.

This is because fraud is rare. In the case of mobile transfer fraud in this study,

the number of normal instances (usual transfer behaviour) exceeds the num-

ber of suspicious instances (unusual transfer behaviour). As a result of this

highly skewed dataset with very small number of positive instances, it is hard

to classify correctly. Nevertheless it is important to detect [AKJ04; GC03].

So to deal with the imbalance learning problem, the best strategy needs to

be selected. Therefore for mobile money fraud classification task, the data

sampling approach is chosen due to the reasons below [GS17; WMZ07]:

6The Hold-Out function is an evaluator function in the jColibri2 framework that splitsthe case base into test case (query) and training case (normal case base) [RGDAGC08].

5.4. Data Pre-processing 113

• Sampling methods perform with somewhat similar effectiveness if not

better, than that of cost-sensitive and algorithmic learning.

• In both cost-sensitivity and algorithmic learning technique, determining

the misclassification cost depends on the problem domain and is consid-

ered a challenging task.

In the literature over-sampling and under-sampling techniques have been

widely used. However, neither of them seems to be a clear winner. The

kind of sampling that yields to better results depends largely on the training

dataset that is been balanced [GS17; WMZ07]. Furthermore, the author in

[Cha+02] combined over-sampling and under-sampling techniques. As a result,

the initial bias of the learner towards the negative (majority) class was reversed

in favour of the positive (minority) class. Thus, the hybrid strategy of over-

sampling and under-sampling works better than either one [Cha+02; GS17].

For the purpose of balancing the training dataset in this research, a hybrid

strategy was adopted using the most used over-sampling and under-sampling

schemes namely; Synthetic Minority Over-sampling Technique (SMOTE) and

Tomek-Link respectively. Both methods have been widely used to deal with

class imbalance problems [GS17]. The next Section discusses the preliminary

experiments conducted after data pre-processing.

5.5. Preliminary Experiment 114

5.5 Preliminary Experiment

In this thesis, as part of background study 7, preliminary experiments were

conducted using some robust machine learning algorithms. From the litera-

ture, several machine learning algorithms have been used to predict financial

transaction fraud and they all have shown promising results. However, one of

the challenges of deploying machine learning tools for fraud detection problem

is how to select the right machine learning methodology. A successful selection

of a prediction algorithm [Bur17] for the identification of money transfer fraud

requires three important aspects; (1) There should be a mechanism for feature

selection/dimensionality reduction on sample sets so as to improve the estima-

tors accuracy scores or to boost their performance on very high-dimensional

datasets. In certain types of data, there are large number of features compared

to the number of data points. (2) The algorithm should calculate and provide

fast enough prediction to support decision making. (3) The algorithm should

have a continuous learning ability by which it keeps learning over time as more

data becomes available. That is to say the algorithm has to constantly use his-

torical data for learning and adapt its predictions to the dynamics of user’s

behaviour. These requirements can be addressed with a supervised machine

learning algorithms.

In supervised learning algorithms, in order to understand the performance

of the learning algorithms and to gain insight into the problem, it is helpful to

formulate the learning problem [Bur17]. In the preliminary experiment of this

thesis, the association between a categorical dependent variable and indepen-

7The rationale for using the preliminary experiments was to evaluate the sample data inthis experiment with some existing machine learning algorithms (having them as a baseline)before using it to test the performance of the proposed CBR system.

5.5. Preliminary Experiment 115

dent variables with either continuous or discrete values is defined. Therefore,

the problem is then formulated as a classification task. For the purpose of

representing the learning problem, let Y be an unknown outcome for a new

test sample to be predicted. Then a predictive model M (i.e a function with

adjustable parameters) is built, which is used to discover this unknown out-

come. Let X be the training examples. This is used to select an optimum set

of parameters for the classifier.

Definition 1 Let X be a set of input variables xi and Y be a label with classes

ci. Dtr is the training set of instances (xi, ci), where Dtr = (xi, ci), xi ∈ X, ci ∈

Y and 1 ≤ i ≤ n, where n is the total number of instances. The classification

problem is to determine a model M(X, Y ) such that it maps xi to target classes

ci.

The process of training and testing of the model is preceded by dividing

data D into two sets; training dataset Dtr and test dataset Dts. The training

dataset Dtr is used for training the classifier whereas the test dataset Dts is

used for the actual prediction. The data D is then split in the training and test

dataset into different proportions to evaluate the model (this process is known

as cross validation). The variable for prediction Y in the training dataset Dtr

is considered known but unknown in the test dataset Dts: thus, the variable

for prediction Y is predicted by the trained model M(X, Y ) on a test dataset

Dts.

The choice of machine learning algorithm is influenced by results from re-

viewed literature discussed in chapter 3. As a result, four well-known baseline

classifiers, Logistic regression, Random forest, Support vector machine and Ar-

tificial neural network were then chosen as algorithms for experiments in this

5.6. Evaluating the Efficiency of Prediction 116

thesis. Logistic regression is a generalized linear model, easy to use and one of

the most commonly used technique for data mining in practice but is vulnera-

ble to overconfidence [Bha+11]. The random forest’s classifier has the ability

to capture non-linear data, shows high scalability with better visual represen-

tation of results data but is liable to over-fit. Support vector machine has a

regularization parameter that is used to prevent over-fitting [Sud+10]. Artifi-

cial neural network can show higher accuracy in prediction but they are much

more computationally expensive and hard to understand the interpretation of

prediction results.

5.6 Evaluating the Efficiency of Prediction

In a supervised learning problem an algorithm is assessed based on its overall

accuracy to predict the correct classes of new and unseen observations. In

order to measure the capability of a trained model M(X, Y ), a test set Dts is

introduced:

Definition 2 Let a test set be defined as Dts = {xj, yj} where xj ∈ X, yj ∈ Y ,

1 ≤ j ≤ m, where X is a variable, Y is a variable for prediction, and m is the

number of data entries. There are no common elements in the sets Dtr and

Dts: Dtr∩Dtr = ∅, where Dtr is a training set. The set Dts is used to evaluate

the performance of the model M(X, Y ). Such evaluation can be performed by

comparing the values of variables yj ∈ Y against predicted variables yj for xj

for all (xj, yj) ∈ Dts.

In classification task, standard classification measures such as TPR, TNR

and Accuracy are misleading assessment measures in unbalanced class prob-

lem [Pro00] as earlier mentioned in Section 3.3. A well-accepted measure for


unbalanced classification is the Area Under the ROC Curve (AUC) [Cha05].

This metric gives a measure of how much the ROC curve is close to the point of

perfect classification. Therefore, for the purpose of this thesis four metrics have

been employed including Recall, F-measure, Mathew Correlation Coefficient

(MCC) and Area Under the ROC Curve (AUC). According to [GS17], these

four metrics have high efficiency with respect to handling imbalanced data

without getting biased towards the majority class and also they are highly

suitable with respect to handling fraud detection domain:

1. Recall:

Recall =TP

FN + TP(5.6.11)

Recall focuses on the significant class (fraud) and is not sensitive to data

distribution.

2. F-measure: It is also known as F-score (F1).

F1 = 2 ∗ Precision×RecallPrecision+Recall

(5.6.12)

Precision is computed as TPTP+FP

F-score is a good metric due to its non-linear nature and has been used

for fraud detection.

3. Mathew Correlation Coefficient (MCC):

MCC =TP × TN − FP × FN√

(TP + FP )(TP + FN)(TN + FP )(TN + FN)(5.6.13)

MCC is considered one of the best singular assessment measure and is

less influenced by imbalanced data [Cha+11a; GS17].


4. Area Under the Curve (AUC):

AUC = 1− FPrate + FNrate

2(5.6.14)

Where FPrate = FNTP+FN

and FNrate = FNTP+FN

AUC evaluates the overall classifier performance and is very appropriate

for class imbalance [Pow07].

5.6.1 Cross-Validation

Cross-validation sometimes called rotation estimation is a computer intensive

technique for evaluating predictive models by partitioning the sample dataset

into a training set to train the model, and a test set to evaluate it [Koh95].

In a k-fold cross validation the dataset D is randomly split into k mutually

exclusive subsets (the folds) D1, D2, . . . , Dk of approximately equal size. As

illustrated in Figure 5.2, the data is first divided in k parts, and then k models

are built on k − 1 parts of the data consequently.

Figure 5.2: Schematic representation of k-fold cross-validation

Each model is then tested on the left kth part of the data. For example

[Dan17] in a 5-fold validation, the sample data is divided into 5 pieces, each

being 20% of the full dataset as shown in Figure 5.3 below:


Figure 5.3: Schematic representation of 5-fold cross-validation

The experiment starts by running the first model, which uses the first fold

as holdout set and everything else as training data. This gives a measure of

model quality based on a 20% holdout set. Then in the second model, data

from the second fold are held out (using everything except the 2nd fold for

training the model). This gives a second estimate of the model quality. The

process is repeated using every fold once as the holdout. Putting this together,

100% of the data is used as a holdout at some point. This approach is employed

in this thesis for evaluating the prediction accuracy for the different classifiers

used.

5.7 Chapter Summary

Chapter 5 introduced the research approach for the identification of money

transfer fraud in Mobile Money Transfer (MMT) environments. To achieve

this, the research questions stated in Chapter 1 were investigated. In order

to apply similarity measures among transaction event sequences, the log in-

formation from the simulation data was classified into five contexts and then


recombined into a single dimension rather than the conventional approach

where transaction amount, time dimensions or features dimension are used

individually.

Two similarity algorithms are being presented in this chapter. They include

StdCBR and a Weighted CBR model. The weighted system uses a combina-

tion of CBR and GA as a tool to assign the significance level (weights) of the

features with the aim of improving the performance of the proposed CBR ap-

proach. Finally some machine learning algorithms were selected and discussed.

This were used to carry out preliminary experiment as further discussed in Ap-

pendix A.1.

Chapter 6

Experiments and Validation

The previous chapter described the approach used in this thesis towards the

identification of money transfer fraud in Mobile Money Transfer (MMT) envi-

ronments. In order to implement and apply the CBR methodology, a cus-

tom CBR model was designed and developed using jCOLIBRI framework

[RGGCDA14]; a Java framework that allows rapid prototyping of a CBR sys-

tem, the development and deployment of the CBR system in real scenarios.

For the purpose of evaluating the research methodology, different sets of

experiments were designed and conducted. However, to evaluate the research

methodology, the experiments were performed in an evolutionary approach

described in Section 5.1. The initial sets of the experiment started with the

application of basic CBR technique (StdCBR) on minimalistic dataset. After

those, the remaining experiments progressed with the application of advanced

techniques (Weighted CBR) on more datasets which exhibit both high com-

plexity and significant sample size. The rationale for this is to assess any

improvements achieved and verify the overall efficiency of the proposed CBR

system.

121

6.1. First Set of Experiments 122

This chapter provides the evaluation process of the undergoing research

with the produced results as well as the encountered outcomes from series

of experiments undertaken. The outline of this chapter is as follows; Section

6.1 discusses the initial experiments conducted using StdCBR approach with

the preliminary dataset. Section 6.2 addresses the research motivation for the

identification of mobile money transfer fraud using enhanced approach and

the experiments undertaken towards its evaluation. Furthermore, this section

unfolds the reasons that led to the adoption of clustering and gives an overview

of the experiments conducted using CURE clustering algorithm. The proposed

CBR approach was evaluated using1 Recall, f-measure, Mathews correlation

coefficient and area under curve (AUC). Finally, Section 6.3 concludes with

discussions on the results of the research experiments.

6.1 First Set of Experiments

In the first experiment, sets of preliminary experiments were designed to evalu-

ate the proposed approach discussed in Chapter 5. For the needs of evaluating

the CBR model, a step wise approach was adopted for the dataset used. An

initial familiarisation with a preliminary dataset was conducted to evaluate

whether a StdCBR model (as previously discussed in Section 5.2) is able to ef-

ficiently identify simple patterns in an already known pre-classified case. Then

a more complex case with another class of different fraud types was then used

in an attempt to promote its prediction aptitude. The reason for this break-

down of experiments is to allow better approximation and handling of the

results as well as gradual evaluation and reflection for its capabilities.

1This metrics can take a value from 0 to 1.


The dataset used for this experiment was selected from the evaluated syn-

thetic dataset as discussed in Section 4.4 where only users who had done more

than 60 transactions were randomly selected. The dataset were characterised

with fraudulent or non-fraudulent labels serving as key point indicators to

whether the transaction are problematic or not. This label classification can

be used when the cases are not complex. Having this in mind, the experiments

were then designed to investigate whether it could perform successful identifi-

cation with an elementary dataset and also to check the overall feasibility of

the system.

To proceed further, a simulation was proposed and designed as the most

suitable approach since one of the prior targets was the feasibility evaluation

of the system while classifying the elementary dataset. The simulation was

carried out using 2000 cases from the event logs. These was split randomly

into a case base of 1800 cases and a test sample (target sample) of 200 cases.

The classification of the target sample was based on simple voting using kNN

algorithm with k = 3. The 3 nearest neighbours were retrieved for each case as

in equation 4.2. Each target case was classified as fraudulent or not fraudulent

based on the votes of its three nearest neighbours. In order to acquire more

accurate results the evaluation run was repeated 10 times and classification

results were finally averaged over 10 runs. This approach was applied to all

experiments conducted in this thesis. Results associated to the StdCBR model

are shown in Table 6.1.


Table 6.1: Evaluation of StdCBR classifier on small MMT dataset.

Performance measure StdCBR

Recall 0.778

F-measure 0.579

Mathews Correlation Coefficient 0.573

Area under the ROC Curve 0.865

From Table 6.1 above, it can be observed that the StdCBR classifier was

able to correctly classify 78% of fraudulent transactions. Although the results

shows fairly positive monitoring attempt for the StdCBR system performance

on simplistic dataset, the result from the F-measure score indicates a high

precision score for the StdCBR classifier. Such a result could possibly lead to

a decline of valid customer transactions (i.e high false positive score).

The application of the StdCBR system was further used to provide explana-

tions on the similarity measures for the retrieved cases as shown in Figure 6.1.


Figure 6.1: Transaction neighbours summary

From the interface of Figure 6.1, 3 nearest neighbours can be seen for each

new case, including classification score (fraud as 1 and non-fraud as 0) as well

as their similarity performance. This can provide a good insight into a number

of final line case investigation for experts after the existing detection system

has been utilised.

The primitive experiment and evaluation of StdCBR system has been

shown to be effective for the monitoring of simple dataset after applying de-

fault StdCBR configuration. As a step further, more dataset ( i.e dataset that

exhibits both high complexity and significant sample size) with improved CBR

configuration could be used. These dataset and configurations could be used to

evaluate the classification precision of this research approach. That motivates

the design and implementation of the second experiment as discussed in the

6.2. Second Set of Experiments 126

next section.

6.2 Second Set of Experiments

This is motivated by the encouraging results produced by the preliminary eval-

uation of the CBR approach. In order to ensure the classification precision of

the approach, there is the need for testing with more dataset that exhibit

both high complexity and significant sample size. This was profoundly indi-

cated after the evaluation and analysis of the previous experiment. Therefore

the following experiments aims to evaluate the prediction performance of ap-

proaches previously presented in Section 5.3 on high volume of transaction

instances.

The dataset used for this experiment had a significantly larger size with

more complexity than the dataset used in the previous experiments. Here,

the general annotation for fraud class (i.e 1) in the previous experiment were

further annotated using three distinct classes: A, B, and C. This represents the

three classes of fraud that was introduced in the data simulation experiment as

mentioned in Section 4.2.2. These classes are described in more detail below:

• Class A: Indicates account takeover fraud as a result of Sim splitting or

loss of device. For example an abnormal or shift in behaviour of end users

compared to a specific profile or compared to usual behaviour [Lla+11].

• Class B: Indicates a retailer who is complicit with Money Laundering

activities and then facilitates opening of account despite knowing that

the account will be loaded with funds coming from criminal activities.

For example, an atypical use of mWallets or an abnormal volume and

frequency of cash transactions compared to a specific profile i.e mWallet


used only for withdrawal or p2p transfer, multiple funding and load-

ing sources of the mWallet, followed by withdrawals shortly afterwards

[Gab+13; Lla+11].

• Class C: Indicates an account management system compromise. For

example, a change of the global balance of mMoney in the system or

a large number of transfer from several mWallets used to fund the one

specific account [Lla+11].

For this experiment, a slightly different approach was adopted regarding

the classification evaluation of CBR model. The reason for this was to investi-

gate which could be the most effective configuration for the predictor in order

to enhance their prediction capabilities. For the needs of this investigation,

certain changes have been made in terms of the case representation, nearest

neighbour configuration, parameter weights assignment.

Upon the above stated rationale, the CBR model used kNN algorithm with

k = 3, 5, 7, 92 respectively for each instances of payment transferred transac-

tions. In order to capture user’s behaviour effectively, the CBR model classifies

the log information into five contexts and then combines them into a single

dimension (already presented in Section 5.3). This is done instead of using

the conventional approach where the transaction amount, time dimensions or

features dimension are used individually. The advantage is that it gives a sig-

nificant improvement in accuracy over a system that considers each feature

dimension individually [Kun+09].

The number of parameters taken into consideration for this experiment was

2The value of k is usually odd number in case the kNN algorithm measure returns anequal number of frequencies for various classes [HAA14].


significantly larger compared to the previous experiment performed. This is to

allow an efficient evaluation and at the same time understand the effectiveness

of the current approach. As a result, the experiment factored in all the available

cases in the dataset taking into account their k (for k= 3, 5, 7, 9) nearest

neighbours. The key difficulty encountered with the use of large dataset was

the class imbalance. Therefore, a data sampling technique was introduced to

reduce the skewness of the training set. The discussion of the data sampling

follows.

6.2.1 Data Sampling

In a fraud detection problem, the class imbalance issue is often present and as a

result the number of fraud data is always less than that of non-fraud. This issue

presents a negative effect on the algorithms because most times the classifiers

are biased towards the majority class and thus, they return low performance

results [GS17]. In the case of the MMT dataset used in this thesis, it was

highly skewed with a ratio of 99.8:1.2 negative to positive instances which is a

major characteristics of financial transaction datasets [PF01]. Neglecting data

skewness in our detection model may lead to diminishing prediction accuracy.

Therefore, according to [GS17] the ability to deal with the imbalanced data for

the learning problem by selecting the best data balancing approach provides

an effective foundation for sampling methods. Thus, a data sampling approach

was chosen over cost-sensitive approaches because determining the cost in a

cost-sensitive approach is considered highly challenging [WMZ07].

For the purpose of balancing the MMT dataset using the data sampling ap-

proach, a hybrid strategy that combines an oversampling and under-sampling

algorithm is adopted as previously discussed in Section 3.2 which works better


than either one [Cha+02; GS17]. SMOTE and Tomek-Link were then selected

as the over-sampling and under-sampling respectively because they are well-

known classifiers in the literature [Pad+07; Gu+08]. In order to re-balance the

MMT data, a sampling ratio of 2:1 was adopted. According to [CC11; GS17],

this ratio is preferable in order to achieve an efficient distribution between

negative and positive instances in a training set. Figure 2.4 below shows the

result of performing SMOTE with k = 5 and ratio = 0.5 followed by Tomek

link removal.

Figure 6.2: An illustration of the SMOTE + Tomek

6.2.2 Experiments with the Weighted CBR

The StdCBR model proves to be efficient in the previously performed experi-

ment as discussed in Section 6.1 by providing good prediction on the investi-

gated cases. Despite that, further experiments had to be conducted in order

to ensure and verify its overall efficiency. Towards this direction an advanced

algorithm has been developed that uses the capabilities of GA for assigning

parameter weights and automating the random selection of k-value in order to


detect mobile money transfer fraud. At this stage of the experiment, events

and cases in the transaction streams were classified into five different types

of mobile-enabled financial operations available in the sample data for the

purpose of computing the case similarity. The main difference between the

StdCBR model and its advanced variant lies in augmentation with GA capa-

bilities and the event/case representation.

In order to evaluate the Weighted CBR model, a set of experiment was de-

signed and conducted. The motive was to evaluate whether the Weighted CBR

model was of benefit to the predictive accuracy of the CBR process. Addi-

tionally, the overhead computational performance of both the StdCBR model

and the Weighted CBR model was considered. The pseudo-code to calculate

the similarity between transaction sequences for the purpose of identifying

fraudulent transactions is been presented below:

Using k-Nearest Neighbour

Classify (X,S, Z) // X: case base, S: class labels of X, Z: query sample

for j = 1 to m do

Compute distance d(Xj, Z) // similarity Sim

end for

Compute set I containing indices for the k smallest distances d(Xj, Z).

return majority label for { Sj where j ∈ I}

This algorithm calculates the similarity between transaction sequences i.e

case and query. The similarity function was computed by optimally combining

the individual local similarity of (vi) into a global similarity as discussed in

Section 5.3.2. For the needs of experiment evaluation, the Weighted CBR

classifier is compared with the basic CBR model (StdCBR) using the individual

feature similarity dimension as the baseline.

In the experiment evaluation, a total of 736,030 transactions were used with


a ratio of 2:1 after the data sampling. To ensure that the CBR model is not

biased as a result of using the same transaction data for both the training and

testing phase in the experiment, the MMT dataset was split into training and

testing set using a ratio of 70:30 respectively. However, to avoid an overopti-

mistic estimate of the performance as proposed in [Fab+17], transactions from

known compromised MMT account was removed from the subsequent split.

For example when an MMT account is already associated with a fraudulent

transaction in the training set, its transactions are removed from the test set.

The Weighted CBR model uses GA to automate k-value random selection

in kNN classification between the range 3, 5, 7, or 9. Table 6.2 shows the

results for k= 3; however, the use of larger values of k in the experiment

did not present any significantly different results. The experiments were ran

for 10 iterations and the average was taken for the final classification result.

Figure 6.3 and 6.4 below shows the results of the experiments conducted as

well as the percentage of the correct classification for each distinct fraud class

category (A, B and C) and non-fraud transactions respectively.

Figure 6.3: Results for all types of fraud class detection


Figure 6.4: Results for Non-fraud transaction detection

The overall performance of the classifiers as shown in Figure 6.3 and 6.4

above can be characterised as good especially for the Weighted CBR + con-

text classifier. In cases belonging to status class C, which indicates account

management system compromise, StdCBR system shows some reluctance in

classifying them accurately. However, it can be observed that as the config-

uration of StdCBR system was enhanced using context features and genetic

algorithm capabilities for feature weights, the CBR system (Weighted CBR +

context) was able to classify with precision all the cases of fraud class more

accurately. This leads to an increase in detection accuracy of 0.032% for class

C. The same performance trend was also recorded for the remaining classifiers

as shown in Figure 6.4 for non-fraudulent transactions classification.

To further represent the effectiveness of the CBR classifiers, the following

parameters were adopted as the metrics for comparison and evaluation: Re-

call, F-measure, Mathews Correlation Coefficient (MCC) and Area Under the

ROC Curve (AUC). These four metrics have high efficiency with respect to

handling imbalanced data without getting biased towards the majority class


and also they are highly suitable with respect to handling fraud detection

[GS17]. Recall focuses on the significant class (fraud) and is not sensitive to

data distribution. F-score is a good metric due to its non-linear nature and

MCC is considered one of the best singular assessment measures and because

they are less influenced by imbalanced data [GS17; Cha+11a]. AUC evaluates

the overall classifier performance and is very appropriate for class imbalance

[Gu+08]. The performance evaluation of the classifier using the above metrics

are as shown in Table 6.2.

Table 6.2: Performance of classifiers: row 1, represents StdCBR, row 2 Std-CBR + new features, row 3 Weighted CBR, and row 4 Weighted CBR + newfeatures.

Model, attributes Recall F-measure MCC AUC

StdCBR 0.981 0.977 0.966 0.984

StdCBR + context 0.987 0.977 0.966 0.985

Weighted CBR 0.994 0.995 0.993 0.996

Weighted CBR + context 0.998 0.998 0.998 0.999

As observed in Table 6.2, the results from this experiment differs from pre-

vious work3 although the dataset contains the same features but with extended

simulation data of higher volume. Our recall is higher i.e our model accurately

captured both fraudulent and non-fraudulent events. Looking at Weighted

CBR + context classifier, it can be observed that the new feature (context of

information) lead to an improvement of 0.021 in F-measure. This indicates

that the new feature had significant improvement to the StdCBR model. It

can be observed from the Table 6.2 above that the Weighted CBR with context

3Tables 3, 4 [Ade+17] reports for smaller unbalanced dataset a recall of 0.46 for Std-CBRand 0.78 FW-CBR.


classifier had the best Mathews Correlation Coefficient result when compared

to other classifiers. This indicates that Weighted CBR + context classifier

has the highest probability of been the perfect model for the sample data in

this experiment. Another observation is that Weighted CBR + context clas-

sifier outperformed the other CBR classifiers in the prediction of both fraud

and non-fraudulent cases by demonstrating a high AUC value. To further

test whether the detected differences are statistically significant, we applied

t-Test on the performance result. All differences are found to be significant

(α < 0.01); the test shows that the newly proposed pattern features in addition

to the feature weighting significantly improves the performance regarding all

the different performance measures used. The experiment demonstrates that

the addition of the new features improves the StdCBR approach compared to

[Ade+17]. The addition of weighting and context as demonstrated in Table 6.2

further improves performance of the CBR model.

The computational complexity associated with the use of genetic algorithms

is seen as one of the major challenges in this experiment. The average execution

time for each of the experiments was 43,581 seconds on a pre-set Intel Core i5

2.16 GHz machine. However, more emphasis is placed on reducing computation

cost to improve the scalability of the proposed system in the next Subsection.

Although further experiments on different application domain may change the

experiment attitude, such investigation is beyond the scope of this thesis.

6.2.3 Weighted CBR with Clustering

The combination of case-based reasoning (CBR) with genetic algorithms has

been successfully applied to a wide range of applications in literature, such as

classification, diagnosis, configuration and decision support [Man+12]. As it


is obtainable in other learning systems, the collaboration of CBR with GA can

suffer from computational cost problem which occurs when knowledge learned

in an attempt to improve a system’s performance, degrades it instead [AJ04;

FR93]. This issue can be addressed by using a collaboration between CBR

and clustering to propose an available strategy at retrieval task which permits

choosing the best solution from a set of solutions found by clustering a case

base [MA16]. The discussion about the benefits of clustering was discussed in

Section 5.1.

Clustering techniques have been widely used in various fields to improve the

classification accuracy and computation cost of learning algorithms [TD13] as

the case library grows. The clustering approach divides data units or variables

into clusters such that elements within a cluster have a high degree of natural

association among themselves while the clusters are relatively distinct from

one another [KH01]. It is often used both in information retrieval [MA16]

and large or irregular case library problems [TD13]. For the selection of the

appropriate clustering algorithm, Kapetanakis et al. in [Kap12] suggested two

major factors that needs to be taken into consideration:

• The complexity of the case structure.

• The similarity algorithms that provided similarity among cases instead

of providing a static set of values for each case.

Based on the above specification, a CURE data clustering algorithm could

be applied in order to cluster the data into meaningful groups. Thus, a CURE4

4The scattered points approach employed by CURE alleviates the shortcomings of boththe all-points as well as the centroid-based approaches in other clustering algorithms. Thus,it enables CURE to correctly identify clusters in a dataset [GRS01].


clustering algorithm was chosen to be used for the retrieval process of the

Weighted CBR system. CURE algorithm employs a novel hierachical cluster-

ing algorithm that adopts a middle ground between centroid-based and the

all-point extremes [GRS01]. According to Tong and Wu in [TD13], CURE

algorithm operates by decomposing the entire case library and selecting a

small part of the cases in the clustering space using the high efficient ran-

domly extracting algorithm called Rivest-Shamir-Adleman (RSA). Then by

CURE clustering, the algorithm divides the subset of cases into a group of

local clustering. Each division is based on the minimum average distance to

find the center of each cluster and setup index. Then, k-NN algorithm is used

to cluster the entire case set based on the index of these cluster centers and

the threshold value T created by CURE algorithm. After a certain number of

iterations, the largest average similarity is identified as the final cluster result

[GRS01; TD13]. CURE clustering algorithm was chosen because it is more

robust to outliers, has lower computational time requirements and identifies

clusters having non-spherical shapes and wide variances in size [GRS01].

In summary, to further improve the computational cost of and at the same

time augment the prediction accuracy of the proposed CBR model, a collab-

oration approach using CBR and clustering is implemented. The objective is

to reduce the search space in the retrieval step and also to consider only the

most suitable cases and solution to support decision and provide an intelligent

strategy that enables the predictor to have the best solution.

6.2.4 Weighted CBR with Clustering Experiment

The performance of the Weighted CBR model is augmented by using clustering

techniques to partition the case-base into sample groups. The Clustering does


not aim at labelling the cases in a group with a specific tag as it happens in

case of classification where the tag represents a piece of generalized domain

knowledge, extracted from the subsumed cases. Rather, it collects the most

similar clusters i.e identification of the cases under similar circumstances and

limit the retrieval task just to them. It also structure the CBR case base,

guide and speed-up its retrieval process [MA16]. Thus, to improve the retrieval

process of the Weighted CBR system in this thesis, CURE algorithm in [TD13]

was adopted.

The clustering analysis [GRS01; TD13] of the case libary start by creating

an index for each cluster center point using CURE algorithm as shown in

Figure 6.5 below. To retrieve the target instances (DCase) from the case

library, the cluster which the target case belongs to is first determined using

the combination of index and nearest neighbour searching algorithm. There

after, the nearest neighbour algorithm is used to retrieve the case that is most

similar to the target case. This is based on the fact that nearest neighbour

algorithm is efficient for well-organised and indexed library.

Figure 6.5: Structure of case library retrieval using clustering algorithm [TD13]


The implementation of the process includes the following stages [TD13]:

Step 1: Determine a cluster that a target case belongs to. Calculate the

similarity between target case and each center point case cluster. Identify the

cluster that the center point case with the maximum similarity belongs to.

Step 2: Compare the maximum similarity with a threshold T. If the maximum

similarity is larger than the threshold T, put the target case into a separate

class, CO.

Step 3: Find the source case with the maximum similarity using the nearest

neighbour searching algorithm. Compute the similarity between the target

case and all the cases in this subset. Identify the source case that has the

maximum similarity with the target case as the candidate case to solve the

target case.

The algorithm description for the case retrieval process is given as follows:

SUB RetrieveInCB (CB)

I = MaxSimilarityOfCore (DCase, SubsetCaseCore[ ]);

Find cluster I that has the maximum similarity with target case DCase;

If the maximum similarity < threshold (T) THEN I = C0 as a separate class;

For C=StartInCB I TO C <> “” Extract cases by nearest neighbour algorithm;

SameDegree(C, DCase) ’Dcase is the targetcase;

NEXT C

RETURN the source case that has the maximum similarity with DCase;

END SUB

For the purpose of running the CURE algorithm to cluster the CBR case

library, five (5) clusters were chosen as the clustering configuration number.

The rationale is based on the fact that the value of k=5 has been widely used in


the literature with promising prediction accuracy as used in [Mou+06; SPA10;

Pat73]. The extended simulated balanced dataset with significant sample size

discussed in Subsection 6.2.1 was used as the case library for this experi-

ment. In order to comprehensively evaluate the experiment on the provided

dataset, the application of CURE algorithm with the set of CBR classifiers

(StdCBR, StdCBR + context, Weighted CBR and Weighted CBR + context)

were examined. To run the CBR classifiers using kNN algorithm, a clustering

configuration of k= 3, 5, 7 and 9 was adopted. The variation in the value

of k beyond 3 for the kNN algorithm retrieval process did not present any

significantly different results. Figure 6.6 shows the produced results after the

application of CURE algorithm.

Figure 6.6: CBR with clustering Results

Weighted CBR + context classifier had the highest results across the dif-

ferent performance metrics used in the evaluation. An additional observation

from the result of this experiments was that as the CBR system configuration

was been augmented with novel features and GA capabilities, the performance

of the CBR system increased gradually across board. The results for the classi-

6.3. Summary 140

fiers in the previous experiment (Section 6.2.2) shows that the Weighted CBR

systems outperformed the CBR systems with clustering algorithm. One of the

main concerns was whether the efficiency of the CURE clustering algorithm

was maximised or not. Although a fixed number of clusters were used in the

CURE algorithm configuration, the use of dynamic number of clusters could

have provided different or better results. This could not be verified and inves-

tigated in this thesis due to time constraints. However, a positive observation

was made from the experimental results which indicate that the computation

cost associated with collaboration of CBR system with clustering algorithms

was reduced from 43,581 seconds to an average of 8,695 seconds.

6.3 Summary

In this paper, a Weighted CBR model is suggested with the aim of improving

the performance of a StdCBR system for fraud identification in mobile money

transfer (MMT). Results from the experiments have shown that CBR system

can detect mobile money payment fraud efficiently by applying novel feature

as well as similarity measures on them.

In the performed experiments, the Weighted CBR system uses a combina-

tion of CBR and GA as a tool to optimize the significance level (weights) of

the features. For the experiment, instead of using the conventional approach

where the transaction amount, time dimensions or feature dimension are used

individually, the log information from the simulation data was classified into

five contexts and then recombined into a single dimension. A reason for that

was to improve the performance accuracy of the CBR system. Results demon-

strate that the classification of log information into five contexts improves the

6.3. Summary 141

performance of our proposed weighted CBR + context classifier with an area

under curve (AUC) of 0.98% to 0.99% for the two feature dimension perspec-

tives. Although the computational cost was the main concern of the experi-

ment which was further investigated. A conclusion was reached to use CURE

clustering algorithm to improve the computation cost of the retrieval process of

the Weighted CBR system. The results from the CBR system with clustering

algorithm shows positive prediction, indicating the success of the approach.

However, the Weighted CBR without clustering algorithm outperformed it. In

conclusion, it can be stated that the CBR classifiers have demonstrated good

prediction accuracy for the examined case study.

The current chapter finalises the research approach towards the detection of

mobile money payment fraud by using augmented CBR systems. The relevant

research work was presented as well as the experiments conducted for the

evaluation of the adopted research approach. The next chapter concludes this

thesis by summarising the finding, conclusions and future directions.

Chapter 7

Conclusion and Further Work

Mobile money transfer is a fast growing medium of making financial transac-

tion via a mobile device and it is increasingly becoming adopted in growing

markets, especially in developing countries. It has the ability to handle large

number of small value payments and worldwide funds exchange in digital cur-

rencies. Consequently, the usage of mobile money transfer introduces addi-

tional risks caused by a large number of non-bank participants, higher speed

of transactions and level of anonymity compared to other existing banking

systems. This thesis investigated how CBR can be used to address some of

these issues. In particular, the thesis focused on how the proposed detection

approach can be effectively used to analyse and predict transaction fraud in

mobile money transfer (MMT) networks. This chapter summarizes the differ-

ent investigated areas needed to carry-out this research work, results from the

experiments carried out, and presents future research directions.

142

7.1. Thesis Summary 143

7.1 Thesis Summary

This thesis started by providing an overview of mobile money services ecosys-

tem and the rationale for selecting mobile money transfer service as the inves-

tigated domain. Then an investigation was carried on how to circumvent the

challenge of obtaining real life financial transaction dataset in this case Mobile

money. From the result of the investigation, only three previous work on fraud

detection domain provided the required level of depth that is needed to carry

out this research work. Therefore, the proposed methodology in [LKJ02] and

Multi-agent based simulator in [LrA12b] were adapted in this thesis to simu-

late mobile transfer transaction data as discussed in Section4.3. Furthermore,

in Chapter 2, the existing approach in the literature for evaluating simulated

dataset were also discussed. From the evaluation, chi-square test was selected

to evaluate the simulated dataset. The rationale for choosing chi-square test

was because it is easy to compute and robust with respect to the distribution of

the data. The results from the evaluation using chi-square test shows that the

proportion of the selected users for both the amount and period are normally

distributed.

Throughout this research, several other algorithms for detecting financial

transaction fraud were researched including Logistic regression (LR), Artificial

neural network (ANN), Support vector machine (SVM) and more. As a result

LR, RF, SVM and ANN were then selected to run some preliminary experi-

ments as further discussed in Appendix A.1. The afore choices was based on

the fact that logistic regression is easy to use and one of the most commonly

used technique for data mining in practice. The random Forest’s classifier

which has the ability to capture non-linear data, shows high scalability with

better visual representation of results data. Support vector machine has a


regularization parameter that is used to prevent over-fitting. Artificial neu-

ral network has shown higher accuracy in prediction. Also in Chapter 3, a

review of the existing approach for handling unbalanced data for predictive

algorithms was presented. As a result, a conclusion was reached to use a hy-

brid technique that combines an oversampling and undersampling algorithm

to balance the MMT dataset. According to [Cha+02], this works better than

either one. Furthermore, discussions on evaluation techniques (performance

metrics) for measuring a fraud detection system effectively was carried out.

Therefore, the following performance metrics were then selected to be used in

this thesis. They include Recall, F-measure, Mathews correlation coefficient

and Area under the ROC curve. This is based on the fact that they have high

efficiency with respect to handling imbalanced data without getting biased

towards the majority class and also they are highly suitable with respect to

handling fraud detection [GS17].

Chapter 4 addressed the challenge of lack of publicly available data set on

mobile money transfer by simulating transaction data using the synthetic data

generation method in [LKJ02]. To simulate the MMT platform, a multi-agent

based simulator(MABS) in [LrA12b] was used to generate mobile money trans-

fer transaction data using the misuse scenarios in EU FP7 MASSIF project

[Lla+11]. As contribution, misuse scenario involving SIM swap and retailer

facilitating opening of end user accounts were modelled. These were not con-

sidered in [Gab+13; LrA12b]. Also, to trace the sequence of events in the

generated data, time-stamp parameter was included in the simulation rather

than steps or category of users over a period of time used in [Gab+13; LrA12b]

respectively. To ensure that the modelled data characteristics are as close as

possible to a real world situations, combination of users habit as well as their


behaviour was incorporated in the simulation as proposed by Gaber et al.

[Gab+13]. From the results of the evaluation, the generated dataset seems to

be as close as possible to actual transactions dataset. This dataset was used to

train and test the proposed predictive model to further verify it’s applicability

on fraud detection tools.

Chapter 5 introduced the proposed CBR approaches for the identification

of money transfer fraud in Mobile Money Transfer (MMT) environments. In

order to apply similarity measures among transaction event sequences in the

CBR system, the log information from the simulation data was classified into

five contexts and then recombined into a single dimension rather than the

conventional approach where transaction amount, time dimensions or features

dimension are used individually. Two similarity algorithms are being presented

in this chapter, StdCBR and a Weighted CBR model. The Weighted system

uses a combination of CBR and GA as a tool to optimize the significance level

(weights) of the features with the aim of improving the performance of the

proposed CBR approach. The result from the experiments shows that the

incorporation of new attribute (context of information) into the CBR system

lead to a significant performance improvement. In addition, some machine

learning algorithms were selected and discussed. This were used to carry out

preliminary experiment as further discussed A.1.

Finally in Chapter 6, the experiment implementation of the CBR sys-

tems formulated in chapter 5 is presented. In the performed experiments,

the Weighted CBR system uses a combination of CBR and GA as a tool to

optimize the significance level (weights) of the features. For the experiment, in-

stead of using the conventional approach where the transaction amount, time

dimensions or features dimension are used individually, the log information

7.2. Contributions and Findings 146

from the simulation data was classified into five contexts and then recombined

into a single dimension. A promising result was achieved in the experiment.

To further enhance the performance of the computational cost of the CBR

systems, a CURE clustering algorithm was applied to the retrieval process.

This lead to a significant improvement to the computation cost. Although, a

fixed cluster value was used for the clustering algorithm for the CBR retrieval

process, however a dynamic approach could produce a different or better re-

sults. This could not be verified and investigated due to time constraint. As

conclusion, it can be stated that the CBR classifiers have demonstrated good

prediction accuracy for the examined case study.

7.2 Contributions and Findings

This Section presents the contribution to knowledge by revisiting the research

questions introduced in Chapter 1 (Section 1.3), and it further provides answers

to this research questions from the findings of this thesis. The first addressed

question follows:

1. How can a model developed through the CBR approach be used for

effective analysis of MMT transaction fraud?

A novel model based on CBR methodology for predicting transaction fraud

in mobile money transfer networks was proposed in chapter 5. As a contri-

bution, the proposed model was used to augment basic CBR (i.e StdCBR)

capability using machine learning techniques (Genetic algorithm) to assign

parameter weights in the sample dataset and automating k-value random se-

lection in k-NN classification to improve CBR performance in detecting finan-

cial transaction fraud. The results from the experiment using Weighted CBR


model shows a good prediction accuracy. Another contribution is the use of

a novel approach for capturing user behaviour with the use of feature clas-

sification into context of information so as capture user behaviour effectively

and improve the prediction accuracy which was introduced into the CBR clas-

sifier. The incorporation of new attribute (context of information) into the

CBR system led to a significant performance improvement (see Table 6.2).

During the CBR experiment simulation to ensure that the CBR model is

not biased as a result of unbalanced dataset and also using the same transaction

data for both training and testing phase in the experiment, the following were

carried out; (i) Data balancing using a hybrid approach that combines over-

sampling and under-sampling algorithms (SMOTE+Tomek-Link) was adopted

using a ratio of 2:1 for both negative and positive instances respectively. Ac-

cording to Chawla et al. in [Cha+02; GS17], this hybrid approach works better

than either one. (ii) The MMT dataset was split into training and testing set

using a ratio of 70:30 respectively. In addition, to avoid an overoptimistic

estimate of the CBR model performance after the dataset split transactions

from known compromised MMT account was removed from the subsequent

split as proposed in [Fab+17]. For example when an MMT account is already

associated with a fraudulent transaction in the training set, its transactions

are removed from the test set. The experimental result shows a promising pre-

diction performance, although the use of larger values of k in the experiment

did not present any significant different results.

2. To what extent can such a model deliver measures/metrics for prediction

of MMT fraud?

This involved the similarity measures used and the performance output

from this predictions. The CBR similarity function was computed by opti-


mally combining the weighted individual local similarity of (vi) into a global

similarity as discussed in Section 5.3.2. For the purpose of evaluating the pro-

posed CBR methodology thoroughly, the experiments were performed in an

evolutionary approach described in Section 5.1. The initial sets of the experi-

ment started with the application of basic techniques (StdCBR) on minimalis-

tic dataset. The experimental results shows fairly positive monitoring attempt

for the StdCBR system performance on simplistic dataset (see Table 6.1).

However, the result from the F-measure score indicates a high precision score

for the StdCBR classifier. Such result could possibly lead to decline of valid

customer transaction i.e high false positive score.

As a step further to improving the proposed CBR model performance, more

dataset that exhibits both high complexity and significant sample size (i.e in

addition to the general annotation of fraud and non-fraud, three distinct classes

of fraud were introduced to represent the different misuse scenarios used in the

simulation) with improved CBR configuration were used. These dataset and

configurations were used to evaluate the classification precision of this research

approach. Thus, the use of annotated dataset using three distinct classes of

fraud as discussed in Section 6.2 in the evaluation provided an insight on the

detection rate of the CBR model on each of this classes. The result from the

experiment shows that Weighted CBR + context classifier provided a good

detection rate across the three classes of fraud cases in the sample data (see

Figure 6.3).

3. What limitations does such a model come with and what performances

can we expect from it?

A major challenge that was observed during the proposed CBR model ex-

periment was the computational complexity associated with the augmentation


of CBR model with genetic algorithms. To improve the computational cost of

and at the same time augment the prediction accuracy of the proposed CBR

model, a collaborative approach using CBR and clustering (CURE clustering

algorithm) was implemented as discussed in Section 6.2.4. The aim was to use

it to reduce the search space in the CBR retrieval process and also to consider

only the most suitable cases and solution to support decision and provide an

intelligent strategy that enables the predictor to have the best solution. The

rationale for using CURE algorithm was because its more robust to outliers,

has lower computational time requirements and identifies clusters having non-

spherical shapes and wide variances in size [GRS01]. The experimental results

show an improved computational cost but with a lower prediction accuracy

compared to the proposed CBR model with out CURE clustering algorithm

(see Figure 6.6). One of the main concern was whether the efficiency of the

CURE clustering algorithm was maximised or not. Although a fixed number

of clusters was used in the CURE algorithm configuration, the use of dynamic

number of clusters could have provided different or better results. This could

not be verified and investigated in this thesis due to time constraints.

An additional contribution to knowledge in this thesis is how the chal-

lenge of obtaining mobile money payment dataset for the purpose of evaluat-

ing the proposed predictive model was addressed. In order to provide answer

to the question ”How can background transaction data as training and test

cases for pattern analysis and learning algorithms evaluation be obtained? ”,

the methodology for simulating financial transaction dataset was reviewed as

discussed in Section 2.3. A conclusion was reached to use the methodology in

[LKJ02] and the multi-agent based simulator (MABS) developed by Lopez et

al. [LrA12b] to simulate mobile money transfer dataset containing information

7.3. Future Work 150

about mobile money transactions with examples of fraudulent samples. This is

based on the fact that the proposed methodology has a well defined interface

which makes it easy to use while MASON facilitates the implementation of

social networks.

As contribution, misuse scenario involving SIM swap and retailer facilitat-

ing opening of end user account were modelled which were not considered in

the literature. Also, to trace the sequence of events in the generated data,

time-stamp parameter was included in the simulation rather than steps or cat-

egory of users over a period of time used in [Gab+13; LrA12b] respectively.

To ensure that the modelled data characteristics are as close as possible to real

world situations, a combination of user’s habits as well as their behaviour was

incorporated in the simulation as proposed by Gaber et al. [Gab+13]. From

the results of the evaluation, the generated dataset seems to be as close as pos-

sible to actual transactions dataset. This dataset was used as input data for

training and testing of fraud detection techniques. In addition, the simulated

dataset was used to test the properties of the proposed Fraud Detection Sys-

tem by injecting variations of known frauds or emerging frauds into synthetic

data to study how this affects performance parameters such as the detection

rate.

7.3 Future Work

This section discusses future research directions, despite the successful results

achieved in this thesis. A list of possible future improvements and further

works are summarised below.

• In future work, the simulation of synthetic mobile money transfer dataset


can be explored further. For example, by building an improved model

or a more realistic dataset using a combination of synthetic and real

data. This will make it even more valuable as a realistic dataset for

fraud detection experiments. Also, real-world geographical locations can

be implemented in the simulation with the extension of MASON called

GEOMASON [Col13].

• The Weighted CBR with CURE clustering approach (see Section 6.2.4)

showed a better time performance but with significantly low accuracy

in comparison to the CBR system without the CURE algorithm. One

of the concerns was on whether the efficiency of the CURE clustering

algorithm was maximised or not. Although a fixed number of clusters

was used in the CURE algorithm configuration, more emphasis is placed

in future work to use dynamic number of clusters to investigate if it could

have provided different or better results.

• As an improvement in future work, the computational performance of

the CBR approach can be further enhanced by parallelising the imple-

mentation of the CBR model. For example, a better computational

performance can be achieved by splitting the data and computational

tasks over multiple computers, CPUs, GPUs and threads [Sri+99].

• The properties of the FDS can be further tested (e.g false alarm rate)

by varying the background data, where background data is defined as

normal usage with no attacks. Possible directions for improvements in-

cludes: (i) testing the FDS by using benchmarks, (ii) combine with en-

semble methods or hybrid approaches, (iii) evaluate using real financial

transaction dataset.


• Finally, in future work it is important to explore the performance of

non-CBR methods such as Artificial neural network, Deep learning and

Discrete modelling approaches. There is a possibility to improve the

performance of the prediction further.

In conclusion, the work presented in this thesis proposed a novel CBR

approach for predicting mobile money transfer fraud. The experimental results

from the experiment shows a good prediction accuracy (see Table 6.2). Despite

these successes, the aforementioned improvements above can be implemented

in future studies.

Appendix A

Preliminary Experiments

A.1 Experimental Results from the Selected

Machine Learning Algorithms

In Section 3.1.1 of chapter 3, the application of some machine learning algo-

rithms were investigated. As a result four algorithmic solution (Logistic regres-

sion, Random forest, Support vector machine and Artificial neural network)

were selected based on their performances from the literature. In Section 5.5

of chapter 5, the representation of both the training and evaluation phase of

this algorithms were presented. In order to run and evaluate the performances

of the selected machine learning algorithms in the prediction of mobile money

transfer fraud, python libraries such as pandas and keras among others were

implemented. This Section presents and discusses the performance evalua-

tion results from the selected machine learning algorithms using the simulated

dataset.

The experiment was carried out using 5-fold cross validation and evaluated

153

A.1. Experimental Results from the Selected Machine LearningAlgorithms 154

using the following metrics Recall, F-measure, Mathews Correlation Coefficient

(MCC) and Area Under the ROC Curve (AUC). These four metrics have high

efficiency with respect to handling imbalanced data. They avoid getting biased

towards the majority class and are also highly suitable with respect to handling

fraud detection [GS17]. Recall focuses on the significant class (fraud) and is not

sensitive to data distribution. F-score is a good metric due to its non-linear

nature and MCC is considered one of the best singular assessment measure

and are less influenced by imbalanced data [GS17; Cha+11a]. AUC evaluates

the overall classifier performance and is very appropriate for class imbalance

[Gu+08]. The performance evaluation of the classifier using the above metrics

is as shown below.

Figure A.1: Classifiers recall results

Figure A.1 above shows the recall value for the four classifiers used in

this thesis. The results clearly show that random forest outperformed the

other classifiers in the identification of both fraud and non-fraudulent cases by

demonstrating a high recall value.


Figure A.2: Classifiers F-measure results

The above F-measure results from the classifiers can be characterised as

good performance. The result further shows that random forest classifier

achieved the best score. Another observation from Figure A.2 above is that

the application of cross validation for LR, RF and SVM did not show any sig-

nificant effect on their performances. However in the case of ANN, the value

increased between the 10 - 30 percent and gradually decrease to between 70 -

90 during the testing phase.

Figure A.3: Classifiers Mathews Correlation Coefficient results

Figure A.3 shows Mathews correlation coefficient results for the classifiers.


It can be observed that RF classifier shows an excellent result when compared

to other classifiers. This indicates that RF classifiers has the highest probabil-

ity of been the perfect model for the sample data in this experiment.

Figure A.4: Classifiers area under ROC curve results

The results produced from the classifiers using AUC performance mea-

sures is as shown in Figure A.4. The results show that RF outperformed the

remaining classifiers with an excellent performance value of 0.92. It can also be

observed that the application of cross validation had more influence on ANN

than the remaining classifiers.

In conclusion the overall results from the experiments with the selected

machine learning algorithms can be characterised as good, especially with the

random forest classifiers. However, part of the motivation for this thesis was

to investigate the performance of case-based reasoning (CBR) methodology in

detecting mobile payment fraud. This prompted the application of CBR to a

mobile payment case study in this research.

Appendix B

Data Simulation Configuration

During the simulation about 10% of the clients were configured to behave as

malicious agents (fraudsters). In a real life scenario, it is more common to

find a lower percentage of fraudsters. The idea behind a higher proportion

of fraudsters is to prevent the class imbalance problem during the training of

the detector. The social network between the clients were built by varying the

network for clients within the same city and outside different cities. The fraud-

sters can also interact with normal clients in the system. The data simulation

was run five times. At each simulation the parameters were varied except for

the number of clients & cities. This will allow changes in behavioural pattern

of each client (i.e users) during each simulation. The different values used for

the parameters during the simulation are presented in Table B.1. The files

generated were merged and ultimately used as input for the proposed CBR

system presented in Section 5.3.1.

157

Appendix B. Data Simulation Configuration 158

Table B.1: Simulation Input parameters

Parameters Exp. 1 Exp. 2 Exp. 3 Exp. 4 Exp. 5

RandomMultiplier 0.1 0.3 0.5 0.7 0.9

MaxNeighbor 10 8 6 4 2

ClientsBalance - - - - -

MaxOtherNeighbor 2 3 4 5 6

UpgradeAccountRate 0.01 0.01 0.01 0.01 0.01

TransactionRate 0.5 0.5 0.5 0.5 0.5

Trans - - - - -

NumClients 2000 2000 2000 2000 2000

Types - - - - -

NumCities 7 7 7 7 7

From Table B.1 above, ”Types” represents type of account profile as dis-

cussed in Section 4.3. Here agents are switched from one profile to another

using Markov matrix of transition probabilities (Markov matrix was chosen

because it has the ability to transit an agent from one state to another and

is commonly used in the literature for sequence evolution). This tells the sys-

tem when to change from Active to Inactive and from Profile P1 to Profile

P2 which allows higher limits for transactions. For ”Trans”, this represents

the categories of transactions that clients can perform. They can either make

a money deposit (MD), money withdrawal (MW), merchant payment (MP),

person-to-person transfer (P2P) or airtime recharge (AR). The autonomy of

the agent is implemented by a probabilistic transition function that computes

the type of operation and the action that an agent will perform in each step.

This transition function depends on clients attributes such as category of user

and the amount which is calculated according to the balance and the limits of

Appendix B. Data Simulation Configuration 159

each client’s profile [LrA12b; Zhd+14]

References

[AAN17] Ahmad Dwi Arianto, Achmad Affandi, and Supeno MardiSusiki Nugroho. “Opinion detection of public sector financialstatements using K-nearest neighbors”. In: 4th InternationalConference on Electrical Engineering, Computer Science andInformatics (EECSI). Yogyakarta, Indonesia: IEEE, 2017.

[ABL13] Mobyen Uddin Ahmed, Hadi Banaee, and Amy Loutfi. “HealthMonitoring for Elderly: An Application Using Case-Based Rea-soning and Cluster Analysis”. In: ISRN Artificial Intelligence(2013), pp. 1–11.

[Ade+16] Adeyinka Adedoyin, Stelios Kapetanakis, Miltos Petridis, andEmmanouil Panaousis. “Evaluating Case-Based Reasoning Knowl-edge Discovery in Fraud Detection”. In: 24th Workshop onCase Based Reasoning (ICCBR2016): Synergies between CBRand Knowledge Discovery. Atlanta, USA, 2016, pp. 182–191.

[Ade+17] Adeyinka Adedoyin, Stelios Kapetanakis, Georgios Samakovi-tis, and Miltos Petridis. “Predicting Fraud in Mobile MoneyTransfer Using Case-Based Reasoning”. In: Bramer M., PetridisM. (eds) Artificial Intelligence XXXIV. SGAI 2017. LectureNotes in Computer Science, vol. 10630. Cambridge, UK: Springer,Cham, 2017, pp. 325–337.

[AHG14] Debray Atanu, Kwon Hyejung, and Richard Gill. “Mobile MoneyOpportunities for Mobile Operators”. In: Business & NetworkConsulting, Hauwei Technologies White Paper (2014), pp. 1–12.

160

REFERENCES 161

[AJ04] Niloofar Arshadi and Igor Jurisica. “Maintaining Case-BasedReasoing Systems A Machine Learning Approach”. In: FunkP., Gonzalez Calero P.A. (eds) Advances in Case-Based Rea-soning. ECCBR 2004. Lecture Notes in Computer Science.Vol. 3155. Springer, Berlin, Heidelberg, 2004, pp. 17–31.

[AKH06] Hyunchul Ahn, Kyoung-jae Kim, and Ingoo Han. “Hybrid ge-netic algorithms and case-based reasoning systems for cus-tomer classification”. In: Expert Systems 23.3 (2006), pp. 127–144.

[AKJ04] Rehan Akbani, Stephen Kwek, and Nathalie Japkowicz. “Ap-plying Support Vector Machines to Imbalanced Datasets”. In:Boulicaut JF., Esposito F., Giannotti F., Pedreschi D. (eds)Machine Learning: ECML 2004. Lecture Notes in ComputerScience. Vol. 3201. Springer, Berlin, Heidelberg, 2004, pp. 39–50.

[AL13] Khaled Amailef and Jie Lu. “Ontology-supported case-basedreasoning approach for intelligent m-Government emergencyresponse services”. In: Decision Support Systems 55.1 (2013),pp. 79–97.

[Ale17] Kijek Aleksander. A Beginner’s Guide to Machine Learningin Payment Fraud Detection & Prevention. 2017.

[All09] Rob J Allan. Survey of Agent Based Modelling and Simula-tion Tools. Tech. rep. Daresbury, Warrington: STFC Dares-bury Laboratory, 2009, pp. 57–72.

[AMZ16] Aisha Abdallah, Mohd Aizaini Maarof, and Anazida Zainal.“Fraud detection system: A survey”. In: Journal of Networkand Computer Applications, ScienceDirect 68 (2016), pp. 90–113.

[AP94] Agnar Aamodt and Enric Plaza. “Case-based reasoning: Foun-dational issues, methodological variations, and system approaches”.In: AI communications 7.1 (1994), pp. 39–59.

REFERENCES 162

[Aze+14] Ush Azeem, Khan Shan, Akhtar Nadeem, and Naved QureshiMohammad. “Real-Time Credit-Card Fraud Detection usingArtificial Neural Network Tuned by Simulated Annealing Al-gorithm”. In: International Conference on Recent Trends inInformation, Telecommunication and Computing, ITC. Chandi-garh, India: Association of Computer Electronics and Electri-cal Engineers, 2014, pp. 113–121.

[Bah+13] Alejandro Correa Bahnsen, Aleksandar Stojanovic, DjamilaAouada, and Bjorn Ottersten. “Cost sensitive credit card frauddetection using bayes minimum risk”. In: 12th InternationalConference on Machine Learning and Applications, ICMLA.Vol. 1. Washington, DC, USA: IEEE Computer Society, 2013,pp. 333–338.

[BAO15] Alejandro Correa Bahnsen, Djamila Aouada, and Bjorn Ot-tersten. “Example-dependent cost-sensitive decision trees”. In:Expert Systems with Applications 42.19 (2015), pp. 6609–6619.

[BBM03] Gustavo E A P A Batista, Ana L C Bazzan, and Maria Car-olina Monard. “Balancing training data for automated anno-tation of keywords: a case study”. In: Revista Tecnologia daInformacao 3.2 (2003), pp. 15–20.

[BD13] N Bennett and S Dilloway. “Investigating the Convergenceof Money Laundering and Terrorist Financing”. In: ACAMSAML and Financial Crime Conference. Amsterdam, 2013.

[BGK05] A. Blansche, P. Gancarski, and J.J. Korczak. “Genetic Al-gorithms for Feature Weighting: Evolution vs. Coevolutionand Darwin vs Lamarck”. In: In: Gelbukh A., de AlbornozA., Terashima-Marın H. (eds) MICAI 2005: Advances in Ar-tificial Intelligence. MICAI 2005. LNCS, vol. 3789. SpringerBerlin Heidelberg, 2005, pp. 682–691.

[BH01] Richard .J Bolton and David .J Hand. “Unsupervised Profil-ing Methods for Fraud Detection”. In: Proceedings of creditscoring and credit control. Edinburgh, Uk, 2001, pp. 235–255.

REFERENCES 163

[BH02] Richard .J Bolton and David .J Hand. “Statistical Fraud De-tection A Review”. In: Statistical Science 17.3 (2002), pp. 235–249.

[Bha+11] Siddhartha Bhattacharyya, Sanjeev Jha, Kurian Tharakunnel,and J. Christopher Westland. “Data mining for credit cardfraud: A comparative study”. In: Decision Support Systems,ScienceDirect 50.3 (2011), pp. 602 –613.

[BK17] Richard A. Bauder and Taghi M. Khoshgoftaar. “A proba-bilistic programming approach for outlier detection in health-care claims”. In: International Conference on Machine Learn-ing and Applications, ICMLA 2016. IEEE, 2017, pp. 347–354.

[BK99] Eric Bauer and Ron Kohavi. “An Empirical Comparison ofVoting Classification Algorithms: Bagging, Boosting, and Vari-ants”. In: Bauer, E. & Kohavi, R. Machine Learning 36.1-2(1999), pp. 105–139.

[BKJ03] Lundin Emilie Barse, Hakan Kvarnstrom, and Erland Jonsson.“Synthesizing test data for fraud detection systems”. In: 19thAnnual Computer Security Applications Conference (ACSAC2003). 03. Sweden: IEEE Computer Society, 2003, p. 11.

[BP89] Mike Brown and George Paliouras. Inside Case-based Reason-ing. Ed. by Schank C. Roger, Alex Kass, and K. Christopher.Lawrence Erlbaum Associates, 1989, p. 21.

[BPM04] Gustavo E. A. P. A. Batista, Ronaldo C. Prati, and Maria Car-olina Monard. “A Study of the Behavior of Several Methods forBalancing Machine Learning Training Data”. In: SIGKDD Ex-plor. Newsl. Special issue on learning from imbalanced datasets6.1 (2004), pp. 20–29.

[Bra97] Andrew P. Bradley. “The use of the area under the ROC curvein the evaluation of machine learning algorithms”. In: PatternRecognition 30.7 (1997), pp. 1145–1159.

REFERENCES 164

[Bur17] Nikolay Burlutskiy. “Machine Learning for Predicting UserBehaviour”. PhD thesis. University of Brighton, 2017, pp. 34–45.

[BVV15] Bart Baesens, Veronique Van Vlasselaer, and Wouter Verbeke.Fraud Analytics Using Descriptive, Predictive, and Social Net-work Techniques: A Guide to Data Science for Fraud Detec-tion. Cary, North Carolina, USA: SAS Institute Inc., 2015,pp. 1–36.

[CB13] Cristiano L. Castro and Antonio P. Braga. “Novel cost-sensitiveapproach to improve the multilayer perceptron performance onimbalanced data”. In: Transactions on Neural Networks andLearning Systems 24.6 (2013), pp. 888–899.

[CC11] Wen Hsi Chang and Jau Shien Chang. “A novel two-stagephased modeling framework for early fraud detection in onlineauctions”. In: Expert Systems with Applications 38.9 (2011),pp. 11244–11260.

[CC12] Wen Hsi Chang and Jau Shien Chang. “An effective earlyfraud detection method for online auctions”. In: ElectronicCommerce Research and Applications 11.4 (2012), pp. 346–360.

[CCB08] Andrew Crooks, Christian Castle, and Michael Batty. “KeyChallenges in Agent-Based Modelling for Geo-Spatial Simula-tion”. In: Computers, Environment and Urban Systems 32.6(2008), pp. 417–430.

[CGC05] Xue-wen Chen, Byron Gerlach, and David Casasent. “Prun-ning Support Vectors for Imbalanced Data Classification”. In:Proceedings of International Joint Conference on Neural Net-works. Montreal, Que., Canada: IEEE, 2005, pp. 1883–1888.

[CG+13] Boris Campillo-Gimenez, Wassim Jouini, Sahar Bayat, andMarc Cuggia. “Improving Case-Based Reasoning Systems byCombining K-Nearest Neighbour Algorithm with Logistic Re-

REFERENCES 165

gression in the Prediction of Patients’ Registration on the Re-nal Transplant Waiting List”. In: PLoS ONE 8.9 (2013).

[CH11] Chun Ling Chuang and Szu Teng Huang. “A hybrid neuralnetwork approach for credit scoring”. In: Expert Systems 28.2(2011), pp. 185–196.

[Cha05] Nitesh V. Chawla. “Data Mining for Imbalanced Datasets: AnOverview”. In: Maimon O., Rokach L. (eds) Data Mining andKnowledge Discovery Handbook. Springer, Boston, MA, 2005,pp. 853–867.

[Cha+02] Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, andW. Philip Kegelmeyer. “SMOTE: Synthetic minority over-sampling technique”. In: Journal of Artificial Intelligence Re-search 16 (2002), pp. 321–357.

[Cha+03] Nitesh V Chawla, Aleksandar Lazarevic, Lawrence O Hall,and Kevin W Bowyer. “SMOTEBoost : Improving Predic-tion”. In: Lavrac N., Gamberger D., Todorovski L., Blockeel H.(eds) Knowledge Discovery in Databases: PKDD 2003. LNCS.Vol. 2838. Springer Berlin Heidelberg, 2003, pp. 107–119.

[Cha+11a] Kevin Chai, Chen Wu, Vidyasagar Potdar, and Pedram Hay-ati. “Automatically Measuring the Quality of User GeneratedContent in Forums”. In: Wang D., Reynolds M. (eds) AI 2011:Advances in Artificial Intelligence, LNCS. Vol. 7106. SpringerBerlin Heidelberg, 2011, pp. 51–60.

[Cha+11b] Pierre-Laurent Chatain, Andrew Zerzan, Wameek Noor, Na-jah Dannaoui, and Louis de Koker. “Protecting Mobile Moneyagainst Financial Crimes: Global Policy Challenges and Solu-tions”. In: Directions in Development Finance (2011), pp. 1–234.

[Chu13] Chun-Ling Chuang. “Application of hybrid case-based reason-ing for enhanced performance in bankruptcy prediction”. In:(Information Sciences) 236 (2013), pp. 174–185.

REFERENCES 166

[CK91] Robert T.H. Chi and Melody .Y Kiang. “An integrated ap-proach of rule-based and case-based reasoning for decisionsupport”. In: Proceedings of the 19th annual conference onComputer Science (CSC ’91). Vol. 103. New York, NY, USA:ACM, 1991, pp. 255–267.

[CL01] J. M. Corchado and B. Lees. “Adaptation of cases for casebased forecasting with neural network support”. In: Soft com-puting in case based reasoning (2001), pp. 293–319.

[CL06] Kelvin Chan and Jay Liebowitz. “The synergy of social net-work analysis and knowledge mapping: a case study”. In: In-ternational Journal of Management and Decision Making 7.1(2006), p. 19.

[Cl18] Jan Chyan-long. “An Effective Financial Statements FraudDetection Model for the Sustainable Development of Finan-cial Markets: Evidence from Taiwan”. In: MDPI SustainabilityJournal 10.2 (2018), p. 513.

[CLL05] Rong-chang Chen, Shu-ting Luol, and Vincent C.S. Lee. “Per-sonalized Approach Based on SVM and ANN for DetectingCredit Card Fraud”. In: International Conference on NeuralNetwork and Brain, ICNN&B ’05. Beijing, China: IEEE, 2005,pp. 810–815.

[Col13] Mark Coletti. The GeoMason Cookbook. Tech. rep. George Ma-son University, 2013, pp. 1–41.

[Cor08] Amelie Cordier. “Interactive and Opportunistic KnowledgeAcquisition in Case-Based Reasoning”. PhD thesis. UniversiteClaude Bernard-Lyon I, 2008.

[CPV01] Corinna Cortes, Daryl Pregibon, and Chris Volinsky. “Com-munities of Interest”. In: Proceedings of the 14th InternationalConference on Advances in Intelligent Data Analysis. London,Uk: Springer-Verlag, 2001, pp. 105–114.

REFERENCES 167

[CZZ13] Peng Cao, Dazhe Zhao, and Osmar Zaiane. “An optimizedcost-sensitive SVM for imbalanced data learning”. In: Pei J.,Tseng V.S., Cao L., Motoda H., Xu G. (eds) Advances inKnowledge Discovery and Data Mining. PAKDD 2013. LNCS7819 (2013), pp. 280–292.

[DA14] Mehdi Darvishi and Gholamreza Ahmadi. “Validation tech-niques of agent based modelling for geospatial simulations”. In:International Archives of the Photogrammetry, Remote Sens-ing and Spatial Information Sciences - ISPRS Archives. Vol. XL-2/W3. Tehran, Iran, 2014, pp. 91–95.

[Dan17] B Dan. Cross-Validation. 2017.

[DCM93] Christian Darken, Joseph Change, and John Moody. “Learn-ing rate schedules for stochastic gradient algorithms”. In: Neu-ral Network for Signal Processing II. IEEE, 1993, p. 133.

[DD13] V. Dheepa and R. Dhanapal. “Hybrid Approach for Impro-vising Credit Card Fraud Detection Based on Collective Ani-mal Behaviour and SVM”. In: Thampi S.M., Atrey P.K., FanCI., Perez G.M. (eds) Security in Computing and Communi-cations. SSCC 2013. Communications in Computer and In-formation Science. Springer Berlin Heidelberg, 2013, pp. 293–302.

[DH00] Chris Drummond and Robert C. Holte. “Explicitly represent-ing expected cost”. In: Proceedings of the sixth ACM SIGKDDinternational conference on Knowledge discovery and data min-ing - KDD ’00. 2000, pp. 198–207.

[DH03] Chris Drummond and R.C. Holte. “C4.5, class imbalance, andcost sensitivity: why under-sampling beats over-sampling”. In:Workshop on Learning from Imbalanced Datasets II, ICML.Washington, DC, USA: AAAI Press, 2003, pp. 1–8.

[Div15] David Divitt. Social network analysis for fraud detection inpayments. 2015.

REFERENCES 168

[DK+15] Asli Demirguc-Kunt, Leora Klapper, Dorothe Singer, and VanPeter Oudheusden. The Global Findex Database 2014: Measur-ing Financial Inclusion around the World. Tech. rep. Wash-ington, DC, USA: The world Bank Group, 2015, pp. 1–8.

[EA13] Shaza M Abd Elrahman and Ajith Abraham. “A Review ofClass Imbalance Problem”. In: Network and Innovative Com-puting 1 (2013), pp. 332–340.

[Eka03] Aniko Ekart. “Using genetic algorithms for improved discretesequence prediction”. In: International Conference on Artifi-cial Inteligence. Las Vegas, Nevada, USA, 2003, pp. 1–10.

[Elh+16] T Elhassan, M Aljurf, F Mohanna, and M Shoukri. “Classi-fication of Imbalance Data using Tomek Link(T-Link) Com-bined with Random Under-sampling (RUS) as a Data Reduc-tion Method”. In: Journal of Informatics and Data Mining 1.2(2016), pp. 1–12.

[Fab+17] Braun Fabian, Olivier Caelen, Evgueni N Smirnov, StevenKelk, and Bertrand Lebichot. “Improving Card Fraud De-tection Through Suspicious Pattern Discovery”. In: BenferhatS., Tabia K., Ali M. (eds) Advances in Artificial Intelligence:From Theory to Practice. IEA/AIE 2017. Vol. 10351. Springer,Cham, 2017, pp. 181–190.

[Fer+08] Alberto Fernandez, Salvador Garcıa, Marıa Jose del Jesus,and Francisco Herrera. “A study of the behaviour of linguis-tic fuzzy rule based classification systems in the frameworkof imbalanced data-sets”. In: Fuzzy Sets and Systems 159.18(2008), pp. 2378–2398.

[FM06] Zakia Ferdousi and Akira Maeda. “Anomaly Detection UsingUnsupervised Profiling Method in Time Series Data”. In: AD-BIS Research Communications (2006).

[FR93] Anthony G. Francis and Ashwin Ram. The utility problem incase-based reasoning. Tech. rep. 1993.

REFERENCES 169

[FS96] Yoav Freund and Robert E. Schapire. “Experiments with anew boosting algorithm”. In: Proceedings of the Thirtennth In-ternational Conference of Machine Learning. Bari, Italy: Mor-gan Kaufmann, 1996, pp. 148–156.

[FSC99] Wei Fan, Salvatore J Stolfo, and Philip K Chan. “AdaCost :Misclassification Cost-sensitive Boosting”. In: Proceedings ofthe Sixteenth International Conference on Machine Learning(ICML 99). Morgan Kaufmann, 1999, pp. 97–105.

[Gab+13] Chrystel Gaber, Baptiste Hemery, Mohammed Achemlal, MarcPasquet, and Pascal Urien. “Synthetic logs generator for frauddetection in mobile transfer services”. In: Sadeghi AR. (eds)Financial Cryptography and Data Security. FC 2013. LNCS.Vol. 7859. Springer, Heidelberg, 2013, pp. 397–398.

[Gal+12] Mikel Galar, Alberto Fern, Edurne Barrenechea, and Hum-berto Bustince. “A Review on Ensembles for the Class Im-balance Problem: Bagging-, Boosting-, and Hybrid-Based Ap-proaches”. In: Transactions on Systems, Man, and Cybernet-ics, Part C (Applications and Reviews) 42.4 (2012), pp. 463–484.

[GAM00] Batista Gustavo, Carvalho Andre, and Monard Maria. “Ap-plying One-Sided Selection to Unbalanced Datasets”. In: CairoO., Sucar L.E., Cantu F.J. (eds) MICAI 2000: Advances inArtificial Intelligence. MICAI 2000. Vol. 1793. Springer BerlinHeidelberg, 2000, pp. 315–325.

[GBC15] Ruibin Geng, Indranil Bose, and Xi Chen. “Prediction of fi-nancial distress: An empirical study of listed Chinese compa-nies using data mining”. In: European Journal of OperationalResearch 241.1 (2015), pp. 236–247.

[GC03] Wu Gang and Edward Y. Chang. “Class-boundary alignmentfor imbalanced dataset learning”. In: The Twentieth Interna-tional Conference on Machine Learning (ICML), Workshop onImbalanced Data Sets. 1. Washington, DC, USA, 2003, pp. 49–56.

REFERENCES 170

[GCG17] Katsiaryna V. Gris, Jean-Philippe Coutu, and Denis Gris. “Su-pervised and Unsupervised Learning Technology in the Studyof Rodent Behavior”. In: Frontiers in Behavioral Neuroscience11 (2017), pp. 1–6.

[GG16] Micheal J. Greenacre and Patrick J. F. Groenen. “WeightedEuclidean Biplots”. In: Journal of Classification 33.3 (2016),pp. 442–459.

[GKB17] Aayushi Gupta, Dhananjay Kumar, and Atul Barve. “Hid-den Markov Model based Credit Card Fraud Detection Systemwith Time Stamp and IP Address”. In: International Journalof Computer Applications 166.5 (2017), pp. 33–37.

[Gor15] Dan Gorton. “IncidentResponseSim: An AgentBased Simula-tion Tool for Risk Management of Online Fraud”. In: Bucheg-ger S., Dam M. (eds) Secure IT Systems, LNCS. Vol. 9417.Springer, Cham, 2015, pp. 172–187.

[GRS01] Sudipto Guha, Rajeev Rastogi, and Kyuseok Shim. “CURE:An efficient clustering algorithm for large databases”. In: In-formation Systems 26.1 (2001), pp. 35–58.

[GS17] Swati Ganguly and Samira Sadaoui. “Classification of Imbal-anced Auction Fraud Data”. In: Mouhoub M., Langlais P.(eds) Advances in Artificial Intelligence. Vol. 10233. Canada:Springer, 2017, pp. 84–89.

[Gu+08] Qiong Gu, Zhihua Cai, Li Zhu, and Bo Huang. “Data Min-ing on Imbalanced Data Sets”. In: International Conferenceon Advanced Computer Theory and Engineering, ICACTE’08.Phuket, Thailand: IEEE Computer Society, 2008, pp. 1020–1024.

[GV16] Robb Genna and Thando Vilakazi. “Mobile Payments Marketsin Kenya, Tanzania and Zimbabwe: A Comparative Study ofCompetitive Dynamics and Outcomes”. In: The African Jour-nal of Information and Communication 17 (2016), pp. 9–37.

REFERENCES 171

[HAA14] Ahmad Basheer Hassanat, Mohammad Ali Abbadi, and Ah-mad Ali Alhasanat. “Solving the Problem of the K Parameterin the KNN Classifier Using an Ensemble Learning Approach”.In: International Journal of Computer Science and Informa-tion Security (IJCSIS) 12.8 (2014), pp. 33–39.

[Hai+16] He Haibo, Bai Yang, Garcia A. Edwardo, and Li Shutao. “Adap-tive Synthetic Sampling Approach for Imbalanced Learning”.In: IEEE International Joint Conference on Neural Networks,IJCNN ’08. 3. Hong Kong, China: IEEE, 2016, pp. 1322–1328.

[Har68] P. Hart. “The condensed nearest neighbor rule (Corresp.)”In: IEEE Transactions on Information Theory 14.3 (1968),pp. 515–516.

[Hay99] Simon Haykin. Neural Networks: A Comprehensive Founda-tion. Second. Pearson Education, 1999, pp. 23–56.

[HB12] Dirk Helbing and Stefano Balietti. “How to Do Agent-BasedSimulations in the Future : From Modeling Social Mechanismsto Emergent Phenomena and Interactive Systems Design”. In:Social Self-Organisation. Ed. by Dirk Helbing. Springer BerlinHeidelberg, 2012, pp. 25–70.

[HD92] James S. Hodges and James A. Dewar. Is It You or Your ModelTalking? A Framework for Model Validation. Tech. rep. SantaMonica, Califonia: National Defense Research Institute, 1992,pp. 1–43.

[Hol00] Jaakko Hollmen. “User profiling and classification for frauddetection in mobile communications networks”. PhD thesis.Helsinki University of Technology, 2000.

[HSW07] Thomas N. Herzog, Fritz J. Scheuren, and William E. Winkler.“What is Data Quality and Why Should We Care”. In: DataQuality and Record Linkage techniques. 1st ed. Springer-VerlagNew York, 2007. Chap. 2, pp. 7–15.

REFERENCES 172

[HTF09] Trevor Hastie, Robert Tibshirani, and Jerome Friedman. TheElements of Statistical Learning. second. Vol. 27. 2. Springer,2009, pp. 83–85.

[IA16] Amira Kamil Ibrahim Hassan and Ajith Abraham. “ModelingInsurance Fraud Detection Using Ensemble Combining Classi-fication”. In: International Journal of Computer InformationSystems and Industrial Management Applications 8 (2016),pp. 257–265.

[IRU18] Sohony Ishan, Pratap Rameshwar, and Nambiar Ullas. “En-semble learning for credit card fraud detection”. In: Proceed-ings of Joint International Conference on Data Science andManagement of Data. India: ACM, 2018, pp. 289–294.

[JADaRg14] Jose L. Jorro-Aragoneses, Belen Dıaz-agudo, and Juan A. Recio-garcıa. “CBR tagging of emotions from facial expressions”.In: Lamontagne L., Plaza E. (eds) Case-Based Reasoning Re-search and Development (ICCBR), LNCS. Vol. 8765. Springer,Heidelberg, 2014, pp. 245–259.

[Jen08] Beth Jenkins. Developing Mobile Money Ecosystems. Tech.rep. Harvard Kennedy School, 2008, p. 36.

[Jes+05] Daniel R. Jeske, Behrokh Samadi, Pengyue J .J Lin, Lan Ye,Sean Cox, Rui Xiao, Ted Younglove, Minh Ly, Douglas Holt,and Ryan Rich. “Generation of Synthetic Data Sets for Eval-uating the Accuracy of Knowledge Discovery Systems”. In:Proceeding of the eleventh SIGKDD international conferenceon knowledge discovery in data mining. Chicago, USA: ACM,2005, pp. 756–762.

[JTPL03] Chang Hyeon Joh, Harry J.P. Timmermans, and Peter T.L.Popkowski-Leszczyc. “Identifying purchase-history sensitive shop-per segments using scanner panel data and sequence align-ment methods”. In: Journal of Retailing and Consumer Ser-vices 10.3 (2003), pp. 135–144.

REFERENCES 173

[Kap12] Stelios Kapetanakis. “Intelligent monitoring of business pro-cesses using case-based reasoning”. PhD thesis. University ofGreenwich, 2012.

[Kap+12] Stelios Kapetanakis, Georgios Samakovitis, P V G BuddhikaGunasekera, and Miltos Petridis. “Monitoring Financial Trans-action Fraud with the use of Case-based Reasoning”. In: Sev-enteenth UK Workshop on Case-Based Reasoning. Cambridge,UK, 2012.

[KBC16] Yeonkook J. Kim, Bok Baik, and Sungzoon Cho. “Detectingfinancial misstatements with fraud intention using multi-classcost-sensitive learning”. In: Expert Systems with Applications62 (2016), pp. 32–43.

[Ken16] Central Bank of Kenya. The Kenya Financial Sector StabilityReport, 2015. Tech. rep. 7. Nairobi: Central Bank of Kenya,2016.

[KGA12] Marjan Kaedi and Nasser Ghasem-Aghaee. “Improving case-based reasoning in solving optimization problems using Bayesianoptimization algorithm”. In: Inteligent Data Aanalysis 16.2(2012), pp. 199–210.

[KH01] Kyung Sup Kim and Ingoo Han. “The cluster-indexing methodfor case-based reasoning using self-organizing maps and learn-ing vector quantization for bond rating cases”. In: Expert Sys-tems with Applications 21.3 (2001), pp. 147–156.

[Kj04] Kim Kyoung-jae. “Toward Global Optimization of Case-BasedReasoning Systems for Financial Forecasting”. In: Applied In-telligence 21.2 (2004), pp. 239–249.

[Koh95] Ron Kohavi. “A Study of Cross-Validation and Bootstrap forAccuracy Estimation and Model Selection”. In: Proceedingsof the 14th international joint conference on Artificial intelli-gence. Vol. 2. Montreal, Quebec, Canada: Morgan Kaufmann,1995, pp. 1137–1143.

REFERENCES 174

[Kok97] A. I. Kokkinaki. “On atypical database transactions: Identifi-cation of probable frauds using machine learning for user pro-filing”. In: Proceedings of Knowledge & Data Engineering Ex-change Workshop, KDEX. Newport Beach, USA: IEEE, 1997,pp. 107–113.

[Kou+04] Yufeng Kou, Chang-Tien Lu, Sirirat Sirwongwattana, and Yo-Ping Huang. “Survey of fraud detection techniques”. In: Pro-ceedings of International Conference on Networking, Sensing& Control, IEEE 2 (2004), pp. 749–754.

[KPJ15] Gulmira Khussainova, Sanja Petrovic, and Rupa Jagannathan.“Retrieval with Clustering in a Case-Based Reasoning Systemfor Radiotherapy Treatment Planning”. In: Journal of Physics:Conference Series 616 (2015), pp. 1–11.

[KPK10] S Kapetanakis, M Petridis, and B Knight. “A case based rea-soning approach for the monitoring of business workflows”.In: Bichindaritz, I., Montani, S. (eds.) (ICCBR 2010) LNCS.Vol. 6176. sp, 2010, pp. 390–405.

[KS12] Yoonseong Kim and So Young Sohn. “Stock fraud detectionusing peer group analysis”. In: Expert Systems with Applica-tions 39.10 (2012), pp. 8986–8992.

[KSM06] Amlan Kundu, Shamik Sural, and A.K. Majumdar. “Two-Stage Credit Card Fraud Detection Using Sequence Align-ment”. In: Bagchi A., Atluri V. (eds) Information SystemsSecurity, (ICISS2006) LNCS. Vol. 4332. Springer Berlin Hei-delberg, 2006, pp. 260–275.

[KSM07] Efstathios Kirkos, Charalambos Spathis, and Yannis Manolopou-los. “Data Mining techniques for the detection of fraudulent fi-nancial statements”. In: Expert Systems with Applications 32.4(2007), pp. 995–1003.

[Kun+09] Amlan Kundu, Suvasini Panigrahi, Shamik Sural, and Arun K.Majumdar. “BLAST-SSAHA Hybridization for Credit CardFraud Detection”. In: IEEE Transactions on Dependable and

REFERENCES 175

Secure Computing. Vol. 6. 4. IEEE Computer Society, 2009,pp. 309–315.

[Lak13] Andrew James Lake. Risk management in Mobile Money : Ob-served Risks and Proposed Mitigants for Mobile Money Oper-ators. Tech. rep. Swiss: World Bank Group, 2013, pp. 1–21.

[LD12] Norman Lonergan and Jonathan Dharmapalan. “Mobile money:An overview for global telecommunications operators”. In: Mo-bile Money (2012), pp. 1–40.

[Lee08] Gun Ho Lee. “Rule-based and case-based reasoning approachfor internal audit of bank”. In: Knowledge-Based Systems, Sci-enceDirect 21.2 (2008), pp. 140–147.

[Lin+14] Fengyi Lin, Deron Liang, Ching Chiang Yeh, and Jui ChiehHuang. “Novel feature selection methods to financial distressprediction”. In: Expert Systems with Applications 41.5 (2014),pp. 2472–2483.

[LKJ02] Emilie Lundin, Hakan Kvarnstrom, and Erland Jonsson. “ASynthetic Fraud Data Generation Methodology”. In: Deng R.,Bao F., Zhou J., Qing S. (eds) Information and Communica-tions Security. ICICS 2002. LNCS. Vol. 2513. Springer, Hei-delberg, 2002, pp. 265–277.

[Lla+11] Marc Llanes, Elsa Prieto, Rodrigo Diaz, and Et Al. D2.1.1Scenario Requirements (Public version). Tech. rep. 2011, pp. 1–75.

[LN15] Sanni Lookman and Selmin Nurcan. “A framework for occupa-tional fraud detection by social network analysis”. In: CEURWorkshop Proceedings. Vol. 1367. Stockholm, Sweden: CEUR,2015, pp. 221–228.

[LrA12a] Edgar Alonso Lopez-rojas and Stefan Axelsson. “Money Laun-dering Detection using Synthetic Data”. In: The 27th annualworkshop of the Swedish Artificial Intelligence Society (SAIS),Orebro; Sweden. 2012, p. 49.

REFERENCES 176

[LrA12b] Edgar Alonso Lopez-rojas and Stefan Axelsson. “Multi AgentBased Simulation ( MABS ) of Financial Transactions for AntiMoney Laundering ( AML )”. In: 17th Nordic Conference onSecure IT. Karlskrona, Sweden, 2012.

[LRA14] Edgar Alonso Lopez-Rojas and Stefan Axelsson. “Social Sim-ulation of Commercial and Financial Behaviour for Fraud De-tection Research”. In: Miguel, Amblard, Barcelo & Madella(eds.) Advances in Computational Social Science and SocialSimulation. Bellaterra, Cerdanyola del Valles, 2014, pp. 1–12.

[LS10a] Hui Li and Jie Sun. “Business failure prediction using hybrid2case-based reasoning (H2CBR)”. In: Computers & OperationsResearch 37.1 (2010), pp. 137–151.

[LS10b] Charles X. Ling and Victor S. Sheng. “Cost-Sensitive Clas-sification”. In: Sammut C., Webb G.I. (eds) Encyclopedia ofMachine Learning and Data Mining. Springer, 2010, pp. 231–235.

[Lui+15] Coppolino Luigi, D’Antonio Salvatore, Formicola Valerio, Mas-sei Carmine, and Romano Luigi. “Use of the Dempster-ShaferTheory for Fraud Detection: The Mobile Money Transfer CaseStudy”. In: Camacho D., Braubach L., Venticinque S., BadicaC. (eds) Intelligent Distributed Computing VIII. Studies inComputational Intelligence. Vol. 570. Springer, Cham, 2015,pp. 465–474.

[Luk+04] Sean Luke, Claudio Cioffi-Revilla, Liviu Panait, and KeithSullivan. “Mason: A new multi-agent simulation toolkit”. In:Proceedings of the 2004 SwarmFest Workshop. Vol. 8. 2. 2004,pp. 316–327.

[Lur11] Michael Lurie. What is a business model ? A new approach.Tech. rep. Blue Mine Group, 2011, pp. 1–7.

[LWZ09] Xu Ying Liu, Jianxin Wu, and Zhi Hua Zhou. “Exploratoryunder-sampling for class-imbalance learning”. In: IEEE Trans-

REFERENCES 177

actions on Systems, Man, and Cybernetics, Part B (Cybernet-ics) 39.2 (2009), pp. 539 –550.

[LX12] H Li and Tao Xiong. “Predicting business risk using combinedcase-based reasoning in Euclidean space”. In: World Automa-tion Congress (WAC), 2012 (2012), pp. 1–6.

[MA16] Abdelhak Mansoul and Baghdad Atmani. “Clustering to En-hance Case-Based Reasoning”. In: Chikhi S., Amine A., ChaouiA., Kholladi M., Saidouni D. (eds) Modelling and Implemen-tation of Complex Systems. LNNS. Vol. 1. Springer, 2016,pp. 137–151.

[Mag09] Dan Magnusson. “The costs of implementing the anti-moneylaundering regulations in Sweden”. In: Journal of money laun-dering control 12.2 (2009), pp. 101–112.

[Maj15] Archisman Majumdar. “Social Network Analysis Approachesfor Fraud Analytics”. In: Mphasis NEXTlabs (2015), pp. 1–5.

[Mal+97] Marcus A. Maloof, Pat Langley, Stephanie Sage, and ThomasO. Binford. “Learning to detect rooftops in aerial images”.In: Proceedings of the 1997 Image Understanding Workshop(DARPA97). San Francisco, CA: Morgan Kaufmann, 1997,pp. 835–845.

[Man+12] Jaweria Manzoor, Saara Asif, Maryum Masud, and Malik Ja-han Khan. “Automatic Case Generation for Case-Based Rea-soning Systems Using Genetic Algorithms”. In: Third GlobalCongress on Intelligent Systems (GCIS). China: IEEE, 2012,pp. 311–314.

[Mar+13] Kowalski Martin, Klupfel Hubert, Zelewski Stephan, Bergen-rodt Daniel, and Saur Alexandra. “Integration of Case-Basedand Ontology-Based Reasoning for the Intelligent Reuse ofProject-Related Knowledge”. In: Clausen U., ten Hompel M.,Klumpp M. (eds) Efficiency and Logistics. Lecture Notes inLogistics. Springer, Berlin, Heidelberg, 2013, pp. 289–299.

REFERENCES 178

[MAS] MASON. Multi-Agent Simulator Of Neighborhoods.

[Mch13] Mary L Mchugh. “The Chi-square test of independence Lessonsin biostatistics”. In: Biochemia Medica 23.2 (2013), pp. 143–9.

[Mer11] Cynthia Merritt. Mobile money transfer services: The nextphase in the evolution of person-to-person payments. Tech. rep.2011, pp. 1–32.

[MGA04] Luke K Mcdowell, Kalyan Moy Gupta, and David W Aha.“Case-Based Collective Classification”. In: American Associ-ation for Artificial Intelligence. 2004, pp. 399–404.

[ML10] Britni Must and Kathleen Ludewig. “Mobilemoney: Cell phonebanking in Developing countries”. In: Policymatters 2.7 (2010),pp. 1–35.

[MNV14] Stephen O. Moepya, Fulufhelo V. Nelwamondo, and Christi-aan Van Der Walt. “A Support Vector Machine Approach toDetect Financial Statement Fraud in South Africa: A FirstLook”. In: Asian Conference on Intelligent Information andDatabase Systems (ACIIDS). LNCS. Vol. 8398. Springer, Cham,2014, pp. 42–51.

[Moh+09] Azlinah Mohamed, Ahmad Fuad Mohamed Bandi, Abdul RazifTamrin, Md Daud Jaafar, Suriah Hasan, and Faeizah Jusof.“Telecommunication fraud prediction using backpropagationneural network (SoCPaR)”. In: International Conference ofSoft Computing and Pattern Recognition. Malaysia: IEEE Com-puter Society, 2009, pp. 259–265.

[Mou+06] Kim Mouridsen, Søren Christensen, Louise Gyldensted, andLeif Østergaard. “Automatic selection of arterial input func-tion using cluster analysis”. In: Magnetic Resonance in Medicine55.3 (2006), pp. 524–531.

[MP17] N. Malini and M. Pushpa. “Analysis on credit card fraud iden-tification techniques based on KNN and outlier detection”. In:Advances in Electrical, Electronics, Information, Communica-

REFERENCES 179

tion and Bio-Informatics (AEEICB). Ed. by IEEE. Chennai,India, 2017.

[Muy15] Catherine Muya. “Mobile money in Africa”. In: Barclays BankPLC (2015), pp. 1–5.

[Net] Netlogo. Agent-Based Modelling Toolkit.

[Nga+11] E. W. T Ngai, Yong Hu, Y. H. Wong, Yijun Chen, and XinSun. “The application of data mining techniques in finan-cial fraud detection: A classification framework and an aca-demic review of literature”. In: Decision Support Systems 50.3(2011), pp. 559–569.

[NK14] Evgenia Novikova and Igor Kotenko. “Visual analytics for de-tecting anomalous activity in mobile money transfer services”.In: Teufel S., Min T.A., You I., Weippl E. (eds) Availability,Reliability, and Security in Information Systems. CD-ARES2014. LNCS 8708 (2014), pp. 63–78.

[Nol17] Ian Nolan. “Transaction Fraud Detection using Random For-est Classifier and Logistic Regression”. In: Neural Networks &Machine Learning 1.1 (2017).

[Ols14] Dominik Olszewski. “Fraud detection using self-organizing mapvisualizing the user profiles”. In: Knowledge-Based Systems 70(2014), pp. 324–334.

[OT08] Nikunj C. Oza and Kagan Tumer. “Classifier ensembles: Selectreal-world applications”. In: Information Fusion 9.1 (2008),pp. 4–20.

[Pad+07] T. Maruthi Padmaja, Narendra Dhulipalla, P Radha Krishna,Raju S Bapi, and A Laha. “An Unbalanced Data Classifica-tion Model Using Hybrid Sampling Technique for Fraud Detec-tion”. In: Ghosh A., De R.K., Pal S.K. (eds) Pattern Recogni-tion and Machine Intelligence (PReMI 2007) LNCS. Vol. 4815.Springer, Berlin, Heidelberg, 2007, pp. 341–348.

REFERENCES 180

[Pae17] Anthea Paelo. “A Comparison of the Mobile Financial ServicesSector in Kenya, Tanzania and Uganda”. In: The 3rd AnnualCompetition and Economic Regulation (ACER) Conference.Dar es Salaam, Tanzania, 2017, pp. 1–19.

[PAG07] Jean Pinquet, Mercedes Ayuso, and Montserrat Guillen. “Se-lection bias and auditing policies for insurance claims”. In:Journal of Risk and Insurance 74.2 (2007), pp. 425–440.

[Pan+09] Suvasini Panigrahi, Amlan Kundu, Shamik Sural, and A. K.Majumdar. “Credit card fraud detection: A fusion approachusing Dempster-Shafer theory and Bayesian learning”. In: In-formation Fusion 10.4 (2009), pp. 354–363.

[Pat73] Edward A. Patrick. “Clustering Using a Similarity MeasureBased on Shared Near Neighbors”. In: IEEE Transactions onComputers C-22.11 (1973), pp. 1025–1034.

[PDM15] Radu Platon, Vahid Raissi Dehkordi, and Jacques Martel.“Hourly prediction of a building’s electricity consumption us-ing case-based reasoning, artificial neural networks and princi-pal component analysis”. In: Energy and Building 92 (2015),pp. 10–18.

[Per+13] Kasun S. Perera, Bijay Neupane, Mustafa Amir Faisal, ZeyarAung, and Wei Lee Woon. “A novel ensemble learning-basedapproach for click fraud detection in mobile advertising”. In:Prasath R., Kathirvalavakumar T. (eds) Mining Intelligenceand Knowledge Exploration. LNCS. Vol. 8284. Springer, Cham,2013, pp. 370–382.

[PF01] Foster Provost and Tom Fawcett. “Robust classification forimprecise environments”. In: Machine Learning 42.3 (2001),pp. 203–231.

[PH02] Cheol-Soo Park and Ingoo Han. “A case-based reasoning withthe feature weights derived by analytic hierarchy process forbankruptcy prediction”. In: Expert Systems with Applications23.3 (2002), pp. 255–264.

REFERENCES 181

[PH07] Jim Prentzas and Ioannis Hatzilygeroudis. “Categorizing ap-proaches combining rule-based and case-based reasoning”. In:Expert Systems 24.2 (2007), pp. 97–122.

[Phu+10] Clifton Phua, Vincent Lee, Kate Smith, and Ross Gayler. “AComprehensive Survey of Data Mining-based Fraud DetectionResearch”. In: International Conference on Intelligent Compu-tation Technology and Automation (ICICTA). Vol. 3. Chang-sha, China: IEEE, 2010, p. 14.

[Pow07] David M W Powers. Evaluation: From Precision, Recall andF-measure to ROC, Informedness, Markedness & Correlation.Tech. rep. Adelaide, Australia: Technical Report (SIE) Schoolof Informatics and Engineering, Flinders University, 2007, pp. 1–24.

[Poz15] Andrea Dal Pozzolo. “Adaptive Machine Learning for CreditCard Fraud Detection”. PhD thesis. Universite Libre de Brux-elles, 2015, pp. 1–55.

[Pro00] Foster Provost. “Machine learning from imbalanced data sets101”. In: Proceedings of the AAAI’2000 Workshop on Imbal-anced Data Sets. 2000, pp. 1–3.

[PZ06] Yaling Pei and Osmar Zaıane. A synthetic data generator forclustering and outlier analysis. Tech. rep. Alberta: Universityof Alberta, 2006, pp. 1–33.

[QF+15] Zhou Qi-Feng, Zhou Hao, Ning Yong-Peng, Yang Fan, andLi Tao. “Two approaches for novelty detection using randomforest”. In: Expert Systems with Applications 42.10 (2015),pp. 4840–4850.

[Qin+18] Yuchu Qin, Wenlong Lu, Qunfen Qi, Xiaojun Liu, Meifa Huang,Paul J. Scott, and Xiangqian Jiang. “Towards an ontology-supported case-based reasoning approach for computer-aidedtolerance specification”. In: Knowledge-Based Systems 141 (2018),pp. 129–147.

REFERENCES 182

[Ran+17] Kuldeep Randhawa, Chu Kiong Loo, Manjeevan Seera, CheePeng Lim, and Asoke K. Nandi. “Credit card fraud detectionusing AdaBoost and majority voting”. In: IEEE Access XX(2017), pp. 1–8.

[Rav+11] P. Ravisankar, V. Ravi, G. Raghava Rao, and I. Bose. “Detec-tion of financial statement fraud and feature selection usingdata mining techniques”. In: Decision Support Systems 50.2(2011), pp. 491–500.

[Rep] Repast. Recursive porous agent simulation toolkit.

[RGDAGC08] Juan a Recio-Garcıa, Belen Dıaz-Agudo, and Pedro Gonzalez-Calero. jCOLIBRI2 Tutorial. Tech. rep. University Complutenseof Madrid, 2008, pp. 1–110.

[RGGCDA14] Juan A. Recio-Garcia, Pedro A. Gonzalez-Calero, and BelenDiaz-Agudo. “Jcolibri2: A framework for building Case-basedreasoning systems”. In: Science of Computer Programming 79(2014), pp. 126–145.

[Rie+13] Roland Rieke, Maria Zhdanova, Jurgen Repp, Romain Giot,and Chrystel Gaber. “Fraud Detection in Mobile PaymentsUtilizing Process Behavior Analysis”. In: International Con-ference on Availability, Reliability and Security (ARES). Ger-many: IEEE, 2013, pp. 662–669.

[RK04] Bhavani Raskutti and Adam Kowalczyk. “Extreme Re-balancingfor SVMs: a case study”. In: ACM SIGKDD Explorations 6.1(2004), pp. 60–69.

[RLJ06] Steven F. Railsback, Steven L. Lytinen, and Stephen K. Jack-son. “Agent-based Simulation Platforms: Review and Develop-ment Recommendations”. In: Simulation 82.9 (2006), pp. 609–623.

[RN03] Stuart J. Russell and Peter Norvig. Artificial Intelligence: AModern Approach. 2nd. Alan Apt, 2003, pp. 215–218.

REFERENCES 183

[Rog83] Schank C. Roger. Dynamic memory: A theory of remindingand learning in computers and people. Combridge UniversityPress New York, NY, USA, 1983, pp. 1–20.

[San+17] B Santoso, H Wijayanto, K.A Notodiputro, and B Sartono.“Class Imbalanced Problems : A Review”. In: Conference Se-ries: Earth and Environmental Science. Vol. 58. 1. IOP Pub-lishing Ltd, 2017, pp. 427–436.

[Sar+14] P. Saravanan, V. Subramaniyaswamy, N. Sivaramakrishnan,M. Arun Prakash, and T. Arunkumar. “Data mining approachfor subscription-fraud detection in telecommunication sector”.In: Contemporary Engineering Sciences 7.11 (2014), pp. 515–522.

[SBD13] Yusuf Sahin, Serol Bulkan, and Ekrem Duman. “A cost-sensitivedecision tree approach for fraud detection”. In: Expert Systemswith Applications 40 (2013), pp. 5916–5923.

[SCP16] David Shrier, German Canale, and Alex Pentland. Mobile Money& Payments : Technology Trends. Tech. rep. Massachusetts In-stitute of Technology, 2016.

[SE13] Abir Smiti and Zied Elouedi. “Using clustering for maintainingcase based reasoning systems”. In: 5th International Confer-ence on Modeling, Simulation and Applied Optimization (ICM-SAO). 2013, pp. 1–6.

[SEDMK14] Dina A. Sharaf-El-Deen, Ibrahim F. Moawad, and M.E Khal-ifa. “A New Hybrid Case-Based Reasoning Approach for Medi-cal Diagnosis Systems”. In: Medical Systems 38.2 (2014), pp. 1–11.

[SH99] Kyung-shik Shin and Ingoo Han. “Case-based reasoning sup-ported by genetic algorithms for corporate bond rating”. In:(Expert Systems with Applications) 16.2 (1999), pp. 85–95.

[Sha+16] Aulon Shabani, Adil Paul, Radu Platon, and Eyke Hullermeier.“Predicting the Electricity Consumption of Buildings: An Im-

REFERENCES 184

proved CBR Approach”. In: Goel A., Dıaz-Agudo M., Roth-Berghofer T. (eds) Case-Based Reasoning Research and Devel-opment. ICCBR 2016. LNCS. Atlanta, USA: Springer BerlinHeidelberg, 2016, pp. 356–369.

[She13] Sheng Shen. Forecast: Mobile Payment, Worldwide, 2013 Up-date. 2013.

[SK11] Sanjay Sood and Parijat Kat. “Business Listing Classifica-tion Using Case Based Reasoning and Joint Probability”. In:AAAI2011 Symposium. 2011, pp. 23–28.

[SK13] Georgios Samakovitis and Stelios Kapetanakis. “Computer-aided Financial Fraud Detection: Promise and Applicabilityin Monitoring Financial Transaction Fraud”. In: Proceedingsof International Conference on Business Management and IS,Dubai, United Arab Emirates. 2013.

[SM70] John W Slocum and H Lee Mathews. “Social Class and Incomeas Indicators of Consumer Credit Behavior”. In: Journal ofMarketing 34.2 (1970), pp. 69–74.

[SMM13] Yaya Sylla and Pierre Morizet-Mahoudeaux. “Fraud Detectionon Large Scale Social Networks”. In: International Congresson Big Data. 1. Santa Clara, CA, USA: IEEE, 2013, pp. 413–414.

[SN10] K. K. Sherly and R Nedunchezhian. “BOAT adaptive creditcard fraud detection system”. In: International Conference onComputational Intelligence and Computing Research. Coim-batore, India: IEEE, 2010, pp. 1–7.

[SP15] Sharmila Subudhi and Suvasini Panigrahi. “Quarter-SphereSupport Vector Machine for Fraud Detection in Mobile Telecom-munication Networks”. In: International Conference on Com-puter, Communication and Convergence (ICCC 2015). Vol. 48.Elsevier, 2015, pp. 353–359.

REFERENCES 185

[SPA10] Ikan Sidat, D I Perairan, and Segara Anakan. “Sebaran Uku-ran Hasil Tangkapan Dan Aspek Reproduksi”. In: PatternRecognition Letters, Science Direct 20.10 (2010).

[Sri+99] Anurag Srivastava, Eui-Hong Han, Vipin Kumar, and VineetSingh. “Parallel Formulations of Decision-Tree ClassificationAlgorithms”. In: Srivastava, A., Han, EH., Kumar, V. et al.Data Mining and Knowledge Discovery 3.3 (1999), pp. 237–261.

[SS99] Robert E. Schapire and Yoram Singer. “Improved boostingalgorithms using confidence-rated predictions”. In: MachineLearning 37.3 (1999), pp. 297–336.

[Ste11] Peer Stein. IFC Mobile Money Study 2011. Tech. rep. Wash-ington, DC, USA: International Finance Corporation, WorldBank Group, 2011.

[Sud+10] Agus Sudjianto, Sheela Nair, Ming Yuan, Aijun Zhang, DanielKern, and Fernando Cela-Dıaz. “Statistical methods for fight-ing financial crimes”. In: Technometrics 52.1 (2010), pp. 5–19.

[Sun+07] Yanmin Sun, Mohamed S. Kamel, Andrew K.C. Wong, andYang Wang. “Cost-sensitive boosting for classification of im-balanced data”. In: Pattern Recognition 40.12 (2007), pp. 3358–3378.

[SVF15] Cesar Silva, Germano Vasconcelos, and Gabriel Frana. “Case-based Reasoning Combined with Neural Networks for CreditRisk Analysis”. In: International Joint Conference on NeuralNetworks (IJCNN). Killarney, Ireland: IEEE, 2015.

[Swa] Swarm. Agent-Based Modelling Toolkit.

[TD13] Lin Tong and Wu Di. “Research on Optimization of Case-Based Reasoning System”. In: Third International Confer-ence on Control, Automation and Systems Engineering (CASE2013). Atlantis Press, 2013, pp. 34–37.

REFERENCES 186

[Tob11] Peter Tobbin. “Understanding the mobile money ecosystem:Roles, structure and strategies”. In: Proceedings of 10th Inter-national Conference on Mobile Business, ICMB 2011. Como,Italy: IEEE Computer Society, 2011, pp. 185–194.

[Tom76] Ivan Tomek. “Two Modifications of CNN”. In: Transactionson Systems, Man, and Cybernetics SMC-6.11 (1976), pp. 769–772.

[Tru15] FinMark Trust. FinScope Consumer Survey Zimbabwe 2014.Tech. rep. 2015, pp. 1–11.

[Uga16] Bank of Uganda. Annual Supervision Report. Tech. rep. De-cember. Kampala, 2016, pp. 24–25.

[Via+04] S. Viaene, D. Van Gheel, M. Ayuso, and Guillen M. “Cost-Sensitive Design of Claim Fraud Screens”. In: Perner P. (eds)Advances in Data Mining. ICDM 2004. LNCS. Vol. 3275. Springer,Berlin, Heidelberg, 2004.

[VKN07] Jason Van Hulse, Taghi M. Khoshgoftaar, and Amri Napoli-tano. “Experimental perspectives on learning from imbalanceddata”. In: Proceedings of the 24th international conference onMachine learning - ICML ’07. ACM, 2007, pp. 935–942.

[Ver+17] Van Vlasselaer Veronique, Eliassi-Rad Tina, Akoglu Leman,Snoeck Monique, and Baesens Bart. “GOTCHA! Network-based fraud detection for social security fraud”. In: Manage-ment science 63.9 (2017), pp. 3090 –3110.

[WA00] Richard Wheeler and Stuart Aitken. “Multiple algorithms forfraud detection”. In: Knowledge-Based Systems 13.2 (2000),pp. 93–99.

[Wan+03] Haixun Wang, Wei Fan, Philip S. Yu, and Jiawei Han. “Min-ing concept-drifting data streams using ensemble classifiers”.In: Proceedings of the ninth ACM SIGKDD international con-ference on Knowledge discovery and data mining. ACM, 2003,pp. 226–235.

REFERENCES 187

[Wat99] Ian Watson. “Case-based reasoning is a methodology not atechnology”. In: Knowledge-Based Systems 12 (1999), pp. 303–308.

[WB15] Jarrod West and Maumita Bhattacharya. “Mining FinancialStatement Fraud An Analysis of Some Experimental Issues”.In: Proceedings of The 10th IEEE Conference on IndustrialElectronics and Applications (ICIEA 2015) (2015).

[WB16] Jarrod West and Maumita Bhattacharya. “Some Experimen-tal Issues in Financial Fraud Detection: An Investigation”.In: The International Conference on Computational Science(ICCS2016). Procedia Computer Science, Elsevier 80 (2016),pp. 7–10.

[Wes+08] David J. Weston, David J. Hand, Niall M. Adams, ChristopherWhitrow, and Piotr Juszczak. “Plastic card fraud detectionusing peer group analysis”. In: Advances in Data Analysis andClassification 2.1 (2008), pp. 45–62.

[WHV08] Mark A. Whiting, Jereme Haack, and Carrie Varley. “Creat-ing realistic, scenario-based synthetic data for test and evalu-ation of information analytics software”. In: Proceedings of the2008 Workshop on BEyond Time and Errors: Novel EvaLu-ation Methods for Information Visualization (BELIV08). 8.Florence, Italy: ACM, 2008, pp. 1–6.

[Wil72] Dennis L. Wilson. “Asymptotic Properties of Nearest Neigh-bor Rules Using Edited Data”. In: IEEE Transactions on Sys-tems, Man and Cybernetics SMC-2.3 (1972), pp. 408–421.

[WMZ07] Gary M. Weiss, Kate McCarthy, and Bibi Zabar. “Cost-sensitivelearning vs. sampling: Which is best for handling unbalancedclasses with unequal error costs?” In: IEEE ICDM. 2007, pp. 35–41.

[WP03] Gary M Weiss and Foster J Provost. “Learning When TrainingData are Costly: The Effect of ClassDistribution on Tree In-

REFERENCES 188

duction.” In: J. Artif. Intell. Res. (JAIR) 19 (2003), pp. 315–354.

[WST10] Jack William, Tavneet Suri, and Robert Townsend. “MonetaryTheory and Electronic Money: Reflections on the Kenyan Ex-perience.” In: Economic Quarterly, Massachusetts Institute ofTechnology 96.1 (2010), pp. 83–122.

[Wue+16] Thorsten Wuest, Daniel Weimer, Christopher Irgens, and Klaus-Dieter Thoben. “Machine learning in manufacturing: advan-tages, challenges, and applications”. In: Production & Manu-facturing Research 4.1 (2016), pp. 23–45.

[XSL07] Jianyun Xu, Andrew H. Sung, and Qingzhong Liu. “Behaviourmining for fraud detection”. In: Journal of Research and Prac-tice in Information Technology 39.1 (2007), pp. 3–18.

[Yar16] Suresh Yaram. “Machine learning algorithms for documentclustering and fraud detection”. In: Data Science and Engi-neering (ICDSE). Cochin, India: IEEE, 2016.

[YL06] Show Jane Yen and Yue Shi Lee. “Under-sampling approachesfor improving prediction of the minority class in an imbalanceddataset”. In: Huang DS., Li K., Irwin G.W. (eds) IntelligentControl and Automation. Lecture Notes in Control and Infor-mation Sciences 344 (2006), pp. 731–740.

[YMR14] Wei Liang Yeow, Rohana Mahmud, and Ram Gopal Raj. “Anapplication of case-based reasoning with machine learning forforensic autopsy”. In: Expert Systems with Applications 41.7(2014), pp. 3497–3505.

[Yua+17] Shuhan Yuan, Xintao Wu, Jun Li, and Aidong Lu. “Spectrum-based deep neural networks for fraud detection”. In: Proceed-ings of the 2017 ACM on Conference on Information andKnowledge Management. Singapore, Singapore: ACM, 2017,pp. 2419–2422.

REFERENCES 189

[Yue+07] Dianmin Yue, Xiaodan Wu, Yunfeng Wang, and Yue Li. “AReview of Data Mining-based Financial Fraud Detection Re-search”. In: International Conference on Wireless Communi-cations, Networking and Mobile Computing (WiCom 2007).Shanghai, China: IEEE, 2007, pp. 5519–5522.

[Yu+16] Dequan Yu, Chuan Lv, Quan Lei Wu, and Xu Peng. “Study oncase-based reasoning expert system about the matching opti-mization of particle swarm optimization algorithm”. In: Prog-nostics and System Health Management Conference, PHM 2015.2016.

[Zha12] Jian Zhang. “Financial inclusion and integration through mo-bile payments and transfer”. In: Proceedings of Workshopson Enhancing Financial Integration throgh Sound Regulationof Cross-Border Mobile Payments: Opportunities and Chal-lenges. Mumbai, India: Africa Development Bank Group, 2012.

[Zhd+14] Maria Zhdanova, Jurgen Repp, Roland Rieke, Chrystel Gaber,and Baptiste Hemery. “No smurfs: Revealing fraud chains inmobile money transfers”. In: 9th International Conference onAvailability, Reliability and Security, ARES 2014. IEEE, 2014,pp. 11–20.

[ZLD14] Yu Jie Zhao, Xin Xing Luo, and Li Deng. “A CBR-Based andMAHP-based customer value prediction model for new prod-uct development”. In: Scientific World Journal 2014.1 (2014).

[ZM03] Jianping Zhang and Inderjeet Mani. “kNN Approach to Un-balanced Data Distributions: A Case Study involving Informa-tion Extraction”. In: Workshop on Learning from ImbalancedDatasets II ICML Washington DC 2003. Washington DC: Sci-entific Research Publishing Inc., 2003, pp. 42–48.

[ZS06] V Zaslavsky and A Strizhak. “Credit Card Fraud DetectionUsing Self-Organizing Maps”. In: Information and Security.18 (2006), pp. 48–63.

REFERENCES 190

[ZS15] Masoumeh Zareapoor and Pourya Shamsolmoali. “Applicationof credit card fraud detection: Based on bagging ensemble clas-sifier”. In: Procedia Computer Science. Vol. 48. Elsevier Ltd,2015, pp. 679–686.

[ZSY03] Zhongfei Zhang, John J. Salerno, and Philip S. Yu. “Apply-ing data mining in investigating money laundering crimes”.In: Proceedings of the SIGKDD International Conference onKnowledge Discovery and Data Mining. Washington, DC: ACM,2003, pp. 747–752.

[ZY12] Cha Zhang and Ma Yunqian. Ensemble Machine Learning.Springer Berlin Heidelberg, 2012, pp. 1–35.

[Com18] Information Commissioner’s Office Uk. Guide to the GeneralData Protection Regulation (GDPR). 2018.

[Int13a] International Telecommunication Union. The Mobile MoneyRevolution. Part 1: NFC Mobile Payments. Tech. rep. ITU-TTechnology, 2013, p. 22.

[Int13b] International Telecommunication Union. The Mobile MoneyRevolution Part 2: Financial Inclusion Enabler. Tech. rep.ITU-T Technology, 2013, p. 30.

[Isl+02] E. Islas Perez, C.A. Coello Coello, A. Hernandez-Aguirre, andA. Villavicencio Ramırez. “Genetic Algorithms and Case-BasedReasoning as a Discovery and Learning Machine in the Opti-mization of Combinational Logic Circuits”. In: Coello CoelloC.A., de Albornoz A., Sucar L.E., Battistutti O.C. (eds) MI-CAI 2002: Advances in Artificial Intelligence. MICAI 2002.2002, pp. 128–137.

[Van+15] Veronique Van Vlasselaer, Cristian Bravo, Olivier Caelen, TinaEliassi-Rad, Leman Akoglu, Monique Snoeck, and Bart Bae-sens. “APATE: A novel approach for automated credit cardtransaction fraud detection using network-based extensions”.In: Decision Support Systems 75 (2015), pp. 38–48.

Date post:	11-Jul-2020
Category:	Documents
Upload:	others
View:	5 times
Download:	0 times

PREDICTING FRAUD IN MOBILE MONEY TRANSFER...Mobile Money Transfer (MMT) is a fast growing medium of...

Documents