A Data Mining Approach to Build AML Indices A Case Study
Claudio Antonini, Ph.D.
Deloitte Financial Advisory Services LLP
New York
Copyright © 2013 Deloitte Development LLC. All rights reserved. 1 Data mining approach to build indices Deloitte.
In 2012, the FSA fined a bank for “failure to take reasonable care to establish and
maintain adequate anti-money laundering (AML) systems and controls [and to]
assess the level of money laundering risk posed by its customers.”
In particular, 46 of 68 accounts reviewed by the FSA “had been inappropriately
classified as normal risk.” [*]
[*] http://www.fsa.gov.uk/library/communication/pr/2012/055.shtml
Conclusion: The firm has to have a defensible way of assessing risk.
Motivation
Copyright © 2013 Deloitte Development LLC. All rights reserved. 2 Data mining approach to build indices
• Basel AML Index – Origin
• Basel Institute on Governance
• Non-profit
• Corruption prevention, public/company governance
– Composition
• Various sources: indices, reports
– Methodology
• Relies on experts that determine weights
– Limitations indicated by reviews of the 2012 release
• Sources — Infrequent, some might be biased
• Methodology — no CI, uncertainty/sensitivity analysis
– Limitations not indicated by reviews
• Missing data
• Non-reproducible, some data difficult to locate
• Limited number of countries and regions covered
Expanding an Existing Index
“The Basel AML Index 2013,” at http://index.baselgovernance.org/index/Project_Description.pdf
Copyright © 2013 Deloitte Development LLC. All rights reserved. 3 Data mining approach to build indices
65%
15%
10%
5% 5%
Money Laundering — Terrorist Financing
Financial Transparency and Standards
Corruption
Public Transparency and Accountability
Political and Legal Risk
Expert Weightings — Risks
“The Basel AML Index 2013,” at http://index.baselgovernance.org/index/Project_Description.pdf
Copyright © 2013 Deloitte Development LLC. All rights reserved. 4 Data mining approach to build indices
Areas Covered — Sources
“The Basel AML Index 2013,” at http://index.baselgovernance.org/index/Project_Description.pdf
Copyright © 2013 Deloitte Development LLC. All rights reserved. 5 Data mining approach to build indices
Availability of Sources — Coverage
Organization Indicator 2005 2006 2007 2008 2009 2010 2011 2012 2013
Basel Institute on Governance
Basel_AML_Index x Jun-10
1.) Bertelsmann Stiftung Transformation Index
Rule of Law scores x
2.) Euromoney
Political Risk scores x
3.) Financial Action Task Force (FATF)
Member countries Mutual Evaluation Reports sp. sp. sp. sp. sp. sp. sp. sp. sp.
4.) Freedom House
Freedom in the World & Press Freedom Index x x
5.) International Institute for Democracy and Electoral
Assistance (IDEA)
Political Finance Database
6.) International Budget Partnership
Open Budget Index some x x x
7.) Tax Justice Network
Financial Secrecy Index x
8.) Transparency International
Corruption Perception Index x
9.) US State Dept. - Int. Narcotics Control Strategy Report
Money Laundering and Financial Crimes x
10.) World Bank - Doing Business Ranking
Business Extent of Disclosure Index x x x x x x x x
11.) World Bank
IDA Resource Allocation Index x
12.) World Economic Forum
Global Competitiveness x x x
International Monetary Fund
Compliance_w_AML_CFT x x x x x x
Sources 2 3 2 3 2 3 6 6 3
Copyright © 2013 Deloitte Development LLC. All rights reserved. 6 Data mining approach to build indices
Country Name Indicator Name 1990 1991 1992 1993 1994 2007 2008 2009 2010 2011
Afghanistan Patent applications, nonresidents
Albania Patent applications, nonresidents 361
Algeria Patent applications, nonresidents 229 170 164 138 118 765 730
American Samoa Patent applications, nonresidents
Andorra Patent applications, nonresidents
Angola Patent applications, nonresidents 2
Antigua and Barbuda Patent applications, nonresidents
Arab World Patent applications, nonresidents 1499 1815 1780 4322
Argentina Patent applications, nonresidents 1955 1851 1919 2261 2820 4806 4781
Armenia Patent applications, nonresidents 30 79 5 4 11 6
Aruba Patent applications, nonresidents
Australia Patent applications, nonresidents 24122 23525 21187 22478
Austria Patent applications, nonresidents 670 544 517 513 516 287 329 292 249
Azerbaijan Patent applications, nonresidents 5
Bahamas, The Patent applications, nonresidents 25 27 29
Bahrain Patent applications, nonresidents 31
Bangladesh Patent applications, nonresidents 76 77 89 71 89 270 278 275 276
Missing Data in Most Circumstances
• In most regressions schemes, only data from a few countries or
regions would remain
Copyright © 2013 Deloitte Development LLC. All rights reserved. 7 Data mining approach to build indices
• It is desired to build an index:
– Maintainable in-house
– Reproducible
– Based on available sources of information
– Updated as sources are updated (not once a year)
– Informative (not only generating point values)
– Valid for new cases (not just previous ones)
Defensible Process Desired to Value Risk
Copyright © 2013 Deloitte Development LLC. All rights reserved. 8 Data mining approach to build indices
• Included more time-series
– Basel AML Index, IMF AML+CFT Index, WDI, WGI
• Treatment of missing data
– Missingness
– Imputation
– Create complete cases
• Modeling
– Decision Trees
– Random Forest
– Linear Models
Process Followed
Copyright © 2013 Deloitte Development LLC. All rights reserved. 9 Data mining approach to build indices
Most of the Data is Gaussian
Copyright © 2013 Deloitte Development LLC. All rights reserved. 10 Data mining approach to build indices
Correlation
Most relevant variables
Copyright © 2013 Deloitte Development LLC. All rights reserved. 11 Data mining approach to build indices
Correlations with Basel AML Index
Copyright © 2013 Deloitte Development LLC. All rights reserved. 12 Data mining approach to build indices
0%
20%
40%
60%
80%
100%
120%
1 6 11 16 21 26 31 36 41
Missingness vs. Indicators (WDI = 1 to 31, AML = 32 to 45)
1982-2012
2002-2012
Indicators — Missing Data
WDI Basel AML I
Less data
More data
Copyright © 2013 Deloitte Development LLC. All rights reserved. 13 Data mining approach to build indices
0%
20%
40%
60%
80%
100%
120%
1 6 11 16 21 26 31 36 41
1982-2012
2002-2012
2012
Indicators — Missing Data
WDI Basel AML I
Less data
More data
Copyright © 2013 Deloitte Development LLC. All rights reserved. 14 Data mining approach to build indices
1982-2012
2002-2012
2012
0%
20%
40%
60%
80%
100%
13
57
911
1315
1719
2123
2527
2931
3335
3739
4143
45
Missingness vs. Indicators (WDI = 1 to 31, AML = 32 to 45)
1982-2012
2002-2012
2012
Indicators — Missing Data
WDI
Basel AML I
Less data
More data
Copyright © 2013 Deloitte Development LLC. All rights reserved. 15 Data mining approach to build indices
Missingness Map (246 countries, 2011/2)
Copyright © 2013 Deloitte Development LLC. All rights reserved. 16 Data mining approach to build indices Deloitte.
Few Changes in the Index from 2012 to 2013
-1.5
-1
-0.5
0
0.5
1
1.5
1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126 131 136 141 146
Country Rank 2013, ordered by compliance score (lowest = Afghanistan, highest = Norway)
BASEL AML Index 2013 - 2012 Positive difference --> getting better (=more compliant)
Laos
Angola
Algeria Guyana
Norway Kazakhstan
Moldova Georgia
Slovak Rep.
Ecuador
Copyright © 2013 Deloitte Development LLC. All rights reserved. 17 Data mining approach to build indices
• Slow process
– y(t) = a(t) + β(t) * x(t)
– y(t+1) = a(t+1) + β(t+1) * x(t+1)
Assume that a(t) ~ a(t+1), β(t) ~ β(t+1)
– y(t+1) = y(t) + β(t) * Δx(t->t+1)
• Can also forecast the individual time-series.
• No new series until Jun-10
Forecasting
Copyright © 2013 Deloitte Development LLC. All rights reserved. 18 Data mining approach to build indices
Data collection is usually limited in less developed countries modeling bias
Decision Tree Model (143 countries)
43% of the rows were deleted due to
missing data. The model was built
with data from only 82 countries.
Copyright © 2013 Deloitte Development LLC. All rights reserved. 19 Data mining approach to build indices
Decision Tree Model (246 countries)
After imputation, all 246 rows
are used to create the model. A
more detailed tree is obtained.
Copyright © 2013 Deloitte Development LLC. All rights reserved. 20 Data mining approach to build indices
Decision Tree Model (246 countries)
2 3 4 5 6 7
12
34
56
Basel_AML_Index
Pre
dic
ted
Linear Fit to Points
Predicted=Observed
Pseudo R-square=0.8357
Predicted vs. Observed
Decision Tree Model
complete_imp246
Rattle 2013-Jun-13 22:38:32 Patricia
Copyright © 2013 Deloitte Development LLC. All rights reserved. 21 Data mining approach to build indices
Random Forest Model (246 countries)
Copyright © 2013 Deloitte Development LLC. All rights reserved. 22 Data mining approach to build indices
Linear Model (246 countries)
1 2 3 4 5 6 7 8
12
34
56
7
Basel_AML_Index
Pre
dic
ted
Linear Fit to Points
Predicted=Observed
Pseudo R-square=0.9276
Predicted vs. Observed
Linear Model
complete_imp246
Rattle 2013-Jun-14 09:58:44 Patricia
Copyright © 2013 Deloitte Development LLC. All rights reserved. 23 Data mining approach to build indices
Linear Model (246 countries)
Initial 143 countries
Index extended to
additional 103
countries and
regions
Country
Ind
ex V
alu
e
Initial (143) and imputed (103) index values (in red), and their (246) estimates
[1] Original 143 countries (red)
[2] Estimates of 143 countries
[3] Imputed values (red)
[4] Estimates on imputed values
Copyright © 2013 Deloitte Development LLC. All rights reserved. 24 Data mining approach to build indices
Item Other Indices Our Approach Options
Experts Use of non-reproducible,
‘arbitrary’ weights
Regression, decision
trees, Random Forest
Various models,
supervised learning
Index
N/A (The index is
generated, not used as a
reference.)
Still need a reference
for modeling
We can select from a
growing number of
indices
Sources Potentially biased Public data We can select from a
growing number of
data sources
Data Sporadic, difficult to
obtain, categorical
• Select sources
• Imputation
We can select from a
growing number of
imputation methods
Countries Limited number Unlimited, and
extended to regions
Estimates Only point values t-stats, CI
Different approaches
Copyright © 2013 Deloitte Development LLC. All rights reserved. 25 Data mining approach to build indices
• Limitations of the Current Procedure to Create an AML Index
– Reproducibility
– Reliance on experts
– Estimates restricted to point values
– Limited modeling options due to missing data
– Many data points are deleted, resulting in biased estimates
– New points (in our case, countries) cannot be scored
• Given the amount of public data available
– The index can be easily replicated and extended
– Other related indices can be used
– No need of expert weights
– No need to rely on sporadic or potentially biased sources
• The process can be applied to other indices/scores
– Use one index as a reference to determine variables
– Identify relevant variables with various models
– Impute data (do not delete variables to create model)
– Create various models for different purposes
Conclusions
About Deloitte
As used in this document, "Deloitte" means Deloitte Financial Advisory Services a subsidiary of Deloitte LLP. Please see
www.deloitte.com/us/about for a detailed description of the legal structure of Deloitte LLP and its subsidiaries. Certain service may
not be available to attest clients under the rules and regulations of public accounting.
This presentation contains general information only and is based on the experiences and research of Deloitte Financial Advisory
Services LLP practitioners. Deloitte Financial Advisory Services LLP is not, by means of this presentation, rendering accounting,
auditing, business, financial, investment, legal or other professional advice or services. This presentation is not a substitute for such
professional advice or services, nor should it be used as a basis for any decision or action that may affect your business. Before
making any decision or taking any action that may affect your business, you should consult a qualified professional advisor.
Deloitte Financial Advisory Services LLP, its affiliates, and related entities shall not be responsible for any loss sustained by any
person who relies on this publication.
Copyright © 2013 Deloitte Development LLC. All rights reserved. Member of Deloitte Touche Tohmatsu Limited