LEVERAGING BIG DATA TECHNIQUES TO ENHANCE
ANTI-MONEY LAUNDERING PRACTICES
Margo Vakharia
TABLE OF CONTENTS
1 Introduction....................................................................................................................... 3
2 Big Data - What Is It? ........................................................................................................... 5
3 Big Data Possibilites For Audit Policies And Procedures ...........................................................10
4 Internal Controls................................................................................................................11
4.1 KYC/CDD....................................................................................................................11
4.1.1 Entity Analysis .....................................................................................................14
4.1.2 Additional Benefits ...............................................................................................15
4.2 OFAC Name Screening .................................................................................................16
4.3 Transaction Monitoring................................................................................................16
4.4 Case Management ......................................................................................................17
5 Conclusion ........................................................................................................................18
References...............................................................................................................................19
1 INTRODUCTION
The evolution of technology in recent decades has brought about unprecedented change for today’s
global workforce. The new reality is one of information overload, compounded by the constant onslaught
of new technologies and a continuous stream of new regulations. This spiraling cycle means that we have
a bigger workload than ever before with less time to complete it in. Couple this with the change
management challenges involved, not to mention cost considerations, and every professional would agree
that the issue of data management is one of the most significant in recent times.
When it comes to the use of innovative approaches to data management, it is global players such as
Facebook, Google, Uber and Airbnb who spring to mind. Such companies are at the forefront of designing
and managing how data is collected, used and stored, and so they should be. Unfortunately, the same
cannot be said for the audit, compliance and anti-money laundering (AML) industry, where instead we
see a muddled picture of policies, procedures, controls and regulations. The sheer volume of red tape,
investigations, monitoring, reporting and suspicious activity reports (SARs) brings us right back to a
situation where we are overloaded. AML and audit need to rise to the challenges presented by the digital
age and we can do this by innovating ourselves. We must modernize our organization in order to be ahead
of the game. Fighting the fire in keeping up with regulation is no longer good enough. We need to
revolutionize the way we work and we can do so by using the latest technologies and techniques. We
have the capability to make a huge impact in our current approach to tackling financial crime, but we just
need the will.
Finding a starting point for an overhaul might seem like a daunting task as we already have a lot on our
plate—customer due diligence (CDD), information systems, transaction monitoring, case management
systems and suspicious activity reporting. However, there is a simple starting point, and that starting point
is big data. A recently coined phrase, the concept of big data is getting a lot of exposure within large
organizations, particularly the financial institutions. In this era of information overload, big data
technology is key, not just to managing data, but to harnessing and exploiting the power within it.
With an incomprehensible amount of data now available all around us, both internally and externally to
our organizations, we need to have the tools to explore all possible sources to help us in the fight against
financial crime. It seems that we have become so caught up in complying with regulation that somewhere
along the way we have lost our focus. We spend so much time ensuring our own compliance (i.e., we are
ticking all the necessary boxes [of which there are many]) that we are left with less time to consider the
big picture. With so many different data sources, new and legacy systems to consider, the task just seems
insurmountable. This is why we need big data.
What exactly is big data and why has it received so much attention? Companies are increasingly finding
themselves with large volumes of information on their information technology (IT) servers and in their
databases that they do not know what to do with it or how to manage it. Despite the fact that these
masses of information are difficult to manage, many companies have realized they can use it in multiple
ways to their benefit, often providing opportunities for competitive advantage. A simple example is the
ability of large supermarkets to gather information on a customer’s purchases, analyze the data and then
use it to offer sales incentives tailored to a customer’s specific need.
Consumers are oblivious to the extent to which they are sharing their personal information, even when
doing a simple grocery shop. Even those who try to restrict the amount of information they share about
themselves (e.g., by limiting their online presence), might not even realize that something as simple as
subscribing to a supermarket loyalty card can provide organizations with a massive amount of detailed
information about themselves. First, there is the standard information (date of birth and address)
provided on the form that needs to be completed in order to obtain a loyalty card. Second, the process
of accruing those store points allows supermarkets to gather enough information to build an entire profile
on their customers, including data on the products they prefer, how much money they spend every week
and their shopping patterns and habits.
Big data has a huge impact on consumers’ daily lives with some effects that they might not even realize.
Amazon makes suggestions as to what we might like to purchase, Netflix suggests what we might like to
watch, and the music we might like to listen can be suggested by the likes of Pandora. Big data is not just
limited to tailoring consumer preferences. For example, the music industry can now use big data to make
a budding artist into an instant success. Big Data can actually be used to pinpoint where that artist’s
potential fan base are located. This enables the record label to start promoting a specific artist in a
particular region. Success is not necessarily based on talent or chance, but on clever positioning and
marketing. Record companies are willing to invest in such targeted marketing because big data is arming
them with the knowledge that the odds are already in their favor and will ultimately lead to a healthy
return on investment. Although such predictions are still somewhat risky, it is a calculated gamble with
the odds in their favor. In addition, large media companies and big businesses are not the only ones who
are using big data. Many government agencies are using it too. For example, in law enforcement,
visualization software is being used to help predict where crimes are likely to occur and when. Such
technology enables the police force to proactively monitor a particular area at certain times. This is known
as predictive policing.i
What about big data in the business of AML? What value can it add? There a very large potential for
harnessing big data for AML purposes. It is simply a matter of adopting the clever techniques that other
industries are already applying and tailoring them to our needs. For example, predictive policing is one
technique that could be easily adapted to this industry. By utilizing information that is readily available
(e.g., internal data such as customer transactions and external data such as social media or geolocation
data), it could help in predicting where, when and how financial crime is likely to occur. At first it might
appear to be information overload (more data to research and investigate), but this is information
technology at its best. If it is utilized in the right way, it has the potential for huge paybacks including the
obvious reduction in financial crime rates, with the additional benefit of reduction in administrative
workload and of course the decrease in direct financial losses due to fraud. This paper will examine how
big data has the potential to change the anti-financial crime landscape and the AML audit process, and it
will outline the associated benefits, such as a complete picture of the customer. Consideration will also
be given to the implications for the industry if it chooses not to embrace big data, how the compliance
world might lag behind in terms of technology and the adverse effects on workload. The financial
implications will also be discussed with regard to enforcement actions. Enforcement actions are final
orders or conditions imposed in writing for violations of law, rules or regulations. ii
The paper will also look at KYC and CDD procedures and will discuss the concept of a 360-degree view of
the customer—a way seeing the entire customer picture, all made possible by big data. The discussion
will progress to predictive analysis and data analytics in relation to sanctions screening, transaction
monitoring and case management for auditors. The challenges of Big Data and possible solutions will also
be debated, but to begin with there will be a non-technical overview of the concept of big data, how it
works, and some interesting examples of the technology and techniques already being used.
2 BIG DATA–WHAT IS IT?
Big data can provide businesses with huge benefits, such as sales and marketing opportunities, improved
customer service and better operational efficiency. Data scientists analyze huge volumes of data and turn
it into useful business information by uncovering hidden patterns, relationships and trends. They analyze
many types of datasets including structured, unstructured, semi-structured, internal and external data.
The concept might seem quite daunting, especially for those outside the technology sector, but for the
purpose of this paper it can easily be broken down into simpler terms starting with a definition of big data
itself.
There are various definitions of big data, which have different interpretations depending on a particular
job role. A good explanation is from author and Big Data guru Bernard Marr, who articulates it succinctly:
“The basic idea behind the phrase 'Big Data' is that everything we do is increasingly leaving a digital trace
(or data), which we (and others) can use and analyze. Big Data therefore refers to that data being collected
and our ability to make use of it.”iii Marr’s research posits that “everything we do is increasingly leaving a
digital trace,” which is very apt for today’s world where we are increasingly leaving digital footprints,
usually unwittingly, everywhere we go.
The concept of information gathering is not new, but when we hear the term big data we can generally
take it to mean the amount of data an organization gathers, stores and manipulates, using technological
advances in capacity, processing, and general software development. In recent years we have progressed
very quickly in terms of computer storage capacity (Gigabytes, Terabytes or even Petabytes and beyond),
but big data also covers every single bit of data a company has stored—basically anything contained on a
company’s servers.
In the compliance/AML sector all available data is not being fully utilized, particularly external data. There
are, of course, some privacy concerns when accessing external data sources, which will be expanded on
later, but for now it is important to just stress the importance of utilizing all types of available data—
structured, unstructured, semi-structured, internal and external.
Key Terms and Concepts:
1. Data Science Data science is the process of extracting knowledge from data or changing raw data into information—a
simple concept. In reality there is a lot more to it, but such a detailed explanation is outside the remit of
this paper. However, there are some important concepts worth understanding. The terms we hear such
as data analytics and data mining can appear to have multiple meanings but specific definitions are not
required here, suffice to know that they are all part of the same process—turning data into business
insights.
iv
More relevant to this discussion is understanding the different types of data and the insights that can be
achieved by combining and cross-referencing different data types and sources.
2. Structured Data
Structured data is a type of data that is contained within fixed fields such as a database or a spreadsheet.
It is organiszed into a pre-defined format. Traditionally these structures or databases were modeled this
way because databases offered the best method possible for processing and analyzing data. Databases
are updated and manipulated by structured query languages, such as SQL. Essentially SQL is how an
administrator interacts with the database to update, retrieve and delete information. Databases, which
first came about in the 1970s are still typically used within the banking sector (e.g., Oracle, Sybase, DB2,
MS SQL Server).
3. Unstructured Data
Unstructured data is that which is not contained in a database. The term unstructured data is closely
related to big data. Examples of unstructured data include text files such as emails, books, blogs, or any
document that is text heavy. Photographs, videos and other sorts of images such as x-rays are also
considered to be unstructured data. Even PowerPoint presentations and social media posts are types of
unstructured data.
4. Semi-Structured Data
Semi-structured data is a cross between structured and unstructured data. It is the type of data that does
not reside in a database but it might have some structure which makes it easier to analyze.
An example of this would be a document or photo that contains meta data. Meta data is the descriptive
data about the file (e.g., , author, date created, location, etc.).
5. Data Lake
“A data lake is a storage repository that holds a vast amount of raw data in its native format, including
structured, semi-structured, and unstructured data. The data structure and requirements are not defined
until the data is needed.”v
6. Internal data is everything stored or accessible in a business; for example, sales data,
transactional data, customer record and HR data.
7. External data is the myriad of information available outside an organzsation; for example,
websites, blogs, images and social media—Twitter, Facebook, LinkedIn, etc.
8. Social Media and Big Data
The amount of information publically available on the internet at present is astounding. Each year since
2012, Domo have published an interesting infographic on the amount of data that is created on the
internet every minute. Quite fascinating in itself, but even more interesting is the rate of increase. Domo
founder Josh James said “data is…constantly pouring out of our smartphones, smartwatches, smart TVs,
and countless other devices that are all connected—and it continues to proliferate at an astounding rate.”
vivii
The above infographic presents the following interesting facts:
In 2012, YouTube users were uploading 48 hours of new video. In 2016, that number increased nearly tenfold to 400 hours.
The global internet population grew from 2.1 billion in 2012 to 3.4 billion in 2016—an increase of 62 percent.
Facebook users share nearly a quarter of a million photos every minute.
Instagram users 'like' over 10 times that amount every minute—an astounding 2.5 million posts
per minute are liked.
833,333 users upload new files per minute to the file sharing website Dropbox.
Given the amount of external data publically available on the internet at any point in time, there is no
doubt that we should be harnessing at least some of it to assist us in the fight against financial crime and
in combating fraud. There are many examples of the technological capabilities in our world today. For
example, how smart phones really are smart, the internet of things , how objects are being manufactured
with the ability to communicate, smart carpets that can track movement of the elderly in their own homes
and detect falls, and how soon we might see giant fleeting advertisement projected into the sky (Echo
Technology).viii The important focus is that with technology advancing at such a rate, we really need to be
looking at how we are implementing the same technology in the world of compliance.
This ever-connected world we live in at present is all very interesting but it does also raise the issue of
privacy. Even though there is currently a lot of information publically available, there are still huge privacy
concerns for individuals and organizations. Data privacy is a hot topic at the moment, particularly in
Europe, as the General Data Protection Regulation (GDPR) (Regulation (EU) 2016/679) will come into force
on May 24, 2018. ixRecently, the Financial Transactions and Reports Analysis Centre (FINTRAC) in Canada
has been in the media spotlight for scrutinizing the social media posts of citizens whose financial
transactions came under their radar. Even though such posts are openly available, concern was raised by
privacy advocates who highlighted the possibility that social media users might not even realize that their
online content is being viewed. FINTRAC defended the practice, stating that government rules allow them
to collect certain information. Spokesperson for FINTRAC Renée Bercier made a valid point that “the
perpetrators of these crimes oftentimes have an online presence and actively use the web, including social
media, to connect with associates, to facilitate their activities, and, in the case of terrorism financing, to
even raise funds."x
Another interesting point highlighted in the article was that the Canadian Anti-Fraud Centre had filed a
Privacy Impact Assessment with the Canadian Privacy Commissioner and listed social media posts as one
of the things it checks when looking into possible cases of fraud or scams. In Canada, Privacy Impact
Assessments (PIAs) are "used to identify the potential privacy risks of new or redesigned federal
government programs or services. They also help eliminate or reduce those risks to an acceptable level.
Virtually all government institutions, as defined in Section 3 of the Privacy Act, including parent Crown
corporations and any wholly owned subsidiary of these corporations, must conduct PIAs for new or
redesigned programs and services that raise privacy issues. Government institutions must provide
completed PIAs to the Treasury Board of Canada Secretariat (TBS) and the Office of the Privacy
Commissioner of Canada (OPC)."xi
This same news agency also penned another article about the Canada Revenue Agency (CRA) using big
data in a similar way by scrutinizing social media in order to catch tax cheats. It exposed details of how
the CRA planned to use Big Data, predictive analytics, and external data to help officials decide whether
or not someone has or has not paid their taxes. This is a good example highlighting the direction in which
things are taking with big data technology as an enabler.xii
3 BIG DATA POSSIBILITES FOR AUDIT POLICIES AND PROCEDURES
The concept of big data is exciting; however, it also has the potential to invoke anxiety in many. But
ultimately, what does big data mean for AML professionals? Petabytes of data from additional external
sources will add to the already staggering volume of data needed for evaluation by compliance. However,
even though data volume is on the increase, it is now much easier to manage that data. We are getting
better at analyzing it and extracting useful information. We are going through an age of immense
technological change and this is also possible within the anti-financial crime industry as well—a data
revolution. Detailing exactly how these techniques will improve each aspect of the AML is outside the
scope of this paper, but I will attempt to address some key areas with examples where such benefits can
be realized.
There has never been a better time to act, but we need to act quickly. We cannot afford to ignore it, nor
can we ward off the ever-increasing barrage of standards and regulation. Governance is a necessity, there
is no doubt about that, but if we work in a faster and more intelligent way, it makes sense that is what we
should do. There are also the financial implications for non-compliance. In December 2016, the Financial
Industry Regulatory Authority (FINRA) issued a $16.5 million fine to Credit Suisse in the U.S. for “failing to
properly implement an automated surveillance system to monitor money movements.” FINRA found that
Credit Suisse’s monitoring program for detecting suspicious activity was significantly deficient in two
ways.xiii
The first issue arose because Credit Suisse had relied on its brokers to identify and escalate potentially
suspicious trading activity. Unfortunately, suspicious activity was not always reported as required. The
second was that Credit Suisse had failed to properly implement its automated monitoring system to detect
suspicious activity. FINRA discovered that “a significant portion of the data feeds into the system was
missing information or had other issues that compromised the system’s effectiveness.” In addition, Credit
Suisse had failed to use certain scenarios designed by the system to identify common suspicious patterns
and activities, and failed to adequately investigate certain activity. What is interesting about this
particular case is that not only can violations result in significant fines, but actual system deficiencies can
be investigated resulting in subsequent fines.xiv
How much money is invested in compliance areas? It is obvious that it is a growing industry. HSBC grew
their compliance department from 2,000 to 5,000 personnel in 2013, and to over 7,000 in 2015.xv But that
is growth—not investment. How much investment goes into these back office functions? It is a
contrasting picture of course when it comes to front office (the money making area) and that is because
it is a logical assumption that there is a better return on investment. But the tide may be turning. There
have been some very large fines related to enforcement action, which not only hurt the pocket but can
lead to a loss of reputation, so there should be sufficient motivation to invest in these type of functions.
It is not just inadequate monitoring and reporting that should be of concern. There are a lot of lacking
areas when it comes to data analytics, such as KYC. Compliance would appear to be antediluvian in this
respect. In the article “Why is internal audit not addressing the data analytics capability gap?,” author
John Verver, CPA CA, CISA, CMC, draws attention to the fact that while major firms are stating the
importance of data analytics within audit, there still appears to be a gap in putting the capabilities to work.
Verver highlights interesting findings in reports published in 2016 from the big four auditing firms where
they have found very little use of analytics in audit. However, the hope is that within the next three to five
years, almost two-thirds of companies “expect to be using analytics in at least 50 percent of their
audits.”xviIt sounds positive.
4 INTERNAL CONTROLS
4.1 KYC/CDD
The Basel Committee on Banking Supervision provides a forum for regular cooperation on banking
supervisory mattersxvii and in 2001, it developed good practice guidance on account opening and customer
identification requirements. Since then many different laws have been implemented in various countries
globally to ensure anyone handling financial transactions undertakes adequate CDD.
Generally, these guidelines and laws have served us well, providing a tick list of sorts to enable us to mark
off each due diligence check along the way, and at the same time providing tidy KYC/CDD documentary
evidence for the regulator. While there is no doubt that these measures have been helpful in catching
some criminals, we have so much information to hand in today’s world that we need to be doing things
more effectively. We actually have the capability to be working smarter and to be one step ahead of the
offenders. There is just so much data available online as people leave such a huge footprint particularly
with social media and smart phones. Therefore, it makes sense for us embrace big data and all that it has
to offer particularly in relation to external data. The explosion in technology is rapidly changing our
lifestyle and we have to keep up.
Many sources note that big data offers opportunities for marketers to upsell and cross-sell their offerings
and the marketing potential unleashed by big data is creating quite a storm. However, it is more difficult
to find the same level of enthusiasm for leveraging the same techniques and technologies in back office
functions and to use it to enhance the KYC and CDD processes. Big data and predictive analysis have the
possibility of making an immense impact in these areas and yet we have heard very little from the industry
in this regard. There are some advocates though.
Vamsi Chemitiganti of Horton Works provides an excellent overview of how KYC/CDD can be transformed.
He refers to the concept of Customer 360, which really is the holistic view of the customer according to
all available sources. In his blog post “How Data Science and Predictive Analytics Transform AML
Compliance in Banking & Payments,” xviii he discusses how big data and predictive analytics can help with
things like data collection and risk scoring, social graph analysis, behavioral modeling and customer
segmentation. He also comments on how transaction monitoring systems should be used in conjunction
with historical datasets, and particularly for customers under suspicion.
Within banking there appears to be quite a fragmented view of the customer. This is simply because of
how business was done traditionally. Records were kept separately, often across products, lines of
business and geographies. Databases were detached and the limit in storage capacity did not allow for
centralized systems. We had, and still have, complex legacy IT platforms that prevent integration.
However, with the dawn of big data software frameworks such as Apache Hadoop, as well as major
advances in capacity management (data storage), this no longer needs to be the case. This view illustrates
how IT architecture might presently look in a financial organization. The cylinder shape represents a
database (DB).
Collection and analysis of basic identity information - view of customer across IT platforms, lines of business and
geographiesxix
The advent of big data is not about getting rid of databases completely. The idea is to integrate them with
the new technology and to have a new single unified approach.
Data from new business channels such as mobile banking applications and social media can be interlinked
with legacy data systems. Big data software has the capability of providing enterprise-wide solutions for
capturing customer information. It is flexible enough to handle a wide range of data types and across
different jurisdictions. Having a 360 Customer view across products, regions and lines of business gives a
clearer view of client activity profile, as well as risk exposure. The customer should be an organization’s
number one priority, whether it is selling, marketing, enhanced due diligence or detecting fraud.
The possibilities with big data are endless and the capabilities are substantial. If we can master the 360
Customer concept, then it opens up all sorts of possibilities. Financial institutions can enhance our KYC
through data mining and we can cross reference data with other sources.
To illustrate the effectiveness of cross-referencing internal data with external data, Bernard Marr uses the
following example.xx The U.S. supermarket chain Walmart cross-referenced their sales data with weather
data and discovered some interesting facts. During extremely bad weather conditions, not only did
flashlights and other such emergency supplies sell well, but so did Pop Tarts. People were stocking up on
such convenience snacks as they were battening down the hatches in preparation for a big storm. In 2012,
just before Hurricane Sandy hit, Walmart stocked their stores with extra supplies of key items, a successful
strategy born from big data. Upselling Pop Tarts in bad weather might not sound like a big technological
advance, and in-fact the science behind it does not appear complicated. Yet we cannot ignore that fact
that it is clever use of data—proving the point that we need to start being more clever with KYC. It does
not have be a case of “out with the old IT in with the new,” but it can be simple yet effective. 3D face
recognition software is another innovative way of utilizing technology in this industry – the capability to
cross reference face recognition with a central database of filed SARs. There are some areas of law
enforcement where there are already such systems in place. A database of offenders can now utilize photo
recognition software. What was considered futuristic and far-fetched when seen in detective dramas and
films not too long ago, is now actually fast becoming everyday reality.
We can now have the means to utilize sophisticated methods to cross-reference our internal data files
with external data. For example, a customer might have made a large cash withdrawal in Melbourne,
Australia that was just below the reporting limits, but at around about the same time his partner ‘checked
in’ with him on Facebook by posting a selfie of them both sipping champagne on a yacht in the Caribbean.
If such data was cross-referenced in real time this could flag an alert, due to the discrepancy in
geographical location. Of course there is always the possibility that the information on Facebook is not
quite the truth, but it illustrates the possibilities for red flag alerts—something that calls for further
exploration.
Even more intelligent is predictive analysis, not just the capability of catching the criminal once the crime
has been committed but actually predicting what your customer is likely to do next. The prediction might
not be fully reliable, but it could present a red flag before an event to warrant extra screening or the
implementation of prevention measures. The opportunity to stop the criminal activity before it happens
should be given the serious consideration it deserves and should not be overlooked. The technology is
available and should be utilized as a matter of urgency.
Facebook is able to build a personal profile based on people’s ‘likes’ by analyzing which posts a person
likes or dislikes. From this data, Facebook is able to tell so much about a person’s personal attributes and
preferences. Even private information such as a person's sexual orientation, political affinity, religion,
intelligence, and emotional stability can all be determined by simple analytics.xxi Again, accuracy may not
be gained every time, but it gets close enough.
Big data is not just about analyzing information. There has been huge progress made in how we report
and display the results of the analysis. Data Visualization is one way in which we might illustrate customer
behavior. Graphs and heat maps can highlight connections and these are the kind of tools that are
particularly useful when it comes to fraud. The technology is available and we should be utilizing it to
detect patterns and trends, to uncover relationships across business lines and products, between people,
locations and events.
For example, advanced analytics company Ayadsi noticed that when customers were filling out online
claim forms fraudulently, they took longer than average to complete the form. The significance of the
length of time for form-fill was because it could identify in the first instance that it was a person
committing the fraud, not a robot. But also that the longer time spent on a particular page on an online
form is an indicator of ‘possible’ fraud.xxii Similarly this could be leveraged as a KYC technique—the speed
at which customers fill out online application forms against the average time taken, and the possibility of
cross referencing with response time for an average customer.
4.1.1 Entity Analysis
Entity matching is the process whereby records from different systems are mapped to individuals or legal
identities. All business entities first need to be identified and then relationships between them defined.
This gives a very clear view of the customer’s profile and it can show for example the relationship between
customers, customer types, products and events. Previously it was impossible to link all the separate
types of data due to volume, processing power and storage capacity, but now with Big Data Software tools
such as Hadoop, MapReduce, and Big Table, uncovering these connections has never been easier or
quicker. Novetta is an example of a company that has harnessed the capabilities of Hadoop in conjunction
with its own analytics to help companies gain such customer insights. Below is their 360 degree view of
a person, organization, location, events and products.
xxiii
From an anti-fraud perspective, seeing a customer from this viewpoint presents us with opportunity to
see the entire picture. We have a clearer view of relationships across multiple entities.
The orange section identifies relationships between customers. For example, who are they married to,
whether they have children, their family connections, same employer, etc.
In the dark blue it shows what products the customer buys or uses. In the case of banking, these products
might be current accounts, savings accounts, foreign currency accounts, private banking and so on.
Closer analysis could uncover connections across the business lines, between people, locations, events
and products. This view presents us with a clearer understanding of the customer and who they are
connected to and might highlight someone on a watch list or what organizations they are affiliated with.
Another section might show what medium they use to connect with us. Do they bank through an online
app on their smartphone, perhaps they use telephone banking or online banking through a browser?
How often does this customer make transactions? If a red flag is raised, we can also look at their
connections and whether there is any unusual activity in or around the same time by people they know?
This might identify a transaction or a pattern that could otherwise have flown under the radar. With time,
effort and financial investment, imagine what the possibilities could be. Collaboration between
experienced AML professionals and developers would allow for the exploration of some of these
possibilities.
xxiv
4.1.2 Additional Benefits
Cost is always a major issue for organizations, particularly when the investment required is not in a front
office/money making capacity. But in the area of big data, the benefits to both the customer and
organization can be realized in many ways. For example, improved customer onboarding experiences
across product lines (e.g., no extra form filling or repeat due diligence, more efficient account opening
process, and better level of customer service with a better understanding of customer). In terms of the
organization, sales opportunities can be identified based on demographic, customer type and customer
relationships.
With the ability to tailor popular product to market segment, as in my previous example of the music
industry, there is huge potential here with sales data and marketing, resulting in a better return on
investment.
Over course there is potential massive savings to be had from the prevention of fraudulent transactions.
The Financial Times recently reported that in the U.K. alone the annual cost of fraud could be as high as
193 billion pounds a year.xxv
4.2 SANCTIONS NAME SCREENING
In relation to sanctions compliance, watchlist screening is another important area particularly for financial
institutions.
There are already a few watchlist vendors in today’s market with sophisticated turn-key solutions. They
use complex data analytics and data mining techniques to assist compliance departments in screening
against global sanctions list such as the Office of Foreign Assets Control, HM Treasury or U.N. watchlists.
These software tools can provide access to regularly updated lists that can include the names and
addresses of bank offices in sanctioned countries. Such products can support customized internal lists,
and can provide information on previously investigated and cleared entities, red flagging instances that
might warrant further investigation or audit.
To reduce the amount of false “hits,” these systems use “fuzzy logic” matching techniques. This essentially
allows the system to exactly match or closely match names. The system can allow a margin of error such
as misspelled, incomplete names, and to allow for nicknames or abbreviations.
The “hit” is then reported with a 0 to 100 matching score, which shows how close the match is to the
watchlist data. The system can be tailored to ignore matches lower than a certain score. It can also use
logic so as not to flag a false positive. For example, when other important data does not match, such as
date of birth, address, phone number, county of origin, etc., the software can be set to create result
records for false positives, but not report them as alerts, so that these records will be available for review
or auditing.
Algorithms can combine data types allowing for screening of both structured and unstructured data. Watchlist screening can be used to assess a customer's risk rating and it can be linked to the Customer 360 by using the entity analysis and cross-referencing approach. For example, if many false positives identify for a particular ‘related’ group of customers, it might actually turn out to be a genuine red flag. This type of analysis could also be reverse engineered. For example, when we analyze our false positives can we identify common customer traits, and is there any relationship or pattern in the types of products? Viewing information in this way would also be useful for determining risk (i.e., if this product or region is generating a lot of false positives for our customers, allow for an analysis of why this is occurring). The institution could then ask if it should weigh the risk score differently. Again, compliance professionals could conceive many suitable use case scenarios for these systems.
4.3 TRANSACTION MONITORING
The relationships between entities provides valuable insight into how one customer might relate to
another or to an organization, location, event or product. By linking transactions between 360 customers
and watchlists, financial institutions will have even better capabilities to identify and track suspicious
transactions.
Big data has enabled systems to search for new patterns across large amounts of transactions. Machine
learning technology means that systems can learn the transactional behavior of clients and discover
transactional activity with similar traits or relationships. Again, entity analysis and the Customer 360
model comes into play here. Analysis can also be performed on false positive transaction alerts and can
predict potential fraudulent transactions before they even happen. Reducing the amount of false positive
is a massive benefit in terms of reducing unnecessary investigative work.
The ability to combine transactional data with historical data can help make decisions on whether or not
to file a SAR for a particular customer under suspicion.
Recent years have seen an increase in the volume of transactions, particularly with the dawn of contactless technology and mobile wallets.
xxvi
The 2016 World Payments Report shows that global non-cash transaction volumes grew by 8.9 percent to $387.3 billion during 2014. The core theme of this report is the challenges and opportunities in transaction banking making a very interesting read. The graph below shows the increase in non-cash transactions worldwide.
xxvii
4.4 CASE MANAGEMENT Case Management software provides institutions with end-to-end solutions for generating and managing
suspicious activity alerts through the entire investigative process right up to reporting and filing of
suspicious activity.
As discussed, leveraging the big data techniques allows complete case analysis both horizontally and
vertically. Horizontally, the relationships across geographical regions or lines of business can be fully
investigated. Vertically the customer relationships are analyzed, including watchlist screening.
Undertaking such an extensive investigation in this way could reveal behaviors, patterns or relationships
that might otherwise be missed. In turn, this would lead to more accurate filing of SARs and would also
allow for risk-based decision making on whether to retain or to terminate an institution’s relationship with
a customer.
5 CONCLUSION
Technology is changing the way we live at such a rapid pace. It is having a huge impact on our personal
and business lives and most of us are not even fully aware of the extent of it. From a business viewpoint,
implementing big data technologies within the anti-financial crime and compliance sectors is essential if
we are to keep abreast of these changes. Innovation is required not only in front-office scenarios, but it
is also much needed in the back of the house. We need to keep up with the rapid pace of change, and we
also need to implement technology as a regulatory requirement. If we fail to do so, we might pay a hefty
price in the form of enforcement penalties and the loss of consumer confidence. Big technology firms
such as Google and online retailers like Amazon have been pioneers of big data, and now that the way has
been paved, we need to follow hastily in the same direction. The pace of change in the payments industry
and the dawn of mobile payments means that volume has significantly increased and as this trend
continues, there is no time to rest on our laurels. The capabilities of big data have been set out in this
paper from a compliance perspective, and by utilizing it as proposed to our advantage, AML and audit in
financial institutions will not only be guaranteed a smarter, faster way of working, but they will fulfil their
regulatory obligations efficiently and effectively.
REFERENCES
i Stannard, G. 2015. 9 amazing ways big data influences our everyday lives. [ONLINE] Available
at: http://digitalcontact.co.uk/blog/9-ways-big-data-influences-our-lives/. [Accessed 23 April 2017].
ii Office of the Comptroller of the Currency - U.S. Department of Treasury. Enforcement Actions.
[ONLINE] Available at: https://www.occ.treas.gov/topics/laws-regulations/enforcement-
actions/index-enforcement-actions.html. [Accessed 24 April 2017].
iii Marr, B. 2015. Big Data Explained in Less Than 2 Minutes - To Absolutely Anyone. [ONLINE] Available at: https://www.linkedin.com/pulse/big-data-explained-less-than-2-minutes-absolutely-anyone-bernard-marr. [Accessed 23 April 2017].
iv Atmiya Institute of Technology & Science. 2015. Differences among Data Science, Data Analysis, Big Data, Data Analytics, Data Mining and Machine Learning ?. [ONLINE] Available at: https://aitsdmclub.wordpress.com/2015/08/11/differences-among-data-science-data-analysis-
big-data-data-analytics-data-mining-and-machine-learning/. [Accessed 23 April 2017].
v Dull, T. 2015. Data Lake vs Data Warehouse Key Differences. [ONLINE] Available at: http://www.kdnuggets.com/2015/09/data-lake-vs-data-warehouse-key-differences.html.
[Accessed 23 April 2017].
vi Domo. 2016. Data Never Sleeps 4.0. [ONLINE] Available at:
https://www.domo.com/blog/data-never-sleeps-4-0/. [Accessed 23 April 2017].
vii 2012 Data never sleeps infographic used for comparison - Domo. 2012. Big Data Never
Sleeps 2.0. [ONLINE] Available at: https://www.domo.com/blog/how-much-data-is-created-
every-minute/. [Accessed 23 April 2017].
viii VRr00m. (2017). Lightvert Takes Augmented Reality To The Skies. [Online Video]. 25 February 2017. Available from: https://www.youtube.com/watch?v=5zy6QfEMF-A. [Accessed:
24 April 2017].
ix European Commission. 2016. Reform of EU data protection rules. [ONLINE] Available at: http://ec.europa.eu/justice/data-protection/reform/index_en.htm. [Accessed 24 April
2017].
x Thompson, E. 2017. Money laundering watchdog scrutinizes Facebook, social media. [ONLINE] Available at: http://www.cbc.ca/beta/news/politics/facebook-twitter-privacy-
moneylaundering-1.4020638. [Accessed 24 April 2017].
xi Office of the Privacy Commissioner of Canada. 2016. Privacy Impact Assessments. [ONLINE]
Available at: https://www.priv.gc.ca/en/privacy-topics/privacy-impact-assessments/. [Accessed 24 April 2017].
xii Thompson. 2017. Canada Revenue Agency monitoring Facebook, Twitter posts of some Canadians. [ONLINE] Available at: http://www.cbc.ca/beta/news/politics/taxes-cra-facebook-big-data-1.3941416. [Accessed 24 April 2017].
xiii Mott, G. 2016. Credit Suisse Fined by Finra Over Money-Laundering Controls. [ONLINE]
Available at: https://www.bloomberg.com/news/articles/2016-12-05/credit-suisse-fined-by-
finra-over-money-laundering-safeguards. [Accessed 23 April 2017].
xiv Finra. 2016. FINRA Fines Credit Suisse Securities (USA) LLC $16.5 Million for Significant
Deficiencies in its Anti-Money Laundering Program. [ONLINE] Available at:
https://www.finra.org/newsroom/2016/finra-fines-credit-suisse-165-million-significant-
deficiencies-its-aml-program. [Accessed 23 April 2017].
xv Trulioo. 2015. Are Compliance Costs Breaking Banks?. [ONLINE] Available at:
https://www.trulioo.com/blog/are-compliance-costs-hurting-banks-bottom-lines/. [Accessed 23 April 2017].
xvi Verver, J. 2016. Why is internal audit not addressing the data analytics capability gap?.
[ONLINE] Available at: https://www.acl.com/2016/08/why-is-internal-audit-not-addressing-the-
data-analytics-capability-gap/. [Accessed 23 April 2017].
xvii The Bank for International Settlements. 2017. Basel Committee on Banking Supervision - overview. [ONLINE] Available at: https://www.bis.org/bcbs/. [Accessed 23 April 2017].
xviii Chemitiganti, V. 2016. HOW DATA SCIENCE AND PREDICTIVE ANALYTICS TRANSFORM AML
COMPLIANCE IN BANKING & PAYMENTS. [ONLINE] Available at: https://hortonworks.com/blog/data-science-predictive-analytics-transform-aml-compliance-banking-payments-22/. [Accessed 23 April 2017].
xix Kwan, P. 2012. Mission Impossible? Creating a Single Enterprise View of the Customer.
[ONLINE] Available at: https://www.alvarezandmarsal.com/insights/mission-impossible-
creating-single-enterprise-view-customer. [Accessed 23 April 2017].
xx Marr, B. 2016. Big Data in Practice. UK. Wiley.
xxi Kosinskia,M, Stillwella,D, Graepelb, T. 2013. Private traits and attributes are predictable
from digital records of human behavior. [ONLINE] Available at: http://www.pnas.org/content/110/15/5802. [Accessed 23 April 2017].
xxii Marr, B. 2015, Big Data, 1st edition, UK. Wiley.
xxiii Clements R. 2015. Novetta Entity Analytics. [ONLINE] Available at: https://www.slideshare.net/chavonneodom/novetta-entity-analytics. [Accessed 23 April 2017].
xxiv Novetta. (2015). Connect Your Hadoop Data and Gain Customer Insights: Introducing
Novetta Entity Analytics. [Online Video]. 19 January 2015. Available from:
https://www.youtube.com/watch?v=e5U4OL8y-78. [Accessed: 23 April 2017].
xxv Croft, J, 2016. Fraud costs the UK up to £193bn per year, reports says. Finacial Times, 25 May 2016.
xxvi Wikipedia. 2017. File:Universal Contactless Card Symbol.svg. [ONLINE] Available at: https://en.wikipedia.org/wiki/File:Universal_Contactless_Card_Symbol.svg. [Accessed 24 April 2017].
xxvii Capgemini & BNP Paribas. 2016. World Payments Report 2016. [ONLINE] Available at: https://www.worldpaymentsreport.com/. [Accessed 24 April 2017].