+ All Categories
Home > Documents > Bolton and Hand

Bolton and Hand

Date post: 07-Apr-2018
Category:
Upload: dig-aquino
View: 216 times
Download: 0 times
Share this document with a friend

of 16

Transcript
  • 8/6/2019 Bolton and Hand

    1/16

    Statistical Fraud Detection: A Review

    Author(s): Richard J. Bolton and David J. HandSource: Statistical Science, Vol. 17, No. 3 (Aug., 2002), pp. 235-249Published by: Institute of Mathematical StatisticsStable URL: http://www.jstor.org/stable/3182781

    Accessed: 25/09/2009 00:21

    Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at

    http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless

    you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you

    may use content in the JSTOR archive only for your personal, non-commercial use.

    Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained athttp://www.jstor.org/action/showPublisher?publisherCode=ims.

    Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed

    page of such transmission.

    JSTOR is a not-for-profit organization founded in 1995 to build trusted digital archives for scholarship. We work with the

    scholarly community to preserve their work and the materials they rely upon, and to build a common research platform that

    promotes the discovery and use of these resources. For more information about JSTOR, please contact [email protected].

    Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and extend access to

    Statistical Science.

    http://www.jstor.org/stable/3182781?origin=JSTOR-pdfhttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/action/showPublisher?publisherCode=imshttp://www.jstor.org/action/showPublisher?publisherCode=imshttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/stable/3182781?origin=JSTOR-pdf
  • 8/6/2019 Bolton and Hand

    2/16

    StatisticalScience2002, Vol. 17, No. 3, 235-255

    Statistical F r a u d Detection: A Rev i ewRichard J. Bolton and David J. Hand

    Abstract. Fraud is increasing dramaticallywith the expansionof modemtechnology and the global superhighwaysof communication,resulting inthe loss of billions of dollars worldwide each year. Although preventiontechnologies are the best way to reduce fraud, fraudsters are adaptiveand, given time, will usually find ways to circumvent such measures.Methodologies for the detection of fraud are essential if we are to catchfraudstersonce fraudpreventionhas failed. Statistics and machinelearningprovide effective technologies for fraud detection and have been appliedsuccessfullyto detect activitiessuch as money laundering, -commercecreditcardfraud,telecommunications raudandcomputer ntrusion, o name butafew. We describe the tools available for statistical fraud detection and theareas n which frauddetectiontechnologiesare mostused.Key words and phrases: Fraud detection, fraud prevention, statistics,machine earning,money laundering, omputerntrusion, -commerce,creditcards,telecommunications.

    1. INTRODUCTIONThe Concise Oxford Dictionary defines fraud as"criminaldeception; he use of false representationsogain anunjust advantage." raud s as old as humanityitself and can take an unlimited variety of differentforms. However,in recent years, the developmentofnew technologies (which havemadeit easier for us tocommunicateandhelpedincreaseourspendingpower)has also providedyet furtherways in which criminalsmay commit fraud. Traditional forms of fraudulentbehaviorsuch as money launderinghave become easierto perpetrateand have been joined by new kinds offraud such as mobile telecommunicationsfraud and

    computer ntrusion.We begin by distinguishing between fraud pre-vention and fraud detection. Fraud prevention de-scribes measures to stop fraud from occurring n thefirst place. These include elaboratedesigns, fluores-cent fibers,multitonedrawings,watermarks,aminatedmetal strips and holographson banknotes,personalRichardJ.Boltonis ResearchAssociate in theStatisticsSectionof theDepartmentof Mathematicsat ImperialCollege. David J. Hand is Professor of Statistics inthe Departmentof Mathematics at Imperial College,LondonSW7 2BZ, UnitedKingdom(e-mail: r.bolton,[email protected]).

    identificationnumbers or bankcards, nternet ecuritysystems for credit cardtransactions,SubscriberIden-tity Module (SIM)cardsfor mobile phones, andpass-words on computer systems and telephone bank ac-counts. Of course, none of these methods is perfectand, n general,a compromisehasto be struckbetweenexpenseand inconvenience e.g., to a customer)on theone hand,andeffectiveness on the other.Incontrast, rauddetection nvolvesidentifying raudas quickly as possible once it has been perpetrated.Frauddetection comes into play once fraudpreven-tion has failed. In practice,of course frauddetectionmust be usedcontinuously,as one will typicallybe un-aware that fraudpreventionhas failed. We can try topreventcredit cardfraudby guardingourcardsassid-uously,but if nevertheless he card'sdetails arestolen,then we need to be able to detect, as soon as possible,thatfraud s beingperpetrated.Fraud detection is a continuously evolving disci-pline. Whenever t becomes knownthatone detectionmethod is in place, criminalswill adapttheir strate-gies andtry others. Of course, new criminals are alsoconstantlyentering he field.Manyof them will notbeawareof the frauddetectionmethods whichhavebeensuccessful in the past and will adoptstrategieswhichlead to identifiable rauds. This means that the earlierdetection tools need to be appliedas well as the latestdevelopments.

    235

  • 8/6/2019 Bolton and Hand

    3/16

    R. J. BOLTONAND D. J. HANDThe developmentof new fraud detection methodsis made more difficult by the fact that the exchangeof ideas in fraud detectionis severely limited. It doesnot make sense to describe frauddetectiontechniquesin great detail in the public domain, as this givescriminalsthe informationthat they requireto evadedetection. Data sets are not made availableand resultsare often censored, making them difficult to assess(e.g., Leonard,1993).Many fraud detection problems involve huge datasets that are constantly evolving. For example, thecreditcardcompanyBarclaycard arriesapproximately350 million transactionsa year in the United King-dom alone (Hand, Blunt, Kelly and Adams, 2000),The Royal Bank of Scotland, which has the largestcredit card merchant acquiringbusiness in Europe,carries over a billion transactionsa year and AT&T

    carriesaround275 million calls each weekday(Cortesand Pregibon, 1998). Processing these data sets ina search for fraudulent ransactionsor calls requiresmore than mere novelty of statistical model, andalso needs fast and efficient algorithms:data miningtechniquesare relevant.These numbers also indicatethe potential value of fraud detection: if 0.1% of a100 million transactionsare fraudulent,each losingthe company ust ?10, then overallthe companyloses?1 million.Statisticaltools for fraud detection are many andvaried, since data from differentapplicationscan bediverse in both size and type, but there are commonthemes.Suchtools areessentiallybasedon comparingthe observeddatawith expected values, but expectedvalues can be derivedin variousways, dependingonthe context.They may be single numericalsummariesof some aspectof behaviorandthey are often simplegraphicalsummaries n which an anomalyis readilyapparent, utthey are also often morecomplex (multi-variate)behaviorprofiles.Such behaviorprofilesmaybe basedon pastbehaviorof the systembeing studied(e.g., thewaya bankaccounthas beenpreviouslyused)or be extrapolatedrom other similarsystems.Thingsareoften furthercomplicatedby the fact that,in somedomains(e.g., tradingon the stockmarket)a given ac-tormaybehave n a fraudulentmanner ome of thetimeand not at othertimes.Statisticalfraud detection methods may be super-visedor unsupervised. n supervisedmethods,samplesof both fraudulentandnonfraudulentecordsareusedto constructmodelswhich allowone to assignnew ob-servations nto one of the two classes. Of course, thisrequiresone to be confident aboutthe trueclasses of

    the originaldata used to build the models. It also re-quiresthatone has examplesof both classes. Further-more, it can only be used to detect frauds of a typewhichhavepreviouslyoccurred.Incontrast,unsupervisedmethodssimplyseekthoseaccounts, customers and so forth which are mostdissimilar romthe norm.These canthen be examinedmoreclosely. Outliersarea basic formof nonstandardobservation.Tools used for checking dataqualitycanbe used,butthedetectionof accidentalerrors s a ratherdifferentproblem from the detection of deliberatelyfalsified data or data which accurately describe afraudulentpattern.This leads us to note the fundamentalpoint that wecan seldom be certain, by statistical analysis alone,thata fraudhas been perpetrated.Rather, he analysisshould be regardedas alertingus to the fact thatanob-servation s anomalous,or morelikely to be fraudulentthanothers,so thatit can thenbe investigated n moredetail. One can think of the objective of the statisti-cal analysisas beingto returna suspicionscore (wherewe will regarda higherscore as more suspiciousthana lower one). The higher the score is, then the moreunusual s the observationor the more like previouslyfraudulentvalues it is. The fact that there are manydifferentways in which fraud can be perpetrated ndmany different scenariosin which it can occur meansthat there are many differentways to compute suspi-cion scores.

    Suspicionscorescan be computed oreachrecord nthedatabase foreach customerwith a bankaccountorcreditcard,foreachowner of a mobilephone,for eachdesktopcomputerand so on), andthesecanbe updatedas time progresses. These scores can then be rankorderedandinvestigativeattentioncan be focussed onthosewiththehighestscoresor on thosewhich exhibita sudden ncrease.Hereissues of cost enter:giventhatit is too expensiveto undertake detailed nvestigationof all records,one concentrates nvestigationon thosethoughtmost likely to be fraudulent.One of the difficulties with frauddetection is thattypically there are many legitimate records for eachfraudulentone. A detection method which correctlyidentifies 99% of the legitimaterecordsas legitimateand 99%of the fraudulent ecordsas fraudulentmightbe regardedas a highly effective system. However,ifonly 1 in 1000 records s fraudulent, hen,on average,in every 100 that the system flags as fraudulent,onlyabout 9 will in fact be so. In particular, his meansthatto identifythose 9 requiresdetailedexaminationofall 100-at possiblyconsiderablecost. This leadsus to

    236

  • 8/6/2019 Bolton and Hand

    4/16

    STATISTICAL RAUD DETECTIONa moregeneralpoint:fraudcan be reducedto as low alevel as one likes,butonlyby virtueof acorrespondinglevel of effort and cost. In practice,some compromisehas to be reached, often a commercialcompromise,between the cost of detectinga fraud and the savingsto be made by detecting it. Sometimes the issues arecomplicated by, for example, the adverse publicityaccompanyingfraud detection. At a business level,revealingthat a bank is a significant targetfor fraud,even if much has been detected, does little to inspireconfidence,and ata personal evel, takingactionwhichimplies to an innocent customer that they may besuspected of fraud is obviously detrimental o goodcustomerrelations.The body of this paper is structuredaccordingtodifferent areas of fraud detection. Clearly we cannothopeto cover all areas n which statisticalmethodscanbe applied.Instead,we have selected a few areaswheresuch methods are used and where there is a body ofexpertiseand of literaturedescribingthem. However,before looking at the details of different applicationareas, Section 2 provides a brief overview of sometools for frauddetection.

    2. FRAUDDETECTIONOOLSAs we mentionedabove, fraud detection can be su-pervised or unsupervised.Supervisedmethods use adatabase of known fraudulent/legitimate ases fromwhich to construct a model which yields a suspicionscore for new cases. Traditionalstatisticalclassifica-tion methods (Hand, 1981; McLachlan, 1992), suchas linear discriminantanalysisandlogistic discrimina-tion, haveprovedto be effective tools for manyappli-cations,butmorepowerfultools (Ripley, 1996;Hand,1997; Webb, 1999), especially neuralnetworks,havealso been extensivelyapplied.Rule-basedmethods aresupervised earningalgorithms hatproduceclassifiersusing rules of the form If {certainconditions}, Then{a consequent}. Examplesof such algorithms ncludeBAYES (Clark and Niblett, 1989), FOIL (Quinlan,

    1990) and RIPPER(Cohen, 1995). Tree-basedalgo-rithmssuch as CART(Breiman,Friedman,Olshen andStone, 1984) andC4.5 (Quinlan,1993)produceclassi-fiersof a similarform.Combinationsof some or all ofthesealgorithmscan be createdusingmeta-learing al-gorithms o improveprediction n frauddetection(e.g.,Chan,Fan,ProdromidisandStolfo, 1999).Major considerations when building a supervisedtool for frauddetection include those of unevenclasssizes and different costs of differenttypes of misclas-sification. We must also take into considerationthe

    costs of investigatingobservationsand the benefits ofidentifying fraud.Moreover,often class membershipis uncertain.For example, credit transactionsmay belabelled incorrectly:a fraudulent ransactionmay re-main unobserved and thus be labeled legitimate(andthe extent of this may remain unknown)or a legit-imate transactionmay be misreportedas fraudulent.Some work has addressedmisclassificationof trainingsamples (e.g., Lachenbruch,1966, 1974;Chhikara ndMcKeon, 1984), but not in the context of frauddetec-tion as far as we are aware. Issues such as these werediscussedby Chanand Stolfo (1998) and ProvostandFawcett(2001).Link analysis relates known fraudsters to otherindividualsusing record linkage and social networkmethods (Wassermanand Faust, 1994). For example,in telecommunicationsnetworks,security nvestigatorshave found that fraudstersseldom work in isolationfrom each other. Also, after an account has beendisconnected or fraud,the fraudsterwill often call thesame numbers rom anotheraccount(Cortes,PregibonandVolinsky,2001). Telephonecalls from an accountcan thus be linked to fraudulentaccounts to indicateintrusion.A similarapproachhas been taken n moneylaundering GoldbergandSenator,1995, 1998;Senatoret al., 1995).Unsupervisedmethods are used when there are noprior sets of legitimate and fraudulentobservations.Techniquesemployed here are usually a combinationof profilingand outlierdetectionmethods. We modela baseline distribution hat representsnormalbehav-ior and then attemptto detect observations hat showthe greatestdeparture rom this norm.There are sim-ilarities to author dentification n text analysis.Digitanalysisusing Benford's law is an exampleof such amethod.Benford's aw (Hill, 1995) says that the distri-bution of the firstsignificantdigits of numbersdrawnfrom a wide varietyof randomdistributionswill have(asymptotically)a certain orm. Untilrecently, his lawwas regardedas merelya mathematical uriositywithno apparentuseful application.However,Nigrini andMittermaier(1997) and Nigrini (1999) showed thatBenford's awcan be used to detect fraud n accountingdata. The premise behind frauddetectionusing toolssuch as Benford's law is that fabricatingdata whichconformto Benford's aw is difficult.Fraudstersadapt to new preventionand detectionmeasures,so frauddetectionneeds to be adaptiveandevolve over time. However, legitimate account usersmay graduallychange their behavior over a longerperiod of time and it is important o avoid spurious

    237

  • 8/6/2019 Bolton and Hand

    5/16

    R. J. BOLTONAND D. J. HANDalarms.Models can be updatedat fixedtime points orcontinuouslyover time; see, for example, Burge andShawe-Taylor(1997), Fawcett and Provost (1997a),Cortes, Pregibon and Volinsky (2001) and Senator(2000).

    Although hebasic statisticalmodelsforfrauddetec-tion can be categorizedas supervisedor unsupervised,the applicationareasof fraud detection cannot be de-scribedso conveniently.Theirdiversity s reflectedintheirparticularoperationalcharacteristicsandthe va-riety andquantityof dataavailable,both features thatdrive the choice of a suitablefrauddetection tool.3. CREDITCARDFRAUD

    The extent of credit card fraud is difficult to quan-tify, partlybecause companies are often loath to re-lease fraudfigures in case they frightenthe spendingpublic and partly because the figures change (prob-ably grow) over time. Various estimates have beengiven.Forexample,Leonard 1993) suggestedthe costof Visa/Mastercard raud in Canada in 1989, 1990and 1991 was $19, 29 and 46 million (Canadian),re-spectively.Ghosh and Reilly (1994) suggested a fig-ure of $850 million (U.S.) per year for all types ofcredit cardfraudin the United States, and Aleskerov,Freislebenand Rao (1997) cited estimatesof $700 mil-lion in the United StateseachyearforVisa/Mastercardand$10 billion worldwidein 1996.Microsoft'sExpe-dia set aside $6 million for credit cardfraudin 1999(Patient,2000). Total losses throughcredit cardfraudinthe UnitedKingdomhave beengrowingrapidlyoverthe last 4 years [1997, ?122 million; 1998, ?135 mil-lion; 1999, ?188 million; 2000, ?293 million. Source:Association for Payment Clearing Services, London(APACS)]and recently APACSreported?373.7 mil-lion losses in the 12 months ending August 2001.Jenkins(2000) says "forevery ?100 you spend on acardin the UK, 13p is lost to fraudsters."Mattersarecomplicatedby issues of exactly what one includesinthe fraud igures.Forexample,bankruptcyraudariseswhenthe cardholdermakespurchases or which he/shehas no intentionof paying andthen files for personalbankruptcy,eavingthe bank to cover the losses. Sincethese aregenerallyregardedas charge-off osses, theyoften are not includedin fraudfigures.However,theycan be substantial:GhoshandReilly (1994) cited oneestimateof $2.65 billionfor bankruptcyraud n 1992.It is in a company and card issuer's interests topreventfraudor, failing this, to detect fraudas soonas possible.Otherwiseconsumertrust n boththe card

    and the company decreases and revenue is lost, inadditionto the direct losses made throughfraudulentsales. Because of the potentialfor loss of sales due toloss of confidence, in general,the merchantsassumeresponsibilityfor fraudlosses, even when the vendorhas obtainedauthorization rom the card ssuer.Creditcard raudmaybe perpetratedn variousways(a descriptionof the credit card industryand how itfunctions s given in Blunt andHand,2000), includingsimple theft, applicationfraudand counterfeit cards.In all of these, the fraudsteruses a physical card,butphysical possession is notessentialto perpetrate reditcardfraud:one of themajor raudareas s "cardholder-not-present" raud,where only the card's details aregiven (e.g., over thephone).Use of a stolen cardis perhaps he most straightfor-wardtype of credit card fraud. In this case, the fraud-stertypically spendsas muchas possible in as short aspace of time as possible, before the theft is detectedandthe card s stopped;hence,detecting he theftearlycanprevent argelosses.Application fraud arises when individuals obtainnew credit cards from issuing companies using falsepersonal information. Traditional credit scorecards(HandandHenley, 1997) are used to detect customerswho arelikely to default,and the reasonsfor this mayinclude fraud. Such scorecardsare based on the de-tails given on the application orms and perhapsalsoon other details such as bureau nformation.Statisticalmodels which monitorbehaviorover time canbe usedto detect cards whichhave been obtained rom a fraud-ulentapplication e.g., a first ime cardholderwho runsout and rapidlymakes many purchasesshouldarousesuspicion). With applicationfraud, however,urgencyis not as importanto the fraudsterand it mightnot beuntil accounts are sentout orrepaymentdatesbegintopass that fraud s suspected.Cardholder-not-presentraudoccurswhen the trans-actionis maderemotely,so thatonly the card's detailsare needed, and a manualsignatureand card imprintare not requiredat the time of purchase.Suchtransac-tions includetelephonesales and on-line transactions,andthis typeof fraudaccounts or a highproportion flosses. To undertake uch fraud t is necessary o obtainthe detailsof the cardwithout the cardholder'sknowl-edge. This is done in variousways, including"skim-ming,"where employees illegally copy the magneticstrip on a credit card by swiping it througha smallhandheldcard reader,"shouldersurfers,"who entercard details into a mobile phone while standingbe-hind apurchasern a queue,andpeople posingascredit

    238

  • 8/6/2019 Bolton and Hand

    6/16

    STATISTICAL RAUDDETECTIONcardcompany employees takingdetails of creditcardtransactions romcompaniesover the phone.Counter-feit cards, currentlythe largest source of credit cardfraud in the United Kingdom (source: APACS), canalso be created using this information. Transactionsmade by fraudstersusing counterfeitcards and mak-ing cardholder-not-presenturchasescan be detectedthroughmethods which seek changes in transactionpatterns,as well as checking for particularpatternswhich are knownto be indicativeof counterfeiting.Credit card databases contain informationon eachtransaction.This information ncludes such things asmerchantcode, account number,type of credit card,type of purchase,client name, size of transactionanddate of transaction.Some of these data are numerical(e.g., transaction ize) and others arenominalcategor-ical (e.g., merchantcode, which can havehundredsofthousandsof categories)or symbolic. The mixed datatypes have led to the applicationof a wide varietyofstatistical,machine earninganddataminingtools.Suspicion scores to detect whetheran accounthasbeen compromisedcan be basedon models of individ-ual customers' previous usage patterns,standardex-pected usage patterns, particularpatternswhich areknown to be often associated with fraud,and on su-pervisedmodels.A simple exampleof the patterns x-hibitedby individualcustomers s givenin Figure16ofHand and Blunt(2001), which shows howtheslopesofcumulativecredit cardspendingovertime are remark-ably linear. Suddenjumps in these curves or suddenchanges of slope (transactionor expenditurerate sud-denly exceeding some threshold)merit investigation.Likewise, some customers practice "jamjarring"-restrictingparticularcards to particular ypes of pur-chases (e.g., using a given card for petrol purchasesonly and a different one for supermarketpurchases),so thatusage of a card to makean unusual ype of pur-chase can triggeran alarm for such customers.At amoregenerallevel, suspicionscorescan also be basedon expected overallusage profiles.For example, firsttime creditcardusers aretypically initiallyfairlyten-tative in theirusage, whereas those transferringoansfrom another card are generally not so reticent. Fi-nally, examples of overall transactionpatternsknownto be intrinsicallysuspiciousare the suddenpurchaseof manysmall electrical tems orjewelry (goods whichpermiteasy black market esale)andtheimmediateuseof a new card n a wide rangeof different ocations.We commented above that, for obvious reasons,there is a dearthof publishedliteratureon fraudde-tection. Much of that which has been published ap-pears in the methodological data analytic literature,

    where the aim is to illustratenew dataanalytictools byapplying them to the detection of fraud, ratherthanto describe methods of fraud detection per se. Fur-thermore,since anomaly detection methods are verycontext dependent,much of the published literaturein the area concentrates on supervisedclassificationmethods.In particular, ule-basedsystems and neuralnetworkshaveattractednterest.Researcherswho haveused neural networksfor credit card fraud detectioninclude Ghosh and Reilly (1994), Aleskerov et al.(1997), Dorronsoro,Ginel, Sanchez and Cruz (1997)andBrause,LangsdorfandHepp(1999), mainlyin thecontextof supervisedclassification.HNC SoftwarehasdevelopedFalcon,a softwarepackagethatrelies heav-ily on neural networktechnologyto detectcredit cardfraud.Supervisedmethods,using samplesfromthe fraud-

    ulent/nonfraudulent lasses as the basis to constructclassificationrules to detect futurecases of fraud,suf-fer from the problemof unbalancedclass sizes men-tioned above:the legitimatetransactionsgenerallyfaroutnumberhefraudulent nes. Brause,LangsdorfandHepp (1999) said that, in their database of creditcard transactions,"the probabilityof fraud is verylow (0.2%) and has been lowered in a preprocessingstep by a conventionalfrauddetecting system downto 0.1%."Hassibi (2000) remarked hat"outof some12 billion transactionsmade annually,approximately10 million-or one out of every 1200 transactions-turn out to be fraudulent.Also, 0.04% (4 out of every10,000)of all monthlyactiveaccounts arefraudulent."It follows from this sort of figure that simple mis-classificationrate cannot be used as a performancemeasure:with a bad rate of 0.1%, simply classifyingevery transactionas legitimatewill yield an errorrateof only 0.001. Instead,one must either minimize anappropriate ost-weightedloss or fix some parameter(suchas the numberof cases one can afford o investi-gate in detail)and thentryto maximizethe numberoffraudulent ases detectedsubjectto theconstraints.Stolfo et al. (1997a, b) outlined a meta-classifiersystem for detecting credit card fraud that is basedon the idea of using different local fraud detectiontools withineach differentcorporateenvironmentandmerging the results to yield a more accurateglobaltool. This work was elaboratedin Chan and Stolfo(1998), Chan,Fan,ProdromidisandStolfo (1999) andStolfo et al. (1999), who described a more realisticcost model to accompanythe differentclassificationoutcomes. Wheeler and Aitken (2000) also exploredthe combinationof multipleclassificationrules.

    239

  • 8/6/2019 Bolton and Hand

    7/16

    R. J. BOLTONAND D. J. HAND4. MONEY LAUNDERING

    Money laundering s the process of obscuringthesource, ownershipor use of funds, usually cash, thataretheprofitsof illicit activity.The size of theproblemis indicatedin a 1995 U.S. Office of TechnologyAs-sessment (OTA)report(U.S. Congress, 1995): "Fed-eral agencies estimate that as much as $300 billion islaunderedannually,worldwide. From $40 billion to$80 billion of this may be drug profits made in theUnited States."Preventions attemptedby means of le-gal constraintsandrequirements-the burdenof whichis gradually ncreasing-and there has been much de-baterecentlyaboutthe use of encryption.However,noprevention trategy s foolproofanddetection s essen-tial. In particular,he September11thterroristattackson New YorkCity and the Pentagonhave focused at-tention on the detection of money launderingin anattempt o starveterroristnetworksof funds.Wire transfersprovidea naturaldomainfor launder-ing: accordingto the OTA report,each day in 1995abouthalfa millionwiretransfers,valued at more than$2 trillion(U.S.), were carriedout using the FedwireandCHIPSsystems, along with almost a quarterof amillion transfersusing the SWIFTsystem. It is esti-mated thataround0.05-0.1% of these transactionsn-volved laundering.Sophisticatedstatistical and otheron-line data analytic proceduresare needed to detectsuchlaunderingactivity.Since it is nowbecominga le-galrequiremento showthat all reasonablemeans havebeen used to detect fraud,we may expect to see evengreaterapplicationof such tools.Wire transfers ontain tems such as dateof transfer,identityof sender,routingnumberof originatingbank,identityof recipient,routingnumberof recipientbankand amount transferred.Sometimes those fields notneededfor transferare left blank,free text fields maybe completed in differentways and, worse still, butinevitable,sometimesthe data have errors.Automaticerror detection (and correction) software has beendeveloped,basedon semanticandsyntacticconstraintson possible content, but, of course, this can neverbe a complete solution. Mattersare also complicatedby the fact that banks do not share their data. Ofcourse, banks are not the only bodies that transfermoney electronically,and other businesses have beenestablishedpreciselyfor this purpose[the OTAreport(U.S. Congress, 1995) estimates the number of suchbusinessesas 200,000].The detectionof money launderingpresentsdifficul-ties not encountered n areas such as, for example,the

    creditcardindustry.Whereascredit card fraud comesto light fairly earlyon, in money laundering t may beyearsbefore individual ransfersor accounts are defin-itively and legally identified as part of a launderingprocess. While, in principle (assuming records havebeen kept), one could go back and trace the relevanttransactions,n practicenot all of themwouldbe iden-tified, so detracting rom their use in supervisedde-tection methods. Furthermore,here is typically lessextensive informationavailable for the account hold-ers in investmentbanksthan thereis in retailbankingoperations.Developingmore detailed customerrecordsystemsmightbe a good way forward.As with other areas of fraud, money launderingdetectionworkshand n handwithprevention. n 1970,for example, in the United States the Bank SecrecyAct required hatbanksreportall currency ransactionsof over $10,000 to the authorities.However, alsoas in other areas of fraud, the perpetratorsadapttheir modus operandito match the changing tacticsof the authorities.So, following the requirementofbanks to reportcurrency ransactionsof over $10,000,the obvious strategywas developed to divide largersums into multipleamounts of less than$10,000 anddeposit them in different banks (a practice termedsmurfingor structuring).In the United States, this isnow illegal, but the way the money launderersadaptto the prevailingdetection methods can lead one tothe pessimistic perspectivethat only the incompetentmoney launderers are detected. This, clearly, alsolimits the value of superviseddetection methods:thepatternsdetected will be those patternswhich werecharacteristic of fraud in the past, but which mayno longer be so. Other strategies used by moneylaundererswhich limit the value of supervisedmethodsinclude switching between wire and physical cashmovements, the creation of shell businesses, falseinvoicingand,of course,the fact that a single transfer,in itself, is unlikely to appear to be a launderingtransaction.Furthermore,because of the large sumsinvolved,moneylaunderers rehighly professionalandoften have contacts in the banks who can feed backdetailsof the detectionstrategiesbeing applied.The numberof currencytransactionsover $10,000in value increased dramaticallyafter the mid-1980s,to the extent that the number of reportsfiled is huge(over 10 million in 1994, with total worth of around$500 billion), and this in itself can cause difficulties.In an attempt o cope with this, the FinancialCrimesEnforcementNetwork (FinCEN)of the U.S. Depart-ment of the Treasuryprocesses all such reportsusing

    240

  • 8/6/2019 Bolton and Hand

    8/16

    STATISTICAL RAUDDETECTIONthe FinCEN artificialintelligence system (FAIS) de-scribedbelow. Moregenerally,banksare also requiredto report any suspicioustransactions,and about0.5%of currency ransaction eportsare so flagged.Moneylaundering nvolvesthreesteps:1. Placement: the introduction of the cash into thebanking system or legitimatebusiness (e.g., trans-ferring the banknotes obtained from retail drugstransactions into a cashier's cheque). One wayto do this is to pay vastly inflated amounts forgoods importedacross international rontiers.Pakand Zdanowicz (1994) described statisticalanaly-sis of trade databases to detect anomalies in gov-ernmenttrade data such as charging$1694 a gramforimportsof thedrugerythromycin omparedwith$0.08 a gramfor exports.2. Layering: carrying out multiple transactionsthroughmultipleaccountswith differentowners atdifferentfinancial institutions in the legitimatefi-nancialsystem.3. Integration:merging hefunds withmoneyobtainedfromlegitimateactivities.

    Detection strategiescan be targetedat various lev-els. In general (and in common with some other ar-eas in which fraud s perpetrated),t is verydifficultorimpossibleto characterizean individualtransactionasfraudulent.Rathertransactionpatterns must be iden-tified as fraudulentor suspicious.A single deposit ofjust under$10,000 is not suspicious,butmultiplesuchdeposits are; a large sum being deposited is not sus-picious, but a largesum being depositedandinstantlywithdrawn s. Infact,one candistinguishseveral evelsof (potential)analysis:the individual ransactionevel,the account evel, thebusiness level (and, indeed,indi-vidualsmay havemultipleaccounts)andthe "ring"ofbusinesses level. Analysescan be targetedatparticularlevels, but morecomplex approaches anexamine sev-eral levels simultaneously.(There is an analogy herewith speech recognition systems: simple systems fo-cused at the individualphoneme and word levels arenot as effective as those which try to recognize theseelements in a higher level context of the way wordsare put together when used.) In general, link analy-sis, which identifies groups of participants nvolvedin transactions,plays a key role in most money laun-dering detection strategies.Senator et al. (1995) said"Money launderingtypically involves a multitude oftransactions,perhapsby distinctindividuals, nto mul-tiple accountswith differentowners at differentbanksandotherfinancial nstitutions.Detection of large-scale

    money launderingschemes requiresthe ability to re-construct hese patternsof transactionsby linkingpo-tentially related transactionsand then to distinguishthe legitimatesets of transactions romthe illegitimateones. This techniqueof finding relationshipsbetweenelements of information, alledlinkanalysis,is thepri-mary analytictechniqueused in law enforcement n-telligence (Andrews and Peterson, 1990)."An obvi-ous andsimplisticillustration s the fact that a transac-tionwith a knowncriminalmayrousesuspicion.Moresubtle methods are based on recognitionof the sortof businesses withwhichmoney launderingoperationstransact.Of course, these are all supervisedmethodsandaresubjectto the weaknesses that those responsi-ble may evolve theirstrategies.Similartools areusedto detect telecom fraud,as outlined in the followingsection.Rule-based systems have been developed, oftenwith the rules based on experience("flagtransactionsfrom countries X and Y"; "flag accounts showinga large deposit followed immediately by a similarsized withdrawal").Structuringcan be detected bycomputing the cumulative sum of amounts enteringan account over a shortwindow,such as a day.Othermethodshavebeendevelopedbased on straightforwarddescriptivestatistics,such as rate of transactionsandproportionof transactionswhich are suspicious. Theuse of the Benforddistribution s an extension of thisidea. Although one may not usually be interestedindetecting changes in an account's behavior,methodssuch as peer group analysis (Bolton and Hand,2001)andbreakdetection(GoldbergandSenator,1997) canbe appliedto detectmoney laundering.One of the most elaboratemoney launderingdetec-tion systemsis theU.S. FinancialCrimesEnforcementNetworkAI system (FAIS)described n Senatoret al.(1995) andGoldbergandSenator(1998). This systemallows users to follow trails of linked transactions. tis built arounda "blackboard" rchitecture,n whichprogrammodules can read andwrite to a centraldata-base that containsdetails of transactions, ubjectsandaccounts.A key componentof the systemis its suspi-cion score. This is a rule-based ystembased on anear-lier system developedby the U.S. Customs Service inthe mid-1980s. The systemcomputessuspicionscoresfor variousdifferenttypes of transactionand activity.SimpleBayesianupdating s used to combineevidencethat suggests that a transactionor activity is illicit toyield an overall suspicionscore. Senatoret al. (1995)includeda briefbutinterestingdiscussion of an inves-tigationof whethercase-basedreasoning(cf. nearest

    241

  • 8/6/2019 Bolton and Hand

    9/16

    R. J. BOLTONAND D. J. HANDneighbormethods) and classification tree techniquescouldusefullybe addedto the system.The American National Association of SecuritiesDealers,Inc.,uses an advanceddetectionsystem ADS;Kirklandet al., 1998; Senator,2000) to flag "patternsor practicesof regulatoryconcern."ADS uses a rulepatternmatcherand a time-sequence patternmatcher,and(like FAIS)places great emphasison visualizationtools. Also as with FAIS, datamining techniquesareusedto identifynew patternsof potential nterest.A different approachto detecting similar fraudu-lent behavior is taken by SearchSpaceLtd. (www.searchspace.com),which has developed a system fortheLondon StockExchangecalledMonITARS moni-toringinsidertradingandregulatory urveillance) hatcombines genetic algorithms,fuzzy logic and neuralnetwork echnologyto detect insiderdealingand mar-ket manipulation.Chartierand Spillane (2000) alsodescribedan applicationof neuralnetworksto detectmoneylaundering.

    5. TELECOMMUNICATIONSRAUDThe telecommunicationsndustryhas expandeddra-maticallyin the last few years with the developmentof affordablemobile phone technology.With the in-creasing numberof mobile phone users, global mo-bile phone fraudis also set to rise. Variousestimateshavebeen presented or the cost of this fraud. Forex-

    ample,Cox, Eick, Wills and Brachman 1997) gave afigureof $1 billion a year.TelecomandNetworkSecu-rityReview[4(5) April 1997] gave a figureof between4 and 6% of U.S. telecom revenuelost due to fraud.Cahill, Lambert,Pinheiro and Sun (2002) suggestedthatinternationaliguresareworse, with "severalnewservice providersreporting osses over 20%."Moreauet al. (1996) gavea valueof "severalmillion ECUs peryear."Presumably his refers to within the EuropeanUnion and, given the size of the other estimates, wewonderif this should be billions. According to a re-centreport NeuralTechnologies,2000), "the ndustryalreadyreportsa loss of ?13 billion each year due tofraud."MobileEurope(2000) gave a figureof $13 bil-lion (U.S.). The latter articlealso claimedthat it is es-timated hatfraudsters an stealup to 5%of some op-erators'revenues,and thatsome expect telecom fraudas a whole to reach$28 billionperyearwithin3 years.Despite the varietyin these figures, it is clear thatthey are all very large. Apartfrom the fact that theyare simply estimates, and hence subject to expectedinaccuraciesand variabilitybased on the information

    used to derive them, there are other reasons for thedifferences. One is the distinctionbetween hard andsoft currency.Hardcurrencyis real money, paid bysomeone other than the perpetrator or the servicethe perpetrator as stolen. Hynninen(2000) gave theexample of the sum one mobile phone operatorwillpay another or the use of theirnetwork.Soft currencyis the value of the service the perpetrator as stolen.At least partof this is only a loss if one assumesthatthe thief would have used the same service even if heor she had had to pay for it. Anotherreason for thedifferences derives from the fact that such estimatesmay be used for differentpurposes.Hynninen(2000)gave the examples of operatorsgiving estimates onthe high side, hoping for more stringent antifraudlegislation,andoperatorsgiving estimateson the lowside to encouragecustomerconfidence.

    We need to distinguish between fraud aimed atthe service providerand fraud enabledby the serviceprovider. An example of the former is the resaleof stolen call time and an example of the latter isinterferingwith telephone bankinginstructions.(It isthe possibilityof the latter sortof fraudwhich makesthe public wary of using their credit cards over theInteret.) We can also distinguish between revenuefraudand nonrevenue raud. The aim of the former sto makemoneyfor theperpetrator, hile theaim of thelatter is simply to obtain a service free of charge(or,as with computerhackers,e.g., the simple challengerepresentedby the system).Therearemanydifferent ypesof telecomfraud see,e.g., Shawe-Taylor t al., 2000) andthese can occuratvariouslevels. The two most prevalent ypes are sub-scriptionfraudand superimposedor "surfing" raud.Subscription raudoccurswhen the fraudster btainsasubscription o a service, often with false identityde-tails, with no intentionof paying. This is thus at thelevel of a phone number-all transactions rom thisnumberwill be fraudulent.Superimposed raud s theuse of a servicewithouthavingthenecessaryauthorityand is usuallydetectedby the appearanceof phantomcalls on a bill. There are severalways to carryout su-perimposed raud, ncludingmobilephonecloningandobtaining calling card authorizationdetails. Superim-posed fraudwill generallyoccur at the level of indi-vidualcalls-the fraudulent alls will be mixedin withthe legitimateones. Subscription raudwill generallybe detectedat somepointthrough hebilling process-althoughthe aim is to detect it well before that,sincelargecosts can quicklybe runup. Superimposed raudcan remainundetected or a long time. The distinction

    242

  • 8/6/2019 Bolton and Hand

    10/16

    STATISTICAL RAUD DETECTIONbetween these two types of fraud ollows a similardis-tinctionin creditcard fraud.Other types of telecom fraud include "ghosting"(technologythat tricks the networkso as to obtainfreecalls) and insiderfraud,where telecom companyem-ployees sell information o criminalsthat can be ex-ploitedfor fraudulent ain.This, of course,is a univer-sal cause of fraud,whateverthe domain. "Tumbling"is a type of superimposed raudin which rolling fakeserial numbers are used on cloned handsets, so thatsuccessive calls are attributed o different legitimatephones. The chance of detectionby spottingunusualpatterns s small and the illicit phone will operateun-til all of the assumed identitieshavebeen spotted.Theterm "spoofing" s sometimes used to describe userspretending o be someoneelse.Telecommunicationsnetworksgeneratevast quanti-ties of data, sometimes on the orderof several giga-bytes per day, so that data mining techniques are ofparticularmportance.The 1998databaseof AT&T, orexample,contained350 millionprofilesandprocessed275 million call recordsper day (CortesandPregibon,1998).As with other frauddomains, apartfrom some do-mainspecifictools,methods or detectionhingearoundoutlier detection and supervisedclassification,eitherusing rule-basedmethodsor based on comparingsta-tistically derived suspicion scores with some thresh-old. At a low level, simple rule-based detection sys-tems use rules such as the apparentuse of the samephone in two very distant geographicallocations inquicksuccession,calls whichappear o overlap ntime,and very high value and very long calls. At a higherlevel, statistical summaries of call distributions(of-ten called profiles or signaturesat the user level) arecomparedwith thresholdsdetermined itherby expertsor by applicationof supervised learning methods toknown fraud/nonfraudases. Muradand Pinkas(1999)and Rosset et al. (1999) distinguishedbetween profil-ing at the levels of individualcalls, daily call patternsand overall call patterns,anddescribedwhatareeffec-tively outlier detection methods for detecting anom-alous behavior.A particularly nterestingdescriptionof profilingmethodswas givenby CortesandPregibon(1998). Cortes,Fisher,PregibonandRogers(2000) de-scribed the Hancock language for writing programsfor processingprofiles, basing the signatureson suchquantitiesas averagecall duration, ongest call dura-tion, number of calls to particular egions in the lastday and so on. Profilingand classificationtechniques

    also were describedby Fawcett andProvost(1997a, b,1999) and Moreau,Verrelst and Vandewalle(1997).Some work(see, e.g., FawcettandProvost,1997a)hasfocused on detectingchangesin behavior.A generalcomplication s thatsignaturesand thresh-olds may need to dependon time of day, type of ac-count and so on, and thatthey will probablyneed tobe updatedover time. Cahill et al. (2002) suggestedexcluding the very suspicious scores in this updatingprocess, althoughmoreworkis neededin this area.Once again,neuralnetworkshave been widely used.The main fraud detection software of the FraudSolutionsUnit of Nortel Networks(Nortel,2000) usesa combinationof profilingand neural networks.Like-wise, ASPeCT (Moreau et al., 1996; Shawe-Tayloret al., 2000), a project of the European Commis-sion, Vodaphone, other European telecom compa-nies andacademics,developeda combinedrule-basedprofiling and neural network approach. Taniguchi,Haft,Hollmen andTresp(1998) describedneural net-works, mixture models and Bayesian networks intelecom fraud detection based on call records storedforbilling.Link analysis, with links updatedover time, estab-lishes the "communities of interest"(Cortes, Pregi-bon andVolinsky,2001) thatcan indicatenetworksoffraudsters.Thesemethodsare basedon the observationthatfraudsters eldom changetheircalling habits,butareoftenclosely linked to other fraudsters.Using sim-ilar patternsof transactions o infer the presenceof aparticularraudster s in the spiritof phenomenaldatamining(McCarthy, 000).Visualizationmethods(Cox et al., 1997), developedfor mining very largedatasets, have also been devel-oped for use in telecom frauddetection. Here humanpatternrecognitionskills interactwith graphicalcom-puter display of quantitiesof calls between differentsubscribersn variousgeographical ocations. A possi-ble future scenario would be to code into software thepatternswhichhumansdetect.The telecommarketwill becomeeven morecompli-cated over time-with moreopportunity or fraud. Atpresentthe extent of fraud is measuredby consider-ing factors such as call lengths and tariffs. The thirdgenerationof mobile phone technologywill also needto take into account such things as the contentof thecalls (becauseof thepacketswitchingtechnologyused,equally long data transmissionsmay containvery dif-ferentnumbersof datapackets)and the priorityof thecall.

    243

  • 8/6/2019 Bolton and Hand

    11/16

    R. J. BOLTONAND D. J. HAND6. COMPUTER INTRUSION

    On Thursday, September21, 2000, a 16-year-oldboy wasjailed for hackinginto both the PentagonandNASA computer systems. Between the 14th and 25thof October2000 Microsoftsecuritytracked he illegalactivityof a hacker on the MicrosoftCorporateNet-work. These examples illustratethat even exception-ally well protecteddomains can have their computersecuritycompromised.Computer ntrusionfraud is big business and com-puterintrusiondetectionis a hugely intensive areaofresearch.Hackerscan findpasswords,read andchangefiles, altersourcecode, reade-mails and so on. Den-ning(1997) listedeightkinds of computer ntrusion. fthehackerscan be prevented rompenetratinghe com-putersystemor can be detectedearlyenough,then suchcrime canbe virtuallyeliminated.However,as with allfraudwhen theprizesarehigh,the attacksareadaptiveandonce one kind of intrusionhasbeenrecognizedthehackerwill try a differentroute. Because of its impor-tance,a greatdeal of effort has been put into develop-ing intrusiondetectionmethods,andthere are severalcommercialproductsavailable, ncludingCisco secureintrusiondetection system (CSIDS, 1999) and next-generation ntrusiondetectionexpertsystem (NIDES;Anderson,Frivold andValdes,1995).Since the only record of a hacker's activities is thesequenceof commandsthatis used when compromis-ingthesystem,analystsof computerntrusiondatapre-dominantlyuse sequenceanalysistechniques.As withother fraud situations,both supervisedand unsuper-vised methodsare used. In the contextof intrusionde-tection,supervisedmethodsaresometimes calledmis-use detection, while the unsupervisedmethods usedaregenerallymethods of anomalydetection,basedonprofilesof usage patterns or each legitimateuser.Su-pervisedmethodshave the problemdescribed n othercontexts,thatthey can, of course,only work on intru-sion patternswhich have alreadyoccurred(or partialmatchesto these). Lee and Stolfo (1998) appliedclas-sificationtechniquesto data from a user or programthat has been identifiedas eithernormal or abnormal.Lippmann t al. (2000) concluded hatemphasisshouldbe placed on developing methods for detecting newpatternsof intrusionrather hanold patterns,but Ku-mar andSpafford 1994) remarked hat "amajorityofbreak-ins .. are the resultof a smallnumberof knownattacks,as evidencedby reportsfrom responseteams(e.g., CERT). Automatingdetection of these attacksshouldthereforeresultin the detectionof a significant

    numberof break-inattempts." hiehandGligor(1991,1997)describedapattern-matchingmethodandarguedthat it is more effective than statisticalmethods at de-tecting knowntypes of intrusion,but is unable to de-tect novel kinds of intrusionpatterns,which could bedetectedby statisticalmethods.Since intrusionrepresentsbehaviorand the aim isto distinguishbetween intrusion behavior and usualbehavior n sequences,Markovmodels have naturallybeen applied (e.g., Ju and Vardi, 2001). Qu et al.(1998) also used probabilities of events to definethe profile.Forrest,Hofmeyr, SomayajiandLongstaff(1996) described a method based on how naturalimmune systems distinguish between self and alienpatterns.As with telecom data, both individualuserpatterns and overall network behavior change overtime, so that a detectionsystem mustbe able to adaptto changes,but not adaptso rapidly hatit also acceptsintrusionsas legitimate changes. Lane and Brodley(1998) and Kosoresow andHofmeyr(1997) also usedsimilarityof sequences that can be interpreted n aprobabilistic ramework.Inevitably,neural networkshave been used: Ryan,Lin and Miikkulainen(1997) performedprofilingbytraining a neural network on the process data andalso referencedother neuralapproaches. n one of themore careful studiesin the area,Schonlauet al. (2001)described a comparativestudy of six statistical ap-proaches for detecting impersonationof other users(masquerading),wherethey took realusage data from50 users and planted contaminatingdata from otherusersto serve as themasquerade argets o be detected.A nice overview of statistical ssues in computer ntru-sion detectionwas given by Marchette 2001), andtheOctober2000 editionof ComputerNetworks 34(4)] isa special issue on (relatively)recentadvances n intru-sion detectionsystems, includingseveralexamplesofnew approaches o computer ntrusiondetection.

    7. MEDICALAND SCIENTIFICFRAUDMedical fraudcan occurat various evels. It can oc-curinclinicaltrials(see, e.g., Buyse et al., 1999).It canalso occurin a morecommercialcontext:forexample,prescription raud,submittingclaims for patientswhoare dead or who do not exist, and upcoding,where adoctorperformsa medical procedure,but chargestheinsurer or one that s moreexpensive,orperhapsdoesnot even performone at all. Allen (2000) gave an ex-ample of bills submitted or more than 24 hours in aworking day. He, Wang, Graco and Hawkins (1997)

    244

  • 8/6/2019 Bolton and Hand

    12/16

    STATISTICAL RAUD DETECTIONand He, Graco and Yao (1999) describedthe use ofneuralnetworks,genetic algorithmsandnearestneigh-bormethods to classify the practiceprofilesof generalpractitioners n Australia nto classes from normaltoabnormal.Medical fraud is often linked to insurance fraud:Terry Allen, a statistician with the Utah Bureau ofMedicaid Fraud, estimated that up to 10% of the$800 million annual claims may be stolen (Allen,2000). Major and Riedinger (1992) created a know-ledge/statistical-based system to detect healthcarefraud by comparing observations with those withwhich they should be most similar(e.g., having simi-largeodemographics).Brockett,Xia andDerrig(1998)used neural networks to classify fraudulentand non-fraudulent claims for automobile bodily injury inhealthcareinsurance claims. Glasgow (1997) gave ashortdiscussionof risk and fraud n the insurance n-dustry.A glossary of several of the differenttypes ofmedical fraudis available at http://www.motherjones.com/mother_jones/MA95/davis2.html.Of course, medicine is not the only scientific areawhere data have sometimes been fabricated, alsifiedor carefullyselectedto supporta pet theory.Problemsof fraudin science are attracting ncreasedattention,but they have always been with us: errantscientistshave been known to massage figuresfromexperimentsto push through developmentof a productor reach amagical significance level for a publication. DmitriyYuryev described such a case on his webpages athttp://www.orc.ru/~yur77/statfr.htm.Moreover,thereare many classical cases in which the data havebeen suspected of being massaged (including thework of Galileo, Newton, Babbage, Kepler,Mendel,Millikanand Burt).Press and Tanur 2001) presenteda fascinatingdiscussion of the role of subjectivity nthe scientificprocess,illustratingwithmanyexamples.The borderlinebetween subconscious selection of dataandout-and-outdistortion s a fine one.

    8. CONCLUSIONSThe areas we have outlined are perhaps those inwhich statistical and other data analytic tools havemade the most impacton frauddetection. This is typi-cally becausetherearelargequantitiesof information,and this informations numericalor can easily be con-verted ntothe numerical n the form of counts andpro-portions.However,otherareas,not mentionedabove,have also used statistical tools for frauddetection.Ir-regularitiesn financial tatements an be used to detect

    accountingandmanagement raud n contexts broaderthan those of money laundering.Digit analysis toolshave found favor in accountancy(e.g., Nigrini andMittermaier, 997;Nigrini, 1999). Statisticalsamplingmethods are important n financialaudit,and screen-ing tools areappliedto decide which tax returnsmeritdetailedinvestigation.We mentioned insurancefraudin the context of medicine,but it clearly occurs morewidely. Artis, Ayuso and Guillen (1999) described anapproacho modellingfraudbehavior n carinsurance,andFanning,Coggerand Srivastava 1995) and Greenand Choi (1997) examinedneural networkclassifica-tion methods for detecting management raud.Statis-tical tools for fraud detectionhave also been appliedto sportingevents. For example, Robinson and Tawn(1995), Smith(1997) andBarao and Tawn(1999) ex-aminedthe results of runningevents to see if some ex-ceptional times were out of line with what might beexpected.Plagiarisms also a typeof fraud.Webrieflyreferredto the use of statistical tools for author verificationand such methods can be applied here. However,statistical tools can also be applied more widely.For example, with the evolution of the Internet t isextremelyeasy for students to plagiarizearticlesandpass them off as their own in school or universitycoursework. The website http://www.plagiarism.orgdescribes a system that can take a manuscriptandcompare it against their "substantialdatabase" ofarticles from the Web. A statistical measure of theoriginalityof themanuscripts returned.As we commented n the Introduction,raud detec-tion is a post hoc strategy,being applied after fraudpreventionhas failed. Statistical ools arealso appliedin some fraudpreventionmethods.For example, so-calledbiometricmethodsof frauddetectionareslowlybecoming more widespread.These includecomputer-ized fingerprint nd retinal dentification,and also facerecognition(althoughthis has received most publicityin the contextof recognizingfootballhooligans).Inmanyof theapplicationswe havediscussed,speedof processingis of the essence. This is particularlyhecase in transactionprocessing, especially with telecomand intrusiondata,wherevast numbersof records areprocessed every day, but also applies in credit card,bankingandretail sectors.A key issue in all of this work is how effective thestatistical ools arein detectingfraudanda fundamen-tal problemis that one typically does not know howmanyfraudulent ases slip through he net. In applica-tions such as bankingfraudand telecom fraud,where

    245

  • 8/6/2019 Bolton and Hand

    13/16

    R. J. BOLTON AND D. J. HAND

    speed of detection matters, measures such as averagetime to detection after fraud starts (in minutes, num-bers of transactions, etc.) should also be reported. Mea-sures of this aspect interact with measures of final de-tection rate: in many situations an account, telephoneand so forth, will have to be used for several fraudulenttransactions before it is detected as fraudulent, so thatseveral false negative classifications will necessarily bemade.An appropriate overall strategy is to use a graded

    system of investigation. Accounts with very highsuspicion scores merit immediate and intensive (andexpensive) investigation, while those with large butless dramatic scores merit closer (but not expensive)observation. Once again, it is a matter of choosing asuitable compromise.

    Finally, it is worth repeating the conclusions reachedby Schonlau et al. (2001), in the context of statisti-cal tools for computer intrusion detection: "statisticalmethods can detect intrusions, even in difficult circum-stances," but also "many challenges and opportunitiesfor statistics and statisticians remain." We believe thispositive conclusion holds more generally. Fraud detec-tion is an important area, one in many ways ideal forthe application of statistical and data analytic tools, andone where statisticians can make a very substantial andimportant contribution.

    ACKNOWLEDGMENTThe work of Richard Bolton was supported bya ROPA award from the Engineering and PhysicalSciences Research Council of the United Kingdom.

    REFERENCESALESKEROV,E., FREISLEBEN,B. and RAO, B. (1997). CARD-WATCH: A neural network based database mining system forcredit card fraud detection. In Computational Intelligence forFinancial Engineering. Proceedings of the IEEE/IAFE 220-226. IEEE, Piscataway, NJ.ALLEN, T. (2000). A day in the life of a Medicaid fraud statistician.Stats 29 20-22.ANDERSON, D., FRIVOLD, T. and VALDES, A. (1995). Next-

    generation ntrusiondetectionexpertsystem(NIDES):A sum-mary. Technical Report SRI-CSL-95-07, Computer ScienceLaboratory, SRI International, Menlo Park, CA.

    ANDREWS, P. P. and PETERSON, M. B., eds. (1990). CriminalIntelligence Analysis. Palmer Enterprises, Loomis, CA.

    ARTIS, M., AYUSO, M. and GUILLtN, M. (1999). Modellingdifferent types of automobile insurance fraud behaviour in theSpanish market. Insurance Mathematics and Economics 2467-81.

    BARAO, M. I. and TAWN,J. A. (1999). Extremal analysis of shortseries with outliers: Sea-levels and athletics records. Appl.Statist. 48 469-487.BLUNT, G. and HAND, D. J. (2000). The UK credit card market.Technical report, Dept. Mathematics, Imperial College, Lon-don.BOLTON, R. J. and HAND, D. J. (2001). Unsupervised profilingmethods for fraud detection. In Conference on Credit Scoringand Credit Control 7, Edinburgh, UK, 5-7 Sept.BRAUSE, R., LANGSDORF,T. and HEPP, M. (1999). Neural data

    mining for credit card fraud detection. In Proceedings of the11th IEEE International Conference on Tools with ArtificialIntelligence 103-106. IEEE Computer Society Press, SilverSpring, MD.

    BREIMAN, L., FRIEDMAN, J. H., OLSHEN, R. A. andSTONE, C. J. (1984). Classification and Regression Trees.Wadsworth, Belmont, CA.

    BROCKETT,P. L., XIA, X. and DERRIG, R. A. (1998). UsingKohonen's self-organising feature map to uncover automobilebodily injury claims fraud. The Journal of Risk and Insurance65 245-274.BURGE, P. and SHAWE-TAYLOR,J. (1997). Detecting cellularfraud using adaptive prototypes. In AAAI Workshop on AIApproaches to Fraud Detection and Risk Management 9-13.AAAI Press, Menlo Park, CA.

    BUYSE, M., GEORGE, S. L., EVANS, S., GELLER, N. L.,RANSTAM, J., SCHERRER,B., LESAFFRE, E., MURRAY, G.,EDLER, L., HUTTON, J., COLTON, T., LACHENBRUCH,P.and VERMA, B. L. (1999). The role of biostatistics in theprevention, detection and treatment of fraud in clinical trials.Statistics in Medicine 18 3435-3451.

    CAHILL,M. H., LAMBERT,D., PINHEIRO,J. C. and SUN, D. X.(2002). Detecting fraud in the real world. In Handbook of Mas-sive Datasets (J. Abello, P. M. Pardalos and M. G. C. Resende,eds.). Kluwer, Dordrecht.CHAN, P. K., FAN, W., PRODROMIDIS,A. L. and STOLFO,S. J.(1999). Distributed data mining in credit card fraud detection.IEEE Intelligent Systems 14(6) 67-74.

    CHAN, P. and STOLFO, S. (1998). Toward scalable learningwith non-uniform class and cost distributions: A case studyin credit card fraud detection. In Proceedings of the FourthInternational Conference on Knowledge Discovery and DataMining 164-168. AAAI Press, Menlo Park, CA.

    CHARTIER, B. and SPILLANE, T. (2000). Money launderingdetection with a neural network. In Business Applications ofNeural Networks (P. J. G. Lisboa, A. Vellido and B. Edisbury,eds.) 159-172. World Scientific, Singapore.

    CHHIKARA, R. S. and MCKEON, J. (1984). Linear discriminantanalysis with misallocation in training samples. J. Amer.Statist. Assoc. 79 899-906.

    CLARK, P. and NIBLETT, T. (1989). The CN2 induction algorithm.Machine Learning 3 261-285.COHEN, W. (1995). Fast effective rule induction. In Proceedings ofthe 12th International Conference on Machine Learning 115-123. Morgan Kaufmann, Palo Alto, CA.CORTES, C., FISHER, K., PREGIBON, D. and ROGERS, A.

    (2000). Hancock: A language for extracting signatures fromdata streams. In Proceedings of the Sixth ACM SIGKDDInternational Conference on Knowledge Discovery and DataMining 9-17. ACM Press, New York.

    246

  • 8/6/2019 Bolton and Hand

    14/16

    STATISTICAL RAUD DETECTIONCORTES,C. andPREGIBON, . (1998). Giga-mining. nProceed-

    ings of the FourthInternationalConferenceon KnowledgeDiscovery and Data Mining 174-178. AAAI Press, MenloPark,CA.

    CORTES,C, PREGIBON, . and VOLINSKY, . (2001). Commu-nities of interest.LectureNotes in Comput.Sci. 2189 105-114.Cox, K. C., EICK,S. G. and WILLS,G. J. (1997). Visual data

    mining:Recognizing telephonecallingfraud.Data MiningandKnowledgeDiscovery1 225-231.

    CSIDS (1999). Cisco secure intrusion detection system tech-nical overview. Available at http://www.wheelgroup.com/warp/public/cc/cisco/mkt/security/nranger/tech/ntran_tc.htm.

    DENNING,D. E. (1997). Cyberspaceattacksand countermeasures.In InternetBesieged (D. E. Denning and P. J. Denning,eds.)29-55. ACMPress,New York.DORRONSORO,. R., GINEL,F., SANCHEZ, . andCRUZ,C. S.

    (1997). Neural frauddetection n creditcardoperations. EEETransactions n Neural Networks8 827-834.FANNING, K., COGGER, K. 0. and SRIVASTAVA,R. (1995).Detection of management raud:A neural networkapproach.International Journal of Intelligent Systems in Accounting,Finance andManagement4 113-126.FAWCETT,T. and PROVOST,F. (1997a). Adaptive fraud detection.Data Miningand KnowledgeDiscovery1 291-316.FAWCETT,T. and PROVOST,F. (1997b). Combining data miningand machine learningfor effective fraud detection.In AAAI

    Workshopon AI Approaches to Fraud Detection and RiskManagement14-19. AAAI Press,MenloPark,CA.

    FAWCETT, . and PROVOST,F. (1999). Activity monitoring:Noticing interestingchanges n behavior. nProceedingsof theFifthACM SIGKDDInternationalConferenceon KnowledgeDiscoveryandData Mining53-62. ACMPress,New York.

    FORREST,S., HOFMEYR,S., SOMAYAJI,A. and LONGSTAFF,T.(1996). A sense of self for UNIX processes.InProceedings ofthe 1996 IEEESymposiumon Securityand Privacy 120-128.IEEEComputerSociety Press,SilverSpring,MD.

    GHOSH,S. andREILLY, . L. (1994). Creditcard frauddetectionwith a neural network. In Proceedings of the 27th HawaiiInternationalConferenceon SystemSciences(J.F NunamakerandR. H. Sprague,eds.) 3 621-630. IEEEComputerSocietyPress,Los Alamitos,CA.GLASGOW, . (1997). Risk and fraud in the insurance ndustry.

    In AAAIWorkshop n AIApproaches o Fraud Detection andRiskManagement20-21. AAAI Press,MenloPark,CA.GOLDBERG, . and SENATOR, . E. (1995). Restructuring ata-bases for knowledgediscoveryby consolidationand link for-mation. In Proceedingsof the First InternationalConferenceon Knowledge Discovery and Data Mining 136-141. AAAIPress,Menlo Park,CA.GOLDBERG, H. and SENATOR, T. E. (1997). Break detection

    systems. In AAAI Workshopon AI Approaches to FraudDetection and RiskManagement22-28. AAAI Press,MenloPark,CA.

    GOLDBERG, H. and SENATOR, T. E. (1998). The FinCEN AIsystem: Finding financial crimes in a large databaseof cashtransactions. nAgent Technology:Foundations,Applications,and Markets N. Jenningsand M. Wooldridge,eds.) 283-302.Springer,Berlin.GREEN,B. P. and CHOI, J. H. (1997). Assessing the riskof managementfraud through neural network technology.Auditing16 14-28.HAND, D. J. (1981). Discrimination and Classification.Wiley,Chichester.

    HAND,D. J. (1997). Constructionand Assessmentof Classifica-tion Rules.Wiley,Chichester.HAND,D. J. andBLUNT,G. (2001). Prospecting orgems increditcard data. IMAJournalof ManagementMathematics12 173-200.HAND, D. J., BLUNT, G., KELLY,M. G. and ADAMS, N. M.(2000). Data mining for fun and profit (with discussion).Statist. Sci. 15 111-131.HAND, D. J. and HENLEY,W. E. (1997). Statistical classificationmethods n consumercredit scoring:A review.J. Roy.Statist.

    Soc. Ser A 160 523-541.HASSIBI,K. (2000). Detecting payment card fraud with neuralnetworks. In Business Applications of Neural Networks(P.J. G. Lisboa,A. Vellido andB. Edisbury, ds.). WorldSci-entific,Singapore.HE, H., GRACO,W. and YAO,X. (1999). Applicationof geneticalgorithmand k-nearestneighbourmethodin medical frauddetection.LectureNotes in Comput.Sci. 1585 74-81. Springer,Berlin.HE, H. X., WANG, J. C., GRACO, W. and HAWKINS, S. (1997).Applicationof neuralnetworksto detectionof medicalfraud.ExpertSystemswithApplications13 329-336.HILL,T. P. (1995). A statisticalderivationof the significant-digitlaw.Statist. Sci. 10 354-363.HYNNINEN,. (2000). Experiences n mobile phone fraud.Semi-naron NetworkSecurity.ReportTik-110.501, Helsinki Univ.Technology.JENKINS,P. (2000). Getting smart with fraudsters.FinancialTimes,September23.JENSEN,D. (1997). Prospectiveassessment of AI technologiesfor fraud detection: a case study. In AAAIWorkshopon AIApproaches o Fraud Detectionand RiskManagement34-38.AAAI Press,MenloPark,CA.Ju, W.-H. and VARDI,Y. (2001). A hybrid high-orderMarkovchain model for computer intrusion detection. J. Comput.Graph.Statist. 10 277-295.KIRKLAND,J. D., SENATOR,T. E., HAYDEN,J. J., DYBALA, T.,

    GOLDBERG,H. G. and SHYR, P. (1998). The NASD regula-tion advanceddetectionsystem (ADS). In Proceedingsof the15thNational Conferenceon ArtificialIntelligence(AAAI-98)and of the 10th Conferenceon InnovativeApplicationsof Ar-tificial Intelligence(IAAI-98)1055-1062. AAAI Press,MenloPark,CA.KOSORESOW,A. P. and HOFMEYR, S. A. (1997). Intrusiondetection via systemcall traces. IEEESoftware14 35-42.KUMAR, S. and SPAFFORD,E. (1994). A pattern matching modelfor misuse intrusion detection. In Proceedings of the 17thNational ComputerSecurityConference11-21.LACHENBRUCH,P. A. (1966). Discriminant analysis when theinitialsamplesaremisclassified.Technometrics 657-662.

    247

  • 8/6/2019 Bolton and Hand

    15/16

    R. J. BOLTONAND D. J. HANDLACHENBRUCH,P. A. (1974). Discriminant analysis when the ini-tial samples are misclassified.II: Non-randommisclassifica-tionmodels. Technometrics 6 419-424.LANE,T. andBRODLEY, . E. (1998). Temporal equencelearn-

    ing and datareduction or anomalydetection.In Proceedingsof the 5th ACMConferenceon Computer nd CommunicationsSecurity CCS-98) 150-158. ACMPress,New York.LEE, W. and STOLFO,S. (1998). Data mining approachesforintrusiondetection.InProceedingsof the7thUSENIX ecuritySymposium,San Antonio, TX 79-93. USENIX Association,Berkeley,CA.LEONARD,K. J. (1993). Detectingcreditcard fraudusing expertsystems.Computers nd IndustrialEngineering25 103-106.

    LIPPMANN, R., FRIED, D., GRAF, I., HAINES, J.,KENDALL, K., MCCLUNG, D., WEBER, D., WEBSTER, S.,WYSCHOGROD, D., CUNNINGHAM, R. and ZISSMAN, M.(2000). Evaluating intrusion detection systems: The 1998DARPA off-line intrusion-detection evaluation. Unpublishedmanuscript, MIT Lincoln Laboratory.

    MAJOR, J. A. and RIEDINGER, D. R. (1992). EFD: A hybridknowledge/statistical-based system for the detection of fraud.InternationalJournalof IntelligentSystems7 687-703.MARCHETTE,. J. (2001). ComputerIntrusion Detection andNetworkMonitoring:A Statistical Viewpoint.Springer,NewYork.MCCARTHY,J. (2000). Phenomenal data mining. Comm. ACM 4375-79.MCLACHLAN,G. J. (1992). Discriminant Analysis and StatisticalPattern Recognition. Wiley, New York.MOBILE EUROPE (2000). New IP world, new dangers. Mobile

    Europe, March.MOREAU, Y., PRENEEL, B., BURGE, P., SHAWE-TAYLOR, J.,

    STOERMANN,C. and COOKE, C. (1996). Novel techniquesfor fraud detection in mobile communications. In ACTS MobileSummit,Grenada.

    MOREAU, Y., VERRELST,H. and VANDEWALLE,J. (1997). De-tection of mobile phone fraud using supervised neural net-works: A first prototype. In Proceedings of 7th InternationalConferenceon ArtificialNeural Networks(ICANN'97)1065-1070. Springer, Berlin.

    MURAD, U. and PINKAS, G. (1999). Unsupervised profiling foridentifying superimposed fraud. Principles of Data Miningand KnowledgeDiscovery.LectureNotes in ArtificialIntelli-gence 1704 251-261. Springer, Berlin.

    NEURAL TECHNOLOGIES 2000). Reducing telecoms fraud andchur. Report, Neural Technologies, Ltd., Petersfield, U.K.NIGRINI, M. J. (1999). I've got your number. Journal of Accoun-tancy May 79-83.

    NIGRINI, M. J. and MITTERMAIER,L. J. (1997). The use ofBenford's law as an aid in analytical procedures. Auditing: AJournalof Practice and Theory16 52-67.

    NORTEL (2000). Nortel networks fraud solutions. Fraud Primer,Issue 2.0. Nortel Networks Corporation.PAK, S. J. and ZDANOWICZ,J. S. (1994). A statistical analysis ofthe U.S. Merchandise Trade Database and its uses in trans-

    fer pricing compliance and enforcement. Tax Management,May 11.

    PATIENT, S. (2000). Reducing online credit card fraud.Web Developer's Journal. Available at http://www.webdevelopersjoumal.com/articles/card_fraud.html

    PRESS, S. J. and TANUR, J. M. (2001). The Subjectivity ofScientistsand theBayesianApproach.Wiley,New York.PROVOST,F. and FAWCETT,T. (2001). Robust classification for

    imprecise environments. Machine Learning 42 203-210.Qu, D., VETTER, B. M., WANG, F., NARAYAN, R., WU, S. F.,HOU, Y. F., GONG, F. and SARGOR, C. (1998). Statisticalanomaly detection for link-state routing protocols. In Proceed-ings of the SixthInternationalConferenceon NetworkProto-cols 62-70. IEEE Computer Society Press, Los Alamitos, CA.

    QUINLAN, J. R. (1990). Learning logical definitions from rela-tions. Machine Learning 5 239-266.QUINLAN, . R. (1993). C4.5: Programs or MachineLearning.Morgan Kaufmann, San Mateo, CA.RIPLEY, . D. (1996). PatternRecognitionand NeuralNetworks.Cambridge Univ. Press.

    ROBINSON, M. E. and TAWN, J. A. (1995). Statistics for excep-tional athleticsrecords.Appl.Statist.44 499-511.ROSSET, S., MURAD, U., NEUMANN, E., IDAN, Y. andPINKAS, G. (1999). Discovery of fraud rules for

    telecommunications-challenges and solutions. In Pro-ceedings of the FifthACMSIGKDD nternationalConferenceon KnowledgeDiscovery and Data Mining 409-413. ACMPress, New York.

    RYAN, J., LIN, M. and MIIKKULAINEN, R. (1997). Intrusiondetection with neural networks. In AAAI Workshop on AIApproaches o Fraud Detectionand RiskManagement72-79.AAAI Press, Menlo Park, CA.

    SCHONLAU, M., DUMOUCHEL, W., JU, W.-H., KARR, A. F.,THEUS, M. and VARDI, Y. (2001). Computer intrusion:Detecting masquerades.Statist.Sci. 16 58-74.

    SENATOR, . E. (2000). Ongoing managementand applicationof discoveredknowledge in a large regulatoryorganization:A case study of the use and impact of NASD regulation'sadvanced detection system (ADS). In Proceedings of theSixthACM SIGKDDInternationalConferenceon KnowledgeDiscoveryand Data Mining44-53. ACMPress,New York.

    SENATOR, T. E., GOLDBERG, H. G., WOOTON, J., COT-TINI, M. A., UMAR KHAN, A. F., KLINGER, C. D., LLA-MAS,W. M., MARRONE,M. P. andWONG,R. W. H. (1995).The financial crimes enforcement network AI system (FAIS)-Identifying potential money laundering from reports of largecash transactions. AI Magazine 16 21-39.

    SHAWE-TAYLOR, J., HOWKER, K., GOSSET, P., HYLAND,M., VERRELST, H., MOREAU, Y., STOERMANN, C. andBURGE, P. (2000). Novel techniques for profiling and frauddetection in mobile telecommunications. In Business Appli-cations of Neural Networks (P. J. G. Lisboa, A. Vellido andB.Edisbury, eds.) 113-139. World Scientific, Singapore.

    SHIEH, S.-P. W. and GLIGOR, V. D. (1991). A pattern-orientedintrusion-detection model and its applications. In Proceedingsof the 1991 IEEEComputer ociety Symposium nResearch nSecurity and Privacy 327-342. IEEE Computer Society Press,Silver Spring, MD.

    SHIEH, S.-P. W. and GLIGOR, V. D. (1997). On a pattern-oriented model for intrusion detection. IEEE Transactions onKnowledgeandData Engineering9 661-667.

    248

  • 8/6/2019 Bolton and Hand

    16/16

    STATISTICAL RAUDDETECTIONTATISTICAL RAUDDETECTIONSMITH,R. L. (1997). Comment on "Statistics for exceptionalathleticsrecords,"by M. E. Robinson and J. A. Tawn.Appl.Statist. 46 123-128.STOLFO,S. J., FAN, D. W., LEE, W., PRODROMIDIS,A. L. andCHAN,P. K. (1997a). Creditcard frauddetectionusing meta-learning:Issues and initial results.In AAAIWorkshop n AIApproaches o Fraud Detection and RiskManagement83-90.AAAI Press,MenloPark,CA.STOLFO, S., FAN, W., LEE, W., PRODROMIDIS, A. L. andCHAN,P. (1999). Cost-basedmodelingfor fraudand ntrusiondetection:Resultsfromthe JAMProject. n Proceedingsof theDARPA nformationSurvivabilityConferenceand Exposition2 130-144. IEEEComputerPress,New York.STOLFO, S. J., PRODROMIDIS,A. L., TSELEPIS, S., LEE, W.,FAN,D. W. and CHAN,P. K. (1997b). JAM: Javaagentsformeta-learningover distributeddatabases.In AAAIWorkshopon AI Approachesto Fraud Detection and RiskManagement91-98. AAAI Press,MenloPark,CA.

    SMITH,R. L. (1997). Comment on "Statistics for exceptionalathleticsrecords,"by M. E. Robinson and J. A. Tawn.Appl.Statist. 46 123-128.STOLFO,S. J., FAN, D. W., LEE, W., PRODROMIDIS,A. L. andCHAN,P. K. (1997a). Creditcard frauddetectionusing meta-learning:Issues and initial results.In AAAIWorkshop n AIApproaches o Fraud Detection and RiskManagement83-90.AAAI Press,MenloPark,CA.STOLFO, S., FAN, W., LEE, W., PRODROMIDIS, A. L. andCHAN,P. (1999). Cost-basedmodelingfor fraudand ntrusiondetection:Resultsfromthe JAMProject. n Proceedingsof theDARPA nformationSurvivabilityConferenceand Exposition2 130-144. IEEEComputerPress,New York.STOLFO, S. J., PRODROMIDIS,A. L., TSELEPIS, S., LEE, W.,FAN,D. W. and CHAN,P. K. (1997b). JAM: Javaagentsformeta-learningover distributeddatabases.In AAAIWorkshopon AI Approachesto Fraud Detection and RiskManagement91-98. AAAI Press,MenloPark,CA.

    TANIGUCHI, M., HAFT, M., HOLLMEN, J. and TRESP, V.(1998). Fraud detection in communicationnetworks usingneural andprobabilisticmethods. In Proceedingsof the 1998IEEE International Conference on Acoustics, Speech andSignal Processing(ICASSP'98)2 1241-1244. IEEEComputerSocietyPress,SilverSpring,MD.U.S. CONGRESS1995). Informationechnologiesfor the controlof money laundering.Office of TechnologyAssessment,Re-portOTA-ITC-630,U.S. GovernmentPrintingOffice, Wash-ington,DC.WASSERMAN,. andFAUST,K. (1994). Social NetworkAnalysis:MethodsandApplications.CambridgeUniv.Press.WEBB, A. R. (1999). Statistical Pattern Recognition. Arnold,London.WHEELER, R. and AITKEN, S. (2000). Multiple algorithmsfor fraud detection. Knowledge-Based Systems 13(2/3)93-99.

    TANIGUCHI, M., HAFT, M., HOLLMEN, J. and TRESP, V.(1998). Fraud detection in communicationnetworks usingneural andprobabilisticmethods. In Proceedingsof the 1998IEEE International Conference on Acoustics, Speech andSignal Processing(ICASSP'98)2 1241-1244. IEEEComputerSocietyPress,SilverSpring,MD.U.S. CONGRESS1995). Informationechnologiesfor the controlof money laundering.Office of TechnologyAssessment,Re-portOTA-ITC-630,U.S. GovernmentPrintingOffice, Wash-ington,DC.WASSERMAN,. andFAUST,K. (1994). Social NetworkAnalysis:MethodsandApplications.CambridgeUniv.Press.WEBB, A. R. (1999). Statistical Pattern Recognition. Arnold,London.WHEELER, R. and AITKEN, S. (2000). Multiple algorithmsfor fraud detection. Knowledge-Based Systems 13(2/3)93-99.

    CommentFosterProvostCommentFosterProvost

    The state of research on fraud detection recalls JohnGodfrey Saxe's 19th-century poem "The Blind Menand the Elephant" (Felleman, 1936, page 521). Basedon a Hindu fable, each blind man experiences only apart of the elephant, which shapes his opinion of thenature of the elephant: the leg makes it seem like atree, the tail a rope, the trunk a snake and so on. In fact,"... though each was partly in the right... all were inthe wrong." Saxe's poem was a criticism of theologicaldebates, and I do not intend such a harsh criticismof research on fraud detection. However, because theproblem is so complex, each research project takesa particular angle of attack, which often obscuresthe view of other parts of the problem. So, someresearchers see the problem as one of classification,others of temporal pattern discovery; to some it isa problem perfect for a hidden Markov model andso on.

    So why is fraud detection not simply classificationor a member of some other already well-understoodproblem class? Bolton and Hand outline several char-acteristics of fraud detection problems that differenti-ate them [as did Tom Fawcett and I in our review ofthe problems and techniques of fraud detection (Faw-Foster Provost is Associate Professor, Leonard N. SternSchool of Business, New York University, New York,New York10012 (e-mail: [email protected]).

    The state of research on fraud detection recalls JohnGodfrey Saxe's 19th-century poem "The Blind Menand the Elephant" (Felleman, 1936, page 521). Basedon a Hindu fable, each blind man experiences only apart of the elephant, which shapes his opinion of thenature of the elephant: the leg makes it seem like atree, the tail a rope, the trunk a snake and so on. In fact,"... though each was partly in the right... all were inthe wrong." Saxe's poem was a criticism of theologicaldebates, and I do not intend such a harsh criticismof research on fraud detection. However, because theproblem is so complex, each research project takesa particular angle of attack, which often obscuresthe view of other parts of the problem. So, someresearchers see the problem as one of classification,others of temporal pattern discovery; to some it isa problem perfect for a hidden Markov model andso on.

    So why is fraud detection not simply classificationor a member of some other already well-understoodproblem class? Bolton and Hand outline several char-acteristics of fraud detection problems that differenti-ate them [as did Tom Fawcett and I in our review ofthe problems and techniques of fraud detection (Faw-Foster Provost is Associate Professor, Leonard N. SternSchool of Business, New York University, New York,New York10012 (e-mail: [email protected]).

    cett and Provost, 2002)]. Consider fraud detection as aclassification problem. Fraud detection certainly mustbe "cost-sensitive"-rather than minimizing errorrate,some other loss function must be minimized. In addi-tion, usually the marginal class distribution is skewedstrongly toward one class (legitimate behavior). There-fore, modeling for fraud detection at least is a diffi-cult problem of estimating class membership probabil-ity, rather than simple classification. However, this stillis an unsatisfying attempt to transform the true prob-lem into one for which we have existing tools (prac-tical and conceptual). The objective function for frauddetection systems actually is much more complicated.For example, the value of detection is a function oftime. Immediate detection is much more valuable thandelayed detection. Unfortunately, evidence builds upover time, so detection is easier the longer it is de-layed. In cases of self-revealing fraud, eventually, de-tection is trivial (e.g., a defrauded customer calls tocomplain about fraudulent transactions on his or herbill).In most research on modeling for fraud detection,a subproblem is extracted (e.g., classifying transac-tions or accounts as being fraudulent) and techniquesare compared for solving this subproblem-withoutmoving on to compare the techniques for the greaterproblem of detecting fraud. Each particular subprob-lem naturally will abstract away those parts that are

    cett and Provost, 2002)]. Consider fraud detection as aclassification problem. Fraud detection certainly mustbe "cost-sensitive"-rather than minimizing errorrate,some other loss function must be minimized. In addi-tion, usually the marginal class distribution is skewedstrongly toward one class (legitimate behavior). There-fore, modeling for fraud detection at least is a diffi-cult problem of estimating class membership probabil-ity, rather than simple classification. However, this stillis an unsatisfying attempt to transform the true prob-lem into one for which we have existing tools (prac-tical and conceptual). The objective function for frauddetection systems actually is much more complicated.For example, the value of detection is a function oftime. Immediate detection is much more valuable thandelayed detection. Unfortunately, evidence builds upover time, so detection is easier the longer it is de-layed. In cases of self-revealing fraud, eventually, de-tection is trivial (e.g., a defrauded customer calls tocomplain about fraudulent transactions on his or herbill).In most research on modeling for fraud detection,a subproblem is extracted (e.g., classifying transac-tions or accounts as being fraudulent) and techniquesare compared for solving this subproblem-withoutmoving on to compare the techniques for the greaterproblem of detecting fraud. Each particular subprob-lem naturally will abstract away those parts that are

    24949


Recommended