Questionnaire-Responded Transaction Approach with SVM for Credit Card Fraud Detection

Post on 08-Jan-2016

60 views 9 download

description

Questionnaire-Responded Transaction Approach with SVM for Credit Card Fraud Detection. Adviser : Tung-Shou Chen (陳同孝) Rong-Chang Chen (陳榮昌) Keh-Chien Ma (馬克艱) Graduate : Bo-Yang Chen (陳柏仰) Student : Chun-Wei Chen (陳峻偉) Yu-Ru Yang (楊玉汝) - PowerPoint PPT Presentation

transcript

Questionnaire-Responded Transaction Approach Questionnaire-Responded Transaction Approach with SVM for with SVM for Credit Card Fraud DetectionCredit Card Fraud Detection

AdviserAdviser :: Tung-Shou Chen Tung-Shou Chen (陳同孝)(陳同孝) Rong-Chang ChenRong-Chang Chen (陳榮昌)(陳榮昌) Keh-Chien Ma Keh-Chien Ma (馬克艱)(馬克艱)

GraduateGraduate :: Bo-Yang Chen Bo-Yang Chen (陳柏仰)(陳柏仰) StudentStudent :: Chun-Wei ChenChun-Wei Chen (陳峻偉) (陳峻偉) Yu-Ru Yang Yu-Ru Yang (楊(楊

玉汝) 玉汝) Chih-Ru Lin Chih-Ru Lin (林志儒) (林志儒) Chun-Bo TsaiChun-Bo Tsai (蔡俊(蔡俊

伯) 伯) Biing-Hsiu LiBiing-Hsiu Li (李秉修)(李秉修)

OutlineOutline IntroductionIntroduction Flow chartFlow chart Questionnaire-Responded Transaction (QRT) Questionnaire-Responded Transaction (QRT) Support Vector Machines (SVMs)Support Vector Machines (SVMs) Results and discussionResults and discussion

IntroductionIntroduction

Recently, preventing credit card fraud has long Recently, preventing credit card fraud has long been one of the most important issues. been one of the most important issues.

The good solution is collecting personal The good solution is collecting personal transaction data by an online questionnaire transaction data by an online questionnaire system.system.

Flow chartFlow chart

No

Normal transactionAbnormal transaction

QRT data Data analysis

Train data and build up a personalized QRT

model by SVMTraining data

Predict a new transaction to be abnormal or not

Yes

Questionnaire-responded transactionQuestionnaire-responded transaction

Questionnaire-responded transactionQuestionnaire-responded transaction

Questionnaire-responded transactionQuestionnaire-responded transaction

mySVM can be used for pattern recognition, mySVM can be used for pattern recognition, regression and classification.regression and classification.

Support Vector Machines (1/5)Support Vector Machines (1/5)

Non-linear classification

y

x

Support Vector MachinesSupport Vector Machines (2/5) (2/5)

Support Vector MachinesSupport Vector Machines (3/5) (3/5)

_

+

+

_+ _

_+

+ +_ _

( True positive)

( True negative)

QRT FormatQRT Format (4/5) (4/5)

GroupGroup GenderGender ItemItem CostCost TimeTime Y/NY/N

00 11 55 11 33 11

mySVM Format (5/5)mySVM Format (5/5)

Results and discussionResults and discussion (1/7) (1/7)

The trend is favorable approach since we can The trend is favorable approach since we can acquire high TN rate if we increase Rn. acquire high TN rate if we increase Rn.

Alternatively, TP rate decreases with an increase Alternatively, TP rate decreases with an increase in Rn. in Rn.

Results and discussionResults and discussion (2/7) (2/7)

50%

60%

70%

80%

90%

100%

0.3 0.4 0.5 0.6 0.7

Rn

Accuracy

TN(N=140)

TP(N=140)

AVG(N=140)

TN(N=200)

TP(N=200)

AVG(N=200)

TN(N=260)

TP(N=260)

AVG(N=260)

Results and discussionResults and discussion (3/7) (3/7)

20%

40%

60%

80%

100%

100 140 180 220 260

Number of data, N

TN

ra

te

Rn=0.3

Rn=0.4

Rn=0.5

Rn=0.6

Rn=0.7

Results and discussionResults and discussion (4/7) (4/7)

When Rn is low, it is difficult to obtain a high When Rn is low, it is difficult to obtain a high TN rate.TN rate.

The over-sampling is to replicate the negative The over-sampling is to replicate the negative data in the minority class.data in the minority class.

Or we can improve the TN rate by adding the Or we can improve the TN rate by adding the negative data.negative data.

Results and discussionResults and discussion (5/7) (5/7)

20%

30%

40%

50%

60%

70%

80%

0 2 4 6 8 10Times

TN

rat

e

adding_1:1:1:1:1:1

adding_random

adding_5:4:3:2:2:2

replicating_5:4:3:2:2:2

Hierarchical SVMsHierarchical SVMs (6/7) (6/7)

………………

………………SVM 1 SVM 2 SVM k

Aggregation

Positive Negative

Results and discussionResults and discussion (7/7) (7/7) TN rate Average F(X) Fraud Cost Cost Saved

Base case 0.48 - 100 % 0 Majority Voting (more

than 4 negatives) 0.74 -3.32 28.9 % 71.09 %

More than 3 negatives 0.82 -2.35 21.5 % 78.5 %

Weighting by TN rate & Voting 0.74 - 2.35 28.9 % 71.09 %

Weighting by 6-fold & 10-fold cross validation

0.74 -1.93 28.9 % 71.09 %

Weighting by leave-one-out 0.74 -3.07 28.9 % 71.09 %

Hierarchical SVMs 0.74 -2.89 28.9 % 71.09 %