+ All Categories
Home > Documents > Statistical Analysis of Identity Risk of Exposure and Cost ... · to better understand the...

Statistical Analysis of Identity Risk of Exposure and Cost ... · to better understand the...

Date post: 22-Jun-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
10
Chi a -Ju Chen Razieh Nokhbeh Zaeem K. Suzanne Barber April 2019 UTCID Report #19-04 Copyright © 201 9 The University of Texas. Confidential and Proprietary, All Rights Reserved. Statistical Analysis of Identity Risk of Exposure and Cost Using the Ecosystem of Identity Attributes
Transcript
Page 1: Statistical Analysis of Identity Risk of Exposure and Cost ... · to better understand the fundamentals of identity to protect it and responsibly use it. Previously developed comprehensive

Cover pageInside cover with acknowledgment of partners and partner lo-gos1-page executive summary with highlights from statistics.3 main sections – each of these will have all of the graphs from the list of 16 that pertain to each respective section:EventsVictimsFraudstersThe format for each of these sections will more or less like a prettier version of the ITAP Template document I sent you. Ba-sically each page will have 3+ graphs depending on design lay-out (however many fit) and then 2 text sections copy for:“Explanations”: A 1-2 sentence description of the graph“Insights” a 1-paragraph exposition on what the statistic indi-cates.

Chia-Ju ChenRazieh Nokhbeh Zaeem K. Suzanne Barber

April 2019UTCID Report #19-04

C o p y r i g h t © 2 0 1 9 T h e U n i v e r s i t y o f T e x a s . C o n f i d e n t i a l a n d P r o p r i e t a r y , A l l R i g h t s R e s e r v e d .

Statistical Analysis of Identity Risk of Exposure and Cost Using the Ecosystem of Identity Attributes

Page 2: Statistical Analysis of Identity Risk of Exposure and Cost ... · to better understand the fundamentals of identity to protect it and responsibly use it. Previously developed comprehensive

Statistical Analysis of Identity Risk of Exposureand Cost Using the Ecosystem of Identity Attributes

Abstract—Personally Identifiable Information (PII) is oftencalled the “currency of Internet” as identity assets are collected,shared, sold, and used for almost every transaction on the Inter-net. PII is used for all types of applications from access control tocredit score calculations to targeted advertising. Every marketsector relies on PII to know and authenticate their customersand their employees. With so many businesses and governmentagencies relying on PII to make important decisions and somany people being asked to share personal data, it is criticalto better understand the fundamentals of identity to protectit and responsibly use it. Previously developed comprehensiveIdentity Ecosystem utilizes graphs to model PII assets and theirrelationships and is powered by empirical data from almost 6,000real-world identity theft and fraud news reports to populatethe UT CID Identity Ecosystem. We obtained UT CID IdentityEcosystem from its authors to analyze using graph theory. Wereport numerous novel statistics using identity asset content,structure, value, accessibility, and impact. Our work sheds lighton how identity is used and paves the way for improving identity.

Index Terms—security and privacy, identity theft, graph the-ory, social network measures, visualization

I. INTRODUCTION

Personally identifiable information (PII) is any data thatcould potentially be used to recognize a particular person,and it is commonly used in both physical and cyber spaces toperform personal authentication. Identity theft is the fraudulentacquisition and usage without permission of a person’s PII. Amodern authentication process usually requires collection ofPII and increases the risk of exposure to identity theft andfraud criminals.

In 2017, the number of identity fraud victims increased by8% rising to 16.7 million U.S. consumers. Fraudsters stolefrom 1.3 million more victims in 2017 stealing a total of$16.8 billion from U.S. consumers [1]. More intelligent andcomprehensive approach should be provided to thwart thecrime of identity theft.

In order to model the identity ecosystem, an intuitive ap-proach is to analyze the components from both cyber and phys-ical aspects. Modern society seamlessly merges online andoffline PII attributes. Examples of on-line attributes are one’ssocial media accounts, on-line shopping patterns, passwords,and email accounts. Off-line attributes are those related to thephysical world such as bank accounts, credit and debit cards,Social Security Number, and one’s physical characteristics.

The UT CID Identity Ecosystem developed at the Centerfor Identity (CID) at the University of Texas (UT) at Austinconstructed a graph-based model of people, devices, and or-ganizations [2]. It models the relation as a Bayesian Network,

and performs interference for possible sources of breaches andcost if the source is compromised. It provides a framework forunderstanding the value, risk and mutual relationships for pairsof PII attributes. Each vertex represents an attribute whereasedges in-between imply the relationship.

For data source of ecosystem, The Identity Threat Assess-ment and Prediction (ITAP) [3] project is leveraged. ITAP isdeveloped to focusing on gathering identity theft informationfrom news stories, structuring this information, analyzing it,and discovering trends and characteristics.

We obtained UT CID Identity Ecosystem and the ITAP datasource from their authors. Based on this graph-based networkof identity, we have designed and implemented a visualizationframework that facilitates understanding of the whole risknetwork rather than reviewing unstructured raw news feeddata from ITAP. We introduce three main statistical evaluationcriteria: (1) Traditional pie, bar, and scatter plots of vertex oredge specific values are employed to show the distribution.(2) Centrality measures such as degree, closeness, and be-tweenness centrality are introduced and hence illustrate eachPII with certain structural features. (3) Strongly ConnectedComponents (SCC) are applied to distinguish groups of PIIthat are interconnected. With these criteria, our visualizationframework can prototype the identity system with detailedfeatures such as PII that are most efficient to spread theinformation if breached, or PII that are aggregated as agroup that will easily be traversed if one member is alreadycompromised. The main contribution of this paper includes thevisual representation for the Ecosystem and the three criteriafor modeling the identity ecosystem.

The remainder of this article is structured as follow. SectionII elaborates on the importance of statistical analysis of theEcosystem tool and the set of measurements to be included.Section III presents a comprehensive evaluation and takeawaysfrom the results. Section IV includes the related work ofthe identity ecosystem, identity theft, and related governmentreports. Section V concludes the research and gives insightsfor future work.

II. STATISTICS BASED ON ECOSYSTEM

The Identity Ecosystem is a valuable previously imple-mented tool that models identity liaison, analyzes identityfraud and breaches, and answers several questions aboutidentity risk management [2]. It maps identity attributes ina probabilistic model and performs Bayesian network-basedinference to determine the posterior effects on each attribute.The Identity Ecosystem models individual identity attributes

Page 3: Statistical Analysis of Identity Risk of Exposure and Cost ... · to better understand the fundamentals of identity to protect it and responsibly use it. Previously developed comprehensive

as nodes whereas edges in-between indicate various types ofconnections.

Each vertex includes different properties such as type ofnode, risk of exposure, and intrinsic monetary value. TheEcosystem Graphical User Interface (GUI) can color and sizenodes based on their properties independently. Figure 1 showsan example snapshot in which the nodes are colored based ontheir risk of exposure and are sized based on their liabilityvalue.

Fig. 1: Background: A snapshot showing previously developedUT CID Ecosystem attribute graph.

The research questions we seek to answer in this paperare that “In a graph-based network of identity, what are theunderlying characteristics? Are PII forming groups or clusters?If so, what are the isolated PII nodes and connected ones?”.We would also want to answer “Inside the network, which PIIis in the critical path of obtaining others most often and whichPII can influence the acquisition of the other PII?”. We furtherobserve questions such as: “In the PII graph, where is the PIIlocated? Is it connected with lots of dangerous neighboringPII? Or is it placed on the boundary of a cluster?”.

We focus on three statistical indices on the given data set:(1) Bar, pie, and distribution charts based on the node oredge value (2) Centrality measurement including node spe-cific in and out-degree centrality, betweenness centrality, andcloseness centrality and (3) Strongly connected componentsof nodes for identifying clusters. Based on the results, acomprehensive discussion is presented about possible breacheswith more important attributes, and flow of personal informa-tion inside the network modeling the real-world informationmovement.

III. STATISTICAL CHARTS

We present sets of mathematical formula and statisticalchart visualization in this section. The data source we usedis from ITAP in Ecosystem, which contains 627 PII attributesin total. We divide the analyses based on edge or nodespecific properties. We represent the Identity Ecosystem asa graph G(V,E) consisting of N attributes A1, ...,AN anda set of directed edges as tuples eij = 〈i, j〉 where Ai isthe originating node and Aj is the target node such that1 ≤ i, j ≤ N . Each edge eij represents a possible path bywhich Aj can be breached given that Ai is breached. Each

node Aj is labeled with a Boolean random variable, denotedD(Aj), which is true if the attribute has been exposed andfalse otherwise. Each edge eij represents a possible path bywhich Aj can be breached given that Ai is breached. Forsimplicity, we consider all edges to be independent. Therefore,we can assign conditional probabilities to each edge withProb(eij) = Prob(D(Aj)|D(Ai)).

A. Statistical charts based on edges

We implement the pie chart to observe the percentage forPII with/without outgoing edges or with/without incomingedges as shown in Figure 2. We can observe that 211 PIIattributes (33%) are with incoming edges, while 45 (7%) arewith outgoing edges.

Insight: Only 7% of PII have any effect on the risk ofexposure of others and a total of 33% could possibly beaffected. The PII with outgoing edges should be carefullyprotected. The most important PII in this list are discussedshortly.

Fig. 2: A snapshot showing pie charts for percentage of nodeswith/without in/out degree.

Furthermore, taking the probability on the edges into ac-count, we extend the degree centrality from summing discreteedge count to accumulating risk. Degree centrality equals thenumber of links that a vertex has with other vertices. Theequation for this measure is as follows:

CDout(vi) = outdegree(vi) = |{eij}| (1)

CDin(vi) = indegree(vi) = |{eji}| (2)

If we consider the weight (i.e., probability) on edges, thisyields the equation:

CWout(vi) = ΣProb(eij) (3)

CWin(vi) = ΣProb(eji) (4)

Page 4: Statistical Analysis of Identity Risk of Exposure and Cost ... · to better understand the fundamentals of identity to protect it and responsibly use it. Previously developed comprehensive

Figure 3 presents the top 10 PII in descending order basedon the number of incoming and outgoing edges. The top threeattributes with the highest number of incoming edges, i.e.,most easily discoverable through incoming edges, are Name,Credit Card Information, and Date of Birth. Also, the top threeattributes with the highest number of outgoing edges, i.e., mostlikely able to reach the wide variety of PII through outgoingedges, are Customer Database, Password, and Email address.Figure 4 shows the same statistics on the top 10 PII withmost incoming and outgoing edges, with the difference that itconsiders the sum of weights on the edges instead of merelythe edge count.

Insight: Name has the highest rank among PII discoverablefrom others through incoming edges and Customer databasesits at the top of nodes with the highest outgoing degree,whether the edge count or edge weight is considered.

Fig. 3: A snapshot showing top 10 PII with most in and outdegree count.

B. Statistical charts based on nodes

• Distribution Chart based on node risk and valueWe examine the distribution based on risk and value of

each attributes to better understand the underlying trend forall properties. The chart is calculated by fixing linear intervalsize on x-axis and counting the number of PII lying in eachinterval. Figure 5 gives a snapshot of the distribution chart fornode value with interval unit of 100,000 in US Dollar value.According to the ITAP project [3], ITAP determines the lossvalue of a PII by averaging out the identity theft cases in whichthe PII was breached as a source of entry. Since ITAP usuallylacks the number of victims involved in a case, the loss valueis not per victim. Figure 6 yields a result for node risk withinterval size 0.001.

Insight: The vast majority of PII are valued at less than$100,000 but have a risk of exposure of less than 0.001 too.

Fig. 4: A snapshot showing top 10 PII with most in and outprobability sum on edges.

Fig. 5: Distribution chart based on node value with intervalsize of $100,000.

• Scatter plot of Closeness vs. Betweenness Centrality

Freeman [4] developed a set of measures for centrality basedon betweenness. Later on, he proposed four core criteria,which developed into degree, closeness, betweenness, andeigenvector centrality [5]. We further leverage the conceptof closeness and betweenness centrality to investigate theEcosystem graph.

Closeness Centrality emphasizes how close a vertex is toall other vertices in the topology – the distance of a vertexto all others in the network by focusing on the geodesicmeasurement from each vertex to all others [5]. To be morespecific, it calculates the shortest path between all nodes andassigns each node a score based on the length of its shortestpaths to other nodes. According to Yin et al. [6], closeness isan evaluation for “how long it will take information to spread

Page 5: Statistical Analysis of Identity Risk of Exposure and Cost ... · to better understand the fundamentals of identity to protect it and responsibly use it. Previously developed comprehensive

Fig. 6: Distribution chart based on node risk with interval sizeof 0.001.

from a given vertex to others in the network” (p.1603), whichhelps find the PII attributes that are best placed to reach othersonce breached, and thus influence the entire network mostefficiently. Consequently, closeness centrality in the identityecosystem is a measure of Information Acquisition Power.The highest it is for a PII attribute, the more power thatPII attribute has in exploiting the entire identity ecosystem.Such PII attribute would only need few others to discoverthe whole network. Also according to Freidkin [7], closenesscentrality represents the independence in the sense that PIIattributes with higher closeness centrality do not need to seekinformation from other more peripheral PII attributes. Thisyields the equation as follows. Cc(vi) stands for the closenesscentrality for vertex i and α(i, j) is the number of the shortestpaths between two vertices vi and vj (considering the numberof edges and not edge weight):

Cc(vi) =

n∑j=1

1

α(i, j)(5)

Betweenness Centrality Betweenness centrality [5] servesas an alternative concept of centrality focusing on control overthe connections between other pairs of vertices. Betweennesscentrality does this by identifying all the shortest paths andthen aggregating how many times the node lies on one. Usingα(i, j) as the number of different shortest 〈i, j〉 paths, andα(i, u, j) as how many times the shortest path flows throughu (u 6= i, j), the equation is as follows:

CB(u) =∑

i 6=j 6=u

α(i, u, j)

α(i, j)(6)

Betweenness centrality recognizes nodes that act as‘bridges’ among whole and assesses the PII attributes thatdetermine the flow around the system. Betweenness serves as apowerful characteristic for communication dynamics – a highbetweenness index could imply a node regulates collaborationin-between, holds authority over, or infers periphery of diverseclusters. In our Ecosystem context, it measures how often a

PII attribute is in the critical path of acquiring or discoveringother PII, and hence measures Criticality.

We calculate the scatter plot of Information Acquisitionpower (y-axis) vs. Criticality (x-axis). This plot could furtherbe divided into four quadrants based on the combination ofhigh and low values on x and y axes. Denoting C for Criticality(i.e., betweenness) and I for Information Acquisition Power(i.e., closeness), Figure 7 shows the graph with blue dotsrepresenting high C and high I, Figure 8 with green dots lowC and High I, Figure 9 with red dots High C and low I, andlastly Figure 10 with orange dots low C and low I1.

Insight: Most of the data points maintain low informationacquisition power and low criticality (Figure 10). There existsonly few sparsely distributed data points, discussed in moredetails shortly, with both high criticality and high informationacquisition power (Figure 7). Such PII attributes are powerfulin acquiring other PII and act as critical bottlenecks in thenetwork of PII attributes. If evaluated using the Ecosystemmodel, these PII attributes could be asserted as attributes thatwill rapidly jeopardize the remaining sub-network if alreadyexposed, and boost the information flow of exposure inside thesystem. Interestingly, there is only one data point with highcriticality but low information acquisition power (Figure 9)and that is Signature.

Fig. 7: Scatter plot with high betweenness (criticality) and highcloseness (information acquisition power).

Figure 11 displays the top 10 PII in descending order basedon the value of information acquisition power and criticality.

Insight: The top three attributes with the highest valueof information acquisition power are Email Address, Name,and Address. The top three attributes with the highest value

1Low and high are indicating below and above average, respectively.

Page 6: Statistical Analysis of Identity Risk of Exposure and Cost ... · to better understand the fundamentals of identity to protect it and responsibly use it. Previously developed comprehensive

Fig. 8: Scatter plot with low betweenness (criticality) and highcloseness (information acquisition power).

Fig. 9: Scatter plot with high betweenness (criticality) and lowcloseness (information acquisition power).

of criticality are Customer Database, Password, and EmailAddress.

C. Strongly Connected Components

In the current Identity Ecosystem, a large portion (%65) ofnodes is completely isolated from the rest of the Ecosystem.Among those PII attributes with connections, we further iden-tify attributes that are mutually coupled among themselves,which we define as ‘clusters’. Clusters serve as subsets that

Fig. 10: Scatter plot with low betweenness (criticality) andlow closeness (information acquisition power).

Fig. 11: A snapshot showing top 10 PII with highest informa-tion acquisition power and criticality values.

are dangerous sources for breaches, can quickly jeopardizeother members in the group and confine the flow inside sub-network. We propose the cluster to be a Strongly ConnectedComponent (SCC) in the graph theory. A SCC of a directedgraph G = (V,E) is a maximal set of vertices U ⊆ Vsuch that for every pair of vertices u and v in U , bothu 7→ v and v 7→ u hold, where u 7→ v means there is adirected path from u to v. Consequently, in a cluster, thereis a probability that every PII attribute can reveal every otherPII and be revealed by every other PII. Tarjan’s classic serial

Page 7: Statistical Analysis of Identity Risk of Exposure and Cost ... · to better understand the fundamentals of identity to protect it and responsibly use it. Previously developed comprehensive

algorithm for detection of SCCs runs linearly with respect tothe number of edges and uses depth-first search. We applyTarjan’s algorithm [8] to compute the clusters. We found onecluster of 36 nodes which we display in Table I.

Insight: Every PII in Table I has a probability of exposingevery other PII in that table.

IV. DISCUSSION OF RESULTS

In this section, we analyze the statistical results and give ex-ample takeaways from the above charts. Overall the Ecosystemcontains 627 vertices. We can observe that a large portion ofthe nodes is not connected to any other node. In fact, 65% ofthe nodes are fully isolated without any inbound or outboundconnections. Only a small portion is considered to be impor-tant when breached or compromised, and one should makean effort to protect them. Among those with connections, wefurther observe the ranking by degree centrality to speculatecandidates with most in-degree versus most out-degree, whichcould be interpreted as attributes that are most likely to getcompromised, versus attributes that tend to spread information.

We further discover possible layout and structural featuresfor the Identity Ecosystem graph by computing the SCC ofthe network. We extracted clusters, wherein each node isinter-reachable inside the sub-graph. Between that 33% withincoming and 7% with outgoing edge PII nodes, an overlapof 36 (about 5%) vertices constitutes a big component.

We can assert our ecosystem model to be a sparse graph,where most attributes are unreachable. Only 5% congregatetogether and serves as a central concern for our identitymanagement.

We utilized closeness and betweenness centrality to betterunderstand the influence in the topology. Closeness, or infor-mation acquisition power in this context, measures the abilityof a PII attribute to retrieve information from and send infor-mation to others. Those PII attributes with high value can beviewed as ‘broadcaster’ or ‘gossiper’, which if breached, canput others in danger. Betweenness, or criticality in this context,is based on the assumption that a PII attribute may be exposingothers if it presides over a path bottleneck. It also identifies theboundary spanner, which separates different communities andfeatures. Those PII attributes with high value can be viewed as‘bridge’ or ‘broker’, if one connecting component is breached,those can function as essential endpoints to protect the identityby not allowing information to flow through.

Generally, previous studies indicate that centrality metricsare positively correlated [9] [10]. Overall degree and closenesswere strongly inter-correlated, while betweenness remainedrelatively uncorrelated with the other measures [11]. Com-binations of centrality values represent certain topology andpositional patterns ( [12] p. 51). Given attributes with highdegree and low closeness centrality (information acquisitionpower), we can assert that the PII is embedded in thecluster and far away from others, whereas low betweenness(criticality) infers that the PII holds redundant links whereinformation just bypass it. Given attributes with a low degreeand high closeness centrality (information acquisition power),

the PII ties with substantial or active others, whereas highbetweenness (criticality) indicates that PII is spanning fewlinks, but with crucial influence on network flow. Low close-ness (information acquisition power) and high betweenness(criticality) combination results in specific PII monopolizingthe ties from a small number of PII attributes to many others.We found a prime example of such situation with Signature.Low betweenness (criticality) and high closeness (informationacquisition power) portray the PII locates in a dense, activecluster at the center of events with many others. We summarizedifferent combinations and their corresponding characteristicsin Table II.

V. RELATED WORK

In this section, we cover previous research that studiesand surveys the statistics of identity theft. We can categorizeprevious work into three main sources: Federal and Statesagencies, private organizations and academic institutions.

From government sources, Federal and State agencies, stud-ies by U.S. department of Justice (Harrell [13]) release reportson distribution of identity theft victims. Also United StatesGeneral Accounting Office (USGAO [14]), Federal TradeCommission (FTC [15]), Office of the Inspector General, Fed-eral Bureau of Investigation (FBI), Postal Inspectors Office,and United States Secret Service (USSS) present studies onidentity theft from different domains.

Among private organizations, Javelin [1] publishes compre-hensive analysis and case studies about fraud detection andidentity threat.

In the academia, Copes et al. [16] analyzed reports from Na-tional Public Survey on White Collar Crime and summarizedfinancial-related fraudster behavior such as credit card fraudand bank account fraud. Allison et al. [17] gathered data fromagencies. They performed statistics analysis on victims andextracted demographic patterns of victims among the generalU.S. population. Using Routine Activity Theory, Reyns [18]reported an empirical study of identity theft in the UnitedKingdom. Pratt et al. [19], and Choo, [20] also conductedstudies utilizing Routine Activity Theory in different jurisdic-tions.

In these studies, statistic were presented. However, thosedata sets were not fully constructed into a structured math-ematical model and do not interact with graph theoretic andsocial network analysis measures. We feed data sets from ITAPand model the risk of exposure using Bayesian Network [2].We are also one of the first to develop identity ecosystem intograph network and exploit the concept from three types ofcentrality as well as strongly connected components.

VI. CONCLUSION

In this paper, we designed and implemented a visualizationframework that assists data providers and collectors to compre-hend and analyze the probabilistic graphical model of identityattributes. The visualization tool facilitates understanding ofthe whole risk model. Based on the Bayesian network presen-tation of identity attributes, we developed traditional statistical

Page 8: Statistical Analysis of Identity Risk of Exposure and Cost ... · to better understand the fundamentals of identity to protect it and responsibly use it. Previously developed comprehensive

Cluster of attributes, containing 36 vertices sorted in alphabetical order.1. Address 2. AccountNumber 3. AccountInformation4. Age 5. BankAccountInformation 6. BankAccountNumber7. BiographicData 8. BirthCertificateInformation 9. CreditCardInformation10. CreditCardNumber 11. CVVCode 12. CheckInformation13. DateofBirth 14. DebitCardInformation 15. Driver’sLicenseNumber16. Driver’sLicenseInformation 17. Date 18. EmployeeLoginCredentials19. EmailAddress 20. EmployeeRecord 21. ExpirationDate22. IDCardInformation 23. LoginCredentials 24. Name25. Password 26. PersonallyIdentifiableInformation 27. PhoneNumber28. PersonalIdentificationNumber(PIN) 29. PhysicalAddress 30. PassportInformation31. Photograph-Person 32. PatientMedicalRecord 33. RoutingNumber34. SocialSecurityNumber 35. Username 36. W-2FormInformation

TABLE I: List of attributes in SCC.

Low Degree Low Closeness (Information Ac-quisition Power)

Low Betweenness (Criticality)

High Degree - Embed in a cluster which is far-away from others

PII with redundant connection -flow bypass

High Closeness (Information Ac-quisition Power)

Key PII connected to important andactive others

- Center PII located in a dense, ac-tive cluster at the center of eventswith many others

High Betweenness (Criticality) PII’s few ties are crucial to networkflow

PII monopolizes the ties from asmall number of PII to many others

-

TABLE II: Combinations of centrality metrics.

charts such as histograms, scatter plots, and pie charts basedon values for each PII to inspect the underlying distribution.Even though hundreds of PII constitute the whole system, alarge amount is indeed isolated. Only a small portion of the PIIis vulnerable to identity theft and one should make an effortto protect them.

To investigate the structural topology and correlation be-tween PII, we further proposed to apply centrality measuressuch as degree, closeness, and betweenness centrality. More-over, we discussed the combination of all the three centralitymeasures with high and low values. With these measures, wecan estimate the hidden characteristics of the network.

Lastly, we calculated Strongly Connected Components(SCC) to recognize clusters of PII that are mutually reachableamong themselves. SCCs are subsets which are dangerousorigins for breaches, can quickly jeopardize other PII in thegroup and constraint the flow inside the sub-network. In thecurrent Identity Ecosystem, there is only one big cluster with36 PII (5% of the entire ecosystem) interconnected. We canagain confirm that as complex as the Identity Ecosystem is, asmall portion is considered most threatening and risky.

As the ITAP project continues to collect data, theories andtechnologies developed from this research can be customizedalong the way to minimize our identities’ risk of exposure andmaximize privacy.

REFERENCES

[1] K. M. Pascual, Al and S. Miller. (2018, Mar.) 2018identity fraud: Fraud enters a new era of complexity.[Online]. Available: https://www.javelinstrategy.com/coverage-area/2018-identity-fraud-fraud-enters-new-era-complexity

[2] R. N. Zaeem, S. Budalakoti, K. S. Barber, M. Rasheed, and C. Bajaj,“Predicting and explaining identity risk, exposure and cost using theecosystem of identity attributes,” in 2016 IEEE International CarnahanConference on Security Technology (ICCST), Oct 2016, pp. 1–8.

[3] J. Zaiss, R. Nokhbeh Zaeem, and K. S. Barber, “Identity threatassessment and prediction,” Journal of Consumer Affairs, vol. 53,no. 1, pp. 58–70, 2019. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1111/joca.12191

[4] L. C. Freeman, “Centrality in social networks conceptual clarification,”Social Networks, p. 215, 1978.

[5] ——, “A set of measures of centrality based on betweenness,”Sociometry, vol. 40, no. 1, pp. 35–41, 1977. [Online]. Available:http://www.jstor.org/stable/3033543

[6] L. chun Yin, H. Kretschmer, R. A. Hanneman, and Z. yuan Liu,“Connection and stratification in research collaboration: An analysisof the collnet network,” Information Processing and Management,vol. 42, no. 6, pp. 1599 – 1613, 2006, special Issue onInformetrics. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0306457306000422

[7] N. E. Friedkin, “Theoretical foundations for centrality measures,”American Journal of Sociology, vol. 96, no. 6, pp. 1478–1504, 1991.[Online]. Available: http://www.jstor.org/stable/2781908

[8] R. Tarjan, “Depth-first search and linear graph algorithms,” in 12thAnnual Symposium on Switching and Automata Theory (swat 1971),Oct 1971, pp. 114–121.

[9] T. Valente, K. Coronges, C. Lakon, and E. Costenbader, “How correlatedare network centrality measures?” Connections (Toronto, Ont.), vol. 28,pp. 16–26, 01 2008.

[10] N. Meghanathan, “Correlation coefficient analysis of centrality metricsfor complex network graphs,” in Computer Science On-line Conference,01 2015, pp. 11–20.

[11] J. M. Bolland, “Sorting out centrality: An analysis of the performanceof four centrality models in real and simulated networks,” SocialNetworks, vol. 10, no. 3, pp. 233 – 253, 1988. [Online]. Available:http://www.sciencedirect.com/science/article/pii/0378873388900147

[12] D. Du. (2019, Mar.) Social network analysis: Centralitymeasures. [Online]. Available: http://www2.unb.ca/∼ddu/6634/Lecturenotes/Lecture 4 centrality measure.pdf

[13] E. Harrell. (2016, Mar.) Victims of identity thef, 2016. [Online].Available: https://www.bjs.gov/content/pub/pdf/vit16.pdf

[14] USGAO. (2017, Mar.) 2017.services offer some benefits but are limitedin preventing fraud. [Online]. Available: https://www.gao.gov/assets/690/683842.pdf

[15] F. T. C. (FTC). (2016, Mar.) 2016. consumer sentinelnetwork data book for january to december 2016. [Online].Available: https://www.ftc.gov/system/files/documents/reports/consumer-sentinel-network-data-book-january-december-2016/csncy-2016 data book.pdf

Page 9: Statistical Analysis of Identity Risk of Exposure and Cost ... · to better understand the fundamentals of identity to protect it and responsibly use it. Previously developed comprehensive

[16] H. Copes and L. M. Vieraitis, “Understanding identity theft:Offenders’ accounts of their lives and crimes,” Criminal JusticeReview, vol. 34, no. 3, pp. 329–349, 2009. [Online]. Available:https://doi.org/10.1177/0734016808330589

[17] S. F. Allison, A. M. Schuck, and K. M. Lersch, “Exploring thecrime of identity theft: Prevalence, clearance rates, and victim/offendercharacteristics,” Journal of Criminal Justice, vol. 33, no. 1, pp. 19– 29, 2005. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0047235204001175

[18] B. W. Reyns, “Online routines and identity theft victimization:Further expanding routine activity theory beyond direct-contactoffenses,” Journal of Research in Crime and Delinquency, vol. 50,no. 2, pp. 216–238, 2013. [Online]. Available: https://doi.org/10.1177/0022427811425539

[19] T. C. Pratt, K. Holtfreter, and M. D. Reisig, “Routine onlineactivity and internet fraud targeting: Extending the generalityof routine activity theory,” Journal of Research in Crime andDelinquency, vol. 47, no. 3, pp. 267–296, 2010. [Online]. Available:https://doi.org/10.1177/0022427810365903

[20] K.-K. R. Choo, “The cyber threat landscape: Challenges and futureresearch directions,” Computers and Security, vol. 30, no. 8, pp. 719 –731, 2011. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0167404811001040

APPENDIX

Page 10: Statistical Analysis of Identity Risk of Exposure and Cost ... · to better understand the fundamentals of identity to protect it and responsibly use it. Previously developed comprehensive

W W W.IDENTIT Y.UTEX AS.EDU

Copyright ©2019 The University of Texas Confidential and Proprietary, All Rights Reserved.


Recommended