+ All Categories
Home > Documents > Analysis of Latent Relationships in Semantic Graphs using … · Application: Enron Email Analysis...

Analysis of Latent Relationships in Semantic Graphs using … · Application: Enron Email Analysis...

Date post: 09-Apr-2018
Category:
Upload: ngothu
View: 214 times
Download: 1 times
Share this document with a friend
30
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energys National Nuclear Security Administration under contract DE-AC04-94AL85000. Brett Bader*, Richard Harshman** & Tamara Kolda* *Sandia National Laboratories **University of Western Ontario Workshop for Algorithms on Modern Massive Data Sets June 24, 2006 Analysis of Latent Relationships in Semantic Graphs using DEDICOM
Transcript

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy’s National Nuclear Security Administration

under contract DE-AC04-94AL85000.

Brett Bader*, Richard Harshman** & Tamara Kolda**Sandia National Laboratories**University of Western Ontario

Workshop for Algorithms on Modern Massive Data SetsJune 24, 2006

Analysis of Latent Relationships in Semantic Graphs using DEDICOM

Common Graph Analysis Technique

Uk

Vk

Adjacencymatrix

ΣkT

Best rank-k matrix filters out noise and captures “latent” information, which improves

certain data mining tasks

But we may have ignored critical informationby not considering edge metadata!

Truncated SVD

Web search - HITS (Kleinberg, 1998)

Ak = Uk!kVT

k =

k!

i=1

!iuivTi

For example:

Semantic Graphs

• Different types of edges

• Examples- WWW (anchor text)- Subway map [thanks Orly!]

- Email communications (time stamp, to/cc)

Tucker

New Paradigm:“Multidimensional Data Mining”

+ + ...Third dimension offers more explanatory power: uncovers new

latent information and reveals subtle relationships

Build an “adjacency tensor” such that there is an adjacency matrix for each edge type.

DEDICOM

PARAFAC

Multilinearalgebra

Adjacencymatrix

Adjacencytensor

Objective

Use DEDICOM to analyze a semantic graph of email communications

changing over time

David

Ellen

Bob

Frank

Alice Carl

IngridHenk

Gary

role

s

time patterns

3-way DEDICOM=

DEDICOM

• DEcomposition into DIrectional COMponents

• Introduced in 1978 by Harshman

• Past applications- Study asymmetries in telephone calls among cities- Marketing research• car switching: car owners and what they buy next • free associations of words- words to describe hair in advertising shampoo:

“body” evokes “fullness” more often than “fullness” evokes “body”

- Asymmetric measures of world trade (import/export)

• Variations- Three-way DEDICOM- Constrained DEDICOM

DEDICOM Models & Algorithms

=XR AT

=

All are “alternating” algorithms

• Generalized Takane method• New algorithm

• Kiers’ method• New algorithm

X AR

A

AT

(Takane, 1985; Kiers et al., 1990)

(Kiers, 1993)

Mathematical Notation

• Scalars• Vectors• Matrices• Tensors (3-way array) - frontal slices of :

• Special symbols- Kronecker product

- Hadamard product (elementwise)

A ! B =

!

"

#

a11B . . . a1nB

.

.

.

...

.

.

.

am1B . . . amnB

$

%

&

a

a

A

D

XiX

X

A ! B =

a11b11 . . . a1nb1n

.

.

.

...

.

.

.

am1bm1 . . . amnbmn

X1

Two-way DEDICOM

X = ARAT

+ E

minA,R

!

!

!

X ! ARAT

!

!

!

2

F

• A (n x p) is an orthogonal matrix of loadings or weights• R (p x p) is a dense matrix that captures asymmetric relationships

• Decomposition is not unique- A can be transformed with no loss of fit to the data- Nonsingular transformation Q:

- Usually “fix” A with some standard rotation (e.g., VARIMAX)

X ! ARAT

=X AR AT

s.t. A orthogonal

ARAT = (AQ)(Q!1RQ!T )(AQ)T

n

n p

n

Single domain model

New Algorithm

Anew !

!

X XT"

#

!

R RT"

#

AT 0

0 AT

$$†

Anew =!

XART + XTAR

" !

R(ATA)RT + RT (ATA)R"

!1.

Solving for A:

or

Solving for R:

Stack data and model “side by side” in a single equation!

X XT"

=!

ARAT

ARTAT

"

= A

#

!

R RT"

#

AT 0

0 AT

$$

...and solve least-squares problem:

Rnew = A†X(AT )†

= AY ZT

minA

!

!

!

Y ! AZT

!

!

!

2

F

Three-way DEDICOM

Xi = ADiRDiAT + Ei for i = 1, . . . ,m,

=

minA,R,D

m!

i=1

"

"Xi ! ADiRDiAT

"

"

2

F

• A (n x p) is a matrix of loadings or weights (not necessarily orthogonal)• R (p x p) is a dense matrix that captures asymmetric relationships• D (p x p x m) is a tensor with diagonal frontal slices giving the weights

of the columns of A for each slice in third mode

• *Unique* solution with enough slices of X with sufficient variation- i.e., no rotation of A possible- greater confidence in interpretation of results

n

nm

A

ATRD D

n

p

New Algorithm - Updating A

!

X1 XT1 · · · Xm XT

m

"

= A!

D1RD1 D1RTD1 · · · DmRDm DmRTDm

" !

I2m ! AT"

A =

!

m"

i=1

#

XiADiRTDi + XT

i ADiRDi

$

% !

m"

i=1

(Bi + Ci)

%

!1

Bi ! DiRDi(ATA)DiR

TDi,

Ci ! DiRTDi(A

TA)DiRDi.

Solving for A:

where

= AY

minA,R,D

m!

i=1

"

"Xi ! ADiRDiAT

"

"

2

F

ZT

A = YZ(ZTZ)!1

New Algorithm - Updating D

minDi

!

!Xi ! ADiRDiAT

!

!

2

F

A = QA,

minDi

!

!

!

QTXiQ ! ADiRDiAT

!

!

!

2

F

gk = !

!

i,j

"

2(X ! ADRDAT ) " (ADrka

Tk + akrk,:DA

T )#

i,j

Solving for D:

Use compressionQR factorization:

Use Newton’s method to solve the optimization problem for

Smaller problem (p x p)

hst = !2!

i,j

"

(X ! ADRDAT ) " (asrsta

Tt + atrtsa

Ts )

! (ADrsaTs + asrs:DA

T ) " (ADrtaTt + atrt:DA

T )#

i,j

dnew = d ! H!1g

d = diag(Di)

Gradient:

Hessian:

Our Algorithm - Updating R

Solving for R:

f(R) =

!

!

!

!

!

!

!

"

#

$

Vec(X1)...

Vec(Xm)

%

&

'!

"

#

$

AD1 " AD1

...ADm " A Dm

%

&

'Vec(R)

!

!

!

!

!

!

!

Vec(R) =

!

m"

i=1

(DiATADi) ! (DiA

TADi)

#

!1 m"

i=1

Vec(DiATXiADi)

minimize:

Use the approach in (Kiers, 1993)

minR

m!

i=1

"

"Xi ! ADiR DiAT

"

"

2

F

Algorithm Costs

ATA

XiART

XT

i AR

QTXiQ

QR factorization of AO(p2

n)

Xi

Dominant costs:

Updating A is most expensive part

linear in nnz of

Application: Enron Email Analysis

• Links consist of email communications

• What can we learn about this network strictly from their communication patterns? (Social network analysis)

David

Ellen

Bob

Frank

Alice Carl

IngridHenk

Gary

!

"#$%&'()

*&$+,#-+.#/,(01+&.(/2(3,&/,(0/&4

!"#$"%"&'"(

!"#$"

)(*+$#,-

!"#$"

.$+(#/01#,(*'"2

!"#$"

31-/01#,(*'"2

!"#$"

3("(#1*'$"

!"#$"

)$#*4

56(#'71

!"#$"

/!"(#28

/9(#:'7(-

!"#$"

;#$1<=1"<

!"#$"

.'>(?'"(-

!"#$"

@#1"->$#*1*'$"

9(#:'7(-

!"#$"/A$#>

+,5(6783(9+.#/,+:(3,'&$;(7&/%4<((=1'(.'+>(+:?/(>'.(@#.1(&'4&'?',.+.#A'?(/2(.1'(9'@

B/&C(D'&E+,.#:'(3FE1+,$'(G9BD3HI(+,5(.1'(9'@(B/&C(J./EC(3FE1+,$'(G9BJ3I(./

:'+&,(/2(.1'#&('F4'&#',E'?(@#.1(':'E.&/,#E(.&+5#,$(+,5(+,;(4&/K:'>?(.1';(2/&'?''(@#.1

','&$;('L.&+5#,$(+?(#.('A/:A'?<((M,(+55#.#/,N(.1'(.'+>(>'.(@#.1(&'4&'?',.+.#A'?(/2(O#/5'FN(+

2#,+,E#+:(?/2.@+&'(2#&>N(.1+.(#?(5'A':/4#,$(+(,'@('L.&+5#,$(?;?.'>(2/&(.1'(9BD3H(+,5

@1/?'(&#?C(>+,+$'>',.(?/2.@+&'(#?(>+5'(+A+#:+K:'(/,(3*P(./(#.?(E%?./>'&?<((=1'

>'>K'&?(/2(.1'(.'+>(+:?/(.//C(E/%&?'?(#,(','&$;(5'&#A+.#A'?(+,5(2#,+,E#+:(+,5(','&$;

>+&C'.(.&+5#,$<

!"#$%&'()*

=1'(K%?#,'??(E/,E'4.(%,5'&:;#,$(3*P(#?(?#>4:'<((=&+5#,$(/,(3*P(%?#,$(.1'

M,.'&,'.(&'4:+E'?(>+&C'.#,$(.1+.(4&'A#/%?:;(.//C(4:+E'(K;(.':'41/,'(+,5(2+F<((*,(3*PN

3,&/,(>+&C'.'&?(+&'(/,(/,'(?#5'(/2('A'&;(.&+5'(Q%?.(+?(.1';(+&'(@1',(.1';(%?'(41/,'(+,5

2+F(./(.&+5'<((3,&/,*,:#,'(#?(+(E/>4%.'&(?;?.'>(/4'&+.'5(K;(+,(3,&/,(?%K?#5#+&;(E+::'5

3,&/,(9'.@/&C?N(M,E<((3,&/,(>+&C'.#,$(?%K?#5#+&#'?(3,&/,(6/@'&(D+&C'.#,$N(M,E<(

G3D6MI(+,5(3,&/,(9/&.1(R>'&#E+N(M,E<((G39RI(E/,5%E.(3,&/,S?(':'E.&#E(4/@'&(+,5

,+.%&+:($+?(.&+5#,$N(&'?4'E.#A':;+,,"#$%&'()(#?(+(?.;:#-'5(/&$+,#-+.#/,(E1+&.(/2(3,&/,(0/&4<

?1/@#,$(@1'&'(3*P(#?(:/E+.'5(#,(&':+.#/,(./($+?(+,5(4/@'&(>+&C'.#,$N(+,5(.1'(4#4':#,'?<

3*P(%?'?(+(/,'L./L>+,;(.&+5#,$(>/5':N(@1'&'(3,&/,(.+C'?(/,'(?#5'(/2('A'&;

.&+,?+E.#/,(.+C#,$(4:+E'(/,(3*P<((3*P(5#22'&?(2&/>(.&+5#.#/,+:('FE1+,$'?(:#C'(.1'(9BJ3

Enron Corp.

• U.S. corporation involved with creating energy markets- 7th largest by revenue• EnronOnline: e-trading business- natural gas- electric power

• Investigations- U.S. Federal Energy Regulatory Commission (FERC)• energy market manipulation• involved energy traders- U.S. Securities and Exchange Commission (SEC)• accounting fraud• insider trading

Enron Email Data

• FERC collected email of ~150 employees as evidence- Included emails saved in inbox, sent items, deleted

items, and all other folders

• Released to the public in 2002 by FERC as part of their investigation- To/from, date, subject, body- Attachments and some names/emails removed- Approx. 500,000 email messages

Smaller Enron Data Set

N D 99 F M A M J J A S O N D 00 F M A M J J A S O N D 01 F M A M J J A S O N D 02 F M A M J0

500

1000

1500

2000

2500

3000

3500

Month

Messages

Figure 1: Number of emails per month in the Enron email graph.

biasing from prolific emailers. Other weightings are possibleas well.

An obvious di!culty in dealing with the Enron corpusis the lack of information regarding the former employees.Without access to a corporate directory or organizationalchart at Enron at the time of these emails, it is di!cult toascertain the validity of our results and assess the perfor-mance of the DEDICOM model. Other researchers usingthe Enron corpus have had this same problem, and informa-tion on the participants has been collected and slowly madeavailable.

The Priebe data set [32] provided partial information onthe 184 employees of the small Enron network, which ap-pears to be based largely on information collected by Shettyand Adibi [36]. It provides most employees’ position andbusiness unit. To facilitate a better analysis of the DEDI-COM results, we collected extra information on the partic-ipants from the email messages themselves. We searchedfor corroborating information of the preexisting data or fornew identification information, such as title, business unit,or manager to help analyze our results. We also collectedsome relevant information posted on the FERC website [9].

5. EXPERIMENTAL RESULTSIn this section we summarize our findings of applying two-

way and three-way DEDICOM on the Enron email network.Our algorithms were written in MATLAB, using sparse ex-tensions of the Tensor Toolbox [2].

Table 1 shows the A and R matrices for a single decompo-sition (p = 3) of the two-way DEDICOM model. The largeadjacency matrix X, showing nonsymmetric relations amongemployees at Enron, related by flows of email, is condensedinto a smaller matrix R giving the same kind of asymmetricrelations but among “types” or abstract idealized individ-uals. In this case, the relations among elements in R areexchanges of email. The latent components are patterns ofthe same kind of flow as among the surface objects, justabstracted into a “higher level” summary of patterns.

DEDICOM does not actually identify clusters, except inspecial circumstances when such clusters happen to exist inthe data as we are partially seeing in the Enron data. Thecomponents or patterns of asymmetric relationships that itidentifies have loadings in A that are continuously-valued,like factor loadings, rather than discrete cluster membershipassignments.

Here, DEDICOM describes the employees by the di"erentlatent dimensions. The first factor (a1) describes an execu-

tive role that fits many of the top executives. The secondfactor (a2) describes a legal role, and the third factor (a3)describes a pipeline employee.

The R matrices show that most of the communication isamong employees that share the same role, as evidenced bythe large diagonal values in R. We do see some asymmetriccommunication. The entries in the lower triangular por-tion are typically larger than the corresponding transposeentry in the upper triangular. This suggests that slightlymore communication “flows up” the management chain than“down.”

As a point of reference, we compute the singular valuedecomposition X = U#V T . Table 1 shows the first threecolumns of the left singular vectors (U matrix) and rightsingular vectors (V matrix). Because X is nearly symmetric,the left and right singular vectors are nearly the same. Anydi"erences between U and V indicate whether the person ismore likely to send mail (U) or receive mail (V).

The SVD solution is somewhat similar to the DEDICOMmodel. Many of the same people are identified and weightedsimilarly by DEDICOM and SVD. However, there are manymore negative entries in SVD than in DEDICOM. The DEDI-COM model also provides directional information betweenthe latent groups in the R matrix that the SVD does notshow.

Table 2 shows the A and R matrices for three instances(p = 2, 3, 4) of the three-way DEDICOM model. The 2-dimensional solution groups the employees largely from thelegal department and those executives dealing with govern-ment and regulatory a"airs. The 3-dimensional solutionadds a another role of top executives, and the 4-dimensionalsolution includes those from the pipeline business in a fourthrole.

The aggregate communication patterns over the 44 monthsamong these 2-4 groups is summarized in the R matrix. Inthe 2-dimensional solution we see that most of the com-munication is within each group as evidenced by the largediagonal elements and small o"-diagonal elements. The 3-dimensional solution shows some communication betweenthe government/regulatory a"airs people and other seniorVP’s (dimensions 2 and 3, respectively). However, the com-munication is substantially asymmetric in that the r2,3 ele-ment is larger than r3,2. This indicates that the VP’s weremostly recipients of messages while the government/regulatorya"airs employees were senders. With the addition of thepipeline employees in the 4-dimensional solution, we see thatthey interact almost exclusively with themselves due to the

Email communications at Enron (1998-2002)

34,427 emails among 184 employees over 44 months

• Limited information on the 184 employees

• No org chart

We used a smaller data set prepared by Priebe et al.

DEDICOM Experiment

• Aggregate communications- Sparse matrix of size 184 x 184 (3007 nonzeros)

• Time series of communication graphs - Sparse tensor of size 184 x 184 x 44 (9838 nonzeros)

• Weighted adjacency matrix - scaling: x number of messages scaled by log(x)+1- other common choices give similar results

• Models:- SVD- 2-way DEDICOM- 3-way DEDICOM

Social Network Analysis

Communication graph among employees over

all times

patterns

• Description of employees by their roles• Aggregate communication patterns among roles

DEDICOM

Adjacencymatrix

role

s

DEDICOM Results

LegalExecutives

Pipeline

patternsro

les

Some employees have dual rolesPattern of communications in R matrix

Lega

lEx

ecs

Pipe

line

DEDICOM SVD (left) SVD (right)Solution Solution Solution

Employee 1 2 3 1 2 3 1 2 3

J. Lavorato - CEO, Enron America 0.41 0.07 0.04 0.30 -0.07 -0.21 0.31 -0.09 -0.07L. Kitchen - President, Enron Online 0.26 0.21 0.04 0.31 0.07 -0.05 0.29 0.02 0.04M. Grigsby - Director, West Desk Gas Trading 0.22 -0.01 -0.01 0.16 -0.09 -0.33 0.14 -0.06 -0.20D. Delainey - CEO, ENA and Enron Energy Services 0.20 0.06 0.06 0.20 -0.05 -0.00 0.20 -0.05 0.03G. Whalley - President, 0.17 0.05 0.04 0.08 -0.02 -0.02 0.24 -0.07 0.02L. Taylor - Executive Assistant to Greg Whalley, 0.17 0.06 0.03 0.24 -0.05 -0.08 0.09 -0.01 -0.02T. Jones - Employee, Financial Trading Group (ENA Legal) -0.12 0.38 -0.02 0.17 0.36 0.13 0.10 0.24 0.10M. Taylor - Manager, Financial Trading Group ENA Legal -0.10 0.35 -0.01 0.13 0.27 0.13 0.13 0.26 0.12S. Shackleton - Employee, ENA Legal -0.13 0.31 -0.02 0.08 0.26 0.10 0.08 0.26 0.10S. Panus - Senior Legal Specialist, ENA Legal -0.11 0.26 -0.02 0.09 0.27 0.10 0.05 0.20 0.08M. Heard - Senior Legal Specialist, ENA Legal -0.10 0.24 -0.02 0.06 0.20 0.09 0.08 0.22 0.09E. Sager - VP and Asst Legal Counsel, ENA Legal -0.01 0.24 0.02 0.12 0.13 0.10 0.15 0.21 0.12S. Corman - VP, Regulatory A!airs -0.04 -0.01 0.33 0.08 -0.18 0.22 0.07 -0.18 0.21K. Watson - Employee, Transwestern Pipeline Company (ETS) -0.08 -0.03 0.32 0.03 -0.16 0.19 0.04 -0.18 0.22L. Donoho - Employee, Transwestern Pipeline Company (ETS) -0.08 -0.03 0.30 0.03 -0.16 0.18 0.03 -0.17 0.20D. Fossum - VP, Transwestern Pipeline Company (ETS)? -0.06 -0.00 0.30 0.07 -0.18 0.23 0.05 -0.13 0.16M. Lokay - Admin. Asst., Transwestern Pipeline Company (ETS) -0.07 -0.02 0.28 0.03 -0.14 0.17 0.04 -0.17 0.20K. Hyatt - Director, Asset Development TW Pipeline Co. (ETS) -0.06 -0.02 0.25 0.03 -0.13 0.17 0.04 -0.14 0.17R. Hayslett - VP, Also CFO and Treasurer -0.04 -0.01 0.23 0.04 -0.13 0.16 0.05 -0.14 0.16

R matrix / singular values 70.3 11.6 6.7 86.3 86.315.4 68.2 5.0 54.1 54.19.9 6.7 59.5 52.6 52.6

Figure 1:

1

Identify shared characteristics to

label group

159.911

6.7

5

6.7

Execs

Pipe-

lineLegal

Legal

Executives

Pipeline employees

Social Network Analysis

Communication graph among employees over

all times

• “Hubs” and “authorities” for different roles

Adjacencymatrix

Hubs Authorities

UkVk

ΣkT

SVD

DEDICOM & SVD ResultsLe

gal

Exec

sPi

pelin

e

LegalExecutives

Pipeline

patternsro

les

SVD: Hubs and Authorities in U and VRoles more difficult to identify in singular vectors

UkVk

ΣkT

DEDICOM SVD (left) SVD (right)Solution Solution Solution

Employee 1 2 3 1 2 3 1 2 3

J. Lavorato - CEO, Enron America 0.41 0.07 0.04 0.30 -0.07 -0.21 0.31 -0.09 -0.07L. Kitchen - President, Enron Online 0.26 0.21 0.04 0.31 0.07 -0.05 0.29 0.02 0.04M. Grigsby - Director, West Desk Gas Trading 0.22 -0.01 -0.01 0.16 -0.09 -0.33 0.14 -0.06 -0.20D. Delainey - CEO, ENA and Enron Energy Services 0.20 0.06 0.06 0.20 -0.05 -0.00 0.20 -0.05 0.03G. Whalley - President, 0.17 0.05 0.04 0.08 -0.02 -0.02 0.24 -0.07 0.02L. Taylor - Executive Assistant to Greg Whalley, 0.17 0.06 0.03 0.24 -0.05 -0.08 0.09 -0.01 -0.02T. Jones - Employee, Financial Trading Group (ENA Legal) -0.12 0.38 -0.02 0.17 0.36 0.13 0.10 0.24 0.10M. Taylor - Manager, Financial Trading Group ENA Legal -0.10 0.35 -0.01 0.13 0.27 0.13 0.13 0.26 0.12S. Shackleton - Employee, ENA Legal -0.13 0.31 -0.02 0.08 0.26 0.10 0.08 0.26 0.10S. Panus - Senior Legal Specialist, ENA Legal -0.11 0.26 -0.02 0.09 0.27 0.10 0.05 0.20 0.08M. Heard - Senior Legal Specialist, ENA Legal -0.10 0.24 -0.02 0.06 0.20 0.09 0.08 0.22 0.09E. Sager - VP and Asst Legal Counsel, ENA Legal -0.01 0.24 0.02 0.12 0.13 0.10 0.15 0.21 0.12S. Corman - VP, Regulatory A!airs -0.04 -0.01 0.33 0.08 -0.18 0.22 0.07 -0.18 0.21K. Watson - Employee, Transwestern Pipeline Company (ETS) -0.08 -0.03 0.32 0.03 -0.16 0.19 0.04 -0.18 0.22L. Donoho - Employee, Transwestern Pipeline Company (ETS) -0.08 -0.03 0.30 0.03 -0.16 0.18 0.03 -0.17 0.20D. Fossum - VP, Transwestern Pipeline Company (ETS)? -0.06 -0.00 0.30 0.07 -0.18 0.23 0.05 -0.13 0.16M. Lokay - Admin. Asst., Transwestern Pipeline Company (ETS) -0.07 -0.02 0.28 0.03 -0.14 0.17 0.04 -0.17 0.20K. Hyatt - Director, Asset Development TW Pipeline Co. (ETS) -0.06 -0.02 0.25 0.03 -0.13 0.17 0.04 -0.14 0.17R. Hayslett - VP, Also CFO and Treasurer -0.04 -0.01 0.23 0.04 -0.13 0.16 0.05 -0.14 0.16

R matrix / singular values 70.3 11.6 6.7 86.3 86.315.4 68.2 5.0 54.1 54.19.9 6.7 59.5 52.6 52.6

Figure 1:

1

No patterns of communication

U (hubs) V (authorities)

Temporal Social Network Analysis

April

March

January

February

Time series of communication graphs

among employees

Adjacencytensor

role

stim

e patterns

• Unique description of employees by their roles• Aggregate communication patterns among roles• Behavior over time

3-way DEDICOM

Roles of Employees

2-Dimensional 3-Dimensional 4-DimensionalSolution Solution Solution

Employee 1 2 1 2 3 1 2 3 4

T. Jones - Employee, Financial Trading Group (ENA Legal) 0.64 -0.02 0.64 -0.02 0.01 0.64 -0.01 0.02 -0.00S. Shackleton - Employee, ENA Legal 0.45 -0.02 0.45 -0.01 -0.02 0.45 -0.00 -0.01 -0.00M. Taylor - Manager, Financial Trading Group ENA Legal 0.38 0.00 0.37 -0.01 0.01 0.37 0.01 0.02 -0.00S. Bailey - Legal Assistant, ENA Legal 0.26 -0.01 0.26 -0.01 -0.01 0.26 -0.00 -0.01 -0.00S. Panus - Senior Legal Specialist, ENA Legal 0.26 -0.01 0.26 -0.01 -0.01 0.26 -0.00 -0.00 -0.00M. Heard - Senior Legal Specialist, ENA Legal 0.23 -0.01 0.23 -0.01 0.00 0.23 -0.00 0.00 -0.00J. Hodge - Asst General Counsel, ENA Legal 0.13 0.03 0.13 0.03 0.00 0.13 0.03 0.01 -0.00L. Kitchen - President, Enron Online 0.10 0.08 0.11 -0.13 0.53 0.11 -0.09 0.53 0.00S. Dickson - Employee, ENA Legal 0.09 -0.00 0.09 -0.00 0.00 0.09 -0.00 0.00 -0.00E. Sager - VP and Asst Legal Counsel, ENA Legal 0.08 0.04 0.08 0.01 0.06 0.08 0.02 0.07 -0.00J. Dasovich - Employee, Government Relationship Executive -0.01 0.58 -0.02 0.57 0.04 -0.01 0.58 0.06 0.01J. Ste!es - VP, Government A!airs -0.00 0.49 -0.01 0.52 -0.08 0.00 0.53 -0.06 -0.01R. Shapiro - VP, Regulatory A!airs -0.01 0.43 -0.01 0.39 0.09 -0.00 0.40 0.10 -0.00S. Kean - VP, Chief of Sta! -0.01 0.35 -0.01 0.37 -0.05 -0.00 0.37 -0.04 -0.00R. Sanders - VP, Enron Wholesale Services 0.03 0.16 0.03 0.16 -0.01 0.03 0.16 -0.01 -0.00D. Delainey - CEO, ENA and Enron Energy Services 0.01 0.12 0.01 0.08 0.08 0.01 0.09 0.09 -0.00S. Corman - VP, Regulatory A!airs -0.00 0.08 -0.00 0.08 -0.01 -0.00 0.08 -0.00 0.20M. Carson - Employee, Corporate and Environmental Policy -0.00 0.07 -0.00 0.09 -0.02 -0.00 0.08 -0.02 -0.00S. Scott - Employee, Transwestern Pipeline Company (ETS) -0.00 0.08 -0.00 0.08 -0.00 -0.00 0.08 -0.00 0.04J. Lavorato - CEO, Enron America 0.02 0.12 0.02 -0.08 0.49 0.02 -0.04 0.49 0.00M. Grigsby - Director, West Desk Gas Trading 0.00 0.04 0.00 -0.04 0.20 0.00 -0.03 0.20 -0.00G. Whalley - President, 0.01 0.06 0.01 -0.03 0.19 0.01 -0.01 0.19 0.00J. Ste!es - VP, Government A!airs 0.00 0.04 0.00 -0.04 0.19 0.00 -0.02 0.18 0.00K. Presto - VP, East Power Trading 0.01 0.01 0.01 -0.06 0.19 0.01 -0.05 0.18 0.00S. Beck - COO, 0.01 0.02 0.01 -0.05 0.17 0.01 -0.03 0.17 0.00B. Tycholiz - VP, Marketing 0.01 0.04 0.01 -0.03 0.17 0.01 -0.02 0.16 0.00J. Arnold - VP, Financial Enron Online 0.03 0.02 0.03 -0.05 0.16 0.03 -0.04 0.16 -0.00J. Williamson - Executive Assistant, 0.00 0.02 0.00 -0.03 0.14 0.00 -0.02 0.14 0.01K. Watson - Employee, Transwestern Pipeline Company (ETS) -0.00 0.01 -0.00 0.00 0.01 -0.00 -0.00 0.01 0.59M. Lokay - Admin. Asst., Transwestern Pipeline Company (ETS) -0.00 0.01 -0.00 0.01 0.01 -0.00 0.01 0.01 0.42L. Donoho - Employee, Transwestern Pipeline Company (ETS) -0.00 0.01 -0.00 0.01 0.01 -0.00 0.01 0.01 0.35M. McConnell - Employee, Transwestern Pipeline Company (ETS) 0.00 0.00 0.00 -0.00 0.01 0.00 -0.00 0.01 0.26L. Blair - Employee, Northern Natural Gas Pipeline (ETS) -0.00 0.01 -0.00 0.01 0.00 -0.00 0.00 0.00 0.22K. Hyatt - Director, Asset Development TW Pipeline Business (ETS) -0.00 0.02 -0.00 0.02 0.00 -0.00 0.01 0.00 0.20D. Schoolcraft - Employee, Gas Control (ETS) -0.00 0.00 -0.00 0.00 0.00 -0.00 0.00 0.00 0.18T. Geaccone - Manager, (ETS) 0.00 0.00 0.00 -0.00 0.01 0.00 -0.00 0.01 0.17R. Hayslett - VP, Also CFO and Treasurer 0.00 0.01 0.00 -0.00 0.02 0.00 -0.00 0.02 0.16

R matrix 438.3 12.1 440.3 18.6 -0.9 440.2 1.6 -15.0 0.415.3 291.9 19.7 292.5 168.4 1.6 278.3 135.4 1.6

-17.0 104.1 216.4 -29.3 70.7 201.6 -6.21.4 -4.6 -7.5 172.3

Table 2: Three-way DEDICOM results on the Enron email graph for three di!erent decompositions, p = 2, 3, 4.The top 10 entries from all reported columns of A are listed in the table. Entries exceeding a threshold of0.06 are highlighted.

DtRDt

October 2000 22.2 0.1 -0.5 0.00.1 19.0 4.7 0.1-0.9 2.5 3.6 -0.10.0 -0.2 -0.1 3.5

October 2001 14.5 0.0 -0.9 0.00.0 4.1 5.5 0.1-1.8 2.9 22.5 -0.70.1 -0.2 -0.8 19.1

Table 3: DtRDt matrices showing communicationpatterns for October, 2000 and October, 2001.

that it identifies some people who were pretty much purelyof a certain type and other people who had mixed charac-teristics. For example, a given person might “load” on bothan executive and a lawyer component or aspect, and thusshow email exchanges resembling each of these two roles tosome extent.

The entries in matrix R describe the communication pat-terns between groups of the same and di!erent type. They

show how a particular person’s combination of roles or at-tributes influences the pattern of messages he/she exchangeswith particular other employees given the other employee’sroles or attributes. The R matrix is asymmetric and of-fers an idealized version of a directed graph involving thecomponents identified in A.

In addition, three-way DEDICOM shows the associatedcommunication patterns over time in the tensor D. Thescales in each Dt show the strength of participation of aparticular group for time period t.

In the present study, we investigated a semantic graphwith edges labeled by time. As an alternative to time, wepoint out that our semantic graph could have incorporateddi!erent types of communication media (e.g., email, phone,and mail communications) instead of time in the third mode.Then an analysis with three-way DEDICOM would repre-sent information about the vertices across all forms of com-munication (appropriately scaled by slices of D) in the Aand R matrices.

Furthermore, DEDICOM is not limited to the analysis ofsociometric and intercommunication data; DEDICOM may

2-Dimensional 3-Dimensional 4-DimensionalSolution Solution Solution

Employee 1 2 1 2 3 1 2 3 4

T. Jones - Employee, Financial Trading Group (ENA Legal) 0.64 -0.02 0.64 -0.02 0.01 0.64 -0.01 0.02 -0.00S. Shackleton - Employee, ENA Legal 0.45 -0.02 0.45 -0.01 -0.02 0.45 -0.00 -0.01 -0.00M. Taylor - Manager, Financial Trading Group ENA Legal 0.38 0.00 0.37 -0.01 0.01 0.37 0.01 0.02 -0.00S. Bailey - Legal Assistant, ENA Legal 0.26 -0.01 0.26 -0.01 -0.01 0.26 -0.00 -0.01 -0.00S. Panus - Senior Legal Specialist, ENA Legal 0.26 -0.01 0.26 -0.01 -0.01 0.26 -0.00 -0.00 -0.00M. Heard - Senior Legal Specialist, ENA Legal 0.23 -0.01 0.23 -0.01 0.00 0.23 -0.00 0.00 -0.00J. Hodge - Asst General Counsel, ENA Legal 0.13 0.03 0.13 0.03 0.00 0.13 0.03 0.01 -0.00L. Kitchen - President, Enron Online 0.10 0.08 0.11 -0.13 0.53 0.11 -0.09 0.53 0.00S. Dickson - Employee, ENA Legal 0.09 -0.00 0.09 -0.00 0.00 0.09 -0.00 0.00 -0.00E. Sager - VP and Asst Legal Counsel, ENA Legal 0.08 0.04 0.08 0.01 0.06 0.08 0.02 0.07 -0.00J. Dasovich - Employee, Government Relationship Executive -0.01 0.58 -0.02 0.57 0.04 -0.01 0.58 0.06 0.01J. Ste!es - VP, Government A!airs -0.00 0.49 -0.01 0.52 -0.08 0.00 0.53 -0.06 -0.01R. Shapiro - VP, Regulatory A!airs -0.01 0.43 -0.01 0.39 0.09 -0.00 0.40 0.10 -0.00S. Kean - VP, Chief of Sta! -0.01 0.35 -0.01 0.37 -0.05 -0.00 0.37 -0.04 -0.00R. Sanders - VP, Enron Wholesale Services 0.03 0.16 0.03 0.16 -0.01 0.03 0.16 -0.01 -0.00D. Delainey - CEO, ENA and Enron Energy Services 0.01 0.12 0.01 0.08 0.08 0.01 0.09 0.09 -0.00S. Corman - VP, Regulatory A!airs -0.00 0.08 -0.00 0.08 -0.01 -0.00 0.08 -0.00 0.20M. Carson - Employee, Corporate and Environmental Policy -0.00 0.07 -0.00 0.09 -0.02 -0.00 0.08 -0.02 -0.00S. Scott - Employee, Transwestern Pipeline Company (ETS) -0.00 0.08 -0.00 0.08 -0.00 -0.00 0.08 -0.00 0.04J. Lavorato - CEO, Enron America 0.02 0.12 0.02 -0.08 0.49 0.02 -0.04 0.49 0.00M. Grigsby - Director, West Desk Gas Trading 0.00 0.04 0.00 -0.04 0.20 0.00 -0.03 0.20 -0.00G. Whalley - President, 0.01 0.06 0.01 -0.03 0.19 0.01 -0.01 0.19 0.00J. Ste!es - VP, Government A!airs 0.00 0.04 0.00 -0.04 0.19 0.00 -0.02 0.18 0.00K. Presto - VP, East Power Trading 0.01 0.01 0.01 -0.06 0.19 0.01 -0.05 0.18 0.00S. Beck - COO, 0.01 0.02 0.01 -0.05 0.17 0.01 -0.03 0.17 0.00B. Tycholiz - VP, Marketing 0.01 0.04 0.01 -0.03 0.17 0.01 -0.02 0.16 0.00J. Arnold - VP, Financial Enron Online 0.03 0.02 0.03 -0.05 0.16 0.03 -0.04 0.16 -0.00J. Williamson - Executive Assistant, 0.00 0.02 0.00 -0.03 0.14 0.00 -0.02 0.14 0.01K. Watson - Employee, Transwestern Pipeline Company (ETS) -0.00 0.01 -0.00 0.00 0.01 -0.00 -0.00 0.01 0.59M. Lokay - Admin. Asst., Transwestern Pipeline Company (ETS) -0.00 0.01 -0.00 0.01 0.01 -0.00 0.01 0.01 0.42L. Donoho - Employee, Transwestern Pipeline Company (ETS) -0.00 0.01 -0.00 0.01 0.01 -0.00 0.01 0.01 0.35M. McConnell - Employee, Transwestern Pipeline Company (ETS) 0.00 0.00 0.00 -0.00 0.01 0.00 -0.00 0.01 0.26L. Blair - Employee, Northern Natural Gas Pipeline (ETS) -0.00 0.01 -0.00 0.01 0.00 -0.00 0.00 0.00 0.22K. Hyatt - Director, Asset Development TW Pipeline Business (ETS) -0.00 0.02 -0.00 0.02 0.00 -0.00 0.01 0.00 0.20D. Schoolcraft - Employee, Gas Control (ETS) -0.00 0.00 -0.00 0.00 0.00 -0.00 0.00 0.00 0.18T. Geaccone - Manager, (ETS) 0.00 0.00 0.00 -0.00 0.01 0.00 -0.00 0.01 0.17R. Hayslett - VP, Also CFO and Treasurer 0.00 0.01 0.00 -0.00 0.02 0.00 -0.00 0.02 0.16

R matrix 438.3 12.1 440.3 18.6 -0.9 440.2 1.6 -15.0 0.415.3 291.9 19.7 292.5 168.4 1.6 278.3 135.4 1.6

-17.0 104.1 216.4 -29.3 70.7 201.6 -6.21.4 -4.6 -7.5 172.3

Table 2: Three-way DEDICOM results on the Enron email graph for three di!erent decompositions, p = 2, 3, 4.The top 10 entries from all reported columns of A are listed in the table. Entries exceeding a threshold of0.06 are highlighted.

DtRDt

October 2000 22.2 0.1 -0.5 0.00.1 19.0 4.7 0.1-0.9 2.5 3.6 -0.10.0 -0.2 -0.1 3.5

October 2001 14.5 0.0 -0.9 0.00.0 4.1 5.5 0.1-1.8 2.9 22.5 -0.70.1 -0.2 -0.8 19.1

Table 3: DtRDt matrices showing communicationpatterns for October, 2000 and October, 2001.

that it identifies some people who were pretty much purelyof a certain type and other people who had mixed charac-teristics. For example, a given person might “load” on bothan executive and a lawyer component or aspect, and thusshow email exchanges resembling each of these two roles tosome extent.

The entries in matrix R describe the communication pat-terns between groups of the same and di!erent type. They

show how a particular person’s combination of roles or at-tributes influences the pattern of messages he/she exchangeswith particular other employees given the other employee’sroles or attributes. The R matrix is asymmetric and of-fers an idealized version of a directed graph involving thecomponents identified in A.

In addition, three-way DEDICOM shows the associatedcommunication patterns over time in the tensor D. Thescales in each Dt show the strength of participation of aparticular group for time period t.

In the present study, we investigated a semantic graphwith edges labeled by time. As an alternative to time, wepoint out that our semantic graph could have incorporateddi!erent types of communication media (e.g., email, phone,and mail communications) instead of time in the third mode.Then an analysis with three-way DEDICOM would repre-sent information about the vertices across all forms of com-munication (appropriately scaled by slices of D) in the Aand R matrices.

Furthermore, DEDICOM is not limited to the analysis ofsociometric and intercommunication data; DEDICOM may

Legal

Gov’t affairs

Execs - trading

Pipeline employees

role

stim

e patterns

Identify shared characteristics to

label group

LegalGov’t a

ffairs

Trade execs

Pipeline

Communication Patternsro

les

time patterns

LegalGov’t a

ffairs

Trade execs

PipelineLegal

Trade

execs

Gov't

affairs

Pipe-

line

2-Dimensional 3-Dimensional 4-DimensionalSolution Solution Solution

Employee 1 2 1 2 3 1 2 3 4

T. Jones - Employee, Financial Trading Group (ENA Legal) 0.64 -0.02 0.64 -0.02 0.01 0.64 -0.01 0.02 -0.00S. Shackleton - Employee, ENA Legal 0.45 -0.02 0.45 -0.01 -0.02 0.45 -0.00 -0.01 -0.00M. Taylor - Manager, Financial Trading Group ENA Legal 0.38 0.00 0.37 -0.01 0.01 0.37 0.01 0.02 -0.00S. Bailey - Legal Assistant, ENA Legal 0.26 -0.01 0.26 -0.01 -0.01 0.26 -0.00 -0.01 -0.00S. Panus - Senior Legal Specialist, ENA Legal 0.26 -0.01 0.26 -0.01 -0.01 0.26 -0.00 -0.00 -0.00M. Heard - Senior Legal Specialist, ENA Legal 0.23 -0.01 0.23 -0.01 0.00 0.23 -0.00 0.00 -0.00J. Hodge - Asst General Counsel, ENA Legal 0.13 0.03 0.13 0.03 0.00 0.13 0.03 0.01 -0.00L. Kitchen - President, Enron Online 0.10 0.08 0.11 -0.13 0.53 0.11 -0.09 0.53 0.00S. Dickson - Employee, ENA Legal 0.09 -0.00 0.09 -0.00 0.00 0.09 -0.00 0.00 -0.00E. Sager - VP and Asst Legal Counsel, ENA Legal 0.08 0.04 0.08 0.01 0.06 0.08 0.02 0.07 -0.00J. Dasovich - Employee, Government Relationship Executive -0.01 0.58 -0.02 0.57 0.04 -0.01 0.58 0.06 0.01J. Ste!es - VP, Government A!airs -0.00 0.49 -0.01 0.52 -0.08 0.00 0.53 -0.06 -0.01R. Shapiro - VP, Regulatory A!airs -0.01 0.43 -0.01 0.39 0.09 -0.00 0.40 0.10 -0.00S. Kean - VP, Chief of Sta! -0.01 0.35 -0.01 0.37 -0.05 -0.00 0.37 -0.04 -0.00R. Sanders - VP, Enron Wholesale Services 0.03 0.16 0.03 0.16 -0.01 0.03 0.16 -0.01 -0.00D. Delainey - CEO, ENA and Enron Energy Services 0.01 0.12 0.01 0.08 0.08 0.01 0.09 0.09 -0.00S. Corman - VP, Regulatory A!airs -0.00 0.08 -0.00 0.08 -0.01 -0.00 0.08 -0.00 0.20M. Carson - Employee, Corporate and Environmental Policy -0.00 0.07 -0.00 0.09 -0.02 -0.00 0.08 -0.02 -0.00S. Scott - Employee, Transwestern Pipeline Company (ETS) -0.00 0.08 -0.00 0.08 -0.00 -0.00 0.08 -0.00 0.04J. Lavorato - CEO, Enron America 0.02 0.12 0.02 -0.08 0.49 0.02 -0.04 0.49 0.00M. Grigsby - Director, West Desk Gas Trading 0.00 0.04 0.00 -0.04 0.20 0.00 -0.03 0.20 -0.00G. Whalley - President, 0.01 0.06 0.01 -0.03 0.19 0.01 -0.01 0.19 0.00J. Ste!es - VP, Government A!airs 0.00 0.04 0.00 -0.04 0.19 0.00 -0.02 0.18 0.00K. Presto - VP, East Power Trading 0.01 0.01 0.01 -0.06 0.19 0.01 -0.05 0.18 0.00S. Beck - COO, 0.01 0.02 0.01 -0.05 0.17 0.01 -0.03 0.17 0.00B. Tycholiz - VP, Marketing 0.01 0.04 0.01 -0.03 0.17 0.01 -0.02 0.16 0.00J. Arnold - VP, Financial Enron Online 0.03 0.02 0.03 -0.05 0.16 0.03 -0.04 0.16 -0.00J. Williamson - Executive Assistant, 0.00 0.02 0.00 -0.03 0.14 0.00 -0.02 0.14 0.01K. Watson - Employee, Transwestern Pipeline Company (ETS) -0.00 0.01 -0.00 0.00 0.01 -0.00 -0.00 0.01 0.59M. Lokay - Admin. Asst., Transwestern Pipeline Company (ETS) -0.00 0.01 -0.00 0.01 0.01 -0.00 0.01 0.01 0.42L. Donoho - Employee, Transwestern Pipeline Company (ETS) -0.00 0.01 -0.00 0.01 0.01 -0.00 0.01 0.01 0.35M. McConnell - Employee, Transwestern Pipeline Company (ETS) 0.00 0.00 0.00 -0.00 0.01 0.00 -0.00 0.01 0.26L. Blair - Employee, Northern Natural Gas Pipeline (ETS) -0.00 0.01 -0.00 0.01 0.00 -0.00 0.00 0.00 0.22K. Hyatt - Director, Asset Development TW Pipeline Business (ETS) -0.00 0.02 -0.00 0.02 0.00 -0.00 0.01 0.00 0.20D. Schoolcraft - Employee, Gas Control (ETS) -0.00 0.00 -0.00 0.00 0.00 -0.00 0.00 0.00 0.18T. Geaccone - Manager, (ETS) 0.00 0.00 0.00 -0.00 0.01 0.00 -0.00 0.01 0.17R. Hayslett - VP, Also CFO and Treasurer 0.00 0.01 0.00 -0.00 0.02 0.00 -0.00 0.02 0.16

R matrix 438.3 12.1 440.3 18.6 -0.9 440.2 1.6 -15.0 0.415.3 291.9 19.7 292.5 168.4 1.6 278.3 135.4 1.6

-17.0 104.1 216.4 -29.3 70.7 201.6 -6.21.4 -4.6 -7.5 172.3

Table 2: Three-way DEDICOM results on the Enron email graph for three di!erent decompositions, p = 2, 3, 4.The top 10 entries from all reported columns of A are listed in the table. Entries exceeding a threshold of0.06 are highlighted.

DtRDt

October 2000 22.2 0.1 -0.5 0.00.1 19.0 4.7 0.1-0.9 2.5 3.6 -0.10.0 -0.2 -0.1 3.5

October 2001 14.5 0.0 -0.9 0.00.0 4.1 5.5 0.1-1.8 2.9 22.5 -0.70.1 -0.2 -0.8 19.1

Table 3: DtRDt matrices showing communicationpatterns for October, 2000 and October, 2001.

that it identifies some people who were pretty much purelyof a certain type and other people who had mixed charac-teristics. For example, a given person might “load” on bothan executive and a lawyer component or aspect, and thusshow email exchanges resembling each of these two roles tosome extent.

The entries in matrix R describe the communication pat-terns between groups of the same and di!erent type. They

show how a particular person’s combination of roles or at-tributes influences the pattern of messages he/she exchangeswith particular other employees given the other employee’sroles or attributes. The R matrix is asymmetric and of-fers an idealized version of a directed graph involving thecomponents identified in A.

In addition, three-way DEDICOM shows the associatedcommunication patterns over time in the tensor D. Thescales in each Dt show the strength of participation of aparticular group for time period t.

In the present study, we investigated a semantic graphwith edges labeled by time. As an alternative to time, wepoint out that our semantic graph could have incorporateddi!erent types of communication media (e.g., email, phone,and mail communications) instead of time in the third mode.Then an analysis with three-way DEDICOM would repre-sent information about the vertices across all forms of com-munication (appropriately scaled by slices of D) in the Aand R matrices.

Furthermore, DEDICOM is not limited to the analysis ofsociometric and intercommunication data; DEDICOM may

LegalGovernment & regulatory affairs

Trade executivesPipeline employees

135.470.7

• Mostly communication within roles• Some large exchanges• Negative values complicates interpretation- Non-negative factorization being investigated

Temporal Patterns

N D 99 F M A M J J A S O N D 00 F M A M J J A S O N D 01 F M A M J J A S O N D 02 F M A M J0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Month

No

rma

lize

d s

ca

le

Group 1

Group 2

Group 3

Group 4

Figure 2: Scales in D indicate the strength of participation of each group’s communication over time.

derive useful information from any directed graph. New pos-sibilities include analyzing a network of web tra!c betweenservers over time or perhaps a web/citation graph, whereedges convey authority among vertices. A third mode enterswhen the 2-way data are categorized by time, demographic,click number, or some other feature of the data.

Finally, we suggest a few extensions to the DEDICOMmodel and its application in data mining that we intend topursue. First, constrained DEDICOM [23] is an extensionof DEDICOM that has been suggested in the 90’s and pur-sued more recently. The idea is to put constraints on theA factors themselves so that the columns of A lie in a pre-scribed column space. For example, in the email graph, onemight want to impose a constraint on the first column ofA so that it contains only the top executives. Many othervariations are possible. This procedure allows for includingdomain knowledge or incorporating human understandinginto the problem. Kiers and Takane [23] o"ered an algorithmfor handling di"erent subspace constraints on A. More re-cently, Rocci [33] proposed a new algorithm for fitting anyconstrained DEDICOM model.

Second, a nonnegative factorization of DEDICOM, whereA and/or R are nonnegative, would preserve the non-negativityof the data, which could be desirable in some domains andapplications.

Finally, DEDICOM has been applied to skew-symmetricdata [17] and has yielded some benefits. There might beways to apply this technique to semantic graphs as well.

7. REFERENCES[1] E. Acar, S. A. Camtepe, M. S. Krishnamoorthy, and

B. Yener. Modeling and multiway analysis ofchatroom tensors. In Intelligence and SecurityInformatics: IEEE Intl. Conf. on Intelligence andSecurity Informatics, ISI 2005, volume 3495 of LectureNotes in Computer Science, pages 256–268. SpringerVerlag, 2005.

[2] B. W. Bader and T. G. Kolda. MATLAB tensorclasses for fast algorithm prototyping. TechnicalReport SAND2004-5187, Sandia NationalLaboratories, Albquerque, NM 87185 and Livermore,CA 94550, Oct. 2004. Submitted to ACM Trans.Math. Software.

[3] M. W. Berry and M. Browne. Email surveillance usingnonnegative matrix factorization. In Workshop on

Link Analysis, Counterterrorism and Security, SIAMConf. on Data Mining, Newport Beach, CA, 2005.

[4] J. D. Carroll and J. J. Chang. Analysis of individualdi"erences in multidimensional scaling via an N-waygeneralization of ‘Eckart-Young’ decomposition.Psychometrika, 35:283–319, 1970.

[5] A. Chapanond, M. S. Krishnamoorthy, and B. Yener.Graph theoretic and spectral analysis of Enron emaildata. In Workshop on Link Analysis,Counterterrorism and Security, SIAM Conf. on DataMining, Newport Beach, CA, 2005.

[6] W. W. Cohen. Enron email dataset. Webpage.http://www.cs.cmu.edu/!enron/.

[7] J. E. Dennis, Jr. and R. B. Schnabel. NumericalMethods for Unconstrained Optimization andNonlinear Equations. Prentice-Hall, Englewood Cli"s,NJ, 1983.

[8] J. Diesner and K. M. Carley. Exploration ofcommunication networks from the Enron emailcorpus. In Workshop on Link Analysis,Counterterrorism and Security, SIAM Conf. on DataMining, Newport Beach, CA, 2005.

[9] Federal Energy Regulatory Commision. Ferc:Information released in Enron investigation.http://www.ferc.gov/industries/electric/indus-act/wec/enron/info-release.asp.

[10] C. W. Harris and H. F. Kaiser. Oblique factor analyticsolutions by orthogonal transformations.Psychometrika, 29(4):347–362, 1964.

[11] R. A. Harshman. Foundations of the PARAFACprocedure: models and conditions for an“explanatory” multi-modal factor analysis. UCLAworking papers in phonetics, 16:1–84, 1970.

[12] R. A. Harshman. Models for analysis of asymmetricalrelationships among n objects or stimuli. In FirstJoint Meeting of the Psychometric Society and theSociety for Mathematical Psychology, McMasterUniversity, Hamilton, Ontario, August 1978.http://publish.uwo.ca/!harshman/asym1978.pdf.

[13] R. A. Harshman. Alternating least squares estimationfor the single domain DEDICOM model, 1981.Unpublished technical memorandum, BellLaboratories, Murray Hill, NJhttp://publish.uwo.ca/!harshman/asym1981.pdf.

[14] R. A. Harshman. DEDICOM: A family of models

Enron crisis breaks; investigation begins

Communication patterns over time

role

s

time patterns

LegalGovernment & regulatory affairsTrade executivesPipeline employee

Filed for bankruptcy

Summary

• Improvements to DEDICOM- New procedure for finding A- Newton step for finding D

• Modifications to handle large data arrays- Compression

• Novel approach to social network analysis using DEDICOM- Roles of employees- Communication patterns among roles and over time

• Future research- Nonnegative DEDICOM- Constrained DEDICOM- PARAFAC

More Information

• DEDICOM paper on Social Network Analysis:- Tech report SAND2006-2161 available

• MATLAB Tensor Toolbox:- http://csmr.ca.sandia.gov/~tgkolda/TensorToolbox- Tech report SAND2004-5189 available on website- Paper to appear in ACM Trans. Math. Softw.- sparse_tensor class to be released soon

http://www.cs.sandia.gov/~bwbader/[email protected]


Recommended