Testing for Strategic Interaction in
Social and Economic Network Formation
CDAR Risk Seminar, October 15th, 2019
Andrin Pelican
University of St. Gallen
Bryan S. Graham
University of California - Berkeley
Strategic Network Formation
Economic theory literature on network formation emphasizesstrategic aspects (e.g., Jackson and Wolinsky, 1995).
Statistics literature focuses on simple probability models for ex-changeable random graphs (e.g., stochastic block models, β-model).
Econometricians build off both approaches (e.g., Graham, 2017;Jochmans, 2018; Dzemski, 2018; Sheng, 2013; de Paula et al.,2018).
1
Strategic Network Formation (continued)
Few econometric models with both rich agent-level heterogeneityand strategic interaction (c.f., Graham, 2016).
Today: Study testing for strategic interaction in a null modelwith unobserved heterogeneity and homophily.
Two key challenges: (i) finding the form of the locally besttest (model is incomplete under the alternative; high dimensionalnuisance parameters) and (ii) simulating its exact distributionunder the null model.
This work is preliminary and comments are very welcome.
2
3
4
Basic Terms & Notation
• An directed graph G (N ,A) consists of a set of nodesN = {1, . . . , N} and a list of ordered pairs of nodes calledarcs/edges A = {{i, j} , {k, l} , . . .} for i, j, k, l ∈ N .
• A graph is conveniently represented by its adjacency matrixD =
[Dij
]where
Dij =
{1 if {i, j} ∈ E0 otherwise . (1)
• No self-ties ⇒ D is a binary matrix with a diagonal of so-called structural zeros.
5
Utility
Let d ∈ D be a feasible network. The utility agent i gets fromsome feasible network wiring d is
νi(di,d−i;U
)=∑j
dij[Ai +Bj +W ′
ijλ0 + γ0sij (d)− Uij
],
where:
1. Ai is a “sender effect” (out-degree heterogeneity);
2. Bj a “receiver” effect (in-degree heterogeneity);
6
Utility (continued)
1. W ′ijλ0 = X ′
iΛXj with the Xi a vector of K community mem-bership dummies (dim(λ0) = K2×1 parameterizes homophily);
2. sij (d) = sij (d− ij) = sij (d+ ij) is a network/strategic ef-fect; can be used to model:
(a) reciprocity: sij (d) = dji;
(b) transitivity:sij (d) =∑
k dikdkj
3.{Uij
}i ̸=i
idiosyncratic utility shifter (i.i.d. logistic)
7
Notation Redux
Out- and in-degree sequences equal
S =
(SoutSint
)′=
(D1+, . . . , DN+D+1, . . . , D+N
).
Here D+i =∑
j Dji and Di+ =∑
j Dij equal the in- and out-degree of agents i = 1, . . . , N .
The K ×K cross-link matrix equals
M =∑i
∑j
DijXiX′j
This matrix summarizes the inter-group link structure in the net-work.
8
Notation Redux (continued)
Let S,M be a degree sequence and cross-link matrix.
We say S,M is graphical if there exists at least one arc set A suchthat G (V,A) is a simple directed graph with degree sequence S
and cross link matrix M.
We call any such network a realization of S,M.
The set of all possible realizations of S,M is denoted by GS,M
(DS,M).
9
Network Game
d ∈ D - a candidate network wiring – is a pure strategy combi-nation (each agent decides which, out of N − 1 choices, links tosend).
A (pure strategy) Nash equilibrium (NE) is a pure strategy com-bination d∗ where, for U = u and all i = 1, . . . , N,
νi(d∗i ,d
∗−i,u
)≥ νi
(di,d
∗−i,u
)(2)
for all possible (other) linking strategies di.
We assume that D – the observed network – satisfies (2) at therealized U.
10
Equilibrium Selection
Let Nd (u; θ) be a function which assigns, for U = u, a probabilityweight to network or, equivalently, pure strategy combination d.
If d is the only network which satisfies (2), then Nd (u; θ) = 1.
If d is not a NE, then Nd (u; θ) = 0.
If there are multiple pure strategy NE, then Nd (u; θ) ≥ 0 for anyd which is a NE and zero otherwise; subject to the constraintthat
∑d∈DNd (u; θ) = 1.
11
Equilibrium Selection (continued)
Nd (u; θ) corresponds to an equilibrium selection rule.
We do not impose any assumptions on the form of Nd (u; θ)
(beyond those already outlined).
A feature of what follows is that the researcher can be veryagnostic about equilibrium selection.
12
Likelihood
We can write the probability of observing network D = d as
P (d; θ) =∫u∈Rn
Nd (u; θ) fu (u) du
where n = N (N − 1) is the number of directed dyads.
Here fu (u) =∏i ̸=j fU
(uij)
with
fU (u) = eu/ [1 + eu]2
the logistic density.
13
Model Parameters
θ =(γ, δ′, π′)′ with
γ - parameter of interest (strategic interaction)
δ =(λ′,A′,B′)′ – homophily/heterogeneity
π - equilibrium selection parameter (abstract)
δ and π are (high dimensional) nuisance parameters
14
Testing for Strategic Interaction
Let ∆ denote a subset of the K2 + 2N dimensional Euclideanspace in which δ0 is, a priori, known to lie, and
Θ0 ={(
γ, δ′, π′): γ = 0, δ ∈ ∆, π = 0
}.
Our null hypothesis is the composite one
H0 : θ ∈ Θ0 (3)
since δ may range freely over ∆ ⊂ RK2+2N under the null.
15
Null Model
Null model is a variant of that studied by Graham (2017) andJochmans (2018); also related to so-called degree correctedstochastic block model (e.g., Karrer and Newman, 2011).
Under the null links are conditionally independent with P0 (d; δ)def≡
P(d;(0, δ′,0′
)′) equal to
P0 (d; δ) =N∏
i=1
∏j ̸=i
exp(W ′
ijλ+R′iA+R′
jB)
1+ exp(W ′
ijλ+R′iA+R′
jB)dij
×
1
1+ exp(W ′
ijλ+R′iA+R′
jB)1−dij
16
with Ri an N × 1 vector with 1 as its ith element and zeroselsewhere.
Null Model (continued)
Note that P0 (d; δ) equals
P0 (d; δ) =∫u∈Rn
Nd (u; θ) fu (u) du
with
Nd (u; θ) =∏i
∏j
1(Ai +Bj +W ′
ijλ ≥ uij)dij
× 1(Ai +Bj +W ′
ijλ > uij)1−dij
.
Things are more involved under the alternative where γ > 0!
17
Null Model: Exponential Family
The null model belongs to the exponential family:
P0 (d; δ) = c (δ) exp(t′δ)
with a (minimally) sufficient statistic for δ of
t =(vec
(m′)′, s′out, s
′in
)′.
In words, the K2 + N + N sufficient statistics are (i) the crosslink matrix, (ii) the out-degree sequence and (iii) the in-degreesequence.
18
Null Model: Conditional Likelihood
Under H0 the conditional likelihood of D = d is
P0 (d|T = t) =1
|Ds,m|.
To simulate the distribution of a statistic under H0 we need tobe able to draw adjacency matrices (i.e., networks) uniformly atrandom from the set Ds,m.
This is a non-trivial problem. See Blitzstein & Diaconis (2010)and Tao (2016).
19
Test Formulation
In our setting, a test ϕ (D), will have size α if its null rejectionprobability (NRP) is less than or equal to α for all values of thenuisance parameter:
supθ∈Θ0
Eθ [ϕ (D)] = supδ∈△
Eθ [ϕ (D)] = α.
Since δ is high dimensional, size control is non-trivial (e.g., Mor-eira, 2009).
This motivates proceeding conditionally on T.
Let T = {(s,m) : s,m is graphical} be the set of possible T.
20
Test Formulation (continued)
For each t ∈ T we form a test with the property that, for allθ ∈ Θ0,
Eθ [ϕ (D)|T = t] = α.
Such an approach ensures similarity of our test since, by iteratedexpectations
Eθ [ϕ (D)] = Eθ [Eθ [ϕ (D)|T]] = α
for any θ ∈ Θ0 (cf. Ferguson, 1967).
By proceeding conditionally we ensure the NRP is unaffected bythe value of δ.
21
Test Formulation (continued)
By Ferguson (1967, Lemma 1, Section 3.6) T is a boundedlycomplete sufficient statistic for θ under the null.
By Ferguson (1967, Theorem 2, Section 5.4) every similar testwill therefore take the form
Eθ [ϕ (D)|T = t] = α
for t ∈ T.
If we desire similarity we can/must take the conditional approach.
22
Alternative Model: Conditional Likelihood
Under the alternative of strategic interaction the conditional like-lihood is
P (d|T = t; θ) =P (d; θ)∑
v∈Ds,m P (v; θ).
This likelihood is complicated and (logically) cannot be evaluatedwithout specifying an explicit equilibrium selection mechanism.
23
Locally Best Test
For each t ∈ T, we choose the critical function, ϕ (D) to maximizethe derivative of the (conditional) power function
β (γ, t) = E [ϕ (D)|T = t]
evaluated at γ = 0 subject to the (conditional) size constraint
Eθ [ϕ (D)|T = t] = α. (4)
Such a ϕ (D) is locally best (Ferguson, 1967, Section 5.5).
24
Locally Best Test (continued)
Differentiating the power function we get
∂β (γ, t)
∂γ
∣∣∣∣∣γ=0
= E [ϕ (D) Sγ (D|T; θ)|T = t] (5)
with Sγ (d| t; θ) the conditional score function
Sγ (d| t; θ) =1
P0 (d; δ)
∂P (d; θ)
∂γ
∣∣∣∣∣γ=0
−∑
v∈Ds,m
∂P (v; θ)
∂γ
∣∣∣∣∣γ=0
=1
P0 (d; δ)
∂P (d; θ)
∂γ
∣∣∣∣∣γ=0
+ k (t)
and k (t) only depending on the data through T = t.
25
Locally Best Test (continued)
By the Neyman-Pearson lemma the test with critical function
ϕ (d) =
1 1
P0(d;δ)∂P (d;θ)
∂γ
∣∣∣γ=0
> cα (t)
gα (t) 1P0(d;δ)
∂P (d;θ)∂γ
∣∣∣γ=0
= cα (t)
0 1P0(d;δ)
∂P (d;θ)∂γ
∣∣∣γ=0
< cα (t)
where the values of cα(t) and gα (t) ∈ [0,1] are chosen to satisfy(4), will be locally best.
26
Locally Best Test (continued)
Several (serious) implementation challenges:
1. Form of the likelihood gradient ∂P (d;θ)∂γ
∣∣∣γ=0
(incompleteness
is an issue)?
2. Locally best test statistic may depend on nuisance parame-ters δ and π?
3. To find cα(t) and gα (t) we need to be able to simulate the(null) distribution of 1
P0(D;δ)∂P (D;θ)
∂γ
∣∣∣γ=0
conditional on T = t.
27
Derivative Calculation: Buckets
Given the network d− ij agent i will direct a link to j if
vij + γsij (d) ≤ Uij
for vij = Ai +Bj +W ′ijδ.
In a given network the strategic interaction term, sij (d) parti-tions the image space of Uij into two intervals
R =(−∞, sij (d)
]∪(sij (d) ,∞
).
Similarly the set of all networks, D, partitions R into a set ofintervals B.
28
Derivative Calculation: Buckets (continued)
Let S = {−s, s1, . . . , sM , s} be the set of possible values for thestrategic interaction term sij (d), ordered from smallest to largest.
We call each element b ∈ B a bucket, buckets are naturally or-dered
R =(−∞, vij + γs
]∪(vij + γs, vij + γs1
]∪ · · ·
∪(vij + γsM , vij + γs
]∪(vij + γs,∞
).
All buckets, with the exception of the first and the last, we callinner buckets.
For any draw of the utility shifter we have Uij ∈ b, b ∈ B.
29
Derivative Calculation: Buckets (continued)
If a realization of Uij is in bucket B, we say Uij falls in (or is in)B.
We suppress the dependence of the partition on ij in the nota-tion.
Observe that for γ ≈ 0, the probability that Uij falls into an innerbucket is close to zero.
30
Derivative Calculation: Buckets (continued)
Let the boldface subscripts i = 1, 2, . . . index the n = N (N − 1)
directed dyads in arbitrary order (e.g., i maps to some ij andvice-versa).
Let b ∈ Bn = B× · · · × B and U = (U1, . . . , Un)′.
We have that U ∈ b for b ∈ Bn so that each element of then-vector of utility shifters U falls into a bucket.
31
Derivative Calculation: Buckets (continued)
To understand these buckets consider Uij ∈(vij + γsm, vij + γsm+1
].
At such a realization of Uij it will be optimal for i to send a linkto j in any network such that sij (d) ≤ sm, and optimal to notsend this link when sij (d) > sm.
Rewirings of the network which induce a shift of sij (d) from sm
to sm+1 change the incentives for i to send a link to j.
Hence each bucket defines a region in which the incentives toform a particular ij link may be sensitive to small re-wirings ofthe network.
32
Derivative Calculation: Likelihood (continued)
Using our bucket notation we can re-write the likelihood as:
P (d; θ) =∑
b∈Bn
∫u∈b
Nd (u; θ) fU (u) du (6)
For a given bucket combination b ∈ Bn,∫u∈bNd (u; θ) fu (u) du
gives the associated contribution to the likelihood of observingD = d.
Summation over all possible bucket combinations gives the over-all likelihood of observing D = d.
33
Derivative Calculation: Likelihood (continued)
Let B̃n be the set of bucket configurations with two or moreinner buckets. Define
P̃ (d; θ) =∑
b∈Bn\B̃n
∫u∈b
Nd (u; θ) fU (u) du
Q (d; θ) =∑
b∈B̃n
∫u∈b
Nd (u; θ) fU (u) du.
Trivially we have the decomposition
P (d; θ) = P̃ (d; θ) +Q (d; θ) .
34
Derivative Calculation
To calculate ∂P (d; θ) /∂γ we show that for γ → 0
P (d; θ) = P̃ (d; θ) +O(γ2).
Furthermore we show that
∂P (d; θ)
∂γ
∣∣∣∣∣γ=0
=∂P̃ (d; θ)
∂γ
∣∣∣∣∣γ=0
. (7)
Hence to derive the form of ∂P (d;θ)∂γ
∣∣∣γ=0
we need only calculate
∂P̃ (d;θ)∂γ
∣∣∣∣γ=0
.
This calculation is non-trivial, but doable (i.e., it is tedious).
35
Derivative Calculation
Only need to worry about cases where (i) no draws of Uij are ininner buckets or (ii) just one draw (out of n) is.
In the first case every player has a strictly dominating strategyprofile.
Strong preferences: regardless of other players’ action it is eitheroptimal, or not, to form specific links.
Network is uniquely defined: Nd (u; θ) is either zero or one.
36
Derivative Calculation
Second case: if all but one component of U falls into the first orlast bucket, then the resulting network is uniquely defined exceptfor the presence or absence of one edge, say, ij.
For any such draw of U, since all other links are formed accordingto a strictly dominating strategy, player i will either benefit fromforming the link ij or not.
Hence Nd (u; θ) is also either zero or one in this case as well.
37
Derivative Calculation
For small values of γ the derivative is driven by summands wherethe precise details of the (unspecified) equilibrium selection mech-anism are not relevant.
Those summands where the form of Nd (u; θ) is germane con-tribute very little to the derivative when γ is small.
We are able to differentiate the likelihood with respect to thestrategic interaction parameter and evaluate that derivative forsmall γ (specifically for γ = 0).
38
Derivative Calculation: Likelihood (continued)
Lemma: P (d; θ) is twice differentiable with respect to γ atγ = 0. Its first derivative at γ = 0 is
∂P (d; θ)
∂γ
∣∣∣∣∣γ=0
=P0 (d; δ)
×
∑i ̸=j
sij (d)
dij fU(tij)
∫ vij−∞ fU (u) du
−(1− dij
) fU(tij)
∫∞vij
fU (u) du
.
With a little manipulation we can simplify:
1
P0 (d; δ)
∂P (d; θ)
∂γ
∣∣∣∣∣γ=0
=∑i ̸=j
[dij − FU
(vij)]
sij (d)
where FU (u) = eu/ [1 + eu] is the logistic CDF.
39
Operational Details
Locally best test statistic is large when links which have lowprobability under the null, tend to form precisely where their“strategic utility” is high.
Controlling for heterogeneity appears to be important for power.
Lots of triangles vs. “surprising” triangles.
40
Operational Details
Although the form of the locally optimal statistic does not de-pend on π (equilibrium selection) it does depend on δ (hetero-geneity).
Plugging in any δ ∈ ∆ results in an admissible test.
We take a “best guess” approach, replacing vij = Ai+Bj +W ′ijλ
with its JMLE v̂ij.
This is ad hoc, but appears to work well in practice.
41
Operational Details (continued)
For s = 1, . . . , S we draw (uniformly at random) Vs ∈ Ds,m and
calculate 1P0(Vs;δ̂)
∂P(Vs;(γ,δ̂′,π′)′
)∂γ
∣∣∣∣∣∣γ=0
.
If 1P0(D;δ̂)
∂P(D;(γ,δ̂′,π′)′
)∂γ
∣∣∣∣∣∣γ=0
, observed in the network in hand, is
greater than 95 percent of our simulated statistics we reject thenull of no strategic interaction.
42
Simulation Algorithm
We begin with D and randomly rewire it, preserving the crosslink structure and degree sequence at each step.
Our MCMC converges to the null distribution, generating a uni-form random draw from DS,M.
Key references: Rao et al. (1996) and Tao (2015).
Our contribution is to also account for the cross-link group struc-ture.
43
Alternating Walks
44
Alternating Cycles
45
Schlaufen Sequences
46
Null: γ = 0,sij (d) =∑
k dikdkj
47
Alternative: γ = 0.3, sij (d) =∑
k dikdkj
48
Wrapping-Up
The presence of strategic interaction is central to many theoriesof network formation (and policy-relevant).
Estimation of such models is non-trivial.
This motivates the need for a method of testing for strategicinteraction.
We propose one such method.
Much remains to be done.
49