Testing for Strategic Interaction in Social and Economic...

Testing for Strategic Interaction in

Social and Economic Network Formation

CDAR Risk Seminar, October 15th, 2019

Andrin Pelican

University of St. Gallen

Bryan S. Graham

University of California - Berkeley

Strategic Network Formation

Economic theory literature on network formation emphasizesstrategic aspects (e.g., Jackson and Wolinsky, 1995).

Statistics literature focuses on simple probability models for ex-changeable random graphs (e.g., stochastic block models, β-model).

Econometricians build off both approaches (e.g., Graham, 2017;Jochmans, 2018; Dzemski, 2018; Sheng, 2013; de Paula et al.,2018).

1

Strategic Network Formation (continued)

Few econometric models with both rich agent-level heterogeneityand strategic interaction (c.f., Graham, 2016).

Today: Study testing for strategic interaction in a null modelwith unobserved heterogeneity and homophily.

Two key challenges: (i) finding the form of the locally besttest (model is incomplete under the alternative; high dimensionalnuisance parameters) and (ii) simulating its exact distributionunder the null model.

This work is preliminary and comments are very welcome.

2

3

4

Basic Terms & Notation

• An directed graph G (N ,A) consists of a set of nodesN = {1, . . . , N} and a list of ordered pairs of nodes calledarcs/edges A = {{i, j} , {k, l} , . . .} for i, j, k, l ∈ N .

• A graph is conveniently represented by its adjacency matrixD =

[Dij

]where

Dij =

{1 if {i, j} ∈ E0 otherwise . (1)

• No self-ties ⇒ D is a binary matrix with a diagonal of so-called structural zeros.

5

Utility

Let d ∈ D be a feasible network. The utility agent i gets fromsome feasible network wiring d is

νi(di,d−i;U

)=∑j

dij[Ai +Bj +W ′

ijλ0 + γ0sij (d)− Uij

],

where:

1. Ai is a “sender effect” (out-degree heterogeneity);

2. Bj a “receiver” effect (in-degree heterogeneity);

6

Utility (continued)

1. W ′ijλ0 = X ′

iΛXj with the Xi a vector of K community mem-bership dummies (dim(λ0) = K2×1 parameterizes homophily);

2. sij (d) = sij (d− ij) = sij (d+ ij) is a network/strategic ef-fect; can be used to model:

(a) reciprocity: sij (d) = dji;

(b) transitivity:sij (d) =∑

k dikdkj

3.{Uij

}i ̸=i

idiosyncratic utility shifter (i.i.d. logistic)

7

Notation Redux

Out- and in-degree sequences equal

S =

(SoutSint

)′=

(D1+, . . . , DN+D+1, . . . , D+N

).

Here D+i =∑

j Dji and Di+ =∑

j Dij equal the in- and out-degree of agents i = 1, . . . , N .

The K ×K cross-link matrix equals

M =∑i

∑j

DijXiX′j

This matrix summarizes the inter-group link structure in the net-work.

8

Notation Redux (continued)

Let S,M be a degree sequence and cross-link matrix.

We say S,M is graphical if there exists at least one arc set A suchthat G (V,A) is a simple directed graph with degree sequence S

and cross link matrix M.

We call any such network a realization of S,M.

The set of all possible realizations of S,M is denoted by GS,M

(DS,M).

9

Network Game

d ∈ D - a candidate network wiring – is a pure strategy combi-nation (each agent decides which, out of N − 1 choices, links tosend).

A (pure strategy) Nash equilibrium (NE) is a pure strategy com-bination d∗ where, for U = u and all i = 1, . . . , N,

νi(d∗i ,d

∗−i,u

)≥ νi

(di,d

∗−i,u

)(2)

for all possible (other) linking strategies di.

We assume that D – the observed network – satisfies (2) at therealized U.

10

Equilibrium Selection

Let Nd (u; θ) be a function which assigns, for U = u, a probabilityweight to network or, equivalently, pure strategy combination d.

If d is the only network which satisfies (2), then Nd (u; θ) = 1.

If d is not a NE, then Nd (u; θ) = 0.

If there are multiple pure strategy NE, then Nd (u; θ) ≥ 0 for anyd which is a NE and zero otherwise; subject to the constraintthat

∑d∈DNd (u; θ) = 1.

11

Equilibrium Selection (continued)

Nd (u; θ) corresponds to an equilibrium selection rule.

We do not impose any assumptions on the form of Nd (u; θ)

(beyond those already outlined).

A feature of what follows is that the researcher can be veryagnostic about equilibrium selection.

12

Likelihood

We can write the probability of observing network D = d as

P (d; θ) =∫u∈Rn

Nd (u; θ) fu (u) du

where n = N (N − 1) is the number of directed dyads.

Here fu (u) =∏i ̸=j fU

(uij)

with

fU (u) = eu/ [1 + eu]2

the logistic density.

13

Model Parameters

θ =(γ, δ′, π′)′ with

γ - parameter of interest (strategic interaction)

δ =(λ′,A′,B′)′ – homophily/heterogeneity

π - equilibrium selection parameter (abstract)

δ and π are (high dimensional) nuisance parameters

14

Testing for Strategic Interaction

Let ∆ denote a subset of the K2 + 2N dimensional Euclideanspace in which δ0 is, a priori, known to lie, and

Θ0 ={(

γ, δ′, π′): γ = 0, δ ∈ ∆, π = 0

}.

Our null hypothesis is the composite one

H0 : θ ∈ Θ0 (3)

since δ may range freely over ∆ ⊂ RK2+2N under the null.

15

Null Model

Null model is a variant of that studied by Graham (2017) andJochmans (2018); also related to so-called degree correctedstochastic block model (e.g., Karrer and Newman, 2011).

Under the null links are conditionally independent with P0 (d; δ)def≡

P(d;(0, δ′,0′

)′) equal to

P0 (d; δ) =N∏

i=1

∏j ̸=i

exp(W ′

ijλ+R′iA+R′

jB)

1+ exp(W ′

ijλ+R′iA+R′

jB)dij

×

1

1+ exp(W ′

ijλ+R′iA+R′

jB)1−dij

16

with Ri an N × 1 vector with 1 as its ith element and zeroselsewhere.

Null Model (continued)

Note that P0 (d; δ) equals

P0 (d; δ) =∫u∈Rn

Nd (u; θ) fu (u) du

with

Nd (u; θ) =∏i

∏j

1(Ai +Bj +W ′

ijλ ≥ uij)dij

× 1(Ai +Bj +W ′

ijλ > uij)1−dij

.

Things are more involved under the alternative where γ > 0!

17

Null Model: Exponential Family

The null model belongs to the exponential family:

P0 (d; δ) = c (δ) exp(t′δ)

with a (minimally) sufficient statistic for δ of

t =(vec

(m′)′, s′out, s

′in

)′.

In words, the K2 + N + N sufficient statistics are (i) the crosslink matrix, (ii) the out-degree sequence and (iii) the in-degreesequence.

18

Null Model: Conditional Likelihood

Under H0 the conditional likelihood of D = d is

P0 (d|T = t) =1

|Ds,m|.

To simulate the distribution of a statistic under H0 we need tobe able to draw adjacency matrices (i.e., networks) uniformly atrandom from the set Ds,m.

This is a non-trivial problem. See Blitzstein & Diaconis (2010)and Tao (2016).

19

Test Formulation

In our setting, a test ϕ (D), will have size α if its null rejectionprobability (NRP) is less than or equal to α for all values of thenuisance parameter:

supθ∈Θ0

Eθ [ϕ (D)] = supδ∈△

Eθ [ϕ (D)] = α.

Since δ is high dimensional, size control is non-trivial (e.g., Mor-eira, 2009).

This motivates proceeding conditionally on T.

Let T = {(s,m) : s,m is graphical} be the set of possible T.

20

Test Formulation (continued)

For each t ∈ T we form a test with the property that, for allθ ∈ Θ0,

Eθ [ϕ (D)|T = t] = α.

Such an approach ensures similarity of our test since, by iteratedexpectations

Eθ [ϕ (D)] = Eθ [Eθ [ϕ (D)|T]] = α

for any θ ∈ Θ0 (cf. Ferguson, 1967).

By proceeding conditionally we ensure the NRP is unaffected bythe value of δ.

21

Test Formulation (continued)

By Ferguson (1967, Lemma 1, Section 3.6) T is a boundedlycomplete sufficient statistic for θ under the null.

By Ferguson (1967, Theorem 2, Section 5.4) every similar testwill therefore take the form

Eθ [ϕ (D)|T = t] = α

for t ∈ T.

If we desire similarity we can/must take the conditional approach.

22

Alternative Model: Conditional Likelihood

Under the alternative of strategic interaction the conditional like-lihood is

P (d|T = t; θ) =P (d; θ)∑

v∈Ds,m P (v; θ).

This likelihood is complicated and (logically) cannot be evaluatedwithout specifying an explicit equilibrium selection mechanism.

23

Locally Best Test

For each t ∈ T, we choose the critical function, ϕ (D) to maximizethe derivative of the (conditional) power function

β (γ, t) = E [ϕ (D)|T = t]

evaluated at γ = 0 subject to the (conditional) size constraint

Eθ [ϕ (D)|T = t] = α. (4)

Such a ϕ (D) is locally best (Ferguson, 1967, Section 5.5).

24

Locally Best Test (continued)

Differentiating the power function we get

∂β (γ, t)

∂γ

∣∣∣∣∣γ=0

= E [ϕ (D) Sγ (D|T; θ)|T = t] (5)

with Sγ (d| t; θ) the conditional score function

Sγ (d| t; θ) =1

P0 (d; δ)

∂P (d; θ)

∂γ

∣∣∣∣∣γ=0

−∑

v∈Ds,m

∂P (v; θ)

∂γ

∣∣∣∣∣γ=0

=1

P0 (d; δ)

∂P (d; θ)

∂γ

∣∣∣∣∣γ=0

+ k (t)

and k (t) only depending on the data through T = t.

25


By the Neyman-Pearson lemma the test with critical function

ϕ (d) =

1 1

P0(d;δ)∂P (d;θ)

∂γ

∣∣∣γ=0

> cα (t)

gα (t) 1P0(d;δ)

∂P (d;θ)∂γ

∣∣∣γ=0

= cα (t)

0 1P0(d;δ)

∂P (d;θ)∂γ

∣∣∣γ=0

< cα (t)

where the values of cα(t) and gα (t) ∈ [0,1] are chosen to satisfy(4), will be locally best.

26


Several (serious) implementation challenges:

1. Form of the likelihood gradient ∂P (d;θ)∂γ

∣∣∣γ=0

(incompleteness

is an issue)?

2. Locally best test statistic may depend on nuisance parame-ters δ and π?

3. To find cα(t) and gα (t) we need to be able to simulate the(null) distribution of 1

P0(D;δ)∂P (D;θ)

∂γ

∣∣∣γ=0

conditional on T = t.

27

Derivative Calculation: Buckets

Given the network d− ij agent i will direct a link to j if

vij + γsij (d) ≤ Uij

for vij = Ai +Bj +W ′ijδ.

In a given network the strategic interaction term, sij (d) parti-tions the image space of Uij into two intervals

R =(−∞, sij (d)

]∪(sij (d) ,∞

).

Similarly the set of all networks, D, partitions R into a set ofintervals B.

28

Derivative Calculation: Buckets (continued)

Let S = {−s, s1, . . . , sM , s} be the set of possible values for thestrategic interaction term sij (d), ordered from smallest to largest.

We call each element b ∈ B a bucket, buckets are naturally or-dered

R =(−∞, vij + γs

]∪(vij + γs, vij + γs1

]∪ · · ·

∪(vij + γsM , vij + γs

]∪(vij + γs,∞

).

All buckets, with the exception of the first and the last, we callinner buckets.

For any draw of the utility shifter we have Uij ∈ b, b ∈ B.

29


If a realization of Uij is in bucket B, we say Uij falls in (or is in)B.

We suppress the dependence of the partition on ij in the nota-tion.

Observe that for γ ≈ 0, the probability that Uij falls into an innerbucket is close to zero.

30


Let the boldface subscripts i = 1, 2, . . . index the n = N (N − 1)

directed dyads in arbitrary order (e.g., i maps to some ij andvice-versa).

Let b ∈ Bn = B× · · · × B and U = (U1, . . . , Un)′.

We have that U ∈ b for b ∈ Bn so that each element of then-vector of utility shifters U falls into a bucket.

31


To understand these buckets consider Uij ∈(vij + γsm, vij + γsm+1

].

At such a realization of Uij it will be optimal for i to send a linkto j in any network such that sij (d) ≤ sm, and optimal to notsend this link when sij (d) > sm.

Rewirings of the network which induce a shift of sij (d) from sm

to sm+1 change the incentives for i to send a link to j.

Hence each bucket defines a region in which the incentives toform a particular ij link may be sensitive to small re-wirings ofthe network.

32

Derivative Calculation: Likelihood (continued)

Using our bucket notation we can re-write the likelihood as:

P (d; θ) =∑

b∈Bn

∫u∈b

Nd (u; θ) fU (u) du (6)

For a given bucket combination b ∈ Bn,∫u∈bNd (u; θ) fu (u) du

gives the associated contribution to the likelihood of observingD = d.

Summation over all possible bucket combinations gives the over-all likelihood of observing D = d.

33


Let B̃n be the set of bucket configurations with two or moreinner buckets. Define

P̃ (d; θ) =∑

b∈Bn\B̃n

∫u∈b

Nd (u; θ) fU (u) du

Q (d; θ) =∑

b∈B̃n

∫u∈b

Nd (u; θ) fU (u) du.

Trivially we have the decomposition

P (d; θ) = P̃ (d; θ) +Q (d; θ) .

34

Derivative Calculation

To calculate ∂P (d; θ) /∂γ we show that for γ → 0

P (d; θ) = P̃ (d; θ) +O(γ2).

Furthermore we show that

∂P (d; θ)

∂γ

∣∣∣∣∣γ=0

=∂P̃ (d; θ)

∂γ

∣∣∣∣∣γ=0

. (7)

Hence to derive the form of ∂P (d;θ)∂γ

∣∣∣γ=0

we need only calculate

∂P̃ (d;θ)∂γ

∣∣∣∣γ=0

.

This calculation is non-trivial, but doable (i.e., it is tedious).

35


Only need to worry about cases where (i) no draws of Uij are ininner buckets or (ii) just one draw (out of n) is.

In the first case every player has a strictly dominating strategyprofile.

Strong preferences: regardless of other players’ action it is eitheroptimal, or not, to form specific links.

Network is uniquely defined: Nd (u; θ) is either zero or one.

36


Second case: if all but one component of U falls into the first orlast bucket, then the resulting network is uniquely defined exceptfor the presence or absence of one edge, say, ij.

For any such draw of U, since all other links are formed accordingto a strictly dominating strategy, player i will either benefit fromforming the link ij or not.

Hence Nd (u; θ) is also either zero or one in this case as well.

37


For small values of γ the derivative is driven by summands wherethe precise details of the (unspecified) equilibrium selection mech-anism are not relevant.

Those summands where the form of Nd (u; θ) is germane con-tribute very little to the derivative when γ is small.

We are able to differentiate the likelihood with respect to thestrategic interaction parameter and evaluate that derivative forsmall γ (specifically for γ = 0).

38


Lemma: P (d; θ) is twice differentiable with respect to γ atγ = 0. Its first derivative at γ = 0 is

∂P (d; θ)

∂γ

∣∣∣∣∣γ=0

=P0 (d; δ)

×

∑i ̸=j

sij (d)

dij fU(tij)

∫ vij−∞ fU (u) du

−(1− dij

) fU(tij)

∫∞vij

fU (u) du

.

With a little manipulation we can simplify:

1

P0 (d; δ)

∂P (d; θ)

∂γ

∣∣∣∣∣γ=0

=∑i ̸=j

[dij − FU

(vij)]

sij (d)

where FU (u) = eu/ [1 + eu] is the logistic CDF.

39

Operational Details

Locally best test statistic is large when links which have lowprobability under the null, tend to form precisely where their“strategic utility” is high.

Controlling for heterogeneity appears to be important for power.

Lots of triangles vs. “surprising” triangles.

40

Operational Details

Although the form of the locally optimal statistic does not de-pend on π (equilibrium selection) it does depend on δ (hetero-geneity).

Plugging in any δ ∈ ∆ results in an admissible test.

We take a “best guess” approach, replacing vij = Ai+Bj +W ′ijλ

with its JMLE v̂ij.

This is ad hoc, but appears to work well in practice.

41

Operational Details (continued)

For s = 1, . . . , S we draw (uniformly at random) Vs ∈ Ds,m and

calculate 1P0(Vs;δ̂)

∂P(Vs;(γ,δ̂′,π′)′

)∂γ

∣∣∣∣∣∣γ=0

.

If 1P0(D;δ̂)

∂P(D;(γ,δ̂′,π′)′

)∂γ

∣∣∣∣∣∣γ=0

, observed in the network in hand, is

greater than 95 percent of our simulated statistics we reject thenull of no strategic interaction.

42

Simulation Algorithm

We begin with D and randomly rewire it, preserving the crosslink structure and degree sequence at each step.

Our MCMC converges to the null distribution, generating a uni-form random draw from DS,M.

Key references: Rao et al. (1996) and Tao (2015).

Our contribution is to also account for the cross-link group struc-ture.

43

Alternating Walks

44

Alternating Cycles

45

Schlaufen Sequences

46

Null: γ = 0,sij (d) =∑

k dikdkj

47

Alternative: γ = 0.3, sij (d) =∑

k dikdkj

48

Wrapping-Up

The presence of strategic interaction is central to many theoriesof network formation (and policy-relevant).

Estimation of such models is non-trivial.

This motivates the need for a method of testing for strategicinteraction.

We propose one such method.

Much remains to be done.

49

Date post:	25-Jul-2020
Category:	Documents
Upload:	others
View:	3 times
Download:	0 times

Testing for Strategic Interaction in Social and Economic...

Documents