Advances in biased net theory: definitions, derivations, and estimations

Social Networks 26 (2004) 113–139

Advances in biased net theory: definitions,derivations, and estimations

John Skvoretza,∗, Thomas J. Fararob, Filip Agneessensca University of South Carolina, Columbia, SC 29208, USA

b University of Pittsburgh, Pittsburgh, PA, USAc Ghent University, Ghent, Belgium

Abstract

Random and biased net theory, introduced by Rapoport and others in the 1950s, is one of theearliest approaches to the formal modeling of social networks. In this theory, intended as a theoryof large-scale networks, ties between nodes derive both from random and non-random events ofconnection. The non-random connections are postulated to arise through “bias” events that incor-porate known or suspected systematic tendencies in tie formation, such as, mutuality or reciprocity,transitivity or closure in triads, and homophily—the overrepresentation of ties between persons whoshare important socio-demographic attributes like race/ethnicity or level of educational attainment.A key problem for biased net theory has been analytical intractability of the models. Formal deriva-tions require approximation assumptions and model parameters have been difficult to estimate. Theaccuracy of the derived formulas and the estimated parameters has been difficult to assess. In thispaper, we attempt to address long-standing issues in biased net models stemming from their analyt-ical intractability. We first reformulate and clarify the definitions of basic biases. Second, we derivefrom first principles the triad distribution in a biased net, using two different analytical strategiesto check our derivations. Third, we set out a pseudo-likelihood method for parameter estimation ofkey bias parameters and then check the accuracy of this relatively simple but approximate schemeagainst the results obtained from the triad distribution derivation.© 2004 Elsevier B.V. All rights reserved.

Keywords: Biased net theory; Pseudo-likelihood method; Large-scale networks

1. Introduction

Random and biased net theory is the earliest attempt to formally model social (andother) networks. The approach originated in the early 1950s through a series of papers byAnatol Rapoport and others in theBulletin of Mathematical Biophysics (Rapoport, 1951a,b;

∗ Corresponding author.

0378-8733/$ – see front matter © 2004 Elsevier B.V. All rights reserved.doi:10.1016/j.socnet.2004.01.005

114 J. Skvoretz et al. / Social Networks 26 (2004) 113–139

Rapoport and Solomonoff, 1951; Landau, 1952; Solomonoff, 1952) followed by a cascadeof further mathematical contributions byRapoport (1953a,b,c, 1957, 1958, 1963)followedby studies with colleagues that used the framework to model friendship networks in twojunior high schools (Rapoport and Horvath, 1961; Foster et al., 1963). Fararo and Sunshine(1964)introduced some theoretical extensions in their study of a large friendship network,also among junior high school students. In biased net theory, a network is the outcome of astochastic process that has random and biased elements. The two basic types of parametersare the density of the network and the bias parameters. A fundamental methodologicalprinciple of the theory is that when the bias parameters in any derived formula vanish, thenthe formula reduces to that holding for a random net with the same density. In this approach,aggregate patterns in network structure emerge from local events of connection, that is,complexity at the aggregate level arises from the compounding of relatively simple andlocal events of connection. However, the stochastic nature of the biased net models makesanalytical derivations almost impossible so that exploration of such a model’s consequencesusually has often relied upon approximation assumptions.

Bias parameters are of two types. One type may be termed “structural” and pertains torelations among nodes. The reciprocity or mutuality bias is a simple example of a structuraltype of bias—the parameter captures the idea that a tie fromx toy is more likely than chanceif there already is a tie fromy to x. The second type of bias may be termed “compositional”and pertains to attributes of the nodes. An example is the “inbreeding bias,” relating tohomophily, introduced byFararo and Sunshine (1964). In their study, for instance, delin-quent boys were more likely than chance to name other delinquent boys as friends than toname nondelinquent boys. In an extended program of formalization of Blau’s influentialmacrosociological theory of social structure (Blau, 1977), Fararo and Skvoretz employ thiscompositional bias parameter (Fararo, 1981; Skvoretz, 1983; Fararo and Skvoretz, 1984,1989; Skvoretz and Fararo, 1986). These articles introduce an additional compositionalbias—an outbreeding bias—necessary to model ties such as marriage in relation to thecompositional dimension of gender. The articles also provide formal models for situationsin which multiple dimensions are in play simultaneously and for situations in which thecompositional dimensions are ranked dimensions, like education and age, Blau’s graduatedparameters.

Additional theoretical research based on these articles has used these and other biased netconcepts.Granovetter’s (1973)strength of weak ties was represented in a biased net model(Fararo, 1983).1 Then this model was unified with the biased net model that had formalizedBlau’s macrosociological theory (Fararo and Skvoretz, 1987) and the unified theory was thenapplied to the small world problem (Skvoretz and Fararo, 1989). The major role of biasednet theory in these developments has been as a formal framework within which otherwiseseparate and disconnected theoretical ideas in sociology can be synthesized (Fararo andSkvoretz, 1989; Chapter 4). In addition, the biased net approach is relevant to the recentupsurge of mathematical model-building dealing with small worlds and related complexnetwork phenomena (Watts, 1999; Newman, 2000; Strogatz, 2001).

1 It should be noted that some of Rapoport’s work had been cited by Granovetter as part of the basis for histhesis.

J. Skvoretz et al. / Social Networks 26 (2004) 113–139 115

Despite these advances in and potential for the use of biased net constructs, we havebeen concerned about technical problems that beset the approach—concerns relating todefinitions of structural bias parameters, to approximations in the derivations of formulas,and to methods for the estimation of bias parameters. Such concerns have led to researchinto the foundations of biased net theory.Skvoretz (1985, 1990)proposes Monte Carlosimulation methods to study the issues involved. In this type of study, one generates networksof specific size subject to specified levels of various bias factors and then studies parameterestimation methods in that context. However, these efforts were not entirely successfuland in fact they cast further doubt on the validity of certain approximation argumentstraditionally used in biased net theory to derive important network properties of interest,such as connectivity.

These remarks set the stage for the current efforts. The focus is on the technical apparatusof biased net theory rather than its use to formalize and synthesize sociological ideas or tomodel particular processes or structures. After clarifying the definitions of key structuralbias parameters, we take a fresh look at biased net models from two directions. The firstdirection is the derivation of subgraph distributions, in particular, the triad distributionas conceptualized in terms of the well-known MAN classification. The second directioninvolves estimation of parameters by pseudolikelihood methods. Pseudolikelihood methodsfor model estimation form the foundation for recent advances in methodological modelsfor networks, the exponential random graph (p∗) approach. Yet there is recognition thatsuch methods are far from ideal. In our work, the first advance, the derivation of the triaddistribution, is employed to provide a check on the pseudolikelihood estimation method.

In undertaking these tasks, we consider basic models that incorporate only structuralbiases. In the next section, dealing with definitions, we formally define the key structural biasparameters, showing how the current formulation relates to earlier ones. Then the followingsection deals with derivations. First we derive dyad distributions implied by the definitions.We then derive the triad distribution from first principles. Because of the complexity ofthis latter task, in terms of numerous probabilistic calculations involving nonindependentevents, we present two different analytical methods for deriving the distribution, so that inshowing that they lead to the same results, we add confidence in the validity of the results.At that point, we employ these two types of distributions, dyadic and triadic, to formulateand assess a method of estimation. First, we show how the derived dyad distribution can beused to define a pseudo-likelihood function for bias parameter estimation. Then, second, wecheck the accuracy of this relatively simple but approximate scheme of estimation againstthe results obtained from the derived triad distribution. Finally, in our conclusion, we notethat despite the advances made here, additional types of investigation are required in thecontinuing effort to firm-up the foundations of biased net theory.

2. Definitions of biases

In some of his early papers on biased net theory that dealt with information diffusionin social networks,Rapoport (1953a,b,c)proposed an approach to biased net models thatfocused on how biases and the density of random connection affect the reachability of thenetwork, that is, the proportion of nodes in a population that can be reached, on average,


from a randomly selected starter node. Whether there was a real-time diffusion processunder analysis or a kind of pseudo-diffusion “process” of “tracing” out connections in agiven network, the idea was to define the bias parameters in the context of a succession ofgenerations of nodes as links were traced out from an arbitrary starting set. The successiveaverage proportions of newly reached nodes in each generation were termed the “structurestatistics” of a biased net (Fararo and Sunshine, 1964). Biases were defined in the frame-work of this tracing procedure, in which “parents” referred to the nodes newly reachedat remove t from the starter set of nodes—parents of generationt. The nodes nominatedby a parent, the parent’s “children,” were called “siblings” because they had a parent incommon.

Three types of structural biases commonly postulated were: (1) the parent reciprocitybias or mutuality—the tendency for a child to return a parent’s nomination, (2) sibling biasor closure—the tendency for one sibling to nominate another, and (3) sibling reciprocitybias—the tendency for a sibling nominated by another sibling to return that nomination.Rapoport and his colleagues (Rapoport, 1957; Foster et al., 1963) introduced versions ofthe first two in the context of a study of empirical sociograms, and thenFararo and Sunshine(1964)introduced the third. Both research teams explored other biases—a distance bias byRapoport and a grandparent bias by Fararo and Sunshine—but they did not receive muchattention either because exact definition and derivation of consequences were too difficultor they had no impact on reachability.

The three biases have been variously defined. The parent reciprocity bias, denotedπ,refers to the idea that, in the context of the tracing procedure, the probability of a tie fromx to y is elevated above chance levels ify is a parent ofx. Rapoport (1958)originally hadcalled this bias “reciprocity” but in the context of tracings of an empirical sociogram, heand his colleagues adopted the terminology of “parent bias.”Fararo and Sunshine (1964)follow Foster et al. (1963)in defining this bias by the equation:

π = Pr(x → y|y → x)

That is, the parent reciprocity bias is the probabilityx targets on or choosesy, given thatytargets on or choosesx and, implicitly,y is a parent ofx. The sibling bias, denotedσ, refersto the idea that the probability of anx to y tie is elevated above chance levels if there is anodez that is a parent to bothx andy. AgainFararo and Sunshine (1964)follow Foster et al.(1963)in defining this bias by the equation:

σ = Pr(x → y|xSy)wherexSy means thatx andy are siblings. The sibling reciprocity bias, denotedρ, capturesthe idea that the probability of anx to y tie is elevated ifx andy have a parentz and onesibling,y, has a tie to the other,x. Fararo and Sunshine (1964)call this bias the “double role”bias, following a remark byFoster et al. (1963)in which they note thaty in this circumstanceis both a “parent” tox and a “sibling” ofx. AlthoughFoster et al. (1963)do not offer aformal definition of this bias, Fararo and Sunshine do in the following equation:

ρ = Pr(x → y|y → x&xSy)

These definitions of biases suppress the random chance of connection because in anylarge network the random chance of connection, denotedd, is assumed to be very small


compared to the bias factors.Fararo’s (1981)redefinition of the biases makes clear how therandom chance of connection fits in:

Pr(x → y|y → x) = π + (1 − π)Pr(x → y) = π + (1 − π)dPr(x → y|xSy) = σ + (1 − σ)Pr(x → y) = σ + (1 − σ)d

In these equations, the bias factor is thought of as the probability that a hypotheticalbias event of the indicated type occurs. If it occurs,x choosesy with probability 1. If itfails to occur,x choosesy with probability equal to the random chance of connection.Thus, for example, the conditional probability thatx choosesy, given thaty is a parentof x, is not equivalent to the parent reciprocity bias directly but to a weighted averageof y choosingx with probability 1 andy choosingx with only the random chance ofconnection.

Fararo (1981)does not provide a similar formulation for the sibling reciprocity bias.Skvoretz (1985)notes that if the sibling reciprocity event fails to occur, it still is the casethat ySx holds with respect to the (y, x) pair. Therefore, if the sibling reciprocity fails tooccur,y andx are still siblings and thus at risk of a sibling bias event and the resultingcreation of a tie. Then only if the sibling bias event fails to occur, does the random chanceof connection come into play. The formula is:

Pr(x → y|y → x&xSy) = ρ + (1 − ρ)(σ + (1 − σ)d)While apparently straightforward, these definitions are problematic: by defining the biases

in the context of the tracing procedure, ambiguity is introduced in how the formation of aparticular tie may be attributable to a bias event. Consider the simple three person examplein which b → a, b → c, a → c, andc → a. If we start tracing out froma, thena is aparent at generation 0, andc is that parent’s only child. The nomination ofa by c couldtherefore potentially be due to a parent reciprocity event. Nowc is a parent at generation1 but c has no children and so the tracing stops. If, however, the tracing procedure startsfrom b, thenb is a parent at generation 0 who has two children,a andc and now the choiceof a by c, according to the above definitions, would be attributable to either sibling bias orsibling reciprocity bias.

To avoid such problems, we offer a reformulation the aim of which is to express howthe probability of a tie fromx to y is contingent on various events and on various structuralconditions. The relevant structural conditions are (1) whether there is a tie fromy to x, and(2) the number of common parents shared byx andy. If there is a tie fromy to x, we denotethis byy → x. If x andy havek common parents, we denote this byxSky. If k = 0, thenx andy are termed an “orphan” dyad. We denote the three bias events byBpr, Bs, andBsr.The first bias event may occur for thex to y choice, only ify → x holds. Since each of thek common parents instantiates a condition in which sibling bias could occur and, ify → x

holds, each instantiates a condition in which sibling reciprocity could occur, we assume thatk common parents provide k exposures to sibling bias and, ify → x, k exposures to siblingreciprocity bias. If any of the hypothetical bias events, occur then thex to y tie forms withprobability equal 1.

For a dyad withk common parents and for whichy → x holds, there are 2k + 1 biasevents that could result in anx to y tie: one instance of parent reciprocity andk instances


each of sibling bias and sibling reciprocity bias. We assume the bias events are independent.Therefore, anx to y tie fails to occur as a result of bias if and only if all 2k + 1 events failto occur. If at least one occurs, then thex to y tie forms. If none of the events occur, the tiestill may form by chance. Thus we have the following equation for the probability of anxto y tie in these dyads:

Pr(x → y|y → x&xSky)= [1 − (1 − π)(1 − σ)k(1 − ρ)k]+ (1 − π)(1 − σ)k(1 − ρ)kd

It is important to note that for orphan dyads, the equation reduces to the familiar equationfor parent reciprocity. For a dyad withk common parents and for whichy → x does nothold, there arek bias events that could result in anx to y tie: k instances of sibling bias. Anx to y tie fails to occur as a result of bias if and only if allk events fail to occur. If at leastone occurs, then thex to y tie forms. If none of the events occur, the tie still may form bychance. Thus we have the following equation for the probability of anx to y tie in thesedyads:

Pr(x → y| ∼ y → x&xSky) = [1 − (1 − σ)k] + (1 − σ)kd

Again for orphan dyads, the equation reduces to justd, the random chance of connection.This completes the reformulation of the basic biases. As we have noted, the case of orphan

dyads reproduces the basic logic and equations of parent reciprocity. The basic equation forsibling bias in one parent dyads wherey → x does not hold is also recovered. However, forone parent dyads wherey → x does hold, the reformulation proposes a new expression,namely,

Pr(x → y|y → x&xS1y)= [1 − (1 − π)(1 − σ)(1 − ρ)]+ (1 − π)(1 − σ)(1 − ρ)d

in which the “double role” ofy vis-à-visx is explicit.There are other biases that may be defined. For instance, the sibling bias captures stochasti-

cally the forbidden triad principle ofGranovetter (1973), the idea of closure in co-nominatedcontacts. In fact, more recently, we have referred to it as “the closure bias” and used it todefine a “SWT” measure. Namely, letπ = ρ = 1 so that we are dealing with a symmetricrelation of acquaintanceship, interpreted as a weak tie. Then SWT is the probability that theclosure bias event doesnot occur given two nodes are acquainted with a third node (Fararoand Skvoretz, 1987; Fararo and Skvoretz, 1989: Section 4.4). This closure principle is re-lated to, but not identical with, the commonly observed tendency towards transitivity, that is,x having a tie toy andy a tie toz tending to induce a tie fromx to z. Thus it would be possibleto define a transitivity bias and considerations relating to this and numerous explorationsof other bias parameter ideas have been part of the tradition of biased net theory from itsearliest days. In addition, one major extension would be to incorporate actor attributes inthe definition of all these structural biases to model tendencies such as a tendency for a tieto be more likely to be reciprocated if actors share an attribute. We save these extensionsfor future work.


3. Derivations of subgraph distributions

3.1. Dyadic distributions

The definitions of the bias events imply a number of consequences. First, both reciprocitybiases are purely “redistributive.” That is, the effect of either bias is to redirect ties in dyadsthat are not reciprocated to dyads in which they are reciprocated or are absent. Either biasdoes not create new ties. This consequence can be demonstrated by deriving the expectednumber of arcs in a pair, conditional on the number of its parents. Second, it is possible toderive a relatively simple formula for the expected number of arcs in pairs havingk commonparents. Finally, this formula can be used to recalibrate the random chance of connectionso that the entire set of biases, including the sibling bias, are redistributive as originallyenvisioned by Rapoport.

This first property can be easily shown followingSkvoretz’s (1985)derivation of themutual, asymmetric and null distribution for parent–child dyads in which parent reciprocityis relevant and (with modification) for sibling dyads in which sibling reciprocity is relevant.Reproduced here in slightly rewritten form, these distributions are, for parent–child dyads:

P(M) = d(π + (1 − π)d)P(A) = 2d(1 − d)(1 − π)P(N) = (1 − d)(1 − d(1 − π))

for sibling dyads (of just one parent):

P(M) = (σ + (1 − σ)d)(1 − (1 − π)(1 − σ)(1 − ρ)+ (1 − π)(1 − σ)(1 − ρ)d)P(A) = 2(σ + (1 − σ)d)(1 − π)(1 − σ)(1 − ρ)(1 − d)P(N) = 1 − (σ + (1 − σ)d)(1 − (1 − π)(1 − σ)(1 − ρ)(1 − d)

+2(1 − π)(1 − σ)(1 − ρ)(1 − d))We will label the first distributionD0 and the secondD1. That the biases are purely redis-tributive follows from a simple calculation of the expected number of arcs,D, in a dyad.For parent–child dyads:

E0(D)= 2[d(π + (1 − π)d)] + 1[2d(1 − d)(1 − π)] + 0[(1 − d)2 + d(1 − d)π]

= 2dπ + 2(1 − π)d2 + 2d − 2d2 − 2d(1 − d)π = 2d

for one parent sibling dyads:

E1(D)= 2[(σ + (1 − σ)d)(1 − (1 − π)(1 − σ)(1 − ρ)+ (1 − π)(1 − σ)(1 − ρ)d)]+ 1[2(σ + (1 − σ)d)(1 − π)(1 − σ)(1 − ρ)(1 − d)]

= 2(σ + (1 − σ)d)[1 − (1 − π)(1 − σ)(1 − ρ)(1 − d)+ (1 − π)(1 − σ)(1 − ρ)(1 − d)] = 2(σ + (1 − σ)d)

In both cases the expected number of arcs does not depend on the value of the reciprocitybiases. Note that in the last expression, the expected value is greater than 2d when siblingbias is nonzero. The consequences of this analytical result will be explored below.


The derivation of the expected number of arcs in a dyad withk parents is straightforwardfollowing the logic ofSkvoretz (1990)and using the reformulated bias definitions for dyadswith k common parents. The equations are:

Pk(M) = ((1 − (1 − σ)k)+ (1 − σ)kd)(1 − (1 − π)(1 − σ)k(1 − ρ)k+ (1 − π)(1 − σ)k(1 − ρ)kd)

Pk(A) = 2((1 − (1 − σ)k)+ (1 − σ)kd)(1 − π)(1 − ρ)k(1 − σ)k(1 − d)Pk(N) = 1 − ((1 − (1 − σ)k)+ (1 − σ)kd)× (1 − (1 − π)(1 − σ)k(1 − ρ)k(1 − d))

+ 2(1 − π)(1 − ρ)k(1 − σ)k(1 − d))This may be referred to as theDk distribution. It is easy to show that the expected numberof arcs in ak parent sibling dyad is solely a function of the sibling bias:

Ek(D) = 2((1 − (1 − σ)k)+ (1 − σ)kd)Quite nicely, whenσ = 0, the expected number of arcs is just 2d, the chance expectationin a Bernoulli graph with densityd.

In parent–child dyads, the expected number of arcs is exactly the number expected in a(homogeneous) Bernoulli digraph with densityd, that is, where the unconditional probabilitythat x targetsy is d. This is true whatever the value of the reciprocity bias parameter. Bycontrast, the sibling bias is productive of ties over and beyond those created by the randomchance of connection. As we noted earlier, in sibling dyads, when the sibling bias is nonzero,the expected number of arcs is greater than the number expected in a homogenous Bernoulligraph with densityd. Only if the sibling bias is zero will the expected number of arcs reduceto 2d, the Bernoulli digraph expectation.

The original aim of biased net theorists was to make the sibling bias redistributive aswell. This intent is clear from the problem context in which the biases were introduced,namely, the problem of tracing contacts out from a small, randomly selected subset ofnodes. The aim is to derive a formula for the structure statistics of the network, definedas the cumulative proportion of actors reachable in 1, 2,. . . , n steps from the starter set.The derivations made a simplifying assumption that each actor had the same number ofcontacts, denoteda. In a random net with no biases, the following recursion formula for theproportion newly contacted at removet + 1 applies:

P(t + 1) = (1 −X(t))(1 − e−aP(t))

In a network with biases, Fararo and Sunshine derive the following recursion formula:

P(t + 1) = (1 −X(t))(1 − e−αP(t))

where att = 0,α = a and att > 0,

α = a − π − σ(a − 1).

The logic here clearly reveals that both biases redirect ties and do not create new ones.The overall number of contacts per persons remains fixed ata and, therefore, the density ofthe network, defined as the ratio of actual to potential contacts, also remains constant. Theidea is that of thea contacts personx has, on average,π of them will be redirected back to


the nominating parent (parent reciprocity) andσ(a−1) redirected to thea−1 other siblingsnominated by the parent. The remaining contacts are then “free” to be randomly allocated toother nodes, some of whom may have already been reached in the tracing process (includingperhaps the parent or one of the siblings) and others who have not yet been reached.

To preserve the idea of biases as purely redistributive, we must adjust terms in the defi-nition of bias events. As the previous paragraph indicates, once biases are introduced, thenumber of ties that are “free” to be randomly assigned to other nodes must be less thana.In a random net of g nodes, density is defined byd = a/(g− 1), but in the biased net, thea contacts of a node are not “free” to be randomly assigned. Thus, in a net with biases, theprobability thata tie is randomly allocated must be less thand. We denote this probabilityby d′ and substitute it for d in the defining formulas for the biases:

Pr(x → y|y → x&xSky)= (1 − (1 − π)(1 − σ)k(1 − ρ)k)+ (1 − π)(1 − σ)k(1 − ρ)kd′

Pr(x → y| ∼ y → x&xSky) = 1 − (1 − σ)k + (1 − σ)kd′

If the biases are purely redistributive, relative to a baseline random net with densityd,then an important identity must hold, namely, that in both the baseline random net and acorresponding biased net, the expected number of arcs in a dyad must be the constant 2d.The expected number of arcs in a dyad is a weighted sum of the expected number of arcsin a dyad as the number of shared parents varies fromk = 0 to g − 2, weighted by theprobability that a dyad has 0,1, . . . , g − 2 parents. LetEk denote the expected number ofarcs in a dyad withk parents, and letwk denote the probability that a dyad hask parents.Then these remarks imply the following identity:

2d =g−2∑k=0

Ekwk =g−2∑k=0

2((1 − (1 − σ)k)+ (1 − σ)kd′)wk

Therefore, the random chance of connection in a biased net must be less than the randomchance of connection in a purely random net whenever the sibling bias is not zero. If weknow the probabilities that a dyad has 0,1, . . . , g − 2 shared parents and we know thevalues of sibling bias and the random chance of connection in a purely random net, we cancompute the appropriate value of the random chance of connection in the biased net.

3.2. Triad distribution

Table 1diagrams the 16 triad types and the probability of each type in a random Bernoullidigraph with densityd. To derive the triad distribution for biased nets from first principles,we use two strategies. In the first strategy, triad analysis, we first inspect a triad type forasymmetries in the risk patterns that dyads face depending on just which dyad outcome oc-curs first. If there are no such asymmetries, then we need not consider alternative sequencesof dyadic outcomes. If asymmetries exist, then all possible alternative sequences must beconsidered: theab dyad then thebc dyad then theac dyad, or first theab, then theac, thenthe bc and so on. There are six possible sequences, each of which we assume is a prioriequally likely.


Table 1Triad types

Triad type Probability in Bernoulli digraph

003 (1 − d)6

012 6d1(1 − d)5

102 3d2(1 − d)4

021D 3d2(1 − d)4

021C 6d2(1 − d)4

021U 3d2(1 − d)4

111U 6d3(1 − d)3

030T 6d3(1 − d)3

030C 2d3(1 − d)3

111D 6d3(1 − d)3

201 3d4(1 − d)2

120U 3d4(1 − d)2

120C 6d4(1 − d)2

120D 3d4(1 − d)2


Table 1 (Continued )

Triad type Probability in Bernoulli digraph

210 6d5(1 − d)1

300 d6

In the second strategy, the sequence enumeration strategy, we begin with a particu-lar dyadic sequence, say,ab, ac, and bc. Each dyad has four possible outcomes: mu-tual, asymmetric fromx to y, asymmetric fromy to x, or null. Therefore each dyadicsequence has 43 = 64 possible outcomes. Events that occur in the first two outcomesare governed by the probabilities for parent–child dyads. Events that occur in the thirdoutcome, however, may be determined by the probabilities for parent–child dyads or bythose for sibling dyads. Furthermore, after the third outcome occurs, the first or seconddyads may face addition risk if members of the dyad become siblings as a result ofthe third outcome. In such cases, additional branching possibilities are introduced andmust be followed up. Eventually, however, all possibilities are enumerated and proba-bilities can be assigned to each branch. Each branch results in a particular triad typeand so the final step is to sum all the probabilities of the branches leading to eachtype.

In both strategies we simplify notation:Mk will denote the probability that a dyad ismutual,ak will denote the probability that a dyad is asymmetric in one particular direction(with letters appended in parentheses if necessary to indicate direction), andNk will denotethe probability that a dyad is null. In all three expressions,k = 0 if the members of thedyad are not siblings andk = 1 if they are. Ifk = 0, then the relevant probabilities arethose for parent–child dyads; ifk = 1, then the relevant probabilities are for one parentsibling dyads. Some additional probabilities come into play if and when a dyad is subjectedto additional risk contingent on the outcomes of the other two dyads. This occurs when adyad is first exposed to risk becausexS0y holds but then outcomes in the other two dyadscreate additional exposure by creating the condition in whichxS1y holds. In these cases, thefirst exposure takes into account potential parent reciprocity and potential random chanceof connection. Hence, the relevant probabilities for the second exposure are variants ofthe one parent probabilities, removing both the parent reciprocity factor and the randomchance of connection. We will useM ′

1, a′1 andN ′1 to denote these probabilities which

are:

M ′1 = σ(1 − (1 − σ)(1 − ρ))

a′1 = σ(1 − σ)(1 − ρ)N ′

1 = 1 − σ(1 − (1 − σ)(1 − ρ)+ 2(1 − σ)(1 − ρ))

We will call this theD′1 distribution. We begin with the null 003 triad.


3.3. The null 003 triad

Consider thebc dyad. It satisfies “locally” the condition that there is no third partyz thathas ties to bothb andc, that is,bS0c. Therefore theD0 specifies the probability that thebcdyad is null. The value of this probability isN0. The same specification holds for the othertwo dyads and there are no asymmetries. Therefore,

Pr(003) = N30

3.4. The 012 triad

Suppose the single arc is fromb to c or from c to b, that is, suppose that thebc dyadis asymmetric. Again, thebc dyad satisfies locally the conditionbS0c. Therefore theD0distribution specifies the probability that thebc dyad is asymmetric: 2a0. Both theab andac dyads satisfy “locally” the conditionsaS0b andaS0c, respectively, so theD0 distributionapplies. But this configuration is but one of three equivalent ones that could result in a 012triad. Therefore, the probability of this triad is:

Pr(012) = 6a0N20

3.5. The 102 triad

Suppose thebc dyad is mutual. Thebc dyad satisfies “locally” the conditionbS0c and sotheD0 distribution specifies the probability that thebc dyad is mutual:M0. Both theab andac dyads satisfy “locally” the conditionsaS0b andaS0c, respectively, so theD0 distributionapplies. Again this configuration is one of three equivalent ones that could result in a 102triad. Therefore, the probability of this triad is:

Pr(102) = 3M0N20

As a check on the derivation, we can fix, say, theab andac dyad outcomes at the null stateand sum the probabilities of the various outcomes that may occur in thebc dyad, namely,mutual, asymmetric or null:

{d′(π + (1 − π)d′)} + {2d′(1 − d′)(1 − π)} + {(1 − d′)(1 − d′(1 − π))} = 1.0

In this case it is obvious that the identity is satisfied.

3.6. The 021D triad

There are three specific realizations of this pattern. Consider the one in whicha → b anda → c. If the outcomes of theab andac dyads occur first, thebc dyad satisfies “locally”the conditionbS1c. But if the ab outcome occurs first followed by thebc outcome, thecalculation is different because at the time thebc outcome occurs, thebc dyad satisfiesthe conditionbS0c. But when theac outcome occurs, it creates additional risk for thebcdyad becauseb andc are now siblings. Therefore, the risk pattern is sequence dependent,


in particular, thebc dyad becomes null either by passing just one hurdle, events related tothe conditionbS1c, or by passing two hurdles, events related first to the conditionbS0c andthen second to the conditionbS1c. The sequences must be considered separately.

Two of the sequences exhibit one risk pattern and the other four another risk pattern. The(ab, ac, bc) and the (ac, ab, bc) sequences have the same risk pattern, namely, at the pointthat thebc outcome is to be determined, thebc dyad satisfies the conditionbS1c. In eithersequence theab andac dyads satisfy the conditionxS0y. Therefore for thebc dyad, theD1 distribution specifies the probability that thebc dyad is null (N1) and for theab andacdyads, theD0 distribution specifies the probability that either dyad is asymmetric (a0). Forthe other four sequences, the first risk to which the dyadbc is exposed occurs whenbS0cholds, consequently theD0 distribution applies for the null outcome:N0. The second riskoccurs when the conditionbS1c becomes satisfied and so apparently theD1 distributionapplies. However, if the probabilityN1 is used directly, thebc dyad is, inappropriately,subject to a second chance of parent reciprocity and a second chance of random connection.2

Therefore, it is theD′1 distribution that is relevant and the correct probability isN ′

1. For theab andac dyads, theD0 distribution applies as in the previous two sequences. Therefore,and considering that the overall configuration is one of three equivalent ones, the probabilityof this triad is:

Pr(021D) = 3a20

(2N1

6+ 4N0N

′1

6

)= a2

0(N1 + 2N0N′1)

At this point we must not forget that in the four sequences in which the dyadbc faces“double jeopardy,” the second risk event could turn out differently. In particular, we couldget a 030T triad or a 120D triad. These probabilities must be added to the probabilities ofother ways that either of these triads could materialize. These probabilities are:

Pr(030T− 021D) = 3a20

(4N02a′1

6

)= 4a2

0N0a′1

Pr(120D− 021D) = 3a20

(4N0M

′1

6

)= 2a2

0N0M′1

Again,a′1 andM ′1 are the probabilitiesa1 andM1 without the terms involvingπ andd.

3.7. The 021C triad

There are six realizations of this pattern. Consider the one in whicha → b andb → c.Unlike the previous case, it does not matter which dyadic outcomes occur first. In anysequence, each of the three dyads locally satisfies the conditionxS0y. Therefore, theD0distribution applies and the probability of this triad is:

Pr(021C) = 6a20N0

2 Subjecting a dyad to two chances of parent reciprocity and two chances of random connection producesa violation of the general principle that when biases vanish, the net has the properties of a random net withdensityd′. Permitting two chances of parent reciprocity and two random chances of connection changes the dyaddistribution in those dyads that are so exposed to a distribution inconsistent with this principle.


3.8. The 021U triad

There are three realizations of this pattern. Consider the one in whichb → a andc → a.Like the previous case, it does not matter which dyadic outcomes occur first. In any sequence,each of the three dyads locally satisfies the conditionxS0y. Therefore, theD0 distributionapplies and the probability of this triad is:

Pr(021U) = 3a20N0

3.9. The 111U triad

There are six realizations of this pattern. Consider the one in whichb → c, c → b

andc → a. In this case sequence matters: if thebc andac outcomes occur before theaboutcome, theab dyad locally satisfiesaS1b. In the other four sequences it faces “doublejeopardy.” The first risk is covered by theD0 distribution, but then when thec → b tieforms (or the mutual tie betweenb andc forms), theab dyad faces added risk becausea andb are now siblings. In all sequences, outcomes in both thebc and theac dyads are governedby theD0 distribution. Therefore, the probability of this triad is:

Pr(111U) = 6M0a0

(2N1

6+ 4N0N

′1

6

)= 2M0a0(N1 + 2N0N

′1)

Again we must not forget that in the four sequences in which the dyadab faces doublejeopardy, the second risk event could turn out differently. In particular, we could get a120C, a 120U (and from this outcome possibly a 210 triad), or a 210 triad (and from thisoutcome possibly a 300 triad). If 120U materializes, theac dyad now faces additional riskbecausea andc are now siblings with one arc present. If either or both a sibling bias anda sibling reciprocity bias occur, a 210 triad materializes. It stays a 120U triad only if bothevents fail to occur with probability (1− ρ)(1 − σ), which we will denote in equations by1−Sr. Similarly, if 210 occurs in the first step, theac dyad also faces additional risk becausea andc are now siblings with an arc present. Thus four different triad types could occurfrom the double jeopardy process. These probabilities must be added to the probabilities ofother ways that any of these triads could materialize. These probabilities are:

Pr(120C− 111U) = 6M0a0

(4N0a

′1

6

)= 4M0a0N0a

′1

Pr(120U− 111U) = 6M0a0

(4N0a

′1(1 − Sr)

6

)= 4M0a0N0a

′1(1 − Sr)

Pr(210− 111U) = 6M0a0

(4N0a

′1Sr

6

)= 4M0a0N0a

′1Sr

Pr(210− 111U) = 6M0a0

(4N0M

′1(1 − Sr)

6

)= 4M0a0N0M

′1(1 − Sr)

Pr(300− 111U) = 6M0a0

(4N0M

′1Sr

6

)= 4M0a0N0M

′1Sr


3.10. The 030T triad

There are six realizations of this pattern, the transitive triple. Consider the one in whicha → b, b → c anda → c. Again sequence matters: if theab andac outcomes occurbefore thebc outcome, outcomes in thebc dyad are governed by theD1 distribution. In theother four sequences it faces double jeopardy. The first risk comes from parent reciprocity,but then when thea → c tie forms (or thea → b tie), thebc dyad faces additional riskbecauseb andc are now siblings with one arc present. The triad stays 030T only if theevents of sibling bias and sibling reciprocity bias fail to occur with probability 1− Sr. Inall sequences,ab andac outcomes are governed by theD0 distribution. Additionally, wemust also add in the 030T triads created via double jeopardy in the 021D triad. Therefore,the total probability of 030T is:

Pr(030T)= 6a20

(2a1

6+ 4a0(1 − Sr)

6

)+ 3a2

0

(4N02a′1

6

)

= 2a20

[a1 + 2a0(1 − Sr)+ 2N0a

′1]

Again we must recall that the second risk event could turn out differently and thus create a120D triad. The probability is:

Pr(120D− 030T) = 6a20

(4a0Sr

6

)= 4a3

0Sr

3.11. The 030C triad

There are just two realizations of this pattern, the cyclical triple. Consider the one inwhicha → b, b → c andc → a. In this case sequence does not matter: for all dyads in allsequences, the D0 distribution applies. Therefore, the probability of this triad is:

Pr(030C) = 2a03

3.12. The 111D triad

There are six realizations of this pattern. Consider the one in whicha → c, b → c andc → b. Sequence does not matter: for all dyads in all sequences, theD0 distribution applies.Therefore, the probability of this triad is:

Pr(111D) = 6M0a0N0

3.13. The 201 triad

There are three realizations of this pattern. Consider the one in whicha → b, b → a,a → c, andc → a. Sequence matters in this case: if theab andac outcomes occur before thebc outcome, thenb andc are siblings. In the other four sequences it faces double jeopardy.The first risk comes from parent reciprocity, but then when the second mutual tie forms, itfaces a second risk becauseb andc are now siblings. Both risks result in a null dyad. In all


sequences, outcomes in both theab and theac dyads are governed by theD0 distribution.Therefore, the probability of this triad is:

Pr(201) = 3M20

(2N1

6+ 4N0N

′1

6

)= M2

0(N1 + 2N0N′1)

The second jeopardy event could turn out differently, creating either a 210 or a 300 triad.The relevant probabilities are:

Pr(210− 201) = 3M20

(4N02a′1

6

)= 4M2

0N0a′1

Pr(300− 201) = 3M20

(4N0M1

6

)= 2M2

0N0M′1

3.14. The 120U triad

There are three realizations of this pattern. Consider the one in whicha → b, c → b,a → c, andc → a. Sequence matters here in a very complex way. Suppose theab andacoutcomes occur before thebc outcome. Then outcomes in thebc dyad are governed by theD1 distribution, but when the outcomec → b occurs, it places theab dyad at risk a secondtime. Theab dyad now satisfies the condition (a → b andbS1a) and so faces two risks fromsibling bias and sibling reciprocity bias. The triad stays 120U only if this event fails to occurwith probability 1− Sr. The same logic holds if thebc andac outcomes occur before theab outcome except now it is thebc dyad that faces additional risk. In these four sequences,therefore, one dyad risks just parent reciprocity, one risks sibling and sibling reciprocity andone risks parent reciprocity and then sibling and sibling reciprocity. In the remaining twosequences, whenab andbc occur beforeac, onceac occurs bothab andac face additionalrisk from sibling and sibling reciprocity. The triad remains 120U only if both events failto occur. In these two sequences, one dyad faces just parent reciprocity and the other twoboth parent and sibling and sibling reciprocity. Moreover, we must add in the 120U triadscreated via double jeopardy in the 111U triad. Therefore, the probability of this triad is:

Pr(120U)= 3M0a0(1 − Sr)

(4a1

6+ 2a0(1 − Sr)

6

)+ 6M0a0

(4N0a

′1

6

)(1 − Sr)

=M0a0(1 − Sr)[2a1 + a0(1 − Sr)+ 4N0a

′1]

Again we must recall that the second risk events could turn out differently. In the first foursequences, a tie may be added to create a 210 dyad. In the second two sequences, a 210dyad could occur in two different ways and a 300 dyad could occur if for both dyads, eitheror both the sibling and sibling reciprocity bias events occur. The probabilities are:

Pr(210− 120U)= 3M0a0

(4a1Sr

6+ 2a02Sr(1 − Sr)

6

)

= 2M0a0Sr(a1 + a0(1 − Sr))

Pr(300− 120U) = 3M0a0

(2a0S2

r

6

)= M0a

20S2r


3.15. The 120C triad

There are six realizations of this pattern and sequence matters. Consider the realizationin which a → b, b → c, a → c, andc → a. If the ab andac outcomes occur beforethebc outcome, thenb andc are siblings. In the other four sequences, theD0 distributionapplies to two of the three dyads and one, thebc dyad, faces double jeopardy. In addition,we must add in the 120C triads created via double jeopardy in the 111U triad. Therefore,the probability of this triad is:

Pr(120C)= 6M0a0

(2a1

6+ 4a0(1 − Sr)

6

)+ 6M0a0

(4N0a

′1

6

)

= 2M0a0(a1 + 2a0(1 − Sr)+ 2N0a′1)

Again we must recall that the second risk event could turn out differently. In four sequences,thec → b tie may be added to create a 210 dyad. But there is an additional complication—ifthat tie is added,a andb are now siblings with one arc present and so exposed to sibling andsibling reciprocity events. If either or both events occur, a 300 triad results. The relevantprobabilities are:

Pr(210− 120C) = 6M0a0

(4a0Sr(1 − Sr)

6

)= 4M0a

20Sr(1 − Sr)

Pr(300− 120C) = 6M0a0

(4a0S2

r

6

)= 4M0a

20S2r

3.16. The 120D triad

There are three realizations of this pattern. Consider the one in whichb → a, b → c,a → c, andc → a. Sequence matters: if theab and bc outcomes occur before theacoutcome,a andc are siblings andD1 applies. The occurrence of the mutualac tie does not,however, subject the other dyads to additional risk. In the remaining four sequences, theD0distribution applies to all three dyads. Moreover, we must add in the 120D triads that occurvia double jeopardy in 021D and 030T. Therefore, the probability of this triad is:

Pr(120D)= 3a20

(2M1

6+ 4M0

6

)+ 3a2

0

(4N0M

′1

6

)+ 6a2

0

(4a0Sr

6

)

= a20(M1 + 2M0 + 2N0M

′1 + 4a0Sr)

3.17. The 210 triad

There are six realizations of this pattern. Consider the one in whicha → b, b → a,a → c, c → a, andb → c. Sequence matters in this case: if theab andac outcomes occurbefore thebc outcome,b andc are siblings andD1 applies. Ifab andbc occur beforeac,theD1 distribution applies toac dyad, but when the mutualac tie occurs,b andc becomesiblings with one arc present and so are put at additional risk. Ifac andbc occur beforeab,


then theD0 distribution applies to all three, but the occurrence of theab mutual tie makesbandc siblings with one arc present and thus at additional risk. Furthermore, there are other210 dyads created by double jeopardy situations in other triad types. The overall probabilityof this triad is quite complicated:

Pr(210)= 6M0

(2M0a1

6+ 2a0M1(1 − Sr)

6+ 2M0a0(1 − Sr)

6

)

+ 6M0a0

(4N0a

′1Sr

6

)+ 6M0a0

(4N0M

′1(1 − Sr)

6

)

+ 3M20

(4N02a′1

6

)+ 3M0a0

(4a1Sr

6+ 2a02Sr(1 − Sr)

6

)

+ 6M0a0

(4a0Sr(1 − Sr)

6

)

=M0[2M0a1 + 2a0M1(1 − Sr)+ 2M0a0(1 − Sr)

+ 4a0N0a′1Sr + 4a0N0M

′1(1 − Sr)+ 4M0N0a

′1 + 2a0a1Sr

+ 6a20Sr(1 − Sr)]

Again we must recall that the second risk events could turn out differently and in every caseproducing a 300 triad. The relevant probability is:

Pr(300− 210) = 6M0

(2a0M1Sr

6+ 2M0a0Sr

6

)= 2M0a0Sr(M1 +M0)

3.18. The 300 triad

There is just one realization of this pattern and sequence does not matter: in all sequencesthe first two dyads are governed by theD0 distribution while the third dyad is governedby theD1 distribution. However, there are 300 triads created by double jeopardy events inother triads. Therefore, the total probability of this triad is:

Pr(300)=M20M1 + 6M0a0

(4N0M

′1Sr

6

)+ 3M2

0

(4N0M

′1

6

)+ 3M0a0

(2a0S2

r

6

)

+ 6M0a0

(4a0S2

r

6

)+ 6M0

(2a0M1Sr

6+ 2M0a0Sr

6

)

=M0

[M0M1 + 4a0N0M

′1Sr + 2M0N0M

′1 + 5a2

0S2r + 2a0M1Sr + 2M0a0Sr

]

This step completes the derivation of the triad distribution using the first strategy.The second strategy begins initially with 64 outcome branches that may occur because

each of three dyads may experience one of four different outcomes. Some of these branches,however, themselves branch out further because the outcome in the third dyad may placeeither or both of the first two dyads under additional risk. The full results of this analysisare depicted inTable 2. There are 117 distinct branches in the final analysis. InTable 2, theoutcomes are enumerated in the formx.y.z, wherex refers to one of the original 64 branches,


Table 2Enumeration of all possible branches

Branch Outcome Branch Outcome

(1) M0M0M1 300 (37.2.2)a0(ba)a0(ac)M0Sr(ca)Sr(ab) 300(2) M0M0a1(bc) 210 (38.1)a0(ba)a0(ac)a0(bc)[1 − Sr(ca)] 030T(3) M0M0a1(cb) 210 (38.2)a0(ba)a0(ac)a0(bc)Sr(ca) 120D(4) M0M0N1 201 (39)a0(ba)a0(ac)a0(cb) 030C(5.1)M0a0(ac)M1[1 − Sr(ca)] 210 (40)a0(ba)a0(ac)N0 021C(5.2)M0a0(ac)M1Sr(ca) 300 (41.1)a0(ba)a0(ca)M0Sr(ab)Sr(ac) 300(6.1)M0a0(ac)a1(bc)[1 − Sr(ca)] 120U (41.2)a0(ba)a0(ca)M0Sr(ab)

[1 − Sr(ac)]210

(6.2)M0a0(ac)a1(bc)Sr(ca) 210 (41.3)a0(ba)a0(ca)M0

[1 − Sr(ab)]Sr(ac)210

(7) M0a0(ac)a1(cb) 120C (41.4)a0(ba)a0(ca)M0[1 − Sr(ab)][1 − Sr(ac)]

120U

(8) M0a0(ac)N1 111U (42.1)a0(ba)a0(ca)a0(bc)[1 − Sr(ac)] 030T(9.1)M0a0(ca)M0[1 − Sr(ac)] 210 (42.2)a0(ba)a0(ca)a0(bc)Sr(ac) 120D(9.2)M0a0(ca)M0Sr(ac) 300 (43.1)a0(ba)a0(ca)a0(cb)[1 − Sr(ab)] 030T(10.1)M0a0(ca)a0(bc)[1 − Sr(ac)] 120C (43.2)a0(ba)a0(ca)a0(cb)Sr(ab) 120D(10.2.1)M0a0(ca)a0(bc)[Sr(ac)]

[1 − Sr(cb)]210 (44)a0(ba)a0(ca)N0 021U

(10.2.2)M0a0(ca)a0(bc)Sr(ac)Sr(cb) 300 (45.1)a0(ba)N0M0a′1(ac) 120C

(11) M0a0(ca)a0(cb) 120D (45.2.1)a0(ba)N0M0a′1(ca)[1 − Sr(ab)] 120U

(12) M0a0(ca)N0 111D (45.2.2)a0(ba)N0M0a′1(ca)Sr(ab) 210

(13.1)M0N0M0a′1(ac) 210 (45.3.1)a0(ba)N0M0M

′1[1 − Sr(ab)] 210

(13.2)M0N0M0a′1(ca) 210 (45.3.2)a0(ba)N0M0M

′1[Sr(ab)] 300

(13.3)M0N0M0M′1 300 (45.4)a0(ba)N0M0N

′1 111U

(13.4)M0N0M0N′1 201 (46.1)a0(ba)N0a0(bc)a′1(ac) 030T

(14.1.1)M0N0a0(bc)a′1(ac)[1 − Sr(cb)] 120U (46.2)a0(ba)N0a0(bc)a′1(ca) 030T(14.1.2)M0N0a0(bc)a′1(ac)Sr(cb) 210 (46.3)a0(ba)N0a0(bc)M ′

1 120D(14.2)M0N0a0(bc)a′1(ca) 120C (46.4)a0(ba)N0a0(bc)N ′

1 021D(14.3.1)M0N0a0(bc)M ′

1[1 − Sr(cb)] 210 (47)a0(ba)N0a0(cb) 021C(14.3.2)M0N0a0(bc)M ′

1[Sr(cb)] 300 (48)a0(ba)N0N0 012(14.4)M0N0a0(bc)N ′

1 111U (49.1)N0M0M0a′1(ab) 210

(15) M0N0a0(cb) 111D (49.2)N0M0M0a′1(ba) 210

(16) M0N0N0 102 (49.3)N0M0M0M′1 300

(17.1)a0(ab)M0M1[1 − Sr(ba)] 210 (49.4)N0M0M0N′1 201

(17.2)a0(ab)M0M1Sr(ba) 300 (50)N0M0a0(bc) 111D(18) a0(ab)M0a1(bc) 120C (51.1.1)N0M0a0(cb)a′1(ab)[1 − Sr(bc)] 120U(19.1)a0(ab)M0a1(cb)[1 − Sr(ba)] 120U (51.1.2)N0M0a0(cb)a′1(ab)Sr(bc) 210(19.2)a0(ab)M0a1(cb)Sr(ba) 210 (51.2)N0M0a0(cb)a′1(ba) 120C(20) a0(ab)M0N1 111U (51.3.1)N0M0a0(cb)M ′

1[1 − Sr(bc)] 210(21) a0(ab)a0(ac)M1 120D (51.3.2)N0M0a0(cb)M ′

1[Sr(bc)] 300(22) a0(ab)a0(ac)a1(bc) 030T (51.4)N0M0a0(cb)N ′

1 111U(23) a0(ab)a0(ac)a1(cb) 030T (52)N0M0N0 102(24) a0(ab)a0(ac)N1 021D (53)N0a0(ac)M0 111D(25.1)a0(ab)a0(ca)M0[1 − Sr(ba)] 120C (54)N0a0(ac)a0(bc) 021U(25.2.1)a0(ab)a0(ca)M0Sr(ba)

[1 − Sr(ac)]210 (55)N0a0(ac)a0(cb) 021C

(25.2.2)a0(ab)a0(ca)M0Sr(ba)Sr(ac) 300 (56)N0a0(ac)N0 012(26) a0(ab)a0(ca)a0(bc) 030C (57.1)N0a0(ca)M0a

′1(ab) 120C

(27.1)a0(ab)a0(ca)a0(cb)[1 − Sr(ba)] 030T (57.2.1)N0a0(ca)M0a′1(ba)[1 − Sr(ac)] 120U


Table 2 (Continued )

Branch Outcome Branch Outcome

(27.2)a0(ab)a0(ca)a0(cb)Sr(ba) 120D (57.2.2)N0a0(ca)M0a′1(ba)Sr(ac) 210

(28) a0(ab)a0(ca)N0 021C (57.3.1)N0a0(ca)M0M′1[1 − Sr(ac)] 210

(29) a0(ab)N0M0 111D (57.3.2)N0a0(ca)M0M′1[Sr(ac)] 300

(30) a0(ab)N0a0(bc) 021C (57.4)N0a0(ca)M0N′1 111U

(31) a0(ab)N0a0(cb) 021U (58)N0a0(ca)a0(bc) 021C(32) a0(ab)N0N0 012 (59.1)N0a0(ca)a0(cb)a′1(ab) 030T(33.1)a0(ba)M0M0[1 − Sr(ab)] 210 (59.2)N0a0(ca)a0(cb)a′1(ba) 030T(33.2)a0(ba)M0M0Sr(ab) 300 (59.3)N0a0(ca)a0(cb)M ′

1 120D(34) a0(ba)M0a0(bc) 120D (59.4)N0a0(ca)a0(cb)N ′

1 021D(35.1)a0(ba)M0a0(cb)[1 − Sr(ab)] 120C (60)N0a0(ca)N0 012(35.2.1)a0(ba)M0a0(cb)Sr(ab)

[1 − Sr(bc)]210 (61)N0N0M0 102

(35.2.2)a0(ba)M0a0(cb)Sr(ab)Sr(bc) 300 (62)N0N0a0(bc) 012(36) a0(ba)M0N0 111D (63)N0N0a0(cb) 012(37.1)a0(ba)a0(ac)M0[1 − Sr(ca)] 120C (64)N0N0N0 003(37.2.1)a0(ba)a0(ac)M0Sr(ca)

[1 − Sr(ab)]210

and theny andz refer to further branching that is contingent on the third dyad’s outcomeand whether it places either or both the first two dyads under additional risk.

Consider one of the more complex branches 14.1.1. The sequence of events that leads tothis branch is as follows. First, theab dyad experiences a mutual event and then theac dyadexperiences a null event. Probabilities for both events are given by theD0 distribution. Thena tie forms fromb to c in the last dyad with probability determined by theD0 distribution.This occurrence now makes the members of theac dyad siblings and so theac dyad is nowsubject to additional risk. A tie forms froma to c with probability determined by theD′

1distribution. But nowb andc are siblings with one arc present and so thebc dyad facesadditional risk. In the 14.1.1 branch the tie fromc to b does not form with probability 1−Srand the 14.1.2 branch it does with probabilitySr. In either case, none of the dyads are putat additional risk and so the branches terminate in a definite outcome.

If we now sum the terms that lead to the same triad type, we get the triad distributiondisplayed inTable 3, which completely agrees with the results obtained by the alternativemethod of derivation—giving us some confidence in their validity, given the complexity ofderivations involving stochastically nonindependent relational events. In the table, each ofthe shorthand expressions for the various probabilities can be replaced by the expressionsfrom the appropriate dyad distribution. If we do so and then set all bias parameters to zero,we recover the Bernoulli distribution inTable 1. We will return to an exploration of thebiased net triad distribution after we address the problem of estimation.

4. Estimation methods

The dyad distributions for dyads withk = 0, . . . , g−2 parents can allow direct expressionof the pseudo likelihood of a given set of observations. Letmk, ak, andnk denote the number


Table 3Biased net triad distribution

Triad type Probability

003 N30

012 6a0N20

102 3M0N20

021D a20(N1 + 2N0N

′1Z)

021C 6a20N0

021U 3a20N0

111U 2M0a0(N1 + 2N0N′1)

030T 2a20[a1 + 2a0(1 − Sr) + 2N0a

′1]

030C 2a30

111D 6M0a0N0

201 M20(N1 + 2N0N

′1)

120U M0a0(1 − Sr)[2a1 + a0(1 − Sr) + 4N0a′1]

120C 2M0a0[a1 + 2a0(1 − Sr) + 2N0a′1]

120D a20(M1 + 2M0 + 2N0M

′1 + 4a0Sr)

210 M0[2M0a1 + 2a0M1(1 − Sr) + 2M0a0(1 − Sr) + 4a0N0a′1Sr + 4a0N0M

′1(1 − Sr)

+ 4M0N0a′1 + 2a0a1Sr + 6a2

0Sr(1 − Sr)]300 M0[M0M1 + 4a0N0M

′1Sr + 2M0N0M

′1 + 5a2

0S2r + 2a0M1Sr + 2M0a0Sr ]

M0 = d′(π + (1 − π)d′), a0 = d′(1 − d′)(1 − π),N0 = (1 − d′)(1 − d′(1 − π)),M1 = (σ + (1 − σ)d′)(1 −(1 − π)(1 − σ)(1 − ρ)(1 − d′)), a1 = (σ + (1 − σ)d′)(1 − π)(1 − σ)(1 − ρ)(1 − d′), N1 = 1 − (σ + (1 −σ)d′)(1 + (1 − π)(1 − σ)(1 − ρ)(1 − d′)),M ′

1 = σ(1 − (1 − σ)(1 − ρ)), a′1 = σ(1 − σ)(1 − ρ),N ′1 =

1 − σ(1 + (1 − σ)(1 − ρ)), Sr = 1 − (1 − σ)(1 − ρ).

mutual, asymmetric, and null dyads withk parents. The pseudo likelihood expression forthe observed digraph as a function of the four parametersπ, ρ, σ, andd′ is:

L(π, ρ, σ, d′) =g−2∏k=0

[Pk(M)]mk [Pk(A)]

ak [Pk(N)]nk

where thePk probabilities are as previous specified. The logic leading to this expressionis as follows. First, a standard homogeneity assumption is made that all dyads withk par-ents are isomorphic and so subjected to the same probabilities of dyadic outcomes. Thisassumption can, of course, be relaxed in various ways. Second, it is assumed that once thedyadic probabilities are conditioned on the dyad’s number of parents, different dyads areindependent. However, it is clear that outcomes in theij dyad depend on what has happenedin other dyads. Therefore, the above expression is not a true likelihood expression but rathera pseudo likelihood expression. There is ample precedent for the use of pseudo likelihoodestimation in the social network literature, most recently with respect to exponential ran-dom graph (p∗) models (Wasserman and Pattison, 1996; Pattison and Wasserman, 1999;Robins et al., 1999; Anderson et al., 1999; Robins et al., 2001). While these models havetheir drawbacks, there is general agreement on their usefulness if results are interpretedcautiously.

Implementation of the above expression (or its log) for estimation can be done in severalways. One available procedure is the grid search algorithm proposed bySkvoretz (1990).He used such an algorithm to estimate parameters from a table cross classifying dyads by


Table 4Triad distributions

Triad type Pseudo-probability Probability

003 N30 N3

0012 6a0N

20 6a0N

20

102 3M0N20 3M0N

20

021D 3a20N1 a2

0(N1 + 2N0N′1)

021C 6a20N0 6a2

0N0

021U 3a20N0 3a2

0N0

111U 6M0a0N1 2M0a0(N1 + 2N0N′1)

030T 6a20a1 2a2

0[a1 + 2a0(1 − Sr) + 2N0a′1]

030C 2a30 2a3

0111D 6M0a0N0 6M0a0N0

201 3M20N1 M2

0(N1 + 2N0N′1)

120U 3M0a21 M0a0(1 − Sr)[2a1 + a0(1 − Sr) + 4N0a

′1]

120C 6M0a21 2M0a0[a1 + 2a0(1 − Sr) + 2N0a

′1]

120D 3a20M1 a2

0(M1 + 2M0 + 2N0M′1 + 4a0Sr)

210 6M0M1a1 M0[2M0a1 + 2a0M1(1 − Sr) + 2M0a0(1 − Sr) + 4a0N0a′1Sr

+ 4a0N0M′1(1 − Sr) + 4M0N0a

′1 + 2a0a1Sr + 6a2

0Sr(1 − Sr)]300 M1

3 M0[M0M1 + 4a0N0M′1Sr + 2M0N0M

′1 + 5a2

0S2r + 2a0M1Sr + 2M0a0Sr ]

M0 = d′(π + (1 − π)d′), a0 = d′(1 − d′)(1 − π),N0 = (1 − d′)(1 − d′(1 − π)),M1 = (σ + (1 − σ)d′)(1 −(1 − π)(1 − σ)(1 − ρ)(1 − d′)), a1 = (σ + (1 − σ)d′)(1 − π)(1 − σ)(1 − ρ)(1 − d′), N1 = 1 − (σ + (1 −σ)d′)(1 + (1 − π)(1 − σ)(1 − ρ)(1 − d′)),M ′

1 = σ(1 − (1 − σ)(1 − ρ)), a′1 = σ(1 − σ)(1 − ρ),N ′1 =

1 − σ(1 + (1 − σ)(1 − ρ)), Sr = 1 − (1 − σ)(1 − ρ).

the number of their parents and the observed dyadic outcome, collapsing into one categorydyads whose number of parents was equal to or greater than some cutoff value (15). In fact,if these categories were not collapsed, Skvoretz’s estimation technique would have exactlyimplemented the above pseudo likelihood estimation strategy. Existing programs for suchestimation require just modest modification, namely, truncation of the parent count needsto be removed from the procedure and the probability expressions revised.

One way to assess the pseudo likelihood estimation procedure is to compare the triaddistribution it implies with the triad distribution worked out from first principles. Under thepseudo likelihood assumption that dyads are (conditionally) independent, the triad distri-bution can be easily derived since the operative assumption is that the outcomes in the threedyads are independent. One simply takes each dyad and inspects the configuration to seeif the third node is a parent. If not, the set of dyadic outcome probabilities for parent–childdyads is used and if it is, the set of outcome probabilities for sibling dyads is used. Thenthe three appropriate probabilities are multiplied together. The total probability for a triadmust also take into account the number of ways such a configuration could occur.

Table 4compares the formal expressions for each of the triad probabilities under the twoderivations. To illustrate how the expressions in the first column ofTable 4are derived,consider the 111U triad in which thebc dyad is mutual and there is an arc fromc to a. Nodec is a parent of theab dyad and so therefore the probability that theab dyad is null is, in thenotation ofTable 2, N1. Nodeb is not a parent of theac dyad and so the probability that theac dyad is asymmetric fromc to a is a0. Finally, nodea is not a parent of thebc dyad andso the probability that it is mutual isM0. There are six different ways the 111U structural


Table 5Comparison of distributions

Triad type d′ = 0.10,π = 0.50,ρ = 0.50

σ = 0.25 σ = 0.50 σ = 0.75

Pseudo-probability

Probability Pseudo-probability

Probability Pseudo-probability

Probability

003 0.6250 0.6250 0.6250 0.6250 0.6250 0.6250012 0.1974 0.1974 0.1974 0.1974 0.1974 0.1974102 0.1206 0.1206 0.1206 0.1206 0.1206 0.1206021D 0.0038 0.0035 0.0024 0.0021 0.0011 0.0009021C 0.0104 0.0104 0.0104 0.0104 0.0104 0.0104021U 0.0052 0.0052 0.0052 0.0052 0.0052 0.0052111U 0.0092 0.0086 0.0058 0.0051 0.0027 0.0022030T 0.0007 0.0010 0.0008 0.0012 0.0005 0.0009030C 0.0002 0.0002 0.0002 0.0002 0.0002 0.0002111D 0.0127 0.0127 0.0127 0.0127 0.0127 0.0127201 0.0056 0.0053 0.0035 0.0031 0.0016 0.0014120U 0.0005 0.0004 0.0006 0.0003 0.0003 0.0001120C 0.0010 0.0012 0.0013 0.0015 0.0006 0.0011120D 0.0016 0.0015 0.0030 0.0028 0.0044 0.0043210 0.0049 0.0032 0.0100 0.0043 0.0105 0.0034300 0.0197 0.0037 0.1163 0.0081 0.3913 0.0143

Total 1.0185 1.0000 1.1150 1.0000 1.3846 1.0000

pattern could be realized among three nodesa, b, andc. Hence the (pseudo) probability ofthe 111U triad type is 6M0a0N1. Note that this differs from the probability we derived fromfirst principles, as shown in the table. The other expressions are derived in a similar fashion.

It is clear that both derivations yield the same expressions for seven triads, namely, types003, 012, 201, 021C, 021U, 030C, and 111D. The feature that distinguishes these triads isthat they do not contain any embedded subgraphs in which one node directs arcs to bothother nodes. These are precisely the subgraphs in which all potential bias effects (definedin this particular biased net model) occur at the dyadic rather than triadic level. It is difficultto tell, however, from the expressions how the pseudo-probability and probability differfor the other nine triads.Table 5provides some insight into this question by calculatingprobability values from various combinations of parameters.

First, the term pseudo-probability is used inTable 5because the pseudo probability ex-pressions for the sixteen triad types do not, in fact, sum to unity when the sibling bias isnon-zero. This aberration, of course, is due entirely to the conditional independence as-sumption made in expressing these probabilities. Second, at low levels of sibling bias, thepseudo-probabilities are less distorted in absolute difference relative to the calculated prob-abilities. Certain triads tend to be over represented in the pseudo-probability distribution atall levels of sibling bias: 021D, 111U, 201, 120U, 120C, 210 and 300. This over representa-tion, particularly for the 300 triad, becomes severe when there are high levels of sibling bias.Third, even if we “re-normalize” the pseudo-probability distribution so that it sums to unity,this will not bring the distribution into line with the one calculated from first principles.


Table 6Parameter estimates

Fall Spring

Pseudo Triad Pseudo Triad

d′ 0.0188 0.0457 0.0188 0.0494π 0.2633 0.4826 0.2312 0.4324σ 0.2313 0.3581 0.2211 0.3218ρ 0.0875 0.3896 0.0383 0.2705

Essentially, renormalization will assign lower weight to triads in which only dyadic biasesmay occur and sometimes higher and sometimes lower weight to triads subject to triadiceffects.

For a final point of comparison, we can use the triad distribution as a basis for parameterestimation and compare its parameter estimates to those from pseudo-likelihood estimation.The data here come fromColeman (1964). In the Fall of 1957 and the Spring of 1958, 73boys in small high school in the Midwest were asked “What fellows here in school do yougo around with most often?” Density of both networks is just under 0.05 and nominationsmay or may not have been reciprocated. The estimation uses a grid search algorithm tomaximize the log likelihood of the triad classification. Results are presented inTable 6.

There are some consistencies in how estimates change from one data point to another,for instance, all three bias parameters decline from Fall to Spring in both estimations. But,clearly the two procedures give quite different estimates with the triad procedure havingthe larger values for all four parameters. That it would give larger values forσ andρ isunderstandable. The estimation effectively assumes each dyad with common parents hasonly one. Consequently, the biases must be much larger to produce the same degree ofclosure that smaller biases could produce when multiple parents independently contributeto closure. One final point of comparison relates to how dense a purely random net mustbe to match each of the biased nets. In both of these networks, the total number of dyadsis 2628 and we know how they are distributed fromk = 0, . . . , g − 2 common parents.Hence, we can empirically determinewk, the proportion of dyads that havek parents.Not surprisingly, given the difference in parameter estimates, the purely random nets havesmaller densities when the pseudo-likelihood estimates are used as compared to the triaddistribution estimates. For Fall and Spring, the random densities are 0.0471 and 0.0503using the pseudo-likelihood estimates and 0.0853 and 0.0916 using the triad distributionestimates.

5. Conclusion

We have attempted to make headway in shoring-up the technical foundations of biasednet theory in three directions: first, resolving problems in the formal representation ofvarious social structural biases; second, deriving the triad distribution, and third, usingpseudo-likelihood estimation to evaluate parameters and assessing it by reference to the triad


distribution. The latter two directions provide insight on the problems of applying biasednet theory but we must add that further work is called for. Pseudo likelihood assumptionsquite obviously give poor predictions for the distribution of triads and, of course, any higherorder subgraphs. The “local impact” assumption of the triad distribution produces parameterestimates that may exaggerate the effect of certain biases. Further progress may require acombination of the two approaches. Despite these difficulties, it may be worth forgingahead with more complicated biased net models, specifically, ones that allow the biases tobe dependent on attributes of the nodes.

Simulation studies may advance understanding of the properties of biased net models andthe estimation schemes we have explored. While simulations have been used previously,as we noted earlier, they were not without some flaws (Skvoretz, 1990). Different types ofsimulations, based on the Metropolis algorithm, have been used to explore properties ofexponential random graph models. The simulation begins with a randomly selected start.Then an ordered pair of nodes is selected at random and with a certain probability based onthe particular model being evaluated, the state of that ordered pair is changed from absent topresent or vice versa. The process is repeated many, many times (the “burn-in” period). It isassumed that given enough time the process reaches equilibrium and representative statesof the network can be sampled and studied for their properties. In the context of biasednet models, this approach begins with a random start, but then selects at random unordereddyads, finds its number of parents, and then applies the appropriate dyad probability distri-bution to change (or not) the state of the dyad. Sufficient repetition, it is assumed, will lead toconvergence at which time states of the entire network can be sampled for further analysis.We expect to outline results from this approach in a later report in which we continue theprogram set out here, namely to work on multiple and related fronts to improve the state ofthe foundations of biased net theory.

Biased net models constitute an alternative to the increasingly visible exponential randomgraph models. The latter models arise from an application of a general methodology thatis just as applicable to modeling crop yields in adjacent fields or to the spin of electrons ina plasma. They offer quite general parameterizations of local neighborhood effects on thepresence or absence of a tie, even if these effects are obscure or have no obvious theoreticalfoundation. To illustrate, consider the basic biased net model for a symmetric relation. Inthis case, we have only two parameters, density and sibling bias. The latter is the onlysubstantive parameter and it varies over the unit interval. In the corresponding exponentialrandom graph model introduced byFrank and Strauss (1986), there are three parameters,a density effect, a triangle or closed triad effect, and a two-star effect. Interpretively, thedensity effects directly correspond and the sibling bias is related to the closed triad effect.However, nothing corresponds to the two-star effect. The two-star effect is present, notbecause of a specific theoretical reason, but because the mathematics that underlie themodel require it. While the triangle effect and the sibling bias are related, the generality ofthe exponential random graph framework, allows the triangle effect to vary from negativeinfinity to positive infinity. Negative values would mean that edges that complete triangleshave lower probability than edges that do not, in effect, an anti-sibling bias. While such abias could be defined in the context of biased net models, it is clearly a theoretically differentquantity than the ordinary sibling bias. Viewed in a positive light, the work in exponentialrandom graph models suggests ways of expanding the universe of biased net models to take


account of biases in tie formation or location that previously had not been recognized orconsidered.

References

Anderson, C.J., Wasserman, S., Crouch, B., 1999. A p∗ primer: logit models for social networks. Social Networks21, 37–66.

Blau, P.M., 1977. Inequality and Heterogeneity: A Primitive Theory of Social Structure. Free Press, New York.Coleman, J.S., 1964. Introduction to Mathematical Sociology. Free Press, New York.Fararo, T.J., 1981. Biased networks and social structure theorems: part I. Social Networks 3, 137–159.Fararo, T.J., 1983. Biased networks and strength of weak ties. Social Networks 5, 1–11.Fararo, T.J., Skvoretz, J., 1984. Biased networks and social structure theorems: part II. Social Networks 6, 223–258.Fararo, T.J., Skvoretz, J., 1987. Unification research programs: integrating two structural theories. American

Journal of Sociology 92, 1183–1209.Fararo, T.J., Skvoretz, J., 1989. The biased net theory of social structures and the problem of integration.” In:

Berger, J., Zelditch, M., Anderson, B. (Eds.), Sociological Theories in Progress: New Formulations. Sage,Newbury Park, CA, pp. 212–255.

Fararo, T.J., Sunshine, M., 1964. A Study of a Biased Friendship Net. Syracuse University Youth DevelopmentCenter and Syracuse University Press, Syracuse, NY.

Foster, C.C., Rapoport, A., Orwant, C.J., 1963. A study of a large sociogram II elimination of free parameters.Behavioral Science 8, 56–65.

Frank, O., Strauss, D., 1986. Markov Graphs. Journal of the American statistical association 81, 832–842.Granovetter, M., 1973. The strength of weak ties. American Journal of Sociology 78, 1360–1380.Landau, H.G., 1952. On some problems of random nets. Bulletin of Mathematical Biophysics 14, 203–212.Newman, M.E.J., 2000. Models of the small world: a review. Journal of Statistical Physics 101, 819–841.Pattison, P.E., Wasserman, S., 1999. Logit models and logistic regression for social networks, II. Multivariate

relations. British Journal of Mathematical and Statistical Psychology 52, 169–194.Rapoport, A., 1951a. Nets with distance bias. Bulletin of Mathematical Biophysics 13, 85–91.Rapoport, A., 1951b. The probability distribution of distinct hits on closely packed targets. Bulletin of Mathematical

Biophysics 13, 133–137.Rapoport, A., 1953a. Spread of information through a population with socio-structural bias: I. Assumption of

transitivity. Bulletin of Mathematical Biophysics 15, 523–533.Rapoport, A., 1953b. Spread of information through a population with socio-structural bias: II. Various models

with partial transitivity. Bulletin of Mathematical Biophysics 15, 535–546.Rapoport, A., 1953c. Spread of information through a population with socio-structural bias: III. Suggested

experimental procedures. Bulletin of Mathematical Biophysics 16, 75–81.Rapoport, A., 1957. A contribution to the theory of random and biased nets. Bulletin of Mathematical Biophysics

19, 257–271 (Reprinted on Leinhardt, S. (Ed.), 1977. Social Networks: A Developing Paradigm, AcademicPress, New York, pp. 389–409.).

Rapoport, A., 1958. Nets with reciprocity bias. Bulletin of Mathematical Biophysics 20, 191–201.Rapoport, A., 1963. Mathematical models of social interaction. In: Luce, R.D., Bush, R.R., Galanter, E. (Eds.),

Handbook of Mathematical Psychology, vol. 2. Wiley, New York, pp. 493–579.Rapoport, A., Horvath, W.J., 1961. A study of a large sociogram. Behavioral Science 6, 279–291.Rapoport, A., Solomonoff, R., 1951. Connectivity of random nets. Bulletin of Mathematical Biophysics 13, 107–

117.Robins, G., Elliott, P., Pattison, P., 2001. Network models for social selection processes. Social Networks 23, 1–30.Robins, G., Pattison, P., Wasserman, S., 1999. Logit models and logistic regression for social networks, III. Valued

relations. Psychometrika 64, 371–394.Skvoretz, J., 1983. Salience, heterogeneity, and consolidation of parameters: civilizing Blau’s primitive theory.

American Sociological Review 48, 360–375.Skvoretz, J., 1985. Random and biased networks: simulations and approximations. Social Networks 7, 225–261.Skvoretz, J., 1990. Biased net theory: approximations, simulations, and observations. Social Networks 12, 217–

238.


Skvoretz, J., Fararo, T.J., 1986. Inequality and association: a biased net theory. Current Perspectives in SocialTheory 7, 29–50.

Skvoretz, J., Fararo, T.J., 1989. Connectivity and the small world problem. In: Kochen, M. (Ed.), The Small World.Ablex, Norwood, NJ.

Solomonoff, R., 1952. An exact method for the computation of the connectivity of random nets. Bulletin ofMathematical Biophysics 14, 153–157.

Strogatz, S.H., 2001. Exploring complex networks. Nature 410, 268–276.Wasserman, S., Pattison, P.E., 1996. Logit models and logistic regression for social networks, I. An introduction

to Markov random graphs and p∗. Psychometrika 60, 401–425.Watts, D.J., 1999. Networks, dynamics, and the small-world phenomenon. American Journal of Sociology 105,

493–527.

Date post:	28-Jan-2023
Category:	Documents
Upload:	usf
View:	0 times
Download:	0 times

Advances in biased net theory: definitions, derivations, and estimations

Documents