159
Views
0
CrossRef citations to date
0
Altmetric
Statistical Learning

Improving and Extending STERGM Approximations Based on Cross-Sectional Data and Tie Durations

, , &
Pages 166-180 | Received 15 Mar 2022, Accepted 21 Jun 2023, Published online: 29 Aug 2023
 

Abstract

Temporal exponential-family random graph models (TERGMs) are a flexible class of models for network ties that change over time. Separable TERGMs (STERGMs) are a subclass of TERGMs in which the dynamics of tie formation and dissolution can be separated within each discrete time step and may depend on different factors. The Carnegie et al. approximation improves estimation efficiency for a subclass of STERGMs, allowing them to be reliably estimated from inexpensive cross-sectional study designs. This approximation adapts to cross-sectional data by attempting to construct a STERGM with two specific properties: a cross-sectional equilibrium distribution defined by an exponential-family random graph model (ERGM) for the network structure, and geometric tie duration distributions defined by constant hazards for tie dissolution. In this article we focus on approaches for improving the behavior of the Carnegie et al. approximation and increasing its scope of application. We begin with Carnegie et al.’s observation that the exact result is tractable when the ERGM is dyad-independent, and then show that taking the sparse limit of the exact result leads to a different approximation than the one they presented. We show that the new approximation outperforms theirs for sparse, dyad-independent models, and observe that the errors tend to increase with the strength of dependence for dyad-dependent models. We then develop theoretical results in the dyad-dependent case, showing that when the ERGM is allowed to have arbitrary dyad-dependent terms and some dyad-dependent constraints, both the old and new approximations are asymptotically exact as the size of the STERGM time step goes to zero. We note that the continuous-time limit of the discrete-time approximations has the desired cross-sectional equilibrium distribution and exponential tie duration distributions with the desired means. We show that our results extend to hypergraphs, and we propose an extension of the Carnegie et al. framework to dissolution hazards that depend on tie age. Supplementary materials for this article are available online.

Supplementary Materials

R code for producing and plotting the simulation results in and is provided in the supplementary materials.

Acknowledgments

We acknowledge Dave Hunter and Alina Kuvelkar for their review of the manuscript, Carter Butts for his review of the manuscript and discussions about continuous-time processes with ERGM equilibria, and the statnet development team for general support.

Disclosure Statement

The authors report there are no competing interests to declare.

Notes

1 We will use the term age for an active tie to refer to the time elapsed since the tie formed, and the term duration for a completed tie to refer to its age at the time it dissolved. For processes with a discrete time step, the tie age is equal to 1 (in units of the time step) at the first discrete time point when the tie is active in the network, and the tie duration is equal to the tie age at the final discrete time point when the tie is active in the network.

The “durational data” for the EDA are typically the ages of active ties in the sample. Under a constant hazard model for dissolution, the mean age of an active tie and the mean duration of a completed tie coincide.

2 The new EDA introduced in Section 3.1.2 is implemented in the same way, except that the formation coefficients are (θ,log(D1),,log(DL)).

3 In practice, dissolution models summarize the systematic patterns in edge dissolution using common dyad-independent terms, possibly depending on nodal or dyadic attributes. The adjustment principle is the same: the EDA STERGM formation model coefficients are obtained by subtracting the coefficients of the dissolution model (with or without the durational adjustment of +1, for the new and old EDA, respectively) from the coefficients of the ERGM. When a term appears in both the dissolution model and the ERGM, we subtract one coefficient from the other. When a term only appears in the dissolution model, the dissolution model coefficient is subtracted from zero to calculate the corresponding EDA STERGM formation coefficient.

To give a simple example of this approach, using the syntax from the ergm package, consider an ERGM model specified with ∼ edges, and durational targets that vary according to whether or not nodes match on “sex”, so the dissolution model can be taken to be ∼ edges + nodematch(“sex”). By implication, the formation model for the EDA STERGM is then ∼ edges + nodematch(“sex”). Letting θ denote the edges coefficient in the ERGM, D0 the durational target for edges not matching on “sex”, and D1 the durational target for edges matching on “sex”, the dissolution coefficients are log(D01) for the edges term and log(D11)log(D01) for the nodematch term. The formation coefficients are then approximated by θlog(D0) for the edges term and log(D0)log(D1) for the nodematch term, using the new EDA.

4 We found that substantially increasing the number of proposals per time step for these models resulted in different trends for the old approximation than those shown in Carnegie et al. (Citation2015), suggesting that the higher number of proposals is needed to allow for equilibration of the Metropolis-Hastings Markov chain within each time step. A further 10-fold increase in proposals (beyond the number used for ) produced largely similar results, suggesting the number used for was sufficient to capture the main trends.

5 The asymptotic cross-sectional exactness result can be generalized as follows. Suppose F is a map from nonnegative numbers t to transition probability matrices on some finite state space, such that F(0) is the identity, F(t) is one-sided differentiable at t = 0, and the one-sided derivative F(0) has a one-dimensional left kernel. Then the left kernel of F(0) is spanned by a (unique) probability vector π, and given any ϵ>0 there exists a δ>0 such that 0<t<δ implies that any stationary distribution σ of F(t) satisfies ||σπ||<ϵ, where ||·|| denotes the Euclidean norm. The proof of this more general result is analogous to the one presented here. Related convergence results (e.g., for finite-dimensional distributions) have appeared in the literature (Mohle Citation1998; Mohle and Notohara Citation2016).

6 How these functions are defined for ordinary graphs not corresponding to hypergraphs is arbitrary and does not affect the results in any way; those states are prohibited by the ordinary graph model constraints, and will not arise even as union or intersection networks in the STERGM transition probabilities, because the relevant constraints are dyad-independent.

7 The above discussion allows X and/or Y to be the empty set , but this can be prohibited without further modification if that is the desired convention for hypergraphs.

Additional information

Funding

This work was supported by the National Institutes of Health under Grant R01-AI138783. Partial support for this research came from a Eunice Kennedy Shriver National Institute of Child Health and Human Development research infrastructure Grant, P2C HD042828, to the Center for Studies in Demography and Ecology at the University of Washington.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.