All that Glitters is not Gold: Modeling Relational Events with Measurement Errors

by   Cornelius Fritz, et al.
Universität München

As relational event models are an increasingly popular model for studying relational structures, the reliability of large-scale event data collection becomes increasingly important. Automated or human-coded events often suffer from relatively low sensitivity in event identification. At the same time, most sensor data is primarily based on actors' spatial proximity for predefined time windows; hence, the observed events could relate either to a social relationship or random co-location. Both examples lead to false positives in the observed events that may bias the estimates and inference. We propose an Error-corrected Relational Event Model (EcREM) as an extension to existing approaches for interaction data. The model provides a flexible solution for modeling data while controlling for false positives. Estimation of our model is carried out in an empirical Bayesian approach via data augmentation. In a simulation study, we investigate the properties of the estimation procedure. Consecutively, we apply this model to combat events from the Syrian civil war and to student co-location data. Results from both the simulation and the application identify the EcREM as a suitable approach to modeling relational event data in the presence of measurement error.



There are no comments yet.



Reliability of relational event model estimates under sampling: how to fit a relational event model to 360 million dyadic events

We assess the reliability of relational event model parameters estimated...

A Bayesian semi-parametric approach for modeling memory decay in dynamic social networks

In relational event networks, the tendency for actors to interact with e...

REM beyond dyads: relational hyperevent models for multi-actor interaction networks

We introduce relational hyperevent models (RHEM) as a generalization of ...

A Bayesian baseline for belief in uncommon events

The plausibility of uncommon events and miracles based on testimony of s...

Incorporating delayed entry into the joint frailty model for recurrent events and a terminal event

In studies of recurrent events, joint modeling approaches are often need...

The Online Event-Detection Problem

Given a stream S = (s_1, s_2, ..., s_N), a ϕ-heavy hitter is an item s_i...

New Methods for Small Area Estimation with Linkage Uncertainty

In Official Statistics, interest for data integration has been increasin...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In recent years, event data have become ubiquitous in the social sciences. For instance, interpersonal structures are examined using face-to-face interactions (Elmer and Stadtfeld, 2020). At the same time, political event data are employed to study and predict the occurrence and intensity of armed conflict (Fjelde and Hultman, 2014; Blair and Sambanis, 2020; Dorff et al., 2020). Butts (2008a) introduced the Relational Event Model (REM) to study such relational event data. In comparison to standard network data of durable relations observed at specific time points, relational events describe instantaneous actions or, put differently, interactions at a fine-grained temporal resolution (Borgatti et al., 2009).

Concurrent with their increasing prominence, event data also face increasing challenges regarding their validity. Data collection on face-to-face interactions relies on different types of sociometric badges (Eagle and Pentland, 2006) for which a recent study reports a sensitivity of the event identification between 50 and 60 when compared to video coded data (Elmer et al., 2019). Political event data on armed conflict, in contrast, are generally collected via automated or human coding of news and social media reporting (Kauffmann, 2020). In this context, sensitivity issues are especially prevalent in machine-coded data where both false-positive rates and the share of false positives among all observations have been reported to reach over 60 (King and Lowe, 2003; Jäger, 2018). However, even human-coded data suffer from this problem as they over-report conflict in urban locations (Dawkins, 2020; Weidmann, 2015).

Against this background, the analysis of automated relational events with any existing model, e.g., the REM (Butts, 2008a) or the Dynamic Actor-Oriented Model (Stadtfeld, 2012), should be done cautiously because observed events do not necessarily come error-free. Therefore, we propose a method that can correct for spurious, that is, false-positive events. In particular, we take a counting process point of view where some increments of the dyadic counting processes are true events, while others may be attributed to false events, i.e., exist due to measurement error. This decomposition of observed events into true-positives and false-positives results in two different intensities governing the two respective types of events. The false events are described by a false-positive intensity that we specify independently of the true-positive intensity of true events. We present the model under the assumption that the false-positive events are purely random. Therefore, we can model the respective intensity solely as a constant term. However, more complex scenarios involving the specification of exogenous and endogenous covariates for the false-positive intensity are also possible. In general, we are however primarily interested in studying what factors drive the intensity of true-positives. We model this intensity following Butts (2008a), but the methodology is extendable to other model types such as Stadtfeld (2012); Vu et al. (2015); DuBois et al. (2013); Perry and Wolfe (2013) or Lerner et al. (2021).

Measurement errors are a general problem in empirical social research. Carroll et al. (2006) provide a book-length review of the background and recent advances in the field. For networks, inaccurate reports that lead to measurement errors are frequent, and several methods to correct inference accordingly have been proposed (see Newman, 2018, Chapter 5 for a general survey). One strand of research tries to assess the accuracy and robustness of global network measures, such as degree centrality or clustering coefficient, under a specific type of measurement error, network structure, and network size (Wang et al., 2012; Martin and Niemeyer, 2019; Almquist, 2012). Multiple ways to correct inference under a specific model with imperfect network data were also proposed. A general assessment of the accuracy of informant self-reports combined with simultaneous inference on static networks based on the family of exponential random graph models is provided by Butts (2003). For longitudinal observations, Lospinoso (2012) suggests a longitudinal model to control for false-positive and false-negative dyadic reports. In line with actor-oriented models developed in Snijders (2001), the network evolution is characterized by a stochastic process observed at specific time points. The probabilistic relationship between the observed network of imperfect measurements and the real latent network, on the other hand, is defined by exponential random graph models (Robins et al., 2007). However, no method is available for the analysis of event data containing spurious interactions. Even in the closely related field of Time-to-Event Analysis (for an introduction see Kalbfleisch and Prentice, 2002), solutions to measurement errors primarily focus on error-affected covariates (Wulfsohn and Tsiatis, 1997), wrongly scaled duration time until an event occurs (Oh et al., 2018), or imperfect self-reports of outcomes (Gu et al., 2015). Against this background, we introduce the Error-corrected Relational Event Model (EcREM) as a tool to explicitly control for measurement errors stemming from false positives.

This article is structured as follows: We begin in Section 2 by introducing our methodology. In particular, we lay out the general framework to study relational event data proposed by Butts (2008a) in Section 2.1 and introduce an extension to this framework, the EcREM, to correct for the presence of false-positive events in the remainder of Section 2. Through a simulation study in Section 3, we investigate the performance of our proposed estimator when measurement error is correctly specified and when it is nonexistent. We then apply the proposed model in Section 4 to analyze fighting incidents in the Syrian civil war as well as social interaction data from a college campus. A discussion of possible implications and other types of measurement error models for the analysis of events follows in Section 5. Finally, Section 6 concludes the article.

2 Error-corrected relational event model

2.1 Modeling framework for relational events

We denote observed events in an event stream of elements. Each object consists of a tuple encoding the information of an event. In particular, we denote the two actors of an event by and and the time of the event with . For simplicity of notation, we omit the argument for and when no ambiguity exists and write for , for , and for . Stemming from our application cases, we mainly focus on undirected events in this article; hence the events and are equivalent in our framework. We denote the set of actor-tuples between which events can possibly occur by , where, for simplicity, we assume that is time-constant.

Following Perry and Wolfe (2013) and Vu et al. (2011a), we assume that the events in are generated by an inhomogeneous matrix-valued counting process


which, in our case, is assumed to be a matrix-valued Poisson process (see Daley and Vere-Jones, 2008 for an introduction to stochastic processes). Without loss of generality, we assume that is observed during the temporal interval , starting at . The cells of (1) count how often all possible dyadic events have occurred between time and , hence can be conceived as a standard social network adjacency matrix (Butts, 2008b). For instance, indicates how often actors and have interacted in the time interval . Therefore, observing event constitutes an increase in at time point , i.e. for . We denote with the matrix-valued intensity of process

. Based on this intensity function we can characterize the instantaneous probability of a unit increase in a specific dimension of

at time-point (Daley and Vere-Jones, 2008). We parametrize conditional on the history of the processes, , which may also include additional exogenous covariates. Hence, , where is some covariate process to be specified later. We define the intensity function at the tie-level:


where is defined with the help of a dyadic operator

that stacks two vectors and

is the baseline intensity characterized by coefficients , while the parameters weight the statistics computed by , which is the function of sufficient statistics. Based on , we can formulate endogenous effects, which are calculated from , exogenous variables calculated from , or a combination of the two which results in complex dependencies between the observed events. Examples of endogenous effects for undirected events include degree-related statistics like the absolute geographic difference of previous distinct events between actors and

or hyperdyadic effects, e.g., investigating how triadic closure influences the observed events. In our first application case, exogenous factors include a dummy variable whether group

and share an ethno-religious identity. Alternatively, one may incorporate continuous covariates, e.g., computing the absolute distance between group and .

Figure 1: Graphical illustrations of endogenous and exogenous covariates. Solid lines represent past interactions, while dotted lines are possible but unrealized events. Node coloring indicates the node’s value on a categorical covariate. The relative risk of the events in the second row compared to the events in the first row is if all other covariates are fixed, where is the coefficient of the respective statistic of each row. If , the event shown in the second row is more likely than the event from the first row and for otherwise.

We give graphical representations of possible endogenous effects in Figure 1 and provide their mathematical formulations together with a general summary in Annex A. When comparing the structures in the first row with the ones in the second row, the event indicated by the dotted line differs by one unit in the respective statistic. Its intensity thus changes by the multiplicative factor , where is the respective parameter of the statistic if all other covariates are fixed. The interpretation of the coefficients is, therefore, closely related to the interpretation of relative risk models (Kalbfleisch and Prentice, 2002).

Previous studies propose multiple options to model the baseline intensity . Vu et al. (2011a, b) follow a semiparametric approach akin to the proportional hazard model by Cox (1972), while Butts (2008a) assumes a constant baseline intensity. We follow Etezadi-Amoli and Ciampi (1987) by setting , with being a smooth function in time parametrized by B-splines (de Boor, 2001):


where denotes the th B-spline basis function weighted by coefficient . To ensure a smooth fit of , we impose a penalty (or regularization) on which is formulated through the priori structure



is a hyperparameter controlling the level of smoothing and

is a penalty matrix that penalizes the differences of coefficients corresponding to adjacent basis functions as proposed by Eilers and Marx (1996). We refer to Ruppert et al. (2003) and Wood (2017) for further details on penalized spline smoothing. Given this notation, we can simplify (2):


with .

2.2 Accounting for false-positives in relational events

Given the discussion in the introduction, we may conclude that some increments of are true-positive events, while others stem from false-positive events. Spurious events can occur because of coding errors during machine- or human-based data collection. To account for such erroneous data points, we now introduce the Error-corrected Relational Event Model (EcREM).

Figure 2: Graphical illustration of a possible path of the counting process of observed events () between actors and that encompasses some false () and true events ().

First, we decompose the observed Poisson process into two separate Poisson processes, i.e. . On the dyadic level, denotes the number of true-positive events between actors and until , and the number of false-positive events that are spurious. Assuming that is a Poisson process, we can apply the so-called thinning property, stating that the two separate processes that sum up to a Poisson process are also Poisson processes (Daley and Vere-Jones, 2008). A graphical illustration of the three introduced counting processes, and , is given in Figure 2. In this illustrative example, we observe four events at times and , although only the first and third are true-positives, while the second and fourth events constitute false-positives. Therefore, the counting process jumps at all times of an event, yet does so only at and . Conversely, increases at and

The counting processes and are characterized by the dyadic intensities and , where we respectively denote the history of all false-positive and true-positive processes by and . This can also be perceived as a competing risks setting, where events can either be caused by the true-positive or false-positive intensity (Gelfand et al., 2000). To make the estimation of and feasible and identifiable (Heckman and Honoré, 1989), we assume that both intensities are independent of one another, which means that their correlation is fully accounted for by the covariates. Building on the superpositioning property of Poisson processes, the specification of those two intensity functions also defines the intensity of the observed counting process . In particular, holds (Daley and Vere-Jones, 2008).

The true-positive intensity drives the counting process of true events and only depends on the history of true-positive events. This assumption is reasonable since if erroneous events are mixed together with true events, the covariates computed for actors and at time through would be confounded and could not anymore be interpreted in any consistent manner. We specify in line with (2) at the dyadic level by:


At the same time, the false-positive intensity determines the type of measurement error generating false-positives. One may consider the false-positive process as an overall noise level with a constant intensity. This leads to the following setting:


The error structure, that is, the intensity of the false-positive process can be made more complex, but to ensure identifiability, cannot depend on the same covariates as . We return to the discussion of this point below and focus on model (7

) for the moment.

2.3 Posterior inference via data augmentation

To draw inference on , we employ an empirical Bayes approach. Specifically, we will sample from the posterior of given the observed data. Our approach is thereby comparable to the estimation of standard mixture (Diebolt and Robert, 1994) and latent competing risk models (Gelfand et al., 2000).

For our proposed method, the observed data is the event stream of all events regardless of being a false-positive or true-positive. To adequately estimate the model formulated in Section 2, we lack information on whether a given event is false-positive or true-positive. We denote this formally as a latent indicator variable for event :

We write to refer to the latent indicators of all events and use to shorten . Given this notation, we can apply the data augmentation algorithm developed in Tanner and Wong (1987) to sample from the joint posterior distribution of

by iterating between the I Step (Imputation) and P Step (Posterior) defined as:

I Step: Draw from the posterior ;
P Step: Draw from the augmented .

This iterative scheme generates a sequence that (under mild conditions) converges to draws from the joint posterior of and is a particular case of a Gibbs’ sampler. For , an initial value of needs to be supplied333For our implementation, we randomly sample of the events to be true-positives, while the rest are false-positives. Consecutively, we set to equal the posterior mean of the implied complete-data data.. Each iteration consists of an Imputation and a Posterior step, resembling the Expectation and Maximization step from the EM algorithm (Dempster et al., 1977). Note, however, that Tanner and Wong (1987) proposed this method with multiple imputations in each I Step and a mixture of all imputed complete-data posteriors in the P Step. We follow Little and Rubin (2002) and Diebolt and Robert (1994) by performing one draw of and in every iteration, which is a specific case of data augmentation. As Noghrehchi et al. (2021) argue, this approach is closely related to the stochastic EM algorithm (Celeux et al., 1996). The main difference between the two approaches is that in our P Step, the current parameters are sampled from the complete-data posterior in the data augmentation algorithm and not fixed at its mean as in Celeux et al. (1996). Consequently, the data augmentation algorithm is a proper multiple imputation procedure (MI, Rubin, 1987), while the stochastic EM algorithm is improper MI444See B1 of Noghrehchi et al. (2021) and Rubin (1987) for a discussion of proper and improper MI procedures.

. We choose the data augmentation algorithm over the stochastic EM algorithm because Rubin’s combination rule to get approximate standard errors can only be applied to

proper MI procedures (Noghrehchi et al., 2021).

In what follows, we give details and derivations on the I and P Steps and then exploit MI to combine a relatively small number of draws from the posterior to obtain point and interval estimates.


To acquire samples from conditional on and , we begin by decomposing the joint density into:


The distribution of conditional on and is:


Note that the information of and allows us to calculate as well as . By iteratively applying (9) and plugging in for , we can draw samples in the I Step of through a sequential design that sweeps once from to . The mathematical derivation of (9) is provided in Annex B.


As already stated, we assume that the true-positive and false-positive intensities are independent. Hence, the sampling from the complete-data posteriors of and can be carried out independently. In the ensuing section, we therefore only show how to sample from , but sampling from is possible in the same manner. To derive this posterior, we begin by showing that the likelihood of and with parameter is the likelihood of the counting process , that resembles a Poisson regression. Consecutively, we state all priors to derive the desired complete-data posterior.

Given a general sampled in the previous I Step and , we reconstruct a unique complete path of by setting


where is an indicator function. The corresponding likelihood of results from the property that any element-wise increments of the counting process between any times and with and arbitrary actors and with

are Poisson distributed:


The integral in (11) is approximated through simple rectangular approximation between the observed event times to keep the numerical efford feasible, so that the distributional assumption simplifies to:


We specify the priors for and separately and independent of one another. The prior for was already stated in (4). Through a restricted maximum likelihood approach, we estimate the corresponding hyperparameter such that it maximizes the marginal likelihood of and given (for additional information on this estimation procedure and general empirical Bayes theory for penalized splines see Wood, 2011, 2020). Regarding the linear coefficients , we assume flat priors, i.e. , indicating no prior knowledge.

In the last step, we apply Wood’s (2006) result that for large samples, the posterior distribution of under likelihoods resulting from distributions belonging to the exponential family, such as the Poisson distribution in (12), can be approximated through:


Here, denotes the penalized maximum likelihood estimator resulting from (12) with the extended penalty matrix defined by

with for being a matrix filled with zeroes and defined in accordance with (4). For let be the length of and of . The penalized likelihood is then given by:


which is equivalent to a generalized additive model; hence we refer to Wood (2017) for a thorough treatment of the computational methods needed to find

. The variance matrix in (

13) has the following structure:

Values for and can be extracted from the estimation procedure to maximize (14) with respect to , while is a matrix whose rows are given by as defined in (6) for and . Similarly,
is a diagonal matrix.

Set: according to footnote 3
for  do
       Imputation Step: Sample
       for  do
  • Sample according to (9)

  • If update

       end for
      Posterior Step: Sample
  • Reconstruct and from and according to (10)

  • Obtain and by maximizing the penalized Poisson likelihood
    given in (12) (only for instead of )

  • Sample

  • Obtain and by maximizing the penalized Poisson likelihood
    given in (12)

  • Sample

end for
Algorithm 1 Pseudo-Code to obtain samples from the data augmentation algorithm.

For the P Step, we now plug in for in (13) to obtain and by carrying out the corresponding complete-case analysis. In the case where no false-positives exist, the complete estimation can be carried out in a single P Step. In Algorithm 1

, we summarize how to generate a sequence of random variables according to the data augmentation algorithm.

Multiple imputation:

One could use the data augmentation algorithm to get a large amount of samples from the joint posterior of to calculate empirical percentiles for obtaining any types of interval estimates. However, in our case this endeavor would be very time-consuming and even infeasible. To circumvent this, Rubin (1976) proposed multiple imputation as a method to approximate the posterior mean and variance. Coincidentally, the method is especially successful when the complete-data posterior is multivariate normal as is the case in (13), thus only a small number of draws is needed to obtain good approximations (Little and Rubin, 2002). To be specific, we apply the law of iterative expectation and variance:


Next, we approximate (15) and (16) using a Monte Carlo quadrature with samples from the posterior obtained via the data augmentation scheme summarized in Algorithm 1 after a burn-in period of iterations:


where encompasses the complete-data posterior means from the th sample and is composed of the corresponding variances defined in (13). We can thus construct point and interval estimates from relatively few draws of the posterior based on a multivariate normal reference distribution (Little and Rubin, 2002).

3 Simulation Study

We conduct a simulation study to explore the performance of the EcREM compared to a REM, which assumes no measurement error, in two different scenarios. These are a regime where measurement error is correctly specified in the EcREM and one where measurement error is instead non-existent.

Simulation design:

In runs, we simulate event data between actors under known true and false intensity functions in each example. For exogenous covariates, we generate categorical and continuous actor-specific covariates, transformed to the dyad level by checking for equivalence in the categorical case and computing the absolute difference for the continuous information. Generally, we simulate both counting processes and separately and stop once .

The data generating processes for true-positive events is identical in each case and given by:

(DG 1-2)

where we draw the continuous exogenous covariate (cont.) from a standard Gaussian distribution and the categorical exogenous covariates (cat.)from a categorical random variable with seven possible outcomes, all with the same probability. Mathematical definition of the endogenous and exogenous statistics are given in Annex

A. In contrast, the false-positive intensity differs across regimes to result in correctly specified (DG 1) and nonexistent (DG 2) measurement errors:

(DG 1)
(DG 2)

Given these intensities, we follow DuBois et al. (2013) to sample the events.

 DG 1 (PFE = 4.8)
  Degree abs
  Cov. cont.
  Cov. cat.
 DG 2 (PFE = 0)
  Degree abs
  Cov. cont.
  Cov. cat.
Table 1: Result of the simulation study for the EcREM and REM with the two data-generating processes (DG 1, DG2). For each DG (Data-Generating process) and covariate, we note the AVE (AVerage Estimate), RMSE (Root-Mean-Squared Error), and CP (Coverage Probability). Further, for each Data Generating process (DG) the average Percentage of False Events (PFE) is given.

Although the method is estimated in a Bayesian framework, we can still assess the frequentist properties of the estimates of the EcREM and REM. In particular, the average point estimate (AVE), the root-mean-squared error (RMSE) and the coverage probabilities (CP) are presented in Table 1. The AVE of a specific coefficient is the average over the posterior modes in each run:

where is the posterior mean (17) of the th simulation run. To check for the average variance of the error in each run, we further report the RMSEs of the coefficient vector :

Finally, we assess the adequacy of the uncertainty quantification by computing the percentage of runs in which the real parameter lies within the confidence intervals based on a multivariate normal posterior approximation to the posterior distribution with mean and variance given in (

17) and (18). According to standard statistical theory for interval estimates, this coverage probability should be around (Casella and Berger, 2001).


DG 1 shows how the estimators behave if the true and false intensities are correctly specified. The results in Table 1 suggest that the EcREM can recover the coefficients from the simulation. On the other hand, strongly biased estimates are obtained in the REM, where not only the average estimates are biased, but we also observe a high RMSE and violated coverage probabilities.

In the second simulation, we assess the performance of the measurement error model when it is misspecified. In particular, we investigate what happens when there is no measurement error in the data, i.e., all events are true-positives, and the intensity of is zero in DG 2. Unsurprisingly, the REM allows for valid and unbiased inference under this regime. But our stochastic estimation algorithm proves to be robust too as for most runs, the simulated events were at some point only consisting of actual events. In other words, the EcREM can detect the measurement errors correctly and is unbiased if no measurement error occurs in the observed data. The EcREM and REM are thus equivalent when measurement error is absent and the true-positive intensity is correctly specified.

In sum, the simulation study thus offers evidence that the EcREM increases our ability to model relational event data in the presence of measurement error while being equivalent to a standard REM when false positives do not exist in the data.

4 Application

Conflict Event Data in 4.1 Co-location Events in 4.2
Source ACLED MIT Human Dynamics Lab
(Raleigh et al., 2010) (Madan et al., 2012)
Observational Period 2017:01:01 - 2019:01:01 2008:11:01 - 2008:11:04
Number of Actors 68555 We include all actors that, within the two-year period, participated in at least five events. To verify their existence and obtain relevant covariates, we compared them first to data collected by Gade et al. (2019) and then to various sources including news media reporting. We omitted two actors on which we could not find any information as well as actor aggregations such as “rioters” or “syrian rebels”. 58
Number of Events 4,362 2,489666To capture new events instead of repeated observations of the same event, we omit events where the most recent previous interaction between a and b occurred less than 20 minutes before.
Table 2: Descriptive information on the two analyzed data sets.

Next we apply the EcREM on two real-world data sets motivated by the types of event data discussed in the introduction, namely human-coded conflict events in the Syrian civil war and co-location event data generated from the Bluetooth devices of students in a university dorm. Information on the data sources, observational periods and numbers of actors and events is summarized in Table 2. Following the above presentation, we focus on modeling the true-positive intensity of the EcREM and thus limit the false-positive intensity, that is the measurement error model, to the constant term. Covariates are thus only specified for the true-positive intensity. In our applications, the samples drawn according to Algorithm 1 converged to a stationary distribution within the first 30 iterations. To obtain the reported point and interval estimates via MI, we sampled 30 additional draws. Due to space restrictions, we keep our discussions of the substantive background and results of both applications comparatively short.

4.1 Conflict events in the Syrian civil war

In the first example, we model conflict events between different belligerents as driven by both exogenous covariates and endogenous network mechanisms. The exogenous covariates are selected based on the literature on inter-rebel conflict. We thus include dummy variables indicating whether two actors share a common ethno-religious identity or receive material support by the same external sponsor as these factors have previously been found to reduce the risk of conflict (Popovic, 2018; Gade et al., 2019). Additionally, we include binary indicators of two actors being both state forces or both rebel groups as conflict may be less likely in the former but more likely in the latter case (Dorff et al., 2020).

Furthermore, we model endogenous processes in the formation of the conflict event network and consider four statistics for this purpose. First, we account for repeated fighting between two actors by including both the count of their previous interactions as well as a binary indicator of repetition, which takes the value 1 if that count is at least 1. We use this additional endogenous covariate as a conflict onset arguably comprises much more information than subsequent fighting. Second, we include the absolute difference in a and b’s degree to capture whether actors with a high extent of previous activity are prone to engage each other or, instead, tend to fight less established groups to pre-empt their rise to power. Finally, we model hyper-dyadic dependencies by including a triangle statistic that captures the combat network’s tendency towards triadic closure.

Coef./CI Z Val. Coef./CI Z Val.
(Intercept) -10.053 -103.374 -9.944 -115.723
[-10.244,-9.863] [-10.112,-9.775]
Degree Abs 0.03 13.883 0.03 17.295
[0.026,0.035] [0.027,0.034]
Repetition Count 0.009 45.62 0.009 56.342
[0.009,0.01] [0.009,0.01]
First Repetition 5.059 53.563 4.911 64.946
[4.873,5.244] [4.763,5.059]
Triangle 0.075 10.485 0.073 18.989
[0.061,0.089] [0.065,0.08]
Match ethno-religious Id. -0.381 -5.076 -0.393 -5.852
[-0.528,-0.234] [-0.525,-0.262]
Match Rebel 0.158 3.239 0.171 4.381
[0.062,0.254] [0.094,0.247]
Match State Force -0.086 -0.686 -0.077 -0.723
[-0.332,0.16] [-0.287,0.132]
Common Sponsor 1 -0.212 -2.35 -0.227 -2.957
[-0.388,-0.035] [-0.378,-0.077]
Table 3: Combat events in the Syrian civil war: Estimated coefficients with confidence intervals noted in brackets in the first column, while the Z values are given in the second column. The results of the EcREM are given in the first two columns, while the coefficients of the REM are depicted in the last two columns.

Table 3 presents the results of the EcREM and, for comparison, an REM. Beginning with the exogenous covariates, our estimates are in line with previous studies in that belligerents are less likely to fight each other when they share an ethno-religious identity or receive resources from the same external sponsor. In contrast, there is no support for the idea that state forces exhibit less fighting among each other than against rebels in this type of internationalized civil war, whereas different rebel groups are more likely to engage in combat against one another. Furthermore, we find evidence that endogenous processes affect conflict event incidence. Along these lines, the binary repetition indicator exhibits the strongest effect across all covariates, implying that two actors are more likely to fight each other if they have done so in the past. As indicated by the positive coefficient of the repetition count, the dyadic intensity further increases the more they have previously fought with one another. The absolute degree difference also exhibits a positive effect, meaning that fighting is more likely between groups with different levels of previous activity. And finally, the triangle statistic’s positive coefficient suggests that even in a fighting network, triadic closure exists. This may suggest that belligerents engage in multilateral conflict, attacking the enemy of their enemy, in order to preserve the existing balance of capabilities or change it in their favor (Pischedda, 2018).

This discussion of results holds both for the results of EcREM and REM, implying that for this application, their point estimates are generally quite similar. However, we note two key differences between the two models. First, the coefficient estimates for the strongest predictor of a conflict event, the binary indicator of belligerents having fought before, differs to a substantively meaningful degree between the two models. In the REM, this implies a multiplicative change of while for the EcREM, it is estimated at . In other words, the effect found in the EcREM is more than 10% stronger than that retrieved in a standard REM which does not account for false positives. Other coefficients, namely those concerning the types of actors and ethno-religious identity, also exhibit noticeable, albeit smaller, differences. Second, while their coefficient estimates are quite similar, REM and EcREM differ noticeably in how precise they deem these estimates to be. This difference is clearest in their respective Z-values, which, as visible in Table 3, are always farther away from zero for the REM than the EcREM. Finally, the samples from the latent indicators indicate that approximately 1.114

of the observations, about 50 events, are on average classified as false positives. These events should not necessarily be understood as not having occurred but instead as being spurious. This may stem from e.g. having occurred on another date or, a real possibility given the very similar names of groups fighting in Syria, between different actors than those recorded. At the same time, this finding attests good data quality to the ACLED event data, especially as compared to alternative, machine-coded datasets

(Jäger, 2018; Hammond and Weidmann, 2014), and offers reassurance for the increasing use of human-coded event data to study armed conflict.

4.2 Co-location events in university housing

In our second application, we use a subset of the co-location data collected by Madan et al. (2012) to model when students within an American university dorm interact with each other. These interactions are deduced from continuous (every 6 minutes) scans of proximity via the Bluetooth signals of students’ mobile phones. Madan et al. (2012) used questionnaires to collect a host of information from the participating students. This information allows us to account for both structural and more personal exogenous predictors of social interaction. We thus include binary indicators of whether two students are in the same year of college or live on the same floor of the dorm to account for the expected homophily of social interactions (McPherson et al., 2001). In addition, we incorporate whether two actors consider each other close friends777We symmetrized the friendship network, i.e., if student nominated student as a close friend, we assume that the relationship is reciprocated.. Given that the data were collected around a highly salient political event, the 2008 US presidential election, we also incorporate a dummy variable to measure whether they share the same presidential preference and a variable measuring their similarity in terms of interest in politics (Butters and Hare, 2020). In addition, we include the same endogenous network statistics here as in section 4.1. These covariates allow us to capture the intuitions that individuals tend to socialize with people with who they have interacted before, are not equally popular as they are, and they share more common friends with (Rivera et al., 2010).

Coef./CI Z Val. Coef./CI Z Val.
Intercept -10.077 -124.905 -10.012 -139.269
[-10.235,-9.919] [-10.153,-9.871]
Degree Abs 0.025 5.361 0.025 6.369
[0.016,0.035] [0.017,0.032]
Repetition Count 0.066 27.263 0.065 29.988
[0.061,0.07] [0.061,0.069]
First Repetition 2.714 42.024 2.615 44.704
[2.587,2.84] [2.501,2.73]
Triangle 0.049 6.597 0.049 8.109
[0.035,0.064] [0.037,0.061]
Match Floor 0.117 2.197 0.123 2.439
[0.013,0.221] [0.024,0.222]
Match Presidential Pref 0.195 4.374 0.188 4.499
[0.108,0.282] [0.106,0.27]
Match Year -0.003 -0.051 -0.012 -0.236
[-0.109,0.104] [-0.112,0.088]
Dyad. Friendship 0.157 3.145 0.15 3.15
[0.059,0.254] [0.057,0.243]
Sim. Interested In Politics -0.018 -0.74 -0.021 -0.917
[-0.064,0.029] [-0.065,0.024]
Table 4: Co-location Events in University Housing: Estimated coefficients with confidence intervals noted in brackets in the first column, while the Z values are given in the second column. The results of the EcREM are given in the first two columns, while the coefficients of the REM are depicted in the last two columns.

We present the EcREM estimates in Table 4. Beginning with the exogenous covariates, we find that the observed interactions tend to be homophilous in that students have social encounters with people they live together with, consider their friends, and share a political opinion with. In contrast, neither a common year of college nor a similar level of political interest are found to have a statistically significant effect on student interactions. At the same time, these results indicate that the social encounters are affected by endogenous processes. Having already had a previous true-positive event is found to be the main driver of the corresponding intensity; hence having a very strong and positive effect. Individuals who have socialized before are thus more likely to socialize again, an effect that, as indicated by the repetition count, increases with the number of previous interactions. Turning to the other endogenous covariates, the result for absolute degree difference suggests that students and are more likely to engage with each other if they have more different levels of previous activity, suggesting that e.g. popular individuals attract attention from less popular ones. As is usual for most social networks (Newman and Park, 2003), a positive triangle statistic augments the intensity of a true-positive event, i.e. students socialize with the friends of their friends.

As in the first application case, the REM and EcREM results presented in Table 4 show some marked differences. Again, the effect estimate for binary repetition, at , is substantively higher in the EcREM than in the REM (

) while smaller differences also exist for other covariates. The Z-values and confidence intervals obtained in the REM are also substantially smaller in the REM than in the EcREM. Taken together with the finding from the simulation that the REM has incorrect coverage probabilities in the presence of false positives, this implies that the REM may wrongly reject null hypotheses due to overly narrow confidence intervals. In contrast, the EcREM exhibits correct coverage probabilities and thus no tendency towards type I error when false positives exist. In this application, the average percentage of false positives is comparatively high at 6

. The fact that leaving out the corresponding 150 events yielded similar estimates is a strong indicator that the false-positive events were mainly observed on the periphery of the interaction network and hardly affected the behavior in the network’s core.

5 Discussion

Our proposed method controls for one explicit measurement error, namely that induced by false-positive events. The simulation study showed that our approach can detect such spurious events and even yield correct results if they are not present. Still, we want to stress that numerous other measurement errors might be present when one analyses relational events, which we disregard in this article. While an all-embracing taxonomy of measurement errors in that context is beyond the scope of this article, we would like to point towards some related pressing issues. First, sometimes the measurement of exogenous covariates is subject to error. One way to alleviate this measurement error is to reformulate the likelihood of the REM as a Poisson GLM. For this model class, frameworks to tackle such problems in the covariates are readily available (see Aitkin, 1996). As an alternative, the approach of Wulfsohn and Tsiatis (1997) could be adapted to repeated events. Second, measurement errors might be more common for some actors than for others. In this context, one could extend the false-positive intensity to include dummy covariates for each actor that indicate if a specific actor is involved in the respective event. Analyzing the corresponding coefficients could be instructive in detecting especially error-prone actors. Third, the measurement instrument is often not capable of recording unique times for each event. Therefore, the ordering of events is not precisely known. In application cases where little simultaneous dependencies are expected, one can assume that events observed concurrently are independent of one another conditional on the past. This approach, however, might be flawed in situations where actors quickly respond to one another at a speed the measurement instrument cannot capture. A possible solution could be to adopt methods known from maximum likelihood estimation for stochastic actor-oriented (Snijders et al., 2010) or longitudinal exponential random graph models (Koskinen and Lomi, 2013). Alternatively, one can perceive time-clustered event data as interval-censored survival data and exploit the respective methods (Satten, 1996; Goggins et al., 1998). In summary, we see numerous open issues concerning measurement errors in the context of relational events and hope that further research on the topic will follow.

We would also like to raise other possible extensions of our latent variable methodology beyond the approach presented here. A straightforward refinement along the lines of Stadtfeld and Block (2017) would be to include windowed effects, i.e., endogenous statistics that are only using history ranging into the past for a specific duration, or exogenous covariates calculated from additional networks to the one modeled. The first modification could also be extended to separable models as proposed in Fritz et al. (2021). A relatively simplistic version of the latter type of covariate was incorporated in Section 4.2 to account for common friendships but more complex covariates are possible. This might be helpful, for instance, when we observe proximity and e-mail events between the same group of actors. In this context, one could be interested in using the mail events to infer which proximity events we can trace back to a shared underlying social structure. Moreover, with minor adaptions, the proposed estimation methodology could handle some of the exogenous or endogenous covariates having nonlinear effects on the intensities.

Furthermore, one can modify the framing of the two counting processes from counting only true and false positives and even extend the number of simultaneous counting processes. To better understand the opportunities our model framework entails, it is instructive to perceive the proposed model as an extension to the latent competing risk model of Gelfand et al. (2000) with two competing risks. The cause-specific intensities are, in our case, and that, in comparison to Gelfand et al. (2000), not only differ in the form of the baseline intensity but also allow for cause-specific endogenous covariates. The consequences of this connection are manifold: For time-to-event data, one could employ an egocentric version888See Vu et al., 2011b for further information on egocentric models. of our model for model-based clustering of general duration times. This interpretation could prove to be a valuable tool for medical applications when we, e.g., observe when the cancer remission ends without knowing its cause (Gelfand et al., 2000). One can also conceive the proposed methodology as a general tool to correct for additive measurement errors in count data. A possible extension would be to spatial settings, where we can only attribute the counts to a specific group of spatial units but no particular one. For example, in conflict research, the exact location of events is often only known with varying precision; hence for some events, the position is known up to 0.5 0.5 decimal degrees, while for others, we only know the general region or country (Raleigh et al., 2010). Besides, one could specify a more complex intensity for than done in (7) and understand the two intensities as governing two independent interaction settings, e.g., writing an e-mail to a friend or a work colleague.

Finally, we would promote the use of the EcREM as a method to check the general goodness of fit of a proposed model. One should revise the studied model if the EcREM finds a non-negligible percentage of false-positive events in the observations under a specific parametrization if no spurious events are expected. In those cases, one could also artificially add noisy events drawn randomly from all possible events and assess whether the estimates of the EcREM are still similar to those from the REM, and the EcREM was able to classify the false-positives as such correctly. In other words, we can use the proposed methodology to test whether the results of a REM are robust to noisy observations.

6 Conclusion

In the introduction, we started by illustrating some possible problems arising from collecting large-scale event data. Motivated by the observation that these data often include spurious observations, we introduced the Error-corrected Relational Event Model (EcREM) to be used when one expects false positives in the data. We demonstrated the performance of this model as compared to a standard REM in a simulation study, showing that the EcREM handles false positives well when they exist and is equivalent to the REM when they do not. Applications to interaction data from two very different contexts, the Syrian civil war and an American university dorm, further illustrate the use of the EcREM. In summary, this paper extends the relational event framework to handle false positives. In doing so, it offers applied researchers analyzing instantaneous interaction data a useful tool to explicitly account for measurement errors induced by spurious events or to investigate the robustness of their results against this type of error.

Appendix A Definition of undirected network statistics

As REMs for undirected events are so far sparse in the literature, there are no standard statistics that are commonly used (one exception being Bauer et al., 2021). Thus we define all statistics based on prior substantive research (Rivera et al., 2010; Wasserman and Faust, 1994) and undirected statistics used for modeling static networks (Robins et al., 2007). Generally, nondirected statistics have to be invariant to swapping the positions of actor and . For the following mathematical definitions, we denote the set of all actors by .

For degree-related statistics, we include the absolute difference of the degrees of actors and :

where is the point-in-time just before . Alternatively, one might also employ other bivariate functions of the degrees as long as they are invariant to swapping and , such as the sum of degrees. When simultaneously using different forms of degree-related statistics, collinearities between the respective covariates might severely impede the interpretation.

To capture past dyadic behavior, one can include directly as a covariate. Since the first event often constitutes a more meaningful action than any further observed events between the actors and , we additionally include a binary covariate to indicate whether the respective actors ever interacted before, leading to the following endogenous statistics:

Hyperdyadic statistics in the undirected regime are defined as any type of triadic closure, where actor is connected to an entity that is also connected to actor :

Finally, actor-specific exogenous statistics can also be used to model the intensities introduced in this article. We denote arbitrary continuous covariates by . On the one hand, we may include a measure for the similarity or dissimilarity for the covariate through:

For multivariate covariates, such as location, we only need to substitute the absolute value for any given metric, e.g., euclidean distance. In other cases, it might be expected that high levels of a continuous covariable result in higher or lower intensities of an event:

Which type of statistic should be used depends on the application case and the hypotheses to be tested. Categorical covariates, that we denote by , can also be used to parametrize the intensity by checking for equivalence of two actor-specific observations of the variable:

Besides actor-specific covariates also exogenous networks or matrices, such as , can also be incorporated as dyadic covariates in our framework:

where is the entry of the th row and th column of the matrix . Extensions to time-varying networks are straightforward when perceiving changes to them as exogenous to the modeled events (Stadtfeld and Block, 2017).

Appendix B Mathematical derivation of (9)

For , let be the increment of the latent counting process of true positive events between the time points and . We observe , hence we can reconstruct the respective increment , where is the increment of the false-positive counting process. The second equality holds since by design the sum of increments of the processes counting the true-positives and false-positives is the increment of the observed counting process, i.e. . To sample from , note that

holds. Heuristically, this means that if we know that one of the two

thinned counting processes jumps at time , the probability of the jump being attributed to is the probability of being a true-positive event. For the increments of involved counting processes, we can then use the properties of the Poisson processes and the fact that the intensities are piecewise constant between event times that yield the following distributional assumptions :


where we set . We can now directly compute the probability of :