The ability to establish and leverage communication networks to share information and collaboratively accomplish sophisticated tasks is a distinguishing feature of humans. While the ability to form social ties with others was originally developed when individuals were in close proximity, technological improvements have allowed increasingly remote forms of communications and collaboration, that now include tele- and video-conferencing, email, and chats. However, the way in which physical co-location affects the formation of relationships is not well understood[9, 10, 11, 12].
Addressing this question is all the more relevant today. First, in planning the transition towards the post-COVID-19-pandemic “new normal”, institutions and policy-makers world-wide are wondering about the best way to reshape work environments following COVID-19[13, 14, 15, 16]. Second, the massive shift to remote work during the past two years has produced a trove of Big Data that promises to elucidate what happens when we remove physical presence as a main conduit of communication. While recent work has started to highlight changes in the communication networks of information workers due to mandatory remote work, the data was collected across several campuses in the continental US – making it impossible to link these network changes to a lack of physical proximity . Hence, the following question remains open: what is the effect of co-location on human communication networks?
In this study, we explore the mechanism via which the complete removal and subsequent partial re-introduction of physical co-location at a large North American university – the MIT campus – affects the structure of its digital communication network. We find that, despite the robustness of many network measures to the shift to remote work, physical co-location plays a crucial role in the formation of new weak ties.
Since Mark Granovetter’s seminal work in 1973, weak ties have been identified as fundamental microscopic structures that enable the spread of ideas and opportunities in social networks. Our hypothesis is that weak ties form due to chance encounters in and around the office, so that removing the possibility for chance encounters should affect the formation of weak ties. By hindering new weak tie formation, the removal of physical co-location leads to increased redundancy in email networks – more information is spread between fewer people. Said differently, physical proximity is vital for updating the people one communicates with over time – a process that we have termed nexogenesis. Nexogenesis can be accurately modeled with a modification of the link-central preferential attachment model which includes a co-location factor , accurately reproducing the dynamics of formation and stability of weak ties caused by long-term removal and partial re-introduction of physical co-location on the MIT campus.
1.1. Forming the daily email network
We build and analyze a large email network of research workers at MIT in Cambridge, Massachusetts. Despite the wide-scale adoption of synchronous video-conferencing technology, email remains a universal mode of digital communication used by researchers to exchange information and organize meetings. In order to study changes in communication behavior due to fully remote work, we study the email habits of 2,834 MIT faculty and postdocs over 18 months starting on December 26, 2019. Each researcher belongs to a “research unit” which describes their campus affiliation (see Supplementary Information for a complete list – the partition of the MIT research community into research units is in general finer than the partition into departments). During March 2020 MIT started implementing COVID-19 contingency plans, which led to a progressive decrease in campus attendance and culminated on Monday, March 23, 2020 with the halting of in-person research activities. For each day, the number of emails sent between each pair of (anonymized) individuals is estimated withaccuracy from randomized, aggregated data and used to form the edge weights of an undirected network (see Methods).
1.2. Many network features are robust to the shift to remote work
Our initial analysis did not show many changes in the network: directly comparing connected components and the number of intra/inter-research unit connections in the email network in February 2020 and February 2021 using a paired test (see Methods) highlights few significant differences (Fig 1 panel a). However, paired testing is not sufficient to estimate the short- and cumulative long-term effects of fully remote working on the communication network. When studying these cumulative effects, there are also no differences – represented by the fact that the time series is within the bounds predicted from the covariate – in the number of intra/inter-research unit connections, number of connected components, and the size of the giant component due to remote work (panels b-i). Please note that the sharp downward spikes near January 1st, 2021 correspond to a holiday week, when many researchers take the entire week off.
To provide a statistically robust estimation of long-term effects, we design a methodology based on the Bayesian Structural Time Series (BSTS) approach. The approach is based on the construction of a synthetic counterfactual time series unaffected by the treatment. As the treatment in this case corresponds to the termination of campus access for researchers, we use email data from weekends when most researchers were not entering the office to construct our counterfactual (see Methods).
1.3. Remote work impedes the formation of weak ties
To study the effect of remote work on the sorts of connections which might arise from serendipitous encounters on campus, we investigate the structure of weak ties in the email network. Because local bridges can be identified knowing only the topology of a social network, Granovetter suggested using the notion of a local bridge as an accessible proxy for the notion of a weak tie (see Supplementary Information for detailed definitions). Figure 2 panel c provides an illustration of the way in which chance encounters led to the formation of new local bridges.
The removal of physical co-location (as a consequence of mandatory fully remote work) caused an immediate and persistent drop in the number of weak ties formed in the MIT email network (Figure 2). Panel a shows a 6.2% drop in the number of weak ties coinciding with the sudden absence of co-location generated by the COVID-19 pandemic. To understand the mechanism behind the sudden loss of weak ties, we plot the number of local bridges not previously seen which enter the daily email network each day (Fig 2
b). Because we study a fixed population of users, as we see more ties the number of new weak ties will naturally decrease – thus the downward trend of panel b is expected. However, there is a significant jump discontinuity on March 23, 2020 indicating that the absence of physical co-location is negatively associated with the ability to form new weak ties. The decrease in the addition of new weak ties hints at a stagnation effect – researchers are not updating their pool of weak ties as often as would be expected.
Not only is the drop in weak ties sudden and statistically significant at the onset of the transition to full remote working, but it is also cumulatively significant over the course of more than one year. As reported in panels d and e, the observed number of weak ties is consistently below the 95% posterior predictive interval of the predicted counterfactual, indicating that the transition to fully remote work caused a lasting decrease in weak tie formation. The cumulative effect of remote work on weak ties shows an expected loss of more than 4,800 weak ties from March 23, 2020 until July 15, 2021 due to remote work – approximately 1.7 ties per person in the 2,834 researchers we study. Thus we find that remote work leads to a long-lasting, statistically significant drop in weak ties. Panels f and g show a non-significant cumulative drop in the number of new weak ties formed in the network – we explain this phenomenon in detail in the following section (see Figure 3).
We confirm in panel h that nexogenesis is failing in the absence of physical co-location– the social contacts of researchers become more similar from week to week after remote work. Specifically, panel e shows the intersection over union of the edges in daily reciprocated email networks on day and day . We see a lasting, significant increase in the stability of network edges from week to week after remote work. Finally, we identify a striking difference in the mechanism via which local bridges disappear due to remote work: in the short-term through the end of the Spring 2020 semester, local bridges become embedded in triangles, while in the long-term they are dropped from the network (Fig 2 panel i).
1.4. New weak ties fall off more significantly between researchers whose offices are near
As people are more likely to meet by chance on campus if their offices are nearby , we expect to observe relatively more consistent changes in the formation of new weak ties for co-located MIT personnel.
There was an immediate and lasting drop in the number of new weak ties between researchers in distinct but nearby research labs (Fig 3 panel a). This is in line with our expectation that co-location contributes to nexogenesis. Given this decrease, it may seem surprising that there is an increase in the number of new weak ties between researchers in the same lab (panel b). However, Yang et al. discovered an increase in the use of asynchronous communication (e.g. email) after the COVID-19 pandemic. A plausible explanation for the increase in new weak ties is that after the shift to remote work, email was used to schedule one-on-one meetings or ask small questions between same-lab researchers formerly scheduled or asked in person. Panels c and d show a non-significant decrease in the number of new weak ties formed between researchers in distinct labs at medium and far distances. This is also compatible with our intuition, as we don’t expect researchers who work far away from one another to have many chance encounters even when working in-person.
For each panel and each week we predict the mean value of the dependent variable along business days using the mean along weekend values using the BSTS approach with treatment on March 23, 2020 outlined previously. To have a consistent measure of distance across all days in the data, we use distance between the campus offices of researchers rather than the distance between their active work environments – during the shift to remote work the distance between researchers’ campus offices does not change. The distribution of distances between researcher offices can be found in Supplementary Information.
1.5. The move to hybrid work in Fall 2021 led to a partial recovery in weak tie formation at close distances
MIT re-opened its campus for the Fall 2021 semester starting on September 8, 2021. However, following MIT recommendation many research labs adopted a hybrid mode of work with researchers only physically present for (at most) 3 out of 5 business days each week, implying that the chance of serendipitous encounters is still lower than before the COVID-19 pandemic. Furthermore, limitations on the number of people allowed to eat together at a time and ongoing restrictions to international travel prevented departments from hosting large-scale events where researchers might typically mix.
The percentage of new weak ties at close distances (offices 150 meters apart) is higher than expected in Fall 2021 given the percentage at close distances in Fall 2020 (Fig 4 panels a and d), and the number of weak ties rises more sharply than expected on September 8, 2020 (panel c) given the rise at the beginning of the Fall 2020 semester. Despite this, the total number of new weak ties is lower or similar to the expected value after hybrid work (panels b,e,g,i), and there is no significant difference in the percentage of new weak ties formed between researchers with offices at medium to long distances (panels f,h). Taken together, the results of Figure 4 hint at the partial but incomplete success of the hybrid work model at allowing researchers to once more form new weak ties with other proximal researchers.
To estimate the causal effect of the end of remote work, we calculate a synthetic counterfactual for 2021 weekday email data from 2020 weekday email data (see Methods).
1.6. Modeling the effect of distance on tie formation
Our empirical results are consistent with the existence of some kind of mechanism via which co-location promotes tie formation – or nexogenesis. Here, we propose to model the propensity of two people to form a link on a given day on four factors, the first three of which have previously been identified as relevant to tie formation[18, 19]: focal closure, triadic closure, link-centric preferential attachment, and physical co-location . Critically, we add a factor accounting for physical co-location to explain the dynamics of weak tie formation observed at MIT campus following the transition to fully remote work (see Methods for details). We simulate the formation of email networks on weekdays by creating an edge memory dictionary from the last two weeks of February 2020, then generating new graphs each day using our model.
To reproduce the qualitative features observed in the data, we set the co-location variable for each pair of individuals to zero, corresponding to no physical co-location, starting from March 23, 2020 then back to 1 on September 8, 2021. Upon the removal of co-location, our model produces a drop in the number of weak ties (Fig 5a) and in the number of new weak ties (Fig 5b) which is qualitatively similar to the one observed in the empirical data. It also reproduces the increase in edge stability (Fig 5c), as well as the robustness of long-distance ties to a sudden absence of co-location (Fig 5d). Our model also predicts that complete reintroduction of co-location results in a complete recovery of weak ties. The signs of the log of the distance interaction coefficients allows us to identify the following potential mechanism for the drop in weak ties: two researchers are more likely to form new weak ties when they are co-located. To further confirm this hypothesis, we have simulated a scenario without the sudden transition to fully remote work modeled through the change in the value of the physical co-location variable on March 23, 2020. The results, reported in the Supplementary Information, show no observable drop in weak ties, providing further evidence in support of our explanatory hypothesis.
Several sociologists have argued that the lack of connections during the COVID-19 pandemic has negatively impacted mental and physical well-being as well as innovation, collaboration, and creativity [20, 21]. However, the mechanism via which such effects have occurred has yet to be explicitly identified. As businesses and universities make crucial decisions about the amount of in-person work after the COVID-19 pandemic, understanding the lasting effects of remote work on research communities is of paramount importance.
Our study shows that the transition to fully remote work on the MIT campus – with consequent complete removal of physical co-location between co-workers – had notable effects on the email communication network: while some common topological features were preserved, the formation of new weak ties (nexogenesis) was hindered, causing weak tie deterioration and network stagnation in the long term. Employees who are not co-located are less likely to form ties, weakening the spread of information in the network [22, 23, 24]. This mechanism can be successfully reproduced using a novel link formation model directly linking weak ties with co-location.
Our findings could have implications for the design of future research campuses and work environments, as well as for the development of new virtual technologies which seek to recreate interactions that happen in physical offices. Today it is of the utmost importance to identify what is the “minimum amount” of in-presence work that enables the formation of weak ties, so that individual and societal benefits related to remote work can be preserved without impacting the generation of new ideas and innovation in general. Our initial findings on the hybrid work model that followed the reintroduction of partial in-person collaboration at MIT show a slight recovery in the number of weak ties – especially between researchers who are once again co-located. This hints at the possibility of establishing a work balance trade-off by combining in-person and remote interactions among colleagues, which could inform the transition to a hybrid, post-COVID-19 ‘new normal’.
2. Author Contributions
D.C., M.M., C.R., and P.S. designed the research. M.M. provided data access. D.C. and T.A.H. processed and analysed the data. D.C., M.M., T.A.H., and P.S. performed the interpretation and writing. S.L., T.A., and R.D provided theoretical expertise. P.S. and C.R. continuously advised the project. C.R. framed initial hypothesis.
3.1. Data Preprocessing
We start with a fixed set of anonymized researchers (research staff, faculty, and postdocs) from 112 different research units. These researchers are grouped via 10 random partitions of , , . Each group contains at least 5 researchers, and all researchers in belong to the same research unit. Let denote the collection of researchers in . For a pair of groups ,, and a day our data contains the sum of all emails sent between the two groups:
For each research unit , let denote the collection of researchers in the research unit. Because each group contains only researchers from a single research unit, each equation (1) is a sum over researchers from at most two research units. Grouping the equations (1) by pairs of research units on each day we obtain a collection of constrained linear systems of Diophantine equations
is the column vector whose entries are, , , . Each of these linear systems is guaranteed to have at least one solution (the actual number of emails sent), but may be underdetermined. If there is a unique solution, we use the Hermite normal form  of to find it. If the system is underdetermined, we use non-negative matrix factorization  with an penalty in order to quickly estimate a sparse non-negative solution, as the algorithms which compute exact sparse integer solutions are slow. This procedure yields an estimate for each day and each pair of . Figure S.19 in SI shows that our approximate solutions are very close to true solutions.
3.2. Network formation
To rule out changes in the data due to departures from the university or new hires, we ensure that each user sent at least one email over the university network before the end of the 2020 spring semester and at least one email after May 20, 2021. Denote this set of active users by .
We are missing data from December 23, 2020 and January 19,20, and 21 of 2021; because these days are during the winter holiday at MIT this does not significantly affect our analysis. For each of the remaining 562 days from December 26, 2019 through July 15, 2021, we obtain a weighted, undirected network whose nodes represent (anonymized) individuals. Fix a day , for each user in , let denote the number of people whom emailed on day . For a pair of users , let denote the (estimated as in the previous section) number of emails sent from to on day . To rule out massmails, let be the subset
Define a weighted, undirected network with nodes . For two nodes , there is an edge if and . The weight of the edge is defined to be . Although the email data is partially estimated due to randomization and aggregation, for % of the edges with non-zero weight in the estimated network we were able to recover the true number of emails sent. If we include all edges between users contributing to a non-zero weight edge in at least one random aggregation of the network (but which may have weight zero in the estimated network), % of edges have the ground truth number of emails. When building the undirected network , we consider four possible time-windows during which emails can be reciprocated: the same day (daily), within 5 business days (weekly), within 10 business days (bi-weekly), or within 21 business days (monthly). For results on weekly, bi-weekly, and monthly networks see Supplementary Information. Previous studies have found that more than 90% of emails are replied to the same day that they are sent, with more than half being replied to within 47 minutes . Requiring emails to be reciprocated the same day hides interactions between users who typically respond slowly to emails; however, it is useful for filtering out massmails, observing sharp discontinuities in the data, and for increasing the power of hypothesis tests. Allowing longer periods of reciprocation captures weaker ties missed in the daily reciprocated email network, but we are forced to sacrifice some statistical power either to autocorrelation or lower sample size; additionally the networks become more saturated, destroying some topological features of interest. Examples of daily, weekly, bi-weekly, and monthly email networks are reported in the Supplementary Information.
We also form a concatenated networks from multiple days of data. Fix a range of dates . We first form a directed network such that the edge weight on edge is
As before, we define an associated undirected graph by setting the edge weights to
and removing edges with weight zero and isolated nodes. In particular, for each weekday we consider the collection of dates given by the next five weekdays to study weekly behavior or the next 21 weekdays to study monthly behavior.
To examine whether this hybrid mode of work returned tie formation to pre-pandemic levels, we first restrict to a collection of 2,206 researchers who sent at least 5 emails after September 2021 and before May 2020, then proceed as above to form networks from December 23, 2020 to October 31, 2021. We choose a stricter requirement for inclusion in the network than previously as we observe many users becoming inactive starting in summer 2021.
When comparing February 2020 to February 2021, we pair the days in the two months as follows: the first Tuesday of the MIT semester in February 2020 is paired with the first Tuesday of the MIT semester in February 2021, etc. For each pair of days , we consider the set of users who were active on both days and, as above, form a pair of undirected networks whose edges correspond to reciprocated emails on (for ) or (for ). Directly comparing these networks allows us to completely remove the effects of seasonality or any difference in makeup of active users in February 2020 and February 2021.
3.3. Link-centric preferential attachment
The goal of our model is not to serve as a tool for prediction, but to understand the mechanism via which distance impacts link formation. Through experimentation, we found that using link-centric preferential attachment alone to propagate a dynamic network produced daily networks which had too many local bridges. Thus we use a two step approach which first produces an intermediate network using link-centric preferential attachment, then adds edges in a way which increases the clustering coefficient of the network. This is analogous to the reverse of the Watts-Strogatz method , where the intermediate network has high clustering coefficient and local bridges are added afterwards.
Our microscopic model has the following parameters:
, the periodicity of the model. controls how much the graph on day looks like the graphs on days , . In other words, the higher the value of , the closer the dynamic network is to being 7-periodic (or 5-periodic if weekends are removed).
, the tendency to connect with old links. The higher the value of , the more likely it is that a given link will connect with a previous partner rather than someone new. This is one of the standard parameters from a vanilla link-centric preferential attachment model, and this parameter decays exponentially in the number of days between contact: where is a constant and is the number of days since the link last appeared.
, the tendency to reach out to new people. This is typically the complement of in vanilla link-centric attachment models (we have more parameters than the standard two).
, the tendency to connect with people in the same department.
, the tendency to be introduced to a mutual friend.
The parameters , and rely on a memory dictionary which stores the days on which a given edge has appeared. For all of the above parameters , we include interaction terms
controlling the extent to which co-location amplifies or dampens the effect. For example, from the empirical data we conclude that co-location should dampen periodicity while amplifying the probability of reaching out to new partners.
Let be a pair of nodes. For each parameter let denote the associated indicator variable. For instance,
Consider the set of all edges which appear on at least one day in the empirical data. Note that the use of rather than the set of all possible edges makes this model unsuitable for prediction tasks. On each day , we start by adding an edge to a random network with probability proportional to
If , the co-location amplifies the effect of parameter , while if it dampens the effect. In total, in the first step we add edges to , where .
In the second step, we add the parameter ,
where is the random network constructed in step 1, and is the usual (unweighted) shortest path distance in a graph . In words, if adding will close a triangle in the network. A new edge is added to the network in step two with probability
In total we add edges to in the second step, where . Including the parameter has the effect of raising the expected number of triangles in the random graph, and hence lowering the percentage of edges which are local bridges.
With parameters fixed, we proceed to form networks one day at a time, adding the edges from the current network to a memory dictionary after formation. To model the effect of remote work, for each day after March 23, 2020, we set the distance between researcher offices to be a fixed constant larger than 650 meters (all other parameters remain fixed).
3.4. Regression Discontinuity
Figure 2 (panels a and b) (repsectively Figure 4 panel c) show the drop in weak tie and new weak tie formation (respectively increase in weak ties) due to the policy change on March 23rd, 2020. We used Regression Discontinuity Designs (RDD) [29, 30, 31] to estimate the causal impact of the policy change. RDDs are a classic, quasi-experimental procedure for estimating treatment effects in observational studies. In an RDD, treatment assignment is determined by an assignment variable rather than through randomization.
For an RDD to be valid, we need only assume that the response is continuous with the assignment variable near the cutoff and that subjects cannot precisely manipulate the assignment variable [30, 31]. Panels a and b in Figure 2 and panel c in Figure 4 show that the responses (weak ties and new weak ties) are continuous with the assignment variable time, albeit observed with noise. The assignment variable, time, is not precisely manipulable by subjects since the announced policy was not known far in advance. Furthermore, there would be little reason to manipulate assignment since subjects are free to send emails at the same rate before and after the policy change.
panel c, we model the weekly mean number of weak ties with the discontinuous linear regression
where is the cutoff date (either March 23rd, 2020 or September 8, 2021) and
is the binary variable
that indicates if the date is before or after the policy change date . The error term
is assumed to be heteroskedastic white noise. The coefficientis the impact of the policy, and measures the gap between the two sides of the regression. We estimate and the other coefficients with generalized least squares with AR(n) structured covariance matrix with for daily, weekly, biweekly, and monthly reciprocated email networks, and report the value of and its p-value in Figure 2
. We use heteroskedasticity-robust estimators for standard errors, so that in total standard errors are robust to autocorrelation (from the AR(n) GLS) and heteroskedasticity.
In panel a, we assumed a discontinuous order one polynomial trend line because the data did not display any apparent higher-order non-linear behavior. The data were subset to January 3rd, 2020 to October 1st, 2020, to semi-localize our regression around the discontinuity, which reduces bias in 
, and to avoid influence from the two outlying regions (before January 3rd, 2020 and during December 2020). These outliers correspond to winter break at MIT and represent a natural and expected decrease in weak ties not due to the policy change.
In panel b, we similarly model the rate of new weak tie formation over time. We assumed a second-order discontinuous polynomial trend (Equation 5) due to the observed parabolic behavior before the cutoff point. Using the same notation as in Equation 4, our linear regression is given by
The coefficient is, again, the causal impact of the policy change. We report the value of and its p-value in Figure 2.
3.5. Bayesian Structural Time Series
We stress that when using Bayesian methods, reported CIs are credible intervals of the predicted dependent variable, and-values are posterior tail probabilities. Bayesian structural time-series (BSTS) combines a state-space model for time-series data and Bayesian model averaging for parameter selection and estimation .
As a state-space model, BSTS combines three components of state: a local linear trend:
with , ; a seasonality component:
with the number of seasons and again an independent error; and (static) covariates which are predictive of the time series in question before the intervention:
For the local linear trend and seasonality components, we use the default priors of the CausalImpact library:
denote a Gamma distribution. For the covariates (the weekend data), in general a spike-and-slab prior is used with the spike defined by
with initialized to where is the expected model size. The slab part of the spike-and-slab prior is
is the covariate data. Because we include only one covariate (the weekly minimum) the spike-and-slab prior collapses to just a normal-inverse Gamma distribution. We use 1000 iterations of Markov Chain Monte Carlo (MCMC) to compute posterior predictive distributions.
Consider the binary variable defined by
For us, “treatment” consists of setting to zero for greater than March 23, 2020 by not permitting researchers to enter their campus office. The time series whose counterfactual we want to estimate is the weekday maximum of the network measure while the covariate is the weekend minimum. As most employees are not physically present in their office on weekends, for most when is a weekend so that the treatment has little effect. We verify this assumption by looking at the number of distinct MAC addresses connected to routers in on-campus research labs on the weekday and weekend (see Supplementary Information).
When studying the effect of hybrid work, we construct a counterfactual using weekday email data spanning July 22 2020 through October 14, 2020 as a covariate for email data spanning July 28, 2021 through October 20, 2021, aligning so that the start of the Fall 2020 and Fall 2021 semesters coincide. We also remove Memorial Day (a university holiday) from both the 2020 and 2021 data.
-  David G. Rand, Samuel Arbesman, and Nicholas A. Christakis. Dynamic social networks promote cooperation in experiments with humans. Proceedings of the National Academy of Sciences, 108(48):19193–19198, 2011.
-  José Luis Iribarren and Esteban Moro. Impact of human activity patterns on the dynamics of information diffusion. Phys. Rev. Lett., 103:038702, Jul 2009.
-  Mark S. Granovetter. The strength of weak ties. American Journal of Sociology, 78(6):1360–1380, 1973.
-  J.-P. Onnela, J. Saramäki, J. Hyvönen, G. Szabó, D. Lazer, K. Kaski, J. Kertész, and A.-L. Barabási. Structure and tie strengths in mobile communication networks. Proceedings of the National Academy of Sciences, 104(18):7332–7336, 2007.
-  Caroline Haythornthwaite. Social networks and internet connectivity effects. Information, Communication and Society, 8:125–147, 06 2005.
-  Ethan Bernstein, Hayley Blunden, Andrew Brodsky, Wonbin Sohn, and Ben Waber. The implications of working without an office. Harvard Business Review: The Big Idea.
-  Robert Walker. Co-residence patterns in hunter-gatherer societies show unique human social structure. Science, 331:1286–1289, 01 2011.
-  Guo ming Chen. The impact of new media on intercultural communication in global context. 2012.
-  Gary Alan Fine. The sad demise, mysterious disappearance, and glorious triumph of symbolic interactionism. Annual Review of Sociology, 19:61–87, 1993.
-  L.T. Reynolds. Interactionism: Exposition and Critique. The Reynolds Series in Sociology. General Hall, 1993.
-  Dashun Wang, Dino Pedreschi, Chaoming Song, Fosca Giannotti, and Albert-Laszlo Barabasi. Human mobility, social ties, and link prediction. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’11, page 1100–1108, New York, NY, USA, 2011. Association for Computing Machinery.
-  Filippo Simini, Marta C. González, Amos Maritan, and A. L. Barabasi. A universal model for mobility and migration patterns. Nature, 484:96–100, 2012.
-  A Benveniste. These companies’ workers may never go back to the office. CNN.
-  R McLean. These companies plan to make working from home the new normal. as in forever. CNN.
-  Susan Lund, Wan-Lae Cheng, André Dua, Aaron De Smet, Olivia Robinson, and Saurabh Sanghvi. What 800 executives envision for the postpandemic workforce. McKinsey Global Institute.
-  Erik Brynjolfsson, John J Horton, Adam Ozimek, Daniel Rock, Garima Sharma, and Hong-Yi TuYe. Covid-19 and remote work: An early look at us data. Working Paper 27344, National Bureau of Economic Research, June 2020.
-  Longqi Yang, David Holtz, Sonia Jaffe, Siddharth Suri, Shilpi Sinha, Jeffrey Weston, Connor Joyce, Neha Parikh Shah, Kevin Sherman, Brent Hecht, and Jaime Teevan. The effects of remote work on collaboration among information workers. Nature Human Behaviour, September 2021.
-  Gueorgi Kossinets and Duncan Watts. Empirical analysis of an evolving social network. Science (New York, N.Y.), 311:88–90, 02 2006.
-  Christian L. Vestergaard, Mathieu Génois, and Alain Barrat. How memory generates heterogeneous dynamics in temporal networks. Phys. Rev. E, 90:042805, Oct 2014.
-  tracy Brower. Why the office simply cannot go away: The compelling case for the workplace. Forbes.
-  Dylan Walsh. How to manage the hidden risks in remote work. MIT Sloan Management School.
-  M. Hansen. The search-transfer problem: The role of weak ties in sharing knowledge across organization subunits. Administrative Science Quarterly, 44:111 – 82, 1999.
-  L. Argote and P. Ingram. Knowledge transfer: A basis for competitive advantage in firms. Organizational Behavior and Human Decision Processes, 82:150–169, 2000.
-  Ray E. Reagans and Bill McEvily. Network structure and knowledge transfer: The effects of cohesion and range. Administrative Science Quarterly, 48:240 – 267, 2003.
-  G. H. Bradley. Algorithms for hermite and smith normal matrices and linear diophantine equations. Mathematics of Computation, 25:897–907, 1971.
-  D. Lee and H. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401:788–791, 1999.
-  Farshad Kooti, Luca Maria Aiello, Mihajlo Grbovic, Kristina Lerman, and Amin Mantrach. Evolution of conversations in the age of email overload. In Proceedings of the 24th International Conference on World Wide Web, WWW ’15, page 603–613, Republic and Canton of Geneva, CHE, 2015. International World Wide Web Conferences Steering Committee.
-  D. Watts and S. Strogatz. Collective dynamics of ‘small-world’ networks. Nature, 393:440–442, 1998.
-  Donald L Thistlethwaite and Donald T Campbell. Regression-discontinuity analysis: An alternative to the ex post facto experiment. Journal of Educational psychology, 51(6):309, 1960.
-  Jinyong Hahn, Petra Todd, and Wilbert Van der Klaauw. Identification and estimation of treatment effects with a regression-discontinuity design. Econometrica, 69(1):201–209, 2001.
-  David S Lee and Thomas Lemieux. Regression discontinuity designs in economics. Journal of economic literature, 48(2):281–355, 2010.
-  Whitney K Newey and Kenneth D West. A simple, positive semi-definite, heteroskedasticity and autocorrelationconsistent covariance matrix. Working Paper 55, National Bureau of Economic Research, April 1986.
-  Kay H. Brodersen, Fabian Gallusser, Jim Koehler, Nicolas Remy, and Steven L. Scott. Inferring causal impact using bayesian structural time-series models. Annals of Applied Statistics, 9:247–274, 2015.
Henry Navarro, Giovanna Miritello, Arturo Canales, and Esteban Moro.
Temporal patterns behind the strength of persistent ties.
EPJ Data Science, 6, 06 2017.
-  Ronald Burt. Structural holes and good ideas. American Journal of Sociology, 110:349–399, 09 2004.
s.6. Research Units
Here we list the research units used in the study. Only a subset of researchers from each unit appear in the email networks. Note that in addition to standard denominations corresponding to departments (e.g. Chemistry), there are also subgroups corresponding to department heads and institute professors (distinguished professors).
Abdul Latif Jameel Poverty Action Lab, Aeronautics and Astronautics, Anthropology Program, Archaeology, Architecture, Architecture and Planning - Depart Heads, Biology, Brain and Cognitive Sciences, Center for Advanced Urbanism, Center for Biomedical Innovation, Center for Collective Intelligence, Center for Environmental Health Sciences, Center for Global Change Science, Center for Information Systems Research, Center for International Studies, Center for Real Estate, Center for Transportation and Logistics, ”Chancellors Office”, Chemical Engineering, Chemistry, Civil and Environmental Engineering, Comp Sci and Artificial Intel Lab HQ, Comp Sci and Artificial Intelligence Lab, Comparative Media Studies/Writing, Cybersecurity at MIT Sloan, D-Lab, DAPER Administration, DAPER Intercollegiate Sports, DLC Heads Science, Dean for Student Life - Dept Heads, Dean of Humanities, Arts, and Social Sci, Dean of Science, Department of Biological Engineering, Dept Administrators and Lab Directors, Dept Heads Vice President for Research, Division of Comparative Medicine, Earth, Atmospheric and Planetary Sciences, Economics, Electrical Engineering-Computer Science, Global Studies and Languages, Haystack Observatory, Health Sciences and Technology Program, History Section, Industrial Performance Center, Inst for Data Systems and Society Profs, Inst for Medical Eng. and Science Prof, Institute Professors, Institute for Data, Systems, and Society, Institute for Medical Eng. and Science, Institute for Soldier Nanotechnologies, J-WAFS, Jameel Clinic for ML in Health, Kavli Inst for Astrophysics and Space Rsrh, Koch Inst - Integrative Cancer Research, Lab for Information and Decision Systems, Laboratory for Nuclear Science, Leaders for Global Operations Program, Libraries, Linguistics and Philosophy, Literature Section, MIT Energy Initiative, MIT Environmental Solutions Initiative, MIT Open Learning, ”MIT Program in Womens and Gender Studies”, MIT Quest for Intelligence, MIT Sea Grant College Program, MIT Sloan Human Resources, MIT-SUTD Collaboration, MIT.nano, MITii, Materials Research Laboratory, Materials Science and Engineering, Mathematics, McGovern Institute for Brain Research, Mechanical Engineering, Media Lab, Microsystems Technology Laboratories, Music and Theater Arts Section, Nuclear Reactor Laboratory, Nuclear Science and Engineering, OSATT-Technology Licensing Office, Office of the President, Office of the Provost, Open Learning, J-WEL, Open Learning, J-WEL Research, Projects, Open Learning, J-WEL, Higher Ed, Open Learning, Playful Journey Lab, Physics, Picower Institute for Learning and Memory, Plasma Science and Fusion Center, Political Science, Prog in Science, Technology, and Society, Program in Art, Culture and Technology, Program in Media Arts and Sciences, ROTC Air Force, ROTC Army, ROTC Navy, Research Laboratory of Electronics, SCC Dept/Lab/Center/ Director Org, SHASS Department Heads, School of Architecture and Planning, School of Engineering, School of Engineering Professors, Schwarzman College of Computing, Sloan School of Management, SoE Dept/Lab/Center/Director Org, Sociotechnical Systems Research Center, Supply Chain Management Program, System Design and Management Program, Urban Studies and Planning, VP for Research, World Wide Web Consortium
s.7. Data acquisition
The email data was obtained by university IT staff in accordance with MIT IRB guidelines, containing only aggregate email counts between randomly chosen groups of individuals in the same research unit. Specifically, we obtained daily total number and volume (total kbs) of emails exchanged between randomly generated groups of five or more researchers. No information about the content of emails has been shared with us or used in this study.
s.8. Local bridges
Given an undirected network , an edge in is a local bridge if there is no node such that there are edges and . In other words, the endpoints of a local bridge have no common neighbors.
As the definition of a local bridge given above requires an undirected network, we will restrict to reciprocated communications when studying local bridges. The link between weak ties and local bridges comes via the following definition from Granovetter .
Fix an undirected network together with a labeling of each edge in as either “strong” or “weak”. We say that a node violates the Strong Triadic Closure Property if it has strong ties to two other nodes and , and there is no edge between and . We say a node satisfies the Strong Triadic Closure Property if it does not violate it.
From here, a well-known argument by contradiction implies that local bridges and a weak ties are intimately related in networks which satisfy Strong Triadic Closure.
Theorem S.3 (Granovetter ).
If a node in an undirected network satisfies the Strong Triadic Closure Property and is involved in at least two strong ties, then any local bridge it is involved in must be a weak tie.
There are two distinct ways to lose local bridges in a network: first, an edge which is a local bridge on day can fail to exist in the network on day for some ; second, an edge which is a local bridge on day can be embedded in a triangle in the network on day . Figure 2, panel f in the main text shows that the transition from local bridges to embedded ties is the primary reason for short-term loss of local bridges, while the failure to exist in the network is the primary reason for loss of local bridges in the long term.
s.9. Estimating the number of researchers on campus
To justify the use of weekend email data to generate a synthetic counterfactual, we estimate the number of distinct researchers co-located on weekends and weekdays before and after the pandemic lockdown using WiFi data. Specifically, we count the number of unique MAC addresses connected to a given MIT campus WiFi router each hour from February 3, 2020 until October 10, 2020. Figure S.1 shows that, as expected, far fewer people are on campus during the weekend. To confirm that this effect is not due to a lack of classes on the weekends, Figure S.2 shows the number of unique MAC addresses in the Senseable City Lab on weekends and weekdays – a room in which no classes are taught.
s.10. Robustness of the tie formation model
To verify that a lack of co-location is what causes our model to reproduce the drop in weak ties seen in the empirical data, we check that with no changes in co-location our model produces no changes in weak ties in Figure S.3.
s.11. Descriptive statistics for weekly, biweekly, and monthly emails networks
Requiring emails to be reciprocated the same day is useful to increase the number of independent pre-lockdown data points for statistical purposes, but we lose email exchanges which are stretched over the course of a week or longer. The mean response time in an email exchange is closely related to the frequency of communication: if the mean response time between two people in the 562 days of data is days, then the total number of reciprocated exchanges (ignoring the number of emails sent per exchange) is at most . Thus by restricting to daily reciprocated emails, we may miss the contributions of some infrequent ties. To expand the breadth of ties under consideration and study how incorporating lagged communications affects our results, we construct email networks whose edges represent the number of reciprocated emails exchanged within 5 business days, 10 business days, and 21 business days. Figures S.4-S.7 show sample weekly, biweekly, and monthly reciprocated email networks from April 4, 2020. Figures S.8 through S.10 show standard network measures for emails networks whose edges represent emails reciprocated within one week, two weeks, or one month. In contrast to the daily email networks, there are seemingly long-lasting changes in the size of the giant component in the networks; we are unable to quantify this statistically as our construction of a synthetic counterfactual from weekend email data breaks down when we allow emails to be reciprocated over any period longer than 2 days.
shows the number of local bridges and new local bridges in weekly, biweekly, and monthly reciprocated email networks. The number of new weak ties in each case seems to stabilize between 40-60 new weak ties in agreement with the number of new weak ties in the daily email networks. This can be partially explained by the fact that we use a sliding window approach with a stride of one to construct each network. Curiously, the rate of weak tie formation (number of new ties per day) seems to approach a constant around 45 regardless of the reciprocation window.
s.12. Temporal stability of local bridges
Next, consider the two email networks which consist of all communications reciprocated between February 2, 2020 and March 3, 2020 (resp. March 23, 2020 and April 22, 2020). We leave out the period from March 3 through March 23 in order to account for the gradual implementation of COVID-19 policy which caused a sharp spike in the number of emails sent. Figure S.12
shows the differences in inter and intra-research-unit emails sent during the two time periods. We note that more emails are sent between fewer people; in addition, the number of inter-unit connections (edges) drops by 4.4% post-lockdown while the number of intra-unit connections rises by 0.8%. This indicates a loss in the diversity of email neighbors which is not detected when restricting to daily reciprocated emails. The mean number of reciprocated emails pre-lockdown sent between dyads which stopped emailing post-lockdown was 2.17, while for dyads which persisted through lockdown the average number of pre-lockdown reciprocated emails was 8.53; a Welch’s t-test shows that this difference is significant, supporting the idea that lost ties are weaker on average. Of the local bridges present before lockdown, 63.9% are not present post-lockdown; on the other hand, only 43.0% of the ties embedded in triangles before lockdown are lost post-lockdown. This is in line with previous research on the temporal volatility of local bridges,  and reinforces the idea that the topological and temporal dimensions of the “weakness” of a tie are related.
To examine long-term changes in the growth of the email network, we build two email networks by considering all emails reciprocated in February 2020 (resp. February 2021). Similarly to the short-term, we find a similar pattern where more emails were sent between fewer people in 2021 (Fig S.13). Here, however, there is a drop in both the number of intra and inter-unit connections. The edges which are present in February 2020 but not February 2021 display similar patterns to the edges dropped immediately after lockdown – 71.7 % of local bridges are dropped, while 46.7% of triangle-involved edges dropped; the mean number of reciprocated emails in February 2020 of those dyads which persist to February 2021 is 8.40, while the mean number of reciprocated between dyads which are dropped in 2021 is 2.80 .
|Inter-unit edges||Inter-unit emails||Intra-unit edges||Intra-unit emails|
|Feb 2 - March 3, 2020||5902||29413||6739||43044|
|March 23 - April 22, 2020||5642||39194||6795||55420|
|Inter-unit edges||Inter-unit emails||Intra-unit edges||Intra-unit emails|
s.13. Alternate definitions of weak tie
From the definition of local bridge, one can see that an isolated dyad – an edge whose source and target have degree one – is a local bridge. However, such edges do not correspond to the intuitive notion of a “bridge” as a link between distinct communities. Figure S.14 panel a shows that the observed drop in local bridges persists even when we ignore local bridges which are isolated dyads.
A previous study on the effect of remote work on tech employee collaboration patterns defined bridging connections as those connections in monthly communication networks with a low local constraint , and weak ties as ties with below-median communication time . Our definition of weak ties (local bridges) is more similar to their definition of bridging tie than their definition of weak tie. Figure S.14 panels b and c show Burt’s measure of structural holes and a measure of structural holes commonly used in topological data analysis, respectively. In both cases, there are no significant long-term changes in the structure of holes in the network. Panel d confirms that there is no significant change in the number of emails sent along local bridges. Though this may seem to be in contrast to the results found by Microsoft, it’s important to note that they primarily studied communication media other than email.
s.14. Preprocessing error
Recall from the Methods section that we estimate the number of emails between each pair of users on each day from randomized, aggregated data. In order to quantify how the quality of our estimations, we re-aggregate our individual estimates and compare to each randomization of the aggregated data. The error observed after aggregation is due to our use of non-negative matrix factorization (NMF) in order to obtain approximate sparse solutions to the number of emails sent per user when there is not a unique integer solution. Of those pairs of people who possibly sent a non-zero number of emails on a given day, we can and do solve for the number of emails exactly in approximately 66% of cases; if we include the pairs which send no emails to one another in at least one random aggregation (and hence send zero emails), we exactly solve for the number of emails between 99.9% of users. When a system is underdetermined, NMF is substantially faster than solving the constrained system of linear Diophantine equations necessary to produce a sparse, non-negative integer solution. Furthermore, any exact integer solution to an underdetermined system of linear equations would itself be an approximation to the true number of emails sent between users. This is a source of unmeasurable error which is a consequence of working with aggregated data. Figure S.19, panels b and c, show that of the tens of thousands of emails sent each day, our errors are at most in the hundreds. Panel a shows that, per edge in the aggregated network, we never exceed more than 3% error.
s.15. Other network measures
Dual to local bridges, we can compute the average clustering coefficient of the email networks, which measures the percentage of possible triangles each node belongs to. Because local bridges are precisely the edges which are not part of triangles, there is a loose inverse relationship between the number of local bridges and the clustering coefficient. Figure S.20a shows the average clustering coefficient; we note that it changes towards the beginning of March when MIT’s COVID-19 response team was first established.
Figure S.20b provides further evidence for the relatively slower rate that new weak links enter the network after the implementation of COVID-19 policy. The edges which are present in the network are significantly more similar from week to week after the implementation of COVID-19 policy than beforehand.
Figure S.20c shows the percentage of edges in each weekday graph which are infrequent, and panel d shows the number of edges on a given day which do not appear on any other day (the most infrequent ties). Frequency is calculated as the number of appearances of the edge over all 562 days in our data. This gives a dynamic, rather than static, measure of the ”weakness” of a tie. We see a sharp drop in the percentage of infrequent edges after the implementation of COVID-19 policy. This is especially surprising, as one would naturally expect edges which first appear late in the data to be infrequent – there are fewer days remaining in the data at which they can reappear.
Figure S.21 shows the distribution of distances between researcher offices at MIT. Almost all researchers’ offices are within 2 km of one another.
s.16. Statistics tables
All null hypothesis tests are two-sided.
|Dep. Variable:||Intra-unit connections||R-squared:||-0.000|
|Dep. Variable:||Inter-unit connections||R-squared:||0.000|
|Dep. Variable:||Num connected comps.||R-squared:||0.000|
|Dep. Variable:||Giant comp. size||R-squared:||0.000|
|Dep. Variable:||Num. weak ties||R-squared:||0.054|
|Dep. Variable:||Num. new weak ties||R-squared:||0.897|
|Dep. Variable:||Num weak ties||R-squared:||0.268|