Socio-economic, built environment, and mobility conditions associated with crime: A study of multiple cities

04/13/2020 ∙ by Marco De Nadai, et al. ∙ 0

Nowadays, 23 metropolises, criminal activity is much higher and violent than in either small cities or rural areas. Thus, understanding what factors influence urban crime in big cities is a pressing need. Mainstream studies analyse crime records through historical panel data or analysis of historical patterns combined with ecological factor and exploratory mapping. More recently, machine learning methods have provided informed crime prediction over time. However, previous studies have focused on a single city at a time, considering only a limited number of factors (such as socio-economical characteristics) and often at large spatial units. Hence, our understanding of the factors influencing crime across cultures and cities is very limited. Here we propose a Bayesian model to explore how crime is related not only to socio-economic factors but also to the built environmental (e.g. land use) and mobility characteristics of neighbourhoods. To that end, we integrate multiple open data sources with mobile phone traces and compare how the different factors correlate with crime in diverse cities, namely Boston, Bogotá, Los Angeles and Chicago. We find that the combined use of socio-economic conditions, mobility information and physical characteristics of the neighbourhood effectively explain the emergence of crime, and improve the performance of the traditional approaches. However, we show that the socio-ecological factors of neighbourhoods relate to crime very differently from one city to another. Thus there is clearly no "one fits all" model.



There are no comments yet.


page 20

page 23

page 25

page 26

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.


In criminology, social cohesion among neighbours has been linked to their willingness to cooperate in order to solve common problems and reduce violence Graif and Sampson (2009); Sampson and Groves (1989); Sampson (1997). Cooperation, as opposed to disorganization of neighbours is indeed believed to create the mechanisms by which residents themselves achieve guardianship and public order Sampson and Groves (1989). This mechanism also finds its roots in urban planning, where the relationship between specific aspects of urban architecture Newman (1972) and urban physical characteristics Jacobs (1961) are related to security. However, neighbourhoods are not to be considered islands unto themselves, as they are embedded in a city-wide system of social interactions. On a daily basis, people’s routine exposes residents to different conditions, possibilities Wang et al. (2018), and it may favour crime Cohen and Felson (1979). Yet, mainstream studies focus on just a subset of static factors at a time, often in a single city (e.g. Chicago or New York), thus neglecting the complex urban interplay between crime, people, places, culture and human mobility.

Criminology widely recognize the importance of places. Crime occurs in small areas such as street segments, buildings or parks. However, neighbourhoods and their contextual characteristics are also believed to influence offenders’ activities. Studies on small areas and neighbourhoods roughly come from two streams of literature. The first stream focuses on the routine activity and crime pattern theories Cohen and Felson (1979); Felson and Clarke (1998); Brantingham and Brantingham (1993), and small areas. These studies suggest that crime occurs when an offender, its suitable target, and the absence of any deterrence system, such as police or even ordinary citizens Felson and Boba (2010), converge at a place. The presence of people influence the number of offenders and targets, but daily routine of residents exposes homes and people to predatory crimes Hindelang et al. (1978). The built environment was also found to affect criminal activities, as physical disorder and specific locations (e.g. bar, taverns) attract offenders and suitable targets O’Brien and Sampson (2015); Murray and Roncek (2008); Salesses et al. (2013). The second stream of literature builds upon the social disorganization theory Sampson and Groves (1989); Sampson (1997), which found high crime concentration in socially and economically disadvantaged neighbourhoods. In these studies, census data is the primary source used to measure social cohesion through socio-economic disadvantage, ethnic diversity, residential instability Sampson and Groves (1989); Sampson (1997, 1985). In some cases, new sources of data were used. For example, scholars exploited synthetic social ties to simulate neighbourhood cohesion Hipp et al. (2013), and mobility flows to indicate crime opportunities and connections between neighbourhoods Song et al. (2019). Others leveraged crowd-sourced Point of Interests (POIs), taxi flows Wang et al. (2016), and dynamic population mapping from satellite imagery Andresen (2006, 2011) and mobile phone activity Bogomolov et al. (2014); Malleson and Andresen (2015) to assess the presence of people. Altogether, these results highlight the tight relation between socio-economic, built environment and mobility conditions, and their impact on criminal activities. Although the two streams of theory are often seen as competing, we argue that they can complement each other. However, very limited work has integrated socio-economic, built environment and mobility conditions together in multiple cities and in small areas. Existing literature focuses on a single city, and often describe crimes at the neighbourhood level and rely on census boundaries. These limitations result in a fragmented and incomplete picture of how the numerous factors influence crime in the urban context and limit the impact of the conclusions.

Here, we seek to shed light on the diverse set of factors at play with urban crime exploring how this is related, at the same time, to social disorganisation, built environment characteristics and human mobility. Specifically, we analyse crime at the level of blocks, considering both the local features of the block and its surrounding context, represented by all the blocks within a half-mile. The contribution of this paper is twofold. First, we address the need for a comprehensive study that explores crime patterns at fine grained resolution across multiple cities of the world, analysing Bogotá, Boston, Los Angeles and Chicago. Secondly, we show that the previously neglected complex interplay between crime, people, places, and human mobility can significantly improve the performance of the crime inference. We make use of massive and ubiquitous data sources such as mobile phone records and geographical data, implying that the resulting framework can be replicated at scale. Our generated insights can help recommend effective policies and interventions that improve urban security.


We study criminal activity in Bogotá (Colombia), Boston (USA), Chicago (USA) and Los Angeles (USA), four very different cities with respect to cultural, urban and socio-economic conditions. The selected unit of analysis is the census block group, the smallest geographical unit for which the census publishes data, and measuring on average 378 square meters. We account for the contextual characteristics around the block group, here called core, by computing a corehood, defined as the set of all the surrounding block groups within a half mile from the core (see Figure 4). Note that neighbouring cores have overlapping corehoods. We tested different sizes of the corehood, finding the half mile distance as the best to describe the neighborhood effect (see the Supplementary Information (SI) Note 11).

Figure 1: For each block group (the core), we consider the block groups within a half mile as its corehood. Blocks that are near each share most of their corehood. In this example, we show two cores in Bogotá and their corresponding corehood. We focus on three aspects of the core and the corehood: the Social Disorganization (SD), the Built Environment (BE), and the Mobility (M). The core, where crime is predicted, measures on average 378 square meters.

Criminal activity is provided by police agencies, which record through police reports the geographic location, date, time of day and category of each crime event. We analyse crime belonging to two broad categories of crime: violent and property crimes, which include homicides, sexual and non-sexual aggravated assaults, robbery, motor vehicle thefts and arson. We assign each crime to a corehood through its position.

In order to estimate the number of crimes in a given core, we compute two types of features. First, we consider the characteristics of the core itself. We include features that were previously found to attract potential offenders and targets 

Wang et al. (2016), such as the residential population and the number of nightlife, shops and food POIs. Then, to account for the fact that environmental (neighbourhood) characteristic influence crime Jacobs (1961); Sohn (2016), we consider corehood features in our model. We group them in Social disorganization (SD), Built Environment (BE) and Mobility (M) features. The SD characteristics include the disadvantage, instability and ethnic diversity of corehood. Consistently with the literature Sampson et al. (1999); Sampson and Groves (1989); Sampson (1997); Kubrin and Weitzer (2003), disadvantage and instability are composite variables built from the two largest principal components of: (i) unemployment rate, (ii) poverty rate, defined as people living below the poverty line, and (iii) residential mobility rate, defined as the percentage of people who recently changed residency. Again, in accordance with the literature Sampson (2013); Sampson and Groves (1989); Sampson and Graif (2009), ethnic diversity is computed as the Hirschman-Herfindahl index across six population groups (e.g. hispanic, black, white people). Additional details are present in the Methods section. Note that we excluded all race-specific variables that are usually employed (e.g. percentage of black people) to build an evidence-based and race-neutral model.

The BE features are based on the Jane Jacobs theory Jacobs (1961), which states that four conditions have to be valid to ensure a virtuous loop between the presence of people and a vibrant neighborhood life. First, a district should serve at least two or more functions to have streets continuously used by residents and strangers. Second, street blocks should be small and short to ensure both high walkability and frequent meeting of people at street intersections. Third, diverse buildings make it possible to have low- and high-rent spaces, and thus a mixture of people and enterprises. The fourth condition is about dense concentration, which ensures a sufficient presence of people and enterprises to attract dwellers from different neighbourhoods continuously. This idea is summarized by the idea that "a well-used city street is apt to be a safe street and a deserted city street is apt to be unsafe" Jacobs (1961). Moreover, walkability is promotes social realtions Leyden (2003) and connected to local cohesion of neighbors. Thus, in accordance with the literature De Nadai et al. (2016) we operationalize the four conditions in: i) land-use mix; ii) block size iii) building age diversity; iv) population density and walkability, related to the second condition but also to density and reachability of POIs in the area. The details of these metrics are available in the Methods section.

The M features are built upon recent mobility and criminology literature. We account for the average number of people at risk in the core by measuring the core ambient population Andresen (2006) and the attractiveness of the corehood, where the latter is measured as the number of trips to the corehood for reasons different than travelling to work or home. Ambient population and attractiveness are computed by simulating realistic urban traces using Timegeo Jiang et al. (2016), state of the art model for human mobility, in combination with mobile phone data. We do not include M features in Chicago, as we do not have mobile phone traces.

We model the relation of crime with core and corehood features through a spatially filtered Bayesian Negative Binomial, which is specifically tailored for discrete data, accounts for the overdispersion of crime events, and models uncertainty. The model accounts for the spatial auto-correlation, thus avoiding the biased parameters of non-spatial models Griffith and Peres-Neto (2006); Tiefelsdorf and Griffith (2007)

. We identify spatial auto-correlation of crime events using a matrix indicating spatial proximity, and modelling spatial random effects. Specifically, criminal activity is explained by a linear combination of an intercept, fixed effects (i.e. the input features), and random effects, which represent the unexplained variance that emerge from the spatial-autocorrelation of neighboring areas. Although we find high spatial correlation in crime events, we did not find any significant spatial auto-correlation in the residuals with our spatial model (see Note 4 in the SI). The reader can refer to the Methods section for additional details about the model and its formulation.

Figure 2: Maps of the estimated number of crime for each neighborhood in Bogotá for the A) Social-disorganization, B) Built environment, C) Full model. D) shows the Full model’s prediction. E) shows the ground truth crime count.
Model Bogotá Boston Los Angeles Chicago
() LOO () LOO () LOO () LOO
Core 0.54 (0.75) -3897 0.21 (0.64) -2035 0.18 (0.68) -9665 0.09 (0.68) -8415
Social-disorganization (SD) 0.57 (0.75) -3891 0.55 (0.68) -2019 0.53 (0.72) -9529 0.66 (0.78) -8019
Built environment (BE) 0.61 (0.76) -3881 0.36 (0.68) -2014 0.27 (0.69) -9629 0.21 (0.69) -8371
Mobility (M) 0.64 (0.80) -3804 0.42 (0.70) -2001 0.25 (0.70) -9570 - -
SD+BE 0.64 (0.76) -3881 0.65 (0.72) -1987 0.56 (0.72) -9508
SD+M 0.66 (0.81) 0.67 (0.73) -1973 0.55 (0.73) -9467 - -
BE+M 0.68 (0.80) -3819 0.50 (0.72) -1989 0.30 (0.70) -9585 - -
SD+BE+M (Full) -3808 - -
Table 1: Quantitative results of crime description and predictions in Bogotá, Boston, Los Angeles and Chicago. The model including Social Disorganization, Built Environment and Mobility features achieves the highest descriptive ( and ) and predictive (LOO) performance. Here, we can see that contextual features of the neighborhood significantly increase our model’s performance against the model considering only the core features. The LOO metric is calculated through the Pareto smoothed importance sampling Leave-One-Out cross-validation.

Description and prediction of crime

For each city, we evaluate our model under various feature combinations to assess the contribution of each group of features. We measure the capability of the model to describe crime through the marginal  Nakagawa et al. (2017), measuring the proportion of variance explained by the fixed effects (i.e. the input features). As reference, we also measure the conditional  Nakagawa et al. (2017) that takes into account both the variance explained by the fixed and random effects (i.e. the spatial autocorrelation) in explaining crime. Additionally, we use the Pareto-smoothed importance sampling Leave-One-Out cross-validation (LOO) Vehtari et al. (2017) to assess the point-wise out-of-sample prediction accuracy (the higher, the better).

First, we evaluate the baseline model that includes only the core variables. Table 1 shows that the core-only model performs poorly in Chicago, Los Angeles and Boston, while it has high in Bogotá. The difference between and highlight that in all cities there is a significant unexplained variance that is captured by the spatial random effects, but not from the input features.

The SD, BE and M features significantly increase the explanatory power of our model. Particularly, in US cities, the increases up to 161%, 194% and 633% in Boston, Los Angeles and Chicago. Notably, and not surprisingly, the SD features are very important, especially in Chicago, were the "Chicago school" forged the Social Disorganization theory and further elaborated the role of collective efficacy on dealing with crime. Differently, the increase in Bogotá is less pronounced, suggesting that the neighbourhood impact on crime is limited. Turning to M and BE features, we find that they describe the crime, but they are often as not meaningful as the SD features for crime prediction. However, the importance of mobility confirms the importance of floating population at describing microdynamic behaviour of criminal activity Caminha et al. (2017). We observe that in all cities the conditional increases when adding the SD, BE and M features, revealing that the included variables also help explain the variance of crime across cores.

Overall, Table 1 shows that considering together SD, BE and M variables result in the highest descriptive () and predictive (LOO) performance. This result means that, in order to model crime, one needs to account for multiple aspects of urban life, including Social Disorganization, the physical characteristics of the neighbourhoods, and mobility. This result holds also against different combinations of the features (i.e. SD+BE, SD+M and BE+M). Nonetheless, some of the SD+BE and SD+M models are very competitive and might be considered when all data-sources are available. Particularly, the ambient population (i.e. the average number of people who stop at the core) is one of the most important variables in the model and allows to better assess the number of people at risk, as suggested by previous works on aggregated mobility Caminha et al. (2017), satellite imagery Andresen (2006), Twitter Malleson and Andresen (2015) and census data Mburu and Helbich (2016)

. However, we found that it might generate large errors due to places that are outliers of mobility in densely populated areas or hotspots of activity (see Figure S7 and Figure S8 in the SI).

improvements indicate that the model relies less on the random effects and it is better at explaining crime from the input features. Figure 2 shows the spatial gain in performance from the baseline in Bogotá. First, it reveals that our Full model prediction resembles the ground truth data (Figure 2 D-E), as confirmed by the high value of . Second, it shows that, while the SD and BE models achieve localized improvements (Figure 2 A-B), the Full model improves the prediction almost everywhere. However, the Full model performs quite poorly in a specific area of Bogotá (see Figure 2 C), part of the Engativá neighbourhood. By inspecting the coefficients of the model, we find that this area is an outlier as it is densely populated, thus resulting in an inflated prediction of crime, due to the high importance of residential and ambient population in the Bogotá model. Note, however, that our prediction is at the block level and the city-wide goodness of fit is .

The difference between and represents the unexplained variance due to spatial auto-correlation, which might suggest missing effects and variables. In Bogotá, our model points out that the touristic and dangerous neighbourhood La Candelaria, and the populous district of Engativá have significant unexplained variance that our input features cannot capture (see Figure S4 in the SI). In Boston, the area near the Franklin park indicates missing local factors (see Figure S3 in SI). In Los Angeles, unexplained variance seems to be tied to places with a large number of people, namely the international airport and the UCLA campus (see Figure S5 in SI). Again, in Chicago, missing variables are suggested near the prison and the southern area (see Figure S6 in SI). Altogether, these signals could help policymakers on including the best factors for each city and enacting policies that prevent crime.

Previous results suggested that the use of mobility flows between different regions might help describing crime Wang et al. (2016); Wang and Li (2017). Thus, we test our model against this hypothesis by using the Origin-Destination matrix of people trips to model the auto-correlation between corehoods. The idea here is that human mobility might better explain the relation between corehoods than geographical closeness. However, we find that mobility flows significantly worsen the performance of our model (see Note 5 of SI).

While the effects of urban environment characteristics, socio-economic conditions, and mobility have been empirically tested separately De Nadai et al. (2016); Graif et al. (2017); Lee et al. (2017a); Sung and Lee (2015); Sampson (1997), to the best of our knowledge, this is the first study to support with large-scale data the association of crime with socio-economic conditions, the built environment, and the mobility. However, we find that these aspects do not play the same role across cities, and only some of them contribute to the crime prediction model.

Figure 3: Generalized Linear Model’s coefficients showing that Social Disorganization, Built Environment and Mobility features do not play the same role in all cities. We highlight in blue the minimum and maximum coefficient for each feature. Overall, this figure shows that there is no universal theory of crime.

Neighborhood variables across cities

In this section, we turn our attention to the standardized coefficients that reveal how features correlate with criminal activity.

First, we focus on the coefficients of the Full model, which combines socio-economic features with the characteristics of the built environment and human mobility. Note that here Chicago is excluded for lack of data. Figure 3 pictures that the coefficients vary greatly across cities. For example, land-use mix correlates negatively with criminal activity in Bogotá and Los Angeles, but positively in Boston. Similarly, higher population building age diversity is present in low-crime areas in Boston and Los Angeles, but in high-crime areas in Bogotá. Social disorganization variables are no less different, as corehood instability is correlated with crime activity only in Bogotá, differently from what expected from the theory Shaw and McKay (1942); Sampson and Groves (1989).

The discrepancies between cities could be explained by the different spatial and socio-economic processes at play. When we look at the bivariate correlations across features, we observe interesting patterns. For example, in Los Angeles and Boston, walkability is strongly positively correlated with population density and neighbourhood attractiveness, as expected Shaw and McKay (1942); Sampson and Groves (1989), and slightly correlated with advantaged neighbourhoods. Differently, walkable areas in Bogotá have low population density areas and are highly advantaged, while the attractiveness is slightly correlated (see Figure S11 in SI). A possible reason for the

coefficients disagreement lies on the multi-collinearity of the input features. Although we use the QR decomposition and Ridge penalty to shrink down the variables that are not necessary, the difference between the coefficients is present also in simpler models.

The difference between the results across cities also suggests that crime correlates differently with space and people. For example, we observe that in Bogotá high crime areas relate to advantaged neighbourhoods, while in Boston and Los Angeles higher crime seem to be linked to disadvantaged neighbourhoods, according to the theory Shaw and McKay (1942); Sampson and Groves (1989). A possible explanation might be related to under-reporting and police disrespecting, which seems to be a problem particularly in Bogotá Godoy et al. (2018). However, literature has shown how neighbourhood cultural codes, informal local control, and problematic policing are also related to violent criminal activities Kubrin and Weitzer (2003).

However, some features behave similarly in all the cities. We find that corehoods with high disadvantage and ethnic diversity but, surprisingly, smaller blocks have higher crime activity. While in the core we find that the presence of Shops, Food POIs, and population (both residential and ambient) correlates positively with criminal activity. These results resonate with literature showing that the presence of POIs and ambient population increase crime due to a higher number of potential targets and offenders in an area. Additionally, we find that corehood attractiveness has a strong connection with crimes, suggesting that the presence of people that do not live nor work in the area might influence crime. This result is in contrast with literature based on Jacobs’ theory Jacobs (1961); Traunmueller et al. (2014), but resonate with Oscar Newman’s one arguing that a high number of visitors results in higher anonymity and, thus, crime Newman (1972). Additionally, a recent empirical study from survey data Boivin and Felson (2018) agrees with our result, obtained instead with large-scale and passively collected information. In the supplementary materials (SI), we compare all the cities in detail.

To test the possibility of having a universal model that predicts crime, we test a model that uses only the features that behave in the same direction in all the cities. This model consistently performs worse than the Full model (see Note 10 in SI), showing that at this moment, no model is convenient to be easily applied to all cities. We also studied at what extent a model trained in one city can be tested to another city. We found that US cities are, as expected, more similar to each other than Bogotá, and that Los Angeles behave similarly to Chicago.


In this paper, we modelled the presence of crime across four cities, widely different with respect to cultural, economic, historical and geographical aspects. We found that the variability of the dynamics and history of each city poses a challenge to the existence of a model that "fits it all", able to learn from one city and to predict on another one. Instead, we presented a model that could describe and disentangle the role of diverse factors in urban crime and draw some theoretical and practical implications.

The goal of this research goes beyond crime prediction in time (i.e. forecasting). Offences are concentrated in a small number of places Lee et al. (2017b), and are tightly coupled with places, stable over time Weisburd et al. (2012). Thus, the easiest way to predict crime is modelling those few places with the highest number of crimes, also known as hotspots Bogomolov et al. (2014); Short et al. (2010). On the contrary, we seek to shed light on the diverse set of factors at play with urban crime and do predictions for those areas without crime statistics (i.e. nowcasting).

Our cumulative results show little evidence in support of the Jane Jacobs’ theory, arguing that specific urban features and people on the street generate higher security. On the contrary, we often found that Jacobs’ features and urban vibrancy increase people’s vulnerability to crime, suggesting that further work has to be done in this direction.

We found that different theories often seen as competing can complement each other in models that take into account the socio-economic, built environment and mobility conditions together. The importance of mobility and built environment characteristics showed that competitive descriptive and predictive models can be built from data available at large scale without the necessity of costly in-field survey studies. However, we found that aspects related to social disorganisation are important for crime description and prediction. Therefore, it is crucial to consider alternative sources of data to infer social cohesion and interactions and overcome the use of census information, which is costly to collect and rarely updated. There have been multiple attempts at inferring social interactions Eagle et al. (2009), poverty Blumenstock et al. (2015), well-being Pappalardo et al. (2016) and unemployment Toole et al. (2015) but so far very little work has been done at micro spatial levels.

Comparing multiple cities in different countries do not come without limitations. First, our analysis ignore temporal variation such as opening times of POIs or temporal variation in mobility. Second, due to lack of consistent data, we did not account for variables such as political and housing policies, security perception, community participation, and social ties within family and within neighbourhoods that were previously found to be related to crime Faust and Tita (2019); Salesses et al. (2013); Tran et al. (2013). Finally, official crime data do not come without errors, given that not all crimes are reported nor recorded Small (2018), and there is no "ground truth" data to gauge any bias in police records.

Our work seeks to make headway on the previous limitation of a single site of study origin. While recent works have started the use of street units and blocks to study criminal activity Contreras (2017); Hipp et al. (2019); Kim and Hipp (2020); Rosser et al. (2017), they often relied on a small subset of variables and one city. Analysing multiple cities together exposed criminology theories to discrepancies and differences. Descriptive modelling can help policymakers to understand the use of urban space and deploy future investments and resources thoughtfully. Moreover, from the scientific perspective, descriptive modelling can provide insights for strong predictors, and potentially for explanatory variables, to be further investigated by explanatory modelling and experiments Kenett et al. (2018). Thus, we hope that additional research keeps exploring multi-dimensional aspects related to crime, to clarify potential crime causes and design better cities.


The socio-economical and Jane Jacobs’ urban theories are dependent upon the actions and activities at work in communities. Thus, we identified corehoods as social and geographical units of analysis. Then, we obtained and aggregated the data for each corehood of Bogotá, Boston, Los Angeles and Chicago.

Crime data

Data collection mechanisms and crime categories can vary from country to country. The Uniform Crime Reporting (UCR) Program ( is a US statistical effort to make crime reports uniform across the country. The UCR divides crime in two main groups: Part 1 and Part 2 offences. The former is composed by violent crimes (aggravated assault, forcible rape, robbery and murder) and property crimes (larceny-theft, motor vehicle theft, burglary and arson), while the latter are considered less serious and they include offences such as simple assaults and nuisance crimes. For each city we thus collect the geo-referenced data of committed crimes and we filter out those crimes not belonging to Part 1 of UCR, similarly to most of the criminology literature. We categorized crimes in Bogotá consistently with UCR categories and released the mapping for future comparisons. We reference crimes to cores and, when a crime event happens in a street segment shared between cores, we evenly assign the event to both cores. Due to the limit in accuracy of GPS positioning, we create a buffer of 30 meters for each crime, which is the distance usually employed for stop location detection algorithms De Nadai et al. (2019). More details are presented in the SI. We summed crime events over one year to minimize seasonal fluctuations.

Mobile phone data

We computed the ambient population and the OD matrices for Bogotá, Boston and Los Angeles through the TimeGeo modelling framework Jiang et al. (2016)

. We fitted the model starting from aggregated and anonymized Call Detailed Records (CDRs) collected from 12-01-2013 to 05-31-2014, 6 weeks in 2010, and 10-15, 2012 to 11-24, 2012 for Bogotá, Boston and Los Angeles respectively. The anonymized data for the three cities was collected for billing purposes by two mobile operators, who also kindly provided to us the data for the present research. Timegeo is an agent-based model that simulates the activity of people from mobile phone data. To be consistent with the travel surveys of each city it simulates the time, duration, direction and type of travels within the city. The types of travels are classified as Home-Based from/to Work (HBW), Home-Based from/to Other type of locations (HBO) and Non-Home-based from/to Other type of locations (NHB). To build the

ambient population we counted the number of people who stops at a specific location for at least one hour, while we built the corehood attractiveness counting the number of NHB trips with the corehood as destination.

Spatial and census data

Census blocks, population, employment and poverty for US cities were drawn from the American Community Survey (ACS) ( For US cities we also used some city-specific datasets that are described in the SI. The census data of Bogotá was obtained by the Departmento Administrativo Nacional de Estadística (DANE), which organized the 2005 general census for the city ( The poverty data of Bogotá was extracted from the Sisbén in the Identification System III of 2014. The detailed list of datasets and URLs are listed in the SI.

Built environment features

We operationalize the Jane Jacobs conditions through some state of the art metrics defined in literature De Nadai et al. (2016). The land-use mix is computed as the average entropy among land uses: , where is the percentage of square meters having land use in unit , and represents the considered land uses in the metric. The LUM ranges between 0, wherein the unit is composed by only one land use (e.g. residential), and 1, wherein developed area is equally shared among the land-uses.

Then, for each corehood we determine the walkability through the accessibility of the core to the nearest point of interests (e.g. convenience stores, restaurants, sport facilities). Consistently with literature wal , we define the weighted walkability score as: , where is the set of categories (i.e., Food, Shops, Grocery, Schools, Entertainment, Parks and outside, Coffee, Banks, Books), is the street-network distance decay function, and is the set of POIs of category . The distance decay function gives a weight (importance) to each POI reachable from a staring point. Additional information about the walkability score can be find in the SI.

We then compute the average block area among the set of blocks in unit as

, and the building age diversity as the standard deviation of building ages in the corehood.

Finally, we operationalize Jacobs’ density condition with the dwelling units density, computed from census data. Additional details are described in the SI.


We create the feature disadvantage and instability through the two largest PCA principal components of: (i) unemployment rate, (ii) poverty rate, defined as people living below the poverty line, and (iii) residential mobility rate, defined as the percentage of people who recently changed residency (one year for US cities and fiver years for Bogotá). From the loadings of the PCA linear combination we verified that disadvantage is mainly a linear combination of poverty rate and unemployment, while instability is mainly about residential mobility rate.

In the Social-disorganization variables we do not include any ethnic-specific variables (e.g. percentage of black people) other than diversity because they might be present only in some places and not in others (e.g. native Americans in Bogotá), and to avoid any ethnic-specific bias. Ethnic diversity represents the difficulties of a community to communicate and collaborate for a common goal. Accordingly to the literature, it is computed as the Hirschman-Herfindahl diversity index of six population groups , where is the proportion of people belonging to the ethnicity , and is the number of ethnicities. Consistently with the literature we include for US cities: Hispanics, non-Hispanic Blacks, Whites, Asians, Native Hawaiians - Pacific Islanders and others. For Bogotá we include: Indigenous, Rom, Islanders (San Andrés), Palenquero, Black and others.

Bayesian model

Let be the discrete number of crimes for a set of spatial regions . We approximate the relation between crimes and spatial features through a Negative Binomial approach that models the non-negative nature of the crime-counts in a city, but also the overdispersion found in the data (Note 4 in the SI). Specifically, where is the input data and the coefficients of the model. are the random effects that accounts for the unexplained variability of crime (i.e. the spatial-autocorrelation). In this paper, we account the spatial auto-correlation with the Bayesian Spatial Filtering (BSF) Hughes (2017) that defines where are coefficients to be found. is instead defined as the first principal components of , where is a spatial matrix that describes the graph between spatial locations, while , which is an approximation of the spatial error model Tiefelsdorf and Griffith (2007). We tested for the presence of spatial auto-correlation on the residuals of all the models without finding significant auto-correlation. As the results might change with different definitions of , we tested all the models for three definitions: i) is a binary adjacency matrix identifying whether a corehood overlaps another corehood, ii) is a inverse distance matrix between corehoods, iii) describes the flow of people between corehoods, which is extracted from mobile phone data. We found that the binary matrix consistently outperforms other definitions. Additional details of the presented models, definition of , and other competitive models tested are present in the SI.

As we have to account for collinearity, we employ a Ridge penalty to all fixed effects.

Model calibration ed evaluation

Model calibration is carried out by means of Markov Chain Monte Carlo (MCMC) approach. Convergence was assured by the Gelman-Rubin convergence statistics, and discarding the first 15,000 iterations and running the model over 5,000 iterations.

We assess how well the models describe crime through the conditional and the marginal  Nakagawa et al. (2017), which adapt the popular coefficient of determination to the generalized linear mixed-effects models. They are defined as:

where is the variance explained by the fixed effects, is the variance explained by the random effects, and is the variance of the residuals. Specifically, , and is specific to the Negative Binomial and defined Nakagawa et al. (2017) as , with and

is the shape parameter of the Negative Binomial distribution.

We assess the out of sample predictive accuracy through the Pareto-smoothed importance sampling Leave-One-Out cross-validation (LOO) Vehtari et al. (2017), which is the state of the art for evaluating Bayesian models.

Data Availability

We are pleased to make available the source-code and datasets accompanying this research. The projects files are available at


We thank Paolo Bosetti and Junpeng Lao for the helpful comments. We especially thank Andrés Clavijo for his support on the data, we all hope that this work could make Bogotá better. This work was supported by the Berkeley DeepDrive and the ITS Berkeley 2018-19 SB1 Research Grant (to M.C.G.); the French Development Agency and the World Bank (to M.D.N., B.L. and E.L.).

Author contributions statement

M.D.N, E.L., M.C.G. and B.L. designed research and experiments; M.D.N, Y.X., M.C.G. and B.L. performed research and experiments; M.D.N, M.C.G. and B.L. contributed new analytic tools; M.D.N, and Y.X. analysed the data; and M.D.N, M.C.G. and B.L. wrote the paper. All authors read, reviewed and approved the final manuscript.

Competing Interests

The authors declare no competing interests.


  • Graif and Sampson (2009) C. Graif and Robert J. Sampson, “Spatial Heterogeneity in the Effects of Immigration and Diversity on Neighborhood Homicide Rates,” Homicide Studies 13, 242–260 (2009).
  • Sampson and Groves (1989) Robert J. Sampson and W. Byron Groves, “Community structure and crime: Testing social-disorganization theory,” American Journal of Sociology 94, 774–802 (1989).
  • Sampson (1997) Robert J. Sampson, “Neighborhoods and Violent Crime: A Multilevel Study of Collective Efficacy,” Science 277, 918–924 (1997).
  • Newman (1972) Oscar Newman, Defensible space (Macmillan New York, 1972).
  • Jacobs (1961) Jane Jacobs, The death and life of great American cities (Vintage, 1961).
  • Wang et al. (2018) Qi Wang, Nolan Edward Phillips, Mario L Small,  and Robert J Sampson, “Urban mobility and neighborhood isolation in america’s 50 largest cities,” PNAS 115, 7735–7740 (2018).
  • Cohen and Felson (1979) Lawrence E Cohen and Marcus Felson, “Social change and crime rate trends: A routine activity approach,” American sociological review , 588–608 (1979).
  • Felson and Clarke (1998) Marcus Felson and Ronald V Clarke, “Opportunity makes the thief,” Police research series, paper 98, 1–36 (1998).
  • Brantingham and Brantingham (1993) Patricia L Brantingham and Paul J Brantingham, “Nodes, paths and edges: Considerations on the complexity of crime and the physical environment,” Journal of environmental psychology 13, 3–28 (1993).
  • Felson and Boba (2010) Marcus Felson and Rachel L Boba, Crime and everyday life (Sage, 2010).
  • Hindelang et al. (1978) Michael J Hindelang, Michael R Gottfredson,  and James Garofalo, Victims of personal crime: An empirical foundation for a theory of personal victimization (Ballinger Cambridge, MA, 1978).
  • O’Brien and Sampson (2015) Daniel Tumminelli O’Brien and Robert J Sampson, “Public and private spheres of neighborhood disorder: Assessing pathways to violence using large-scale digital records,” Journal of research in crime and delinquency 52, 486–510 (2015).
  • Murray and Roncek (2008) Rebecca K Murray and Dennis W Roncek, “Measuring diffusion of assaults around bars through radius and adjacency techniques,” Criminal Justice Review 33, 199–220 (2008).
  • Salesses et al. (2013) Philip Salesses, Katja Schechtner,  and César A Hidalgo, “The collaborative image of the city: mapping the inequality of urban perception,” PloS one 8 (2013).
  • Sampson (1985) Robert J. Sampson, “Neighborhood and crime: The structural determinants of personal victimization,” Journal of Research in Crime and Delinquency 22, 7–40 (1985).
  • Hipp et al. (2013) John R Hipp, Carter T Butts, Ryan Acton, Nicholas N Nagle,  and Adam Boessen, “Extrapolative simulation of neighborhood networks based on population spatial distribution: Do they predict crime?” Social Networks 35, 614–625 (2013).
  • Song et al. (2019) Guangwen Song, Wim Bernasco, Lin Liu, Luzi Xiao, Suhong Zhou,  and Weiwei Liao, “Crime feeds on legal activities: Daily mobility flows help to explain thieves’ target location choices,” Journal of Quantitative Criminology 35, 831–854 (2019).
  • Wang et al. (2016) Hongjian Wang, Daniel Kifer, Corina Graif,  and Zhenhui Li, “Crime rate inference with big data,” in ACM SIGKDD, KDD ’16 (ACM, New York, NY, USA, 2016) pp. 635–644.
  • Andresen (2006) Martin A Andresen, “Crime measures and the spatial analysis of criminal activity,” British Journal of criminology 46, 258–285 (2006).
  • Andresen (2011) Martin A Andresen, “The ambient population and crime analysis,” The Professional Geographer 63, 193–212 (2011).
  • Bogomolov et al. (2014) Andrey Bogomolov, Bruno Lepri, Jacopo Staiano, Nuria Oliver, Fabio Pianesi,  and Alex Pentland, “Once upon a crime: towards crime prediction from demographics and mobile data,” in ICMI (ACM, 2014) pp. 427–434.
  • Malleson and Andresen (2015) Nick Malleson and Martin A Andresen, “Spatio-temporal crime hotspots and the ambient population,” Crime science 4, 10 (2015).
  • Sohn (2016) Dong-Wook Sohn, “Residential crimes and neighbourhood built environment: Assessing the effectiveness of crime prevention through environmental design (cpted),” Cities 52, 86–93 (2016).
  • Sampson et al. (1999) Robert J Sampson, Jeffrey D Morenoff,  and Felton Earls, “Beyond social capital: Spatial dynamics of collective efficacy for children,” American sociological review , 633–660 (1999).
  • Kubrin and Weitzer (2003) Charis E Kubrin and Ronald Weitzer, “Retaliatory homicide: Concentrated disadvantage and neighborhood culture,” Social problems 50, 157–180 (2003).
  • Sampson (2013) Robert J Sampson, “The place of context: a theory and strategy for criminology’s hard problems,” Criminology 51, 1–31 (2013).
  • Sampson and Graif (2009) Robert J Sampson and Corina Graif, “Neighborhood social capital as differential social organization: Resident and leadership dimensions,” American Behavioral Scientist 52, 1579–1605 (2009).
  • Leyden (2003) Kevin M Leyden, “Social capital and the built environment: the importance of walkable neighborhoods,” American journal of public health 93, 1546–1551 (2003).
  • De Nadai et al. (2016) Marco De Nadai, Jacopo Staiano, Roberto Larcher, Nicu Sebe, Daniele Quercia,  and Bruno Lepri, “The Death and Life of Great Italian Cities: A Mobile Phone Data Perspective,” in WWW (International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 2016) pp. 413–423.
  • Jiang et al. (2016) Shan Jiang, Yingxiang Yang, Siddharth Gupta, Daniele Veneziano, Shounak Athavale,  and Marta C. González, “The TimeGeo modeling framework for urban mobility without travel surveys,” PNAS 113, E5370–E5378 (2016).
  • Griffith and Peres-Neto (2006)

    Daniel A Griffith and Pedro R Peres-Neto, “Spatial modeling in ecology: the flexibility of eigenfunction spatial analyses,” Ecology 

    87, 2603–2613 (2006).
  • Tiefelsdorf and Griffith (2007)

    Michael Tiefelsdorf and Daniel A Griffith, “Semiparametric filtering of spatial autocorrelation: the eigenvector approach,” Environment and Planning A 

    39, 1193–1221 (2007).
  • Nakagawa et al. (2017) Shinichi Nakagawa, Paul CD Johnson,  and Holger Schielzeth, “The coefficient of determination r 2 and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded,” Journal of the Royal Society Interface 14, 20170213 (2017).
  • Vehtari et al. (2017) Aki Vehtari, Andrew Gelman,  and Jonah Gabry, “Practical bayesian model evaluation using leave-one-out cross-validation and waic,” Statistics and computing 27, 1413–1432 (2017).
  • Caminha et al. (2017) Carlos Caminha, Vasco Furtado, Tarcisio HC Pequeno, Caio Ponte, Hygor PM Melo, Erneson A Oliveira,  and José S Andrade Jr, “Human mobility in large cities as a proxy for crime,” PloS one 12, e0171609 (2017).
  • Mburu and Helbich (2016) Lucy W. Mburu and Marco Helbich, “Crime Risk Estimation with a Commuter-Harmonized Ambient Population,” Annals of the American Association of Geographers 106, 804–818 (2016).
  • Wang and Li (2017) Hongjian Wang and Zhenhui Li, “Region representation learning via mobility flow,” in Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (2017) pp. 237–246.
  • Graif et al. (2017) Corina Graif, Alina Lungeanu,  and Alyssa M Yetter, “Neighborhood isolation in chicago: Violent crime effects on structural isolation and homophily in inter-neighborhood commuting networks,” Social Networks  (2017).
  • Lee et al. (2017a) Sugie Lee, Chisun Yoo, Jaehyun Ha,  and Jeemin Seo, “Are perceived neighbourhood built environments associated with social capital? Evidence from the 2012 Seoul survey in South Korea,” International Journal of Urban Sciences , 1–17 (2017a).
  • Sung and Lee (2015) Hyungun Sung and Sugie Lee, “Residential built environment and walking activity: Empirical evidence of jane jacobs’ urban vitality,” Transportation Research Part D: Transport and Environment 41, 318–329 (2015).
  • Shaw and McKay (1942) Clifford Robe Shaw and Henry Donald McKay, Juvenile delinquency and urban areas. (University of Chicago Press, 1942).
  • Godoy et al. (2018) Juan Felipe Godoy, C Rodriguez,  and H Zuleta, “Security and sustainable development in bogota, colombia,” Geneva: DCAF  (2018).
  • Traunmueller et al. (2014) Martin Traunmueller, Giovanni Quattrone,  and Licia Capra, “Mining mobile phone data to investigate urban crime theories at scale,” in International Conference on Social Informatics (Springer, 2014) pp. 396–411.
  • Boivin and Felson (2018) Remi Boivin and Marcus Felson, “Crimes by visitors versus crimes by residents: The influence of visitor inflows,” Journal of Quantitative Criminology 34, 465–480 (2018).
  • Lee et al. (2017b) YongJei Lee, John E. Eck, SooHyun O,  and Natalie N. Martinez, “How concentrated is crime at places? a systematic review from 1970 to 2015,” Crime Science 6, 6 (2017b).
  • Weisburd et al. (2012) David Weisburd, Elizabeth R Groff,  and Sue-Ming Yang, The criminology of place: Street segments and our understanding of the crime problem (Oxford University Press, 2012).
  • Short et al. (2010) Martin B Short, P Jeffrey Brantingham, Andrea L Bertozzi,  and George E Tita, “Dissipation and displacement of hotspots in reaction-diffusion models of crime,” PNAS  (2010).
  • Eagle et al. (2009) Nathan Eagle, Alex Sandy Pentland,  and David Lazer, “Inferring friendship network structure by using mobile phone data,” PNAS 106, 15274–15278 (2009).
  • Blumenstock et al. (2015) Joshua Blumenstock, Gabriel Cadamuro,  and Robert On, “Predicting poverty and wealth from mobile phone metadata,” Science 350, 1073–1076 (2015).
  • Pappalardo et al. (2016) Luca Pappalardo, Maarten Vanhoof, Lorenzo Gabrielli, Zbigniew Smoreda, Dino Pedreschi,  and Fosca Giannotti, “An analytical framework to nowcast well-being using mobile phone data,” International Journal of Data Science and Analytics 2, 75–92 (2016).
  • Toole et al. (2015) Jameson L. Toole, Yu-ru Lin, Erich Muehlegger, Daniel Shoag, Marta C. González,  and David Lazer, “Tracking employment shocks using mobile phone data,” Journal of The Royal Society Interface 12, 20150185 (2015)arXiv:1505.06791 .
  • Faust and Tita (2019) Katherine Faust and George E Tita, “Social networks and crime: Pitfalls and promises for advancing the field,” Annual Review of Criminology 2, 99–122 (2019).
  • Tran et al. (2013) Van C Tran, Corina Graif, Alison D Jones, Mario L Small,  and Christopher Winship, “Participation in context: Neighborhood diversity and organizational involvement in boston,” City & Community 12, 187–210 (2013).
  • Small (2018) Mario L Small, “Understanding when people will report crimes to the police,” Proceedings of the National Academy of Sciences 115, 8057–8059 (2018).
  • Contreras (2017) Christopher Contreras, “A block-level analysis of medical marijuana dispensaries and crime in the city of los angeles,” Justice Quarterly 34, 1069–1095 (2017).
  • Hipp et al. (2019) John R Hipp, Young-An Kim,  and Kevin Kane, “The effect of the physical environment on crime rates: Capturing housing age and housing type at varying spatial scales,” Crime & Delinquency 65, 1570–1595 (2019).
  • Kim and Hipp (2020) Young-An Kim and John R Hipp, “Street egohood: An alternative perspective of measuring neighborhood and spatial patterns of crime,” Journal of Quantitative Criminology 36, 29–66 (2020).
  • Rosser et al. (2017) Gabriel Rosser, Toby Davies, Kate J Bowers, Shane D Johnson,  and Tao Cheng, “Predictive crime mapping: Arbitrary grids or street networks?” Journal of Quantitative Criminology 33, 569–594 (2017).
  • Kenett et al. (2018) Ron S. Kenett, Danny Pfeffermann,  and David M. Steinberg, “Election polls—a survey, a critique, and proposals,” Annual Review of Statistics and Its Application 5 (2018), 10.1146/annurev-statistics-031017-100204.
  • De Nadai et al. (2019) Marco De Nadai, Angelo Cardoso, Antonio Lima, Bruno Lepri,  and Nuria Oliver, “Strategies and limitations in app usage and human mobility,” Scientific reports 9, 1–9 (2019).
  • (61) Front Seat Walk Score Methodology, Tech. Rep., available online at Last accessed on 3 January 2020.
  • Hughes (2017) John Hughes, “Spatial regression and the bayesian filter,” arXiv preprint arXiv:1706.04651  (2017).
  • Kadar et al. (2017) Cristina Kadar, Raquel Rosés Brüngger,  and Irena Pletikosa, “Measuring ambient population from location-based social networks to describe urban crime,” in International Conference on Social Informatics (Springer, 2017) pp. 521–535.
  • dat (a) “Ideca,” (a).
  • dat (b) “Us census tiger,” (b).
  • dat (c) “Boston maps open data site,” (c).
  • dat (d) “Chicago boundaries,” (d).
  • dat (e) “La boundaries,” (e).
  • dat (f) “Bogota buildings,” (f).
  • dat (g) “Boston buildings,” (g).
  • dat (h) “Chicago buildings,” (h).
  • dat (i) “Los angeles buildings,” (i).
  • dat (j) “Boston crime,” (j).
  • dat (k) “Chicago crime,” (k).
  • dat (l) “Los angeles crime,” (l).
  • dat (m) “Us census factfinder,” (m).
  • dat (n) “Boston landuse,” (n).
  • dat (o) “Chicago landuse,” (o).
  • dat (p) “Los angeles landuse,” (p).
  • Moran (1948) Patrick AP Moran, “The interpretation of statistical maps,” Journal of the Royal Statistical Society. Series B (Methodological) 10, 243–251 (1948).
  • Arnold et al. (1999) N Arnold, ANDREW Thomas, L Waller,  and E Conlon, “Bayesian models for spatially correlated disease and exposure data,” in Bayesian Statistics 6: Proceedings of the Sixth Valencia International Meeting, Vol. 6 (Oxford University Press, 1999) p. 131.
  • Gelman et al. (2006) Andrew Gelman et al., “Prior distributions for variance parameters in hierarchical models (comment on article by browne and draper),” Bayesian analysis 1, 515–534 (2006).
  • Potthoff (2006) Richard F Potthoff, “Homogeneity, potthoff-whittinghill tests of,” Encyclopedia of Statistical Sciences  (2006).
  • Chun et al. (2016) Yongwan Chun, Daniel A Griffith, Monghyeon Lee,  and Parmanand Sinha, “Eigenvector selection with stepwise regression techniques to construct eigenvector spatial filters,” Journal of Geographical Systems 18, 67–85 (2016).
  • Lin and Zhang (2007) Ge Lin and Tonglin Zhang, “Loglinear residual tests of moran’s i autocorrelation and their applications to kentucky breast cancer data,” Geographical Analysis 39, 293–310 (2007).
  • Murakami and Griffith (2019) Daisuke Murakami and Daniel A Griffith, “Eigenvector spatial filtering for large data sets: fixed and random effects approaches,” Geographical Analysis 51, 23–49 (2019).

Appendix A Walkability

We determine the walkability of a neighbourhood through its accessibility to the nearest Point Of Interests (e.g., convenience stores, restaurants, sport facilities). The concept of walkability is empirically calculated in many different ways. However, one of the most accepted one is Walk Score wal . We here describe and compute the walkability score for our cities consistently with their methodology wal , as Walk Score is not available for all the cities we consider.

Thus, for each city block we first collect an ordered list of closest Point Of Interests (POIs) belonging to category :


where is the closest POI of category to , is the second closest and so on so forth. And then we compute the walkability score as:


where is the set of categories (i.e. Food, Shops, Grocery, Schools, Entertainment, Parks and outside, Coffee, Banks, Books), is the street-network distance decay function (explained later), and is a weighting factor that depends on both the category and the -est closest POI.

In categories where depth of choice is important, multiple POIs are considered (i.e. ). For example, restaurants and bars are combined in a single category due to their overlapping function. They are the most frequent walking destination, hence we include 10 counts of places to account for the depth of offer in the neighbourhood. The shopping category represent all the retails where people can buy products such as clothes, gifts, etc. They are common walking destinations and they are commonly described as important for the attractiveness of a place. Thus, we considered 5 counts of places for this category. Coffee shops are also important for the neighbourhood, but not as important as restaurants and shopping places. Thus, we considered 2 counts for this category. For other categories only the distance from the nearest POI is calculated. These parameters are consistent with Walk Score wal . The definitions of and as summarized in Table 2.

The amenities are extracted from Foursquare, a crowd-sourced project where people participate in an online game where they check-in places where they go.

Grocery 1
Food 10
Shops 5
Schools 1
Entertainment 1
Parks and outside 1
Coffee 2
Banks 1
Books 1
Table 2: Additional details to compute the walkability score. is the number of retrieved Point of Interests, while is an ordered sequence of weights applied to the distances to the nearest Point of Interests.

The distance decay function that computes importance weight to each POI reachable from a starting point. Similarly to Walk Score, we use a polynomial distance that assigns the maximum score to amenities meters far from the starting point, then the score decays quickly until 1500 meters, where it first slows down then it goes to zero. The distance is along the street network, instead of the geometric distance (see Figure 4).

Figure 4: The geometric distance (green) and shortest path walking distance (red) between two points.

Appendix B Crime measure

In criminology, there are usually two ways to assess crime: crime counts and rates. The main difference between the two consists on how the population at risk is modelled. The former assumes that the importance of the population at risk has to be found by the model, while the latter assumes that importance is equal to one. Specifically, in crime rates model, the crime counts with the population at risk, which is usually the residential one. Recently, some scholars have discussed the costs and benefits of alternative denominators such as the ambient population Andresen (2006). Different ways exist on how to compute it, including the use of satellite imagery Andresen (2006), census data Mburu and Helbich (2016) and Foursquare check-ins Kadar et al. (2017).

However, it is not clear whether using residential rates, ambient rates, nor the policy implications of using one or the other. Moreover, the bias of ambient rates can be potentially lead to misleading interpretation of crime. In this research, we thus prefer to describe crime counts controlling for residential and ambient population. Hence, it is also possible to describe their relative role through the coefficients of the regression.

Appendix C Data sources

Type City Open Data Data is shared
Raw Aggregated
Blocks Bogotá
Census Blocks Bogotá dat (a)
Boston dat (b)
Chicago dat (b)
LA dat (b)
Boundaries Bogotá
Boston dat (c)
Chicago dat (d)
LA dat (e)
Buildings Bogotá dat (f)
Boston dat (g)
Chicago dat (h)
LA dat (i)
Crime Bogotá
Boston dat (j)
Chicago dat (k)
LA dat (l)
Employment and Ethnic mix Bogotá
Boston dat (m)
Chicago dat (m)
LA dat (m)
Land Use Bogotá dat (a)
Boston dat (n)
Chicago dat (o)
LA dat (p)
Mobile phone data All but Chicago
POIs All
Population Bogotá
Boston dat (m)
Chicago dat (m)
LA dat (m)
Poverty Bogotá
Boston dat (m)
Chicago dat (m)
LA dat (m)
Residential stability Bogotá
Boston dat (m)
Chicago dat (m)
LA dat (m)
Street network All
Table 3: Description of the data sources to replicate the paper. Most of the data is shared along this paper in raw format, while a subset of it (POIs and mobile phone data) could be shared only in an aggregated format for license and privacy reasons.

Distribution of crime

Crime is not evenly distributed in space and its distribution in the neighbourhoods of the analysed cities can be seen in Figure 5.

Figure 5: Distribution of the number of total committed crimes for each neighborhood.

Appendix D The spatial model

In each city, we tested the presence of spatial auto-correlation of crime through the Moran’s I coefficient Moran (1948) . When there is positive auto-correlation, negative otherwise. In the former case, places with high crime tend to be near places with high crime, in the latter places with high crime are near places with low crime. When is not near zero, regression models might exhibit spatial correlation in the residuals, thus invalidating the assumption of independence of the errors. In these cases, regression models should account for the spatial auto-correlation between spatial units, as we did.

In our paper, we model the crime counts in a city with a Negative Binomial (NB) regression, and we account for the spatial auto-correlation with the Bayesian Spatial Filtering (BSF) Hughes (2017) approach. The NB model is defined as:


where the mean and variance of are: . We use the logarithm as link function for the NB.

The BSF is defined as:

where is the Laplacian of , and is a Gamma with a large mean to discourage artifactual spatial structures in the posterior Arnold et al. (1999); Hughes (2017). Since we want to account for the correlation between the features, and well generalize the model, we apply a Ridge penalty to the coefficients and the QR decomposition to decorrelate covariates and, thus, the resulting posterior distribution. Thus, we model and as:

The alternative formulation that does not account for the auto-correlation is a NB model with Ridge penalty on the beta coefficients, which is defined as:


is the half-Cauchy distribution with a mean of zero and a scale parameter of one. We chose the half-Cauchy as suggested by Andrew Gelman 

Gelman et al. (2006).

In Table 4 we show the NB model exhibits a strong positive spatial auto-correlation in the residuals, while the BSF model does not, as expected. The models based on BSF are also superior in the LOO and .

Test for overdispersion

The NB model is motivated by the extra-Poisson variability of the crime distribution in the city. We can test the need of overdispersion through the Potthoff-Whittinghill and the Lagrange multiplier test.

The Potthoff-Whittinghill index of dispersion test Potthoff (2006) rejects the hypothesis of no overdispersion. It is defined as:


which is approximately a chi-square distribution with

degrees of freedom. We also apply the Lagrange multiplier test, defined as:


With one degree of freedom, the test appears to be significant – the hypothesis of no overdispersion is again rejected.

Selection of E eigenvectors

Following seminal literature of eigen-based spatial modelling and filtering Tiefelsdorf and Griffith (2007), we select the first eigenvectors from , where is a spatial matrix that describes the graph between spatial locations, while , which is an approximation of the spatial error model.

The associated sets of eigenvalues (

) from the

decomposition assess the strength of a spatial pattern. A vector

with describes positive spatial auto-correlation, while vectors with describe negative spatial auto-correlation. Spatial models are notoriously inefficient at dealing with negative spatial auto-correlation, which is also rather rare. Thus, consistently with literature Chun et al. (2016) we focus on positive auto-correlation and we select those vectors having , where is the maximum value among the eigenvalues.

Test for residual auto-correlation

We test for the presence of auto-correlation in the models’ residuals to the Moran’s I. However, since our model is not an Ordinary Least Squares, we used a corrected version of Moran’s I that is specifically tailored for log-linear relationships 

Lin and Zhang (2007). The index is defined as:

where is the number of spatial units, is the spatial matrix, and is the vector of residual errors.

We do not find significant residual spatial auto-correlation in our spatial models.

City Model BSF NB
Bogota Core -3897 0.75 -0.034 -4126 0.53 0.455
Social-disorganization (SD) -3891 0.75 -0.043 -4079 0.58 0.354
Built environment (BE) -3881 0.76 -0.036 -4061 0.61 0.371
Mobility (M) -3804 0.80 -0.042 -4034 0.64 0.460
SD+BE -3880 0.76 -0.035 -4013 0.65 0.287
SD+M -3795 0.81 -0.050 -3988 0.67 0.374
BE+M -3819 0.80 -0.025 -3980 0.68 0.361
SD+BE+M (Full) -3809 0.80 -0.040 -3941 0.71 0.284
Boston Core -2035 0.64 -0.005 -2209 0.22 0.418
Social-disorganization (SD) -2019 0.68 -0.003 -2088 0.55 0.236
Built environment (BE) -2014 0.68 -0.033 -2169 0.37 0.309
Mobility (M) -2001 0.70 -0.026 -2140 0.45 0.351
SD+BE -1987 0.72 -0.043 -2030 0.65 0.108
SD+M -1973 0.73 -0.030 -2011 0.67 0.105
BE+M -1989 0.72 -0.033 -2109 0.52 0.264
SD+BE+M (Full) -1957 0.75 -0.040 -1993 0.70 0.084
LA Core -9665 0.68 0.032 -10757 0.17 0.647
Social-disorganization (SD) -9529 0.72 0.005 -10042 0.55 0.416
Built environment (BE) -9629 0.69 0.005 -10618 0.27 0.615
Mobility (M) -9570 0.70 0.018 -10658 0.24 0.628
SD+BE -9508 0.72 -0.010 -9989 0.57 0.366
SD+M -9467 0.73 -0.002 -10003 0.57 0.444
BE+M -9585 0.70 0.011 -10571 0.30 0.618
SD+BE+M (Full) -9453 0.74 -0.011 -9967 0.58 0.388
Chicago Core -8415 0.68 0.117 -9350 0.09 0.543
Social-disorganization (SD) -8019 0.78 0.016 -8391 0.66 0.295
Built environment (BE) -8371 0.69 0.093 -9237 0.21 0.519
SD+BE -8003 0.79 0.003 -8357 0.68 0.282
Table 4: Comparison between the BSF model and the NB model that does not account for the spatial auto-correlation.

Appendix E Alternative spatial models

We tested alternative spatial models that could explain the residual spatial-autocorrelation. Here, we compare the BSF with other two similar, but competitive models: the Random Effects Eigenvector Spatial Filtering (RE-ESF) Murakami and Griffith (2019) and the Linear ESF model Tiefelsdorf and Griffith (2007). The ESF model is defined as:

where is the vector of the eigenvalues associated with , and is chosen to constrain the spatial random effects and avoid they penalize too much the fixed effects. To ensure limited variance, is limited to an upper value of 2.

The RE-ESF instead assumes to be random such that:

where is a multiplier that represents the scale of spatial variance, and is a parameter to be found.

Table 5 shows that no models clearly outperforms another, suggesting that they are almost equivalent in a Full Bayesian setting.

Bogota Core -3897 -0.034 -3899 -0.045 -3902 -0.041
Social-disorganization (SD) -3891 -0.043 -3895 -0.052 -3896 -0.049
Built environment (BE) -3881 -0.036 -3882 -0.045 -3884 -0.042
Mobility (M) -3804 -0.042 -3803 -0.048 -3807 -0.046
SD+BE -3880 -0.035 -3882 -0.043 -3884 -0.040
SD+M -3795 -0.050 -3796 -0.057 -3798 -0.056
BE+M -3819 -0.025 -3817 -0.033 -3822 -0.032
SD+BE+M (Full) -3809 -0.040 -3810 -0.049 -3810 -0.046
Boston Core -2035 -0.005 -2035 -0.016 -2035 -0.014
Social-disorganization (SD) -2019 -0.003 -2020 -0.017 -2020 -0.016
Built environment (BE) -2014 -0.033 -2013 -0.044 -2014 -0.044
Mobility (M) -2001 -0.026 -1999 -0.035 -1999 -0.035
SD+BE -1987 -0.043 -1987 -0.057 -1986 -0.057
SD+M -1973 -0.030 -1972 -0.046 -1971 -0.045
BE+M -1989 -0.033 -1988 -0.043 -1988 -0.042
SD+BE+M (Full) -1957 -0.040 -1957 -0.054 -1956 -0.053
LA Core -9665 0.032 -9663 0.028 -9671 0.029
Social-disorganization (SD) -9529 0.005 -9530 -0.001 -9535 -0.000
Built environment (BE) -9629 0.005 -9629 0.002 -9638 0.001
Mobility (M) -9570 0.018 -9569 0.014 -9576 0.014
SD+BE -9508 -0.010 -9510 -0.015 -9514 -0.014
SD+M -9467 -0.002 -9468 -0.006 -9472 -0.006
BE+M -9585 0.011 -9585 0.008 -9591 0.007
SD+BE+M (Full) -9453 -0.011 -9455 -0.015 -9458 -0.015
Chicago Core -8415 0.117 -8413 0.115 -8414 0.115
Social-disorganization (SD) -8019 0.016 -8019 0.012 -8019 0.013
Built environment (BE) -8371 0.093 -8369 0.090 -8370 0.090
SD+BE -8002 0.003 -8004 -0.001 -8006 -0.000
Table 5: Results of alternative spatial models applied in each city.

Appendix F Alternative Connectivity Matrices

Incorporating spatial relationship in spatial models requires the definition of a connectivity matrix that describes the relationship (if any) between one spatial unit and all the others. One of the most common connectivity matrix is a binary relationship between spatial units, also called topology representation Griffith and Peres-Neto (2006), which usually results in a sparse matrix:


An alternative formulation is based on distance. For example, Griffith et al. Griffith and Peres-Neto (2006) defines:


where is chosen as the maximal distance that keeps all the spatial units connected, while is the (Euclidian) distance between the centroids of unit and . is computed through the maximal distance of a Minimum Spanning Tree (MST) computed on the distance matrix .

We also test for an additional formulation that accounts for the connectivity of spatial units, extracted from mobile phone data. Here, the assumption is that the more connections two sites have, the strongest is the similarity between them. Thus , where is the number of trips made, on average, from the unit to unit and viceversa. As the matrix is not symmetrical and it does not have the diagonal equal to zero:


and for all .

As shown in Table 6, the BSF model with contiguity matrix achieves better performance in all the urban settings.

City Model Contiguity Distance Mobility
Bogota Core -3897 -0.034 -3999 0.016 -4104 0.073
Social-disorganization (SD) -3891 -0.043 -3969 0.010 -4051 0.060
Built environment (BE) -3881 -0.036 -3971 0.003 -4051 0.048
Mobility (M) -3804 -0.042 -3906 0.010 -4021 0.037
SD+BE -3880 -0.035 -3949 0.006 -4000 0.042
SD+M -3795 -0.050 -3879 0.002 -3964 0.031
BE+M -3819 -0.025 -3889 0.003 -3975 0.027
SD+BE+M (Full) -3809 -0.040 -3873 0.001 -3929 0.027
Boston Core -2035 -0.005 -2078 0.036 -2208 0.118
Social-disorganization (SD) -2019 -0.003 -2037 0.024 -2081 0.107
Built environment (BE) -2014 -0.033 -2040 0.010 -2168 0.076
Mobility (M) -2001 -0.026 -2014 -0.010 -2094 0.029
SD+BE -1987 -0.043 -2007 -0.002 -2029 0.022
SD+M -1973 -0.030 -1991 -0.011 -2011 0.001
BE+M -1989 -0.033 -2006 -0.006 -2108 0.075
SD+BE+M (Full) -1957 -0.040 -1976 -0.008 -1993 -0.002
LA Core -9665 0.032 -9966 0.061 -10691 0.134
Social-disorganization (SD) -9529 0.005 -9708 0.014 -9977 0.033
Built environment (BE) -9629 0.005 -9767 0.017 -10528 0.113
Mobility (M) -9570 0.018 -9875 0.050 -10596 0.118
SD+BE -9508 -0.010 -9668 0.005 -9922 0.026
SD+M -9467 -0.002 -9647 0.016 -9898 0.029
BE+M -9585 0.011 -9733 0.021 -10514 0.111
SD+BE+M (Full) -9453 -0.011 -9613 0.003 -9891 0.027
Chicago Core -8415 0.117 -8716 0.076 - -
Social-disorganization (SD) -8019 0.016 -8257 0.028 - -
Built environment (BE) -8371 0.093 -8623 0.058 - -
SD+BE -8003 0.003 -8244 0.032 - -
Table 6: Results of alternative connectivity matrices applied in the spatial model of each city. Chicago does not have mobility information, thus it was not possible to use the mobility matrix.

Appendix G Spatial model decomposition

The fit of our model can be decomposed in fixed effects, which are the input variables, random effects, which are the unexplained variance through spatial auto-correlation, and residuals, which are the errors of the model.

From Figure 6 D, Figure 7 D, Figure 8 D, and Figure 9 D we do not observe any clear spatial pattern on the residuals, confirming that the BSF model is easing the spatial auto-correlation as expected.

The observation of the random effects can help on locating local spatial effects that are not considered in the fixed effects. In Bogotá, the model suggests that significant unexplained variance is present near the touristic and dangerous neighbourhood La Candelaria, and near the populous district of Engativá (see Figure 7). In Boston, the area near the Franklin park indicates missing local factors (see Figure 6). In Los Angeles, unexplained variance seems to be tied to places with a large amount of people, namely the international airport and the UCLA campus (see SI Figure 8). Finally in Chicago missing variables are suggested near the prison and the southern area (see Figure 9).

Figure 6: Decomposition of the ground truth in fixed, random and residuals effects in Boston.
Figure 7: Decomposition of the ground truth in fixed, random and residuals effects in Bogotá.
Figure 8: Decomposition of the ground truth in fixed, random and residuals effects in Los Angeles.
Figure 9: Decomposition of the ground truth in fixed, random and residuals effects in Chicago.

Appendix H Improvement analysis

Figure 10, Figure 11, Figure 12 and Figure 13 show the improvement of each model against the Core model, and some reference variables for each city.

Figure 10: Model improvements in Boston. It can be seen that SD model improves the prediction from the core model almost everywhere but especially in disadvantaged areas, while the BE model seems to better improve the prediction near the airport and peripheral areas. The mobility model seems to improve but it also generates a strong outlier that performs poorly near the city centre. Finally, the Full model outperforms the core model and the other models almost everywhere. It keeps failing in some areas due to mobility information.
Figure 11: Model improvements in Bogotá. It can be seen that SD model improves the prediction from the core model very slightly, while the BE model seems to better improve the prediction near the richer part of the city and in areas with high number of shops. The mobility model seems to improve but it also generates a strong outlier that performs poorly near the “Engativa", a populous neighbourhood in Bogotá. Finally, the Full model outperforms the core model and the other models almost everywhere. It keeps failing in some areas due to mobility information.
Figure 12: Model improvements in Los Angeles. It can be seen that SD model improves the prediction from the core model very consistently, especially in deprived areas, while the BE model seems to only slightly improve the prediction. The mobility model seems to improve especially in popular areas, such the airport. Finally, the Full model outperforms the core model and the other models almost everywhere.
Figure 13: Model improvements in Chicago. Here, we note that we do not possess mobility information, so we compare the SD, BE and SD+BE models. It can be seen that SD model improves the prediction from the core model very consistently, especially in deprived areas, while the BE model seems to improve the prediction in southern Chicago. The SD+BE outperforms the core model and the other models almost everywhere.

Appendix I Auto-correlation of features

Figure 14 shows how the features do not act with the same strength and direction in all cities.

Figure 14: Correlation of features in the different cities.

Appendix J The minimal model

Table 7 shows the results of the minimal model, which employs only the features that play the same role in all the cities. Results show that no minimal setting is better at predicting crime in all cities.

City Model LOO
Bogota Minimal -3872 -0.046
Full -3809 -0.040
Boston Minimal -1994 -0.028
Full -1957 -0.040
Los Angeles Minimal -9534 -0.005
Full -9453 0.003
Chicago Minimal -8009 0.012
Full -8003 0.003
Table 7: Results of the Full model and the minimal one, which exploits only those features that play the same role in all the cities.

Appendix K Corehood tests

Table 8 shows the results of the Full model, and SD+BE model in Chicago, for different sizes of Corehood. From the results we can observe that the best size to infer the neighborhood effect is half a mile.

City Model Corehood size LOO
Bogota Full 0.5 miles -3809 -0.040
1 mile -3860 -0.005
Boston Full 0.5 miles -1957 -0.040
1 mile -2026 -0.011
Los Angeles Full 0.5 miles -9453 0.003
1 mile -9644 -0.001
Chicago SD+BE 0.5 miles -8003 0.003
1 mile -8066 -0.016
Table 8: Results of the Full or SD+BE model with different sizes of Corehood.