Introduction
In criminology, it is a generally accepted fact that crime occurs more often in more populated regions. In one of the first works of modern criminology, Balbi and Guerry examined the crime distribution across France in 1825, revealing that some areas experienced more crime than others (balbi1829statistique; Friendly2007). To compare these areas, they realized the need to adjust for population size and analyzed crime rates instead of raw numbers. This method removes the linear effect of population size from crime numbers, and it has been used to measure crime and compare cities almost everywhere—from academia to news outlets (nyt2016; independent2016; siegel2011criminology). However, this approach neglects potential nonlinear effects of population and, more importantly, exposes our limited understanding of the population–crime relationship.
Though different criminology theories expect a relationship between population size and crime, they tend to disagree on how crime increases with population (Chamlin2004; Rotolo2006). These theories predict divergent population effects, such as linear and superlinear crime growth. Despite these theoretical disputes, however, crime rates per capita are broadly used by assuming that crime increases linearly with the number of people in a region. Crucially, crime rates are often deemed to be a standard way to compare crime in cities.
However, the widespread adoption of crime rates is arguably due more to tradition rather than its ability to remove the effects of population size (Boivin2013). Many urban indicators, including crime, have already been shown to increase nonlinearly with population size (Bettencourt2007). When we violate the linear assumption and use rates, we deal with quantities that still have population effects, which introduces an artifactual bias into rankings and analyses.
Despite this inadequacy, we only have a limited understanding of the impact of nonlinearity on crime rates. The literature has mostly paid attention to estimating the relationship between crime and population size, focusing on either specific countries or crime types. The lack of a comprehensive systematic study has limited our knowledge regarding the impact of the linear assumption on crime analyses and, more critically, has prevented us from better understanding the effect of population on crime.
In this work, we analyze burglaries and thefts in twelve countries and investigate how crime rates per capita can misrepresent cities in rankings. Instead of assuming that the population–crime relationship is linear, we estimate this relationship from data using probabilistic scaling analysis (Leitao2016). We use our estimates to rank cities while adjusting for population size, and then we examine how these rankings differ from rankings based on rates per capita. In our results, we find that the linear assumption is unjustified. We show that using crime rates to rank cities can lead to rankings that considerably differ from rankings adjusted for population size. Finally, our results reveal contrasting growths of burglaries and thefts with population size, implying that different crime dynamics can produce distinct features at the city level. Our work sheds light on the population–crime relationship and suggests caution in using crime rates per capita.
Crime and population size
Different theoretical perspectives predict the emergence of a relationship between population size and crime. Three main criminology theories expect this relationship: structural, social control, and subcultural (Chamlin2004; Rotolo2006). In general, these perspectives agree that variations in the number of people in a region have an impact on the way people interact with each other. These theories, however, differ in the type of changes in social interaction and how they can produce a population–crime relationship.
From a structural perspective, a higher number of people increases the chances of social interaction, which increases the occurrence of crime. Two distinct rationales can explain such an increase. Mayhew1976 posit that crime is a product of human contact: more interaction leads to higher chances of individuals being exploited, offended, or harmed. They claim that a larger population size increases the opportunities for interaction at an increasing rate, which would lead to a superlinear crime growth with population size (Chamlin2004). In contrast, blau1977inequality implies a linear population–crime relationship. He posits that population aggregation reduces spatial distance among individuals which promotes different social associations such as victimization. At the same time that conflictive association increases, other integrative ones also increase, leading to a linear growth of crime (Chamlin2004). Notably, the structural perspective focuses on the quantitative consequences of population growth.
The social control perspective advocates that changes in population size have a qualitative impact on social relations, which weakens informal social control mechanisms that inhibit crime (Groff2015). From this perspective, crime relates to two aspect of population: size and stability. Larger population size leads to higher population density and heterogeneity—not only individuals have more opportunities for social contacts, but they are often surrounded by strangers (Wirth1938). This situation makes social integration more difficult and promotes a higher anonymity, which encourages criminal impulses and harms community’s ability to socially constrain misbehavior (Freudenburg1986; Sampson1986). Similarly, from a systemic viewpoint, any change (i.e., increase or decrease) in population size can have an impact on crime numbers (Rotolo2006). This viewpoint understands that regular and sustained social interactions produce community networks with effective mechanisms of social control (Bursik1982). Population instability, however, hinders the construction of such networks. In communities with unstable population size, residents avoid socially investing in their neighborhoods, which hurts community organization and weakens social control, increasing misbehavior and crime (Sampson1988; Miethe1991).
Both social control and structural perspectives solely focus on individuals’ interactions without considering individuals’ private interests. These perspectives pay little attention to how unconventional interests increase with urbanization (Fischer1975) and how these interests relate to misbehavior.
In contrast, the subcultural perspective advocates that population concentration brings together individuals with shared interests, which produces private social networks built around these interests, promoting a social support for behavioral choices. Fischer1975 posits that population size has an impact on the creation, diffusion, and intensification of unconventional interests. He proposes that large populations have sufficient people with specific shared interests which enable social interaction and lead to the emergence of subcultures. The social networks surrounding a subculture bring normative expectations that increase the likelihood of misbehavior and crime (Fischer1995; Fischer1975).
These three perspectives—structural, social control, and subcultural—expect that more people in an area lead to more crime in that area. In the case of cities, we know that population size is indeed a strong predictor of crime (Bettencourt2007) . The existence of a population–crime relationship implies that we must adjust for population size to analyze crime in cities properly.
Crime rate per capita
In the literature, the typical solution for removing the effect of population size from crime numbers is to use ratios such as
(1) 
which are often used together with a multiplier that contextualizes the quantity (e.g., crime per inhabitants) (Boivin2013). Though crime rates are popularly used, however, they present at least two inadequacies. First, the way we define population affects crime rates. The common approach is to use resident population (e.g., census data) to estimate rates, but this practice can distort the picture of crime in a place: crime is not limited to residents (Gibbs1976), and cities attract a substantial number of nonresidents (Stults2015). Instead, researchers suggest to use ambient population (Andresen2006; Andresen2011) and account for the number of targets, which depends on the type of crime (Boggs1965; Cohen1985).
Second, Eq. (1) assumes that the population–crime relationship is linear. The rationale behind this equation is that we have a relationship of the form
(2) 
which means that crime can be linearly approximated via population. Because of the linearity assumption, when we divide crime by population in Eq. (1), we are trying to cancel out the effect of population on crime. This assumption implies that crime increases at the same pace of population. Not all theoretical perspectives, however, agree with such a type of growth, and many urban indicators, including crime, have been shown to increase with population size in a nonlinear fashion (Bettencourt2007).
Cities and scaling laws
Much research has been devoted to understanding urban growth and its impact on indicators such as gross domestic product, total wages, electrical consumption, and crime (Bettencourt2007; Bettencourt2010; Bettencourt2013; GomezLievano2016). Bettencourt2007 have shown that a city’s population size, denoted by , is a strong predictor of its urban indicators, denoted by , exhibiting a relationship of the form:
(3) 
This socalled scaling law tells us that, given the size of a city, we expect certain levels of wealth creation, knowledge production, criminality, and other urban aspects. This expectation suggests general processes underlying urban development (Bettencourt2013hyp) and indicates that regularities exist in cities despite of their idiosyncrasies (oliveira2019spatial). To understand this scaling and urban processes better, we can examine the exponent , which describes how an urban indicator grows with population size.
Bettencourt2007 presented evidence that different categories of urban indicators exhibit distinct growth regimes. They showed that social indicators grow faster than infrastructural ones (see Fig. 1A). Specifically, social indicators, such as number of patents and total wages, increase superlinearly with population size (i.e., ), meaning that these indicators grow at an increasing rate with population. In the case of infrastructural aspects (e.g., road surface, length of electrical cables), there exists an economy of scale. As cities grow in population size, these urban indicators increase at a slower pace with (i.e., sublinearly). In both scenarios, because of nonlinearity, we should be careful with per capita analyses.
When we violate the linearity assumption of per capita ratios, we deal with quantities that can misrepresent an urban indicator. To show that, we use Eq. (3) to define the per capita rate of an urban indicator as the following:
(4) 
which implies that rates are independent from population only when equals to one—when , population is not cancelled out from the equation. In these nonlinear cases, per capita rates can inflate or deflate the representation of an urban indicator depending on (see Fig. 1B) (Bettencourt2010; Alves2013b). This misrepresentation occurs because population still has an effect on rates. By definition, we expect that per capita rates are higher in bigger cities when , whereas when , we expect bigger cities having lower rates. In nonlinear situations, when we compare cities via rates, we introduce an artifactual bias in analyses and rankings of urban indicators.
More crime in cities?
In the case of crime, researchers have found a superlinear growth with population size. Bettencourt2007 showed that serious crime in the United States exhibits a superlinear scaling with exponent , and some evidence has confirmed similar superlinearity for homicides in Brazil, Colombia, and Mexico (GomezLievano2012; Alves2013). Previous works have also shown that different kinds of crime in the U.K. and in U.S. present nonlinear scaling relationships (Hanley2016; Chang2019). Remarkably, the existence of these scaling laws of crime suggests fundamental urban processes that relates to crime, independent of cities’ particularities.
This regularity manifests itself in the socalled scaleinvariance property of scaling laws. It is possible to show that Eq. (3) holds the following property:
(5) 
where does not depend on (thurner2018introduction). From a modeling perspective, this relationship reveals two aspects about crime. First, we can predict crime numbers in cities via a populational scale transformation (Bettencourt2013hyp). This transformation is independent of population size but depends on which tunes the relative increase of crime in such a way that . Second, Eq. (5) implies that crime is present in any city, independent of size. This implication arguably relates to the Durkheimian concept of crime normalcy in that crime is seen as a normal and necessary phenomenon in societies, provided that its numbers are not unusually high (durkheim1938rules). The scaleinvariance property tells us that crime in cities is associated with population in a somewhat predictable fashion. Crucially, this property might give the impression that such a regularity is independent of crime type.
However, different types of crime are connected to social mechanisms differently (Hipp2016) and exhibit unique temporal (crimeprofiles; Oliveira2018) and spatial characteristics (Andresen2012; white2014spatial; oliveira2015criminal; Oliveira2017). It is plausible that the scaling laws of crime depend on crime type. Nevertheless, the literature has mostly focused on either specific countries or crime types. Few studies have systematically examined the scaling of different crime types, and the focus on specific countries has prevented us from better understanding the impact of population on crime. Likewise, the lack of a comprehensive systematic study has limited our knowledge about the impact of the linear assumption on crime rates. We still fail to understand how per capita analyses can misrepresent cities in nonlinear scenarios.
In this work, we characterize the scaling laws of burglary and theft in twelve countries and investigate how crime rates per capita can misrepresent cities in rankings. Instead of assuming that the population–crime relationship is linear, as described in Eq. (2), we investigate this relationship under its functional form as the following:
(6) 
Specifically, we examine the plausibility of scaling laws to describe the population–crime relationship. To estimate the scaling laws, we use probabilistic scaling analysis, which enables us to characterize the scaling laws of crime. We use our estimates to rank cities while accounting for the effects of population size. Finally, we compare these adjusted rankings with rankings based on percapita rates (i.e., with the linearity assumption).
Results
Country  Theft  Burglary  

Belgium  
Canada  
Colombia  
Denmark  
France  
Italy  
Mexico        
Portugal        
South Africa  
Spain        
United Kingdom  
United States 
, sample standard deviation
, and maximum value .We use data from twelve countries to investigate the relationship between population size and crime at the city level. We examine annual data from Belgium, Canada, Colombia, Denmark, France, Italy, Portugal, South Africa, Spain, the United Kingdom, and the United States (see Table 1). In our research, we are not interested in comparing countries’ absolute numbers of crime. We understand that international comparisons of crime have several problems because of differences in crime definitions, police and court practices, reporting rates, and others (Takala2008). In this work, however, we want to investigate how crime increases with population size in each country, focusing on burglary and theft (see Supplementary Information for data sources). We analyze data of both types of crime in all considered countries, except Mexico, Portugal, and Spain, where we only have data for one kind of offense.
The scaling laws of crime in cities
To assess the relationship between crime and population size (see Fig. 2), we model using probabilistic scaling analysis (see Methods). In our study, we examine whether this relationship follows the general form of . First, we estimate from data, and then we evaluate the plausibility of the model () and the evidence for nonlinearity (i.e., ). Our results show that and often exhibit a nonlinear relationship, depending on the type of offense.
In most of the considered countries, theft increases with population size superlinearly, whereas burglary tends to increase linearly (see Fig. 3). Precisely, in nine out of eleven countries, we find that for theft is above one; our results indicate linearity for theft (i.e., absence of nonlinear plausibility) in Canada and South Africa. In the case of burglary, we are unable to reject linearity in seven out of ten countries; in France and the United Kingdom, we find superlinearity, and, in Canada, sublinearity. In almost all the data sets, these estimates are consistent over two consecutive years in the countries we have data for different years (see Appendix I).
Our results show that the general form of is plausible in most countries, but that this compatibility depends on the offense. We find that burglary data are compatible with the model () in 80% of the considered countries. In the case of theft, the superlinear models are compatible with data in five out of nine countries. We note that, in Canada and South Africa, where we are unable to reject linearity for theft, the linear model also lacks compatibility with data.
We find that the estimates of for each offense often have different values across countries—for example, the superlinear estimates of for theft range from to . However, when we analyze each country separately, we find that for theft tends to be larger than for burglary in each country.
In summary, we find evidence for a nonlinear relationship between crime and population size in more than half of the considered data sets. Our results indicate that crime often increases with population size at a pace that is different from per capita. This relationship implies that analyses with a linear assumption might create distorted pictures of crime in cities. To understand such distortions, we have to examine how nonlinearity influences comparisons of crime in cities, when linearity is assumed.
The inadequacy of crime rates and per capita rankings
We investigate how crime rates of the form introduce bias in comparisons and rankings of cities. To understand this bias, we use Eq. (3) to rewrite crime rate as . This relationship implies that crime rate depends on population size when . For example, in Portugal and Denmark, this dependency is clear when we analyze burglary and theft numbers (see Fig. 4). In the case of burglary in Portugal, linearity makes independent of population size. In Denmark, since theft increases superlinearly, we expect rates to increase with population size. In this country, based on data, the expected theft rate of a small city is lower than the ones of larger cities. We have to account for this tendency in order to compare crime in cities; otherwise, we introduce bias against larger cities.
To account for the population–crime relationship found in data, we compare cities using the model as the baseline. We compare the number of crime in a city with the expectation of the model. For each city with population size , we evaluate the score of the city with respect to . The score tells us how much more or less crime a particular city has in comparison to cities with similar population size, as expected by the model. These scores enable us to compare cities in a country and rank them while accounting for population size differences. We denote this kind of analysis as a comparison adjusted for population–crime relationship.
For example, in Denmark, the theft rate in the municipality of Aalborg () is almost the same as in Solrød (). However, less crime occurs in Aalborg than the expected for cities of similar size, while crime in Solrød is above the model expectation (see Fig. 4B). This disagreement arises because of the different population sizes. Since Aalborg is more than ten times larger than Solrød, we expect rates in Aalborg to be larger than in Solrød. When we account for this tendency and evaluate their scores, we find that the score of Aalborg is , whereas in Solrød the score is .
Such inconsistencies have an impact on crime rankings of cities. The municipality of Aarhus, in Denmark, for example, is in the top twelve ranking of cities with the highest theft rate in the country. However, when we account for population–crime relationship using scores, we find that Aarhus is only at the end of the top fiftyfour ranking.
To understand these variations systematically, we compare rankings based on crime rates with rankings that account for population–crime relationship (i.e., adjusted rankings). Our results show that these two rankings create distinct representations of cities. For each considered data set, we rank cities based on their scores and crime rates then examine the change in the rank of each city. We find that the positions of the cities can change substantially. For instance, in Italy, half of the cities have theft rate ranks that diverge in at least eleven positions from the adjusted ranking (Fig. 5A). This disagreement means that these rankings disagree about half of the cities in the top ten most dangerous cities.
We evaluate these discrepancies by using the Kendall rank correlation coefficient to measure the similarity between crime rates and adjusted rankings in the considered countries. We find that these rankings can differ considerably but converge when . The coefficients for the data sets range from to , exhibiting a dependency on the type of crime; or more specifically, on the scaling (Fig. 5B). As expected, as approaches to , the rankings are more similar to each other. For example, in Italy, in contrast to theft, the burglary rate rank of half of the cities only differs from the adjusted ranking in a maximum of two positions (Fig. 5A).
Discussion
Despite being used virtually everywhere, crime rates per capita have a strong assumption that crime increases at the same pace as the number of people in a region. In this work, we investigated how crime grows with population size and how such a widespread assumption of linear growth influences cities’ rankings.
First, we analyzed crime in cities from twelve countries to characterize the population–crime relationship statistically, examining the plausibility of scaling laws to describe this relationship. Then, we ranked cities using our estimates and compared how these rankings differ from rankings based on rates per capita.
We found that the assumption of linear crime growth is unfounded. In more than half of the considered data sets, we found evidence for nonlinear crime growth—that is, crime often increases with population size at a different pace than per capita. This nonlinearity introduces a population effect into crime rates. Our results showed that using crime rates to rank cities substantially differs from ranking cities while adjusting for population size.
From academia to news outlets, crime rates per capita are arguably used because they provide us with a familiar measure of criminality (Boivin2013). Our work implies, however, that they can create a distorted picture of crime in cities. For example, in superlinear scenarios, we expect bigger cities to have higher crime rates. In this case, when we use rates to rank cities, we build rankings that big cities are at the top. But, these cities might not experience more crime than what we expect from places of the same size. It is an artifactual bias due to population effects still present in crime rates.
Because of this inadequacy, we advise caution when using crime rates per capita to compare cities. We recommend first evaluating the linear plausibility before analyzing crime rates, and avoiding them when possible. Instead, we suggest comparing scores computed via the model estimated using the approach discussed in the manuscript (Leitao2016).
We highlight that crime rates per capita also suffer from the population definition issue—that is, how we define population affects crime rates. In this work, we used the resident population to analyze the population–crime relationship. We understand that crime is not limited to residents (Gibbs1976), and cities attract nonresidents (Stults2015). Much literature suggests using ambient population and account for the number of targets (Boggs1965; Andresen2006; Andresen2011). However, this data is difficult to collect when dealing with different countries. Future research should investigate the scaling laws using other definitions of population, particularly using social media data (Malleson2016; PachecoOM17).
In this work, we shed light on the population–crime relationship. The linear assumption is exhausted and expired. We have resounding evidence of nonlinearity in crime, which disallows us from unjustifiably assuming linearity. Yet, in light of our results, we note that the scaling laws are plausible models only for half of the considered data sets. We need better models—in particular, models that account for the fact that different crime types relate to population size differently. More adequate models will help us better understand the relationship between population and crime.
Data and methods
Preprocessing data
We gathered data sets of different types of crime at the city level from countries: Belgium, Canada, Colombia, Denmark, France, Italy, Portugal, South Africa, Spain, United Kingdom, and United States. To examine different types of crime in these countries, we need to have a way to denote each type of crime in each place using a general description. The way we categorize the different types of crime are summarized in the Supplementary Material.
Probabilistic scaling analysis
We use probabilistic scaling analysis to estimate the scaling laws of crime. Instead of analyzing the linear form of Eq. (3), we use the approach developed by Leitao2016 to estimate the parameters of a distribution that has the following expectation:
(7) 
that is, scales the expected value of an urban indicator (Bettencourt2013hyp; GomezLievano2012; Leitao2016). Note that this method does not assume that the fluctuations around and
(Leitao2016). Instead, we compare models forthat satisfy the following conditional variance:
(8) 
where typically . To estimate the scaling laws, we maximize the loglikelihood
(9) 
since we assume as an independent realization from . In this work, we use an implementation developed by Leitao2016 that maximizes the loglikelihood with the ‘LBFGSB’ algorithm. We model
using Gaussian and lognormal distributions, so we can analyze whether accounting for the sizedependent variance influences the estimation. In the case of the Gaussian, the conditions from Eq. (
7) and Eq. (8) are satisfied with(10) 
whereas in the case of the lognormal distribution,
(11) 
In lognormal case, note that, if , the fluctuations are independent of , thus this would be the same as using the minimum leastsquares approach (Leitao2016). With this framework, we compare models that have fixed against models that is also included in the optimization process. In the case of the Gaussian, we have fixed and free , whereas the lognormal has fixed and free . We compare each of the models individually against the linear alternative (with fixed ), to test the nonlinearity plausibility.
Finally, with the fits for all types of crime and countries, we measure the Bayesian Information Criteria (), defined as
(12) 
where is the number of free parameters in the model and lower values indicate better data description. The value of each fit enables us to compare the ability of the models to explain data.
References
Appendices
Appendix I: Results from the probabilistic scaling analysis
To test the plausibility of a nonlinear scaling, we compare each model against the linear alternative (i.e., ) using the difference between the fits for each data set. We follow Leitao2016 and define three outcomes from this comparison. First, if , we say that the model is linear (), since we can consider that the linear model explains the data better. Second, if , we consider the analysis of inconclusive because we do not have enough evidence for the nonlinearity. Finally, if , we have evidence in favor of the nonlinear scaling, which can be superlinear () or sublinear (). We also use to determine the model that describes the data better. In Table 2 and Table 3, we summarize the results in that we a dark gray cell indicates the best model based on , a light gray cell indicates the best model given a model, and indicates that the model is plausible ().
Lognormal  Gaussian  

Belgium (2015)  () 
()

()

() 
Belgium (2016)  () 
()

()

() 
Canada (2015) 
()

() 
()

() 
Canada (2016)  () 
()

()

() 
Colombia (2013) 
()

() 
()

() 
Colombia (2014) 
()

() 
()

() 
Denmark (2015)  () 
()

()

() 
Denmark (2016)  () 
()

()

() 
France (2013)  () 
()

()

() 
France (2014) 
()

() 
()

() 
Italy (2014)  () 
()

()

() 
Italy (2015)  () 
()

()

() 
Mexico (2015)  () 
()

()

() 
Mexico (2016)  () 
()

()

() 
South Africa (2016)  () 
()

()

() 
Spain (2015) 
()

() 
()

() 
Spain (2016) 
()

() 
()

() 
United Kingdom (2015) 
()

() 
()

() 
United Kingdom (2016) 
()

() 
()

() 
United States (2014) 
()

() 
()

() 
United States (2015) 
()

() 
()

() 
Lognormal  Gaussian  

Belgium (2015) 
()

() 
()

() 
Belgium (2016) 
()

() 
()

() 
Canada (2015) 
()

() 
()

() 
Canada (2016) 
()

() 
()

() 
Colombia (2013) 
()

() 
()

() 
Colombia (2014) 
()

() 
()

() 
Denmark (2015) 
()

()  () 
()

Denmark (2016) 
()

()  () 
()

France (2013)  () 
()

()

() 
France (2014)  () 
()

()

() 
Italy (2014)  () 
()

()

() 
Italy (2015)  () 
()

()

() 
Portugal (2015)  () 
()

()

() 
Portugal (2016)  () 
()

()

() 
South Africa (2016)  () 
()

()

() 
United Kingdom (2015)  () 
()

()

() 
United Kingdom (2016)  () 
()

()

() 
United States (2014) 
()

() 
()

() 
United States (2015)  () 
()

()

() 
Comments
There are no comments yet.