Taming Near Repeat Calculation for Crime Analysis via Cohesive Subgraph Computing

05/18/2017
by   Zhaoming Yin, et al.
0

Near repeat (NR) is a well known phenomenon in crime analysis assuming that crime events exhibit cor- relations within a given time and space frame. Traditional NR calculation generates 2 event pairs if 2 events happened within a given space and time limit. When the number of events is large, however, NR calculation is time consuming and how these pairs are organized are not yet explored. In this paper, we designed a new approach to calculate clusters of NR events efficiently. To begin with, R-tree is utilized to index crime events, a single event is represented by a vertex whereas edges are constructed by range querying the vertex in R-tree, and a graph is formed. Cohesive subgraph approaches are applied to identify the event chains. k-clique, k-truss, k- core plus DBSCAN algorithms are implemented in sequence with respect to their varied range of ability to find cohesive subgraphs. Real world crime data in Chicago, New York and Washington DC are utilized to conduct experiments. The experiment confirmed that near repeat is a solid effect in real big crime data by conducting Mapreduce empowered knox tests. The performance of 4 different algorithms are validated, while the quality of the algorithms are gauged by the distribution of number of cohesive subgraphs and their clustering coefficients. The proposed framework is the first to process the real crime data of million record scale, and is the first to detect NR events with size of more than 2.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset