Identifying Hidden Visits from Sparse Call Detail Record Data

by   Zhan Zhao, et al.

Despite a large body of literature on trip inference using call detail record (CDR) data, a fundamental understanding of their limitations is lacking. In particular, because of the sparse nature of CDR data, users may travel to a location without being revealed in the data, which we refer to as a "hidden visit". The existence of hidden visits hinders our ability to extract reliable information about human mobility and travel behavior from CDR data. In this study, we propose a data fusion approach to obtain labeled data for statistical inference of hidden visits. In the absence of complementary data, this can be accomplished by extracting labeled observations from more granular cellular data access records, and extracting features from voice call and text messaging records. The proposed approach is demonstrated using a real-world CDR dataset of 3 million users from a large Chinese city. Logistic regression, support vector machine, random forest, and gradient boosting are used to infer whether a hidden visit exists during a displacement observed from CDR data. The test results show significant improvement over the naive no-hidden-visit rule, which is an implicit assumption adopted by most existing studies. Based on the proposed model, we estimate that over 10 CDR data involve hidden visits. The proposed data fusion method offers a systematic statistical approach to inferring individual mobility patterns based on telecommunication records.



There are no comments yet.


page 13


User Localization Based on Call Detail Records

Understanding human mobility is essential for many fields, including tra...

Exploring Human Mobility for Multi-Pattern Passenger Prediction: A Graph Learning Framework

Traffic flow prediction is an integral part of an intelligent transporta...

Mobility Inference on Long-Tailed Sparse Trajectory

Analyzing the urban trajectory in cities has become an important topic i...

Comparative Analysis of User Behavior of Dock-Based vs. Dockless Bikeshare and Scootershare in Washington, D.C

In 2017, dockless bikeshare systems were introduced in the United States...

Routine pattern discovery and anomaly detection in individual travel behavior

Discovering patterns and detecting anomalies in individual travel behavi...

State-Reification Networks: Improving Generalization by Modeling the Distribution of Hidden Representations

Machine learning promises methods that generalize well from finite label...

A Sparse Linear Model and Significance Test for Individual Consumption Prediction

Accurate prediction of user consumption is a key part not only in unders...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.