How much telematics information do insurers need for claim classification?

05/28/2021
by   Francis Duval, et al.
0

It has been shown several times in the literature that telematics data collected in motor insurance help to better understand an insured's driving risk. Insurers that use this data reap several benefits, such as a better estimate of the pure premium, more segmented pricing and less adverse selection. The flip side of the coin is that collected telematics information is often sensitive and can therefore compromise policyholders' privacy. Moreover, due to their large volume, this type of data is costly to store and hard to manipulate. These factors, combined with the fact that insurance regulators tend to issue more and more recommendations regarding the collection and use of telematics data, make it important for an insurer to determine the right amount of telematics information to collect. In addition to traditional contract information such as the age and gender of the insured, we have access to a telematics dataset where information is summarized by trip. We first derive several features of interest from these trip summaries before building a claim classification model using both traditional and telematics features. By comparing a few classification algorithms, we find that logistic regression with lasso penalty is the most suitable for our problem. Using this model, we develop a method to determine how much information about policyholders' driving should be kept by an insurer. Using real data from a North American insurance company, we find that telematics data become redundant after about 3 months or 4,000 kilometers of observation, at least from a claim classification perspective.

READ FULL TEXT
research
07/06/2020

Cost-sensitive Multi-class AdaBoost for Understanding Driving Behavior with Telematics

Powered with telematics technology, insurers can now capture a wide rang...
research
05/25/2018

Penalized polytomous ordinal logistic regression using cumulative logits. Application to network inference of zero-inflated variables

We consider the problem of variable selection when the response is ordin...
research
09/26/2022

Enhancing Claim Classification with Feature Extraction from Anomaly-Detection-Derived Routine and Peculiarity Profiles

Usage-based insurance is becoming the new standard in vehicle insurance;...
research
12/19/2021

ArcFace Knows the Gender, Too!

The main idea of this paper is that if a model can recognize a person, o...
research
09/06/2022

A Data Science Approach to Risk Assessment for Automobile Insurance Policies

In order to determine a suitable automobile insurance policy premium one...
research
04/12/2022

Prediction of motor insurance claims occurrence as an imbalanced machine learning problem

The insurance industry, with its large datasets, is a natural place to u...
research
09/19/2023

Striking a Balance: An Optimal Mechanism Design for Heterogenous Differentially Private Data Acquisition for Logistic Regression

We investigate the problem of performing logistic regression on data col...

Please sign up or login with your details

Forgot password? Click here to reset