Detecting organized eCommerce fraud using scalable categorical clustering

10/10/2019
by   Samuel Marchal, et al.
0

Online retail, eCommerce, frequently falls victim to fraud conducted by malicious customers (fraudsters) who obtain goods or services through deception. Fraud coordinated by groups of professional fraudsters that place several fraudulent orders to maximize their gain is referred to as organized fraud. Existing approaches to fraud detection typically analyze orders in isolation and they are not effective at identifying groups of fraudulent orders linked to organized fraud. These also wrongly identify many legitimate orders as fraud, which hinders their usage for automated fraud cancellation. We introduce a novel solution to detect organized fraud by analyzing orders in bulk. Our approach is based on clustering and aims to group together fraudulent orders placed by the same group of fraudsters. It selectively uses two existing techniques, agglomerative clustering and sampling to recursively group orders into small clusters in a reasonable amount of time. We assess our clustering technique on real-world orders placed on the Zalando website, the largest online apparel retailer in Europe1. Our clustering processes 100,000s of orders in a few hours and groups 35-45 simple technique built on top of our clustering that detects 26.2 while raising false alarms for only 0.1

READ FULL TEXT
research
11/05/2020

Group isomorphism is nearly-linear time for most orders

We show that there is a dense set Υ⊆ℕ of group orders and a constant c s...
research
12/23/2017

Merging K-means with hierarchical clustering for identifying general-shaped groups

Clustering partitions a dataset such that observations placed together i...
research
05/22/2022

Lotteries for Shared Experiences

We study a setting where tickets for an experience are allocated by lott...
research
06/09/2023

Agent market orders representation through a contrastive learning approach

Due to the access to the labeled orders on the CAC40 data from Euronext,...
research
02/01/2019

StaTIX - Statistical Type Inference on Linked Data

Large knowledge bases typically contain data adhering to various schemas...
research
02/27/2020

Uncovering Insurance Fraud Conspiracy with Network Learning

Fraudulent claim detection is one of the greatest challenges the insuran...
research
02/14/2017

Regularities and Irregularities in Order Flow Data

We identify and analyze statistical regularities and irregularities in t...

Please sign up or login with your details

Forgot password? Click here to reset