InfoBehavior: Self-supervised Representation Learning for Ultra-long Behavior Sequence via Hierarchical Grouping

06/13/2021
by   Runshi Liu, et al.
0

E-commerce companies have to face abnormal sellers who sell potentially-risky products. Typically, the risk can be identified by jointly considering product content (e.g., title and image) and seller behavior. This work focuses on behavior feature extraction as behavior sequences can provide valuable clues for the risk discovery by reflecting the sellers' operation habits. Traditional feature extraction techniques heavily depend on domain experts and adapt poorly to new tasks. In this paper, we propose a self-supervised method InfoBehavior to automatically extract meaningful representations from ultra-long raw behavior sequences instead of the costly feature selection procedure. InfoBehavior utilizes Bidirectional Transformer as feature encoder due to its excellent capability in modeling long-term dependency. However, it is intractable for commodity GPUs because the time and memory required by Transformer grow quadratically with the increase of sequence length. Thus, we propose a hierarchical grouping strategy to aggregate ultra-long raw behavior sequences to length-processable high-level embedding sequences. Moreover, we introduce two types of pretext tasks. Sequence-related pretext task defines a contrastive-based training objective to correctly select the masked-out coarse-grained/fine-grained behavior sequences against other "distractor" behavior sequences; Domain-related pretext task designs a classification training objective to correctly predict the domain-specific statistical results of anomalous behavior. We show that behavior representations from the pre-trained InfoBehavior can be directly used or integrated with features from other side information to support a wide range of downstream tasks. Experimental results demonstrate that InfoBehavior significantly improves the performance of Product Risk Management and Intellectual Property Protection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/20/2022

Hierarchically Self-Supervised Transformer for Human Skeleton Representation Learning

Despite the success of fully-supervised human skeleton sequence modeling...
research
02/14/2022

UserBERT: Modeling Long- and Short-Term User Preferences via Self-Supervision

E-commerce platforms generate vast amounts of customer behavior data, su...
research
07/01/2022

e-CLIP: Large-Scale Vision-Language Representation Learning in E-commerce

Understanding vision and language representations of product content is ...
research
12/05/2020

Self-Supervised Visual Representation Learning from Hierarchical Grouping

We create a framework for bootstrapping visual representation learning f...
research
03/15/2023

Relax, it doesn't matter how you get there: A new self-supervised approach for multi-timescale behavior analysis

Natural behavior consists of dynamics that are complex and unpredictable...
research
06/17/2021

Efficient Self-supervised Vision Transformers for Representation Learning

This paper investigates two techniques for developing efficient self-sup...
research
08/11/2016

Sequence Graph Transform (SGT): A Feature Extraction Function for Sequence Data Mining (Extended Version)

The ubiquitous presence of sequence data across fields such as the web, ...

Please sign up or login with your details

Forgot password? Click here to reset