Reliable Detection of Compressed and Encrypted Data

03/31/2021
by   Fabio De Gaspari, et al.
0

Several cybersecurity domains, such as ransomware detection, forensics and data analysis, require methods to reliably identify encrypted data fragments. Typically, current approaches employ statistics derived from byte-level distribution, such as entropy estimation, to identify encrypted fragments. However, modern content types use compression techniques which alter data distribution pushing it closer to the uniform distribution. The result is that current approaches exhibit unreliable encryption detection performance when compressed data appears in the dataset. Furthermore, proposed approaches are typically evaluated over few data types and fragment sizes, making it hard to assess their practical applicability. This paper compares existing statistical tests on a large, standardized dataset and shows that current approaches consistently fail to distinguish encrypted and compressed data on both small and large fragment sizes. We address these shortcomings and design EnCoD, a learning-based classifier which can reliably distinguish compressed and encrypted data. We evaluate EnCoD on a dataset of 16 different file types and fragment sizes ranging from 512B to 8KB. Our results highlight that EnCoD outperforms current approaches by a wide margin, with accuracy ranging from  82 for 512B fragments up to  92 for 8KB data fragments. Moreover, EnCoD can pinpoint the exact format of a given data fragment, rather than performing only binary classification like previous approaches.

READ FULL TEXT

page 1

page 7

page 9

page 12

research
10/15/2020

EnCoD: Distinguishing Compressed and Encrypted File Fragments

Reliable identification of encrypted file fragments is a requirement for...
research
11/22/2018

PE-AONT: Partial Encryption combined with an All-or-Nothing Transform

In this report, we introduce PE-AONT: a novel algorithm for fast and sec...
research
03/30/2023

Differential Area Analysis for Ransomware: Attacks, Countermeasures, and Limitations

Crypto-ransomware attacks have been a growing threat over the last few y...
research
10/24/2022

Comparison of Entropy Calculation Methods for Ransomware Encrypted File Identification

Ransomware is a malicious class of software that utilises encryption to ...
research
05/28/2019

HEDGE: Efficient Traffic Classification of Encrypted and Compressed Packets

As the size and source of network traffic increase, so does the challeng...
research
06/28/2021

Differential Area Analysis for Ransomware Attack Detection within Mixed File Datasets

The threat from ransomware continues to grow both in the number of affec...

Please sign up or login with your details

Forgot password? Click here to reset