DeepTLS: comprehensive and high-performance feature extraction for encrypted traffic

08/08/2022
by   Zhi Liu, et al.
0

Feature extraction is critical for TLS traffic analysis using machine learning techniques, which it is also very difficult and time-consuming requiring huge engineering efforts. We designed and implemented DeepTLS, a system which extracts full spectrum of features from pcaps across meta, statistical, SPLT, byte distribution, TLS header and certificates. The backend is written in C++ to achieve high performance, which can analyze a GB-size pcap in a few minutes. DeepTLS was thoroughly evaluated against two state-of-the-art tools Joy and Zeek with four well-known malicious traffic datasets consisted of 160 pcaps. Evaluation results show DeepTLS has advantage of analyzing large pcaps with half analysis time, and identified more certificates with acceptable performance loss compared with Joy. DeepTLS can significantly accelerate machine learning pipeline by reducing feature extraction time from hours even days to minutes. The system is online at https://deeptls.com, where test artifacts can be viewed and validated. In addition, two open source tools Pysharkfeat and Tlsfeatmark are also released.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/07/2023

Feature Mining for Encrypted Malicious Traffic Detection with Deep Learning and Other Machine Learning Algorithms

The popularity of encryption mechanisms poses a great challenge to malic...
research
12/11/2017

Feature Extraction and Feature Selection: Reducing Data Complexity with Apache Spark

Feature extraction and feature selection are the first tasks in pre-proc...
research
06/02/2021

Deep Learning for Network Traffic Classification

Monitoring network traffic to identify content, services, and applicatio...
research
05/25/2023

LFTK: Handcrafted Features in Computational Linguistics

Past research has identified a rich set of handcrafted linguistic featur...
research
11/03/2021

AlphaD3M: Machine Learning Pipeline Synthesis

We introduce AlphaD3M, an automatic machine learning (AutoML) system bas...
research
11/17/2021

Exploring Unsupervised Learning Methods for Automated Protocol Analysis

The ability to analyse and differentiate network protocol traffic is cru...
research
05/01/2023

Meat Freshness Prediction

In most retail stores, the number of days since initial processing is us...

Please sign up or login with your details

Forgot password? Click here to reset