MLPdf: An Effective Machine Learning Based Approach for PDF Malware Detection

08/21/2018
by   Jason Zhang, et al.
0

Due to the popularity of portable document format (PDF) and increasing number of vulnerabilities in major PDF viewer applications, malware writers continue to use it to deliver malware via web downloads, email attachments and other methods in both targeted and non-targeted attacks. The topic on how to effectively block malicious PDF documents has received huge research interests in both cyber security industry and academia with no sign of slowing down. In this paper, we propose a novel approach based on a multilayer perceptron (MLP) neural network model, termed MLPdf, for the detection of PDF based malware. More specifically, the MLPdf model uses a backpropagation algorithm with stochastic gradient decent search for model update. A group of high quality features are extracted from two real-world datasets which comprise around 105000 benign and malicious PDF documents. Evaluation results indicate that the proposed MLPdf approach exhibits excellent performance which significantly outperforms all evaluated eight well known commercial anti-virus scanners with a much higher true positive rate of 95.12 low false positive rate of 0.08

READ FULL TEXT
research
02/10/2019

Machine Learning With Feature Selection Using Principal Component Analysis for Malware Detection: A Case Study

Cyber security threats have been growing significantly in both volume an...
research
01/17/2019

Easy to Fool? Testing the Anti-evasion Capabilities of PDF Malware Scanners

Malware scanners try to protect users from opening malicious documents b...
research
04/22/2018

MEADE: Towards a Malicious Email Attachment Detection Engine

Malicious email attachments are a growing delivery vector for malware. W...
research
07/27/2021

PDF-Malware: An Overview on Threats, Detection and Evasion Attacks

In the recent years, Portable Document Format, commonly known as PDF, ha...
research
03/25/2019

Capturing the symptoms of malicious code in electronic documents by file's entropy signal combined with Machine learning

Abstract-Email cyber-attacks based on malicious documents have become th...
research
01/04/2021

Echelon: Two-Tier Malware Detection for Raw Executables to Reduce False Alarms

Existing malware detection approaches suffer from a simplistic trade-off...
research
10/29/2021

A Comprehensive Study on Learning-Based PE Malware Family Classification Methods

Driven by the high profit, Portable Executable (PE) malware has been con...

Please sign up or login with your details

Forgot password? Click here to reset