Look Before You Leap: Detecting Phishing Web Pages by Exploiting Raw URL And HTML Characteristics

11/06/2020
by   Chidimma Opara, et al.
0

Cybercriminals resort to phishing as a simple and cost-effective medium to perpetrate cyber-attacks on today's Internet. Recent studies in phishing detection are increasingly adopting automated feature selection over traditional manually engineered features. This transition is due to the inability of existing traditional methods to extrapolate their learning to new data. To this end, in this paper, we propose WebPhish, a deep learning technique using automatic feature selection extracted from the raw URL and HTML of a web page. This approach is the first of its kind, which uses the concatenation of URL and HTML embedding feature vectors as input into a Convolutional Neural Network model to detect phishing attacks on web pages. Extensive experiments on a real-world dataset yielded an accuracy of 98 percent, outperforming other state-of-the-art techniques. Also, WebPhish is a client-side strategy that is completely language-independent and can conduct lightweight phishing detection regardless of the web page's textual language.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/28/2019

HTMLPhish: Enabling Accurate Phishing Web Page Detection by Applying Deep Learning Techniques on HTML Analysis

Recently, the development and implementation of phishing attacks require...
research
03/21/2022

Web Page Content Extraction Based on Multi-feature Fusion

With the rapid development of Internet technology, people have more and ...
research
07/06/2018

Recommending Relevant Sections from a Webpage about Programming Errors and Exceptions

Programming errors or exceptions are inherent in software development an...
research
04/13/2018

A Deep Learning Approach to Fast, Format-Agnostic Detection of Malicious Web Content

Malicious web content is a serious problem on the Internet today. In thi...
research
08/30/2021

Web Application Testing: Using Tree Kernels to Detect Near-duplicate States in Automated Model Inference

In the context of End-to-End testing of web applications, automated expl...
research
05/19/2019

Phish-IRIS: A New Approach for Vision Based Brand Prediction of Phishing Web Pages via Compact Visual Descriptors

Phishing, a continuously growing cyber threat, aims to obtain innocent u...
research
09/24/2022

Toward Intention Discovery for Early Malice Detection in Bitcoin

Bitcoin has been subject to illicit activities more often than probably ...

Please sign up or login with your details

Forgot password? Click here to reset