PhishOut: Effective Phishing Detection Using Selected Features

04/21/2020
by   Suhail Paliath, et al.
0

Phishing emails are the first step for many of today's attacks. They come with a simple hyperlink, request for action or a full replica of an existing service or website. The goal is generally to trick the user to voluntarily give away his sensitive information such as login credentials. Many approaches and applications have been proposed and developed to catch and filter phishing emails. However, the problem still lacks a complete and comprehensive solution. In this paper, we apply knowledge discovery principles from data cleansing, integration, selection, aggregation, data mining to knowledge extraction. We study the feature effectiveness based on Information Gain and contribute two new features to the literature. We compare six machine-learning approaches to detect phishing based on a small number of carefully chosen features. We calculate false positives, false negatives, mean absolute error, recall, precision and F-measure and achieve very low false positive and negative rates. Naïve Bayes has the least true positives rate and overall Neural Networks holds the most promise for accurate phishing detection with accuracy of 99.4%.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/03/2018

Adversarial Attack Type I: Generating False Positives

False positive and false negative rates are equally important for evalua...
research
07/03/2020

The Effect of Class Imbalance on Precision-Recall Curves

In this note I study how the precision of a classifier depends on the ra...
research
11/08/2022

ShaSTA: Modeling Shape and Spatio-Temporal Affinities for 3D Multi-Object Tracking

Multi-object tracking is a cornerstone capability of any robotic system....
research
11/05/2017

Bloom Filters, Adaptivity, and the Dictionary Problem

The Bloom filter---or, more generally, an approximate membership query d...
research
06/24/2023

Information criteria for structured parameter selection in high dimensional tree and graph models

Parameter selection in high-dimensional models is typically finetuned in...
research
04/30/2023

The MCC approaches the geometric mean of precision and recall as true negatives approach infinity

The performance of a binary classifier is described by a confusion matri...
research
08/19/2021

Physics-informed machine learning improves detection of head impacts

In this work we present a new physics-informed machine learning model th...

Please sign up or login with your details

Forgot password? Click here to reset