A Crawler Architecture for Harvesting the Clear, Social, and Dark Web for IoT-Related Cyber-Threat Intelligence

09/14/2021
by   Paris Koloveas, et al.
32

The clear, social, and dark web have lately been identified as rich sources of valuable cyber-security information that -given the appropriate tools and methods-may be identified, crawled and subsequently leveraged to actionable cyber-threat intelligence. In this work, we focus on the information gathering task, and present a novel crawling architecture for transparently harvesting data from security websites in the clear web, security forums in the social web, and hacker forums/marketplaces in the dark web. The proposed architecture adopts a two-phase approach to data harvesting. Initially a machine learning-based crawler is used to direct the harvesting towards websites of interest, while in the second phase state-of-the-art statistical language modelling techniques are used to represent the harvested information in a latent low-dimensional feature space and rank it based on its potential relevance to the task at hand. The proposed architecture is realised using exclusively open-source tools, and a preliminary evaluation with crowdsourced results demonstrates its effectiveness.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

research
06/28/2023

Can Twitter be used to Acquire Reliable Alerts against Novel Cyber Attacks?

Time-relevant and accurate threat information from public domains are es...
research
03/28/2021

Data-Driven Threat Hunting Using Sysmon

Threat actors can be persistent, motivated and agile, and leverage a div...
research
07/28/2016

Darknet and Deepnet Mining for Proactive Cybersecurity Threat Intelligence

In this paper, we present an operational system for cyber threat intelli...
research
08/25/2022

Automatic Mapping of Unstructured Cyber Threat Intelligence: An Experimental Study

Proactive approaches to security, such as adversary emulation, leverage ...
research
09/27/2018

Identification of Wearable Devices with Bluetooth

With wearable devices such as smartwatches on the rise in the consumer e...
research
02/08/2021

Generating Fake Cyber Threat Intelligence Using Transformer-Based Models

Cyber-defense systems are being developed to automatically ingest Cyber ...
research
07/19/2018

Using Deep Neural Networks to Translate Multi-lingual Threat Intelligence

The multilingual nature of the Internet increases complications in the c...

Please sign up or login with your details

Forgot password? Click here to reset