DeepAI AI Chat
Log In Sign Up

Latent Semantic Structure in Malicious Programs

by   John Musgrave, et al.
University of Cincinnati
United States Air Force

Latent Semantic Analysis is a method of matrix decomposition used for discovering topics and topic weights in natural language documents. This study uses Latent Semantic Analysis to analyze the composition of binaries of malicious programs. The semantic representation of the term frequency vector representation yields a set of topics, each topic being a composition of terms. The vectors and topics were evaluated quantitatively using a spatial representation. This semantic analysis provides a more abstract representation of the program derived from its term frequency analysis. We use a metric space to represent a program as a collection of vectors, and a distance metric to evaluate their similarity within a topic. The segmentation of the vectors in this dataset provides increased resolution into the program structure.


page 1

page 2

page 3

page 4


Top2Vec: Distributed Representations of Topics

Topic modeling is used for discovering latent semantic structure, usuall...

Automated Feature-Topic Pairing: Aligning Semantic and Embedding Spaces in Spatial Representation Learning

Automated characterization of spatial data is a kind of critical geograp...

Continuous Semantic Topic Embedding Model Using Variational Autoencoder

This paper proposes the continuous semantic topic embedding model (CSTEM...

Topic Modeling in the Voynich Manuscript

This article presents the results of investigations using topic modeling...

Empirical Network Structure of Malicious Programs

A modern binary executable is a composition of various networks. Control...

Exploratory topic modeling with distributional semantics

As we continue to collect and store textual data in a multitude of domai...

AST-Based Deep Learning for Detecting Malicious PowerShell

With the celebrated success of deep learning, some attempts to develop e...