Classifying Sequences of Extreme Length with Constant Memory Applied to Malware Detection

12/17/2020
by   Edward Raff, et al.
14

Recent works within machine learning have been tackling inputs of ever-increasing size, with cybersecurity presenting sequence classification problems of particularly extreme lengths. In the case of Windows executable malware detection, inputs may exceed 100 MB, which corresponds to a time series with T=100,000,000 steps. To date, the closest approach to handling such a task is MalConv, a convolutional neural network capable of processing up to T=2,000,000 steps. The 𝒪(T) memory of CNNs has prevented further application of CNNs to malware. In this work, we develop a new approach to temporal max pooling that makes the required memory invariant to the sequence length T. This makes MalConv 116× more memory efficient, and up to 25.8× faster to train on its original dataset, while removing the input length restrictions to MalConv. We re-invest these gains into improving the MalConv architecture by developing a new Global Channel Gating design, giving us an attention mechanism capable of learning feature interactions across 100 million time steps in an efficient manner, a capability lacked by the original MalConv CNN. Our implementation can be found at https://github.com/NeuromorphicComputationResearchProgram/MalConv2

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset