Memory-Efficient Deep Learning Inference in Trusted Execution Environments

04/30/2021
by   Jean-Baptiste Truong, et al.
19

This study identifies and proposes techniques to alleviate two key bottlenecks to executing deep neural networks in trusted execution environments (TEEs): page thrashing during the execution of convolutional layers and the decryption of large weight matrices in fully-connected layers. For the former, we propose a novel partitioning scheme, y-plane partitioning, designed to (i) provide consistent execution time when the layer output is large compared to the TEE secure memory; and (ii) significantly reduce the memory footprint of convolutional layers. For the latter, we leverage quantization and compression. In our evaluation, the proposed optimizations incurred latency overheads ranging from 1.09X to 2X baseline for a wide range of TEE sizes; in contrast, an unmodified implementation incurred latencies of up to 26X when running inside of the TEE.

READ FULL TEXT
research
07/14/2021

Memory-Aware Fusing and Tiling of Neural Networks for Accelerated Edge Inference

A rising research challenge is running costly machine learning (ML) netw...
research
06/08/2018

Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware

As Machine Learning (ML) gets applied to security-critical or sensitive ...
research
06/18/2021

Introducing Fast and Secure Deterministic Stash Free Write Only Oblivious RAMs for Demand Paging in Keystone

Keystone is a trusted execution environment, based on RISC-V architectur...
research
04/12/2020

DarkneTZ: Towards Model Privacy at the Edge using Trusted Execution Environments

We present DarkneTZ, a framework that uses an edge device's Trusted Exec...
research
07/13/2019

Towards Characterizing and Limiting Information Exposure in DNN Layers

Pre-trained Deep Neural Network (DNN) models are increasingly used in sm...
research
05/21/2020

TASO: Time and Space Optimization for Memory-Constrained DNN Inference

Convolutional neural networks (CNNs) are used in many embedded applicati...
research
02/23/2022

Memory Planning for Deep Neural Networks

We study memory allocation patterns in DNNs during inference, in the con...

Please sign up or login with your details

Forgot password? Click here to reset