Secure Medical Image Analysis with CrypTFlow

12/09/2020 ∙ by Javier Alvarez-Valle, et al. ∙ 0

We present CRYPTFLOW, a system that converts TensorFlow inference code into Secure Multi-party Computation (MPC) protocols at the push of a button. To do this, we build two components. Our first component is an end-to-end compiler from TensorFlow to a variety of MPC protocols. The second component is an improved semi-honest 3-party protocol that provides significant speedups for inference. We empirically demonstrate the power of our system by showing the secure inference of real-world neural networks such as DENSENET121 for detection of lung diseases from chest X-ray images and 3D-UNet for segmentation in radiotherapy planning using CT images. In particular, this paper provides the first evaluation of secure segmentation of 3D images, a task that requires much more powerful models than classification and is the largest secure inference task run till date.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Secure multiparty computation (MPC) allows a set of mutually distrusting parties to compute a publicly known function on their secret inputs without revealing their inputs to each other. This is done through the execution of a cryptographic protocol which guarantees that the protocol participants learn only the function output on their secret inputs and nothing else. MPC has made rapid strides - from being a theoretical concept three decades ago

yao; gmw

, to now being on the threshold of having real world impact. One of the most compelling use cases for MPC is machine learning (ML) - e.g. being able to do secure ML inference when the model and the query are private inputs belonging to different parties. There has been a flurry of recent works aimed at running inference securely with MPC such as SecureML 

secureml, MinioNN minionn, ABY aby3, CHET chet, SecureNN securenn, Gazelle gazelle, Delphi delphi

, and so on. Unfortunately, these techniques are not easy-to-use by ML developers and have only been demonstrated on small deep neural networks (DNNs) on tiny datasets such as MNIST and CIFAR. However, in order for MPC to be truly ubiquitous for secure inference tasks, it must be both easy to use by developers with no background in cryptography and capable of scaling to the DNNs used in practice.

In this work, we present CrypTFlow, a system that converts TensorFlow tensorflow inference code into MPC protocols at the push of a button. By converting code in standard TensorFlow, a ubiquitous ML framework that is used in production by various technology companies, to MPC protocols, CrypTFlow significantly lower the entry barrier for ML practitioners and programmers to use cryptographic MPC protocols in real world applications. We make the following contributions:

First, for the developer frontend, we provide a compiler, called Athos, from TensorFlow to a variety of secure computation protocols (both 2 and 3 party) while preserving accuracy. The compiler is designed to be modular and it provides facilities for plugging in different MPC protocols. To demonstrate this modularity, we have integrated Athos with the following backends: ABY-based aby 2-party computation (2PC), SCI-based 2PC cryptflow2, Aramis-based malicious secure 3-party computation cryptflow, and Porthos-based semi-honest secure 3-party computation (3PC).

Second, for the cryptographic backend, we provide a semi-honest secure 3-party computation protocol, Porthos

, that outperforms all prior protocols for secure inference and enables us to execute, for the first time, cryptographically secure inference of ImageNet scale networks. Prior work in the area of secure inference has been limited to small networks over tiny datasets such as MNIST or CIFAR. We have evaluated

CrypTFlow on secure inference over DNNs that are at least an order of magnitude larger than the state-of-the-art delphi; chet; chameleon; securenn; secureml; gazelle; ezpc; minionn; aby3; nhe; xonn; nitin. Even on MNIST/CIFAR, Porthos has lower communication complexity and is more efficient than prior works securenn; aby3; chameleon.

Third, we demonstrate the ease-of-use, efficiency and scalability of CrypTFlow by evaluating on ResNet50 resnet

for ImageNet classification,

DenseNet121 densenet for detection of lung diseases from chest X-ray images and 3D-UNet unet for segmentation of raw 3D CT images.

Our toolchain is publicly available111 This paper reviews the original CrypTFlow paper cryptflow briefly and its increment lies in the secure segmentation evaluation (Section 5.3).

2 Athos

Athos compiles TensorFlow inference code to secure computation protocols. The transformations implemented in Athos are sensitive to the performance of MPC protocols. For performance reasons all efficient secure computation protocols perform computation over fixed-point arithmetic - i.e., arithmetic over integers or arithmetic with fixed precision. Athos automatically converts TensorFlow code over floating-point values into code that computes the same function over fixed-point values. This compilation is done while matching the inference accuracy of floating-point code. Prior works (secureml; minionn; gazelle; aby3; securenn; delphi) in the area of running ML securely have performed this task by hand with significant losses in accuracy over floating-point code.

Athos represents a 32-bit floating-point number by a 64-bit integer for a precision or scale . Then operations on 32-bit floating-point numbers are simulated by operations on 64-bit integers. For example is simulated as . A large causes integer overflows and a small leads to accuracy loss. To obtain a suitable scale

(all variables have the same precision in Athos output), Athos works by “sweeping through” various precision levels to estimate the best precision 


3 Porthos

Porthos is an improved semi-honest 3-party secure computation protocol (tolerating one corruption) that builds upon SecureNN securenn. Porthos makes two crucial modifications to SecureNN. First, SecureNN reduces convolutions to matrix multiplications and invokes the Beaver triples beaver based matrix multiplication protocol. When performing a convolution with filter size on a matrix of size , the communication is roughly elements in the ring , where . Porthos computes these Beaver triples by appropriately reshaping and matrices. This reduces the communication to roughly ring elements. Typically the filter size,

, is between 1 and 11 and the communication of Porthos can be up to two orders of magnitudes less than SecureNN. Additionally, in SecureNN, the protocols for non-linear layers (such as ReLU and MaxPool) require the third party to send secret shares to the first two parties. In Porthos, we cut this communication to half by eliminating the communication of one of these shares 

cryptflow. This reduces the communication in the overall ReLU and MaxPool protocols by 25%.

4 Motivating Example

In this section, we describe the end-to-end working of CrypTFlow

 through an example of logistic regression. The toolchain is shown in Figure


Figure 1: CrypTFlow: End-to-end toolchain
# x is (1,784) MNIST image. # W and b are model parameters. print(tf.argmax(tf.matmul(x, W) + b, 1))
Figure 2: Logistic Regression: TensorFlow snippet

CrypTFlow takes as input a pre-trained floating-point TensorFlow model. For example, consider the code snippet for logistic regression over MNIST dataset in TensorFlow as shown in Figure 2

. Our compiler first generates the TensorFlow graph dump as well as metadata containing the dimensions of all the tensors. Next, the TensorFlow graph dump is compiled into a high-level intermediate language HLIL. The code snippet for logistic regression in HLIL is shown in Figure

2(a). Next, Athos’ float-to-fixed converter translates the floating-point HLIL code to fixed-point code in a low-level intermediate language LLIL. This step requires Athos to compute the right precision to be used for maximum accuracy Figure 2(b) shows the LLIL code snippet for logistic regression. The operation divides each 64-bit integer entry of tensor by . The function calls in the LLIL code can be implemented with a variety of secure computation backends - e.g. ABY aby for the case of 2-party secure computation, Porthos for the case of semi-honest 3-party secure computation, and Aramis cryptflow for the malicious secure variant. Different backends provide different security guarantees and hence vary in their performance. For this example, the three backends take 227ms, 6.5ms, and 10.2ms respectively.

HLIL_LR_Verbatim xW = MatMul(x, W); xWb = MatAdd(xW, b); output(ArgMax(xWb));

[]LLIL_LR_Verbatim xW = MatMul(x, W); ScaleDown(xW, 15); //15 bit precision xWb = MatAdd(xW, b); output(ArgMax(xWb));




Figure 3: Logistic Regression in (a) floating-point: HLIL syntax (b) fixed-point: LLIL syntax

5 Experiments

Overview. First, in Section 5.1, we use CrypTFlow for secure classification on ImageNet using the following pre-trained TensorFlow models: ResNet50 222 and DenseNet121 333 We show that the fixed-point MPC protocols generated by Athos matches the accuracy of cleartext floating-point ResNet50 and DenseNet121. We also show how the optimizations in Porthos help it outperform prior works in terms of communication complexity and overall execution time. Finally, we discuss two case-studies of running CrypTFlow on DNNs for medical image analysis. The compilation time of CrypTFlow is around 5 sec for ResNet50, 35 sec for DenseNet121 and 2 minutes for 3D UNet.

5.1 Secure Inference on ImageNet

These experiments are in a LAN setting on 3.7GHz machines, each with 4 cores and with 16 GB of RAM. The measured bandwidth between each of the machines was at most 377 MBps and the latency was sub-millisecond.

ResNet50 takes 25.9 seconds and 6.9 GB of communication; DenseNet121 takes 36 seconds and 10.5 GB of communication. We measure communication as total communication between all parties - each party roughly communicates a third of this value. We show that Athos generated fixed-point code matches the accuracy of floating-code on ResNet50 and DenseNet121 in Table 2.

Benchmark Float Fixed Float Fixed
Top 1 Top 1 Top 5 Top 5
ResNet50 76.47 76.45 93.21 93.23
DenseNet121 74.25 74.33 91.88 91.90
Table 1: Accuracy of fixed- vs floating-point.
SecureNN Porthos SecureNN Porthos
(s) (s) Comm. (GB) Comm. (GB)
Table 2: Porthos vs SecureNN.

Detailed comparisons of CrypTFlow with prior works on secure inference can be found in cryptflow. However, since Porthos builds on SecureNN, we compare them on ImageNet scale benchmarks in Table 2. For this purpose, we add the code of SecureNN available at securenncode as another backend to CrypTFlow. These results show that Porthos improves upon the communication of SecureNN by a factor of roughly 1.2X–1.5X and the runtime by a factor of roughly 1.4X–1.5X.

5.2 Lung diseases from 2D chest X-Ray images

In chestxray2018, the authors train a DenseNet121

to predict lung diseases from chest X-ray images. They use the publicly available NIH dataset of chest X-ray images and end up achieving an average AUROC score of 0.845 across 14 possible disease labels. These DNNs are available as pre-trained Keras models. We converted them into TensorFlow using 

kttf and compiled the automatically generated TensorFlow code with CrypTFlow. During secure inference, we observed no loss in classification accuracy and the latency is similar to the runtime of DenseNet121 for ImageNet.

5.3 Segmenting tumors and organs at risk from 3D CT images

Half a million cancer patients receive radiotherapy each year demand. Personalized radiation treatments require segmenting tumors and organs at risk from 3D volumetric images. Currently, this segmentation is a manual process where an oncologist draws contours along regions of interest slice-by-slice across the whole volume. This process often takes several hours per image which ML provided automation stan; shuai can reduce to minutes. We consider a 3D-UNet model unet that takes as input a raw 3D image obtained via Computed Tomography (CT) scans of the pelvic region and delineates tumor volumes and organs at risk. This model’s accuracy is within the inter-observer variability seen among clinical experts innereye and requires 1.87 Teraflops per inference.

Since this model is implemented in PyTorch, we first export it to ONNX and then use

CrypTFlow’s ONNX frontend. For our secure inference setup, each party has 32 cores running at 2.4GHz, no GPUs, and 128GB RAM. The parties are connected on a network with ping latency 0.2s and 625MBps bandwidth. On this set up, secure inference incurs a latency of 1 hour and 57 minutes and 557GB of communication. The most expensive operators in this computation are 3D transposed convolutions (or deconvolutions) and, to the best of our knowledge, CrypTFlow is the only secure inference tool that supports these operations. In our experience, it takes a couple of days for a scan to reach the oncologist for review and hence this latency overhead can be acceptable.

6 Related work and conclusion

Other related systems for converting PyTorch/Tensorflow to MPC protocols crypten; tfe; pysyft; quantizednn only support 3PC. Whereas, CrypTFlow additionally supports 2PC backends. CrypTFlow provides the first implementation and evaluation of a system for secure segmentation. With CrypTFlow, data scientists, with no background in cryptography, can obtain secure inference implementations for their trained models at the push of a button.