Ultrasound has numerous applications in the diagnosis and treatment of different diseases. Ultrasound elastography is a branch of ultrasound that studies the mechanical properties in the tissue such as strain. A detailed review of elastography and its clinical applications can be found at [brian_anthony, app1, app2, app4, app5, application].
Ultrasound elastography can be classified into two types of quasi-static or dynamic[j2011recent]. In the first type, the deformations are very slow and therefore tissue dynamics can be ignored [parker2010imaging, ophir1999elastography, treece2011real]. Freehand quasi-static imaging does not need any additional hardware and as such, is very common (Fig. 1). The second type is dynamic elastography, where waves created by either the imaging system or natural pulsations, caused by for example heartbeats, are tracked. In both types, the response of the tissue to external or internal forces is used to determine its mechanical properties. This is done by obtaining the displacement image, which shows the motion of every sample in the radio frequency (RF) frame during the deformation. We focus on quasi-static freehand strain imaging in this paper, where the strain image is computed by spatially differentiating the deformation field.
In order to be able to estimate the strain image, we need two RF frames collected before and after applying the external force. One of the problems that free-hand ultrasound elastography faces is the difficulty in choosing suitable RF frames to estimate the strain. If the two RF frames are collected from the same plane and the force is purely axial, they will yield a high-quality strain image. Therefore, the operator needs to be an expert in performing the freehand palpation, rendering this technique very user-dependent. To solve this problem and make the data collection procedure independent of the user’s experience, Ranger et al. [3d_brian_anthony] used a 3D camera to track and compensate any undesired motion that could happen during the data collection. Another approach by both Foroughi et al. [foroughi2013freehand] and Rivaz et al. [rivaz2009tracked] depends on external trackers to collect information about the exact location of the RF frame. By doing this, they can find the RF frames that lie in the same plane, so that they can choose a suitable pair according to some cost function. Aalamifar et al. [robot] used a robot for collecting RF frames. They try to estimate a transformation matrix that transforms the RF frames collected from the robot’s tooltip to the ultrasound image frame, using an active echo element.
Although the previously mentioned methods did improve the quality of the strain image, they all need an external device, which complicates the process of data collection and makes it more expensive. Herein, we introduce a novel method using a convolutional neural network (CNN) to determine whether a specific pair of RF frames is suitable for elastography. Although we focus on quasi-static elastography, the method can also be applied to other types of elastography.
In this section, we will discuss data collection for training and testing, and the CNN architecture used. Our model is simply a binary classifier, which is used to determine the suitability of a pair of RF frames for strain estimation.
Our proposed technique can also be used for automatically finding the best RF frames for a specific pre-selected RF frame. The model achieves that by searching in a window composed of several RF frames (in this work, 8 before and after the pre-specified RF frame).
Ii-a Data Collection
The data used for training and testing the algorithm includes both phantom and in vivo data. For the phantom data used in this paper, 4,116 pairs of RF frames were collected at Concordia University’s PERFORM Centre from 3 different CIRS phantoms (Norfolk, VA), namely Models 040GSE, 039 and 059 at different locations. 3,290 pairs out of the total data were used for training and validation with a ratio of 80:20, and the remaining data was used for testing. The ultrasound device used was the 12R Alpinion ultrasound machine (Bothell, WA) with an L3-12H high density linear array probe at a center frequency of 8.5 MHz and sampling frequency of 40 MHz. For the in vivo data, 688 pairs of RF frames were collected at Johns Hopkins Hospital from different patients who were undergoing liver ablation for primary or secondary liver cancers. Detailed information about this data is available in [rivaz2011real]. 528 pairs out of the 688 pairs were used for training and validation with a ratio of 80:20, leaving the rest of the pairs for testing. The labelling of the data was done as described in Algorithm 1.
It is important to note that steps 2 and 3 in Algorithm 1 are very computationally complex. As such, they cannot be performed in real-time for selecting optimal pairs of RF data. Our proposed method only performs these steps during training, and encodes the results into a computationally efficient CNN.
Suppose we have two RF frames and , and we would like to determine the suitability of this pair for strain estimation. We simply input the two frames to the CNN classifier on two different channels, and the output is a binary number 1 or 0. The architecture used is relatively simple as shown Fig. 2
. Every convolutional layer has a Rectified Linear Unit (ReLU) as the activation function, and is followed by batch normalization. The activation function in the output layer is a softmax, where the output values in the two nodes represent the probability of having a good and a bad pair respectively. The applied optimization technique is the Adam optimizer[kingma2014adam] with a learning rate of
Ii-C Training and testing time
The labelling of the data, which includes applying Algorithm 1 on every single pair of RF frames took 22 hours. Most of this time was spent on displacement estimation (step 2) and interpolating RF data (step 3). The actual training of the CNN took 7.4 minutes on a 7th generation 3.4 GHz Intel core i5 desktop with a NVIDIA TITAN V GPU. Inference is very fast, and only takes 5.4 to classify two frames of size 2304 by 384. The frames are downsampled by a factor of 2 in the axial direction, to generate smaller input images for the CNN. Note that in comparison, doing steps 2, 3, 5 and 6 in Algorithm 1 for two frames of the same size takes 6.21 seconds, 14.04 seconds, 46.87 and 2.45 respectively, for a total run-time of 20.3 seconds. In other words, frame selection with CNN is more than 3,700 times faster. It is important to note than CNN computations are performed on a GPU, whereas the steps in Algorithm 1 use a CPU.
Iii Experiments and Results
In this section, we compare our CNN frame selection method to other methods that choose to pair an RF frame with another by simply skipping one or two frames.
Fig. 3 shows the output of different frame selection methods when tested on one of the phantom datasets. It is clear that our automatic frame selection substantially outperforms the fixed skip frame pairing methods as it chooses more suitable frames, yielding better quality strain images. Table I shows the accuracy as well as the F1-measure obtained from our CNN classifier on new phantom datasets, that were not used during training. The results prove the ability of the classifier to generalize to unseen data.
|Phantom dataset 1||228 instances||96.77%||93.68%|
|Phantom dataset 2||297 instances||91.7%||89.17%|
|Phantom dataset 3||301 instances||96%||96%|
|In vivo dataset||160 instances||95.24%||92%|
In this paper we introduced a new method based on CNN to automatically choose RF frames that are suitable for strain estimation. Our method is fast, practical and does not need any external hardware. Therefore, it could be used commercially to generate high quality strain images even when used by an inexperienced operator.
The in vivo data was collected at Johns Hopkins Hospital. The authors would like to thank the principal investigators Drs. E. Boctor, M. Choti and G. Hager who provided us with the data. We would like to thank Morteza Mirzaei for providing us with some of the phantom data used in this paper. The authors also acknowledge NVIDIA for donating the graphics card.