High frame-rate cardiac ultrasound imaging with deep learning

08/23/2018 ∙ by Ortal Senouf, et al. ∙ Technion 8

Cardiac ultrasound imaging requires a high frame rate in order to capture rapid motion. This can be achieved by multi-line acquisition (MLA), where several narrow-focused received lines are obtained from each wide-focused transmitted line. This shortens the acquisition time at the expense of introducing block artifacts. In this paper, we propose a data-driven learning-based approach to improve the MLA image quality. We train an end-to-end convolutional neural network on pairs of real ultrasound cardiac data, acquired through MLA and the corresponding single-line acquisition (SLA). The network achieves a significant improvement in image quality for both 5- and 7-line MLA resulting in a decorrelation measure similar to that of SLA while having the frame rate of MLA.



There are no comments yet.


page 3

page 8

page 9

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Increasing the frame rate is a major challenge in 2D and 3D echocardiography. Investigating deformations at different stages of the cardiac cycle is crucial for cardiovascular imaging; hence high temporal resolution is highly desired in addition to the spatial resolution. There are several ways to increase the frame rate of ultrasound imaging; one of the most commonly used techniques, which is implemented in many ultrasound scanners, is multi-line acquisition (MLA) [1], often referred to as parallel receive beamforming (PRB) [2].

Single- vs. multi-line acquisition.

In single-line acquisition (SLA), a narrow-focused pulse is transmitted by introducing transmit time delays through a linear phased array of acoustic transducer elements. Upon reception the obtained signal is dynamically focused along the receive (Rx) direction which is identical to the transmit (Tx) direction. The spatial region of interest is raster scanned line-by-line to obtain an ultrasound image.

The need to transmit a large number of pulses sequentially results in a low frame rate and renders SLA inadequate for cardiovascular imaging, where a high frame rate is mandatory, especially for quantitative analysis or during stress tests. For the same reason, SLA is neither useful for scanning large fields of view in real time 3D imaging applications.

In an attempt to overcome the frame rate problem, the MLA method was proposed in [1], [3]. The main idea behind MLA is to transmit a weakly focused beam that provides a sufficiently wide coverage for a high number of received lines. On the receiver side, lines is constructed from the data acquired from each transmit event, thereby increasing the frame rate by (the latter number is usually referred to as the MLA factor). Signal formation in the SLA and MLA modalities is demonstrated in Figure 1 where -MLA is depicted. For a -MLA, we construct Rx lines per each Tx thus increasing the frame rate by the factor of .

MLA Artifacts.

As the Tx and Rx are no longer aligned in the MLA mode, the two-way beam profile is shifted towards the original transmit direction, making the lateral sampling irregular [2]. This beam warping effect causes sharp lateral discontinuities that are manifested as block artifacts in the image domain.

The observed block artifacts in the ultrasound images (see, e.g., Figure 1) tend to be more obvious when the number of transmit events decreases. The MLA artifact can be measured by assessing the correlation coefficient between each two adjacent Rx lines in the in-phase and quadrature (I/Q) demodulated beamformed data [4]. In SLA or compensated MLA, the averaged correlation values inside MLA groups and between MLA groups are almost the same. In the uncompensated cases, however, the correlation values are different.

Apart from beam warping, there are two other effects caused by the transmit-receive misalignment: skewing, where shape of the two-way beam profile becomes asymmetric, and gain variation, where the outermost lines inside the group have a lower gain than the innermost lines [4].

Related work.

Several methods have been proposed in literature to decrease MLA artifacts, including transmit sinc apodization [5] and dynamic steering [6]

, incoherent interpolation

[7],[8] (applied after envelope detection), and its coherent (before envelope detection) counterparts [9],[2]. One of the more prominent methods, synthetic transmit beamforming (STB)[2], creates synthetic Tx lines by coherently interpolating information received from each two adjacent Tx events in intermediate directions. This technique creates highly correlated lines, attenuating block artifacts. A common practice for MLA imaging with focused beams is to create Rx lines per each Tx event in cases without overlap, or lines in the presence of overlaps from adjacent transmissions, in order to perform the correction [2],[4],[10]. Thus, creating eight lines with overlaps provides an effective frame rate increase by the factor of

. In this paper, however, we used odd MLA factors

for the purpose of acquiring data from aligned directions for both SLA and MLA.

Recently, data-driven learning techniques based on convolutional neural networks (CNNs) have been extensively used for solving inverse problems in imaging and in medical imaging in particular, for example, in X-ray CT reconstruction and denoising [11] and in real-time ultrasound post-processing [12]. Inspired by their success, we propose a data-driven approach to overcome MLA artifacts.


We propose an end-to-end CNN-based approach for MLA artifact correction. Our fully convolutional network consists of interpolation layers followed by a trainable apodization layer, and is trained on in-vivo cardiac data to approximate an SLA quality image. We demonstrate the effectiveness of this network both visually and quantitatively using the decorrelation measure () and SSIM [13] quality criteria. To the best of our knowledge, this is the first study to report good artifact corrections in the case of MLA. We show that the trained network generalizes well across patients, as well as to phantom data.

Figure 1: Single (left) and multi-line (right, with MLA factor ) acquisition procedures and their corresponding ultrasound scans. Block artifacts can be seen along the axial direction in MLA. Zooming in is recommended.

2 Methods

2.1 Improving MLA with CNNs

Aiming at providing a general and optimal solution for MLA interpolation achieving SLA quality, we propose to replace MLA artifact correction and apodization phases in the traditional MLA pipeline as shown in Figure 2 with an end-to-end CNN depicted in Figure 3. We draw similarities to [10] who showed that combining MLA interpolation with an optimal apodization method produces superior results compared to the traditional approaches. Our network comprises both the interpolation and the apodization stages that are trained jointly.

Interpolation stage.

The interpolation stage consists of our CNN containing convolutional layers with symmetric skip connections[14],[15] from each layer in the downsampling track to its corresponding layer in the upsampling track as visualized in Figure 3

. Downsampling is performed using average pooling and strided convolutions are used for upsampling. The number of bifurcations is set to

for all the experiments. The interpolation stage takes as an input the time-delayed and phase-rotated element-wise I/Q data from the transducer.

Apodization stage.

Following the interpolation stage, we introduce a convolutional layer to perform apodization. This is performed using point-wise convolutions () for each element’s channel in the network and the results are then added to the learned weights of the convolution. The weights of the channel are initialized with a Hann window.


We use the norm training loss to measure the discrepancy between the image predicted by the network and the ground truth SLA images. The loss is minimized using the Adam optimizer [16] with a learning rate of . We observed that adding the apodization stage accelerates the training process, and makes the network converge faster.

Figure 2: Traditional MLA ultrasound imaging pipeline.
Figure 3: Proposed CNN-based MLA artifact correction pipeline.

2.2 Data acquisition and training

We generated a dataset for training the network using cardiac data from six patients; each patient contributed - cine loops, containing frames. The data was acquired using a GE experimental breadboard ultrasound system. The same transducer was used for both phantom and cardiac acquisition. Excitation sinusoidal pulses of cycles, centered around MHz, were transmitted using central elements out of the total element in the probe with a pitch of mm, elevation size of mm and elevation focus of mm. The depth focus was set at mm. In order to assess the desired aperture for MLA setup, Field II simulator [17] was used as in [10] using the transducer impulse response and tri-state transmission excitation sequence, requiring a minimal insonification of dB for all MLAs from a single Tx.

On the Rx side, the I/Q demodulated signals were dynamically focused using linear interpolation, with an f-number of . The FOV was covered with Tx/ Rx lines in SLA mode, Tx/Rx lines in the MLA mode, and Tx/Rx lines in the MLA mode. For both phantom and cardiac cases, the data were acquired in the SLA mode; MLA and MLA data was obtained by appropriately decimating the Rx pre-beamformed data.

In total, we used frames from five patients for training and validation, while keeping the cine loops from the sixth patient for testing. The data set comprised pairs of beamformed I/Q images with Hann window apodization, and the corresponding and MLA pre-apodization samples with the dimensions of (depth elements Rx lines). The MLA data was acquired by decimation of the Tx lines of the SLA samples by the MLA factor ().

We trained dedicated CNNs for the reconstruction of SLA images from and MLA. Each CNN was trained to a maximum of epochs on mini batches of size .

3 Experimental evaluation

3.1 Settings

In order to assess the performance of our trained networks, we used cine loops from one patient excluded from the training/validation set. From two cine loops, each containing frames, we generated pairs of and MLA samples and their corresponding SLA images the same way as described in section 2.2, resulting in test samples. For quantitative evaluation of the performance of our method we measured the decorrelation () criterion that evaluates the artifact strength [4], and the SSIM [13] structural similarity criterion with respect to the SLA image. In addition, we tested the performance of our networks on four frames acquired from the GAMMEX Ultrasound 403GS LE Grey Scale Precision Phantom.

3.2 Results

Quantitative results for the cardiac test set are summarized in Table 1. We show a major improvement in decorrelation and SSIM for both and MLA. The corrected MLA performance approaches that of MLA, suggesting the feasiblity of larger MLA factors. Figure 4 shows representative images from each imaging modality. We show that the correlation coefficients profile of the corrected and MLA approaches that of SLA.

Similarly, quantitative results for the phantom test set are summarized in Table 2, again showing a significant improvement in the image quality for both and MLA. Visual results with the corresponding correlation coefficients profiles are depicted in Figure 1 in the Supplementary Material. These results suggest that the networks trained on real cardiac data generalize well to the phantom data without any further training or fine-tuning. For comparison, [4] reported a decorrelation value of for a phantom image acquired in a MLA mode with STB compensation, while we report closer to zero values, for MLA and for

MLA, which both use a greater decimation rate. The slight dissimilarities in the recovered data can be explained by the acquisition method being used: since the scanned object was undergoing a motion, there is a difference between all but a central line in each MLA group and the matching lines in SLA. We assume that training the network on images of static organs may further improve its performance. Independently, small areas with vertical stripes were observed in several images. In our opinion, the origin of the stripes is a coherent summation of the beamformed lines across the moving object. Since the frame rate of the employed acquisition sequence was slower than of genuine MLA acquisition, the magnitude of this artifact is probably exaggerated.

Original Original Corrected Original Corrected
Decorrelation /
Table 1: Image reconstruction results on cardiac data: comparison of average decorrelation and SSIM measures between the original and corrected and MLA cardiac images. Decorrelation of SLA is reported in the first column; left and right values in the entry indicate the values calculated for and MLA, respectively.
(a) SLA (b) MLA (c) Corrected MLA (d) MLA (e) Corrected MLA
Figure 4: CNN-based MLA artifact correction tested on cardiac data. A test frame from cardiac sequence demonstrating the performance of the proposed artifact correction algorithm. Each image is depicted along with the plot of the correlation coefficients between adjacent lines.
(a) SLA (b) 5MLA (c) Corrected 5MLA (d) 7MLA (e) Corrected 7MLA
Figure 5: CNN-based MLA artifact correction tested on phantom data A test frame from the phantom data demonstrating the performance of the proposed artifact correction algorithm. Each image is depicted along with the plot of the cross-correlation coefficient between adjacent lines.
Original Original Corrected Original Corrected
Decorrelation /
Table 2: Image reconstruction results on phantom data: Comparison of average decorrelation and SSIM measures between original and corrected and MLA phantom images. Decorrelation of SLA is reported in the first column; left and right values in the entry indicate values calculated for and MLA, respectively.

4 Conclusion

In this paper, we have shown that conventional ultrasound MLA correction can be substituted with an end-to-end CNN performing both optimal interpolation and apodization in order to approximate SLA image quality. In the future, we aim at extending this approach to even earlier stages in multi-line acquisition such as beamforming, assuming it will provide a greater improvement in image quality. Moreover, in a concurrent work [18], we demonstrate that similar method could be applied for other fast US acquisition modalities, such as multi-line transmission (MLT) [19].

5 Acknowledgements

This research was partially supported by ERC StG RAPID.


  • [1] Shattuck, D.P., Weinshenker, M.D., Smith, S.W., von Ramm, O.T.: Explososcan: A parallel processing technique for high speed ultrasound imaging with linear phased arrays. Acoustical Society of America Journal 75 (April 1984) 1273–1282
  • [2] Hergum, T., Bjastad, T., Kristoffersen, K., Torp, H.: Parallel beamforming using synthetic transmit beams. IEEE transactions on ultrasonics, ferroelectrics, and frequency control 54(2) (2007) 271–280
  • [3] Ramm, O.T.V., Smith, S.W., Pavy, H.G.: High-speed ultrasound volumetric imaging system. ii. parallel processing and image display. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control 38(2) (March 1991) 109–115
  • [4] Bjastad, T., Aase, S.A., Torp, H.: The impact of aberration on high frame rate cardiac b-mode imaging. IEEE Transactions on Ultrasonics Ferroelectrics and Frequency Control 54(1) (2007)  32
  • [5] Augustine, L.J.: High resolution multiline ultrasonic beamformer (February 24 1987) US Patent 4,644,795.
  • [6] Thiele, K.E., Brauch, A.: Method and apparatus for dynamically steering ultrasonic phased arrays (June 21 1994) US Patent 5,322,068.
  • [7] Holley, G.L., Guracar, I.M.: Ultrasound multi-beam distortion correction system and method (July 14 1998) US Patent 5,779,640.
  • [8] Liu, D.D., Lazenby, J.C., Banjanin, Z., McDermott, B.A.: System and method for reduction of parallel beamforming artifacts (September 10 2002) US Patent 6,447,452.
  • [9] Wright, J.N., Maslak, S.H., Finger, D.J., Gee, A.: Method and apparatus for coherent image formation (April 29 1997) US Patent 5,623,928.
  • [10] Rabinovich, A., Friedman, Z., Feuer, A.:

    Multi-line acquisition with minimum variance beamforming in medical ultrasound imaging.

    IEEE transactions on ultrasonics, ferroelectrics, and frequency control 60(12) (2013) 2521–2531
  • [11] McCann, M.T., Jin, K.H., Unser, M.: Convolutional neural networks for inverse problems in imaging: A review. IEEE Signal Processing Magazine 34(6) (Nov 2017) 85–95
  • [12] Vedula, S., Senouf, O., Bronstein, A., Michailovich, O., Zibulevsky, M.: Towards CT-quality Ultrasound Imaging using Deep Learning. arXiv preprint arXiv:1710.06304 (2017)
  • [13] Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13(4) (2004) 600–612
  • [14] Mao, X., Shen, C., Yang, Y.B.: Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In: Advances in neural information processing systems. (2016) 2802–2810
  • [15] Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention, Springer (2015) 234–241
  • [16] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. Proceedings of the 3rd International Conference on Learning Representations (ICLR) (2015)
  • [17] Jensen, J.A.: Field: A program for simulating ultrasound systems. In: 10TH Nordicbaltic Conference on Biomedical Imaging, VOL. 4, Supplement 1, Part 1: 351–353, Citeseer (1996)
  • [18] Vedula, S., Senouf, O., Zurakhov, G., Bronstein, A., Zibulevsky, M., Adam, D., Michailovich, O., Gaitini, D.: High quality ultrasonic multi-line transmission through deep learning. MLMIR workshop, MICCAI, 2018
  • [19] Mallart, R., Fink, M.: Improved imaging rate through simultaneous transmission of several ultrasound beams. In: New Developments in Ultrasonic Transducers and Transducer Systems. Volume 1733., International Society for Optics and Photonics (1992) 120–131