Agent with Warm Start and Active Termination for Plane Localization in 3D Ultrasound

by   Haoran Dou, et al.
Shenzhen University

Standard plane localization is crucial for ultrasound (US) diagnosis. In prenatal US, dozens of standard planes are manually acquired with a 2D probe. It is time-consuming and operator-dependent. In comparison, 3D US containing multiple standard planes in one shot has the inherent advantages of less user-dependency and more efficiency. However, manual plane localization in US volume is challenging due to the huge search space and large fetal posture variation. In this study, we propose a novel reinforcement learning (RL) framework to automatically localize fetal brain standard planes in 3D US. Our contribution is two-fold. First, we equip the RL framework with a landmark-aware alignment module to provide warm start and strong spatial bounds for the agent actions, thus ensuring its effectiveness. Second, instead of passively and empirically terminating the agent inference, we propose a recurrent neural network based strategy for active termination of the agent's interaction procedure. This improves both the accuracy and efficiency of the localization system. Extensively validated on our in-house large dataset, our approach achieves the accuracy of 3.4mm/9.6 and 2.7mm/9.1 for the transcerebellar and transthalamic plane localization, respectively. Ourproposed RL framework is general and has the potential to improve the efficiency and standardization of US scanning.


page 3

page 7


Agent with Warm Start and Adaptive Dynamic Termination for Plane Localization in 3D Ultrasound

Accurate standard plane (SP) localization is the fundamental step for pr...

Searching Collaborative Agents for Multi-plane Localization in 3D Ultrasound

3D ultrasound (US) is widely used due to its rich diagnostic information...

Agent with Tangent-based Formulation and Anatomical Perception for Standard Plane Localization in 3D Ultrasound

Standard plane (SP) localization is essential in routine clinical ultras...

Standard Plane Detection in 3D Fetal Ultrasound Using an Iterative Transformation Network

Standard scan plane detection in fetal brain ultrasound (US) forms a cru...

Localizing the Recurrent Laryngeal Nerve via Ultrasound with a Bayesian Shape Framework

Tumor infiltration of the recurrent laryngeal nerve (RLN) is a contraind...

Image-Guided Navigation of a Robotic Ultrasound Probe for Autonomous Spinal Sonography Using a Shadow-aware Dual-Agent Framework

Ultrasound (US) imaging is commonly used to assist in the diagnosis and ...

Statistical Dependency Guided Contrastive Learning for Multiple Labeling in Prenatal Ultrasound

Standard plane recognition plays an important role in prenatal ultrasoun...

1 Introduction

Acquisition of standard planes containing key anatomical structures is crucial for ultrasound (US) diagnosis. In prenatal US, typically dozens of standard planes are manually acquired for subsequent biometric measurements and diagnosis with a 2D US probe, such as the transthalamic (TT) and transcerebellar (TC) views for fetal brain assessment (Fig. 1). This process is very time-consuming and highly operator-dependent. In comparison, 3D US can contain multiple standard planes in just a single shot and has the inherent advantages of less user-dependency and more efficiency [7]. However, it is very challenging to manually localize standard planes in the volume due to the huge search space, the large fetal posture variability and the low image quality, as shown in Fig. 1. Therefore automatic localization of standard planes in 3D US is highly expected to improve diagnostic efficiency and decrease operator-dependency.

In recent years, some research on standard plane localization in 3D US has been conducted accordingly. Ryou et al. proposed a three-step learning method to sequentially localize the fetus, the fetal parts and detect biometry planes by classification [9]

. This method narrowed the search space in the localized structures and the axial direction. Regression methods were also employed to localize cardiac planes by Random Forests 

[2] and the fetal abdominal plane by deep networks [10]. However, these methods tend to fail when acoustic shadow and occlusion spread in US during late pregnancy. Lorenz et al. proposed to extract the abdomen plane by detecting anatomical landmarks and aligning them to a fetal organ model [5]. The system achieved accuracy of for plane localization. Although effective by using prior anatomical knowledge, the method’s performance is limited by landmark detection accuracy and testing case-model difference. More recently, Li et al. proposed an iterative deep network to localize fetal brain planes in 3D US [4]. They further customized a reinforcement learning (RL) agent for view planning in MR volumes [1]. RL is promising for standard plane localization in 3D US due to its ability of mimicking experts’ operation and exploring inter-plane dependency by the agent-environment interaction. However, RL may suffer from its random initialization and empirical termination when its environment, such as the US volume, has strong noise, artifacts and large appearance variations.

In this paper, we propose a novel RL framework to localize fetal brain standard planes in prenatal US volumes. We believe we are the first to employ RL-based techniques for this problem. Our contribution is two-fold. First, we equip the RL framework with a landmark-aware alignment module for warm start to ensure its effectiveness. We employ deep networks to detect anatomical landmarks in the US volume and register them to a plane-specific atlas. The plane configuration of the atlas therefore provides strong spatial bounds for RL agent actions. Second, instead of passively and empirically terminating the agent inference, we propose a recurrent neural network (RNN) based strategy for active termination of the agent’s interaction procedure. The RNN-based strategy can find the optimal termination point adaptively, so it improves the accuracy and efficiency of the localization system at the same time.

Figure 2: Schematic view of our proposed framework.

2 Methodology

Fig. 2 is the schematic view of our proposed framework. We propose to localize fetal brain standard planes in US volumes with a RL framework, which can progressively interact with the volumes and modify the search trajectory towards the final target plane. Specifically, we equipped the RL framework with 1) a landmark-aware alignment module for warm start, to ensure its effectiveness, and also 2) a recurrent neural network based strategy for active termination of the interaction procedure, to improve its accuracy and efficiency.

2.1 Deep Reinforcement Learning Framework for Plane Localization

The task of plane localization in US volumes can be well modeled under the RL framework, where an agent, in its current state , interacts with the environments by making successive actions

that maximize the expectation of reward. Let a plane in Cartesian coordinate system be represented as

, where is the normal, is the distance from the plane to the volume center origin. The system will obtain the optimal plane parameters as the agent interacts with the environment.

Similar to [1], we define the action space as 8 actions, . After an action is made by the agent, the plane parameters are accordingly updated as . Each valid action gets its scalar reward following the rule , where calculates the Euclidean distance from the predicted plane to the ground truth . indicates whether the agent is moving towards the preferred target.

With the reward signal, the agent then maximizes both the current and future rewards to obtain the action-selection policy. Following the Q-learning, Deep Q-Network [6] (DQN) can learn a state-action value function, , via deep networks to serve as the action-selection policy. To improve the robustness of DQN against the noisy environment in 3D US, we finally choose the Double DQN (DDQN) [11]

as our deep agent for plane localization. The loss function for our DDQN is defined as:


where is a discount factor to weight future rewards, and are the state and action in the next step. is the experience replay memory to avoid frequent data sampling. and

are the current and target network parameters. Specifically, we select an ImageNet pre-trained

VGG-13 as our current and target networks. Three recently predicted planes serve as the network input.

Figure 3: Landmarks of 100 US volumes (left) aligned to a place-specific atlas space (middle) provides strong spatial bounds for RL agent actions (right). Red, green and blue dots indicate landmarks shown in Fig. 1(a).

2.2 Landmark-aware Plane Alignment for Warm Start

To ensure an effective interaction of the RL agent with the noisy 3D US environment, we propose a landmark-aware plane alignment module to leverage anatomical prior and provide a warm start for the agent. Specifically, we first detect three landmarks of fetal brain, i.e, the genu of corpus callosum, splenium of corpus callosum and cerebellar vermis, as shown in Fig. 1(a), with a customized 3D U-net[8]. Then these landmarks are used to align the testing volume with the atlas, which contains both the reference landmarks and standard plane parameters. Different from [7, 5] which apply a common anatomical model to all kinds of standard planes, we propose to select specific atlas for each plane to improve the localization accuracy. Finally, standard planes of atlas are mapped to testing volumes and serve as a warm start for our RL agent. Atlas selection for a type of standard plane is formulated as following,


where are volume index. is the normal of in volume , calculates the angle between normals, is the distance from plane in volume to origin. is the transformation matrix from volume to , which is determined by the landmark annotation based rigid registration. Fig.3 shows the effect of our landmark alignment for 100 volumes. The accurate alignment guarantees the effectiveness of the initial point for RL agent and therefore leads to fast and improved plane localization.

Figure 4: Mean Q-value of 8 action candidates (yellow) and ADI (blue) on training dataset. Green point denotes the optimal termination step with maximum ADI.

2.3 Recurrent Neural Network based Active Termination

To ensure an efficient interaction of the RL agent, we propose a RNN-based active termination (AT) module to tell the agent when to stop. Usually, there is no well-defined criteria to terminate the iterative inference of RL learning. Under- and over-estimation of the termination state often degrade the final localization. Existing work makes use of a predefined maximum step, a lower Q-value

[1] or oscillation of Q-value [3] as an indicator of termination. While the first one wastes a lot of computation resource if it’s set to a large number, the latter two do not necessarily lead to the optimal results. As shown in Fig. 4

, the optimal termination step with highest angle and distance improvement (ADI) is neither the maximum step nor the step with the lowest Q-value. This motivates us to propose a novel strategy to actively learn the optimal step. Specifically, considering the sequential characteristics of the iterative inference, as shown in Fig.

2, we formulate the mapping between the Q-value sequence and optimal step with recurrent neural networks.

The Q-values of 8 action candidates at each state serve as an input of our RNN, which then learns to output the optimal termination step with highest ADI, i.e., most significant angle and distance improvement. We train the RNN model with the inference results of the agent on our training volumes. During testing, our method terminated the iteration action of the agent according to the RNN output and get the final plane parameters. With this active termination mechanism, our agent can make efficient inference without excessive iterations.

3 Experimental Results

3.0.1 Materials and Implementation Details.

We validate our solution on the task of localizing two standard planes (TT and TC) of fetal brain in US volumes. We built a dataset of 430 prenatal US volumes acquired from 430 healthy pregnant women volunteers. Approved by local Institutional Review Board, all volumes were anonymized and obtained by experts using a Mindray DC-9 US system with an integrated 3D probe. Free fetal poses are allowed during scanning. Gestational age ranges from 19 to 31 weeks, much broader than [4, 7]. Average volume size of our dataset is 270207235 and unified voxel size is A sonographer with 5-year experience provided manual annotation of landmarks and standard planes for all the volumes. We then randomly split the dataset into 330 and 100 volumes for training and testing.

We implemented our framework in PyTorch

, using a standard PC with a NVIDIA TITAN X(PASCAL) GPU. We trained the DDQN with Adam optimizer (learning rate=5e-5) for 100 epochs(about 4 days). Replay-buffer is set as 15000. Target network copies the parameters of current network every 2000 iterations. For training RNN variants (vanilla RNN and LSTM), optimizer is Adam with L1 regression loss, batch size=100, hidden size=64 and epoch=200 (about 15

). The starting planes for training DDQN are randomly initialized around the ground truth plane within an angle range of and distance range of

. The range is deterimined by the average plane localization error of atlas based registration. For landmark detection (Adam optimizer, batch size=1, learning rate=0.001, moment is 0.5, epoch=40), limited by GPU memory, US volume is resized as 0.4 times for training. Gaussian maps of landmarks are generated as ground truth. L2 loss is used for training. Iterative Closest Point algorithm is used for the rigid registration between testing case and atlas.

Method TC TT
Ang()↓ Dis(mm)↓ SSIM↑ Ang()↓ Dis(mm)↓ SSIM↑
Regress 27.048.40 4.103.81 0.6720.087 24.2717.05 7.626.00 0.5070.100
AtlasRegist 14.147.54 3.402.28 0.6810.148 13.434.63 2.621.54 0.6820.138
RegistRegress 12.447.78 2.182.12 0.6840.157 13.8711.77 2.802.16 0.6600.141
DDQN-nA 31.5424.24 5.123.67 0.6850.131 30.4424.43 5.033.82 0.6150.132
DDQN-maxS 11.7114.32 3.532.55 0.6840.165 12.368.53 2.952.94 0.6940.154
DDQN-minQ 10.689.76 3.402.27 0.6880.165 10.787.62 2.621.54 0.7050.163
DDQN-AT(FC) 10.369.60 3.402.28 0.6890.165 9.615.79 2.661.55 0.7070.161
DDQN-AT(RNN) 9.9610.19 3.412.27 0.6910.167 9.535.74 2.641.62 0.7090.164
DDQN-AT(LSTM) 9.618.97 3.402.77 0.6930.168 9.115.56 2.662.06 0.7090.163
Table 1: Quantitative evaluation of our proposed framework.
Figure 5: TC (left) and TT (right) results. Top row: ground truth (left) and predicted (right) plane. Bottom row: left, active termination step (dotted red line) compared to optimal step in green dot, 3D visualization of ground truth and predicted plane (right).

3.0.2 Quantitative and Qualitative Analysis:

The efficacy of our proposed method was validated with 100 US volumes and results were demonstrated in Table 1. We adopt both spatial and content similarities to evaluate the performance, including the dihedral angle between two planes (Ang), difference of Euclidean distance to origin (Dis), and Structural Similarity Index (SSIM).

  • Firstly, the proposed RL agent (DDQN-AT) has good performance on localizing two types of standard planes, and outperforms the regression-based method (Regress), the registration-based method (AtlasRegist) and their combination (RegistRegress). This can be attributed to the active interaction procedure of the agent that searches along the trajectory towards the optimal plane.

  • Secondly, as can be clearly drawn from the table, when the landmark-aware space alignment module is employed on DDQNs as a warm start, they achieve significantly better performance on standard plane localization than the method without alignment (DDQN-nA). Besides, the proposed space alignment module can also be deployed in the regression model and lead to clear improvement (RegistRegress).

  • Thirdly, the proposed active termination can lead to better localization with much less inference iterations. Compared to other termination policies, such as maximum step (DDQN-maxS) and minimum Q-value (DDQN-minQ), our AT based methods generally give better localization performances. Among them, DDQN-AT (LSTM) shows the best results, since it has stronger capacity in learning from the Q-value sequence. More importantly, with AT module equipped, our RL-agent requires an average of 13 steps to localize the standard planes, in comparison with 100 steps that no AT module was employed. Given the fact that such iteration steps cost most computation, the AT module will definitely improve the efficiency of the RL agent.

In Fig. 5, we visualize two testing results of DDQN-AT (LSTM) for TC and TT plane localization. Compared from image content and spatial relationship, for both tasks, our method accurately captures the plane, which is very close to the ground truth. Our active termination strategy also presents the ability to learn from the Q-value sequence and hits the optimal termination step (green dot) for large angle and distance improvement (ADI).

4 Conclusion

We proposed a general framework for standard plane localization in 3D US with a RL agent. We use a landmark-aware alignment model to exploit prior information about the standard planes from the atlas and provide the agent with an effective warm starting point. In addition, we devise a RNN-based active termination strategy to indicate the agent to stop once the optimal plane is localized, therefore improving its accuracy and efficiency. Experiments on our in-house large dataset validate the efficacy of our method and reveal its great potential for future practical applications.

4.0.1 Acknowledgments:

The work in this paper was supported by the grant from National Natural Science Foundation of China (No. 61571304), Shenzhen Peacock Plan (No. KQTD2016053112051497, KQJSCX20180328095606003), Medical Scientific Research Foundation of Guangdong Province, China (No. B2018031) and National Natural Science Foundation of China (Project No. U1813204).


  • [1] A. Alansary, L. Le Folgoc, et al. (2018) Automatic view planning with multi-scale deep reinforcement learning agents. In MICCAI, pp. 277–285. Cited by: §1, §2.1, §2.3.
  • [2] K. Chykeyuk, M. Yaqub, and J. A. Noble (2013) Class-specific regression random forest for accurate extraction of standard planes from 3d echocardiography. In

    International MICCAI Workshop on Medical Computer Vision

    pp. 53–62. Cited by: §1.
  • [3] F. Ghesu et al. (2019) Multi-scale deep reinforcement learning for real-time 3d-landmark detection in ct scans. IEEE TPAMI 41 (1), pp. 176–189. Cited by: §2.3.
  • [4] Y. Li, B. Khanal, et al. (2018)

    Standard plane detection in 3d fetal ultrasound using an iterative transformation network

    In MICCAI, pp. 392–400. Cited by: §1, §3.0.1.
  • [5] C. Lorenz, T. Brosch, et al. (2018) Automated abdominal plane and circumference estimation in 3d us for fetal screening. In Medical Imaging 2018: Image Processing, Vol. 10574, pp. 105740I. Cited by: §1, §2.2.
  • [6] V. Mnih, K. Kavukcuoglu, et al. (2015) Human-level control through deep reinforcement learning. Nature 518 (7540), pp. 529. Cited by: §2.1.
  • [7] A. I. Namburete, R. V. Stebbing, and J. A. Noble (2014) Diagnostic plane extraction from 3d parametric surface of the fetal cranium.. In MIUA, pp. 27–32. Cited by: §1, §2.2, §3.0.1.
  • [8] O. Ronneberger, P. Fischer, and T. Brox (2015) U-net: convolutional networks for biomedical image segmentation. In MICCAI, pp. 234–241. Cited by: §2.2.
  • [9] H. Ryou, M. Yaqub, et al. (2016) Automated 3d ultrasound biometry planes extraction for first trimester fetal assessment. In MLMI, pp. 196–204. Cited by: §1.
  • [10] A. Schmidt-Richberg, N. Schadewaldt, et al. (2019) Offset regression networks for view plane estimation in 3d fetal ultrasound. In Medical Imaging 2019: Image Processing, Vol. 10949, pp. 109493K. Cited by: §1.
  • [11] S. D. Van Hasselt H (2016) Deep reinforcement learning with double q-learning. In AAAI, pp. 1234–1241. Cited by: §2.1.