Localizing dexterous surgical tools in X-ray for image-based navigation

01/20/2019
by   Cong Gao, et al.
Johns Hopkins University
0

X-ray image based surgical tool navigation is fast and supplies accurate images of deep seated structures. Typically, recovering the 6 DOF rigid pose and deformation of tools with respect to the X-ray camera can be accurately achieved through intensity-based 2D/3D registration of 3D images or models to 2D X-rays. However, the capture range of image-based 2D/3D registration is inconveniently small suggesting that automatic and robust initialization strategies are of critical importance. This manuscript describes a first step towards leveraging semantic information of the imaged object to initialize 2D/3D registration within the capture range of image-based registration by performing concurrent segmentation and localization of dexterous surgical tools in X-ray images. We presented a learning-based strategy to simultaneously localize and segment dexterous surgical tools in X-ray images and demonstrate promising performance on synthetic and ex vivo data. We currently investigate methods to use semantic information extracted by the proposed network to reliably and robustly initialize image-based 2D/3D registration. While image-based 2D/3D registration has been an obvious focus of the CAI community, robust initialization thereof (albeit critical) has largely been neglected. This manuscript discusses learning-based retrieval of semantic information on imaged-objects as a stepping stone for such initialization and may therefore be of interest to the IPCAI community. Since results are still preliminary and only focus on localization, we target the Long Abstract category.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 4

11/16/2019

Automatic Annotation of Hip Anatomy in Fluoroscopy for Robust and Efficient 2D/3D Registration

Fluoroscopy is the standard imaging modality used to guide hip surgery a...
08/04/2021

The Impact of Machine Learning on 2D/3D Registration for Image-guided Interventions: A Systematic Review and Perspective

Image-based navigation is widely considered the next frontier of minimal...
07/24/2019

Multi-task Localization and Segmentation for X-ray Guided Planning in Knee Surgery

X-ray based measurement and guidance are commonly used tools in orthopae...
03/22/2018

X-ray-transform Invariant Anatomical Landmark Detection for Pelvic Trauma Surgery

X-ray image guidance enables percutaneous alternatives to complex proced...
01/17/2020

Registration made easy – standalone orthopedic navigation with HoloLens

In surgical navigation, finding correspondence between preoperative plan...
09/23/2019

Patch-Based Image Similarity for Intraoperative 2D/3D Pelvis Registration During Periacetabular Osteotomy

Periacetabular osteotomy is a challenging surgical procedure for treatin...
04/23/2015

An Elastic Image Registration Approach for Wireless Capsule Endoscope Localization

Wireless Capsule Endoscope (WCE) is an innovative imaging device that pe...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Purpose

Continuum dexterous manipulators (CDMs), commonly referred to as snake-like robots, have demonstrated great premise for minimmally-invasive procedures [11, 4]. Recent innovations have made CDMs appropriate for use in orthopedic surgery [5, 8, 1]

. One key challenge of using CDMs is performing precise intra-operative control guided by a pre-operative patient-specific plan, conceived based on 3D imaging and potentially bio-mechanical analysis. To this end, the calibration loop of robot base to end-effector to patient anatomy must be closed, and an accurate kinematic deformation estimation of the CDM is required.


X-ray image based surgical tool navigation has received increasing interest since it is fast and supplies accurate images of deep seated structures. Typically, recovering the 6 degree of freedom (DOF) rigid pose and deformation of tools with respect to the X-ray camera can be accurately achieved through intensity-based 2D/3D registration of 3D images or models to 2D X-rays 

[7]. However, it is well known that the capture range of image-based 2D/3D registration is inconveniently small suggesting that automatic and robust initialization strategies are of critical importance. Consequently, this manuscript describes a first step towards leveraging semantic information of the imaged object to initialize 2D/3D registration within the capture range of image-based registration by performing concurrent segmentation and localization of the CDM in X-ray images.

2 Methods

We seek to train a convolutional neural network (ConvNet) to localize CDMs in X-ray images. The CDM developed by our group considered here is fabricated using Nitinol. Its outer diameter is

 mm and it includes an instrument channel with a diameter of  mm. A set of 26 alternating notches on the body of the CDM allow for single-plane bending [5, 8]. Due to the unavailability of annotated X-ray images to train ConvNets, we rely on DeepDRR [10], a framework for physics-based rendering of digitally reconstructed radiographs (DRRs, i. e. synthetic fluoroscopic images) from 3D CT. DeepDRR accurately accounts for energy- and material-dependence of X-ray attenuation, scattering, and noise. Recent work [10, 3] demonstrated that ConvNets trained on DeepDRRs generalize to clinically acquired X-ray images without re-training, motivating its use for the application proposed here. The simulation of the CDM mainly consists of two parts: 1) body and base of the CDM plus an extended shaft; and 2) inserted tool and drill. Following previous work on kinematic modeling of this CDM [9], we assume that the joint angle changes smoothly from one joint to the next. Angles are parameterized as cubic spline of equally distributed control points, , along the central axis of the CDM. The rigid pose of the CDM relative to X-ray camera is represented by translation and rotation in , , and axes, defining the total parameter space as .
Given a 3D CT of the lower torso, we manually define a rigid transformation such that the CDM model is enclosed in the femur, simulating applications in core decompression and fracture repair [1, 2]. DeepDRR uses voxel representation, so the CDM surface model is voxelized with high resolution to preserve details of the notches. At positions where the CT volume exhibits overlap with the CDM, CT values are omitted to model drilling. From the above volumes and coordinate transforms, we use DeepDRR to generate 1) realistic X-ray images, 2) 2D segmentation masks of the CDM end-effector, and 3) 2D locations of two key landmarks. Our segmentation target region covers the 26 alternating notches which discerns the CDM from other surgical tools. The two landmarks are defined as 1) the middle of the 2 conjunction points between the first notch and the base and 2) the center of the distal plane of the last notch, i. e. start and end point of CDM centerline. The simulated X-rays have pixels with an isotropic pixel size of  mm. DRRs are converted to line-integral domain to decrease the dynamic range and then normalized to

. Landmark coordinates are transformed to belief maps expressed as Gaussian distributions (

 pixels) around the true location. Data generation was done as follows: A total of 5 lower-limbs CTs ( voxel,  mm/voxel) were included in the experiment and centered around the pelvis. The CDM volume was manually aligned with the left/right femur to mimic our clinical usecase. Then, CDM shapes and rigid X-ray source and volume poses were sampled randomly: Source-to-detector distance was fixed to  mm while source-to-isocenter distance was . Source rotation in LAO/RAO was and in CRAN/CUAD . Volume translation was  mm in all axes. CDM shapes were defined by randomly sampling control point angles . We sampled a total of random configurations per femoral head ( total) to render synthetic images. CTs were split into training:testing, and within the training dataset into training:validation. We also manually annotated 87 X-ray images of a real CDM drilling in femoral bone specimens for quantitative evaluation on cadaveric data.
Inspired by the work of [6], we design a ConvNet-based auto-encoder like architecture with skip connections, and split the connection from the last feature layer to perform two tasks concurrently, i. e. segmentation and landmark detection. Fig. 1 illustrates the ConvNet architecture used here. In the decoder, we repeat the connection of 2D convolutional layer and maxpooling layer four times to abstract a feature representation with channels. In the decoder part, we concatenate the upsampled features and features from the same level in the encoding stage. The final decoded channel feature layer is shared across the segmentation path and the localization path. The final output of the segmentation mask is backward concatenated with this shared feature to boost the localization task. We chose Dice loss to train the segmentation task and the standard loss for the localization task. Learning rate was initialized with and decayed by every epochs.

Figure 1: Network architecture used for concurrent segmentation and landmark detection.

3 Results

The segmentation accuracy is computed as the Dice score of mask prediction. Landmark detection accuracy is reported as the distance in millimeters. We first evaluated the network on the synthetic dataset where exact groundtruth was known. The mean Dice score was and the mean distance was  mm. On the manually annotated 87 ex vivo X-ray images, the network achieved a mean Dice score of and mean distance of

 mm. The cadaveric data contained configurations never seen during training (i. e. tool completely outside bone) that induced poor performance of our network, as reflected in the high standard deviations for the cadaveric dataset. Representative results are shown in Fig. 

2.

Figure 2: Representative examples of segmentation and landmark detection performance on synthetic (upper row) and real ex vivo data (lower row). The predicted segmentation and landmarks are shown as green and red overlay, respectively.

4 Conclusions

We presented a learning-based strategy to simultaneously localize and segment dexterous surgical tools in X-ray images. Our results on synthetic and ex vivo data are promising and encourage training of our ConvNet on a more exhaustive dataset. We currently investigate how these results translate to other real data and investigate methods to use semantic information extracted by the proposed network to reliably and robustly initialize image-based 2D/3D registration.

References

  • [1] Alambeigi, F., et al.: A curved-drilling approach in core decompression of the femoral head osteonecrosis using a continuum manipulator. IEEE Robot Autom Lett 2(3), 1480–1487 (2017)
  • [2] Alambeigi, F., et al.: Inroads toward robot-assisted internal fixation of bone fractures using a bendable medical screw and the curved drilling technique. In: IEEE Conf on Biorob. pp. 595–600. IEEE (2018)
  • [3] Bier, B., Unberath, M., et al.: X-ray-transform invariant anatomical landmark detection for pelvic trauma surgery. In: Proc MICCAI (2018)
  • [4] Burgner-Kahrs, J., et al.: Continuum robots for medical applications: A survey. IEEE Trans Robot 31(6), 1261–1280 (2015)
  • [5] Kutzer, M.D., et al.: Design of a new cable-driven manipulator with a large open lumen: Preliminary applications in the minimally-invasive removal of osteolysis. In: Proc ICRA. pp. 2913–2920. IEEE (2011)
  • [6] Laina, I., Rieke, N., et al.: Concurrent segmentation and localization for tracking of surgical instruments. In: Proc MICCAI. pp. 664–672 (2017)
  • [7] Markelj, P., et al.: A review of 3d/2d registration methods for image-guided interventions. MediIA 16(3), 642–661 (2012)
  • [8] Murphy, R.J., et al.: Design and kinematic characterization of a surgical manipulator with a focus on treating osteolysis. Robotica 32(6), 835–850 (2014)
  • [9] Otake, Y., et al.: Piecewise-rigid 2d-3d registration for pose estimation of snake-like manipulator using an intraoperative x-ray projection. In: Proc SPIE Med Imag. vol. 9036, p. 90360Q (2014)
  • [10]

    Unberath, M., Zaech, J.N., et al.: Deepdrr – a catalyst for machine learning in fluoroscopy-guided procedures. In: Proc MICCAI. Springer (2018)

  • [11] Walker, I.D., et al.: Snake-like and continuum robots. In: Springer Handbook of Robotics, pp. 481–498 (2016)