Augmented Reality for Depth Cues in Monocular Minimally Invasive Surgery

03/01/2017 ∙ by Long Chen, et al. ∙ University of Bradford Bournemouth University University of Chester 0

One of the major challenges in Minimally Invasive Surgery (MIS) such as laparoscopy is the lack of depth perception. In recent years, laparoscopic scene tracking and surface reconstruction has been a focus of investigation to provide rich additional information to aid the surgical process and compensate for the depth perception issue. However, robust 3D surface reconstruction and augmented reality with depth perception on the reconstructed scene are yet to be reported. This paper presents our work in this area. First, we adopt a state-of-the-art visual simultaneous localization and mapping (SLAM) framework - ORB-SLAM - and extend the algorithm for use in MIS scenes for reliable endoscopic camera tracking and salient point mapping. We then develop a robust global 3D surface reconstruction frame- work based on the sparse point clouds extracted from the SLAM framework. Our approach is to combine an outlier removal filter within a Moving Least Squares smoothing algorithm and then employ Poisson surface reconstruction to obtain smooth surfaces from the unstructured sparse point cloud. Our proposed method has been quantitatively evaluated compared with ground-truth camera trajectories and the organ model surface we used to render the synthetic simulation videos. In vivo laparoscopic videos used in the tests have demonstrated the robustness and accuracy of our proposed framework on both camera tracking and surface reconstruction, illustrating the potential of our algorithm for depth augmentation and depth-corrected augmented reality in MIS with monocular endoscopes.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 9

page 11

page 13

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Minimally Invasive Surgery (MIS) requires a surgeon to successfully perform technically demanding procedures. Typically their visual interface consists of a monocular display showing the video stream from the endoscope (although there are some examples of stereoscopic endoscopes discussed below). The Field of View (FOV) captured by a monocular endoscopic camera is usually limited, for example, only 30% to 40% of the whole liver surface is visible in one frame Plantefeve2016 . Further, the loss of depth perception from monocular image sequences can severely impact on a surgeon’s performance in completing complex tasks Honeck2012 Wagner2012

. As such, providing cues to aid better depth perception is highly desirable for a monocular endoscope scene and requires further innovation. In this paper we propose a novel method of fusing rich 3D anatomical information with a monocular endoscope video stream, based on real-time 3D reconstruction of the scene to address the limited FOV and depth perception issue. There are three main technical challenges to overcome: (i) achieving real-time tracking and feature extraction of the scene; (ii) dealing with camera tracking and the sparse and noisy data set extracted from the camera tracking algorithm; and (iii) robust reconstruction of a 3D surface of the scene.

Recent advances in computer hardware and software technologies has facilitated the use of computer vision techniques for MIS scene analysis and understanding. For example, efficient motion tracking of endoscopic cameras in MIS for deformable soft tissues has been demonstrated

MountneyLoThiemjarusEtAl2007 , as well as augmented reality with anatomy structures HaouchineDequidtPeterlikEtAl2013 HaouchineCotinPeterlikEtAl2015 . There are particular challenges, however, since in MIS the luminance changes dramatically and an endoscope can also move rapidly during insertion and extrusion. The scale-invariant feature transform (SIFT) algorithm Kim2012 and Speeded Up Robust Features (SURF) algorithm Kumar2014 have been developed for robust feature based camera motion tracking, and other approaches specifically designed to work with soft tissues that account for scale, rotation and brightness MountneyYang2008 . However, the issue of depth perception remains regardless of whichever computer vision algorithm has been deployed, i.e. information regarding the depth of elements within a scene has not been recovered and selected feature points extracted from vision algorithms must be within the field of view to provide the geometrical information required by augmented reality.

A stereo endoscope will improve the depth perception problem, and there are such systems available - often integrated into robotic systems (e.g. the da Vinci system from Intuitive Surgical, Inc.) or using proprietary stereo cameras. 3D depth can be recovered using the disparity map from rectified stereo images during a laparoscopic surgery StoyanovDarziYang2004 Stoyanov2005 , so that dense 3D reconstruction of the laparoscopic scene can be achieved by a propagation method StoyanovScarzanellaPrattEtAl2010 and/or a cost-volume algorithm ChangStoyanovDavisonEtAl2013 . Stereo vision based reconstructions, however, can only recover the structure of a local frame without a global overview of the scene. Therefore, such local reconstruction is prone to tracking errors and tracking noise. In addition, stereo endoscopic surgery is still expensive and yet to be used widely in practice, compared with the monocular endoscope.

Recently, the maturity of the simultaneous localization and mapping (SLAM) method used by robot devices for navigation in an unknown 3D environment has opened up new opportunities for novel camera tracking approaches in MIS. A SLAM-enabled system makes it possible to estimate the 3D structure of an unknown environment from a moving camera and simultaneously track the pose of the camera in the environment. The scenario of tracking and scene reconstruction in endoscopic surgery is similar to that of a typical SLAM application. Further, tracking and mapping using SLAM can be very accurate (within 1mm in ideal conditions) and does not require the use of optical or magnetic trackers that may distract a surgeons’ view. EKF-based SLAM has already been widely used with laparoscopic image sequences

Mountney2006 Mountney2009 GrasaBernalCasadoEtAl2014 . Although the further motion compensation model MountneyYang2010 and stereo semi-dense reconstruction Totz2011 are integrated into the EKF-SLAM framework, due to the linearization of motion and sensor models by first-order Taylor series expansion, the accuracy of EKF-SLAM cannot be guaranteed and therefore it is prone to inconsistent estimation and drifting. The first keyframe-based SLAM – PTAM (Parallel Tracking and Mapping) was a breakthrough in visual SLAM and has also been used in MIS for simultaneous stereoscope tracking Lin2013 . ORB–SLAM is a well-designed SLAM system derived from the idea of PTAM, which utilizes ORB (Oriented FAST and Rotated BRIEF) binary features for fast and reliable feature point tracking. Mahmoud et al NaderMahmoud2016 tested ORB–SLAM on endoscope video and presented a method for map point densifying but with some loss of accuracy.

In this paper, we adapt ORB–SLAM for endoscopic camera tracking and mapping. A 3D surface reconstruction method is proposed based on Moving Least Squares (MLS) smoothing and Poisson surface reconstruction to recover a smooth surface from the unstructured sparse map points extracted from ORB–SLAM. We also provide a comprehensive quantitative assessment by using simulated laparoscopic sequences. Estimated camera trajectories and reconstructed surfaces are compared with ground truth camera trajectories and the 3D models that we used to render the simulated video. The experiment results yield root mean square errors (RMSE) of 1.24 mm for camera trajectories and 4.32mm for surface reconstruction. Our method provides new possibilities for depth augmentation in monocular endoscopic MIS, enabling surgeons to view correct depth during a procedure. We also demonstrate an augmented reality (AR) framework for superimposing virtual objects with the correct depth, which is very important to prevent object drifting when changing perspectives.

2 ORB–SLAM for endoscopic camera tracking and mapping

Figure 1: A moving monocular endoscopic camera can capture a series of image sequences which can be used to build a 3D sparse point cloud by using a SLAM system.

ORB–SLAM Mur-ArtalMontielTardos2015 combines many state-of-the-art techniques into one SLAM system, such as using an ORB descriptor for tracking, local keyframe for mapping, graph-based optimization, Bag of Words algorithm for relocalization, and an essential graph for loop closure. These features can enable real-time endoscopic camera tracking and sparse point mapping in an abdominal cavity as shown in Fig. 1. Real-time performance is crucial in time-demanding medical interventions. Since ORB is a binary feature point descriptor, it is an order of magnitude faster than SURF and more than two orders faster than SIFT and also offers better accuracy. In addition, ORB features are invariant to rotation, illumination and scale, which means that it is capable of dealing with the rapid movements of endoscope cameras including rotation, zooming and change in brightness.

A common problem for monocular scene analysis using SLAM is initialized, i.e. a procedure is required to create an initial map, as depth cannot be recovered from a single image. An automatic approach is used in ORB–SLAM to calculate homography for planar scenes and a fundamental matrix for non-planar scenes dynamically. This approach can greatly increase the success rate of initialization and reduce the initializing time, which also facilitates its use in a MIS scene for initialization on an organ surface or to compute a fundamental matrix when the endoscopic camera is pointing at a complex structure.

The use of the Bags of Words (BoW) algorithm in ORB–SLAM can also help relocalization when tracking is lost. The vocabulary is created offline with a large number of ORB descriptors extracted from very big datasets with indoor and outdoor images, which almost covers all of patch patterns that we can encounter. The vocabulary serves as a classifier or a dictionary that assigns each descriptor an index. When a new image enters the system, each descriptor of features in this image is looked up, and a unique vector will be built based on the index of descriptors. In this way, the rough similarity of two images can be acquired by simply comparing the two unique vectors, which can greatly increase the speed of relocalization. For endoscopic videos, however, colours of soft tissues, organs and vessels do not differ greatly from person to person. Therefore, to extend the ORB–SLAM to be used in MIS scenes, we trained our vocabulary list specifically for its use in MIS based on a database of 7,894 images from the Hamlyn Centre Endoscopic Video Datasets

London2016 Ye2016 . By using a specific BoW database, the length of unique vector for similarity measurement will be decreased, which could increase not only the speed for performing comparisons but also the accuracy.

To optimise for MIS scenes, we followed the method of NaderMahmoud2016 to fine tune some of the default parameters that were used in ORB–SLAM. We extend the searching region by a factor of 1.5 to allow more key points to be included. The parallax threshold for point initialization has also been increased by a factor of 5 to increase accuracy when triangulating the points for 3D positions. The maximum threshold that is allowed between keypoints and reprojected map points for triangulation is reduced by a factor of 10 to strictly select robust 3D points. Finally, Hamming distance threshold for ORB descriptors comparison is decreased by 0.8 for more strict application of the pair point rule.

3 Intra-operative 3D surface reconstruction

In our system, the use of a SLAM system has made it possible to extract a sparse 3D point cloud from a moving monocular endoscopic camera. Unstructured sparse point clouds, however, describe the 3D structure of the endoscopic scene poorly. Therefore, we propose a 3D surface reconstruction framework that combines outlier removal filters, the Moving Least Square algorithm to smooth noise data and a Poisson surface reconstruction method to generate a dense and smooth surface from a unstructured sparse point cloud. This pipeline is illustrated in Fig. 2.

3.1 Point cloud pre-process

The point cloud P given by ORB–SLAM represents the salient points that are visible at different camera keyframes, giving a sparse representation of the intra-operative scene. MIS scenes are very complex due to camera calibration, tissue movement and reflection, and so results in a noisy point cloud mixed with many outliers that affect the final surface reconstruction. Therefore, before feeding the point cloud into the reconstruction pipeline, we apply two outlier removal filters to remove the noisy points located amongst the raw data points.

We first employ a radius filter that processes points in a cloud based on the number of neighbours that the points have. Points with very few neighbours are labelled as outliers, as isolated points cannot describe the structure of the 3D scene. Since some texture-abundant areas gain many more points than other areas, a voxel-grid filter is then used to re-sample the point cloud into a more even point cloud. After the filtering process, the point cloud becomes flat and ready for MLS smoothing and 3D surface reconstruction.

(a)
(b)
Figure 2: The proposed intra-operative 3D surface reconstruction framework.

3.2 Moving Least Squares for point smoothing

The Moving Least Squares (MLS) algorithm Levin2004 reconstructs surfaces locally by solving an optimization problem to find a local reference plane and fit a polynomial to the surface. Let a point set be the point cloud produced from the ORB–SLAM system. The continuous and smooth MLS surface can be computed by a two-step procedure: (i) a local reference plane is defined as , which can be computed by minimizing the weighted sum of squared distances:

where is the projection of onto , and is the MLS kernel, usually a Gaussian; (ii) after the points are projected onto the initial local reference plane, a second least squares optimization is used to find a bi–variate polynomial function (where is the local coordinate of in ) that approximates to the local surface. The projection of onto can then be defined by the polynomial value at the origin, i.e. .

3.3 Poisson surface reconstruction

We represent the points after the MLS filter stage by a vector field . Poisson surface reconstruction Michael2006 approaches the surface reconstruction problem through a framework of implicit functions that compute a 3D indicator function (which is equal to 1 inside the model and 0 at the outside points). Therefore, the problem becomes finding the whose gradient can best approximate the vector filed :

Applying the divergence operator, we can transform this into a Poisson problem:

After solving the Poisson problem and obtaining the 3D indicator function

, the 3D surface can be directly obtained by extracting an isosurface. Poisson reconstruction acts as a global solution that treats all of the data points simultaneously without relying on heuristic partitioning or blending, so that it can robustly approximate noisy data and create very smooth surfaces.

4 Results and Discussion

We designed a two-part quantitative and qualitative evaluation process: (i) using a realistic simulation of a MIS scene video for the ground truth study to assess the performance of ORB–SLAM tracking error and the accuracy of the proposed surface reconstruction framework; (ii) using a real in vivo video acquired from the Hamlyn Centre Laparoscopic/Endoscopic Video Datasets London2016 Mountney2010 to assess the quality of two applications of our proposed framework i.e. depth augmentation and AR with correct depth.

4.1 System setup

Our system is implemented in an Ubuntu 14.04 environment using C/C++ (without any GPU acceleration). All experiments are conducted on a workstation equipped with Intel Xeon(R) 2.8 GHz quad core CPU, 32G Memory, and one NVIDIA GeForce GTX 970 graphics card. The size of the simulation image sequences is 1024 X 768 pixels and the size of in vivo endoscope video is 840 X 640 pixels. ORB–SLAM with our proposed AR framework runs in real-time at 40 FPS at max and the 3D surface reconstruction process takes around 600ms to traverse the whole pipeline.

4.2 Ground truth study using simulation data

For the evaluation of the tracking performance in terms of tracking accuracy, camera trajectories estimated by ORB–SLAM were aligned with trajectories of the ground truth camera used to render the MIS scene video. Similarly, the accuracy of our proposed 3D surface reconstruction framework is evaluated by comparing the reconstructed surface with the 3D model used to render the simulation video.

To quantitatively evaluate the performance of ORB–SLAM, we used the Blender Blender2016 – an open source 3D creation software to render realistic image sequences of a simulated abdominal cavity scene using pre-defined endoscopic camera movement. The digestive system contains models with appropriate textures to make the scene as realistic as possible. The model was scaled to be real life size according to an average measured liver diameter of 14.0 cm Kratzer2003 as shown in Fig. 3(a); the material property was set with a strong specularity component to simulate smooth and reflective liver surface tissue. The luminance is intentionally set high by using a spot light attached to the main camera to simulate an endoscope camera as shown in Fig. 3(a) to render a realistic endoscopic lighting condition. We designed a camera trajectory that hovers around the 3D model as shown in Fig. 3(b) to capture as much area as possible for building a point cloud that covers the whole front surface of the models. There are 900 frames of image sequences at a frame-rate of 30 fps rendered out, which is equivalent to a 30 seconds video.

(a)
(b)
(c)
Figure 3: Simulated MIS scenes with a realistic human digestive system model. (a) The size of the model is scaled to the real world size of the size of an adult liver. (b) The only light is attached to the camera and camera trajectory is designed to hover around the 3D model. (c) The frame that ORB–SLAM succeeded in initializing.
(a)
(b)
(c)
(d)
Figure 4: The camera trajectory comparison of the ground truth (red dots) with the estimated result (blue dots) in four different views, (a) 3D view, (b) view of X-axis, (c) view of Y-axis, (d) view of Z-axis

Camera trajectory evaluation

Fig 3(c) shows one of the rendered images from the sequences used as the input to ORB–SLAM. The camera trajectory started with a close shot location to the liver surface. ORB–SLAM was successfully initialized at around frame 224 when the camera was in a place where many feature points were identified. After the initialization step, the SLAM system ran stably and estimated the camera trajectory with the origin of the coordinate system at the initialized position. The estimated camera trajectory was then extracted and normalized into the same coordinate system with that of the simulated ground truth model to assess the SLAM tracking performance.

Fig. 4 shows the performance evaluation results; Fig. 4(a) displays both camera trajectories in 3D space, in which blue dots represent the camera trajectory estimated by ORB–SLAM, whereas red dots are the trajectory of the simulated ground truth. Figs. 4(b), (c), and (d) are two camera trajectories in X-axis, Y-axis, and Z-axis views, respectively. As can be seen, the SLAM camera trajectory starts at the frame of 224, as there is no estimated data before initialization. Once the camera tracking is initialised, the trajectory of the camera matches closely to the ground truth camera trajectory represented by red dots. RMSE between the two camera trajectory data sets was also calculated with a result of 1.24mm.

3D surface reconstruction evaluation

(a)
(b)
(c)
(d)
Figure 5: (a) and (b): the surface nicely represents the model surface. (c) Surface distance map between the reconstructed surface with the 3D model. (d)Distance distribution

When the ORB–SLAM system gained enough feature points, we build a 3D surface based on the sparse point cloud. The whole reconstruction pipeline takes only 600 ms to generate the surface, which was then exported into the 3D model space to be compared with the ground truth surface data set. A simple iterative closest point (ICP) algorithm was used to align the reconstructed surface with the 3D model that was used to render the video.

Root Mean Squared Distance (RMSD) is used to evaluate the overall distance between the two surfaces. They are aligned in the real world coordinate system and we apply a grid sample to get a series of x,y coordinate points based on the surface area, and then compare the distance of the z value of the two surfaces.

The RMSD to the ground truth surface is 4.32 mm for surface we reconstructed.

Fig. 5 (a) shows that the reconstructed 3D surface aligns with the 3D model closely; Fig. 5 (b) shows the top down view of the alignment. Fig. 5 (c) shows the distance map between the reconstructed surface with the 3D ground truth model, where warm colours show penetrations between the two surfaces, the green colour represents a perfect match between the two surfaces, and the blue colour shows the largest distance between the two surfaces. Fig. 5

(d) illustrates the distance distribution, demonstrating a normal distribution with distances for most of the surface areas are between -1.0 mm to 4.0 mm.

4.3 Real endoscopic video evaluation

To qualitatively evaluate the performance of our proposed surface reconstruction framework, we applied the proposed approach on real in vivo videos that we acquired from Hamlyn Centre Laparoscopic / Endoscopic Video Datasets London2016 Mountney2010 . Fig. 6 (a) shows the reconstruction result from our 3D reconstruction framework. Fig. 6 (b) shows the depth augmentation by fusing the camera pose from the SLAM system and the 3D surface reconstructed from our proposed framework. The real-time alignment of the 3D transparent mesh and the video augurs well for providing correct depth information intra-operatively and so help improve surgical performance by displaying 3D mesh structures when performing monocular endoscope procedures.

Our new 3D surface reconstruction approach also allows us to develop a depth-correct AR framework for augmenting 3D models within the intra-operative endoscope scene in real-time. Depth-corrected AR is important when placing 3D models into the scene, since incorrect depth placement will cause virtual objects to appear to drift away when viewing perspective changes. Our depth-correct AR system ensures virtual objects can be superimposed at the correct positions. Fig.6 (c) shows some AR 3D-text notations placed into the video frames, and in Fig.6 (d), we manually rotated the mesh in order to inspect the depth of AR objects. More details can be appreciated in our video Youtube2016 .

(a)
(b)
(c)
(d)
Figure 6: (a):The surface reconstruction results applied to an in vivo video sequence. (b) The depth augmentation. (c) The AR element inserted with correct depth. (d) The mesh is manually rotated to show the depth.

5 Conclusions

In this paper, we have proposed an efficient and effective 3D surface reconstruction framework for an intra-operative monocular laparosopic scene based on ORB–SLAM. This new approach has shown promising results when tested on both simulated laparosopic scene image sequences and clinical data. The proposed framework also augurs well for use with depth augmentation and augmented reality in MIS with correct depth.

In future work, we will continue developing the dense SLAM system to be used in MIS reconstruction and extend the current reconstruction framework to get improved accuracy and speed. This will enable us to develop a prototype system that can be tested in the operating theatre with our clinical collaborators. This will further investigate the benefits and efficacy of this approach and provide evidence for our hypothesis that visual SLAM can enhance the tools available to the surgeon performing a monocular endoscopic procedure.

References

  • (1) Blender: Blender - free and open 3d creation software (2016). URL https://www.blender.org/. [Accessed 6 Nov. 2016]
  • (2) Chang, P.L., Stoyanov, D., Davison, A.J., Edwards, P.E.: Real-time dense stereo reconstruction using convex optimisation with a cost-volume for image-guided robotic surgery. Med Image Comput Comput Assist Interv 16(Pt 1), 42–49 (2013)
  • (3) Chen, L.: Youtube video (2016). URL https://youtu.be/m06dxtFeBOM. [Accessed 6 Nov. 2016]
  • (4) Grasa, O.G., Bernal, E., Casado, S., Gil, I., Montiel, J.: Visual slam for handheld monocular endoscope. Medical Imaging, IEEE Transactions on 33(1), 135–146 (2014)
  • (5) Haouchine, N., Cotin, S., Peterlik, I., Dequidt, J., Lopez, M.S., Kerrien, E., Berger, M.O.: Impact of soft tissue heterogeneity on augmented reality for liver surgery. Visualization and Computer Graphics, IEEE Transactions on 21(5), 584–597 (2015)
  • (6) Haouchine, N., Dequidt, J., Peterlik, I., Kerrien, E., Berger, M.O., Cotin, S.: Image-guided simulation of heterogeneous tissue deformation for augmented reality during hepatic surgery. In: Mixed and Augmented Reality (ISMAR), 2013 IEEE International Symposium on, pp. 199–208. IEEE (2013)
  • (7) Honeck, P., Wendt-Nordahl, G., Rassweiler, J., Knoll, T.: Three-dimensional laparoscopic imaging improves surgical performance on standardized ex-vivo laparoscopic tasks. Journal of Endourology 26(8), 1085–1088 (2012). DOI 10.1089/end.2011.0670. URL http://dx.doi.org/10.1089/end.2011.0670
  • (8) Kim, J.H., Bartoli, A., Collins, T., Hartley, R.: Tracking by detection for interactive image augmentation in laparoscopy. Lecture Notes in Computer Science pp. 246–255 (2012)
  • (9) Kratzer, W., Fritz, V., Mason, R.A., Haenle, M.M., Kaechele, V.,  , R.S.G.: Factors affecting liver size: a sonographic survey of 2080 subjects. J Ultrasound Med 22(11), 1155–1161 (2003)
  • (10) Kumar, A., Wang, Y.Y., Wu, C.J., Liu, K.C., Wu, H.S.: Stereoscopic visualization of laparoscope image using depth information from 3D model. Computer Methods and Programs in Biomedicine 113(3), 862–868 (2014). DOI 10.1016/j.cmpb.2013.12.013. URL http://dx.doi.org/10.1016/j.cmpb.2013.12.013
  • (11)

    Levin, D.: Mesh-independent surface interpolation.

    Mathematics and Visualization pp. 37–49 (2004)
  • (12) Lin, B., Johnson, A., Qian, X., Sanchez, J., Sun, Y.: Simultaneous tracking, 3D reconstruction and deforming point detection for stereoscope guided surgery. Lecture Notes in Computer Science pp. 35–44 (2013)
  • (13) London, I.C.: Hamlyn centre laparoscopic / endoscopic video datasets (2016). URL http://hamlyn.doc.ic.ac.uk/vision/. [Accessed 6 Nov. 2016]
  • (14) Michael, K., Bolitho, M., Hoppe, H.: Poisson surface reconstruction. In: Proceedings of the fourth Eurographics symposium on Geometry processing, vol. 7, p. 2006 (2006)
  • (15) Mountney, P., Lo, B., Thiemjarus, S., Stoyanov, D., Zhong-Yang, G.: A probabilistic framework for tracking deformable soft tissue in minimally invasive surgery. Med Image Comput Comput Assist Interv 10(Pt 2), 34–41 (2007)
  • (16) Mountney, P., Stoyanov, D., Davison, A., Yang, G.Z.: Simultaneous stereoscope localization and soft-tissue mapping for minimal invasive surgery. Med Image Comput Comput Assist Interv 9(Pt 1), 347–354 (2006)
  • (17) Mountney, P., Stoyanov, D., Yang, G.Z.: Three-dimensional tissue deformation recovery and tracking. IEEE Signal Processing Magazine 27(4), 14–24 (2010). DOI 10.1109/msp.2010.936728. URL http://dx.doi.org/10.1109/MSP.2010.936728
  • (18) Mountney, P., Yang, G.Z.: Soft tissue tracking for minimally invasive surgery: learning local deformation online. Med Image Comput Comput Assist Interv 11(Pt 2), 364–372 (2008)
  • (19) Mountney, P., Yang, G.Z.: Dynamic view expansion for minimally invasive surgery using simultaneous localization and mapping. 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society (2009). DOI 10.1109/iembs.2009.5333939. URL http://dx.doi.org/10.1109/IEMBS.2009.5333939
  • (20) Mountney, P., Yang, G.Z.: Motion compensated slam for image guided surgery. Medical Image Computing and Computer-Assisted Intervention MICCAI 2010 (2010)
  • (21) Mur-Artal, R., Montiel, J.M.M., Tardós, J.D.: Orb-SLAM: A versatile and accurate monocular SLAM system. IEEE Transactions on Robotics 31(5), 114–1163 (2015). DOI 10.1109/TRO.2015.2463671
  • (22) Nader Mahmoud Inigo Cirauqui, A.H.C.D.L.S.J.M.J.M.: Orbslam-based endoscope tracking and 3d reconstruction. In: MICCAI 2016 CARE (2016)
  • (23) Planteféve, R., Peterlik, I., Haouchine, N., Cotin, S.: Patient-specific biomechanical modeling for guidance during minimally-invasive hepatic surgery. Ann Biomed Eng 44(1), 139–153 (2016). DOI 10.1007/s10439-015-1419-z. URL http://dx.doi.org/10.1007/s10439-015-1419-z
  • (24) Stoyanov, D., Darzi, A., Yang, G.Z.: Dense 3d depth recovery for soft tissue deformation during robotically assisted laparoscopic surgery. Medical Image Computing and Computer-Assisted Intervention MICCAI 2004 (2004)
  • (25) Stoyanov, D., Darzi, A., Yang, G.Z.: A practical approach towards accurate dense 3d depth recovery for robotic laparoscopic surgery. Comput Aided Surg 10(4), 199–208 (2005). DOI 10.3109/10929080500230379. URL http://dx.doi.org/10.3109/10929080500230379
  • (26) Stoyanov, D., Scarzanella, M.V., Pratt, P., Yang, G.Z.: Real-time stereo reconstruction in robotically assisted minimally invasive surgery. Medical Image Computing and Computer-Assisted Intervention MICCAI 2010 (2010)
  • (27) Totz, J., Mountney, P., Stoyanov, D., Yang, G.Z.: Dense surface reconstruction for enhanced navigation in mis. Med Image Comput Comput Assist Interv 14(Pt 1), 89–96 (2011)
  • (28) Wagner, O.J., Hagen, M., Kurmann, A., Horgan, S., Candinas, D., Vorburger, S.A.: Three-dimensional vision enhances task performance independently of the surgical method. Surg Endosc 26(10), 2961–2968 (2012). DOI 10.1007/s00464-012-2295-3. URL http://dx.doi.org/10.1007/s00464-012-2295-3
  • (29) Ye, M., Giannarou, S., Meining, A., Yang, G.Z.: Online tracking and retargeting with applications to optical biopsy in gastrointestinal endoscopic examinations. Medical Image Analysis 30, 14–157 (2016). DOI 10.1016/j.media.2015.10.003. URL http://dx.doi.org/10.1016/j.media.2015.10.003