Comparative study of image registration techniques for bladder video-endoscopy

by   Achraf Ben-Hamadou, et al.
Université Lorraine

Bladder cancer is widely spread in the world. Many adequate diagnosis techniques exist. Video-endoscopy remains the standard clinical procedure for visual exploration of the bladder internal surface. However, video-endoscopy presents the limit that the imaged area for each image is about nearly 1cm2. And, lesions are, typically, spread over several images. The aim of this contribution is to assess the performance of two mosaicing algorithms leading to the construction of panoramic maps (one unique image) of bladder walls. The quantitative comparison study is performed on a set of real endoscopic exam data and on simulated data relative to bladder phantom.



There are no comments yet.


page 3

page 4

page 6


Beyond Visual Image: Automated Diagnosis of Pigmented Skin Lesions Combining Clinical Image Features with Patient Data

kin cancer is considered one of the most common type of cancer in severa...

Video-based computer aided arthroscopy for patient specific reconstruction of the Anterior Cruciate Ligament

The Anterior Cruciate Ligament (ACL) tear is a common medical condition ...

NanoNet: Real-Time Polyp Segmentation in Video Capsule Endoscopy and Colonoscopy

Deep learning in gastrointestinal endoscopy can assist to improve clinic...

Hierarchical Classification of Pulmonary Lesions: A Large-Scale Radio-Pathomics Study

Diagnosis of pulmonary lesions from computed tomography (CT) is importan...

Towards automatic initialization of registration algorithms using simulated endoscopy images

Registering images from different modalities is an active area of resear...

Quantitative Matching of Forensic Evidence Fragments Utilizing 3D Microscopy Analysis of Fracture Surface Replicas

Fractured surfaces carry unique details that can provide an accurate qua...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The applicative aim of this contribution concerns bladder cancer detection in image sequences recorded during endoscopic examinations. The 2-D cartography of an image sequence, also called image mosaicing, relies on a prior registration of consecutive image pairs of the video sequence, and then on the superposition of all the images onto a single common panoramic image. Lesion detection and evolution assessment may be far easier in such mosaics than in isolated images showing, each, only a very small part of the region of interest. Mosaicing of human organ images is a few treated problem (see [Jalink:1996, Chou:1997, Can:2002a, Vercauteren:2005] for applications of mosaicing in mammography, angiography, ophthalmology, and microscopy), the existing solutions being not automated or needing a priori knowledge like sensor position and being only able to register few images. In the case of bladder endoscopy, image mosaicing is difficult for several reasons. First, image primitives are not easy to extract robustly (e.g., contours), and their background is severely textured. Moreover, the recorded images have a great inter- and intra-patient variability. Second, the endoscope position is unknown during the image acquisition, since urologists can move “freely” the instrument inside the bladder. Third, a video sequence consists generally of thousands of images. One of the technical question for the consecutive registration of pairs of images is : how to register robustly, precisely and with an acceptable computation time all the images of a sequence? The computation time may be the less critical factor since the mosaic must be available for a further diagnosis which is usually performed some dozens of minutes or hours after the examination itself. In this paper, we focus on the registration of consecutive images, denoted by (target image) and (source image), where stands for the image index in the video sequence. The registration of and consists in finding a 2-D/2-D perspective transformation which superimposes on . In notation , represents a 2-D point in the domain of image and is the set of parameters of the perspective transformation (related in eq. (1) to the translations and , plane rotation , scale factor , shearing parameters and , and perspective parameters and ). The perspective transformation of the 2-D space reads :


and involves 8 independent parameters (). The registration of images and is stated as the maximization of a similarity criterion of the form:


The difference between different registration algorithms to be chosen to solve this problem lies in the choice of the measure of similarity and in the choice of the numerical algorithm of optimization.

2 Image registration algorithms

The bladder images do not systematically include image primitives (e.g., corners or contours) that can be robustly enough extracted[Yahir:2007]. For this reason, the most simple registration methods relying on the segmentation of an image primitive cannot be used, and we must consider a great number of image pixels when choosing the measure of similarity .

2.1 : Quadratic distance based algorithm

The first algorithm [Yahir:2006, Yahir:2007] is based on a measure of dissimilarity defined as the quadratic distance between the grey levels of the pixels of and these of the perspective transformation of the pixels of :


where () denotes the coordinates of a pixel common to both and images. The minimization of this measure can be done using Baker and Matthews’ inverse composition algorithm [Baker:2004]

whose goal is to estimate the optical flow,

i.e., the apparent motion between two given images.

2.2 : Mutual Information based algorithm

The second algorithm [Miranda:2008, Miranda:2005] is based on Viola and Wells’ approach EMMA[viola:1997] (EMpirical entropy Manipulation and Analysis). aligns images and by maximizing the measure of similarity defined as the mutual information between and . Shortly speaking, the mutual information is a statistical measure computed with the grey level entropies and of the overlapping parts of and and with the joint entropy :


This measure is used together with a stochastic descent gradient algorithm in the optimization process of eq. (2). The mutual information is well suited to the registration of textured images [PluimMV:03].

3 Comparative study : experiments and results

In this section, we present the registration results obtained with both measures of similarity applied on common data sets obtained from real human bladder examinations and simulated endoscope displacement and on simulated data from a realistic phantom constructed using a pig bladder cartography.

3.1 Robustness evaluation

Three images with very different visual aspects (various textures and illumination conditions) were extracted from human endoscopic sequences to assess the robustness of the algorithms (see Figure 1). These three images were all taken as reference images ( target images in eq. (2)). source images were computed by applying known simulated 2-D transformations on the target image (), as if we simulate a real 3-D displacement of the endoscope. The 3-D displacement includes two translations corresponding to and in eq. (1) while relates to the scale factor and 3-D rotations (in plane rotation in eq. (1) and two out of plane rotations and related to and . In this way, it is possible to compare the calculated transformations with the known transformations already used to simulate images.

Figure 1: I, II and III : Three reference images extracted from a real endoscopic exams for robustness evaluation tests. The chosen images present both texture and illumination variabilities.

These (, ) image pairs allow for an assessment of the largest endoscope viewpoint change leading to successful registrations. The parameter value intervals for which a successful registration was obtained are detailed in Tab.1. For the algorithm, intervals are : = = pixels, = , = and == . These limits are more restricted for the algorithm : = = pixels, = , = and = = . Even if for both methods, the translation limits are roughly the same order (with a slight advantage for the mutual information algorithm ), the robustness is clearly better for the mutual information method in terms of scale factor changes and in- and out of plane rotations.

Transformation value intervals
Transformation parameters Real endoscopic exam
Translation ( and ) pixels pixels pixels
Scale factor ()
In plane rotation ()
Out of plane rotations ( and )

Table 1: Transformation value intervals for with a successful registration was obtained for both and . The last column designate transformation value intervals in real endoscopic exams in most cases ().

3.2 Accuracy evaluation

A quite realistic phantom was built using an excised pig bladder in order to test the registration accuracy of both methods. The pig bladder was incised, opened out and photographed with a camera. The pig bladder texture (see Figure 2(a)) is very similar to that human bladder. The area covered by the acquired picture is a 16 cm side square. The first image was taken in the upper left photograph corner. The other images of the sequence were obtained by simulating successively 10 pixel horizontal translations (14 upper images), a combination of 10 pixel translations and of in plane rotations (upper 10 vertical images on the right photograph side), combination of 10 pixel translations and of 5% scale factor changes (lower 10 vertical images on the right photograph side), a combination of 10 horizontal translations and of out of plane rotations (first 10 lower images from the right photograph side), etc.

(a) (b) (c)

Figure 2: (a) Pig bladder photograph : the boxes indicate the simulated image sequence, i.e., the acquisition path. (b) Mosaic (map) image obtained with the mutual information algorithm using registration of successive images. The map is visually coherent, all textures being continuous from one image to another in the map. This map visually matches the image of the pig bladder photograph. (c) Same results for the optical flow method.

All image pairs (, ) were registered with both methods. The registration accuracy criterion is defined as the mean distance between homologous pixels of the target images and the registered images . This criterion is ideally equal to 0.

In the case of simple translations ( 0.2 pixels) and a combination of out of plane rotations (perspective changes) and translations ( 0.6 pixels), the registration errors are equal for both methods. These errors are very small and imperceptible (see Figures 2(b) and 2(c)). For the combinations of translations and in plane rotations, the errors are again equal for both methods ( 3.5 pixels) (see Figure 3). As we observed visually (Figure 2), these errors rather correspond to a small image distortion without affecting the global mosaic (map) coherence. Especially in the map regions including image borders, the textures are without discontinuities. As shown in Figure 3, registration mean errors values are equivalent for both algorithms in most sequence parts except in the part where the scale factor changes (images number 20 to 30) and for which algorithm is more efficient ( 1.5 pixels compared to 4.5 pixels). Again, these errors do not affect the global visual map coherence. It is noticeable that, due to the image acquisition rate (25 images/second) and to the small endoscope displacements (few millimetres/second), the real rotation parameters (), translation parameters ( and 5 pixels) and scale factor () changes are in fact by far smaller than those imposed in our experiments. In practice, both methods led systematically to sub-pixel errors for limited and more realistic displacements.

Figures 4 and 5 show two panoramic images constructed from real cystoscopic examination images using and respectively. The panoramic image in Figure 4 is a 1479 1049 pixel image constructed from a 450 image sequence using . In this panoramic image, two polyps are visible on the top-right and at the bottom left of the image. Both polyps can be accurately located in relation to each other. Figure 5 represents a 650 182 pixel panoramic image constructed from a 500 cystoscopic image sequence using . There are no visible discontinuities on texture affirming a quite good visual coherence.

Figure 3: Registration mean errors for both and . values are equivalent for both algorithms in most sequence parts except in the part where the scale factor changes (images number 20 to 30) and for which algorithm is more efficient.

3.3 Mosaicing speed

and were programmed in C language using OpenCV vision library. The evaluation of both algorithms robustness and accuracy was done using an Intel Dual core(TM) 2.40GHz, 2Gb RAM computer. The optimization method of the algorithm requires, in average, 250 iterations to register consecutive images. And each image pair registration takes between 50 and 60 seconds. In Figure 5, the construction of the panoramic image took nearly 8 hours 27 minutes. However, in the same experimental conditions, is about 100 times faster than . In fact, a mean number of 12 iterations was needed by the optimization algorithm of the algorithm to register a pair of images. The time of registration for an image pair varied between 0.3 and 0.6 second. The panoramic image shown in Figure 4 was constructed in 3.20 minutes. The computation time of the makes possible the construction of partial panoramic image of the bladder during the standard cystoscopic examination procedure.

4 Conclusion

In terms of accuracy, both registration methods give comparable results with a slight advantage for . However, the method is more robust than the method, while the computation time of algorithm (some tenth of seconds to register two images) is about 100 times smaller than that of the algorithm. Future work will aim at combining both methods to reach the robustness of the mutual information method and to tend towards the computation times of the algorithm.

Figure 4: A 1479 1049 pixels panoramic image constructed from a 450 image sequence using . Two polyps are visible on the top-right and at the bottom left of the image. In this panoramic image, both polyps can be accurately located in relation to each other.
Figure 5: A 650 182 pixels panoramic image constructed from a 500 image sequence using . In this panoramic image, there are no visible discontinuities on texture affirming a quite good visual coherence.
The authors express their gratitude to the “Région Lorraine” the “Ligue Contre le Cancer (CD 52, 54)” and address their grateful thanks to physician Pr. F. Giollemin and urologist M.-A. DHallewin from Cancer Institute CAV in Nancy (France) for their clinical experience and for providing video sequences of various cystoscopic examination. The authors also thank the surgeons from Experimental Surgery Laboratory (Faculty of Medicine, Nancy) for the fresh pig bladders excisions.