1 Introduction
The applicative aim of this contribution concerns bladder cancer detection in image sequences recorded during endoscopic examinations. The 2D cartography of an image sequence, also called image mosaicing, relies on a prior registration of consecutive image pairs of the video sequence, and then on the superposition of all the images onto a single common panoramic image. Lesion detection and evolution assessment may be far easier in such mosaics than in isolated images showing, each, only a very small part of the region of interest. Mosaicing of human organ images is a few treated problem (see [Jalink:1996, Chou:1997, Can:2002a, Vercauteren:2005] for applications of mosaicing in mammography, angiography, ophthalmology, and microscopy), the existing solutions being not automated or needing a priori knowledge like sensor position and being only able to register few images. In the case of bladder endoscopy, image mosaicing is difficult for several reasons. First, image primitives are not easy to extract robustly (e.g., contours), and their background is severely textured. Moreover, the recorded images have a great inter and intrapatient variability. Second, the endoscope position is unknown during the image acquisition, since urologists can move “freely” the instrument inside the bladder. Third, a video sequence consists generally of thousands of images. One of the technical question for the consecutive registration of pairs of images is : how to register robustly, precisely and with an acceptable computation time all the images of a sequence? The computation time may be the less critical factor since the mosaic must be available for a further diagnosis which is usually performed some dozens of minutes or hours after the examination itself. In this paper, we focus on the registration of consecutive images, denoted by (target image) and (source image), where stands for the image index in the video sequence. The registration of and consists in finding a 2D/2D perspective transformation which superimposes on . In notation , represents a 2D point in the domain of image and is the set of parameters of the perspective transformation (related in eq. (1) to the translations and , plane rotation , scale factor , shearing parameters and , and perspective parameters and ). The perspective transformation of the 2D space reads :
(1) 
and involves 8 independent parameters (). The registration of images and is stated as the maximization of a similarity criterion of the form:
(2) 
The difference between different registration algorithms to be chosen to solve this problem lies in the choice of the measure of similarity and in the choice of the numerical algorithm of optimization.
2 Image registration algorithms
The bladder images do not systematically include image primitives (e.g., corners or contours) that can be robustly enough extracted[Yahir:2007]. For this reason, the most simple registration methods relying on the segmentation of an image primitive cannot be used, and we must consider a great number of image pixels when choosing the measure of similarity .
2.1 : Quadratic distance based algorithm
The first algorithm [Yahir:2006, Yahir:2007] is based on a measure of dissimilarity defined as the quadratic distance between the grey levels of the pixels of and these of the perspective transformation of the pixels of :
(3) 
where () denotes the coordinates of a pixel common to both and images. The minimization of this measure can be done using Baker and Matthews’ inverse composition algorithm [Baker:2004]
whose goal is to estimate the optical flow,
i.e., the apparent motion between two given images.2.2 : Mutual Information based algorithm
The second algorithm [Miranda:2008, Miranda:2005] is based on Viola and Wells’ approach EMMA[viola:1997] (EMpirical entropy Manipulation and Analysis). aligns images and by maximizing the measure of similarity defined as the mutual information between and . Shortly speaking, the mutual information is a statistical measure computed with the grey level entropies and of the overlapping parts of and and with the joint entropy :
(4) 
This measure is used together with a stochastic descent gradient algorithm in the optimization process of eq. (2). The mutual information is well suited to the registration of textured images [PluimMV:03].
3 Comparative study : experiments and results
In this section, we present the registration results obtained with both measures of similarity applied on common data sets obtained from real human bladder examinations and simulated endoscope displacement and on simulated data from a realistic phantom constructed using a pig bladder cartography.
3.1 Robustness evaluation
Three images with very different visual aspects (various textures and illumination conditions) were extracted from human endoscopic sequences to assess the robustness of the algorithms (see Figure 1). These three images were all taken as reference images ( target images in eq. (2)). source images were computed by applying known simulated 2D transformations on the target image (), as if we simulate a real 3D displacement of the endoscope. The 3D displacement includes two translations corresponding to and in eq. (1) while relates to the scale factor and 3D rotations (in plane rotation in eq. (1) and two out of plane rotations and related to and . In this way, it is possible to compare the calculated transformations with the known transformations already used to simulate images.
I  II  III 
These (, ) image pairs allow for an assessment of the largest endoscope viewpoint change leading to successful registrations. The parameter value intervals for which a successful registration was obtained are detailed in Tab.1. For the algorithm, intervals are : = = pixels, = , = and == . These limits are more restricted for the algorithm : = = pixels, = , = and = = . Even if for both methods, the translation limits are roughly the same order (with a slight advantage for the mutual information algorithm ), the robustness is clearly better for the mutual information method in terms of scale factor changes and in and out of plane rotations.
Transformation value intervals  

Transformation parameters  Real endoscopic exam  
Translation ( and )  pixels  pixels  pixels 
Scale factor ()  
In plane rotation ()  
Out of plane rotations ( and )  

3.2 Accuracy evaluation
A quite realistic phantom was built using an excised pig bladder in order to test the registration accuracy of both methods. The pig bladder was incised, opened out and photographed with a camera. The pig bladder texture (see Figure 2(a)) is very similar to that human bladder. The area covered by the acquired picture is a 16 cm side square. The first image was taken in the upper left photograph corner. The other images of the sequence were obtained by simulating successively 10 pixel horizontal translations (14 upper images), a combination of 10 pixel translations and of in plane rotations (upper 10 vertical images on the right photograph side), combination of 10 pixel translations and of 5% scale factor changes (lower 10 vertical images on the right photograph side), a combination of 10 horizontal translations and of out of plane rotations (first 10 lower images from the right photograph side), etc.
(a)  (b)  (c) 

All image pairs (, ) were registered with both methods. The registration accuracy criterion is defined as the mean distance between homologous pixels of the target images and the registered images . This criterion is ideally equal to 0.
In the case of simple translations ( 0.2 pixels) and a combination of out of plane rotations (perspective changes) and translations ( 0.6 pixels), the registration errors are equal for both methods. These errors are very small and imperceptible (see Figures 2(b) and 2(c)). For the combinations of translations and in plane rotations, the errors are again equal for both methods ( 3.5 pixels) (see Figure 3). As we observed visually (Figure 2), these errors rather correspond to a small image distortion without affecting the global mosaic (map) coherence. Especially in the map regions including image borders, the textures are without discontinuities. As shown in Figure 3, registration mean errors values are equivalent for both algorithms in most sequence parts except in the part where the scale factor changes (images number 20 to 30) and for which algorithm is more efficient ( 1.5 pixels compared to 4.5 pixels). Again, these errors do not affect the global visual map coherence. It is noticeable that, due to the image acquisition rate (25 images/second) and to the small endoscope displacements (few millimetres/second), the real rotation parameters (), translation parameters ( and 5 pixels) and scale factor () changes are in fact by far smaller than those imposed in our experiments. In practice, both methods led systematically to subpixel errors for limited and more realistic displacements.
Figures 4 and 5 show two panoramic images constructed from real cystoscopic examination images using and respectively. The panoramic image in Figure 4 is a 1479 1049 pixel image constructed from a 450 image sequence using . In this panoramic image, two polyps are visible on the topright and at the bottom left of the image. Both polyps can be accurately located in relation to each other. Figure 5 represents a 650 182 pixel panoramic image constructed from a 500 cystoscopic image sequence using . There are no visible discontinuities on texture affirming a quite good visual coherence.
3.3 Mosaicing speed
and were programmed in C language using OpenCV vision library. The evaluation of both algorithms robustness and accuracy was done using an Intel Dual core(TM) 2.40GHz, 2Gb RAM computer. The optimization method of the algorithm requires, in average, 250 iterations to register consecutive images. And each image pair registration takes between 50 and 60 seconds. In Figure 5, the construction of the panoramic image took nearly 8 hours 27 minutes. However, in the same experimental conditions, is about 100 times faster than . In fact, a mean number of 12 iterations was needed by the optimization algorithm of the algorithm to register a pair of images. The time of registration for an image pair varied between 0.3 and 0.6 second. The panoramic image shown in Figure 4 was constructed in 3.20 minutes. The computation time of the makes possible the construction of partial panoramic image of the bladder during the standard cystoscopic examination procedure.
4 Conclusion
In terms of accuracy, both registration methods give comparable results with a slight advantage for . However, the method is more robust than the method, while the computation time of algorithm (some tenth of seconds to register two images) is about 100 times smaller than that of the algorithm. Future work will aim at combining both methods to reach the robustness of the mutual information method and to tend towards the computation times of the algorithm.
Comments
There are no comments yet.