I Introduction
IMAGE registration [1, 2, 3] is the process of establishing correspondences between images and a reference space, such that the contents of the images have a high degree of affinity in the reference space. An example of such correspondence is a mapping of an image (often referred to as floating image) of a brain to a reference space of another image (often referred to as reference image) of a brain where their important structures are well colocalized. There are two main categories of approaches for image registration: featurebased methods extract a set of feature points between which a correspondence is found, whereas intensitybased methods use the voxelvalues directly, and evaluate candidate mappings based on a similarity measure (affinity). There are also two main categories of transformation models: linear (which include, as special cases, rigid, similarity, and affine transformations), and nonlinear (deformable). The combination of differentiable transformation models and differentiable similarity measures enables the use of gradientbased local optimization methods.
Medical and biomedical image registration, [4], is an important branch of general image registration and a lot of effort has been invested over the last decades to refine the tools and techniques, [2]. Although a majority of the recent research has been devoted to nonlinear registration techniques, the most prevalent registration method used in the clinic is still linear registration. In a number of situations, the deformations allowed by nonlinear registration can be difficult to evaluate and may affect reliability of diagnosis, [2]; hence, physicians may prefer a more constrained rigid or affine alignment. Considering their wide usage as fundamental tools, improvement of rigid and affine registration in terms of performance and reliability is highly relevant in practice.
Featurebased image registration is dependent on the ability of the feature extraction method to locate distinct points of interest appearing in both (all) images. Featureextractors (e.g. SIFT
[5]) typically presuppose the existence and relevance of specific local characteristics such as edges, corners and other salient features; if no, or too few, such distinct points are found, the registration will fail. This is often the case in medical and biomedical applications, [6, 7], where intensitybased registration, therefore, tends to be the method of choice. Figure 2 shows an illustrative example of a biomedical application where featurebased method fails, whereas intensitybased method can be successful.Intensitybased registration is, in general, formulated as a nonconvex optimization problem. The similarity measures commonly used as optimization criteria typically exhibit a high number of local optima [8, 9]; a count which tends to rapidly increase under noisy conditions. A small region of attraction of a global optimum imposes that the starting position has to be set very close to the optimal solution for it to be found by an optimizer. This leads to reliability challenges for automated solutions.
In this study we develop a registration framework based on a family of symmetric distance measures, proposed in [9], which combine intensity and spatial information in a single measure. These measures have been shown to be characterized by smooth distance surfaces with significantly fewer local minima than the commonly used intensitybased measures, when studied in the context of template matching and object recognition. In this work we demonstrate that these distance measures can, with a slight modification, be successfully used for fast and robust affine image registration. By differentiating the distance measure we are able to use efficient gradientbased optimization. The proposed method outperforms the commonly used similarity measures in both synthetic and real scenarios of medical and biomedical registration tasks, which we confirm by (i) landmarkbased evaluation on transmission electron microscopy (TEM) images of cilia [10]
, with the aim of improving multiimage superresolution reconstruction, as well as (ii) evaluation on the task of atlasbased segmentation of magnetic resonance (MR) images of brain, on the LPBA40dataset
[11].Intensity interpolation is typically a required tool in the context of intensitybased registration performed with commonly used similarity measures since the sought transformation (and intermediate candidates) is likely to map points to regions outside of the regular grid. Treating the reference and floating images differently in terms of the interpolation introduces a significant source of asymmetry [12] and can cause a registration task to succeed or fail depending on which image is selected as reference and which is floating. Our proposed approach requires no offgrid intensity values, and is interpolationfree in terms of intensities; empirical tests confirm that it is highly symmetric in practice.
Noting that intensitybased image registration can be computationally demanding, we also demonstrate that the proposed measure is fast to compute, in comparison with the implementations of the measures existing in the ITKframework [12]. The registration framework, with the proposed measure, is implemented in C++/ITK, and its source code is available^{3}^{3}3Source code available from www.github.com/MIDAgroup.
Ii Preliminaries and Previous Work
Iia Images as Fuzzy Sets
First we recall a few basic concepts related to fuzzy sets [13], a theoretical framework where grayscale images are conveniently represented.
A fuzzy set on a reference set
is a set of ordered pairs,
, where is the membership function of . Where there is no risk for confusion, we equate the set and its membership function and let be equivalent toA grayscale image can directly be interpreted as a spatial fuzzy set by rescaling the valid intensity range to . We assume, w.l.o.g., that the images to be registered have an intensity range and we directly interpret them as fuzzy sets defined on a reference set which is the image domain, and is in most cases a subset of . We use the terms image and fuzzy set interchangeably in this text.
A crisp set
(a binary image) is a special case of a fuzzy set, with its characteristic function as membership function
(1) 
The height of a fuzzy set is . The complement of a fuzzy set is An cut of a fuzzy set is a crisp set defined as i.e., a thresholded image.
Let be an element of the reference set . A fuzzy point (also called a fuzzy singleton) defined at with height , is defined by a membership function
(2) 
IiB IntensityBased Registration and PointWise Distances
Intensitybased registration is a general approach to image registration defined as a minimization process, where a distance measure between the intensities of overlapping points (or regions) is used as optimization criterion. Given a distance measure and a set of valid transformations , intensitybased registration of two images (floating) and (reference) can be formulated as the optimization problem,
(3) 
where denotes a valid transform of image into the reference space of image .
Intensitybased similarity/distance measures which are most commonly used for image registration are Sum of Squared Differences (SSD) [14], Pearson Correlation Coefficient (PCC) and Mutual Information (MI) [15]. These measures are pointbased, i.e. they are functions of the intensities of points belonging to the overlapping regions of the two compared sets. Their evaluation, therefore, typically requires interpolation of image intensities.
For two images and defined on a common reference set of overlapping points, these measures are defined as
(4) 
(5) 
and
(6) 
In (5) , and denote means of the resp. intensity distributions over the evaluated region. In (6) the (joint and marginal) entropies , and of the image intensity distributions and
are defined in terms of the estimated probability
of a randomly selected point having intensities , , as(7) 
and
(8) 
Intensitybased registration, as formulated in (3), is, in general, a nonconvex optimization problem with a large number of local optima, especially for the commonly used pointbased measures (SSD, PCC, and MI). To try to overcome this optimization challenge, a resolutionpyramidscheme is normally used [16, 17], where smoothed low resolution images are first registered, followed by registration of images with increasing resolution and decreasing degree of smoothing, using the transform obtained from the previous stage as starting position (socalled coarsetofine approach).
IiC Distances Combining Intensity and Spatial Information
While the distances of Sec. IIB only rely on intensities of overlapping points, the distances presented in this section incorporate also spatial information of nonoverlapping points. For such spatial relations, we consider distances between two points, between a point and a set, and between two sets. The most commonly used pointtopoint distance is the Euclidean distance, denoted .
Given a pointtopoint distance , the common crisp pointtoset distance between a point and a set is
(9) 
Closely related to the crisp pointtoset distance is the (external) distance transform of a crisp set (with pointtopoint distance ) which is defined as
(10) 
Taking into the consideration the intensity, or equivalently, the height of a fuzzy point, the fuzzy pointtoset inwards distance , based on integration over cuts [9], between a fuzzy point and a fuzzy set , is defined as
(11) 
where is a pointtoset distance defined on crisp sets. The complement distance [18] of a fuzzy pointtoset distance is
(12) 
The fuzzy pointtoset bidirectional distance is
(13) 
For an arbitrary pointtoset distance , Sum of Minimal Distances (SMD) [19] defines a settoset distance as
(14) 
A weighted version can be defined [9], which may be useful if a priori information about relative importance of contributions of different points to the overall distance is available:
(15) 
Inserting distances (11) or (13) in (14) or (15) provides extensions of the SMD family of distances to fuzzy sets [9]. We refer to them as , , and .
It has been observed for fuzzy set distances [20] in general, and for distances based on (11) and (13) in particular, that distances between sets with empty cuts may give infinite or illdefined distances. We follow a previous study and introduce a parameter , [21]
, to limit the underlying crisp pointtoset distance. This has a double benefit of (i) reducing the effect of outliers and (ii) making the distances well defined also for images with empty
cuts.IiD Transformations, Interpolation, and Symmetry
Linear transformations relate points in one space to another via application of a linear function. A transformation is rigid if only rotations and translations are permitted, and affine if shearing and reflections are also permitted. Affine transformation can be expressed as matrix multiplication,
(16) 
Linear transformations can, in general, transform points sampled on an image grid to positions outside of the grid, hence an interpolator is required for obtaining the image intensity at the transformed point’s location. Interpolation is a large source of error, bias, and a significant contributing factor of asymmetry in intensitybased registration, [12]. Commonly, interpolation is only required for one of the two images, where sampling (for optimization) is done from the grid of the other image space; hence, the two images are treated asymmetrically, yielding distinct similarity surfaces (over the transformation parameters) depending on which image is taken as reference. This can cause a registration task to succeed or fail, depending on the registration direction.
IiE Optimization
Registration with a differentiable distance measure as objective function enables the use of gradientbased optimization algorithms, which typically are significantly more efficient than derivativefree algorithms for local iterative optimization. An effective and commonly used subset of gradientbased algorithms are the stochastic gradient descent methods
[22], which consider a random subset of the points in each optimization iteration, incurring a twofold benefit: utilizing randomness to escape shallow local optima in the implicit distance surface, while also decreasing the computational work required per iteration. The size of the random subset is usually given as a fraction of the total number of points, and denoted as the sampling fraction. Approximation of the cost function by random subset sampling (where a new random subset of points is chosen in every iteration) has been, in previous studies, [15, 23], shown to perform well for intensitybased registration.Iii Proposed Image Registration Framework
Iiia Distances
To extend the family of distance measures given by (15), to be suitable for registration, optionally with random subsampling optimization methods [23], we here define a new related family of distance measures.
Definition 1 (Asymmetric average minimal distance).
Given fuzzy set on a reference set , fuzzy set on reference set , and a weight function , the Asymmetric average minimal distance from to , is
(17) 
Building on the asymmetric distance, we formulate a symmetric distance as follows:
Definition 2 (Average minimal distance).
Given fuzzy set on reference set , fuzzy set on reference set , weight functions and , the Average minimal distance between and , is
(18) 
In the context of image registration, we utilize to express a (weighted) distance between transformed fuzzy points , and the image , where the transformation of a fuzzy point is given by the transformation of the reference point :
(19) 
To reflect the bounded image domain, only the transformed points falling on a predefined region are considered. Note that, when and are digital images, and are typically subsets of and the transformed points do not necessarily coincide with the points of the reference set ; an illustrative example is given in Fig. 2.
We, therefore, provide the following definitions suited for the task of image registration:
Definition 3 (Asymmetric average minimal distance for image registration).
Given fuzzy set on reference set , fuzzy set on , a weight function , and a crisp subset (mask) , the Asymmetric average minimal distance for image registration from to , parameterized by a transformation , is
(20) 
Definition 4 (Average minimal distance for image registration).
Given fuzzy set on reference set , fuzzy set on , weight functions and , and crisp subsets (masks) , the Average minimal distance for image registration between and , parameterized by an invertible transformation , with inverse , is defined as
(21) 
The distance is based on full sampling, taking into account all points in the two sets which have nonzero weights, as long as they are transformed to points inside the mask associated with the other set. To reduce the computational cost of the distance and, in addition, to enable random iterative sampling, we propose an approximate version of :
Definition 5 (Subsampled average minimal distance for image registration).
Given fuzzy set on reference set , fuzzy set on , weight functions and , and crisp subsets (masks) , the Subsampled average minimal distance for image registration between and , parameterized by an invertible transformation , with inverse , and crisp sets and , is defined as
(22) 
Inserting (11) or (13) as pointtoset distance in (20), and hence indirectly in (21) and (22), provides extensions of this family of distances to the cutbased distances, which we denote , , , , and .
Normalization of the weights of the sampled points, introduced through Def. 1, renders the magnitude of the distance (and subsequently its derivatives) invariant to the size and aggregated weight of the sets or of the sampled subsets. Since the normalization is done separately from to and from to , both directions are weighted equally even if the total weights of the point subsets from the two sets are different. This normalization can simplify the process of choosing e.g. steplength of various optimization methods, and makes it more likely that default hyperparameter values can be found and reused across different applications.
IiiB Registration
We propose to utilize symmetric distances and as cost functions in (3) to define concrete image registration methods. Inserting into (3) we obtain
(23) 
For the case of subset sampling with sets and , registration is defined as
(24) 
By selection of new random subsets and in each iteration, various stochastic gradient descent optimization methods can be realized.
IiiC Gradients
The derivative of (9), the crisp pointtoset distance measure (in dimensional space), with respect to parameters of the transformation applied to a point , yielding , can be written as
(25) 
The values are the components (partial derivatives) of the gradient of the pointtoset distance in point , and are not dependent on the transformation model.
The gradient of the fuzzy pointtoset distance measure (11) is given by the integral over cuts, of gradients of the (crisp) pointtoset distances:
(26) 
IiiD Algorithms for Digital Images on Rectangular Grids
The distances and gradients can be computed efficiently for the special case of digital images on rectangular grids. For images quantized to nonzero discrete levels the integrals in (11) and (26) are suitably replaced by sums. The number of quantization levels is typically taken to be a small constant; a choice of nonzero equally spaced levels has previously shown to provide a good tradeoff between performance, speed and noisesensitivity [9], and we keep it for all experiments.
We need a discrete operator to approximate the gradient of for a set defined on a rectangular grid with spacing . We propose to use the following difference operator providing a discrete approximation of :
(27) 
where
(28) 
is an indicator function,
(29) 
and
is the unitvector along the
th dimension.The indicator function causes the operator to be zerovalued for points included in (i.e., where the distance transform is zerovalued). This prevents discretization issues along set boundaries, where the standard central difference operator yields nonzero gradients, which can cause the measure to overshoot a potential voxelperfect overlap.
By creating tables for the distance and gradient sums for each image as a preprocessing step, using either of the procedures in Alg. 1 ( for inwards distances and for bidirectional distances), the distance and gradient may then be readily computed with Alg. 2. denotes the number of parameters of the transformation, which is for twodimensional (2D) affine transformations, and for threedimensional (3D) affine transformations.
The procedures in Alg. 1 have linear runtime complexity , achieved by using a lineartime algorithm for computing the distance transform (e.g. [24]) in line 6 of Alg. 1. The space complexity of the algorithm is and the structures must remain in memory to enable fast lookup in Alg. 2. Figure 3 shows an example of the distance and gradient of a sample level. Alg. 2 computes the pointtoset distance and gradient w.r.t. the transformation using the precomputed tables and has runtime complexity thus being independent of and the size of and .
Algorithm 3 performs a complete registration given two images, their binary masks, weight functions, and an initial transformation. Algorithm 3 completes full iterations, however other termination criteria may be beneficial (see Sec. IV). The registration consists of preprocessing, followed by a loop where the symmetric set distance and derivatives are computed and is updated. , in line 10 of Alg. 3 denotes a matrix of partial derivatives of the parameters of the inverse transform w.r.t. the parameters of the forward transform . This matrix relates the computed partial derivatives with the parametrization of the forward transform.
The overall runtime complexity is . Practical choices for tend to be in the range , depending on hyperparameters (e.g. ), and distance in parameterspace between starting position and the global optimum. The evaluation in Sec. V confirms empirically that convergence, according to (31) or (32), tends to be reached after to iterations, using an optimizer with a decaying .
The QUANTIZE procedure in Alg. 2 takes the membership of point , , and gives the index of the minimal level (, …, ) for which . If the membership is below all levels, the index is . For equally spaced levels, the quantization can be expressed as
(30) 
The INTERPOLATE procedure in Alg. 2 computes the value of the discrete functions and in point which may not be on the grid due to application of . There are many interpolation schemes proposed in the literature, but we suggest that either nearest neighbor (for maximal speed) or linear interpolation (for higher accuracy) are used here, since the distance and gradient fields are smooth. By linearity of integration and summation, nearest neighbor and linear interpolation may be performed on the preprocessed and and yield the same result as if each level was interpolated before integration, allowing efficient computation. The (discretized) measure does not require intensity interpolation; the interpolation operates on distances and gradients only.
Iv Implementation
We implemented the proposed distance measure and registration method in the opensource Insight Segmentation and Registration Toolkit (ITK) [12]. We chose this particular software framework because it

enables the use of an existing optimization framework,

allows for a fair comparison against well written, tested, and widely used implementations of reference similarity measures, with support for resolutionpyramids,

supports anisotropic/scaled voxels in relevant algorithms,

facilitates reproducible evaluation,

makes the proposed measure easily accessible for others.
The builtin ITK optimizer we have used for the registration tools and all the evaluation is RegularStepGradientDescentOptimizerv4. This is an optimizer based on gradient descent, with an initial steplength , and a relaxation factor which reduces the used steplength gradually as the direction of the gradient changes, in order to enable convergence with high accuracy. In addition to a maximum number of iterations , two termination criteria are used: (i) a gradient magnitude threshold (GMT),
(31) 
and (ii) a minimum steplength (MSL),
(32) 
where is the current relaxation coefficient. We use default values of for both of these criteria. A relaxation factor of is used for all experiments, since it performed well in preliminary tests; in this study we are willing to trade some (potential) additional iterations for better robustness. To maximize utilization of the limited number of levels, images are normalized before registration to make sure that they are within the valid interval. We use the following robust normalization technique: Let denote the th percentile of image with respect to image intensities,
(33) 
V Performance Analysis
We evaluate performance of the proposed method, both for 2D and 3D images, in two different scenarios; (i) we perform a statistical study on synthetically generated images, where we seek to recover known transformations and measure the registration error by comparing the ground truth locations of known landmarks with the corresponding registered ones; (ii) we apply the proposed framework to real image analysis tasks: landmarkbased evaluation of registration of TEM images in 2D, and atlasbased segmentation evaluation of 3D MR images of brain.
To compare the proposed measure and registration method against the most commonly used alternative methods and similarity measures, we select the widely used ITK implementations of optimization framework and similarity measures (SSD, PCC and MI) as the baseline of intensitybased registration accuracy. Note that the PCC measure is denoted Normalized Cross Correlation (NCC) in the ITK framework.
All experiments are performed on a workstation with a 6core Intel i76800K, 3.4GHz processor with 48GB of RAM and 15MB cache. The operating system used is Ubuntu 16.04 LTS. The compiler used to build the framework is g++ version 5.4.0 (20160609). Version 4.9 of the ITKframework is used for testing and evaluation.
Va Datasets
One biomedical 2D dataset and one medical 3D dataset are used for the evaluation.
VA1 TEM Images of Cilia (2D)
The dataset of 20 images of cilia [10] is acquired with the MiniTEM^{4}^{4}4MiniTEM imaging system is developed by Vironova AB. imaging system. Each image is isotropic of size pixels, with a pixelsize of a few nm. An example is shown in Fig. 2. A particular challenge is the nearrotational symmetry of the object: 9 pairs of rings are located around a central pair of rings, which gives 9 plausible solutions for a registration problem. The alignment of the central pair can be taken into special consideration to reduce the number of solutions to two. The dataset comes with a set of 20 landmarks per image, indicating the position of each of the relevant structures to be detected and analysed – rings ( in the center and in a circle around the center). The landmarks are produced by a domain expert and are only used for evaluation of the registrations.
VA2 Lpba40 (3d)
LPBA40 [11] is a publicly available dataset of 40 3D images of brains of a diverse set of healthy individuals, acquired with MRI. The images are anisotropic, of size voxels with voxelsize . The dataset comes with segmentations of the brains into 56 distinct regions marked by a medical expert, which are used in this study as groundtruth for evaluation. LPBA40 includes two atlases: first 20 out of 40 MR images of brain in the dataset are used to generate one brain atlas by Symmetric Groupwise Normalization (SyGN) [25]; another atlas is created analogously, from the last brains in the dataset. The atlases contain both a synthesized MR image and the fused label category in all the voxels, as well as a whole brain mask which may be used for brain extraction.
VB Evaluation Criteria
We evaluate accuracy and robustness of the registration methods in presence of noise, their robustness w.r.t. change of roles of reference and floating image (symmetry), and their speed. We quantify the performance of the observed frameworks in terms of the following quality measures:
VB1 Average Error measure (AE)
The registration result is quantified as the mean Euclidean distance between the sets of corresponding image corner landmarks and in the reference image space, after transformation of the floating image corner landmarks , where is the number of landmarks (4 in 2D; 8 in 3D). The quality measure is defined as
(34) 
A slight variation of this measure, the Average Minimal Error (AME), is used in the real task of cilia registration:
(35) 
For the central pair, the error is simply
, whereas for the outer rings we utilize the knowledge that an odd (even) landmark should be matched with an odd (even) landmark of the other image. The error function for the outer rings,
[10], is therefore defined as:(36) 
VB2 Success Rate (SR)
A registration is considered successful if its AE is below one voxel(pixel). Success rate (SR) at a given AE value corresponds to the ratio of successful registrations (w.r.t. the set of performed ones).
VB3 Symmetric Success Rate (SymSR)
is defined as the ratio of performed registrations which are successful (i.e., ) in both directions, i.e., when the roles of reference and floating image are exchanged.
VB4 Inverse Consistency Error (ICE),[26]
Given a set of interest , the transformations , and , the ICE of this pair of transformations is
(37) 
We compute ICE considering all the points of the reference image for each of the cases where Symmetric Success is observed ( in both directions).
VB5 Jaccard Index for segmentation evaluation
VB6 Execution Time
We evaluate (i) the execution times required for one iteration in the registration procedure, i.e., times needed to compute the distance (similarity) measure and its derivatives, with full sampling, and in full image resolution, between two distinct images from the same set, as well as (ii) the execution time for complete registrations.
VC Parameter Tuning
The distance measure and optimization method have a number of parameters which must be properly chosen. Synthetic tests indicated that the following values lead to good optimization performance: three pyramid levels with downsampling factors and Gaussian smoothing , max iterations per level and an initial steplength . The number of levels used is . Normalization percentile is normally . This same parameter setting, if not stated differently, is used in all the tests, on both synthetic and real data.
VD Synthetic Tests
A synthetic evaluation framework is used to evaluate the performance of the proposed method, and to compare it with standard tools based on SSD, PCC, and MI, in a controlled environment. For this evaluation, we construct sets of transformed versions of a reference image and add (a new instance of) Gaussian noise to each generated image. The transformations are selected at random from a multivariate uniform distribution of rotations measured in degrees (1 angle for 2D images and 3 Euler angles for 3D images) and translations measured in fractions of the original image size.
VD1 2D TEM Images of Cilia
Three sets of transformed images are built based on image Nr. 1 in the observed dataset, by applying on it the following three groups of transformations: Small, containing compositions of translations of up to of image size (in any direction) and rotations by up to ; Medium, containing compositions of translations and rotations such that at least one of the parameters exceeds the range of Small, and falls within of image size of translation (in at least one direction), or of rotation; and Large, containing compositions of translations and rotations such that at least one of the parameters exceeds the range of Medium, and falls within of image size of translation (in at least one direction), or of rotation. The transformed images are also corrupted by additive Gaussian noise, from (, corresponding to a PSNR20 dB). Each group of transformations is applied 1000 times, and the resulting images are registered to image Nr. 1, each time corrupted by a new instance of Gaussian noise.
To evaluate symmetry, we performed registrations of images transformed by randomly selected translations of up to of image size, and rotations by up to , and corrupted by additive noise from . Each of the registrations were performed twice, with exchanged roles of reference image and floating image.
Intensitybased registration with gradientdescent optimization can be computationally demanding, requiring the distance function and its derivative for each iteration of the optimization procedure. The time to compute the distance and derivatives is directly proportional to the number of sampled points. We, therefore, evaluate influence of the sampling fraction on registration success, observing registrations after Small transformations and added noise (with ), over a range of sampling fractions. For each evaluated sampling fraction, registrations are performed and SR and AE are computed for successful registrations (). No resolution pyramids are used for these tests.
VD2 3D MR Images of Brain
Three sets of transformed images are built based on image Nr. 1 in the observed dataset, by applying to it the following three groups of transformations: Small, containing compositions of translations of up to of image size (in any direction) and rotations by up to (around each of the rotation axes); Medium, containing compositions of translations and rotations such that at least one of the parameters exceeds the range of Small, and falls within of image size of translation (in at least one direction), or of rotation (around at least one rotation axes); and Large, containing compositions of translations and rotations such that at least one of the parameters exceeds the range of Medium, and falls within of image size of translation (in at least one direction), or of rotation (around at least one rotation axes). The transformed images are also corrupted by additive Gaussian noise, from . Each group of transformations is applied 200 times, and the resulting images are registered to image Nr. 1, each time corrupted by a new instance of Gaussian noise.
VE Results of Synthetic Tests
VE1 2D TEM Images of Cilia
Figure 4 shows the distributions of registration errors (AE), for the three transformation classes. Superiority of the proposed measure, and the corresponding registration framework, is particularly clear for Medium and Large transformations; it reaches a success rate, with subpixel accuracy, whereas the competitors not only exhibit considerably lower accuracy, but also much lower success rate, i.e., they completely fail in a large number of cases.


Overall registration performance is summarized in Table I, for complete sampling (a), and for random sampling of 10% of the points (b). The proposed method has success rate and also symmetric success rate. The other observed measures exhibit much lower success rate and poor symmetry scores; the second best, SSD, succeeds in of the cases, and succeeds symmetrically in only of the cases. The registration error for successful registrations is considerably smaller for the proposed method, while the execution time is considerably lower. The reduced sampling fraction in (b) has a small impact on the proposed method while substantially degrading the performance of the other measures.
Figure 5 shows registration performance for varying sampling fractions; Small transformations, in presence of noise () are considered. We observe that the registration performance flattens and stabilizes at approximately sampling fraction ( of the points). We conclude that previous findings of [15, 23], suggesting that random subsampling provides good performance even with very small sampling fractions, apply well for the proposed measure.
VE2 3D MR Images of Brain
Figure 6 shows the observed distributions of registration errors (AE) for the three transformation classes, and clearly confirms that the proposed method is robust and with high performance, even for larger transformations, while the magnitude of the transformation has a substantial negative effect on the performance of the other observed measures.
Figure 7 presents bar plots corresponding to the performed synthetic tests on the LPBA40dataset, consisting of registrations of images after up to (and including) Large transformations (with additive Gaussian noise, ). Successful registrations () are observed. Here as well, the proposed method delivers success rate, whereas the second best, SSD, succeeds in only of the cases. The registration error for successful registrations is the smallest for the proposed method. We observe a relative increase in execution time of the proposed registration framework in 3D case, where it is slightly slower than the other measures.
VE3 Execution Time Analysis
The number of iterations required for convergence of the optimization (registration) typically range from 1000 to 3000. Measures SSD, PCC and MI use cubic spline interpolation. Lookups from the distance maps for are done using linear interpolation. Table II
shows the mean (and standard deviation) execution time of one iteration, which includes computation of the measures and their derivatives, repeated
times for 2D, and times for 3D affine image registrations. We observe that the proposed measure is the fastest per iteration both in 2D and 3D. Note that these execution time measurements exclude preprocessing.Measure  Cilia (2D) [s]  Brain (3D) [s]  

Mean  Std.dev.  Mean  Std.dev.  
0.270  0.010  3.116  0.098  
SSD  0.718  0.036  4.782  0.066 
PCC  1.191  0.026  8.562  0.002 
MI  0.890  0.025  5.699  0.002 
VF Evaluation on Real Applications
VF1 Registration of Cilia
Registration of multiple cilia instances detected in a single TEM sample, for enhancement of diagnostically relevant substructures, requires a pixelaccurate and robust method which is able to overcome the challenges posed by the nearrotational symmetry of a cilium. At most two of the possible solutions properly align the central pair, which is vital for a successful reconstruction.
We compare the performance of the proposed method with reported results of a previous study [10] which uses intensitybased registration with PCC as similarity measure. We follow the general protocol described in [10] and perform, as a first step, a multistart rigid registration (parameterized by angle in radians, and translation ), followed, in a second step, by affine registration initiated by the best (lowest final distance) registration of the 9 rigid ones.
No resolution pyramids are used since they were observed to interfere with the multistart approach (by facilitating large movements). The registrations are performed in full resolution, without stochastic subsampling. For the rigid registration we use a small circular binary mask with radius of pixels, positioned in the center, combined with a squared circular Hann window function. The affine registration is performed using a circular binary mask with radius of pixels; the mask removes the outside background and the outer plasma which is not helpful in guiding the registration. No additional weightmask is used for the affine registration. Step length was used for the rigid and for the affine registration. We use . Normalization percentile is set to for the rigid stage and for the affine stage.
VF2 Atlasbased Segmentation (LPBA40)
In [27], a protocol for evaluation of distance/similarity measures in the context of image registration was proposed. The protocol starts with affine registration, for which results are reported, and then proceeds to deformable registration. Since this study focuses on the development of an affine (linear) registration framework based on the proposed distance measure, we compare with the reported affineonly performance; an improved affine registration is of great significance since a very high correlation between the performance of the affine registration and that of the subsequent deformable registration has been established.
We start from the two atlases created utilizing the Advanced Neuroimaging Tools (ANTs) registration software suite and the opensource evaluation script provided in the reference study [27]. We utilize the atlas created using Mutual Information since that is the one found in [27] to be best performing and is used as the basis for the whole deformable registration study. Twofold cross validation is utilized; the first atlas is registered to the last brain images and the second atlas is registered to the first brains, hence all registrations are done with brains that did not contribute to the creation of the atlas.
The multilabel segmentations defined by the atlas are transformed using the transformation parameters found during the registration and compared to the groundtruth segmentations for each brain. The Jaccard Index [28] is calculated per region, as well as for the entire brain mask.
For the proposed method based on we use