1 Introduction
is a ubiquitous second messenger in organisms [1]. Quantitatively analyzing spatialtemporal patterns of intercellular calcium () signaling in tissues is important for understanding biological functions. Wing disc pouches of fruit flies are a commonly used genetic model system of organ development and have recently been used to study the decoding of signaling in epithelial tissues [1, 2]. However, wing discs can undergo considerable movements and deformations during live imaging experiments. Therefore, an effective automatic image registration approach is needed to align pouches across image sequences.
Registering tissues that undergo deformations in timelapse image sequences is a challenging problem. For example, experimental image data of wing disc pouches at different time points are moving or deforming due to a general feature of tissue growth, morphogenesis, and due to general movement during the live imaging process. Furthermore, a timelapse movie can contain many frames, and a number of movies are needed to obtain reliable measurements of activity due to the noisy and stochastic nature of the signals, which make the processing more complicated and costly. Common approaches suffer considerably due to cumbersome intensity distortions of tissues caused by oscillations. A method for minimizing the error residual between the local phasecoherence representations of two images was proposed to deal with nonhomogeneity in images [3], which relies heavily on structural information. But, in our problem, the intensity inside the pouches may change a lot, thus causing such methods to fail. A MarkovGibbs random field model with pairwise interaction was used to learn prior appearance of a given prototype, which makes it possible to align complex images [4]. Incorporation of spatial and geometric information was proposed to address the limitations of the static local intensity relationship [5]. But, the computational complexity of these methods is high. In our problem, a single timelapse movie may have hundreds of frames, and hundreds of movies are analyzed. Hence, these methods do not work well for our problem. In general, known methods do not address well the kind of complex intensity distortion in our images with a reasonable computation cost.
To address the difficulty of spatial intensity distortion along with various ROI deformations and movements, we propose a new nonrigid registration approach to effectively align pouches at different time points in image sequences. Our approach consists of two main stages: (1) a segmentation stage and (2) a mapping stage (see Fig. 1). The segmentation stage is based on a deep neural network (FCN). Because the accuracy of the ROI boundary is a key factor influencing the registration results, we apply a graph search algorithm to refine the segmented pouch boundaries. The mapping stage uses the segmentation results from the first stage to characterize an optimal transformation. We first apply a rigid transformation to modulate the movement of pouches, and then design an object predetection Bspline based nonrigid algorithm to produce the final mapping.
2 Methodology
In this section, we present a complete pipeline, which takes timelapse image sequences of pouches as input and produces the registered sequences (see Fig. 1). We first discuss the segmentation stage (i.e., FCN model and boundary refinement), and then the mapping stage (i.e., rigid transformation and nonrigid transformation).
2.1 Segmentation Stage
Our registration approach is based on accurate pouch segmentation. Pouches in our images are commonly surrounded and touched by extra tissues with similar intensity and texture, and the separation boundary between pouches and extra tissues is usually of poor visibility. Meanwhile, the noise induced by the live imaging process makes the segmentation task more challenging. Thus, it is important for our segmentation algorithm to leverage the morphological and topological contexts, in order to correctly segment the shape of each actual pouch, especially its boundary, from the noisy background. For this, we employ an FCN model to exploit the semantic context for accurate segmentation, and a graph search algorithm to further refine the boundaries.
FCN module
Recently, deep learning methods have emerged as powerful image segmentation tools. Fully convolutional networks (FCN) are widely used in general semantic segmentation and biomedical image segmentation
[6, 7].It is worth mentioning that in our images, the separation boundary between a pouch and other tissues is usually quite subtle (as thin as 3 to 5 pixels wide) and obscure, while the whole contextual region (including both the pouch and extra tissues) can be of a relatively large scale (more than pixels). Therefore, the FCN model must fully exploit both the fine details and a very large context. For this purpose, we carefully design the FCN architecture following the model in [8] to leverage a large receptive field without sacrificing model simplicity and neglecting fine details. The exact structure of our FCN model is depicted in Fig. 2.
Boundary refinement We observed that the boundaries of the output pouches from our FCN model may be fuzzy or of irregular shape in difficult cases (see Fig. 4(F)). To improve the boundary accuracy of the segmented pouches, we first apply a Gaussian smooth filter to reduce the influence of intensity variations inside the pouches and then employ a graph search algorithm with the node weights being the negatives of the gradients to further refine the shape boundaries. This allows the subsequent mapping stage to be built upon more accurate segmentation and produce better registration results. Details of this process are omitted due to the page limit.
2.2 Mapping Stage
The goal of the registration process is: For every point in the source image, obtain an optimal corresponding point in the reference image. A key observation is that the intensity profile of the same pouch may incur substantial changes in different frames of an image sequence, due to undergoing signal waves. Hence, intensity is not a reliable cue for finding optimal correspondence between points in different frames of a sequence. Here, we utilize the results from the segmentation stage.
Rigid transformation Since there are lots of movements (i.e., rotation and translation) of pouches in image sequences, we first compute an optimal rigid transformation to reduce their influence. This optimization step uses a regularstep gradient descent algorithm. Note that here, local optimum traps could be an issue in practice. Specifically, since the pouches are often of oval shapes, a local optimum may yield incorrect results where the object is aligned with the opposite orientation. To resolve this issue, we always initialize the optimization process using the optimum parameters computed from the preceding frame in the sequence, since the pouch movement in consecutive frames is usually not very big.
Nonrigid transformation The nonrigid registration seeks an optimal transformation : , which maps any points in the source image at time to time (i.e., in the reference image). We use the freeform deformation (FFD) transformation model, based on Bsplines [9], to find an optimal transformation.
For our problem, not too many control points are needed outside a pouch, because we need to focus only on the ROI. Thus, detecting ROIs first can save computation time. Pouches will be near the same position in the frames after the rigid registration, making it possible to do nonrigid transformation only around the ROI area. Based on this observation, we can crop an area in the first frame, and apply a lattice to this area in the following changing frames. We define the lattice as an grid in the domain (see Fig. 3).
We define the registration model as follows. Let be a rectangular domain in the plane, where the and values specify the boundary of the detection area. To approximate the intensity of scattered points, , in a pouch, we formulate a function as a uniform cubic Bspline function: , where , and . In addition, represents the th basis function of the cubic Bspline: , and
Since the resolution of the control points determines the nonrigid degree and has a big impact on the computation time, a higher resolution of control points gives more freedom to do deformation while also increasing the computation time. To optimize this tradeoff, we use a multilevel mesh grid approach [10] to devise a computationally efficient algorithm. Let denote a hierarchy of meshes of control points with increasing resolutions and be the deformation functions with respect to each mesh. We first apply a coarse deformation, and then refine the deformation gradually. The size of the mesh grid is increased by a factor of 2 with the same spacing, so that the raw image is downsampled to the corresponding size at different levels. The final deformation is the composition of these functions:
To obtain an optimal transformation , we minimize the following energy function: , where the first term is the similarity measure and the second term is for regularization. is the curvature penalization and is the displacement of control points. The similarity measure we use is the sum of squared distance (SSD): , where is the reference image intensity function, is the intensity function of the transformed image of a source image under the current transformation and is the number of pixels.
This process iteratively updates the transformation parameters, , using a gradient descent algorithm. When a local optimum of the cost function is smaller than or the number of iterations is bigger than , the algorithm will terminate.
3 Experiments and Evaluations
We evaluate our registration approach from two perspectives. First, we evaluate the accuracy of the segmentation method, because the segmentation accuracy is crucial to our overall approach. Second, we conduct experiments using both synthetic data and real biological data to assess the registration performance of different approaches.
3.1 Segmentation Evaluations
We conduct experiments on 100 images of 512512 pixels selected from 10 randomly chosen control videos. Our method is compared with both traditional method (e.g., level set [11]) and stateoftheart FCN models (i.e., UNet [7] and CUMedNet [12]). The FCN models are trained using the Adam optimizer [13] with a learning rate of 0.0005. Data augmentation (random rotation and flip) is used during training. We use the mean IU (intersection over union) and F1 as the metrics. Table 1 shows the quantitative results of different methods, and Fig. 4 shows some segmentation examples of pouches using different methods. It is evident that our FCN model works better in segmenting difficult ROIs, and our boundary refinement can help obtain accurate ROI boundaries.
Mean IU  F1 score  

Level set  0.8235  0.8294 
CUMedNet  0.9394  0.9454 
UNet  0.9479  0.9542 
Our FCN  0.9586  0.9643 
Our FCN + BR  0.9617  0.9682 
3.2 Registration Evaluations
We compare the registration performance with the Demon algorithm [14] and a Bspline method based on Residual Complexity (RC) [15], which are two stateoftheart nonrigid registration methods for images with spatiallyvarying intensity.
Synthetic data For synthetic data, we choose 8 pouch images as reference images. Specifically, for each reference image , we generate 20 source images by adding geometric distortion , and intensity distortion , to simulate an image sequence with undergoing movements and signaling (i.e., . For geometric distortions, we apply an elastic spline deformation to perturb the points according to the grid deformation and a rigid transformation. For intensity distortion, we first add Gaussian noise in random diskshaped regions within the pouch to simulate the calcium signal waves and then rescale the intensity to [0, 1]. Our target is to find the optimal transformation from every source image to the corresponding reference image (i.e., . To quantify the performance, we compare the intensity root mean square error (RMSE) between the reference image and the clean registered images , which are obtained by applying to the geometrically distorted images without intensity distortion (i.e., . The idea is to evaluate whether the registration algorithm is able to find an optimal geometric transformation without damaging the texture. Fig. 5.I shows some visual results of different methods, and Table 2 gives quantitative results. The results of our approach are considerably more accurate.












Wing disc pouch data We randomly choose 8 movies from 150 control videos. In each movie, we take the first frame as the reference image and all the other frames as source images. To validate the registration results, we apply transformation to the annotation of source images to obtain the registered annotation boundary and compare the results using the Hausdorff distance (HD) error metric. Table 3 gives the quantitative results. Our approach achieves accurate boundary shapes. Also, our method obtains clear texture inside ROIs, as shown by the examples in Fig. 5.II.
4 Discussions and Conclusions
In this paper, we propose a new twostage nonrigid image registration approach and apply it to analyze live imaging data of wing disc pouches for signaling study. Comparing to the stateoftheart nonrigid methods for biomedical image registration, our approach achieves higher accuracy in aligning images with nonnegligible texture distortions. Our approach lays a foundation for quantitative analysis of pouch image sequences in wholeorgan studies of signaling related diseases. Our approach may be extended to solving other biomedical image registration problems, especially when the intensity profiles and texture patterns of the target objects incur significant changes. The mapping stage of our approach is applicationdependent, while our segmentation method is general and can be applied to many problems by modifying only the graph search based boundary refinement procedure.
5 Acknowledgment
This research was supported in part by the Nanoelectronics Research Corporation (NERC), a whollyowned subsidiary of the Semiconductor Research Corporation (SRC), through Extremely Energy Efficient Collective Electronics (EXCEL), an SRCNRI Nanoelectronics Research Initiative under Research Task ID 2698.005, and by NSF grants CCF1640081, CCF1217906, CNS1629914, CCF1617735, CBET1403887, and CBET1553826, NIH R35GM124935, Harper Cancer Research Institute Research like a Champion awards, Walther Cancer Foundation Interdisciplinary Interface Training Project, the Notre Dame Advanced Diagnostics and Therapeutics Berry Fellowship, the Notre Dame Integrated Imaging Facility, the Bloomington Stock Center for fly stocks.
References
 [1] Q. Wu, P. Brodskiy, C. Narciso, M. Levis, J. Chen, P. Liang, N. ArredondoWalsh, D. Z. Chen, and J. J. Zartman, “Intercellular calcium waves are controlled by morphogen signaling during organ development,” bioRxiv, no. 104745, 2017.
 [2] E. L. Ardiel, A. Kumar, J. Marbach, R. Christensen, R. Gupta, W. Duncan, J. S. Daniels, N. Stuurman, D. ColónRamos, and H. Shroff, “Visualizing calcium flux in freely moving nematode embryos,” Biophysical Journal, vol. 112, no. 9, pp. 1975–1983, 2017.
 [3] A. Wong and J. Orchard, “Robust multimodal registration using local phasecoherence representations,” Journal of Signal Processing Systems, vol. 54, no. 13, p. 89, 2009.
 [4] A. ElBaz, A. Farag, G. Gimel’farb, and A. E. AbdelHakim, “Image alignment using learning prior appearance model,” in ICIP, pp. 341–344, 2006.
 [5] J. Woo, M. Stone, and J. L. Prince, “Multimodal registration via mutual information incorporating geometric and spatial context,” IEEE Transactions on Image Processing, vol. 24, no. 2, pp. 757–769, 2015.

[6]
J. Chen, L. Yang, Y. Zhang, M. Alber, and D. Z. Chen, “Combining fully convolutional and recurrent neural networks for 3D biomedical image segmentation,” in
NIPS, pp. 3036–3044, 2016.  [7] O. Ronneberger, P. Fischer, and T. Brox, “UNet: Convolutional networks for biomedical image segmentation,” in MICCAI, pp. 234–241, 2015.

[8]
L. Yang, Y. Zhang, J. Chen, S. Zhang, and D. Z. Chen, “Suggestive annotation: A deep active learning framework for biomedical image segmentation,” in
MICCAI, pp. 399–407, 2017.  [9] S. Lee, G. Wolberg, K.Y. Chwa, and S. Y. Shin, “Image metamorphosis with scattered feature constraints,” IEEE Transactions on Visualization and Computer Graphics, vol. 2, no. 4, pp. 337–354, 1996.
 [10] D. Rueckert, L. Sonoda, E. Denton, S. Rankin, C. Hayes, M. O. Leach, D. Hill, and D. J. Hawkes, “Comparison and evaluation of rigid and nonrigid registration of breast MR images,” in SPIE, vol. 3661, pp. 78–88, 1999.
 [11] C. Li, C.Y. Kao, J. C. Gore, and Z. Ding, “Minimization of regionscalable fitting energy for image segmentation,” IEEE Transactions on Image Processing, vol. 17, no. 10, pp. 1940–1949, 2008.

[12]
H. Chen, X. Qi, J.Z. Cheng, P.A. Heng, et al.
, “Deep contextual networks for neuronal structure segmentation.,” in
AAAI, pp. 1167–1173, 2016.  [13] D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
 [14] D.J. Kroon and C. H. Slump, “MRI modality transformation in demon registration,” in ISBI, pp. 963–966, 2009.
 [15] A. Myronenko and X. Song, “Image registration by minimization of residual complexity,” in CVPR, pp. 49–56, 2009.