Graph-based Hypothesis Generation for Parallax-tolerant Image Stitching

04/20/2018 ∙ by Jing Chen, et al. ∙ Nankai University 0

The seam-driven approach has been proven fairly effective for parallax-tolerant image stitching, whose strategy is to search for an invisible seam from finite representative hypotheses of local alignment. In this paper, we propose a graph-based hypothesis generation and a seam-guided local alignment for improving the effectiveness and the efficiency of the seam-driven approach. The experiment demonstrates the significant reduction of number of hypotheses and the improved quality of naturalness of final stitching results, comparing to the state-of-the-art method SEAGULL.



There are no comments yet.


page 2

page 3

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Parallax handling is a challenging task for image stitching. Global alignment usually introduces noticeable artifacts or objectionable distortion. The seam-driven approach is a powerful tool for addressing the parallax problem, which searches finite representative hypotheses of local alignment for an invisible seam for plausible and natural stitching [1, 2, 3]. The effectiveness and the efficiency depend upon the quality and the number of hypotheses of local alignment, which correspond to image warping and correspondence grouping respectively (see Figure 1).

Fig. 1: Overview of the seam-driven approach

In fact, correspondence grouping includes to determine the warp and the strategy for aligning and grouping correspondences of feature. Image warping is evaluated by the qualities of seam, distortion and naturalness. Gao et al. [1] use homography (homo) and multiple RANSAC to divide the feature correspondences into hypotheses and generate corresponding local alignment via homography. Zhang and Liu [2] use homography and a randomized strategy (seed) for correspondences grouping, then use the combination of homography and content-preserving warping (CPW) for generating hypotheses of local alignment. Lin et al. [3] use homography and a new superpixel-based strategy (combinatorial) for correspondences grouping, then use the structure-preserving warping (SPW) with adaptive feature weighting for generating hypotheses of local alignment. Because homography is not flexible enough, these correspondence grouping are not representative enough, then subsequently generate some undesirable hypotheses such that the efficiency becomes low. On the other hand, these image warping suffer from distortion and naturalness issues in the cases of non-planar geometry of scenes, such that their stitching results are less natural-looking. Table I shows a comparison of seam-driven approaches in aspect of correspondence grouping and image warping.

Method Correspondence grouping Image warping
Gao et al. [1] homo+multiple RANSAC homo
Zhang & Liu [2] homo+seed homo+CPW
Lin et al. [3] homo+combinatorial SPW
Ours APAP+graph SPMD
TABLE I: Comparison of seam-driven approaches

In this paper, we propose a graph-based hypothesis generation and a seam-guided local alignment for parallax-tolerant image stitching. First, we use the as-projective-as-possible (APAP) warping and a graph-based strategy to group dual-feature correspondences into a few representative hypotheses. Then we use a single-perspective mesh deformation (SPMD) with adaptive feature weighting that depends on current seams to generate corresponding local alignment. Finally, we search a good seam from these hypotheses and create the final stitching result. Experiments demonstrate significant reductions of number of hypotheses and improved qualities of naturalness of final stitching results, comparing to SEAGULL.

Ii Correspondence Grouping

In order to increase the alignment accuracy and decrease the undesired distortion, we consider both line segment and point correspondences as our feature correspondences, which are called dual-feature in [4]. On the other hand, in order to increase the flexibility of warping, we consider APAP [5] as our grouping warp. In the following, we formulate correspondence grouping into a weighted graph with source and sink, then solve it via calculating the shortest path between them.

Fig. 2: Overview of graph-based hypothesis generation. (a) Formulating (the edge weights are normalized into the range of ). (b) Grouping (the generated hypothesis consists of red and green vertices).

For formulating, we first sample the line segment correspondence with three points in the reference image (two endpoints and one midpoint) and insert a source point and a sink point on the image border (up and down for horizontal stitching, or left and right for vertical stitching). Then we set up a graph via Delaunay triangulation on the dual-feature points, the source and the sink points. The weight for the edge between feature points and is defined by the cross residual distance error (CRDE), which can be calculated by (short for ),


where is the Euclidean distance of points, is the distance of line segments in [4] and is the transformed feature of under the location dependent homography at the feature

, which can be solved via the moving direct linear transformation (MDLT) approach by solving


where is formed from point correspondences , is formed from line segment correspondences , and depend on the distance between and other features and . Here, we refer to [4, 5, 6] for more details. Figure 2(a) illustrates the weighted graph after formulating, where the edge weight somehow indicates the flexibility to align the two feature correspondences via APAP. Therefore, we next need to determine a strategy to group those correspondences into a hypothesis, which forms a subgraph that consists of the source and sink vertices with edge weights less than a threshold.

For grouping, we first calculate the shortest path between the source vertex and the sink vertex in the weighted graph via Dijkstra’s algorithm (marked in red in Figure 2(b)). Then we initialize the hypothesis by the features on the path and iteratively insert new features that are not only connected to the last ones but also with edge weights less than a threshold. The procedure terminates until no more new features can be inserted into the current hypothesis (marked in green in Figure 2(b)). It is worth to note that, in order to avoid the shortest path from running into a path with less vertices but larger weight than the threshold, we use a sigmoid metric to preprocess the edge weights, which is similar to the metric in [7].

Other hypotheses are subsequently generated by iteratively modifying the edge weights of the last shortest path to a very large positive number, and performing the procedure of grouping until the number of remaining correspondences is less than , or the edge weight of the shortest path is larger than the threshold. In the experiment, we use VLFeat to extract and match SIFT features, use LSD to extract line segments and match them by [8]

and use RANSAC to remove outliers. The threshold of edge weights is set to

. Because APAP is more flexible than homography, our graph-based correspondences grouping generates less but more representative hypotheses. A comparison of number of hypotheses with SEAGULL [3] is illustrated in Table II.

Iii Image Warping

In order to address distortion and naturalness issues, we consider SPMD [9] as our image warping to generate corresponding local alignment for each hypothesis. First, we mesh the target image by vertices indexing from up to and reshape them into a

-dimension vector

. The vertices after deformation are denoted by . Then the total energy function is defined by


where these four terms address the alignment, naturalness, distortion and saliency issue respectively. Finally, local alignment is obtained by solving via any sparse linear solver, since (3) is sparse and quadratic. Here, we refer to [9] for more details.

In order to further improve the quality of seam of local alignment, we use the adaptive feature weighting strategy in [3]

such that the process of optimizing local alignment is guided by the estimated seam. Respectively, alignment and naturalness terms are modified to



is the bilinear interpolation of the four enclosing grid vertices of the dual-feature point,

and are the adaptive feature weights that depend on current alignment errors and distances to current seams. Here, we refer to [3] for more details.

For the estimation of the stitching seam, we use the perception-based seam-cutting approach in [7], which uses a sigmoid metric to characterize the perception of color discrimination and a saliency weight to simulate that the human eye inclines to pay more attention to the salient objects.

The iteration terminates until the average change of vertex locations compared to the last iteration is less than one pixel or the iteration number exceeds . In the experiment, the parameters are set to the recommended values in [3, 7, 9]. Because characteristics of local alignment, naturalness, distortion and saliency are simultaneously and iteratively optimized, our seam-guided local alignment create more natural-looking final stitching results. Comparisons of qualities of seam and naturalness of hypotheses with SEAGULL are illustrated in Table II and Figure 3.

Iv Experiments

We compare our proposed method with the state-of-the-art method SEAGULL [3]. We evaluate these two methods over the publicly available dataset of pairs of images with challenging parallax variation from SEAGULL111

Firstly, we compare the number of hypotheses of our correspondence grouping with SEAGULL. Table II shows the comparison results, where in most of the cases, our grouping method generates the smaller number of hypotheses. The comparison results for other seam-driven approaches [1, 2] over the same dataset can be found in [3].

hypo seam hypo seam hypo seam hypo seam
01. 3 0.148 1 0.136 13. 3 0.045 2 0.010
02. 1 0.061 1 0.043 14. 1 0.074 2 0.069
03. 2 0.135 2 0.127 15. 5 0.205 3 0.265
04. 3 0.217 2 0.186 16. 3 0.138 1 0.117
05. 1 0.387 2 0.342 17. 10 0.114 3 0.185
06. 6 0.072 2 0.068 18. 7 0.336 2 0.287
07. 6 0.168 2 0.159 19. 3 0.142 2 0.126
08. 5 0.072 3 0.065 20. 3 0.170 2 0.167
09. 5 0.066 2 0.062 21. 2 0.179 2 0.171
10. 7 0.195 3 0.226 22. 9 0.080 2 0.076
11. 1 0.256 1 0.239 23. 13 0.159 4 0.130
12. 6 0.265 3 0.237 24. 6 0.148 2 0.136
TABLE II: Quantitative comparison between SEAGULL and our method. Bold values indicate best results

We also compare the seam and the naturalness qualities of our image warping with SEAGULL. Table II including the number of hypotheses (hypo) and the seam quality (seam) shows the comparison results of seam quality, which is measured by the zero-mean normalized cross correlation (ZNCC) score of local patches along the seam as in [3]. In most of the cases, our warping method has the lower score of local alignment. Figure 3 illustrates the comparison results of final stitching results, where SEAGULL suffers from projective distortion (indicated in red rectangles) at times while our final stitching results are more natural-looking. All pairs of comparison results are available in the supplementary material.

Fig. 3: Qualitative comparison between SEAGULL and our method.

V Conclusions

In this paper, we proposed a novel graph-based hypothesis generation and a seam-guided local alignment for parallax-tolerant image stitching. Experiments demonstrated significant reductions of number of hypotheses and improved qualities of naturalness of final stitching results, comparing to the state-of-the-art method SEAGULL.


  • [1] Gao J., Li Y., Chin T.-J., and Brown M. S.: ‘Seam-driven image stitching’, Eurographics, 2013, pp. 45–48
  • [2] Zhang F., and Liu F.: ‘Parallax-tolerant image stitching’,

    Proc. IEEE Conf. Comput. Vision Pattern Recognit.

    , 2014, pp. 3262–3269
  • [3] Lin K., Jiang N., Cheong L.-F., Do M., and Lu J.: ‘SEAGULL: Seam-guided local alignment for parallax-tolerant image stitching’, Proc. Eur. Conf. Comput. Vis., 2016, pp. 370–385
  • [4] Li S., Yuan L., Sun J., and Quan L.: ‘Dual-feature warping-based motion model estimation’, Proc. IEEE Int. Conf. Comput. Vis., 2015, pp. 4283–4291
  • [5] Zaragoza J., Chin T.-J., Brown M. S., and Suter D.: ‘As-projective-as-possible image stitching with moving DLT’, IEEE Trans. Pattern Anal. Mach. Intell., 2014, 7(36), pp. 1285–1298
  • [6] Joo K., Kim N., Oh T. H., and Kweon I. S.: ‘Line meets as-projective-as-possible image stitching with moving DLT’, Proc. IEEE Int. Conf. Image Process., 2015, pp. 1175–1179
  • [7] Li N., Liao T., and Wang C.: ‘Perception-based seam cutting for image stitching’, SIViP, 2018, doi:10.1007/s11760-018-1241-9
  • [8] Jia Q., Gao X., Fan X., Luo Z., Li H., and Chen Z.: ‘Novel coplanar line-points invariants for robust line matching across views’, Proc. Eur. Conf. Comput. Vis., 2016, pp. 599–611
  • [9] Li N., and Liao T.: ‘Single-perspective warps in natural image stitching’, arXiv:1802.04645, 2018