Non-rigid structure-from-motion (NRSfM) is a classical problem in computer vision, where the task is to recover the 3D shape of a deforming object from multiple images. Despite the fact that NRSfM for arbitrary deformation still remains an open problem, it can be solved efficiently under some mild prior assumptions about the deformation and the shape configuration[14, 28, 23, 41, 4, 24, 25, 27, 26].
Even though the existing solutions to sparse NRSfM have demonstrated outstanding results, they do not scale to dense feature points and their resilience to noise remains unsatisfactory. Moreover, the state-of-the-art algorithms [18, 11] to solve dense NRSfM are computationally expensive and rely on the assumption of global low-rank shape which, unfortunately, fails to cater to the inherent local structure of the deforming shape over time. Consequently, to represent dense non-rigid structure under such formulations seem rather flimsy and implausible.
For many real-world applications, such as facial expression reconstruction, limitations such as scalability, timing, robustness, reliable modeling, etc., are of crucial concern. Despite these limitations, no template-free approach exists that can reliably deal with these concerns. In this paper, we propose a template-free dense NRSfM algorithm that overcomes these difficulties. As a first step to overcome these difficulties, we reduce the overall high-dimensional non-linear space spanned by representing the deforming shape as a union of several local low-dimensional linear subspaces. Our approach is based on a simple idea/assumption i.e., any complex deforming surface can be approximated by a locally linear subspace structure . We use this simple intuition in a spatio-temporal framework to solve dense NRSfM. This choice naturally leads to a few legitimate queries:
a) Why spatio-temporal framework for solving dense NRSfM? Spatio-temporal framework by Kumar et al.  has exhibited the state-of-the-art results on the recent NRSfM challenge [5, 21]. A recent method  which follows the same idea as proposed by Kumar et al.  has also observed an improvement in the reconstruction accuracy under such formulations. Even though the concept behind such a framework is elementary, no algorithm to our knowledge exists that exploit such an intrinsic idea for dense NRSfM.
b) Why the previously proposed spatio-temporal methods are unable to handle dense NRSfM? The formulation proposed by Kumar et al.  and its adaptation  is inspired from SSC , and LRR . As a result, the complexity of their formulations grows exponentially in the order of the number of data points. This makes it difficult to solve dense NRSfM using their formulations. Moreover, these methods [23, 41, 3] use an assumption that non-rigid shape should lie on a low-dimensional linear or affine subspace globally. In reality, such an assumption does not hold for all kinds of non-linear deformations [39, 34]. Although a recent spatio-temporal method proposed by Dai et al.  solves this task, it involves a series of least square problems to be solved, which is computationally demanding.
To overcome all these issues, we propose a spatio-temporal dense NRSfM algorithm which is free from such unwarranted assumptions and limitations. Instead, we adhere to the assumption that the low-dimensional linear subspace spanned by a deforming shape is locally valid. Such an assumption about shapes has been well studied in topological manifold theory [1, 15]. The Grassmann manifold is a topologically rich non-linear manifold, each point of which represents the set of all right-invariant subspaces of the Euclidean space. One property of the Grassmannian that is particularly useful in our setting is that the points in it can be embedded into the space of symmetric matrices. This property has been used in several computer vision applications that deal with subspace representation of data [20, 8]. Accordingly, in our problem, to model a non-linear shape, using a Grassmannian allows us to represent the shape as a set of “smooth” low-dimensional surfaces embedded in a higher dimensional Euclidean space. Such a representation not only reduces the complexity of our task but also makes our formulation robust and scalable as described below.
c) Why Grassmann manifold?
It is well-known that the complex non-rigid deformations are composed of multiple subspaces that quite often fit a higher-order parametric model[33, 36, 41]
. To handle such complex models globally can be very challenging – both numerically and computationally. Consequently, for an appropriate representation of such a model, we decompose the overall non-linearity of the shape by a set of locally linear models that span a low-rank subspace of a vector space. As alluded to above, the space of all-dimensional linear subspaces of () forms the Grassmann manifold [1, 2]
. Modeling the deformation on this manifold allows us to operate on the number of subspaces rather than on the number of vectorial data points (on the shape), which reduces the complexity of the problem significantly. Moreover, since each local surface is a low-rank subspace, it can be faithfully reconstructed using a few eigenvalues and corresponding eigenvectors, which makes such a representation scalable and robust to noise.
The aforementioned properties of the Grassmannian perfectly fit our strategy to model complex deformations, and therefore, we blend the concept of spatio-temporal representations with local low-rank linear models. This idea results in a two-stage coupled optimization problem i.e., local reconstruction and global grouping, which is solved efficiently using the standard ADMM algorithm . As the local reconstructions are performed using a low-rank eigen decomposition, our representation is computationally efficient and robust to noise. We demonstrate the benefit of our approach to benchmark real and synthetic sequences §5. Our results show that our method outperforms previous state-of-the-art approaches by 1-2 % on all the benchmark datasets. Before we provide the details of our algorithm, we review some pertinent previous works in the next section.
This section provides a brief background on the recent advancements in NRSfM, focusing mainly on the methods that are relevant to this work.
Preliminaries: We borrow the notation system from Dai et al.’s work  for its wide usage. Given ‘’ feature points over ‘’ frames, we represent , , as the measurement, the shape, and the rotation matrices, respectively. Here matrix is composed of block diagonal , representing per frame orthographic camera projection. Also, the notation stands for the rearranged shape matrix, which is a linear mapping of . We use and to denote the Frobenius norm and the nuclear norm, respectively.
2.1 Relevant Previous Work
Dai et al.’s approach: Dai et al. proposed a simple and elegant solution to NRSfM . The work, dubbed “prior-free”, provides a practical solution as well as new theoretical insights to NRSfM. Their formulation involves nuclear norm minimization on instead of –see Table 1(a). This is enforced due to the fact that rank bound on is weaker than rank bound on , where refers to the rank of . Although this elegant framework provides robust results for the shapes that span a single subspace, it may perform poorly on complex non-rigid motions .
Zhu et al.’s approach: To achieve better 3D reconstructions on complex non-rigid sequences, this work capitalized on the limitations of Dai et al.’s work by exploiting the union of subspaces in the shape space . The proposed formulation is inspired by LRR  in conjunction with Dai et al. work –see Table 1(b). In the formulation, , are the coefficient and error matrices.
Kumar et al.’s approach: Kumar et al. exploits multiple subspaces both in the trajectory space and in the shape space . This work demonstrated empirically that procuring multiple subspaces in the trajectory and shape spaces provide better reconstruction results. They proposed a joint segmentation and reconstruction framework, where segmentation inherently benefits reconstruction and vice-versa –see Table 1(c). In their formulation , are the coefficient matrices and, linearly maps to .
Dense NRSfM approach: Garg et al. developed a variational approach to solve dense NRSfM . The optimization framework proposed by them employs total variational constraint on the deforming shape () to allow edge preserving discontinuities, and trace norm constraints to penalize the number of independent shapes –see Table 1(d). Recently, Dai et al. has also proposed a dense NRSfM algorithm with a spatio-temporal formulation .
This work is intended to overcome the shortcomings of the previous approaches to solve dense NRSfM. Accordingly, we would like to outline the critical limitations associated with them. Although some of them are highlighted before, we reiterate it for the sake of completeness.
[nolistsep, itemindent=0em, label=()]
To solve dense NRSfM using the formulation proposed by Kumar et al.  and Zhu et al.  is nearly impossible due to the complexity of their formulation §1. Also, the error measure used by them is composed of Euclidean norm defined on the original data (see Table 1), which is not proper for non-linear data with a manifold structure [1, 38].
To avoid all the aforementioned limitations, we propose a new dense NRSfM algorithm. The primary contributions of this paper are as follows:
A scalable spatio-temporal framework on the Grassmann manifold to solve dense NRSfM which does not need any template prior.
An effective framework that can handle non-linear deformations even with noisy trajectories and provides state-of-the-art results on benchmark datasets.
An efficient solution to the proposed optimization based on the ADMM procedure .
3 Problem Formulation
In this section, we first provide a brief introduction to the Grassmann manifold and a suitable definition for a similarity distance metric on it, before revealing our formulation.
3.1 Grassmann Manifold
The Grassmann manifold, usually denoted as , consists of all -dimensional linear subspaces of , where . A point on the Grassmann manifold is represented by a matrix (say ), whose columns are composed of orthonormal basis of the subspace spanned by , denoted as or in short as . Let’s suppose ,  are two such points on this manifold, then among several similarity distances known for this manifold , we will be using the projection metric distance given by , as it allows directly embedding the Grassmannian points into a Euclidean space (and the use of the Frobenius norm) using the mapping . With this metric, forms a metric space. Interested readers may refer to  for details.
With the relevant background as reviewed in the above sections, we are now ready to present our algorithm to solve the dense NRSfM task under orthographic projection. We start our discussion with the classical representation to NRSfM i.e.,
where, , , . The motive here is, given the input measurement matrix, solve for rotation () and 3D shape (). To serve this objective, Eq.(1) maintains the camera motion and the shape deformation such that it complies with the image measurements. For our method, we solve for rotations using the Intersection method  by assuming that the multiple non-rigid motions within a single deforming object, over frames, can be faithfully approximated by per frame single relative camera motion with a higher rank111Check the supplementary material for a detail discussion on rotation.. Accordingly, our goal reduces to develop a systematic approach that can reliably explain the non-rigid shape deformations and provides better 3D reconstruction. We use subscript ‘’ in Eq.(1) to indicate that the column permutations of and matrix are allowed. However, the column permutations of is inadmissible as it results in discontinuous trajectories over frames.
Grassmannian Representations in Trajectory Space: Let’s suppose = is the set of points on the Grassmann manifold generated using matrix, then
represents a tensor which is constructed by mapping all symmetric matrices of the Grassmann data points —refer Figure2(a). As discussed before in §1, to explain the complex deformations, we reduce the overall non-linear space as a union of several local low-dimensional linear spaces which form the sample points on the Grassmann manifold. But, the notion of self-expressiveness is valid only for Euclidean linear or affine subspace. To apply self-expressiveness on the Grassmann manifold one has to adopt linearity onto the manifold. Since, Grassmann manifold is isometrically equivalent to the symmetric idempotent matrices , we embed the Grassmann manifold into the symmetric matrix manifold, where the self-expression can be defined in the embedding space. This leads to the following optimization:
We denote as the coefficient matrix with ‘’ as the total number of spatial groups. Here, measures the trajectory group reconstruction error as per the manifold geometry. Also, we would like to emphasize that since the object undergoes deformations in the 3D space, we operate in 3D space rather than in the projected 2D space. is enforced on for a low-rank solution.
Grassmannian Representations in Shape Space: Deforming object attains different state over time which adheres to distinct temporal local subspaces . Assuming that the temporal deformation is smooth over-time, we express deforming shapes in terms of local self-expressiveness across frames as:
Similarly, is the set of all symmetric matrices constructed using a set of Grassmannian samples , where contains the samples which are drawn from —refer Figure 2(b). Intuitively, is a shape matrix with each column as a deforming shape. , represent the temporal group reconstruction error and coefficient matrix respectively, with as the number of temporal groups. is enforced on for a low-rank solution.
Spatio-Temporal Formulation: Combining the above two objectives and their constraints with reprojection error term give us our formulation. Our representation blends the local subspaces structure along with the global composition of a non-rigid shape. Thus, the overall objective is:
The re-projection error constraint performs the 3D reconstruction using and . Meanwhile, the local subspace grouping naturally enforces the union of subspace structure in , with corresponding low-rank representations of the coefficient matrices and . Here, the function draws inference from matrices to refine Grassmannian sample set, both in trajectory and shape spaces. The function reconstructs and matrices based on a set of local subspaces (
), singular values (, ) and the number of top eigenvalues (). The function transforms matrix to matrix and function rearranges matrix as per the recent ordering of 222It’s important to keep track of column permutation of , .. Parameters such as ‘’, ‘’ and ‘’ provides the flexibility to handle noise and adjust computations. Note that the element of the sets are obtained using SVD. The above equation i.e. Eq: (4) is a coupled optimization problem where the solution to matrices influence the solution of matrices and vice-versa, and connects to .
The formulation in Eq.(4) is a non-convex problem due to the bilinear optimization variables (, ), hence a global optimal solution is hard to achieve. However, it can be efficiently solved using Augmented Lagrangian Methods (ALMs) , which has proven its effectiveness for many non-convex problems. Introducing Lagrange multipliers and auxiliary variables to Eq.(4) gives us the complete cost function as follows:
The function first computes the SVD of matrices, i.e. , then forms a matrix such that , where is set empirically based on noise levels and (normalized). Secondly, it uses to form new Grassmann samples from the matrices. Notice that operates on matrices whose dimensions depend on the number of Grassmann samples. This reduces the complexity of the task from exponential in the number of vectorial points to exponential in the number of linear subspaces. The later being of the order 10-50, where as the former can go more than 50,000 for dense NRSfM.
The function is defined as follows = = , where stands for top- eigenvalues, denotes the cardinal number of the set and denotes for the horizontal concatenation of matrices. Intuitively, reconstructs back each local low-rank subspace. During implementation, we use , in place of accordingly. The optimization variables over iteration are obtained by solving for one variable at a time treating others as constant, keeping the constraints intact. For detailed derivations for each sub-problem and proofs, kindly refer to the supplementary material. The pseudo code of our implementation is provided in Algorithm 1.
5 Experiments and Results
We compare the performance of our method against four previously reported state-of-the-art approaches, namely Dense Spatio-Temporal DS , Dense Variational DV , Trajectory Basis PTA  and Metric Projection MP . To test the performance, we used dense NRSfM dataset introduced by Garg et al.  and Varol et al. 
under noisy and noise free conditions. For quantitative evaluation of 3D reconstruction, we align the estimated shapewith ground-truth shape per frame using Procrustes analysis. We compute the average RMS 3D reconstruction error as . We used Kmeans++ algorithm  to initialize segments without disturbing the temporal continuity.
Experiments on Synthetic Face Sequences: This dataset consists of 4 different face sequence with 28,880 feature points tracked over multiple frames. The face sequence 1, 2 is a 10 frame long video, whereas, face sequence 3, 4 is a 99 frame long video. It’s a challenging dataset mainly due to different rotation frequencies and deformations in each of the sequence. Figure 3 shows the qualitative reconstruction results obtained using our approach in comparison to the ground-truth for face sequence 4. Table 2 lists the performance comparisons of our method with other competing methods. Clearly, our algorithm outperforms the other baseline approach, which helps us to conclude that holistic approaches to rank minimization without drawing any inference from local subspace structure is a less effective framework to cope up with the local non-linearities.
|Data||DS ||DV ||PTA ||MP ||Ours|
Experiments on face, back and heart sequence: This dataset contains monocular videos of human facial expressions, back deformations, and beating heart under natural lighting conditions. The face sequence, back sequence, and heart sequence are composed of 28332, 20561, and 68295 feature points tracked over 120, 150, and 80 images, respectively. Unfortunately, due to the lack of ground-truth 3D data, we are unable to quantify the performance of these sequences. Figure 4 shows some qualitative results obtained using our algorithm on this real dataset.
Experiments on kinect_paper and kinect_tshirt sequence: To evaluate our performance on the real deforming surfaces, we used kinect_paper and kinect_tshirt dataset. This dataset provides sparse SIFT feature tracks along with dense 3D point clouds of the entire scene for each frame. Since, dense 2D tracks are not directly available with this dataset, we synthesized it. To obtain dense feature tracks, we considered the region within a window containing the deforming surface. Precisely, we considered the region within = (253, 253, 508, 508), = (132, 363, 363, 132) across 193 frames for paper sequence, and = (203, 203, 468, 468), = (112, 403, 403, 112) across 313 frames for tshirt sequence to obtain the measurement matrix [19, 17]. Figure 5 show some qualitative results obtained using our method on this dataset. Table 3 lists the numerical comparison of our approach with other competing dense NRSfM approaches on this dataset.
|Data||DS ||DV ||PTA ||MP ||Ours|
Experiments on noisy data:
To evaluate the robustness of our method to noise levels, we performed experiments by adding Gaussian noise under different standard deviations to the measurement matrix. Similar to DS the standard deviations are incorporated as by varying from 0.01 to 0.05. We repeated the experiment 10 times. Figure 6(a) and Figure 6(b) shows the variation in the performance of our method under different noise ratio’s on synthetic face sequences and kinect sequences respectively. It can be inferred from the plot that even with large noise ratios, the average reconstruction error does not fluctuate significantly. This improvement is expected from our framework as it is susceptible only to top eigen values.
Effects of variable initialization on the overall performance: We performed several other experiments to study the behavior of our algorithm under different variable initializations. For easy exposition, we conducted this experiment on noise free sequences. We mainly investigated the behavior of on the overall performance of our algorithm. Figure 6(c) and Figure 6(d) shows the variations in the reconstruction errors with respect to and respectively. A similar trend in the plots is observed for changes on and values. These plots clearly illustrate the usefulness of our local low-rank structure i.e., considering a small number of eigenvalues for every local structure is as good as considering all eigenvalues. Similarly, increasing the number of local subspaces after a certain value has negligible effect on the overall reconstruction error. Furthermore, we examined the form of and after convergence as shown Figure 7(a) and Figure 7(b). Unfortunately, due to the lack of ground-truth data on local subspaces, we could not quantify and . For qualitative analysis on the observation, kindly refer to the supplementary material.
Ablation Analysis: This test is performed to evaluate the importance of spatial and temporal constraints in our formulation. To do this, we observe the performance of our formulation under four different setups: a) without any spatio-temporal constraint (NC), b) with only spatial constraint (SP), c) with only temporal constraint (TP), and d) with spatio-temporal constraint (Both). Figure 7(c) shows the variations in reconstruction errors under these setups on four synthetic face sequence. The statistics clearly illustrate the importance of both constraints in our formulation.
Runtime Analysis: This experiment is performed on a computer with an Intel core i7 processor and 16GB RAM. The script to compute the runtime is written in MATLAB 2016b. Figure 7(d) shows the runtime comparisons of our approach with other dense NRSfM methods. The runtime reported in Figure 7(d) corresponds to the results listed in Table 2, 3. The results clearly show the scalability of our method on datasets with more than 50,000 points. Despite PTA  is faster than our approach, its accuracy suffers by a large margin for dense NRSfM (see Table 2, 3).
In this paper, we have introduced a scalable dense NRSfM algorithm which efficiently models the complex non-linear deformations. We achieved this by exploiting the non-linearity on the Grassmann manifold via a spatio-temporal formulation. Moreover, we provided an efficient ADMM  based solution for solving our optimization. In the future, we will consider how to extend this work to the projective setting with perspective cameras (e.g. [13, 12]).
Acknowledgement. S. Kumar was supported in part by Australian Research Council (ARC) grant (DE140100180). Y. Dai was supported in part by National 1000 Young Talents Plan of China, Natural Science Foundation of China (61420106007), and ARC grant (DE140100180). H. Li is funded in part by ARC Centre of Excellence for Robotic Vision (CE140100016).
-  P.-A. Absil, R. Mahony, and R. Sepulchre. Riemannian geometry of grassmann manifolds with a view on algorithmic computation. Acta Applicandae Mathematicae, 80(2):199–220, 2004.
-  P.-A. Absil, R. Mahony, and R. Sepulchre. Optimization algorithms on matrix manifolds. Princeton University Press, 2009.
A. Agudo and F. Moreno-Noguer.
Dust: Dual union of spatio-temporal subspaces for monocular multiple
object 3d reconstruction.
IEEE Conference on Computer Vision and Pattern Recognition, 2017.
-  I. Akhter, Y. Sheikh, S. Khan, and T. Kanade. Nonrigid structure from motion in trajectory space. In Advances in neural information processing systems, pages 41–48, 2009.
-  B. Alessio, Del, A. Henrik, J. Sebastian N, and S. Yaser. Non-rigid structure from motion challenge. IEEE Conference on Computer Vision and Pattern Recognition, 2017.
-  D. Arthur and S. Vassilvitskii. k-means++: The advantages of careful seeding. In Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, pages 1027–1035. Society for Industrial and Applied Mathematics, 2007.
S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein.
Distributed optimization and statistical learning via the alternating
direction method of multipliers.
Foundations and Trends® in Machine Learning, 3(1):1–122, 2011.
-  H. E. Cetingul and R. Vidal. Intrinsic mean shift for clustering on stiefel and grassmann manifolds. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1896–1902, 2009.
-  Y. Chikuse. Statistics on special manifolds, volume 174. Springer Science & Business Media, 2012.
-  K. Crane. Conformal geometry processing. California Institute of Technology, 2013.
-  Y. Dai, H. Deng, and M. He. Dense non-rigid structure-from-motion made easy-a spatial-temporal smoothness based solution. arXiv preprint arXiv:1706.08629, 2017.
-  Y. Dai, H. Li, and M. He. Element-wise factorization for n-view projective reconstruction. In ECCV, 2010.
-  Y. Dai, H. Li, and M. He. Projective multiview structure and motion from element-wise factorization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(9):2238–2251, 2013.
-  Y. Dai, H. Li, and M. He. A simple prior-free method for non-rigid structure-from-motion factorization. International Journal of Computer Vision, 107(2):101–122, 2014.
-  P. Dollár, V. Rabaud, and S. Belongie. Non-isometric manifold learning: Analysis and an algorithm. In International Conference on Machine Learning, pages 241–248, 2007.
-  E. Elhamifar and R. Vidal. Sparse subspace clustering. In IEEE Conference on Computer Vision and Pattern Recognition,, pages 2790–2797. IEEE, 2009.
-  R. Garg, A. Roussos, and L. Agapito. Robust trajectory-space tv-l1 optical flow for non-rigid sequences. In Energy Minimization Methods in Computer Vision and Pattern Recognition, pages 300–314. Springer, 2011.
-  R. Garg, A. Roussos, and L. Agapito. Dense variational reconstruction of non-rigid surfaces from monocular video. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1272–1279, 2013.
-  R. Garg, A. Roussos, and L. Agapito. A variational approach to video registration with subspace constraints. International journal of computer vision, 104(3):286–314, 2013.
-  J. Hamm and D. D. Lee. Grassmann discriminant analysis: a unifying view on subspace-based learning. In International conference on Machine learning, pages 376–383. ACM, 2008.
-  S. H. N. Jensen, A. Del Bue, M. E. B. Doest, and H. Aanæs. A benchmark and evaluation of non-rigid structure from motion. arXiv preprint arXiv:1801.08388, 2018.
-  P. Ji, M. Salzmann, and H. Li. Efficient dense subspace clustering. In IEEE Winter Conference on Applications of Computer Vision, pages 461–468, 2014.
-  S. Kumar, Y. Dai, and H.Li. Spatio-temporal union of subspaces for multi-body non-rigid structure-from-motion. Pattern Recognition, 71:428–443, May 2017.
-  S. Kumar, Y. Dai, and H. Li. Multi-body non-rigid structure-from-motion. In 3D Vision (3DV), 2016 Fourth International Conference on, pages 148–156. IEEE, 2016.
-  S. Kumar, Y. Dai, and H. Li. Monocular dense 3d reconstruction of a complex dynamic scene from two perspective frames. In IEEE International Conference on Computer Vision, pages 4649–4657, Oct 2017.
-  W. Kusakunniran, Q. Wu, H. Li, and J. Zhang. Automatic gait recognition using weighted binary pattern on video. In IEEE AVSS, 2009.
-  W. Kusakunniran, Q. Wu, J. Zhang, H. Li, and L. Wang. Recognizing gaits across views through correlated motion co-clustering. TIP, 23(2):696–709, 2014.
M. Lee, J. Cho, C.-H. Choi, and S. Oh.
Procrustean normal distribution for non-rigid structure from motion.In IEEE Conference on Computer Vision and Pattern Recognition, pages 1280–1287, 2013.
-  H. Li, B. Adams, L. J. Guibas, and M. Pauly. Robust single-view geometry and motion reconstruction. In ACM Transactions on Graphics, volume 28, page 175, 2009.
-  G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu, and Y. Ma. Robust recovery of subspace structures by low-rank representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1):171–184, 2013.
-  D. G. Lowe. Object recognition from local scale-invariant features. In IEEE International Conference on Computer vision, pages 1150–1157, 1999.
-  M. Paladini, A. Del Bue, J. Xavier, L. Agapito, M. Stošić, and M. Dodig. Optimal metric projections for deformable and articulated structure-from-motion. International journal of computer vision, 96(2):252–276, 2012.
-  A. Pasko and V. Adzhiev. Function-based shape modeling: mathematical framework and specialized language. In International Workshop on Automated Deduction in Geometry, pages 132–160. Springer, 2002.
-  V. Rabaud and S. Belongie. Re-thinking non-rigid structure from motion. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1–8. IEEE, 2008.
-  C. Russell, J. Fayad, and L. Agapito. Dense non-rigid structure from motion. In International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission, pages 509–516, 2012.
Y. Sheng, P. Willis, G. G. Castro, and H. Ugail.
Facial geometry parameterisation based on partial differential equations.Mathematical and Computer Modelling, 54(5):1536–1548, 2011.
-  A. Varol, M. Salzmann, P. Fua, and R. Urtasun. A constrained latent variable model. In IEEE Conference on Computer Vision and Pattern Recognition, pages 2248–2255. Ieee, 2012.
-  B. Wang, Y. Hu, J. Gao, Y. Sun, and B. Yin. Low rank representation on grassmann manifolds: An extrinsic perspective. arXiv preprint arXiv:1504.01807, 2015.
R. Wang, S. Shan, X. Chen, and W. Gao.
Manifold-manifold distance with application to face recognition based on image set.In IEEE Conference on Computer Vision and Pattern Recognition, pages 1–8, 2008.
-  R. Yu, C. Russell, N. D. Campbell, and L. Agapito. Direct, dense, and deformable: template-based non-rigid 3d reconstruction from rgb video. In IEEE International Conference on Computer Vision, pages 918–926, 2015.
-  Y. Zhu, D. Huang, F. De La Torre, and S. Lucey. Complex non-rigid motion 3d reconstruction by union of subspaces. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1542–1549, 2014.