On the Approximation Theory of Linear Variational Subspace Design

06/28/2015
by   Jianbo Ye, et al.
Penn State University
0

Solving large-scale optimization on-the-fly is often a difficult task for real-time computer graphics applications. To tackle this challenge, model reduction is a well-adopted technique. Despite its usefulness, model reduction often requires a handcrafted subspace that spans a domain that hypothetically embodies desirable solutions. For many applications, obtaining such subspaces case-by-case either is impossible or requires extensive human labors, hence does not readily have a scalable solution for growing number of tasks. We propose linear variational subspace design for large-scale constrained quadratic programming, which can be computed automatically without any human interventions. We provide meaningful approximation error bound that substantiates the quality of calculated subspace, and demonstrate its empirical success in interactive deformable modeling for triangular and tetrahedral meshes.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 2

page 5

page 6

03/29/2016

Scalable Solution for Approximate Nearest Subspace Search

Finding the nearest subspace is a fundamental problem and influential to...
10/02/2017

Large-Scale Quadratically Constrained Quadratic Program via Low-Discrepancy Sequences

We consider the problem of solving a large-scale Quadratically Constrain...
07/05/2015

Scalable Sparse Subspace Clustering by Orthogonal Matching Pursuit

Subspace clustering methods based on ℓ_1, ℓ_2 or nuclear norm regulariza...
05/03/2021

Subspace Method for the Estimation of Large-Scale Structured Real Stability Radius

We consider the autonomous dynamical system x' = Ax, with A ∈ℝ^n× n. Thi...
09/29/2021

Discretizing L_p norms and frame theory

Given an N-dimensional subspace X of L_p([0,1]), we consider the problem...
10/22/2020

Krylov Subspace Recycling for Evolving Structures

Krylov subspace recycling is a powerful tool for solving long series of ...
07/03/2020

RSAC: Regularized Subspace Approximation Classifier for Lightweight Continuous Learning

Continuous learning seeks to perform the learning on the data that arriv...

Code Repositories

iMeshDeform

A C++ interactive mesh deformer


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In computer graphics realm, solving optimization with a substantially large amount of variables is often an expensive task. In order to speed up the computations, model reduction has been introduced as a useful technique, particularly for interactive and real-time applications. In solving a large-scale optimization problem, it typically assumes that a desired solution approximately lies in a manifold of much lower dimension that is independent of the variable size. Therefore, it is possible to cut down calculations to a computationally practical level by only exploring variability (i.e., different solutions subject to different constraints) in a suitably chosen low-order space, meanwhile, attempting to produce visually convincing results just-in-time. In this paper, we re-examine model reduction techniques for quadratic optimization with uncertain linear constraints, which has been widely used in interactively modeling deformable surfaces and solids.

Modeling deformable meshes has been an established topic in computer graphics for years [Sorkine et al. 2004, Yu et al. 2004]. Mesh deformation of high quality is accessible via off-line solving a large-scale optimization whose variables are in complexity of mesh nodes. A studio work-flow in mesh deformable modeling often involves trial-and-error loops: an artist tries different sets of constraints and explores for desirable poses. In such processes, an interactive technique helps to save the computation time where approximate solutions are firstly displayed for the purpose of guidance before a final solution is calculated and exported. Nevertheless, interactive techniques related to real-time mesh modeling has been less successful than their off-line siblings till today. Existing work based on model reduction often requires a high quality subspace as the input, which typically demands human interventions in constructing them. Exemplars include cage-based deformations [Huang et al. 2006, Ben-Chen et al. 2009], LBS with additional weights [Jacobson et al. 2012], LBS with skeletons [Shi et al. 2007], and LBS with point/region handles [Au et al. 2007, Sumner et al. 2007]. The time spent on constructing such reduced models is as much as, if not more than, that spent on on-site modeling. In industrial deployments, companies have to hire many artists with expertise skills for rigging a large set of models before those models are used in productions. This poses the necessity for a fully automatic subspace generation method. This problems have received attentions in the past. For example, data-driven methods have been developed for deformable meshes, where a learning algorithm tries to capture the characteristics of deformable mesh sequences and applies to a different task [Sumner et al. 2005, Der et al. 2006]. However, they still struggle to face two challenges: 1) Scalability: Like approaches relying on human inputs, obtaining a deformable sequence of scanned meshes can also be expensive. No words to say if we want to build a deformable mesh database containing large number of models with heterogeneous shapes. 2) Applicability: Many models of complex geometries or topologies are relatively difficult to rig, and there are no easy ways to build a set of controllers with skinning weights to produce desirable deformations. Though we see there have been several workarounds for a domain specific mesh sets, such as faces and clothes, an automatically computed subspace for arbitrary meshes, which is cheaply obtained, still can be beneficial, if not all, for fast prototyping or exploratory purposes: the set of constraints chosen on-site is exported for computing a deformation with full quality in the off-line stage.

In this paper, we introduce an automatic and principled way to create reduced models, which might be applied to other computationally intensive optimization scenarios other than mesh deformation. Our main idea is very simple: in solving a constrained quadratic programming, we observe that Karush-Kuhn-Tucker (KKT) condition implicitly defines an effective subspace that can be directly reused for on-site subsequent optimization. We name this linear variational subspace (for short, variational subspace). Our contribution is to theoretically study the approximation error bound of variational subspace and to empirically validate its success in interactive mesh modeling. The deformation framework is similar to one used in [Jacobson et al. 2012].111In independent work reported in a recent preprint [Wang et al. 2015], Wang et al. also propose a mesh deformation framework based on linear variational subspace similar to ours with the difference that we in addition use linear variational subspace to model rotation errors in reduced-ARAP framework. Our deformation can be similar to theirs [Wang et al. 2015], if regularized coefficient is set to a large value. Therefore, the contribution of our paper excluding the empirical efforts is the approximation theory for linear variational subspace.222Implementations and demos: https://github.com/bobye We further examine the deformation property of our proposed method, and compare with physically based deformation  [Ciarlet 2000, Grinspun et al. 2003, Botsch et al. 2006, Sorkine and Alexa 2007, Chao et al. 2010] and conformal deformation[Paries et al. 2007, Crane et al. 2011].

Figure 1: Variational subspace provides robust and high quality deformation results regarding arbitrary constraints at runtime. This figure shows tree and fertility as well as their deformed versions.

2 Mathematical Background

Consider minimizing a quadratic function subject to linear constraints , where is an overly high dimensional solution, is a semi-definite positive matrix of size , , is a well-conditioned matrix of size , and . Typically, . Without loss of generality, we can write

(1)

Instead of solving the optimization with a single setup, we consider a set of them with a prescribed fixed , and varying , and under certain conditions. The “demand” of this configuration is defined to be a particular choice of , and . Different choices usually result in different optimum solutions. When is relatively small, efficiently solving for unreduced solutions belongs to the family, so called multi-parametric quadratic programming, or mp-QP [Bemporad et al. 2002][Tøndel et al. 2003]. We instead approach to tackle the same setting with a large by exploring approximate solutions in a carefully chosen low-order space.

We model the “demand”s by assuming each column of is selected from a low-order linear space , namely for some , and is again selected from another low-order linear space such that that for some , where is a matrix of size ,

is a vector of size

. Here and is the dimension of reduced subspace and articulating to what and belong, respectively. Instead of pursuing a direct reduction in domain of solution , we analyze the reducibility of “demand” parameters and by constructing reduced space and . Specifications of the on-site parameters , and turn out to be the realization of “demand”s. We can rewrite (1) as

(2)

Optimization (2) can be decomposed into an equivalent two-stage formulation, i.e.,

(3)

Karush-Kuhn-Tucker (KKT) condition yields that the optimum point for should satisfy linear equations

(4)

where is a Lagrange multiplier. Therefore is affine in terms of on-site parameters and , i.e.,

(5)

where and can be computed before , and are observed in solving the second stage of (3): From Eq. (4), each column of is computed through a preconditioned linear direct solver by setting as the corresponding column of with ; And similarly, each column of is linearly solved by setting to be the corresponding column of with . Remark it is particularly required that and has to be full-rank and well-conditioned (as will be specified later).

Figure 2: Artistic freedom is important in deformable mesh modeling because many artists are interested in authoring creative editing. Rig or cage based deformation lacks the richness of deformable variants.

Once we have a subspace design (5), for arbitrary “demand” , and , we can immediately solve for an approximate solution via substituting (5) into the original formulation (1). We clarify that if our assumption is held, i.e., and for some and , the approximate solution is identical to the exact optimal . Furthermore, the approximated solution can work with hard constraints in the online solvers.

In next, we demonstrate its effectiveness by implementing an interactive mesh deformation method based on the model reduction framework we proposed. In the end, we will return to the theoretical aspects, and derive an error bound for the approximate solution with respect to the use of and .

3 Interactive Mesh Deformation

In this section, we start from the point that is familiar to the graphics audiences, and proceed to the practice of our reduced model, where we mainly focus on deriving the correct formulation for employing variational subspace. Experimental results are provided in the end. In order for practitioners to reproduce our framework, we describe the details of our implementation in Appendix B. We remark that, subspace techniques described in this section has been standardized as described in [Jacobson et al. 2012]. The main difference is to replace the original linear subspace of a skinning mesh with the variational subspace described in our paper. Our variational subspace techniques extends the fast deformable framework as proposed in [Jacobson et al. 2012] to meshes whose skinning is not available or impossible, such as those of complex typologies. There are, however, good reasons to work with linear blending skinning, for example, it is often possible for artists to directly edit the weights painted on a skinned mesh.

Notations. Denote by the rest-pose vertex positions of input mesh , and denote the deformed vertex positions by .

Use bold lowercase letters to denote single vertex and transformation matrix , and bold uppercase letters and to denote arrays of them. We use uppercase normal font letters to denote general matrices and vectors (one column matrices) and lowercase normal letters to denote scalars. We may or may not specify the dimensions of matrices explicitly in the subscripts, hence and are the same. For some other cases, subscripts are enumerators or instance indicators. We use superscripts with braces for enumerators for matrices, e.g., .

Use to denote Frobenius norm of matrices (vectors), to denote norm of matrices, and to denote Mahalanobis norm with semi-positive definite matrix .

Denote dot product of matrices by , and Kronecker product of matrices by . Let

be the identity matrix,

be the zero matrix and

be a matrix of all ones.

3.1 Variational Reduced Deformable Model

ARAP energy. In recent development of nonlinear deformation energy, the As-Rigid-As-Possible energy [Sorkine and Alexa 2007, Xu et al. 2007, Chao et al. 2010] is welcomed in many related works, in which they represent deformations by local frame transformation. The objective energy function under this representation is quadratic in terms of variables: vertices and transformation matrices with orthogonality constraints. This family of energy functions can be written as

(6)

where are their corresponding sets of edges (see Figure 5 of [Jacobson et al. 2012]), are typically the cotangent weights [Chao et al. 2010], and denotes the local frame rotations. By separating quadratic terms and linear terms w.r.t. , and vectorizing and to and respectively, ARAP energy can be further expressed as

(7)

where (see [Jacobson et al. 2012] for more details).

Rotational proxies. By observation, minimizing ARAP energy involves solving , which is in complexity of mesh geometries. We modify the original ARAP energy to a piece-wise linear form, which relieves the high non-linearity of optimization, but simultaneously increases the complexity by introducing linearization variables.

Figure 3: Results subject to different regularization coefficients. Deformed solid cylinder upon three point constraints. From left to right: .
Figure 4: Results with varied number of rotational proxies. From Left to right: : for each, deformed result with 32 rotational proxies is shown with dark contours.

We divide local rotations into

rotational clusters spatially, which is an over-segmentation of input meshes. Rotations within each patch segment are desired to be similar in deformations. Empirically, we found that a simple k-means clustering on weighted Laplacian-Beltrami eigen-vectors fits well with our scheme, which cuts surface/solid mesh into

patches. We revise the original energy by assuming

(8)

where denotes -th patch-wise frame rotation of the cluster that the vertex belongs, and

(9)

It should be noted that Laplacian surface editing [Sorkine et al. 2004] utilizes to approximate the rotation matrix, whereas we use to approximate the difference of two rotation matrices . It leads to a different energy by appending L2 penalties subject to the regulators :

(10)

where denotes the element-wise area/volume, denotes the penalty coefficient of overall spatial distortion, and denotes the additional penalty coefficient of surface normal distortion (if applicable). and are empirically chosen. (See Fig. 3 for deformations subject to different penalty coefficients ). One potential issue is that using this penalty may incur surface folds when shapes are bended at large angles. To counteract such effects, we optionally use an extra regularization term appended to penalizing the moving frame differentials [Lipman et al. 2007], i.e.,

(11)

where is the set of neighboring local frames and . The two bending cylinder examples in Fig. 8 are produced by penalizing the moving frame differentials.

Let and be the vectorization of and respectively. is quadratic in terms of and , and its partial gradient w.r.t. and is again linear in terms of . Hence again we can write as

where are the rotational proxies of our model, iff the extra regularization (11) is present.

There is an interesting discussion about the difference between and , because includes near-isotropic scaling which has arguable values over distortion in only one direction for artistic modeling purpose in case the desired deformation is far from the isometry [Sorkine et al. 2004, Lipman et al. 2008]. (See Fig. 9 for comparison with the ARAP energy.)

Linear proxies. Besides rotational proxies, we add linear proxies via pseudo-spatial linear constraints,

(12)

where are the linear proxies of our model.

Intuitively, spans a finite dimensional linear space to approach the uncertainty set of onsite constraints provided by users. Its choice reflects how we reduce the dimension of anticipated constraints, as suggested by the use of variational subspace, A simple choice is a sparse sampling of vertices (shown as Fig. 5), i.e (under a permutation)

and an alternative one is to utilize vertices groups via clustering, i.e., (under a permutation)

To this point, technically contrast with our approach, standard model reduction technique employ a strategy that vertices are explicitly represented in low-order by . In order to compute a reasonable subspace, different smoothness criterion are exposed on computing , such as heat equilibrium[Baran and Popović 2007], exponential propagating weights[Jacobson et al. 2012], biharmonic smoothness[Jacobson et al. 2011]. We instead reduce the dimension of constraints, and the subspace are then automatically solved accordingly.

Variational subspace. With context of approximated energy , we are to solve the linear variational problem so as to derive a reduced representation of in terms of proxies and , i.e.,

(13)

By KKT condition introducing Lagrange multipliers , we have a set of linear equations in respect of , which can be expressed as matrix form

(14)

This then implicitly establishes a linear map

(15)

where each column of matrices can be pre-computed by a sparse linear solver with a single preconditioning (LU or Cholesky), subject to each single variable in vector and . Solving for and only need one time computation in the offline stage.

Sub-manifold integration. Provided variational subspace, and span a sub-manifold of deformations. We then restrict our scope to determine reduced variables and . We employ a routine similar to alternating least square [Sorkine and Alexa 2007], where we alternatively update and via two phases.

Phase 1: provided , solve for .

By substituting (15) into approximated ARAP energy (3.1), we derive a reduced ARAP energy as

(16)

where and . With onset hard constraints specified by the user (where are positional constraints and are their values), we are then to solve for linear proxies

(17)

where . Hard constraints are the default setting of our framework.

Alternatively, we can pose on-site soft constraints as

(18)

where is adjusted interactively by user to match the desired effects. We input and solve the integrated reduced model interactively.

Remark that because optimization problems (equations (17) and (18)) are again linear variational, it can be efficiently solved by a standard dense linear solver: (1) pre-computing LU factorization of matrix (not related to ) at the stage to specify constraint handlers , and (2) backward substitution on the fly at the stage to drag/rotate handler.

Phase 2: provided and , compute . Rather than minimizing the reduced energy functional (shown in Eq. (16)) in terms of , we instead want rotational clusters to adapt for the existing deformation. Letting , we fit a patch-wise local frame of rotational clusters subject to deformed mesh by dumping relations and their penalties , and optimizing a simplified energy , which is equivalent to

(19)

where and

are pre-computed. It is well known that those rotation fittings can be solved in parallel via singular value decomposition of each gradient block of

. For matrix, we employ the optimized SVD routines by McAdams and colleagues [McAdams et al. 2011] that avoid reflection, i.e., guarantee the orientation.

3.2 Algorithm overview

Figure 5: Interactive mesh modeling framework of our approach. From left to right: (1) Input red demon model ; (2) Pre-process to set pseudo-constraint points as linear proxies and near-rigid parts as rotational proxies; (3) Compute variational subspace offline, and load into display device. (4) Prepare subspace integration , , and . (5) At run-time, given on-site user demand , solve for reduced variables and upload to display device (bandwidth saving); (6) Display deformed mesh and feedback.

We review our previous mathematical formulations, and summarize our algorithm into three stages (see also Fig. 5):

Pre-compute. The user loads initial mesh model , linear proxies , rotational proxies , and affine controllers (if applicable). Our algorithm constructs a sparse linear system to solve for variational subspace and , and then pre-computes , , and .

Prepare on-site constraints. When above pre-computed matrices are present, the user can only freely specify the intended constraint handler on-site. They are in the form of . Our algorithm then proceeds to compute and , and pre-factorize the linear system (see equation (17) or (18)). If a user introduces a brand new set of constraints on-site, this stage will be re-computed within tens of milliseconds. The timing regard to different settings has been reported in Table 1 column “OP”.

Deform on the fly. Our algorithm allows the user to deform meshes on the fly, which means the user can view the deformation results instantly by controlling constraint handlers. For each frame, our model takes in positional constraints , calls an alternating routine (with global rotation adaption) interactively to solve for proxy variables and , and reconstructs and displays the deformed mesh. To guarantee real-time performance, we used a fixed number of iterations per frame. By initializing an alternating routine with the previous frame proxies, we do not observe any disturbing artifacts even when using only 8 iterations.

3.3 Results and Discussion

Input Model Proxies Runtime Pre-computation
Model Vert. Type Linear Rot.

1 Iter.

(s)

Df.

(ms)

Total

(ms)

Subspace

(s/GB)

OP

(ms)

Fig.
Cylinder 5k Tri. 33 12 50 1.4 2 14 (0.3) 14 8
Cactus 5k Tri. 33 27 90 2 3 18 (0.4) 16 8
Bar 6k Tri. 33 52 93 3.8 4.8 36 (.5) 20 8
Bumpy Plane 40k Tri. 33 27 85 14 15 200 (2.7) 38 8
Plate Box 4k Tet. 25 25 62 2 2.7 30(.4) 11 6
Solid Cylinder 8k Tet. 33 52 110 4 5 90 (.8) 22 3
Tree 3.6k Tri. 60 60 137 3.5 5 34(.4) 10 1
Fertility 25k Tet. 29 26 78 5.2 6 148 (2.4) 28 1
Dinosaur 21k Tri. 46 34 108 12 14 115 (1.7) 24 7
Dragon 53k Tri. 20 20 55 14 15 198 (2.7) 64 10
Red Demon 80k Tri. 30 28 90 35 36 498 (5.8) 106 5
Table 1: Model statistics and serial performance on a HP laptop with an Intel i7 2.20GHz Processor. From left to right: number of vertices, type of mesh, number of linear proxies, number of rotational proxies, time in seconds for one iteration, time in milliseconds for mapping reduced solution info full space (Intel MKL on CPU performance), time in milliseconds for full optimization per frame, time in seconds (and memory in GBytes) for pre-computation of subspace, time in milliseconds for computation of on-site preconditioner, figure that shows the configuration.
Figure 6: Our reduced model preserves the nature of different functions. Left: volumetric deformed mesh; Right: deformed surface (self-occlusion possible) under same constraints.
Figure 7: Our revised energy does not reveal linearization artifacts when the deformation is isometric (middle). It also favors isotropic scaling when the model is stretched or shrunk (right). The original model is shown in the left.

We implement our framework for deformable mesh modeling and demonstrate our results by examples which include standard deformation suites introduced in [Botsch and Sorkine 2008]. Results of our approach on a set of typical test meshes are shown in Fig. 8). The results shown can be compared with results of high quality methods without model reduction, including PriMo[Botsch et al. 2006], also shown in [Botsch and Sorkine 2008]. Besides, we also demonstrate the strength of our method in conformal setting, where we configure scaling factors in our modeling framework (see Fig. 10). In our experiments, the modeling framework runs robustly on various models, for different types of transformation, such as small and large rotations, twisting, bending, and more as shown in Fig. 2. It works reasonably naturally on both surface and solid meshes, in which user’s choice of energy controls the desired behaviors (see Fig. 6). It also accommodates different hyper-parameter setting, such as the number and type of proxies, to produce predictable and reasonable results (see Fig. 4).

Based on our CPU implementation, We report timing of our algorithm working on different models presented in our paper in Table 1. All timing results are generated on an Intel Core i7-2670QM 2.2GHz 8 processor with 12 GB RAM. It has been shown that the time used in reduced model iteration is not related to the geometric complexity, and the overall computation per frame is magnitude faster than that as reported in [Hildebrandt et al. 2011]. The computational framework used in our paper is almost as same as the one used in [Jacobson et al. 2012], thus the performances are comparable. It is also shown that the process of mesh reconstruction, which is a matrix-vector product (see Eq. (15) and column “Df.” of Table 1), is the bottle-neck of overall computation, yet it is embarrassingly parallel.

Figure 8: Approximation quality of our method in all images is demonstrated on the test suite of models introduced in [Botsch and Sorkine 2008]; From left to right: Bumpy Plane, Cylinder, Cactus, Bar. Multiple types of deformations, including bending, shifting and twisting, are tested.

Comparing to other cell-based model reduction methods [Sumner et al. 2007, Botsch et al. 2007], our approach utilizes a much smaller number of reduced variables. Typically, we adopt no more than 35 linear proxies, and no more than 60 rotational proxies. Besides, the configuration of proxies gives user the freedom to design his/her own needs in modeling a particular mesh. Instead restricting variability in modes and modal derivatives space [Hildebrandt et al. 2011], artist, based on his intentions, can cut shape into near rigid parts (each for a rotational proxy), and specify pseudo constrain locations as linear proxies. Fig. 5 demonstrates a modeling scenario where artist intended to adjust mouth, nose and eyes on a face model: semantic parts are in first annotated, variational reduced deformable model is then pre-computed for on-line editing.

4 Theory of Variational Subspace

The notations used follows the preliminary setup in section 2.

4.1 Concept of Approximation

Definition 4.1 (Variational Subspace).

Given a quadratic programming problem in the form of Eq. (1), choose and for the problem in the two-stage form as Eq. (3). The solution for the first stage problem is hence given by Eq. (5). The subspace spanned by columns of and are called variational subspace.

Proposition 4.1.

is a matrix of size and is a matrix of size . Their columns span a linear subspace where the reduced solution belongs. Those columns are computed by solving a variational formulation provided by Eq. (4).

In Eq. 1, is only guaranteed to be positive semi-definite. We have Cholesky decomposition , where is upper triangular with non-negative diagonal entries [Golub and Van Loan 2012]. Denote the pseudo-inverse of by , then for any , we have a two-part orthogonal decomposition , where

(20)
Proposition 4.2.

With the two-part decomposition Eq. (20), we have

(21)

In addition, if is positive definite, .

Moreover, we can rewrite the optimization problem Eq. (1) as

(22)

Since , we simplify above formulation to

(23)

It is observed that only appears in second-order term in the objective function of Eq (23). Suppose the optimal solution to Eq. (1) is with a two-part decomposition (given by Eq. (20)) , we then consider the following companion optimization problem

(24)

Remark which appears in objective function of Eq. (24) is a constant. We see Eq. (24) can be equivalently solved in two steps: In the first step, we solve the following problem

(25)

And in the second stage, we project the solution to , where is the solution to Eq. (25).

Theorem 4.3.

Suppose is the unique solution to Eq. (23), is the unique solution to Eq. (24), and is the unique solution to Eq. (25), then we have .

Proof.

We observe that satisfies the constraint of Eq. (24), and objective functions of Eq. (23) and Eq. (24) coincide, i.e., . Therefore, as is the minimizer to Eq. (24), we have

On the other hand, . Given is the minimizer to Eq. (23), we also have

Hence, the equality holds for . Given that the optimum exists and is unique, we have

The proof of is similar: observe that and satisfies the constraint of Eq. (24). ∎

Definition 4.2.

Two solutions subject to the form Eq. (1) (parameterized by , , and ) are called quotient equivalent, if they share the same companion problem defined by Eq. (25), i.e. their are the same. This forms group equivalence in the space of solutions.

For example, let be the Laplacian operator , the solution would minimize the the L2 norm of first-order gradient. In such case, two problems are of quotient equivalence if their optimal solutions preserve to an additive constant. We use the distance between two solution groups under the quotient equivalence to measure the approximation error. It is the Mahalanobis distance provided by , i.e.

Let , , , and , we rewrite Eq. (25) (but not equivalent) as

(26)

Similar to the treatment of Eq. (24), we can in first solve

(27)

and then project the solution to , where is the solution to Eq. (27).

Since we are always interested in distance measure for different solutions, the projection step in solving Eq. (24) is not necessary to compute . The distance between two solution groups and of the quotient equivalence is therefore the Euclidean distance between and .

Similar to the treatment of Eq. (1) and Eq. (27), we can derive the two-stage problem from Eq. (2) as

(28)

where and . The KKT condition of its first-stage problem is given similarly as

(29)

where is a Lagrange multiplier. We have the following justifications to only study Eq. (27) and Eq. (28).

Proposition 4.4.

If is the optimal solution to Eq. (1) and the optimal solution to Eq. (27) with and , we have . The similar argument also holds for Eq. (2) and Eq. (28).

Proof.

Because is the optimal solution to Eq. (23), we have , where , as mentioned, is the optimal solution to Eq. (25). On the other hand, we know . Therefore, we have . ∎

Proposition 4.5.

Suppose is the solution of Eq. (4) and is the solution of Eq. (29), we have if and .

Proof.

We have and . Therefore, satisfies Eq. (29). ∎

Definition 4.3 (Variational Subspace Under Quotient Equivalence).

Given to be the solution to Eq. (29), the columns of and span the variational subspace for solutions of optimization problem in the form of Eq. (27).

The main problem is thus revealed: how close are the exact solution of Eq. (27) and subspace solution restricted in a variational subspace defined by Def. 4.3, where the closeness is measured by the Mahalanobis distance provided by .

4.2 Bound of Approximation Error

In this section, we provide the proof that the approximation error of model reduction by variational subspace can be bounded in .

Proposition 4.6 (Exact Solution).

Assuming Eq. (27) has a unique solution which has finite optimum, such that is a full-rank matrix. the solution is

(30)

where is the pseudo-inverse of .

Proof.

The KKT condition of Eq. (27) indicates and . This leads to . Since is full-rank, is invertible. Hence we have

(31)

Plug-in yields Eq. (30). ∎

Similarly, we also have

Proposition 4.7.

Suppose is the solution of Eq. (29), then

(32)

where is pseudo-inverse of , and is the orthogonal projector onto the kernel of [Golub and Van Loan 2012].

Here we remark that in order for any subspace solution to have a unique low-dimensional coordinate . We should require and to be linearly independent. This equivalently means is full rank.

Theorem 4.8 (Projection on Variational Subspace).

Assume columns of and are linearly independent. Given any , its closest point (under Euclidean distance) in a variational subspace given by Eq. (32) is

(33)

and the closest point is

(34)

where and .

Proof.

If for some , we have , where . Since and are linearly independent, we have and . Therefore, is full-rank. Furthermore,   is invertible. The closest point to is to minimize

whose partial gradient against and should be zero, i.e.,

and

Notice that , , . Above equalities can be simplified to

Above equalities can be solved as

Next, we are to derive the analytic subspace solution.

Proposition 4.9 (Variational Subspace Solution).

Let be orthogonal projector onto the subspace ([Yanai et al. 2011], page 45), where the orthogonal projector onto the kernel of . Assuming is full-rank, columns of and are linearly independent, and exists, the variational subspace solution to Eq. (27) is

(35)

where and . Note is the projection matrix restricted in subspace that map onto the kernel of .

Proof.

First, we have is symmetric, and . Plug variational subspace into Eq. (