On Radically Expanding the Landscape of Potential Applications for Automated Proof Methods

06/03/2019 ∙ by Jeffrey Uhlmann, et al. ∙ 0

In this paper we examine the potential of computer-assisted proof methods to be applied much more broadly than commonly recognized. More specifically, we contend that there are vast opportunities to derive useful mathematical results and properties that are extremely narrow in scope, and of practical relevance only to highly-specialized engineering applications, that are presently overlooked because they have characteristics atypical of those that are conventionally pursued in the areas of pure and applied mathematics. As a concrete example, we demonstrate use of automated methods for certifying polynomial nonnegativity as a part of a dimension-pinning strategy to prove that the inverse of the relative gain array (RGA) of a d-dimensional positive-definite matrix is doubly-stochastic for d≤ 4.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

This paper is in part inspired by recently-revived discussions about the current and potential prevalence of computer-assisted proof methods spurred by the quarter-century anniversary of the notorious “Death of Proof” article from 1993 [10]. One common observation is that “…computers continue to only rarely have any role in the creation and checking of the proofs of mathematical theorems” [11]. The conventional opinion is that this status quo will not change unless and until the power of automated proof methods improves dramatically to a point where they are able to solve an increasing fraction of outstanding problems of interest to professional mathematicians. In this paper, however, we suggest an alternative in which their power increases at only a moderate rate but there is a dramatic increase in the number of problems to which they are applied. These could include problems of little or no theoretical interest to the mathematical community and which only have narrow practical application (e.g., as typified by the example problem of this paper) but which by sheer number lead to a situation in which automated methods do in fact surpass the productivity and practical impact of traditional proof methods employed by human mathematicians.

2 Dimension-Specific Properties

For a particular class of objects of interest, e.g., some defined subset of the integers or reals, it is commonly of analytic or practical value to identify a set of special properties satisfied by that class. For a class of objects parameterized by an integral measure of size or dimensionality , it may be the case that a particular property only holds for a limited set of values of . For example, non-parallel lines always intersect in two dimensions but not generally in higher dimensions.

In the case of matrices there is great value in being able to establish that the result of a given matrix function or transformation has a special property (e.g., that it is unitary, positive-definite, totally-unimodular, doubly-stochastic, or has a special structural property such as sparseness) which can be exploited for analysis purposes or to obtain solutions more efficiently than would be possible in the general case. This is the motivation for deriving decompositions of general matrices as products or sums of matrices with special properties [15, 16, 31]

. For example, the singular-value decomposition (SVD) is one of the most widely-used tools in linear algebra because it permits an arbitrary matrix

to be expressed as , where and are unitary and is diagonal [12].

When general linear-algebraic tools are applied to analyze a matrix transformation , the identified properties will typically hold independently of . In fact, the power of linear algebra largely derives from its ability to work with objects (matrices) in a manner that allows the dimensionality to be abstracted. However, many theoretical and practical problems of interest, e.g., design and analysis of control systems, are intrinsically defined in a fixed -dimensional space for which many properties may hold for conditioned on that do not necessarily hold for general . Once identified, such properties can provide significant insights and permit a much larger set of mathematical tools to be applied. Unfortunately, establishing such properties may demand effort exponential in dimensionality.

In the following sections we consider an example in which the opportunities and obstacles associated with the determination of dimension-specific properties are highlighted. In particular, we prove conjectured properties of a specific matrix function for using sophisticated methods for certifying polynomial nonnegativity. We then consider the challenge of proving that these properties hold more generally up to a conjectured upper bound of . We conclude with consideration of dimension pinning as a potentially effective and general strategy for establishing dimension-specific properties as part of a broader vision for increasing the applicability and relevance of computer-assisted proof methods.

3 The RGA and its Inverse

The relative gain array (RGA) is an important tool in the design of process control systems [4]. It is a function of a nonsingular real or complex111It should be noted that the the transpose operator in the RGA applies even in the case of complex , i.e., it should not be replaced with the conjugate-transpose operator. matrix defined as:

(3.1)

where represents the elementwise Hadamard matrix product and the rightmost expression exploits simplified notation for the inverse-transpose operator. The RGA has a variety of interesting mathematical properties, one of which is that the sum of the elements in each row and column is unity [13, 30]. For example, given222This particular integer matrix was chosen purely because its inverse is also integral [9] and thus simplifies the values appearing in this and subsequent examples, i.e., no inferences should be drawn from the integrality of the results.

(3.2)

then

(3.3)

Another important property of the RGA relates to permutations of the rows and columns of its argument [13]:

(3.4)

In other words, the RGA is permutation-consistent [29, 30] with respect to left and right multiplication by permutation matrices and .

A related matrix operator that has not previously been considered in the literature is the inverse of the RGA result, or IRGA, which we define as:

(3.5)

Applying this to the matrix of Eq. (3.2) gives

(3.6)

where it can be seen that the row and column sums are unity, as should be expected since has unit row and column sums. Examining the case

(3.7)

reveals

(3.8)

This result suggests several special properties that might be conjectured to hold for some values . The most tantalizing relates to the fact that the denominator of the coefficient is the permanent of . In fact, the overall result represents the joint assignment matrix (JAM) [5] of , which is the solution333The inverse-JAM is presently under investigation by the first author as an alternative generalization of the RGA for . to an important combinatorial problem arising in multiple-target tracking and related applications [27, 28]. Evaluation of the JAM for a general matrix is believed to be computationally intractable based on the #P-Hard complexity of evaluating the permanent of a matrix [5, 3, 17, 6], so it should not be surprising that JAM-equivalence does not hold for .

It can also be observed in the case that nonnegative implies is nonnegative and therefore doubly-stochastic. Unfortunately, this property also does not generalize to . However, in the case of positive-definite a conjecture of the first author states:

Conjecture 3.1

For an , , positive-definite matrix , is positive-definite, has unit row and column sums, and is nonnegative and hence doubly-stochastic.

In fact, because of the permutation-consistency property of the RGA, hence also of the IRGA, the conjecture only requires to be positive-definite up to left and right permutations.

A nonnegative symmetric positive-definite matrix is sometimes referred to as being doubly nonnegative[36]

. Such matrices arise in a variety of applications ranging from control systems and network analysis to estimation and optimization 

[2, 20, 21, 32, 37], and characterizations of their inverses have been long-studied [7, 8, 23].

It is straightforward to prove positive-definiteness of the IRGA is preserved for all because and its inverse are both PD (positive-definite), and the Hadamard product of two PD matrices always yields a PD result [12, 26]. Therefore the critical property to be verified for values of in the conjectured range is nonnegativity. This will be examined in the following two sections.

4 Certifying nonnegativity of polynomials

Let be the ring of real -variate polynomials. For a finite set , we denote by the convex hull of . A polynomial can be written as with . The support of is .

For a nonempty finite set , denotes the set of polynomials in whose supports are contained in , i.e., and we use to denote the set of polynomials which are sums of squares of polynomials in . The set of symmetric matrices is denoted by and the set of positive-semidefinite matrices is denoted by . Let be the

-dimensional column vector consisting of elements

, then

where the matrix is called the Gram matrix.

A classical approach for checking nonnegativity of multivariate polynomials, as introduced by Lasserre [14] and Parrilo [18], is the use of sums of squares as a suitable replacement for nonnegativity. Given a polynomial , if there exist polynomials such that

(4.1)

then is referred to as a sum of squares (SOS). Obviously an SOS decomposition of a given polynomial serves as a certificate for its nonnegativity. For , let and assume . Then the SOS condition (4.1) can be converted to the problem of deciding if there exists a positive-semidefinite matrix such that

(4.2)

which can be effectively solved as a semidefinite programming (SDP) problem. Note that there are elements in the monomial basis . So the size of the corresponding SDP problem is , which grows rapidly as the number of variables and the degree of the given polynomial increase. To deal with high-degree polynomials with many variables it is crucial to exploit the sparsity of polynomials to reduce the size of the corresponding SDP problems.

5 Certifying nonnegativity of sparse polynomials

There are two aspects by which sparsity in SOS decompositions can be exploited. The first is to compute a smaller monomial basis. In fact, the monomial basis in (4.2) can be replaced by [22]

(5.1)

Second, the block-diagonal structure in the Gram matrix can be exploited, namely cross sparsity patterns, which was introduced by the second author [33] and will prove critical to achieving our main result.

Definition 5.1

Let with and is as in (5.1). An cross sparsity pattern matrix is defined by

(5.2)

Given a cross sparsity pattern matrix , the graph where and is called the cross sparsity pattern graph.

Given an undirected graph with , we define an extended set of edges that includes all self-loops. Then we define the space of symmetric sparse matrices as

(5.3)

and the cone of sparse positive-semidefinite matrices as

(5.4)

Assume and is as in (5.1). Let be the cross sparsity pattern graph. Let be the graph obtained by adding edges to all connected components of such that every connected component becomes a complete subgraph. Letting

we can state the following sparse SOS decomposition theorem:

Theorem 5.2 ([33])

Assume and is as in (5.1). Let denote the connected components of and . Then, if and only if there exist for such that

(5.5)

Theorem 5.2 implies that the checking of involves solving a block-diagonalized SDP problem, which significantly reduces the computation. For proof purposes it is then necessary to convert the obtained numerical SOS decomposition to an SOS decomposition with rational coefficients. Standard rounding-projection procedures [19] can be applied when the given polynomial lies in the interior of the SOS cone, but in our case (and also , ) lies on the boundary of the SOS cone. To overcome this difficulty we use the method of undetermined coefficients by setting nonzero elements in the numerical SOS decomposition of as unknowns and then searching for a rational solution to the system of equations obtained by comparing coefficients. It is the complexity of this latter step that will limit the values of for which SOS certificates can be practically obtained for our problem of interest.

6 Toward Proving the Conjecture for

The basis for the upper bound of the conjecture was determined first by a strategy of extensive directed sampling of random PD matrices for successive values of , which revealed that counterexamples can be readily found for but not for . It then remained only to tailor the search to find a specific example for which the violated nonnegativity condition can be certified with finite arithmetic. The following integer PD matrix provides this certification:

(6.1)

Applying IRGA to the above can be verified to yield a rational result that contains a symmetric pair of negative values. With this it remains to rigorously prove that no such counterexamples can exist for . This is necessary and sufficient because clearly if the conjecture holds for some value of it must hold for all smaller values because each case can be expressed as a block replacing a submatrix of the identity.

To prove that the case is nonnegative it is sufficient to show that a typical off-diagonal element is nonnegative. As will be discussed in the subsequent section, the computational complexity required to prove the nonnegativity of the polynomial associated with such an element tends to become practically prohibitive even for values of of this size.

7 The case

Any positive-definite matrix can be be scaled on the left and right by a positive diagonal matrix so that its Cholesky decomposition has unit diagonal elements, thus we can assume without loss of generality that our test of positive-definiteness can assume an appropriately-scaled having Cholesky factors with

(7.1)

Because is positive-definite it follows that the determinant of , which is also the common denominator of elements of , is positive. Therefore, to prove the nonnegativity of elements of it is sufficient to establish nonnegativity of the numerators of elements of . Consider a typical off-diagonal element of , say the element in position , whose numerator we denote by (similarly for the meaning of ). The sizes of are listed in the following table from which it can be seen that the support sizes of become extremely large:

#var deg #supp
6 12 116
10 20 5157
15 40 676505
Table 1: Sizes of ,,.

Applying the method of the previous section we obtain a numerical SOS decomposition for , but its size prevents us from obtaining a rational certificate of nonnegativity. In other words, IRGA nonnegativity for is virtually but not formally established. However, the much larger size of prevents us from obtaining even a numerical SOS decomposition, assuming one exists.

Given that it is not presently possible to formally prove nonnegativity for the case, the fallback must be to follow a strategy similar to the finding of counterexamples and consider successive values of . The case is already at the limit of what can be proven by hand or by using computer-aided symbolic methods (both are doable), but in the following section we will formally prove nonnegativity for the subsuming case of , i.e., that typical element is nonnegative.

8 The case

As in the previous section, assume the positive-definite matrix has Cholesky decomposition with

(8.1)

From this we can express the polynomial as

Computing the monomial basis of as in (5.1) yields a result with elements. The cross sparsity pattern graph of has connected components with nodes, respectively. From this we obtain as a weighted sum of squares, i.e., , where

We can now present our main technical result relating to the IRGA conjecture:

Theorem 8.1

For an matrix that is positive-definite up to permutations of its rows and columns, retains the same properties, but with unit row and column sums, and for is nonnegative and hence doubly-stochastic, i.e., is doubly-nonnegative.

The import of this theorem is that for any application of the IRGA to arbitrarily permuted positive-definite matrices, , it can be rigorously assumed that the result will retain the same properties while also being doubly-stochastic.

Remark 8.2

Obtaining an SOS decomposition with rational coefficients for required finding a rational solution to a system of polynomial equations of degree and with unknowns. The corresponding system for involves many hundreds of unknowns and is presently beyond the limits of what can be solved using existing generic methods. However, we believe that a specially-tailored approach may be able to obtain a rational SOS decomposition in the near future and formally establish the nonnegativity of . Obtaining a solution for seems so formidable as to require completely new theoretical and practical innovations, e.g., possibly exploiting other special properties of PD doubly-stochastic matrices that obviate the need for SOS methods or greatly reduce the complexity of the polynomials to which they must be applied. With or without such innovations, we propose as a challenge problem to the SOS community.

It must be emphasized that the focus of this paper is not on this highly-specific problem/result per se but rather how it serves as an example of the kinds of highly-specialized problems that arise in practical applications that are potentially amenable to automated proof methods. Of course such methods go well beyond the polynomial-nonnegativity tools used in this paper, but virtually all of them suffer severe computational complexity constraints. What has been suggested in this paper is that a closer look may reveal a plethora of narrow application-specific problems that can be practically solved by present state-of-the-art tools.

9 Discussion

In this paper we have examined a problem for which it is possible to identify and prove nontrivial properties relating to a strongly nonlinear matrix function that hold for values (dimensionality) of below a specific bound but do not hold for general above a specific bound. Our dimension-pinning approach, which is likely to be applicable to a broad family of problems, relies on using counterexample search to establish an upper-bound limit on , i.e., above which the speculated properties provably do not generally hold, and using automated methods to establish a lower-bound limit sufficient to prove they do hold in general below that bound. Ideally, the two approaches would converge to optimally define/pin the limit at which the properties fail to hold in all cases.

Of course there are myriad examples dating from the earliest history of mathematics in which explicit counterexamples have been used to demonstrate the limit to which the scope of a given result can be generalized. What we advocate here is a more structured approach to facilitate computer-aided proof methods to be applied to identify and establish potentially valuable properties of mathematical formulae that hold only in low dimensions, e.g., sufficient to encompass most or all practical instances of a particular engineering problem. For the specific problem of this paper nonnegativity of the IRGA of a positive-definite matrix was observed for the case of , and this motivated a counterexample search which identified that nonnegativity does not generally hold for . This then motivated use of SOS methods to prove that nonnegativity does hold in general at least up to . SOS methods have also provided strong evidence that nonnegativity holds for , but due to computational complexity constraints inherent to SOS methods the status of presently remains entirely unresolved – aside from supporting evidence from the fact that extensive search has yielded no counterexamples for whereas they are readily found when .

We have noted that dimensionality-limited properties are particularly relevant to those engineering applications for which is practically limited in some way by the physical dimensions of the real world. Beyond this, a proof of the nonnegativity conjecture of this paper would imply the existence of special mathematical structures that are limited to 6-dimensional spaces and thus may be relevant to theoretical physics as unitary conformal field theories (CFTs) and superconformal variants can only be defined in or dimensions [34, 35, 1, 24, 25]. Most importantly, we hope this paper has provided a glimpse of what we believe to be a diverse and bountiful – though heretofore largely unrecognized – set of problems to which computer-assisted proof methods can be productively applied.

Declarations of Interest: None.

Acknowledgements: The first author wishes to thank Charlie Johnson ([12, 13]) for enjoyable conversations relating to the topic of this paper. The second author wishes to gratefully acknowledge support from the China Postdoctoral Science Foundation under grants 2018M641055.

References

  • [1] O. Aharony, O. Bergman, D. L. Jafferis, and J. M. Maldacena, “N=6 superconformal Chern–Simons-matter theories, M2-branes and their gravity duals,” J. High Energy Phys., 2008.
  • [2] Claudio Altafini, “Consensus problems on networks with antagonistic interactions,” IEEE Transactions on Automatic Control, 58, no. 4, 935-946, 2013.
  • [3] N. Atanasov, M. Zhu, K. Daniilidis, and George J. Pappas, “Localization from semantic observations via the matrix permanent,” The International Journal of Robotics Research, Vol. 35(1–3) 73–99, 2016.
  • [4] E. Bristol, “On a new measure of interaction for multivariable process control,” IEEE Transactions on Automatic Control, vol. 11, no. 1, pp. 133-134, 1966.
  • [5]

    Joseph Collins and Jeffrey Uhlmann, “Efficient Gating in Data Association for Multivariate Gaussian Distributions,”

    IEEE Transactions on Aerospace and Electronic Systems, Vol. 28, No. 3, 1992.
  • [6]

    D.F. Crouse and P. Willett, “Computation of Target-Measurement Association Probabilities Using the Matrix Permanent,”

    IEEE Transactions on Aerospace and Electronic Systems, PP(99):1-1, 2017.
  • [7] M. Fiedler and R. Grone, “Characterizations of sign patterns of inverse-positive matrices,” Linear Algebra and its Applications, 40: 237-245, 1981.
  • [8] Miroslav Fiedler, “Old and new about positive definite matrices”, Linear Algebra and its Applications, 484:496-503, 2015.
  • [9] Robert Hanson, “Matrices Whose Inverse Contain Only Integers,” The Two-Year College of Mathematics Journal, Vol 13, No. 1, 1982.
  • [10] John Horgan, “Death of Proof,” Scientific American, October, 1993.
  • [11] John Horgan, “Okay, Maybe Proofs Aren’t Dying After All,” Scientific American, March, 2019.
  • [12] R.A. Horn and C.R. Johnson, Matrix Analysis, Cambridge University Press, 1999.
  • [13] C.R. Johnson and H.M. Shapiro, “Mathematical Aspects of the Relative Gain Array ”, SIAM. J. on Algebraic and Discrete Methods, 7(4), 627-644, 1985.
  • [14]

    J.B. Lasserre, “Global optimization with polynomials and the problem of moments,”

    SIAM Journal on Optimization, 11(3):796-817, 2001.
  • [15] N.L. Lord, “Matrices as Sums of Invertible Matrices,” Mathematics Magazine, February, 1987.
  • [16] Mil Mascaras and J. Uhlmann, “Expression of a Real Matrix as a Difference of a Matrix and its Transpose Inverse,” Journal de Ciencia e Ingenieria, Vol. 11, No. 1, 2019.
  • [17] Mark R. Moreland, “Joint data association using importance sampling,” 12th International Conference on Information Fusion, Seattle, WA, USA, July 6-9, 2009.
  • [18] P. A. Parrilo, “Structured semidefinite programs and semialgebraic geometry methods in robustness and optimization,” Ph.D. Thesis, California Institute of Technology, 2000.
  • [19] H. Peyrl, P. Parrilo, “Computing sum of squares decompositions with rational coefficients,” Theoretical Computer Science, 409(2):269-281, 2008.
  • [20] Janez Povh and Franz Rendl, “Copositive and semidefinite relaxations of the quadratic assignment problem,” Discrete Optimization, 6, no. 3, 231-241, 2009.
  • [21] A. Rahmani, M. Ji, M. Mesbahi, and M. Egerstedt, “Controllability of multi-agent systems from a graph-theoretic perspective,” SIAM Journal on Control and Optimization, 48, no. 1, 162-186, 2009.
  • [22] B. Reznick, “Extremal PSD forms with few terms,” Duke Math. Journal, 45, 363-374, 1978.
  • [23] S. Roy and M. Xue, “Sign Patterns of Inverse Doubly-Nonnegative Matrices,” arXiv:1903.04141v2, 2019.
  • [24] C. Saemann and L. Schmidt, “Towards an M5-brane model I: A 6d superconformal field theory,” Journal of Mathematical Physics. 59, 043502, 2018.
  • [25] H. Samtleben, E. Sezgin, and R. Wimmer, “(1,0) superconformal models in six dimensions,” J. High Energy Phys., 2011(12), 62; e-print arXiv:1108.4060 [hep-th], 2011.
  • [26] Issai Schur, “Bemerkungen zur Theorie der beschränkten Bilinearformen mit unendlich vielen Veränderlichen,” Journal für die reine und angewandte Mathematik, (140): 1–28, 1911.
  • [27] Jeffrey Uhlmann, “Matrix Permanent Inequalities for Approximating Joint Assignment Matrices in Tracking Systems,” Journal of the Franklin Institute, Vol. 341, Issue 7, 2004.
  • [28] Jeffrey Uhlmann, “An Introduction to the Combinatorics of Optimal and Approximate Data Association,” Chapter 11 of Handbook of Multisensor Data Fusion, edited by Martin Liggins, David Hall, and James Llinas, CRC Press, 2008.
  • [29] Jeffrey Uhlmann, “A Generalized Matrix Inverse that is Consistent with Respect to Diagonal Transformations,” SIAM Journal on Matrix Analysis (SIMAX), Vol. 39:3, 2018.
  • [30] Jeffrey Uhlmann, “On the Relative Gain Array (RGA) with Singular and Rectangular Matrices,” Applied Mathematics Letters, Vol. 93, 2019.
  • [31] R.S. Varga, (1960). “Factorization and Normalized Iterative Methods,” in Boundary Problems in Differential Equations (edited by R.E. Langer), University of Wisconsin Press, pp. 121-142, 1960.
  • [32] A. Vosughi, C. Johnson, M. Xue, S. Roy, and S. Warnick, “Target control and source estimation metrics for dynamical networks,” Automatica, 100: 412-416, 2019.
  • [33] J. Wang, H. Li, B. Xia, “A New Sparse SOS Decomposition Algorithm Based on Term Sparsity,” arXiv:1809.10848, 2018.
  • [34] E. Witten, “Conformal field theory in four and six dimensions,” arXiv:0712.0157, 2007.
  • [35] E. Witten, “Geometric Langlands from six dimensions,” arXiv:0905.2720, 2009.
  • [36] Akiko Yoshise and Yasuaki Matsukawa, “On optimization over the doubly nonnegative cone,” in 2010 IEEE International Symposium on Computer-Aided Control System Design, pp. 13-18, 2010.
  • [37] H. Zhu, G. Leus, and G.B. Giannakis, “Sparsity-cognizant total least- squares for perturbed compressive sampling,” IEEE Transactions on Signal Processing, 59, no. 5, 2002-2016, 2011.