Fano schemes of generic intersections and machine learning

We investigate Fano schemes of conditionally generic intersections, i.e. of hypersurfaces in projective space chosen generically up to additional conditions. Via a correspondence between generic properties of algebraic varieties and events in probability spaces that occur with probability one, we use the obtained results on Fano schemes to solve a problem in machine learning.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

07/11/2018

TherML: Thermodynamics of Machine Learning

In this work we offer a framework for reasoning about a wide class of ex...
04/11/2018

Threshold and Revocation Encryptions via Threshold Trapdoor Function

We introduce a cryptographic primitive named threshold trapdoor function...
02/17/2014

The Algebraic Approach to Phase Retrieval and Explicit Inversion at the Identifiability Threshold

We study phase retrieval from magnitude measurements of an unknown signa...
11/08/2018

The Generic SysML/KAOS Domain Metamodel

This paper is related to the generalised/generic version of the SysML/KA...
08/12/2021

TextBenDS: a generic Textual data Benchmark for Distributed Systems

Extracting top-k keywords and documents using weighting schemes are popu...
08/14/2020

On single server private information retrieval in a coding theory perspective

In this paper, we present a new perspective of single server private inf...
09/02/2010

Experimental Evaluation of Branching Schemes for the CSP

The search strategy of a CP solver is determined by the variable and val...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Properties of generic points of a variety, or generic members of a family of varieties, are often much easier to study than their non-generic counterparts. An obvious reason why generic properties are simpler to study is that open sets in algebraic geometry are also dense, and so generic properties are topologically very natural ones to consider. One example of this phenomenon is the family of lines on a hypersurface in projective space, i.e. the hypersurface’s Fano scheme of lines: the dimension, smoothness and connectedness of this Fano scheme for a generic hypersurface were determined in [BVdV79], whereas its dimension for arbitrary smooth hypersurfaces is still unknown except in certain cases [Col79, HMP98, Beh06, LR09, LT10]. The conjectured dimension in general is the content of the Debarre–de Jong conjecture [Deb01]. This paper studies Fano schemes of -planes for intersections of hypersurfaces in that are chosen generically up to some additional property, which we call conditional genericity. Our main results characterize the dimension of these Fano schemes, with the chosen conditions coming from an application to machine learning.

If generic properties are sometimes the only ones accessible in algebraic geometry, they are in a sense the only ones of importance in statistics and machine learning. To be more precise, given a probability space , an event occurs almost surely if it occurs with probability one, that is, if . Suppose that is an algebraic variety, and is a continuous measure. If generic events in are taken to be (Zariski) open, measurable subsets, then non-generic events occur with probability zero. Roughly speaking, non-generic events may be ignored in statistics and machine learning because they never occur.

Our application to machine learning involves stationary subspace analysis (SSA) [vBMKM09], a method for multivariate time series analysis that has been applied to brain-computer interface research. We frame the problem and its relation to Fano schemes in Section 3, but present here one simplified application from brain-computer interface research. Suppose our goal is to isolate a brain signal for a particular task from other signals that vary with time and are irrelevant to this task, e.g. we seek to separate one type of brain activity from other activity related to environmental factors or so-called alpha oscillations due to fatigue. The data is divided into epochs, with each epoch modeled as a random variable. In Section 3, we develop a precise identifiability criterion for SSA—i.e. how many epochs are necessary to identify the stationary signal—in terms of the dimension of a certain Fano scheme.

We now describe the structure of the paper. Section 2 develops results on Fano scheme of various conditionally generic intersections. In Section 2.1 we study the Fano scheme of generic intersections of hypersurfaces under the assumption that all hypersurfaces contain some fixed -plane. Section 2.2 restricts to Fano schemes of intersections of quadric hypersurfaces, but conditions the genericity assumption on the quadrics both containing a fixed -plane and having a fixed rank. We conclude with Section 3, where we relate results of the previous sections to machine learning.

Notation and conventions: All work below is carried out over the complex numbers, although the results hold for any algebraically closed field. Unless otherwise specified, dimension will refer to projective dimension; e.g. For a linear subspace of dimension , we write for the corresponding element of the Grassmannian

, and when considering a vector subspace

instead as a projective subpace, we write .

For the vector space of homogeneous polynomials of degree in variables we write . For a mulitidegree , the product of the corresponding vector spaces of polynomials will be denoted by .

A brief note for the reader coming from outside of algebraic geometry: the word “scheme” can safely be replaced with “variety” in almost all appearances below. The second author has also prepared a companion set of notes to this paper for those with rudimentary knowledge of algebraic geometry [Lar12].

Acknowledgements:

We would like to thank Bernd Sturmfels for introducing us to one another, for suggesting the connection to Fano schemes, and for his encouragement throughout. We are grateful to Fabian Müller for his invaluable help at numerous junctures during this project. We would also like to thank Luke Oeding for his insights on symmetric tensors. Several of the results of this paper were first tested with Macaulay2

[GS10] and MATLAB(R)/Octave.

2. Fano schemes of intersections of conditionally generic hypersurfaces

The th Fano scheme of a projective variety , denoted , is the subscheme of the Grassmannian parametrizing -planes contained in . If are defining equations of , with , then we write , where , and call the multidegree of .

The Fano scheme has been studied in [DM98] under the assumption that is chosen generically. Let and , with all , and set

Theorem 2.1 ([Dm98]).

Let be generic, and suppose that .

  1. If , then the Fano scheme is empty.

  2. If , then the Fano scheme is smooth of dimension .

The case of a single quadric is also settled in [DM98] using different argumentation, but the Fano scheme of a single hypersurface carries little interest for our application to machine learning (see Section 3). Therefore we assume in the remainder that .

We extend these results to conditionally generic in two ways. First, in Section 2.1, we take the defining equations to be generic up to the assumption that each hypersurface contains a fixed -plane . And second, in Section 2.2, we restrict to intersections of quadrics that are generic, conditional upon all quadrics having a fixed rank and containing a fixed -plane. Our application to machine learning involves determining when, under the conditional genericity mentioned above, there is equality , i.e. when identifiability is achieved (see Section 3 for details). Thus determining when the Fano scheme has dimension 0 is especially important.

2.1. Generic intersections containing a common subspace

Suppose first that . The analogue of statement (2) in Theorem 2.1 for generic such that is a straightforward corollary.

Corollary 2.2.

Let be generic, and let be generic such that . If , then .

Proof.

Let be the subpace consisting of all -tuples such that . This is a codimension subspace. Consider the incidence correspondence

The proof of Theorem 2.1 involves showing that, under the conditions of part (2), is dominant, and hence by the theorem of the dimension of the fiber, there exists an open set over which all fibers—which are Fano schemes —have the expected dimension . The corollary will follow after showing that for generic , the intersection is non-empty, and hence dense in .

Set , so that . Note that factors as follows: if , then

But dominance of implies that is dense in , so for generic it follows that is in some fiber of over , and hence , as required. ∎

Next we show that the remaining cases where result in identifiability.

Theorem 2.3.

Let be generic, and let be generic such that . If , then .

Proof.

For the incidence correspondence above, the fibers of are all subspaces of the same dimension in . In the situation of Theorem 2.3, however, the incidence corresponence restricts to those such that and are contained in , where is fixed. The fibers of are no longer projective spaces of the same dimension, but rather their dimension depends on the intersection .

As in the proof of Corollary 2.2, we denote the subpace of consisting of that vanish on by . For such that , , we can choose a basis for such that and . If we replace the original incidence correspondence with , then the non-empty fibers of over are now projective spaces of a constant dimension, and the non-empty fibers of over are the following subvarieties of ,

The natural description of the image of the second projection, , is via Schubert cells. Given a complete flag of projective subspaces ,

such that , and a non-increasing integral vector with , the Schubert cell111We use the less-common practice of defining Schubert cells via projective, rather than affine, dimension. is

The codimension of in is (see [GH78] for further details).

For example, if , , and , then is the intersection of with the interior of ; if instead , then is obtained by intersecting with the interior of ; and if and , then we intersect with the interior of to obtain . In general, for ,

where we use the common abbreviation of omitting the components of equal 0.

We now write the incidence correspondence of interest as

To prove Theorem 2.3, we determine the expected dimension of for each , and then show that under the assumption , all expected dimensions are negative except when , i.e. when .

The map is surjective with fibers projective spaces of the same dimension, so is smooth and irreducible of codimension in , and

Subtracting from implies that the expected dimension of is

(2.1)

Here we take for any .

To finish the proof, it suffices to show that all expected dimensions are negative (except when , when the dimension is 0). We denote the first forward difference of with respect to by , and the second forward difference by . Then

Since we assume , is non-negative for , and hence is convex for these . Note further that , and . By assumption, , so all expected dimensions except for are negative.

It follows that the map cannot be dominant for , and so the variety will be empty for generic . Since , this completes the proof.

We conclude by studying the case . Replacing by for in the proof of Theorem 2.1 in [DM98] yields the following (for an expository account of this proof, see [Lar12]).

Corollary 2.4.

Let be generic, and let be generic such that . If , then .

In particular, there exists in this case a -plane distinct from (in fact, disjoint from it) contained in . Combining with Theorem 2.3 will give a precise characterization of identifiability in SSA.

Corollary 2.5.

Let be generic, and let be generic such that . Then if and only if .

2.2. Lower rank quadrics

We specialize now to intersections of quadrics, with all quadrics vanishing on a common -plane , and all with rank (meaning their Gram matrices all have rank ). We begin with a result that makes no assumption about a common linear subspace. Since we only deal with quadrics, we set

Proposition 2.6.

Let , be generic homogeneous quadratic forms of rank . If and , then has dimension .

Proof.

An immediate corollary of Theorem 2.1 of [DM98] (Theorem 2.1 above) is that for a -plane , the Jacobian matrix defining the tangent space is full-rank. Choose coordinates such that , and consider the coordinate patch of the Grassmannian centered at with coordinates , and corresponding tangent space coordinates denoted by . For each , let , where , be coordinates for the vector space of homogeneous quadratic polynomials (i.e. the entries of the quadratic form’s Gram matrix lying on or above the diagonal—see Figure 1 for an example). The defining equations for the tangent space of at are

(2.2)
(2.3)

for all and with . The first main ingredient of the proof is that these equations depend only on the coordinates satisfying .

The second ingredient is a particular choice of coordinates for quadrics of rank at most , given by all coordinate functions except for those appearing in the bottom right corner of the quadric’s Gram matrix (see Figure 1). Note that, by our assumption on , these excluded entries are never among the coordinate functions appearing in the defining equations of the tangent space (2.2)–(2.3). For each , these coordinates are valid on an open subset of , which we denote by . We next show that is non-empty.

To prove this claim, let be the Gram matrix of the quadratic form . By assumption all minors of vanish. The resulting equations enable us to write the omitted coordinate functions as rational expressions in the remaining coordinate functions, and the locus in where the denominators of these rational expressions do not vanish is precisely the open set where these coordinates are valid.

Specifically, fix with . Let be the submatrix obtained by deleting the rows indexed by and the columns indexed by . The corresponding minor is

which gives the promised rational expression for , with denominator equal to the top-left minor of , so that is the complement of the vanishing locus of this minor. Now it is easy to see that is non-empty, since the quadric with Gram matrix having entries if and otherwise lies in this intersection.

To finish the proof, Theorem 2.1 guarantees that for generic , the Fano scheme is of the expected dimension, equivalently, the equations (2.2)–(2.3) are full-rank. By the observation following these equations, it therefore follows that for generic coordinate choices , with , equations (2.2)–(2.3) are full-rank. Hence a generic intersection of quadrics of rank at most that lies in the coordinate patch described above is also in the open set where the dimension of matches the expected dimension. Since the variety parametrizing symmetric matrices of rank is irreducible, the result follows.

Figure 1. Gram matrix of a quadric vanishing on for , . The excluded coordinate functions for are depicted in increasing shades of green.

The proofs of Corollary 2.2 and Theorem 2.3 require only notational changes to extend the above arguments to conditional genericity, where the additional condition is containment of a fixed -plane .

Corollary 2.7.

Let be generic, and let be generic homogeneous quadratic forms of rank such that and .

  1. If , then .

  2. If , then .

A direct extension of Corollary 2.5 now leads to a precise determination of an identifiability criterion for SSA.

Corollary 2.8.

Under the assumptions of Corollary 2.7, if and only if .

It would be interesting to extend these lower-rank results to other multidegrees. For our application to machine learning, this generalization would eliminate the need to assume that only knowledge of the first two cumulants is available (see Section 3). There are however two significant obstacles to extending the techniques used here. First, it is not known in general if the variety parametrizing lower rank symmetric tensors of degree is irreducible. And second, coordinates like the ones used for quadrics are not known in general.

3. Application to stationary subspace analysis

In this section, we show how the preceding results can be applied in machine learning. A central task in multivariate time series analysis is to separate out data coming from different sources, e.g. filtering out noisy data, or identifying a time-stationary data source from time-varying sources. A recent method for this second task is stationary subspace analysis (SSA) [vBMKM09].

Let be time series data separated into consecutive epochs, with for all . Each epoch is modeled as a random variable . A central assumption of SSA is that these data come from a linear superposition of a dimensional stationary signal, and an dimensional non-stationary signal. The task of identifying the stationary signal is now equivalent to finding a projection matrix such that are identically distributed, i.e.

(3.1)

To make this problem well-defined, we must further assume that the data –and hence the random variables —are general under the assumption that such a projection exists. An important problem in SSA is to determine the minimal number of epochs required to uniquely determine .

The connection to algebraic geometry and Fano schemes arises through cumulants. Let be a random variable taking values in , then the th cumulant of is a symmetric tensor of degree . For example, the first cumulant is the mean vector of , and the second cumulant is the covariance matrix of . In general, the th cumulant of can be represented as a homogeneous polynomial in variables of degree (see e.g. Appendix 2 of [Eis95]). For , we define the th cumulant polynomial of by the generating function

If the cumulant generating functions of two random variables have finite radii of convergence, then by taking Fourier transorms it can be shown that these two random variables are identically distributed if and only if their cumulants coincide, just as with random variables taking values in . Hence the defining property of from Equation (3.1) translates into an a priori infinite system of homogeneous algebraic equations,

(3.2)

for all and all , where we have abbreviated by , and the action of on is induced from its action on . We refer to Section 2 of [KvBM12] for more details. To avoid an infinite number of equations, in SSA it is assumed that the stationary and non-stationary signals can be separated from one another by considering only the first two cumulants.

To connect Equations (3.2) to Fano schemes, let be the row-span of the matrix . For , set . Then Equations (3.2) are satisfied if and only if all polynomials vanish identically on , which is equivalent to the condition . In case , the resulting equations are linear, so by the genericity assumption on the , these equations only reduce the ambient dimension. Hence Fano schemes of intersections of quadrics is the main case of interest for SSA. Since the number of epochs tends to be large, the case of two epochs—i.e. a single quadric—can be reasonably omitted.

The specific application of our results on Fano schemes to SSA is to give a precise characterization of the number of epochs required to uniquely identify the projection . An upper bound for the identifiability of the SSA problem was proven in the Appendix of [KvBM12]:

Theorem 3.1.

Let be generic random variables in conditional upon the existence of a projection such that the first two cumulants of all coincide. Then is uniquely identifiable if .

The results from Section 2 sharpen the above result and completely solve the SSA identifiability problem for the case of the first two cumulants:

Theorem 3.2.

Let be generic random variables in conditional upon the existence of a projection such that the first two cumulants of all coincide. Then is uniquely identifiable if , and this bound is sharp.

In some SSA problems, the differences of covariance matrices of the random variables are rank deficient. As long as the rank is not too low, Corollary 2.8 provides a generalization of Theorem 3.2 for these situations.

References

  • [Beh06] Roya Beheshti, Lines on projective hypersurfaces, J. Reine Angew. Math. 592 (2006), 1–21. MR 2222727 (2007a:14009)
  • [BVdV79] W. Barth and A. Van de Ven, Fano varieties of lines on hypersurfaces, Arch. Math. (Basel) 31 (1978/79), no. 1, 96–104. MR 510081 (80j:14004)
  • [Col79] Alberto Collino, Lines on quartic threefolds, J. London Math. Soc. (2) 19 (1979), no. 2, 257–267. MR 533324 (80j:14005)
  • [Deb01] Olivier Debarre, Higher-dimensional algebraic geometry, Universitext, Springer-Verlag, New York, 2001. MR MR1841091 (2002g:14001)
  • [DM98] Olivier Debarre and Laurent Manivel, Sur la variété des espaces linéaires contenus dans une intersection complète, Math. Ann. 312 (1998), no. 3, 549–574. MR 1654757 (99j:14048)
  • [Eis95] David Eisenbud, Commutative algebra, Graduate Texts in Mathematics, vol. 150, Springer-Verlag, New York, 1995, With a view toward algebraic geometry. MR 1322960 (97a:13001)
  • [GH78] Phillip Griffiths and Joseph Harris, Principles of algebraic geometry, Wiley-Interscience [John Wiley & Sons], New York, 1978, Pure and Applied Mathematics. MR MR507725 (80b:14001)
  • [GS10] Daniel R. Grayson and Michael E. Stillman, Macaulay2, a software system for research in algebraic geometry, version 1.4, Available at http://www.math.uiuc.edu/Macaulay2, 2010.
  • [HMP98] Joe Harris, Barry Mazur, and Rahul Pandharipande, Hypersurfaces of low degree, Duke Math. J. 95 (1998), no. 1, 125–160. MR 1646558 (99j:14043)
  • [KvBM12] Franz J. Király, Paul von Bünau, Frank C. Meinecke, Duncan A. J. Blythe, and Klaus-Robert Müller,

    Algebraic geometric comparison of probability distributions

    , J. Mach. Learn. Res. 13 (2012), 855–903. MR 2913722
  • [Lar12] Paul Larsen, Notes on Fano varieties of complete intersections, arXiv math.AG:1211.6249 (2012).
  • [LR09] J. M. Landsberg and Colleen Robles, Fubini’s theorem in codimension two, J. Reine Angew. Math. 631 (2009), 221–235. MR 2542223 (2010h:14078)
  • [LT10] J. M. Landsberg and Orsola Tommasi, On the Debarre-de Jong and Beheshti-Starr conjectures on hypersurfaces with too many lines, Michigan Math. J. 59 (2010), no. 3, 573–588. MR 2745753 (2012a:14121)
  • [vBMKM09] Paul von Bünau, Frank C. Meinecke, Franz J. Király, and Klaus-Robert Müller, Finding stationary subspaces in multivariate time series, Phys. Rev. Lett. 103 (2009), no. 21, 214101.