In data analysis, the operation of clustering is fundamental. At its base is a problem wedged between geometry and topology: given a set of points and a notion of distance or proximity among them, compute a parsimonious division into sets of points that are mutually close. In practice, this is a subtle question that depends sensitively on the models of proximity for the input and the desired nature of the output [13, 3].
The discretized nature of this problem obscures the more topological (and ancient) problem of minimal decomposition of a space into simple pieces or, suggestively, modal domains. For a geometric domain , one might ask for the minimal number of convex pieces into which can be decomposed, minimal number providing an unambiguous descriptor of geometric complexity. For a topological space , the minimal mode is not a convex domain but rather a contractible subset , meaning that has the homotopy type of a point. The minimal number of such subsets covering is called the geometric category of and is a homeomorphism invariant of . There is a family of related topological categories, the most important being the Lusternik-Schnirelmann category (using subsets nullhomotopic in ) which branches quickly into homotopy theory . Other related notions of minimal topological decomposition include the classical sectional category (or Schwarz genus) of a fibration and the more modern variant of topological complexity of path planning .
This paper concerns a weighted version of these problems adapted to statistics. Given a density , what is the minimal number of modes into which it can be decomposed as a sum? In the geometric version of this problem, one natural notion of a mode is a Gaussian, and the problem of minimal approximation of a density as a sum of Gaussians is well-studied on one-dimensional domains [2, 6, 14]. This choice of a Gaussian as fundamental mode is somewhat geometrically rigid, analogous to the clustering of a space into convex pieces. One can imagine other basis unimodal distributions [11, 12].
1.2. Unimodal category
For a topological space, let denote the set of all compactly supported continuous functions . Such a function is unimodal if the upper excursion sets have the homotopy type of a point for all and are empty for all . Such a has as its maximal value.
We will refer to the nonempty upper excursion sets as being contractible, though it must be clarified that such sets are contractible in themselves as opposed to being merely contractible in . The latter would be more in line with the definitions used in Lusternik-Schnirelmann theory, but is less relevant for most applications (where ).
The unimodal category of is the minimal number of unimodal distributions on such that is the pointwise sum of the collection .
In the data analysis interpretation of the unimodal functions, where the mode corresponds to signal, and the spread of the density around it, to noise
, it makes sense to assume some similarities of the noise generating mechanisms for different modes. In the world of Gaussian distribution this leads to assumption of the fixed, or slowly varying covariant form. A parsimonious, homeomorphism-invariant version would assume a much weaker formulation, which nonetheless strengthens significantly the notion of unimodal category, as follows:
The strong unimodal category of is the minimal number of distributions on that sum up to such that any intersection of the upper excursion sets is either empty or contractible.
Since unimodal functions remain unimodal under a homeomorphic change of coordinates, the unimodal category is a coordinate-free invariant of a density. This makes it of significant potential use in applications where data is collected from noisy or otherwise uncertainly located samples. The initial paper on the subject gave a constructive algorithm for computing the unimodal category on , along with generalizations to unimodal categories based on pointwise norms rather than addition . The thesis of Govc showed just how subtle and difficult the problem of computation of these invariants is in higher dimensions 
. Recent applications of the unimodal category in 1-d by Huntsman are currently being applied to problems of mixture estimation in statistics.
We believe that strong unimodal category would prove more amenable to analysis, but postpone this line of research till later, noting only that for our model, where the underlying topological space is a metric tree, the notion of strong unimodal category is identical with the standard one.
The primary contribution of this note is the presentation and proof of correctness of an efficient greedy algorithm for the computation of a minimal unimodal decomposition of a density over a tree, this yielding the unimodal category. More than simply computing a topological invariant, this method permits identification of “essential” modes which are local maxima for any minimal unimodal decomposition. We believe that this extension will permit novel applications of the unimodal category and point the way to computational methods on suitably restricted higher dimensional domains.
As a perhaps fanciful toy problem for the sake of motivation, consider the following scenario. Suppose one wishes to detect radiological substances by means of crude sensors mounted on vehicles in traffic on city streets. At any given time, the network of vehicles returns a sampling of a distribution on a graph (radioactivity levels restricted to the idealized 1-d cell complex of city streets). It is perhaps known that there are certain expected modes — at, say, hospitals, research facilities, and universities. In a dense urban environment, one might have difficulty distinguishing true modes from interference modes, and the problem of noise in position measurement is a further complicating feature. It is in such circumstances that a minimal unimodal decomposition may assist as an unambiguous warning of a significant unexpected mode.
Let be a finite metric tree, that is, a 1-dimensional compact contractible metric space stratified into 0-dimensional vertices and 1-dimensional (open) edges. All subtrees of are assumed to inherit both the cell structure and the metric of . The subtrees of form a finite lattice.
In order to work with finite unimodal decompositions, we henceforth assume that is restricted to functions with a finite number of critical points. Given such an , we can assume that the restriction of the function to each edge is weakly monotonic, by adding new vertices at local maxima and minima. Using the restriction of to each edge (or an affine linear function of it) as a coordinate, we can also assume that is edge-linear, that is, restricted to each edge is an affine function of that edge with respect to the internal metric structure: see Figure 1.
If is constant on an edge of , then the operation of contraction by taking to the tree sends unimodal decompositions of on to unimodal decompositions of on preserving number of modes. This operation is reversible, taking any unimodal decomposition of on to a mode-preserving decomposition of on .
If is a unimodal decomposition of an edge-linear , then its components can be chosen to be edge-linear as well.
Replacing a unimodal component with
with its edge-linear interpolation, preserving the values at the vertices of, is again unimodal, and that the resulting components sum up to , as their values at vertices do. ∎
As an immediate corollary we obtain the following.
Any unimodal decomposition can be modified (without changing the number of components) so that all component modes are maximized at the vertices of .
2.2. Free and forced
Given the data , we call a subtree mode-free if there exists a unimodal decomposition of such that is free of modes. Clearly, mode-free subtrees are closed under the operation of taking subtrees. We will call a vertex of mode-forced for if is a mode of every minimal unimodal decomposition of on .
For any satisfying the assumptions of §2, a mode-forced vertex exists.
Choose any vertex and consider it the root of ; then becomes a union of several branches each having as a root. We will call a branch (a connected component of the complement of , and all vertices in it excluding ) insignificant, if the values of are monotonically decreasing away from in that branch.
It is clear that removing all insignificant vertices does not increase the unimodal category of (but can generate more insignificant vertices). Pruning iteratively the tree of its insignificant vertices results either in a single vertex (in which case the original function is unimodal), or in a tree with at least two leaves, each of which is mode-forced. ∎
Specific examples of mode-forced vertices include the global maximum of , as well as all local maxima of such that all but one of the components of their complement are monotonically decreasing paths to leaves.
3.1. Sweeping operation
In this section we define a sweeping operation that will generate a function with a given mode.
Let be a vertex of and satisfying the assumptions of §2.1. Define the function on using the following procedure.
Orient all edges of away from (making it the root of the oriented tree).
If for an oriented edge, , is defined, then apply the sweeping move:
Figure 2 below illustrates the sweeping mechanism.
3.2. Remainders and freedom
Given a vertex , we will call the function the -remainder. This is nonnegative as, clearly, for any . Also noted is that the support of does not contain . The following is the critical result needed for our constructions.
If is a mode-free subtree for , and , then is also mode-free for .
In other words, taking -remainders does not shrink free subtrees not containing . We postpone the proof until §5.2.
4. Greedy Algorithm
The algorithm given in  for computing a minimal unimodal decomposition of involved sweeping from to , identifying mode-forced vertices, then removing their contribution by computing remainders, until the entire interval was swept.111That is, using the language of this paper. In essence, the same process is here employed for a tree in the following greedy algorithm:
Using Lemma 2.4, find a mode-forced vertex .
Construct the function ; it is a component of the unimodal decomposition.
Compute the remainder and iterate.
The detection of mode-forced vertices is constructive by Lemma 2.4. Theorem 3.1, when proved, will permit iteration of the greedy algorithm by preserving the mode-free subtree structure: as the mode-free subtrees do not shrink, any unimodal decomposition will survive under sweeping, preserving the (strong) unimodal category.
5. Remainders and leaf functions
We need some preliminary results.
5.1. Leaf functions
Let be a mode-free subtree of . Denote by the set of leaves of (not necessarily leaves of ).
The condition of being mode-free for is equivalent to the existence of leaf functions, , such that each is non-increasing away from and
If is mode-free for , then has a unimodal decomposition without modes on . For each , let be the sum of the over all whose modes lie in the connected component of adjacent at leaf . Likewise, given such a decomposition of on as a sum of leaf functions , choose a unimodal decomposition of on each connected component of . Use the restriction of this to each to evenly divide the leaf functions into summands. The resulting decomposition is unimodal (perhaps not minimally so) and mode-free on . ∎
5.2. Remainders preserve mode-freeness
Consider an edge in . The restrictions of the leaf functions to this edge can be modified so that the constraints of Lemma 5.1 are still satisfied. In what follows, let be the leaf adjacent to the component of containing .
We want to prove that restricted to can still be decomposed as described in Lemma 5.1. To achieve this we will modify the decomposition (2) edge by edge, away from in such a way that is always changed according to the sweeping rule (1), while the remaining functions continue to satisfy the condition of being non-increasing away from .
It is immediate that if the restriction of to the edge is linear interpolating , then any modification of to and to extends to an edge-linear, non-increasing from function as long as , see Fig. 3. We will call any such modification of a function on an edge admissible.
The following lemma asserts that we can restrict our attention to admissible modifications of aggregations of leaf functions.
Consider an edge, and a collection of leaf functions , each non-increasing from to . Their sum is also non-increasing in the same direction. Let be an admissible modification of . Then there exist admissible modifications of each so that on .
By admissibility, we have that . The claim is equivalent to existence of a solution to the following systems of (in)equalities on the variables :
Using the fact that and , transform the left hand side of (6) as
6. Stability of mode-free subtrees
In this section we finally prove Theorem 3.1 on the stability of mode-free subtrees.
Proof of Theorem 3.1.
Lemmas 5.1 and 5.2 imply that any admissible modification of on an edge of maintains a leaf function decomposition, and thus, the mode-free nature of . To show that preserves the mode-free subtree for , we apply the sweeping algorithm generating along the edges in , propagating away from (the leaf of adjacent to the component of containing ) and performing admissible modifications.
Specifically, if we modify the leaf functions so that for some constant , then we have a leaf function decomposition of on . If , then raising to and lowering the other leaf functions is admissible. In this case, .
Fix an edge where the leaf functions are to be modified. We assume that the propagation happens from to (that is is in the component of containing ). Consider the values and at the vertices. Decompose these as follows:
where and are values of the leaf function corresponding to ; the , , , and terms are values (at and ) of sums of the remaining leaf functions that are respectively oriented (nonincreasing) from to () and from to (). By Lemma 5.2, it suffices to give admissible modifications of these aggregated leaf functions — that is of values , , , and .
After potentially subdividing based on where hits zero, there are two cases, corresponding to being increasing or decreasing from to . If is decreasing, then , so that its decrement is less or equal than the decrement of . For these cases, we modify -values of leaf functions at and to -values as per Lemma 5.2. These modifications are given in Table 6 and can be checked as admissible and summing to as per (11).
Sweeping over , this modifies the leaf function at to be of the form . Thus, modifying to the remainder has the effect of maintaining a leaf function decomposition on . ∎
The present work gives a constructive method for the computation of minimal unimodal decompositions of densities on trees. We end with a few remarks.
Unimodal decompositions are far from unique. Minimal unimodal decompositions are likewise not unique, but in a structured manner. The mode-forced vertices and mode-free subtrees provide a skeleton on which a given density hangs. By performing the sweeping moves of the greedy algorithm in different “directions” one arrives at many minimal unimodal decompositions. The analogous variations over an interval consist in sweeping from left-to-right or right-to-left .
Assume that the values of on the vertices of are given as the input. As written, the greedy algorithm for generating a unimodal decomposition is . One can do better. For example, if the modes have supports of uniformly bounded size , the algorithm runtime drops to .
Of course, the restriction of these results to trees is suboptimal. For applications of unimodal decompositions in disciplines where precise geometric data can be hard to come by (e.g., phylogenetics or neuroscience), graphs with cycles are not only possible but (especially in the case of neuroscience) critical. The generalization from graphs to higher dimensional domains is likewise important but appears formidable, depending on the model of unimodal decomposition employed. We view the construction of a minimal unimodal decompositions on graphs to be an important prerequisite for this challenge.
This research was done while YB was visiting the Departments of Mathematics and ESE of the University of Pennsylvania - the hospitality of both departments is warmly appreciated.
-  Y. Baryshnikov and R. Ghrist, “Unimodal Category and Topological Statistics,” Proc. NOLTA, 2011.
J. Behboodian, “On the modes of a mixture of two normal distributions,”Technometrics, 12, 1970, 131–139.
G. Carlsson & F. Mémoli, “Classifying clustering schemes,”Found. Comput. Math., 13 (2):221–252 (2013).
M. Carreira-Perpinan and C. Williams, “On the number of modes of a
Gaussian mixture,” in
Scale-Space Methods in Computer Vision, Lecture Notes in Comput. Sci., vol. 2695, 2003, 625–640.
-  O. Cornea, G. Lupton, J. Oprea, and D. Tanré, Lusternik-Schnirelmann Category, Amer. Math. Soc., 2003.
-  I. Eisenberger, “Genesis of bimodal distributions,” Technometrics, 6, 1964, 357–363.
-  M. Farber, “Topological complexity of motion planning,” Discrete Comput. Geom., 29 (2), 2003, 211–221.
-  R. Ghrist, Elementary Applied Topology, Createspace, 2014.
-  D. Govc, “Unimodal category and the monotonicity conjecture,” arXiv:1709.06547 [math.AT], 2017.
-  S. Huntsman, “Topological Mixture Estimation,” preprint, 2018.
-  I. Kakiuchi, “Unimodality conditions of the distribution of a mixture of two distributions,” Math. Sem. Notes Kobe Univ., 9, 1981, 315–325.
-  J. Kemperman, “Mixtures with a limited number of modal intervals,” Ann. Statist. 19, 1991, 2120–2144.
-  J. Kleinberg , “An impossibility theorem for clustering,” in Proc. NIPS, 2002, 446–453.
-  C. Robertson and J. Fryer, “Some descriptive properties of normal mixtures,” Skand. Aktuarietidskr, 1969, 137–146.