# On Computing Min-Degree Elimination Orderings

We study faster algorithms for producing the minimum degree ordering used to speed up Gaussian elimination. This ordering is based on viewing the non-zero elements of a symmetric positive definite matrix as edges of an undirected graph, and aims at reducing the additional non-zeros (fill) in the matrix by repeatedly removing the vertex of minimum degree. It is one of the most widely used primitives for pre-processing sparse matrices in scientific computing. Our result is in part motivated by the observation that sub-quadratic time algorithms for finding min-degree orderings are unlikely, assuming the strong exponential time hypothesis (SETH). This provides justification for the lack of provably efficient algorithms for generating such orderings, and leads us to study speedups via degree-restricted algorithms as well as approximations. Our two main results are: (1) an algorithm that produces a min-degree ordering whose maximum degree is bounded by Δ in O(m Δ^3n) time, and (2) an algorithm that finds an (1 + ϵ)-approximate marginal min-degree ordering in O(m ^5n ϵ^-2) time. Both of our algorithms rely on a host of randomization tools related to the ℓ_0-estimator by [Cohen `97]. A key technical issue for the final nearly-linear time algorithm are the dependencies of the vertex removed on the randomness in the data structures. To address this, we provide a method for generating a pseudo-deterministic access sequence, which then allows the incorporation of data structures that only work under the oblivious adversary model.

There are no comments yet.

## Authors

• 9 publications
• 4 publications
• 37 publications
• 9 publications
• 5 publications
• 2 publications
• ### Graph Sketching Against Adaptive Adversaries Applied to the Minimum Degree Algorithm

Motivated by the study of matrix elimination orderings in combinatorial ...
04/11/2018 ∙ by Matthew Fahrbach, et al. ∙ 0

• ### A Fast Minimum Degree Algorithm and Matching Lower Bound

The minimum degree algorithm is one of the most widely-used heuristics f...
07/28/2019 ∙ by Robert Cummings, et al. ∙ 0

• ### Deterministic Decremental SSSP and Approximate Min-Cost Flow in Almost-Linear Time

In the decremental single-source shortest paths problem, the goal is to ...
01/18/2021 ∙ by Aaron Bernstein, et al. ∙ 0

• ### A recognition algorithm for adjusted interval digraphs

Min orderings give a vertex ordering characterization, common to some gr...
10/15/2018 ∙ by Asahi Takaoka, et al. ∙ 0

• ### Distributed Graph Realizations

We study graph realization problems from a distributed perspective and w...
02/13/2020 ∙ by John Augustine, et al. ∙ 0

• ### Engineering Data Reduction for Nested Dissection

Many applications rely on time-intensive matrix operations, such as fact...
04/23/2020 ∙ by Wolfgang Ost, et al. ∙ 0

• ### An Evaluation of Structural Parameters for Probabilistic Reasoning: Results on Benchmark Circuits

Many algorithms for processing probabilistic networks are dependent on t...
02/13/2013 ∙ by Yousri El Fattah, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Many algorithms in numerical analysis and scientific computing benefit from speedups using combinatorial graph theory [NS12, HP07]

. Such connections are due to the correspondence between non-zero entries of matrices and edges of graphs. The minimum degree algorithm is a classic heuristic for minimizing the space and time cost of Gaussian elimination, which solves a system of linear equations by adding and subtracting rows to eliminate variables. As its name suggests, it repeatedly pivots on the variable involved in the fewest number of equations

[GL89].111We will assume the system is symmetric positive definite (SPD) and thus the diagonal will remain strictly positive, allowing for any pivot order. There are many situations where this is suboptimal. Nonetheless, it is still a widely used and effective heuristic in practice [ADD04, DGLN04]. It is integral to the direct methods for solving linear systems exactly in LaPack [ABD90], which is in turn called by the “\” command for solving linear systems in MATLAB [Mat17]. It is also a critical part of the linear algebra suite in Julia [BKSE12].

While the best theoretical running times for solving such systems either rely on fast matrix multiplication [LG14] or iterative methods [ST14, KMP12], direct methods and their speedups are preferred in many cases. For such elimination-based methods, performances better than the general bound for naive Gaussian elimination are known only when the non-zero graph has additional separators [LT79, LRT79, GT87] or hierarchical structure [PCD17]. Nonetheless, these methods are still preferable for a variety of reasons. They only depend on the non-zero structure, and have fewer numerical issues. More importantly, direct methods also benefit more from the inherent sparsity in many real-world input instances. For an input matrix and a given elimination order of the variables, the non-zero structure that arises over the course of the elimination steps has a simple characterization graph theoretically [Ros73, RTL76, LRT79, GT87].

This characterization of additional non-zero entries, known as fill, is at the core of elimination trees, which allow one to precisely allocate memory for the duration of the algorithm in time [GNP94]. The reliable performance of elimination-based methods has led to the study of elimination-based methods for solving more structured linear systems [KS16]. However, recent hardness results seem to indicate that speedups via additional numerical structure may be limited to families of specific problems instead of all sparse matrices arising in scientific computing and numerical analysis [KZ17].

Although computing an elimination ordering that minimizes the total cost is NP-hard in general [BS90, Yan81], the minimum degree heuristic is exceptionally useful in practice. When the non-zeros of the matrix are viewed as edges of a graph, eliminating a vertex is equivalent to creating a clique on its neighborhood and then deleting this vertex. With this view in mind, the traditional min-degree algorithm can be viewed as: (1) find the vertex with minimum degree (which we term the fill-degree to avoid confusion with the original graph) in time; (2) add a clique among all its neighbors in time; (3) remove it together with all its edges from the graph in time.

This leads to a running time that is —as high as the cost of Gaussian elimination itself. Somewhat surprisingly, despite the wide use of the min-degree heuristic in practice, there have been very few works on provably faster algorithms for producing this ordering. Instead, heuristics such as AMD (approximate-minimum degree ordering) [ADD96] aim to produce orderings similar to minimum-degree orderings in provably faster times such as without degree pivot size bounds.

Our investigation in this paper revolves around the question of finding provably more efficient algorithms for producing exact and approximate min-degree orderings. We combine sketching with implicit representations of the fill structure to obtain provably time algorithms. These algorithms utilize representations of intermediate non-zero structures related to elimination trees in order to implicitly examine the fill, which may be much larger. We also uncover a direct but nonetheless surprising connection between finding min-degree vertices and popular hardness assumptions. In particular, we show that computing the vertex of minimum degree after several specified pivot steps cannot be done faster than time, assuming the widely-believed strong exponential time hypothesis [Wil05].

Nevertheless, we are able to extend various tools from sketching and sampling to give several improved bounds for computing and approximating minimum degree orderings. We show that our use of sketching can be much more efficient when the maximum degree is not too large. This in turn enables us to use sampling to construct data structures that accurately approximate the fill-degrees of vertices in graphs in time, even under pivoting of additional vertices. Leveraging such approximate data structures, we obtain an algorithm for producing an approximate marginal minimum degree ordering, which at each step pivots a vertex whose degree is close to minimum, in nearly-linear time. Our main result is:

###### Theorem 1.1.

Given an matrix with non-zero graph structure containing non-zeros, we can produce an -approximate greedy min-degree ordering in time.

Our algorithms combine classical ideas in streaming algorithms and data structures, such as -samplers [Coh97], wedge sampling [KP17, ELRS17], and exponential start-time clustering [MPX13, MPVX15]. Until now these tools have not been rigorously studied in the context of scientific computing due to their dependency on randomization. However, we believe there are many other algorithms and heuristics in scientific computing that can benefit from the use of these techniques.

Furthermore, our overall algorithm critically relies on dissociating the randomnesses from the pivot steps, as the update is dependent on the randomness in the data structures. In Section 3.4 we give an example of how such correlations can “amplify” errors in the data structures. To address this issue, we define a pseudo-deterministic sequence of pivots based on a second degree-estimation scheme, which we discuss in Section 3.5.

Our paper is organized as follows. We will formalize the implicit representation of fill and definitions of exact, capped, and approximate min-degree orderings in Section 2. Then in Section 3 we give an overview of our results and discuss our main decorrelation technique in Subsection 3.5. Our main hardness results are in Section 4, while the use of sketching and sampling to obtain exact and approximate algorithms are in Sections 5 and 6, respectively. Further details on the graph theoretic building blocks are in Sections 7 and 8. They respectively cover the estimation of fill-degree of a single vertex and the maintenance of sketches as vertices are pivoted.

## 2 Preliminaries

We work in the pointer model, where function arguments are pointers to objects instead of the objects themselves. Therefore, we do not assume that passing an object of size costs time and space. This is essentially the “pass by reference” construct in high-level programming languages.

### 2.1 Gaussian Elimination and Fill

Gaussian elimination is the process of repeatedly eliminating variables from a system of linear equations, while maintaining an equivalent system on the remaining variables. Algebraically, this involves taking one equation involving some target variable and subtracting (a scaled version of) this equation from all others involving the target variable. Since our systems are SPD, we can also apply these operations to the columns and drop the variable, which gives the Schur complement.

A particularly interesting fact about Gaussian elimination is that the numerical Schur complement is unique irrespective of the ordering of pivoting. Under the now standard assumption that non-zero elements do not cancel each other out [GL89], this commutative property also holds for the combinatorial non-zero structure. Since the non-zero structure of a matrix corresponds to a graph, we can define the combinatorial change to the non-zero structure of the matrix as a graph theoretic operation. We start with the notation from Gilbert, Ng, and Peyton [GNP94]. For a symmetric matrix, they use

 G(A)

to denote the undirected graph formed by its non-zero structure.

Gilbert, Ng, and Peyton [GNP94] worked with a known elimination ordering and treated the entire fill pattern statically. Because we work with partially eliminated states, we will need to distinguish between the eliminated and remaining vertices in by implicitly associating vertices with two states:

• Eliminated vertices will be denoted using and .

• Remaining vertices will be denoted using , , and .

Then we use the fill graph

 G+

to denote the graph on the remaining vertices, where we add an edge between any pair of remaining vertices and connected via a path of eliminated vertices. We can also iteratively form the fill graph from by repeatedly removing an eliminated vertex and its incident edges, and then adding edges between all of the neighbors of to form a clique. This characterization of fill means that we can readily compute the fill-degree of a single vertex in a partially eliminated state without explicitly constructing the matrix.

###### Lemma 2.1.

For any graph and vertex , given an elimination ordering we can compute in time the value in when is eliminated.

###### Proof.

Color the vertices in the sequence before red, and color all remaining vertices green. Run a depth-first search from that terminates at green vertices . Let be the set of green vertices at which the search terminated. It follows from the definition of that . ∎

This kind of path finding among eliminated vertices adds an additional layer of complexity to our structures. To overcome this, we contract eliminated vertices into their connected components, leading to the notion of the component graph. We use

 G∘

to denote such a graph where we contract all edges between eliminated vertices and . We will denote the vertices corresponding to such components by . Note that is a quasi-bipartite graph, because the contraction rule implies there are no edges between the component vertices. It is also useful to denote the neighborhood of different kinds of vertices in a component graph:

• or : For a component or a remaining vertex in the component graph , we use to denote the neighbors that are remaining vertices.

• : For a remaining vertex , this is the set of component vertices adjacent to .

• : For a remaining vertex , this denotes the neighbors of in , which is

 ⎛⎝⋃c∈Ncomponent(u)Nremaining(c)⎞⎠∪Nremaining(u)∪{u}.

Note that the fill-degree of a remaining vertex (its degree in ) is precisely . Additionally, we use the restricted degrees:

• or to denote the size of or , respectively.

• to denote the size of for some remaining vertex .

### 2.2 Min-Degree Orderings: Greedy, Capped, and Approximate

For an elimination ordering

 u1,u2,…,un,

we define as the graph with vertices marked as eliminated and marked as remaining. Furthermore, we say such a permutation is a minimum degree permutation if at each step , the vertex has the minimum fill-degree in the non-zero structure graph . Concretely,

 degG+i−1(ui)=minv∈V(G+i−1){degG+i−1(v)}. (1)

Because the performance of our algorithm degrades over time as the minimum degree increases, we define the notion of a -capped minimum degree ordering, where degrees are truncated to before making a comparison. We first define -capped equality where is an integer.

###### Definition 2.2.

We use the notation to denote .

Now we can modify the definition of minimum degree in Equation 1 to specify that the elimination sequence satisfies the -capped minimum degree property at each time step:

 degG+i−1(ui)=Δminv∈V(G+i−1){degG+i−1(v)}. (2)

Our algorithm for finding the minimum (-capped) degrees is randomized, so we need to be careful to not introduce dependencies between different steps when several remaining vertices are of minimum degree. To bypass this problem, we require that the lexicographically least vertex be eliminated at each step in the event of a tie. This simple condition is critical for arguing that our randomized routines do not introduce dependencies as the algorithm progresses.

Lastly, our notion of approximating the min-degree ordering is based on finding the vertex whose fill-degree is approximately minimum in the current graph . This decision process has no look-ahead, and therefore does not in any way approximate the minimum possible total fill.

###### Definition 2.3.

An ordering of vertices is a -approximate greedy min-degree ordering if for all steps we have

 degG+i−1(ui)≤(1+ϵ)minv∈V(G+i−1){degG+i−1(v)}. (3)

### 2.3 Randomized Tools

All of our algorithms are randomized, and their analyses involve tools such as the union bound, concentration bounds, and explicit calculations and approximations of expected values. We say an event happens with high probability (w.h.p.) if for any constant

there is a setting of constants (hidden by big- notation) so that this event occurs with probability at least . We also make extensive applications of backward analysis [Sei93], which calculates the probabilities of events locally using the current state of the data structures.

Our final algorithm for producing

-approximate marginal min-degree orderings relies heavily on properties of the exponential distribution in order to decorrelate updates to the data structures and the results that it produces. Properties of the exponential random variable are formalized in Section

6, and we discuss its role in our algorithm in the overview in Section 3.5.

The analysis of our algorithms critically hinges on viewing all randomness as being generated before-hand, based on the (potential) index in which the procedure gets called. This is opposed to having a single source of randomness that we query sequentially as the procedures are invoked. For procedures such as the fill-degree estimator in Section 7.1, this method leads to a simplified analysis by viewing the output of a randomized sub-routine as a fixed distribution. Such a view of randomization is also a core idea in our decorrelation routine, which defines a random distribution on elements, but only queries of them in expectation. This view is helpful for arguing that the randomness we query is independent of the indices that we ignored.

### 2.4 Related Works

#### Fill from Gaussian Elimination and Pivot Orderings

The study of better pivoting orderings is one of the foundational questions in combinatorial scientific computing. Work by George [Geo73] led to the study of nested dissection algorithms, which utilize separators to give provably smaller fill bounds for planar [RTL76, LRT79] and separable graphs [GT87, AY10]. One side effect of such a study is the far better (implicit) characterization of fill entries discussed in Section 2.1. This representation was used to compute the total amount of fill of a specific elimination ordering [GNP94]. It is also used to construct elimination trees, which are widely used in combinatorial scientific computing to both pre-allocate memory and optimize cache behaviors [Liu90].

#### Finding Low Fill-in Orderings

The ability to compute total fill for a given ordering raises the natural question of whether orderings with near-optimal fills can be computed. NP-hardness results for finding the minimum fill-in ordering [Yan81, BS90] were followed by works for approximating the minimum total fill [NSS00], as well as algorithms [KST99, FV13] and hardness results for parameterized variants [WAPL14, BCK16, CS17].

Partially due to the higher overhead of these methods, the minimum degree method remains one of the most widely used methods for producing orderings with small fill [GL89]. Somewhat surprisingly, we were not able to find prior works that compute the exact minimum degree ordering in times faster than , or ones that utilize the implicit representation of fill provided by elimination trees.222 We use speculative language here due to the vastness of the literature on variants of minimum degree algorithms. On the other hand, there are various approximate schemes for producing min-degree like orderings. These include multiple minimum degree (MMD) [Liu85] and an approximate minimum degree algorithm (AMD), the latter of which is used in MATLAB [ADD96]. While both of these methods run extremely well in practice, theoretically they have tight performances of for MMD and for AMD [HEKP01]. Furthermore, AMD can be viewed as a different version of the min-degree heuristic, as it is not always guaranteed to produce a vertex of approximate minimum degree.

#### Estimating and Sketching Sizes of Sets

The core difficulty of our algorithms is in estimating the cardinality of sets (neighborhoods of eliminated components or component vertices in component graphs ) under union and deletion of elements. Many cardinality estimation algorithms have been proposed in the streaming algorithm literature using similar ideas [FM85, CM05]. These algorithms often trade off accuracy for space, where as we trade space for accuracy and efficiency in updates and queries.

Also closely related is another size-estimation framework for reachability problems by Cohen [Coh97]. This work utilized -estimators, which propagate random sketch values along neighborhoods to estimate the size of reachable sets. Our sketching method in Section 5 propagates the exact same set of values. However, we need to maintain this propagation under vertex pivots, which is akin to contracting edges in the component graph. This leads to a layer of intricacies that we resolve using amortized analysis in Section 8.

#### Removing Dependencies in Randomized Algorithms

Lastly, our use of size estimators is dynamic—the choice of pivots, which in turn affects the subsequent graph eliminate states, is a result of the randomness used to generate the results of previous steps. The independence between the access sequence and randomness is a common requirement in recent works on data structures that maintain spanning trees and matchings [BGS15, KKM13, Sol16]. There this assumption is known as the oblivious adversarial model, which states that the adversary can choose the graph and the sequence of updates, but it cannot choose updates adaptively in response to the randomly guided choices of the algorithm.

There have been recent works that re-inject randomness to preserve “independence” of randomized dimensionality-reduction procedures [LS15]. The amount of “loss” in randomness has been characterized via mutual information in a recent work [KNP17]. Their bounds require an additional factor of of randomness in order to handle adversarially injected information, which as stated is too much for handling pivots adversarially. Our work also has some tenuous connections to recent works that utilize matrix martingales to analyze repeated introductions of randomness in graph algorithms [KS16, KPPS17]. However, our work utilizes more algorithmic tools than the martingale-based ones.

## 3 Overview

The starting point of our investigation uses sketching to design an efficient data structure for maintaining fill-degrees under pivot operations. This corresponds to edge contractions in the component graph and is based on the observation that -estimators propagate well along edges of graphs. For any matrix with non-zero entries, this algorithm takes time.

In our attempts to improve the running time of an exact algorithm, we came to the somewhat surprising realization that it is hard to compute the minimum degree in certain partially eliminated graphs in time , for any , assuming the strong exponential time hypothesis. We extend this observation to give super-linear hardness for computing minimum degree orderings.

This hardness result for exact minimum degree sequences then motivated us to parameterize the performance of min-degree algorithms in a new way. Inspired by the behavior of AMD, we parameterize the performance of our algorithm in terms of intermediate degrees. Letting the minimum degree of the -th pivot be and the number of edges at that time be , we improve the performance of our algorithm to . For many important real-world graphs such as grids and cube meshes, this bound is sub-quadratic. We then proceed to give a nearly-linear time algorithm for computing an -approximate marginal min-degree ordering, where at each step the eliminated vertex has fill degree close to the current minimum.

### 3.1 Sketching the Fill Graph

We first explain the connection between computing fill-degrees and estimating the size of reachable sets. Assume for simplicity that no edges exist between the remaining vertices. Consider duplicating the remaining vertices so that each remaining vertex splits into , and any edge in the component graph becomes two directed edges and . Then the fill-degree of  is the number of remaining vertices reachable from . Estimating the size of reachable sets is a well-studied problem for which Cohen [Coh97] gave a nearly-linear time algorithm using -estimators. Adapting this framework to our setting for fill graphs (without duplication of vertices) leads to the following -sketch structure.

###### Definition 3.1.

An -sketch structure consists of:

1. Each remaining vertex generating a random number .

2. Each remaining vertex then computing the minimum among its neighbors in (including itself), which is equivalent to

 minv∈Nreachable(u)xv.

In Section 8 we demonstrate that a copy of this structure can be maintained efficiently through any sequence of pivots in nearly-linear time. As the priorities are chosen independently and uniformly at random, we effectively assign each vertex a random vertex from its reachable set . Therefore, if we maintain independent copies of this -sketch data structure, by a coupon-collector argument each vertex has a list of all its distinct neighbors. Adding together the cost of these copies leads to an time algorithm for computing a minimum degree sequence, which to the best of our knowledge is the fastest such algorithm.

### 3.2 SETH-Hardness of Computing Min-Degree Elimination Orderings

Our hardness results for computing the minimum fill degree and the min-degree ordering are based on the strong exponential time hypothesis (SETH), which states that for all there exists a  such that solving -SAT requires time. Many hardness results based on SETH, including ours, go through the OrthogonalVectors problem and make use of the following result.

###### Theorem 3.2 ([Wil05]).

Assuming SETH, for any , there does not exist an time algorithm that takes

binary vectors with

bits and decides if there is an orthogonal pair.

We remark that OrthogonalVectors is often stated as deciding if there exists a pair of orthogonal vectors from two different sets [Wil15], but we can reduce the problem to a single set by appending to all vectors in the first set and to all vectors in the second set.

Our hardness observation for computing the minimum degree of a vertex in the fill graph of some partially eliminated state is a direct reduction to OrthogonalVectors. We give a bipartite graph construction that demonstrates how OrthogonalVectors can be interpreted as deciding if a union of cliques covers a clique on the remaining vertices of a partially eliminated graph.

###### Lemma 3.3.

Assuming SETH, for any , there does not exist an time algorithm that takes any partially eliminated graph and computes the minimum fill degree in .

###### Proof.

Consider an OrthogonalVectors instance with vectors Construct a bipartite graph such that each vertex in corresponds to a vector and each vertex in uniquely corresponds to a dimension . For the edges, we connect vertices with if and only if .

Consider the graph state with all of eliminated and all of remaining. We claim that there exists a pair of orthogonal vectors among if and only if there exists a remaining vertex with . Let be any two different vertices, and let and be their corresponding vectors. The vertices and are adjacent in if and only if there exists a dimension such that .

Suppose there exists an time algorithm for finding the minimum degree in a partially eliminated graph for some . Then for , we can use this algorithm to compute the vertex with minimum fill degree in the graph described above in time

 O(m2−θ)=O((nlog2n)2−θ)=O(n2−θ/2),

which contradicts SETH by Theorem 3.2. ∎

In Section 4, we extend this observation to show that an algorithm for computing the min-degree elimination ordering does not exist, assuming SETH. This is based on constructing a graph where the bipartite graph in the proof of Lemma 3.3 appears in an intermediate step. The main overhead is adding more vertices and edges to force the vertices in to be eliminated first. To do this, we first split such vertices into stars of degree . Then we fully connect  to an additional clique of size to ensure that the (split) vertices in are the first to be pivoted. There are edges in this construction, which leads to the -hardness. However, we believe this is suboptimal and that -hardness is more likely.

### 3.3 Δ-capped and Approximately Marginal Min-Degree Ordering

This lower bound assuming SETH suggests that it is unlikely to obtain a nearly-linear, or even sub-quadratic, time algorithms for the min-degree ordering of a graph. As a result, we turn our attention towards approximations and output-sensitive algorithms.

Our first observation is that the size of can be bounded by , so copies of the sketches as discussed in Section 3.1 suffice for “coupon collecting” all distinct values instead of copies. This leads to bounds that depend on the maximum intermediate fill-degrees, which on large sparse graphs are often significantly less than . We also show how to maintain copies of the data structure and use the -th order statistic to approximate the number of entries in the set. This leads to procedures that maintain approximate minimum degree vertices for fixed sequences of updates. This type of estimation is the same as using -estimators to approximate the size of reachable sets [Coh97].

This procedure of repeatedly pivoting out the approximate minimum degree vertices given by sketching yields a nearly-linear time algorithm for producing an -approximate greedy min-degree ordering. Initially, however, we were unable to analyze it because the input sequence is not oblivious to the randomness of the data structure. In particular, the choice of pivots is dependent on the randomness of the sketches. Compared to the other recent works that analyze sequential randomness in graph sparsification [KS16, KPPS17], our accumulation of dependencies differs in that it affects the order in which vertices are removed, instead of just the approximations in matrices.

### 3.4 Correlation Under Non-Oblivious Adversaries

The general issue of correlations (or dependencies) between the randomness of a data structure and access patterns to it can be remarkably problematic. We consider a simple example where deciding future updates based on the output of previous queries results in a continual amplification of errors. This can be understood as adversarially correlating the update sequence with results of the randomness. Consider the data structure in Figure 1 for maintaining a sequence of sets

 S1,S2,…,Sm⊆{1,2,…,n}

under insertion/deletions and returns the one with minimum size up to an additive error of .

For a non-adaptive sequence fixed ahead of time and a single set , Chernoff bounds give a result that is an approximation with high probability. Therefore, we can utilize this to build a data structure that maintains a series of sets under insertion/deletion and returns a set of approximate minimum cardinality (up to an additive ). Furthermore, to remove ambiguity, we assume this data structure breaks ties lexicographically when the intersection of two sets with have equal cardinality. With a similar invocation of Chernoff bounds, we can show that this augmented data structure is correct under the oblivious adversary model. As we maintain elements from each set , the total space usage of this data structure is .

On the other hand, an adaptive adversary can use the results of previous queries to infer the set of secret keys in queries. Consider the following sequence of updates:

1. Start with two sets, and , both initially equal to .

2. For :

1. Delete from .

2. If is the set of approximate minimum size (the one with the smallest cardinality ), insert back into .

At the end of this sequence of updates, the only elements in are those in , which is a substantially worse result than what we can guarantee under the oblivious adversary model.

Our use of sketching to find a minimum degree vertex clearly does not perform updates that are this adversarial, but it does act on the minimum value generated by the randomized routine so the final result can be reasonably inaccurate. Moreover, any accounting of correlation (in the standard sense) allows for the worst-case type of adaptive behavior described above. In the next subsection, we describe an algorithmic approach to fix this issue.

### 3.5 Decorrelating Sketches and Updates

Our correlation removal method is motivated by a third routine that estimates the fill-degree of a remaining vertex in time that is close to the degree of the vertex. We then define an approximate, greedy min-degree sequence using this routine. At each step we choose the pivot vertex to be the minimizer of

 (1−ϵExp(1)O(logn))⋅\textscEstimateDegree(u,ϵO(logn)),

which is the -decayed minimum over all the estimates returned by the degree estimation routine.

We then utilize an -estimation structure to maintain approximate degrees throughout this update procedure. By doing this, the randomness in the -estimation data structure is no longer correlated with the updates. This sequence is defined with the randomness that is independent of the -estimators, and (after removing the probability of incorrectness) may as well be considered deterministic. On the other hand, evaluating such a sequence using only calls to EstimateDegree is expensive: it requires one call per vertex, leading to a total of at least . Here we reincorporate the -estimation data structure via the following observations about the initial perturbation term involving the random variable .

1. For a set of vertices whose degrees are within of each other, it suffices to randomly select and consider of them (by generating the highest order statistics for exponential random variables in decreasing order).

2. By the memoryless property of the exponential distribution, if we call EstimateDegree, with constant probability it will be for the pivoted vertex. Therefore, we can “charge” the cost of these evaluations to the overall edge count and retain the nearly-linear time bounds.

At a high level, we improve a data structure that only works under the oblivious adversary model by providing it with a fixed input using a second, more local, size-estimation routine. Our generation of this “fixed” update sequence can still benefit from the approximate bucketing created in the data structure. The key idea is that any dependencies on the -sketch structure stop after these candidates are generated—their answers only depend on the randomness of the separate size-estimation procedures.

This approach has close connections to pseudo-deterministic algorithms [GG11, Gol12, GGR13], which formalize randomized algorithms whose output sequences are fixed. Such pseudo-deterministic update sequences seem particularly useful for expanding the settings in which data structures designed for the oblivious adversary model can be used. We hope to formalize such connections in the near future. However, the lack of a counterexample for directly using -sketching structures, or a proof of its correctness, suggests that some ideas are still missing for the min-degree problem.

## 4 SETH-Hardness of Computing Min-Degree Orderings

We showed in Section 3.2 that computing the minimum fill degree of a partially eliminated graph cannot be done in time, for any , assuming the strong exponential time hypothesis (SETH). In this section, we augment this result to show that an exact linear-time algorithm for computing min-degree elimination orderings is unlikely. In particular, our main hardness result is:

###### Theorem 4.1.

Assuming SETH, for any , there does not exist an time algorithm for producing a min-degree elimination ordering.

The main idea of our construction is to modify the bipartite graph in Subsection 3.2 so that a minimum degree ordering has the effect of necessarily eliminating the vertices in before any vector vertex in . This allows us to use a minimum degree ordering on the graph to efficiently solve an OrthogonalVectors instance. The main bottleneck in our initial approach is that vertices in can have degree as large as , so requiring that they are removed first is difficult. We address this by breaking these vertices apart into vertices, each with degree , using the following construction which we call a covering set system.

###### Lemma 4.2.

Given any positive integer , we can construct in time a covering set system of the integers . This system is collection of subsets such that:

1. The number of subsets .

2. The cardinality , for all .

3. For each there exists a subset such that .

Next we pad each of the vertices in

with edges to ensure that they are eliminated after the vertices introduced by the covering set systems. We outline this construction in Figure 2.

###### Lemma 4.3.

Let be the graph produced by the construction in Figure 2 for an instance of OrthogonalVectors with vectors of dimension . We have and .

###### Proof.

The number of vertices in is

 |V|=20√n+n+d⋅O(n)=O(nd).

Similarly, an upper bound on the number of edges in is

 |E|=(20√n2)+20√n⋅n+d⋅10√n⋅O(n)=O(n3/2d),

where the terms on the left-hand side of the final equality correspond to edges contained in , the edges between and , and edges between and , respectively. ∎

###### Lemma 4.4.

Consider a graph constructed from an OrthogonalVectors instance as described in Figure 2. For any min-degree ordering of , the first vertices to be eliminated are those in . The fill degree of the next eliminated vertex is .

###### Proof.

Let the graph be , where is partitioned into

as described in Figure 2. Initially, for every vertex we have

For every vertex we have

 deg(vvec)=20√n+|E(vvec,Vdim)|≥20√n,

and for every vertex we have

 deg(vdim)≤10√n.

Pivoting out a vertex in does not increase the degree of any other vertex in , because no two vertices in are adjacent. As these vertices are pivoted, we still maintain

 deg(v)≥20√n,

for all . Therefore, the first vertices to be pivoted must be all . After all vertices in have been pivoted, the next vertex must have fill degree , because either a vertex in will be eliminated or all remaining vertices have fill degree . ∎

###### Proof of Theorem 4.1..

Suppose for some there exists an time algorithm for MinDegreeOrdering. Construct the graph with covering sets as described in Figure 2. For , it follows from Lemma 4.3 that and . Therefore by the assumption, we can obtain a min-degree ordering of in time

 O(m4/3−θ)=O((n3/2log2n)4/3−θ)=O(n2−θ).

By Lemma 4.4, the state of the elimination steps after the first vertices have been pivoted is essentially identical to the partially eliminated state from Lemma 3.3. Then by Lemma 2.1, we can compute the degree of the next vertex to be eliminated in time. Checking whether the degree of that vertex is allows us to solve OrthogonalVectors in time , which contradicts SETH. ∎

It remains to efficiently construct the covering set systems as defined in Lemma 4.2, which we can interpret as a strategy for covering all the edges of with subgraphs. We also note that our construction of covering set systems is related to existence results for the covering problem with fixed-size subgraphs [CCLW13, CY98].

###### Proof of Lemma 4.2..

Let . Bertrand’s postulate asserts that , so we can compute in time. Clearly we have , so it suffices to find a covering for . Map the elements of to the coordinates of a array in the canonical way so that

 1 ↦(0,0) 2 ↦(0,1) ⋮ p2 ↦(p−1,p−1).

For all , define

 D(a,b)def={(x,y)∈{0,1,…,p−1}2:y≡ax+b(modp)}

to be the diagonal subsets of the array, and define

 R(a)def={(x,y)∈{0,1,…,p−1}2:x≡a(modp)}

to be the row subsets of the array. Let the collection of these subsets be

The construction clearly satisfies the first two conditions. Consider any and their coordinates in the array and . If , then . Otherwise, it follows that and are solutions to the line

 y≡y1−y2x1−x2⋅(x−x1)+y1(modp),

so the third condition is satisfied. ∎

## 5 Sketching Based Algorithms for Computing Degrees

Let us recall a few relevant definitions from Section 2 for convenience. For a given vertex elimination sequence

 u(1),u(2),…,u(n),

let denote the fill graph obtained by pivoting vertices . Let denote the minimum degree of a vertex in . An -sketch data structure consists of the following:

• Each remaining vertex generates a random number .

• Each remaining vertex computes the vertex with the minimum value among its neighbors in and itself (which we call the minimizer of ).

In this section we show that if an -sketch data structure can be maintained efficiently for a dynamic graph, then we can use a set of copies of this data structure to find the vertex with minimum fill degree at each step and pivot out this vertex. Combining this with data structures for efficiently propagating sketch values from Section 8 gives a faster algorithm for computing minimum degree orderings on graphs. We use this technique in three different cases.

First, we consider the case where the minimum degree at each step is bounded. In this case, we choose a fixed number of copies of the -sketch data structure and look at the minimizers over all the copies.

###### Theorem 5.1.

There is an algorithm DeltaCappedMinDegree that, when given a graph with a lexicographically-first min-degree ordering such that the minimum degree is always bounded by , outputs the ordering with high probability in expected time and uses space .

Next, we eliminate the condition on the minimum degrees and allow the time and space bounds of the algorithm to be output sensitive. In this case, we adaptively increase the number of copies of the -data structure.

###### Theorem 5.2.

There is an algorithm OutputSensitiveMinDegree that, when given a graph with a lexicographically-first min-degree ordering , outputs this ordering with high probability in expected time and uses space .

Lastly, we modify the algorithm to compute the approximate minimum degree at each step. In this case, we use copies of the data structure and use the reciprocal of the -th percentile among the values of its minimizers as an effective approximate of the vertex degree.

###### Theorem 5.3.

There is a data structure ApproxDegreeDS that supports the following two operations:

1. , which pivots a remaining vertex .

2. , which provides balanced binary search tree (BST) containers

 V1,V2,…,VB

such that all vertices in the bucket have degrees in the range

 [(1+ϵ)i−2,(1+ϵ)i+2].

The memory usage of this data structure is . Moreover, if the pivots are picked independently from the randomness used in this data structure (i.e., we work under the oblivious adversary model) then:

1. The total cost of all the calls to ApproxDegreeDS_Pivot is bounded by .

2. The cost of each call to ApproxDegreeDS_Report is bounded by .

### 5.1 Computing Exact Min-Degree

We consider the case where the minimum degree in each of the fill graphs is at most . In this case, we maintain copies of the -sketch data structure. By the coupon-collector argument, any vertex with degree at most has a list of all its distinct neighbors with high probability. This implies that for each , we can obtain the exact min-degree in with high probability. Figure 3 gives a brief description of the data structures we will maintain for this version of the algorithm.

Note that if we can efficiently maintain the data structures in Figure 3, simply finding the minimum element in gives us the vertex with minimum degree. Theorem 5.4 shows that this data structure can indeed be maintained efficiently.

###### Theorem 5.4.

Given i.i.d. random variables associated with each vertex , there is a data structure that, for each vertex , maintains the vertex with minimum  among itself and its neighbors in . This data structure supports the following methods:

• , which returns for a remaining vertex in time.

• , which pivots a remaining vertex and returns the list of all remaining vertices whose values of have changed just after this pivot.

The memory usage of this data structure is . Moreover, for any choice of values for vertices:

1. The total cost of all the pivots is .

2. For , the total size of all lists returned by is .

This theorem relies on data structures described in Section 8, so we defer the proof to the end of that section.

Now consider a vertex with fill degree . By symmetry of the values, each vertex in is the minimizer of with probability . As a result, maintaining copies of the -sketch data structure would ensure that we have an accurate estimation of the minimum fill degree. The pseudocode for this routine is given in Figure 4. The probability guarantees are formalized in Lemma 5.5, which is essentially a restatement of [Coh97, Theorem 2.1].

###### Lemma 5.5.

For a remaining vertex with fill degree , with high probability we have

 bst_size_of_minimizers[w]=d.
###### Proof.

The only case where is when at least one neighbor of is not chosen in . Let be an arbitrary neighbor of in the fill graph . The probability of not being chosen in any of the copies is

 (1−1d)k.

Now, using the assumption that and , we have

 {Pr}x1,x2,…,xn∼[0,1)[w′ not selected in any copy] ≤(1−1Δ)O(Δlogn) ≤e−O(logn) ≤n−O(1).

Using a union bound over all neighbors, we can upper bound the probability that at least one of them is left out by

 ∣∣Nfill(w)∣∣⋅n−O(1)≤n−O(1),

which completes the proof. ∎

###### Proof of Theorem 5.1..

We prove the space bound first. By Theorem 5.4, each of the copies of the data structure use memory. Each copy of can take space up to , and can use up to space. Therefore, total space used is

 O(mk+nklogk+nlogn)=O(mk)=O(mΔlogn).

We now analyze the running time. Theorem 5.4 gives a direct cost of across all pivots, and in turn a total cost of across all copies. Furthermore, this implies that the sum of (the length of the update lists in ) across all steps is at most . Each of these updates may lead to one BST update, so the total overhead is , which is a lower order term. ∎

### 5.2 Output-Sensitive Running Time

If we do away with the condition that minimum fill degrees are bounded above by , the number of copies of the -sketch data structure needed depends on the actual values of the minimum fill degree at each step. Therefore, to be more efficient, we can adaptively maintain the required number of copies of the -sketch data structure.

For the graph , we need to have at least copies of the -sketch data structure. However, we do not know the values of a priori. Therefore, consider the following scheme that adaptively keeps a sufficient number of copies of the sketch structures:

1. Let . We will ensure that we have copies at all times. (Note that this is initially true.)

2. Let be the “computed” minimum degree in using copies of the data structure.

3. If , set and repeat.

The core idea of the above routine is that if the “computed” min-degree is at most , then with high probability the actual min-degree is at most . Then, because we have copies of the data structure, the correctness of the algorithm follows.

###### Proof of Theorem 5.2..

The proof follows analogously to that of Theorem 5.1, except that our upper bound for the minimum degrees can be simply given by . With this, the claimed space and time bounds follow. ∎

### 5.3 Computing Approximate Min-Degree

To avoid artificial conditions such as bounds on minimum fill degree and to make running times independent of the output, we modify the algorithm to obtain an approximate min-degree vertex at each step. To do this, we reduce the number of copies of and use the reciprocal of the -th percentile of a set to approximate its size.333Note that we use to refer to the base of the natural logarithm; it should not be confused with edges in a graph.

However, there is a subtle issue with the randomness that this algorithm uses. A necessary condition for the algorithm to succeed as intended is that each step must be independent of its past decisions. Therefore, we must remove any dependencies between previous and current queries. Section 3.4 gives an example of such a correlation between steps of an algorithm. To circumvent this problem, we need to decorrelate the sketches we construct and the updates to the data structure from pivoting vertices. Section 6 tackles this issue. Rather than simply selecting a vertex with approximate min-degree, this algorithm requires access to all vertices whose estimated degree is within a certain range of values. It follows that this version of the algorithm utilizes such a data structure, as opposed to the previous two versions which just output the vertex to be pivoted.

Figure 5 gives a description of the data structures for this version of the algorithm.

To achieve our goal of using fewer copies of the data structure, we use a sampling-based algorithm. In particular, we make use of the following lemma.

###### Lemma 5.6.

Suppose that we have copies of the -sketch data structure for some . Let be a vertex with degree , and let denote the -ranked element in . Then, with high probability, we have

 1−ϵd≤q(w)≤1+ϵd.

Lemma 5.6 is simply a restatement of [Coh97, Propositions 7.1 and 7.2]. However, [Coh97] assumes that the random variables are drawn from the exponential distribution (and hence also their minimum), whereas we assume that

is independently drawn from the uniform distribution. When

is large though, the minimums of elements from both distributions are almost identically distributed. For the sake of completeness, we provide the proof for the variables being uniformly distributed in Appendix A.

This leads to the following result for providing implicit access to all vertices with approximately the same degree, which is crucial for our overall nearly-linear time algorithm in Section 6. We give its pseudocode in Figure 6.

When interacting with , note that the maximum degree is , so we have . Therefore, the data structure can simply return pointers to “samplers” for the partition .

###### Proof of Theorem 5.3..

By construction, all vertices in have their -ranked quantile in the range

 [(1+ϵ)−i−1,(1+ϵ)−i].

Subsequently from Lemma 5.6, the fill-degree of a vertex is in the range

 [(1−ϵ)(1+ϵ)i,(1+ϵ)(1+ϵ)i+1],

which is within the claimed range for .

The proof of time and space bounds is again analogous to that of Theorem 5.1. Substituting in the new number of copies instead of proves the space complexity.

The main difference in these data structures is that we now need to store information about the -ranked quantile. These can be supported in time by augmenting the balanced binary search trees with information about sizes of the subtrees in standard ways (e.g. [CLRS09, Chapter 14]). A time splitting operation is also standard to most binary search tree data structures (e.g. treaps [SA96]). ∎

Note that there may be some overlaps between the allowed ranges of the buckets; vertices on the boundary of the buckets may be a bit ambiguous.

An immediate corollary of Theorem 5.3 is that we can provide access to approximate minimum-degree vertices for a fixed sequence of updates by always returning some entry from the first non-empty bucket.

###### Corollary 5.7.

For a fixed sequence of pivots , we can find -approx min-degree vertices in each of the intermediate states in time.

## 6 Generating Decorrelated Sequences

In this section we show our nearly-linear -approximate min degree algorithm. The algorithm crucially uses the ApproxDegreeDS data structure constructed in Section 5.3.

###### Theorem 6.1.

There is an algorithm ApproxMinDegreeSequence that produces an -approximate greedy min-degree sequence in expected time with high probability.

The algorithm is based on the degree approximation routines using sketching, as described in Theorem 5.3. Theorem 5.3 provides us access to vertex buckets, where the -th bucket contains vertices with fill degrees in the range . At any point, reporting any member of the first non-empty bucket gives an approximate minimum degree choice. However, such a choice must not have dependencies on the randomness used to generate this step, or more importantly, subsequent steps.

To address this issue, we use an additional layer of randomization, which decorrelates the -sketch data structures and the choice of vertex pivots. Figure 7 contains the pseudocode for the top-level algorithm to compute a nearly-linear -approximate minimum degree sequence. The algorithm makes calls the following routines and data structures:

• ApproxDegreeDS: Access to buckets of vertices with approximately equal degrees (Section 5.3).

• ExpDecayedCandidates: Takes a set whose values are within of each other, randomly perturbs its elements, and returns this (-decayed) set.

• EstimateDegree: Gives an -approximation to the fill-degree of any given vertex (Section 7). The formal statement is given in Theorem 6.2.

###### Theorem 6.2.

There is a data structure that maintains a component graph under (adversarial) vertex pivots in a total of time, and supports the operation