1 Introduction
The typical approach in modeling a realworld problem as a computational problem has, broadly speaking, two steps: (i) abstracting the problem into a mathematical formulation which captures the crux of the realworld problem, and (ii) asking for a best solution to the mathematical problem.
Consider the following scenario. Dr. organizes a panel discussion, and has a shortlist of candidates to invite. From that shortlist, Dr. wants to invite as many candidates as possible, such that each of them will bring an individual contribution to the panel. Given two candidates and , it may not be beneficial to invite both and , for various reasons: their areas of expertise or opinions may be too similar for both to make a distinguishable contribution, or it may be preferable not to invite more than one person from each institution. It may even be the case that and do not see eyetoeye on some issues which could come up at the discussion, and Dr. wishes to avoid a confrontation.
A natural mathematical model to resolve Dr. ’s dilemma is as an instance of the Vertex Cover problem: each candidate on the shortlist corresponds to a vertex, and for each pair of candidates and , we add the edge between and if it is not beneficial to invite both of them. Removing a smallest vertex cover in the resulting graph results in a largest possible set of candidates such that each of them may be expected to individually contribute to the appeal of the event.
Formally, a vertex cover of an undirected graph is any subset of the vertex set of such that every edge in has at least one endpoint in . The Vertex Cover problem asks for a vertex cover of the smallest size:
Vertex Cover  

Input:  Graph . 
Solution:  A vertex cover of of the smallest size. 
While the above model does provide Dr. with a set of candidates to invite that is valid in the sense that each invited candidate can be expected to make a unique contribution to the panel, a vast amount of side information about the candidates is lost in the modeling process. This side information could have helped Dr. to get more out of the panel discussion. For instance, Dr. may have preferred to invite more wellknown or established people over ‘newcomers’, if they wanted the panel to be highly visible and prestigious; or they may have preferred to have more ‘newcomers’ in the panel, if they wanted the panel to have more outreach. Other preferences that Dr.
may have had include: to have people from many different cultural backgrounds, to have equal representation of genders, or preferential representation for affirmative action; to have a variety in the levels of seniority among the attendants, possibly skewed in one way or the other. Other factors, such as the total carbon footprint caused by the participants’ travels, may also be of interest to Dr.
. This list could go on and on.Now, it is possible to plug in some of these factors into the mathematical model, for instance by including weights or labels. Thus a vertex weight could indicate ‘how wellestablished’ a candidate is . However, the complexity of the model grows fast with each additional criterion. The classic field of multicriteria optimization [46, 39, 29, 36] addresses the issue of bundling multiple factors into the objective function, but it is seldom possible to arrive at a balance in the various criteria in a way which captures more than a small fraction of all the relevant side information. Moreover, several side criteria may be conflicting or incomparable (or both); consider in Dr. ’s case ‘maximizing the number of different cultural backgrounds’ vs. ‘minimizing total carbon footprint.’
While Dr. ’s story is admittedly a madeup one, the Vertex Cover problem is in fact used to model conflict resolution in far more realistic settings. In each case there is a conflict graph whose vertices correspond to entities between which one wishes to avoid a conflict of some kind. There is an edge between two vertices in if and only if they could be in conflict, and finding and deleting a smallest vertex cover of yields a largest conflictfree subset of entities. We describe three examples to illustrate the versatility of this model. In each case it is intuitively clear, just like in Dr. ’s problem, that formulating the problem as Vertex Cover results in a lot of significant side information being thrown away, and that while finding a smallest vertex cover in the conflict graph will give a valid solution, it may not really help in finding a best solution, or even a reasonably good solution. We list some side information that is lost in the modeling process; the reader should find it easy to come up with any amount of other side information that would be of interest, in each case.
 Air traffic control.

Conflict graphs are used in the design of decision support tools for aiding Air Traffic Controllers (ATCs) in preventing untoward incidents involving aircraft [44, 45, 25]. Each node in the graph in this instance is an aircraft, and there is an edge between two nodes if the corresponding aircraft are at risk of interfering with each other. A vertex cover of corresponds to a set of aircraft which can be issued resolution commands which ask them to change course, such that afterwards there is no risk of interference.
In a situation involving a large number of aircraft it is unlikely that every choice of ten aircraft to redirect is equally desirable. For instance, in general it is likely that (i) it is better to ask smaller aircraft to change course in preference to larger craft, and (ii) it is better to ask aircraft which are cruising to change course, in preference to those which are taking off or landing.
 Wireless spectrum allocation.

Conflict graphs are a standard tool in figuring out how to distribute wireless frequency spectrum among a large set of wireless devices so that no two devices whose usage could potentially interfere with each other are allotted the same frequencies [22, 24]. Each node in is a user, and there is an edge between two nodes if (i) the users request the same frequency, and (ii) their usage of the same frequency has the potential to cause interference. A vertex cover of corresponds to a set of users whose requests can be denied, such that afterwards there is no risk of interference.
When there is large collection of devices vying for spectrum it is unlikely that every choice of ten devices to deny the spectrum is equally desirable. For instance, it is likely that denying the spectrum to a remotecontrolled toy car on the ground is preferable to denying the spectrum to a drone in flight.
 Managing inconsistencies in database integration.

A database constructed by integrating data from different data sources may end up being inconsistent (that is, violating specified integrity constraints) even if the constituent databases are individually consistent. Handling these inconsistencies is a major challenge in database integration, and conflict graphs are central to various approaches for restoring consistency [37, 26, 12, 3, 13]. Each node in is a database item, and there is an edge between two nodes if the two items together form an inconsistency. A vertex cover of corresponds to a set of database items in whose absence the database achieves consistency.
In a database of large size it is unlikely that all data are created equal; some database items are likely to be of better relevance or usefulness than others, and so it is unlikely that every choice of ten items to delete is equally desirable.
Getting back to our first example, it seems difficult to help Dr. with their decision by employing the ‘traditional’ way of modeling computational problems, where one looks for one best solution. If on the other hand, Dr. was presented with a small set of good solutions that in some sense are far apart, then they might handpick the list of candidates that they consider the best choice for the panel and make a more informed decision. Moreover, several forms of sideinformation may only become apparent once Dr. is presented some concrete alternatives, and are more likely to be retrieved from alternatives that look very different. That is, a bunch of good quality, dissimilar solutions may end up capturing a lot of the “lost” side information. And this applies to each of the other three examples as well. In each case, finding one best solution could be of little utility in solving the original problem, whereas finding a small set of solutions, each of good quality, which are not too similar to one another may offer much more help.
To summarize, realworld problems typically have complicated side constraints, and the optimality criterion may not be clear. Therefore, the abstraction to a mathematical formulation is almost always a simplification, omitting important side information. There are at least two obstacles to simply adapting the model by incorporating these secondary criteria into the objective function or taking into account the side constraints: (i) they make the model complicated and unmanagable, and (ii) more importantly, these criteria and constraints are often not precisely formulated, potentially even unknown a priori. There may even be no sharp distinction between optimality criteria and constraints (the socalled “soft constraints”).
One way of dealing with this issue is to present a small number of good solutions and let the user choose between them, based on all the experience and additional information that the user has and that is ignored in the mathematical model. Such an approach is useful even when the objective can be formulated precisely, but is difficult to optimize: After generating solutions, each of which is good enough according to some quality criterion, they can be compared and screened in a second phase, evaluating their exact objective function or checking additional side constraints. In this context, it makes little sense to generate solutions that are very similar to each other and differ only in a few features. It is desirable to present a diverse variety of solutions.
It should be clear that the issue is scarcely specific to Vertex Cover. Essentially any computational problem motivated by practical applications likely has the same issue: the modeling process throws out so much relevant side information that any algorithm which finds just one optimal solution to an input instance may not be of much use in solving the original problem in practice. One scenario where the traditional approach to modeling computational problems fails completely is when computational problems may combined with a human sense of aesthetics or intuition to solve a task, or even to stimulate inspiration. Some early relevant work is on the problem of designing a tool which helps an architect in creating a floor plan which satisfies a specified set of constraints. In general, the number of feasible floor plans—those which satisfy constraints imposed by the plot on which the building has to be erected, various regulations which the building should adhere to, and so on—would be too many for the architect to look at each of them one by one. Further, many of these plans would be very similar to one another, so that it would be pointless for the architect to look at more than one of these for inspiration. As an alternative to optimization for such problems, Galle proposed a “Branch & Sample” algorithm for generating a “limited, representative sample of solutions, uniformly scattered over the entire solution space” [21].
The Diverse Paradigm.
Mike Fellows has proposed the Diverse Paradigm as a solution for these issues and others [19]. In this paradigm “” is a placeholder for an optimization problem, and we study the complexity—specifically, the fixedparameter tractability—of the problem of finding a few different good quality solutions for . Contrast this with the traditional approach of looking for just one good quality solution. Let denote an optimization problem where one looks for a minimumsize subset of some set; Vertex Cover is an example of such a problem. The generic form of is then:
Input:  An instance of . 

Solution:  A solution of of the smallest size. 
Here the form that a “solution of ” takes is dictated by the problem ; compare this with the earlier definition of Vertex Cover.
The diverse variant of problem , as proposed by Fellows, has the form
Diverse  

Input:  An instance of , and positive integers . 
Parameter:  
Solution:  A set of solutions of , each of size at most , such that a diversity measure of is at least . 
Note that one can construct diverse variants of other kinds of problems as well, following this model: it doesn’t have to be a minimization problem, nor does the solution have to be a subset of some kind. Indeed, the example about floor plans described above has neither of these properties. What is relevant is that one should have (i) some notion of “good quality” solutions (for , this equates to a small size) and (ii) some notion of a set of solutions being “diverse”.
Diversity measures.
The concept of diversity appears also in other fields, and there are many different ways to measure the diversity of a collection. For example, in ecology, the diversity of a set of species (“biodiversity”) is a topic that has become increasingly important in recent times, see for example Solow and Polasky [41].
Another possible viewpoint, in the context of multicriteria optimization, is to require that the sample of solutions should try to represent the whole solution space. This concept can be quantified for example by the geometric volume of the represented space [28, 10], or by the discrepancy [34]. See [43, Section 3] for an overview of diversity measures in multicriteria optimization.
In this paper, we follow the simple possibility of looking for a collection of good solutions that have large distances from each other, in a sense that will be made precise below (1)–(2). Direction (2
), i.e., taking the pairwise sum of all Hamming distances, has been taken by many practical papers in the area of genetic algorithms, see e.g.
[20, 33]. This now classical approach can be traced as far back as 1992 [32]. In [47], it has been boldly stated that this measure (and its variations) is one of the most broadly used measures in describing population diversity within genetic algoritms. One of its advantages is that it can be computed very easily and efficiently unlike many other measures, e.g., some geometry or discrepancy based measures.1.1 Our problems and results.
In this work we focus on diverse versions of two minimization problems, Hitting Set and Feedback Vertex Set, whose solutions are subsets of a finite set. Hitting Set is in fact a class of such problems which includes Vertex Cover, as we describe below. We will consider two natural diversity measures for these problems: the minimum Hamming distance between any two solutions, and the sum of pairwise Hamming distances of all the solutions.
The Hamming distance between two sets and , or the size of their symmetric difference, is
We use
(1) 
to denote the minimum Hamming distance between any pair of sets in a collection of finite sets, and
(2) 
to denote the sum of all pairwise Hamming distances. (In Section 5, we will discuss some issues with the latter formulation.)
A feedback vertex set of a graph is any subset of the vertex set of such that the graph obtained by deleting the vertices in is a forest; that is, contains no cycle.
Feedback Vertex Set  

Input:  A graph . 
Solution:  A feedback vertex set of of the smallest size. 
More generally, a hitting set of a collection of subsets of a universe is any subset such that every set in the family has a nonempty intersection with . For a fixed positive integer the Hitting Set problem asks for a hitting set of the smallest size of a family of sized subsets of a finite universe :
Hitting Set  

Input:  A finite universe and a family of subsets of , each of size at most . 
Solution:  A hitting set of of the smallest size. 
Observe that both Vertex Cover and Feedback Vertex Set are special cases of finding a smallest hitting set for a family of subsets. Vertex Cover is also an instance of Hitting Set, with : the universe is the set of vertices of the input graph and the family consists of all sets where is an edge in . There is no obvious way to model Feedback Vertex Set as a Hitting Set instance, however, because the cycles in the input graph are not necessarily of the same size.
In this work, we consider the following problems in the Diverse paradigm. Using as the diversity measure, we consider Diverse Hitting Set and Diverse Feedback Vertex Set, where is Hitting Set and Feedback Vertex Set, respectively. Using as the diversity measure, we consider MinDiverse Hitting Set and MinDiverse Feedback Vertex Set, where is Hitting Set and Feedback Vertex Set, respectively.
In each case we show that the problem is fixedparameter tractable (), with the following running times:
Theorem 1.
Diverse Hitting Set can be solved in time .
Theorem 2.
Diverse Feedback Vertex Set can be solved in time .
Theorem 3.
MinDiverse Hitting Set can be solved in time

if and

otherwise.
Theorem 4.
MinDiverse Feedback Vertex Set can be solved in time .
Defining the diverse versions Diverse Vertex Cover and MinDiverse Vertex Cover of Vertex Cover in a similar manner as above, we get
Corollary 5.
Diverse Vertex Cover can be solved in time . MinDiverse Vertex Cover can be solved in time

if and

otherwise.
Related Work.
The parameterized complexity of finding a diverse collection of goodquality solutions to algorithmic problems seems to be largely unexplored. To the best of our knowledge, the only existing work in this area consists of: (i) a privately circulated manuscript by Fellows [19] which introduces the Diverse Paradigm and makes a forceful case for its relevance, and (ii) a manuscript by Baste et al. [5] which applies the Diverse Paradigm to vertexproblems with the treewidth of the input graph as an extra parameter. In this context a vertexproblem is any problem in which the input contains a graph and the solution is some subset of the vertex set of which satisfies some problemspecific properties. Both Vertex Cover and Feedback Vertex Set are vertexproblems in this sense, as are many other graph problems. The treewidth of a graph is, informally put, a measure of how treelike the graph is. See, e.g., [14, Chapter 7] for an introduction of the use of the treewidth of a graph as a parameter in designing algorithms. The work by Baste et al. [5] shows how to convert essentially any treewidthbased dynamic programming algorithm for solving a vertexproblem, into an algorithm for computing a diverse set of solutions for the problem, with the diversity measure being the sum of Hamming distances of the solutions. This latter algorithm is in the combined parameter where is the treewidth of the input graph. As a special case, they obtain a running time of for Diverse Vertex Cover. Further, they show that the Diverse versions (i.e., where the diversity measure is ) of a handful of problems have polynomial kernels. In particular, they show that Diverse Vertex Cover has a kernel with vertices, and that Diverse Hitting Set has a kernel with a universe size of .
Organization of the rest of the paper.
In Section 2 we list some definitions which we use in the rest of the paper. In Section 3 we describe a generic framework which can be used for computing solution families of maximum diversity for a variety of problems whose solutions form subsets of some finite set. We prove Theorem 1 in Section 3.3 and Theorem 2 in Section 4. In Section 5 we discuss some potential pitfalls in using as a measure of diversity. In Section 6 we prove Theorem 3 and Theorem 4. We conclude in Section 7.
2 Preliminaries
Given two integers and , we denote by the set of all integers such that holds. Given a graph , we denote by (resp. ) the set of vertices (resp. edges) of . For a subset we use to denote the subgraph of induced by , and for the graph . A set is a vertex cover (resp. a feedback vertex set) if has no edge (resp. no cycle). Given a graph and a vertex such that has exactly two neighbors, say and , contracting consists in removing the edges and , removing and adding the edge . Given a graph and a vertex , we denote by the degree of in . For two vertices in a connected graph we use to denote the distance between and in , which is the length of a shortest path in between and .
A deepest leaf in a tree is a vertex such that there exists a root satisfying . A deepest leaf in a forest is a deepest leaf in some connected component of . A deepest leaf has the property that there is another leaf in the tree at distance at most 2 from unless is an isolated vertex or ’s neighbor has degree 2.
3 A Framework for Maximally Diverse Solutions
In this section we describe a framework for computing solution families of maximum diversity for a variety of hitting set problems. This framework requires that the solutions form a family of subsets of a ground set which is upward closed: Any superset of a solution is also a solution.
The approach is as follows: In a first phase, we enumerate the class of all minimal solutions of size at most . (A larger class is also fine as long as it is guaranteed to contain all minimal solutions of size at most .) Then we form all tuples . For each such family , we try to augment it to a family under the constraints and , for each , in such a way that is maximized.
For this augmentation problem, we propose a network flow model that computes an optimal augmentation in polynomial time, see Section 3.1. This has to be repeated for each family, times. The first step, the generation of , is problemspecific. Section 3.3 shows how to solve it for Hitting Set. In Section 4, we will adapt our approach to deal with Feedback Vertex Set.
3.1 Optimal Augmentation
Given a universe and a set of subsets of , the problem consists in finding an tuple that maximizes , over all tuples such that for each , and there exists such that .
Theorem 6.
Let be a finite universe, and be two integers and be a set of subsets of . can be solved in time .
Proof.
The algorithm that proves Theorem 6 starts by enumerating all tuples of elements from . For each of these tuples we try to augment each , using elements of , in such a way that the diversity of the resulting tuple is maximized and such that for each , and . It is clear that this algorithm will find the solution to .
We show how to model this problem as a maximumcost network flow problem with piecewise linear concave costs. This problem can be solved in polynomial time. (See for example [42] for basic notions about network flows.)
Without loss of generality, let . We use a variable to decide whether element of should belong to set . In an optimal flow, these values are integral. Some of these variables are already fixed because must contain :
(4) 
The size of must not exceed :
(5) 
Finally, we can express the number of sets in which an element occurs:
(6) 
These variables are the variables in terms of which the objective function (3) is expressed:
(7) 
These constraints can be modeled by a network as shown in Figure 1. There are nodes representing the sets and a node for each element . In addition, there is a source and a sink . The arcs emanating from have capacity . Together with the flow conservation equations at the nodes , this models the constraints (5). Flow conservation at the nodes gives rise to the flow variables in the arcs leading to according to (6). The arcs with fixed flow (4) could be eliminated from the network, but for ease of notation, we leave them in the model. The only arcs that carry a cost are the arcs leading to , and the costs are given by the concave function (7).
There is now a onetoone correspondence between integral flows from to in the network and solutions , and the cost of the flow is equal to the diversity (2) or (3). We are thus looking for a flow of maximum cost. The value of the flow (to total flow out of ) can be arbitrary. (It is equal to the sum of the sizes of the sets .)
The concave arc costs (7) on the arcs leading to can be modeled in a standard way by multiple arcs. Denote the concave cost function by , for . Then each arc in the last layer is replaced by parallel arcs of capacity 1 with costs , , …, . This sequence of values is decreasing, starting out with positive values and ending with negative values. If the total flow along such a bundle is , the maximumcost way to distribute this flow is to fill the first arcs to capacity, for a total cost of , as desired.
An easy way to compute a maximumcost flow is the longest augmenting path method. (Commonly it is presented as the shortest augmenting path method for the minimumcost flow.) This holds for the classical flow model where the cost on each arc is a linear function of the flow. An augmenting path is a path in the residual network with respect to the current flow, and the cost coefficient of an arc in such a path must be taken with opposite sign if it is traversed in the direction opposite to the original graph.
Proposition 1 (The shortest augmenting path algorithm, cf. [42, Theorem 8.12]).
Suppose a maximumcost flow among all flows of value from to is given. Let be a maximumcost augmenting path from to . If we augment the flow along this path, this results in a new flow, of some value . Then the new flow is a maximumcost flow among all flows of value from to .
Let us apply this algorithm to our network. We initialize the constrained flow variables according to (4) to 1 and all other variables to 0. This corresponds to the original solution , and it is clearly the optimal flow of value because it is the only feasible flow of this value.
We can now start to find augmenting paths. Our graph is bipartite, and augmenting paths have a very simple structure: They start in , alternate back and forth between the nodes and the nodes, and finally make a step to . Moreover, in our network, all costs are zero except in the last layer, and an augmenting path contains precisely one arc from this layer. Therefore, the cost of an augmenting path is simply the cost of the final arc.
The flow variables in the final layer are never decreased. The resulting algorithm has therefore a simple greedylike structure. Starting from the initial flow, we first try to saturate as many of the arcs of cost as possible. Next, we try to saturate as many of the arcs of cost as possible, and so on. Once the incremental cost becomes negative, we stop.
Trying to find an augmenting path whose last arc is one of the arcs of cost , for fixed , is a reachability problem in the residual graph, and it can be solved by graph search in time because the network has vertices. Every augmentation increases the flow value by 1 unit. Thus, there are at most augmentations, for a total runtime of . ∎
3.2 Faster Augmentation
We can obtain faster algorithms by using more advanced network algorithms from the literature. We will derive one such algorithm here. The best choice depends on the relation between , , and . We will apply the following result about matchings, which are generalizations of matchings: Each node has a given supply , specifying that should be incident to at most edges.
Proposition 2 ([1]).
A maximumweight matching in a bipartite graph with nodes on the two sides of the bipartition and edges that have integer weights between and can be found in time .
We will describe below how the network flow problem from above can be converted into a matching problem with plus nodes and edges of weight at most . Plugging these values into Proposition 2 gives a running time of for finding an optimal augmentation. This improves over the run time from the previous section unless is extremely large (at least ).
From the network of Figure 1, we keep the two layers of nodes and . Each vertex gets a supply of , and each vertex gets a supply of . To mimic the piecewise linear costs on the arcs in the original network, we introduce parallel slack edges from a new source vertex to each vertex . The costs are as follows. Let with denote the costs in the last layer of the original network, and let . Since , this is larger than all costs. Then every edge from the original network gets a weight of , and the new slack edges entering each get positive weights . We set the supply of the extra source node to , which imposes no constraint on the number of incident edges.
Now suppose that we have a solution for the original network in which the total flow into vertex is . In the corresponding matching, we can then use = of the slack edges incident to . The maximumweight slack edges have weights . The total weight of the edges incident to is therefore
using the equation . Thus, up to an addition of the constant , the maximum weight of a matching agrees with the maximum cost of a flow in the original network.
3.3 Diverse Hitting Set
In this section we show how to use the optimal augmentation technique developed in Section 3 to solve Diverse Hitting Set. For this we use the following folklore lemma about minimal hitting sets.
Lemma 7.
Let be an instance of Hitting Set, and let be an integer. There are at most inclusionminimal hitting sets of of size at most , and they can all be enumerated in time .
Theorem 1.
Diverse Hitting Set can be solved in time .
4 Diverse Feedback Vertex Set
A feedback vertex set (FVS) (also called a cycle cutset) of a graph is any subset of vertices of such that every cycle in contains at least one vertex from . The graph obtained by deleting from
is thus an acyclic graph. Finding an FVS of small size is an NPhard problem with a number of applications in Artificial Intelligence, many of which stem from the fact that many hard problems become easy to solve in acyclic graphs.
The Propositional Model Counting (or #SAT) problem asks for the number of satisfying assignments for a given CNF formula, and has a number of applications, for instance in planning [35, 17] and in probabilistic inference problems such as Bayesian reasoning [4, 11, 23, 15, 30, 40, 2]. A popular approach to solving #SAT consists of first finding a small FVS of the CNF formula. Assigning values to all the variables in results in an acyclic instance of CNF. The algorithm assigns all possible sets of values to the variables in , computes the number of satisfying assignments of the resulting acyclic instances, and returns the sum of these counts [16]
. Other applications of finding a small FVS include faster sampling for Bayesian networks, solving constraint satisfaction problems, credulous and skeptical acceptance problems in abstract argumentation, and learning and inference in graphical models
[8, 6, 9, 7, 18, 31].In this section, we focus on the Diverse Feedback Vertex Set problem and prove the following theorem.
Theorem 2.
Diverse Feedback Vertex Set can be solved in time .
In order to solve Diverse Feedback Vertex Set, one natural way would be to generate every feedback vertex set of size at most and then check which set of solutions provide the required sum of Hamming distances. Unfortunately, the number of feedback vertex set is not parameterized by . Indeed, one can consider a graph containing cycle of size , leading to different feedback vertex sets of size .
We avoid this problem by generating all such small feedback vertex sets up to some equivalence of degree two vertices. We obtain an exact and efficient description of all feedback vertex sets of size at most , which is formally captured by Lemma 8. A class of solutions of a graph , is a pair such that and is a function such that for each , , and for each , , . Given a class of solutions , we define . A class of FVS solutions is a class of solutions such that each is a feedback vertex set of . Moreover, if and , we say that is described by . Note that is also a feedback vertex set. In a class of FVS solutions , the meaning of the function is that, for each cycle in , there exists such that each element of hits . This allows us to group related solutions into only one set .
Lemma 8.
Let be a vertex graph. There exists a set of classes of FVS solutions of of size at most such that each feedback vertex set of size at most is described by an element of . Moreover, can be constructed in time .
Proof.
Let be a vertex graph. We start by generating a feedback vertex set of size at most . The current best deterministic algorithm for this by Kociumaka and Pilipczuk [27] finds such a set in time . In the following, we use the ideas used for the iterative compression approach [38].
For each subset , we initiate a branching process by setting , , and . Observe that initially, as and , the graph has at most components. In the branching process, we will add more vertices to and , and we will remove vertices and edges from , but we will maintain the property that and . The set will always denote the vertex set . Note that is initially a forest; we ensure that it always remains a forest.
We also initialize a function by setting for each . This function will keep information about vertices that are deleted from . While searching for a feedback vertex set, we consider only feedback vertex sets that contain all vertices of but no vertex of . Vertices in are still undecided. The function will maintain the invariant that for each , , and for each , all vertices of intersect exactly the same cycles in . Moreover, for each , the value is fixed and will not be modified anymore in the branching process. During the branching process, we will progressively increase the size of , , and the sets , .
By reducing we mean that we apply the following rules exhaustively.

If there is a such that , we delete from .

If there is an edge such that , we contract in and set .
These are classical preprocessing rules for the Feedback Vertex Set problem, see for instance [14, Section 9.1]. Indeed, vertices of degree one cannot appear in a cycle, and consecutive vertices of degree hit exactly the same cycles. After this preprocessing, there are no adjacent degreetwo vertices and no degreeone vertices in . (Degrees are measured in .)
We start to describe the branching procedure. We work on the tuple . After each step, the value will increase, where denotes the number of connected components of .
At each step of the branching we do the following. If or if contains a cycle, we immediately stop this branch as there is no solution to be found in it. If is a feedback vertex set of size at most , then is a class of FVS solutions, we add it to and stop working on this branch. Otherwise, we reduce . We pick a deepest leaf in and apply one of the two following cases, depending of the vertex .

Case 1: The vertex has at least two neighbors in (in the graph ).
If there is a path in between two neighbors of , then we have to put in , as otherwise this path together with will induce a cycle. If there is no such path, we branch on both possibilities, inserting either into or into .

Case 2: The vertex has at most one neighbor in .
Since is a leaf in , it hat at most one neighbor also in . On the other hand, we know that has degree at least 2 in . Thus, has exactly one neighbor in and one neighbor in , for a degree of in . Let be the neighbor in . Again, as we have reduced , the degree of in is at least . So either it has a neighbor in , or, as is a deepest leaf, it has another child, say , that is also a leaf in , and has therefore a neighbor in . We branch on the at most possibilities to allocate , , and if considered, between and , taking care not to produce a cycle in .
In both cases, either we put at least one vertex in , and so increases by one, or all considered vertices are added to . In the latter case, the considered vertices are connected, at least two of them have a neighbor in , and no cycles were created; therefore, the number of components in drops by one. Thus increases by at least one. As , there can be at most branching steps.
Since we branch at most times and at each branch we have at most possibilities, the branching tree has at most leaves. So, for each of the at most subsets of , we add at most elements to .
It is clear that we have obtained all solutions of FVS and they are described by the classes of FVS solutions in , which is of size . ∎
Proof of Theorem 2.
We generate all tuples of the classes of solutions given by Lemma 8, with repetition allowed.
We now consider each tuple and try to pick an appropriate solution from each class of solutions , , in such a way that the diversity of the resulting tuple of feedback vertex sets is maximized. The network of Section 3.1 must be adapted to model the constraints resulting from solution classes. Let be a solution class, with . For our construction, we just need to know the family of disjoint nonempty vertex sets. The solutions that are described by this class are all sets that can be obtained by picking at least one vertex from each set . Figure 2 shows the necessary adaptations for one solution . In addition to a single node that is either directly of indirectly connected to all nodes , like in Figure 1, we have additional nodes representing the sets . For each vertex that appears in one of the sets , there is an additional node in an intermediate layer of the network. The flow from to is forced to be equal to 1, and this ensures that at least one element of the set is chosen in the solution. Here it is important that the sets are disjoint.
A similar structure must be built for each set , and all these structures share the vertices and . The rightmost layer of the network is the same as in Figure 1.
The initial flow is not so straightforward as in Section 3.1 but is still easy to find. We simply saturate the arc from to each of the nodes in turn by a shortest augmenting path. Such a path can be found by a simple reachability search in the residual network, in time. The total running time from Section 3.1 remains unchanged. ∎
5 Modeling Aspects: Discussion of the Objective Function
In Sections 3 and 4, we have used the sum of the Hamming distances, , as the measure of diversity. While this metric is of natural interest, it appears that in some specific cases, it may not be a useful choice. We present a simple example where the most diverse solution according to is not what one might expect.
Let be an even number. We consider the path with vertices, and we are looking for vertex covers of size at most , of maximum diversity.
Figure 3 shows an example with . The smallest size of a vertex cover is indeed , and there are different solutions. One would hope that the “maximally diverse” selection of solutions would pick all these different solutions. But no, the selection that maximizes consists of copies of just two
solutions, the “odd” vertices and the “even” vertices (the first and last solution in Figure
3).This can be seen as follows. If the selected set contains in total copies of the first solutions in the order of Figure 3, then the objective can be written as
Here, each term accounts for two consecutive vertices of the path in the formulation (3). The unique way of maximizing each term individually is to set for all . This corresponds to the selection of copies of the first solution and copies of the last solution, as claimed.
In a different setting, namely the distribution of points inside a square, an analogous phenomenon has been observed [43, Figure 1 ]: Maximizing the sum of pairwise Euclidean distances places all points at the corners of the square. In fact it is easy to see that, in this geometric setting, any locally optimal solution must place all points on the boundary of the feasible region. By contrast, for our combinatorial problem, we don’t know whether this pathological behavior is typical or rare in instances that are not specially constructed. Further research is needed. A notion of diversity which is more robust in this respect is the smallest difference between two solutions, which we consider in Section 6.
6 Maximizing the Smallest Hamming distance
The undesired behavior highlighted in Section 5 is the fact that the collection that maximizes the sum of the Hamming distances uses several copies of the same set. In this section we explore how to handle this unexpected behavior by changing the distance to the minimal Hamming distance between two sets of the collection. This modification naturally removes the possibility of selecting the same solution twice. We show how to solve MinDiverse Hitting Set and MinDiverse Feedback Vertex Set for this metric.
Theorem 3.
MinDiverse Hitting Set can be solved in time

if and

otherwise.
Proof.
Let be an instance of MinDiverse Hitting Set where . If , we solve the problem by complete enumeration: There are trivially at most hitting sets of size at most . We form all tuples of them and select the one that maximizes . The running time is at most .
We now assume that . ∎
Theorem 4.
MinDiverse Feedback Vertex Set can be solved in time .
Proof.
Let be a vertex graph. If , we again solve the problem by complete enumeration: There are trivially at most feedback vertex sets of size at most . We form all tuples of them and select the one that maximizes . The running time is at most .
We assume now that . As in Section 4, we construct a set of at most classes of FVS solutions of , using Lemma 8. Then we go through all tuples of classes . For each such tuple, we look for the tuple of feedback vertex sets such that each is described by , and the objective value is maximized. So far, the procedure is completely analogous to the algorithm of Theorem 2 in Section 4 for maximizing .
Now, in going from a class to , we have to select a vertex from every set , for , and we may add an arbitrary number of additional vertices, up to size . We make this selection as follows: Whenever , we simply try all possibilities of choosing an element of and putting it into . If , we defer the choice for later. In this way, we have created at most “partial” feedback vertex sets
For each such , we now add the remaining elements. In each list which has been deferred, we greedily pick an element that is distinct from all other chosen elements. This is always possible since the list is large enough. Finally, we fill up the sets to size , again choosing fresh elements each time. Each such choice is an optimal choice, because it increases the Hamming distance between the concerned set and every other set by 1, which is the best that one can hope for. As we proceed to this operation for each , where , and that for each such , we create at most tuples, we obtain an algorithm running in time . The theorem follows. ∎
7 Conclusions and Open Problems
In this work, we have considered the paradigm of finding small diverse collections of reasonably good solutions to combinatorial problems, which has recently been introduced to the field of fixedparameter tractability theory [5].
We have shown that finding diverse collections of hitting sets and feedback vertex sets can be done in
time. While these problems can be classified as
via the kernels and a treewidthbased metatheorem proved in [5], the methods proposed here are of independent interest. We introduced a method of generating a maximally diverse set of solutions from a set that either contains all minimal solutions of bounded size (Hitting Set) or from a collection of structures that in some way describes all solutions of bounded size (Feedback Vertex Set). In both cases, the maximally diverse collection of solutions is obtained via a network flow model, which does not rely on any specific properties of the studied problems. It would be interesting to see if this strategy can be applied to give FPTalgorithms for diverse problems that are not covered by the metatheorem or the kernels presented in [5].While the problems in [5] as well as the ones in Sections 3 and 4, seek to maximize the sum of all pairwise Hamming distances, we also studied the variant that asks to maximize the minimum Hamming distance, taken over each pair of solutions. This was motivated by an example where the former measure does not perform as intended (Section 5). We showed that also under this objective, the diverse variants of Hitting Set and Feedback Vertex Set are . It would be interesting to see whether this objective also allows for a (possibly treewidthbased) metatheorem.
In [5], the authors ask whether there is a problem that is in parameterized by solution size whose diverse variant becomes hard upon adding as another component of the parameter. We restate this question here.
Question 9 ([5]).
Is there a problem with solution size , such that is parameterized by , while Diverse , asking for solutions, is hard parameterized by ?
To the best of our knowledge, this problem is still wide open. We believe that the measure is more promising to obtain such a result rather than the measure. A possible way to tackle both measures at once might be a parameterized (and strenghtened) analogue of the following approach that is wellstudied in classical complexity. Yato and Seta propose a framework [48] to prove completeness of finding a second solution to an complete problem. In other words, there are some problems where given one solution it is still hard to determine whether the problem has a different solution.
From a different perspective, one might want to identify problems where obtaining one solution is polynomialtime solvable, but finding a diverse collection of solutions becomes hard. The targeted running time should be parameterized by (and maybe
, the diversity target) only. We conjecture that this is most probably
 or hard in general. However, we believe it is interesting to search for wellknown problems where it is not the case.Acknowledgements.
The second, third and fourth authors would like to thank Mike Fellows for introducing them to the notion of diverse algorithms and sharing the manuscript “The Diverse Paradigm” [19].
References
 [1] Ravindra K. Ahuja, James B. Orlin, Clifford Stein, and Robert E. Tarjan. Improved algorithms for bipartite network flow. SIAM Journal on Computing, 23:906–933, 1994. doi:10.1137/S0097539791199334.
 [2] Udi Apsel and Ronen I. Brafman. Lifted MEU by weighted model counting. In Proceedings of the TwentySixth AAAI Conference on Artificial Intelligence, pages 1861–1867. AAAI Press, 2012.
 [3] Marcelo Arenas, Leopoldo Bertossi, Jan Chomicki, Xin He, Vijay Raghavan, and Jeremy Spinrad. Scalar aggregation in inconsistent databases. Theoretical Computer Science, 296(3):405–434, 2003. doi:10.1016/S03043975(02)007375.

[4]
Fahiem Bacchus, Shannon Dalmao, and Toniann Pitassi.
Algorithms and complexity results for #SAT and Bayesian inference.
In 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings., pages 340–351. IEEE, 2003. doi:10.1109/SFCS.2003.1238208.  [5] Julien Baste, Michael R. Fellows, Lars Jaffke, Tomáš Masařík, Mateus de Oliveira Oliveira, Geevarghese Philip, and Frances A. Rosamond. Diversity in combinatorial optimization, 2019. arXiv:1903.07410.
 [6] Bozhena Bidyuk and Rina Dechter. An anytime scheme for bounding posterior beliefs. In Proceedings of the 21st National Conference on Artificial Intelligence, Volume 2, pages 1095–1100. AAAI Press, 2006.
 [7] Bozhena Bidyuk and Rina Dechter. Cutset sampling with likelihood weighting. In Proceedings of the TwentySecond Conference on Uncertainty in Artificial Intelligence, pages 39–46. AUAI Press, 2006.
 [8] Bozhena Bidyuk and Rina Dechter. Cutset sampling for Bayesian networks. Journal of Artificial Intelligence Research, 28:1–48, 2007. doi:10.1613/jair.2149.
 [9] Bozhena Petrovna Bidyuk. Exploiting graph cutsets for samplingbased approximations in Bayesian networks. PhD thesis, University of California, Irvine, 2006.
 [10] Karl Bringmann, Sergio Cabello, and Michael T. M. Emmerich. Maximum volume subset selection for anchored boxes. In Boris Aronov and Matthew J. Katz, editors, 33rd International Symposium on Computational Geometry (SoCG 2017), volume 77 of Leibniz International Proceedings in Informatics (LIPIcs), pages 22:1–22:15, Dagstuhl, Germany, 2017. Schloss Dagstuhl–LeibnizZentrum für Informatik. doi:10.4230/LIPIcs.SoCG.2017.22.
 [11] Mark Chavira and Adnan Darwiche. On probabilistic inference by weighted model counting. Artificial Intelligence, 172(6–7):772–799, 2008. doi:10.1016/j.artint.2007.11.002.
 [12] Jan Chomicki and Jerzy Marcinkowski. Minimalchange integrity maintenance using tuple deletions. Information and Computation, 197(1–2):90–121, 2005. doi:10.1016/j.ic.2004.04.007.
 [13] Jan Chomicki and Jerzy Marcinkowski. On the computational complexity of minimalchange integrity maintenance in relational databases. In Inconsistency Tolerance, pages 119–150. Springer, 2005. doi:10.1007/9783540305972_5.
 [14] Marek Cygan, Fedor V. Fomin, Lukasz Kowalik, Daniel Lokshtanov, Dániel Marx, Marcin Pilipczuk, Michal Pilipczuk, and Saket Saurabh. Parameterized Algorithms. Springer, 2015. doi:10.1007/9783319212753.
 [15] Adnan Darwiche. The quest for efficient probabilistic inference. Invited Talk, IJCAI05, 2005.
 [16] Rina Dechter and David Cohen. Constraint Processing. Morgan Kaufmann, 2003.
 [17] Carmel Domshlak and Jörg Hoffmann. Fast probabilistic planning through weighted model counting. In Proceedings of the Sixteenth International Conference on Automated Planning and Scheduling, ICAPS 2006, pages 243–252. AAAI Press, 2006.
 [18] Wolfgang Dvořák, Sebastian Ordyniak, and Stefan Szeider. Augmenting tractable fragments of abstract argumentation. Artificial Intelligence, 186:157–173, 2012. doi:j.artint.2012.03.002.
 [19] Michael Ralph Fellows. The diverse X paradigm. Manuscript, November 2018.

[20]
Thomas Gabor, Lenz Belzner, Thomy Phan, and Kyrill Schmid.
Preparing for the unexpected: Diversity improves planning resilience in evolutionary algorithms.
In 2018 IEEE International Conference on Autonomic Computing, ICAC 2018, Trento, Italy, September 37, 2018, pages 131–140, 2018. doi:10.1109/ICAC.2018.00023.  [21] Per Galle. Branch & sample: A simple strategy for constraint satisfaction. BIT Numerical Mathematics, 29(3):395–408, 1989. doi:10.1007/BF02219227.
 [22] Sorabh Gandhi, Chiranjeeb Buragohain, Lili Cao, Haitao Zheng, and Subhash Suri. A general framework for wireless spectrum auctions. In 2007 2nd IEEE International Symposium on New Frontiers in Dynamic Spectrum Access Networks, pages 22–33. IEEE, 2007.
 [23] Carla P. Gomes, Ashish Sabharwal, and Bart Selman. Model counting: a new strategy for obtaining good bounds. In Proceedings of the 21st National Conference on Artificial IntelligenceVolume 1, pages 54–61. AAAI Press, 2006.
 [24] Martin Hoefer, Thomas Kesselheim, and Berthold Vöcking. Approximation algorithms for secondary spectrum auctions. ACM Transactions on Internet Technology (TOIT), 14(2–3):16:1–16:24, 2014. doi:10.1145/2663496.
 [25] M. Idan, G. Iosilevskii, and L. BenYishay. Efficient air traffic conflict resolution by minimizing the number of affected aircraft. International Journal of Adaptive Control and Signal Processing, 24(10):867–881, 2010.
 [26] Ekaterini Ioannou and Slawek Staworko. Management of inconsistencies in data integration. In Data Exchange, Integration, and Streams, volume 5, pages 217–225. Schloss DagstuhlLeibnizZentrum für Informatik, 2013. doi:10.4230/DFU.Vol5.10452.217.
 [27] Tomasz Kociumaka and Marcin Pilipczuk. Faster deterministic feedback vertex set. Information Processing Letters, 114(10):556–560, 2014. doi:10.1016/j.ipl.2014.05.001.
 [28] Tobias Kuhn, Carlos M. Fonseca, Luís Paquete, Stefan Ruzika, Miguel M. Duarte, and José Rui Figueira. Hypervolume subset selection in two dimensions: Formulations and algorithms. Evolutionary Computation, 24(3):411–425, 2016. doi:10.1162/EVCO_a_00157.
 [29] G. Leitmann and A. Marzollo, editors. Handbook of Multicriteria Analysis. Springer, Berlin, 2010.

[30]
Michael L. Littman, Stephen M. Majercik, and Toniann Pitassi.
Stochastic Boolean satisfiability.
Journal of Automated Reasoning
, 27(3):251–296, 2001. doi:10.1023/A:1017584715408.  [31] Ying Liu and Alan Willsky. Learning Gaussian graphical models with observed or latent FVSs. In Advances in Neural Information Processing Systems, pages 1833–1841, 2013.
 [32] Sushil J. Louis and Gregory J. E. Rawlins. Syntactic analysis of convergence in genetic algorithms. In Proceedings of the Second Workshop on Foundations of Genetic Algorithms. Vail, Colorado, USA, July 2629 1992., pages 141–151, 1992. doi:10.1016/b9780080948324.500155.
 [33] Ronald W. Morrison and Kenneth A. De Jong. Measurement of population diversity. In Artificial Evolution, 5th International Conference, Evolution Artificielle, EA 2001, Le Creusot, France, October 2931, 2001, Selected Papers, pages 31–41, 2001. doi:10.1007/3540460330_3.
 [34] Aneta Neumann, Wanru Gao, Carola Doerr, Frank Neumann, and Markus Wagner. Discrepancybased evolutionary diversity optimization. In Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ’18, pages 991–998, New York, NY, USA, 2018. ACM. doi:10.1145/3205455.3205532.
 [35] Héctor Palacios, Blai Bonet, Adnan Darwiche, and Héctor Geffner. Pruning conformant plans by counting models on compiled dDNNF representations. In Proceedings of the Fifteenth International Conference on Automated Planning and Scheduling, ICAPS 2005, pages 141–150. AAAI Press, 2005.
 [36] Jason Papathanasiou and Nikolaos Ploskas. Multiple Criteria Decision Aid: Methods, Examples and Python Implementations. Springer Nature, 2018.
 [37] Enela Pema, Phokion G. Kolaitis, and WangChiew Tan. On the tractability and intractability of consistent conjunctive query answering. In Proceedings of the 2011 Joint EDBT/ICDT Ph. D. Workshop, pages 38–44. ACM, 2011. doi:10.1145/1966874.1966881.
 [38] Bruce Reed, Kaleigh Smith, and Adrian Vetta. Finding odd cycle transversals. Operations Research Letters, 32(4):299–301, 2004. doi:10.1016/j.orl.2003.10.009.
 [39] Bernard Roy. Multicriteria Methodology for Decision Aiding. Kluwer, Dordrecht, 1996.
 [40] Tian Sang, Paul Bearne, and Henry Kautz. Performing Bayesian inference by weighted model counting. In Proceedings of the 20th National Conference on Artificial IntelligenceVolume 1, pages 475–481. AAAI Press, 2005.
 [41] Andrew R. Solow and Stephen Polasky. Measuring biological diversity. Environmental and Ecological Statistics, 1(2):95–103, Jun 1994. doi:10.1007/BF02426650.
 [42] Robert E. Tarjan. Data Structures and Network Algorithms. SIAM, Philadelpia, 1983. doi:10.1137/1.9781611970265.
 [43] Tamara Ulrich, Johannes Bader, and Lothar Thiele. Defining and optimizing indicatorbased diversity measures in multiobjective search. In Robert Schaefer, Carlos Cotta, Joanna Kołodziej, and Günter Rudolph, editors, Parallel Problem Solving from Nature, PPSN XI, pages 707–717, Berlin, Heidelberg, 2010. Springer. doi:10.1007/9783642158445_71.
 [44] Adan Vela, JohnPaul Clarke, Eric Feron, Nicolas Durand, and William Singhose. Determining the value of information for minimizing controller taskload: a graphbased approach. In ATM Seminar 2011, 9th USA/Europe Seminar on ATM R&D, 2011.
 [45] Adan Ernesto Vela. Understanding conflictresolution taskload: implementing advisory conflictdetection and resolution algorithms in an airspace. PhD thesis, Georgia Institute of Technology, 2011.
 [46] Philippe Vincke. Multicriteria Decisionaid. Wiley, Chichester, 1992.
 [47] Mark Wineberg and Franz Oppacher. The underlying similarity of diversity measures used in evolutionary computation. In Genetic and Evolutionary Computation  GECCO 2003, Genetic and Evolutionary Computation Conference, Chicago, IL, USA, July 1216, 2003. Proceedings, Part II, pages 1493–1504, 2003. doi:10.1007/3540451102_21.
 [48] Takayuki Yato and Takahiro Seta. Complexity and completeness of finding another solution and its application to puzzles. IEICE transactions on fundamentals of electronics, communications and computer sciences, 86(5):1052–1060, 2003.
Comments
There are no comments yet.