In this paper, we study Maximum Happy Vertices and Maximum Happy Edges. Both problems were recently introduced by Zhang and Li in 2015 [zhang2015algorithmic], motivated by a study of algorithmic aspects of the homophyly law in large networks. Informally they paraphrase the law as ”birds of a feather flock together”. The law states that in social networks people are more likely to connect with people sharing similar interests with them. A social network is represented by a graph, where each vertex corresponds to a person in the network, and an edge between two vertices denotes that corresponding persons are connected within the network. Furthermore, we let vertices have colors assigned. The color of a vertex indicates type, character or affiliation of the corresponding person in the network. An edge is called happy if its endpoints are colored with the same color. A vertex is called happy if all its neighbours are colored with the same color as the vertex itself. Equivalently, a vertex is happy if all edges incident to it are happy. The formal definitions of Maximum Happy Vertices and Maximum Happy Edges are the following:
Maximum Happy Edges has an immediate connection to Multiway Cut. Precisely, if each color is used in precoloring exactly once, then Maximum Happy Edges is exactly the Multiway Uncut problem, i.e. the edge complement of Multiway Cut. Thus, Maximum Happy Edges is a generalization of the Multiway Uncut problem. So, in this case the connection between clustering vertices by color and cutting edges in order to separate different colors is pretty obvious. However, this is not the case for vertex version of the problem, which we would like to connect with the vertex version of Multiway Cut, Node Multiway Cut.
Maximum Happy Vertices can be seen as a sort of clusterization problem, in which some vertices already have prescribed color/cluster and the goal is to identify colors/clusters of initially uncolored/unassigned vertices. In some sense, we would like to clusterize the graph in such a way that overall boundary of clusters is minimized. Here, by a boundary of a cluster we understand vertices of the cluster that are connected to vertices outside the cluster. While it is possible to straightforwardly formulate the problem in terms of a special cutting problem, this kind of formalization will sound complicated and unnatural. We show that MHV can be easily transformed into Node Multiway Cut, thereby constructing an additional bridge between clusterization and cutting problems.
Recently, MHV and MHE have attracted a lot of attention and were studied from parameterized [Agrawal2018, Aravind2016, Aravind2017, Choudhari2018, Misra2018] and approximation [zhang2015algorithmic, zhang2018improved, zhang2015improved, xu2016submodular] points of view as well as from experimantal perspective [lewis2019finding]. Further, dozens of algorithms for the classical Multiway Cut problem have been considered as well, which is the complement of a special case of MHE.
In 2015, Zhang and Li established that -MHE and -MHV are -hard for , where is the number of colors used. Later, Aravind et al.[Aravind2016] showed that when the input graph is a tree, -MHV and -MHE can be solved in and in time respectively. In [Misra2018], Misra and Reddy proved -hardness of both MHV and MHE on split and on bipartite graphs, and showed that MHV is polynomial time solvable on cographs.
From the approximation perspective, the currently best known results are the following. Zhang et al. [zhang2018improved] showed that MHV can be approximated within , where is the maximum degree of the input graph, and MHE can be approximated within , where . They also claimed that a more careful analysis can improve the approixmation ratio for MHV to , where .
The known results in parameterized complexity (not including kernelization) are summarized in Table 1. Results proved in the paper are marked by in the table. Agrawal [Agrawal2018] provides -kernel for MHV, where is the number of used colors and is the number of desired happy vertices. Independently, Gao and Gao [gao2018kernelization] present a -kernel for the general case and a -kernel in the case of planar graphs. We provide a kernel on vertices for MHV parameterized by the distance to clique, partially answering a question in [Misra2018]. Note that the kernel sizes mentioned in this paragraph correspond to the number of vertices in the kernels.
|Distance to threshold graphs||?||?||[Choudhari2018]|
|Distance to clique||[Misra2018]|
|Distance to cluster||-hard C3||T3|
|Distance to cographs||-hard C3||?||-hard C3|
|Treewidth||[Aravind2017, Agrawal2018]||[Aravind2017, Misra2018]|
|Pathwidth||[Aravind2017, Misra2018]||[Aravind2017, Agrawal2018]|
|Feedback Vertex Set Number|
|Vertex Cover Number||[Misra2018]|
|Split Vertex Deletion Number||para-NP-hard [Misra2018]|
|Odd Cycle Transversal Number|
Our results: The main contributions of our work are the following.
We establish a natural connection between Maximum Happy Vertices on a graph and Node Multiway Cut on a second power of a certain subgraph of .
We answer questions in [Agrawal2018, Aravind2016] about existence of -algorithm for MHVparameterized by the treewidth of the input graph only.
Similarly, we answer one of the questions from Choudhari et al. [Choudhari2018] and Misra et al. [Misra2018] by showing -hardness of MHE parameterized by the cluster vertex deletion number. We show that MHV, in contrast to MHE, is in when parameterized by the cluster vertex deletion number.
We partially answer a question stated by Misra and Reddy in [Misra2018]. We provide a kernel of size for MHV, where is the distance to cliques.
Among other results, we also present the first algorithm for Node Multiway Cut parameterized by the clique-width of the input graph.
Organization of the paper: Section 3 describes results under some structural and distance-to-triviality parameters. In Section LABEL:section:cut-relation we provide results connecting Node Multiway Cut and Maximum Happy Vertices. In Section LABEL:sec:structural-kernel we provide a polynomial kernel for MHV parameterized by the distance to clique. In Section 4 we show how to strengthen the results of -hardness and obtain the corresponding -hardness results.
Basic notation. We denote the set of positive integer numbers by . For each positive integer , by we denote the set of all positive integers not exceeding , . We use to denote an infinitely large number, for which holds and , where is an arbitrary integer. We use for the disjoint union operator, i.e. equals , with an additional constraint that and are disjoint.
We employ partial functions in our work. To denote a partial function from a set to a set , that is, a function that may do not map some element of to an element of , we write . If a partial function maps an element to some element in , we say that is assigned. If is unassigned, we allow to extend by assigning the value of .
We use the traditional -notation for asymptotical upper bounds. We additionally use the -notation that hides polynomial factors. Many of our results concern the parameterized complexity of the problems, including fixed-parameter tractable algorithms, kernelization algorithms, and some hardness results for certain parameters. For a detailed survey in parameterized algorithms we refer to the book of Cygan et al. [cygan2015parameterized]. In their book one may also find definitions of pathwidth and treewidth that are considered as parameters in some of our results.
Throughout the paper, we use standard graph notation and terminology, following the book of Diestel [diestel2018graph]. All graphs in our work are undirected simple graphs. We consider several graph classes in our work. Interval graphs are graphs whose vertices can be represented as intervals on the real line, so that a pair of vertices are connected by an edge if and only if their representative intervals intersect. Cluster graphs are graphs that are a disjoint union of cliques, or, equivalently, graphs that do not contain induced paths on three vertices.
We often refer to the distance to parameter, where is an arbitrary graph class. For a graph , we say that a vertex subset is a modulator of , if becomes a member of after deletion of , i.e. . Then, the distance to parameter of is defined as the size of its smallest modulator.
Graph colorings. When dealing with instances of Maximum Happy Vertices or Maximum Happy Edges, we use a notion of colorings. A coloring of a graph is a function that maps vertices of the graph to a set of colors. If this function is partial, we call such a coloring partial. If not stated otherwise, we use for the number of distinct colors, and assume that colors are integers in . A partial coloring is always given as a part of the input for both problems, along with graph . We also call a precoloring of the graph , and use to denote the graph along with the precoloring. The goal of both problems is to extend this partial coloring to a specific coloring that maps each vertex to a color. We call a full coloring (or simply, a coloring) of that extends . We may also say that is a coloring of . For convenience, introduce the notion of potentially happy vertices, both for full and partial colorings.
We call a vertex of potentially happy, if there exists a coloring of such that is happy with respect to . In other words, if and are precolored neighbours of , then (and , if is a precolored vertex). We denote the set of all potentially happy vertices in by .
By we denote the set of all potentially happy vertices in such that they are either precolored with color or have a neighbour precolored with color :
In other words, if a vertex is happy with respect to some coloring of , then necessarily .
Note that if is a full coloring of a graph , then is equal to the number of vertices in that are happy with respect to .
Clique-width. Among other structural parameters, we consider clique-width in our work. We follow definitions presented by Lackner et al. in their work on Multicut parameterized by clique-width [Lackner2012].
To define clique-width, we need to define -expressions first. For any , a -expression describes a graph , whose vertices are labeled with integers in . -expressions and its corresponding graphs are defined recursively. Depending on its topmost operator, a -expression can be of four following types.
Introducing a vertex. , where is a label and is a vertex. is a graph consisting of a single vertex with label , i.e. .
Disjoint union. , where and are smaller subexpressions. is a disjoint union of the graphs and , i.e. and . The labels of the vertices remain the same.
Renaming labels. . The structure of remains the same as the structure of , but each vertex with label receives label .
Introducing edges. . is obtained from by connecting each vertex with label with each vertex with label .
Clique-width of a graph is then defined as the smallest value of needed to describe with a -expression and is denoted as . To avoid confusion with the parameter of MHV and MHE, we may use notation of -expression instead of -expression.
There is still no known -algorithm for finding a -expression of a given graph . However, there is an -algorithm that decides that or outputs -expression of . For more details on clique-width we refer to [Hlineny2007].
3 Structural and distance-to-triviality parameters
In [Agrawal2018], Agrawal proved that Maximum Happy Vertices is -hard with respect to the standard parameter, the number of happy vertices. In [Aravind2016, Misra2018, Choudhari2018] some structural parameters for MHV and MHE were studied. In [Agrawal2018], Agrawal also asked whether MHV admits an algorithm when parameterized by the treewidth of the input graph alone. In this section, we show that both MHV and MHE are -hard with respect to certain distance-to-triviality and structural paramters, including treewidth, answering the question of Agrawal and some other questions. We start with the definition of a classical -complete (with respect to the solution size) problem.
Maximum Happy Vertices is -hard when parameterized by the distance to graphs that are a disjoint union of paths consisting of three vertices.
We reduce from Regular Multicolored Independent Set, that is -complete with respect to due to [Belmonte2013].
Let be an instance of Regular Multicolored Independent Set, and let be the degree of every vertex in , i.e. for any . We assume that for each , since otherwise the instance can be trivially reduced to an instance with a smaller . We construct an instance of Maximum Happy Vertices as follows.
We set , so each color corresponds to a unique vertex of . For convenience, we use vertices of as colors, instead of the numbers in .
For each edge , we introduce a path on three vertices , , in , with being the middle vertex of the path. Endpoint vertices and are precolored in colors and respectively, i.e. and , and the middle vertex is left uncolored.
We then introduce a selection gadget in . That is, we introduce uncolored vertices . For each and each color , we connect with each vertex precolored in color . Thus, a vertex becomes connected to exactly one vertex of the selection gadget , where is such that . The purpose of the selection gadget is that the color of in the optimal coloring corresponds to a vertex that we should take in in the initial instance of Regular Multicolored Independent Set.
We finally set and argue that is a yes-instance of Regular Multicolored Independent Set if and only if is a yes-instance of Maximum Happy Vertices.
Let be a multicolored independent set of , i.e. is an independent set in and for each . Let us construct a coloring of such that it extends and at least vertices of are happy with respect to . For each , set the color of to , i.e. , where . For each edge , set the color of to , if , or to , if , and to an arbitrary color otherwise. Formally, , if , and , if . If , then can be assigned an arbitrary color. Note that either or , since is an independent set.
has no other uncolored vertex, thus the construction of is complete.
For each vertex and each edge incident to , is happy with respect to .
Proof of Claim 0.
Indeed, is adjacent to exactly two vertices: , where , and . Since , and by construction of . is a vertex precolored with color , hence is happy with respect to .
For each , there are exactly edges adjacent to , hence all vertices precolored with color are happy. , hence at least vertices of are happy with respect to .
It is left to prove that if is a yes-instance of Maximum Happy Vertices, then is a yes-instance of Regular Multicolored Independent Set.
Let be an arbitrary coloring of extending . There are at most happy vertices in with respect to . Moreover, all happy vertices are precolored vertices of at most distinct colors.
Proof of Claim 0.
Observe that for each , is unhappy with respect to any coloring extending , since neighbours of are precolored with colors in , and each color is presented exactly times among its neighbours, and we assumed that consists of at least two vertices.
For each , is adjacent to exactly two vertices and , which are precolored with two distinct colors and . Thus, is also unhappy with respect to any coloring extending .
Hence, only precolored vertices of can be happy, i.e. vertices for . Each of them is adjacent to exactly one vertex of the selector gadget, i.e. vertex for some . But for each , only the neighbours that share the same color as can be happy. Thus, each happy vertex shares a color with one of vertices of the selection gadget. Since each color is presented exactly times in the partial coloring , there can be at most such happy vertices.
Let be a coloring of extending such that at least vertices of are happy with respect to . According to Claim 3, exactly vertices of are happy with respect to , and they are precolored with different colors. Moreover, for each color, all precolored vertices of this color are happy. Let be the set of these colors, i.e. . We argue that is an independent set in . Note that is then automatically satisfied, as is a clique in for each .
If there are happy vertices among the vertices of type in with respect to coloring , that extends , then is an independent set in .
Proof of Claim 0.
Indeed, suppose that is not an independent set in , i.e. there are vertices , such that . Then there is a path , , in . is a happy vertex of color , hence . Analogously, is a happy vertex of color , hence . We get that , which contradicts our assumption.
We have shown that is an instance equivalent to ; moreover, it can be constructed in polynomial time.
Note that the deletion of the selector gadget vertices in leads to being a disjoint union of paths consisting of three vertices. Thus, has the distance parameter being at most , and if Maximum Happy Vertices is in when parameterized by the distance to graphs being a disjoint union of path consisting of three vertices, then -complete Regular Multicolored Independent Set is also in . Hence, MHV is -hard with respect to the distance parameter.
The following corollary answers an open question posed in [Agrawal2018].
Maximum Happy Vertices is -hard with respect to parameters pathwidth, treewidth or clique-width, distance to cographs, feedback vertex set number.
-hardness of MHV with respect to the parameters distance to cographs or feedback vertex set number is an immediate corollary of Theorem 3, since graphs of type (that is, graphs that are a disjoint union of paths consisting of three vertices) are simultaneously cographs and forests.
Pathwidth. Let be a graph and be a modulator of , i.e. is a graph consisting of connected components that are disjoint paths on three vertices. Observe that the pathwidth of is at most . Indeed, let consist of connecting components, of them is a three-vertex path . Then construct a path decomposition of as a sequence
Constructed sequence is a correct path decomposition of . Firstly, each vertex is contained in a contiguous segment of sets in the sequence. Secondly, for each edge in , its endpoints are contained in some set of the sequence simultaneously, as each edge of is either an edge between a vertex in and some vertex , or an edge between and for some and . The size of each set of the sequence is , hence the pathwidth of is at most . Thus, if a graph has the distance-to- graphs parameter equal to , then its pathwidth is at most . By Theorem 3, MHV is -hard when parameterized by the pathwidth of the input graph.
Treewidth. -hardness for the treewidth parameter follows from the fact that a path decomposition of a graph is a tree decomposition of the graph; if a graph is of pathwidth , it is of treewidth at most .
Clique-width. In [Corneil2005], Corneil and Rotics proved that a graph of treewidth has clique-width at most . This already gives us the hardness result for the clique-width parameter. Though, one can improve the upper bound and show that if a graph has a -modulator of size , then the clique-width of such graph is at most .
Maximum Happy Edges is -hard when parameterized by the distance to graphs that are disjoint union of paths consisting of three vertices and is -hard when parameterized by the distance to graphs that are a disjoint union of cycles of length three.
We adjust the reduction from Regular Multicolored Independent Set to MHV provided in the proof of Theorem 3.
Given an instance of Regular Multicolored Independent Set, we construct an instance of Maximum Happy Edges as follows.
Let , . is constructed in the same way as in the proof of Theorem 3: for each edge , we introduce a path on three vertices , , , and set , , and is left uncolored; then we introduce the selection gadget vertices , and introduce an edge between and for each , and . For each , is left uncolored.
Additionally, we introduce edges new to this construction: for each and each edge , such that and , we introduce edges between and and between and . In case , we introduce only one edge.
We also need additional precolored vertices in order for this reduction to work. For each , and each , we introduce new paths consisting of three vertices in : for each , we introduce a path , , . We precolor every vertex in these new paths with color , i.e. for each . Then we connect each of them by a newly-introduced edge to the vertex of the selector gadget. These auxiliary vertices will ensure that for each , is colored with one of the colors in . Note that paths between these newly-introduced vertices are needed only to preserve the distance parameter.
In any optimal coloring of extending , for each .
Proof of Claim 0.
Suppose is an optimal coloring of extending , but for some . Then no edge between and is happy for any with respect to . Hence, the only edges incident to that can be happy are edges between and vertices of the paths constructed for edges, i.e. or . There are exactly such vertices, thus is incident to at most edges happy with respect to . But for each , is adjacent to vertices of type and vertices of type precolored with color . Hence, if we change the color of in to one of the colors in , we lose at most happy edges, and win at least happy edges, which contradicts the optimality of .
We finally set and argue that is a yes-instance of Regular Multicolored Independent Set if and only if is a yes-instance of Maximum Happy Edges.
Again, similarly to the proof of Theorem 3, let construct a coloring of from a multicolored independent set of with . The coloring is constructed almost in the same way as in the proof of Theorem 3: for each , we put , where , and for each we put . The difference is in coloring vertices , where and . Since is now adjacent to one or two vertices of the selector gadget, one may win one happy edge by coloring with one of the colors of the selector gadget vertices. Thus we put
There are exactly edges that are happy with respect to .
Proof of Claim 0.
Let consider every type of edges in .
Edges inside of the path for any and .
Each such path gives exactly happy edges, and there are such paths. In total, there are edges of this type.
Edges of type , for any , , , and .
Since , only edges between and are happy for a fixed . There are possible options of choosing and , hence these edges are in total.
Edges between and for any , and .
Again, since is precolored with color , and , only edges between and are happy. There are exactly edges incident to in , hence is adjacent to exactly vertices of type . In total, these sum up to edges.
Edges between and for any .
Since if and only if , for each fixed there are exactly happy edges of such type. Hence, there are such happy edges.
Edges between and , for any , and , where and .
For each such , we constructed so that is colored in the color of one of its neighbours in the selector gadget. is adjacent to one or two selector gadget vertices of distinct colors, hence it is adjacent to exactly one edge of such type. There are exactly edges in with no endpoints in , thus exactly edges of type are happy in with respect to .
Edges between and , for any , and , where .
As , , hence . Also, since , . Thus, each edge of such type in is happy with respect to , and there are edges of such type.
In total, we get that exactly
edges are happy in with respect to .
Claim 3 shows that if is a yes-instance of RMIS, then is a yes-instance of MHE. We now give a proof in the other direction.
Let be a coloring of extending such that at least edges are happy in with respect to . We may assume that is an optimal coloring of , i.e. it yields the maximum possible number of happy edges in . Then, by Claim 3, for every .
Again, we argue that is a multicolored independent set in . We start proving this fact with the following claim.
There are at least happy edges incident to the vertices of type with respect to .
Proof of Claim 0.
We bound the number of happy edges not incident to the vertices of type .
From follows that exactly edges are happy with respect to among edges that are incident to auxiliary vertices . These are exactly the edges of types and in the proof of Claim 3, and happy edges among them are counted in the same way as in the proof.
The only other edges not incident to the vertices of type are edges between and for each , and . Again, analysis of these edges is the same as the analysis of the edges of type in the proof of Claim 3, and their number is at most .
The only happy edges left are edges incident to for some , hence the number of happy edges among them is at least .
The following claim, along with Claim 3, allows us to move from counting happy edges to counting happy vertices in .
For any , cannot be incident to more than two happy edges in with respect to any coloring extending . Moreover, if is incident to exactly two happy edges, then either is happy or is happy with respect to .
Proof of Claim 0.
Take any . The neighbours of are vertices and , and also and , where and . In case , has only three neighbour vertices.
We know that , as . Hence, only one of the edges between and can be happy with respect to the same coloring .
The same holds for and : since , only one of the edges between and or between and can be happy at the same time. In case , there is the only edge that can be either happy or not.
Thus, happy edges incident to can sum up to no more than two edges. Suppose now that is incident to exactly two happy edges. Then these edges are and for some and some . Hence, . By Claim 3, , i.e. it is either and or and . Hence, and are connected by an edge in , and, since has exactly two neighbours and and they share the same color, is happy with respect to .
By Claim 3 and Claim 3, at least vertices of type are incident to exactly two happy edges with respect to . Hence, there are at least vertices of type that are happy in with respect to . Note that these vertices remain happy with respect to even if we remove auxiliary vertices and edges between and , i.e. return to the original construction of in the proof of Theorem 3.
Thus, coloring yields at least happy vertices of type in the original construction of in the proof of Theorem 3, hence we use Claim 3 to finish the proof of the first part of this theorem in the same way.
We thereby have shown that MHE is -hard when parameterized by the distance to graphs being a disjoint union of paths on three vertices. To prove the same for the distance to graphs being a disjoint union of cycles of length three, we note that in our construction of , endpoints of the paths are precolored vertices.
Hence, we can add an edge between endpoints of each path, i.e. between and for each and between and for each and , and just increase the parameter by the number of newly-appeared happy edges. Namely, these are the edges between and , thus we increase by , and the other parts of the construction remain the same.
Maximum Happy Edges is -hard with respect to parameters pathwidth, treewidth or clique-width, distance to cographs, feedback vertex set number.
The rest of the section focuses on the parameterized complexity of both MHV and MHE parameterized by the distance to cluster parameter. We separate MHE and MHV, showing that the former problem is -hard with respect to this parameter, but the latter admits an -algorithm. This answers an open question posed in works of Choudhari and Reddy [Choudhari2018] and Misra and Reddy [Misra2018].
Maximum Happy Edges is -hard when parameterized by the cluster vertex deletion number.
Observe that graph consisting of disjoint cycles of length three is a cluster graph. Then, by Theorem 3, MHE is -hard when parameterized by the distance to cluster graphs.
Maximum Happy Vertices can be solved in time, where is the distance to cluster parameter of the input graph.
We adapt algorithms of Misra and Reddy presented in [Misra2018] in their proofs of membership result for both MHV and MHE parameterized by the vertex cover number and by the distance to clique parameters.
Let be an instance of MHV, and is a given minimum modulator to cluster of . We describe an algorithm that works in , where is the distance to cluster parameter of . Note that it is not necessary that is given explicitly. To find , one can simply consider as an instance of Cluster Vertex Deletion parameterized by the solution size, and employ one of the algorithms working in time , let it be a simple running time algorithm [jansen1997disjoint], or more sophisticated ones, working in [Hffner2008] or even in [Boral2015] running time. Note that this would not change the overall running time, since is a constant value.
To solve the problem, the algorithm finds an optimal coloring of . Let be an arbitrary optimal coloring of . Firstly, the algorithm guesses what vertices of are happy with respect to . Clearly, there are options to choose a subset , and the algoithm considers each one of them. From now on, let be a fixed guess of the algorithm, i.e. it assumes that is the set of vertices of that are happy in with respect to .
At the other hand, partitions vertices of into groups of the same color, in other words, into equivalence classes. Obviously, such partitions can be enumerated in time (if , there are at most such partitions, that is even less). The algorithm guesses a partition corresponding to . Let be a fixed guessed partition, where . Formally, a partition corresponding to should satisfy for each pair of vertices . For each , the vertices in are assigned the same color, denote this color by . The actual value of the colors is not known to the algorithm. Thus, are color variables and the algorithm is to determine what actual colors they should correspond to. Importantly, for distinct and , and should correspond to distinct colors in .
For convenience, we introduce a partial function to the algorithm. If specified, a value denotes a color variable that corresponds to the color . Since distinct variables correspond to different colors, can be viewed as a partial coloring of the vertices of , just with the color variables used instead of actual colors. If both and are specified for a pair of vertices , then if and only if . In other words, agrees with . Since is a coloring of , agrees with as well.
The purpose of is to reflect restrictions on a coloring that are implied by the fixed guesses of the algorithm. That is, should agree with the set of happy vertices , and with the partition . Clearly, for each and for each , . Also, since all vertices in are happy with respect to , for each and for each , should equal . The algorithm assigns values of so that these restrictions are satisfied. If the fixed guesses correspond to an actual coloring, the function satisfying these restrictions exists and is found easily by the algorithm. If cannot be found, the algorithm stops working with the currently fixed guesses, since they do not correspond to any coloring of . Note that the restrictions do not ensure that all vertices in are unhappy. We formulate the main property of in the following claim.
Let be a coloring that agrees with constructed by the algorithm. Then all vertices in are happy with respect to in and partitions the vertices of according to .
Now the algorithm starts to find values of the color variables. This can be viewed as a constructing an injective function . Since agrees with , some values of can be determined by the algorithm: if for some vertex both and are specified, then . The algorithm constructs so that this property is satisfied. If it is impossible to construct an appropriate injective , the algorithm stops working with the current guesses and continues with another ones.
When found, allows to extend both and . If a correspondence between a color variable and a color was established, i.e. , then we may assume that . According to this, the algorithm extends and . Note that is no more an initial precoloring of , since it was extended according to .
We now want each vertex of to be assigned either a color (by ) or a color variable (by ). If for a vertex , neither nor is assigned, we call unassigned. Recall that all vertices in are assigned a color variable by . It is left to assign colors (by ) or color variables (by ) to each of the vertices of the cluster, i.e. vertices in . It turns out to be possible since we are looking for an optimal coloring that agrees with and . and already ensure happiness of all vertices in , so the algorithm can focus directly on happiness of the vertices in .
Consider a connected component in the cluster graph , say, a clique . There are a few cases to consider. If contains two vertices that are assigned distinct colors by or distinct colors by , then all vertices in are unhappy with respect to any coloring that agrees with and . Thus, vertices in can be colored arbitrarily and there are no happy vertices among them. The algorithm assigns an arbitary color, say color , to each unassigned vertex in . Now and , consider easier case . In this case, the algorithm assigns color variables to unassigned vertices in . can yield a happy vertex only if all vertices in receive the same color variable. Since , there is an optimal coloring in which all vertices in are colored with the same color. If , this is simply the color variable in . Otherwise, the algorithm can simply determine how many happy vertices will yield if a color variable is chosen. The only neighbours of vertices in outside of are vertices in , and each vertex in is assigned a color variable by . Thus, it is easy to determine for a vertex whether it is happy if the whole clique is assigned a color variable . The algorithm chooses a color variable that gives the maximum possible number of happy vertices in , and assigns it to each vertex in .
It is left to consider . In this case, can yield a happy vertex only if all vertices in receive color . Hence, there is an optimal coloring where each unassigned vertex in is colored with color . The algorithm assigns color to each unassigned vertex in . It is left to determine how many happy vertices does contain. In contrast with the previous two cases, this depends on which color variable does correspond to color . In case , can yield happy vertices only if corresponds to , and it is easy to find the number of happy vertices in . In case , each vertex in is colored with color . Therefore, can contain some vertices that are happy in any case, that is, vertices that have no neighbours outside . Since these vertices are always happy, the algorithm does not count them. Each other vertex in has at least one neighbour in . If it has two neighbours with distinct color labels assigned, it can never be happy. Otherwise, all of its neighbours in are assigned the same color label, say , and the vertex is happy if and only if corresponds to . Thus, for each color variable we get that yields a certain number of happy vertices if corresponds to .
Summing up these values over all clique components, we get a weighted bipartite graph . Left part of the graph corresponds to the color variables , and the right one corresponds to the colors . An edge between in the left part and in the right part is assigned a weight equal to the number of happy vertices in in case gets corresponding to (not counting vertices that are happy independently of this choice). Some color variables can already be assigned a color by , and the graph should reflect that. That is, for each with assigned, there is only one edge incident to in , and this edge is . For each other color variable, there is each of possible edges presented in . Clearly, a maximum-weight matching in that saturates all color variables yields an optimal way to assign colors to the color variables.
The algorithm constructs graph and finds a maximum matching in in polynomial time. If gets connected to in , the algorithm extends with . Since no more contains unassigned vertices, the optimal coloring can be simply constructed from the values of , and . The pseudo-code of the algorithm procedure working with a single pair of guesses is presented in Fig. 1.