 # Three-dimensional matching is NP-Hard

The standard proof of NP-Hardness of 3DM provides a power-4 reduction of 3SAT to 3DM. In this note, we provide a linear-time reduction. Under the exponential time hypothesis, this reduction improves the runtime lower bound from 2^o(√(m)) (under the standard reduction) to 2^o(m).

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In this note, we first establish the hardness of the following decision problem.

###### Definition 1 (3dm).

.

Input: Sets and and a set of matches of size .

Output: YES if there exists such that each element of appears exactly once in . NO otherwise.

To prove that 3DM is NP-Hard, we reduce an instance of 3SAT to the given problem. Next, we define the 3SAT decision problem.

###### Definition 2 (3-Sat).

.

Input: A boolean formulae in 3CNF form with literals and clauses.

Output: YES if is satisfiable, NO otherwise.

Given an instance of 3SAT with literals and clauses,  construct a graph with vertices and edges. Thus, this is a power- reduction. In this note, we use a similar but a more efficient gadget and provide a linear time reduction of the 3SAT instance to the given problem.

## 2 Hardness of 3DM

###### Theorem 3.

Three-dimensional matching is an NP-Hard problem. Figure 1: Part of graph G constructed for the literal x1. The figure is an illustration for when x1 is part of four different clauses. The triangles (or hyper-edge) (ai,bi,ci) capture the case when x1 is true and the other triangle (bi,c′i,ai+1) captures the case when x1 is false. Assuming that a clause Cj={x1,x2,x3}, the hyper-edges containing tfi,tf′i and t1,t′1 capture different settings. The hyper-edges containing t1,t′1 ensure that atleast one of the literals in the clause is true. The other two ensure that two variables can take either true or false values.

Our reduction is described in Fig. 1. For each literal , let be the number of clauses in which the the literal is present. We construct a “truth-setting” component containing hyper-edges (or triangles). We add the following hyper-edges to .

 {(ak[i],bk[i],ck[i]):1≤k≤mi} ∪{(ak+1[i],bk[i],c′k[i]):1≤k≤mi}

Note that one of or have to be selected in a matching . If the former is selected, that corresponds to the variable being assigned true, the latter corresponds to false. This part is the same as the standard construction.

For every clause we add three types of hyper-edges. The first type ensures that atleast one of the literals is true.

 {(ck[i],t1[j],t′1[j]):x′i∈Cj}∪{(c′k[i],t1[j],t′1[j]):xi∈Cj}

The other two types of hyper-edges (conected to the ’s) say that two of the literals can be either true or false. Hence, we connect them to both and

 {(ck[i],tf1[j],tf′1[j]):x′i or% xi∈Cj} ∪{(ck[i],tf2[j],tf′2[j]):xi or x′i∈Cj} ∪{(c′k[i],tf1[j],tf′1[j]):x′i or xi∈Cj} ∪{(c′k[i],tf2[j],tf′2[j]):xi%orx′i∈Cj}

Note that in the construction refers to the index of the clause in the truth-setting component corresponding to the literal . Using the above construction, we get that

 W={ck[i],c′k[i]} X={ak[i]}∪{t1[j],tf1[j],tf2[j]} Y={bk[i]}∪{t′1[j],tf′1[j],tf′2[j]}

Hence, we see that . Now, . And, we have that . Thus, we see that this construction is linear in the number of clauses.

Now, if the 3-SAT formula is satisfiable then there exists a matching for the 3DM problem. If a variable in the assignment then add to else add . For every clause , let (or ) be the variable which is set to true in that clause. Add (or ) to . For the remaining two clauses, add the hyper-edges containing and depending upon their assignments. Clearly, is a matching.

Now, the proof for the other direction is similar. If there exists a matching, then one of or have to be selected in a matching . This defines a truth assignment of the variables. Now, the construction of the clause hyper-edges ensures that every clause is satisfiable.

## 3 Exponential Time Hypothesis for 3DM

Before we start the discussion in the section, lets review the definition of the exponential time hypothesis.

Exponential Time Hypothesis (ETH)
There does not exist an algorithm which decides 3-SAT and runs in time.

If the exponential hypothesis is true, the standard reduction of 3-SAT to 3DM  implies that any algorithm for 3DM runs in . However, using the reduction in Section 2, we get a more tighter dependence on stated as a theorem below.

###### Theorem 4.

If the exponential time hypothesis holds then there does not exist an algorithm which decides the three-dimensional matching problem (3DM) and runs in time .

###### Proof.

For the sake on contradiction, suppose that such an algorithm exists. Then, using the reduction from Section 2 and , we get an algorithm for 3SAT that runs in time which contradicts the ET hypothesis. ∎

An immediate corollary of this result applies to another popular problem Exact Cover by 3-sets.

###### Definition 5 (X3c).

.

Input: . A collections of subsets such that each and contains exactly three elements.

Output: YES if there exist such that each element of occurs exactly once in , NO otherwise.

###### Corollary 6.

If the exponential time hypothesis holds then there does not exist an algorithm which decides exact cover by 3-sets problem (X3C) and runs in time .

###### Proof.

The proof follows from the trivial reduction of 3DM to X3C where and . ∎

## References

•  M. R. Garey and D. S. Johnson (1979) Computers and intractability. Vol. 174, freeman San Francisco. Cited by: §1, §3.
•  S. Kushagra, S. Ben-David, and I. F. Ilyas (2019) Semi-supervised clustering for de-duplication. In

The 22nd International Conference on Artificial Intelligence and Statistics, AISTATS 2019, 16-18 April 2019, Naha, Okinawa, Japan

,
pp. 1659–1667. Cited by: footnote 1.