# On the Hardness of Energy Minimisation for Crystal Structure Prediction

Crystal Structure Prediction (csp) is one of the central and most challenging problems in materials science and computational chemistry. In csp, the goal is to find a configuration of ions in 3D space that yields the lowest potential energy. Finding an efficient procedure to solve this complex optimisation question is a well known open problem in computational chemistry. Due to the exponentially large search space, the problem has been referred in several materials-science papers as ”NP-Hard and very challenging” without any formal proof though. This paper fills a gap in the literature providing the first set of formally proven NP-Hardness results for a variant of csp with various realistic constraints. In particular, we focus on the problem of removal: the goal is to find a substructure with minimal potential energy, by removing a subset of the ions from a given initial structure. Our main contributions are NP-Hardness results for the csp removal problem, new embeddings of combinatorial graph problems into geometrical settings, and a more systematic exploration of the energy function to reveal the complexity of csp. In a wider context, our results contribute to the analysis of computational problems for weighted graphs embedded into the three-dimensional Euclidean space.

## Authors

• 4 publications
• 18 publications
• 3 publications
• 13 publications
• ### Kemeny ranking is NP-hard for 2-dimensional Euclidean preferences

The assumption that voters' preferences share some common structure is a...
06/24/2021 ∙ by Bruno Escoffier, et al. ∙ 0

• ### A Simpler NP-Hardness Proof for Familial Graph Compression

This document presents a simpler proof showcasing the NP-hardness of Fam...
09/07/2020 ∙ by Ammar Ahmed, et al. ∙ 0

• ### Minimum Constraint Removal Problem for Line Segments is NP-hard

In the minimum constraint removal (MCR), there is no feasible path to mo...
07/07/2021 ∙ by Bahram Sadeghi Bigham, et al. ∙ 0

• ### Improved hardness for H-colourings of G-colourable graphs

We present new results on approximate colourings of graphs and, more gen...
07/01/2019 ∙ by Marcin Wrochna, et al. ∙ 0

• ### r-Gathering Problems on Spiders:Hardness, FPT Algorithms, and PTASes

We consider the min-max r-gathering problem described as follows: We are...
12/05/2020 ∙ by Soh Kumabe, et al. ∙ 0

• ### The algorithmic hardness threshold for continuous random energy models

We prove an algorithmic hardness result for finding low-energy states in...
10/11/2018 ∙ by Louigi Addario-Berry, et al. ∙ 0

• ### A Tutorial on Online Supervised Learning with Applications to Node Classification in Social Networks

We revisit the elegant observation of T. Cover '65 which, perhaps, is no...
08/31/2016 ∙ by Alexander Rakhlin, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

One of the central and most challenging problems in materials science and computational chemistry is the problem of predicting the structure of a crystal given the set of ions composing it. At a high level, the goal there is to find a configuration structure of ions in a three-dimensional box that achieves the lowest energy. This problem, termed Crystal Structure Prediction (csp), has remained open due to the complexity of solving it optimally [woodley2008crystal]

and the combinatorial explosion following a brute-force approach. There are many previous approaches to this problem, largely based on heuristic techniques

[LYAKHOV20101623, doi:10.1002/1521-3765(20020916)8:18<4102::AID-CHEM4102>3.0.CO;2-3, oganov2006crystal, WANG20122063], which have shown some promise, however they still lack the ability to guarantee optimality and moreover they are computationally very demanding.

In the most generic formulation of csp

there are many degrees of freedom due to the numerous parameters of the model: the number of ions to place, their positions, and the unique interactions between each type of ion. Furthermore, real crystals are based on periodic tessellations of 3D space with unit cells whose size and shape may also be changed. The search space remains exponential in size even for greatly simplified versions of

csp. Due to this, csp has, incorrectly, been referred to in several computational-chemistry papers as “NP-Hard and very challenging” [Oganov2018]. However, from the computational-theory viewpoint the argument that the search must be done in a set of exponential size does not imply NP-Hardness.

On the other hand, the two results which are often mentioned in context of the NP-Hardness of csp are [Barahona_1982] and [Wille_1985]. In [Barahona_1982], within the context of the Ising model, the authors show NP-Hardness in the model of placing charges on a graph taking into account only interactions between connected vertices. The reduction works on a grid where each vertex has degree at most 6, making the interaction very local. In [Wille_1985], it was shown that the problem of placing ions for some given positions is in NP by reducing the problem to TSP. However the reduction goes only one way and does not imply the NP-Hardness of the problem.

In our work, we consider several special variants of csp and provide a few alternative reasons for the hardness of closely related problems. We take inspiration from hard combinatorial problems in graph theory and propose several new embeddings of NP-Hard graph problems into numerical versions of csp which can be seen as an optimisation problem for weighted geometric graphs with a non-linear objective function. We focus on the problem of removal. Here, the input is a configuration of the ions, and the goal is to remove a subset of the ions such that the interaction energy among the atoms is minimised.The problem of removing vertices of a graph whose deletion results in a subgraph satisfying some specific property have been intensively studied in the combinatorial graph theory. In [doi:10.1137/0208049], it was shown that for a large class of properties this problem is NP-Complete. In [doi:10.1137/0210022] and [Yannakakis78node-and] this was extended to further properties showing NP-Completeness for bipartite graphs and for non-trivial hereditary properties.

The removal problem can be seen as a variant of combinatorial csp problem, where the positions of the ions correspond to points in discrete grid/lattice. For example, we may find an optimal structure for csp by placing many copies of the ions that we wish to use to build a new structure in unrealistic positions in the discrete space. This may involve ions “overlapping”, i.e., being unrealistically close. Due to the nature of the energy function, when the goal is to minimise the potential energy, the overlapping ions must be removed. In our variant of removal problem for which we show NP-Hardness results, the initial configuration (from where we remove ions) is part of the input and has only vacant positions or positions with a single ions in the discrete three-dimensional-Euclidean space.

Our contributions. We give the first NP-Hardness results with more realistic constraints for csp [collins2017accelerated], we provide new embeddings of combinatorial graph problems in geometrical settings, and we explore in a more systematic way the energy function that could reveal the computational complexity of csp. Moreover, our results can be seen as part of a more general problem of removing vertices from a weighted graph embedded into 3D Euclidean space.

The main challenges for the Euclidean graphs we consider is that they are complete and that the edges are weighted proportional to the distance between their vertices. As such, many classical NP-Hard problems are much harder to embed into this setting. Even for some existing hardness results, in both the geometric and more restricted Euclidean setting, to bring these problems into a bounded number of dimensions often require non-trivial technical proofs as dimension often is part of the input [ageev2014np, 10.1007/978-3-642-00202-1_24]. In some of our constructions we utilise the results on geometric graphs embedded into the plane [ITA_2011__45_3_331_0, CLARK1990165], however many problems in this field remain open. We also study the optimisation problems where we change parameters in the energy function, weights of the nodes, and also restricting their ranges. In this paper we consider three specific versions of the removal problem in Euclidean setting:

• -Charge Removal: Remove exactly charges to minimise the total energy;

• Minimal-At-Least--Charge Removal: A generalisation of -charge removal where the removed set is a minimal set of at least charges minimising the total energy;

• At-Least--Charge Removal: A generalisation of minimal-at-least--charge removal where the removed set is of least charges but not necessarily minimal, minimising the total energy.

In all cases we require the sum of the positively weighted vertices to equal the magnitude of the sum of negatively weighted vertices in the set that we are removing. The first version is -charge removal, where for a given we must remove a set of vertices such that the sum of vertices with positive weights is equal to . We note in Corollary 1 that determining if there is a solution to this problem when we have an unbounded number of possible charges for the ions is NP-Hard. One generalisation that will get around this problem is minimal-at-least--charge removal, where the sum of the weights of positively weighted vertices is greater than or equal to , and the set is minimal, meaning that no strict subset of the removed vertices satisfies the required properties. While every instance of this problem will have a minimal set, we show in Proposition 3 that in the case of unbounded charges it is NP-Hard to verify if any given solution is minimal. We may remove the requirement for a solution to be minimal to avoid this, getting the at-least--charge removal problem.

We will also consider a variety of settings for these problems, varying the energy function for the crystal, restrictions on the charges and restrictions on the number of ion species. We summarise our results in Table 1. Regarding energy functions, we will primarily consider the class of controllable energy functions, , which we define in Section 2, and the Buckingham-Coulomb potential function, which is popular in computational chemistry. We show in Proposition 1 that Buckingham-Coulomb belongs to the class . One further energy function we will consider is the Coulomb potential, which is used to calculate the electrostatic potential. We show that depending on the energy function used, and the restrictions on the ion species and charges, we are able to reduce several different combinatorial problems to our problems.

The remainder of this work will be organised as follows: in Section 2 we discuss the preliminaries of these problems, providing relevant notation and definitions. In Section 3 we present our results for the general case of the problems, claiming NP-Hardness with Theorems 1 and 2 for energy functions in and an unbounded number of ion species. We also consider some natural restrictions to this problem. In Section 4 we consider the restriction of having only two species of ion under the Buckingham-Coulomb potential as our energy function, showing that the problem remains NP-Hard under these restrictions in Theorem 4. In Section 5 we consider the restriction to only the Coulomb potential, this time with no restrictions on the number of species of ions or the charges of the ions. In Theorem 5 we show that under these restrictions the problem remains NP-Hard.

## 2 Preliminaries

Unit Cell. A crystal is a solid material whose ions, are arranged in a highly ordered arrangement, forming a crystal structure that extends in all directions. A crystal structure is described by its unit cell; a region of three dimensional space within a parallelepiped representing a period that contains ions in a specific arrangement. The unit cells are stacked in three dimensional space tiling the whole space forming a crystal. The unit cell is a parallelepiped alongside the arrangement of ions with their specie. Each unit cell contains a set of ions within the parallelepiped. Each ion, , has a specie, e.g. Ti or Sr, and a non-zero charge . The specie for an ion will be denoted . In every crystal the unit cell is neutrally charged, i.e., . An arrangement defines a position for every ion in the unit cell, i.e. the positions within .

Energy. The most frequently used technique to compute the energy of a crystal is by summing the pairwise interactions between all pairs of ions. A positive value for the pairwise interaction means the two ions are repelling, while a negative value means they are attracting each other. Formally, for each pair of species there is a unique set of parameters - called force fields - which are applied to the common energy function alongside the Euclidean distance between the ions. In general, energy is defined via series as a crystal is infinite.

In this paper interaction will be restricted with respect to the energy function to a single unit cell only. The primary reason is that the energy between ions in different unit cells quickly converges, making the energy within a single unit cell a good approximation of the total. A second reason is that it is much quicker to compute the energy for a finite set of ions than it is to compute the convergence over an infinite series.

Each arrangement has ions and a corresponding potential energy , calculated with respect to the given common energy function . The goal is to minimise the potential energy. i.e. maximise the magnitude of a negative . The pairwise interaction between two ions and with respect to the energy function is , denoted when it is clear from the context. The value of is defined by the force field of the ions and the Euclidean distance between them, which is included as one of the parameters. The total potential energy for an arrangement of ions is given by .

This paper will consider a general class of energy functions, called the controllable potential functions, denoted by . All functions in are required to be computable in polynomial time for any input. Intuitively, for every there exists a set of force field parameters that counteract the effect of the distance parameter . Formally, a function belongs to if and only if for any given and any fixed there exists a set such that .

The most popular function for crystal structure prediction, which will be focused on in this paper, is the Buckingham-Coulomb potential [buckingham1938classical], which is the sum of the Buckingham and Coulomb potentials. The Coulomb potential for a pair of ions is defined as , where is the Euclidean distance between the ions. The Buckingham potential for a pair of ions , , is defined by four parameters. These are the distance and the three force field parameters, , , , which are dependent on the specie of the ions. It should be noted that all three parameters are positive values. The energy is calculated as . Therefore the Buckingham-Coulomb potential is given by

 UBCij=UBij+UCij=AS(i),S(j)eBS(i),S(j)rij−CS(i),S(j)r6ij+qiqjrij.
###### Proposition 1.

There exists a set of parameters for the Buckingham-Coulomb function such that it is in .

###### Proof.

Given a value and a pair of ions, and , at a distance of with arbitrary charges and , the parameters may be set so that the potential at a distance of is . is set to 0 and the values of and are set as follows. If

 AS(i),S(j)={a,if qiqj>0;a+|qiqj|rij,otherwise.CS(i),S(j)={qiqjr5ij,if qiqj>0;0,otherwise.

If

 AS(i),S(j)={0,if qiqj>0;|qiqj|rij,otherwise.CS(i),S(j)={|a|r6ij+qiqjr5ij,if qiqj>0;|a|r6ij,otherwise.

Substituting for into the Buckingham-Coulomb potential, the equation becomes

 UBCij=AS(i),S(j)−CS(i),S(j)r6ij+qiqjrij.

Cancelling out the coulomb potential either by adding to in the case or to in the case . In the first case the energy added by the Coulomb potential will be , which will be cancelled by the addition of when multiplied by the term applied to . Otherwise the Coulomb energy will be , which will be cancelled out by the relevant addition from .

By the same arguments the addition of an term to either or with the appropriate multiplier, will leave just the value of . Note that in the case that is negative, the magnitude must be added to so that once the multiplier has been applied it will act as a negative value. ∎

Crystals as geometric graphs. Using the above definitions, it can be shown how crystals may be viewed as geometric graphs. Recall that each ion corresponds to a charged point in . Each ion is represented with a weighted vertex, also placed into at the same position as the ion, giving a total of vertices. The vertex corresponding to the ion , denoted , is assigned a weight of . will denote the weight of a given vertex , i.e. . For notation, will denote the set of vertices with a positive weight in , and for the set of vertices with a negative weight in . This can be extended to a set of ions , using for the positively charged ions and for the negatively charged ions.

Between each pair of vertices there is an edge, weighted by the pairwise interaction of the corresponding ions . Note that from its definition will be determined in part by the length of the edge, which will be drawn as a straight line in the space. The energy of a crystal graph can be computed as .

Geometric graphs created from a unit cell will be referred to as crystal graphs. In the remainder of this work crystals will be described in terms of their physical structure where it makes sense to be considering the ions, and as a graph otherwise.

### The k-Charge Removal Problem.

The -charge removal problem, henceforth k-charge removal, will take as input a crystal graph corresponding to a “dense” initial arrangement of ions, with the goal of removing some vertices in order to minimise the energy of the new subgraph . It will be assumed that the initial graph is charge-neutral, as defined in Definition 1. As must also be neutral, any set of vertices which is removed must therefore be neutral. Using intuition from chemistry regarding the number of ions within a realistic unit cell, a natural number of charges to remove is chosen, as defined in Definitions 2 and 3.

###### Definition 1.

A set of vertices is neutral if .

###### Definition 2.

A set of k-Charges from a crystal graph is a neutral subset where and .

Informally, a set of -charges is a set of vertices with a total weight is 0, while the magnitudes of the sums of all positively weighted vertices, and of all negatively weighted vertices is

###### Definition 3.

A removal of a set of vertices from a graph is the graph where and is the set of edges in with no endpoint in .

Informally, a removal is the result of removing a set of vertices and the incident edges from a given graph.

###### Problem 1.

-Charge Removal (k-charge removal)

Instance: A crystal graph G, with edges weighted by a given common energy function U, and a natural number k. The set of k-charges R from G where G′={V′,E′} created by the removal of R from G which minimises ∑{vi,vj}∈E′Uij.

A decision problem can be derived from k-charge removal by asking if there exists a removal that leaves with no-more total energy than some goal , i.e..

### The At-Least-k-Charge Removal Problem.

One generalisation of k-charge removal is the at-least--charge removal problem, denoted at-least-k-charge removal. This problem takes the same input as in k-charge removal, however rather than looking to remove a set of exactly -charges, it is instead sufficient to remove a neutral set of at-least--charges. The motivation for this comes from the case that it is not be possible to remove exactly for some given . In this generalisation more than charges may be removed, provided the cell remains neutral. Note that any removal of exactly will also be valid for this generalisation.

###### Definition 4.

A set of At-Least-k-Charges from a crystal graph is a neutral subset where and .

###### Problem 2.

At-Least--Charge Removal (at-least-k-charge removal)

Instance: A crystal graph G, with edges weighted by a given common energy function U, and a natural number k. The set of at-least-k-charges R from G where G′={V′,E′} created by the removal of R from G which minimises ∑{vi,vj}∈E′Uij.
###### Proposition 2.

A solution to k-charge removal or at-least-k-charge removal can be verified in polynomial time.

###### Proof.

A solution to k-charge removal contains the set of vertices that are removed. This can be verified as a set of at-least--charges by simply summing up the positive and negative weights, checking that the set is neutral and that for k-charge removal or for at-least-k-charge removal. This will take time of the order of . Similarly the sum of the edges in the original graph that do not have an endpoint in can be checked against the goal value . This can be done in time, as the graph is complete. Therefore as no step will take more than time, a solution to at-least-k-charge removal can be verified in polynomial time. Hence k-charge removal and at-least-k-charge removal fall into the class of NP problems. ∎

### The Minimal-At-Least-k-Charge Removal Problem

An alternative restriction of at-least-k-charge removal is the minimal-at-least--charge removal problem, denoted minimal-at-least-k-charge removal. This also serves as a generalisation of k-charge removal, where the goal is to get close to a set of -charges, however accepting that it may not be possible to reach the exact value. In this problem a minimal set of at-least--charges is removed.

###### Definition 5.

A set of at-least--charges is minimal if there exists no subset such that and .

Informally, Definition 5 means that there is no way of getting closer a set of -charges from the set, without having fewer than charges. It follows that for a given crystal graph, there may be multiple minimal at-least--charge sets for a given . A removal of at-least--charges is minimal if the set of at-least--charges is minimal. It may be noted that a set of -charges is always a minimal set of at-least--charges.

###### Problem 3.

Minimal-at-least--Charge Removal (k-charge removal)

Instance: A crystal graph G, with edges weighted by a common energy function U, and a natural number k. The minimal set of k-charges R from G where for G′={V′,E′} created by the removal of R from G which minimises ∑{vi,vj}∈E′Uij.
###### Proposition 3.

It is NP-Hard to verifying if a set of at-least--charges is minimal when no bounds are given on the value of the charges.

###### Proof.

This can be shown by a reduction from the subset-sum problem. In the subset-sum problem there is a set of values , and a goal . The goal is to choose some subset such that . Note that this problem remains NP-complete in the case the input is only positive integers.

Given an instance of subset sum , a crystal graph is created as follows as follows. For each integer a new vertex with a weight of is created, note these will correspond to the set . Two further ions are created, the first having a charge of and the second having a charge of , these will correspond to . The value is chosen as the greater of and .

Given this instance it can be claimed that the only minimal -charge removal from is . To disprove this is minimal there must be some subset that is also a set of at-least--charges. As charges must be removed, any such must only contain the vertex in with a charge of . Therefore if this claim is false, there must be a set such that . If there is such a then there is also have a solution to the subset sum instance as either or . This can be shown as if , the the values in must sum to , satisfying . Conversely if , then the ions in must sum to satisfying . If there is no such subset then there must be no solution to , as such a solution would allow the existence of a for this set.

In the other direction, if there is a solution to then trivially there must be exist such a that would make non-minimal. Similarly if there is no valid solution to then the only minimal set of -charges is the complete set of ions. Therefore it can not be determined if the a solution is minimal in polynomial time. Subsequently as a minimal set of charges for at-least-k-charge removal is required, a solution can not be verified in polynomial time, therefore it is not in NP in the general case. ∎

###### Corollary 1.

It is NP-Hard to determine if an instance of k-charge removal has a valid solution in the case there are no bounds on the value of the charges.

###### Proof.

It follows from the arguments of Proposition 3 that an instance of k-charge removal may be constructed for a subset sum instance such that it is only satisfiable if the subset sum instance is. ∎

###### Corollary 2.

A set of -charges may be verified as minimal in polynomial time for charge values bounded by a polynomial size.

###### Proof.

In the case the charges are bounded a solution to the subset sum may be found in polynomial time, for example relative to either the upper limit on the value due to Pisinger [pisinger1999linear], or for the number of distinct weights relative and the goal value due to Axiotis and Tzamos [axiotis2018capacitated]. Using these a set of -charge can be verified a minimal. This is done by, for every value checking if there a subset of charges and such that . If there exists such a solution for any then is not minimal.

The claimed energy may also be verified by checking the sum of pairwise interactions relative to , with may trivially be done in Polynomial time by the definition of . Therefore under these restrictions at-least-k-charge removal is in NP. ∎

## 3 NP-Hardness for an unbounded number of ion species

This section will focus on the class of potential functions . It will be assumed that the energy function for all cases is an arbitrary function in for which the parameters required by the ions to result in the energy from their pairwise interaction to be any arbitrary are known. NP-completeness for k-charge removal as well as for the generalisations to minimal-at-least-k-charge removal and at-least-k-charge removal will be shown when there are bounds on value of the charges (either quantity of charges, or the maximum value). It may be noted that in the case the charges are not bounded, minimal-at-least-k-charge removal will remain NP-Hard, however as it will not be in NP it will not be complete. Moreover k-charge removal and minimal-at-least-k-charge removal when all vertices have a weight of for a given can be reduced to max-weight--clique.

###### Theorem 1.

k-charge removal, minimal-at-least-k-charge removal and at-least-k-charge removal are NP-Complete for energy functions in for charges of , for any natural number .

###### Proof.

k-charge removal and at-least-k-charge removal are in NP by Proposition 2, and as the charges are bounded, minimal-at-least-k-charge removal will be in NP by Corollary 2. Hardness is established via a reduction from clique. This is shown by reduction to k-charge removal, noting that any satisfying solution to k-charge removal will also satisfy at-least-k-charge removal and minimal-at-least-k-charge removal.

In the Clique problem, henceforth clique, the input is a graph, , and a natural number, . The goal is to find a clique of size in , or report that no such clique exists. A clique is a set of vertices in a graph such that all vertices in the set are adjacent to each other.

Given an instance of clique, where , an instance, , of k-charge removal is constructed as follows. A unit cell of arbitrary size is chosen. Within this cell unique positions are created at arbitrary places. In the first positive ions are placed and in the last negative ions are placed. Each ion has its own unique specie. Every vertex corresponds to two ions, and with charges and respectively. For two ions and associated with and respectively the parameters are set so as to satisfy the following:

 Uij={−1vi=vj or {vi,vj}∈E∞otherwise.

The definition of guarantees that there exists parameters satisfying these conditions irrespective of the positions, and thus the distance , of the ions. The goal vale is chosen as , noting that there are edges in a clique of size . Let , noting that to remove positive and negative ions charges must be removed.

From this the corresponding crystal graph is constructed as described in the preliminaries. Let the vertices represent the ions corresponding to . will be used to denote either or where the charge doesn’t matter i.e. we are only concerned with the vertex in that corresponds to. From the definition of the energy function, if or , and otherwise.

It may now be claimed that will be satisfiable if and only if is satisfiable. First consider the case that is satisfiable. In this case charges may be removed from , leaving only the vertices corresponding to the clique in , denoted . As all vertices in correspond to adjacent vertices in , the energy will simply be multiplied by the number of edges, giving a total energy of , satisfying the k-charge removal instance. Conversely if there does not exist a clique of size in then any subset of vertices of cardinality clearly must contain at least one edge with a weight of , making unsatisfiable.

Assume now is satisfiable, where is the set of removed vertices and are the remaining vertices. Note thate the energy must be at most , and there have exactly vertices in . As the only negative weight edges in are those between two vertices representing either the same or adjacent vertices in . Clearly this can only be achieved by having a set of vertices in representing vertices in a clique of size in . Similarly if is unsatisfiable then there can not be a clique of size , as this would imply is satisfiable. Therefore this problem is NP-Complete in the case that all ions have charges of for any natural number for k-charge removal. The problem is also NP-Complete for at-least-k-charge removal and minimal-at-least-k-charge removal, as the minimum energy for any at-least--charge removal will be , which can only be achieved by a -charge removal. ∎

It can be noted that this may be extended to other graph problems relatively easily. One example of this would be the max-weight k-clique problem. In max-weight k-clique takes as input a weighted graph , a natural number , and a goal value . The problem is to report if a clique of size where the sum of the weights of the edges is at least exists. Using the above construction, a crystal graph may be created from . From this the weights on the edges may be adjust as follows:

 Uij=⎧⎪⎨⎪⎩−wt(vi,vj)vi≠vj an {vi,vj}∈E−cvi=vj∞otherwise.

Where denotes the weight between vertices and in and is some constant such that s.t. . The goal value for the k-charge removal instance is chosen as . The correctness of this reduction follows from the arguments in Theorem 1.

###### Theorem 2.

k-charge removal remains NP-Hard for set of allowed charges with unique magnitude and an energy function within .

###### Proof.

The construction of Theorem 1 may be extended to the case the set of charges is limited to any set of allowed charges. Two charges are chosen from this set, and such that and such that is minimised. The same steps as in Theorem 1 are followed for the construction to get an initial crystal graph and .. Note that will have a charge deficiency of , meaning that some set of vertices must be added to make the cell neutral. To handle the deficiency two sets of dummy vertices with charges of and will be created.

The first set will be to deal with the deficiency that would be left from a clique of size . To construct these, a natural number is chosen such that there exists a pair of natural numbers and such that and . Using these, vertices with a charge of and vertices with a charge of and with a charge of are added. From the definition of , the energy between them and all ions in and between each other can be set as .

The second set of dummy vertices will be to counteract the overall deficiency in the initial unit cell. A natural number is chosen such that there exists a pair of natural numbers and where and . vertices with a charge of and vertices with a charge of are added. The potential energy between between them and all other vertices, including the set of previously added dummy vertices, is .

To ensure that the best set of ions to be left with will be a clique of size as well as all of the dummy vertices added in the first step, the following is done. The goal energy remains the same as from Theorem 1. It can be seen that the only way to achieve this is to leave vertices corresponding to a clique of size at least . As there are charges for one set, and the goal is to be left with , a value is chosen to remove as . In the case that exactly -charges are removed, either the dummy vertices or some other vertices corresponding to a clique of size greater than , ensuring the set remains neutral will be left. In the at-least- case, some dummy vertices may also be removed provided the cell remains neutral.

From the arguments in Theorem 1 it can be seen that this will be sufficient to ensure the new instance is satisfiable if and only if the original clique instance is. Therefore these problems are NP-Hard, even in the case that there are distinct charges and , . ∎

While there are simple reductions to other NP-Complete problems such as Integer Programming, embedding this problem into many classical problems is made difficult due to the problem of maintaining the neutrality of the unit cell. To this end, Theorem 3 shows how a restricted version of k-charge removal may be embedded into max-weight k-clique.

###### Theorem 3.

k-charge removal can be reduced to max-weight k-clique in polynomial time, under the restriction that charges are limited and the energy function is computable within polynomial time.

###### Proof.

Note that, given charges of , a valid solution to minimal-at-least-k-charge removal will either be valid for k-charge removal, or there will be no valid solutions to k-charge removal. Taking as input an instance of minimal-at-least-k-charge removal with charges of with the corresponding crystal graph, it will be claimed that this instance may be represented as an instance of the weighted generalisation of clique.

In weighted -clique, denoted weighted-k-clique, the input is a weighted graph, a goal value , and a natural number . An instance of weighted-k-clique is satisfiable if and only if there exist a clique of size such the the sums of the weight of the edges in the clique is at least .

Given an instance of minimal-at-least-k-charge removal , an instance of weighted-k-clique is created as follows. A value is chosen as rounded down to the nearest natural number. Note that if , then there is no valid solution to k-charge removal, however there may still be some valid solution to minimal-at-least-k-charge removal. A new graph is created which will initially be empty. For each pair of vertices with different charges a new associated vertex in is created. An edge is created between each new vertex if and only if the corresponding vertices are all unique, i.e. given the set of vertices and an edge would be placed between the new vertex representing and the one representing , but not from either to the vertex representing . Give two connected vertices corresponding to vertices and the edged is assigned a value of to the edge between them. The intuition behind this is for the edge to maintain the weights of the edges in . is added to this so that within a clique of size , the edge between the two vertices is fully represented. An example of this construction is shown in Figure 1, omitting weights for legibility.

It may now be claimed that any clique of size will correspond to a neutrally weighted subset where . This is shown by noting that vertices are only connected if they do not represent a common vertex. As such a clique of size must contain unique positively weighted and unique negatively weighted vertices for the corresponding vertices to be connected as a clique. Therefore by selecting any clique of size in this graph, it can be seen that there will be a valid structure left with exactly unique positively weighted and unique negatively weighted vertices. From the definition of this will correspond to an subgraph of after a minimal removal of .

It may now be claimed that a maximum weight clique of size will correspond to the best subset of ions after a charge removal. Note that given a clique with total weight corresponds to a set of ions with total energy . It is a straightforward extension to see that a maximum weight clique will correspond to a minimum energy subset of ions. This can be seen by noting that by choosing as the size of the clique, the corresponding arrangement will have . From the definition of , this requires , which will satisfy the requirements for a charge removal. Conversely the definition of ensures that the removal must be minimal. Therefore the optimal solution to the weighted-k-clique instance must correspond to an optimal solution to the minimal-at-least-k-charge removal k-charge removal instance. Similarly any valid solution to the weighted-k-clique instance will correspond to some solution to the minimal-at-least-k-charge removal instance. ∎

## 4 Bounded number of species with Buckingham-Coulomb potential

In Section 3 NP-Hardness was shown for the case that there was an unbounded number of species, and NP-completeness in the case that there is a bounded number of charges. This will now be strengthened by considering instances with only two unique species. Only the Buckingham-Coulomb potential function with charges of will be considered in this section. All three problems will again be considered, noting that for charges of k-charge removal is equivalent to minimal-at-least-k-charge removal. NP-Hardness will be shown by a reduction from independent-set on penny graphs adapting it to the Euclidean settings of crystal graph of ions within a unit cell. The Independent Set problem, denoted independent-set, takes as input a graph, , and a natural number . The goal is to find an independent set, i.e. a set of vertices such that no two are adjacent, of size in , or report that one does not exist. Penny graphs are the class of graphs where each vertex may be drawn as a unit circle such that no two circles overlap, and an edge between two vertices exist if and only if the corresponding circles are tangent, i.e. they intersect at only a single point. Finding an independent set on this class of graphs was shown to be NP-Hard by Cerioli et al. [ITA_2011__45_3_331_0]. The NP-Hardness result for this problem comes from a reduction from max-degree 3 Planar Vertex cover, which was shown to be NP-Complete by Garey and Johnson [doi:10.1137/0132071].

Construction of the k-charge removal instance: Starting with an instance of independent-set on a maximum degree 3 planar graph, containing the graph and a natural number an instance of k-charge removal is created as follows. Using Theorem 1.2 from Cerioli et al. to create a new penny graph realisation, , and a new natural number . The class of graphs created by this process will be denoted as the long orthogonal penny graphs. For the realisation the radius of each circle is chosen as .

A region of space in with a height of at least and a width and length allowing may be drawn is created. This space will be the parallelepiped for the unit cell. In this space, two copies of are drawn such that one is has a height heigher than the other. For every circle in two ions are created, one in the lower copy of and the other in the higher copy. Each ion is labelled with the vertex from it corresponds to. In this context pair refers to the two ions in the new crystal graph , labelled with the same vertex from . Two pairs are neighbouring if they represent vertices that are adjacent in . The lower ions are assigned the positive specie and the upper ions the negative. An example of this arrangement is provided in Figure 2. Note that the minimum distance between two pairs in the same plane that are non-adjacent for circles with a radius of will be , as shown in Figure 3.

The positive and negative species are assigned charges of and respectively. From these species there are parameters for the interaction between two ions of the positive specie, two ions of the negative specie, and between one ion of the positive specie and one of the negative specie. For brevity, 1 and 2 will be used to denote the positive and negative specie respectively. Under this construction, the interaction between the two ions of the positive specie is the same as between two ions of the negative specie. Therefore the parameters that may be set are and .

Let , being the number of charges that are required to be removed to be left with an independent set of size . Note that as the charge of each ion has a magnitude of one, a removal of can only be achieved by removing positive and negative ions. The goal energy for the construction is set as . To simplify the equations regarding the interaction between planes, will be used to denote .

An independent set will be said to be left if the ions left after a removal of charges have labels corresponding to an independent set in . To ensure that an independent set is left of size if and only if one exists, the following three inequalities must be satisfied:

 A11eB11n−C11n6+1n+A12eB12ˆn−C12ˆn6−1ˆn ≥∣∣∣A12eB12−C12−1∣∣∣ (1) n2∣∣∣A11eB11r−C11r6+1r+A12eB12ˆr−C12ˆr6−1ˆr∣∣∣ ≤∣∣∣A12eB12−C12−1∣∣∣, r≥√2n (2) A11eB11r−C11r6+1r+A12eB12ˆr−C12ˆr6−1ˆr >0, r≥√2n (3)

In Lemmas 1, 2 and 3 it will be assumed that when removing charges, it will be preferable to choose pairs over any other set of charges in the arrangement. This will be proven in Lemma 4.

###### Lemma 1.

Inequalities (1) and (2) are sufficient to ensure that an independent set is left if one exists.

###### Proof.

It can be seen that inequality (1) will ensure that if there are two pairs corresponding to points that intersect, the total energy will always decrease the total energy by removing one of the pairs. Inequality (2) compliments this by ensuring that given a pair corresponding to a vertex with no adjacent neighbours, the total energy would increase by removing it. This holds even in the case that all other pairs are at a distance of . It can be seen that these two inequalities combined will mean that the global minimum total energy for any subset will be the maximum independent set. Note that the total energy will decrease with the cardinality of the given independent set. ∎

###### Lemma 2.

There exists, for any structure created from a long orthogonal penny graph, some parameters such that Inequalities (1, 2) and (3) are satisfied.

###### Proof.

Values are chosen for and such that the energy for any pair of ions of opposite charge at a distance of 1 is . This is achieved by choosing a value of for , 0 for , and for . Note that these values for and will lead to a constant interaction of between ions on separate planes. This simplifies the energy equation to

 UBC(r)=A11eB11r−C11r6+1r+12n2−12n2ˆr6−1ˆr.

To satisfy Inequality 1, . This may be satisfied by choosing values for , , and such that , noting that for all positive distances greater than 1. This can be satisfied by solving the equation , choosing .

Inequality (2) requires that at a distance of at least the total energy is no more than . This can be satisfied at a distance of by ensuring that the , which can be satisfied after substituting in the appropriate value for with , which simplifies to . Finally, consider the value of . Note that the value of both and depend greatly on , with a small increase in leading to a very rapid increase in the value of and a rapid decrease in . Similarly, the value of the energy given by rapidly decreases initially before converging at approximately 0, which can be seen by noting that the first derivative with respect to of this equation for a given is . As such by choosing a suitably large Inequality (2) can be easily satisfied, one obvious choice for this would be .

Note that both and will strictly decrease, therefore if and both Inequalities (2) and (3) are satisfied. From the value of this becomes , which will be positive and less than for any . Note that for any , hence it is clear that for . Considering at a distance of , the equation becomes , from the previous arguments it follows that this will be considerably less than for .

Finally it can be noted that due to the constant term there is a positive value for any distance greater than , satisfying Inequality (3). Using these values, it is now shown how to choose a set of parameters which satisfy the Inequalities. ∎

###### Lemma 3.

Given pairs, the energy will be less than if and only if the pairs correspond to an independent set of size , for .

###### Proof.

Given pairs, the energy between the ions in each pair will be , for a total of . Inequality (2) ensures that the maximum energy gained from pairs of ions corresponding to non-intersecting circles will be at most . Inequality (3) ensures that having charges on the same plane will lead to a slight positive charge. From this it follows that the maximum energy a set of ions corresponding to an independent set will be . Conversely, from Inequality (1) it is known that if there is a pair of intersecting circles the total energy must be greater than .

It can be noted for at-least-k-charge removal that if greater than pairs were removed this energy could not be achieved as the minimum energy would be for the interaction within pairs. As there is a positive interaction between pairs, the total energy must be slightly greater than this for any . Therefore it can be seen that the total energy will be less than if and only if there is an independent set of size left. It can be noted that under the choice of variables from Lemma 2, the upper bound of will be . ∎

###### Lemma 4.

When removing charges from the construction from a long orthogonal penny graph, it is always preferable to remove pairs provided that Inequalities (1- 3) hold.

###### Proof.

Assume that this statement is false, there must be some assignment where it is preferable to remove some set of at least two vertices, and , that do not form a pair with any ions that have be removed. Assume that there are positive and negative vertices in the graph. If instead was left in, while was removed, the remaining energy would change by at least . From the arguments in Lemma 1 and the construction in Lemma 2 it can be seen that this will lead to a decrease in total energy, making it preferable and therefore contradicting the assumption. Note that given a positively weighted vertex of the maximum degree, in this case 3, it could contribute at most which will have a magnitude less than 1 for any . Therefore, by contradiction this holds. ∎

###### Theorem 4.

k-charge removal, minimal-at-least-k-charge removal and at-least-k-charge removal are NP-Complete when limited to only two species of ion and restricted to the Buckingham-Coulomb potential energy function.

###### Proof.

Building on the results from Lemmas 1, 2, 3, and 4, NP-Completeness will now be shown. Lemma 1 shows that, provided Inequalities (1) and (2) hold, the optimal solution will be to leave an independent set. From Lemma 2 it follows that these inequalities are satisfiable for any graph under the given construction, noting that the assignment of parameters gives an energy of within pairs. Lemma 3 shows that the upper bound is reachable if and only if an independent set has been left. It follows from Lemma 4 that the assumption that it is preferable to remove a set of pairs over any other set of charges holds when the inequalities also do.

Therefore there will be a satisfiable instance of k-charge removal or any generalisation if and only if the instance of independent set for the maximum degree 3 planar graph instance is satisfiable. Conversely if the independent set instance is satisfiable, the corresponding k-charge removal instance can be satisfied by leaving the vertices corresponding to the independent set in the long orthogonal penny graph construction. Hence under these restriction all three problems will be NP-complete. Note that this may be extended to charges of for any given . ∎

## 5 Restriction to the Coulomb potential with unbounded charges

The final case that will be considered in this work is when the energy function is the Coulomb potential. NP-Hardness for this case will be shown by a reduction from knapsack to at-least-k-charge removal. Note that with an unbounded number charges this problem is not in NP for minimal-at-least-k-charge removal due to Proposition 3 and is trivially NP-Hard for k-charge removal due to Corollary 1. This reduction requires using an unbounded number of charges, thus it follows from proposition 3 that it is NP-Hard to verify if a solution to an instance of at-least-k-charge removal is minimal.

In this reduction it will be shown that an instance of at-least-k-charge removal such that the set of ions left will correspond to the items for the knapsack instance if and only if there is a set satisfying the knapsack instance.

###### Theorem 5.

at-least-k-charge removal and minimal-at-least-k-charge removal remains NP-Hard when the energy function is limited to the Coulomb potential.

###### Proof.

In the knapsack problem, henceforth knapsack, the input is a bag with capacity , and a set of items . Each item has a weight , and a value . In this problem the goal it to find the subset such that is maximised conditional on . Alternatively this may phrased as a decision problem by taking some goal value and asking if there is an such that .

NP-completeness for at-least-k-charge removal and minimal-at-least-k-charge removal will be shown by a reduction from the knapsack problem. Given an instance, , of the knapsack problem as described above, an instance, , of at-least-k-charge removal is created as follows. For every , two vertices are created denoted and and label with the corresponding item. These are assigned a weight of to and to .

The values and will be defined such that is some value such that there does not exit any pair of items, and , such that but , and is less than the smallest unit of precision for the value of the items. Using this, is defined as some value satisfying the inequality , where is the weight of the heaviest item. This ensures that is some distance such that if all vertices are at least away from each other there will be a difference of no more than in energy, which will be sufficient to ensure that vertices at that distance may be safely ignored.

These vertices are now placed such that for each item such that the distance between the two vertices and will give a potential energy of . Recall that , therefore this can be achieved by placing them at a distance of . Each of the pair of vertices representing an item is placed in a line so that the distance between any two pairs is no less than . An example of this construction is provided in Figure 4.

The value is chosen as , ensuring that there will be no more than charges left after removing , corresponding to a valid assignment for the knapsack instance. Finally, the goal value will be chosen as .

It follows from this construction that any removal of at-least- charges will be a valid packing in terms of the capacity.

If the at-least-k-charge removal instance is satisfiable then there must be some valid packing of no more than energy. As the interaction between vertices corresponding to different items is trivially small, the only way to achieve this is to choose a set of vertex pairs with an energy between them no more than . As the energy between pairs is equal to the value of the items, the only way this can be achieved this is to have items corresponding to a packing with value at least . Conversely if the at-least-k-charge removal instance is not satisfiable, there does not exist a packing of value by the same arguments.

Similarly if the knapsack instance is satisfiable then the at-least-k-charge removal instance may be satisfied by removing all vertices not corresponding to a satisfying packing of the knapsack instance. Finally if the knapsack instance is not satisfiable then by the previous arguments the kcr instance also can not be satisfied. Therefore this problem is NP-Complete. Note that as the weights on all items will be positive, with a corresponding negative energy in , given a non-minimal satisfying solution there will exist some minimal satisfying solution. Therefore minimal-at-least-k-charge removal will also be NP-Hard. ∎

## 6 Conclusions and future work

In this work we have presented the new problem of -charge removal, and a class of functions for which the general case is NP-Complete. We have also shown that the problem remains NP-Complete under both the restriction that we have only two species of ions and the Buckingham-Coulomb energy function and the restriction we only use the Coulomb potential on an unbounded number of ion species.

One obvious question would be if approximation results can be gained for this problem. We would submit that while it seems likely that the general case is APX-hard, under a bound on the number of ions this problem may well be approximable within a reasonable factor.

From a chemistry stand point, while we have made progress towards physical constructions there is still a lot that could be done in this regard. As such investigation into the restrictions of having more realistic physical values remains an important unexplored direction. Another question would be if we can investigate the convergence of these interactions, particularly the Coulomb potential, over a periodic structure to more fully understand the energy function.