# Starting CLuP with polytope relaxation

The Controlled Loosening-up (CLuP) mechanism that we recently introduced in <cit.> is a generic concept that can be utilized to solve a large class of problems in polynomial time. Since it relies in its core on an iterative procedure, the key to its excellent performance lies in a typically very small number of iterations needed to execute the entire algorithm. In a separate paper <cit.>, we presented a detailed complexity analysis that indeed confirms the relatively small number of iterations. Since both papers, <cit.> and <cit.> are the introductory papers on the topic we made sure to limit the initial discussion just to the core of the algorithm and consequently focused only on the algorithm's most basic version. On numerous occasions though, we emphasized that various improvements and further upgrades are possible. In this paper we present a first step in this direction and discuss a very simple upgrade that can be introduced on top of the basic CLuP mechanism. It relates to the starting of the CLuP and suggests the well-known so-called polytope-relaxation heuristic (see, e.g. <cit.>) as the starting point. We refer to this variant of CLuP as the CLuP-plt and proceed with the presentation of its complexity analysis. As in <cit.>, a particular complexity analysis per iteration level type of complexity analysis is chosen and presented through the algorithm's application on the well-known MIMO ML detection problem. As expected, the analysis confirms that CLuP-plt performs even better than the original CLuP. In some of the most interesting regimes it often achieves within the first three iterations an excellent performance. We also complement the theoretical findings with a solid set of numerical experiments.

09/03/2019

### Complexity analysis of the Controlled Loosening-up (CLuP) algorithm

In our companion paper <cit.> we introduced a powerful mechanism that we...
11/23/2020

### Rephased CLuP

In <cit.> we introduced CLuP, a Random Duality Theory (RDT) based algori...
11/23/2020

### Sparse linear regression – CLuP achieves the ideal exact ML

In this paper we revisit one of the classical statistical problems, the ...
09/03/2019

### Controlled Loosening-up (CLuP) – achieving exact MIMO ML in polynomial time

In this paper we attack one of the most fundamental signal processing/in...
11/23/2020

### Algorithmic random duality theory – large scale CLuP

Based on our Random Duality Theory (RDT), in a sequence of our recent pa...
08/13/2020

### Dynamic Complexity of Expansion

Dynamic Complexity was introduced by Immerman and Patnaik <cit.> (see al...
09/25/2015

### Folding a Tree into a Map

Analysis of the retrieval architecture of the highly influential UNIX fi...

## 1 Introduction

To handle famous MIMO ML detection problem, we in [23] presented the so-called Controlled Loosening-up (CLuP) algorithm. Since the CLuP algorithm will be the main topic of this paper as well, and since we will study its behavior when applied for solving the MIMO ML detection problems, we first briefly recall on the basics of the MIMO ML.

 y=Axsol+σv, (1)

where

is the output vector,

is the system matrix , is the input vector, is the noise vector at the output, and is a scaling factor that determines the ratio of the useful signal and the noise (the so-called SNR (signal-to-noise ratio)). It goes without saying that this type of system modeling is among the most useful/popular in various scientific/engineering fields (a particularly popular application of this model in the fields of information theory and signal processing is its utilization in modeling of multi-antenna systems).

Also, we will here continue the trend that we have started in [23] and [22], and consider a statistical setup where both and

are comprised of i.i.d. standard normal random variables. A similar continuing the trend from

[23] and [22] regarding the so-called linear regime will be in place as well. That means that in this paper we will also view and as large but with a constant proportionality between them, i.e. we will assume that where is a number that doesn’t change as both and grow. The following optimization problem is the simplest yet most fundamental version of the MIMO ML-detection problem

 ^x=minx∈X∥y−Ax∥2, (2)

where is the set of all available input vectors . Now, many interesting scenarios/variants of the MIMO ML problem appear depending on the structure of (for example, LASSO/SOCP variants of (2

) often seen in statistics, machine learning, and compressed sensing are just a tiny subset of many very popular scenarios of interest; more on these considerations can be found in e.g.

[13, 2, 26, 3, 1, 27, 10]). Here, we follow into the footsteps of [23, 22] and consider the standard information theory/wireless communications binary scenario which assumes . It goes trivially, basically almost without saying, that is naturally assumed as well. We will also without a loss of generality assume even further that .

The above problem (2) can be solved either exactly or approximately (for more on various relaxing heuristics see, e.g. [6, 9, 28, 5]). What makes it particularly interesting is that in the above mentioned binary scenario, (2

) is typically viewed as a very hard combinatorial optimization type of problem. As such it was obviously the topic of interest in various research communities over last at least half a century. Many excellent algorithms and algorithmic heuristics have been introduced over this period of time. As a detailed discussion about such developments is more suited for review papers we here just in passing mention that some of the very best results regarding to the perspective of the problem that is of interest here can be found in e.g.

[24, 25, 4, 7, 8]

. In addition, we also emphasize two probably the most important points: 1) the problem in (

2) is hard if one wants/needs to solve it exactly and 2) polynomial heuristics typically offer an approximate solution that does trail the exact one by a solid margin in almost all interesting scenarios. In [23] and [22] we introduced the above mentioned CLuP mechanism as a way of attacking the MIMO ML on the exact level. Compared to [24, 25], which also attacked the MIMO ML on the exact level, CLuP did so by running only a few (fairly often not more than ) simplest possible quadratic programming type of iterations. In [22], this rather remarkable property was analytically characterized. Here we provide a similar characterization for a slightly different upgraded variant of CLuP. Before we proceed with the characterization of this new CLuP variant we below first recall on the CLuP’s basics.

### 1.1 CLuP’s basics

As is by now well-known from [23, 22], CLuP is effectively a very simple iterative procedure that in its core form assumes choosing a starting , radius , and running the following

 x(i+1)=x(i+1,s)∥x(i+1,s)∥2withx(i+1,s)=argminx −(x(i))Tx subject to ∥y−Ax∥2≤r (3) x∈[−1√n,1√n]n.

As one can guess, the choice for and has a very strong effect on the way how the algorithm progresses. For the simplest possible choice of (each component of is generated with equal likelihood as or ) we in Figure 1 show both, the theoretical and the simulated CLuP’s performance.

Without going into too much details, we just briefly mention that as increases from to (more on the definition and importance of can be found in [23]) CLuP’s performance gets closer to the ML. Further detailed explanations related to the figure can be found in [23]. Those among other things include a discussion regarding the appearance of a vertical line (the so-called line of corrections). We of course skip repeating such discussion and just mention that in this paper (similarly to [22]) we will be interested in the regimes above the line, i.e. in the regimes where the SNR, , is to the right of the line.

What is of a bit more interest to the present paper though (and what can’t exactly be seen from Figure 1) is the complexity of the above CLuP algorithm. The analysis of the CLuP’s complexity was of course the main topic of [22]. The remarkable CLuP’s property that it fairly often runs not only in a polynomial but rather fixed number of iterations was through such an analysis fully characterized. What may have escaped the attention in [22] is the fact that not only is the number of CLuP’s iterations fixed and small, it is actually achieved without much effort in making the algorithm even the tiniest of the bits more complex than its most basic version. That in the first place meant that in [22], we analyzed CLuP’s complexity by assuming that the starting is basically completely random and as such in a way completely disconnected from the problem at hand. On the other hand, it seems rather natural that a bit more clever choice could help CLuP achieve even better performance. There are a tone of possible choices for and the next natural question would be which of such choices would be the best or at least better than the random one. Such a discussion requires a careful analysis and we will present it in a separate companion paper. To insure that the initial discussion in this direction is as simple as possible, we here focus on a particular choice of the starting that we view as pretty much the simplest, most natural one after the fully random one considered in [23, 22].

### 1.2 CLuP-plt

The new CLuP’s variant that we consider in this paper (and to which we refer as CLuP-plt), assumes simply generating as the solution to the standard polytope-relaxation heuristic (see, e.g. [24, 25]) of the original ML problem (2)

 x(0,plt)=argminx ∥y−Ax∥2 subject to x∈[−1√n,1√n]n. (4)

Then one can define CLuP-plt as

 x(i+1)=x(i+1,s)∥x(i+1,s)∥2withx(i+1,s)=argminx −(x(i))Tx subject to ∥y−Ax∥2≤r (5) x∈[−1√n,1√n]n,

where starts from zero and . Alternatively, one can increment the indices and start counting the iterations by first setting

 x(1,s)=x(0,plt) andx(1)=x(1,s)∥x(1,s)∥2=x(0,plt)∥x(0,plt)∥2, (6)

and then continuing with (5) for . To be in an alignment with what we have done in [22] and to accurately account for the (4) as the first iteration of the algorithm (as we should) we will rely on (6) and (5) with .

The main idea behind the CLuP-plt introduced above is that the initial is expected to be closer to the targeted optimal solution and as such might help getting to the optimum faster. Below we will provide an analysis that will confirm these expectations. We will formally focus on the algorithm’s complexity, which due to its iterative nature amounts to handling the number of iterations. However, we will present a particular type of analysis that we typically refer to as the complexity analysis per iteration level, where we basically fully characterize all system parameters and how they change through each of the running iterations. Such an analysis is of course way more demanding than just mere computation of the total number of needed iterations.

Through the presentation below we will see that the analysis of CLuP-plt can be designed so that it to a large degree parallels what we have done when we analyzed the complexity of the original CLuP in [22]. We will therefore try to avoid repeating many explanations that are to a large degree similar or even identical to the corresponding ones in [22] and instead focus on the key differences. Also, we will emphasize it on multiple occasions but do mention it here as well that we chose a very simple upgrade to showcase potential of the CLuP’s core mechanism. Since we will be utilizing the main concepts of the analysis from [22] in some of our companion papers as well, we also found this particular upgrade as a very convenient choice to quickly get fully familiar with all the key steps of [22]. In a way, we will essentially through a reconsideration in this paper bring those steps (that at first may appear complicated) to a level of a routine. This will turn out to be particularly useful when we switch to discussion of a bit more advanced structures.

Parallelling what was done in [22] the presentation will be split into several parts. The characterization of the algorithms’s first iteration will be briefly discussed at the beginning and then in the second part we will move to the second and higher iterations. We will also present a large set of simulations results and observe that they are in a rather nice agreement with the theoretical findings.

## 2 Complexity analysis of CLuP-plt – first iteration

As mentioned above, to facilitate the exposition and following we will try to parallel as much as possible the flow of the presentation from [22]. That means that the core of the complexity analysis will again be the so-called complexity analysis on per iteration level.

We start things off by noting that a combination of (1) and (4) gives the following version of the CLuP-plt’s first iterations

 x(0,plt)=argminx ∥σv+A(xsol−x)∥2 subject to x∈[−1√n,1√n]n, (7)

which with a cosmetic change easily becomes

 x(0,plt)=argminx ∥σv+Az∥2 subject to z∈[0,2√n]n. (8)

Following considerations from [23, 22] and ultimately those from [18, 11, 19, 12, 17, 13, 14, 15, 16, 21, 20] and utilizing the concentration strategy we set and instead of (8) consider

 ξp,1(α,σ,c1,z)=limn→∞1√nEminz ∥σv+Az∥2 subject to ∥z∥22=c1,z (9) z∈[0,2/√n]n.

It is now not that hard to note that the problem in (9) is conceptually identical to the one in equation (7) in [22]. In fact, it can be thought of a special case of the one from [22] with and the components of in equation (7) in [22] being equal to zero. This basically means that one can completely repeat the rest of the analysis from the second section of [22]. The only substantial difference will be that the variable from [22]’s second section will now be zero. In particular, instead of [22]’s equation (16) one now has for the optimizing

 zi=1√nmin(max(0,−(h2γ)),2). (10)

Moreover, analogously to [22]’s equations (18) and (19) one now has

 I1,1(γ) = −(exp(−0.5(4γ)2)(−4γ)+√π/2erf(2√2γ+1/√2))/(4√2πγ) I2,1(γ) = 2γerfc((4γ)/√2)−2exp(−1/2(4γ)2)/√2π, (11)

and

 ξ(1)RD(α,σ;c1,z,γ)=√α√c1,z+σ2+I1,1(γ)+I2,1(γ)−γc1,z. (12)

The following theorem summarizes what we presented above.

###### Theorem 1.

(CLuP-plt – RDT estimate – first iteration) Let

and be as in (9) and (12), respectively. Then

 ξp,1(α,σ,c1,z)=maxγξ(1)RD(α,σ;c1,z,γ). (13)

Consequently,

 minc1,zξp,1(α,σ,c1,z)=minc1,zmaxγξ(1)RD(α,σ;c1,z,γ). (14)
###### Proof.

Follows automatically from [22] and ultimately the RDT mechanisms from [18, 19, 12, 13, 14, 15, 16] (as in [22], the strong random duality is trivially in place here as well). ∎

We do mention in passing also that one can trivially first solve the optimization over and effectively transform/simplify the above optimization problem to an optimization over only . However, to maintain parallelism with [22] and ultimately with what we will present below, we avoided doing so.

### 2.1 CLuP-plt – first iteration summary

Since the above theorem is very similar to the corresponding one in [22], we below continue to follow into the footsteps of [22] and in a summarized way formalize how it can be utilized to finally obtain all of the key algorithm’s parameters in the first iteration.

Summary of the CLuP-plt’s first iteration

We first solve

 {^γ(1),^c(1)1,z}=argmin0≤c1,z≤4maxγξ(1)RD(α,σ;c1,z,γ) (15)

and then as in [22]’s equations (23) define

 sx,1(γ) = 1/2/γ/√2π(1−exp(−(4γ)2/2)) sxsq,1(γ) = −I1,1(γ)/γ sx,2(γ) = 2(.5erfc((4γ)/√2)) sxsq,2(γ) = 2sx,2. (16)

Moreover, analogously to [22]’s (24),(25), and (26) we now have

 √nEzi = sx,1(^γ(1))+sx,2(^γ(1)) nEz2i = sxsq,1(^γ(1))+sxsq,2(^γ(1)), (17)

and with also

 √nExi = 1−(sx,1(^γ(1))+sx,2(^γ(1))) nEx2i = sxsq,1(^γ(1))+sxsq,2(^γ(1))+√n2Exi−1, (18)

and finally

 E((xsol)Tx) = 1−(sx,1(^γ(1))+sx,2(^γ(1))) E∥x∥22 = sxsq,1(^γ(1))+sxsq,2(^γ(1))+2E((xsol)Tx)−1. (19)

As in [22], the strong random duality ensures that the above are not only the expected values but also the concentrating points of the corresponding quantities (concentration of course is exponential in ). As in [22]’s (27) one can also obtain for the probability of error

 p(1)err=1−P(zi≤1√n)=1−12erfc(−2^γ(1)√2). (20)

The theoretical values for all key system parameters that can be obtained utilizing the above Theorem 1 are shown in Table 1 for SNR, [db].

To maintain the parallelism with [22] and with what we will present below, we artificially keep two additional parameters and and assign the value to them.

## 3 Summary of the CLuP-plt’s second iteration analysis

The move from the first to the second iteration is of course of critical importance for understanding all later moves from -th to -th iteration for . The CLuP’s second iteration assumes computing as

 x(2)=x(2,s)∥x(2,s)∥2withx(2,s)=argminx −(x(1))Tx subject to ∥y−Ax∥≤r (21) x∈[−1√n,1√n]n,

where we recall from (6), . One can then also rewrite (21) in the following way

 minz (x(1))Tz subject to ∥σv+Az∥≤r (22) z∈[0,2/√n]n.

Utilizing once again the concentration strategy we set and and consider

 ξp,2(α,σ,c2,z,s2)=limn→∞1√nEminz ∥σv+Az∥2 subject to ∥z∥22=c2,z (23) (x(1,s))Tz=s2 z∈[0,2/√n]n.

The above problem is structurally literally identical to [22]’s (31). One can then repeat all the steps between [22]’s (31) and (56) to arrive at the following set of equations that determine the optimizing and

 z(2)i = 1√nmin⎛⎝max⎛⎝0,−⎛⎝h(1,p)i+νx(1,s)i+ν22γ⎞⎠⎞⎠,2⎞⎠ x(2,s)i = 1√n−z(2)i=1√n⎛⎝1−min⎛⎝max⎛⎝0,−⎛⎝h(1,p)i+νx(1,s)i+ν22γ⎞⎠⎞⎠,2⎞⎠⎞⎠, (24)

where one also recalls from (6)

 x(1,s)i=x(0,plt)=1−z(1)i=1√n(1−min(max(0,−(hi2^γ(1))),2)). (25)

As in [22], and the components of both and are i.i.d. standard normals. Setting

 (26)

where if negative, the term under the integral is zero for . Analogously to [22]’s (60) one can define

 ξ(2)RD(α,σ;p(1),q(1),c2,z,s2,s3,γ,ν,ν2) = √α√c2,z+σ2(q(1)p(1)+√1−(q(1))2√1−(p(1))2) (27) +I(2)1(γ,ν,ν2)−νs2−ν2s3−γc2,z.

Finally, from [22]’s (76)-(78) one has

 ϕ(2)b=argmins,d(2)1,d(2)2 s subject to maxp(1)min0≤c2,z≤4maxγ,ν,ν2ξ(2)RD(α,σ;p(1),q(1),c2,z,s2,s3,γ,ν,ν2)=r (28) s2=d(1)1+s√d(1)2 s3=1−d(2)1 c2,z=d(2)2−2d(2)1+1 q(1)=s3−s2+σ2√c2,z+σ2√^c1,z+σ2,

and

 p(2)err=1−∫∫((sign(x(2,s))+1)/2)exp⎛⎜ ⎜⎝−(h(1)i)2+h2i2⎞⎟ ⎟⎠dh(1)idhi2π. (29)

To obtain the remaining key parameters one can utilize

 ^d(2)2 = ∫∫((x(2,s)i)2exp⎛⎜ ⎜⎝−(h(1)i)2+h2i2⎞⎟ ⎟⎠dh(1)idhi2π ^d(2)1 = ∫∫((x(2,s)i)exp⎛⎜ ⎜⎝−(h(1)i)2+h2i2⎞⎟ ⎟⎠dh(1)idhi2π ^s(2)2 = ∫∫((x(1,s)i)z(2)iexp⎛⎜ ⎜⎝−(h(1)i)2+h2i2⎞⎟ ⎟⎠dh(1)idhi2π. (30)

To make things easier to follow one can define a set of the key output parameters at the end of the second iteration (of course, it goes without emphasizing that is the main output of the second iteration). This set consists of critical plus auxiliary parameters

 ϕ(2)={\definecolor[named]pgfstrokecolorrgb0,0,1\pgfsys@color@rgb@stroke001\pgfsys@color@rgb@fill001p(2)err,^s(2),^d(2)2,^d(2)1,\definecolor[named]pgfstrokecolorrgb.75,0,.25\pgfsys@color@rgb@stroke.750.25\pgfsys@color@rgb@fill.750.25^ν(2),^ν(2)2,^γ(2),^p(1),^q(1),^c(1)2,z,^s(2)2,^s(2)3}, (31)

where

 \definecolor[named]pgfstrokecolorrgb0,0,1\pgfsys@color@rgb@stroke001\pgfsys@color@rgb@fill001p(2)err − probability of error after the second iteration \definecolor[named]pgfstrokecolorrgb0,0,1\pgfsys@color@rgb@stroke001\pgfsys@color@rgb@fill001^s(2) = E((x(1))Tx(2,s))−objective % value after the second iteration \definecolor[named]pgfstrokecolorrgb0,0,1\pgfsys@color@rgb@stroke001\pgfsys@color@rgb@fill001^d(2)2 = E∥x(2,s)∥22−squared norm after % the second iteration \definecolor[named]pgfstrokecolorrgb0,0,1\pgfsys@color@rgb@stroke001\pgfsys@color@rgb@fill001^d(2)1 = ExTsolx(2,s)−inner product % with xsol after the second iteration. (32)

with the last three quantities being not only the expected but also the concentrating values as well. Before proceeding with the numerical results for the second iteration we recall the output of the first iteration

 ϕ(1)={\definecolor[named]pgfstrokecolorrgb0,0,1\pgfsys@color@rgb@stroke001\pgfsys@color@rgb@fill001p(1)err,^s(1),^d(1)2,^d(1)1,\definecolor[named]pgfstrokecolorrgb.75,0,.25\pgfsys@color@rgb@stroke.750.25\pgfsys@color@rgb@fill.750.25^ν(1),^γ(1),^c(1)1,z}={0.0072,−0,0.7574,0.8369,0,1.2233,0.835}. (33)

The theoretical values for the output parameters after the second iteration (i.e. for the parameters from (31) that are obtained through the discussion presented above for SNR, [db], , and ) are included in Table 2.

One can also characterize the remaining auxiliary parameters from , i.e. relying on the equality constraints in (28). Table 3 shows the results for these parameters that can be obtained through both, the equality constraints in (28) and (3).

## 4 Summary of the CLuP-plt’s (k+1)-th iteration analysis

The heart of the analysis mechanism is the move from the first to the second iteration. Such a move is conceptually then identical to the move from any -th to -th iteration. However, there are still a few technical differences that require a special attention. These differences are of course the main reason why we separately discuss a generic move from -th to -th iteration for any . On the other hand, we have already faced a similar situation in [22] and all the results obtained there in this regard can be reutilized. We start by recalling that CLuP’s -th iteration is basically the following optimization problem

 x(k+1)=x(k+1,s)∥x(k+1,s)∥2withx(k+1,s)=argminx −(x(k))Tx subject to ∥y−Ax∥≤r (34) x∈[−1√n,1√n]n.

This is of course structurally identical to (85) in [22]. One can then again utilize the Random Duality Theory and repeat all the steps between (85) and (108) in [22] to arrive at the following for the optimizing and

 z(k+1)i = 1√nmin⎛⎜⎝max⎛⎜⎝0,−⎛⎜⎝h(k,p)i+∑kj=1~νjx(j,s)i+ν22γ⎞⎟⎠⎞⎟⎠,2⎞⎟⎠ x(k+1,s)i = 1√n−z(2)i=1√n⎛⎜⎝1−min⎛⎜⎝max⎛⎜⎝0,−⎛⎜⎝h(k,p)i+∑kj=1~νjx(j,s)i+ν22γ⎞⎟⎠⎞⎟⎠,2⎞⎟⎠⎞⎟⎠, (35)

where are obtained after the -th iteration as the optimizing variables after each of the first iterations. One can also set as in [22]’s (110)

 I(k+1)1(γ,ν,ν2,^ν(1))=E((h(k,p)i+k∑j=1~νjx(j,s)i+ν2)z(k+1)i+γ(z(k+1)i)2), (36)

where for the term under the expectation is assumed zero if negative. Moreover, one can also set as in [22]’s (111)

 ξ(k+1)RD(α,σ;P(k+1),Q(k+1),c2,z,s2,j,s3,γ,~νj,ν2) = √α√c2,z+σ2f(k+1)sph+I(k+1)1(γ,ν,ν2,^ν(1)) (37) −k∑j=1~νjs2,j−ν2s3−γc2,z,

where and are as in [22]’s (90) and is as in [22]’s (101). We also note from [22]’s (112)-(114) that the key output parameters after the -th iteration are

 x(j,s)i,z(j)i,λ(j−1),1≤j≤k, (38)

and

 ϕ(k)={\definecolor[named]pgfstrokecolorrgb0,0,1\pgfsys@color@rgb@stroke001\pgfsys@color@rgb