# A Hierarchical-based Greedy Algorithm for Echelon-Ferrers Construction

Echelon-Ferrers is one of important techniques to help researchers to improve lower bounds for subspace code. Unfortunately, exact computation of echelon ferrers construction is limited by the computation time. In this paper, we show how to attain codes of larger size for a given minimum distance d=4 or 6 by the hierarchical-based greedy algorithm for echelon-ferrers introduced in [6]. About 63 new constant-dimension subspace codes are better than previously best known codes.

## 1 Introducation

Subspace coding was proposed by R.Koetter and F.R.Kschischang in [21] to correct errors and erasures in random network coding. The projective space of order over the finite field , denoted

, is the set of all subspaces of the vector space

. The set of all -dimensional subspaces of an -vector space will be denoted by . For , its cardinality is given by the Gaussian binomial coefficient

 |Gq(k,n)|=⎧⎨⎩(qn−1)(qn−1−1)⋯(qn−k+1−1)(qk−1)(qk−1−1)⋯(q−1)if 0≤k≤n;0otherwise.

Thus, .

A widely used distance measure for subspace codes (motivated by an information-theoretic analysis of the Kötter-Kschischang-Silva model, see e.g. [25]) are the subspace distance

 dS(U,W):=dim(U+W)−dim(U∩W)=2⋅dim(U+W)−dim(U)−dim(W),

where and are subspaces of .

A set of subspaces of is called a subspace code. The minimum distance of is given by . If the dimension of the codewords, is fixed as , we use the notation and call a constant dimension code(CDC for short). For fixed ambient parameters , , and , the main problem of subspace coding asks for the determination of the maximum possible size of an subspace code.

In this paper we give a greedy algorithm for the echelon-ferrers construction. About new constant-dimension subspace codes of larger size for a given minimum distance are illustrated in the table LABEL:tab:new-improvements.

The remaining part of this paper is structured as follows. The currently implemented lower bounds, constructions, are described in Section 2. The preliminaries are outlined in section 3. Constant dimension codes (CDC) by our algorithm are treated in Section LABEL:our_construction, Finally we draw a conclusion in Section 5.

## 2 Previous constructions

The lower and upper bounds on have been intensively studied in the last years, see e.g. [7]. The report [15] describes the underlying theoretical base of an on-line database, found at http://subspacecodes.uni-bayreuth.de and maintained by the research team in the University of Bayreuth that tries to collect up-to-date information on the best lower and upper bounds for subspace codes.

Lifted MRD codes, (we omit the details here, see subsection 3.2), are one type of building blocks of the Echelon- Ferrers construction, see subsection 3.3. The latter is a nice interplay between the subspace distance, the rank distance and the Hamming distance. Another construction based on similar ideas is the so-called coset construction [16]. The most effective general recursive construction is the linkage construction and its generalization. According the report [15], the lower bound with the highest score is the improved linkage construction, and it yields the best known lower bound in 69.1% of the constant dimension code parameters of the database currently. The linkage construction is to obtain large codes from the subspaces spanned by a given code and choices of an MRD code : rowspace{( are sampled from }. This resulting size of the constructed code is the size of times the size of the MRD code. By performing a tighter analysis of the occurring subspace distances, papers [24, 13, heinlein2017asymptotic] indicated that codes in a smaller ambient space can be further added.

The expurgation-augmentation method, which starts with a lifted MRD code and then adding and removing codewords, is invented by Thomas Honold. A starting point is possible a computer–free construction for the lower bound , see [22]. The subsequent studies contain for [19, Theorem 2], , [17], and [18, Theorem 4].

New subspace codes from two parallel versions of maximum rank distance codes was introduced by Xu and Chen [26]. The problem asks for the size of the construction const dimension codes was turned to find a suitable sufficient condition to restrict the number of roots of to , where and are -polynomials over the extension field :

 If2t≥n,thenAq(2n,2(n−t),n)≥qn(t+1)+n∑r=n−tAr(Q(q,n,t)).

Geometric concepts like the Segre variety and the Veronese variety where also used to obtain constructions for constant dimension codes :

###### Theorem 1 ([5, Theorem 3.11 and 3.8])

If

is odd, then

, using .

If is even, then , using if is odd and if is even.

In general, the exact determination of is a hard problem, whether in terms of theory or algorithms. The exact calculation for echelon ferrers construction is constrained by the computation time[8, 14, 9]. A greedy-type approach has been considered by Alexander Shishkin, see [23] and also [2]. It is implemented asgreedy_multicomponent. In [12, 11] the authors considered block designs as skeleton codes. [4] describes an algorithm to tackle the integer linear optimization problems representing the q-packing design construction by means of a metaheuristic approach, and gives some improvements on the size of . With a stochastic maximum weight clique algorithm and a systematic consideration of groups, authors in [3] gives some new lower bounds on for .

## 3 Preliminaries

### 3.1 Basic Notation

Let be a -dimensional subspace of . We represent by the matrix

in reduced row echelon form, such that the rows of

form a basis of . The identifying vector of , denoted by , is the binary vector of length and weight , where the ones of are exactly in the positions where has the leading coefficients (the pivots).

In this section we give the definitions for two structures which are useful in describing a subspace in . The reduced row echelon form is a standard way to describe a linear subspace. The Ferrers diagram is a standard way to describe a partition of a given positive integer into positive integers.

A matrix is said to be in row echelon form if each nonzero row has more leading zeroes than the previous row.

A matrix with rank is in reduced row echelon form if the following conditions are satisfied.

• The leading coefficient of a row is always to the right of the leading coefficient of the previous row.

• All leading coefficients are ones.

• Every leading coefficient is the only nonzero entry in its column.

A -dimensional subspace of can be represented by a generator matrix whose rows form a basis for . We usually represent a codeword of a projective space code by such a matrix. There is exactly one such matrix in reduced row echelon form and it will be denoted by .

A Ferrers diagram represents partitions as patterns of dots with the -th row having the same number of dots as the -th term in the partition. A Ferrers diagram satisfies the following conditions.

• The number of dots in a row is at most the number of dots in the previous row.

• All the dots are shifted to the right of the diagram.

The number of rows (columns) of the Ferrers diagram is the number of dots in the rightmost column (top row) of . If the number of rows in the Ferrers diagram is and the number of columns is we say that it is an Ferrers diagram.

Recall that the Hamming metric on is defined as , where denotes the number of nonzero entries in the vector . The following results are useful tools for constructions of subspace codes.

For we have

• ,

• if , then ,

### 3.2 Lifted MRD codes

A prominent code construction uses maximum rank distance (MRD) codes. For matrices the rank distance is defined via .

###### Theorem 2

(see [10]) Let be prime power, are positive integers, and be a rank-metric code with minimum rank distance . Then, .

Codes attaining this upper bound are called maximum rank distance (MRD) codes. They exist for all (suitable) choices of parameters. Using an identity matrix as a prefix one obtains the so-called lifted MRD codes. For any two MRD code and , the subspaces and spanned by rows of and are the same if and only if . The intersection is the set . Thus . The distance of this CDC is . A CDC constructed as above is called a lifted MRD code.

### 3.3 Echelon-Ferrers

In [9] presented the multi-level construction, which was based on lifted MRD codes. Let us briefly review the construction in the following theorem 3. Let be integers and a binary vector of weight . By we denote the set of all matrices over that are in row-reduced echelon form.

###### Theorem 3

(see [9]) For integers with and , let be a binary constant weight code of length , weight , and minimum hamming distance . For each let be a code in with minimum rank distance at least . Then, is a constant dimension code of dimension having a subspace distance of at least .

The code is also called skeleton code. For we have the following upper bound:

###### Theorem 4

(see [9]) Let be the Ferrers diagram of and be a subspace code having a subspace distance of at least , then

 #C≤qmin{νi:0≤i≤δ−1},

where is the number of dots in , which are neither contained in the first rows nor contained in the rightmost columns.

The authors of [9] conjecture that Theorem 4 is tight for all parameters , , and . Constructions settling the conjecture in several cases are given in [8].

Let denote the maximum size of a known MRD code over matching distance . The optimal Echelon-Ferrers construction can be modeled as an ILP:

 max∑v∈Fn2 c(v)⋅xv s.t. xa+xb≤1 ∀a≠b∈Fn2:dH(a,b)

This is implemented as echelon_ferrers. However, the evaluation of this ILP is only feasible for rather moderate sized parameters. The Echelon-Ferrers construction has even been fine-tuned to the pending dots [6].

Now, we are ready to give the formal definition about the problem that will be addressed in this paper.

###### Definition 1 (Problem Definition)

Given , there are total different identifying vectors, and each vector corresponding to a certain dimension. Among these vectors, we need to choose a binary vector to maximize the size of .

## 4 Greedy Algorithm

In this section, we will present the details of the construction: our greedy algorithm. We first briefly review the classic recursive backtracking procedure that exhaustively enumerates all maximal cliques in an undirected graph . Then we provide the greedy algorithm in the rest of the section.

### 4.1 Classic Maximum Clique Enumeration (MCE)

A classic Maximum Clique Enumeration (MCE) algorithm relies on recursive calls to procedure , which is illustrated in Algorithm 1. We denote the set of neighbors of a vertex by . The algorithm takes a graph as input and initially invokes . In Algorithm 1, the basic idea is to recursively backtrack to add a vertex from the set of candidate vertices in to grow the current clique . A vertex is a candidate to if and only if is a neighbor of all vertices in . Each time when is augmented by a vertex , we refine by keeping only the vertices that are also neighbors of . When becomes empty, cannot be further grown. At this point, we need to check whether is indeed maximal. Towards this, we maintain a set which keeps the set of vertices that are neighbors of all vertices in and have been outputted as part of some maximal clique earlier, i.e., the recursive procedure has outputted some maximal clique earlier, where . Thus, if is not empty, is not a maximal clique; otherwise, we output as a maximal clique.

In the worst case, the algorithm can be achieved in [1, 20] time complexity. The time taken to compute and output the set of all maximal cliques is acceptable when the is small. The following algorithm makes use of this feature. On a normal PC machine, when the size of V is under 80, the classic maximum clique enumeration algorithm can be calculated in a few minutes.

### 4.2 Algorithm

As mentioned in Section 3.3, the optimal Echelon-Ferrers construction of code

can be modeled as an Integer Linear Programming(ILP). Consider that the evaluation of this ILP is only feasible for rather moderate sized parameters, we present a hierarchical-based greedy algorithm

2 as illustrated in the following. The greedy algorithm iteratively maintains a set of identifying vectors. The algorithm starts by initializing a set of all the identifying vectors denoted by , and computing its corresponding dimension by Theorem 4, then we sort the into a descending order of their dimensions. We put the identifying vector with maximal dimension into the result set . At this point, we need to calculate from second maximal dimension, and eventually down to 0 dimension. Each dimension is treated as a layer. Then for each dimension , the algorithm constructs the vectors , which is compatible to . That is, for each vector , we have . Then, the MCE is called to generate all the maximal cliques. In the end, we choose the best clique into . In some cases, the is an empty set, due to the fact that the compatible condition is not satisfied.

In the above algorithm, the way to choose the clique is critical for the resulting solution. Suppose that cliques were calculated from the previous step. We pick the click with largest codes into . If there exists serval clicks with same largest codes, we need to evaluate the impact on the subsequent selection after joining the result set . Towards this, suppose that was added to , we choose the vectors with dimensions from to , which were compatible to the new result set , we invoke Algorithm again to generate all the possible cliques. Among all the cliques, we pick the one that maximizes the total number of codes. The parameter makes the can be finished in acceptable time.

###### Example 1

Let be any prime power, be , we observe that total identifying vectors. After apply the greedy algorithm, we obtain 100 identifying vectors, 24 of which are illustrated in table 1. With this, the codes of have the cardinalities are . Table LABEL:A-q-5-4 gives some new lower bounds for codes .

It has been proved that for general diagrams , the bound of Theorem 4 is attained for (see [9, 8] for more details). The improvements on CDC codes are given in Table 2-3, achieved by our greedy algorithm. All the codes are attached in the Supplementary material.

## 5 Discussion

The echelon-ferrers construction is an important method to construct the const dimension code. One of the outstanding advantages is that this method can be applied to various parameters. In this paper, we give a greedy algorithm for the echelon-ferrers construction. About improvements are given by our greedy algorithm. It is also interesting if the greedy algorithm of this paper can be improved to get larger codes.

