List-decoding homomorphism codes with arbitrary codomains

06/08/2018
by   László Babai, et al.
The University of Chicago
0

The codewords of the homomorphism code aHom(G,H) are the affine homomorphisms between two finite groups, G and H, generalizing Hadamard codes. Following the work of Goldreich--Levin (1989), Grigorescu et al. (2006), Dinur et al. (2008), and Guo and Sudan (2014), we further expand the range of groups for which local list-decoding is possible up to mindist, the minimum distance of the code. In particular, for the first time, we do not require either G or H to be solvable. Specifically, we demonstrate a poly(1/ε) bound on the list size, i.e., on the number of codewords within distance (mindist-ε) from any received word, when G is either abelian or an alternating group, and H is an arbitrary (finite or infinite) group. We conjecture that a similar bound holds for all finite simple groups as domains; the alternating groups serve as the first test case. The abelian vs. arbitrary result then permits us to adapt previous techniques to obtain efficient local list-decoding for this case. We also obtain efficient local list-decoding for the permutation representations of alternating groups (i.e., when the codomain is a symmetric group S_m) under the restriction that the domain G=A_n is paired with codomain H=S_m satisfying m < 2^n-1/√(n). The limitations on the codomain in the latter case arise from severe technical difficulties stemming from the need to solve the homomorphism extension (HomExt) problem in certain cases; these are addressed in a separate paper (Wuu 2018). However, we also introduce an intermediate "semi-algorithmic" model we call Certificate List-Decoding that bypasses the HomExt bottleneck and works in the alternating vs. arbitrary setting. A certificate list-decoder produces partial homomorphisms that uniquely extend to the homomorphisms in the list.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

05/16/2018

On the List Decodability of Insertions and Deletions

List decoding of insertions and deletions is studied. A Johnson-type upp...
06/23/2018

List Decodability of Symbol-Pair Codes

We investigate the list decodability of symbol-pair codes in the present...
02/23/2018

Homomorphism Extension

We define the Homomorphism Extension (HomExt) problem: given a group G, ...
06/24/2019

List Decoding of Insertion and Deletion Codes

Insertion and deletion (Insdel for short) errors are synchronization err...
08/27/2020

Orbit Structure of Grassmannian G_2, m and a decoder for Grassmann code C(2, m)

In this manuscript, we consider decoding Grassmann codes, linear codes a...
06/15/2021

Improving the List Decoding Version of the Cyclically Equivariant Neural Decoder

The cyclically equivariant neural decoder was recently proposed in [Chen...
11/11/2020

List Decoding of Direct Sum Codes

We consider families of codes obtained by "lifting" a base code 𝒞 throug...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

1.1 Brief history

Let and be finite groups, to be referred to as the domain and the codomain, respectively. A map is an affine homomorphism if it is a translate of a homomorphism, i. e., if there exists a homomorphism and an element such that . We write and to denote the set of homomorphisms and affine homomorphisms, respectively. Let denote the set of all functions .

We view as a (nonlinear) code within the code space (the space of possible “received words”) and refer to this class of codes as homomorphism codes.

Homomorphism codes are candidates for efficient local list-decoding up to minimum distance (mindist) and in many cases it is known that their minimum distance is (asymptotically) equal to the list-decoding bound.

This line of work goes back to the celebrated paper by Goldreich and Levin (1989) [GL89] who found local list-decoders for Hadamard codes, i. e., for homomorphism codes with domain and codomain . This result was extended to homomorphism codes of abelian groups (both the domain and the codomain abelian) by Grigorescu, Kopparty, and Sudan (2006) [GKS06] and Dinur, Grigorescu, Kopparty, and Sudan (2008) [DGKS08] and to the case of supersolvable domain and nilpotent codomain by Guo and Sudan (2014) [GS14], cf. [BGSW18].

While homomorphism codes have low (logarithmic) rates, they tend to have remarkable list-decoding properties.

In particular, in all cases studied so far (including the present paper), for an arbitrary received word , and any , the number of codewords within radius is bounded by (as opposed to some faster-growing function of , as permitted in the theory of list-decoding). This is an essential feature for the complexity-theoretic application (hard-core predicates) by Goldreich and Levin.

We call the bound economical, and a homomorphism code permitting such a bound combinatorially economically list-decodable (CombEcon).

By efficient decoding we mean queries to the received word and additional work. We call a CombEcon code AlgEcon (algorithmically economically list-decodable) if it permits efficient decoding in this sense. So the cited results show that homomorphism codes with abelian domain and codomain, and more generally with supersolvable domain and nilpotent codomain, are CombEcon and AlgEcon.

In all work on the subject, this efficiency depends on the computational representation of the groups used (presentation in terms of generators and relators, black-box access, permutation groups, matrix groups). We shall make the representation required explicit in all algorithmic results.

1.2 Our contribution – combinatorial bounds

In this paper we further expand the range of groups for which efficient local list-decoding is possible up to the minimum distance. In particular, for the first time, we do not require either or to be solvable. In fact, in our combinatorial and semi-algorithmic results (see below), the codomain is an arbitrary (finite or infinite) group. We say that a class of finite groups is universally CombEcon if for all and arbitrary (finite or infinite) , the code is CombEcon. This paper is the first to demonstrate the existence of significant universally CombEcon classes.

Convention 1.1.

When speaking of a homomorphism code , the domain will always be a finite group, but the codomain will, in general, not be restricted to be finite.

Theorem 1.2 (Main combinatorial result).

Finite abelian and alternating groups are universally CombEcon.

We explain this result in detail. By “distance” in a code we mean normalized Hamming distance.

(Restatement of Theorem 1.2.)  Let the domain be a finite abelian or alternating group and an arbitrary (finite or infinite) group. Let mindist denote the minimum distance of the homomorphism code and let . Let be an arbitrary received word. Then the number of codewords within of is at most .


The degree of the polynomial in the expression for abelian domains is where is the degree in the corresponding abelianabelian result (currently  [GS14]). For alternating domains , we prove a bound of 9 on the degree of the polynomial; with additional work, this can be improved to 7.

Our choice of the alternating groups as the domain is our test case of what we believe is a general phenomenon valid for all finite simple groups.

Conjecture 1.3.

The class of finite simple groups is universally CombEcon.

The following problem is also open.

Problem 1.4.

Is the class of finite groups universally CombEcon?

We suspect the answer is “no.”

Theorem 1.2 also holds for a hierarchy of wider classes of finite groups we call shallow random generation groups or “SRG groups” (see Sec. 4.4

). This class includes the alternating groups. The defining feature of these groups is that a bounded number of random elements generate, with extremely high probability, a “shallow” subgroup, i. e., a subgroup at bounded distance from the top of the subgroup lattice.

Our new combinatorial tools allow us to play on the relatively well-understood top layers of the subgroup lattice of the domain, avoiding the dependence on the codomain, a bottleneck in previous work.

Remark 1.5.

Our results list-decode certain classes of codes up to distance for positive . In many cases, mindist is the list-decoding boundary; examples show that the length of the list may blow up when is set to zero. Classes of such examples with abelian domain and codomain were found by Guo and Sudan [GS14]. We add classes of examples with alternating domains (Section 9.4).

1.3 Our contribution – algorithms

On the algorithmic front, the combinatorial bound in the abelianarbitrary case permits us to adapt the algorithm of [GKS06] to obtain efficient local list-decoding. We say that a class of finite groups is universally AlgEcon if for all and arbitrary finite , the code is AlgEcon. The validity of such a statement depends not only on the class but also on the representation of the domain and codomain.

Corollary 1.6.

Let be a finite abelian group and an arbitrary finite group. Under suitable assumptions on the representation of and , the homomorphism code is AlgEcon.

In other words, abelian groups are universally AlgEcon.

In fact, the algorithm is so efficient that in the unit-cost black-box-access model for

(elements of can be named and operations on them performed at unit cost) the work required is only . (The cost does not depend on ; indeed, in this case, infinite is also allowed).

We need to clarify the “suitable representation.” It suffices to assume that is a finite abelian group given in any presentation by generators and relators, assuming in addition that a superset of the prime divisors of the order of is available. Without the prime divisors, we need a factoring oracle. We need black-box access to .

A permutation representation of degree of a group is a homomorphism , where the codomain is the symmetric group of degree . We also obtain efficient local list-decoding for the permutation representations of alternating groups under a rather generous restriction on the size of the permutation domain.

Theorem 1.7 (Main algorithmic result).

Let be the alternating group and the symmetric group of degree . Then is AlgEcon, assuming .

The limitations on the codomain arise from severe technical difficulties encountered.

In contrast to all previous work, in the alternating case the minimum distance does not necessarily correspond to a subgroup of smallest index (modulo the “irrelevant kernel,” see Sec. 4.2). This necessitates the introduction of the homomorphism extension problem, a problem of interest in its own right, which remains the principal bottleneck for algorithmic progress. The problem was solved by Wuu [Wuu18] in the special case stated above.

To bypass the HomExt bottleneck, we introduce a new model we call Certificate List-Decoding. In this model the output is a short () list of partial maps from to that includes, for each affine homomorphism within of the received word, a certificate of , i. e., a partial affine homomorphism that uniquely extends to .

We say that a homomorphism code is economically certificate-list-decodable (CertEcon) if such a list can be efficiently generated.

Note that, by definition, AlgEcon CertEcon CombEcon.

We say that a class of finite groups is universally CertEcon if for all and arbitrary (finite or infinite) , the code is CertEcon.

Theorem 1.8 (Main semi-algorithmic result).

Alternating groups are universally CertEcon.

In fact we show that SRG groups are universally CertEcon.

Finally, we show that certificate list-decoding, combined with a HomExt oracle for the top layers of the subgroup lattice of , suffices for list-decoding .

This is the route we take to proving Theorem 1.7.

We give more formal statements of these results in Section 4.

1.4 The structure of the paper

Much of our conceptual framework can be interpreted for codes in general, not just for homomorphism codes. In Section 2 we develop the general terminology. This includes the notions of economy in local list-decoding as well as the new concepts of certificate-list decoding (Sec. 2.4), our semi-algorithmic intermediate concept, and mean-list decoding, our main tool for domain relaxation (Sec. 2.5), motivated by Guo and Sudan’s use of repeated codes [GS14]. We also introduce subword extenders, which constitute the bridge between certificate-list decoding and algorithmic list-decoding (Sec. 2.6).

In Section 3 we present notation and terminology from group theory and computational group theory, including our access models, i. e., computational representations of groups (black-box, generator-relator presentations, etc.).

Section 4 gives formal statements of our results and occasional minor proofs that contribute to the conceptual development. The section includes a discussion of shallow-random-generation (SRG) groups (Section 4.4). Section 4.7 explains the role of the Homomorphism Extension problem in bridging the gap between certificate-list decoding and algorithmic list-decoding.

Section 5 describes a simple combinatorial lemma (“Bipartite covering lemma”) and applies it in two separate contexts: connecting mean-list-size to list-size, from which we infer our domain relaxation principle, and the equivalence (both combinatorial and algorithmic) of and .

Section 6 outlines our basic strategy for the combinatorial bounds. It indicates the differences between the approach to abelian domains and to alternating (and SRG) domains. We indicate that the same strategy also produces certificate-list-decoders.

Section 7 describes the tools for the combinatorial bounds. We compare one of our tools, a sphere packing argument via a strong negative correlation inequality, to the Johnson bound.

Section 8 gives the full technical development of our results for abelian domains.

The rest of the paper, Sections 9 to 11, provides the proofs for alternating domains and their generalizations, the SRG groups.

We give two proofs that alternating groups are CombEcon.

The first proof, in Section 9, is based on a sphere packing argument and is non-constructive, but the method applies under quite general circumstances. The second, in Section 11, depends on structure specific to the alternating groups (or more generally, to SRG groups), that proof directly translates to a semi-algorithmic result (CertEcon), and under restrictions of the codomain, also provides an algorithmic result (AlgEcon).

2 Terminology for general codes

2.1 List-decoding

We introduce some terminology that applies to codes in general and not just homomorphism codes.

Let be an alphabet and a set we think of as the set of positions. We view , the set of functions, as our code space; we call its elements words. We write for the normalized Hamming distance between two words (so ) and refer to it simply as distance. Let be a code; we call its elements codewords.

We write (or simply mindist) for the minimum distance between distinct codewords in .

Words we wish to decode are referred to in the literature as received words. We refer to the set of codewords within a specified distance of a received word as “the list” and denote it by . We write .

2.2 Combinatorial list-decoding

The list-decoding problem splits into a combinatorial and an algorithmic part.

The combinatorial problem, to which we refer as combinatorial list-decoding, asks to bound the size of the list. Typically, we take and we wish to obtain a bound , that depends only on and the class of codes under discussion ().

We say that a class of codes is CombEcon (“combinatorially economically list-decodable”) if for . (With some abuse of terminology, we shall refer to a code as a CombEcon code is the class of codes is understood from the context.)

2.3 Algorithmic list-decoding

We shall describe algorithms with certain performance guarantees typically guaranteeing properties of the output with specified probability.

A list-decoder is an algorithm that, given the received word and the distance , lists a superset of the list . Typically, we take and we wish to produce a list of size for some that depends only on and the class of codes under discussion ().

Adapting the terminology of [GKS06] and [DGKS08], we say that a local algorithm is a probabilistic algorithm that has oracle access to the received word .

We say that is an AlgEcon (“algorithmically economically list-decodable”) class of codes if there exists a local list-decoder with the following features.
Input: mindist, , oracle access to .
Notation: .
Output: A list of codewords in of length .
Guarantee: With probability , we have .
Cost:

  1. queries to the received word .

  2. amount of work.

Access: The meaning of this definition depends also on the access model to and . We shall clarify this in each application.

Strong AlgEcon

In the unit cost model for , we charge unit cost to name an element of .

We say that is a strong AlgEcon code if there exists a list-decoder satisfying the conditions of AlgEcon, except with (ii) replaced by the following.

  1. amount of work in the unit cost model for .

Typically, elements of are encoded by strings of length and therefore (ii’) implies (ii) with linear dependence on . The AlgEcon results proved in prior work [DGKS08, GS14, BGSW18] are actually strong AlgEcon results for those classes of pairs of groups. Our AlgEcon result for abelian domain is also strong AlgEcon (see Section 4.3). On the other hand, our AlgEcon result for alternating domain does not meet the “strong” requirement.

Remark 2.1.

The unit cost model can also be used in the case of infinite . In fact, our AlgEcon result for abelian domains holds even for infinite codomains in the unit cost model, i.e., it satisfies (ii’).

2.4 Certificate list-decoding

In the light of technical difficulties arising from algorithmic list-decoding, we introduce a new type of list-decoding that is intermediate between the combinatorial and algorithmic. We call it “certificate list-decoding.” We shall refer to results of this type as “semi-algorithmic.”

A partial map from to , denoted , is a map of a subset of to . In particular, .

Definition 2.2 (Certificate).

We say that a partial map is a certificate for the codeword if and is the unique codeword that extends . A certificate for the code is a certificate for some codeword in .

Definition 2.3 (Certificate-list).

We say that a list of partial maps is a certificate-list for the set of codewords if contains a certificate for each codeword in . A certificate-list for up to distance of the received word is a certificate-list for the list .

Remark 2.4.

Note that we permit the certificate-list to contain redundancies (more than one certificate for the same codeword) and irrelevant items (partial functions that are not certificates of any codeword in , or not even certificates of any codeword at all).

Definition 2.5.

A certificate-list-decoder is an algorithm that, given the received word and the distance ,

constructs a certificate-list of up to distance of .

Definition 2.6.

We say that is a CertEcon (“certificate-economically list-decodable”) code if there exists a local certificate-list-decoder with the following features.
Input: , oracle access to .
Notation: Again, let .
Output: A list of partial maps of length .
Guarantee: With probability , we have that is a certificate-list for .
Cost:

  1. queries to the received word .

  2. amount of work.

Access: The meaning of this definition depends also on the access model to and . We shall clarify this in each application.

Remark 2.7.

Note that mindist is not part of the input. In our results, we are likely to find a certificate of up to distance of the received word , regardless of the actual value of mindist. We note that, depending on the access model, we may not be able to find mindist.

Remark 2.8.

CertEcon is intermediate between AlgEcon and CombEcon. Indeed, CertEcon implies CombEcon, by the length bound of the Output and the Guarantee. Moreover, AlgEcon implies CertEcon, as the AlgEcon Output satisfies the definition of a certificate, under the same Guarantee and Cost bound.

Strong CertEcon
Definition 2.9.

We say that is a strong CertEcon code if there exists a certificate-list-decoder satisfying the conditions of CertEcon, except with (ii) replaced by the following.

  1. amount of work in the unit cost model for .

All CertEcon results in this paper are actually strong CertEcon results.

Remark 2.10.

Strong CertEcon does not follow from AlgEcon, though it does follow from strong AlgEcon.

Remark 2.11.

As in the AlgEcon context, the unit cost model can also be used in the case of infinite . In fact, all our CertEcon results hold for infinite codomain in the unit cost model, i.e., they satisfy (ii”).

2.5 Mean-list-decoding

Let be a family of received words . By the size we mean the size of the index set . The average distance of a word to is the average distance of to elements of , given by . (The expectation

is taken with respect to the uniform distribution over

.)

Definition 2.12 (Mean-lists).

We define the mean list as the set of codewords within a specified average distance of the received words , i.e.,

(1)

We write for the maximum mean-list size for a given distance .

This concept was inspired by the use of repeated codes by Guo and Sudan [GS14], see Remark 5.4.

As we shall see, the mean list-decoding concept helps expand the scope of our results, without making them more difficult to prove. We adapt above terminology to the context of mean-list-decoding.

Combinatorial.

We wish to bound mean-list size by for some that depends only on and the class of codes under discussion (). We say that the class of codes is CombEconM (“combinatorially economically mean-list-decodable”) if for .

Algorithmic.

We say that the class of codes is AlgEconM (“algorithmically economically mean-list-decodable”) if it satisfies the definition of AlgEcon classes of codes, with the following modifications.

For each , the received word is replaced by a family of received words and the list becomes . Oracle access to means that, given and , the oracle returns . Condition (ii) is replaced by the following.

  • [(ii-M)]

  • amount of work.

Note that the number of queries to the family remains .

Certificate.

We say that a class of codes is CertEconM (“certificate economically mean-list-decodable”) if it satisfies the definition of CertEcon codes, with the same modifications as AlgEconM.

Theorem 2.13.

For a class of codes, we have the following.

  1. is CombEconM if and only if it is CombEcon.

  2. is AlgEconM if and only if it is AlgEcon.

  3. is CertEconM if and only if it is CertEcon.

For more detailed statements and proofs see Section 5.2.

Remark 2.14 (Significance of mean-list-decoding).

Dinur et al. show the CombEcon and AlgEcon list-decodability of abelianabelian homomorphism codes [DGKS08]. We shall see that Theorem 2.13 quickly leads to the conclusion of CombEcon list-decodability of arbitraryabelian homomorphism codes. The same inference can be made about AlgEcon list-decodability, assuming natural conditions about representation of the domain group. See Section 5.2 for details.

Strong mean-list-decoding

We say that is a strong AlgEconM code if it satisfies the definition of AlgEconM, except with (ii-M) replaced by (ii’-M) below. Similarly, we say that is a strong CertEconM code if it satisfies the definition of CertEconM, except with (ii-M) replaced by (ii’-M) below.

  1. amount of work in the unit cost model for and unit sampling cost model for .

In the unit sampling cost model for , we charge unit cost for naming any and for generating a uniform random .

2.6 Subword extension

In this section we introduce terminology to formalize our strategy to advance from certificate-list-decoding to algorithmic list-decoding (Observation 2.24 below).

Definition 2.15 (Subword extension problem).

Let be a code. The subword extension problem asks, given a partial map , whether extends to a codeword in .

A subword extender is an algorithm that answers this question and returns a codeword in extending , if one exists.

Observation 2.16.

A certificate-list-decoder and a subword extender combine to a list-decoder.

Remark 2.17.

This observation describes our two-phase plan to prove algorithmic list-decodability results for homomorphism codes with alternating domains. In the case of homomorphism codes, the subword extension problem corresponds to the homomorphism extension problem (see Section 4.9). The algorithmic difficulty of the homomorphism extension problem is a major bottleneck to further progress.

In fact, the plan suggested by this observation is too ambitious. We have no hope to solve the subword extension problem in cases of interest for all subwords.

Therefore, we relax the subword extender concept; correspondingly, we strengthen the notion of certificates required.

Let be a set of partial maps.

Definition 2.18 (-subword extender).

The -subword extension problem asks to solve the subword extension problem on inputs from . A -subword extender is an algorithm that takes as input any partial map and returns a yes/no answer; and in the case of a “yes” answer, it also returns a codeword , such that

  • if then the answer is “yes” if and only if extends to a codeword, and in this case, is a codeword that extends .

Remark 2.19.

Note that is not required to decide whether . must correctly decide extendability of for all ; in case , the algorithm may return an arbitrary answer.

Definition 2.20 (-certificate).

A -certificate is a certificate that belongs to .

Definition 2.21 (-certificate-list).

We say that a list of partial maps is a –certificate-list for the set of codewords if contains a -certificate for each codeword in . A -certificate-list for up to distance of the received word is a -certificate-list for the list .

Remark 2.22.

Note that, as mentioned in Remark 2.4, we permit the -certificate-list to contain redundancies and irrelevant items, including partial functions that do not belong to .

Definition 2.23.

A -certificate-list-decoder is an algorithm that, given the received word and the distance , constructs a -certificate-list of up to distance of .

Our overall strategy for the case when is “far from abelian” is summarized in the following observation.

Observation 2.24.

For any set of partial maps, a -certificate-list-decoder and a -subword extender combine to a list-decoder.

Definition 2.25.

We say that is a -CertEcon (“-certificate-economically list-decodable”) code if there exists a local -certificate-list-decoder with the features listed in Definition 2.6.

Definition 2.26.

We say that is a strong -CertEcon code if there exists a strong -certificate-list-decoder, i. e., a -certificate-list-decoder that is a strong certificate-list-decoder (see Definition 2.9).

2.7 Minimum distance versus maximum agreement

Recall that our code space is , the set of functions. In the theory of error-correcting codes, the usual measure of distance between two functions (strings) is the (normalized) Hamming distance, the fraction of symbols on which they differ. Following [GKS06], we find it convenient to consider the measure complementary to normalized Hamming distance, the (normalized) agreement,

(2)

the fraction of positions on which the two functions agree.

Definition 2.27.

The maximum agreement of the code is given by

Fact 2.28.

The minimum distance is the complement of the maximum agreement, i. e.,

So, the codewords within distance of a received word are the same as the codewords with agreement at least with .

Classes of examples for the infeasibility of list-decoding outside this range were provided by Guo and Sudan [GS14] for abelian domain and codomain, and we provide such classes for alternating domain (see Section 9.4), so the list-decoding radius is mindist for those classes.

3 Preliminaries

Let be a set. For any subset , define the density of in by We call the “ambient set” and write when is understood. The ambient set will generally be a group .

3.1 Groups

In this paper we will denote the class of all groups (finite or infinite) by . We write to denote the class of finite abelian groups and for the class of (finite) alternating groups.

Our group theory reference is [Rob95]. We review some definitions and facts.

Let be a group. We write to express that is a subgroup; we write if is a normal subgroup. We refer to cosets of subgroups of as subcosets. For the subcoset of (where ), let denote the index of in . For a subset of a group , the subgroup generated by is the smallest subgroup of containing . If , then generates . A subset is affine-closed if . An affine-closed subset is either empty or it is a subcoset. The intersection of affine-closed subsets is affine-closed. The affine closure , affinely generated by , is the the smallest affine-closed subset containing . Note that the affine closure of the empty set is empty. The affine closure of a nonempty set is a subcoset; indeed, for any , we have that .

3.2 Homomorphism codes

3.2.1 Affine homomorphisms as codewords

Let be a finite group and a group. Denote the set of homomorphisms from to by .

Definition 3.1.

Let and be affine-closed subsets of and , resp. A function is an affine homomorphism if

We write to denote the set of affine homomorphisms from to .

Fact 3.2.

Let and . Let and . A function is an affine homomorphism if and only if there exists and such that

(3)

for every . The element and the homomorphism are unique.

The analogous statement also holds with on the right of .

Definition 3.3.

For sets and functions , the equalizer is the subset of on which and agree, i. e.,

More generally, if is a collection of functions from to , then the equalizer is the set

Fact 3.4.
  • If then .

  • If then is affine-closed. Moreover, if are the corresponding homomorphisms (see (3)) then either is empty or for any .

Recall that the (normalized) agreement between two functions is given by

Specializing Def. 2.27 to homomorphism codes, we write

for the maximum agreement of . In other words,

If the groups and are understood, we often write in place of . Using this terminology, the min distance of the homomorphism code is .

The following statement appears in [Guo15, Prop. 3.5]. We include the proof for completeness.

Proposition 3.5 (Guo).

Let be groups. The maximum agreement can equivalently be defined with replaced by , i. e.,

Here we use the convention that the maximum of the empty set (of nonnegative numbers) is zero. Otherwise we would need to make the additional assumption .

Proof.

Let .

Obviously . Now let . So, by Eq. (3), there exist and be such that for all we have and . It follows that if then . Hence is either zero or equal to , proving that . ∎

Corollary 3.6.

Let be a finite group and a group. Then, , the largest density of a proper subgroup of .

Fact 3.7.

Let and be groups and a subset. If and for all , then . ∎

Corollary 3.8.

Let be a finite group, a group, and , such that . If are such that for all , then .

Remark 3.9 (Why affine?).

The reader may ask, why we (and all prior work) consider affine homomorphisms rather than homomorphisms. The reason is that affine homomorphisms are simply the more natural objects in this context. To begin with, this object is more homogeneous. For instance, for finite , under random affine homomorphisms, the images of any element are uniformly distributed over .

This uniformity also serves as an inductive tool: when extending the domain from a subgroup to a group , the action of any homomorphism can be split into actions on the cosets of in . Those actions are affine homomorphisms. On the other hand we also note that list-decoding of and are essentially equivalent tasks; see Section 5.5.

3.3 Computational representations of groups and homomorphisms

In this section we discuss the models of access to groups required by our algorithms. The choice of the model significantly impacts the running time and even the feasibility of an algorithm.

The models include oracle models (black-box access, black-box groups), generator-relator presentations, and various explicit models such as permutation groups, matrix groups, direct products of cyclic groups of known orders.

Recall that our domain groups are always finite but the codomain may be infinite (Convention 1.1).

Recall also that homomorphisms will be represented by the list of their values on a set of generators.

3.3.1 Black-box models

If the codomain is infinite, and even if it is finite but very large, the black-box-group model with its fixed-length encoding [BS84a] is not appropriate (see “encoded groups” below). We start with an extension of that model.

Definition 3.10 (Black-box access).

An unencoded black-box representation of a (finite or infinite) group is an ordered -tuple

where

  • is a (possibly infinite) set;

  • with ;

  • with for all ;

  • with for ; and

  • with if and only if is the identity in .

We say that an algorithm has black-box access to the group if the algorithm can store elements of and query the functions (oracles) . We say that is given as an (unencoded) black-box group if in addition a list of generators of is given.

Remark 3.11.

We emphasize that the difference between black-box access to a group and the group being given as a black-box group is that in the latter model, a list of generators of is given, whereas no elements of may be a priori known in the former.

If then we talk about an encoded group, of encoding length . This of course implies that is finite, namely, . (This is the model introduced in [BS84b].)

In an abuse of notation, when black-box access to a group is given, we may refer to elements of by their images under , we may write in place of , we may write in place of , and we may write in place of .

Access to domain and codomain. In general we shall not need generators of the codomain, , just black-box access. On the other hand, we do need generators of the domain, ; homomorphisms will be defined by their values on a set of generators. So our access to the domain will be assumed to be at least as strong as an (encoded) black-box group.

The black-box unit cost model. The (unencoded) black-box access model is particularly well suited to the unit-cost model where we assume that we can copy and store an element of and query an oracle at unit cost. We shall analyze our algorithms in the unit-cost model for the codomain . This essentially counts the operations performed in , so its bit-cost will incur an additional factor of (if is finite and nearly optimally encoded).

Random generation. In encoded black-box groups, independent nearly uniform random elements can be generated in time, polynomial in the encoding length [Bab91].

Remark 3.12.

Black-box groups have been studied in a substantial body of literature, both in the theory of computing and in computational group theory (see the references in 

[BBS09]). It is common to make additional access assumptions to a black-box group (assume additional oracles) such as an oracle for the order of the elements.

Given a black-box group , we cannot determine the order or the order of a given element . In fact, even with an oracle for the order of elements, and cannot be distinguished in fewer than

randomized black-box identity queries.

To avoid such obstacles, it is common to assume additional information beyond black-box access. In finding for abelian domain

one needs to decide if a given prime divides . To accomplish this, we assume additional information about the group

such as the order or the list of primes dividing .

3.3.2 Generator-relator presentation, homomorphism checking

By “presentation” of a group we mean generator-relator presentation.

For a group given by a presentation, basic questions, such as whether the group has order 1, are undecidable. However, special types of presentations, such polycyclic presentations of finite solvable groups, are often helpful. Note, however, that it is not known, how to efficiently perform group operations in a finite solvable group given by a polycyclic presentation, so such presentations cannot answer basic black-box queries.

Any presentation, however, can be used for homomorphism checking, a critical operation in decoding homomorphism codes.

Proposition 3.13 (Homomorphism checking).

Let be a list of generators of . Assume a presentation of is given in terms of . Let be a function. Then extends to a homomorphism if and only if the list satisfies the relations.

Note that this gives an efficient way to check whether extends to a homomorphism if the relators are short or are given as short straight-line programs, assuming black-box access to the codomain.

Definition 3.14.

Let be a group and a list of elements of . A straight-line program in from to is a sequence of elements of such that each is either a member of or a product of the form for some or for some . We say that the element is given in terms of by the straight-line program .

The following is well known.

Proposition 3.15.

Let be a permutation group and a set of generators of . Given , a presentation of in terms of can be computed in polynomial time, where the relators returned are represented as straight-line programs.

3.3.3 Abelian groups

The invariant factor decomposition of a finite abelian group is a decomposition as a direct product of cyclic groups, , where for each , the integer divides . Such a decomposition can be further split into a direct product of cyclic groups of prime power order; the result is a primary decomposition.

Any abelian presentation of a finitely generated abelian group can be converted into an invariant factor decomposition in polynomial time, using the Smith normal form. However, moving to a primary decomposition requires factoring the order of the group; to this end, knowing a superset of the prime divisors of the order suffices. All prior algorithmic results as well as those of the present paper on homomorphism codes with abelian domain require the primary decomposition.

4 Formal statements

4.1 List-decoding homomorphism codes

Let be a class of pairs of groups. We say that is CombEcon if the class of codes is CombEcon. We define CertEcon and AlgEcon classes of pairs of groups analogously.

Denote by the class of all groups, finite or infinite. Recall that we say that a class of finite groups is universally CombEcon if is CombEcon. We define universally CertEcon and universally AlgEcon analogously, under access models to be specified.

A common feature of the prior work reviewed in Section 1.1 is that each class of pairs of groups considered was CombEcon and AlgEcon.

The present work continues to maintain this feature.

All previously existing results put structural restrictions both on the domain and the codomain. In particular, they were restricted to subclasses of the solvable groups. In this paper we extend the economical list-decodability (both combinatorial and algorithmic) in the following three directions.

  1. We give a general principle for removing certain types of constraints on the domain (see Section 4.2). It will follow that the previously known results extend to arbitrary domains.

  2. We find universally economically list-decodable classes of groups

    Specifically, abelian and alternating groups are universally CombEcon. Moreover, abelian groups are universally AlgEcon, and alternating groups are universally CertEcon, under modest access assumptions.

  3. We exhibit the first (nontrivial) classes of examples where the domain is not solvable.

We note that no CombEcon bounds appear to be known for the much-studied classical linear codes (Reed–Solomon, Reed–Muller, BCH) (cf., e.g., [BL15]).

The CombEcon bound for Hadamard codes is quadratic [GL89]. For abelian and nilpotent groups, it currently has degree 105 [DGKS08, GS14].

4.2 Extending the domain: the irrelevant kernel

In the prior work reviewed, both the domain and the codomain was abelian or close to abelian (nilpotent or supersolvable). It is natural to ask how to further relax the structural constraints on the groups involved.

We point out that structural constraints such as nilpotence or solvability (or any other hereditary property) play a very different role if imposed on the domain as on the

codomain. For instance, a combinatorial list-decoding bound on abelian  abelian homomorphism codes implies the same bound for arbitrary  abelian homomorphism codes. This is shown by reducing list-decoding for arbitrary and abelian to mean-list-decoding

, where is the commutator subgroup of , so is the largest abelian quotient of . A similar argument extends the bounds for nilpotent  nilpotent homomorphism codes to arbitrary  nilpotent, working through the largest nilpotent quotient of .

Similar results hold for certificate and algorithmic list-decoding.

In general, we can replace by its relevant quotient , where is the irrelevant kernel (intersection of the kernels of all homomorphisms),

see Sec. 5.4.

While this observation extends the reach of the results of Dinur et al. [DGKS08] and Guo and Sudan [GS14], it also shows that, in a sense, the gains by extending the class of groups serving as the domains, without relaxing the structural constraints on the codomains, are virtual, and the main impediment to extending these results to wider classes of pairs of groups is the structural constraints on the codomain.

Our main contribution is the elimination of all constraints on the codomain.

This also opens up the question of meaningfully (as opposed to “virtually”) removing structural constraints on the domain side. Of particular interest becomes the case where the domain is a finite simple group and the codomain is arbitrary. We initiate this direction by studying the class of alternating groups as domains.

Definition 4.1 (Irrelevant kernel).

Let and be groups. The -irrelevant kernel (or “irrelevant kernel” if and are clear) is the intersection of the kernels of all homomorphisms, i.e.,

(4)

We call elements and subgroups of the irrelevant kernel irrelevant.

For instance, if is abelian, then the commutator subgroup is irrelevant.

Theorem 4.2.

Let be an irrelevant normal subgroup of . Then, . Moreover,

  1. if is CombEcon then is CombEcon;

  2. if is CertEcon then is CertEcon;

  3. if is AlgEcon then is AlgEcon.

For items (ii) and (iii) we need to make suitable assumptions on access to the domain.

For the proofs and discussion, see Section 5.2. The proofs rely on mean-list-decoding (Theorem 2.13).

Corollary 4.3.

The code is AlgEcon for any finite group and any finite nilpotent group .

Proof.

Combine Theorem 4.2 and the main result of [GS14]. For abelian , use [DGKS08] instead. ∎

4.3 List-decoding: abelian arbitrary

We state our main result about abelian domains.

Theorem 4.4.

If is a finite abelian group, then is universally CombEcon and strong AlgEcon list-decodable.

The degree of the list-size bound is where is the bound for abelian  abelian homomorphism codes (currently  [GS14]).

The proof of the CombEcon bound is based on the following structural result that says the range of all relevant homomorphisms is covered by a small number of finite abelian subgroups of .

Theorem 4.5 (Structure of range).

Let be a finite abelian group, an arbitrary group (finite or infinite), a function, and . Then there exists a set of finite abelian subgroups of with such that for all , there is such that .

Access model.

We need to clarify how the algorithm accesses the domain and codomain. Following [DGKS08, GS14, BGSW18], we assume the domain is given explicitly as a direct product of cyclic groups of prime power order. We remark that representing the domain in terms of a presentation by generators and abelian relations would suffice, if we are also given a superset of the prime divisors of the order of the domain. Without that additional information, factoring would be required (see Section 3.3.3). — We only require black-box access to the codomain (see Definition 3.10).

Pointer.

We prove Theorems 4.4 and 4.5 in Section 8. The essential new result is the CombEcon bound, proved in Section 8.2. The algorithm is an adaptation of the algorithm of [DGKS08, GS14], based on our CombEcon bound. This adaptation will be discussed in Section 8.3.

4.4 Shallow random generation and list-decodability

We shall consider groups with the property that a bounded number of random elements tend to generate a subgroup of bounded depth (see Definitions 4.9 and 4.10 below). This class includes the alternating groups. We show that groups in this class are CombEcon, and under minimal assumptions on access they are also CertEcon.

It will be useful to consider an -independent lower bound on the quantity .

Definition 4.6.

We define .

Observation 4.7.

For simple groups the following three quantities are equal: (a)  , (b)  , and (c) the largest fraction of elements of fixed by an automorphism.

Observation 4.8.

For , , we have .

The depth of a subgroup in a group is the length of the longest subgroup chain . We say that a subgroup is “shallow” if its depth is bounded. It follows from a result of citeBabai1989 that already a pair of elements in generates a subgroup of depth at most 6. This is the property that we generalize.

Definition 4.9 (Shallow random generation).

Let . We say that a finite group is -shallow generating if

(5)
Definition 4.10 (SRG groups).

We say that a class of finite groups has shallow random generation ( is SRG) if there exist such that all are -shallow generating.

Lemma 4.11.

The alternating groups are SRG groups. In particular, for sufficiently large , the alternating group is -shallow generating.

We prove this lemma in Section 10.1. We note that certain classes of Lie type simple groups are also SRG. We shall elaborate on this in a separate paper.

Now we can state one of the main results of this paper.

Theorem 4.12.

If is an SRG group, then is universally CombEcon list-decodable.

For the case of alternating groups, we show that the degree of the list-size bound is at most 9; with further work this can be reduced to 7.

Theorem 4.13.

If is an SRG group, then is universally strong CertEcon list-decodable.

In fact, SRG groups are universally strong -CertEcon list-decodable (see Section 2.6 for the definition of -certificates and section 4.5 for the definition of ). This restriction on the type of certificates we obtain is necessary for extensions to AlgEcon results (cf. comment before Definition 2.18). Section 4.5 discusses -certificates in the context of homomorphism codes. A formal statement of the -CertEcon result is given in Section 4.6.

Access model.

For the CertEcon results, we assume access to (nearly) uniform random elements of the domain. We do not multiply elements of the domain, so we do not need black-box access to the domain. However, representing the domain as an encoded black-box group suffices for random generation (see Sec. 3.3.1).

We need no access to the codomain.

Pointers.

We prove the CombEcon result in Section 11.1 and the CertEcon result in Section 11.2. For alternating groups we also give another, non-algorithmic, proof of the CombEcon result in Section 9. That proof relies on a generic sphere packing argument to split the sphere into more tractable bins (see Lemma 7.3 and Section 9.2).

4.5 Certificate list-decoding for homomorphism codes

First we translate the concepts associated with certificate list-decoding (Section 2.4) to the context of homomorphism codes. A certificate is a partial map that extends uniquely to an affine homomorphism .

A subword extender is an algorithm that extends a partial map to a full homomorphism if possible.

Recall that for a subset , we denote by the density of in . For notational simplicity, we write for .

Notation 4.14.

Let (resp. ) be the set of partial maps such that (resp. ).

Recall that we have introduced certificate list-decoding as an intermediate step towards algorithmic list-decoding, to address technical difficulties that arise in algorithmic list-decoding in the alternating case. Our plan is to apply Observation 2.24 on subword extension with .

Observation 4.15.

If a partial map belongs to , then extends to at most one affine homomorphism in .

We will find -certificate-list-decoders for a large class of homomorphism codes, and we wish to find corresponding -subword-extenders.

Let be a partial map. We present three conditions on , then discuss their relationships to each other as well as to list-decoding.

  1. If extends to an affine homomorphism in , then the extension is unique, i.e., is a certificate for some affine homomorphism.

  2. .

  3. The affine closure of is .

Clearly, Condition (3) implies Condition (2), which implies Condition (1). Implications in the other direction do not hold in general. In particular, neither reverse implication holds for the alternating groups.

Algorithmic list-decoding requires a list of full affine homomorphisms. (Recall that affine homomorphisms are represented as partial maps satisfying Condition (3).)

Certificate list-decoding requires the list of partial maps to satisfy Condition (1). Our CertEcon algorithms actually return certificates satisfying Condition (2), i. e., they are -certificate-list-decoders.

In the case of abelian , Condition (3) is equivalent to Condition (1) if the irrelevant kernel is trivial (see Definition 4.1). So, in this case certificate list-decoding and algorithmic list-decoding are equivalent. We introduced the mean-list-decoding machinery to address the case of nontrivial irrelevant kernel (see Theorems 2.13 and 5.18).

4.6 Certificate list-decoding: SRG arbitrary

Recall that, in the context of list-decoding , denotes the set of partial maps such that , where . We state the promised strengthening of Theorem 4.13.

Theorem 4.16 (SRG certificate, abridged).

If is an SRG group, then is universally strong -CertEcon list-decodable.

Access model.

We assume access to (nearly) uniform random elements of the domain. We do not multiply elements of the domain. We remark that representing the domain as a black-box group would suffice for random generation citeBab91BBpolygen.

We need no access to the codomain. We get ahold of elements of the codomain by querying the received word. We shall not perform any group operations in the codomain.

Actually our conclusion is much stronger than what would be implied by

Theorem 4.16.

Theorem 4.17 (SRG certificate, unabridged).

Let be a -shallow generating group and an arbitrary group. We have a local algorithm with the following features.

Input:

Values .
Output: A set of -tuples in , where

Cost: amount of work.
Performance guarantee: For every received word , with probability at least , the set is -certificate-list for up to distance of .

Access model.

Same as in Theorem 4.16.

Pointer.

The proof of Theorems 4.16 and 4.17 can be found in Section 11.2.

Remark 4.18.

Given that is -shallow generating (Lemma 4.11), Theorem 4.17 applies to with . We think of being given in its natural permutation representation. We note that a representation of as a black-box group would suffice, because the natural permutation representation of an alternating group can be efficiently extracted from a black-box group representation citeBBtoAlt.

4.7 Algorithmic list-decoding, assuming certificate list-decoding and homomorphism extension

In the light of Observation 2.24

(a certificate-list-decoder and a subword extender combine to a list-decoder) and the CertEcon results stated above, we need subword extenders for homomorphism codes.

The homomorphism extension problem is the same as the subword extension problem for . We shall see below that it can also be used to solve the subword extension problem for .

The Homomorphism Extension Problem asks whether a partial map extends to a homomorphism on the whole group. As before, let .

Definition 4.19.

(Homomorphism Extension, )
Instance: A partial map .
Solution: A homomorphism that extends , i. e., .

The Homomorphism Extension Decision Problem asks whether a solution exists. The Homomorphism Extension Search Problem asks to determine whether a solution exists and, if so, to find one.

Let denote the subgroup of generated by the domain of . The Homomorphism Extension problem splits into the following two questions.

  • Does extend to an homomorphism? (If such an extension exists, it is clearly unique.)

  • Given an homomorphism, does it extend to a homomorphism?

Question (a) can be solved efficiently if a presentation of is available in terms of the set of generators and we have black-box access to (see Prop. 3.13). Such presentation can always be found efficiently if is given as a permutation group. (Prop. 3.15).

The difficult problem is to extend a homomorphism from to . For we are only able to do this when has polynomial index in . Therefore we consider the threshold version of the problem.

Definition 4.20.

(Homomorphism Extension with Threshold, )
Instance: A number and a partial map satisfying .
Solution: A homomorphism that extends , i. e., .

Note that, if , then an oracle for can answer the queries.

Next we reduce the extension problem for affine homomorphisms to the HomExt problem, i. e., the extension problem for homomorphisms.

Proposition 4.21.

Let and be groups to which we are given black-box access. Then, a subword extender for can be implemented in -time in the unit-cost model for , assuming we are given an oracle for the search problem — that is, we have a subword extender for .

Proof.

Let be a partial map. If then the map to the identity element of extends . Otherwise, fix . Let by . Then, extends to a homomorphism if and only if extends to an affine homomorphism , with for all . ∎

Since Theorem 4.13 guarantees -certificate-lists, we need only provide a -subword extender

(see Observation 2.24). In this case, the HomExt oracle may be relaxed to account for this restriction on certificates. Further elaborating on the comment after Remark 2.17, we note that this relaxation is critical to our application to the alternating group. While we are able to solve for partial maps whose domain generates subgroups of polynomial index, we see little hope to solving it for all partial maps on .

The next result is the -relaxation of Proposition 4.21.

Proposition 4.22.

Let and be groups to which we are given black-box access. Suppose we are given an oracle for the search problem. Then, a -subword extender for can be implemented in time in the unit-cost model for .

The proof is the same as that of Proposition 4.21.

Remark 4.23.

In practice we may not be able to determine the value of , while we may be able to determine a rather large lower bound . So, we instead ask for an oracle for . This is the procedure we follow in this paper for the alternating group.

Corollary 4.24.

Let be an SRG group and be an arbitrary group. Under the assumptions of Proposition 4.22, is AlgEcon.

Proof.

Combine Proposition 4.22 and Theorem 4.16. ∎

4.8 Homomorphism extension from alternating groups

The following theorem addresses the HomExt Search Problem for the permutation representations of the alternating groups. This is the main result of [Wuu18].

Theorem 4.25 (Wuu).

Let , and . If , then the search problem can be solved in time.

Remark 4.26.

In fact, under the assumptions of Theorem 4.25, the number of extensions can be counted in time.

This result is proved by looking at the orbits in of the group generated by the domain of the partial function, then deciding how they may combine to form orbits of . We reformulate HomExt with symmetric codomain as an exponentially large instance of a generalized Subset Sum Problem to which we have oracle access. The technical assumption guarantees that the arising instance of generalized Subset Sum is tractable. Answering oracle queries amounts to solving certain problems of computational group theory such as the conjugacy problem for permutation groups.

4.9 Algorithmic list-decoding: alternating symmetric, restricted cases

We need one more ingredient before we can prove our main algorithmic result.

Lemma 4.27.

Let . Let and let be a group. If , then either or .

Proof.

We note that , since the identity automorphism of and the automorphism that sends to its conjugation by the transposition agree on which has index (In fact, ).

Suppose , so is nontrivial. Since is simple, contains an isomorphic copy of (The image of a nontrivial homomorphism is isomorphic to ). So, . By Fact 3.4 and the Jordan-Liebeck Theorem (see Section 9.1), or .

We remark that Guo [Guo15, Proposition 6.1] proved that for .