 # Popularity of patterns over d-equivalence classes of words and permutations

Two same length words are d-equivalent if they have same descent set and same underlying alphabet. In particular, two same length permutations are d-equivalent if they have same descent set. The popularity of a pattern in a set of words is the overall number of copies of the pattern within the words of the set. We show the far-from-trivial fact that two patterns are d-equivalent if and only if they are equipopular over any d-equivalence class, and this equipopularity does not follow obviously from a trivial equidistribution.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction and notation

We consider words over the set of positive integers and permutations are particular words. For , denotes the alphabet and the underlying alphabet of a word is the set of symbols occurring in the word. For instance, the -ary words and have different underlying alphabet, namely and . A descent in a word is a position with and the descent set of , denoted , is the set of all such ; ascent and ascent set, , are defined similarly. Two same length words are -equivalent if they have the same descent set and the same underlying alphabet. For instance the words and have the same descent set but not the same underlying alphabet (thus they are not -equivalent), whereas and are -equivalent (the common underlying alphabet is ). If and are -equivalent and is a permutation, it follows that so is . A -equivalence class is a maximal set of -equivalent words. For a word the reverse of is the word ; and the complement of is the word with the maximal entry of .

A pattern is a word with the property that if occurs in it, then so does , for any with , and the reduction of , , is the unique pattern order isomorphic with . The pattern occurs in the word , , if has a subword order isomorphic with , see for instance Kitaev’s seminal book  on this topic. The number of occurrences of the pattern in the word is denoted by . For a set of words , becomes an integer valued statistic on and the overall number of occurrences of within the words of is called the popularity of in ; more formally, the popularity of in is .

The equidistribution of two patterns implies their equipopularity and recently a growing interest is shown in patterns that have same popularity but not same distribution on particular classes of words or permutations, see for instance [1, 2, 5, 8].

In this paper we show that two patterns are -equivalent if and only if they have the same popularity on any -equivalence class. Specializing to permutations, we obtain that two same length permutations have same descent set if and only if they have the same popularity on any descent-set equivalence class of permutations.

## 2 Preliminary notions and results

### 2.1 d-equivalence vs. f-equivalence

Here we show that any two -equivalent patterns can be obtained from each other by a sequence of -transformations, which are ‘small changes’ preserving the -equivalence class.

#### Lexicographically smallest pattern in a d-equivalence class

For our purposes we need the lexicographically minimal pattern -equivalent with a given pattern , which in turn requires two other particular patterns and that we define below.

The descent word of a length word is the binary word where if and only if is a descent in (and so, is a redundant ). The minimal arity of a pattern having descent word is one more than the maximal number of consecutive s in , and we denote by the lexicographically smallest pattern having minimal arity and descent word . It is easy to see that the pattern is defined as: for each , ,

 αi=min{j:j≥i, bj=0}−i+1.
###### Example 1.

With and , we have and (the descents are in bold).

The maximal arity of a pattern having descent word is , and we denote by the lexicographically smallest -ary pattern having descent word , which is necessarily a length permutation. We divide the descent word into runs: the length maximal factors of the form with at least one occurrence of are descent runs and, for convenience, we call the remaining length maximal s factors (if any) ascent runs. We define an order relation on : for two integers and , , we say that precedes , with respect to the binary word , if

• and are in two distinct runs in , and , or

• and are in the same ascent run in , and , or

• and are in the same descent run in , and .

The desired permutation is precisely that induced by this order relation:

 ωi=the rank of i in {1,2,…,n}, in the % precedence order,

and is at the same time the lexicographically minimal word of maximal arity (that is ) having descent word and, as we will see below, defines an order in which we cover the entries of a pattern with descent word .

###### Example 2.

If is as in the previous example, then (runs are separated by dots) and (the descents are in bold).

Now for an arbitrary arity (not necessarily its minimal, or its maximal value ), we construct the lexicographically minimal pattern where each symbol in occurs at least once and is a descent in if and only if , and in this construction the above defined patterns and are involved. Moreover, if reaches its minimal value, then and if , then . The pattern is obtained by covering its entries in order and the first entries (in this order) are taken from and the last ones are increasing integers to guarantee that all symbols in occur in . Formally:

 βωi={αωiifmax{αω1,αω2,…,αωi}≥q−(n−i)q−(n−i)elsewhere.

With these notations, it follows that if and then

 βωi={αωiif i

and the entries are consecutive integers in increasing order.

###### Example 3.

Continuing the previous example with , if , then the above construction gives ; and if , then it gives (the entries taken from are in bold).

For a pattern with descent word , by a slight abuse of notation we denote by the pattern , by the pattern (permutation) ; and in addition if has arity , then we denote by the pattern . Note that

• the pattern is lexicographically minimal in its -equivalence class and so are and ,

• the four patterns , , and have the same descent set,

• and are -equivalent, but , and are not necessarily -equivalent since they can have different underlying alphabet (or equivalently in this case, different arity).

#### f-equivalent patterns

For later use we need the following rather technical notion: for two -equivalent patterns and we say that is an -transformation of if can be obtained from by either

• increasing or decreasing by an entry in , or

• interchanging in two entries with consecutive values.

Actually, the -transformation is a symmetric binary relation on a set of -equivalent patterns and two patterns are said -equivalent if they belong to the same equivalence class with respect to the transitive closure of -transformation. Below we prove that the notions of -equivalence and -equivalence coincide, which is stated in Corollary 1 of Theorem 1.

The order induced by is related to the descent word of , however we have the following.

###### Proposition 1.

Let be a pattern. If and are such that , then implies .

###### Proof.

If and , then and are not in the same descent run of , so . ∎

The next proposition says that, under certain conditions, decreasing an entry in the pattern produces a -equivalent pattern lexicographically smaller than .

###### Proposition 2.

Let be a pattern. If , and

• there is an such that for any , ,

• ,

• the entry occurs at least twice in ,

then the word with for any except is a pattern -equivalent with , lexicographically smaller than .

###### Proposition 3.

Let be a length pattern. If , and

• there is an such that for any , ,

• ,

• the entry occurs once in ,

• the entry occurs at least once in the set ,

then there is a with and , and the word with for any , except and is a pattern -equivalent with , lexicographically smaller than .

###### Proof.

Let be the largest element of the set with . It is enough to choose . ∎

Note that, in the two propositions above is obtained by an -transformation of .

###### Proposition 4.

Let be a -ary length pattern. If , and is such that each of occurs once in , then

• is a sequence of consecutive integers ending by .

In addition, if for any , , then

• each of occurs once in .

###### Proof.

If condition 1. is violated, then is not the lexicographically smallest pattern in its -equivalence class, which is a contradiction. If condition 2. is violated, then cannot be a -ary pattern, again a contradiction. ∎

###### Proposition 5.

Let be a length pattern. If , and

• there is an such that for any , ,

• ,

• the entry occurs once in ,

• the entry does not occur in the set ,

then there is a pattern which is -equivalent with and lexicographically smaller than .

###### Proof.

First we prove that at least one of the entries occurs at least twice in . Indeed, if these entries occur once in so are the entries in the set . But since and have the same arity and for any it follows that the two sets and are equal, and by the point 1. of Proposition 4, they are formed by consecutive integers. This is a contradiction since is not the minimal element of (otherwise ) and does not occur in .

Now we prove the statement according to the following two (non exclusive) cases: (i) there is an integer larger than in the set that occurs at least twice in , or (ii) there is at least one integer smaller than in the set .

Case (i). If there is an integer larger than in the set that occurs twice in , let be the smallest of them and let with be the set of occurrences of in . The entry occurs once in and let define as: if is not in the same descent run as , then , and otherwise. It follows that the pattern with for all , except , is lexicographically smaller than and is obtained from by an -transformation.

Case (ii). If there is an integer smaller than in the set , then let be the largest of them and let be the largest element of with . Necessarily occurs at least twice in ; otherwise, since does not occur in , the pattern with with for any , except and for an appropriate with , is -equivalent with and lexicographically smaller than , which is in contradiction with for any .

Since occurs at least twice in it follows that the pattern with for any , except , is -equivalent with (and lexicographically larger than ) and the entry occurs at least twice in . Now two subcases can occur: or .

When , since it follows that is a pattern satisfying Proposition 3, and the pattern with for any , except and , is -equivalent with (and thus with ) and is lexicographically smaller than . The patterns and are -equivalent, and the statement holds.

When , since the entry occurs twice in we can find as previously a pattern -equivalent with (and so with ) where the entry occurs twice in . Iterating this procedure, we obtain a sequence of -equivalent patterns with for all and for some . As above, the pattern with for any , except and is -equivalent with (and thus with ) and lexicographically smaller than . Moreover, each is obtained from by an -transformation and the statement holds. ∎

By Propositions 2, 3 and 5 we have the following theorem.

###### Theorem 1.

Any pattern is -equivalent with .

###### Proof.

Let be a pattern with . Then is in one of the cases stated in Propositions 2, 3 or 5, and according to these propositions there exists a pattern -equivalent (and thus -equivalent) with and lexicographically smaller than , and eventually is -equivalent with the lexicographically smallest pattern in its -equivalence class, that is with . ∎

###### Corollary 1.

Two patterns are -equivalent if and only if they are -equivalent.

###### Proof.

By definition, -equivalence implies -equivalence. Conversely, if two patterns and are -equivalent, then , and by the previous theorem is -equivalent with and is -equivalent with . Finally and are -equivalent. ∎

### 2.2 Bijection ψ

In the following we need a bijection on onto itself, that we denote by , and satisfying:

1. preserves the underlying alphabet,

2. the number of occurrences of the largest entry is the same in and in , and the same holds for the smallest entry in and in ,

3. transforms descent set into ascent set, that is, for any word .

In particular when is a permutation, the complement transformation satisfies the three properties above, which in general is not longer true for arbitrary words, and we propose a bijection which satisfies these properties for any words, not necessarily permutations. Its construction is based on the bijection on words defined in , which in turn is built on Foata and Schützenberger  bijection on permutations. The bijection in  satisfies for any word :

1. is a rearrangement of the symbols of ,

2. , and

3. .

See  for the definition of the set valued statistic that we will not use here and for a weaker version of satisfying only (ii) and (iii) above. Note that from (i) it follows that preserves the underlying alphabet.

Based on the properties (i) and (ii) of it is easy to check that defined as

 ψ=r∘ϕ (1)

satisfies the above desiderata (a)–(c). Indeed, properties (a) and (b) follow from (i), and property (c) follows from (ii). Property (iii) is a deep and remarkable feature of (that we will not make use of it) and in some sense our bijection is over endowed. For instance, , and (see ), and thus , and .

### 2.3 Pattern trace and word substitution

For a word and a set we denote by the subword of .

Let be a length word over , , and be the set . We say that is a trace of the pattern if and have the same relative order (, , or ) as and have whenever . Equivalently, is a trace of if the words and are order-isomorphic. In particular, when does not contain , then ; and formed only by ’s is a trace of any pattern. It can happen that is a trace of several patterns. For instance, for two same length patterns and , if the trace of is such that implies , then is a trace of as well.

With a trace of a pattern and as above, for a word and a set of positions in we say that is a trace of in at if (and so, ), and a trace of in at can be seen as a partial occurrence of the pattern in with playing the role of ‘wild’ symbol. It can happen that several occurrences of in a word have trace at , and we denote by the number of these occurrences, and thus becomes an integer valued statistic on words.

###### Example 4.

If and are two patterns, then and are traces of both and . Furthermore, if , then

• is an occurrence in of with trace at ,

• and are occurrences in of with trace at , and .

See Table 1 in Appendix for other examples.

For a word and two pairs of integer and we denote by the length-maximal subword of with and . Alternatively, is the length-maximal subword of with entries in .

If the word has different symbols and is a word with the underlying alphabet , then there is a unique word with

• for any with or , and

• , and and have the same underlying alphabet.

Indeed, is obtained from by replacing the subword of by an appropriate word order-isomorphic with . With these notations we call the -substitution by in . In particular, if , then the -substitution by in is itself, and we have the following easy to understand fact.

###### Fact 1.

If and are two -equivalent words, then so are and the -substitution by in .

See Example 5 where the -substitution by in is (the replaced elements are in italic and represented by in the corresponding diagrams).

## 3 Proof of the main results

The main result of this article is Theorem 4. Prior to its proof, Lemmata 1 and 2 below establish some equidistribution results and the Corollary 2 of Theorems 2 and 3 says that if two patterns are an -transformation of each other, then the patterns have the same popularity on any -equivalence class.

###### Lemma 1.

Let and be two -equivalent patterns with for any , except for some . Let also be a trace of both and with one symbol and be a element subset of . Then on any -equivalence class the statistics and have the same distribution.

###### Proof.

For any -equivalence class we give a bijection with .

Since and differ in position it follows that . In addition, since and are patterns, the entry occurs at least twice in (otherwise is not longer a pattern) and so does in . Let be the symbol in playing the role of and be that playing the role of , and we define the interval and the interval as follows. If , then

• if , then ,

• if , then ,

• elsewhere, .

Now let be the word and be the word , with defined in relation (1) and the complement operation. The desired word is the -substitution by in . Indeed, and have the same descent set and same underlying alphabet, and thus they are -equivalent. By Fact 1 the transformation turns the word into a -equivalent word . In addition, by property (b) of , it follows that the number of the largest entries in is the same as that of the smallest entries in , and vice versa. Thus, for any , (with the convention and ) transforms any occurrence

 wp1…wpi−1wjwpi…wpk−1=t1…ti−1xti+1…tk

of in with trace at into an occurrence

 vp1…vpi−1vjvpi…vpk−1=t1…ti−1yti+1…tk,

of in with trace at . This transformation is reversible, indeed the -substitution by in gives the word , and so it is a bijection. ∎

###### Example 5.

We represent words as diagrams identifying words with the set of points . Let be the word (the left-hand side diagram in this representation), and be two patterns as in Lemma 1, be a common trace of and , and be the set . In the diagram representation of , the entries , and of occurring in positions belonging to are represented by   symbol. Following the notations in the proof of Lemma 1, the interval is , is , is the subword (represented by symbols in the left-hand side diagram), is the word , is , see the examples at the end of the Section 1. Finally, in the right-hand side diagram is the image of through the bijection in the proof of Lemma 1, and we have . Indeed, occurs twice in with trace at , namely in positions and in positions ; and so does in with trace at , namely in positions and in positions .

 w=
 ↦ v=

The next lemma is the counterpart of Lemma 1 where the patterns differ in two positions, with the additional requirement that the two different entries occur once in each pattern.

###### Lemma 2.

Let and be two -equivalent patterns such that there are and , , with

• for any , except and ,

• ,

• each of and occurs once in (or, equivalently, and occur once in ).

Let also be a trace of both and with two symbols and be a subset of of cardinality . Then on any -equivalence class the statistics and have the same distribution.

###### Proof.

To a certain extent the proof is similar to that of Lemma 1 by giving a bijection with on any -equivalence class.
Since and differ in positions and and contains two symbols, it follows that , and since and are -equivalent and are not consecutive positions in . We define three intervals , and . Let be the element subset.

• If is the smallest entry in then . Otherwise let be the largest entry in smaller than , and be the entry in playing the role of , and finally . Similarly, if is the largest entry in then . Otherwise let be the smaller entry in larger than , and be the entry in playing the role of , and finally .

• If , then , otherwise ; and .

• ; and if , then , otherwise .

Now we define the announced bijection , where is obtained by constructing the words , and by applying the following steps.

1. Let be the word and , with defined in relation (1), and be the -substitution by in ;

2. let be the word and , and be the -substitution by in ;

3. let be the word and , with the complement operation, and be the -substitution by in ;

and finally . Note that the first two steps can be performed in arbitrary order since the substitution operations act on different entries of (on the disjoint intervals and ). As in the proof of Lemma 1, taking in consideration the properties of , transforms any occurrence of in with trace at into an occurrence of in with trace at , and is reversible and so it is a bijection. ∎

Note that in the previous proof, unlike in that of Lemma 2, the property (b) of the bijection is not used.

See Table 1 in Appendix for an example of the equidistribution stated in Lemma 2.

###### Example 6.

Let be the word (the left-hand side diagram below), and be two patterns as in Lemma 2, be a common trace of and , and be the set . In the diagram representation of , the entries of occurring in positions belonging to are represented by   symbols. Following the notations in the proof of Lemma 2, the interval is , is , is , is the subword (represented by symbols in the left-hand side diagram), is the subword (represented by symbols), is and , is and . Finally, in the left-hand side diagram is the image of through the bijection in the proof of Lemma 2, and we have . Indeed, occurs three times in with trace at , namely in positions , in positions , and in positions ; and so does in with trace at , namely in positions , in positions , and in positions