 # Descent distribution on Catalan words avoiding a pattern of length at most three

Catalan words are particular growth-restricted words over the set of non-negative integers, and they represent still another combinatorial class counted by the Catalan numbers. We study the distribution of descents on the sets of Catalan words avoiding a pattern of length at most three: for each such a pattern p we provide a bivariate generating function where the coefficient of x^ny^k in its series expansion is the number of length n Catalan words with k descents and avoiding p. As a byproduct, we enumerate the set of Catalan words avoiding p, and we provide the popularity of descents on this set. Some of the obtained enumerating sequences are not yet recorded in the On-line Encyclopedia of Integer Sequences.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction and notation

Combinatorial objects counted by the Catalan numbers are very classical in combinatorics, with a variety of applications in, among others, Biology, Chemistry, and Physics. A length Catalan word is a word over the set of non-negative integers with , and

 0≤wi≤wi−1+1,

for . We denote by the set of length Catalan words, and . For example, and . It is well known that the cardinality of is given by the th Catalan number , see for instance [12, exercise 6.19., p. 222], which is the general term of the sequence A000108 in the On-line Encyclopedia of Integer Sequences (OEIS) . See also  where Catalan words are considered in the context of the exhaustive generation of Gray codes for growth-restricted words.

A pattern is a word satisfying the property that if appears in , then all integers in the interval also appear in . We say that a word contains the pattern if there is a subsequence of , , which is order-isomorphic to . For example, the Catalan word contains seven occurrences of the pattern and four occurrences of the pattern . A word avoids the pattern whenever it does not contain any occurrence of . We denote by the set of length Catalan words avoiding the pattern , and . For instance, , and . For a set of words, the popularity of a pattern is the overall number of occurrences of within all words of the set, see  where this notion was introduced, and [1, 7, 10, 2] for some related results.

A descent in a word is an occurrence such that . Alternatively, a descent is an occurrence of the consecutive pattern (i.e., the entries corresponding to an occurrence of are required to be adjacent). We denote by the number of descents of , thus the popularity of descents on a set of words is . The distribution of the number of descents has been widely studied on several classes of combinatorial objects such as permutations and words, since descents have some particular interpretations in fields as Coxeter groups or theory of lattice paths [3, 6].

The main goal of this paper is to study the descent distribution on Catalan words (see Table 1 for some numerical values). More specifically, for each pattern of length at most three, we give the distribution of descents on the sets of length Catalan words avoiding . We denote by the bivariate generating function for the cardinality of words in with descents. Plugging

• into , we deduce the generating function for the set , and

• into , we deduce the generating function for the popularity of descents in .

From the definition at the beginning of this section it follows that a Catalan word is either the empty word, or it can uniquely be written as , where and are in turn Catalan words, and is obtained from by adding one to each of its entries. We call this recursive decomposition first return decomposition of a Catalan word, and it will be crucial in our further study. It follows that , the generating function for the cardinality of , satisfies:

 C(x)=1+x⋅C2(x),

which corresponds precisely to the sequence of Catalan numbers.

We conclude this section by explaining how Catalan words are naturally related to two classical combinatorial classes counted by the Catalan numbers.

### Catalan words vs. Dyck words

A Dyck word is a word over with the same number of ’s and ’s, and with the property that all of its prefixes contain no more ’s than ’s. Alternatively, a Dyck word can be represented as a lattice path starting at , ending at , and never going below the -axis, consisting of up steps and down steps . There is a direct bijection between the set of Dyck words of semilength and : the Catalan word is the sequence of the lowest ordinate of the up steps in the Dyck word , in lattice path representation. For instance, the image through this bijection of the Dyck word of semilength is . Note that the above bijection gives a one-to-one correspondence between occurrences of the consecutive pattern in Dyck words and descents in Catalan words.

### Catalan words vs. binary trees

In  the author introduced an integer sequence representation for binary trees, called left-distance sequence. For a binary tree , let consider the following labeling of its nodes: the root is labeled by , a left child by the label of its parent, and a right child by the label of its parent, plus one. The left-distance sequence of is obtained by covering in inorder (i.e., visit recursively the left subtree, the root and then the right subtree of ) and collecting the labels of the nodes. In  it is showed that, for a given length, the set of left-distance sequences is precisely that of same length Catalan words. Moreover, the induced bijection between Catalan words and binary trees gives a one-to-one correspondence between descents in Catalan words and particular nodes (left-child nodes having a right child) in binary trees.

The remainder of the paper is organized as follows. In Section 2, we study the distribution of descents on the set of Catalan words. As a byproduct, we deduce the popularity of descents in . We consider also similar results for the obvious cases of Catalan words avoiding a pattern of length two. In Section 3, we study the distribution and the popularity of descents on Catalan words avoiding each pattern of length three.

## 2 The sets C and C(p) for p∈{00,01,10}

Here we consider both unrestricted Catalan words and those avoiding a length two pattern. We denote by the bivariate generating function where the coefficient of of its series expansion is the number of length Catalan words with descents. When we restrict to Catalan words avoiding the pattern , the corresponding generating function is denoted by .

###### Theorem 1.

We have

 C(x,y)=1−2x+2xy−√1−4x+4x2−4x2y2xy.
###### Proof.

Let be the first return decomposition of a non-empty Catalan word with . If (resp. ) is empty then the number of descents in is the same as that of (resp. ); otherwise, we have since there is a descent between and . So, we obtain the functional equation which gives the desired result. ∎

As expected, is the generating function for the Catalan numbers, and is the generating function for the descent popularity on , and we have the next corollary.

###### Corollary 1.

The popularity of descents on the set is , and its generating function is (sequence A002694 in ).

Catalan words of odd lengths encompass a smaller size Catalan structure. This result is stated in the next corollary, see the bold entries in Table

1.

###### Corollary 2.

Catalan words of length with descents are enumerated by the th Catalan number .

###### Proof.

Clearly, the maximal number of descents in a length Catalan word is . Let be a Catalan word of length with descents. We necessarily have with , , and . Since the length of is odd, and have the same parity. If and are both even, then which gives a contradiction. So, and are both odd, and we have . Thus the generating function where the coefficient of is the number of Catalan words of length with descents satisfies which is the generating function for the Catalan numbers. ∎

There are three patterns of length two, namely , and , and Catalan words avoiding such a pattern do not have descents, thus the corresponding bivariate generating functions collapse into one variable ones.

For , we have .

###### Proof.

If (resp. ) then (resp. ) is the unique non-empty Catalan word of length avoiding , and the statement follows. ∎

###### Theorem 3.

We have , which is the generating function for the sequence (sequence A011782 in ).

###### Proof.

A non-empty Catalan word avoiding the pattern is of the form for , and with . So, we have the functional equation , which gives . ∎

## 3 The sets C(p) for a length three pattern p

Here we turn our attention to patterns of length three. There are thirteen such patterns, and we give the distribution and the popularity of descents on Catalan words avoiding each of them. Some of the obtained results are summarized in Tables 2 and 3.

###### Theorem 4.

For , we have

 Cp(x,y)=1−x+x2−x2y1−2x+x2−x2y.
###### Proof.

A non-empty word has its first return decomposition where and . If or , then the number of descents in is the same as that of ; otherwise, we have (there is a descent between and ). So, we obtain the functional equation which gives the desired result.
A non-empty word has the form where and . If or , then the number of descents in is the same as that of ; otherwise, we have . So, we obtain the functional equation which gives the desired result. ∎

Considering the previous theorem and the coefficient of in and in , we obtain the next corollary.

###### Corollary 3.

For , we have , and the popularity of descents on the set is (sequence A001787 in ).

As in the case of length two patterns, a Catalan word avoiding does not have descents, and we have the next theorem.

###### Theorem 5.

If , then which is the generating function for the sequence (sequence A011782 in ).

###### Proof.

A non-empty word can be written either as with , or as with . So, we deduce , and the statement holds. ∎

###### Theorem 6.

For , we have

 Cp(x,y)=1−4x+6x2−x2y−4x3+3x3y+x4−x4y(1−x)(1−2x)(1−2x+x2−x2y).
###### Proof.

Let be a non-empty word in , and let its first return decomposition with . Note that belongs to . We distinguish two cases: (1) does not contain , and (2) otherwise.
In the case (1), (i.e., for some ), and . If (resp. ), then the number of descents in is the same as that of (resp. ); otherwise, we have . So, this case contributes to with .

In the case (2), and . If then and have the same number of descents; otherwise, we have . So, this case contributes to with .

Taking into account these two disjoint cases, and adding the empty word, we deduce the functional equation , which after calculation gives the result.∎

###### Corollary 4.

For , we have which is the generating function for the sequence (sequence A005183 in ). The popularity of descents on the set is with the generating function (sequence A001793 in ).

###### Theorem 7.

For , we have

 Cp(x,y)=1−3x+3x2−2x2y−x3+x3y(1−x)(1−3x+2x2−2x2y).
###### Proof.

Let be a non-empty word in , and let its first return decomposition with . If is empty, then for some and we have . If is empty, then for some and we have . If and are both non-empty, then and . We deduce the functional equation . Finally, by Theorem 4 we obtain the desired result.

Let be a non-empty word in , and let its first return decomposition with . If is empty, then for some and we have . If is empty, then for some and we have . If and are both non-empty, then and we distinguish two cases: (1) does not contain , and (2) otherwise. In the case (1), we have and ; in the case (2), contains 1 and and . Combining the previous cases, the functional equation becomes , which gives the desired result. ∎

###### Corollary 5.

For , we have which is the generating function of the sequence (sequence A007051 in ). The popularity of descents on the set is with the generating function (sequence A027471 in ).

###### Theorem 8.

For , we have

 Cp(x,y)=1−2x+x2−x2y1−3x+2x2−x2y.
###### Proof.

Let be a non-empty word in , and let be its first return decomposition where . If is empty, then for some and we have ; if is empty, then for some and we have ; if and are not empty, then , and . We deduce the functional equation which gives the result.

Let be a non-empty word in , and let be its first return decomposition where . If is empty, then for some and ; if is empty, then for some and ; if and are not empty, then and and . We deduce the functional equation which gives the result. ∎

###### Corollary 6.

For , we have and the coefficient of in its series expansion is the th term of the Fibonacci sequence (see A001519 in ). The popularity of descents on the set is given by which is the coefficient of in the series expansion of (sequence A001870 in ).

###### Theorem 9.

For , we have

 Cp(x,y)=1−2x+2x2−x3+x3y(1−x)3.
###### Proof.

Let be a non-empty word in , and let its first return decomposition where . If (resp. ) is empty, then we have (resp. ); if and are non-empty, then and . We deduce the functional equation which gives the result. ∎

###### Corollary 7.

For , we have and the coefficient of in its series expansion is (sequence A000124 in ). The popularity of descents on the set is given by which is the coefficient of in the series expansion of (sequence A000217 in ).

###### Theorem 10.

For , we have

 Cp(x,y)=1−x2−x2y1−x−2x2−x2y+x3+x4−x4y.
###### Proof.

Let be a non-empty word in , and let its first return decomposition where . We distinguish two cases: (1) is empty, and (2) otherwise.

In the case (1), we have for some and . So, the generating function for the Catalan words in this case is .

In the case (2), we set for some and we have . We distinguish three sub-cases: (2.a) is empty, (2.b) is non-empty and is empty, and (2.c) and are both non-empty.

In the case (2.a), we have with . So, the generating function for the Catalan words belonging to this case is .

In the case (2.b), we have with . So, the generating function for the corresponding Catalan words is .

In the case (2.c), we have where and are non-empty Catalan words such that is a Catalan word lying in the case (2). If , then ; if , then . So, the generating function for the corresponding Catalan words is .

Considering , the obtained functional equations give the result. ∎

###### Corollary 8.

For , we have and the generating function for the popularity of descents in the sets , , is

 x3(1−x)(1+2x)(1+x)(1−x−3x2+x3)2.

Note that the sequences defined by the two generating functions in Corollary 8 do not appear in .

###### Theorem 11.

For , we have

 Cp(x,y)=1−2x−x2y+x31−3x+x2−x2y+2x3.
###### Proof.

For , we define as the set of Catalan words avoiding with exactly zeros, and let be the generating function for .

A Catalan word is of the form with . Since we have , the generating function for these words satisfies .

A Catalan word , , is of the form with . Since we have , the generating function for these words satisfies .

A Catalan word has one of the three following forms:

(1) with ; we have , and the generating function for these Catalan words is .

(2) with ; we have , and the generating function for these Catalan words is .

(3) where and are non-empty and for some (i.e., with ). So, there are possible choices for , namely and . If , then ; if and , then ; if and , then . So, the generating function for these words is , which is .

Taking into account all previous cases, we obtain the following functional equations:

A simple calculation gives the desired result. ∎

###### Corollary 9.

For , we have , which is the generating function for the sequence (see A057960 in ), and the generating function for the popularity of descents in the sets , , is

 x3(1−x−x2)(1−3x+2x3)2.

For , we have

 Cp(x)=