1 Introduction
The convolution is a wellknown and very useful method, which is not only closely linked to signal processing (e.g. [12]) but is also used to multiply polynomials (see [5, p. 905]) and large numbers (e.g. [11]
(written in German)) in quasilinear time. The convolution can be efficiently computed with the fast Fourier transform or its counterpart in residue class rings, the number theoretic transform:
Theorem 1.
Let and be two sequences. The sequence with can be computed in operations.
The most wellknown proofs use additions and multiplications of arbitrary complex numbers. However, with the finite register lengths of realworld computers, one must either cope with the roundoff errors or do all calculations in a different ring. In Appendix A.1, we show that a suitable ring is only dependent on and can be found in time if the generalized Riemann hypothesis is true.
The convolution can also be interpreted geometrically: Let and be sequences. Then the convolution calculates the partial sums
where is the square given by .
This paper extends this geometric interpretation and shows that if is an arbitrary convex polygon with vertices and perimeter , the partial sums can be calculated in time.
We also use this extended method to solve an open problem of a string pattern called cadence. A cadence is given by an arithmetic progression of occurrences of the same character in a string such that the progression can not be extended to either side without extending the string as well. For example, in the string the indices corresponding to the “1”s form a cadence. On the other hand, in the string the indices corresponding to the “1”s do not form a cadence since, for example, the index is still inside of the string.
cadences can be found naïvely in quadratic time. In the paper [2], a quasilinear time algorithm for detecting the existence of cadences was proposed, but this algorithm also detects false positives as the aforementioned string .
This paper fixes this issue and also extends the algorithm to the slightly more general notion of partialcadences. The resulting extended algorithm also allows counting those partialcadences and only needs time. Using a method presented by Amir et al. in [2], this implies that all partialcadences can be counted in time.
Furthermore, we show that the output of the counting algorithm also allows for finding partialcadences in time.
This paper also gives similar results for subcadences.
For the time complexity, we assume that arithmetic operations with bits can be done in constant time. In particular, we want to be able to get the remainder of a division by a prime in constant time.
Also, in this paper, we assume a suitable alphabet. I.e. the characters are given by sufficiently small integers in order to allow constant time reading of a given character in the string and in order to allow sorting the characters.
2 (Sub)Cadences and Their Definitions
The term cadence in the context of strings dates back to 1964 and was first introduced by Gardelle and Guilbaud in [6] (written in French). Since then, there were at least two other, slightly different and nonequivalent definitions given by Lothaire in [9] and Amir et al. in [2].
This paper uses the most restrictive definition of the cadence, which was introduced by Amir et al. in [2], and also uses their definition of the subcadence, which is equivalent to Gardelle’s cadence in [6] and Lothaire’s arithmetic cadence in [9].
A string of length is the concatenation of characters from an alphabet .
Definition 1.
A subcadence is a triple of positive integers such that
holds.
In this paper, cadences are additionally required to start and end close to the boundaries of the string:
Definition 2.
A cadence is a subcadence such that the inequalities and hold.
Since for any subcadence the inequality holds, for any cadence holds. This implies and thereby . It is therefore sufficient to omit the variable of a cadence and just denote this cadence by the pair .
Remark 1 (Comparison of the Definitions).

The cadence as defined by Lothaire is just an ordered sequence of unequal indices such that the corresponding characters are equal.

The cadence as defined by Gardelle and Guilbaud additionally requires the sequence to be an arithmetic sequence.

The cadence as defined by Amir et al. and as used in this paper additionally requires that the cadence can not be extended in any direction without extending the string as well.
For the analysis of cadences with errors, we need two more definitions:
Definition 3.
A cadence with at most errors is a tuple of integers such that and and hold and such that there are different integers with and
A particularly interesting case of cadences with errors is given by the partialcadences in which we know all positions where an error is allowed:
Definition 4.
For some different integers with , a partialcadence is a triple of positive integers with and such that
hold.
3 3SubCadences and Rectangular Convolutions
Lothaire showed over 20 years ago that sufficiently large strings are guaranteed to have subcadences of a given length:
Theorem 2 (Existence of SubCadences (Lothaire [9])).
Let be an alphabet and an integer. There exists an integer such that every string containing at least characters has at least one subcadence
However, this theorem does not provide the number of subcadences of a given string.
In this section, we will show that subcadences with a given character of a string of length can be efficiently counted in time. We will also show that arbitrary subcadences of a string of length can be counted in time and that both counting algorithms allow to output different subcadences in additional time if at least different subcadences exist.
Let be a character. We will now count all subcadences with character .
Let be a subcadence. Since holds, the position of the middle occurrence of only depends on the sum of the index of first occurrence and the index of the third occurrence but does not depend on the individual indices of those two positions. Therefore, it is possible to determine the candidates for the middle occurrences with the convolution of the candidates of the first occurrence and the candidates of the third occurrence.
Let the sequence be given by the indicator function for in :
With this definition, the product is if and only if and otherwise is . Therefore counts in how many ways the index lies in the middle of two . These partial sums can be calculated in time by convolution.
If
is odd or
holds, the index can not be the middle index of a subcadence. If holds, the indicator function is , and therefore holds as well. Since is not a subcadence, the output element contains one false positive. Additionally, for with and , the output element counts the combination as well as .Therefore,
counts exactly the number of subcadences with character such that the second occurrence of has index . The sum of the is the number of total subcadences with character .
Also, for each , all those subcadences can be found in time by checking for each index whether holds.
If the character is rare, we can also follow the idea of Amir et al. in [2] for detecting cadences with rare characters: If all occurrences of the character are known, the can be computed in time by computing every pair of those occurrences. Therefore:
Theorem 3.
For every character , the subcadences with can be counted in time. Also, if all occurrences of are known, the subcadences with can be counted in time.
Following the proof in [2], we can get all occurrences of every character by sorting the input string in time. This implies that the algorithm needs at most
time.
Theorem 4.
The number of all subcadences can be counted in
Theorem 5.
After counting at least subcadences, it is possible to output subcadences in time.
4 NonRectangular Convolutions
In this section, we will extend the geometric interpretation of the convolution and show that for convex polygons with vertices and perimeter it is possible to calculate the partial sums
in time.
Let’s imagine a graph where all integercoordinates have the value . We don’t need the convolution in order to determine the sum of the function values in a given rectangle since we can use the simple factorization in time. However, the convolution provides the partial sums on the diagonals in almost the same time of .
We will now extend this geometric interpretation firstly to triangles with a vertical cathetus and a horizontal cathetus, then to arbitrary triangles and lastly to convex polygons. In order to do this, we will divide the given polygon in polygons and such that for each integer point the equality
holds, and we define
By construction, holds. However, if the edges and vertices of the polygons and contain integerpoints, we need to carefully decide for every of these polygons, which edges and vertices are supposed to be included in the polygons and which are excluded from the polygons.
Lemma 1.
Let be a triangle with a vertical cathetus and a horizontal cathetus and perimeter . Let also the sequences and be given.
Then the partial sums
can be calculated in time.
Proof.
The proof will be symmetrical with regard to horizontal and vertical mirroring. Therefore, without loss of generality, we will assume that is oriented as in Figure 1.
In the following proof, we assume that both catheti are included in the polygon and that the hypotenuse as well as its endpoints are excluded. If this is not the expected behavior, we can traverse the edges in time and for each integerpoint on the edge, we can decrease/increase the corresponding by if necessary.
If is at most one, there is at most one integerpoint in the triangle, and this point can be found in constant time. In this case, we only have to increase by .
If is bigger than one, we will separate the triangle into three disjoint parts as seen in Figure 1.

The triangle of points with xcoordinate of at least ,

the triangle of points with ycoordinate of at least and

the red rectangle of points with xcoordinate of at most and ycoordinate of at most .
There are no integers bigger than but smaller than nor integers bigger than but smaller than . Therefore, each integerpoint in is in exactly one of the three parts.
For the red rectangle, we can calculate the convolution and thereby get the corresponding partial sums in time. The partial sums corresponding to the subtriangles are calculated recursively. Increasing the by the partial results leads to the final result.
Hence, the algorithm takes
time. ∎
We will now further extend this result to arbitrary triangles:
Lemma 2.
Let a triangle with perimeter and sequences and be given.
Then the partial sums
can be calculated in time.
Proof.
Let be the minimal and maximal xcoordinates and ycoordinates of the three vertices of the polygon . As in the last lemma, we first initialize the output vector .
Similarly to the last lemma, we can remove/add edges and vertices in linear time with respect to . Since the number of edges and vertices is constant, we ignore them for the sake of simplicity.
Let be the rectangle . Since has four edges but only has three vertices, at least one of the vertices of is also a vertex of . Without loss of generality, this vertex is .
 Case 1:

The opposing vertex in also coincides with a vertex of (as in the left hand side of Figure 2):
Without loss of generality, we can assume that the third vertex of is above the diagonal from to . In this case, the partial sums corresponding to are given by the sum of the partial sums of the red triangles and the partial sums of the blue rectangle minus the partial sums of the lighter triangle.
There are only three triangles and one rectangle involved, and each of those polygons has perimeter . Furthermore, all triangles have a vertical cathetus and a horizontal cathetus. Therefore, using Lemma 1, we can calculate all partial sums in time.
 Case 2:

The opposing vertex in does not coincide with a vertex of (as in the right hand side of Figure 2):
In this case, one vertex of lies on the right edge of and one vertex of lies on the upper edge of .
The wanted partial sums are in this case the difference of the partial sums of the rectangle and of the partial sums of the three red triangles. Again, we can calculate all partial sums in time.
Since both cases require time, this concludes the proof. ∎
Now we will extend this algorithm to convex polygons by dissecting them into triangles with sufficiently small perimeter.
Theorem 6.
Let be a convex polygon with vertices and perimeter . Let also the sequences and be given.
Then the partial sums
can be calculated in time.
Proof.
As in the last two Lemmata, we define to be the minimal and maximal xcoordinates and ycoordinates of the vertices of . Also, we first initialize the output vector . We further assume that none of the edges and vertices of is included in .
If is a triangle, then this Lemma simplifies to Lemma 2 and there is nothing left to prove.
If is a quadrilateral , as in the left hand side of Figure 3, then it can be partitioned into the triangles and where the edge is included in exactly one triangle and all other edges are excluded. The triangle inequality proves that and hold. Therefore, both triangles have a perimeter of at most . This implies that the partial sums can be calculated in
If is a polygon with more than four vertices, as in the right hand side of Figure 3, it can be partitioned into

the polygon , which is given by the odd vertices without its edges,

the red triangles with including the edge but excluding the other edges and the vertices,

if is even, the triangle including the edge but excluding the other edges and the vertices.
By construction and triangle inequality, the perimeter of is at most . This, however, also implies that the total perimeter of the triangles is at most . The inequality
implies that the algorithm needs time plus the time we need for processing . Since each step almost halves the number of vertices, we need steps. This results in a total time complexity of . ∎
5 (a,b,c)PartialkCadences
In this section, we will show how the nonrectangular convolution helps counting the partialcadences with a given character in . We will further show that all partialcadences can be counted in time and that both counting algorithms allow to output of those partialcadences in time.
As a special case, these results also hold for cadences.
We further conclude from these results that the existence of cadences with at most errors can be detected in time.
Without loss of generality, we will only deal with the case in this section.
Lemma 3.
Three positions , and form a partialcadence if and only if

the equation holds,

the equation holds and

the inequalities
(1) (2) (3) (4)
Proof.
Define and . Then and . Furthermore, the equation holds if and only if and holds if and only if is an integer.
Additionally, using and , the four inequalities can be simplified to , , and .
Therefore, the lemma follows from the definition of the partialcadence. ∎
The four inequalities hold if the points lie inside the convex quadrilateral given, as shown in Figure 4, by the corners
including the vertex and the edges between and as well as between and but excluding all other vertices and the edges between and as well as between and .
For given and , the third occurrence can be calculated with the equation directly without calculating and first. The corresponding partial sums
can be calculated by using the partial sums
with and a polygon , which is derived from by stretching the first coordinate by and the second coordinate by . The perimeter of is at most times the perimeter of . Using the quadrilateral with perimeter
the polygon has perimeter . This proves the following three theorems.
Theorem 7.
For every character , the partialcadences with can be counted in time. Also, if all occurrences of are known, the partialcadences with can be counted in time.
Theorem 8.
The number of all partialcadences can be counted in
Theorem 9.
After counting at least partialcadences, it is possible to output partialcadences in time.
Since every cadence is an partialcadence, we also obtain the special case:
Corollary 1.
For every character , the cadences with can be counted in time. Also, if all occurrences of are known, the cadences with can be counted in time.
Therefore, the number of all cadences can be counted in
Also, after counting at least cadences, it is possible to output cadences in time.
Taking the sum over all possible triples , we can also search for cadences with at most errors. It can be checked in
time whether the given string has a cadence with at most errors. However, since cadences with less than errors are counted more than once, it seems to be difficult to determine the exact number of cadences with at most errors.
6 Conclusion
This paper extends convolutions to arbitrary convex polygons. One might wonder whether these convolutions could be speed up or be further extended to nonconvex polynomials.
Instead of just partitioning the interior of the polygon into triangles, it is also possible to identify polygons by the difference of a slightly bigger but less complex polygon and a triangle. However, if the algorithm presented in this paper is adapted to nonconvex polygons, it can generate selfintersecting polygons. While the timecomplexity stays the same for these polygons, it becomes hard to ensure that every vertex and every edge of the polygon is counted exactly once.
Another approach is given by Levcopoulos and Lingas in [7]. This paper shows that any simple polygon can be decomposed into convex components in quasilinear time with only logarithmic blowup. This paper also shows that if the input polygon is rectilinear, this partition only contains axisaligned rectangles. Since the convolution handles rectangles quicker and more easily than triangles, this saves a logarithm. However, in general, it is not obvious how to transform arbitrary polygons into equivalent simple rectilinear polygons in quasilinear time without blowingup the number of vertices too much.
The nonrectangular convolution, unlike the usual convolution, allows to define a dependence between the indices of the convoluted sequences. This dependence is not usable in applications like the multiplication of polynomials, and for many signal processing applications this extended method does not seem to bring any benefits either. However, in order to count the partialcadences this dependence was essential. The nonrectangular convolution may also have future applications in image processing and convolutional neural networks.
In terms of cadences, this paper presents algorithms to count and find subcadences, cadences and partialcadences with three elements. However, if there are linearly many positions of partialcadences, the knowledge of those partialcadences does not lead to a subquadratictimealgorithm for determining the existence cadences. On the other hand, it is also not shown that this problem needs quadratic time.
Also, the timecomplexity for finding cadences is quite pessimistic. If there are many cadences, it is very likely that quite a few of these cadences share one of their occurrences. These occurrences can be found in time. On the other hand, in the string , for example, there are linearly many cadences but every second occurrence and every third occurrence only occurs in at most one of those cadences.
7 acknowledgements
The first author discovered an error in the algorithm for determining the existence of 3cadences in ”String cadences” of Amir et al., which led to falsepositives. Travis Gagie explained this error to the second author at the CPMConference in Pisa. He also claimed that this problem should be solvable. Juliusz Straszyński showed that subcadences beginning and ending in given intervals can efficiently be detected by convolution. Amihood Amir noted that we can also efficiently count these subcadences, which allows “subtractive” methods as used for arbitrary triangles.
Appendix A appendix
a.1 Convolutions
It is wellknown that the discrete convolution can be calculated with complex arithmetic operations. However, if the convolution is calculated with the fast Fourier transform, the finite register lengths introduce roundoff errors. These errors can propagate and accumulate throughout the calculation.
Therefore, in order to calculate the convolution of integer sequences, it seems more convenient to use the number theoretic transform, which is the generalization of the fast Fourier transform from the field of the complex numbers to certain residue class rings.
In this section, we will show that after some precomputation in time it is possible to calculate these convolutions in time.
Agarwal and Burrus show in [1] that the cyclic convolution of two integervectors of length can be efficiently computed modulo a prime if is a multiple of .
Linnik proves in [8] that there are constants and such that for each , with , there is a prime of the form with . While Linnik himself did not provide the values of and , there are some upper bounds: For example, Xylouris proves in [13] that there is a such that for each , with , there is a prime of the form with . More explicitly, Bach and Sorenson present in [4] that if the generalized Riemann hypothesis holds, for each , with , there is a prime of the form with .
As a result, for each , there is a prime with . This also implies that the length of is at most times the length of . Therefore, such a prime number is a good modulus for the convolution of length or any of its divisors. It is left to show that such a prime can be efficiently found.
Theorem 10.
Let be an integer. A prime with can be found in time.
Proof.
The main idea is to use the sieve of Eratosthenes to first find all primes up to and then sieve only the numbers up to that are congruent to modulo with these primes.
On the one hand, since holds, all numbers left after the second sieving are primes. On the other hand, the result of Bach and Sorenson in [4] guarantees that if the generalized Riemann hypothesis holds, there is a prime left. Also, by construction, all primes left fulfill this theorem.
It remains to be shown that this algorithm can be done in time.
For the usual sieve of Eratosthenes, one prepares a Boolean array for the first numbers. Then, for each number that has not been marked as nonprime, every multiple is marked as nonprime. Afterwards, all nonmarked numbers are returned. The majority of the time is spend for the marking. This takes
time. The last equality is given by Mertens in [10, p. 46] (written in German) and the inequality .
For the second part, we have a much larger interval of numbers. However, since we only have to consider the first residue class, only every th number has to be considered. Therefore we need
markings. Using the extended Euclidean algorithm, for every prime , we can find the smallest such that in time. Summing up over all primes, this takes
time.
This concludes the proof. ∎
Remark 2.
The prime number theorem states that the number of primes smaller than asymptotically behaves like . Dirichlet’s prime number theorem states that for a given and a sufficiently large , the prime numbers are evenly distributed in all residue classes with .
Therefore, for a given and sufficiently large , we should expect circa prime numbers of the form that are smaller than . One might therefore hope that it is possible to guess logarithmically many numbers smaller than in the right residue class, and then test in time whether this number is prime.
However, the “sufficient largeness” of depends on . Therefore, these theorems do not provide the number of suitable primes smaller than, for example, . Also, since the generation of suitable primes can be done in quasilinear time, the randomized shortcut is not necessary.
It is not only possible to find a suitable modulus for the number theoretic transform, but we can also find a suitable th root:
Theorem 11.
Let be a prime with and .
A th root of unity modulo can be found in time.
Proof.
Let for an odd number .
Firstly, we will show that a residue is a th root of unity modulo if and only if is a quadratic nonresidue modulo .
Since is prime, there is a primitive root modulo .
Let . Then has the order . Therefore, has order if and only if is odd. On the other hand, if is even, then is a quadratic residue, and if is odd, then is a quadratic nonresidue. This implies that is a th root of unity modulo if and only if is a quadratic nonresidue modulo .
Ankeny shows in [3] that if the generalized Riemann hypothesis holds, there is a quadratic nonresidue in the first residue classes. For any residue it can be tested with multiplications and modulo operations whether has order . As byproduct we get . If and only if has order , the power has order .
Therefore, a th root of unity modulo can be found in time. ∎
Therefore, we can efficiently compute the integerconvolution with the help of the number theoretic transform.
Theorem 12.
For a given integer , we can find a modulus and a suitable root in time such that it is possible to calculate the acyclic convolution modulo of two sequences of length in time afterwards.
Proof.
The acyclic convolution of sequences of length can be derived from a cyclic convolution of sequences with lengths of at least . Therefore, it is sufficient to prepare with .
For this length, the last two theorems state that a suitable modulus and a suitable th root of unity can be found in .
Afterwards, for every
Comments
There are no comments yet.