1 Introduction
String matching is one of fundamental problems in computer science, and it can be applied to many practical problems. In many applications string matching has variants derived from exact matching (which can be collectively called generalized matching), such as orderpreserving matching [19, 20, 22], parameterized matching [4, 7, 8], jumbled matching [9], overlap matching [3], pattern matching with swaps [2], and so on. These problems are characterized by the way of defining a match, which depends on the application domains of the problems. In financial markets, for example, people want to find some patterns in the time series data of stock prices. In this case, they would like to know more about some pattern of price fluctuations than exact prices themselves [15]. Therefore, we need a definition of match which is appropriate to handle such cases.
The Cartesian tree [27] is a tree data structure that represents an array, only focusing on the results of comparisons between numeric values in the array. In this paper we introduce a new metric of match, called Cartesian tree matching, which means that two strings match if they have the same Cartesian trees. If we model the time series stock prices as a numerical string, we can find a desired pattern from the data by solving a Cartesian tree matching problem. For example, let’s assume that the pattern we want to find looks like the picture on the left of Figure 1, which is a common pattern called the headandshoulder [15] (in fact there are two versions of the headandshoulder: one is the picture in Figure 1 and the other is the picture reversed). The picture on the right of Figure 1 is the Cartesian tree corresponding to the pattern on the left. Cartesian tree matching finds every position of the text which has the same Cartesian tree as the picture on the right of Figure 1.
Even though orderpreserving matching [19, 20, 22] can also be applied to finding patterns in time series data, Cartesian tree matching may be more appropriate than orderpreserving matching in finding patterns. For instance, let’s assume that we are looking for the pattern in Figure 1 in time series stock prices. An important characteristic of the pattern is that the price hit the bottom (head), and it has two shoulders before and after the head. But the relative order between the two shoulders (i.e., which one is higher) does not matter. If we model this pattern into orderpreserving matching, then orderpreserving matching imposes a relative order between two shoulders and . Moreover, it imposes an unnecessary order between two valleys and . Hence, order preserving matching may not be able to find such a pattern in time series data. In contrast, the pattern in Figure 1 can be represented by one Cartesian tree, and therefore Cartesian tree matching is a more appropriate metric in such cases.
In this paper we define string matching problems based on Cartesian tree matching: single pattern matching for a text of length and a pattern of length , and multiple pattern matching for a text of length and patterns of total length , and we present efficient algorithms for them. We also define an index data structure called Cartesian suffix tree as in the cases of parameterized matching and orderpreserving matching [8, 13], and present an efficient algorithm to build the Cartesian suffix tree. To obtain efficient algorithms for Cartesian tree matching, we define a representation of the Cartesian tree, called the parentdistance representation.
In Section 2 we give basic definitions for Cartesian tree matching. In Section 3 we propose an time algorithm for single pattern matching. In Section 4 we present an deterministic time or randomized time algorithm for multiple pattern matching. In Section 5 we define the Cartesian suffix tree, and present an randomized time algorithm to build the Cartesian suffix tree of a string of length .
2 Problem Definition
2.1 Basic notations
A string is a sequence of characters in an alphabet , which is a set of integers. We assume that the comparison between any two characters can be done in time. For a string , represents the th character of , and represents a substring of starting from and ending at .
2.2 Cartesian tree matching
A string can be associated with its corresponding Cartesian tree according to the following rules [27]:

If is an empty string, is an empty tree.

If is not empty and is the minimum value among , is the tree with as the root, as the left subtree, and as the right subtree. If there are two or more minimum values, we choose the leftmost one as the root.
Since each character in string corresponds to a node in Cartesian tree , we can treat each character as a node in the Cartesian tree.
Cartesian tree matching is the problem to find all the matches in the text which have the same Cartesian tree as a given pattern. Formally, we define it as follows:
(Cartesian tree matching) Given two strings text and pattern , find every such that .
For example, let’s consider a sample text . If we find the pattern in Figure 1, which is , we can find a match at position 5 of the text, i.e., . Note that the matched text is not a match in orderpreserving matching [20, 22] because the relative order between and is different from that between and , but it is a match in Cartesian tree matching.
3 Single Pattern Matching in Time
3.1 Parentdistance representation
In order to solve Cartesian tree matching without building every possible Cartesian tree, we propose an efficient representation to store the information about Cartesian trees, called the parentdistance representation.
(Parentdistance representation) Given a string , the parentdistance representation of is an integer string , which is defined as follows:
For example, the parentdistance representation of string is . Note that in Definition 3.1 represents the parent of in Cartesian tree . Furthermore, if there is no such , is the root of Cartesian tree .
Theorem 3.1 shows that the parentdistance representation has a onetoone mapping to the Cartesian tree, so it can substitute the Cartesian tree without any loss of information.
[] Two strings and have the same Cartesian trees if and only if and have the same parentdistance representations.
Proof.
If two strings have different lengths, they have different Cartesian trees and different parentdistance representations, so the theorem holds. Therefore, we can only consider the case where and have the same length. Let be the length of and . We prove the theorem by an induction on .
If , and will always have the same Cartesian trees with only one node. Furthermore, they will have the same parentdistance representation . Therefore, the theorem holds when .
Let’s assume that the theorem holds when , and show that it holds when .
Assume that and have the same Cartesian trees (i.e., ). There are two cases.

If and are not roots of the Cartesian trees, let be the parent of , and the parent of . Since , we have . If we remove from Cartesian tree , we obtain the tree , where the left subtree of is attached to its parent . If we remove from , we obtain in the same way. Since , we get , and therefore by induction hypothesis. Since and (and ), we have .

If and are roots, we remove and to get and . Since , we have , and therefore by induction hypothesis. Since in this case, we get .
Assume that and have the same parentdistance representations (i.e., ). Since , we have by induction hypothesis. From , we can derive as follows. If , let be . We insert into so that the parent of is and the original right subtree of becomes the left subtree of . If , is the root of and becomes the left subtree of . We derive from in the same way. Since and , we can conclude that .
Therefore, we have proved that there is a onetoone mapping between Cartesian trees and parentdistance representations. ∎
3.2 Computing parentdistance representation
Given a string , we can compute the parentdistance representation in linear time using a stack, as in [13, 14]. The main idea is that if two characters and for satisfy , cannot be the parent of for any . Therefore, we will only store which does not have such while scanning from left to right. If we store such only, they form a nondecreasing subsequence of . When we consider a new value, therefore, we can pop values that are larger than the new value, find its parent, and push the new value and its index into the stack. Algorithm 1 describes the algorithm to compute .
Furthermore, given the parentdistance representation of string , we can compute the parentdistance representation of any substring easily. To compute , we need only check whether the parent of is within or not (i.e., the parent is outside if ).
(1) 
For example, the parentdistance representation of string is . For , we can use the above equation and compute the value at each position in constant time, getting .
3.3 Failure function
We can define a failure function similar to the one used in the KMP algorithm [21].
(Failure function) The failure function of string is an integer string such that:
That is, is the largest such that the prefix and the suffix of of length have the same Cartesian trees. For example, assuming that , the corresponding failure function is . We can see that from . We will present an algorithm to compute the failure function of a given string in Section 3.5.
3.4 Text search
As in the original KMP text search algorithm, we can use the failure function in order to achieve linear time text search: scan the text from left to right, and use the failure function every time we find a mismatch between the text and the pattern. We apply this idea to Cartesian tree matching.
In order to perform a text search using space, we compute the parentdistance representation of the text online as we read the text, so that we don’t need to store the parentdistance representation of the whole text, which would cost space. Furthermore, among the text characters which are matched with the pattern, we only have to store elements that form a nondecreasing subsequence by using a deque (instead of a stack in Section 3.2) in order to delete elements in front. Using this idea, we can keep the size of the deque to be always smaller than or equal to . Therefore, we can perform the text search using space. Algorithm 2 shows the text search algorithm of Cartesian tree matching. In line 9 we need to compute . If the deque is empty, then . Otherwise, let be the element at the back of the deque. Then . This computation takes constant time. Just before line 14, we do not compare and when , because they always match. Therefore, we can safely perform line 14.
3.5 Computing failure function
We compute the failure function in a way similar to the text search, as in the KMP algorithm. However, we can compute the parentdistance representation of the pattern in time before we compute the failure function. Hence we don’t need a deque and the computation is slightly simpler than text search. Algorithm 3 shows the procedure to compute the failure function.
3.6 Correctness and time complexity
Since our algorithm for Cartesian tree matching including text search and the computation of the failure function follow the KMP algorithm, it is easy to see that our algorithm correctly finds all occurrences (in the sense of Cartesian tree matching) of the pattern in the text. Since our algorithm checks one character of the parentdistance representation in constant time, it takes time for text search and time to compute the failure function, as in KMP algorithm. Therefore, our algorithm requires time for Cartesian tree matching using space.
3.7 Cartesian tree signature
There is an alternative representation of Cartesian trees, called Cartesian tree signature [14]. The Cartesian tree signature of is an array such that equals the number of the elements popped from the stack in the th iteration of Algorithm 1. Furthermore, the Cartesian tree signature can be represented as a bit string of length less than , which is a succinct representation of a Cartesian tree. For example, the Cartesian tree signature of string is , and its corresponding bit string is .
We can use this representation to perform Cartesian tree matching. While we compute the Cartesian tree signature, we store one more array , which is defined as follows: If is never popped out from the stack, . Otherwise, let be the value which popped out from the stack, and . For string , we have .
Using array , we can delete one character at the front of string in constant time. In order to get Cartesian tree signature and its corresponding for , we do the following: If , we decrease by one and erase from . If , we just erase . After that, we delete from to get . For example, if we want to delete one character at the front of , we decrease by one, and delete and . This results in and . These arrays are the correct Cartesian tree signature and its corresponding array of . In this way, we can perform Algorithm 2 using the Cartesian tree signature. Computing the failure function can also be done in a similar way.
Note that the Cartesian tree signature can represent a Cartesian tree using less space than the parentdistance representation, but it needs an auxiliary array to perform string matching, which uses the same space as the parentdistance representation. For Cartesian tree matching, therefore, it uses more space than Algorithm 2.
4 Multiple Pattern Matching in Time
In this section we extend Cartesian tree matching to the case of multiple patterns. Definition 4 gives the formal definition of multiple pattern matching. (Multiple pattern Cartesian tree matching) Given a text and patterns , where , multiple pattern Cartesian tree matching is to find every position in the text which matches at least one pattern, i.e., it has the same Cartesian tree as that of at least one pattern. We modify the AhoCorasick algorithm [1] using the parentdistance representation defined in Section 3.1 to do multiple pattern matching in time.
4.1 Constructing the AhoCorasick automaton
Instead of using the patterns themselves in the AhoCorasick automaton, we use their parentdistance representations to make an automaton. Each node in the automaton corresponds to the prefix of the parentdistance representation of some pattern. We maintain two integers and for every node such that the node corresponds to the parentdistance representation of the pattern prefix . If there are more than one possible indexes, we store the smallest one. Each node also has a state transition function , which gets an integer as an input and returns the next node, or report that there is no such node. We can construct the trie and the state transition function for every node in time, assuming that we use a balanced binary search tree to implement the transition function. Figure 2 shows an AhoCorasick automaton for three patterns , where we use the parentdistance representations of the patterns, to construct the automaton.
The failure function of the AhoCorasick automaton is defined as follows: Let be a node in the automaton, and be the substring that node represents in the trie. Let be the longest proper suffix of which matches (in the sense of Cartesian tree matching) prefix of some pattern . The failure function of is defined as node (i.e., ). The dotted lines in Figure 2 shows the failure function of each node. For example, node represents , and its failure function represents . We can see that matches (i.e., ), which is the longest proper suffix of that matches a prefix of some pattern. Note that the parentdistance representation of may not be the suffix of the parentdistance representation of . For example, has the parentdistance representation , but its failure function has the parentdistance representation which is not a suffix of .
Algorithm 4 computes the failure function of the trie. As in the original AhoCorasick algorithm, we traverse the trie with breadthfirst order (except the root) and compute the failure function. The main difference between Algorithm 4 and the AhoCorasick algorithm is at line 13, where we decide the next character to match. According to the definition of the trie, corresponds to the parentdistance representation of , and so the parent of corresponds to the parentdistance representation of . In the while loop from line 10 to 16, corresponds to the parentdistance representation of some suffix of , because is a node that can be reached from the parent of following the failure links. Since corresponds to some string of length , we can conclude that represents . We want to check whether matches some node in the trie, so we should check whether has the transition using . If has the transition , it corresponds to , and we can conclude that . If doesn’t have such a transition, there is no node that represents , and thus we have to continue the loop.
For example, suppose that we compute the failure function of in Figure 2. From and , we know that represents , and so , which is the parent of , represents . We begin the while loop starting from . Since , we know that , which represents , matches . In order to check whether matches some node in the trie, we compute and check whether exists. Since there is no such transition, we continue the while loop with . We know that , which represents , matches from . In order to check whether matches some node, we compute and check whether exists. Since there is such a transition, we conclude that . Note that may change during the while loop, which is not the case in the AhoCorasick algorithm.
While computing the failure function, we can also compute the output function in the same way as the AhoCorasick algorithm. The output function of node is the set of patterns which match some suffix of . This function is used to output all possible matches at the node.
4.2 Multiple pattern matching
Using the automaton defined above, we can solve multiple pattern Cartesian tree matching in time. The text search algorithm is essentially the same as that of the AhoCorasick algorithm, following the trie and using the failure links in case of any mismatches. As in the single pattern case, we compute the parentdistance representation of the text online in the same way as Algorithm 2 (using a deque) to ensure space. The time complexity of our multiple pattern Cartesian tree matching is using space, where the factor is included due to the binary search tree in each node. Since there can be at most outgoing edges from each node, we can perform an operation in the binary search tree in time. Combined with the timecomplexity analysis of the AhoCorasick algorithm, this shows that our algorithm has the time complexity of . We can reduce the time complexity further to randomized time by using a hash instead of a binary search tree [12].
5 Cartesian Suffix Tree in Randomized Time
In this section we apply the notion of Cartesian tree matching to the suffix tree as in the cases of parameterized matching and orderpreserving matching [8, 13]. We first define the Cartesian suffix tree, and show that it can be built in randomized time or worstcase time using the result from Cole and Hariharan [12].
5.1 Defining Cartesian suffix tree
The Cartesian suffix tree is an index data structure that allows us to find an occurrence of a given pattern in randomized time or worstcase time, where is the length of the text string. In order to store the information of Cartesian suffix trees efficiently, we again use the parentdistance representation from Section 3.1. Definition 5.1 gives the formal definition of the Cartesian suffix tree.
(Cartesian suffix tree) Given a string , the Cartesian suffix tree of is a compacted trie built with for every (where the special character is concatenated to the end of ) and string .
Note that we append a special character to the end of each parentdistance representation to ensure that no string is a prefix of another string.
Figure 3 shows an example Cartesian suffix tree of . Each edge actually stores the suffix number, start position, and end position instead of the parentdistance representation itself. For example, node corresponds to substring or , whose parentdistance representation is . Hence, the edge that goes into node stores the suffix number or , start and end positions and .
5.2 Constructing Cartesian suffix tree
There are several algorithms efficiently constructing the suffix tree, such as McCreight’s algorithm [24] and Ukkonen’s algorithm [26]. However, the distinct right context property [16, 8] should hold in order to apply these algorithms, which means that the suffix link of every internal node should point to an explicit node. The Cartesian suffix tree does not have the distinct right context property. In Figure 3, the internal node marked with does not satisfy this property because and thus there is no explicit node corresponding to parentdistance representation .
In order to handle this issue, we use an algorithm due to Cole and Hariharan [12]. This algorithm can construct a compacted trie for a quasisuffix collection, which satisfies the following properties:

A quasisuffix collection is a set of strings , where the length of is .

For any two different strings and , should not be a prefix of .

For any and , if and have a common prefix of length , and should have a common prefix of length at least .
A collection of parentdistance representations for the Cartesian suffix tree satisfies all of the above properties. The first two properties are trivial. Furthermore, if and have a common prefix of length , i.e., , we can show that by Equation 1. Therefore, and have a common prefix of length or more, showing the third property holds.
One more property we need to perform Cole and Hariharan’s algorithm is a character oracle, which returns the th character of in constant time. We can do this in constant time using Equation 1, once the parentdistance representation of is computed.
Since we have all properties needed to perform Cole and Hariharan’s algorithm, we can construct a Cartesian suffix tree in randomized time using space [12]. In the worst case, it can be built in time by using a binary search tree instead of a hash table to store the children of each node in the suffix tree, because the alphabet size is . We can also modify our algorithm to construct a Cartesian suffix tree online, using the idea in [23, 25].
6 Conclusion
We have defined Cartesian tree matching and the parentdistance representation of a Cartesian tree. We developed a linear time algorithm for single pattern matching and an deterministic time or randomized time algorithm for multiple pattern matching. Finally, we defined an index data structure called Cartesian suffix tree, and showed that it can be constructed in randomized time. We believe that the notion of Cartesian tree matching, which is a new metric on string matching and indexing over numeric strings, can be used in many applications.
There have been many works on approximate generalized matching. For example, there are results for approximate orderpreserving matching [11], approximate jumble matching [10], approximate swapped matching [5], and approximate parameterized matching [6, 18]. There are also results on computing the period of a generalized string, such as computing the period in the orderpreserving model [17]. Since Cartesian tree matching is first introduced in this paper, many problems including approximate matching and computing the period in the Cartesian tree matching model are future research topics.
Acknowledgments
S.G. Park and K. Park were supported by Institute for Information & communications Technology Promotion(IITP) grant funded by the Korea government(MSIT) (No. 2018000551, Framework of Practical Algorithms for NPhard Graph Problems). A. Amir and G.M. Landau were partially supported by the Israel Science Foundation grant 571/14, and Grant No. 2014028 from the United StatesIsrael Binational Science Foundation (BSF).
References
 [1] Alfred V. Aho and Margaret J. Corasick. Efficient string matching: An aid to bibliographic search. Commun. ACM, 18(6):333–340, 1975. URL: https://doi.org/10.1145/360825.360855, doi:10.1145/360825.360855.
 [2] Amihood Amir, Yonatan Aumann, Gad M. Landau, Moshe Lewenstein, and Noa Lewenstein. Pattern matching with swaps. J. Algorithms, 37(2):247–266, 2000. URL: https://doi.org/10.1006/jagm.2000.1120, doi:10.1006/jagm.2000.1120.
 [3] Amihood Amir, Richard Cole, Ramesh Hariharan, Moshe Lewenstein, and Ely Porat. Overlap matching. Inf. Comput., 181(1):57–74, 2003. URL: https://doi.org/10.1016/S08905401(02)000354, doi:10.1016/S08905401(02)000354.
 [4] Amihood Amir, Martin Farach, and S. Muthukrishnan. Alphabet dependence in parameterized matching. Inf. Process. Lett., 49(3):111–115, 1994. URL: https://doi.org/10.1016/00200190(94)900868, doi:10.1016/00200190(94)900868.
 [5] Amihood Amir, Moshe Lewenstein, and Ely Porat. Approximate swapped matching. Inf. Process. Lett., 83(1):33–39, 2002. URL: https://doi.org/10.1016/S00200190(01)003027, doi:10.1016/S00200190(01)003027.
 [6] Alberto Apostolico, Péter L. Erdös, and Moshe Lewenstein. Parameterized matching with mismatches. J. Discrete Algorithms, 5(1):135–140, 2007. URL: https://doi.org/10.1016/j.jda.2006.03.014, doi:10.1016/j.jda.2006.03.014.
 [7] Brenda S. Baker. A theory of parameterized pattern matching: algorithms and applications. In Proceedings of the TwentyFifth Annual ACM Symposium on Theory of Computing, May 1618, 1993, San Diego, CA, USA, pages 71–80, 1993. URL: https://doi.org/10.1145/167088.167115, doi:10.1145/167088.167115.
 [8] Brenda S. Baker. Parameterized duplication in strings: Algorithms and an application to software maintenance. SIAM J. Comput., 26(5):1343–1362, 1997. URL: https://doi.org/10.1137/S0097539793246707, doi:10.1137/S0097539793246707.
 [9] Peter Burcsi, Ferdinando Cicalese, Gabriele Fici, and Zsuzsanna Lipták. Algorithms for jumbled pattern matching in strings. Int. J. Found. Comput. Sci., 23(2):357–374, 2012. URL: https://doi.org/10.1142/S0129054112400175, doi:10.1142/S0129054112400175.
 [10] Peter Burcsi, Ferdinando Cicalese, Gabriele Fici, and Zsuzsanna Lipták. On approximate jumbled pattern matching in strings. Theory Comput. Syst., 50(1):35–51, 2012. URL: https://doi.org/10.1007/s0022401193445, doi:10.1007/s0022401193445.
 [11] Tamanna Chhabra, Emanuele Giaquinta, and Jorma Tarhio. Filtration algorithms for approximate orderpreserving matching. In String Processing and Information Retrieval  22nd International Symposium, SPIRE 2015, London, UK, September 14, 2015, Proceedings, pages 177–187, 2015. URL: https://doi.org/10.1007/9783319238265_18, doi:10.1007/9783319238265_18.
 [12] Richard Cole and Ramesh Hariharan. Faster suffix tree construction with missing suffix links. In Proceedings of the ThirtySecond Annual ACM Symposium on Theory of Computing, May 2123, 2000, Portland, OR, USA, pages 407–415, 2000. URL: https://doi.org/10.1145/335305.335352, doi:10.1145/335305.335352.
 [13] Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Marcin Kubica, Alessio Langiu, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. Orderpreserving indexing. Theor. Comput. Sci., 638:122–135, 2016. URL: https://doi.org/10.1016/j.tcs.2015.06.050, doi:10.1016/j.tcs.2015.06.050.
 [14] Erik D. Demaine, Gad M. Landau, and Oren Weimann. On Cartesian trees and range minimum queries. Algorithmica, 68(3):610–625, 2014. URL: https://doi.org/10.1007/s004530129683x, doi:10.1007/s004530129683x.
 [15] TakChung Fu, Korris FuLai Chung, Robert Wing Pong Luk, and Chakman Ng. Stock time series pattern matching: Templatebased vs. rulebased approaches. Eng. Appl. of AI, 20(3):347–364, 2007. URL: https://doi.org/10.1016/j.engappai.2006.07.003, doi:10.1016/j.engappai.2006.07.003.
 [16] Raffaele Giancarlo. A generalization of the suffix tree to square matrices, with applications. SIAM J. Comput., 24(3):520–562, 1995. URL: https://doi.org/10.1137/S0097539792231982, doi:10.1137/S0097539792231982.
 [17] Garance Gourdel, Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, Arseny M. Shur, and Tomasz Walen. String periods in the orderpreserving model. In 35th Symposium on Theoretical Aspects of Computer Science, STACS 2018, February 28 to March 3, 2018, Caen, France, pages 38:1–38:16, 2018. URL: https://doi.org/10.4230/LIPIcs.STACS.2018.38, doi:10.4230/LIPIcs.STACS.2018.38.
 [18] Carmit Hazay, Moshe Lewenstein, and Dina Sokol. Approximate parameterized matching. In Algorithms  ESA 2004, 12th Annual European Symposium, Bergen, Norway, September 1417, 2004, Proceedings, pages 414–425, 2004. URL: https://doi.org/10.1007/9783540301400_38, doi:10.1007/9783540301400_38.
 [19] Jinil Kim, Amihood Amir, Joong Chae Na, Kunsoo Park, and Jeong Seop Sim. On representations of ternary order relations in numeric strings. Mathematics in Computer Science, 11(2):127–136, 2017. URL: https://doi.org/10.1007/s1178601602820, doi:10.1007/s1178601602820.
 [20] Jinil Kim, Peter Eades, Rudolf Fleischer, SeokHee Hong, Costas S. Iliopoulos, Kunsoo Park, Simon J. Puglisi, and Takeshi Tokuyama. Orderpreserving matching. Theor. Comput. Sci., 525:68–79, 2014. URL: https://doi.org/10.1016/j.tcs.2013.10.006, doi:10.1016/j.tcs.2013.10.006.
 [21] Donald E. Knuth, James H. Morris Jr., and Vaughan R. Pratt. Fast pattern matching in strings. SIAM J. Comput., 6(2):323–350, 1977. URL: https://doi.org/10.1137/0206024, doi:10.1137/0206024.
 [22] Marcin Kubica, Tomasz Kulczynski, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. A linear time algorithm for consecutive permutation pattern matching. Inf. Process. Lett., 113(12):430–433, 2013. URL: https://doi.org/10.1016/j.ipl.2013.03.015, doi:10.1016/j.ipl.2013.03.015.
 [23] Taehyung Lee, Joong Chae Na, and Kunsoo Park. Online construction of parameterized suffix trees for large alphabets. Inf. Process. Lett., 111(5):201–207, 2011. URL: https://doi.org/10.1016/j.ipl.2010.11.017, doi:10.1016/j.ipl.2010.11.017.
 [24] Edward M. McCreight. A spaceeconomical suffix tree construction algorithm. J. ACM, 23(2):262–272, 1976. URL: https://doi.org/10.1145/321941.321946, doi:10.1145/321941.321946.
 [25] Joong Chae Na, Raffaele Giancarlo, and Kunsoo Park. Online construction of twodimensional suffix trees in O(n log n) time. Algorithmica, 48(2):173–186, 2007. URL: https://doi.org/10.1007/s004530070063x, doi:10.1007/s004530070063x.
 [26] Esko Ukkonen. Online construction of suffix trees. Algorithmica, 14(3):249–260, 1995. URL: https://doi.org/10.1007/BF01206331, doi:10.1007/BF01206331.
 [27] Jean Vuillemin. A unifying look at data structures. Commun. ACM, 23(4):229–239, 1980. URL: https://doi.org/10.1145/358841.358852, doi:10.1145/358841.358852.