Succinct Data Structures for Families of Interval Graphs

We consider the problem of designing succinct data structures for interval graphs with n vertices while supporting degree, adjacency, neighborhood and shortest path queries in optimal time in the Θ( n)-bit word RAM model. The degree query reports the number of incident edges to a given vertex in constant time, the adjacency query returns true if there is an edge between two vertices in constant time, the neighborhood query reports the set of all adjacent vertices in time proportional to the degree of the queried vertex, and the shortest path query returns a shortest path in time proportional to its length, thus the running times of these queries are optimal. Towards showing succinctness, we first show that at least nn - 2n n - O(n) bits are necessary to represent any unlabeled interval graph G with n vertices, answering an open problem of Yang and Pippenger [Proc. Amer. Math. Soc. 2017]. This is augmented by a data structure of size nn +O(n) bits while supporting not only the aforementioned queries optimally but also capable of executing various combinatorial algorithms (like proper coloring, maximum independent set etc.) on the input interval graph efficiently. Finally, we extend our ideas to other variants of interval graphs, for example, proper/unit interval graphs, k-proper and k-improper interval graphs, and circular-arc graphs, and design succinct/compact data structures for these graph classes as well along with supporting queries on them efficiently.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

11/08/2021

Succinct Data Structure for Path Graphs

We consider the problem of designing a succinct data structure for path ...
10/09/2020

Succinct Navigational Oracles for Families of Intersection Graphs on a Circle

We consider the problem of designing succinct navigational oracles, i.e....
10/08/2020

Succinct Permutation Graphs

We present a succinct, i.e., asymptotically space-optimal, data structur...
10/12/2020

Interval Query Problem on Cube-free Median Graphs

In this paper, we introduce the interval query problem on cube-free medi...
05/30/2019

Compact Data Structures for Shortest Unique Substring Queries

Given a string T of length n, a substring u = T[i.. j] of T is called a ...
08/24/2021

Succinct Data Structures for Series-Parallel, Block-Cactus and 3-Leaf Power Graphs

We design succinct encodings of series-parallel, block-cactus and 3-leaf...
04/30/2020

An Efficient Noisy Binary Search in Graphs via Median Approximation

Consider a generalization of the classical binary search problem in line...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

A simple undirected graph is called an interval graph if its vertices can be assigned to intervals on the real line so that two vertices are adjacent in if and only if their assigned intervals intersect. The set of intervals assigned to the vertices of G is called a realization of . These graphs were first introduced by Hajós [25] who also asked for the characterization of them. The same problem was also asked, independently, by Benser [4] while studying the structure of genes. Interval graphs naturally appear in a variety of contexts, for example, operations research and scheduling theory [3], biology especially in physical mapping of DNA [35], temporal reasoning [21] and many more. We refer the reader to [20, 19] for a thorough treatment of interval graphs and its applications. Eventually answering the question of Hajós [25], several researchers came up with different characterizations of interval graphs, including linear time algorithms for recognizing them; see, for example, [20, Chapter 8] for characterizations, and [5] and [24] for linear time algorithms. Moreover, exploiting the special structure of interval graphs, many otherwise NP-hard problems in general graphs are also shown to have polynomial time algorithms for interval graphs [19]. These include computing maximum independent set, reporting a proper coloring, returning a maximum clique etc. In spite of having many applications in practically motivated problems, we are not aware of, to the best of our knowledge, any study of interval graphs from the point of view of succinct data structures where the goal is to store a set of objects using the information theoretic minimum bits of space while still being able to support the relevant set of queries efficiently, and which is what we focus on in this paper. We also assume the usual model of computation, namely a -bit word RAM model where is the size of the input.

1.1 Related Work

There already exists a large body of work on representing various classes of graphs succinctly. This is partly motivated by theoretical curiosity and partly by the practical needs as these combinatorial structures do arise quite often in various applications. A partial list of such special graph classes would be trees [28], planar graphs [1], chordal graphs [29], partial -tree [15] among others, while succinct encoding for arbitrary graphs is also considered [16] in the literature. For interval graphs, other than the algorithmic works mentioned earlier, there are plenty of attempts in exactly counting the number of unlabeled interval graphs [26, 27], and the state-of-the-art result is due to [34], which is what we improve in this work. For the variants of the interval graphs that we study in this paper, there exists also a fairly large number of algorithmic results on them as well as structural results. See [20, 19] for details.

1.2 Our Results and Paper Organization

Given an unlabeled interval graph with vertices, in Section 3 we first show that at least bits are necessary to represent , answering an open problem of Yang and Pippenger [34]. More specifically, Yang and Pippenger [34] showed a lower bound of -bit for representing any unlabeled interval graph and asked whether this lower bound can be further improved. Augmenting this lower bound, in Section 4 we also propose a succinct representation of using bits while still being able to support the relevant queries optimally, where the queries are defined as follows. For any two vertices ,

  • : returns the number of vertices that are adjacent to in ,

  • : returns true if and are adjacent in , and false otherwise,

  • : returns all the vertices that are adjacent to in , and

  • : returns the shortest path between and in .

We show that all these queries can be supported optimally using our succinct data structure for interval graphs. More precisely, for any two vertices , we can answer and queries in time, queries in time, and queries in time. Furthermore, we also show how one can implement various fundamental graph algorithms in interval graphs, for example depth-first search (DFS), breadth-first search (BFS), computing maximum independent set, determining a maximum clique etc, both time and space efficiently using our succinct representation for interval graphs. In Section 5, we extend our ideas to other variants of interval graphs, for example, proper/unit interval graphs, k-proper and k-improper interval graphs, and circular-arc graphs, and design succinct data structures for these graph classes as well along with supporting queries on them efficiently. For definitions of these graphs, see Section 5. Finally we conclude in Section 6 with some remarks on possible future directions for exploring. We list all the preliminary data structures and graph theoretic terminologies that will be used throughout this paper, in Section 2.

2 Preliminaries

We will use the following data structures in the rest of this paper.

Rank and Select queries: Let be a sequence of size over an alphabet . Then for , and , one can define rank and select queries as follows.

  • = the number of occurrences of in .

  • = the position where is the -th in .

The following lemma shows that these operations can be supported efficiently using optimal space.

Lemma 2.1 ([11, 22]).

Given a sequence of size over an alphabet , for any , there exists data structures as follows.

  • when , -bit data structure which answers and queries on in time.

  • when , -bit data structure which answers queries on in time and queries on in time.

Note that the one can access the any element of the input sequence (at a given index) in (resp. ) time with the (resp. )-bit data structure of Lemma 2.1

Range Maximum Queries: Given a sequence of size , for , the Range Maximum Query on range (denoted by ) returns the position such that is a maximum value in (if there is a tie, we return the leftmost such position). One can define the Range Minimum Queries on range () analogously. The following lemma shows that there exist data structures which can answer these queries efficiently using optimal space.

Lemma 2.2 ([7, 18]).

Given a sequence of size and for any ,

  1. there exists a data structure of size bits, in addition to storing the sequence , which supports and queries in time while supporting access on in time.

  2. there exists a data structure of size bits (that does not store the sequence ) which supports or queries in time.

Graph Terminology and Input Representation: We will assume the knowledge of basic graph theoretic terminology as given in [13] and basic graph algorithms as given in [12]. Throughout this paper, will denote a simple undirected graph with the vertex set of cardinality and the edge set having cardinality . We call an interval graph if (a) with every vertex we can associate a closed interval on the real line, and (b) two vertices share an edge if and only if the corresponding intervals are not disjoint (see Figure 1 for an example). It is well known that given an interval graph with vertices, one can assign intervals to vertices such that every end point is a distinct integer from to using time [26], and in the rest of this paper, we deal exclusively with such representation. Moreover, for vertex , we refer to as the interval corresponding to .

3 Counting the number of unlabeled interval graphs

This section deals with counting unlabeled interval graphs on vertices, and let denote this quantity. (This is the sequence with id A005975 in the On–Line Encyclopedia of Integer Sequences [32].) Initial values of this sequence are given by Hanlon [26] but he did not prove an asymptotic form for enumerating the sequence. Answering a question posed by Hanlon [26], Yang and Pippenger [34] proved that the generating function diverges for any and they established the bounds

(1)

The upper bound in (1) follows from , where the right hand side is the number of matchings on points on a line. For the lower bound, the authors showed by finding an injection from , the set of permutations of length , to three-colored interval graphs of size . Furthermore, they left it open whether the leading terms of the lower and upper bounds in (1) can be matched, which is what show in affirmative by improving the lower bound. In other words, we find the asymptotic value of . In what follows, for a set , we denote by the set of -subsets of .

Theorem 1.

Let be the number of unlabeled interval graphs with vertices. As , we have

(2)
Proof.

We consider certain interval graphs on vertices with colored vertices. Let be a positive integer smaller than and a positive constant smaller than . For , let and denote the intervals and , respectively. These pairwise-disjoint intervals will make up vertices in the graphs we consider. Now let denote the set of closed intervals with one endpoint in and the other in . We color with blue, with red, and the intervals in with white.

Together with , each gives an -vertex, three-colored interval graph. For a given , let denote the colored interval graph whose vertices correspond to intervals in , and let denote the set of all .

Now let . For a white vertex , the pair , which represents the numbers of blue and red neighbors of , uniquely determine the interval corresponding to ; this is the interval . In other words, can be recovered from uniquely. Thus . Since there are at most ways to color the vertices of an interval graph with blue, red, and white, we have

for any . Setting and taking the logarithms, we get

Remark.

Yang and Pippenger [34] also posed the question whether for some or not. According to Theorem 1, this boils down to getting rid of the term in (2). Such a result would imply that the exponential generating function has a finite radius of convergence. (As noted in [34], the bound implies that the radius of convergence of is at least ).

4 Succinct representation of interval graphs

In this section, we introduce a succinct -bit representation of unlabeled interval graph on vertices with constant , and show that the navigational queries (degree, adjacent, neighborhood, and spath queries) and some basic graph algorithms (BFS, DFS, PEO traversals, proper coloring, computing the size of maximum clique and maximum independent set etc.) on can be answered/executed efficiently using our representation of .

4.1 Succinct Representation of

We first label the vertices of using the integers from to , as described in the following. By the assumption in Section 2, the vertices in can be represented by intervals where all the endpoints in are distinct integers in the range . Since there are distinct endpoints for the intervals in , every integer in corresponds to a unique or for some . We assign the labels to the vertices in based on the sorted order of left endpoints of their corresponding intervals, i.e., for any two vertices , if and only if .

Figure 1: Example of the interval graph and its representation.

Now we describe the representation of . Let be the binary sequence of length such that for , if (i.e., if corresponds to the left end point of an interval in ), and otherwise. If or , we say that corresponds to the interval . We represent the the sequence using the data structure of Lemma 2.1, using a bits to support rank and select queries on in time. Next, we store the sequence , and for some fixed constant , we also store an -bit data structure of Lemma 2.2(1) (with ) to support RMax and RMin queries on in time. Using the representations of and , it is easy to show that for any vertex , we can return its corresponding interval in time by computing , and can be accessed from the sequence . Thus, the total space usage of our representation is bits. See Figure 1 for an example.

4.2 Supporting Navigational Queries

In this section, we show that degree, adjacent, neighborhood, and spath queries on can be answered in asymptotically optimal time using the representation described in the Section 4.1.

query: We count the number of vertices in which are not adjacent to , which is a disjoint union of the two sets: (i) the set of intervals that end before the starting point , and (ii) the set of intervals that start after the end point . Using our representation the cardinalities of these two sets can be computed as follows. The number of intervals with is given by . Similarly, the number of intervals with is given by . Therefore, we can answer query in time by returning .

query: Since we can compute the intervals and in time, query can be answered in by checking or ( and are not adjacent if and only if one of these conditions is satisfied).

query: The set of all neighbors of a vertex can be reported by considering all the intervals whose left end points are in within the range and returning all such ’s with (i.e., which start to the left of and end after ). With our data structure, this query can be supported by returning the set . Using the RMax structure stored on , this can be supported in time. Note that, given a threshold value and a query range of the sequence , the range max data structure can be used to report all the elements within the range such that , in time per element, using the following recursive procedure. Compute the position . If , then return , and recurse on the subintervals and ; else stop.

query: We first define the SUCC query as described in [10]. For an interval , returns the interval such that and there is no with and . (For example in Figure 1, and .) To answer the query, let be the shortest path from to initialized with (without loss of generality, we assume that ). If and are identical, we simply add to and return . If not, we first add to and consider two cases as follows [10].

  • If is adjacent to , add to and return .

  • If is not adjacent to , we perform query recursively.

Since we can answer adjacent queries in time, it is enough to show how to answer the SUCC queries in time. Let be the number of vertices which satisfies , which can be answered in time by ). Then by the definition of SUCC query, with gives an answer of if (if not, there is no vertex in adjacent to ). Therefore we can answer the SUCC query in time, which implies query can be answered in time.

In Appendix A we discuss how to support some basic graph algorithms (BFS, DFS, PEO traversals, proper coloring, computing the size of maximum clique, maximum independent set and minimum vertex cover) efficiently on with the above set of operations along with the representation of Section 4.1.

5 Representation of some related families of interval graphs

In this section, we propose space-efficient representations for proper interval graphs, -proper and -improper interval graphs, and circular arc graphs. Since these graphs are restrictions or extensions (i.e., sub/super-classes) of interval graphs, we can represent them by modifying the representation in Section 4.1 (to make the representation asymptotically optimal in terms of space). We also show that navigation queries on these graph classes can be answered efficiently with the modified representation.

5.1 Proper interval graphs

An interval graph is proper if there exists an interval representation of such that for any two vertices , and (let such interval representation of be proper representation of ). Also it is known that proper interval graphs are equivalent to the unit interval graphs, which have an interval representation such that every interval has the same length [31].

Now we consider how to represent a proper interval graph with vertices while support navigational queries efficiently on . We first obtain an interval representation of the graph where the intervals satisfy the property of proper interval graph. We then assign labels to vertices of based on the sorted order left end points of their corresponding intervals, as described in Section 4.1. Let be the bit sequence obtained from this representation, as defined in the Section 4.1. Then by the definition of , there are no two vertices with and (if so, ). Thus by the Lemma 2.1, for any vertex we can compute and in time by and respectively using bits. Also note that is strictly increasing sequence when is a proper interval graph, and hence one can support the RMax queries on in time without maintaining any data structure, by simply returning the rightmost position of the query range. Thus, we obtain a following theorem.

Theorem 2.

Given a proper interval graph or unit interval graph with vertices, there exists a -bit representation of which answers and queries in time, queries in time, and queries in time, for any vertices .

It is known that there are asymptotically non-isomorphic unlabeled unit interval graphs with vertices, for some constant  [17], and hence bits is an information-theoretic lower bound on representing an arbitrary proper interval graph. Thus our representation in Theorem 2 gives a succinct representation for proper interval graphs.

5.2 -proper and -improper interval graphs

One can generalize the proper interval graph to the following two sub-classes of interval graphs. For an interval graph with vertices, is -proper interval graph (resp. -improper interval graph) if there exists an interval representation such that for any vertex , is contained by (resp., contains) at most intervals in other than . We call such an interval representation of as the -proper representation (resp. -improper representation) of . Note that, every proper interval graph is both a 0-proper and a 0-improper graph. The graph in Figure 1 is a 2-proper, and a 3-improper graph. Now we consider how to represent a -proper interval graph with vertices and support navigation queries efficiently on . We first represent -properly into intervals, and assign the labels to vertices of based on the sorted order of their left end points, as described in Section 4.1. Same as the representation in Section 4.1, we first maintain the data structure for supporting rank and select queries on in time, using bits in total. Also we maintain the -bit data structure of Lemma 2.2 on for supporting RMax queries on in time. Next, to access without using bits, we define the sequence of size over the alphabet such that (resp. ) if (resp. ) and its corresponding interval is contained by intervals in . Now for any , let be the set of all intervals such that for any , and . It is easy to show that each corresponds to the proper interval graph. For example the graph in Figure 1 is 2-proper interval graph, and , , , and . By Lemma 2.1, we can maintain using bits with supporting rank and select queries in and time respectively. Then for any vertex , we can answer its corresponding interval in time by and . Thus, we obtain a following theorem.

Theorem 3.

Given a -proper interval graph with vertices, there exists a -bit representation of which answers and queries in time, queries in time, and queries in time, for any vertices .

Note that we can represent -improper interval graphs in same space with same query time as in Theorem 3 by changing the definition of to be (resp. ) if (resp. ) and its corresponding interval contains intervals in .

5.3 Circular-arc graphs

In this section, we propose a succinct representation for circular-arc graphs, and show how to support navigation queries efficiently on the representation. A circular-arc graph is a graph whose vertices can be assigned to arcs on a circle so that two vertices are adjacent in if and only if their assigned arcs intersect. It is easy to see that every interval graph is a circular-arc graph. Thus, by the Lemma 3, we need at least bits to represent an arbitrary circular-arc graph .

Suppose that is represented by the circle together with arcs of . For an arc, we define its start point to be the unique point on it such that the arc continues from that point in the clock-wise direction but stops in the anti-clockwise direction; and similarly define its end point to be the unique point on it such that the arc stops in the clockwise direction but continues in the anti-clockwise direction. As in the case of interval graphs, we assume, without loss of generality, that all the start and end points of all the arcs are distinct. We label the vertices of with the integers form to as described below. We first select an arbitrary arc, and label the vertex (and the arc) corresponding to this arc by . We then traverse the circle from the starting point of that arc in the clockwise direction, and label the remaining vertices and arcs in the order in which their starting points are encountered during the traversal, and finish the traversal when we return to the starting point of the first arc. We also map all the start and end points of all arcs, in the order in which they are encountered in the above traversal, into the range (since the start and end points of all the arcs are distinct). With the above defined labeling of the arcs, and the numbering of their start and end points, let and start and end points of the arc labeled , for . Now the arcs can be thought of as two types of intervals in the range ; we call an interval as normal if (i.e., we traverse prior to ), and reversed otherwise. A normal interval corresponds to the interval , while a reversed interval actually corresponds to the union of the two intervals and . See Figure 2 for an example; intervals numbered 4 and 7 are reversed, while the others are normal. Our representation of consists of the following substructures.

Figure 2: Example of the circular graph and its representation.
  1. Define a binary sequence of length such that for , (resp. ) if -th end point encountered during the traversal of is in (resp. ). Now, construct a sequence of size over an alphabet such that for all , if the position corresponds to the end point of a reversed interval, and otherwise (i.e., if corresponds to a normal interval). We represent using the structure of Lemma 2.1, using bits, so that we can answer rank and select queries on in time. In addition, we also store auxiliary structures (of bits) on top of to support rank and select queries on (without explicitly storing – note that, one can efficiently reconstruct any subsequence of from ).

  2. To store the interval end points efficiently, we introduce two 2-dimensional grids of points, and , defined as follows. Suppose there are vertices in which correspond to normal intervals (and vertices correspond to reversed intervals). Then let be a set of points on the 2-dimensional grid which consist of , for all with . Similarly let be a set of points on the 2-dimensional grid which consist of , for all with . Given a set of points on 2-dimensional grid, we define the following queries

    • : returns with .

    • : returns number of points in within the rectangular range .

    We represent and using bits in total, such that and queries can be supported in time [6].

Using these data structures, when the vertex is given, we can answer and in time by , and if (i.e., if is the left end point of a normal interval), and otherwise (i.e., if if is the left end point of a reversed interval). Finally, let be a sequence such that for , with . Similarly, let be a sequence such that for , with . Then we maintain the data structure of Lemma 2.2 on and , using a total of bits, to support RMax queries on each of them. Thus, the overall representation takes bits in total. Now we prove the following theorem (See Appendix B for the proof).

Theorem 4.

Given a circular arc graph with vertices, there exists a -bit representation of which answers and queries in time, queries in time, and queries in time for any two vertices .

6 Conclusion and Final Remarks

We considered the problem of succinctly encoding an unlabeled interval graph with vertices so as to support adjacency, degree, neighborhood and shortest path queries. To this end, we designed a succinct data structure that can support these queries optimally. We also showed how one can implement various combinatorial algorithms in interval graphs using our succinct data structure in both time and space efficient manner. Extending these ideas, finally, we also showed succinct/compact data structures for multiple other variants of interval graphs. For some of these variants, the query times of our data structures are super constant, hence non-optimal and we leave them as open problems whether we can design data structures for supporting these queries in constant time.

References

  • [1] L. C. Aleardi, O. Devillers, and G. Schaeffer. Succinct representations of planar maps. Theor. Comput. Sci., 408(2-3):174–187, 2008.
  • [2] N. Banerjee, S. Chakraborty, V. Raman, and S. R. Satti. Space efficient linear time algorithms for BFS, DFS and applications. Theory Comput. Syst., 62(8):1736–1762, 2018.
  • [3] A. Bar-Noy, R. Bar-Yehuda, A. Freund, J. Naor, and B. Schieber. A unified approach to approximating resource allocation and scheduling. J. ACM, 48(5):1069–1090, 2001.
  • [4] S. Benser. On the topology of the genetic fine structure. Proc. Nat. Acad. Sci., 45:1607–1620.
  • [5] K. S. Booth and G. S. Lueker. Testing for the consecutive ones property, interval graphs, and graph planarity using pq-tree algorithms. J. Comput. Syst. Sci., 13(3):335–379, 1976.
  • [6] P. Bose, M. He, A. Maheshwari, and P. Morin. Succinct orthogonal range search structures on a grid with applications to text indexing. In WADS, pages 98–109, 2009.
  • [7] G. S. Brodal, P. Davoodi, and S. S. Rao. On space efficient two dimensional range minimum data structures. Algorithmica, 63(4):815–830, 2012.
  • [8] S. Chakraborty, A. Mukherjee, V. Raman, and S. R. Satti. A framework for in-place graph algorithms. In 26th ESA, pages 13:1–13:16, 2018.
  • [9] S. Chakraborty, V. Raman, and S. R. Satti. Biconnectivity, st-numbering and other applications of DFS using O(n) bits. J. Comput. Syst. Sci., 90:63–79, 2017.
  • [10] D. Z. Chen, D. T. Lee, R. Sridhar, and C. N. Sekharan. Solving the all-pair shortest path query problem on interval and circular-arc graphs. Networks, 31(4):249–258, 1998.
  • [11] D. R. Clark and J. I Munro. Efficient suffix trees on secondary storage. SODA ’96, pages 383–391, 1996.
  • [12] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms (3. ed.). MIT Press, 2009.
  • [13] R. Diestel. Graph Theory, 4th Edition, volume 173 of Graduate texts in mathematics. Springer, 2012.
  • [14] Y. Dodis, M. Patrascu, and M. Thorup. Changing base without losing space. In STOC, pages 593–602, 2010.
  • [15] A. Farzan and S. Kamali. Compact navigation and distance oracles for graphs with small treewidth. Algorithmica, 69(1):92–116, 2014.
  • [16] A. Farzan and J. I. Munro. Succinct encoding of arbitrary graphs. Theor. Comput. Sci., 513:38–52, 2013.
  • [17] S. R. Finch. Mathematical Constants. 2003.
  • [18] J. Fischer and V. Heun. Space-efficient preprocessing schemes for range minimum queries on static arrays. SIAM J. Comput., 40(2):465–492, 2011.
  • [19] M. C. Golumbic. Interval graphs and related topics. Discrete Mathematics, 55(2):113–121, 1985.
  • [20] M. C. Golumbic. Algorithmic Graph Theory and Perfect Graphs. 2004.
  • [21] M. C. Golumbic and R. Shamir. Complexity and algorithms for reasoning about time: A graph-theoretic approach. J. ACM, 40(5):1108–1133, 1993.
  • [22] A. Golynski, J. I. Munro, and S. S. Rao. Rank/select operations on large alphabets: a tool for text indexing. In SODA, pages 368–373, 2006.
  • [23] U. I. Gupta, D. T. Lee, and Joseph Y.-T. Leung. Efficient algorithms for interval graphs and circular-arc graphs. Networks, 12(4):459–467, 1982.
  • [24] M. Habib, R. M. McConnell, C. Paul, and L. Viennot. Lex-bfs and partition refinement, with applications to transitive orientation, interval graph recognition and consecutive ones testing. Theor. Comput. Sci., 234(1-2):59–84, 2000.
  • [25] G. Hajós. Über eine art von graphen. Int. Math. Nachr., 11:1607–1620.
  • [26] P. Hanlon. Counting interval graphs. Transactions of the American Mathematical Society, 272(2):383–426, 1982.
  • [27] S. Klavzar and M. Petkovsek. Intersection graphs of halflines and halfplanes. Discrete Mathematics, 66(1-2):133–137, 1987.
  • [28] J. I. Munro and V. Raman. Succinct representation of balanced parentheses and static trees. SIAM J. Comput., 31(3):762–776, 2001.
  • [29] J. I. Munro and K. Wu. Succinct data structures for chordal graphs. In ISAAC, pages 67:1–67:12, 2018.
  • [30] G. Navarro. Wavelet trees for all. J. Discrete Algorithms, 25:2–20, 2014.
  • [31] F. S. Roberts. ”Indifference graphs”, in Harary, Frank, Proof Techniques in Graph Theory. 1969.
  • [32] N. J. E. Sloane. The on–line encyclopedia of integer sequences. http://oeis.org.
  • [33] M. Thorup. Integer priority queues with decrease key in constant time and the single source shortest paths problem. J. Comput. Syst. Sci., 69(3):330–353, 2004.
  • [34] J. C. Yang and N. Pippenger. On the enumeration of interval graphs. Proc. Amer. Math. Soc. Ser. B, 4(1):1–3, 2017.
  • [35] P. Zhang, E. A. Schon, E. Cayanis, S. G. Fischer, P. E. Bourne, J. Weiss, and S. Kistelr. An algorithm based on graph theory for the assembly of contigs in physical mapping of DNA. Bioinformatics, 10(3):309–317, 06 1994.

Appendix A Some graph algorithms on the succinct representation of interval graphs

Depth-first search (DFS) and Breath-first search (BFS) : DFS and BFS are the two most widely known and popular graph search methods because of their versatile usage as the backbone of so many other powerful and important graph algorithms. In what follows, we show that essentially the vertices sorted by its ascending order of the labels i.e., gives both DFS and BFS vertex ordering of the . Note that there may be more than one valid DFS or BFS ordering on , but here we are interested in any of those valid and correct orderings. Moreover along the lines of recent papers [2, 9, 8], here we are interested only in the ordering of the vertices in DFS and BFS traversals i.e., the order in which the vertices are visited for the first time during the DFS/BFS traversal of the input graph , not in actually reporting the final DFS/BFS tree. Towards this, we show the following,

Theorem 5.

Given an interval graph with vertices, suppose we label the vertices of to to be for any vertices , if and only if . Then ascending order from to gives a valid DFS and BFS ordering of .

Proof.

We only consider the DFS traversal in the proof (the case of BFS traversal can be proved using the similar argument). We prove by induction on the number of visited vertex. Since we can start from arbitrary vertex in , the theorem statement holds with starting the traversal with the vertex 1. Next, suppose that we already visited the vertices with (the case is trivial) and for every valid DFS traversal, there exists a vertex which is visited prior to . This implies that there exists at least one vertex such that is adjacent to but not , contradicting to the fact that . Therefore there exists a valid DFS traversal which visits the vertex after visiting the vertex . ∎

Perfect Elimination Ordering (PEO) : PEO of a graph , if it exists, is defined as an ordering of the vertices of such that, for each vertex , and the neighbors of that occur before in the order form a clique [20]. If we order the vertices corresponding to the intervals by sorting based on their left endpoints, then the resulting vertex order is a PEO, as the predecessor set of every vertex forms a clique. Thus, from our representation it is trivial to generate a PEO of the given interval graph.

Maximum Independent Set (MIS) and Minimum Vertex Cover (MVC) : To compute an MIS, we simulate the greedy algorithm of [23] which works as follows. Initialize the sets and to . We first find the vertex such that is the leftmost among all the right endpoints of the intervals in . If such an exists, we add to and add where is the set of all intervals whose corresponding vertices are adjacent to . We repeat this procedure until no such vertex exists, and return . Also MVC can be computed from MIS by returning the complement of MIS, in time. (For the graph in Figure 1, MIS = and MVC = .)

Now we show how the algorithm can be implemented in time linear in the size of the input, with our representation of . We first initialize the set to and compute (which returns the interval with the smallest right end point among all the intervals), and add vertex to . Then the greedy algorithm picks the next interval with the smallest right end point in the range of the sequence . In general, suppose and is the last vertex added to . Then we compute , and add to , if it exists. Thus, we can compute MIS in time linear in the size of MIS.

Computing a Maximum Clique : In order to find a maximum clique in , we define a sequence of length where (i) , and (ii) for , if and otherwise. From the definition of , if , there are exactly vertices in such that all corresponding intervals of these vertices have left endpoints at most and right endpoints larger then . Thus all such vertices form a clique. This gives an algorithm for computing a maximum clique in as follows. While constructing the sequence in time, we maintain the index such that is a largest value in . We then scan all the intervals and return those intervals whose left end point is at most and right end point is larger than . Therefore we can compute the maximum clique in in time in total.

Computing a Proper Coloring: It is well-known that the greedy algorithm on yields the optimal proper coloring if we process the vertices of in the order of their corresponding intervals’ left endpoints [20]. Thus, we simply implement this greedy coloring on from the vertex to as follows. We first maintain values such that for , stores the color of vertex . Since each can be stored using bits, we can maintain all ’s using bits in total, where denotes the number of edges in . To access the color of a given vertex in

time, we also store a parallel bit vector which stores a

at the beginning of each vertex’s color, and in all other positions; and store auxiliary data structure to support select queries on it. Now initialize all to and scan the vertices from to . While we visit the vertex , we perform the query and choose the minimum color in . Since we use time for each query to assign the color of , we can assign the color of all vertices in in time, using extra bits of space.

Another alternative way to implement the greedy coloring on is to use a priority queue. In this case, we first compute , which is a chromatic number of . Since is an interval graph, we can compute in time on our representation by computing the size of the maximum clique of . Now we initialize to and insert to the priority queue , and scanning from left to right. Suppose we currently access which corresponds to (we can compute the index in time). If , we assign the minimum element of to , and delete from . Otherwise, we insert to . Note that we exactly perform insert operations and delete operations on . Therefore we can compute a proper coloring of in time using bits of space, using the integer priority queue structure of [33].

Note that these two solutions use bits of space, With bits, we cannot store the colors of all the vertices simultaneously (unless the graph is sparse), and this poses a challenge for the greedy algorithm. We leave open the problem to find a proper coloring of interval graphs using extra bits.

Appendix B Proof of Theorem 4

Theorem 4.

Given a circular arc graph with vertices, there exists a -bit representation of which answers and queries in time, queries in time, and queries in time for any two vertices .

Proof.

Suppose we have the -bit representation described in Section 5.3. Now we consider the following queries, which extends the proof in Section 4.2.

query: To answer query, We first compute (i) counting the vertices with , and (ii) counting the vertices with and return the sum of them. Now we consider the two cases based on and as follows.

  • We can count the number of vertices in (i) in time by returning , same as in Section 4.2

    . Next, we classify the vertices

    in (ii) into three cases as 1) , 2) , and 3) or and return the sum of them. First, number of vertices in case 1) and 2) can be easily answered in time by returning and respectively. To count the number of vertices in case 3), we first count the number of start and end points between and by returning . After that we subtract the number of vertices whose both start and end points exist between and , which is where . Thus we can count the number of vertices in this case in time.

  • We classify the number of vertices in case (i) into three cases as 1) , 2) , and 3) or separately and return the sum of them. This can be answered in time by the same argument as above. For counting the vertices in (ii), we simply return since all the vertices corresponds to the reverse interval cross in , i.e., all such vertices form a clique in .

query: This can be answered in time by checking , , , and .

query: We only describe how to answer the vertices adjacent to when the corresponding interval of is normal. The case when the interval is reverse can be handled similarly. First we can return the all vertices with in time using the same argument in Section 4.2. Next, the set of vertices adjacent to with , is a disjoint union of the following two sets: 1) the set of all vertices with , and 2) the set of all vertices with . We can answer all the vertices in in time by returning , which takes time per each element. Finally vertices in is equivalent to the the vertices in with . Using the data structure RMax on with a query range on , these vertices can be answered in time per element by the same procedure to answer the neighborhood queries on interval graphs. Thus, we can answer