This paper investigates the problem of encoding a given graph into a binary string with the requirement that can be decoded to reconstruct . The problem has been studied generally with two primary objectives. One is to minimize the length of , while the other is to minimize the time needed to compute and decode . In light of these goals, a coding scheme is efficient if its encoding and decoding procedures both take polynomial time. A coding scheme is succinct if the length of is not much larger than its information-theoretic tight bound, i.e., the shortest length over all possible coding schemes.
As the two primary objectives are often in conflict, a number of coding schemes with different trade-offs have been proposed from practical and theoretical perspectives. The most well-known efficient succinct scheme is the folklore scheme of encoding a rooted ordered -vertex tree into a string of balanced pairs of left and right parentheses, which uses bits. Since the total number of such trees is at least , the minimum number of bits needed to differentiate these trees is the logarithm111All logarithms are of base 2. of this quantity, which is by Stirling’s approximation. Thus, 2 bits per edge is an information-theoretic tight bound for encoding rooted ordered trees. The standard adjacency-list encoding of a graph is widely useful but requires bits where and are the numbers of edges and vertices, respectively . For certain graph families, Kannan, Naor and Rudich  gave schemes that encode each vertex with bits and support -time testing of adjacency between two vertices. For connected planar graphs, Jacobson  gave an -bit encoding which supports traversal in time per vertex visited. This result was recently improved by Munro and Raman ; their schemes encode binary trees, rooted ordered trees and planar graphs succinctly and support several graph operations in constant time. For dense graphs and complement graphs, Kao, Occhiogrosso, and Teng  devised two compressed representations from adjacency lists to speed up basic graph techniques such as breadth-first search and depth-first search. Galperin and Wigderson  and Papadimitriou and Yannakakis  investigated complexity issues arising from encoding a graph by a small circuit that computes its adjacency matrix. For labeled planar graphs, Itai and Rodeh  gave an encoding procedure that requires bits. For unlabeled general graphs, Naor  gave an encoding of bits, which is optimal to the second order.
Our work aims to minimize the number of bits needed to encode an embedded planar graph which is unlabeled and undirected. We assume that has vertices, edges, and faces but has no self-loop or multiple edge. (See [2, 7, 16] for the graph-theoretic terminology used in this paper.) Note that if polynomial time for encoding and decoding is not required, then any given graph in a large family can be encoded with the information-theoretic minimum number of bits by brute-force enumeration. This paper focuses on schemes that use only time for both encoding and decoding.
For a general planar graph , Turán  gave an encoding using bits asymptotically. This space complexity was improved by Keeler and Westbrook  to about bits. They also gave encoding algorithms for several important classes of planar graphs. In particular, they showed that if is triangulated, it can be encoded in about bits. If is triconnected, it can be encoded using bits. In this paper, these latter two results are improved as follows. If is triangulated, it can be encoded using bits. It is interesting that rooted ordered trees require 2 bits per edge, while the seemingly more complex plane triangulations need fewer bits. Note that Tutte  gave an enumeration theorem that yields an information-theoretic tight bound of roughly bits for plane triangulations that may contain multiple edges. If is triconnected, we can encode it using at most bits, which is at most bits. Both of our coding schemes are intuitive and simple. They require only time for encoding as well as decoding. The schemes make new uses of the canonical orderings of planar graphs, which were originally introduced by de Fraysseix, Pach and Pollack  and extended by Kant . These structures and closely related ones have proven useful also for drawing planar graphs in organized and compact manners [12, 13, 20].
2 A Coding Scheme for Plane Triangulations
This section assumes that is a plane triangulation. Thus, and has edges.
Let be an ordering of the vertices of , where are the three exterior vertices of in the counterclockwise order. After fixing such an ordering, let be the subgraph of induced by . Let be the exterior face of . Let be the subgraph of obtained by removing . Our coding scheme uses a special kind of ordering defined as follows. [see ] An ordering of is canonical if the following statements hold for every :
is biconnected, and its exterior face is a cycle containing the edge .
The vertex is on the exterior face of , and the set of its neighbors in forms a subinterval of the path and consists of at least two vertices. Furthermore, if , has at least one neighbor in . Note that the case is somewhat ambiguous due to degeneracy, and is regarded as the edge itself.
Figure 1 illustrates a canonical ordering of a plane triangulation. Note that every plane triangulation has a canonical ordering which can be computed in time . A canonical ordering of can be viewed as an order in which is reconstructed from a single edge step by step. At step with , the vertex and the edges between and its lower ordered neighbors are added into the graph. For the sake of enhancing intuitions, we call the contour of ; denote its vertices by in the consecutive order along the cycle ; and visualize them as arranged from left to right above the edge in the plane. When the vertex is added to to construct , let be the neighbors of on the contour . After is added, the vertices are no longer contour vertices. Thus, we say that these vertices are covered by . The edge is the left edge of ; the edge is the right edge of ; the edges with are the internal edges of .
There is no published reference for the following folklore lemma; for the sake of completeness, we include its proof here.
Let be a canonical ordering of . Let respectively, be the collection of the left respectively, right edges of for ; similarly, let be that of the internal edges of for .
is a tree spanning over .
is a tree spanning over .
is a tree spanning over .
The statements are proved separately as follows.
Statement 1. For , let be the collection of the left edges of for . We prove by induction on the claim that is a tree spanning over . Then, since , the claim implies the statement. For the base case , the claim trivially holds. The induction hypothesis is that the claim holds for . The induction step is to prove the claim for . is obtained from by adding the left edge of . By the induction hypothesis, is a tree spanning over . Since is the leftmost neighbor of on , is some with and . Thus, contains , and is a tree spanning over .
Statement 3. has vertices and edges. The edges are not in . Thus, since and have edges each, has edges. Then, since is acyclic and does not contain and , is a spanning tree of .
A canonical ordering is rightmost if for all and with such that the neighbors of on are all in , the leftmost neighbor of appears before that of when traversing from to in the clockwise direction. Intuitively speaking, if there are more than one vertex that can be added to , we always add the rightmost one. The ordering in Figure 1 is rightmost. A rightmost canonical ordering is symmetric to a leftmost one in  and can be computed from in linear time similarly.
Let be a rightmost canonical ordering of . Let be as in Lemma 2 for this ordering. Let be the tree . In Figure 1, is indicated by the thick lines. Our coding scheme uses extensively. The rightmost depth-first search of proceeds as follows. We start at and traverse the edge first. Afterwards, if two or more vertices can be visited from , we choose the rightmost one. More precisely, let be the path in from to and then to . Let be the set of edges between and the available vertices. We visit a new vertex through the edge in that is next to in the counterclockwise cyclic order around formed by and the edges in . Note that the order in which the vertices are visited by the rightmost depth-first search is the rightmost canonical ordering that defines .
We are now ready to describe the encoding of as the concatenation of two binary strings and as follows.
is the binary string that encodes using the folklore parenthesis coding scheme where and correspond to and , respectively. In this encoding, is rooted at , and the branches are ordered the same as their enpoints are in the rightmost canonical ordering. Since contains vertices, has bits.
encodes the number of contour vertices covered by each with . First, we create a string of copies of . The -th corresponds to . If covers vertices, we insert copies of before the corresponding . For example, the string for Figure 1 is:
Since each vertex with is covered exactly once, has copies of 1. So bits. Hence, bits.
We next describe how to decode to reconstruct . Given , we can uniquely determine from the length of . Subsequently, we can uniquely determine and . From , we can reconstruct . From , we can recover the ordering . Then, we draw the edge and perform a loop of steps indexed by with where step processes . Before is processed, and its contour have been constructed. At step , we add and the edges between and its lower ordered neighbors into to construct as follows. From , we can identify the leftmost neighbor of on the contour , because is simply the parent of in . From , we can determine the number of vertices covered by . Thus, we add the edges into ; note that . This gives us the subgraph and completes step .
It is straightforward to carry out these encoding and decoding procedures in linear time. Also, we can save 1 bit by deleting the last in . Since covers no vertex, for , we can save another bit by deleting the first in . Note that for , the last in is also the first and cannot be deleted twice, but we can simply encode the 3-vertex plane triangulation with zero bit without ambiguity. Thus, we have the following theorem.
A plane triangulation of edges and vertices with can be encoded using bits. Both encoding and decoding take time.
3 A Coding Scheme for Triconnected Plane Graphs
This section assumes that is triconnected. To avoid triviality, let .
Let be an ordering of the vertices of where are on the exterior face of , and and are neighbors of . Let be the subgraph of induced by . Let be the exterior face of . Let be the subgraph of obtained by removing . Our coding scheme for triconnected plane graphs uses an ordering defined as follows.
[see ]An ordering of a triconnected plane graph is canonical if the integer interval can be partitioned into subintervals each satisfying either set of properties below:
The integer is . The vertex is on the exterior face of and has at least two neighbors in . is biconnected and its exterior face contains the edge . If , has at least one neighbor in .
The integer is at least . The sequence is a chain on the exterior face of and has exactly two neighbors in , one for and the other for , which are on the exterior face of . is biconnected and its exterior face contains the edge . Every vertex among has at least one neighbor in .
As in §2, we similarly define a rightmost canonical ordering of . Figure 2 shows a rightmost canonical ordering of a triconnected plane graph. Given a triconnected plane graph, we can find a rightmost canonical ordering in linear time . With a rightmost canonical ordering, can be reconstructed from a single edge through a sequence of steps indexed by . There are two possible cases at step , which correspond to the two sets of properties in Definition 3 and are used throughout this section.
Case 1: A single vertex is added.
Case 2: A chain of vertices is added.
While reconstructing , we collect a set of edges as follows. Initially, consists of the edge . Let be the vertices of , which are ordered consecutively along the boundary cycle of and are arranged from left to right above the edge in the plane.
Case 1. Let and with be the leftmost and rightmost neighbors of in , respectively. After is added, are no longer contour vertices; these vertices are covered at step . The edge is included in .
Case 2. Let and with be the neighbors of and in , respectively. After are added, are no longer contour vertices; these vertices are covered at step . The edges are included in .
In Figure 2, the edges in are indicated by the thick lines. By an argument similar to the proof of Lemma 2(1), is a spanning tree of . As in §2, we similarly define the rightmost depth-first search in . Note that the order in which the vertices of are visited by the rightmost depth-first search is the rightmost canonical ordering that defines .
We are now ready to describe the encoding of by means of . We further divide Case 1 into three subcases.
Case 1a: No vertex is covered at step .
Case 1b: At least one vertex is covered at step and the leftmost covered vertex is adjacent to .
Case 1c: At least one vertex is covered at step and the leftmost covered vertex is not adjacent to .
Let be the number of steps for reconstructing . Let and be the numbers of steps of Cases 1a, 1b, 1c, and 2, respectively. We first consider the case to encode with Scheme I; afterwards, we modify Scheme I into Scheme II for the case .
In Scheme I, the encoding of is the concatenation of three strings , and . is the folklore parentheses encoding of , which is rooted and ordered in the same way as in §2. Since has vertices, has bits.
To construct , first let where each is a binary string that corresponds to the step of reconstructing based on the ordering . is determined as follows. The following two cases both assume that vertices are covered at step .
Case 1. Note that . The string has symbols corresponding to with , respectively. If the edge is present in , the symbol in corresponding to is 1; otherwise, the symbol is 0. Note that in Case 1a, since no vertex is covered, is empty.
Case 2. The string consists of copies of followed by copies of .
For example, the string for Figure 2 is:
is a binary representation of defined as follows. A step of Case 1 adds one vertex to and correspondingly includes one in ; similarly, a step of Case 2 adds vertices to and includes one and copies of in . Since exactly vertices are added, the total number of these symbols is . Each symbol in not yet counted corresponds to a vertex covered at the steps. Since each with is covered at most once and are never covered, the total number of these latter symbols is at most . Thus has at most
symbols. For the sake of unambiguous decoding, we padwith copies of 1 at its end to have exactly symbols. Since uses 3 distinct symbols, we treat it as an integer of base 3 and convert it to a binary integer. Again, for the sake of unambiguous decoding, we use exactly bits for this binary integer by padding copies of at its beginning. The resulting binary string is the desired .
For the sake of decoding, we also need to know whether any given is of Case 1 or 2. Thus, let where if step is of Case 1 and otherwise. To save space, note that some bits can be deleted as follows without incurring ambiguity. If step is of Case 1a, is deleted because is empty and only a string of Case 1a can be empty. If step is of Case 1b, is deleted because starts with 1, while the strings of Case 2 start with 0. If step is of Case 1c or 2, remains in . For example, the string for Figure 2 consists of , , , . Thus, has bits, which can be bounded as follows. A step of Case 1 adds one vertex into and a step of Case 2 adds at least two vertices. Since vertices are added over the steps, . Since Scheme I assumes , .
Since , bits. This completes the description of the encoding procedure of Scheme I.
Next we describe how to decode to reconstruct . This decoding assumes that both and are given. Thus, we can uniquely determine , and . Then we convert to . From we can recover all with . From and all , we can recover all with . From , we reconstruct . From , we find the ordering . Afterwards, we draw the edge and perform a loop of steps as follows. Each step is indexed by and corresponds to step of reconstructing using the rightmost canonical ordering.
If , step is of Case 1. Thus, a vertex is added at this step where is the smallest ordered vertex not added into the current graph yet. From , we can determine the leftmost neighbor of in the contour because is the parent of in . From , we know the number of vertices covered by and hence the rightmost neighbor of in the contour . From , we also know which of the covered vertices are connected to . These corresponding edges are added to .
If , step is of Case 2. Thus, a chain is added at this step where is the smallest ordered vertex not added into the current graph yet. The integer can be determined from the string by counting its leading copies of . From , we also know the number of vertices covered at step , which is the count of in . Thus, we know the neighbor of in the contour . The chain is added accordingly.
This completes the decoding procedure of Scheme I. It is straightforward to implement the whole Scheme I in time. If , we use Scheme II to encode , which is identical to Scheme I with the following differences. If step is of Case 2, consists of copies of 1 followed by copies of 0. Also, all bits for steps of Cases 1a and 1c are omitted from without incurring ambiguity since their corresponding strings either are empty or start with 0 while the strings of Cases 1b and 2 start with 1. We use one extra bit to encode whether we use Scheme I or II. Thus we have the following lemma.
Any triconnected plane graph with vertices can be encoded using at most bits. Both encoding and decoding take time. The decoding procedure assumes that both and are given.
We can improve Lemma 3 as follows. Let be the dual of . has vertices, edges and faces. Since is triconnected, is also triconnected. Furthermore, if , then and has no self-loop or multiple edge. Thus, we can use the coding scheme of Lemma 3 to encode with at most bits. Since can be uniquely determined from , to encode , it suffices to encode . To make shorter, for the case , if , we encode using at most bits; otherwise, we encode using at most bits. This new encoding has at most bits. Since , the bit count is at most by Euler’s formula . For the sake of decoding, we use one extra bit to denote whether we encode or its dual. Note that if , we can simply encode using zero bit without ambiguity. Thus we have proved the following theorem.
Any triconnected plane graph with vertices, edges and faces can be encoded using at most bits. Both encoding and decoding take time. The decoding procedure assumes that is given together with or as appropriate.
Remark. There are several ways to improve this coding scheme so that the decoding does not require as input. One is to use well-known data compression techniques to encode and append it to the beginning of using bits [1, 5]. Another is to pad with copies of at its end so that it has exactly bits. Then, since , given alone, we can uniquely determine or and proceed with the original decoding procedure. With the strings , we can unambiguously identify the padded bits.
4 Open Problems
This paper leaves several problems open. Since plane triangulations are useful in many application areas, it would be particularly helpful to encode them in time using close to bits. Similarly, it would be significant to obtain a linear-time coding scheme for triconnected plane graphs using close to bits. Note that Tutte  proved an information-theoretic tight bound of bits for triconnected plane graphs that may contain multiple edges and self-loops. More generally, it would be of interest to encode graphs in a given family in polynomial time using their information-theoretic minimum number of bits. Solving these problems will most likely lead to the discovery of new structural properties of graphs.
Acknowledgments. The authors are grateful to anonymous referees for helpful comments.
-  T. C. Bell, J. G. Cleary, and I. H. Witten, Text Compression, Prentice-Hall, Englewood Cliffs, NJ, 1990.
-  C. Berge, Graphs, North-Holland, New York, NY, second revised ed., 1985.
-  T. H. Cormen, C. L. Leiserson, and R. L. Rivest, Introduction to Algorithms, MIT Press, Cambridge, MA, 1990.
-  H. de Fraysseix, J. Pach, and R. Pollack, How to draw a planar graph on a grid, Combinatorica, 10 (1990), pp. 41–51.
-  P. Elias, Universal codeword sets and representations of the integers, IEEE Transactions on Information Theory, IT-21 (1975), pp. 194–203.
-  H. Galperin and A. Wigderson, Succinct representations of graphs, Information and Control, 56 (1983), pp. 183–198.
-  F. Harary, Graph Theory, Addison-Wesley, Reading, MA, 1972.
-  A. Itai and M. Rodeh, Representation of graphs, Acta Informatica, 17 (1982), pp. 215–219.
-  G. Jacobson, Space-efficient static trees and graphs, in Proceedings of the IEEE Thirtieth Annual Symposium on Foundations of Computer Science, 1989, pp. 549–554.
-  S. Kannan, M. Naor, and S. Rudich, Implicit representation of graphs, SIAM Journal on Discrete Mathematics, 5 (1992), pp. 596–603.
-  G. Kant, Drawing planar graphs using the -ordering, in Proceedings of the 33rd Annual IEEE Symposium on Foundations of Computer Science, 1992, pp. 101–110.
-  G. Kant and X. He, Regular edge labeling of 4-connected plane graphs and its applications in graph drawing problems, Theoretical Computer Science, 172 (1997), pp. 175–193.
-  M. Y. Kao, M. Fürer, X. He, and B. Raghavachari, Optimal parallel algorithms for straight-line grid embeddings of planar graphs, SIAM Journal on Discrete Mathematics, 7 (1994), pp. 632–646.
M. Y. Kao, N. Occhiogrosso, and S. H. Teng, Simple and efficient
compression schemes for dense and complement graphs
, Journal of Combinatorial Optimization, (1999).To appear.
-  K. Keeler and J. Westbrook, Short encodings of planar graphs and maps, Discrete Applied Mathematics, 58 (1995), pp. 239–252.
-  L. Lovász, An Algorithmic Theory of Numbers, Graphs and Convexity, Society for Industrial and Applied Mathematics, Philadelphia, PA, 1986.
-  J. I. Munro and V. Raman, Succinct representation of balanced parentheses, static trees and planar graphs, in Proceedings of the 38th Annual IEEE Symposium on the Foundations of Computer Science, 1997, pp. 118–126.
-  M. Naor, Succinct representations of general unlabeled graphs, Discrete Applied Mathematics, 28 (1990), pp. 303–307.
-  C. H. Papadimitriou and M. Yannakakis, A note on succinct representations of graphs, Information and Control, 71 (1986), pp. 181–185.
-  W. Schnyder, Embedding planar graphs on the grid, in Proceedings of the 1st Annual ACM-SIAM Symposium on Discrete Algorithms, 1990, pp. 138–148.
-  G. Turán, On the succinct representation of graphs, Discrete Applied Mathematics, 8 (1984), pp. 289–294.
-  W. T. Tutte, A census of planar triangulations, Canadian Journal of Mathematics, 14 (1962), pp. 21–38.
-  , A census of planar maps, Canadian Journal of Mathematics, 15 (1963), pp. 249–271.