The smallest grammar problem revisited
In a seminal paper of Charikar et al. on the smallest grammar problem, the authors derive upper and lower bounds on the approximation ratios for several grammar-based compressors, but in all cases there is a gap between the lower and upper bound. Here the gaps for LZ78 and BISECTION are closed by showing that the approximation ratio of LZ78 is Θ( (n/ n)^2/3), whereas the approximation ratio of BISECTION is Θ(√(n/ n)). In addition, the lower bound for RePair is improved from Ω(√( n)) to Ω( n/ n). Finally, results of Arpe and Reischuk relating grammar-based compression for arbitrary alphabets and binary alphabets are improved.
READ FULL TEXT