Data Compression with Finite Windows

著者: Edward R. Fiala, Daniel H. Greene
タイトル: Data Compression with Finite Windows
ページ: 490-505
日時: April 1989
コメント: LZFGとして知られる手法である。データ構造として PATRICIA Trie\cite{Morrison:PATRICIA}を使っている。
感想: The research of Ziv and Lempel led to a family of compression methods based on the idea of replacing a group of characters by a reference to an earlier occurrence of that same group. The authors explore one particular form of this compression method which restricts the references so that they can only point into a finite sliding window of text. Their aim is clearly to develop a practical compression algorithm. In particular, the authors attempt to achieve good compression performance, but at the same time they want fast compression and decompression algorithms with reasonable storage requirements. As far as compression performance is concerned, the algorithms are definitely successful; the figures reported in the paper indicate that they achieve compression similar to or better than competing methods. The compression performance of these algorithms is also good on relatively small files, whereas many other methods achieve their best performance only on very large files. Unfortunately, the authors do not provide all the data needed to compare execution time requirements. I was particularly disappointed to find that they make no speed comparison with the ZLW (Ziv-Lempel-Welch) method, which is provided on most UNIX systems as the compress command and is in widespread use. \\ The quality of the paper is uneven. The first half gives a readable introduction to the use of suffix trees in implementing the window-based compression algorithm and provides theorems to show that they have linear-time update costs. But pragmatic considerations then take over, and all the details of the implemented algorithms will likely overwhelm the reader. Although such details are important when extracting the last drop of compression performance, they definitely detract from the paper's readability. \\ Finally, I cannot conclude this review without complaining about the three graphs reproduced in the paper. Each graph includes a curve labeled H1 that purports to show the entropy of the file as determined by first-order Markov statistics. The curves are not corrected to compensate for small sample sizes, however, and are therefore meaningless. It would have made more sense for the authors to use the Cleary-Witten method to generate the data for a first-order Markov model. Another problem with these graphs is that the numbers marked on the vertical axes do not agree at all well with the numbers printed alongside the curves. \\ -R. N. Horspool, Victoria, B.C., Canada
カテゴリ: Compress

Category: Compress Journal: cacm Comment: LZFGとして知られる手法である。データ構造として PATRICIA Trie\cite{Morrison:PATRICIA}を使っている。 Number: 4 Bibtype: Article Author: Edward R. Fiala Daniel H. Greene Pages: 490-505 Month: apr Review: The research of Ziv and Lempel led to a family of compression methods based on the idea of replacing a group of characters by a reference to an earlier occurrence of that same group. The authors explore one particular form of this compression method which restricts the references so that they can only point into a finite sliding window of text. Their aim is clearly to develop a practical compression algorithm. In particular, the authors attempt to achieve good compression performance, but at the same time they want fast compression and decompression algorithms with reasonable storage requirements. As far as compression performance is concerned, the algorithms are definitely successful; the figures reported in the paper indicate that they achieve compression similar to or better than competing methods. The compression performance of these algorithms is also good on relatively small files, whereas many other methods achieve their best performance only on very large files. Unfortunately, the authors do not provide all the data needed to compare execution time requirements. I was particularly disappointed to find that they make no speed comparison with the ZLW (Ziv-Lempel-Welch) method, which is provided on most UNIX systems as the compress command and is in widespread use. \\ The quality of the paper is uneven. The first half gives a readable introduction to the use of suffix trees in implementing the window-based compression algorithm and provides theorems to show that they have linear-time update costs. But pragmatic considerations then take over, and all the details of the implemented algorithms will likely overwhelm the reader. Although such details are important when extracting the last drop of compression performance, they definitely detract from the paper's readability. \\ Finally, I cannot conclude this review without complaining about the three graphs reproduced in the paper. Each graph includes a curve labeled H1 that purports to show the entropy of the file as determined by first-order Markov statistics. The curves are not corrected to compensate for small sample sizes, however, and are therefore meaningless. It would have made more sense for the authors to use the Cleary-Witten method to generate the data for a first-order Markov model. Another problem with these graphs is that the numbers marked on the vertical axes do not agree at all well with the numbers printed alongside the curves. \\ -R. N. Horspool, Victoria, B.C., Canada Title: Data Compression with Finite Windows Year: 1989 Volume: 32