Context Tree Weighting (CTW)

www.data-compression.info
The Data Compression Resource on the Internet

Context Tree Weighting (CTW)

The CTW-algorithm is a binary technique, invented by Frans Willems, Yuri Shtarkov and Tjalling Tjalkens, which was later extended for byte processing.
The main idea is to weight the probabilities of several branches inside a tree in order to get a good probability estimation for the next symbol.
The algorithm is optimal in the sense that it achieves the Rissanen (1984) lower bound.

ATTENTION: In contrast to PPM or BWCA, the CTW-method is patented!!! The patents are assigned to Koninklijke KPN NV. More information about the Intellectual Property Rights can be found at http://www.ele.tue.nl/ctw/ipr.html. Therefore, if you plan to develop compression algorithm based on the CTW-method for commercial utilization, get in contact with KPN NV, or maybe use a patentfree algorithm.

Publications

Logo

Title

Description

Text Compression by Context Tree Weighting

This paper from the DCC 1997 by Jan Aberg and Yuri Shtarkov describes different modifications of the context tree weighting algorithm. The combination of CTW with PPM is studied and a gain of 0.091 bps on the Calgary Corpus compared with PPMD achieved.

On Prediction Using Variable Order Markov Models

The paper by Ron Begleiter, Ran El-Yaniv and Golan Yona from 2004 compares several lossless compression algorithms, including LZ, PPM, CTW and PST on the example of biological, textual and musical data.

Superior Guarantees for Sequential Prediction

This paper by Ron Begleiter and Ran El-Yaniv from 2006 presents theoretical and experimental results for hierarchical applications of a state-of-the-art CTW variant (decomposed CTW).

Implementing the Context Tree Weighting Method for Text Compression

Kunihiko Sadakane, Takumi Okazaki and Hiroshi Imai extend in their article from 2000 the CTW-method for processing multi-alphabet sequences. Furthermore, a method to optimize a parameter for binary alphabets is reveiled.
This CTW approach achieves a compression rate of 2.27 bps for the Calgary Corpus.

Implementing the Context-Tree Weighting Method: Arithmetic Coding

An arithmetic coding procedure adapted for the CTW-method is presenmted by Tjalling Tjalkens and Frans Willems in thier paper from 1997. Instead of an estimated probability Pe(s) and a weighted probability Pw(s) the logarithm of the ratio Pe(s)/Pw(os)Pw(1s) is stored in each node s of the context tree, which leads to a considerable storage reduction.

A Context-Tree Weighting Method for Text-Generating Sources

Tjalling Tjalkens, Paul Volf and Frans Willems extend the CTW-method for compressing ASCII- and Byte-files. The ideas that make this possible, i.e. binary decomposition, zero-redundancy estimator, and weighting only at symbol-boundaries, were presented in the DCC paper from 1997.

The Context-Tree Weighting Method: Extentensions

In his paper from 1994 Frans Willems modifies the basic binary CTW-method such that the past symbols are not needed by the encoder and the decoder. He describes how to make the context-tree depth D infinite, which results in optimal redundancy behavior for all tree sources, while the number of records in the context tree is not larger than 2T-1, where T is the length of the source sequence. For this extended context-tree weighting algorithm he shows that with probability one the compression ratio is not larger than the source entropy for source sequence length T approaching infinity for stationary and ergodic sources.

The Context Tree Weighting Method: Basic Properties

The basic and theoretical principles of the CTW-method are described in this paper from 1995 by Frans Willems, Yuri Shtarkov and Tjalling Tjalkens. Here CTW is used for binary sources.
Frans Willems, Yuri Shtarkov and Tjalling Tjalkens received for this paper the 1996 Paper Award of the IEEE Information Theory Society. The award was presented at the IEEE International Symposium on Infomation Theory in Ulm, Germany, 29.06.1997.

Context Weighting for General Finite-Context Sources

Frans Willems, Yuri Shtarkov and Tjalling Tjalkens extend in their paper from 1996 the CTW-method to sources with structures,which are more complex than tree structures. The generalized algorithms are again recursive, and they also achieve optimal redundancy behavior for sources in the model class for which the algorithm is designed.

Coding for a Binary Independent Piecewise-Identically Distributed Source

Frans Willems extends the CTW-method in his article from 1996 for efficient coding of sequences generated by piecewise stationary sources. The transition pattern can be considered as part of the mechanism that selects the parameters of the source.

Reflections on "The Context Tree Weighting Method: Basic Properties"

Frans Willems, Yuri Shtarkov and Tjalling Tjalkens have been asked by Michelle Effros, editor of the IEEE Information Theory Newsletter, to write a reflection contribution for their 1996 Paper Award of the IEEE Information Theory Society. The paper "Reflections on: Context-Tree Weighting: Basic Properties" contains a mini-course on universal source coding and an introduction to the CTW-method.

Complexity Reduction of the Context-Tree Weighting Method

Frans Willems and Tjalling Tjalkens present in their aerticle from 1997 a method to decrease the storage and communication complexity of the CTW-method. The method is based on combining the estimated probability of a node in the context tree and weighted probabilities of its children in one single variable. The variable is represented by its logarithm.

The Context-Tree Weighting Project

A CTW research site from 1997 by Frans Willems with a list of the main CTW papers. One of the best sources on information about CTW.

CTW website

The original CTW site from Frans Willems with many informations on CTW implementations. The site contains papers, manuals, source codes and results.

People

Logo

Name

Description

Jan Aberg

Jan Aberg made his doctoral thesis about Universal Source Coding Perspective on PPM at the Lund University, Sweden.

Ron Begleiter

Ron is a graduate student at the Technion - Israel Institute of Technology, Isreal. He is interested in machine learning and sequence prediction.

Ran El-Yaniv

Ran is a professor at the Technion - Israel Institute of Technology, Israel. His research interests include machine learning and data mining, online algorithms, competitive analysis and computational finance.

Hiroshi Imai

Hiroshi Imai works at the Department of Information Science, University of Tokyo, Japan, on the ERATO Quantum Computation & Information Project.

Takumi Okazaki

Takumi Okazaki has published together with Hiroshi Imai and Kunihiko Sadakane a paper about a CTW-method for multi-alphabets.

Kunihiko Sadakane

Kunihiko has developed a very fast suffix sorting algorithm with logarithmic number of passes. He wrote his dissertation about suffix sorting.

Yuri Shtarkov

Yuri Shtarkov is a researcher at the Institute for Problems of Information Transmission (IPPI), Russia. His current interests include source coding and data compression.

Tjalling Tjalkens

Tjalling Tjalkens works at the Electrical Engineering Department of the University of Eindhoven, Netherlands. His research interests are Source coding and Channel coding.

Paul Volf

Paul Volf was a doctoral student of Frans Willems at the University of Eindhoven, Netherlands.

Frans Willems

Frans Willems works at the Electrical Engineering Department of the University of Eindhoven, Netherlands. His research interests are Source coding, Channel coding and Coding for Networks.

Golan Yona

The research interests of Golan, assistant professor at the Technion - Israel Institute of Technology, Israel, and adjunct assistant professor at Cornell University, United States of America, are computational molecular biology and machine learning.

Source Code

Logo

Title

Description

VMM Algorithms Source Code

The source code in JAVA and MatLab of Ron Begleiter for CTW papers.

CTW (Context Tree Weighting) version 0.1

Source code in C from the original CTW site.

Copyright © 2002-2022 Dr.-Ing. Jürgen Abel, Lechstraße 1, 41469 Neuß, Germany. All rights reserved.