


Context Tree Weighting (CTW)

The CTWalgorithm is a binary technique, invented by Frans Willems, Yuri Shtarkov and Tjalling Tjalkens, which was later extended for byte processing. The main idea is to weight the probabilities of several branches inside a tree in order to get a good probability estimation for the next symbol. The algorithm is optimal in the sense that it achieves the Rissanen (1984) lower bound.
ATTENTION: In contrast to PPM or BWCA, the CTWmethod is patented!!! The patents are assigned to Koninklijke KPN NV. More information about the Intellectual Property Rights can be found at http://www.ele.tue.nl/ctw/ipr.html. Therefore, if you plan to develop compression algorithm based on the CTWmethod for commercial utilization, get in contact with KPN NV, or maybe use a patentfree algorithm.
Logo

Title

Description


Text Compression by Context Tree Weighting

This paper from the DCC 1997 by Jan Aberg and Yuri Shtarkov describes different modifications of the context tree weighting algorithm. The combination of CTW with PPM is studied and a gain of 0.091 bps on the Calgary Corpus compared with PPMD achieved.


On Prediction Using Variable Order Markov Models

The paper by Ron Begleiter, Ran ElYaniv and Golan Yona from 2004 compares several lossless compression algorithms, including LZ, PPM, CTW and PST on the example of biological, textual and musical data.


Superior Guarantees for Sequential Prediction

This paper by Ron Begleiter and Ran ElYaniv from 2006 presents theoretical and experimental results for hierarchical applications of a stateoftheart CTW variant (decomposed CTW).


Implementing the Context Tree Weighting Method for Text Compression

Kunihiko Sadakane, Takumi Okazaki and Hiroshi Imai extend in their article from 2000 the CTWmethod for processing multialphabet sequences. Furthermore, a method to optimize a parameter for binary alphabets is reveiled. This CTW approach achieves a compression rate of 2.27 bps for the Calgary Corpus.


Implementing the ContextTree Weighting Method: Arithmetic Coding

An arithmetic coding procedure adapted for the CTWmethod is presenmted by Tjalling Tjalkens and Frans Willems in thier paper from 1997. Instead of an estimated probability Pe(s) and a weighted probability Pw(s) the logarithm of the ratio Pe(s)/Pw(os)Pw(1s) is stored in each node s of the context tree, which leads to a considerable storage reduction.


A ContextTree Weighting Method for TextGenerating Sources

Tjalling Tjalkens, Paul Volf and Frans Willems extend the CTWmethod for compressing ASCII and Bytefiles. The ideas that make this possible, i.e. binary decomposition, zeroredundancy estimator, and weighting only at symbolboundaries, were presented in the DCC paper from 1997.


The ContextTree Weighting Method: Extentensions

In his paper from 1994 Frans Willems modifies the basic binary CTWmethod such that the past symbols are not needed by the encoder and the decoder. He describes how to make the contexttree depth D infinite, which results in optimal redundancy behavior for all tree sources, while the number of records in the context tree is not larger than 2T1, where T is the length of the source sequence. For this extended contexttree weighting algorithm he shows that with probability one the compression ratio is not larger than the source entropy for source sequence length T approaching infinity for stationary and ergodic sources.


The Context Tree Weighting Method: Basic Properties

The basic and theoretical principles of the CTWmethod are described in this paper from 1995 by Frans Willems, Yuri Shtarkov and Tjalling Tjalkens. Here CTW is used for binary sources. Frans Willems, Yuri Shtarkov and Tjalling Tjalkens received for this paper the 1996 Paper Award of the IEEE Information Theory Society. The award was presented at the IEEE International Symposium on Infomation Theory in Ulm, Germany, 29.06.1997.


Context Weighting for General FiniteContext Sources

Frans Willems, Yuri Shtarkov and Tjalling Tjalkens extend in their paper from 1996 the CTWmethod to sources with structures,which are more complex than tree structures. The generalized algorithms are again recursive, and they also achieve optimal redundancy behavior for sources in the model class for which the algorithm is designed.


Coding for a Binary Independent PiecewiseIdentically Distributed Source

Frans Willems extends the CTWmethod in his article from 1996 for efficient coding of sequences generated by piecewise stationary sources. The transition pattern can be considered as part of the mechanism that selects the parameters of the source.


Reflections on "The Context Tree Weighting Method: Basic Properties"

Frans Willems, Yuri Shtarkov and Tjalling Tjalkens have been asked by Michelle Effros, editor of the IEEE Information Theory Newsletter, to write a reflection contribution for their 1996 Paper Award of the IEEE Information Theory Society. The paper "Reflections on: ContextTree Weighting: Basic Properties" contains a minicourse on universal source coding and an introduction to the CTWmethod.


Complexity Reduction of the ContextTree Weighting Method

Frans Willems and Tjalling Tjalkens present in their aerticle from 1997 a method to decrease the storage and communication complexity of the CTWmethod. The method is based on combining the estimated probability of a node in the context tree and weighted probabilities of its children in one single variable. The variable is represented by its logarithm.


The ContextTree Weighting Project

A CTW research site from 1997 by Frans Willems with a list of the main CTW papers. One of the best sources on information about CTW.


CTW website

The original CTW site from Frans Willems with many informations on CTW implementations. The site contains papers, manuals, source codes and results.

Logo

Name

Description


Jan Aberg

Jan Aberg made his doctoral thesis about Universal Source Coding Perspective on PPM at the Lund University, Sweden.


Ron Begleiter

Ron is a graduate student at the Technion  Israel Institute of Technology, Isreal. He is interested in machine learning and sequence prediction.


Ran ElYaniv

Ran is a professor at the Technion  Israel Institute of Technology, Israel. His research interests include machine learning and data mining, online algorithms, competitive analysis and computational finance.


Hiroshi Imai

Hiroshi Imai works at the Department of Information Science, University of Tokyo, Japan, on the ERATO Quantum Computation & Information Project.


Takumi Okazaki

Takumi Okazaki has published together with Hiroshi Imai and Kunihiko Sadakane a paper about a CTWmethod for multialphabets.


Kunihiko Sadakane

Kunihiko has developed a very fast suffix sorting algorithm with logarithmic number of passes. He wrote his dissertation about suffix sorting.


Yuri Shtarkov

Yuri Shtarkov is a researcher at the Institute for Problems of Information Transmission (IPPI), Russia. His current interests include source coding and data compression.


Tjalling Tjalkens

Tjalling Tjalkens works at the Electrical Engineering Department of the University of Eindhoven, Netherlands. His research interests are Source coding and Channel coding.


Paul Volf

Paul Volf was a doctoral student of Frans Willems at the University of Eindhoven, Netherlands.


Frans Willems

Frans Willems works at the Electrical Engineering Department of the University of Eindhoven, Netherlands. His research interests are Source coding, Channel coding and Coding for Networks.


Golan Yona

The research interests of Golan, assistant professor at the Technion  Israel Institute of Technology, Israel, and adjunct assistant professor at Cornell University, United States of America, are computational molecular biology and machine learning.




