|
|
|
The basic idea of the MTF scheme is to maintain a list, which represents the symbols of an alphabet. Frequenty used symbols are located near the front of the list. A symbol is encoded as the current index of that symbol in the list, id est as the number of symbols that precede it in the list. Every time the symbol occurs the index is output and the symbol is moved to the front of the list, id est to index 0. The MTF scheme is a typical representative of a List Update Algorithm (LUA) and is used as a Global Structure Transformation (GST) inside the Burrows-Wheeler Compression Algorithm (BWCA).
Logo
|
Title
|
Description
|
|
Modifications of the Burrows and Wheeler Data Compression Algorithm
|
This paper of Bernhard Balkenhol, Stefan Kurtz and Yuri Shtarkov from 1999 describes a modification of the MTF algorithm, which moves the next symbol to the second place in the list instead of the first place, except the old position of the new symbol was 0 or 1, then it is moved to the first place. This BWCA approach achieves a compression rate of 2.30 bps for the Calgary Corpus.
|
|
One attempt of a compression algorithm using the BWT
|
Another important paper of Bernhard Balkenhol and Yuri Shtarkov from 1999 with focus on improvements of the Burrows-Wheeler Compression Algorithm (BWCA). It includes some interesting aspects for the RLE field too. An important property of the output of the BWT is the presents of many runs, which results in overestimating the probability of symbols outside the run. This problem is called "Pressure of Runs" and can be decreased by RLE. Besides that, a modification of the MTF algorithm is described which moves the next symbol to the second place in the list instead of the first place, except the old position of the new symbol was 0 or 1. If the old position of the new symbol was 1 it is moved to the first place only if the last output number was different from 0. This paper is my favourite paper from Bernhard. This BWCA approach achieves a compression rate of 2.26 bps for the Calgary Corpus.
|
|
A locally adaptive data compression scheme
|
The original MTF paper of Jon Bentley, Daniel Sleator, Robert Tarjan and Victor Wei from 1986. The paper descibes a word based MTF scheme as a data compression scheme, that exploits locality of reference. It is based on a simple heuristic for self-organizing sequential search.
|
|
A Block-Sorting Lossless Data Compression Algorithm
|
Michael Burrows and David Wheeler introduce the Burrows-Wheeler Transform (BWT) and the Burrows-Wheeler Compression Algorithm (BWCA) in a research report from 1994 at the Digital Systems Research Center. The paper explains the transformation and why the output sequence compresses well. Some variants of the algorithm, such as a modified MTF, are named too. This paper is a "must" for everyone interested in BWCAs. This BWCA approach achieves a compression rate of 2.43 bps for the Calgary Corpus.
|
|
Move To Front by Campos
|
A short description about the Move To Front algorithm from 1999 written by Arturo Campos with a little example and source code.
|
|
Second step algorithms in the Burrows-Wheeler compression algorithm
|
This publication of Sebastian Deorowicz in "Software-Practice and Experience" from 2002 give a quite complete overview of the post BWT stages used within the Burrows-Wheeler Compression Algorithm. Besides his own Weighted Frequency Count Algorithm and other post BWT stages, Sebastian describes briefly but clearly variantions of the Move To Front scheme, the Inversion Frequencies algorithm and the Distance Coding algorithm with the 3 properties of the newsgroup posting. This is one of my favourite BWCA papers. This BWCA approach achieves a compression rate of 2.25 bps for the Calgary Corpus.
|
|
Block Sorting Text Compression - Final Report
|
This technical paper is one of Peter’s most important BWCA papers published in 1996. Peter describes the basic transformation by an example and several improvements of the basic algorithm, like hierarchical coding models, structured coding models and Wheeler’s Run Length Encoding for zeros (RLE0). He also gives some sort and MTF improvements and a interpretation of the BWT output sequence. This is a "must", if you are interested in the BWCA field. This BWCA approach achieves a compression rate of 2.34 bps for the Calgary Corpus.
|
|
Burrows Wheeler - Alternatives to Move to Front
|
The paper from 2003 by Peter Fenwick, Mark Titchener and Michelle Lorenz describes a cached MTF modeling scheme with a foreground (more frequent symbols) and background (less frequent symbols) model. It also introduces the so called T-entropy as a measurement for entropies after each BWCA stage.
|
|
A Fast Block-sorting Algorithm for lossless Data Compression
|
Michael Schindler developed a sorting algoritm, which had a limited order context in order to speed up the sorting process in his paper from 1996. He also describes a MTF scheme, which works like the MTF-1 scheme of Bernhard Balkenhol, Stefan Kurtz and Yuri Shtarkov from 1999.
|
Logo
|
Name
|
Description
|
|
Bernhard Balkenhol
|
Bernhard studied mathematics at the University of Bielefeld and works for the mediaWays GmbH in Germany. He is interested in information theory, data compression and encryption and has published several compression articles, which are available on his internet site.
|
|
Jon Bentley
|
Beside 200 published articles Jon Bentley is the author of the famous programming books "Programming Pearls", which is a collection of 13 essays that were published in Communications of the ACM in the early 80s, and "More Programming Pearls". Jon Bentley received in 2000 the Dr. Dobb's "Excellence in Programming Award". The award is presented to individuals who, in the spirit of innovation and cooperation, have made significant contributions to the advancement of software development.
|
|
Michael Burrows
|
Michael Burrows is the author of the research report, which introduces the Burrows-Wheeler Transform (BWT).
|
|
Arturo Campos
|
Arturo Campos is a student and programmer, interested in data compression, and has written several articles about data compression.
|
|
Sebastian Deorowicz
|
Sebastian is the author of the Weigthed Frequency Count algorithm (WFC) and is a doctoral student of the Silesian University of Technology, Poland. He just finished his dissertation.
|
|
Peter Fenwick
|
Peter is a Professor at the University of Auckland, New Zealand. His primary interests are in computer architecture (including arithmetic), communications and networks, and text compression. Beside many other papers Peter has published several papers for "symbol ranking" text compression and in the Burrows-Wheeler compression field with a very nice paper "Block Sorting Text Compression - Final Report".
|
|
Stefan Kurtz
|
Stefan Kurtz works at the University of Bielefeld, Germany. His research interests are algorithms and datastructures, suffix trees, functional programming and data compression.
|
|
Michelle Lorenz
|
Michelle Lorenz is a Master of Science student at the Computer Science Department, University of Auckland, New Zealand.
|
|
Michael Schindler
|
Michael Schindler is an independent compression consultant in Austria and the author of szip and a range coder.
|
|
Yuri Shtarkov
|
Yuri Shtarkov is a researcher at the Institute for Problems of Information Transmission (IPPI), Russia. His current interests include source coding and data compression.
|
|
Daniel Sleator
|
Daniel Sleator is Professor at the Carnegie Mellon University, United States of America. His interts are splay trees, link parsers, music analyzers and chess.
|
|
Robert Tarjan
|
Robert Tarjan is Professor at the Princeton University, United States of America.
|
|
Mark Titchener
|
Mark Titchener works at the Computer Science Department, University of Auckland, New Zealand. He is interested in deterministic information theory, measurement of entropy, data communications and coding and Data compression.
|
|
Victor Wei
|
Victor Wei is Professor at the Chinese University of Hong Kong, Hong Kong.
|
|
David Wheeler
|
David Wheeler works as a Professor at the University of Cambridge, United Kingdom.
|
|
|
|
|