www.data-compression.info
The Data Compression Resource on the Internet

Contents

 Run Length Encoding (RLE)


Run Length Encoding (RLE) is a simple and popular data compression algorithm. It is based on the idea to replace a long sequence of the same symbol by a shorter sequence and is a good introduction into the data compression field for newcomers.
The sequence of length l of a repeated symbol 's' is replaced by a shorter sequence, usually containing one or more symbols of 's', a length information and sometimes an escape symbol.
RLE algorithms differ from each other mainly in three points: the threshold t, the marking of the start of a run and the coding of the length information. If l is smaller than t, the run stays unchanged, and if l is greater or equal t, the run is replaced. The start of a run can be indicated by a threshold run or an escape symbol 'c'. If a threshold run is used the start is characterized by a small sequence of 's', which has a length greater or equal than t. If an escape symbol 'c' indicates the start of a run, 's' is normally put behind 'c' in order to characterize the run symbol. 'c' must not be an element of the alphabet or occurrences of 'c' have to be coded in a way, that they are not mixed up with the start of a run.

 Publications


Logo

Title

Description

One attempt of a compression algorithm using the BWT

One attempt of a compression algorithm using the BWT
 

Another important paper of Bernhard Balkenhol and Yuri Shtarkov from 1999 with focus on improvements of the Burrows-Wheeler Compression Algorithm (BWCA).
It includes some interesting aspects for the RLE field too. An important property of the output of the BWT is the presents of many runs, which results in overestimating the probability of symbols outside the run. This problem is called "Pressure of Runs" and can be decreased by RLE. Besides that, a modification of the MTF algorithm is described which moves the next symbol to the second place in the list instead of the first place, except the old position of the new symbol was 0 or 1. If the old position of the new symbol was 1 it is moved to the first place only if the last output number was different from 0.
This paper is my favourite paper from Bernhard.
This BWCA approach achieves a compression rate of 2.26 bps for the Calgary Corpus.
 

Run Length Encoding

Run Length Encoding
 

This tutorial from 1999 of Arturo Campos describes a simple RLE algorithm, which is based on a threshold run instead of an escape symbol.
 

RLE - Run Length Encoding

RLE - Run Length Encoding
 

The Data Compression Reference Center presents a tutorial from 2000 about RLE, which is based on an escape symbol. The tutorial includes a flow chart of the RLE algorithm together with some commercial examples.
 

The RLE compression algorithm

The RLE compression algorithm
 

A short introduction from 2000 by Laurens Leurs of RLE, which presents a simple and very typical example of RLE in praxis, based on an escape symbol.
 

A Run Length Encoding Scheme For Block Sort Transformed Data

A Run Length Encoding Scheme For Block Sort Transformed Data
 

Michael Maniscalco describes in 2000 a RLE algorithm for usage within the Burrows-Wheeler Compression Allgorithm, which is based on a fixed length threshold run and a symbol run which is about half as long as the original run.
 

A Second Modified Run Length Encoding Scheme for Blocksort Transformed Data

A Second Modified Run Length Encoding Scheme for Blocksort Transformed Data
 

This RLE algorithm from 2001 by Michael Maniscalco is based on a variable length threshold run, which defines the length of the binary representation of the threshold run and a mantissa part, which is stored in a separate data buffer.
 

 People


Logo

Name

Description

Bernhard Balkenhol

Bernhard Balkenhol
 

Bernhard studied mathematics at the University of Bielefeld and works for the mediaWays GmbH in Germany. He is interested in information theory, data compression and encryption and has published several compression articles, which are available on his internet site.
 

Arturo Campos

Arturo Campos
 

Arturo Campos is a student and programmer, interested in data compression, and has written several articles about data compression.
 

Laurens Leurs

Laurens Leurs
 

Laurens Leurs studied Industrial Design and works for the presales support of a Belgian prepress company.
 

Michael Maniscalco

Michael Maniscalco
 

Michael has focused his personal work on data compression and works as a software developer for Lexon Technologies in North Andover, United States of America.
 

Yuri Shtarkov

Yuri Shtarkov
 

Yuri Shtarkov is a researcher at the Institute for Problems of Information Transmission (IPPI), Russia. His current interests include source coding and data compression.
 

 Source Code


Logo

Title

Description

A Run Length Encoding Scheme For Block Sort Transformed Data

A Run Length Encoding Scheme For Block Sort Transformed Data
 

Source code in C for a sample coder from Michael Maniscalco.
 

 

Copyright © 2002-2022 Dr.-Ing. Jürgen Abel, Lechstraße 1, 41469 Neuß, Germany. All rights reserved.