www.data-compression.info
The Data Compression Resource on the Internet

Contents

 Distance Coding (DC)


Distance Coding (DC) is an algorithm proposed by Edgar Binder in 2000. There is no official paper published about DC yet, but some messages from Edgar Binder in the comp.compression newsgroup are available.
DC is a replacement of the Move To Front stage (MTF) within the Burrows-Wheeler Compression Algorithm. Similar to the IF stage, the output of the DC stage consits of distances of indices, which can be greater than 255.
The DC stage scans the input sequence sequentially for each symbol C and outputs the distance between the current index and the next occurence of C.

 Publications


Logo

Title

Description

Distance Coding newsgroup message

Distance Coding newsgroup message
 

The famous newsgroup posting from 2000 by Edgar Binder, where he describes the DC algorithm by an example together with 3 important properties of the DC output sequence.
 

Second step algorithms in the Burrows-Wheeler compression algorithm

Second step algorithms in the Burrows-Wheeler compression algorithm
 

This publication of Sebastian Deorowicz in "Software-Practice and Experience" from 2002 give a quite complete overview of the post BWT stages used within the Burrows-Wheeler Compression Algorithm. Besides his own Weighted Frequency Count Algorithm and other post BWT stages, Sebastian describes briefly but clearly variantions of the Move To Front scheme, the Inversion Frequencies algorithm and the Distance Coding algorithm with the 3 properties of the newsgroup posting. This is one of my favourite BWCA papers.
This BWCA approach achieves a compression rate of 2.25 bps for the Calgary Corpus.
 

An analysis of the second step algorithms in the Burrows-Wheeler compression algorithm

An analysis of the second step algorithms in the Burrows-Wheeler compression algorithm
 

A  quite similar publication from 2000 to the article "Second step algorithms in the Burrows-Wheeler compression algorithm" from Sebastian Deorowicz, describing his Weighted Frequency Count Algorithm.
This BWCA approach achieves a compression rate of 2.25 bps for the Calgary Corpus.
 

 People


Logo

Name

Description

Edgar Binder

Edgar Binder
 

Edgar Binder is the author of the Distance Coding algorithm (DC) and a member of the Snarl-up software team.
 

Sebastian Deorowicz

Sebastian Deorowicz
 

Sebastian is the author of the Weigthed Frequency Count algorithm (WFC) and is a doctoral student of the Silesian University of Technology, Poland. He just finished his dissertation.
 

 Source Code


Logo

Title

Description

 

Copyright © 2002-2022 Dr.-Ing. Jürgen Abel, Lechstraße 1, 41469 Neuß, Germany. All rights reserved.