www.data-compression.info
The Data Compression Resource on the Internet

Contents

 Artithmetic Coding (AC)


Arithmetic coding (AC) is a special kind of entropy coding. Unlike Huffman coding, arithmetic coding doesn´t use a discrete number of bits for each symbol to compress. It reaches for every source almost the optimum compression in the sense of the Shannon theorem and is well suitable for adaptive models. The biggest drawbak of the arithmetic coding is it´s low speed since of several needed multiplications and divisions for each symbol. The main idea behind arithmetic coding is to assign to each symbol an interval. Starting with the interval [0..1), each interval is devided in several subintervals, which sizes are proportional to the current probability of the corresponding symbols of the alphabet. The subinterval from the coded symbol is then taken as the interval for the next symbol. The output is the interval of the last symbol. Implementations write bits of this interval sequence as soon as they are certain.
A fast variant of arithmetic coding, which uses less multiplications and divisions, is a range coder, which works byte oriented. The compression rate of a range coder is only a little bit less than pure arithmetic coding, and the difference in many real implementation is not noticeable.

 Publications

Logo

Title

Description

Using Arithmetic Coding for Reduction of Resulting Simulation Data Size on Massively Parallel GPGPUs

Using Arithmetic Coding for Reduction of Resulting Simulation Data Size on Massively Parallel GPGPUs
 

A paper about a block-parallel arithmetic encoder, which reduces the number of data transfers between the GPGPU and the host PC.
 

Compression: Algorithms: Statistical Coders

Compression: Algorithms: Statistical Coders
 

A good introduction into entropy coding is article from Charles Bloom in 1996. The process of statistical coding is explained with many simple examples.
 

New Techniques in Context Modeling and Arithmetic Encoding

New Techniques in Context Modeling and Arithmetic Encoding
 

Charles Bloom presents 1996 several new techniques on high order context modeling, low order context modeling, and order-0 arithmetic coding. Emphasis is placed on economy of memory and speed. Performance is found to be significantly better than previous methods.
 

Arithmetische Kodierung (Proseminar Datenkompression)

Arithmetische Kodierung (Proseminar Datenkompression)
 

A well structured description of the ideas, background and implementation of arithmetic coding in German from 2002 by Eric Bodden, Malte Clasen and Joachim Kneis. Good explanation of the renormalisation process and with complete source code. Very recommendable for German readers.
 

Arithmetic Coding revealed

Arithmetic Coding revealed
 

A translated and updated version of the former German paper about the principles of arithmetic coding by Eric Bodden, Malte Clasen and Joachim Kneis from 2004, now available in English.
 

Arithmetic Coding by Campos

Arithmetic Coding by Campos
 

A short description about arithmetic coding from 1999 written by Arturo Campos with a little example.
 

Arithmetic Coding by the Data Compression Reference Center

Arithmetic Coding by the Data Compression Reference Center
 

A brief description of arithmetic coding from 2000. Easy to read, with figures and examples.
 

The Design and Analysis of Efficient Lossless Data Compression Systems

The Design and Analysis of Efficient Lossless Data Compression Systems
 

The thesis of Paul Howard from 1993 about data compression algorithms with emphasis on arithmetic coding, text and image compression.
 

Arithmetic Coding for Data Compression

Arithmetic Coding for Data Compression
 

Paul Howard and Jeffrey Vitter describe an efficient implementation which uses table lookups in the article from 1994.
 

Analysis of Arithmetic Coding for Data Compression

Analysis of Arithmetic Coding for Data Compression
 

In their article from 1992 Paul Howard and Jeffrey Vitter analyse arithmetic coding and entroduce the concept of weighted entropy.
 

Practical Implementations of Arithmetic Coding

Practical Implementations of Arithmetic Coding
 

A tutorial on arithmetic coding  from 1992 by Paul Howard and Jeffrey Vitter with table lookups for higher speed.
 

Data Compression (Tutorial)

Data Compression (Tutorial)
 

A basic paper from Debra Lelewer and Daniel Hirschberg about fundametal concepts of data compression, intended as a tutorial from 1987. Contains many small examples.
 

Lossless Compression Algorithms (Entropy Encoding)

Lossless Compression Algorithms (Entropy Encoding)
 

Several nice and short articles written by Dave Marshall from 2001 about entropy coding with many examples.
 

Range encoding: an algorithm for removing redundancy from a digitised message

Range encoding: an algorithm for removing redundancy from a digitised message
 

Range encoding was first proposed by this paper from G. Martin in 1979, which describes the algorithm not very clearly.
 

Arithmetic Coding Revisited

Arithmetic Coding Revisited
 

Together with the CACM87 paper this 1998 paper from Alistair Moffat, Radford Neal and Ian Witten is very well known. Improves the CACM87 implementation by using fewer multiplications and a wider range of symbol probabilities.
 

Arithmetic Coding + Statistical Modeling = Data Compression

Arithmetic Coding + Statistical Modeling = Data Compression
 

Mark Nelson's article about arithmetic coding from 1991. The concepts are easy to understand and accompanied by a simple "BILL GATES" example. Source code for Billyboy is available.
 

Anatomy of Range Encoder

Anatomy of Range Encoder
 

A very intersting and usefull site from Andrew Polar in 2007 about range encoding with technical details of arithmetic and range encoders and some patent issues. It contains links to the source code of RanCode.cpp, an range encoder written for research purposes.
 

Arithmetic Coding for Data Compression

Arithmetic Coding for Data Compression
 

This ACM paper from 1987, written by Ian Witten, Radford Neal and John Cleary, is the definite front-runner of all arithmetic coding papers. The article is quite short but comes with full source code for the famous CACM87 AC implementation.
 


 People

Logo

Name

Description

Ana Balevic

Ana Balevic
 

Ana Balevic is interested in parallel computing, parallelization concepts and last but not least compression algorithms.
 

Charles Bloom

Charles Bloom
 

Charles Bloom has published many papers about data compression and is author of PPMZ2, a very strong compression algorithm (2.141 bps on the Calgary Corpus)
 

Eric Bodden

Eric Bodden
 

Eric is a student of the RWTH Aachen, Germany, and currently studying at the University of Kent at Canterbury. He started a small online business called Communic Arts in November 1999.
 

Arturo Campos

Arturo Campos
 

Arturo Campos is a student and programmer, interested in data compression, and has written several articles about data compression.
 

Bob Carpenter

Bob Carpenter
 

Bob Carpenter is a natural language scientist and software architect, working at his own company Alias I, United States of America.
 

Malte Clasen

Malte Clasen
 

Malte Clasen is a student of the RWTH Aachen, Germany, and is known as "the update" in the demoscene, a community of people whose target is to demonstrate their coding, drawing and composing skills in small programs called demos that have no purpose except posing.
 

John Cleary

John Cleary
 

John Cleary works at the University of Waikato, New Zealand, and has published several well known papers together with Ian Witten and Timothy Bell.
 

Daniel Hirschberg

Daniel Hirschberg
 

Daniel Hirschberg is working at the University of California, United States of America. He is interested in the theory of design and analysis of algorithms.
 

Paul Howard

Paul Howard
 

Paul Howard is working at the Eastern Michigan University, United States of America, and is engaged in the arithmetic coding filed since 10 years.
 

Joachim Kneis

Joachim Kneis
 

Joachim Kneis studies Computer Science at the RWTH Aachen, Germany, and like to play "Unreal Tournament".
 

Mikael Lundqvist

Mikael Lundqvist
 

Mikael is interested in data compression, experimental electronic music and has written a BWT implementation, an improved range coder, a faster sort algorithm and a modified MTF scheme.
 

Dave Marshall

Dave Marshall
 

Dave Marshall works at the Cardiff University, United Kingdom. He is interested in music and has several compression articles on his multimedia internet site.
 

G. Martin

G. Martin
 

G. Martin is the author of the first range coder paper presented on the Data Recording Conference in 1979.
 

Alistair Moffat

Alistair Moffat
 

Alistair Moffat is working at the University of Melbourne, Australia. Together with Ian Witten and Timothy Bell he is author of the book "Managing Gigabytes".
 

Radford Neal

Radford Neal
 

Radford Neal works at the University of Toronto, Canada. He is one of the authors of the CACM87 implementation, which sets the standard in aritmetic coding.
 

Mark Nelson

Mark Nelson
 

Mark is the author of the famous compression site www.datacompression.info and has published articles in the data compression field for over ten years. He is an editor of the Dr. Dobb's Journal and author of the book "The Data Compression Book". He lives in the friendly Lone Star State Texas ("All My Ex's"...).
 

Andrew Polar

Andrew Polar
 

Andrew Polar achieved a Ph.D. in automatic control theory and conducted lectures in Arithmetic, Algebra also some programming. He is currently in commercial software development.
 

Amir Said

Amir Said
 

Amir works as a senior researcher at the Hewlett-Packard Laboratories, Palo Alto, United States of America, on imaging, image and video coding, signal processing, and security. He is coauthor of the Lossless Compression Handbook, and has released free C++ source code of fast arithmetic coding implementation.
 

Michael Schindler

Michael Schindler
 

Michael Schindler is an independent compression consultant in Austria and the author of szip and a range coder.
 

Jeffrey Vitter

Jeffrey Vitter
 

Jeffrey Vitter works at the Purdue University, United States of America. He published several data compression papers, some of them together with Paul Howard.
 

Ian Witten

Ian Witten
 

Ian is working at the University of Waikato, New Zealand. Together with John Cleary and Timothy Bell he published "Modeling for Text Compression".
 


 Source Code


Logo

Title

Description

Arithmetische Kodierung (Proseminar Datenkompression)

Arithmetische Kodierung (Proseminar Datenkompression)
 

The source code from the paper of Eric Bodden, Malte Clasen and Joachim Kneis.
 

Arithmetic Coding by Campos

Arithmetic Coding by Campos
 

A little pseudo source code from Arturo Campos.
 

Range coder by Campos

Range coder by Campos
 

A little pseudo source code from Arturo Campos.
 

Compression via Arithmetic Coding in Java

Compression via Arithmetic Coding in Java
 

A JAVA implementation from Bob Carpenter for a generic arithemtic coder and decoder, along with byte stream models that are subclasses of Java's I/O streams.
 

CACM87

CACM87
 

The standard CACM 1987 implementation of arithmetic coding in three different versions from John Cleary, Radford Neal and Ian Witten.
 

Range Coder by Lundqvist

Range Coder by Lundqvist
 

The range coder implementation from Dmitry Subbotin, improved by Mikael Lundqvist. A range coder is working similary to an arithmetic coder but uses less renormalisations and a faster byte output.
 

Arithmetic Coding + Statistical Modeling = Data Compression

Arithmetic Coding + Statistical Modeling = Data Compression
 

The source code for the arithmetic coding article from Mark Nelson.
 

RanCode by Polar

RanCode by Polar
 

RanCode is a fast and efficient range encoder by Andrew Polar.
 

FastAC

FastAC by Said
 

A fast arithmetic coding implementation by Amir Said in C++.
 

Arithmetic Coding + Statistical Modeling = Data Compression

Range Coder by Schindler
 

Range coder source code from Michael Schindler, which is one of my favourite range coder implementations. A range coder is working similary to an arithmetic coder but uses less renormalisations and a faster byte output.
 

 

Copyright © 2002-2022 Dr.-Ing. Jürgen Abel, Lechstraße 1, 41469 Neuß, Germany. All rights reserved.