www.data-compression.info
The Data Compression Resource on the Internet

Contents

 Inversion Frequencies (IF)


The Inversion Frequencies Algorithm (IF) is used as a replacement of the Move To Front stage (MTF) within the Burrows-Wheeler Compression algorithm and was introduced by Arnavut and Magliveras in 1997. Its aim is to obtain a output sequence, which is better compressible than the output sequence of the MTF stage.
The IF stage scans the input sequence for each alphabet symbol C, and if the current element of the input sequence is equal to C, the number of symbols greater than C between the current index and the index of the last occurrence of C is output.
Since the difference of the indices can be greater than 255, the bit size of the output values is usually 20 bit or more. One important property of IF is the fact, that the output values for the last symbol of the alphabet are all 0. Since a sequence of zeroes does not give any information, it can be omited. Therefore, the output sequence of the IF stage is shorter than the input sequence. On the other hand the IF stage needs substantially more overhead than the MTF stage to be transmitted like the symbol distribution or terminator symbols, which makes IF less attractive for small amounts of compressible data.

 Publications


Logo

Title

Description

Lexical Permutation Sorting Algorithm

Lexical Permutation Sorting Algorithm
 

This article by Ziya Arnavut and Spyros Magliveras from 1996 describes a "Lexical Permutation Sorting Algorithm" (LPSA), which is a generalization of the Burrows-Wheeler Compression Algorithm (BWCA).
 

Block Sorting and Compression

Block Sorting and Compression
 

This article by Ziya Arnavut and Spyros Magliveras from 1997 describes a "Lexical Permutation Sorting Algorithm" (LPSA), which is a generalization of the Burrows-Wheeler Compression Algorithm (BWCA), and a new post-BWT stage, called Inversion Frequencies (IF).
 

Move-to-Front and Inversion Coding

Move-to-Front and Inversion Coding
 

This conference proceeding paper from Ziya Arnavut at DCC 2000 compares the MTF stage with the IF stage. The IF stage, called inversion coding (ranking) technique, yields better compression gain than the recency ranking (MTF coder) for almost all the test data files.
 

Investigation of Block-Sorting of Multiset Permutations

Investigation of Block-Sorting of Multiset Permutations
 

Ziya and Meral Arnavut investigate in this 2004 article special cases of permutations, called multiset permutations, together with a transform different to the BWT, called Linear Ordering Transformation (LOT). LOT sorts rows of a cyclic rotated matrix with respect to the first element of each row.
 

Inversion Coder

Inversion Coder
 

A paper from 2004 by Ziya Arnavut presents and compares different inversion ranking schemes (IF) with recency ranking schemes (MTF). For large files, image and DNA files, IF achieves better compression rate than MTF schemes.
 

Second step algorithms in the Burrows-Wheeler compression algorithm

Second step algorithms in the Burrows-Wheeler compression algorithm
 

This publication of Sebastian Deorowicz in "Software-Practice and Experience" from 2002 give a quite complete overview of the post BWT stages used within the Burrows-Wheeler Compression Algorithm. Besides his own Weighted Frequency Count Algorithm and other post BWT stages, Sebastian describes briefly but clearly variantions of the Move To Front scheme, the Inversion Frequencies algorithm and the Distance Coding algorithm with the 3 properties of the newsgroup posting. This is one of my favourite BWCA papers.
This BWCA approach achieves a compression rate of 2.25 bps for the Calgary Corpus.
 

 People


Logo

Name

Description

Meral Arnavut

Meral Arnavut
 

Meral Arnavut is an assistant orofessor at the Department of Mathematical Sciences at the State University of New York, Fredonia, United States of America. Her research interest is in Commutative Algebra, Ring Theory,  Number Theory, Combinatorics and Data Compression.
 

Ziya Arnavut

Ziya Arnavut
 

Ziya Arnavut is a associate professor at the State University of New York, Fredonia, United States of America. He published several papers about data compression, particularly lossless image compression. He developed the Inversion Coder, which compresses very well DNA, image, audio and other signal-based data, in addition to large text files. His research interests are in data compression, image processing, multimedia, communications and combinatorial algorithms.
 

Sebastian Deorowicz

Sebastian Deorowicz
 

Sebastian is the author of the Weigthed Frequency Count algorithm (WFC) and is a doctoral student of the Silesian University of Technology, Poland. He just finished his dissertation.
 

Spyros Magliveras

Spyros Magliveras
 

Spyros Magliveras is a professor of Mathematics at the Florida Atlantic University, United States of America. His main research interests are in cryptology, algebraic combinatorics, coding theory, and algorithms and their complexity, particularly those related to computational group theory.
 

 Source Code


Logo

Title

Description

 

Copyright © 2002-2022 Dr.-Ing. Jürgen Abel, Lechstraße 1, 41469 Neuß, Germany. All rights reserved.