Adaptive coding

From Wikipedia, the free encyclopedia

Adaptive coding refers to variants of entropy encoding methods of lossless data compression. They are particularly suited to streaming data, as they adapt to localized changes in the characteristics of the data, and don't require a first pass over the data to calculate a probability model. The cost paid for these advantages is that the encoder and decoder must be more complex to keep their states synchronized, and more computational power is needed to keep adapting the encoder/decoder state.

Almost all data compression methods involve the use of a model, a prediction of the composition of the data. When the data matches the prediction made by the model, the encoder can usually transmit the content of the data at a lower information cost, by making reference to the model. This general statement is a bit misleading as general data compression algorithms would include the popular LZW and LZ77 algorithms, which are hardly comparable to compression techniques typically called adaptive. Run-length encoding and the typical JPEG compression with run length encoding and predefined Huffman codes do not transmit a model. A lot of other methods adapt their model to the current file and need to transmit it in addition to the encoded data, because both the encoder and the decoder need to use the model.

In adaptive coding, the encoder and decoder are instead equipped with a predefined meta-model about how they will alter their models in response to the actual content of the data, and otherwise start with a blank slate, meaning that no initial model needs to be transmitted. As the data is transmitted, both encoder and decoder adapt their models, so that unless the character of the data changes radically, the model becomes better-adapted to the data it is handling and compresses it more efficiently approaching the efficiency of the static coding.

Adaptive method[edit]

Encoder[edit]

  1. Initialize the data model as per agreement.
  2. While there is more data to send
    1. Encode the next symbol using the data model and send it.
    2. Modify the data model based on the last symbol.

Decoder[edit]

  1. Initialize the data model as per agreement.
  2. While there is more data to receive
    1. Decode the next symbol using the data model and output it.
    2. Modify the data model based on the decoded symbol.

Any adaptive coding method has a corresponding static model method, in which the data model is precalculated and then transmitted with the data.

Static method[edit]

Encoder[edit]

  1. Initialize the data model based on a first pass over the data.
  2. Transmit the data model.
  3. While there is more data to send
    1. Encode the next symbol using the data model and send it.

Decoder[edit]

  1. Receive the data model.
  2. While there is more data to receive
    1. Decode the next symbol using the data model and output it.

Examples[edit]

Adaptive image coding was used by the Cassini-Huygens craft to relay images from Saturn. Only about 5% of the images show any visual signs of damage. As the spacecraft has an error correcting Flash drive and long timeframes between image taking events, damaged images like this can be present. It is assumed that the number of damaged, but unrecoverable images from the Cassini mission is about 0.01% or less.[needs update]

The Cassini camera was pointing toward Dione at a distance of approximately 548,210 kilometers. The image was taken using the CL1 and CL2 filters on May 17, 2010.

Cassini Lossless Compression[edit]

  • Both converted (8-bit) and unconverted (12-bit) data can be losslessly compressed. The Cassini hardware data compressor uses a modified Huffman encoding scheme as part of its adaptive compressor.
  • Each compressed image can be reconstructed on the ground with no loss to the information content of the image, provided the image entropy does not exceed the threshold where 2:1 compression is reached.
  • Due to camera problems and the need to reduce file size, there is a slight modification to the image coding scheme so that each compressed line is effectively bandwidth limited on the number of bits available to encode it.