Skip to content
Brendan G Bohannon edited this page Jul 3, 2020 · 5 revisions

BT FastBHT Image (BTIC 2f)

  • FOURCC: 'bt2f' ?
This format will be similar in concept to the original JPEG format, but will make a number of design changes mostly with the intention of increasing the speed (reducing the computational cost) of a decoder.
  • Will limit symbol lengths to 12 bits.
  • Will use lower cost transforms.
Image TLV:
  • 0x00..0x1F: 8K, TWOCC
  • 0x20..0x3F: 8K, FOURCC
  • 0x40..0x5F: 512M, TWOCC
  • 0x60..0x7F: 512M, FOURCC
  • 0x80..0x9F: 2M, TWOCC
  • 0xA0..0xBF: 2M, FOURCC
TwoCC:
  • 'IX': Image Data
  • 'HX': Header
  • 'HT': Huffman Tables
  • 'QT': Quantization Tables
Header:
  • u16 width; //image frame width
  • u16 height; //image frame height
  • u16 flags; //some control flags
  • byte clrs; //colorspace
  • byte mbty; //macroblock type
The frame size will be internally padded up to a multiple of the size of a macroblock, though the width and height will represent the non-padded resolution.

Note that only a fixed set of macroblock layouts will be supported with this format. This is mostly done in the name of making the decoder simpler.

Table of Contents

Bitstream

Bitstream

  • LSB first (like in Deflate).
  • Huffman symbols are 1-12 bits.
    • Rather than 1-15 as in Deflate.
    • This allows simplifying the table encoding some.
    • Also allows for a single-stage lookup table to be smaller.
The use of a 12-bit lookup allows the decoder table for Huffman symbols to be 8K, with two 8K tables being able to more easily fit in cache while still being reasonably effective for entropy coding.

Several lump types will use a bitstream rather than byte-oriented data.

Colorspace

Colorspace:

  • 0: GDbDr
    • Y=G, U=B-G, V=R-G
  • 1: RCT
    • Y=(2G+R+B)/4, U=B-G, V=R-G
  • 2: Approx YUV
    • Y=(8G+5R+3B)/16, U=B-Y, V=R-Y
The primary colorspace will be GDbDr, which will be defined as:
  • Y=G, U=B-G, V=R-G
  • G=Y, B=Y+U, R=Y+V
The RCT and Approx-YUV spaces may also be used. These may offer slightly better decorellation at the expense of higher computational cost.

RCT:

  • Y=(2G+R+B)/4, U=B-G, V=R-G
  • G=Y-(U+V)/4, R=G+V, B=G+U
The Approx-YUV space will function as an approximation of a more traditional YUV variant.
  • Y=(8G+5R+3B)/16, U=B-Y, V=R-Y
  • R=Y+V, B=Y+U, G=(8Y-5V-3U)/8
Values in the Y plane will thus nominally range between 0 and 255, whereas values in the U and V planes will be -256 to 256. Though, the actual range used by an encoder may be less than this.

The U and V values will be stored without scaling or a DC bias.

This format will not require value-range clamping on the decoder, thus it will be the encoder's responsibility to make sure that decoded values do not overflow. Whether overflowed values wrap or saturate will be undefined.

Possible: Use of range clamping may depend on colorspace.

Quantization Tables

The Quantization Tables lump will consist of Quantization Tables.

These will consist of a tag byte followed by the table data:

  • EOT(0): End-Of-Tables Marker
  • Y (1): Quantization Table for Y.
  • UV(2): Quantization Table for UV.
Each quantization table will consist of 64 bytes which will represent scale factors for each coefficient. These will be in the same order as those used in the block transform.

Packed Huffman Tables

The Huffman Tables Lump will consist of multiple tables each preceded by a 4 bit prefix. If a prefix value of 0 is encountered, this means that the end of the Huffman Tables lump has been reached.

Major Tables (4-bit):

  • EOT(0): End-Of-Tables Marker
  • DC(1): DC Coefficient Table
  • AC(2): AC Coefficient Table
Additional Huffman tables may be added later, and for the time being will follow the same basic format.

Tables 12..15 will be reseved for expansion to some other table format.

Huffman tables will encode a series of lengths, but the tables will not be entropy coded. Each table will encode up to 256 lengths.

Table contents will be encoded via 4 bit codes:

  • 0: Non-encoded symbol
  • 1..C: Symbol Length, 1..12 bits.
  • D: Reserved for longer symbol lengths (2 bits follow):
    • Followed by a 2-bit suffix (encoding lengths of 0..2=13..15 bits)
  • E: 3..18 zeroes (4 bits follow).
  • F: Escaped Run, 2 bit code follows:
    • 0: 19..82 zeroes (6 bits follow).
    • 1: 4..66 repeats of prior length (6 bits follow).
      • Bias is 3, with 0/3 encoding an "End of Table".
    • 2: Reserved
    • 3: Reserved
A table does not necessarily end with an End of Table marker, but may also end upon reaching its total number of lengths. A table may not encode more than 256 lengths.

If an End of Table marker is seen, any remaining lengths before the end of the table are filled with zeroes.

Symbols will be assigned code patterns such that:

  • Shorter symbols precede longer ones;
  • Symbols with the same length will be be ordered by symbol value.
Symbols will be effectively transposed in the bitstream, such that the high-order bit of the symbol's codeword appears in LSB position in the bitstream.

Macroblock

Macroblock Type (Main Header):

  • 0: 4:2:0, YYYYUV, 16x16 pixels (BHT)
  • 1: 4:4:4, YUV, 8x8 pixels (BHT)
A macroblock will be the basic unit of image compression. A macroblock will consist of 1 or more Y, U, and V blocks, with a layout defined in terms of the chroma subsampling mode. This format will assume fixed subsampling modes, with the image frame composed of an integer number of macroblocks.

For each coded block, the preceding block from the same plane will indicate a predicted value for DC, with the initial values for each plane being set to 0.

For 4:2:0 mode, the four Y blocks will also be encoded in Hilbert order, or:

  • 0 3
  • 1 2
Each block will consist of 64 coefficients encoded in "zigzag" order:
  • 0 1 5 6 14 15 27 28
  • 2 4 7 13 16 26 29 42
  • 3 8 12 17 25 30 41 43
  • 9 11 18 24 31 40 44 53
  • 10 19 23 32 39 45 52 54
  • 20 22 33 38 46 51 55 60
  • 21 34 37 47 50 56 59 61
  • 35 36 48 49 57 58 62 63
DC coefficients, labeled as 0, will be Huffman encoded first using the DC table. The remaining coefficients will use the AC table.

Each AC coefficient will be encoded using a Z3V5 scheme, where Z will be the high 3 bits and will encode the number of zeroes. The V field in the low 5 bits will correspond to the Value, and will serve as a prefix for a variable-length coding.

The prefix will be followed by 0 or more "extra bits". Prefix, Extra Bits, Value/Range:

  • 00/01, 0, 0 / 1
  • 02/03, 0, 2 / 3
  • 04/05, 1, 4.. 7
  • 06/07, 2, 8.. 15
  • 08/09, 3, 16.. 31
  • 0A/0B, 4, 32.. 63
  • 0C/0D, 5, 64.. 127
  • 0E/0F, 6, 128.. 255
  • 10/11, 7, 256.. 511
  • 12/13, 8, 512.. 1023
  • 14/15, 9, 1024.. 2047
  • 16/17, 10, 2048.. 4095
  • 18/19, 11, 4096.. 8191
  • 1A/1B, 12, 8192..16383
  • 1C/1D, 13, 16384..32767
  • 1E/1F, 14, 32768..65535
For encoding coefficient values, the sign will be folded into the LSB, and thus the unsigned sequence (0, 1, 2, 3, 4, 5, 6, 7, ...) will map to the signed sequence (0, -1, 1, -2, 2, -3, 3, -4, ...).

Block Haar Transform

The transform used will be referred to as the Block Haar Transform. This is essentially a block oriented version of a Haar Wavelet.

The 2D transform will be composed of layering horizontal and vertical transforms. The horizontal transform will be on the side facing the image pixel data, with the vertical transform on the interior (coefficient) side.

The transform will work by being applied to pairs of values, which will map (A, B) to (A+B, (A+B)/2-B).

Mapping (I0..I7) to (O0..O7):

  • A0=I0+I1, O4=(A0>>1)-I1
  • A1=I2+I3, O5=(A1>>1)-I3
  • A2=I4+I5, O6=(A2>>1)-I5
  • A3=I6+I7, O7=(A3>>1)-I7
  • B0=A0+A1, O2=(B0>>1)-A1
  • B1=A2+A3, O3=(B1>>1)-A3
  • O0=B0+B1, O1=(C0>>1)-B1
Decoding will inverse this mapping and rebuild the original series of values, mapping (C, D) to (C-(C/2-D), C/2-D).

Where, substituting the forward transform into the inverse:

  • ((A+B)-((A+B)/2-((A+B)/2-B)), (A+B)/2-((A+B)/2-B)) =>
  • ((A+B)-(A+B)/2+(A+B)/2-B), (A+B)/2-(A+B)/2+B) =>
  • ((A+B)-B, B) =>
  • (A, B)
This transform is mostly used because it is fast.

Note that the block transform will not contain an implicit DC bias.

Clone this wiki locally