cannam@86: % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*- cannam@86: %!TEX root = Vorbis_I_spec.tex cannam@86: % $Id$ cannam@86: \section{Probability Model and Codebooks} \label{vorbis:spec:codebook} cannam@86: cannam@86: \subsection{Overview} cannam@86: cannam@86: Unlike practically every other mainstream audio codec, Vorbis has no cannam@86: statically configured probability model, instead packing all entropy cannam@86: decoding configuration, VQ and Huffman, into the bitstream itself in cannam@86: the third header, the codec setup header. This packed configuration cannam@86: consists of multiple 'codebooks', each containing a specific cannam@86: Huffman-equivalent representation for decoding compressed codewords as cannam@86: well as an optional lookup table of output vector values to which a cannam@86: decoded Huffman value is applied as an offset, generating the final cannam@86: decoded output corresponding to a given compressed codeword. cannam@86: cannam@86: \subsubsection{Bitwise operation} cannam@86: The codebook mechanism is built on top of the vorbis bitpacker. Both cannam@86: the codebooks themselves and the codewords they decode are unrolled cannam@86: from a packet as a series of arbitrary-width values read from the cannam@86: stream according to \xref{vorbis:spec:bitpacking}. cannam@86: cannam@86: cannam@86: cannam@86: cannam@86: \subsection{Packed codebook format} cannam@86: cannam@86: For purposes of the examples below, we assume that the storage cannam@86: system's native byte width is eight bits. This is not universally cannam@86: true; see \xref{vorbis:spec:bitpacking} for discussion cannam@86: relating to non-eight-bit bytes. cannam@86: cannam@86: \subsubsection{codebook decode} cannam@86: cannam@86: A codebook begins with a 24 bit sync pattern, 0x564342: cannam@86: cannam@86: \begin{Verbatim}[commandchars=\\\{\}] cannam@86: byte 0: [ 0 1 0 0 0 0 1 0 ] (0x42) cannam@86: byte 1: [ 0 1 0 0 0 0 1 1 ] (0x43) cannam@86: byte 2: [ 0 1 0 1 0 1 1 0 ] (0x56) cannam@86: \end{Verbatim} cannam@86: cannam@86: 16 bit \varname{[codebook\_dimensions]} and 24 bit \varname{[codebook\_entries]} fields: cannam@86: cannam@86: \begin{Verbatim}[commandchars=\\\{\}] cannam@86: cannam@86: byte 3: [ X X X X X X X X ] cannam@86: byte 4: [ X X X X X X X X ] [codebook\_dimensions] (16 bit unsigned) cannam@86: cannam@86: byte 5: [ X X X X X X X X ] cannam@86: byte 6: [ X X X X X X X X ] cannam@86: byte 7: [ X X X X X X X X ] [codebook\_entries] (24 bit unsigned) cannam@86: cannam@86: \end{Verbatim} cannam@86: cannam@86: Next is the \varname{[ordered]} bit flag: cannam@86: cannam@86: \begin{Verbatim}[commandchars=\\\{\}] cannam@86: cannam@86: byte 8: [ X ] [ordered] (1 bit) cannam@86: cannam@86: \end{Verbatim} cannam@86: cannam@86: Each entry, numbering a cannam@86: total of \varname{[codebook\_entries]}, is assigned a codeword length. cannam@86: We now read the list of codeword lengths and store these lengths in cannam@86: the array \varname{[codebook\_codeword\_lengths]}. Decode of lengths is cannam@86: according to whether the \varname{[ordered]} flag is set or unset. cannam@86: cannam@86: \begin{itemize} cannam@86: \item cannam@86: If the \varname{[ordered]} flag is unset, the codeword list is not cannam@86: length ordered and the decoder needs to read each codeword length cannam@86: one-by-one. cannam@86: cannam@86: The decoder first reads one additional bit flag, the cannam@86: \varname{[sparse]} flag. This flag determines whether or not the cannam@86: codebook contains unused entries that are not to be included in the cannam@86: codeword decode tree: cannam@86: cannam@86: \begin{Verbatim}[commandchars=\\\{\}] cannam@86: byte 8: [ X 1 ] [sparse] flag (1 bit) cannam@86: \end{Verbatim} cannam@86: cannam@86: The decoder now performs for each of the \varname{[codebook\_entries]} cannam@86: codebook entries: cannam@86: cannam@86: \begin{Verbatim}[commandchars=\\\{\}] cannam@86: cannam@86: 1) if([sparse] is set) \{ cannam@86: cannam@86: 2) [flag] = read one bit; cannam@86: 3) if([flag] is set) \{ cannam@86: cannam@86: 4) [length] = read a five bit unsigned integer; cannam@86: 5) codeword length for this entry is [length]+1; cannam@86: cannam@86: \} else \{ cannam@86: cannam@86: 6) this entry is unused. mark it as such. cannam@86: cannam@86: \} cannam@86: cannam@86: \} else the sparse flag is not set \{ cannam@86: cannam@86: 7) [length] = read a five bit unsigned integer; cannam@86: 8) the codeword length for this entry is [length]+1; cannam@86: cannam@86: \} cannam@86: cannam@86: \end{Verbatim} cannam@86: cannam@86: \item cannam@86: If the \varname{[ordered]} flag is set, the codeword list for this cannam@86: codebook is encoded in ascending length order. Rather than reading cannam@86: a length for every codeword, the encoder reads the number of cannam@86: codewords per length. That is, beginning at entry zero: cannam@86: cannam@86: \begin{Verbatim}[commandchars=\\\{\}] cannam@86: 1) [current\_entry] = 0; cannam@86: 2) [current\_length] = read a five bit unsigned integer and add 1; cannam@86: 3) [number] = read \link{vorbis:spec:ilog}{ilog}([codebook\_entries] - [current\_entry]) bits as an unsigned integer cannam@86: 4) set the entries [current\_entry] through [current\_entry]+[number]-1, inclusive, cannam@86: of the [codebook\_codeword\_lengths] array to [current\_length] cannam@86: 5) set [current\_entry] to [number] + [current\_entry] cannam@86: 6) increment [current\_length] by 1 cannam@86: 7) if [current\_entry] is greater than [codebook\_entries] ERROR CONDITION; cannam@86: the decoder will not be able to read this stream. cannam@86: 8) if [current\_entry] is less than [codebook\_entries], repeat process starting at 3) cannam@86: 9) done. cannam@86: \end{Verbatim} cannam@86: cannam@86: \end{itemize} cannam@86: cannam@86: After all codeword lengths have been decoded, the decoder reads the cannam@86: vector lookup table. Vorbis I supports three lookup types: cannam@86: \begin{enumerate} cannam@86: \item cannam@86: No lookup cannam@86: \item cannam@86: Implicitly populated value mapping (lattice VQ) cannam@86: \item cannam@86: Explicitly populated value mapping (tessellated or 'foam' cannam@86: VQ) cannam@86: \end{enumerate} cannam@86: cannam@86: cannam@86: The lookup table type is read as a four bit unsigned integer: cannam@86: \begin{Verbatim}[commandchars=\\\{\}] cannam@86: 1) [codebook\_lookup\_type] = read four bits as an unsigned integer cannam@86: \end{Verbatim} cannam@86: cannam@86: Codebook decode precedes according to \varname{[codebook\_lookup\_type]}: cannam@86: \begin{itemize} cannam@86: \item cannam@86: Lookup type zero indicates no lookup to be read. Proceed past cannam@86: lookup decode. cannam@86: \item cannam@86: Lookup types one and two are similar, differing only in the cannam@86: number of lookup values to be read. Lookup type one reads a list of cannam@86: values that are permuted in a set pattern to build a list of vectors, cannam@86: each vector of order \varname{[codebook\_dimensions]} scalars. Lookup cannam@86: type two builds the same vector list, but reads each scalar for each cannam@86: vector explicitly, rather than building vectors from a smaller list of cannam@86: possible scalar values. Lookup decode proceeds as follows: cannam@86: cannam@86: \begin{Verbatim}[commandchars=\\\{\}] cannam@86: 1) [codebook\_minimum\_value] = \link{vorbis:spec:float32:unpack}{float32\_unpack}( read 32 bits as an unsigned integer) cannam@86: 2) [codebook\_delta\_value] = \link{vorbis:spec:float32:unpack}{float32\_unpack}( read 32 bits as an unsigned integer) cannam@86: 3) [codebook\_value\_bits] = read 4 bits as an unsigned integer and add 1 cannam@86: 4) [codebook\_sequence\_p] = read 1 bit as a boolean flag cannam@86: cannam@86: if ( [codebook\_lookup\_type] is 1 ) \{ cannam@86: cannam@86: 5) [codebook\_lookup\_values] = \link{vorbis:spec:lookup1:values}{lookup1\_values}(\varname{[codebook\_entries]}, \varname{[codebook\_dimensions]} ) cannam@86: cannam@86: \} else \{ cannam@86: cannam@86: 6) [codebook\_lookup\_values] = \varname{[codebook\_entries]} * \varname{[codebook\_dimensions]} cannam@86: cannam@86: \} cannam@86: cannam@86: 7) read a total of [codebook\_lookup\_values] unsigned integers of [codebook\_value\_bits] each; cannam@86: store these in order in the array [codebook\_multiplicands] cannam@86: \end{Verbatim} cannam@86: \item cannam@86: A \varname{[codebook\_lookup\_type]} of greater than two is reserved cannam@86: and indicates a stream that is not decodable by the specification in this cannam@86: document. cannam@86: cannam@86: \end{itemize} cannam@86: cannam@86: cannam@86: An 'end of packet' during any read operation in the above steps is cannam@86: considered an error condition rendering the stream undecodable. cannam@86: cannam@86: \paragraph{Huffman decision tree representation} cannam@86: cannam@86: The \varname{[codebook\_codeword\_lengths]} array and cannam@86: \varname{[codebook\_entries]} value uniquely define the Huffman decision cannam@86: tree used for entropy decoding. cannam@86: cannam@86: Briefly, each used codebook entry (recall that length-unordered cannam@86: codebooks support unused codeword entries) is assigned, in order, the cannam@86: lowest valued unused binary Huffman codeword possible. Assume the cannam@86: following codeword length list: cannam@86: cannam@86: \begin{Verbatim}[commandchars=\\\{\}] cannam@86: entry 0: length 2 cannam@86: entry 1: length 4 cannam@86: entry 2: length 4 cannam@86: entry 3: length 4 cannam@86: entry 4: length 4 cannam@86: entry 5: length 2 cannam@86: entry 6: length 3 cannam@86: entry 7: length 3 cannam@86: \end{Verbatim} cannam@86: cannam@86: Assigning codewords in order (lowest possible value of the appropriate cannam@86: length to highest) results in the following codeword list: cannam@86: cannam@86: \begin{Verbatim}[commandchars=\\\{\}] cannam@86: entry 0: length 2 codeword 00 cannam@86: entry 1: length 4 codeword 0100 cannam@86: entry 2: length 4 codeword 0101 cannam@86: entry 3: length 4 codeword 0110 cannam@86: entry 4: length 4 codeword 0111 cannam@86: entry 5: length 2 codeword 10 cannam@86: entry 6: length 3 codeword 110 cannam@86: entry 7: length 3 codeword 111 cannam@86: \end{Verbatim} cannam@86: cannam@86: cannam@86: \begin{note} cannam@86: Unlike most binary numerical values in this document, we cannam@86: intend the above codewords to be read and used bit by bit from left to cannam@86: right, thus the codeword '001' is the bit string 'zero, zero, one'. cannam@86: When determining 'lowest possible value' in the assignment definition cannam@86: above, the leftmost bit is the MSb. cannam@86: \end{note} cannam@86: cannam@86: It is clear that the codeword length list represents a Huffman cannam@86: decision tree with the entry numbers equivalent to the leaves numbered cannam@86: left-to-right: cannam@86: cannam@86: \begin{center} cannam@86: \includegraphics[width=10cm]{hufftree} cannam@86: \captionof{figure}{huffman tree illustration} cannam@86: \end{center} cannam@86: cannam@86: cannam@86: As we assign codewords in order, we see that each choice constructs a cannam@86: new leaf in the leftmost possible position. cannam@86: cannam@86: Note that it's possible to underspecify or overspecify a Huffman tree cannam@86: via the length list. In the above example, if codeword seven were cannam@86: eliminated, it's clear that the tree is unfinished: cannam@86: cannam@86: \begin{center} cannam@86: \includegraphics[width=10cm]{hufftree-under} cannam@86: \captionof{figure}{underspecified huffman tree illustration} cannam@86: \end{center} cannam@86: cannam@86: cannam@86: Similarly, in the original codebook, it's clear that the tree is fully cannam@86: populated and a ninth codeword is impossible. Both underspecified and cannam@86: overspecified trees are an error condition rendering the stream cannam@86: undecodable. Take special care that a codebook with a single used cannam@86: entry is handled properly; it consists of a single codework of zero cannam@86: bits and 'reading' a value out of such a codebook always returns the cannam@86: single used value and sinks zero bits. cannam@86: cannam@86: Codebook entries marked 'unused' are simply skipped in the assigning cannam@86: process. They have no codeword and do not appear in the decision cannam@86: tree, thus it's impossible for any bit pattern read from the stream to cannam@86: decode to that entry number. cannam@86: cannam@86: cannam@86: cannam@86: \paragraph{VQ lookup table vector representation} cannam@86: cannam@86: Unpacking the VQ lookup table vectors relies on the following values: cannam@86: \begin{programlisting} cannam@86: the [codebook\_multiplicands] array cannam@86: [codebook\_minimum\_value] cannam@86: [codebook\_delta\_value] cannam@86: [codebook\_sequence\_p] cannam@86: [codebook\_lookup\_type] cannam@86: [codebook\_entries] cannam@86: [codebook\_dimensions] cannam@86: [codebook\_lookup\_values] cannam@86: \end{programlisting} cannam@86: cannam@86: \bigskip cannam@86: cannam@86: Decoding (unpacking) a specific vector in the vector lookup table cannam@86: proceeds according to \varname{[codebook\_lookup\_type]}. The unpacked cannam@86: vector values are what a codebook would return during audio packet cannam@86: decode in a VQ context. cannam@86: cannam@86: \paragraph{Vector value decode: Lookup type 1} cannam@86: cannam@86: Lookup type one specifies a lattice VQ lookup table built cannam@86: algorithmically from a list of scalar values. Calculate (unpack) the cannam@86: final values of a codebook entry vector from the entries in cannam@86: \varname{[codebook\_multiplicands]} as follows (\varname{[value\_vector]} cannam@86: is the output vector representing the vector of values for entry number cannam@86: \varname{[lookup\_offset]} in this codebook): cannam@86: cannam@86: \begin{Verbatim}[commandchars=\\\{\}] cannam@86: 1) [last] = 0; cannam@86: 2) [index\_divisor] = 1; cannam@86: 3) iterate [i] over the range 0 ... [codebook\_dimensions]-1 (once for each scalar value in the value vector) \{ cannam@86: cannam@86: 4) [multiplicand\_offset] = ( [lookup\_offset] divided by [index\_divisor] using integer cannam@86: division ) integer modulo [codebook\_lookup\_values] cannam@86: cannam@86: 5) vector [value\_vector] element [i] = cannam@86: ( [codebook\_multiplicands] array element number [multiplicand\_offset] ) * cannam@86: [codebook\_delta\_value] + [codebook\_minimum\_value] + [last]; cannam@86: cannam@86: 6) if ( [codebook\_sequence\_p] is set ) then set [last] = vector [value\_vector] element [i] cannam@86: cannam@86: 7) [index\_divisor] = [index\_divisor] * [codebook\_lookup\_values] cannam@86: cannam@86: \} cannam@86: cannam@86: 8) vector calculation completed. cannam@86: \end{Verbatim} cannam@86: cannam@86: cannam@86: cannam@86: \paragraph{Vector value decode: Lookup type 2} cannam@86: cannam@86: Lookup type two specifies a VQ lookup table in which each scalar in cannam@86: each vector is explicitly set by the \varname{[codebook\_multiplicands]} cannam@86: array in a one-to-one mapping. Calculate [unpack] the cannam@86: final values of a codebook entry vector from the entries in cannam@86: \varname{[codebook\_multiplicands]} as follows (\varname{[value\_vector]} cannam@86: is the output vector representing the vector of values for entry number cannam@86: \varname{[lookup\_offset]} in this codebook): cannam@86: cannam@86: \begin{Verbatim}[commandchars=\\\{\}] cannam@86: 1) [last] = 0; cannam@86: 2) [multiplicand\_offset] = [lookup\_offset] * [codebook\_dimensions] cannam@86: 3) iterate [i] over the range 0 ... [codebook\_dimensions]-1 (once for each scalar value in the value vector) \{ cannam@86: cannam@86: 4) vector [value\_vector] element [i] = cannam@86: ( [codebook\_multiplicands] array element number [multiplicand\_offset] ) * cannam@86: [codebook\_delta\_value] + [codebook\_minimum\_value] + [last]; cannam@86: cannam@86: 5) if ( [codebook\_sequence\_p] is set ) then set [last] = vector [value\_vector] element [i] cannam@86: cannam@86: 6) increment [multiplicand\_offset] cannam@86: cannam@86: \} cannam@86: cannam@86: 7) vector calculation completed. cannam@86: \end{Verbatim} cannam@86: cannam@86: cannam@86: cannam@86: cannam@86: cannam@86: cannam@86: cannam@86: cannam@86: cannam@86: \subsection{Use of the codebook abstraction} cannam@86: cannam@86: The decoder uses the codebook abstraction much as it does the cannam@86: bit-unpacking convention; a specific codebook reads a cannam@86: codeword from the bitstream, decoding it into an entry number, and then cannam@86: returns that entry number to the decoder (when used in a scalar cannam@86: entropy coding context), or uses that entry number as an offset into cannam@86: the VQ lookup table, returning a vector of values (when used in a context cannam@86: desiring a VQ value). Scalar or VQ context is always explicit; any call cannam@86: to the codebook mechanism requests either a scalar entry number or a cannam@86: lookup vector. cannam@86: cannam@86: Note that VQ lookup type zero indicates that there is no lookup table; cannam@86: requesting decode using a codebook of lookup type 0 in any context cannam@86: expecting a vector return value (even in a case where a vector of cannam@86: dimension one) is forbidden. If decoder setup or decode requests such cannam@86: an action, that is an error condition rendering the packet cannam@86: undecodable. cannam@86: cannam@86: Using a codebook to read from the packet bitstream consists first of cannam@86: reading and decoding the next codeword in the bitstream. The decoder cannam@86: reads bits until the accumulated bits match a codeword in the cannam@86: codebook. This process can be though of as logically walking the cannam@86: Huffman decode tree by reading one bit at a time from the bitstream, cannam@86: and using the bit as a decision boolean to take the 0 branch (left in cannam@86: the above examples) or the 1 branch (right in the above examples). cannam@86: Walking the tree finishes when the decode process hits a leaf in the cannam@86: decision tree; the result is the entry number corresponding to that cannam@86: leaf. Reading past the end of a packet propagates the 'end-of-stream' cannam@86: condition to the decoder. cannam@86: cannam@86: When used in a scalar context, the resulting codeword entry is the cannam@86: desired return value. cannam@86: cannam@86: When used in a VQ context, the codeword entry number is used as an cannam@86: offset into the VQ lookup table. The value returned to the decoder is cannam@86: the vector of scalars corresponding to this offset.