annotate src/libvorbis-1.3.3/doc/03-codebook.tex @ 168:ceec0dd9ec9c

Replace these with versions built using an older toolset (so as to avoid ABI compatibilities when linking on Ubuntu 14.04 for packaging purposes)
author Chris Cannam <cannam@all-day-breakfast.com>
date Fri, 07 Feb 2020 11:51:13 +0000
parents 98c1576536ae
children
rev   line source
cannam@86 1 % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
cannam@86 2 %!TEX root = Vorbis_I_spec.tex
cannam@86 3 % $Id$
cannam@86 4 \section{Probability Model and Codebooks} \label{vorbis:spec:codebook}
cannam@86 5
cannam@86 6 \subsection{Overview}
cannam@86 7
cannam@86 8 Unlike practically every other mainstream audio codec, Vorbis has no
cannam@86 9 statically configured probability model, instead packing all entropy
cannam@86 10 decoding configuration, VQ and Huffman, into the bitstream itself in
cannam@86 11 the third header, the codec setup header. This packed configuration
cannam@86 12 consists of multiple 'codebooks', each containing a specific
cannam@86 13 Huffman-equivalent representation for decoding compressed codewords as
cannam@86 14 well as an optional lookup table of output vector values to which a
cannam@86 15 decoded Huffman value is applied as an offset, generating the final
cannam@86 16 decoded output corresponding to a given compressed codeword.
cannam@86 17
cannam@86 18 \subsubsection{Bitwise operation}
cannam@86 19 The codebook mechanism is built on top of the vorbis bitpacker. Both
cannam@86 20 the codebooks themselves and the codewords they decode are unrolled
cannam@86 21 from a packet as a series of arbitrary-width values read from the
cannam@86 22 stream according to \xref{vorbis:spec:bitpacking}.
cannam@86 23
cannam@86 24
cannam@86 25
cannam@86 26
cannam@86 27 \subsection{Packed codebook format}
cannam@86 28
cannam@86 29 For purposes of the examples below, we assume that the storage
cannam@86 30 system's native byte width is eight bits. This is not universally
cannam@86 31 true; see \xref{vorbis:spec:bitpacking} for discussion
cannam@86 32 relating to non-eight-bit bytes.
cannam@86 33
cannam@86 34 \subsubsection{codebook decode}
cannam@86 35
cannam@86 36 A codebook begins with a 24 bit sync pattern, 0x564342:
cannam@86 37
cannam@86 38 \begin{Verbatim}[commandchars=\\\{\}]
cannam@86 39 byte 0: [ 0 1 0 0 0 0 1 0 ] (0x42)
cannam@86 40 byte 1: [ 0 1 0 0 0 0 1 1 ] (0x43)
cannam@86 41 byte 2: [ 0 1 0 1 0 1 1 0 ] (0x56)
cannam@86 42 \end{Verbatim}
cannam@86 43
cannam@86 44 16 bit \varname{[codebook\_dimensions]} and 24 bit \varname{[codebook\_entries]} fields:
cannam@86 45
cannam@86 46 \begin{Verbatim}[commandchars=\\\{\}]
cannam@86 47
cannam@86 48 byte 3: [ X X X X X X X X ]
cannam@86 49 byte 4: [ X X X X X X X X ] [codebook\_dimensions] (16 bit unsigned)
cannam@86 50
cannam@86 51 byte 5: [ X X X X X X X X ]
cannam@86 52 byte 6: [ X X X X X X X X ]
cannam@86 53 byte 7: [ X X X X X X X X ] [codebook\_entries] (24 bit unsigned)
cannam@86 54
cannam@86 55 \end{Verbatim}
cannam@86 56
cannam@86 57 Next is the \varname{[ordered]} bit flag:
cannam@86 58
cannam@86 59 \begin{Verbatim}[commandchars=\\\{\}]
cannam@86 60
cannam@86 61 byte 8: [ X ] [ordered] (1 bit)
cannam@86 62
cannam@86 63 \end{Verbatim}
cannam@86 64
cannam@86 65 Each entry, numbering a
cannam@86 66 total of \varname{[codebook\_entries]}, is assigned a codeword length.
cannam@86 67 We now read the list of codeword lengths and store these lengths in
cannam@86 68 the array \varname{[codebook\_codeword\_lengths]}. Decode of lengths is
cannam@86 69 according to whether the \varname{[ordered]} flag is set or unset.
cannam@86 70
cannam@86 71 \begin{itemize}
cannam@86 72 \item
cannam@86 73 If the \varname{[ordered]} flag is unset, the codeword list is not
cannam@86 74 length ordered and the decoder needs to read each codeword length
cannam@86 75 one-by-one.
cannam@86 76
cannam@86 77 The decoder first reads one additional bit flag, the
cannam@86 78 \varname{[sparse]} flag. This flag determines whether or not the
cannam@86 79 codebook contains unused entries that are not to be included in the
cannam@86 80 codeword decode tree:
cannam@86 81
cannam@86 82 \begin{Verbatim}[commandchars=\\\{\}]
cannam@86 83 byte 8: [ X 1 ] [sparse] flag (1 bit)
cannam@86 84 \end{Verbatim}
cannam@86 85
cannam@86 86 The decoder now performs for each of the \varname{[codebook\_entries]}
cannam@86 87 codebook entries:
cannam@86 88
cannam@86 89 \begin{Verbatim}[commandchars=\\\{\}]
cannam@86 90
cannam@86 91 1) if([sparse] is set) \{
cannam@86 92
cannam@86 93 2) [flag] = read one bit;
cannam@86 94 3) if([flag] is set) \{
cannam@86 95
cannam@86 96 4) [length] = read a five bit unsigned integer;
cannam@86 97 5) codeword length for this entry is [length]+1;
cannam@86 98
cannam@86 99 \} else \{
cannam@86 100
cannam@86 101 6) this entry is unused. mark it as such.
cannam@86 102
cannam@86 103 \}
cannam@86 104
cannam@86 105 \} else the sparse flag is not set \{
cannam@86 106
cannam@86 107 7) [length] = read a five bit unsigned integer;
cannam@86 108 8) the codeword length for this entry is [length]+1;
cannam@86 109
cannam@86 110 \}
cannam@86 111
cannam@86 112 \end{Verbatim}
cannam@86 113
cannam@86 114 \item
cannam@86 115 If the \varname{[ordered]} flag is set, the codeword list for this
cannam@86 116 codebook is encoded in ascending length order. Rather than reading
cannam@86 117 a length for every codeword, the encoder reads the number of
cannam@86 118 codewords per length. That is, beginning at entry zero:
cannam@86 119
cannam@86 120 \begin{Verbatim}[commandchars=\\\{\}]
cannam@86 121 1) [current\_entry] = 0;
cannam@86 122 2) [current\_length] = read a five bit unsigned integer and add 1;
cannam@86 123 3) [number] = read \link{vorbis:spec:ilog}{ilog}([codebook\_entries] - [current\_entry]) bits as an unsigned integer
cannam@86 124 4) set the entries [current\_entry] through [current\_entry]+[number]-1, inclusive,
cannam@86 125 of the [codebook\_codeword\_lengths] array to [current\_length]
cannam@86 126 5) set [current\_entry] to [number] + [current\_entry]
cannam@86 127 6) increment [current\_length] by 1
cannam@86 128 7) if [current\_entry] is greater than [codebook\_entries] ERROR CONDITION;
cannam@86 129 the decoder will not be able to read this stream.
cannam@86 130 8) if [current\_entry] is less than [codebook\_entries], repeat process starting at 3)
cannam@86 131 9) done.
cannam@86 132 \end{Verbatim}
cannam@86 133
cannam@86 134 \end{itemize}
cannam@86 135
cannam@86 136 After all codeword lengths have been decoded, the decoder reads the
cannam@86 137 vector lookup table. Vorbis I supports three lookup types:
cannam@86 138 \begin{enumerate}
cannam@86 139 \item
cannam@86 140 No lookup
cannam@86 141 \item
cannam@86 142 Implicitly populated value mapping (lattice VQ)
cannam@86 143 \item
cannam@86 144 Explicitly populated value mapping (tessellated or 'foam'
cannam@86 145 VQ)
cannam@86 146 \end{enumerate}
cannam@86 147
cannam@86 148
cannam@86 149 The lookup table type is read as a four bit unsigned integer:
cannam@86 150 \begin{Verbatim}[commandchars=\\\{\}]
cannam@86 151 1) [codebook\_lookup\_type] = read four bits as an unsigned integer
cannam@86 152 \end{Verbatim}
cannam@86 153
cannam@86 154 Codebook decode precedes according to \varname{[codebook\_lookup\_type]}:
cannam@86 155 \begin{itemize}
cannam@86 156 \item
cannam@86 157 Lookup type zero indicates no lookup to be read. Proceed past
cannam@86 158 lookup decode.
cannam@86 159 \item
cannam@86 160 Lookup types one and two are similar, differing only in the
cannam@86 161 number of lookup values to be read. Lookup type one reads a list of
cannam@86 162 values that are permuted in a set pattern to build a list of vectors,
cannam@86 163 each vector of order \varname{[codebook\_dimensions]} scalars. Lookup
cannam@86 164 type two builds the same vector list, but reads each scalar for each
cannam@86 165 vector explicitly, rather than building vectors from a smaller list of
cannam@86 166 possible scalar values. Lookup decode proceeds as follows:
cannam@86 167
cannam@86 168 \begin{Verbatim}[commandchars=\\\{\}]
cannam@86 169 1) [codebook\_minimum\_value] = \link{vorbis:spec:float32:unpack}{float32\_unpack}( read 32 bits as an unsigned integer)
cannam@86 170 2) [codebook\_delta\_value] = \link{vorbis:spec:float32:unpack}{float32\_unpack}( read 32 bits as an unsigned integer)
cannam@86 171 3) [codebook\_value\_bits] = read 4 bits as an unsigned integer and add 1
cannam@86 172 4) [codebook\_sequence\_p] = read 1 bit as a boolean flag
cannam@86 173
cannam@86 174 if ( [codebook\_lookup\_type] is 1 ) \{
cannam@86 175
cannam@86 176 5) [codebook\_lookup\_values] = \link{vorbis:spec:lookup1:values}{lookup1\_values}(\varname{[codebook\_entries]}, \varname{[codebook\_dimensions]} )
cannam@86 177
cannam@86 178 \} else \{
cannam@86 179
cannam@86 180 6) [codebook\_lookup\_values] = \varname{[codebook\_entries]} * \varname{[codebook\_dimensions]}
cannam@86 181
cannam@86 182 \}
cannam@86 183
cannam@86 184 7) read a total of [codebook\_lookup\_values] unsigned integers of [codebook\_value\_bits] each;
cannam@86 185 store these in order in the array [codebook\_multiplicands]
cannam@86 186 \end{Verbatim}
cannam@86 187 \item
cannam@86 188 A \varname{[codebook\_lookup\_type]} of greater than two is reserved
cannam@86 189 and indicates a stream that is not decodable by the specification in this
cannam@86 190 document.
cannam@86 191
cannam@86 192 \end{itemize}
cannam@86 193
cannam@86 194
cannam@86 195 An 'end of packet' during any read operation in the above steps is
cannam@86 196 considered an error condition rendering the stream undecodable.
cannam@86 197
cannam@86 198 \paragraph{Huffman decision tree representation}
cannam@86 199
cannam@86 200 The \varname{[codebook\_codeword\_lengths]} array and
cannam@86 201 \varname{[codebook\_entries]} value uniquely define the Huffman decision
cannam@86 202 tree used for entropy decoding.
cannam@86 203
cannam@86 204 Briefly, each used codebook entry (recall that length-unordered
cannam@86 205 codebooks support unused codeword entries) is assigned, in order, the
cannam@86 206 lowest valued unused binary Huffman codeword possible. Assume the
cannam@86 207 following codeword length list:
cannam@86 208
cannam@86 209 \begin{Verbatim}[commandchars=\\\{\}]
cannam@86 210 entry 0: length 2
cannam@86 211 entry 1: length 4
cannam@86 212 entry 2: length 4
cannam@86 213 entry 3: length 4
cannam@86 214 entry 4: length 4
cannam@86 215 entry 5: length 2
cannam@86 216 entry 6: length 3
cannam@86 217 entry 7: length 3
cannam@86 218 \end{Verbatim}
cannam@86 219
cannam@86 220 Assigning codewords in order (lowest possible value of the appropriate
cannam@86 221 length to highest) results in the following codeword list:
cannam@86 222
cannam@86 223 \begin{Verbatim}[commandchars=\\\{\}]
cannam@86 224 entry 0: length 2 codeword 00
cannam@86 225 entry 1: length 4 codeword 0100
cannam@86 226 entry 2: length 4 codeword 0101
cannam@86 227 entry 3: length 4 codeword 0110
cannam@86 228 entry 4: length 4 codeword 0111
cannam@86 229 entry 5: length 2 codeword 10
cannam@86 230 entry 6: length 3 codeword 110
cannam@86 231 entry 7: length 3 codeword 111
cannam@86 232 \end{Verbatim}
cannam@86 233
cannam@86 234
cannam@86 235 \begin{note}
cannam@86 236 Unlike most binary numerical values in this document, we
cannam@86 237 intend the above codewords to be read and used bit by bit from left to
cannam@86 238 right, thus the codeword '001' is the bit string 'zero, zero, one'.
cannam@86 239 When determining 'lowest possible value' in the assignment definition
cannam@86 240 above, the leftmost bit is the MSb.
cannam@86 241 \end{note}
cannam@86 242
cannam@86 243 It is clear that the codeword length list represents a Huffman
cannam@86 244 decision tree with the entry numbers equivalent to the leaves numbered
cannam@86 245 left-to-right:
cannam@86 246
cannam@86 247 \begin{center}
cannam@86 248 \includegraphics[width=10cm]{hufftree}
cannam@86 249 \captionof{figure}{huffman tree illustration}
cannam@86 250 \end{center}
cannam@86 251
cannam@86 252
cannam@86 253 As we assign codewords in order, we see that each choice constructs a
cannam@86 254 new leaf in the leftmost possible position.
cannam@86 255
cannam@86 256 Note that it's possible to underspecify or overspecify a Huffman tree
cannam@86 257 via the length list. In the above example, if codeword seven were
cannam@86 258 eliminated, it's clear that the tree is unfinished:
cannam@86 259
cannam@86 260 \begin{center}
cannam@86 261 \includegraphics[width=10cm]{hufftree-under}
cannam@86 262 \captionof{figure}{underspecified huffman tree illustration}
cannam@86 263 \end{center}
cannam@86 264
cannam@86 265
cannam@86 266 Similarly, in the original codebook, it's clear that the tree is fully
cannam@86 267 populated and a ninth codeword is impossible. Both underspecified and
cannam@86 268 overspecified trees are an error condition rendering the stream
cannam@86 269 undecodable. Take special care that a codebook with a single used
cannam@86 270 entry is handled properly; it consists of a single codework of zero
cannam@86 271 bits and 'reading' a value out of such a codebook always returns the
cannam@86 272 single used value and sinks zero bits.
cannam@86 273
cannam@86 274 Codebook entries marked 'unused' are simply skipped in the assigning
cannam@86 275 process. They have no codeword and do not appear in the decision
cannam@86 276 tree, thus it's impossible for any bit pattern read from the stream to
cannam@86 277 decode to that entry number.
cannam@86 278
cannam@86 279
cannam@86 280
cannam@86 281 \paragraph{VQ lookup table vector representation}
cannam@86 282
cannam@86 283 Unpacking the VQ lookup table vectors relies on the following values:
cannam@86 284 \begin{programlisting}
cannam@86 285 the [codebook\_multiplicands] array
cannam@86 286 [codebook\_minimum\_value]
cannam@86 287 [codebook\_delta\_value]
cannam@86 288 [codebook\_sequence\_p]
cannam@86 289 [codebook\_lookup\_type]
cannam@86 290 [codebook\_entries]
cannam@86 291 [codebook\_dimensions]
cannam@86 292 [codebook\_lookup\_values]
cannam@86 293 \end{programlisting}
cannam@86 294
cannam@86 295 \bigskip
cannam@86 296
cannam@86 297 Decoding (unpacking) a specific vector in the vector lookup table
cannam@86 298 proceeds according to \varname{[codebook\_lookup\_type]}. The unpacked
cannam@86 299 vector values are what a codebook would return during audio packet
cannam@86 300 decode in a VQ context.
cannam@86 301
cannam@86 302 \paragraph{Vector value decode: Lookup type 1}
cannam@86 303
cannam@86 304 Lookup type one specifies a lattice VQ lookup table built
cannam@86 305 algorithmically from a list of scalar values. Calculate (unpack) the
cannam@86 306 final values of a codebook entry vector from the entries in
cannam@86 307 \varname{[codebook\_multiplicands]} as follows (\varname{[value\_vector]}
cannam@86 308 is the output vector representing the vector of values for entry number
cannam@86 309 \varname{[lookup\_offset]} in this codebook):
cannam@86 310
cannam@86 311 \begin{Verbatim}[commandchars=\\\{\}]
cannam@86 312 1) [last] = 0;
cannam@86 313 2) [index\_divisor] = 1;
cannam@86 314 3) iterate [i] over the range 0 ... [codebook\_dimensions]-1 (once for each scalar value in the value vector) \{
cannam@86 315
cannam@86 316 4) [multiplicand\_offset] = ( [lookup\_offset] divided by [index\_divisor] using integer
cannam@86 317 division ) integer modulo [codebook\_lookup\_values]
cannam@86 318
cannam@86 319 5) vector [value\_vector] element [i] =
cannam@86 320 ( [codebook\_multiplicands] array element number [multiplicand\_offset] ) *
cannam@86 321 [codebook\_delta\_value] + [codebook\_minimum\_value] + [last];
cannam@86 322
cannam@86 323 6) if ( [codebook\_sequence\_p] is set ) then set [last] = vector [value\_vector] element [i]
cannam@86 324
cannam@86 325 7) [index\_divisor] = [index\_divisor] * [codebook\_lookup\_values]
cannam@86 326
cannam@86 327 \}
cannam@86 328
cannam@86 329 8) vector calculation completed.
cannam@86 330 \end{Verbatim}
cannam@86 331
cannam@86 332
cannam@86 333
cannam@86 334 \paragraph{Vector value decode: Lookup type 2}
cannam@86 335
cannam@86 336 Lookup type two specifies a VQ lookup table in which each scalar in
cannam@86 337 each vector is explicitly set by the \varname{[codebook\_multiplicands]}
cannam@86 338 array in a one-to-one mapping. Calculate [unpack] the
cannam@86 339 final values of a codebook entry vector from the entries in
cannam@86 340 \varname{[codebook\_multiplicands]} as follows (\varname{[value\_vector]}
cannam@86 341 is the output vector representing the vector of values for entry number
cannam@86 342 \varname{[lookup\_offset]} in this codebook):
cannam@86 343
cannam@86 344 \begin{Verbatim}[commandchars=\\\{\}]
cannam@86 345 1) [last] = 0;
cannam@86 346 2) [multiplicand\_offset] = [lookup\_offset] * [codebook\_dimensions]
cannam@86 347 3) iterate [i] over the range 0 ... [codebook\_dimensions]-1 (once for each scalar value in the value vector) \{
cannam@86 348
cannam@86 349 4) vector [value\_vector] element [i] =
cannam@86 350 ( [codebook\_multiplicands] array element number [multiplicand\_offset] ) *
cannam@86 351 [codebook\_delta\_value] + [codebook\_minimum\_value] + [last];
cannam@86 352
cannam@86 353 5) if ( [codebook\_sequence\_p] is set ) then set [last] = vector [value\_vector] element [i]
cannam@86 354
cannam@86 355 6) increment [multiplicand\_offset]
cannam@86 356
cannam@86 357 \}
cannam@86 358
cannam@86 359 7) vector calculation completed.
cannam@86 360 \end{Verbatim}
cannam@86 361
cannam@86 362
cannam@86 363
cannam@86 364
cannam@86 365
cannam@86 366
cannam@86 367
cannam@86 368
cannam@86 369
cannam@86 370 \subsection{Use of the codebook abstraction}
cannam@86 371
cannam@86 372 The decoder uses the codebook abstraction much as it does the
cannam@86 373 bit-unpacking convention; a specific codebook reads a
cannam@86 374 codeword from the bitstream, decoding it into an entry number, and then
cannam@86 375 returns that entry number to the decoder (when used in a scalar
cannam@86 376 entropy coding context), or uses that entry number as an offset into
cannam@86 377 the VQ lookup table, returning a vector of values (when used in a context
cannam@86 378 desiring a VQ value). Scalar or VQ context is always explicit; any call
cannam@86 379 to the codebook mechanism requests either a scalar entry number or a
cannam@86 380 lookup vector.
cannam@86 381
cannam@86 382 Note that VQ lookup type zero indicates that there is no lookup table;
cannam@86 383 requesting decode using a codebook of lookup type 0 in any context
cannam@86 384 expecting a vector return value (even in a case where a vector of
cannam@86 385 dimension one) is forbidden. If decoder setup or decode requests such
cannam@86 386 an action, that is an error condition rendering the packet
cannam@86 387 undecodable.
cannam@86 388
cannam@86 389 Using a codebook to read from the packet bitstream consists first of
cannam@86 390 reading and decoding the next codeword in the bitstream. The decoder
cannam@86 391 reads bits until the accumulated bits match a codeword in the
cannam@86 392 codebook. This process can be though of as logically walking the
cannam@86 393 Huffman decode tree by reading one bit at a time from the bitstream,
cannam@86 394 and using the bit as a decision boolean to take the 0 branch (left in
cannam@86 395 the above examples) or the 1 branch (right in the above examples).
cannam@86 396 Walking the tree finishes when the decode process hits a leaf in the
cannam@86 397 decision tree; the result is the entry number corresponding to that
cannam@86 398 leaf. Reading past the end of a packet propagates the 'end-of-stream'
cannam@86 399 condition to the decoder.
cannam@86 400
cannam@86 401 When used in a scalar context, the resulting codeword entry is the
cannam@86 402 desired return value.
cannam@86 403
cannam@86 404 When used in a VQ context, the codeword entry number is used as an
cannam@86 405 offset into the VQ lookup table. The value returned to the decoder is
cannam@86 406 the vector of scalars corresponding to this offset.