annotate src/libvorbis-1.3.3/doc/08-residue.tex @ 83:ae30d91d2ffe

Replace these with versions built using an older toolset (so as to avoid ABI compatibilities when linking on Ubuntu 14.04 for packaging purposes)
author Chris Cannam
date Fri, 07 Feb 2020 11:51:13 +0000
parents 05aa0afa9217
children
rev   line source
Chris@1 1 % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
Chris@1 2 %!TEX root = Vorbis_I_spec.tex
Chris@1 3 % $Id$
Chris@1 4 \section{Residue setup and decode} \label{vorbis:spec:residue}
Chris@1 5
Chris@1 6 \subsection{Overview}
Chris@1 7
Chris@1 8 A residue vector represents the fine detail of the audio spectrum of
Chris@1 9 one channel in an audio frame after the encoder subtracts the floor
Chris@1 10 curve and performs any channel coupling. A residue vector may
Chris@1 11 represent spectral lines, spectral magnitude, spectral phase or
Chris@1 12 hybrids as mixed by channel coupling. The exact semantic content of
Chris@1 13 the vector does not matter to the residue abstraction.
Chris@1 14
Chris@1 15 Whatever the exact qualities, the Vorbis residue abstraction codes the
Chris@1 16 residue vectors into the bitstream packet, and then reconstructs the
Chris@1 17 vectors during decode. Vorbis makes use of three different encoding
Chris@1 18 variants (numbered 0, 1 and 2) of the same basic vector encoding
Chris@1 19 abstraction.
Chris@1 20
Chris@1 21
Chris@1 22
Chris@1 23 \subsection{Residue format}
Chris@1 24
Chris@1 25 Residue format partitions each vector in the vector bundle into chunks,
Chris@1 26 classifies each chunk, encodes the chunk classifications and finally
Chris@1 27 encodes the chunks themselves using the the specific VQ arrangement
Chris@1 28 defined for each selected classification.
Chris@1 29 The exact interleaving and partitioning vary by residue encoding number,
Chris@1 30 however the high-level process used to classify and encode the residue
Chris@1 31 vector is the same in all three variants.
Chris@1 32
Chris@1 33 A set of coded residue vectors are all of the same length. High level
Chris@1 34 coding structure, ignoring for the moment exactly how a partition is
Chris@1 35 encoded and simply trusting that it is, is as follows:
Chris@1 36
Chris@1 37 \begin{itemize}
Chris@1 38 \item Each vector is partitioned into multiple equal sized chunks
Chris@1 39 according to configuration specified. If we have a vector size of
Chris@1 40 \emph{n}, a partition size \emph{residue\_partition\_size}, and a total
Chris@1 41 of \emph{ch} residue vectors, the total number of partitioned chunks
Chris@1 42 coded is \emph{n}/\emph{residue\_partition\_size}*\emph{ch}. It is
Chris@1 43 important to note that the integer division truncates. In the below
Chris@1 44 example, we assume an example \emph{residue\_partition\_size} of 8.
Chris@1 45
Chris@1 46 \item Each partition in each vector has a classification number that
Chris@1 47 specifies which of multiple configured VQ codebook setups are used to
Chris@1 48 decode that partition. The classification numbers of each partition
Chris@1 49 can be thought of as forming a vector in their own right, as in the
Chris@1 50 illustration below. Just as the residue vectors are coded in grouped
Chris@1 51 partitions to increase encoding efficiency, the classification vector
Chris@1 52 is also partitioned into chunks. The integer elements of each scalar
Chris@1 53 in a classification chunk are built into a single scalar that
Chris@1 54 represents the classification numbers in that chunk. In the below
Chris@1 55 example, the classification codeword encodes two classification
Chris@1 56 numbers.
Chris@1 57
Chris@1 58 \item The values in a residue vector may be encoded monolithically in a
Chris@1 59 single pass through the residue vector, but more often efficient
Chris@1 60 codebook design dictates that each vector is encoded as the additive
Chris@1 61 sum of several passes through the residue vector using more than one
Chris@1 62 VQ codebook. Thus, each residue value potentially accumulates values
Chris@1 63 from multiple decode passes. The classification value associated with
Chris@1 64 a partition is the same in each pass, thus the classification codeword
Chris@1 65 is coded only in the first pass.
Chris@1 66
Chris@1 67 \end{itemize}
Chris@1 68
Chris@1 69
Chris@1 70 \begin{center}
Chris@1 71 \includegraphics[width=\textwidth]{residue-pack}
Chris@1 72 \captionof{figure}{illustration of residue vector format}
Chris@1 73 \end{center}
Chris@1 74
Chris@1 75
Chris@1 76
Chris@1 77 \subsection{residue 0}
Chris@1 78
Chris@1 79 Residue 0 and 1 differ only in the way the values within a residue
Chris@1 80 partition are interleaved during partition encoding (visually treated
Chris@1 81 as a black box--or cyan box or brown box--in the above figure).
Chris@1 82
Chris@1 83 Residue encoding 0 interleaves VQ encoding according to the
Chris@1 84 dimension of the codebook used to encode a partition in a specific
Chris@1 85 pass. The dimension of the codebook need not be the same in multiple
Chris@1 86 passes, however the partition size must be an even multiple of the
Chris@1 87 codebook dimension.
Chris@1 88
Chris@1 89 As an example, assume a partition vector of size eight, to be encoded
Chris@1 90 by residue 0 using codebook sizes of 8, 4, 2 and 1:
Chris@1 91
Chris@1 92 \begin{programlisting}
Chris@1 93
Chris@1 94 original residue vector: [ 0 1 2 3 4 5 6 7 ]
Chris@1 95
Chris@1 96 codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ]
Chris@1 97
Chris@1 98 codebook dimensions = 4 encoded as: [ 0 2 4 6 ], [ 1 3 5 7 ]
Chris@1 99
Chris@1 100 codebook dimensions = 2 encoded as: [ 0 4 ], [ 1 5 ], [ 2 6 ], [ 3 7 ]
Chris@1 101
Chris@1 102 codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
Chris@1 103
Chris@1 104 \end{programlisting}
Chris@1 105
Chris@1 106 It is worth mentioning at this point that no configurable value in the
Chris@1 107 residue coding setup is restricted to a power of two.
Chris@1 108
Chris@1 109
Chris@1 110
Chris@1 111 \subsection{residue 1}
Chris@1 112
Chris@1 113 Residue 1 does not interleave VQ encoding. It represents partition
Chris@1 114 vector scalars in order. As with residue 0, however, partition length
Chris@1 115 must be an integer multiple of the codebook dimension, although
Chris@1 116 dimension may vary from pass to pass.
Chris@1 117
Chris@1 118 As an example, assume a partition vector of size eight, to be encoded
Chris@1 119 by residue 0 using codebook sizes of 8, 4, 2 and 1:
Chris@1 120
Chris@1 121 \begin{programlisting}
Chris@1 122
Chris@1 123 original residue vector: [ 0 1 2 3 4 5 6 7 ]
Chris@1 124
Chris@1 125 codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ]
Chris@1 126
Chris@1 127 codebook dimensions = 4 encoded as: [ 0 1 2 3 ], [ 4 5 6 7 ]
Chris@1 128
Chris@1 129 codebook dimensions = 2 encoded as: [ 0 1 ], [ 2 3 ], [ 4 5 ], [ 6 7 ]
Chris@1 130
Chris@1 131 codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
Chris@1 132
Chris@1 133 \end{programlisting}
Chris@1 134
Chris@1 135
Chris@1 136
Chris@1 137 \subsection{residue 2}
Chris@1 138
Chris@1 139 Residue type two can be thought of as a variant of residue type 1.
Chris@1 140 Rather than encoding multiple passed-in vectors as in residue type 1,
Chris@1 141 the \emph{ch} passed in vectors of length \emph{n} are first
Chris@1 142 interleaved and flattened into a single vector of length
Chris@1 143 \emph{ch}*\emph{n}. Encoding then proceeds as in type 1. Decoding is
Chris@1 144 as in type 1 with decode interleave reversed. If operating on a single
Chris@1 145 vector to begin with, residue type 1 and type 2 are equivalent.
Chris@1 146
Chris@1 147 \begin{center}
Chris@1 148 \includegraphics[width=\textwidth]{residue2}
Chris@1 149 \captionof{figure}{illustration of residue type 2}
Chris@1 150 \end{center}
Chris@1 151
Chris@1 152
Chris@1 153 \subsection{Residue decode}
Chris@1 154
Chris@1 155 \subsubsection{header decode}
Chris@1 156
Chris@1 157 Header decode for all three residue types is identical.
Chris@1 158 \begin{programlisting}
Chris@1 159 1) [residue\_begin] = read 24 bits as unsigned integer
Chris@1 160 2) [residue\_end] = read 24 bits as unsigned integer
Chris@1 161 3) [residue\_partition\_size] = read 24 bits as unsigned integer and add one
Chris@1 162 4) [residue\_classifications] = read 6 bits as unsigned integer and add one
Chris@1 163 5) [residue\_classbook] = read 8 bits as unsigned integer
Chris@1 164 \end{programlisting}
Chris@1 165
Chris@1 166 \varname{[residue\_begin]} and
Chris@1 167 \varname{[residue\_end]} select the specific sub-portion of
Chris@1 168 each vector that is actually coded; it implements akin to a bandpass
Chris@1 169 where, for coding purposes, the vector effectively begins at element
Chris@1 170 \varname{[residue\_begin]} and ends at
Chris@1 171 \varname{[residue\_end]}. Preceding and following values in
Chris@1 172 the unpacked vectors are zeroed. Note that for residue type 2, these
Chris@1 173 values as well as \varname{[residue\_partition\_size]}apply to
Chris@1 174 the interleaved vector, not the individual vectors before interleave.
Chris@1 175 \varname{[residue\_partition\_size]} is as explained above,
Chris@1 176 \varname{[residue\_classifications]} is the number of possible
Chris@1 177 classification to which a partition can belong and
Chris@1 178 \varname{[residue\_classbook]} is the codebook number used to
Chris@1 179 code classification codewords. The number of dimensions in book
Chris@1 180 \varname{[residue\_classbook]} determines how many
Chris@1 181 classification values are grouped into a single classification
Chris@1 182 codeword. Note that the number of entries and dimensions in book
Chris@1 183 \varname{[residue\_classbook]}, along with
Chris@1 184 \varname{[residue\_classifications]}, overdetermines to
Chris@1 185 possible number of classification codewords.
Chris@1 186 If \varname{[residue\_classifications]}\^{}\varname{[residue\_classbook]}.dimensions
Chris@1 187 exceeds \varname{[residue\_classbook]}.entries, the
Chris@1 188 bitstream should be regarded to be undecodable.
Chris@1 189
Chris@1 190 Next we read a bitmap pattern that specifies which partition classes
Chris@1 191 code values in which passes.
Chris@1 192
Chris@1 193 \begin{programlisting}
Chris@1 194 1) iterate [i] over the range 0 ... [residue\_classifications]-1 {
Chris@1 195
Chris@1 196 2) [high\_bits] = 0
Chris@1 197 3) [low\_bits] = read 3 bits as unsigned integer
Chris@1 198 4) [bitflag] = read one bit as boolean
Chris@1 199 5) if ( [bitflag] is set ) then [high\_bits] = read five bits as unsigned integer
Chris@1 200 6) vector [residue\_cascade] element [i] = [high\_bits] * 8 + [low\_bits]
Chris@1 201 }
Chris@1 202 7) done
Chris@1 203 \end{programlisting}
Chris@1 204
Chris@1 205 Finally, we read in a list of book numbers, each corresponding to
Chris@1 206 specific bit set in the cascade bitmap. We loop over the possible
Chris@1 207 codebook classifications and the maximum possible number of encoding
Chris@1 208 stages (8 in Vorbis I, as constrained by the elements of the cascade
Chris@1 209 bitmap being eight bits):
Chris@1 210
Chris@1 211 \begin{programlisting}
Chris@1 212 1) iterate [i] over the range 0 ... [residue\_classifications]-1 {
Chris@1 213
Chris@1 214 2) iterate [j] over the range 0 ... 7 {
Chris@1 215
Chris@1 216 3) if ( vector [residue\_cascade] element [i] bit [j] is set ) {
Chris@1 217
Chris@1 218 4) array [residue\_books] element [i][j] = read 8 bits as unsigned integer
Chris@1 219
Chris@1 220 } else {
Chris@1 221
Chris@1 222 5) array [residue\_books] element [i][j] = unused
Chris@1 223
Chris@1 224 }
Chris@1 225 }
Chris@1 226 }
Chris@1 227
Chris@1 228 6) done
Chris@1 229 \end{programlisting}
Chris@1 230
Chris@1 231 An end-of-packet condition at any point in header decode renders the
Chris@1 232 stream undecodable. In addition, any codebook number greater than the
Chris@1 233 maximum numbered codebook set up in this stream also renders the
Chris@1 234 stream undecodable. All codebooks in array [residue\_books] are
Chris@1 235 required to have a value mapping. The presence of codebook in array
Chris@1 236 [residue\_books] without a value mapping (maptype equals zero) renders
Chris@1 237 the stream undecodable.
Chris@1 238
Chris@1 239
Chris@1 240
Chris@1 241 \subsubsection{packet decode}
Chris@1 242
Chris@1 243 Format 0 and 1 packet decode is identical except for specific
Chris@1 244 partition interleave. Format 2 packet decode can be built out of the
Chris@1 245 format 1 decode process. Thus we describe first the decode
Chris@1 246 infrastructure identical to all three formats.
Chris@1 247
Chris@1 248 In addition to configuration information, the residue decode process
Chris@1 249 is passed the number of vectors in the submap bundle and a vector of
Chris@1 250 flags indicating if any of the vectors are not to be decoded. If the
Chris@1 251 passed in number of vectors is 3 and vector number 1 is marked 'do not
Chris@1 252 decode', decode skips vector 1 during the decode loop. However, even
Chris@1 253 'do not decode' vectors are allocated and zeroed.
Chris@1 254
Chris@1 255 Depending on the values of \varname{[residue\_begin]} and
Chris@1 256 \varname{[residue\_end]}, it is obvious that the encoded
Chris@1 257 portion of a residue vector may be the entire possible residue vector
Chris@1 258 or some other strict subset of the actual residue vector size with
Chris@1 259 zero padding at either uncoded end. However, it is also possible to
Chris@1 260 set \varname{[residue\_begin]} and
Chris@1 261 \varname{[residue\_end]} to specify a range partially or
Chris@1 262 wholly beyond the maximum vector size. Before beginning residue
Chris@1 263 decode, limit \varname{[residue\_begin]} and
Chris@1 264 \varname{[residue\_end]} to the maximum possible vector size
Chris@1 265 as follows. We assume that the number of vectors being encoded,
Chris@1 266 \varname{[ch]} is provided by the higher level decoding
Chris@1 267 process.
Chris@1 268
Chris@1 269 \begin{programlisting}
Chris@1 270 1) [actual\_size] = current blocksize/2;
Chris@1 271 2) if residue encoding is format 2
Chris@1 272 3) [actual\_size] = [actual\_size] * [ch];
Chris@1 273 4) [limit\_residue\_begin] = maximum of ([residue\_begin],[actual\_size]);
Chris@1 274 5) [limit\_residue\_end] = maximum of ([residue\_end],[actual\_size]);
Chris@1 275 \end{programlisting}
Chris@1 276
Chris@1 277 The following convenience values are conceptually useful to clarifying
Chris@1 278 the decode process:
Chris@1 279
Chris@1 280 \begin{programlisting}
Chris@1 281 1) [classwords\_per\_codeword] = [codebook\_dimensions] value of codebook [residue\_classbook]
Chris@1 282 2) [n\_to\_read] = [limit\_residue\_end] - [limit\_residue\_begin]
Chris@1 283 3) [partitions\_to\_read] = [n\_to\_read] / [residue\_partition\_size]
Chris@1 284 \end{programlisting}
Chris@1 285
Chris@1 286 Packet decode proceeds as follows, matching the description offered earlier in the document.
Chris@1 287 \begin{programlisting}
Chris@1 288 1) allocate and zero all vectors that will be returned.
Chris@1 289 2) if ([n\_to\_read] is zero), stop; there is no residue to decode.
Chris@1 290 3) iterate [pass] over the range 0 ... 7 {
Chris@1 291
Chris@1 292 4) [partition\_count] = 0
Chris@1 293
Chris@1 294 5) while [partition\_count] is less than [partitions\_to\_read]
Chris@1 295
Chris@1 296 6) if ([pass] is zero) {
Chris@1 297
Chris@1 298 7) iterate [j] over the range 0 .. [ch]-1 {
Chris@1 299
Chris@1 300 8) if vector [j] is not marked 'do not decode' {
Chris@1 301
Chris@1 302 9) [temp] = read from packet using codebook [residue\_classbook] in scalar context
Chris@1 303 10) iterate [i] descending over the range [classwords\_per\_codeword]-1 ... 0 {
Chris@1 304
Chris@1 305 11) array [classifications] element [j],([i]+[partition\_count]) =
Chris@1 306 [temp] integer modulo [residue\_classifications]
Chris@1 307 12) [temp] = [temp] / [residue\_classifications] using integer division
Chris@1 308
Chris@1 309 }
Chris@1 310
Chris@1 311 }
Chris@1 312
Chris@1 313 }
Chris@1 314
Chris@1 315 }
Chris@1 316
Chris@1 317 13) iterate [i] over the range 0 .. ([classwords\_per\_codeword] - 1) while [partition\_count]
Chris@1 318 is also less than [partitions\_to\_read] {
Chris@1 319
Chris@1 320 14) iterate [j] over the range 0 .. [ch]-1 {
Chris@1 321
Chris@1 322 15) if vector [j] is not marked 'do not decode' {
Chris@1 323
Chris@1 324 16) [vqclass] = array [classifications] element [j],[partition\_count]
Chris@1 325 17) [vqbook] = array [residue\_books] element [vqclass],[pass]
Chris@1 326 18) if ([vqbook] is not 'unused') {
Chris@1 327
Chris@1 328 19) decode partition into output vector number [j], starting at scalar
Chris@1 329 offset [limit\_residue\_begin]+[partition\_count]*[residue\_partition\_size] using
Chris@1 330 codebook number [vqbook] in VQ context
Chris@1 331 }
Chris@1 332 }
Chris@1 333
Chris@1 334 20) increment [partition\_count] by one
Chris@1 335
Chris@1 336 }
Chris@1 337 }
Chris@1 338 }
Chris@1 339
Chris@1 340 21) done
Chris@1 341
Chris@1 342 \end{programlisting}
Chris@1 343
Chris@1 344 An end-of-packet condition during packet decode is to be considered a
Chris@1 345 nominal occurrence. Decode returns the result of vector decode up to
Chris@1 346 that point.
Chris@1 347
Chris@1 348
Chris@1 349
Chris@1 350 \subsubsection{format 0 specifics}
Chris@1 351
Chris@1 352 Format zero decodes partitions exactly as described earlier in the
Chris@1 353 'Residue Format: residue 0' section. The following pseudocode
Chris@1 354 presents the same algorithm. Assume:
Chris@1 355
Chris@1 356 \begin{itemize}
Chris@1 357 \item \varname{[n]} is the value in \varname{[residue\_partition\_size]}
Chris@1 358 \item \varname{[v]} is the residue vector
Chris@1 359 \item \varname{[offset]} is the beginning read offset in [v]
Chris@1 360 \end{itemize}
Chris@1 361
Chris@1 362
Chris@1 363 \begin{programlisting}
Chris@1 364 1) [step] = [n] / [codebook\_dimensions]
Chris@1 365 2) iterate [i] over the range 0 ... [step]-1 {
Chris@1 366
Chris@1 367 3) vector [entry\_temp] = read vector from packet using current codebook in VQ context
Chris@1 368 4) iterate [j] over the range 0 ... [codebook\_dimensions]-1 {
Chris@1 369
Chris@1 370 5) vector [v] element ([offset]+[i]+[j]*[step]) =
Chris@1 371 vector [v] element ([offset]+[i]+[j]*[step]) +
Chris@1 372 vector [entry\_temp] element [j]
Chris@1 373
Chris@1 374 }
Chris@1 375
Chris@1 376 }
Chris@1 377
Chris@1 378 6) done
Chris@1 379
Chris@1 380 \end{programlisting}
Chris@1 381
Chris@1 382
Chris@1 383
Chris@1 384 \subsubsection{format 1 specifics}
Chris@1 385
Chris@1 386 Format 1 decodes partitions exactly as described earlier in the
Chris@1 387 'Residue Format: residue 1' section. The following pseudocode
Chris@1 388 presents the same algorithm. Assume:
Chris@1 389
Chris@1 390 \begin{itemize}
Chris@1 391 \item \varname{[n]} is the value in
Chris@1 392 \varname{[residue\_partition\_size]}
Chris@1 393 \item \varname{[v]} is the residue vector
Chris@1 394 \item \varname{[offset]} is the beginning read offset in [v]
Chris@1 395 \end{itemize}
Chris@1 396
Chris@1 397
Chris@1 398 \begin{programlisting}
Chris@1 399 1) [i] = 0
Chris@1 400 2) vector [entry\_temp] = read vector from packet using current codebook in VQ context
Chris@1 401 3) iterate [j] over the range 0 ... [codebook\_dimensions]-1 {
Chris@1 402
Chris@1 403 4) vector [v] element ([offset]+[i]) =
Chris@1 404 vector [v] element ([offset]+[i]) +
Chris@1 405 vector [entry\_temp] element [j]
Chris@1 406 5) increment [i]
Chris@1 407
Chris@1 408 }
Chris@1 409
Chris@1 410 6) if ( [i] is less than [n] ) continue at step 2
Chris@1 411 7) done
Chris@1 412 \end{programlisting}
Chris@1 413
Chris@1 414
Chris@1 415
Chris@1 416 \subsubsection{format 2 specifics}
Chris@1 417
Chris@1 418 Format 2 is reducible to format 1. It may be implemented as an additional step prior to and an additional post-decode step after a normal format 1 decode.
Chris@1 419
Chris@1 420
Chris@1 421 Format 2 handles 'do not decode' vectors differently than residue 0 or
Chris@1 422 1; if all vectors are marked 'do not decode', no decode occurrs.
Chris@1 423 However, if at least one vector is to be decoded, all the vectors are
Chris@1 424 decoded. We then request normal format 1 to decode a single vector
Chris@1 425 representing all output channels, rather than a vector for each
Chris@1 426 channel. After decode, deinterleave the vector into independent vectors, one for each output channel. That is:
Chris@1 427
Chris@1 428 \begin{enumerate}
Chris@1 429 \item If all vectors 0 through \emph{ch}-1 are marked 'do not decode', allocate and clear a single vector \varname{[v]}of length \emph{ch*n} and skip step 2 below; proceed directly to the post-decode step.
Chris@1 430 \item Rather than performing format 1 decode to produce \emph{ch} vectors of length \emph{n} each, call format 1 decode to produce a single vector \varname{[v]} of length \emph{ch*n}.
Chris@1 431 \item Post decode: Deinterleave the single vector \varname{[v]} returned by format 1 decode as described above into \emph{ch} independent vectors, one for each outputchannel, according to:
Chris@1 432 \begin{programlisting}
Chris@1 433 1) iterate [i] over the range 0 ... [n]-1 {
Chris@1 434
Chris@1 435 2) iterate [j] over the range 0 ... [ch]-1 {
Chris@1 436
Chris@1 437 3) output vector number [j] element [i] = vector [v] element ([i] * [ch] + [j])
Chris@1 438
Chris@1 439 }
Chris@1 440 }
Chris@1 441
Chris@1 442 4) done
Chris@1 443 \end{programlisting}
Chris@1 444
Chris@1 445 \end{enumerate}
Chris@1 446
Chris@1 447
Chris@1 448
Chris@1 449
Chris@1 450
Chris@1 451
Chris@1 452