annotate src/libvorbis-1.3.3/doc/08-residue.tex @ 56:af97cad61ff0

Add updated build of PortAudio for OSX
author Chris Cannam <cannam@all-day-breakfast.com>
date Tue, 03 Jan 2017 15:10:52 +0000
parents 05aa0afa9217
children
rev   line source
Chris@1 1 % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
Chris@1 2 %!TEX root = Vorbis_I_spec.tex
Chris@1 3 % $Id$
Chris@1 4 \section{Residue setup and decode} \label{vorbis:spec:residue}
Chris@1 5
Chris@1 6 \subsection{Overview}
Chris@1 7
Chris@1 8 A residue vector represents the fine detail of the audio spectrum of
Chris@1 9 one channel in an audio frame after the encoder subtracts the floor
Chris@1 10 curve and performs any channel coupling. A residue vector may
Chris@1 11 represent spectral lines, spectral magnitude, spectral phase or
Chris@1 12 hybrids as mixed by channel coupling. The exact semantic content of
Chris@1 13 the vector does not matter to the residue abstraction.
Chris@1 14
Chris@1 15 Whatever the exact qualities, the Vorbis residue abstraction codes the
Chris@1 16 residue vectors into the bitstream packet, and then reconstructs the
Chris@1 17 vectors during decode. Vorbis makes use of three different encoding
Chris@1 18 variants (numbered 0, 1 and 2) of the same basic vector encoding
Chris@1 19 abstraction.
Chris@1 20
Chris@1 21
Chris@1 22
Chris@1 23 \subsection{Residue format}
Chris@1 24
Chris@1 25 Residue format partitions each vector in the vector bundle into chunks,
Chris@1 26 classifies each chunk, encodes the chunk classifications and finally
Chris@1 27 encodes the chunks themselves using the the specific VQ arrangement
Chris@1 28 defined for each selected classification.
Chris@1 29 The exact interleaving and partitioning vary by residue encoding number,
Chris@1 30 however the high-level process used to classify and encode the residue
Chris@1 31 vector is the same in all three variants.
Chris@1 32
Chris@1 33 A set of coded residue vectors are all of the same length. High level
Chris@1 34 coding structure, ignoring for the moment exactly how a partition is
Chris@1 35 encoded and simply trusting that it is, is as follows:
Chris@1 36
Chris@1 37 \begin{itemize}
Chris@1 38 \item Each vector is partitioned into multiple equal sized chunks
Chris@1 39 according to configuration specified. If we have a vector size of
Chris@1 40 \emph{n}, a partition size \emph{residue\_partition\_size}, and a total
Chris@1 41 of \emph{ch} residue vectors, the total number of partitioned chunks
Chris@1 42 coded is \emph{n}/\emph{residue\_partition\_size}*\emph{ch}. It is
Chris@1 43 important to note that the integer division truncates. In the below
Chris@1 44 example, we assume an example \emph{residue\_partition\_size} of 8.
Chris@1 45
Chris@1 46 \item Each partition in each vector has a classification number that
Chris@1 47 specifies which of multiple configured VQ codebook setups are used to
Chris@1 48 decode that partition. The classification numbers of each partition
Chris@1 49 can be thought of as forming a vector in their own right, as in the
Chris@1 50 illustration below. Just as the residue vectors are coded in grouped
Chris@1 51 partitions to increase encoding efficiency, the classification vector
Chris@1 52 is also partitioned into chunks. The integer elements of each scalar
Chris@1 53 in a classification chunk are built into a single scalar that
Chris@1 54 represents the classification numbers in that chunk. In the below
Chris@1 55 example, the classification codeword encodes two classification
Chris@1 56 numbers.
Chris@1 57
Chris@1 58 \item The values in a residue vector may be encoded monolithically in a
Chris@1 59 single pass through the residue vector, but more often efficient
Chris@1 60 codebook design dictates that each vector is encoded as the additive
Chris@1 61 sum of several passes through the residue vector using more than one
Chris@1 62 VQ codebook. Thus, each residue value potentially accumulates values
Chris@1 63 from multiple decode passes. The classification value associated with
Chris@1 64 a partition is the same in each pass, thus the classification codeword
Chris@1 65 is coded only in the first pass.
Chris@1 66
Chris@1 67 \end{itemize}
Chris@1 68
Chris@1 69
Chris@1 70 \begin{center}
Chris@1 71 \includegraphics[width=\textwidth]{residue-pack}
Chris@1 72 \captionof{figure}{illustration of residue vector format}
Chris@1 73 \end{center}
Chris@1 74
Chris@1 75
Chris@1 76
Chris@1 77 \subsection{residue 0}
Chris@1 78
Chris@1 79 Residue 0 and 1 differ only in the way the values within a residue
Chris@1 80 partition are interleaved during partition encoding (visually treated
Chris@1 81 as a black box--or cyan box or brown box--in the above figure).
Chris@1 82
Chris@1 83 Residue encoding 0 interleaves VQ encoding according to the
Chris@1 84 dimension of the codebook used to encode a partition in a specific
Chris@1 85 pass. The dimension of the codebook need not be the same in multiple
Chris@1 86 passes, however the partition size must be an even multiple of the
Chris@1 87 codebook dimension.
Chris@1 88
Chris@1 89 As an example, assume a partition vector of size eight, to be encoded
Chris@1 90 by residue 0 using codebook sizes of 8, 4, 2 and 1:
Chris@1 91
Chris@1 92 \begin{programlisting}
Chris@1 93
Chris@1 94 original residue vector: [ 0 1 2 3 4 5 6 7 ]
Chris@1 95
Chris@1 96 codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ]
Chris@1 97
Chris@1 98 codebook dimensions = 4 encoded as: [ 0 2 4 6 ], [ 1 3 5 7 ]
Chris@1 99
Chris@1 100 codebook dimensions = 2 encoded as: [ 0 4 ], [ 1 5 ], [ 2 6 ], [ 3 7 ]
Chris@1 101
Chris@1 102 codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
Chris@1 103
Chris@1 104 \end{programlisting}
Chris@1 105
Chris@1 106 It is worth mentioning at this point that no configurable value in the
Chris@1 107 residue coding setup is restricted to a power of two.
Chris@1 108
Chris@1 109
Chris@1 110
Chris@1 111 \subsection{residue 1}
Chris@1 112
Chris@1 113 Residue 1 does not interleave VQ encoding. It represents partition
Chris@1 114 vector scalars in order. As with residue 0, however, partition length
Chris@1 115 must be an integer multiple of the codebook dimension, although
Chris@1 116 dimension may vary from pass to pass.
Chris@1 117
Chris@1 118 As an example, assume a partition vector of size eight, to be encoded
Chris@1 119 by residue 0 using codebook sizes of 8, 4, 2 and 1:
Chris@1 120
Chris@1 121 \begin{programlisting}
Chris@1 122
Chris@1 123 original residue vector: [ 0 1 2 3 4 5 6 7 ]
Chris@1 124
Chris@1 125 codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ]
Chris@1 126
Chris@1 127 codebook dimensions = 4 encoded as: [ 0 1 2 3 ], [ 4 5 6 7 ]
Chris@1 128
Chris@1 129 codebook dimensions = 2 encoded as: [ 0 1 ], [ 2 3 ], [ 4 5 ], [ 6 7 ]
Chris@1 130
Chris@1 131 codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
Chris@1 132
Chris@1 133 \end{programlisting}
Chris@1 134
Chris@1 135
Chris@1 136
Chris@1 137 \subsection{residue 2}
Chris@1 138
Chris@1 139 Residue type two can be thought of as a variant of residue type 1.
Chris@1 140 Rather than encoding multiple passed-in vectors as in residue type 1,
Chris@1 141 the \emph{ch} passed in vectors of length \emph{n} are first
Chris@1 142 interleaved and flattened into a single vector of length
Chris@1 143 \emph{ch}*\emph{n}. Encoding then proceeds as in type 1. Decoding is
Chris@1 144 as in type 1 with decode interleave reversed. If operating on a single
Chris@1 145 vector to begin with, residue type 1 and type 2 are equivalent.
Chris@1 146
Chris@1 147 \begin{center}
Chris@1 148 \includegraphics[width=\textwidth]{residue2}
Chris@1 149 \captionof{figure}{illustration of residue type 2}
Chris@1 150 \end{center}
Chris@1 151
Chris@1 152
Chris@1 153 \subsection{Residue decode}
Chris@1 154
Chris@1 155 \subsubsection{header decode}
Chris@1 156
Chris@1 157 Header decode for all three residue types is identical.
Chris@1 158 \begin{programlisting}
Chris@1 159 1) [residue\_begin] = read 24 bits as unsigned integer
Chris@1 160 2) [residue\_end] = read 24 bits as unsigned integer
Chris@1 161 3) [residue\_partition\_size] = read 24 bits as unsigned integer and add one
Chris@1 162 4) [residue\_classifications] = read 6 bits as unsigned integer and add one
Chris@1 163 5) [residue\_classbook] = read 8 bits as unsigned integer
Chris@1 164 \end{programlisting}
Chris@1 165
Chris@1 166 \varname{[residue\_begin]} and
Chris@1 167 \varname{[residue\_end]} select the specific sub-portion of
Chris@1 168 each vector that is actually coded; it implements akin to a bandpass
Chris@1 169 where, for coding purposes, the vector effectively begins at element
Chris@1 170 \varname{[residue\_begin]} and ends at
Chris@1 171 \varname{[residue\_end]}. Preceding and following values in
Chris@1 172 the unpacked vectors are zeroed. Note that for residue type 2, these
Chris@1 173 values as well as \varname{[residue\_partition\_size]}apply to
Chris@1 174 the interleaved vector, not the individual vectors before interleave.
Chris@1 175 \varname{[residue\_partition\_size]} is as explained above,
Chris@1 176 \varname{[residue\_classifications]} is the number of possible
Chris@1 177 classification to which a partition can belong and
Chris@1 178 \varname{[residue\_classbook]} is the codebook number used to
Chris@1 179 code classification codewords. The number of dimensions in book
Chris@1 180 \varname{[residue\_classbook]} determines how many
Chris@1 181 classification values are grouped into a single classification
Chris@1 182 codeword. Note that the number of entries and dimensions in book
Chris@1 183 \varname{[residue\_classbook]}, along with
Chris@1 184 \varname{[residue\_classifications]}, overdetermines to
Chris@1 185 possible number of classification codewords.
Chris@1 186 If \varname{[residue\_classifications]}\^{}\varname{[residue\_classbook]}.dimensions
Chris@1 187 exceeds \varname{[residue\_classbook]}.entries, the
Chris@1 188 bitstream should be regarded to be undecodable.
Chris@1 189
Chris@1 190 Next we read a bitmap pattern that specifies which partition classes
Chris@1 191 code values in which passes.
Chris@1 192
Chris@1 193 \begin{programlisting}
Chris@1 194 1) iterate [i] over the range 0 ... [residue\_classifications]-1 {
Chris@1 195
Chris@1 196 2) [high\_bits] = 0
Chris@1 197 3) [low\_bits] = read 3 bits as unsigned integer
Chris@1 198 4) [bitflag] = read one bit as boolean
Chris@1 199 5) if ( [bitflag] is set ) then [high\_bits] = read five bits as unsigned integer
Chris@1 200 6) vector [residue\_cascade] element [i] = [high\_bits] * 8 + [low\_bits]
Chris@1 201 }
Chris@1 202 7) done
Chris@1 203 \end{programlisting}
Chris@1 204
Chris@1 205 Finally, we read in a list of book numbers, each corresponding to
Chris@1 206 specific bit set in the cascade bitmap. We loop over the possible
Chris@1 207 codebook classifications and the maximum possible number of encoding
Chris@1 208 stages (8 in Vorbis I, as constrained by the elements of the cascade
Chris@1 209 bitmap being eight bits):
Chris@1 210
Chris@1 211 \begin{programlisting}
Chris@1 212 1) iterate [i] over the range 0 ... [residue\_classifications]-1 {
Chris@1 213
Chris@1 214 2) iterate [j] over the range 0 ... 7 {
Chris@1 215
Chris@1 216 3) if ( vector [residue\_cascade] element [i] bit [j] is set ) {
Chris@1 217
Chris@1 218 4) array [residue\_books] element [i][j] = read 8 bits as unsigned integer
Chris@1 219
Chris@1 220 } else {
Chris@1 221
Chris@1 222 5) array [residue\_books] element [i][j] = unused
Chris@1 223
Chris@1 224 }
Chris@1 225 }
Chris@1 226 }
Chris@1 227
Chris@1 228 6) done
Chris@1 229 \end{programlisting}
Chris@1 230
Chris@1 231 An end-of-packet condition at any point in header decode renders the
Chris@1 232 stream undecodable. In addition, any codebook number greater than the
Chris@1 233 maximum numbered codebook set up in this stream also renders the
Chris@1 234 stream undecodable. All codebooks in array [residue\_books] are
Chris@1 235 required to have a value mapping. The presence of codebook in array
Chris@1 236 [residue\_books] without a value mapping (maptype equals zero) renders
Chris@1 237 the stream undecodable.
Chris@1 238
Chris@1 239
Chris@1 240
Chris@1 241 \subsubsection{packet decode}
Chris@1 242
Chris@1 243 Format 0 and 1 packet decode is identical except for specific
Chris@1 244 partition interleave. Format 2 packet decode can be built out of the
Chris@1 245 format 1 decode process. Thus we describe first the decode
Chris@1 246 infrastructure identical to all three formats.
Chris@1 247
Chris@1 248 In addition to configuration information, the residue decode process
Chris@1 249 is passed the number of vectors in the submap bundle and a vector of
Chris@1 250 flags indicating if any of the vectors are not to be decoded. If the
Chris@1 251 passed in number of vectors is 3 and vector number 1 is marked 'do not
Chris@1 252 decode', decode skips vector 1 during the decode loop. However, even
Chris@1 253 'do not decode' vectors are allocated and zeroed.
Chris@1 254
Chris@1 255 Depending on the values of \varname{[residue\_begin]} and
Chris@1 256 \varname{[residue\_end]}, it is obvious that the encoded
Chris@1 257 portion of a residue vector may be the entire possible residue vector
Chris@1 258 or some other strict subset of the actual residue vector size with
Chris@1 259 zero padding at either uncoded end. However, it is also possible to
Chris@1 260 set \varname{[residue\_begin]} and
Chris@1 261 \varname{[residue\_end]} to specify a range partially or
Chris@1 262 wholly beyond the maximum vector size. Before beginning residue
Chris@1 263 decode, limit \varname{[residue\_begin]} and
Chris@1 264 \varname{[residue\_end]} to the maximum possible vector size
Chris@1 265 as follows. We assume that the number of vectors being encoded,
Chris@1 266 \varname{[ch]} is provided by the higher level decoding
Chris@1 267 process.
Chris@1 268
Chris@1 269 \begin{programlisting}
Chris@1 270 1) [actual\_size] = current blocksize/2;
Chris@1 271 2) if residue encoding is format 2
Chris@1 272 3) [actual\_size] = [actual\_size] * [ch];
Chris@1 273 4) [limit\_residue\_begin] = maximum of ([residue\_begin],[actual\_size]);
Chris@1 274 5) [limit\_residue\_end] = maximum of ([residue\_end],[actual\_size]);
Chris@1 275 \end{programlisting}
Chris@1 276
Chris@1 277 The following convenience values are conceptually useful to clarifying
Chris@1 278 the decode process:
Chris@1 279
Chris@1 280 \begin{programlisting}
Chris@1 281 1) [classwords\_per\_codeword] = [codebook\_dimensions] value of codebook [residue\_classbook]
Chris@1 282 2) [n\_to\_read] = [limit\_residue\_end] - [limit\_residue\_begin]
Chris@1 283 3) [partitions\_to\_read] = [n\_to\_read] / [residue\_partition\_size]
Chris@1 284 \end{programlisting}
Chris@1 285
Chris@1 286 Packet decode proceeds as follows, matching the description offered earlier in the document.
Chris@1 287 \begin{programlisting}
Chris@1 288 1) allocate and zero all vectors that will be returned.
Chris@1 289 2) if ([n\_to\_read] is zero), stop; there is no residue to decode.
Chris@1 290 3) iterate [pass] over the range 0 ... 7 {
Chris@1 291
Chris@1 292 4) [partition\_count] = 0
Chris@1 293
Chris@1 294 5) while [partition\_count] is less than [partitions\_to\_read]
Chris@1 295
Chris@1 296 6) if ([pass] is zero) {
Chris@1 297
Chris@1 298 7) iterate [j] over the range 0 .. [ch]-1 {
Chris@1 299
Chris@1 300 8) if vector [j] is not marked 'do not decode' {
Chris@1 301
Chris@1 302 9) [temp] = read from packet using codebook [residue\_classbook] in scalar context
Chris@1 303 10) iterate [i] descending over the range [classwords\_per\_codeword]-1 ... 0 {
Chris@1 304
Chris@1 305 11) array [classifications] element [j],([i]+[partition\_count]) =
Chris@1 306 [temp] integer modulo [residue\_classifications]
Chris@1 307 12) [temp] = [temp] / [residue\_classifications] using integer division
Chris@1 308
Chris@1 309 }
Chris@1 310
Chris@1 311 }
Chris@1 312
Chris@1 313 }
Chris@1 314
Chris@1 315 }
Chris@1 316
Chris@1 317 13) iterate [i] over the range 0 .. ([classwords\_per\_codeword] - 1) while [partition\_count]
Chris@1 318 is also less than [partitions\_to\_read] {
Chris@1 319
Chris@1 320 14) iterate [j] over the range 0 .. [ch]-1 {
Chris@1 321
Chris@1 322 15) if vector [j] is not marked 'do not decode' {
Chris@1 323
Chris@1 324 16) [vqclass] = array [classifications] element [j],[partition\_count]
Chris@1 325 17) [vqbook] = array [residue\_books] element [vqclass],[pass]
Chris@1 326 18) if ([vqbook] is not 'unused') {
Chris@1 327
Chris@1 328 19) decode partition into output vector number [j], starting at scalar
Chris@1 329 offset [limit\_residue\_begin]+[partition\_count]*[residue\_partition\_size] using
Chris@1 330 codebook number [vqbook] in VQ context
Chris@1 331 }
Chris@1 332 }
Chris@1 333
Chris@1 334 20) increment [partition\_count] by one
Chris@1 335
Chris@1 336 }
Chris@1 337 }
Chris@1 338 }
Chris@1 339
Chris@1 340 21) done
Chris@1 341
Chris@1 342 \end{programlisting}
Chris@1 343
Chris@1 344 An end-of-packet condition during packet decode is to be considered a
Chris@1 345 nominal occurrence. Decode returns the result of vector decode up to
Chris@1 346 that point.
Chris@1 347
Chris@1 348
Chris@1 349
Chris@1 350 \subsubsection{format 0 specifics}
Chris@1 351
Chris@1 352 Format zero decodes partitions exactly as described earlier in the
Chris@1 353 'Residue Format: residue 0' section. The following pseudocode
Chris@1 354 presents the same algorithm. Assume:
Chris@1 355
Chris@1 356 \begin{itemize}
Chris@1 357 \item \varname{[n]} is the value in \varname{[residue\_partition\_size]}
Chris@1 358 \item \varname{[v]} is the residue vector
Chris@1 359 \item \varname{[offset]} is the beginning read offset in [v]
Chris@1 360 \end{itemize}
Chris@1 361
Chris@1 362
Chris@1 363 \begin{programlisting}
Chris@1 364 1) [step] = [n] / [codebook\_dimensions]
Chris@1 365 2) iterate [i] over the range 0 ... [step]-1 {
Chris@1 366
Chris@1 367 3) vector [entry\_temp] = read vector from packet using current codebook in VQ context
Chris@1 368 4) iterate [j] over the range 0 ... [codebook\_dimensions]-1 {
Chris@1 369
Chris@1 370 5) vector [v] element ([offset]+[i]+[j]*[step]) =
Chris@1 371 vector [v] element ([offset]+[i]+[j]*[step]) +
Chris@1 372 vector [entry\_temp] element [j]
Chris@1 373
Chris@1 374 }
Chris@1 375
Chris@1 376 }
Chris@1 377
Chris@1 378 6) done
Chris@1 379
Chris@1 380 \end{programlisting}
Chris@1 381
Chris@1 382
Chris@1 383
Chris@1 384 \subsubsection{format 1 specifics}
Chris@1 385
Chris@1 386 Format 1 decodes partitions exactly as described earlier in the
Chris@1 387 'Residue Format: residue 1' section. The following pseudocode
Chris@1 388 presents the same algorithm. Assume:
Chris@1 389
Chris@1 390 \begin{itemize}
Chris@1 391 \item \varname{[n]} is the value in
Chris@1 392 \varname{[residue\_partition\_size]}
Chris@1 393 \item \varname{[v]} is the residue vector
Chris@1 394 \item \varname{[offset]} is the beginning read offset in [v]
Chris@1 395 \end{itemize}
Chris@1 396
Chris@1 397
Chris@1 398 \begin{programlisting}
Chris@1 399 1) [i] = 0
Chris@1 400 2) vector [entry\_temp] = read vector from packet using current codebook in VQ context
Chris@1 401 3) iterate [j] over the range 0 ... [codebook\_dimensions]-1 {
Chris@1 402
Chris@1 403 4) vector [v] element ([offset]+[i]) =
Chris@1 404 vector [v] element ([offset]+[i]) +
Chris@1 405 vector [entry\_temp] element [j]
Chris@1 406 5) increment [i]
Chris@1 407
Chris@1 408 }
Chris@1 409
Chris@1 410 6) if ( [i] is less than [n] ) continue at step 2
Chris@1 411 7) done
Chris@1 412 \end{programlisting}
Chris@1 413
Chris@1 414
Chris@1 415
Chris@1 416 \subsubsection{format 2 specifics}
Chris@1 417
Chris@1 418 Format 2 is reducible to format 1. It may be implemented as an additional step prior to and an additional post-decode step after a normal format 1 decode.
Chris@1 419
Chris@1 420
Chris@1 421 Format 2 handles 'do not decode' vectors differently than residue 0 or
Chris@1 422 1; if all vectors are marked 'do not decode', no decode occurrs.
Chris@1 423 However, if at least one vector is to be decoded, all the vectors are
Chris@1 424 decoded. We then request normal format 1 to decode a single vector
Chris@1 425 representing all output channels, rather than a vector for each
Chris@1 426 channel. After decode, deinterleave the vector into independent vectors, one for each output channel. That is:
Chris@1 427
Chris@1 428 \begin{enumerate}
Chris@1 429 \item If all vectors 0 through \emph{ch}-1 are marked 'do not decode', allocate and clear a single vector \varname{[v]}of length \emph{ch*n} and skip step 2 below; proceed directly to the post-decode step.
Chris@1 430 \item Rather than performing format 1 decode to produce \emph{ch} vectors of length \emph{n} each, call format 1 decode to produce a single vector \varname{[v]} of length \emph{ch*n}.
Chris@1 431 \item Post decode: Deinterleave the single vector \varname{[v]} returned by format 1 decode as described above into \emph{ch} independent vectors, one for each outputchannel, according to:
Chris@1 432 \begin{programlisting}
Chris@1 433 1) iterate [i] over the range 0 ... [n]-1 {
Chris@1 434
Chris@1 435 2) iterate [j] over the range 0 ... [ch]-1 {
Chris@1 436
Chris@1 437 3) output vector number [j] element [i] = vector [v] element ([i] * [ch] + [j])
Chris@1 438
Chris@1 439 }
Chris@1 440 }
Chris@1 441
Chris@1 442 4) done
Chris@1 443 \end{programlisting}
Chris@1 444
Chris@1 445 \end{enumerate}
Chris@1 446
Chris@1 447
Chris@1 448
Chris@1 449
Chris@1 450
Chris@1 451
Chris@1 452