Mercurial > hg > sv-dependency-builds
comparison src/libvorbis-1.3.3/doc/08-residue.tex @ 1:05aa0afa9217
Bring in flac, ogg, vorbis
| author | Chris Cannam |
|---|---|
| date | Tue, 19 Mar 2013 17:37:49 +0000 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| 0:c7265573341e | 1:05aa0afa9217 |
|---|---|
| 1 % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*- | |
| 2 %!TEX root = Vorbis_I_spec.tex | |
| 3 % $Id$ | |
| 4 \section{Residue setup and decode} \label{vorbis:spec:residue} | |
| 5 | |
| 6 \subsection{Overview} | |
| 7 | |
| 8 A residue vector represents the fine detail of the audio spectrum of | |
| 9 one channel in an audio frame after the encoder subtracts the floor | |
| 10 curve and performs any channel coupling. A residue vector may | |
| 11 represent spectral lines, spectral magnitude, spectral phase or | |
| 12 hybrids as mixed by channel coupling. The exact semantic content of | |
| 13 the vector does not matter to the residue abstraction. | |
| 14 | |
| 15 Whatever the exact qualities, the Vorbis residue abstraction codes the | |
| 16 residue vectors into the bitstream packet, and then reconstructs the | |
| 17 vectors during decode. Vorbis makes use of three different encoding | |
| 18 variants (numbered 0, 1 and 2) of the same basic vector encoding | |
| 19 abstraction. | |
| 20 | |
| 21 | |
| 22 | |
| 23 \subsection{Residue format} | |
| 24 | |
| 25 Residue format partitions each vector in the vector bundle into chunks, | |
| 26 classifies each chunk, encodes the chunk classifications and finally | |
| 27 encodes the chunks themselves using the the specific VQ arrangement | |
| 28 defined for each selected classification. | |
| 29 The exact interleaving and partitioning vary by residue encoding number, | |
| 30 however the high-level process used to classify and encode the residue | |
| 31 vector is the same in all three variants. | |
| 32 | |
| 33 A set of coded residue vectors are all of the same length. High level | |
| 34 coding structure, ignoring for the moment exactly how a partition is | |
| 35 encoded and simply trusting that it is, is as follows: | |
| 36 | |
| 37 \begin{itemize} | |
| 38 \item Each vector is partitioned into multiple equal sized chunks | |
| 39 according to configuration specified. If we have a vector size of | |
| 40 \emph{n}, a partition size \emph{residue\_partition\_size}, and a total | |
| 41 of \emph{ch} residue vectors, the total number of partitioned chunks | |
| 42 coded is \emph{n}/\emph{residue\_partition\_size}*\emph{ch}. It is | |
| 43 important to note that the integer division truncates. In the below | |
| 44 example, we assume an example \emph{residue\_partition\_size} of 8. | |
| 45 | |
| 46 \item Each partition in each vector has a classification number that | |
| 47 specifies which of multiple configured VQ codebook setups are used to | |
| 48 decode that partition. The classification numbers of each partition | |
| 49 can be thought of as forming a vector in their own right, as in the | |
| 50 illustration below. Just as the residue vectors are coded in grouped | |
| 51 partitions to increase encoding efficiency, the classification vector | |
| 52 is also partitioned into chunks. The integer elements of each scalar | |
| 53 in a classification chunk are built into a single scalar that | |
| 54 represents the classification numbers in that chunk. In the below | |
| 55 example, the classification codeword encodes two classification | |
| 56 numbers. | |
| 57 | |
| 58 \item The values in a residue vector may be encoded monolithically in a | |
| 59 single pass through the residue vector, but more often efficient | |
| 60 codebook design dictates that each vector is encoded as the additive | |
| 61 sum of several passes through the residue vector using more than one | |
| 62 VQ codebook. Thus, each residue value potentially accumulates values | |
| 63 from multiple decode passes. The classification value associated with | |
| 64 a partition is the same in each pass, thus the classification codeword | |
| 65 is coded only in the first pass. | |
| 66 | |
| 67 \end{itemize} | |
| 68 | |
| 69 | |
| 70 \begin{center} | |
| 71 \includegraphics[width=\textwidth]{residue-pack} | |
| 72 \captionof{figure}{illustration of residue vector format} | |
| 73 \end{center} | |
| 74 | |
| 75 | |
| 76 | |
| 77 \subsection{residue 0} | |
| 78 | |
| 79 Residue 0 and 1 differ only in the way the values within a residue | |
| 80 partition are interleaved during partition encoding (visually treated | |
| 81 as a black box--or cyan box or brown box--in the above figure). | |
| 82 | |
| 83 Residue encoding 0 interleaves VQ encoding according to the | |
| 84 dimension of the codebook used to encode a partition in a specific | |
| 85 pass. The dimension of the codebook need not be the same in multiple | |
| 86 passes, however the partition size must be an even multiple of the | |
| 87 codebook dimension. | |
| 88 | |
| 89 As an example, assume a partition vector of size eight, to be encoded | |
| 90 by residue 0 using codebook sizes of 8, 4, 2 and 1: | |
| 91 | |
| 92 \begin{programlisting} | |
| 93 | |
| 94 original residue vector: [ 0 1 2 3 4 5 6 7 ] | |
| 95 | |
| 96 codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ] | |
| 97 | |
| 98 codebook dimensions = 4 encoded as: [ 0 2 4 6 ], [ 1 3 5 7 ] | |
| 99 | |
| 100 codebook dimensions = 2 encoded as: [ 0 4 ], [ 1 5 ], [ 2 6 ], [ 3 7 ] | |
| 101 | |
| 102 codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ] | |
| 103 | |
| 104 \end{programlisting} | |
| 105 | |
| 106 It is worth mentioning at this point that no configurable value in the | |
| 107 residue coding setup is restricted to a power of two. | |
| 108 | |
| 109 | |
| 110 | |
| 111 \subsection{residue 1} | |
| 112 | |
| 113 Residue 1 does not interleave VQ encoding. It represents partition | |
| 114 vector scalars in order. As with residue 0, however, partition length | |
| 115 must be an integer multiple of the codebook dimension, although | |
| 116 dimension may vary from pass to pass. | |
| 117 | |
| 118 As an example, assume a partition vector of size eight, to be encoded | |
| 119 by residue 0 using codebook sizes of 8, 4, 2 and 1: | |
| 120 | |
| 121 \begin{programlisting} | |
| 122 | |
| 123 original residue vector: [ 0 1 2 3 4 5 6 7 ] | |
| 124 | |
| 125 codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ] | |
| 126 | |
| 127 codebook dimensions = 4 encoded as: [ 0 1 2 3 ], [ 4 5 6 7 ] | |
| 128 | |
| 129 codebook dimensions = 2 encoded as: [ 0 1 ], [ 2 3 ], [ 4 5 ], [ 6 7 ] | |
| 130 | |
| 131 codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ] | |
| 132 | |
| 133 \end{programlisting} | |
| 134 | |
| 135 | |
| 136 | |
| 137 \subsection{residue 2} | |
| 138 | |
| 139 Residue type two can be thought of as a variant of residue type 1. | |
| 140 Rather than encoding multiple passed-in vectors as in residue type 1, | |
| 141 the \emph{ch} passed in vectors of length \emph{n} are first | |
| 142 interleaved and flattened into a single vector of length | |
| 143 \emph{ch}*\emph{n}. Encoding then proceeds as in type 1. Decoding is | |
| 144 as in type 1 with decode interleave reversed. If operating on a single | |
| 145 vector to begin with, residue type 1 and type 2 are equivalent. | |
| 146 | |
| 147 \begin{center} | |
| 148 \includegraphics[width=\textwidth]{residue2} | |
| 149 \captionof{figure}{illustration of residue type 2} | |
| 150 \end{center} | |
| 151 | |
| 152 | |
| 153 \subsection{Residue decode} | |
| 154 | |
| 155 \subsubsection{header decode} | |
| 156 | |
| 157 Header decode for all three residue types is identical. | |
| 158 \begin{programlisting} | |
| 159 1) [residue\_begin] = read 24 bits as unsigned integer | |
| 160 2) [residue\_end] = read 24 bits as unsigned integer | |
| 161 3) [residue\_partition\_size] = read 24 bits as unsigned integer and add one | |
| 162 4) [residue\_classifications] = read 6 bits as unsigned integer and add one | |
| 163 5) [residue\_classbook] = read 8 bits as unsigned integer | |
| 164 \end{programlisting} | |
| 165 | |
| 166 \varname{[residue\_begin]} and | |
| 167 \varname{[residue\_end]} select the specific sub-portion of | |
| 168 each vector that is actually coded; it implements akin to a bandpass | |
| 169 where, for coding purposes, the vector effectively begins at element | |
| 170 \varname{[residue\_begin]} and ends at | |
| 171 \varname{[residue\_end]}. Preceding and following values in | |
| 172 the unpacked vectors are zeroed. Note that for residue type 2, these | |
| 173 values as well as \varname{[residue\_partition\_size]}apply to | |
| 174 the interleaved vector, not the individual vectors before interleave. | |
| 175 \varname{[residue\_partition\_size]} is as explained above, | |
| 176 \varname{[residue\_classifications]} is the number of possible | |
| 177 classification to which a partition can belong and | |
| 178 \varname{[residue\_classbook]} is the codebook number used to | |
| 179 code classification codewords. The number of dimensions in book | |
| 180 \varname{[residue\_classbook]} determines how many | |
| 181 classification values are grouped into a single classification | |
| 182 codeword. Note that the number of entries and dimensions in book | |
| 183 \varname{[residue\_classbook]}, along with | |
| 184 \varname{[residue\_classifications]}, overdetermines to | |
| 185 possible number of classification codewords. | |
| 186 If \varname{[residue\_classifications]}\^{}\varname{[residue\_classbook]}.dimensions | |
| 187 exceeds \varname{[residue\_classbook]}.entries, the | |
| 188 bitstream should be regarded to be undecodable. | |
| 189 | |
| 190 Next we read a bitmap pattern that specifies which partition classes | |
| 191 code values in which passes. | |
| 192 | |
| 193 \begin{programlisting} | |
| 194 1) iterate [i] over the range 0 ... [residue\_classifications]-1 { | |
| 195 | |
| 196 2) [high\_bits] = 0 | |
| 197 3) [low\_bits] = read 3 bits as unsigned integer | |
| 198 4) [bitflag] = read one bit as boolean | |
| 199 5) if ( [bitflag] is set ) then [high\_bits] = read five bits as unsigned integer | |
| 200 6) vector [residue\_cascade] element [i] = [high\_bits] * 8 + [low\_bits] | |
| 201 } | |
| 202 7) done | |
| 203 \end{programlisting} | |
| 204 | |
| 205 Finally, we read in a list of book numbers, each corresponding to | |
| 206 specific bit set in the cascade bitmap. We loop over the possible | |
| 207 codebook classifications and the maximum possible number of encoding | |
| 208 stages (8 in Vorbis I, as constrained by the elements of the cascade | |
| 209 bitmap being eight bits): | |
| 210 | |
| 211 \begin{programlisting} | |
| 212 1) iterate [i] over the range 0 ... [residue\_classifications]-1 { | |
| 213 | |
| 214 2) iterate [j] over the range 0 ... 7 { | |
| 215 | |
| 216 3) if ( vector [residue\_cascade] element [i] bit [j] is set ) { | |
| 217 | |
| 218 4) array [residue\_books] element [i][j] = read 8 bits as unsigned integer | |
| 219 | |
| 220 } else { | |
| 221 | |
| 222 5) array [residue\_books] element [i][j] = unused | |
| 223 | |
| 224 } | |
| 225 } | |
| 226 } | |
| 227 | |
| 228 6) done | |
| 229 \end{programlisting} | |
| 230 | |
| 231 An end-of-packet condition at any point in header decode renders the | |
| 232 stream undecodable. In addition, any codebook number greater than the | |
| 233 maximum numbered codebook set up in this stream also renders the | |
| 234 stream undecodable. All codebooks in array [residue\_books] are | |
| 235 required to have a value mapping. The presence of codebook in array | |
| 236 [residue\_books] without a value mapping (maptype equals zero) renders | |
| 237 the stream undecodable. | |
| 238 | |
| 239 | |
| 240 | |
| 241 \subsubsection{packet decode} | |
| 242 | |
| 243 Format 0 and 1 packet decode is identical except for specific | |
| 244 partition interleave. Format 2 packet decode can be built out of the | |
| 245 format 1 decode process. Thus we describe first the decode | |
| 246 infrastructure identical to all three formats. | |
| 247 | |
| 248 In addition to configuration information, the residue decode process | |
| 249 is passed the number of vectors in the submap bundle and a vector of | |
| 250 flags indicating if any of the vectors are not to be decoded. If the | |
| 251 passed in number of vectors is 3 and vector number 1 is marked 'do not | |
| 252 decode', decode skips vector 1 during the decode loop. However, even | |
| 253 'do not decode' vectors are allocated and zeroed. | |
| 254 | |
| 255 Depending on the values of \varname{[residue\_begin]} and | |
| 256 \varname{[residue\_end]}, it is obvious that the encoded | |
| 257 portion of a residue vector may be the entire possible residue vector | |
| 258 or some other strict subset of the actual residue vector size with | |
| 259 zero padding at either uncoded end. However, it is also possible to | |
| 260 set \varname{[residue\_begin]} and | |
| 261 \varname{[residue\_end]} to specify a range partially or | |
| 262 wholly beyond the maximum vector size. Before beginning residue | |
| 263 decode, limit \varname{[residue\_begin]} and | |
| 264 \varname{[residue\_end]} to the maximum possible vector size | |
| 265 as follows. We assume that the number of vectors being encoded, | |
| 266 \varname{[ch]} is provided by the higher level decoding | |
| 267 process. | |
| 268 | |
| 269 \begin{programlisting} | |
| 270 1) [actual\_size] = current blocksize/2; | |
| 271 2) if residue encoding is format 2 | |
| 272 3) [actual\_size] = [actual\_size] * [ch]; | |
| 273 4) [limit\_residue\_begin] = maximum of ([residue\_begin],[actual\_size]); | |
| 274 5) [limit\_residue\_end] = maximum of ([residue\_end],[actual\_size]); | |
| 275 \end{programlisting} | |
| 276 | |
| 277 The following convenience values are conceptually useful to clarifying | |
| 278 the decode process: | |
| 279 | |
| 280 \begin{programlisting} | |
| 281 1) [classwords\_per\_codeword] = [codebook\_dimensions] value of codebook [residue\_classbook] | |
| 282 2) [n\_to\_read] = [limit\_residue\_end] - [limit\_residue\_begin] | |
| 283 3) [partitions\_to\_read] = [n\_to\_read] / [residue\_partition\_size] | |
| 284 \end{programlisting} | |
| 285 | |
| 286 Packet decode proceeds as follows, matching the description offered earlier in the document. | |
| 287 \begin{programlisting} | |
| 288 1) allocate and zero all vectors that will be returned. | |
| 289 2) if ([n\_to\_read] is zero), stop; there is no residue to decode. | |
| 290 3) iterate [pass] over the range 0 ... 7 { | |
| 291 | |
| 292 4) [partition\_count] = 0 | |
| 293 | |
| 294 5) while [partition\_count] is less than [partitions\_to\_read] | |
| 295 | |
| 296 6) if ([pass] is zero) { | |
| 297 | |
| 298 7) iterate [j] over the range 0 .. [ch]-1 { | |
| 299 | |
| 300 8) if vector [j] is not marked 'do not decode' { | |
| 301 | |
| 302 9) [temp] = read from packet using codebook [residue\_classbook] in scalar context | |
| 303 10) iterate [i] descending over the range [classwords\_per\_codeword]-1 ... 0 { | |
| 304 | |
| 305 11) array [classifications] element [j],([i]+[partition\_count]) = | |
| 306 [temp] integer modulo [residue\_classifications] | |
| 307 12) [temp] = [temp] / [residue\_classifications] using integer division | |
| 308 | |
| 309 } | |
| 310 | |
| 311 } | |
| 312 | |
| 313 } | |
| 314 | |
| 315 } | |
| 316 | |
| 317 13) iterate [i] over the range 0 .. ([classwords\_per\_codeword] - 1) while [partition\_count] | |
| 318 is also less than [partitions\_to\_read] { | |
| 319 | |
| 320 14) iterate [j] over the range 0 .. [ch]-1 { | |
| 321 | |
| 322 15) if vector [j] is not marked 'do not decode' { | |
| 323 | |
| 324 16) [vqclass] = array [classifications] element [j],[partition\_count] | |
| 325 17) [vqbook] = array [residue\_books] element [vqclass],[pass] | |
| 326 18) if ([vqbook] is not 'unused') { | |
| 327 | |
| 328 19) decode partition into output vector number [j], starting at scalar | |
| 329 offset [limit\_residue\_begin]+[partition\_count]*[residue\_partition\_size] using | |
| 330 codebook number [vqbook] in VQ context | |
| 331 } | |
| 332 } | |
| 333 | |
| 334 20) increment [partition\_count] by one | |
| 335 | |
| 336 } | |
| 337 } | |
| 338 } | |
| 339 | |
| 340 21) done | |
| 341 | |
| 342 \end{programlisting} | |
| 343 | |
| 344 An end-of-packet condition during packet decode is to be considered a | |
| 345 nominal occurrence. Decode returns the result of vector decode up to | |
| 346 that point. | |
| 347 | |
| 348 | |
| 349 | |
| 350 \subsubsection{format 0 specifics} | |
| 351 | |
| 352 Format zero decodes partitions exactly as described earlier in the | |
| 353 'Residue Format: residue 0' section. The following pseudocode | |
| 354 presents the same algorithm. Assume: | |
| 355 | |
| 356 \begin{itemize} | |
| 357 \item \varname{[n]} is the value in \varname{[residue\_partition\_size]} | |
| 358 \item \varname{[v]} is the residue vector | |
| 359 \item \varname{[offset]} is the beginning read offset in [v] | |
| 360 \end{itemize} | |
| 361 | |
| 362 | |
| 363 \begin{programlisting} | |
| 364 1) [step] = [n] / [codebook\_dimensions] | |
| 365 2) iterate [i] over the range 0 ... [step]-1 { | |
| 366 | |
| 367 3) vector [entry\_temp] = read vector from packet using current codebook in VQ context | |
| 368 4) iterate [j] over the range 0 ... [codebook\_dimensions]-1 { | |
| 369 | |
| 370 5) vector [v] element ([offset]+[i]+[j]*[step]) = | |
| 371 vector [v] element ([offset]+[i]+[j]*[step]) + | |
| 372 vector [entry\_temp] element [j] | |
| 373 | |
| 374 } | |
| 375 | |
| 376 } | |
| 377 | |
| 378 6) done | |
| 379 | |
| 380 \end{programlisting} | |
| 381 | |
| 382 | |
| 383 | |
| 384 \subsubsection{format 1 specifics} | |
| 385 | |
| 386 Format 1 decodes partitions exactly as described earlier in the | |
| 387 'Residue Format: residue 1' section. The following pseudocode | |
| 388 presents the same algorithm. Assume: | |
| 389 | |
| 390 \begin{itemize} | |
| 391 \item \varname{[n]} is the value in | |
| 392 \varname{[residue\_partition\_size]} | |
| 393 \item \varname{[v]} is the residue vector | |
| 394 \item \varname{[offset]} is the beginning read offset in [v] | |
| 395 \end{itemize} | |
| 396 | |
| 397 | |
| 398 \begin{programlisting} | |
| 399 1) [i] = 0 | |
| 400 2) vector [entry\_temp] = read vector from packet using current codebook in VQ context | |
| 401 3) iterate [j] over the range 0 ... [codebook\_dimensions]-1 { | |
| 402 | |
| 403 4) vector [v] element ([offset]+[i]) = | |
| 404 vector [v] element ([offset]+[i]) + | |
| 405 vector [entry\_temp] element [j] | |
| 406 5) increment [i] | |
| 407 | |
| 408 } | |
| 409 | |
| 410 6) if ( [i] is less than [n] ) continue at step 2 | |
| 411 7) done | |
| 412 \end{programlisting} | |
| 413 | |
| 414 | |
| 415 | |
| 416 \subsubsection{format 2 specifics} | |
| 417 | |
| 418 Format 2 is reducible to format 1. It may be implemented as an additional step prior to and an additional post-decode step after a normal format 1 decode. | |
| 419 | |
| 420 | |
| 421 Format 2 handles 'do not decode' vectors differently than residue 0 or | |
| 422 1; if all vectors are marked 'do not decode', no decode occurrs. | |
| 423 However, if at least one vector is to be decoded, all the vectors are | |
| 424 decoded. We then request normal format 1 to decode a single vector | |
| 425 representing all output channels, rather than a vector for each | |
| 426 channel. After decode, deinterleave the vector into independent vectors, one for each output channel. That is: | |
| 427 | |
| 428 \begin{enumerate} | |
| 429 \item If all vectors 0 through \emph{ch}-1 are marked 'do not decode', allocate and clear a single vector \varname{[v]}of length \emph{ch*n} and skip step 2 below; proceed directly to the post-decode step. | |
| 430 \item Rather than performing format 1 decode to produce \emph{ch} vectors of length \emph{n} each, call format 1 decode to produce a single vector \varname{[v]} of length \emph{ch*n}. | |
| 431 \item Post decode: Deinterleave the single vector \varname{[v]} returned by format 1 decode as described above into \emph{ch} independent vectors, one for each outputchannel, according to: | |
| 432 \begin{programlisting} | |
| 433 1) iterate [i] over the range 0 ... [n]-1 { | |
| 434 | |
| 435 2) iterate [j] over the range 0 ... [ch]-1 { | |
| 436 | |
| 437 3) output vector number [j] element [i] = vector [v] element ([i] * [ch] + [j]) | |
| 438 | |
| 439 } | |
| 440 } | |
| 441 | |
| 442 4) done | |
| 443 \end{programlisting} | |
| 444 | |
| 445 \end{enumerate} | |
| 446 | |
| 447 | |
| 448 | |
| 449 | |
| 450 | |
| 451 | |
| 452 |
