cannam@86: % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*- cannam@86: %!TEX root = Vorbis_I_spec.tex cannam@86: % $Id$ cannam@86: \section{Residue setup and decode} \label{vorbis:spec:residue} cannam@86: cannam@86: \subsection{Overview} cannam@86: cannam@86: A residue vector represents the fine detail of the audio spectrum of cannam@86: one channel in an audio frame after the encoder subtracts the floor cannam@86: curve and performs any channel coupling. A residue vector may cannam@86: represent spectral lines, spectral magnitude, spectral phase or cannam@86: hybrids as mixed by channel coupling. The exact semantic content of cannam@86: the vector does not matter to the residue abstraction. cannam@86: cannam@86: Whatever the exact qualities, the Vorbis residue abstraction codes the cannam@86: residue vectors into the bitstream packet, and then reconstructs the cannam@86: vectors during decode. Vorbis makes use of three different encoding cannam@86: variants (numbered 0, 1 and 2) of the same basic vector encoding cannam@86: abstraction. cannam@86: cannam@86: cannam@86: cannam@86: \subsection{Residue format} cannam@86: cannam@86: Residue format partitions each vector in the vector bundle into chunks, cannam@86: classifies each chunk, encodes the chunk classifications and finally cannam@86: encodes the chunks themselves using the the specific VQ arrangement cannam@86: defined for each selected classification. cannam@86: The exact interleaving and partitioning vary by residue encoding number, cannam@86: however the high-level process used to classify and encode the residue cannam@86: vector is the same in all three variants. cannam@86: cannam@86: A set of coded residue vectors are all of the same length. High level cannam@86: coding structure, ignoring for the moment exactly how a partition is cannam@86: encoded and simply trusting that it is, is as follows: cannam@86: cannam@86: \begin{itemize} cannam@86: \item Each vector is partitioned into multiple equal sized chunks cannam@86: according to configuration specified. If we have a vector size of cannam@86: \emph{n}, a partition size \emph{residue\_partition\_size}, and a total cannam@86: of \emph{ch} residue vectors, the total number of partitioned chunks cannam@86: coded is \emph{n}/\emph{residue\_partition\_size}*\emph{ch}. It is cannam@86: important to note that the integer division truncates. In the below cannam@86: example, we assume an example \emph{residue\_partition\_size} of 8. cannam@86: cannam@86: \item Each partition in each vector has a classification number that cannam@86: specifies which of multiple configured VQ codebook setups are used to cannam@86: decode that partition. The classification numbers of each partition cannam@86: can be thought of as forming a vector in their own right, as in the cannam@86: illustration below. Just as the residue vectors are coded in grouped cannam@86: partitions to increase encoding efficiency, the classification vector cannam@86: is also partitioned into chunks. The integer elements of each scalar cannam@86: in a classification chunk are built into a single scalar that cannam@86: represents the classification numbers in that chunk. In the below cannam@86: example, the classification codeword encodes two classification cannam@86: numbers. cannam@86: cannam@86: \item The values in a residue vector may be encoded monolithically in a cannam@86: single pass through the residue vector, but more often efficient cannam@86: codebook design dictates that each vector is encoded as the additive cannam@86: sum of several passes through the residue vector using more than one cannam@86: VQ codebook. Thus, each residue value potentially accumulates values cannam@86: from multiple decode passes. The classification value associated with cannam@86: a partition is the same in each pass, thus the classification codeword cannam@86: is coded only in the first pass. cannam@86: cannam@86: \end{itemize} cannam@86: cannam@86: cannam@86: \begin{center} cannam@86: \includegraphics[width=\textwidth]{residue-pack} cannam@86: \captionof{figure}{illustration of residue vector format} cannam@86: \end{center} cannam@86: cannam@86: cannam@86: cannam@86: \subsection{residue 0} cannam@86: cannam@86: Residue 0 and 1 differ only in the way the values within a residue cannam@86: partition are interleaved during partition encoding (visually treated cannam@86: as a black box--or cyan box or brown box--in the above figure). cannam@86: cannam@86: Residue encoding 0 interleaves VQ encoding according to the cannam@86: dimension of the codebook used to encode a partition in a specific cannam@86: pass. The dimension of the codebook need not be the same in multiple cannam@86: passes, however the partition size must be an even multiple of the cannam@86: codebook dimension. cannam@86: cannam@86: As an example, assume a partition vector of size eight, to be encoded cannam@86: by residue 0 using codebook sizes of 8, 4, 2 and 1: cannam@86: cannam@86: \begin{programlisting} cannam@86: cannam@86: original residue vector: [ 0 1 2 3 4 5 6 7 ] cannam@86: cannam@86: codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ] cannam@86: cannam@86: codebook dimensions = 4 encoded as: [ 0 2 4 6 ], [ 1 3 5 7 ] cannam@86: cannam@86: codebook dimensions = 2 encoded as: [ 0 4 ], [ 1 5 ], [ 2 6 ], [ 3 7 ] cannam@86: cannam@86: codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ] cannam@86: cannam@86: \end{programlisting} cannam@86: cannam@86: It is worth mentioning at this point that no configurable value in the cannam@86: residue coding setup is restricted to a power of two. cannam@86: cannam@86: cannam@86: cannam@86: \subsection{residue 1} cannam@86: cannam@86: Residue 1 does not interleave VQ encoding. It represents partition cannam@86: vector scalars in order. As with residue 0, however, partition length cannam@86: must be an integer multiple of the codebook dimension, although cannam@86: dimension may vary from pass to pass. cannam@86: cannam@86: As an example, assume a partition vector of size eight, to be encoded cannam@86: by residue 0 using codebook sizes of 8, 4, 2 and 1: cannam@86: cannam@86: \begin{programlisting} cannam@86: cannam@86: original residue vector: [ 0 1 2 3 4 5 6 7 ] cannam@86: cannam@86: codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ] cannam@86: cannam@86: codebook dimensions = 4 encoded as: [ 0 1 2 3 ], [ 4 5 6 7 ] cannam@86: cannam@86: codebook dimensions = 2 encoded as: [ 0 1 ], [ 2 3 ], [ 4 5 ], [ 6 7 ] cannam@86: cannam@86: codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ] cannam@86: cannam@86: \end{programlisting} cannam@86: cannam@86: cannam@86: cannam@86: \subsection{residue 2} cannam@86: cannam@86: Residue type two can be thought of as a variant of residue type 1. cannam@86: Rather than encoding multiple passed-in vectors as in residue type 1, cannam@86: the \emph{ch} passed in vectors of length \emph{n} are first cannam@86: interleaved and flattened into a single vector of length cannam@86: \emph{ch}*\emph{n}. Encoding then proceeds as in type 1. Decoding is cannam@86: as in type 1 with decode interleave reversed. If operating on a single cannam@86: vector to begin with, residue type 1 and type 2 are equivalent. cannam@86: cannam@86: \begin{center} cannam@86: \includegraphics[width=\textwidth]{residue2} cannam@86: \captionof{figure}{illustration of residue type 2} cannam@86: \end{center} cannam@86: cannam@86: cannam@86: \subsection{Residue decode} cannam@86: cannam@86: \subsubsection{header decode} cannam@86: cannam@86: Header decode for all three residue types is identical. cannam@86: \begin{programlisting} cannam@86: 1) [residue\_begin] = read 24 bits as unsigned integer cannam@86: 2) [residue\_end] = read 24 bits as unsigned integer cannam@86: 3) [residue\_partition\_size] = read 24 bits as unsigned integer and add one cannam@86: 4) [residue\_classifications] = read 6 bits as unsigned integer and add one cannam@86: 5) [residue\_classbook] = read 8 bits as unsigned integer cannam@86: \end{programlisting} cannam@86: cannam@86: \varname{[residue\_begin]} and cannam@86: \varname{[residue\_end]} select the specific sub-portion of cannam@86: each vector that is actually coded; it implements akin to a bandpass cannam@86: where, for coding purposes, the vector effectively begins at element cannam@86: \varname{[residue\_begin]} and ends at cannam@86: \varname{[residue\_end]}. Preceding and following values in cannam@86: the unpacked vectors are zeroed. Note that for residue type 2, these cannam@86: values as well as \varname{[residue\_partition\_size]}apply to cannam@86: the interleaved vector, not the individual vectors before interleave. cannam@86: \varname{[residue\_partition\_size]} is as explained above, cannam@86: \varname{[residue\_classifications]} is the number of possible cannam@86: classification to which a partition can belong and cannam@86: \varname{[residue\_classbook]} is the codebook number used to cannam@86: code classification codewords. The number of dimensions in book cannam@86: \varname{[residue\_classbook]} determines how many cannam@86: classification values are grouped into a single classification cannam@86: codeword. Note that the number of entries and dimensions in book cannam@86: \varname{[residue\_classbook]}, along with cannam@86: \varname{[residue\_classifications]}, overdetermines to cannam@86: possible number of classification codewords. cannam@86: If \varname{[residue\_classifications]}\^{}\varname{[residue\_classbook]}.dimensions cannam@86: exceeds \varname{[residue\_classbook]}.entries, the cannam@86: bitstream should be regarded to be undecodable. cannam@86: cannam@86: Next we read a bitmap pattern that specifies which partition classes cannam@86: code values in which passes. cannam@86: cannam@86: \begin{programlisting} cannam@86: 1) iterate [i] over the range 0 ... [residue\_classifications]-1 { cannam@86: cannam@86: 2) [high\_bits] = 0 cannam@86: 3) [low\_bits] = read 3 bits as unsigned integer cannam@86: 4) [bitflag] = read one bit as boolean cannam@86: 5) if ( [bitflag] is set ) then [high\_bits] = read five bits as unsigned integer cannam@86: 6) vector [residue\_cascade] element [i] = [high\_bits] * 8 + [low\_bits] cannam@86: } cannam@86: 7) done cannam@86: \end{programlisting} cannam@86: cannam@86: Finally, we read in a list of book numbers, each corresponding to cannam@86: specific bit set in the cascade bitmap. We loop over the possible cannam@86: codebook classifications and the maximum possible number of encoding cannam@86: stages (8 in Vorbis I, as constrained by the elements of the cascade cannam@86: bitmap being eight bits): cannam@86: cannam@86: \begin{programlisting} cannam@86: 1) iterate [i] over the range 0 ... [residue\_classifications]-1 { cannam@86: cannam@86: 2) iterate [j] over the range 0 ... 7 { cannam@86: cannam@86: 3) if ( vector [residue\_cascade] element [i] bit [j] is set ) { cannam@86: cannam@86: 4) array [residue\_books] element [i][j] = read 8 bits as unsigned integer cannam@86: cannam@86: } else { cannam@86: cannam@86: 5) array [residue\_books] element [i][j] = unused cannam@86: cannam@86: } cannam@86: } cannam@86: } cannam@86: cannam@86: 6) done cannam@86: \end{programlisting} cannam@86: cannam@86: An end-of-packet condition at any point in header decode renders the cannam@86: stream undecodable. In addition, any codebook number greater than the cannam@86: maximum numbered codebook set up in this stream also renders the cannam@86: stream undecodable. All codebooks in array [residue\_books] are cannam@86: required to have a value mapping. The presence of codebook in array cannam@86: [residue\_books] without a value mapping (maptype equals zero) renders cannam@86: the stream undecodable. cannam@86: cannam@86: cannam@86: cannam@86: \subsubsection{packet decode} cannam@86: cannam@86: Format 0 and 1 packet decode is identical except for specific cannam@86: partition interleave. Format 2 packet decode can be built out of the cannam@86: format 1 decode process. Thus we describe first the decode cannam@86: infrastructure identical to all three formats. cannam@86: cannam@86: In addition to configuration information, the residue decode process cannam@86: is passed the number of vectors in the submap bundle and a vector of cannam@86: flags indicating if any of the vectors are not to be decoded. If the cannam@86: passed in number of vectors is 3 and vector number 1 is marked 'do not cannam@86: decode', decode skips vector 1 during the decode loop. However, even cannam@86: 'do not decode' vectors are allocated and zeroed. cannam@86: cannam@86: Depending on the values of \varname{[residue\_begin]} and cannam@86: \varname{[residue\_end]}, it is obvious that the encoded cannam@86: portion of a residue vector may be the entire possible residue vector cannam@86: or some other strict subset of the actual residue vector size with cannam@86: zero padding at either uncoded end. However, it is also possible to cannam@86: set \varname{[residue\_begin]} and cannam@86: \varname{[residue\_end]} to specify a range partially or cannam@86: wholly beyond the maximum vector size. Before beginning residue cannam@86: decode, limit \varname{[residue\_begin]} and cannam@86: \varname{[residue\_end]} to the maximum possible vector size cannam@86: as follows. We assume that the number of vectors being encoded, cannam@86: \varname{[ch]} is provided by the higher level decoding cannam@86: process. cannam@86: cannam@86: \begin{programlisting} cannam@86: 1) [actual\_size] = current blocksize/2; cannam@86: 2) if residue encoding is format 2 cannam@86: 3) [actual\_size] = [actual\_size] * [ch]; cannam@86: 4) [limit\_residue\_begin] = maximum of ([residue\_begin],[actual\_size]); cannam@86: 5) [limit\_residue\_end] = maximum of ([residue\_end],[actual\_size]); cannam@86: \end{programlisting} cannam@86: cannam@86: The following convenience values are conceptually useful to clarifying cannam@86: the decode process: cannam@86: cannam@86: \begin{programlisting} cannam@86: 1) [classwords\_per\_codeword] = [codebook\_dimensions] value of codebook [residue\_classbook] cannam@86: 2) [n\_to\_read] = [limit\_residue\_end] - [limit\_residue\_begin] cannam@86: 3) [partitions\_to\_read] = [n\_to\_read] / [residue\_partition\_size] cannam@86: \end{programlisting} cannam@86: cannam@86: Packet decode proceeds as follows, matching the description offered earlier in the document. cannam@86: \begin{programlisting} cannam@86: 1) allocate and zero all vectors that will be returned. cannam@86: 2) if ([n\_to\_read] is zero), stop; there is no residue to decode. cannam@86: 3) iterate [pass] over the range 0 ... 7 { cannam@86: cannam@86: 4) [partition\_count] = 0 cannam@86: cannam@86: 5) while [partition\_count] is less than [partitions\_to\_read] cannam@86: cannam@86: 6) if ([pass] is zero) { cannam@86: cannam@86: 7) iterate [j] over the range 0 .. [ch]-1 { cannam@86: cannam@86: 8) if vector [j] is not marked 'do not decode' { cannam@86: cannam@86: 9) [temp] = read from packet using codebook [residue\_classbook] in scalar context cannam@86: 10) iterate [i] descending over the range [classwords\_per\_codeword]-1 ... 0 { cannam@86: cannam@86: 11) array [classifications] element [j],([i]+[partition\_count]) = cannam@86: [temp] integer modulo [residue\_classifications] cannam@86: 12) [temp] = [temp] / [residue\_classifications] using integer division cannam@86: cannam@86: } cannam@86: cannam@86: } cannam@86: cannam@86: } cannam@86: cannam@86: } cannam@86: cannam@86: 13) iterate [i] over the range 0 .. ([classwords\_per\_codeword] - 1) while [partition\_count] cannam@86: is also less than [partitions\_to\_read] { cannam@86: cannam@86: 14) iterate [j] over the range 0 .. [ch]-1 { cannam@86: cannam@86: 15) if vector [j] is not marked 'do not decode' { cannam@86: cannam@86: 16) [vqclass] = array [classifications] element [j],[partition\_count] cannam@86: 17) [vqbook] = array [residue\_books] element [vqclass],[pass] cannam@86: 18) if ([vqbook] is not 'unused') { cannam@86: cannam@86: 19) decode partition into output vector number [j], starting at scalar cannam@86: offset [limit\_residue\_begin]+[partition\_count]*[residue\_partition\_size] using cannam@86: codebook number [vqbook] in VQ context cannam@86: } cannam@86: } cannam@86: cannam@86: 20) increment [partition\_count] by one cannam@86: cannam@86: } cannam@86: } cannam@86: } cannam@86: cannam@86: 21) done cannam@86: cannam@86: \end{programlisting} cannam@86: cannam@86: An end-of-packet condition during packet decode is to be considered a cannam@86: nominal occurrence. Decode returns the result of vector decode up to cannam@86: that point. cannam@86: cannam@86: cannam@86: cannam@86: \subsubsection{format 0 specifics} cannam@86: cannam@86: Format zero decodes partitions exactly as described earlier in the cannam@86: 'Residue Format: residue 0' section. The following pseudocode cannam@86: presents the same algorithm. Assume: cannam@86: cannam@86: \begin{itemize} cannam@86: \item \varname{[n]} is the value in \varname{[residue\_partition\_size]} cannam@86: \item \varname{[v]} is the residue vector cannam@86: \item \varname{[offset]} is the beginning read offset in [v] cannam@86: \end{itemize} cannam@86: cannam@86: cannam@86: \begin{programlisting} cannam@86: 1) [step] = [n] / [codebook\_dimensions] cannam@86: 2) iterate [i] over the range 0 ... [step]-1 { cannam@86: cannam@86: 3) vector [entry\_temp] = read vector from packet using current codebook in VQ context cannam@86: 4) iterate [j] over the range 0 ... [codebook\_dimensions]-1 { cannam@86: cannam@86: 5) vector [v] element ([offset]+[i]+[j]*[step]) = cannam@86: vector [v] element ([offset]+[i]+[j]*[step]) + cannam@86: vector [entry\_temp] element [j] cannam@86: cannam@86: } cannam@86: cannam@86: } cannam@86: cannam@86: 6) done cannam@86: cannam@86: \end{programlisting} cannam@86: cannam@86: cannam@86: cannam@86: \subsubsection{format 1 specifics} cannam@86: cannam@86: Format 1 decodes partitions exactly as described earlier in the cannam@86: 'Residue Format: residue 1' section. The following pseudocode cannam@86: presents the same algorithm. Assume: cannam@86: cannam@86: \begin{itemize} cannam@86: \item \varname{[n]} is the value in cannam@86: \varname{[residue\_partition\_size]} cannam@86: \item \varname{[v]} is the residue vector cannam@86: \item \varname{[offset]} is the beginning read offset in [v] cannam@86: \end{itemize} cannam@86: cannam@86: cannam@86: \begin{programlisting} cannam@86: 1) [i] = 0 cannam@86: 2) vector [entry\_temp] = read vector from packet using current codebook in VQ context cannam@86: 3) iterate [j] over the range 0 ... [codebook\_dimensions]-1 { cannam@86: cannam@86: 4) vector [v] element ([offset]+[i]) = cannam@86: vector [v] element ([offset]+[i]) + cannam@86: vector [entry\_temp] element [j] cannam@86: 5) increment [i] cannam@86: cannam@86: } cannam@86: cannam@86: 6) if ( [i] is less than [n] ) continue at step 2 cannam@86: 7) done cannam@86: \end{programlisting} cannam@86: cannam@86: cannam@86: cannam@86: \subsubsection{format 2 specifics} cannam@86: cannam@86: Format 2 is reducible to format 1. It may be implemented as an additional step prior to and an additional post-decode step after a normal format 1 decode. cannam@86: cannam@86: cannam@86: Format 2 handles 'do not decode' vectors differently than residue 0 or cannam@86: 1; if all vectors are marked 'do not decode', no decode occurrs. cannam@86: However, if at least one vector is to be decoded, all the vectors are cannam@86: decoded. We then request normal format 1 to decode a single vector cannam@86: representing all output channels, rather than a vector for each cannam@86: channel. After decode, deinterleave the vector into independent vectors, one for each output channel. That is: cannam@86: cannam@86: \begin{enumerate} cannam@86: \item If all vectors 0 through \emph{ch}-1 are marked 'do not decode', allocate and clear a single vector \varname{[v]}of length \emph{ch*n} and skip step 2 below; proceed directly to the post-decode step. cannam@86: \item Rather than performing format 1 decode to produce \emph{ch} vectors of length \emph{n} each, call format 1 decode to produce a single vector \varname{[v]} of length \emph{ch*n}. cannam@86: \item Post decode: Deinterleave the single vector \varname{[v]} returned by format 1 decode as described above into \emph{ch} independent vectors, one for each outputchannel, according to: cannam@86: \begin{programlisting} cannam@86: 1) iterate [i] over the range 0 ... [n]-1 { cannam@86: cannam@86: 2) iterate [j] over the range 0 ... [ch]-1 { cannam@86: cannam@86: 3) output vector number [j] element [i] = vector [v] element ([i] * [ch] + [j]) cannam@86: cannam@86: } cannam@86: } cannam@86: cannam@86: 4) done cannam@86: \end{programlisting} cannam@86: cannam@86: \end{enumerate} cannam@86: cannam@86: cannam@86: cannam@86: cannam@86: cannam@86: cannam@86: