Mercurial > hg > sv-dependency-builds
comparison src/libvorbis-1.3.3/doc/03-codebook.tex @ 1:05aa0afa9217
Bring in flac, ogg, vorbis
author | Chris Cannam |
---|---|
date | Tue, 19 Mar 2013 17:37:49 +0000 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
0:c7265573341e | 1:05aa0afa9217 |
---|---|
1 % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*- | |
2 %!TEX root = Vorbis_I_spec.tex | |
3 % $Id$ | |
4 \section{Probability Model and Codebooks} \label{vorbis:spec:codebook} | |
5 | |
6 \subsection{Overview} | |
7 | |
8 Unlike practically every other mainstream audio codec, Vorbis has no | |
9 statically configured probability model, instead packing all entropy | |
10 decoding configuration, VQ and Huffman, into the bitstream itself in | |
11 the third header, the codec setup header. This packed configuration | |
12 consists of multiple 'codebooks', each containing a specific | |
13 Huffman-equivalent representation for decoding compressed codewords as | |
14 well as an optional lookup table of output vector values to which a | |
15 decoded Huffman value is applied as an offset, generating the final | |
16 decoded output corresponding to a given compressed codeword. | |
17 | |
18 \subsubsection{Bitwise operation} | |
19 The codebook mechanism is built on top of the vorbis bitpacker. Both | |
20 the codebooks themselves and the codewords they decode are unrolled | |
21 from a packet as a series of arbitrary-width values read from the | |
22 stream according to \xref{vorbis:spec:bitpacking}. | |
23 | |
24 | |
25 | |
26 | |
27 \subsection{Packed codebook format} | |
28 | |
29 For purposes of the examples below, we assume that the storage | |
30 system's native byte width is eight bits. This is not universally | |
31 true; see \xref{vorbis:spec:bitpacking} for discussion | |
32 relating to non-eight-bit bytes. | |
33 | |
34 \subsubsection{codebook decode} | |
35 | |
36 A codebook begins with a 24 bit sync pattern, 0x564342: | |
37 | |
38 \begin{Verbatim}[commandchars=\\\{\}] | |
39 byte 0: [ 0 1 0 0 0 0 1 0 ] (0x42) | |
40 byte 1: [ 0 1 0 0 0 0 1 1 ] (0x43) | |
41 byte 2: [ 0 1 0 1 0 1 1 0 ] (0x56) | |
42 \end{Verbatim} | |
43 | |
44 16 bit \varname{[codebook\_dimensions]} and 24 bit \varname{[codebook\_entries]} fields: | |
45 | |
46 \begin{Verbatim}[commandchars=\\\{\}] | |
47 | |
48 byte 3: [ X X X X X X X X ] | |
49 byte 4: [ X X X X X X X X ] [codebook\_dimensions] (16 bit unsigned) | |
50 | |
51 byte 5: [ X X X X X X X X ] | |
52 byte 6: [ X X X X X X X X ] | |
53 byte 7: [ X X X X X X X X ] [codebook\_entries] (24 bit unsigned) | |
54 | |
55 \end{Verbatim} | |
56 | |
57 Next is the \varname{[ordered]} bit flag: | |
58 | |
59 \begin{Verbatim}[commandchars=\\\{\}] | |
60 | |
61 byte 8: [ X ] [ordered] (1 bit) | |
62 | |
63 \end{Verbatim} | |
64 | |
65 Each entry, numbering a | |
66 total of \varname{[codebook\_entries]}, is assigned a codeword length. | |
67 We now read the list of codeword lengths and store these lengths in | |
68 the array \varname{[codebook\_codeword\_lengths]}. Decode of lengths is | |
69 according to whether the \varname{[ordered]} flag is set or unset. | |
70 | |
71 \begin{itemize} | |
72 \item | |
73 If the \varname{[ordered]} flag is unset, the codeword list is not | |
74 length ordered and the decoder needs to read each codeword length | |
75 one-by-one. | |
76 | |
77 The decoder first reads one additional bit flag, the | |
78 \varname{[sparse]} flag. This flag determines whether or not the | |
79 codebook contains unused entries that are not to be included in the | |
80 codeword decode tree: | |
81 | |
82 \begin{Verbatim}[commandchars=\\\{\}] | |
83 byte 8: [ X 1 ] [sparse] flag (1 bit) | |
84 \end{Verbatim} | |
85 | |
86 The decoder now performs for each of the \varname{[codebook\_entries]} | |
87 codebook entries: | |
88 | |
89 \begin{Verbatim}[commandchars=\\\{\}] | |
90 | |
91 1) if([sparse] is set) \{ | |
92 | |
93 2) [flag] = read one bit; | |
94 3) if([flag] is set) \{ | |
95 | |
96 4) [length] = read a five bit unsigned integer; | |
97 5) codeword length for this entry is [length]+1; | |
98 | |
99 \} else \{ | |
100 | |
101 6) this entry is unused. mark it as such. | |
102 | |
103 \} | |
104 | |
105 \} else the sparse flag is not set \{ | |
106 | |
107 7) [length] = read a five bit unsigned integer; | |
108 8) the codeword length for this entry is [length]+1; | |
109 | |
110 \} | |
111 | |
112 \end{Verbatim} | |
113 | |
114 \item | |
115 If the \varname{[ordered]} flag is set, the codeword list for this | |
116 codebook is encoded in ascending length order. Rather than reading | |
117 a length for every codeword, the encoder reads the number of | |
118 codewords per length. That is, beginning at entry zero: | |
119 | |
120 \begin{Verbatim}[commandchars=\\\{\}] | |
121 1) [current\_entry] = 0; | |
122 2) [current\_length] = read a five bit unsigned integer and add 1; | |
123 3) [number] = read \link{vorbis:spec:ilog}{ilog}([codebook\_entries] - [current\_entry]) bits as an unsigned integer | |
124 4) set the entries [current\_entry] through [current\_entry]+[number]-1, inclusive, | |
125 of the [codebook\_codeword\_lengths] array to [current\_length] | |
126 5) set [current\_entry] to [number] + [current\_entry] | |
127 6) increment [current\_length] by 1 | |
128 7) if [current\_entry] is greater than [codebook\_entries] ERROR CONDITION; | |
129 the decoder will not be able to read this stream. | |
130 8) if [current\_entry] is less than [codebook\_entries], repeat process starting at 3) | |
131 9) done. | |
132 \end{Verbatim} | |
133 | |
134 \end{itemize} | |
135 | |
136 After all codeword lengths have been decoded, the decoder reads the | |
137 vector lookup table. Vorbis I supports three lookup types: | |
138 \begin{enumerate} | |
139 \item | |
140 No lookup | |
141 \item | |
142 Implicitly populated value mapping (lattice VQ) | |
143 \item | |
144 Explicitly populated value mapping (tessellated or 'foam' | |
145 VQ) | |
146 \end{enumerate} | |
147 | |
148 | |
149 The lookup table type is read as a four bit unsigned integer: | |
150 \begin{Verbatim}[commandchars=\\\{\}] | |
151 1) [codebook\_lookup\_type] = read four bits as an unsigned integer | |
152 \end{Verbatim} | |
153 | |
154 Codebook decode precedes according to \varname{[codebook\_lookup\_type]}: | |
155 \begin{itemize} | |
156 \item | |
157 Lookup type zero indicates no lookup to be read. Proceed past | |
158 lookup decode. | |
159 \item | |
160 Lookup types one and two are similar, differing only in the | |
161 number of lookup values to be read. Lookup type one reads a list of | |
162 values that are permuted in a set pattern to build a list of vectors, | |
163 each vector of order \varname{[codebook\_dimensions]} scalars. Lookup | |
164 type two builds the same vector list, but reads each scalar for each | |
165 vector explicitly, rather than building vectors from a smaller list of | |
166 possible scalar values. Lookup decode proceeds as follows: | |
167 | |
168 \begin{Verbatim}[commandchars=\\\{\}] | |
169 1) [codebook\_minimum\_value] = \link{vorbis:spec:float32:unpack}{float32\_unpack}( read 32 bits as an unsigned integer) | |
170 2) [codebook\_delta\_value] = \link{vorbis:spec:float32:unpack}{float32\_unpack}( read 32 bits as an unsigned integer) | |
171 3) [codebook\_value\_bits] = read 4 bits as an unsigned integer and add 1 | |
172 4) [codebook\_sequence\_p] = read 1 bit as a boolean flag | |
173 | |
174 if ( [codebook\_lookup\_type] is 1 ) \{ | |
175 | |
176 5) [codebook\_lookup\_values] = \link{vorbis:spec:lookup1:values}{lookup1\_values}(\varname{[codebook\_entries]}, \varname{[codebook\_dimensions]} ) | |
177 | |
178 \} else \{ | |
179 | |
180 6) [codebook\_lookup\_values] = \varname{[codebook\_entries]} * \varname{[codebook\_dimensions]} | |
181 | |
182 \} | |
183 | |
184 7) read a total of [codebook\_lookup\_values] unsigned integers of [codebook\_value\_bits] each; | |
185 store these in order in the array [codebook\_multiplicands] | |
186 \end{Verbatim} | |
187 \item | |
188 A \varname{[codebook\_lookup\_type]} of greater than two is reserved | |
189 and indicates a stream that is not decodable by the specification in this | |
190 document. | |
191 | |
192 \end{itemize} | |
193 | |
194 | |
195 An 'end of packet' during any read operation in the above steps is | |
196 considered an error condition rendering the stream undecodable. | |
197 | |
198 \paragraph{Huffman decision tree representation} | |
199 | |
200 The \varname{[codebook\_codeword\_lengths]} array and | |
201 \varname{[codebook\_entries]} value uniquely define the Huffman decision | |
202 tree used for entropy decoding. | |
203 | |
204 Briefly, each used codebook entry (recall that length-unordered | |
205 codebooks support unused codeword entries) is assigned, in order, the | |
206 lowest valued unused binary Huffman codeword possible. Assume the | |
207 following codeword length list: | |
208 | |
209 \begin{Verbatim}[commandchars=\\\{\}] | |
210 entry 0: length 2 | |
211 entry 1: length 4 | |
212 entry 2: length 4 | |
213 entry 3: length 4 | |
214 entry 4: length 4 | |
215 entry 5: length 2 | |
216 entry 6: length 3 | |
217 entry 7: length 3 | |
218 \end{Verbatim} | |
219 | |
220 Assigning codewords in order (lowest possible value of the appropriate | |
221 length to highest) results in the following codeword list: | |
222 | |
223 \begin{Verbatim}[commandchars=\\\{\}] | |
224 entry 0: length 2 codeword 00 | |
225 entry 1: length 4 codeword 0100 | |
226 entry 2: length 4 codeword 0101 | |
227 entry 3: length 4 codeword 0110 | |
228 entry 4: length 4 codeword 0111 | |
229 entry 5: length 2 codeword 10 | |
230 entry 6: length 3 codeword 110 | |
231 entry 7: length 3 codeword 111 | |
232 \end{Verbatim} | |
233 | |
234 | |
235 \begin{note} | |
236 Unlike most binary numerical values in this document, we | |
237 intend the above codewords to be read and used bit by bit from left to | |
238 right, thus the codeword '001' is the bit string 'zero, zero, one'. | |
239 When determining 'lowest possible value' in the assignment definition | |
240 above, the leftmost bit is the MSb. | |
241 \end{note} | |
242 | |
243 It is clear that the codeword length list represents a Huffman | |
244 decision tree with the entry numbers equivalent to the leaves numbered | |
245 left-to-right: | |
246 | |
247 \begin{center} | |
248 \includegraphics[width=10cm]{hufftree} | |
249 \captionof{figure}{huffman tree illustration} | |
250 \end{center} | |
251 | |
252 | |
253 As we assign codewords in order, we see that each choice constructs a | |
254 new leaf in the leftmost possible position. | |
255 | |
256 Note that it's possible to underspecify or overspecify a Huffman tree | |
257 via the length list. In the above example, if codeword seven were | |
258 eliminated, it's clear that the tree is unfinished: | |
259 | |
260 \begin{center} | |
261 \includegraphics[width=10cm]{hufftree-under} | |
262 \captionof{figure}{underspecified huffman tree illustration} | |
263 \end{center} | |
264 | |
265 | |
266 Similarly, in the original codebook, it's clear that the tree is fully | |
267 populated and a ninth codeword is impossible. Both underspecified and | |
268 overspecified trees are an error condition rendering the stream | |
269 undecodable. Take special care that a codebook with a single used | |
270 entry is handled properly; it consists of a single codework of zero | |
271 bits and 'reading' a value out of such a codebook always returns the | |
272 single used value and sinks zero bits. | |
273 | |
274 Codebook entries marked 'unused' are simply skipped in the assigning | |
275 process. They have no codeword and do not appear in the decision | |
276 tree, thus it's impossible for any bit pattern read from the stream to | |
277 decode to that entry number. | |
278 | |
279 | |
280 | |
281 \paragraph{VQ lookup table vector representation} | |
282 | |
283 Unpacking the VQ lookup table vectors relies on the following values: | |
284 \begin{programlisting} | |
285 the [codebook\_multiplicands] array | |
286 [codebook\_minimum\_value] | |
287 [codebook\_delta\_value] | |
288 [codebook\_sequence\_p] | |
289 [codebook\_lookup\_type] | |
290 [codebook\_entries] | |
291 [codebook\_dimensions] | |
292 [codebook\_lookup\_values] | |
293 \end{programlisting} | |
294 | |
295 \bigskip | |
296 | |
297 Decoding (unpacking) a specific vector in the vector lookup table | |
298 proceeds according to \varname{[codebook\_lookup\_type]}. The unpacked | |
299 vector values are what a codebook would return during audio packet | |
300 decode in a VQ context. | |
301 | |
302 \paragraph{Vector value decode: Lookup type 1} | |
303 | |
304 Lookup type one specifies a lattice VQ lookup table built | |
305 algorithmically from a list of scalar values. Calculate (unpack) the | |
306 final values of a codebook entry vector from the entries in | |
307 \varname{[codebook\_multiplicands]} as follows (\varname{[value\_vector]} | |
308 is the output vector representing the vector of values for entry number | |
309 \varname{[lookup\_offset]} in this codebook): | |
310 | |
311 \begin{Verbatim}[commandchars=\\\{\}] | |
312 1) [last] = 0; | |
313 2) [index\_divisor] = 1; | |
314 3) iterate [i] over the range 0 ... [codebook\_dimensions]-1 (once for each scalar value in the value vector) \{ | |
315 | |
316 4) [multiplicand\_offset] = ( [lookup\_offset] divided by [index\_divisor] using integer | |
317 division ) integer modulo [codebook\_lookup\_values] | |
318 | |
319 5) vector [value\_vector] element [i] = | |
320 ( [codebook\_multiplicands] array element number [multiplicand\_offset] ) * | |
321 [codebook\_delta\_value] + [codebook\_minimum\_value] + [last]; | |
322 | |
323 6) if ( [codebook\_sequence\_p] is set ) then set [last] = vector [value\_vector] element [i] | |
324 | |
325 7) [index\_divisor] = [index\_divisor] * [codebook\_lookup\_values] | |
326 | |
327 \} | |
328 | |
329 8) vector calculation completed. | |
330 \end{Verbatim} | |
331 | |
332 | |
333 | |
334 \paragraph{Vector value decode: Lookup type 2} | |
335 | |
336 Lookup type two specifies a VQ lookup table in which each scalar in | |
337 each vector is explicitly set by the \varname{[codebook\_multiplicands]} | |
338 array in a one-to-one mapping. Calculate [unpack] the | |
339 final values of a codebook entry vector from the entries in | |
340 \varname{[codebook\_multiplicands]} as follows (\varname{[value\_vector]} | |
341 is the output vector representing the vector of values for entry number | |
342 \varname{[lookup\_offset]} in this codebook): | |
343 | |
344 \begin{Verbatim}[commandchars=\\\{\}] | |
345 1) [last] = 0; | |
346 2) [multiplicand\_offset] = [lookup\_offset] * [codebook\_dimensions] | |
347 3) iterate [i] over the range 0 ... [codebook\_dimensions]-1 (once for each scalar value in the value vector) \{ | |
348 | |
349 4) vector [value\_vector] element [i] = | |
350 ( [codebook\_multiplicands] array element number [multiplicand\_offset] ) * | |
351 [codebook\_delta\_value] + [codebook\_minimum\_value] + [last]; | |
352 | |
353 5) if ( [codebook\_sequence\_p] is set ) then set [last] = vector [value\_vector] element [i] | |
354 | |
355 6) increment [multiplicand\_offset] | |
356 | |
357 \} | |
358 | |
359 7) vector calculation completed. | |
360 \end{Verbatim} | |
361 | |
362 | |
363 | |
364 | |
365 | |
366 | |
367 | |
368 | |
369 | |
370 \subsection{Use of the codebook abstraction} | |
371 | |
372 The decoder uses the codebook abstraction much as it does the | |
373 bit-unpacking convention; a specific codebook reads a | |
374 codeword from the bitstream, decoding it into an entry number, and then | |
375 returns that entry number to the decoder (when used in a scalar | |
376 entropy coding context), or uses that entry number as an offset into | |
377 the VQ lookup table, returning a vector of values (when used in a context | |
378 desiring a VQ value). Scalar or VQ context is always explicit; any call | |
379 to the codebook mechanism requests either a scalar entry number or a | |
380 lookup vector. | |
381 | |
382 Note that VQ lookup type zero indicates that there is no lookup table; | |
383 requesting decode using a codebook of lookup type 0 in any context | |
384 expecting a vector return value (even in a case where a vector of | |
385 dimension one) is forbidden. If decoder setup or decode requests such | |
386 an action, that is an error condition rendering the packet | |
387 undecodable. | |
388 | |
389 Using a codebook to read from the packet bitstream consists first of | |
390 reading and decoding the next codeword in the bitstream. The decoder | |
391 reads bits until the accumulated bits match a codeword in the | |
392 codebook. This process can be though of as logically walking the | |
393 Huffman decode tree by reading one bit at a time from the bitstream, | |
394 and using the bit as a decision boolean to take the 0 branch (left in | |
395 the above examples) or the 1 branch (right in the above examples). | |
396 Walking the tree finishes when the decode process hits a leaf in the | |
397 decision tree; the result is the entry number corresponding to that | |
398 leaf. Reading past the end of a packet propagates the 'end-of-stream' | |
399 condition to the decoder. | |
400 | |
401 When used in a scalar context, the resulting codeword entry is the | |
402 desired return value. | |
403 | |
404 When used in a VQ context, the codeword entry number is used as an | |
405 offset into the VQ lookup table. The value returned to the decoder is | |
406 the vector of scalars corresponding to this offset. |