Mercurial > hg > sv-dependency-builds
comparison src/libvorbis-1.3.3/doc/01-introduction.tex @ 1:05aa0afa9217
Bring in flac, ogg, vorbis
author | Chris Cannam |
---|---|
date | Tue, 19 Mar 2013 17:37:49 +0000 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
0:c7265573341e | 1:05aa0afa9217 |
---|---|
1 % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*- | |
2 %!TEX root = Vorbis_I_spec.tex | |
3 % $Id$ | |
4 \section{Introduction and Description} \label{vorbis:spec:intro} | |
5 | |
6 \subsection{Overview} | |
7 | |
8 This document provides a high level description of the Vorbis codec's | |
9 construction. A bit-by-bit specification appears beginning in | |
10 \xref{vorbis:spec:codec}. | |
11 The later sections assume a high-level | |
12 understanding of the Vorbis decode process, which is | |
13 provided here. | |
14 | |
15 \subsubsection{Application} | |
16 Vorbis is a general purpose perceptual audio CODEC intended to allow | |
17 maximum encoder flexibility, thus allowing it to scale competitively | |
18 over an exceptionally wide range of bitrates. At the high | |
19 quality/bitrate end of the scale (CD or DAT rate stereo, 16/24 bits) | |
20 it is in the same league as MPEG-2 and MPC. Similarly, the 1.0 | |
21 encoder can encode high-quality CD and DAT rate stereo at below 48kbps | |
22 without resampling to a lower rate. Vorbis is also intended for | |
23 lower and higher sample rates (from 8kHz telephony to 192kHz digital | |
24 masters) and a range of channel representations (monaural, | |
25 polyphonic, stereo, quadraphonic, 5.1, ambisonic, or up to 255 | |
26 discrete channels). | |
27 | |
28 | |
29 \subsubsection{Classification} | |
30 Vorbis I is a forward-adaptive monolithic transform CODEC based on the | |
31 Modified Discrete Cosine Transform. The codec is structured to allow | |
32 addition of a hybrid wavelet filterbank in Vorbis II to offer better | |
33 transient response and reproduction using a transform better suited to | |
34 localized time events. | |
35 | |
36 | |
37 \subsubsection{Assumptions} | |
38 | |
39 The Vorbis CODEC design assumes a complex, psychoacoustically-aware | |
40 encoder and simple, low-complexity decoder. Vorbis decode is | |
41 computationally simpler than mp3, although it does require more | |
42 working memory as Vorbis has no static probability model; the vector | |
43 codebooks used in the first stage of decoding from the bitstream are | |
44 packed in their entirety into the Vorbis bitstream headers. In | |
45 packed form, these codebooks occupy only a few kilobytes; the extent | |
46 to which they are pre-decoded into a cache is the dominant factor in | |
47 decoder memory usage. | |
48 | |
49 | |
50 Vorbis provides none of its own framing, synchronization or protection | |
51 against errors; it is solely a method of accepting input audio, | |
52 dividing it into individual frames and compressing these frames into | |
53 raw, unformatted 'packets'. The decoder then accepts these raw | |
54 packets in sequence, decodes them, synthesizes audio frames from | |
55 them, and reassembles the frames into a facsimile of the original | |
56 audio stream. Vorbis is a free-form variable bit rate (VBR) codec and packets have no | |
57 minimum size, maximum size, or fixed/expected size. Packets | |
58 are designed that they may be truncated (or padded) and remain | |
59 decodable; this is not to be considered an error condition and is used | |
60 extensively in bitrate management in peeling. Both the transport | |
61 mechanism and decoder must allow that a packet may be any size, or | |
62 end before or after packet decode expects. | |
63 | |
64 Vorbis packets are thus intended to be used with a transport mechanism | |
65 that provides free-form framing, sync, positioning and error correction | |
66 in accordance with these design assumptions, such as Ogg (for file | |
67 transport) or RTP (for network multicast). For purposes of a few | |
68 examples in this document, we will assume that Vorbis is to be | |
69 embedded in an Ogg stream specifically, although this is by no means a | |
70 requirement or fundamental assumption in the Vorbis design. | |
71 | |
72 The specification for embedding Vorbis into | |
73 an Ogg transport stream is in \xref{vorbis:over:ogg}. | |
74 | |
75 | |
76 | |
77 \subsubsection{Codec Setup and Probability Model} | |
78 | |
79 Vorbis' heritage is as a research CODEC and its current design | |
80 reflects a desire to allow multiple decades of continuous encoder | |
81 improvement before running out of room within the codec specification. | |
82 For these reasons, configurable aspects of codec setup intentionally | |
83 lean toward the extreme of forward adaptive. | |
84 | |
85 The single most controversial design decision in Vorbis (and the most | |
86 unusual for a Vorbis developer to keep in mind) is that the entire | |
87 probability model of the codec, the Huffman and VQ codebooks, is | |
88 packed into the bitstream header along with extensive CODEC setup | |
89 parameters (often several hundred fields). This makes it impossible, | |
90 as it would be with MPEG audio layers, to embed a simple frame type | |
91 flag in each audio packet, or begin decode at any frame in the stream | |
92 without having previously fetched the codec setup header. | |
93 | |
94 | |
95 \begin{note} | |
96 Vorbis \emph{can} initiate decode at any arbitrary packet within a | |
97 bitstream so long as the codec has been initialized/setup with the | |
98 setup headers. | |
99 \end{note} | |
100 | |
101 Thus, Vorbis headers are both required for decode to begin and | |
102 relatively large as bitstream headers go. The header size is | |
103 unbounded, although for streaming a rule-of-thumb of 4kB or less is | |
104 recommended (and Xiph.Org's Vorbis encoder follows this suggestion). | |
105 | |
106 Our own design work indicates the primary liability of the | |
107 required header is in mindshare; it is an unusual design and thus | |
108 causes some amount of complaint among engineers as this runs against | |
109 current design trends (and also points out limitations in some | |
110 existing software/interface designs, such as Windows' ACM codec | |
111 framework). However, we find that it does not fundamentally limit | |
112 Vorbis' suitable application space. | |
113 | |
114 | |
115 \subsubsection{Format Specification} | |
116 The Vorbis format is well-defined by its decode specification; any | |
117 encoder that produces packets that are correctly decoded by the | |
118 reference Vorbis decoder described below may be considered a proper | |
119 Vorbis encoder. A decoder must faithfully and completely implement | |
120 the specification defined below (except where noted) to be considered | |
121 a proper Vorbis decoder. | |
122 | |
123 \subsubsection{Hardware Profile} | |
124 Although Vorbis decode is computationally simple, it may still run | |
125 into specific limitations of an embedded design. For this reason, | |
126 embedded designs are allowed to deviate in limited ways from the | |
127 `full' decode specification yet still be certified compliant. These | |
128 optional omissions are labelled in the spec where relevant. | |
129 | |
130 | |
131 \subsection{Decoder Configuration} | |
132 | |
133 Decoder setup consists of configuration of multiple, self-contained | |
134 component abstractions that perform specific functions in the decode | |
135 pipeline. Each different component instance of a specific type is | |
136 semantically interchangeable; decoder configuration consists both of | |
137 internal component configuration, as well as arrangement of specific | |
138 instances into a decode pipeline. Componentry arrangement is roughly | |
139 as follows: | |
140 | |
141 \begin{center} | |
142 \includegraphics[width=\textwidth]{components} | |
143 \captionof{figure}{decoder pipeline configuration} | |
144 \end{center} | |
145 | |
146 \subsubsection{Global Config} | |
147 Global codec configuration consists of a few audio related fields | |
148 (sample rate, channels), Vorbis version (always '0' in Vorbis I), | |
149 bitrate hints, and the lists of component instances. All other | |
150 configuration is in the context of specific components. | |
151 | |
152 \subsubsection{Mode} | |
153 | |
154 Each Vorbis frame is coded according to a master 'mode'. A bitstream | |
155 may use one or many modes. | |
156 | |
157 The mode mechanism is used to encode a frame according to one of | |
158 multiple possible methods with the intention of choosing a method best | |
159 suited to that frame. Different modes are, e.g. how frame size | |
160 is changed from frame to frame. The mode number of a frame serves as a | |
161 top level configuration switch for all other specific aspects of frame | |
162 decode. | |
163 | |
164 A 'mode' configuration consists of a frame size setting, window type | |
165 (always 0, the Vorbis window, in Vorbis I), transform type (always | |
166 type 0, the MDCT, in Vorbis I) and a mapping number. The mapping | |
167 number specifies which mapping configuration instance to use for | |
168 low-level packet decode and synthesis. | |
169 | |
170 | |
171 \subsubsection{Mapping} | |
172 | |
173 A mapping contains a channel coupling description and a list of | |
174 'submaps' that bundle sets of channel vectors together for grouped | |
175 encoding and decoding. These submaps are not references to external | |
176 components; the submap list is internal and specific to a mapping. | |
177 | |
178 A 'submap' is a configuration/grouping that applies to a subset of | |
179 floor and residue vectors within a mapping. The submap functions as a | |
180 last layer of indirection such that specific special floor or residue | |
181 settings can be applied not only to all the vectors in a given mode, | |
182 but also specific vectors in a specific mode. Each submap specifies | |
183 the proper floor and residue instance number to use for decoding that | |
184 submap's spectral floor and spectral residue vectors. | |
185 | |
186 As an example: | |
187 | |
188 Assume a Vorbis stream that contains six channels in the standard 5.1 | |
189 format. The sixth channel, as is normal in 5.1, is bass only. | |
190 Therefore it would be wasteful to encode a full-spectrum version of it | |
191 as with the other channels. The submapping mechanism can be used to | |
192 apply a full range floor and residue encoding to channels 0 through 4, | |
193 and a bass-only representation to the bass channel, thus saving space. | |
194 In this example, channels 0-4 belong to submap 0 (which indicates use | |
195 of a full-range floor) and channel 5 belongs to submap 1, which uses a | |
196 bass-only representation. | |
197 | |
198 | |
199 \subsubsection{Floor} | |
200 | |
201 Vorbis encodes a spectral 'floor' vector for each PCM channel. This | |
202 vector is a low-resolution representation of the audio spectrum for | |
203 the given channel in the current frame, generally used akin to a | |
204 whitening filter. It is named a 'floor' because the Xiph.Org | |
205 reference encoder has historically used it as a unit-baseline for | |
206 spectral resolution. | |
207 | |
208 A floor encoding may be of two types. Floor 0 uses a packed LSP | |
209 representation on a dB amplitude scale and Bark frequency scale. | |
210 Floor 1 represents the curve as a piecewise linear interpolated | |
211 representation on a dB amplitude scale and linear frequency scale. | |
212 The two floors are semantically interchangeable in | |
213 encoding/decoding. However, floor type 1 provides more stable | |
214 inter-frame behavior, and so is the preferred choice in all | |
215 coupled-stereo and high bitrate modes. Floor 1 is also considerably | |
216 less expensive to decode than floor 0. | |
217 | |
218 Floor 0 is not to be considered deprecated, but it is of limited | |
219 modern use. No known Vorbis encoder past Xiph.Org's own beta 4 makes | |
220 use of floor 0. | |
221 | |
222 The values coded/decoded by a floor are both compactly formatted and | |
223 make use of entropy coding to save space. For this reason, a floor | |
224 configuration generally refers to multiple codebooks in the codebook | |
225 component list. Entropy coding is thus provided as an abstraction, | |
226 and each floor instance may choose from any and all available | |
227 codebooks when coding/decoding. | |
228 | |
229 | |
230 \subsubsection{Residue} | |
231 The spectral residue is the fine structure of the audio spectrum | |
232 once the floor curve has been subtracted out. In simplest terms, it | |
233 is coded in the bitstream using cascaded (multi-pass) vector | |
234 quantization according to one of three specific packing/coding | |
235 algorithms numbered 0 through 2. The packing algorithm details are | |
236 configured by residue instance. As with the floor components, the | |
237 final VQ/entropy encoding is provided by external codebook instances | |
238 and each residue instance may choose from any and all available | |
239 codebooks. | |
240 | |
241 \subsubsection{Codebooks} | |
242 | |
243 Codebooks are a self-contained abstraction that perform entropy | |
244 decoding and, optionally, use the entropy-decoded integer value as an | |
245 offset into an index of output value vectors, returning the indicated | |
246 vector of values. | |
247 | |
248 The entropy coding in a Vorbis I codebook is provided by a standard | |
249 Huffman binary tree representation. This tree is tightly packed using | |
250 one of several methods, depending on whether codeword lengths are | |
251 ordered or unordered, or the tree is sparse. | |
252 | |
253 The codebook vector index is similarly packed according to index | |
254 characteristic. Most commonly, the vector index is encoded as a | |
255 single list of values of possible values that are then permuted into | |
256 a list of n-dimensional rows (lattice VQ). | |
257 | |
258 | |
259 | |
260 \subsection{High-level Decode Process} | |
261 | |
262 \subsubsection{Decode Setup} | |
263 | |
264 Before decoding can begin, a decoder must initialize using the | |
265 bitstream headers matching the stream to be decoded. Vorbis uses | |
266 three header packets; all are required, in-order, by this | |
267 specification. Once set up, decode may begin at any audio packet | |
268 belonging to the Vorbis stream. In Vorbis I, all packets after the | |
269 three initial headers are audio packets. | |
270 | |
271 The header packets are, in order, the identification | |
272 header, the comments header, and the setup header. | |
273 | |
274 \paragraph{Identification Header} | |
275 The identification header identifies the bitstream as Vorbis, Vorbis | |
276 version, and the simple audio characteristics of the stream such as | |
277 sample rate and number of channels. | |
278 | |
279 \paragraph{Comment Header} | |
280 The comment header includes user text comments (``tags'') and a vendor | |
281 string for the application/library that produced the bitstream. The | |
282 encoding and proper use of the comment header is described in \xref{vorbis:spec:comment}. | |
283 | |
284 \paragraph{Setup Header} | |
285 The setup header includes extensive CODEC setup information as well as | |
286 the complete VQ and Huffman codebooks needed for decode. | |
287 | |
288 | |
289 \subsubsection{Decode Procedure} | |
290 | |
291 The decoding and synthesis procedure for all audio packets is | |
292 fundamentally the same. | |
293 \begin{enumerate} | |
294 \item decode packet type flag | |
295 \item decode mode number | |
296 \item decode window shape (long windows only) | |
297 \item decode floor | |
298 \item decode residue into residue vectors | |
299 \item inverse channel coupling of residue vectors | |
300 \item generate floor curve from decoded floor data | |
301 \item compute dot product of floor and residue, producing audio spectrum vector | |
302 \item inverse monolithic transform of audio spectrum vector, always an MDCT in Vorbis I | |
303 \item overlap/add left-hand output of transform with right-hand output of previous frame | |
304 \item store right hand-data from transform of current frame for future lapping | |
305 \item if not first frame, return results of overlap/add as audio result of current frame | |
306 \end{enumerate} | |
307 | |
308 Note that clever rearrangement of the synthesis arithmetic is | |
309 possible; as an example, one can take advantage of symmetries in the | |
310 MDCT to store the right-hand transform data of a partial MDCT for a | |
311 50\% inter-frame buffer space savings, and then complete the transform | |
312 later before overlap/add with the next frame. This optimization | |
313 produces entirely equivalent output and is naturally perfectly legal. | |
314 The decoder must be \emph{entirely mathematically equivalent} to the | |
315 specification, it need not be a literal semantic implementation. | |
316 | |
317 \paragraph{Packet type decode} | |
318 | |
319 Vorbis I uses four packet types. The first three packet types mark each | |
320 of the three Vorbis headers described above. The fourth packet type | |
321 marks an audio packet. All other packet types are reserved; packets | |
322 marked with a reserved type should be ignored. | |
323 | |
324 Following the three header packets, all packets in a Vorbis I stream | |
325 are audio. The first step of audio packet decode is to read and | |
326 verify the packet type; \emph{a non-audio packet when audio is expected | |
327 indicates stream corruption or a non-compliant stream. The decoder | |
328 must ignore the packet and not attempt decoding it to | |
329 audio}. | |
330 | |
331 | |
332 | |
333 | |
334 \paragraph{Mode decode} | |
335 Vorbis allows an encoder to set up multiple, numbered packet 'modes', | |
336 as described earlier, all of which may be used in a given Vorbis | |
337 stream. The mode is encoded as an integer used as a direct offset into | |
338 the mode instance index. | |
339 | |
340 | |
341 \paragraph{Window shape decode (long windows only)} \label{vorbis:spec:window} | |
342 | |
343 Vorbis frames may be one of two PCM sample sizes specified during | |
344 codec setup. In Vorbis I, legal frame sizes are powers of two from 64 | |
345 to 8192 samples. Aside from coupling, Vorbis handles channels as | |
346 independent vectors and these frame sizes are in samples per channel. | |
347 | |
348 Vorbis uses an overlapping transform, namely the MDCT, to blend one | |
349 frame into the next, avoiding most inter-frame block boundary | |
350 artifacts. The MDCT output of one frame is windowed according to MDCT | |
351 requirements, overlapped 50\% with the output of the previous frame and | |
352 added. The window shape assures seamless reconstruction. | |
353 | |
354 This is easy to visualize in the case of equal sized-windows: | |
355 | |
356 \begin{center} | |
357 \includegraphics[width=\textwidth]{window1} | |
358 \captionof{figure}{overlap of two equal-sized windows} | |
359 \end{center} | |
360 | |
361 And slightly more complex in the case of overlapping unequal sized | |
362 windows: | |
363 | |
364 \begin{center} | |
365 \includegraphics[width=\textwidth]{window2} | |
366 \captionof{figure}{overlap of a long and a short window} | |
367 \end{center} | |
368 | |
369 In the unequal-sized window case, the window shape of the long window | |
370 must be modified for seamless lapping as above. It is possible to | |
371 correctly infer window shape to be applied to the current window from | |
372 knowing the sizes of the current, previous and next window. It is | |
373 legal for a decoder to use this method. However, in the case of a long | |
374 window (short windows require no modification), Vorbis also codes two | |
375 flag bits to specify pre- and post- window shape. Although not | |
376 strictly necessary for function, this minor redundancy allows a packet | |
377 to be fully decoded to the point of lapping entirely independently of | |
378 any other packet, allowing easier abstraction of decode layers as well | |
379 as allowing a greater level of easy parallelism in encode and | |
380 decode. | |
381 | |
382 A description of valid window functions for use with an inverse MDCT | |
383 can be found in \cite{Sporer/Brandenburg/Edler}. Vorbis windows | |
384 all use the slope function | |
385 \[ y = \sin(.5*\pi \, \sin^2((x+.5)/n*\pi)) . \] | |
386 | |
387 | |
388 | |
389 \paragraph{floor decode} | |
390 Each floor is encoded/decoded in channel order, however each floor | |
391 belongs to a 'submap' that specifies which floor configuration to | |
392 use. All floors are decoded before residue decode begins. | |
393 | |
394 | |
395 \paragraph{residue decode} | |
396 | |
397 Although the number of residue vectors equals the number of channels, | |
398 channel coupling may mean that the raw residue vectors extracted | |
399 during decode do not map directly to specific channels. When channel | |
400 coupling is in use, some vectors will correspond to coupled magnitude | |
401 or angle. The coupling relationships are described in the codec setup | |
402 and may differ from frame to frame, due to different mode numbers. | |
403 | |
404 Vorbis codes residue vectors in groups by submap; the coding is done | |
405 in submap order from submap 0 through n-1. This differs from floors | |
406 which are coded using a configuration provided by submap number, but | |
407 are coded individually in channel order. | |
408 | |
409 | |
410 | |
411 \paragraph{inverse channel coupling} | |
412 | |
413 A detailed discussion of stereo in the Vorbis codec can be found in | |
414 the document \href{stereo.html}{Stereo Channel Coupling in the | |
415 Vorbis CODEC}. Vorbis is not limited to only stereo coupling, but | |
416 the stereo document also gives a good overview of the generic coupling | |
417 mechanism. | |
418 | |
419 Vorbis coupling applies to pairs of residue vectors at a time; | |
420 decoupling is done in-place a pair at a time in the order and using | |
421 the vectors specified in the current mapping configuration. The | |
422 decoupling operation is the same for all pairs, converting square | |
423 polar representation (where one vector is magnitude and the second | |
424 angle) back to Cartesian representation. | |
425 | |
426 After decoupling, in order, each pair of vectors on the coupling list, | |
427 the resulting residue vectors represent the fine spectral detail | |
428 of each output channel. | |
429 | |
430 | |
431 | |
432 \paragraph{generate floor curve} | |
433 | |
434 The decoder may choose to generate the floor curve at any appropriate | |
435 time. It is reasonable to generate the output curve when the floor | |
436 data is decoded from the raw packet, or it can be generated after | |
437 inverse coupling and applied to the spectral residue directly, | |
438 combining generation and the dot product into one step and eliminating | |
439 some working space. | |
440 | |
441 Both floor 0 and floor 1 generate a linear-range, linear-domain output | |
442 vector to be multiplied (dot product) by the linear-range, | |
443 linear-domain spectral residue. | |
444 | |
445 | |
446 | |
447 \paragraph{compute floor/residue dot product} | |
448 | |
449 This step is straightforward; for each output channel, the decoder | |
450 multiplies the floor curve and residue vectors element by element, | |
451 producing the finished audio spectrum of each channel. | |
452 | |
453 % TODO/FIXME: The following two paragraphs have identical twins | |
454 % in section 4 (under "dot product") | |
455 One point is worth mentioning about this dot product; a common mistake | |
456 in a fixed point implementation might be to assume that a 32 bit | |
457 fixed-point representation for floor and residue and direct | |
458 multiplication of the vectors is sufficient for acceptable spectral | |
459 depth in all cases because it happens to mostly work with the current | |
460 Xiph.Org reference encoder. | |
461 | |
462 However, floor vector values can span \~{}140dB (\~{}24 bits unsigned), and | |
463 the audio spectrum vector should represent a minimum of 120dB (\~{}21 | |
464 bits with sign), even when output is to a 16 bit PCM device. For the | |
465 residue vector to represent full scale if the floor is nailed to | |
466 $-140$dB, it must be able to span 0 to $+140$dB. For the residue vector | |
467 to reach full scale if the floor is nailed at 0dB, it must be able to | |
468 represent $-140$dB to $+0$dB. Thus, in order to handle full range | |
469 dynamics, a residue vector may span $-140$dB to $+140$dB entirely within | |
470 spec. A 280dB range is approximately 48 bits with sign; thus the | |
471 residue vector must be able to represent a 48 bit range and the dot | |
472 product must be able to handle an effective 48 bit times 24 bit | |
473 multiplication. This range may be achieved using large (64 bit or | |
474 larger) integers, or implementing a movable binary point | |
475 representation. | |
476 | |
477 | |
478 | |
479 \paragraph{inverse monolithic transform (MDCT)} | |
480 | |
481 The audio spectrum is converted back into time domain PCM audio via an | |
482 inverse Modified Discrete Cosine Transform (MDCT). A detailed | |
483 description of the MDCT is available in \cite{Sporer/Brandenburg/Edler}. | |
484 | |
485 Note that the PCM produced directly from the MDCT is not yet finished | |
486 audio; it must be lapped with surrounding frames using an appropriate | |
487 window (such as the Vorbis window) before the MDCT can be considered | |
488 orthogonal. | |
489 | |
490 | |
491 | |
492 \paragraph{overlap/add data} | |
493 Windowed MDCT output is overlapped and added with the right hand data | |
494 of the previous window such that the 3/4 point of the previous window | |
495 is aligned with the 1/4 point of the current window (as illustrated in | |
496 the window overlap diagram). At this point, the audio data between the | |
497 center of the previous frame and the center of the current frame is | |
498 now finished and ready to be returned. | |
499 | |
500 | |
501 \paragraph{cache right hand data} | |
502 The decoder must cache the right hand portion of the current frame to | |
503 be lapped with the left hand portion of the next frame. | |
504 | |
505 | |
506 | |
507 \paragraph{return finished audio data} | |
508 | |
509 The overlapped portion produced from overlapping the previous and | |
510 current frame data is finished data to be returned by the decoder. | |
511 This data spans from the center of the previous window to the center | |
512 of the current window. In the case of same-sized windows, the amount | |
513 of data to return is one-half block consisting of and only of the | |
514 overlapped portions. When overlapping a short and long window, much of | |
515 the returned range is not actually overlap. This does not damage | |
516 transform orthogonality. Pay attention however to returning the | |
517 correct data range; the amount of data to be returned is: | |
518 | |
519 \begin{Verbatim}[commandchars=\\\{\}] | |
520 window\_blocksize(previous\_window)/4+window\_blocksize(current\_window)/4 | |
521 \end{Verbatim} | |
522 | |
523 from the center of the previous window to the center of the current | |
524 window. | |
525 | |
526 Data is not returned from the first frame; it must be used to 'prime' | |
527 the decode engine. The encoder accounts for this priming when | |
528 calculating PCM offsets; after the first frame, the proper PCM output | |
529 offset is '0' (as no data has been returned yet). |