Chris@1
|
1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
|
Chris@1
|
2 <html>
|
Chris@1
|
3 <head>
|
Chris@1
|
4
|
Chris@1
|
5 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-15"/>
|
Chris@1
|
6 <title>Ogg Documentation</title>
|
Chris@1
|
7
|
Chris@1
|
8 <style type="text/css">
|
Chris@1
|
9 body {
|
Chris@1
|
10 margin: 0 18px 0 18px;
|
Chris@1
|
11 padding-bottom: 30px;
|
Chris@1
|
12 font-family: Verdana, Arial, Helvetica, sans-serif;
|
Chris@1
|
13 color: #333333;
|
Chris@1
|
14 font-size: .8em;
|
Chris@1
|
15 }
|
Chris@1
|
16
|
Chris@1
|
17 a {
|
Chris@1
|
18 color: #3366cc;
|
Chris@1
|
19 }
|
Chris@1
|
20
|
Chris@1
|
21 img {
|
Chris@1
|
22 border: 0;
|
Chris@1
|
23 }
|
Chris@1
|
24
|
Chris@1
|
25 #xiphlogo {
|
Chris@1
|
26 margin: 30px 0 16px 0;
|
Chris@1
|
27 }
|
Chris@1
|
28
|
Chris@1
|
29 #content p {
|
Chris@1
|
30 line-height: 1.4;
|
Chris@1
|
31 }
|
Chris@1
|
32
|
Chris@1
|
33 h1, h1 a, h2, h2 a, h3, h3 a, h4, h4 a {
|
Chris@1
|
34 font-weight: bold;
|
Chris@1
|
35 color: #ff9900;
|
Chris@1
|
36 margin: 1.3em 0 8px 0;
|
Chris@1
|
37 }
|
Chris@1
|
38
|
Chris@1
|
39 h1 {
|
Chris@1
|
40 font-size: 1.3em;
|
Chris@1
|
41 }
|
Chris@1
|
42
|
Chris@1
|
43 h2 {
|
Chris@1
|
44 font-size: 1.2em;
|
Chris@1
|
45 }
|
Chris@1
|
46
|
Chris@1
|
47 h3 {
|
Chris@1
|
48 font-size: 1.1em;
|
Chris@1
|
49 }
|
Chris@1
|
50
|
Chris@1
|
51 li {
|
Chris@1
|
52 line-height: 1.4;
|
Chris@1
|
53 }
|
Chris@1
|
54
|
Chris@1
|
55 #copyright {
|
Chris@1
|
56 margin-top: 30px;
|
Chris@1
|
57 line-height: 1.5em;
|
Chris@1
|
58 text-align: center;
|
Chris@1
|
59 font-size: .8em;
|
Chris@1
|
60 color: #888888;
|
Chris@1
|
61 clear: both;
|
Chris@1
|
62 }
|
Chris@1
|
63 </style>
|
Chris@1
|
64
|
Chris@1
|
65 </head>
|
Chris@1
|
66
|
Chris@1
|
67 <body>
|
Chris@1
|
68
|
Chris@1
|
69 <div id="xiphlogo">
|
Chris@1
|
70 <a href="http://www.xiph.org/"><img src="fish_xiph_org.png" alt="Fish Logo and Xiph.org"/></a>
|
Chris@1
|
71 </div>
|
Chris@1
|
72
|
Chris@1
|
73 <h1>Page Multiplexing and Ordering in a Physical Ogg Stream</h1>
|
Chris@1
|
74
|
Chris@1
|
75 <p>The low-level mechanisms of an Ogg stream (as described in the Ogg
|
Chris@1
|
76 Bitstream Overview) provide means for mixing multiple logical streams
|
Chris@1
|
77 and media types into a single linear-chronological stream. This
|
Chris@1
|
78 document specifies the high-level arrangement and use of page
|
Chris@1
|
79 structure to multiplex multiple streams of mixed media type within a
|
Chris@1
|
80 physical Ogg stream.</p>
|
Chris@1
|
81
|
Chris@1
|
82 <h2>Design Elements</h2>
|
Chris@1
|
83
|
Chris@1
|
84 <p>The design and arrangement of the Ogg container format is governed by
|
Chris@1
|
85 several high-level design decisions that form the reasoning behind
|
Chris@1
|
86 specific low-level design decisions.</p>
|
Chris@1
|
87
|
Chris@1
|
88 <h3>Linear media</h3>
|
Chris@1
|
89
|
Chris@1
|
90 <p>The Ogg bitstream is intended to encapsulate chronological,
|
Chris@1
|
91 time-linear mixed media into a single delivery stream or file. The
|
Chris@1
|
92 design is such that an application can always encode and/or decode a
|
Chris@1
|
93 full-featured bitstream in one pass with no seeking and minimal
|
Chris@1
|
94 buffering. Seeking to provide optimized encoding (such as two-pass
|
Chris@1
|
95 encoding) or interactive decoding (such as scrubbing or instant
|
Chris@1
|
96 replay) is not disallowed or discouraged, however no bitstream feature
|
Chris@1
|
97 must require nonlinear operation on the bitstream.</p>
|
Chris@1
|
98
|
Chris@1
|
99 <h3>Multiplexing</h3>
|
Chris@1
|
100
|
Chris@1
|
101 <p>Ogg bitstreams multiplex multiple logical streams into a single
|
Chris@1
|
102 physical stream at the page level. Each page contains an abstract
|
Chris@1
|
103 time stamp (the Granule Position) that represents an absolute time
|
Chris@1
|
104 landmark within the stream. After the pages representing stream
|
Chris@1
|
105 headers (all logical stream headers occur at the beginning of a
|
Chris@1
|
106 physical bitstream section before any logical stream data), logical
|
Chris@1
|
107 stream data pages are arranged in a physical bitstream in strict
|
Chris@1
|
108 non-decreasing order by chronological absolute time as
|
Chris@1
|
109 specified by the granule position.</p>
|
Chris@1
|
110
|
Chris@1
|
111 <p>The only exception to arranging pages in strictly ascending time order
|
Chris@1
|
112 by granule position is those pages that do not set the granule
|
Chris@1
|
113 position value. This is a special case when exceptionally large
|
Chris@1
|
114 packets span multiple pages; the specifics of handling this special
|
Chris@1
|
115 case are described later under 'Continuous and Discontinuous
|
Chris@1
|
116 Streams'.</p>
|
Chris@1
|
117
|
Chris@1
|
118 <h3>Seeking</h3>
|
Chris@1
|
119
|
Chris@1
|
120 <p>Ogg is designed to use an interpolated bisection search to
|
Chris@1
|
121 implement exact positional seeking. Interpolated bisection search is
|
Chris@1
|
122 a spec-mandated mechanism.</p>
|
Chris@1
|
123
|
Chris@1
|
124 <p><i>An index may improve objective performance, but it seldom
|
Chris@1
|
125 improves subjective performance outside of a few high-latency use
|
Chris@1
|
126 cases and adds no additional functionality as bisection search
|
Chris@1
|
127 delivers the same functionality for both one- and two-pass stream
|
Chris@1
|
128 types. For these reasons, use of indexes is discouraged, except in
|
Chris@1
|
129 cases where an index provides demonstrable and noticable performance
|
Chris@1
|
130 improvement.</i></p>
|
Chris@1
|
131
|
Chris@1
|
132 <p>Seek operations are by absolute time; a direct bisection search must
|
Chris@1
|
133 find the exact time position requested. Information in the Ogg
|
Chris@1
|
134 bitstream is arranged such that all information to be presented for
|
Chris@1
|
135 playback from the desired seek point will occur at or after the
|
Chris@1
|
136 desired seek point. Seek operations are neither 'fuzzy' nor
|
Chris@1
|
137 heuristic.</p>
|
Chris@1
|
138
|
Chris@1
|
139 <p><i>Although key frame handling in video appears to be an exception to
|
Chris@1
|
140 "all needed playback information lies ahead of a given seek",
|
Chris@1
|
141 key frames can still be handled directly within this indexless
|
Chris@1
|
142 framework. Seeking to a key frame in video (as well as seeking in other
|
Chris@1
|
143 media types with analogous restraints) is handled as two seeks; first
|
Chris@1
|
144 a seek to the desired time which extracts state information that
|
Chris@1
|
145 decodes to the time of the last key frame, followed by a second seek
|
Chris@1
|
146 directly to the key frame. The location of the previous key frame is
|
Chris@1
|
147 embedded as state information in the granulepos; this mechanism is
|
Chris@1
|
148 described in more detail later.</i></p>
|
Chris@1
|
149
|
Chris@1
|
150 <h3>Continuous and Discontinuous Streams</h3>
|
Chris@1
|
151
|
Chris@1
|
152 <p>Logical streams within a physical Ogg stream belong to one of two
|
Chris@1
|
153 categories, "Continuous" streams and "Discontinuous" streams.
|
Chris@1
|
154 Although these are discussed in more detail later, the distinction is
|
Chris@1
|
155 important to a high-level understanding of how to buffer an Ogg
|
Chris@1
|
156 stream.</p>
|
Chris@1
|
157
|
Chris@1
|
158 <p>A stream that provides a gapless, time-continuous media type with a
|
Chris@1
|
159 fine-grained timebase is considered to be 'Continuous'. A continuous
|
Chris@1
|
160 stream should never be starved of data. Clear examples of continuous
|
Chris@1
|
161 data types include broadcast audio and video.</p>
|
Chris@1
|
162
|
Chris@1
|
163 <p>A stream that delivers data in a potentially irregular pattern or with
|
Chris@1
|
164 widely spaced timing gaps is considered to be 'Discontinuous'. A
|
Chris@1
|
165 discontinuous stream may be best thought of as data representing
|
Chris@1
|
166 scattered events; although they happen in order, they are typically
|
Chris@1
|
167 unconnected data often located far apart. One possible example of a
|
Chris@1
|
168 discontinuous stream types would be captioning. Although it's
|
Chris@1
|
169 possible to design captions as a continuous stream type, it's most
|
Chris@1
|
170 natural to think of captions as widely spaced pieces of text with
|
Chris@1
|
171 little happening between.</p>
|
Chris@1
|
172
|
Chris@1
|
173 <p>The fundamental design distinction between continuous and
|
Chris@1
|
174 discontinuous streams concerns buffering.</p>
|
Chris@1
|
175
|
Chris@1
|
176 <h3>Buffering</h3>
|
Chris@1
|
177
|
Chris@1
|
178 <p>Because a continuous stream is, by definition, gapless, Ogg buffering
|
Chris@1
|
179 is based on the simple premise of never allowing any active continuous
|
Chris@1
|
180 stream to starve for data during decode; buffering proceeds ahead
|
Chris@1
|
181 until all continuous streams in a physical stream have data ready to
|
Chris@1
|
182 decode on demand.</p>
|
Chris@1
|
183
|
Chris@1
|
184 <p>Discontinuous stream data may occur on a fairly regular basis, but the
|
Chris@1
|
185 timing of, for example, a specific caption is impossible to predict
|
Chris@1
|
186 with certainty in most captioning systems. Thus the buffering system
|
Chris@1
|
187 should take discontinuous data 'as it comes' rather than working ahead
|
Chris@1
|
188 (for a potentially unbounded period) to look for future discontinuous
|
Chris@1
|
189 data. As such, discontinuous streams are ignored when managing
|
Chris@1
|
190 buffering; their pages simply 'fall out' of the stream when continuous
|
Chris@1
|
191 streams are handled properly.</p>
|
Chris@1
|
192
|
Chris@1
|
193 <p>Buffering requirements need not be explicitly declared or managed for
|
Chris@1
|
194 the encoded stream; the decoder simply reads as much data as is
|
Chris@1
|
195 necessary to keep all continuous stream types gapless (also ensuring
|
Chris@1
|
196 discontinuous data arrives in time) and no more, resulting in optimum
|
Chris@1
|
197 implicit buffer usage for a given stream. Because all pages of all
|
Chris@1
|
198 data types are stamped with absolute timing information within the
|
Chris@1
|
199 stream, inter-stream synchronization timing is always explicitly
|
Chris@1
|
200 maintained without the need for explicitly declared buffer-ahead
|
Chris@1
|
201 hinting.</p>
|
Chris@1
|
202
|
Chris@1
|
203 <p>Further details, mechanisms and reasons for the differing arrangement
|
Chris@1
|
204 and behavior of continuous and discontinuous streams is discussed
|
Chris@1
|
205 later.</p>
|
Chris@1
|
206
|
Chris@1
|
207 <h3>Whole-stream navigation</h3>
|
Chris@1
|
208
|
Chris@1
|
209 <p>Ogg is designed so that the simplest navigation operations treat the
|
Chris@1
|
210 physical Ogg stream as a whole summary of its streams, rather than
|
Chris@1
|
211 navigating each interleaved stream as a separate entity.</p>
|
Chris@1
|
212
|
Chris@1
|
213 <p>First Example: seeking to a desired time position in a multiplexed (or
|
Chris@1
|
214 unmultiplexed) Ogg stream can be accomplished through a bisection
|
Chris@1
|
215 search on time position of all pages in the stream (as encoded in the
|
Chris@1
|
216 granule position). More powerful searches (such as a key frame-aware
|
Chris@1
|
217 seek within video) are also possible with additional search
|
Chris@1
|
218 complexity, but similar computational complexity.</p>
|
Chris@1
|
219
|
Chris@1
|
220 <p>Second Example: A bitstream section may consist of three multiplexed
|
Chris@1
|
221 streams of differing lengths. The result of multiplexing these
|
Chris@1
|
222 streams should be thought of as a single mixed stream with a length
|
Chris@1
|
223 equal to the longest of the three component streams. Although it is
|
Chris@1
|
224 also possible to think of the multiplexed results as three concurrent
|
Chris@1
|
225 streams of different lengths and it is possible to recover the three
|
Chris@1
|
226 original streams, it will also become obvious that once multiplexed,
|
Chris@1
|
227 it isn't possible to find the internal lengths of the component
|
Chris@1
|
228 streams without a linear search of the whole bitstream section.
|
Chris@1
|
229 However, it is possible to find the length of the whole bitstream
|
Chris@1
|
230 section easily (in near-constant time per section) just as it is for a
|
Chris@1
|
231 single-media unmultiplexed stream.</p>
|
Chris@1
|
232
|
Chris@1
|
233 <h2>Granule Position</h2>
|
Chris@1
|
234
|
Chris@1
|
235 <h3>Description</h3>
|
Chris@1
|
236
|
Chris@1
|
237 <p>The Granule Position is a signed 64 bit field appearing in the header
|
Chris@1
|
238 of every Ogg page. Although the granule position represents absolute
|
Chris@1
|
239 time within a logical stream, its value does not necessarily directly
|
Chris@1
|
240 encode a simple timestamp. It may represent frames elapsed (as in
|
Chris@1
|
241 Vorbis), a simple timestamp, or a more complex bit-division encoding
|
Chris@1
|
242 (such as in Theora). The exact encoding of the granule position is up
|
Chris@1
|
243 to a specific codec.</p>
|
Chris@1
|
244
|
Chris@1
|
245 <p>The granule position is governed by the following rules:</p>
|
Chris@1
|
246
|
Chris@1
|
247 <ul>
|
Chris@1
|
248
|
Chris@1
|
249 <li>Granule Position must always increase forward or remain equal from
|
Chris@1
|
250 page to page, be unset, or be zero for a header page. The absolute
|
Chris@1
|
251 time to which any correct sequence of granule position maps must
|
Chris@1
|
252 similarly always increase forward or remain equal. <i>(A codec may
|
Chris@1
|
253 make use of data, such as a control sequence, that only affects codec
|
Chris@1
|
254 working state without producing data and thus advancing granule
|
Chris@1
|
255 position and time. Although the packet sequence number increases in
|
Chris@1
|
256 this case, the granule position, and thus the time position, do
|
Chris@1
|
257 not.)</i></li>
|
Chris@1
|
258
|
Chris@1
|
259 <li>Granule position may only be unset if there no packet defining a
|
Chris@1
|
260 time boundary on the page (that is, if no packet in a continuous
|
Chris@1
|
261 stream ends on the page, or no packet in a discontinuous stream begins
|
Chris@1
|
262 on the page. This will be discussed in more detail under Continuous
|
Chris@1
|
263 and Discontinuous streams).</li>
|
Chris@1
|
264
|
Chris@1
|
265 <li>A codec must be able to translate a given granule position value
|
Chris@1
|
266 to a unique, deterministic absolute time value through direct
|
Chris@1
|
267 calculation. A codec is not required to be able to translate an
|
Chris@1
|
268 absolute time value into a unique granule position value.</li>
|
Chris@1
|
269
|
Chris@1
|
270 <li>Codecs shall choose a granule position definition that allows that
|
Chris@1
|
271 codec means to seek as directly as possible to an immediately
|
Chris@1
|
272 decodable point, such as the bit-divided granule position encoding of
|
Chris@1
|
273 Theora allows the codec to seek efficiently to key frame without using
|
Chris@1
|
274 an index. That is, additional information other than absolute time
|
Chris@1
|
275 may be encoded into a granule position value so long as the granule
|
Chris@1
|
276 position obeys the above points.</li>
|
Chris@1
|
277
|
Chris@1
|
278 </ul>
|
Chris@1
|
279
|
Chris@1
|
280 <h4>Example: timestamp</h4>
|
Chris@1
|
281
|
Chris@1
|
282 <p>In general, a codec/stream type should choose the simplest granule
|
Chris@1
|
283 position encoding that addresses its requirements. The examples here
|
Chris@1
|
284 are by no means exhaustive of the possibilities within Ogg.</p>
|
Chris@1
|
285
|
Chris@1
|
286 <p>A simple granule position could encode a timestamp directly. For
|
Chris@1
|
287 example, a granule position that encoded milliseconds from beginning
|
Chris@1
|
288 of stream would allow a logical stream length of over 100,000,000,000
|
Chris@1
|
289 days before beginning a new logical stream (to avoid the granule
|
Chris@1
|
290 position wrapping).</p>
|
Chris@1
|
291
|
Chris@1
|
292 <h4>Example: framestamp</h4>
|
Chris@1
|
293
|
Chris@1
|
294 <p>A simple millisecond timestamp granule encoding might suit many stream
|
Chris@1
|
295 types, but a millisecond resolution is inappropriate to, eg, most
|
Chris@1
|
296 audio encodings where exact single-sample resolution is generally a
|
Chris@1
|
297 requirement. A millisecond is both too large a granule and often does
|
Chris@1
|
298 not represent an integer number of samples.</p>
|
Chris@1
|
299
|
Chris@1
|
300 <p>In the event that audio frames are always encoded as the same number of
|
Chris@1
|
301 samples, the granule position could simply be a linear count of frames
|
Chris@1
|
302 since beginning of stream. This has the advantages of being exact and
|
Chris@1
|
303 efficient. Position in time would simply be <tt>[granule_position] *
|
Chris@1
|
304 [samples_per_frame] / [samples_per_second]</tt>.</p>
|
Chris@1
|
305
|
Chris@1
|
306 <h4>Example: samplestamp (Vorbis)</h4>
|
Chris@1
|
307
|
Chris@1
|
308 <p>Frame counting is insufficient in codecs such as Vorbis where an audio
|
Chris@1
|
309 frame [packet] encodes a variable number of samples. In Vorbis's
|
Chris@1
|
310 case, the granule position is a count of the number of raw samples
|
Chris@1
|
311 from the beginning of stream; the absolute time of
|
Chris@1
|
312 a granule position is <tt>[granule_position] /
|
Chris@1
|
313 [samples_per_second]</tt>.</p>
|
Chris@1
|
314
|
Chris@1
|
315 <h4>Example: bit-divided framestamp (Theora)</h4>
|
Chris@1
|
316
|
Chris@1
|
317 <p>Some video codecs may be able to use the simple framestamp scheme for
|
Chris@1
|
318 granule position. However, most modern video codecs introduce at
|
Chris@1
|
319 least the following complications:</p>
|
Chris@1
|
320
|
Chris@1
|
321 <ul>
|
Chris@1
|
322
|
Chris@1
|
323 <li>video frames are relatively far apart compared to audio samples;
|
Chris@1
|
324 for this reason, the point at which a video frame changes to the next
|
Chris@1
|
325 frame is usually a strictly defined offset within the frame 'period'.
|
Chris@1
|
326 That is, video at 50fps could just as easily define frame transitions
|
Chris@1
|
327 <.015, .035, .055...> as at <.00, .02, .04...>.</li>
|
Chris@1
|
328
|
Chris@1
|
329 <li>frame rates often include drop-frames, leap-frames or other
|
Chris@1
|
330 rational-but-non-integer timings.</li>
|
Chris@1
|
331
|
Chris@1
|
332 <li>Decode must begin at a 'key frame' or 'I frame'. Keyframes usually
|
Chris@1
|
333 occur relatively seldom.</li>
|
Chris@1
|
334
|
Chris@1
|
335 </ul>
|
Chris@1
|
336
|
Chris@1
|
337 <p>The first two points can be handled straightforwardly via the fact
|
Chris@1
|
338 that the codec has complete control mapping granule position to
|
Chris@1
|
339 absolute time; non-integer frame rates and offsets can be set in the
|
Chris@1
|
340 codec's initial header, and the rest is just arithmetic.</p>
|
Chris@1
|
341
|
Chris@1
|
342 <p>The third point appears trickier at first glance, but it too can be
|
Chris@1
|
343 handled through the granule position mapping mechanism. Here we
|
Chris@1
|
344 arrange the granule position in such a way that granule positions of
|
Chris@1
|
345 key frames are easy to find. Divide the granule position into two
|
Chris@1
|
346 fields; the most-significant bits are an absolute frame counter, but
|
Chris@1
|
347 it's only updated at each key frame. The least significant bits encode
|
Chris@1
|
348 the number of frames since the last key frame. In this way, each
|
Chris@1
|
349 granule position both encodes the absolute time of the current frame
|
Chris@1
|
350 as well as the absolute time of the last key frame.</p>
|
Chris@1
|
351
|
Chris@1
|
352 <p>Seeking to a most recent preceding key frame is then accomplished by
|
Chris@1
|
353 first seeking to the original desired point, inspecting the granulepos
|
Chris@1
|
354 of the resulting video page, extracting from that granulepos the
|
Chris@1
|
355 absolute time of the desired key frame, and then seeking directly to
|
Chris@1
|
356 that key frame's page. Of course, it's still possible for an
|
Chris@1
|
357 application to ignore key frames and use a simpler seeking algorithm
|
Chris@1
|
358 (decode would be unable to present decoded video until the next
|
Chris@1
|
359 key frame). Surprisingly many player applications do choose the
|
Chris@1
|
360 simpler approach.</p>
|
Chris@1
|
361
|
Chris@1
|
362 <h3>granule position, packets and pages</h3>
|
Chris@1
|
363
|
Chris@1
|
364 <p>Although each packet of data in a logical stream theoretically has a
|
Chris@1
|
365 specific granule position, only one granule position is encoded
|
Chris@1
|
366 per page. It is possible to encode a logical stream such that each
|
Chris@1
|
367 page contains only a single packet (so that granule positions are
|
Chris@1
|
368 preserved for each packet), however a one-to-one packet/page mapping
|
Chris@1
|
369 is not intended to be the general case.</p>
|
Chris@1
|
370
|
Chris@1
|
371 <p>Because Ogg functions at the page, not packet, level, this
|
Chris@1
|
372 once-per-page time information provides Ogg with the finest-grained
|
Chris@1
|
373 time information is can use. Ogg passes this granule positioning data
|
Chris@1
|
374 to the codec (along with the packets extracted from a page); it is the
|
Chris@1
|
375 responsibility of codecs to track timing information at granularities
|
Chris@1
|
376 finer than a single page.</p>
|
Chris@1
|
377
|
Chris@1
|
378 <h3>start-time and end-time positioning</h3>
|
Chris@1
|
379
|
Chris@1
|
380 <p>A granule position represents the <em>instantaneous time location
|
Chris@1
|
381 between two pages</em>. However, continuous streams and discontinuous
|
Chris@1
|
382 streams differ on whether the granulepos represents the end-time of
|
Chris@1
|
383 the data on a page or the start-time. Continuous streams are
|
Chris@1
|
384 'end-time' encoded; the granulepos represents the point in time
|
Chris@1
|
385 immediately after the last data decoded from a page. Discontinuous
|
Chris@1
|
386 streams are 'start-time' encoded; the granulepos represents the point
|
Chris@1
|
387 in time of the first data decoded from the page.</p>
|
Chris@1
|
388
|
Chris@1
|
389 <p>An Ogg stream type is declared continuous or discontinuous by its
|
Chris@1
|
390 codec. A given codec may support both continuous and discontinuous
|
Chris@1
|
391 operation so long as any given logical stream is continuous or
|
Chris@1
|
392 discontinuous for its entirety and the codec is able to ascertain (and
|
Chris@1
|
393 inform the Ogg layer) as to which after decoding the initial stream
|
Chris@1
|
394 header. The majority of codecs will always be continuous (such as
|
Chris@1
|
395 Vorbis) or discontinuous (such as Writ).</p>
|
Chris@1
|
396
|
Chris@1
|
397 <p>Start- and end-time encoding do not affect multiplexing sort-order;
|
Chris@1
|
398 pages are still sorted by the absolute time a given granulepos maps to
|
Chris@1
|
399 regardless of whether that granulepos represents start- or
|
Chris@1
|
400 end-time.</p>
|
Chris@1
|
401
|
Chris@1
|
402 <h2>Multiplex/Demultiplex Division of Labor</h2>
|
Chris@1
|
403
|
Chris@1
|
404 <p>The Ogg multiplex/demultiplex layer provides mechanisms for encoding
|
Chris@1
|
405 raw packets into Ogg pages, decoding Ogg pages back into the original
|
Chris@1
|
406 codec packets, determining the logical structure of an Ogg stream, and
|
Chris@1
|
407 navigating through and synchronizing with an Ogg stream at a desired
|
Chris@1
|
408 stream location. Strict multiplex/demultiplex operations are entirely
|
Chris@1
|
409 in the Ogg domain and require no intervention from codecs.</p>
|
Chris@1
|
410
|
Chris@1
|
411 <p>Implementation of more complex operations does require codec
|
Chris@1
|
412 knowledge, however. Unlike other framing systems, Ogg maintains
|
Chris@1
|
413 strict separation between framing and the framed bitstream data; Ogg
|
Chris@1
|
414 does not replicate codec-specific information in the page/framing
|
Chris@1
|
415 data, nor does Ogg blur the line between framing and stream
|
Chris@1
|
416 data/metadata. Because Ogg is fully data-agnostic toward the data it
|
Chris@1
|
417 frames, operations which require specifics of bitstream data (such as
|
Chris@1
|
418 'seek to key frame') also require interaction with the codec layer
|
Chris@1
|
419 (because, in this example, the Ogg layer is not aware of the concept
|
Chris@1
|
420 of key frames). This is different from systems that blur the
|
Chris@1
|
421 separation between framing and stream data in order to simplify the
|
Chris@1
|
422 separation of code. The Ogg system purposely keeps the distinction in
|
Chris@1
|
423 data simple so that later codec innovations are not constrained by
|
Chris@1
|
424 framing design.</p>
|
Chris@1
|
425
|
Chris@1
|
426 <p>For this reason, however, complex seeking operations require
|
Chris@1
|
427 interaction with the codecs in order to decode the granule position of
|
Chris@1
|
428 a given stream type back to absolute time or in order to find
|
Chris@1
|
429 'decodable points' such as key frames in video.</p>
|
Chris@1
|
430
|
Chris@1
|
431 <h2>Unsorted Discussion Points</h2>
|
Chris@1
|
432
|
Chris@1
|
433 <p>flushes around key frames? RFC suggestion: repaginating or building a
|
Chris@1
|
434 stream this way is nice but not required</p>
|
Chris@1
|
435
|
Chris@1
|
436 <h2>Appendix A: multiplexing examples</h2>
|
Chris@1
|
437
|
Chris@1
|
438 <div id="copyright">
|
Chris@1
|
439 The Xiph Fish Logo is a
|
Chris@1
|
440 trademark (™) of Xiph.Org.<br/>
|
Chris@1
|
441
|
Chris@1
|
442 These pages © 1994 - 2005 Xiph.Org. All rights reserved.
|
Chris@1
|
443 </div>
|
Chris@1
|
444
|
Chris@1
|
445 </body>
|
Chris@1
|
446 </html>
|