annotate src/libogg-1.3.0/doc/oggstream.html @ 1:05aa0afa9217

Bring in flac, ogg, vorbis
author Chris Cannam
date Tue, 19 Mar 2013 17:37:49 +0000
parents
children
rev   line source
Chris@1 1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
Chris@1 2 <html>
Chris@1 3 <head>
Chris@1 4
Chris@1 5 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-15"/>
Chris@1 6 <title>Ogg Documentation</title>
Chris@1 7
Chris@1 8 <style type="text/css">
Chris@1 9 body {
Chris@1 10 margin: 0 18px 0 18px;
Chris@1 11 padding-bottom: 30px;
Chris@1 12 font-family: Verdana, Arial, Helvetica, sans-serif;
Chris@1 13 color: #333333;
Chris@1 14 font-size: .8em;
Chris@1 15 }
Chris@1 16
Chris@1 17 a {
Chris@1 18 color: #3366cc;
Chris@1 19 }
Chris@1 20
Chris@1 21 img {
Chris@1 22 border: 0;
Chris@1 23 }
Chris@1 24
Chris@1 25 #xiphlogo {
Chris@1 26 margin: 30px 0 16px 0;
Chris@1 27 }
Chris@1 28
Chris@1 29 #content p {
Chris@1 30 line-height: 1.4;
Chris@1 31 }
Chris@1 32
Chris@1 33 h1, h1 a, h2, h2 a, h3, h3 a {
Chris@1 34 font-weight: bold;
Chris@1 35 color: #ff9900;
Chris@1 36 margin: 1.3em 0 8px 0;
Chris@1 37 }
Chris@1 38
Chris@1 39 h1 {
Chris@1 40 font-size: 1.3em;
Chris@1 41 }
Chris@1 42
Chris@1 43 h2 {
Chris@1 44 font-size: 1.2em;
Chris@1 45 }
Chris@1 46
Chris@1 47 h3 {
Chris@1 48 font-size: 1.1em;
Chris@1 49 }
Chris@1 50
Chris@1 51 li {
Chris@1 52 line-height: 1.4;
Chris@1 53 }
Chris@1 54
Chris@1 55 #copyright {
Chris@1 56 margin-top: 30px;
Chris@1 57 line-height: 1.5em;
Chris@1 58 text-align: center;
Chris@1 59 font-size: .8em;
Chris@1 60 color: #888888;
Chris@1 61 clear: both;
Chris@1 62 }
Chris@1 63
Chris@1 64 .caption {
Chris@1 65 color: #000000;
Chris@1 66 background-color: #aabbff;
Chris@1 67 margin: 1em;
Chris@1 68 margin-left: 2em;
Chris@1 69 margin-right: 2em;
Chris@1 70 padding: 1em;
Chris@1 71 padding-bottom: 0em;
Chris@1 72 overflow: hidden;
Chris@1 73 }
Chris@1 74
Chris@1 75 .caption p {
Chris@1 76 clear: none;
Chris@1 77 }
Chris@1 78
Chris@1 79 .caption img {
Chris@1 80 display: block;
Chris@1 81 margin: 0px;
Chris@1 82 margin-left: auto;
Chris@1 83 margin-right: auto;
Chris@1 84 margin-bottom: 1.5em;
Chris@1 85 background-color: #ffffff;
Chris@1 86 padding: 10px;
Chris@1 87 }
Chris@1 88
Chris@1 89 #thepage {
Chris@1 90 margin-left: auto;
Chris@1 91 margin-right: auto;
Chris@1 92 width: 840px;
Chris@1 93 }
Chris@1 94
Chris@1 95 </style>
Chris@1 96
Chris@1 97 </head>
Chris@1 98
Chris@1 99 <body>
Chris@1 100 <div id="thepage">
Chris@1 101
Chris@1 102 <div id="xiphlogo">
Chris@1 103 <a href="http://www.xiph.org/"><img src="fish_xiph_org.png" alt="Fish Logo and Xiph.org"/></a>
Chris@1 104 </div>
Chris@1 105
Chris@1 106 <h1>Ogg bitstream overview</h1>
Chris@1 107
Chris@1 108 <p>This document serves as starting point for understanding the design
Chris@1 109 and implementation of the Ogg container format. If you're new to Ogg
Chris@1 110 or merely want a high-level technical overview, start reading here.
Chris@1 111 Other documents linked from the <a href="index.html">index page</a>
Chris@1 112 give distilled technical descriptions and references of the container
Chris@1 113 mechanisms. This document is intended to aid understanding.
Chris@1 114
Chris@1 115 <h2>Container format design points</h2>
Chris@1 116
Chris@1 117 <p>Ogg is intended to be a simplest-possible container, concerned only
Chris@1 118 with framing, ordering, and interleave. It can be used as a stream delivery
Chris@1 119 mechanism, for media file storage, or as a building block toward
Chris@1 120 implementing a more complex, non-linear container (for example, see
Chris@1 121 the <a href="skeleton.html">Skeleton</a> or <a
Chris@1 122 href="http://en.wikipedia.org/wiki/Annodex">Annodex/CMML</a>).
Chris@1 123
Chris@1 124 <p>The Ogg container is not intended to be a monolithic
Chris@1 125 'kitchen-sink'. It exists only to frame and deliver in-order stream
Chris@1 126 data and as such is vastly simpler than most other containers.
Chris@1 127 Elementary and multiplexed streams are both constructed entirely from a
Chris@1 128 single building block (an Ogg page) comprised of eight fields
Chris@1 129 totalling twenty-eight bytes (the page header) a list of packet lengths
Chris@1 130 (up to 255 bytes) and payload data (up to 65025 bytes). The structure
Chris@1 131 of every page is the same. There are no optional fields or alternate
Chris@1 132 encodings.
Chris@1 133
Chris@1 134 <p>Stream and media metadata is contained in Ogg and not built into
Chris@1 135 the Ogg container itself. Metadata is thus compartmentalized and
Chris@1 136 layered rather than part of a monolithic design, an especially good
Chris@1 137 idea as no two groups seem able to agree on what a complete or
Chris@1 138 complete-enough metadata set should be. In this way, the container and
Chris@1 139 container implementation are isolated from unnecessary metadata design
Chris@1 140 flux.
Chris@1 141
Chris@1 142 <h3>Streaming</h3>
Chris@1 143
Chris@1 144 <p>The Ogg container is primarily a streaming format,
Chris@1 145 encapsulating chronological, time-linear mixed media into a single
Chris@1 146 delivery stream or file. The design is such that an application can
Chris@1 147 always encode and/or decode all features of a bitstream in one pass
Chris@1 148 with no seeking and minimal buffering. Seeking to provide optimized
Chris@1 149 encoding (such as two-pass encoding) or interactive decoding (such as
Chris@1 150 scrubbing or instant replay) is not disallowed or discouraged, however
Chris@1 151 no container feature requires nonlinear access of the bitstream.
Chris@1 152
Chris@1 153 <h3>Variable Bit Rate, Variable Payload Size</h3>
Chris@1 154
Chris@1 155 <p>Ogg is designed to contain any size data payload with bounded,
Chris@1 156 predictable efficiency. Ogg packets have no maximum size and a
Chris@1 157 zero-byte minimum size. There is no restriction on size changes from
Chris@1 158 packet to packet. Variable size packets do not require the use of any
Chris@1 159 optional or additional container features. There is no optimal
Chris@1 160 suggested packet size, though special consideration was paid to make
Chris@1 161 sure 50-200 byte packets were no less efficient than larger packet
Chris@1 162 sizes. The original design criteria was a 2% overhead at 50 byte
Chris@1 163 packets, dropping to a maximum working overhead of 1% with larger
Chris@1 164 packets, and a typical working overhead of .5-.7% for most practical
Chris@1 165 uses.
Chris@1 166
Chris@1 167 <h3>Simple pagination</h3>
Chris@1 168
Chris@1 169 <p>Ogg is a byte-aligned container with no context-dependent, optional
Chris@1 170 or variable-length fields. Ogg requires no repacking of codec data.
Chris@1 171 The page structure is written out in-line as packet data is submitted
Chris@1 172 to the streaming abstraction. In addition, it is possible to
Chris@1 173 implement both Ogg mux and demux as MT-hot zero-copy abstractions (as
Chris@1 174 is done in the Tremor sourcebase).
Chris@1 175
Chris@1 176 <h3>Capture</h3>
Chris@1 177
Chris@1 178 <p>Ogg is designed for efficient and immediate stream capture with
Chris@1 179 high confidence. Although packets have no size limit in Ogg, pages
Chris@1 180 are a maximum of just under 64kB meaning that any Ogg stream can be
Chris@1 181 captured with confidence after seeing 128kB of data or less [worst
Chris@1 182 case; typical figure is 6kB] from any random starting point in the
Chris@1 183 stream.
Chris@1 184
Chris@1 185 <h3>Seeking</h3>
Chris@1 186
Chris@1 187 <p>Ogg implements simple coarse- and fine-grained seeking by design.
Chris@1 188
Chris@1 189 <p>Coarse seeking may be performed by simply 'moving the tone arm' to a
Chris@1 190 new position and 'dropping the needle'. Rapid capture with
Chris@1 191 accompanying timecode from any location in an Ogg file is guaranteed
Chris@1 192 by the stream design. From the acquisition of the first timecode,
Chris@1 193 all data needed to play back from that time code forward is ahead of
Chris@1 194 the stream cursor.
Chris@1 195
Chris@1 196 <p>Ogg implements full sample-granularity seeking using an
Chris@1 197 interpolated bisection search built on the capture and timecode
Chris@1 198 mechanisms used by coarse seeking. As above, once a search finds
Chris@1 199 the desired timecode, all data needed to play back from that time code
Chris@1 200 forward is ahead of the stream cursor.
Chris@1 201
Chris@1 202 <p>Both coarse and fine seeking use the page structure and sequencing
Chris@1 203 inherent to the Ogg format. All Ogg streams are fully seekable from
Chris@1 204 creation; seekability is unaffected by truncation or missing data, and
Chris@1 205 is tolerant of gross corruption. Seek operations are neither 'fuzzy' nor
Chris@1 206 heuristic.
Chris@1 207
Chris@1 208 <p>Seeking without use of an index is a major point of the Ogg
Chris@1 209 design. There two primary reasons why Ogg transport forgoes an index:
Chris@1 210
Chris@1 211 <ol>
Chris@1 212
Chris@1 213 <li>An index is only marginally useful in Ogg for the complexity
Chris@1 214 added; it adds no new functionality and seldom improves performance
Chris@1 215 noticeably. Empirical testing shows that indexless interpolation
Chris@1 216 search does not require many more seeks in practice than using an
Chris@1 217 index would.
Chris@1 218
Chris@1 219 <li>'Optional' indexes encourage lazy implementations that can seek
Chris@1 220 only when indexes are present, or that implement indexless seeking
Chris@1 221 only by building an internal index after reading the entire file
Chris@1 222 beginning to end. This has been the fate of other containers that
Chris@1 223 specify optional indexing.
Chris@1 224
Chris@1 225 </ol>
Chris@1 226
Chris@1 227 <p>In addition, it must be possible to create an Ogg stream in a
Chris@1 228 single pass. Although an optional index can simply be tacked on the
Chris@1 229 end of the created stream, some software groups object to
Chris@1 230 end-positioned indexes and claim to be unwilling to support indexes
Chris@1 231 not located at the stream beginning.
Chris@1 232
Chris@1 233 <p><i>All this said, it's become clear that an optional index is a
Chris@1 234 demanded feature. For this reason, the <a
Chris@1 235 href="http://wiki.xiph.org/Ogg_Index">OggSkeleton now defines a
Chris@1 236 proposed index.</a></i>
Chris@1 237
Chris@1 238 <h3>Simple multiplexing</h3>
Chris@1 239
Chris@1 240 <p>Ogg multiplexes streams by interleaving pages from multiple elementary streams into a
Chris@1 241 multiplexed stream in time order. The multiplexed pages are not
Chris@1 242 altered. Muxing an Ogg AV stream out of separate audio,
Chris@1 243 video and data streams is akin to shuffling several decks of cards
Chris@1 244 together into a single deck; the cards themselves remain unchanged.
Chris@1 245 Demultiplexing is similarly simple (as the cards are marked).
Chris@1 246
Chris@1 247 <p>The goal of this design is to make the mux/demux operation as
Chris@1 248 trivial as possible to allow live streaming systems to build and
Chris@1 249 rebuild streams on the fly with minimal CPU usage and no additional
Chris@1 250 storage or latency requirements.
Chris@1 251
Chris@1 252 <h3>Continuous and Discontinuous Media</h3>
Chris@1 253
Chris@1 254 <p>Ogg streams belong to one of two categories, "Continuous" streams and
Chris@1 255 "Discontinuous" streams.
Chris@1 256
Chris@1 257 <p>A stream that provides a gapless, time-continuous media type with a
Chris@1 258 fine-grained timebase is considered to be 'Continuous'. A continuous
Chris@1 259 stream should never be starved of data. Examples of continuous data
Chris@1 260 types include broadcast audio and video.
Chris@1 261
Chris@1 262 <p>A stream that delivers data in a potentially irregular pattern or
Chris@1 263 with widely spaced timing gaps is considered to be 'Discontinuous'. A
Chris@1 264 discontinuous stream may be best thought of as data representing
Chris@1 265 scattered events; although they happen in order, they are typically
Chris@1 266 unconnected data often located far apart. One example of a
Chris@1 267 discontinuous stream types would be captioning such as <a
Chris@1 268 href="http://wiki.xiph.org/OggKate">Ogg Kate</a>. Although it's
Chris@1 269 possible to design captions as a continuous stream type, it's most
Chris@1 270 natural to think of captions as widely spaced pieces of text with
Chris@1 271 little happening between.
Chris@1 272
Chris@1 273 <p>The fundamental reason for distinction between continuous and
Chris@1 274 discontinuous streams concerns buffering.
Chris@1 275
Chris@1 276 <h3>Buffering</h3>
Chris@1 277
Chris@1 278 <p>A continuous stream is, by definition, gapless. Ogg buffering is based
Chris@1 279 on the simple premise of never allowing an active continuous stream
Chris@1 280 to starve for data during decode; buffering works ahead until all
Chris@1 281 continuous streams in a physical stream have data ready and no further.
Chris@1 282
Chris@1 283 <p>Discontinuous stream data is not assumed to be predictable. The
Chris@1 284 buffering design takes discontinuous data 'as it comes' rather than
Chris@1 285 working ahead to look for future discontinuous data for a potentially
Chris@1 286 unbounded period. Thus, the buffering process makes no attempt to fill
Chris@1 287 discontinuous stream buffers; their pages simply 'fall out' of the
Chris@1 288 stream when continuous streams are handled properly.
Chris@1 289
Chris@1 290 <p>Buffering requirements in this design need not be explicitly
Chris@1 291 declared or managed in the encoded stream. The decoder simply reads as
Chris@1 292 much data as is necessary to keep all continuous stream types gapless
Chris@1 293 and no more, with discontinuous data processed as it arrives in the
Chris@1 294 continuous data. Buffering is implicitly optimal for the given
Chris@1 295 stream. Because all pages of all data types are stamped with absolute
Chris@1 296 timing information within the stream, inter-stream synchronization
Chris@1 297 timing is always maintained without the need for explicitly declared
Chris@1 298 buffer-ahead hinting.
Chris@1 299
Chris@1 300 <h3>Codec metadata</h3>
Chris@1 301
Chris@1 302 <p>Ogg does not replicate codec-specific metadata into the mux layer
Chris@1 303 in an attempt to make the mux and codec layer implementations 'fully
Chris@1 304 separable'. Things like specific timebase, keyframing strategy, frame
Chris@1 305 duration, etc, do not appear in the Ogg container. The mux layer is,
Chris@1 306 instead, expected to query a codec through a centralized interface,
Chris@1 307 left to the implementation, for this data when it is needed.
Chris@1 308
Chris@1 309 <p>Though modern design wisdom usually prefers to predict all possible
Chris@1 310 needs of current and future codecs then embed these dependencies and
Chris@1 311 the required metadata into the container itself, this strategy
Chris@1 312 increases container specification complexity, fragility, and rigidity.
Chris@1 313 The mux and codec code becomes more independent, but the
Chris@1 314 specifications become logically less independent. A codec can't do
Chris@1 315 what a container hasn't already provided for. Novel codecs are harder
Chris@1 316 to support, and you can do fewer useful things with the ones you've
Chris@1 317 already got (eg, try to make a good splitter without using any codecs.
Chris@1 318 Such a splitter is limited to splitting at keyframes only, or building
Chris@1 319 yet another new mechanism into the container layer to mark what frames
Chris@1 320 to skip displaying).
Chris@1 321
Chris@1 322 <p>Ogg's design goes the opposite direction, where the specification
Chris@1 323 is to be as simple, easy to understand, and 'proofed' against novel
Chris@1 324 codecs as possible. When an Ogg mux layer requires codec-specific
Chris@1 325 information, it queries the codec (or a codec stub). This trades a
Chris@1 326 more complex implementation for a simpler, more flexible
Chris@1 327 specification.
Chris@1 328
Chris@1 329 <h3>Stream structure metadata</h3>
Chris@1 330
Chris@1 331 <p>The Ogg container itself does not define a metadata system for
Chris@1 332 declaring the structure and interrelations between multiple media
Chris@1 333 types in a muxed stream. That is, the Ogg container itself does not
Chris@1 334 specify data like 'which steam is the subtitle stream?' or 'which
Chris@1 335 video stream is the primary angle?'. This metadata still exists, but
Chris@1 336 is stored by the Ogg container rather than being built into the Ogg
Chris@1 337 container itself. Xiph specifies the 'Skeleton' metadata format for Ogg
Chris@1 338 streams, but this decoupling of container and stream structure
Chris@1 339 metadata means it is possible to use Ogg with any metadata
Chris@1 340 specification without altering the container itself, or without stream
Chris@1 341 structure metadata at all.
Chris@1 342
Chris@1 343 <h3>Frame accurate absolute position</h3>
Chris@1 344
Chris@1 345 <p>Every Ogg page is stamped with a 64 bit 'granule position' that
Chris@1 346 serves as an absolute timestamp for mux and seeking. A few nifty
Chris@1 347 little tricks are usually also embedded in the granpos state, but
Chris@1 348 we'll leave those aside for the moment (strictly speaking, they're
Chris@1 349 part of each codec's mapping, not Ogg).
Chris@1 350
Chris@1 351 <p>As previously mentioned above, granule positions are mapped into
Chris@1 352 absolute timestamps by the codec, rather than being a hard timestamp.
Chris@1 353 This allows maximally efficient use of the available 64 bits to
Chris@1 354 address every sample/frame position without approximation while
Chris@1 355 supporting new and previously unknown timebase encodings without
Chris@1 356 needing to extend or update the mux layer. When a codec needs a novel
Chris@1 357 timebase, it simply brings the code for that mapping along with it.
Chris@1 358 This is not a theoretical curiosity; new, wholly novel timebases were
Chris@1 359 deployed with the adoption of both Theora and Dirac. "Rolling INTRA"
Chris@1 360 (keyframeless video) also benefits from novel use of the granule
Chris@1 361 position.
Chris@1 362
Chris@1 363 <h2>Ogg stream arrangement</h2>
Chris@1 364
Chris@1 365 <h3>Packets, pages, and bitstreams</h3>
Chris@1 366
Chris@1 367 <p>Ogg codecs place raw compressed data into <em>packets</em>.
Chris@1 368 Packets are octet payloads containing the data needed for a single
Chris@1 369 decompressed unit, eg, one video frame. Packets have no maximum size
Chris@1 370 and may be zero length. They do not generally have any framing
Chris@1 371 information; strung together, the unframed packets form a <em>logical
Chris@1 372 bitstream</em> of codec data with no internal landmarks.
Chris@1 373
Chris@1 374 <div class="caption">
Chris@1 375 <img src="packets.png">
Chris@1 376
Chris@1 377 <p> Packets of raw codec data are not typically internally framed.
Chris@1 378 When they are strung together into a stream without any container to
Chris@1 379 provide framing, they lose their individual boundaries. Seek and
Chris@1 380 capture are not possible within an unframed stream, and for many
Chris@1 381 codecs with variable length payloads and/or early-packet termination
Chris@1 382 (such as Vorbis), it may become impossible to recover the original
Chris@1 383 frame boundaries even if the stream is scanned linearly from
Chris@1 384 beginning to end.
Chris@1 385
Chris@1 386 </div>
Chris@1 387
Chris@1 388 <p>Logical bitstream packets are grouped and framed into Ogg pages
Chris@1 389 along with a unique stream <em>serial number</em> to produce a
Chris@1 390 <em>physical bitstream</em>. An <em>elementary stream</em> is a
Chris@1 391 physical bitstream containing only a single logical bitstream. Each
Chris@1 392 page is a self contained entity, although a packet may be split and
Chris@1 393 encoded across one or more pages. The page decode mechanism is
Chris@1 394 designed to recognize, verify and handle single pages at a time from
Chris@1 395 the overall bitstream.
Chris@1 396
Chris@1 397 <div class="caption">
Chris@1 398 <img src="pages.png">
Chris@1 399
Chris@1 400 <p> The primary purpose of a container is to provide framing for raw
Chris@1 401 packets, marking the packet boundaries so the exact packets can be
Chris@1 402 retrieved for decode later. The container also provides secondary
Chris@1 403 functions such as capture, timestamping, sequencing, stream
Chris@1 404 identification and so on. Not all of these functions are represented in the diagram.
Chris@1 405
Chris@1 406 <p>In the Ogg container, pages do not necessarily contain
Chris@1 407 integer numbers of packets. Packets may span across page boundaries
Chris@1 408 or even multiple pages. This is necessary as pages have a maximum
Chris@1 409 possible size in order to provide capture guarantees, but packet
Chris@1 410 size is unbounded.
Chris@1 411 </div>
Chris@1 412
Chris@1 413
Chris@1 414 <p><a href="framing.html">Ogg Bitstream Framing</a> specifies
Chris@1 415 the page format of an Ogg bitstream, the packet coding process
Chris@1 416 and elementary bitstreams in detail.
Chris@1 417
Chris@1 418 <h3>Multiplexed bitstreams</h3>
Chris@1 419
Chris@1 420 <p>Multiple logical/elementary bitstreams can be combined into a single
Chris@1 421 <em>multiplexed bitstream</em> by interleaving whole pages from each
Chris@1 422 contributing elementary stream in time order. The result is a single
Chris@1 423 physical stream that multiplexes and frames multiple logical streams.
Chris@1 424 Each logical stream is identified by the unique stream serial number
Chris@1 425 stamped in its pages. A physical stream may include a 'meta-header'
Chris@1 426 (such as the <a href="skeleton.html">Ogg Skeleton</a>) comprising its
Chris@1 427 own Ogg page at the beginning of the physical stream. A decoder
Chris@1 428 recovers the original logical/elementary bitstreams out of the
Chris@1 429 physical bitstream by taking the pages in order from the physical
Chris@1 430 bitstream and redirecting them into the appropriate logical decoding
Chris@1 431 entity.
Chris@1 432
Chris@1 433 <div class="caption">
Chris@1 434 <img src="multiplex1.png">
Chris@1 435
Chris@1 436 <p>Multiple media types are mutliplexed into a single Ogg stream by
Chris@1 437 interleaving the pages from each elementary physical stream.
Chris@1 438
Chris@1 439 </div>
Chris@1 440
Chris@1 441 <p><a href="ogg-multiplex.html">Ogg Bitstream Multiplexing</a> specifies
Chris@1 442 proper multiplexing of an Ogg bitstream in detail.
Chris@1 443
Chris@1 444 <h3>Chaining</h3>
Chris@1 445
Chris@1 446 <p>Multiple Ogg physical bitstreams may be concatenated into a single new
Chris@1 447 stream; this is <em>chaining</em>. The bitstreams do not overlap; the
Chris@1 448 final page of a given logical bitstream is immediately followed by the
Chris@1 449 initial page of the next.</p>
Chris@1 450
Chris@1 451 <p>Each logical bitstream in a chain must have a unique serial number
Chris@1 452 within the scope of the full physical bitstream, not only within a
Chris@1 453 particular <em>link</em> or <em>segment</em> of the chain.</p>
Chris@1 454
Chris@1 455 <h3>Continuous and discontinuous streams</h3>
Chris@1 456
Chris@1 457 <p>Within Ogg, each stream must be declared (by the codec) to be
Chris@1 458 continuous- or discontinuous-time. Most codecs treat all streams they
Chris@1 459 use as either inherently continuous- or discontinuous-time, although
Chris@1 460 this is not a requirement. A codec may, as part of its mapping, choose
Chris@1 461 according to data in the initial header.
Chris@1 462
Chris@1 463 <p>Continuous-time pages are stamped by end-time, discontinuous pages
Chris@1 464 are stamped by begin-time. Pages in a multiplexed stream are
Chris@1 465 interleaved in order of the time stamp regardless of stream type.
Chris@1 466 Both continuous and discontinuous logical streams are used to seek
Chris@1 467 within a physical stream, however only continuous streams are used to
Chris@1 468 determine buffering depth; because discontinuous streams are stamped
Chris@1 469 by start time, they will always 'fall out' at the proper time when
Chris@1 470 buffering the continuous streams. See 'Examples' for an illustration
Chris@1 471 of the buffering mechanism.
Chris@1 472
Chris@1 473 <h2>Multiplexing Requirements</h2>
Chris@1 474
Chris@1 475 <p>Multiplexing requirements within Ogg are straightforward. When
Chris@1 476 constructing a single-link (unchained) physical bitstream consisting
Chris@1 477 of multiple elementary streams:
Chris@1 478
Chris@1 479 <ol>
Chris@1 480
Chris@1 481 <li><p> The initial header for each stream appears in sequence, each
Chris@1 482 header on a single page. All initial headers must appear with no
Chris@1 483 intervening data (no auxiliary header pages or packets, no data pages
Chris@1 484 or packets). Order of the initial headers is unspecified. The
Chris@1 485 'beginning of stream' flag is set on each initial header.
Chris@1 486
Chris@1 487 <li><p> All auxiliary headers for all streams must follow. Order
Chris@1 488 is unspecified. The final auxiliary header of each stream must flush
Chris@1 489 its page.
Chris@1 490
Chris@1 491 <li><p>Data pages for each stream follow, interleaved in time order.
Chris@1 492
Chris@1 493 <li><p>The final page of each stream sets the 'end of stream' flag.
Chris@1 494 Unlike initial pages, terminal pages for the logical bitstreams need
Chris@1 495 not occur contiguously; indeed it may not be possible for them to do so.
Chris@1 496 </oL>
Chris@1 497
Chris@1 498 <p><p>Each grouped bitstream must have a unique serial number within the
Chris@1 499 scope of the physical bitstream.</p>
Chris@1 500
Chris@1 501 <h3>chaining and multiplexing</h3>
Chris@1 502
Chris@1 503 <p>Multiplexed and/or unmultiplexed bitstreams may be chained
Chris@1 504 consecutively. Such a physical bitstream obeys all the rules of both
Chris@1 505 chained and multiplexed streams. Each link, when unchained, must
Chris@1 506 stand on its own as a valid physical bitstream. Chained streams do
Chris@1 507 not mix or interleave; a new segment may not begin until all streams
Chris@1 508 in the preceding segment have terminated. </p>
Chris@1 509
Chris@1 510 <h2>Codec Mapping Requirements</h2>
Chris@1 511
Chris@1 512 <p>Each codec is allowed some freedom in deciding how its logical
Chris@1 513 bitstream is encapsulated into an Ogg bitstream (even if it is a
Chris@1 514 trivial mapping, eg, 'plop the packets in and go'). This is the
Chris@1 515 codec's <em>mapping</em>. Ogg imposes a few mapping requirements
Chris@1 516 on any codec.
Chris@1 517
Chris@1 518 <ol>
Chris@1 519
Chris@1 520 <li><p>The <a href="framing.html">framing specification</a> defines
Chris@1 521 'beginning of stream' and 'end of stream' page markers via a header
Chris@1 522 flag (it is possible for a stream to consist of a single page). A
Chris@1 523 correct stream always consists of an integer number of pages, an easy
Chris@1 524 requirement given the variable size nature of pages.</p>
Chris@1 525
Chris@1 526 <li><p>The first page of an elementary Ogg bitstream consists of a single,
Chris@1 527 small 'initial header' packet that must include sufficient information
Chris@1 528 to identify the exact CODEC type. From this initial header, the codec
Chris@1 529 must also be able to determine its timebase and whether or not it is a
Chris@1 530 continuous- or discontinuous-time stream. The initial header must fit
Chris@1 531 on a single page. If a codec makes use of auxiliary headers (for
Chris@1 532 example, Vorbis uses two auxiliary headers), these headers must follow
Chris@1 533 the initial header immediately. The last header finishes its page;
Chris@1 534 data begins on a fresh page.
Chris@1 535
Chris@1 536 <p><p>As an example, Ogg Vorbis places the name and revision of the
Chris@1 537 Vorbis CODEC, the audio rate and the audio quality into this initial
Chris@1 538 header. Vorbis comments and detailed codec setup appears in the larger
Chris@1 539 auxiliary headers.</p>
Chris@1 540
Chris@1 541 <li><p>Granule positions must be translatable to an exact absolute
Chris@1 542 time value. As described above, the mux layer is permitted to query a
Chris@1 543 codec or codec stub plugin to perform this mapping. It is not
Chris@1 544 necessary for an absolute time to be mappable into a single unique
Chris@1 545 granule position value.
Chris@1 546
Chris@1 547 <li><p>Codecs are not required to use a fixed duration-per-packet (for
Chris@1 548 example, Vorbis does not). the mux layer is permitted to query a
Chris@1 549 codec or codec stub plugin for the time duration of a packet.
Chris@1 550
Chris@1 551 <li><p>Although an absolute time need not be translatable to a unique
Chris@1 552 granule position, a codec must be able to determine the unique granule
Chris@1 553 position of the current packet using the granule position of a
Chris@1 554 preceeding packet.
Chris@1 555
Chris@1 556 <li><p>Packets and pages must be arranged in ascending
Chris@1 557 granule-position and time order.
Chris@1 558
Chris@1 559 </ol>
Chris@1 560
Chris@1 561 <h2>Examples</h2>
Chris@1 562
Chris@1 563 <em>[More to come shortly; this section is currently being revised and expanded]</em>
Chris@1 564
Chris@1 565 <p>Below, we present an example of a multiplexed and chained bitstream:</p>
Chris@1 566
Chris@1 567 <p><img src="stream.png" alt="stream"/></p>
Chris@1 568
Chris@1 569 <p>In this example, we see pages from five total logical bitstreams
Chris@1 570 multiplexed into a physical bitstream. Note the following
Chris@1 571 characteristics:</p>
Chris@1 572
Chris@1 573 <ol>
Chris@1 574 <li>Multiplexed bitstreams in a given link begin together; all of the
Chris@1 575 initial pages must appear before any data pages. When concurrently
Chris@1 576 multiplexed groups are chained, the new group does not begin until all
Chris@1 577 the bitstreams in the previous group have terminated.</li>
Chris@1 578
Chris@1 579 <li>The ordering of pages of concurrently multiplexed bitstreams is
Chris@1 580 goverened by timestamp (not shown here); there is no regular
Chris@1 581 interleaving order. Pages within a logical bitstream appear in
Chris@1 582 sequence order.</li>
Chris@1 583 </ol>
Chris@1 584
Chris@1 585 <div id="copyright">
Chris@1 586 The Xiph Fish Logo is a
Chris@1 587 trademark (&trade;) of Xiph.Org.<br/>
Chris@1 588
Chris@1 589 These pages &copy; 1994 - 2010 Xiph.Org. All rights reserved.
Chris@1 590 </div>
Chris@1 591
Chris@1 592 </div>
Chris@1 593 </body>
Chris@1 594 </html>