annotate src/libogg-1.3.0/doc/rfc3533.txt @ 23:619f715526df sv_v2.1

Update Vamp plugin SDK to 2.5
author Chris Cannam
date Thu, 09 May 2013 10:52:46 +0100
parents 05aa0afa9217
children
rev   line source
Chris@1 1
Chris@1 2
Chris@1 3
Chris@1 4
Chris@1 5
Chris@1 6
Chris@1 7 Network Working Group S. Pfeiffer
Chris@1 8 Request for Comments: 3533 CSIRO
Chris@1 9 Category: Informational May 2003
Chris@1 10
Chris@1 11
Chris@1 12 The Ogg Encapsulation Format Version 0
Chris@1 13
Chris@1 14 Status of this Memo
Chris@1 15
Chris@1 16 This memo provides information for the Internet community. It does
Chris@1 17 not specify an Internet standard of any kind. Distribution of this
Chris@1 18 memo is unlimited.
Chris@1 19
Chris@1 20 Copyright Notice
Chris@1 21
Chris@1 22 Copyright (C) The Internet Society (2003). All Rights Reserved.
Chris@1 23
Chris@1 24 Abstract
Chris@1 25
Chris@1 26 This document describes the Ogg bitstream format version 0, which is
Chris@1 27 a general, freely-available encapsulation format for media streams.
Chris@1 28 It is able to encapsulate any kind and number of video and audio
Chris@1 29 encoding formats as well as other data streams in a single bitstream.
Chris@1 30
Chris@1 31 Terminology
Chris@1 32
Chris@1 33 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
Chris@1 34 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
Chris@1 35 document are to be interpreted as described in BCP 14, RFC 2119 [2].
Chris@1 36
Chris@1 37 Table of Contents
Chris@1 38
Chris@1 39 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2
Chris@1 40 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 2
Chris@1 41 3. Requirements for a generic encapsulation format . . . . . . . 3
Chris@1 42 4. The Ogg bitstream format . . . . . . . . . . . . . . . . . . . 3
Chris@1 43 5. The encapsulation process . . . . . . . . . . . . . . . . . . 6
Chris@1 44 6. The Ogg page format . . . . . . . . . . . . . . . . . . . . . 9
Chris@1 45 7. Security Considerations . . . . . . . . . . . . . . . . . . . 11
Chris@1 46 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Chris@1 47 A. Glossary of terms and abbreviations . . . . . . . . . . . . . 13
Chris@1 48 B. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14
Chris@1 49 Author's Address . . . . . . . . . . . . . . . . . . . . . . . 14
Chris@1 50 Full Copyright Statement . . . . . . . . . . . . . . . . . . . 15
Chris@1 51
Chris@1 52
Chris@1 53
Chris@1 54
Chris@1 55
Chris@1 56
Chris@1 57
Chris@1 58 Pfeiffer Informational [Page 1]
Chris@1 59
Chris@1 60 RFC 3533 OGG May 2003
Chris@1 61
Chris@1 62
Chris@1 63 1. Introduction
Chris@1 64
Chris@1 65 The Ogg bitstream format has been developed as a part of a larger
Chris@1 66 project aimed at creating a set of components for the coding and
Chris@1 67 decoding of multimedia content (codecs) which are to be freely
Chris@1 68 available and freely re-implementable, both in software and in
Chris@1 69 hardware for the computing community at large, including the Internet
Chris@1 70 community. It is the intention of the Ogg developers represented by
Chris@1 71 Xiph.Org that it be usable without intellectual property concerns.
Chris@1 72
Chris@1 73 This document describes the Ogg bitstream format and how to use it to
Chris@1 74 encapsulate one or several media bitstreams created by one or several
Chris@1 75 encoders. The Ogg transport bitstream is designed to provide
Chris@1 76 framing, error protection and seeking structure for higher-level
Chris@1 77 codec streams that consist of raw, unencapsulated data packets, such
Chris@1 78 as the Vorbis audio codec or the upcoming Tarkin and Theora video
Chris@1 79 codecs. It is capable of interleaving different binary media and
Chris@1 80 other time-continuous data streams that are prepared by an encoder as
Chris@1 81 a sequence of data packets. Ogg provides enough information to
Chris@1 82 properly separate data back into such encoder created data packets at
Chris@1 83 the original packet boundaries without relying on decoding to find
Chris@1 84 packet boundaries.
Chris@1 85
Chris@1 86 Please note that the MIME type application/ogg has been registered
Chris@1 87 with the IANA [1].
Chris@1 88
Chris@1 89 2. Definitions
Chris@1 90
Chris@1 91 For describing the Ogg encapsulation process, a set of terms will be
Chris@1 92 used whose meaning needs to be well understood. Therefore, some of
Chris@1 93 the most fundamental terms are defined now before we start with the
Chris@1 94 description of the requirements for a generic media stream
Chris@1 95 encapsulation format, the process of encapsulation, and the concrete
Chris@1 96 format of the Ogg bitstream. See the Appendix for a more complete
Chris@1 97 glossary.
Chris@1 98
Chris@1 99 The result of an Ogg encapsulation is called the "Physical (Ogg)
Chris@1 100 Bitstream". It encapsulates one or several encoder-created
Chris@1 101 bitstreams, which are called "Logical Bitstreams". A logical
Chris@1 102 bitstream, provided to the Ogg encapsulation process, has a
Chris@1 103 structure, i.e., it is split up into a sequence of so-called
Chris@1 104 "Packets". The packets are created by the encoder of that logical
Chris@1 105 bitstream and represent meaningful entities for that encoder only
Chris@1 106 (e.g., an uncompressed stream may use video frames as packets). They
Chris@1 107 do not contain boundary information - strung together they appear to
Chris@1 108 be streams of random bytes with no landmarks.
Chris@1 109
Chris@1 110
Chris@1 111
Chris@1 112
Chris@1 113
Chris@1 114 Pfeiffer Informational [Page 2]
Chris@1 115
Chris@1 116 RFC 3533 OGG May 2003
Chris@1 117
Chris@1 118
Chris@1 119 Please note that the term "packet" is not used in this document to
Chris@1 120 signify entities for transport over a network.
Chris@1 121
Chris@1 122 3. Requirements for a generic encapsulation format
Chris@1 123
Chris@1 124 The design idea behind Ogg was to provide a generic, linear media
Chris@1 125 transport format to enable both file-based storage and stream-based
Chris@1 126 transmission of one or several interleaved media streams independent
Chris@1 127 of the encoding format of the media data. Such an encapsulation
Chris@1 128 format needs to provide:
Chris@1 129
Chris@1 130 o framing for logical bitstreams.
Chris@1 131
Chris@1 132 o interleaving of different logical bitstreams.
Chris@1 133
Chris@1 134 o detection of corruption.
Chris@1 135
Chris@1 136 o recapture after a parsing error.
Chris@1 137
Chris@1 138 o position landmarks for direct random access of arbitrary positions
Chris@1 139 in the bitstream.
Chris@1 140
Chris@1 141 o streaming capability (i.e., no seeking is needed to build a 100%
Chris@1 142 complete bitstream).
Chris@1 143
Chris@1 144 o small overhead (i.e., use no more than approximately 1-2% of
Chris@1 145 bitstream bandwidth for packet boundary marking, high-level
Chris@1 146 framing, sync and seeking).
Chris@1 147
Chris@1 148 o simplicity to enable fast parsing.
Chris@1 149
Chris@1 150 o simple concatenation mechanism of several physical bitstreams.
Chris@1 151
Chris@1 152 All of these design considerations have been taken into consideration
Chris@1 153 for Ogg. Ogg supports framing and interleaving of logical
Chris@1 154 bitstreams, seeking landmarks, detection of corruption, and stream
Chris@1 155 resynchronisation after a parsing error with no more than
Chris@1 156 approximately 1-2% overhead. It is a generic framework to perform
Chris@1 157 encapsulation of time-continuous bitstreams. It does not know any
Chris@1 158 specifics about the codec data that it encapsulates and is thus
Chris@1 159 independent of any media codec.
Chris@1 160
Chris@1 161 4. The Ogg bitstream format
Chris@1 162
Chris@1 163 A physical Ogg bitstream consists of multiple logical bitstreams
Chris@1 164 interleaved in so-called "Pages". Whole pages are taken in order
Chris@1 165 from multiple logical bitstreams multiplexed at the page level. The
Chris@1 166 logical bitstreams are identified by a unique serial number in the
Chris@1 167
Chris@1 168
Chris@1 169
Chris@1 170 Pfeiffer Informational [Page 3]
Chris@1 171
Chris@1 172 RFC 3533 OGG May 2003
Chris@1 173
Chris@1 174
Chris@1 175 header of each page of the physical bitstream. This unique serial
Chris@1 176 number is created randomly and does not have any connection to the
Chris@1 177 content or encoder of the logical bitstream it represents. Pages of
Chris@1 178 all logical bitstreams are concurrently interleaved, but they need
Chris@1 179 not be in a regular order - they are only required to be consecutive
Chris@1 180 within the logical bitstream. Ogg demultiplexing reconstructs the
Chris@1 181 original logical bitstreams from the physical bitstream by taking the
Chris@1 182 pages in order from the physical bitstream and redirecting them into
Chris@1 183 the appropriate logical decoding entity.
Chris@1 184
Chris@1 185 Each Ogg page contains only one type of data as it belongs to one
Chris@1 186 logical bitstream only. Pages are of variable size and have a page
Chris@1 187 header containing encapsulation and error recovery information. Each
Chris@1 188 logical bitstream in a physical Ogg bitstream starts with a special
Chris@1 189 start page (bos=beginning of stream) and ends with a special page
Chris@1 190 (eos=end of stream).
Chris@1 191
Chris@1 192 The bos page contains information to uniquely identify the codec type
Chris@1 193 and MAY contain information to set up the decoding process. The bos
Chris@1 194 page SHOULD also contain information about the encoded media - for
Chris@1 195 example, for audio, it should contain the sample rate and number of
Chris@1 196 channels. By convention, the first bytes of the bos page contain
Chris@1 197 magic data that uniquely identifies the required codec. It is the
Chris@1 198 responsibility of anyone fielding a new codec to make sure it is
Chris@1 199 possible to reliably distinguish his/her codec from all other codecs
Chris@1 200 in use. There is no fixed way to detect the end of the codec-
Chris@1 201 identifying marker. The format of the bos page is dependent on the
Chris@1 202 codec and therefore MUST be given in the encapsulation specification
Chris@1 203 of that logical bitstream type. Ogg also allows but does not require
Chris@1 204 secondary header packets after the bos page for logical bitstreams
Chris@1 205 and these must also precede any data packets in any logical
Chris@1 206 bitstream. These subsequent header packets are framed into an
Chris@1 207 integral number of pages, which will not contain any data packets.
Chris@1 208 So, a physical bitstream begins with the bos pages of all logical
Chris@1 209 bitstreams containing one initial header packet per page, followed by
Chris@1 210 the subsidiary header packets of all streams, followed by pages
Chris@1 211 containing data packets.
Chris@1 212
Chris@1 213 The encapsulation specification for one or more logical bitstreams is
Chris@1 214 called a "media mapping". An example for a media mapping is "Ogg
Chris@1 215 Vorbis", which uses the Ogg framework to encapsulate Vorbis-encoded
Chris@1 216 audio data for stream-based storage (such as files) and transport
Chris@1 217 (such as TCP streams or pipes). Ogg Vorbis provides the name and
Chris@1 218 revision of the Vorbis codec, the audio rate and the audio quality on
Chris@1 219 the Ogg Vorbis bos page. It also uses two additional header pages
Chris@1 220 per logical bitstream. The Ogg Vorbis bos page starts with the byte
Chris@1 221 0x01, followed by "vorbis" (a total of 7 bytes of identifier).
Chris@1 222
Chris@1 223
Chris@1 224
Chris@1 225
Chris@1 226 Pfeiffer Informational [Page 4]
Chris@1 227
Chris@1 228 RFC 3533 OGG May 2003
Chris@1 229
Chris@1 230
Chris@1 231 Ogg knows two types of multiplexing: concurrent multiplexing (so-
Chris@1 232 called "Grouping") and sequential multiplexing (so-called
Chris@1 233 "Chaining"). Grouping defines how to interleave several logical
Chris@1 234 bitstreams page-wise in the same physical bitstream. Grouping is for
Chris@1 235 example needed for interleaving a video stream with several
Chris@1 236 synchronised audio tracks using different codecs in different logical
Chris@1 237 bitstreams. Chaining on the other hand, is defined to provide a
Chris@1 238 simple mechanism to concatenate physical Ogg bitstreams, as is often
Chris@1 239 needed for streaming applications.
Chris@1 240
Chris@1 241 In grouping, all bos pages of all logical bitstreams MUST appear
Chris@1 242 together at the beginning of the Ogg bitstream. The media mapping
Chris@1 243 specifies the order of the initial pages. For example, the grouping
Chris@1 244 of a specific Ogg video and Ogg audio bitstream may specify that the
Chris@1 245 physical bitstream MUST begin with the bos page of the logical video
Chris@1 246 bitstream, followed by the bos page of the audio bitstream. Unlike
Chris@1 247 bos pages, eos pages for the logical bitstreams need not all occur
Chris@1 248 contiguously. Eos pages may be 'nil' pages, that is, pages
Chris@1 249 containing no content but simply a page header with position
Chris@1 250 information and the eos flag set in the page header. Each grouped
Chris@1 251 logical bitstream MUST have a unique serial number within the scope
Chris@1 252 of the physical bitstream.
Chris@1 253
Chris@1 254 In chaining, complete logical bitstreams are concatenated. The
Chris@1 255 bitstreams do not overlap, i.e., the eos page of a given logical
Chris@1 256 bitstream is immediately followed by the bos page of the next. Each
Chris@1 257 chained logical bitstream MUST have a unique serial number within the
Chris@1 258 scope of the physical bitstream.
Chris@1 259
Chris@1 260 It is possible to consecutively chain groups of concurrently
Chris@1 261 multiplexed bitstreams. The groups, when unchained, MUST stand on
Chris@1 262 their own as a valid concurrently multiplexed bitstream. The
Chris@1 263 following diagram shows a schematic example of such a physical
Chris@1 264 bitstream that obeys all the rules of both grouped and chained
Chris@1 265 multiplexed bitstreams.
Chris@1 266
Chris@1 267 physical bitstream with pages of
Chris@1 268 different logical bitstreams grouped and chained
Chris@1 269 -------------------------------------------------------------
Chris@1 270 |*A*|*B*|*C*|A|A|C|B|A|B|#A#|C|...|B|C|#B#|#C#|*D*|D|...|#D#|
Chris@1 271 -------------------------------------------------------------
Chris@1 272 bos bos bos eos eos eos bos eos
Chris@1 273
Chris@1 274 In this example, there are two chained physical bitstreams, the first
Chris@1 275 of which is a grouped stream of three logical bitstreams A, B, and C.
Chris@1 276 The second physical bitstream is chained after the end of the grouped
Chris@1 277 bitstream, which ends after the last eos page of all its grouped
Chris@1 278 logical bitstreams. As can be seen, grouped bitstreams begin
Chris@1 279
Chris@1 280
Chris@1 281
Chris@1 282 Pfeiffer Informational [Page 5]
Chris@1 283
Chris@1 284 RFC 3533 OGG May 2003
Chris@1 285
Chris@1 286
Chris@1 287 together - all of the bos pages MUST appear before any data pages.
Chris@1 288 It can also be seen that pages of concurrently multiplexed bitstreams
Chris@1 289 need not conform to a regular order. And it can be seen that a
Chris@1 290 grouped bitstream can end long before the other bitstreams in the
Chris@1 291 group end.
Chris@1 292
Chris@1 293 Ogg does not know any specifics about the codec data except that each
Chris@1 294 logical bitstream belongs to a different codec, the data from the
Chris@1 295 codec comes in order and has position markers (so-called "Granule
Chris@1 296 positions"). Ogg does not have a concept of 'time': it only knows
Chris@1 297 about sequentially increasing, unitless position markers. An
Chris@1 298 application can only get temporal information through higher layers
Chris@1 299 which have access to the codec APIs to assign and convert granule
Chris@1 300 positions or time.
Chris@1 301
Chris@1 302 A specific definition of a media mapping using Ogg may put further
Chris@1 303 constraints on its specific use of the Ogg bitstream format. For
Chris@1 304 example, a specific media mapping may require that all the eos pages
Chris@1 305 for all grouped bitstreams need to appear in direct sequence. An
Chris@1 306 example for a media mapping is the specification of "Ogg Vorbis".
Chris@1 307 Another example is the upcoming "Ogg Theora" specification which
Chris@1 308 encapsulates Theora-encoded video data and usually comes multiplexed
Chris@1 309 with a Vorbis stream for an Ogg containing synchronised audio and
Chris@1 310 video. As Ogg does not specify temporal relationships between the
Chris@1 311 encapsulated concurrently multiplexed bitstreams, the temporal
Chris@1 312 synchronisation between the audio and video stream will be specified
Chris@1 313 in this media mapping. To enable streaming, pages from various
Chris@1 314 logical bitstreams will typically be interleaved in chronological
Chris@1 315 order.
Chris@1 316
Chris@1 317 5. The encapsulation process
Chris@1 318
Chris@1 319 The process of multiplexing different logical bitstreams happens at
Chris@1 320 the level of pages as described above. The bitstreams provided by
Chris@1 321 encoders are however handed over to Ogg as so-called "Packets" with
Chris@1 322 packet boundaries dependent on the encoding format. The process of
Chris@1 323 encapsulating packets into pages will be described now.
Chris@1 324
Chris@1 325 From Ogg's perspective, packets can be of any arbitrary size. A
Chris@1 326 specific media mapping will define how to group or break up packets
Chris@1 327 from a specific media encoder. As Ogg pages have a maximum size of
Chris@1 328 about 64 kBytes, sometimes a packet has to be distributed over
Chris@1 329 several pages. To simplify that process, Ogg divides each packet
Chris@1 330 into 255 byte long chunks plus a final shorter chunk. These chunks
Chris@1 331 are called "Ogg Segments". They are only a logical construct and do
Chris@1 332 not have a header for themselves.
Chris@1 333
Chris@1 334
Chris@1 335
Chris@1 336
Chris@1 337
Chris@1 338 Pfeiffer Informational [Page 6]
Chris@1 339
Chris@1 340 RFC 3533 OGG May 2003
Chris@1 341
Chris@1 342
Chris@1 343 A group of contiguous segments is wrapped into a variable length page
Chris@1 344 preceded by a header. A segment table in the page header tells about
Chris@1 345 the "Lacing values" (sizes) of the segments included in the page. A
Chris@1 346 flag in the page header tells whether a page contains a packet
Chris@1 347 continued from a previous page. Note that a lacing value of 255
Chris@1 348 implies that a second lacing value follows in the packet, and a value
Chris@1 349 of less than 255 marks the end of the packet after that many
Chris@1 350 additional bytes. A packet of 255 bytes (or a multiple of 255 bytes)
Chris@1 351 is terminated by a lacing value of 0. Note also that a 'nil' (zero
Chris@1 352 length) packet is not an error; it consists of nothing more than a
Chris@1 353 lacing value of zero in the header.
Chris@1 354
Chris@1 355 The encoding is optimized for speed and the expected case of the
Chris@1 356 majority of packets being between 50 and 200 bytes large. This is a
Chris@1 357 design justification rather than a recommendation. This encoding
Chris@1 358 both avoids imposing a maximum packet size as well as imposing
Chris@1 359 minimum overhead on small packets. In contrast, e.g., simply using
Chris@1 360 two bytes at the head of every packet and having a max packet size of
Chris@1 361 32 kBytes would always penalize small packets (< 255 bytes, the
Chris@1 362 typical case) with twice the segmentation overhead. Using the lacing
Chris@1 363 values as suggested, small packets see the minimum possible byte-
Chris@1 364 aligned overhead (1 byte) and large packets (>512 bytes) see a fairly
Chris@1 365 constant ~0.5% overhead on encoding space.
Chris@1 366
Chris@1 367
Chris@1 368
Chris@1 369
Chris@1 370
Chris@1 371
Chris@1 372
Chris@1 373
Chris@1 374
Chris@1 375
Chris@1 376
Chris@1 377
Chris@1 378
Chris@1 379
Chris@1 380
Chris@1 381
Chris@1 382
Chris@1 383
Chris@1 384
Chris@1 385
Chris@1 386
Chris@1 387
Chris@1 388
Chris@1 389
Chris@1 390
Chris@1 391
Chris@1 392
Chris@1 393
Chris@1 394 Pfeiffer Informational [Page 7]
Chris@1 395
Chris@1 396 RFC 3533 OGG May 2003
Chris@1 397
Chris@1 398
Chris@1 399 The following diagram shows a schematic example of a media mapping
Chris@1 400 using Ogg and grouped logical bitstreams:
Chris@1 401
Chris@1 402 logical bitstream with packet boundaries
Chris@1 403 -----------------------------------------------------------------
Chris@1 404 > | packet_1 | packet_2 | packet_3 | <
Chris@1 405 -----------------------------------------------------------------
Chris@1 406
Chris@1 407 |segmentation (logically only)
Chris@1 408 v
Chris@1 409
Chris@1 410 packet_1 (5 segments) packet_2 (4 segs) p_3 (2 segs)
Chris@1 411 ------------------------------ -------------------- ------------
Chris@1 412 .. |seg_1|seg_2|seg_3|seg_4|s_5 | |seg_1|seg_2|seg_3|| |seg_1|s_2 | ..
Chris@1 413 ------------------------------ -------------------- ------------
Chris@1 414
Chris@1 415 | page encapsulation
Chris@1 416 v
Chris@1 417
Chris@1 418 page_1 (packet_1 data) page_2 (pket_1 data) page_3 (packet_2 data)
Chris@1 419 ------------------------ ---------------- ------------------------
Chris@1 420 |H|------------------- | |H|----------- | |H|------------------- |
Chris@1 421 |D||seg_1|seg_2|seg_3| | |D|seg_4|s_5 | | |D||seg_1|seg_2|seg_3| | ...
Chris@1 422 |R|------------------- | |R|----------- | |R|------------------- |
Chris@1 423 ------------------------ ---------------- ------------------------
Chris@1 424
Chris@1 425 |
Chris@1 426 pages of |
Chris@1 427 other --------| |
Chris@1 428 logical -------
Chris@1 429 bitstreams | MUX |
Chris@1 430 -------
Chris@1 431 |
Chris@1 432 v
Chris@1 433
Chris@1 434 page_1 page_2 page_3
Chris@1 435 ------ ------ ------- ----- -------
Chris@1 436 ... || | || | || | || | || | ...
Chris@1 437 ------ ------ ------- ----- -------
Chris@1 438 physical Ogg bitstream
Chris@1 439
Chris@1 440 In this example we take a snapshot of the encapsulation process of
Chris@1 441 one logical bitstream. We can see part of that bitstream's
Chris@1 442 subdivision into packets as provided by the codec. The Ogg
Chris@1 443 encapsulation process chops up the packets into segments. The
Chris@1 444 packets in this example are rather large such that packet_1 is split
Chris@1 445 into 5 segments - 4 segments with 255 bytes and a final smaller one.
Chris@1 446 Packet_2 is split into 4 segments - 3 segments with 255 bytes and a
Chris@1 447
Chris@1 448
Chris@1 449
Chris@1 450 Pfeiffer Informational [Page 8]
Chris@1 451
Chris@1 452 RFC 3533 OGG May 2003
Chris@1 453
Chris@1 454
Chris@1 455 final very small one - and packet_3 is split into two segments. The
Chris@1 456 encapsulation process then creates pages, which are quite small in
Chris@1 457 this example. Page_1 consists of the first three segments of
Chris@1 458 packet_1, page_2 contains the remaining 2 segments from packet_1, and
Chris@1 459 page_3 contains the first three pages of packet_2. Finally, this
Chris@1 460 logical bitstream is multiplexed into a physical Ogg bitstream with
Chris@1 461 pages of other logical bitstreams.
Chris@1 462
Chris@1 463 6. The Ogg page format
Chris@1 464
Chris@1 465 A physical Ogg bitstream consists of a sequence of concatenated
Chris@1 466 pages. Pages are of variable size, usually 4-8 kB, maximum 65307
Chris@1 467 bytes. A page header contains all the information needed to
Chris@1 468 demultiplex the logical bitstreams out of the physical bitstream and
Chris@1 469 to perform basic error recovery and landmarks for seeking. Each page
Chris@1 470 is a self-contained entity such that the page decode mechanism can
Chris@1 471 recognize, verify, and handle single pages at a time without
Chris@1 472 requiring the overall bitstream.
Chris@1 473
Chris@1 474 The Ogg page header has the following format:
Chris@1 475
Chris@1 476 0 1 2 3
Chris@1 477 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte
Chris@1 478 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Chris@1 479 | capture_pattern: Magic number for page start "OggS" | 0-3
Chris@1 480 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Chris@1 481 | version | header_type | granule_position | 4-7
Chris@1 482 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Chris@1 483 | | 8-11
Chris@1 484 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Chris@1 485 | | bitstream_serial_number | 12-15
Chris@1 486 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Chris@1 487 | | page_sequence_number | 16-19
Chris@1 488 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Chris@1 489 | | CRC_checksum | 20-23
Chris@1 490 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Chris@1 491 | |page_segments | segment_table | 24-27
Chris@1 492 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Chris@1 493 | ... | 28-
Chris@1 494 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Chris@1 495
Chris@1 496 The LSb (least significant bit) comes first in the Bytes. Fields
Chris@1 497 with more than one byte length are encoded LSB (least significant
Chris@1 498 byte) first.
Chris@1 499
Chris@1 500
Chris@1 501
Chris@1 502
Chris@1 503
Chris@1 504
Chris@1 505
Chris@1 506 Pfeiffer Informational [Page 9]
Chris@1 507
Chris@1 508 RFC 3533 OGG May 2003
Chris@1 509
Chris@1 510
Chris@1 511 The fields in the page header have the following meaning:
Chris@1 512
Chris@1 513 1. capture_pattern: a 4 Byte field that signifies the beginning of a
Chris@1 514 page. It contains the magic numbers:
Chris@1 515
Chris@1 516 0x4f 'O'
Chris@1 517
Chris@1 518 0x67 'g'
Chris@1 519
Chris@1 520 0x67 'g'
Chris@1 521
Chris@1 522 0x53 'S'
Chris@1 523
Chris@1 524 It helps a decoder to find the page boundaries and regain
Chris@1 525 synchronisation after parsing a corrupted stream. Once the
Chris@1 526 capture pattern is found, the decoder verifies page sync and
Chris@1 527 integrity by computing and comparing the checksum.
Chris@1 528
Chris@1 529 2. stream_structure_version: 1 Byte signifying the version number of
Chris@1 530 the Ogg file format used in this stream (this document specifies
Chris@1 531 version 0).
Chris@1 532
Chris@1 533 3. header_type_flag: the bits in this 1 Byte field identify the
Chris@1 534 specific type of this page.
Chris@1 535
Chris@1 536 * bit 0x01
Chris@1 537
Chris@1 538 set: page contains data of a packet continued from the previous
Chris@1 539 page
Chris@1 540
Chris@1 541 unset: page contains a fresh packet
Chris@1 542
Chris@1 543 * bit 0x02
Chris@1 544
Chris@1 545 set: this is the first page of a logical bitstream (bos)
Chris@1 546
Chris@1 547 unset: this page is not a first page
Chris@1 548
Chris@1 549 * bit 0x04
Chris@1 550
Chris@1 551 set: this is the last page of a logical bitstream (eos)
Chris@1 552
Chris@1 553 unset: this page is not a last page
Chris@1 554
Chris@1 555 4. granule_position: an 8 Byte field containing position information.
Chris@1 556 For example, for an audio stream, it MAY contain the total number
Chris@1 557 of PCM samples encoded after including all frames finished on this
Chris@1 558 page. For a video stream it MAY contain the total number of video
Chris@1 559
Chris@1 560
Chris@1 561
Chris@1 562 Pfeiffer Informational [Page 10]
Chris@1 563
Chris@1 564 RFC 3533 OGG May 2003
Chris@1 565
Chris@1 566
Chris@1 567 frames encoded after this page. This is a hint for the decoder
Chris@1 568 and gives it some timing and position information. Its meaning is
Chris@1 569 dependent on the codec for that logical bitstream and specified in
Chris@1 570 a specific media mapping. A special value of -1 (in two's
Chris@1 571 complement) indicates that no packets finish on this page.
Chris@1 572
Chris@1 573 5. bitstream_serial_number: a 4 Byte field containing the unique
Chris@1 574 serial number by which the logical bitstream is identified.
Chris@1 575
Chris@1 576 6. page_sequence_number: a 4 Byte field containing the sequence
Chris@1 577 number of the page so the decoder can identify page loss. This
Chris@1 578 sequence number is increasing on each logical bitstream
Chris@1 579 separately.
Chris@1 580
Chris@1 581 7. CRC_checksum: a 4 Byte field containing a 32 bit CRC checksum of
Chris@1 582 the page (including header with zero CRC field and page content).
Chris@1 583 The generator polynomial is 0x04c11db7.
Chris@1 584
Chris@1 585 8. number_page_segments: 1 Byte giving the number of segment entries
Chris@1 586 encoded in the segment table.
Chris@1 587
Chris@1 588 9. segment_table: number_page_segments Bytes containing the lacing
Chris@1 589 values of all segments in this page. Each Byte contains one
Chris@1 590 lacing value.
Chris@1 591
Chris@1 592 The total header size in bytes is given by:
Chris@1 593 header_size = number_page_segments + 27 [Byte]
Chris@1 594
Chris@1 595 The total page size in Bytes is given by:
Chris@1 596 page_size = header_size + sum(lacing_values: 1..number_page_segments)
Chris@1 597 [Byte]
Chris@1 598
Chris@1 599 7. Security Considerations
Chris@1 600
Chris@1 601 The Ogg encapsulation format is a container format and only
Chris@1 602 encapsulates content (such as Vorbis-encoded audio). It does not
Chris@1 603 provide for any generic encryption or signing of itself or its
Chris@1 604 contained content bitstreams. However, it encapsulates any kind of
Chris@1 605 content bitstream as long as there is a codec for it, and is thus
Chris@1 606 able to contain encrypted and signed content data. It is also
Chris@1 607 possible to add an external security mechanism that encrypts or signs
Chris@1 608 an Ogg physical bitstream and thus provides content confidentiality
Chris@1 609 and authenticity.
Chris@1 610
Chris@1 611 As Ogg encapsulates binary data, it is possible to include executable
Chris@1 612 content in an Ogg bitstream. This can be an issue with applications
Chris@1 613 that are implemented using the Ogg format, especially when Ogg is
Chris@1 614 used for streaming or file transfer in a networking scenario. As
Chris@1 615
Chris@1 616
Chris@1 617
Chris@1 618 Pfeiffer Informational [Page 11]
Chris@1 619
Chris@1 620 RFC 3533 OGG May 2003
Chris@1 621
Chris@1 622
Chris@1 623 such, Ogg does not pose a threat there. However, an application
Chris@1 624 decoding Ogg and its encapsulated content bitstreams has to ensure
Chris@1 625 correct handling of manipulated bitstreams, of buffer overflows and
Chris@1 626 the like.
Chris@1 627
Chris@1 628 8. References
Chris@1 629
Chris@1 630 [1] Walleij, L., "The application/ogg Media Type", RFC 3534, May
Chris@1 631 2003.
Chris@1 632
Chris@1 633 [2] Bradner, S., "Key words for use in RFCs to Indicate Requirement
Chris@1 634 Levels", BCP 14, RFC 2119, March 1997.
Chris@1 635
Chris@1 636
Chris@1 637
Chris@1 638
Chris@1 639
Chris@1 640
Chris@1 641
Chris@1 642
Chris@1 643
Chris@1 644
Chris@1 645
Chris@1 646
Chris@1 647
Chris@1 648
Chris@1 649
Chris@1 650
Chris@1 651
Chris@1 652
Chris@1 653
Chris@1 654
Chris@1 655
Chris@1 656
Chris@1 657
Chris@1 658
Chris@1 659
Chris@1 660
Chris@1 661
Chris@1 662
Chris@1 663
Chris@1 664
Chris@1 665
Chris@1 666
Chris@1 667
Chris@1 668
Chris@1 669
Chris@1 670
Chris@1 671
Chris@1 672
Chris@1 673
Chris@1 674 Pfeiffer Informational [Page 12]
Chris@1 675
Chris@1 676 RFC 3533 OGG May 2003
Chris@1 677
Chris@1 678
Chris@1 679 Appendix A. Glossary of terms and abbreviations
Chris@1 680
Chris@1 681 bos page: The initial page (beginning of stream) of a logical
Chris@1 682 bitstream which contains information to identify the codec type
Chris@1 683 and other decoding-relevant information.
Chris@1 684
Chris@1 685 chaining (or sequential multiplexing): Concatenation of two or more
Chris@1 686 complete physical Ogg bitstreams.
Chris@1 687
Chris@1 688 eos page: The final page (end of stream) of a logical bitstream.
Chris@1 689
Chris@1 690 granule position: An increasing position number for a specific
Chris@1 691 logical bitstream stored in the page header. Its meaning is
Chris@1 692 dependent on the codec for that logical bitstream and specified in
Chris@1 693 a specific media mapping.
Chris@1 694
Chris@1 695 grouping (or concurrent multiplexing): Interleaving of pages of
Chris@1 696 several logical bitstreams into one complete physical Ogg
Chris@1 697 bitstream under the restriction that all bos pages of all grouped
Chris@1 698 logical bitstreams MUST appear before any data pages.
Chris@1 699
Chris@1 700 lacing value: An entry in the segment table of a page header
Chris@1 701 representing the size of the related segment.
Chris@1 702
Chris@1 703 logical bitstream: A sequence of bits being the result of an encoded
Chris@1 704 media stream.
Chris@1 705
Chris@1 706 media mapping: A specific use of the Ogg encapsulation format
Chris@1 707 together with a specific (set of) codec(s).
Chris@1 708
Chris@1 709 (Ogg) packet: A subpart of a logical bitstream that is created by the
Chris@1 710 encoder for that bitstream and represents a meaningful entity for
Chris@1 711 the encoder, but only a sequence of bits to the Ogg encapsulation.
Chris@1 712
Chris@1 713 (Ogg) page: A physical bitstream consists of a sequence of Ogg pages
Chris@1 714 containing data of one logical bitstream only. It usually
Chris@1 715 contains a group of contiguous segments of one packet only, but
Chris@1 716 sometimes packets are too large and need to be split over several
Chris@1 717 pages.
Chris@1 718
Chris@1 719 physical (Ogg) bitstream: The sequence of bits resulting from an Ogg
Chris@1 720 encapsulation of one or several logical bitstreams. It consists
Chris@1 721 of a sequence of pages from the logical bitstreams with the
Chris@1 722 restriction that the pages of one logical bitstream MUST come in
Chris@1 723 their correct temporal order.
Chris@1 724
Chris@1 725
Chris@1 726
Chris@1 727
Chris@1 728
Chris@1 729
Chris@1 730 Pfeiffer Informational [Page 13]
Chris@1 731
Chris@1 732 RFC 3533 OGG May 2003
Chris@1 733
Chris@1 734
Chris@1 735 (Ogg) segment: The Ogg encapsulation process splits each packet into
Chris@1 736 chunks of 255 bytes plus a last fractional chunk of less than 255
Chris@1 737 bytes. These chunks are called segments.
Chris@1 738
Chris@1 739 Appendix B. Acknowledgements
Chris@1 740
Chris@1 741 The author gratefully acknowledges the work that Christopher
Chris@1 742 Montgomery and the Xiph.Org foundation have done in defining the Ogg
Chris@1 743 multimedia project and as part of it the open file format described
Chris@1 744 in this document. The author hopes that providing this document to
Chris@1 745 the Internet community will help in promoting the Ogg multimedia
Chris@1 746 project at http://www.xiph.org/. Many thanks also for the many
Chris@1 747 technical and typo corrections that C. Montgomery and the Ogg
Chris@1 748 community provided as feedback to this RFC.
Chris@1 749
Chris@1 750 Author's Address
Chris@1 751
Chris@1 752 Silvia Pfeiffer
Chris@1 753 CSIRO, Australia
Chris@1 754 Locked Bag 17
Chris@1 755 North Ryde, NSW 2113
Chris@1 756 Australia
Chris@1 757
Chris@1 758 Phone: +61 2 9325 3141
Chris@1 759 EMail: Silvia.Pfeiffer@csiro.au
Chris@1 760 URI: http://www.cmis.csiro.au/Silvia.Pfeiffer/
Chris@1 761
Chris@1 762
Chris@1 763
Chris@1 764
Chris@1 765
Chris@1 766
Chris@1 767
Chris@1 768
Chris@1 769
Chris@1 770
Chris@1 771
Chris@1 772
Chris@1 773
Chris@1 774
Chris@1 775
Chris@1 776
Chris@1 777
Chris@1 778
Chris@1 779
Chris@1 780
Chris@1 781
Chris@1 782
Chris@1 783
Chris@1 784
Chris@1 785
Chris@1 786 Pfeiffer Informational [Page 14]
Chris@1 787
Chris@1 788 RFC 3533 OGG May 2003
Chris@1 789
Chris@1 790
Chris@1 791 Full Copyright Statement
Chris@1 792
Chris@1 793 Copyright (C) The Internet Society (2003). All Rights Reserved.
Chris@1 794
Chris@1 795 This document and translations of it may be copied and furnished to
Chris@1 796 others, and derivative works that comment on or otherwise explain it
Chris@1 797 or assist in its implementation may be prepared, copied, published
Chris@1 798 and distributed, in whole or in part, without restriction of any
Chris@1 799 kind, provided that the above copyright notice and this paragraph are
Chris@1 800 included on all such copies and derivative works. However, this
Chris@1 801 document itself may not be modified in any way, such as by removing
Chris@1 802 the copyright notice or references to the Internet Society or other
Chris@1 803 Internet organizations, except as needed for the purpose of
Chris@1 804 developing Internet standards in which case the procedures for
Chris@1 805 copyrights defined in the Internet Standards process must be
Chris@1 806 followed, or as required to translate it into languages other than
Chris@1 807 English.
Chris@1 808
Chris@1 809 The limited permissions granted above are perpetual and will not be
Chris@1 810 revoked by the Internet Society or its successors or assigns.
Chris@1 811
Chris@1 812 This document and the information contained herein is provided on an
Chris@1 813 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
Chris@1 814 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
Chris@1 815 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
Chris@1 816 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
Chris@1 817 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Chris@1 818
Chris@1 819 Acknowledgement
Chris@1 820
Chris@1 821 Funding for the RFC Editor function is currently provided by the
Chris@1 822 Internet Society.
Chris@1 823
Chris@1 824
Chris@1 825
Chris@1 826
Chris@1 827
Chris@1 828
Chris@1 829
Chris@1 830
Chris@1 831
Chris@1 832
Chris@1 833
Chris@1 834
Chris@1 835
Chris@1 836
Chris@1 837
Chris@1 838
Chris@1 839
Chris@1 840
Chris@1 841
Chris@1 842 Pfeiffer Informational [Page 15]
Chris@1 843