Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Network Working Group S. Pfeiffer Chris@1: Request for Comments: 3533 CSIRO Chris@1: Category: Informational May 2003 Chris@1: Chris@1: Chris@1: The Ogg Encapsulation Format Version 0 Chris@1: Chris@1: Status of this Memo Chris@1: Chris@1: This memo provides information for the Internet community. It does Chris@1: not specify an Internet standard of any kind. Distribution of this Chris@1: memo is unlimited. Chris@1: Chris@1: Copyright Notice Chris@1: Chris@1: Copyright (C) The Internet Society (2003). All Rights Reserved. Chris@1: Chris@1: Abstract Chris@1: Chris@1: This document describes the Ogg bitstream format version 0, which is Chris@1: a general, freely-available encapsulation format for media streams. Chris@1: It is able to encapsulate any kind and number of video and audio Chris@1: encoding formats as well as other data streams in a single bitstream. Chris@1: Chris@1: Terminology Chris@1: Chris@1: The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", Chris@1: "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this Chris@1: document are to be interpreted as described in BCP 14, RFC 2119 [2]. Chris@1: Chris@1: Table of Contents Chris@1: Chris@1: 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 Chris@1: 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 2 Chris@1: 3. Requirements for a generic encapsulation format . . . . . . . 3 Chris@1: 4. The Ogg bitstream format . . . . . . . . . . . . . . . . . . . 3 Chris@1: 5. The encapsulation process . . . . . . . . . . . . . . . . . . 6 Chris@1: 6. The Ogg page format . . . . . . . . . . . . . . . . . . . . . 9 Chris@1: 7. Security Considerations . . . . . . . . . . . . . . . . . . . 11 Chris@1: 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Chris@1: A. Glossary of terms and abbreviations . . . . . . . . . . . . . 13 Chris@1: B. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14 Chris@1: Author's Address . . . . . . . . . . . . . . . . . . . . . . . 14 Chris@1: Full Copyright Statement . . . . . . . . . . . . . . . . . . . 15 Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Pfeiffer Informational [Page 1] Chris@1: Chris@1: RFC 3533 OGG May 2003 Chris@1: Chris@1: Chris@1: 1. Introduction Chris@1: Chris@1: The Ogg bitstream format has been developed as a part of a larger Chris@1: project aimed at creating a set of components for the coding and Chris@1: decoding of multimedia content (codecs) which are to be freely Chris@1: available and freely re-implementable, both in software and in Chris@1: hardware for the computing community at large, including the Internet Chris@1: community. It is the intention of the Ogg developers represented by Chris@1: Xiph.Org that it be usable without intellectual property concerns. Chris@1: Chris@1: This document describes the Ogg bitstream format and how to use it to Chris@1: encapsulate one or several media bitstreams created by one or several Chris@1: encoders. The Ogg transport bitstream is designed to provide Chris@1: framing, error protection and seeking structure for higher-level Chris@1: codec streams that consist of raw, unencapsulated data packets, such Chris@1: as the Vorbis audio codec or the upcoming Tarkin and Theora video Chris@1: codecs. It is capable of interleaving different binary media and Chris@1: other time-continuous data streams that are prepared by an encoder as Chris@1: a sequence of data packets. Ogg provides enough information to Chris@1: properly separate data back into such encoder created data packets at Chris@1: the original packet boundaries without relying on decoding to find Chris@1: packet boundaries. Chris@1: Chris@1: Please note that the MIME type application/ogg has been registered Chris@1: with the IANA [1]. Chris@1: Chris@1: 2. Definitions Chris@1: Chris@1: For describing the Ogg encapsulation process, a set of terms will be Chris@1: used whose meaning needs to be well understood. Therefore, some of Chris@1: the most fundamental terms are defined now before we start with the Chris@1: description of the requirements for a generic media stream Chris@1: encapsulation format, the process of encapsulation, and the concrete Chris@1: format of the Ogg bitstream. See the Appendix for a more complete Chris@1: glossary. Chris@1: Chris@1: The result of an Ogg encapsulation is called the "Physical (Ogg) Chris@1: Bitstream". It encapsulates one or several encoder-created Chris@1: bitstreams, which are called "Logical Bitstreams". A logical Chris@1: bitstream, provided to the Ogg encapsulation process, has a Chris@1: structure, i.e., it is split up into a sequence of so-called Chris@1: "Packets". The packets are created by the encoder of that logical Chris@1: bitstream and represent meaningful entities for that encoder only Chris@1: (e.g., an uncompressed stream may use video frames as packets). They Chris@1: do not contain boundary information - strung together they appear to Chris@1: be streams of random bytes with no landmarks. Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Pfeiffer Informational [Page 2] Chris@1: Chris@1: RFC 3533 OGG May 2003 Chris@1: Chris@1: Chris@1: Please note that the term "packet" is not used in this document to Chris@1: signify entities for transport over a network. Chris@1: Chris@1: 3. Requirements for a generic encapsulation format Chris@1: Chris@1: The design idea behind Ogg was to provide a generic, linear media Chris@1: transport format to enable both file-based storage and stream-based Chris@1: transmission of one or several interleaved media streams independent Chris@1: of the encoding format of the media data. Such an encapsulation Chris@1: format needs to provide: Chris@1: Chris@1: o framing for logical bitstreams. Chris@1: Chris@1: o interleaving of different logical bitstreams. Chris@1: Chris@1: o detection of corruption. Chris@1: Chris@1: o recapture after a parsing error. Chris@1: Chris@1: o position landmarks for direct random access of arbitrary positions Chris@1: in the bitstream. Chris@1: Chris@1: o streaming capability (i.e., no seeking is needed to build a 100% Chris@1: complete bitstream). Chris@1: Chris@1: o small overhead (i.e., use no more than approximately 1-2% of Chris@1: bitstream bandwidth for packet boundary marking, high-level Chris@1: framing, sync and seeking). Chris@1: Chris@1: o simplicity to enable fast parsing. Chris@1: Chris@1: o simple concatenation mechanism of several physical bitstreams. Chris@1: Chris@1: All of these design considerations have been taken into consideration Chris@1: for Ogg. Ogg supports framing and interleaving of logical Chris@1: bitstreams, seeking landmarks, detection of corruption, and stream Chris@1: resynchronisation after a parsing error with no more than Chris@1: approximately 1-2% overhead. It is a generic framework to perform Chris@1: encapsulation of time-continuous bitstreams. It does not know any Chris@1: specifics about the codec data that it encapsulates and is thus Chris@1: independent of any media codec. Chris@1: Chris@1: 4. The Ogg bitstream format Chris@1: Chris@1: A physical Ogg bitstream consists of multiple logical bitstreams Chris@1: interleaved in so-called "Pages". Whole pages are taken in order Chris@1: from multiple logical bitstreams multiplexed at the page level. The Chris@1: logical bitstreams are identified by a unique serial number in the Chris@1: Chris@1: Chris@1: Chris@1: Pfeiffer Informational [Page 3] Chris@1: Chris@1: RFC 3533 OGG May 2003 Chris@1: Chris@1: Chris@1: header of each page of the physical bitstream. This unique serial Chris@1: number is created randomly and does not have any connection to the Chris@1: content or encoder of the logical bitstream it represents. Pages of Chris@1: all logical bitstreams are concurrently interleaved, but they need Chris@1: not be in a regular order - they are only required to be consecutive Chris@1: within the logical bitstream. Ogg demultiplexing reconstructs the Chris@1: original logical bitstreams from the physical bitstream by taking the Chris@1: pages in order from the physical bitstream and redirecting them into Chris@1: the appropriate logical decoding entity. Chris@1: Chris@1: Each Ogg page contains only one type of data as it belongs to one Chris@1: logical bitstream only. Pages are of variable size and have a page Chris@1: header containing encapsulation and error recovery information. Each Chris@1: logical bitstream in a physical Ogg bitstream starts with a special Chris@1: start page (bos=beginning of stream) and ends with a special page Chris@1: (eos=end of stream). Chris@1: Chris@1: The bos page contains information to uniquely identify the codec type Chris@1: and MAY contain information to set up the decoding process. The bos Chris@1: page SHOULD also contain information about the encoded media - for Chris@1: example, for audio, it should contain the sample rate and number of Chris@1: channels. By convention, the first bytes of the bos page contain Chris@1: magic data that uniquely identifies the required codec. It is the Chris@1: responsibility of anyone fielding a new codec to make sure it is Chris@1: possible to reliably distinguish his/her codec from all other codecs Chris@1: in use. There is no fixed way to detect the end of the codec- Chris@1: identifying marker. The format of the bos page is dependent on the Chris@1: codec and therefore MUST be given in the encapsulation specification Chris@1: of that logical bitstream type. Ogg also allows but does not require Chris@1: secondary header packets after the bos page for logical bitstreams Chris@1: and these must also precede any data packets in any logical Chris@1: bitstream. These subsequent header packets are framed into an Chris@1: integral number of pages, which will not contain any data packets. Chris@1: So, a physical bitstream begins with the bos pages of all logical Chris@1: bitstreams containing one initial header packet per page, followed by Chris@1: the subsidiary header packets of all streams, followed by pages Chris@1: containing data packets. Chris@1: Chris@1: The encapsulation specification for one or more logical bitstreams is Chris@1: called a "media mapping". An example for a media mapping is "Ogg Chris@1: Vorbis", which uses the Ogg framework to encapsulate Vorbis-encoded Chris@1: audio data for stream-based storage (such as files) and transport Chris@1: (such as TCP streams or pipes). Ogg Vorbis provides the name and Chris@1: revision of the Vorbis codec, the audio rate and the audio quality on Chris@1: the Ogg Vorbis bos page. It also uses two additional header pages Chris@1: per logical bitstream. The Ogg Vorbis bos page starts with the byte Chris@1: 0x01, followed by "vorbis" (a total of 7 bytes of identifier). Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Pfeiffer Informational [Page 4] Chris@1: Chris@1: RFC 3533 OGG May 2003 Chris@1: Chris@1: Chris@1: Ogg knows two types of multiplexing: concurrent multiplexing (so- Chris@1: called "Grouping") and sequential multiplexing (so-called Chris@1: "Chaining"). Grouping defines how to interleave several logical Chris@1: bitstreams page-wise in the same physical bitstream. Grouping is for Chris@1: example needed for interleaving a video stream with several Chris@1: synchronised audio tracks using different codecs in different logical Chris@1: bitstreams. Chaining on the other hand, is defined to provide a Chris@1: simple mechanism to concatenate physical Ogg bitstreams, as is often Chris@1: needed for streaming applications. Chris@1: Chris@1: In grouping, all bos pages of all logical bitstreams MUST appear Chris@1: together at the beginning of the Ogg bitstream. The media mapping Chris@1: specifies the order of the initial pages. For example, the grouping Chris@1: of a specific Ogg video and Ogg audio bitstream may specify that the Chris@1: physical bitstream MUST begin with the bos page of the logical video Chris@1: bitstream, followed by the bos page of the audio bitstream. Unlike Chris@1: bos pages, eos pages for the logical bitstreams need not all occur Chris@1: contiguously. Eos pages may be 'nil' pages, that is, pages Chris@1: containing no content but simply a page header with position Chris@1: information and the eos flag set in the page header. Each grouped Chris@1: logical bitstream MUST have a unique serial number within the scope Chris@1: of the physical bitstream. Chris@1: Chris@1: In chaining, complete logical bitstreams are concatenated. The Chris@1: bitstreams do not overlap, i.e., the eos page of a given logical Chris@1: bitstream is immediately followed by the bos page of the next. Each Chris@1: chained logical bitstream MUST have a unique serial number within the Chris@1: scope of the physical bitstream. Chris@1: Chris@1: It is possible to consecutively chain groups of concurrently Chris@1: multiplexed bitstreams. The groups, when unchained, MUST stand on Chris@1: their own as a valid concurrently multiplexed bitstream. The Chris@1: following diagram shows a schematic example of such a physical Chris@1: bitstream that obeys all the rules of both grouped and chained Chris@1: multiplexed bitstreams. Chris@1: Chris@1: physical bitstream with pages of Chris@1: different logical bitstreams grouped and chained Chris@1: ------------------------------------------------------------- Chris@1: |*A*|*B*|*C*|A|A|C|B|A|B|#A#|C|...|B|C|#B#|#C#|*D*|D|...|#D#| Chris@1: ------------------------------------------------------------- Chris@1: bos bos bos eos eos eos bos eos Chris@1: Chris@1: In this example, there are two chained physical bitstreams, the first Chris@1: of which is a grouped stream of three logical bitstreams A, B, and C. Chris@1: The second physical bitstream is chained after the end of the grouped Chris@1: bitstream, which ends after the last eos page of all its grouped Chris@1: logical bitstreams. As can be seen, grouped bitstreams begin Chris@1: Chris@1: Chris@1: Chris@1: Pfeiffer Informational [Page 5] Chris@1: Chris@1: RFC 3533 OGG May 2003 Chris@1: Chris@1: Chris@1: together - all of the bos pages MUST appear before any data pages. Chris@1: It can also be seen that pages of concurrently multiplexed bitstreams Chris@1: need not conform to a regular order. And it can be seen that a Chris@1: grouped bitstream can end long before the other bitstreams in the Chris@1: group end. Chris@1: Chris@1: Ogg does not know any specifics about the codec data except that each Chris@1: logical bitstream belongs to a different codec, the data from the Chris@1: codec comes in order and has position markers (so-called "Granule Chris@1: positions"). Ogg does not have a concept of 'time': it only knows Chris@1: about sequentially increasing, unitless position markers. An Chris@1: application can only get temporal information through higher layers Chris@1: which have access to the codec APIs to assign and convert granule Chris@1: positions or time. Chris@1: Chris@1: A specific definition of a media mapping using Ogg may put further Chris@1: constraints on its specific use of the Ogg bitstream format. For Chris@1: example, a specific media mapping may require that all the eos pages Chris@1: for all grouped bitstreams need to appear in direct sequence. An Chris@1: example for a media mapping is the specification of "Ogg Vorbis". Chris@1: Another example is the upcoming "Ogg Theora" specification which Chris@1: encapsulates Theora-encoded video data and usually comes multiplexed Chris@1: with a Vorbis stream for an Ogg containing synchronised audio and Chris@1: video. As Ogg does not specify temporal relationships between the Chris@1: encapsulated concurrently multiplexed bitstreams, the temporal Chris@1: synchronisation between the audio and video stream will be specified Chris@1: in this media mapping. To enable streaming, pages from various Chris@1: logical bitstreams will typically be interleaved in chronological Chris@1: order. Chris@1: Chris@1: 5. The encapsulation process Chris@1: Chris@1: The process of multiplexing different logical bitstreams happens at Chris@1: the level of pages as described above. The bitstreams provided by Chris@1: encoders are however handed over to Ogg as so-called "Packets" with Chris@1: packet boundaries dependent on the encoding format. The process of Chris@1: encapsulating packets into pages will be described now. Chris@1: Chris@1: From Ogg's perspective, packets can be of any arbitrary size. A Chris@1: specific media mapping will define how to group or break up packets Chris@1: from a specific media encoder. As Ogg pages have a maximum size of Chris@1: about 64 kBytes, sometimes a packet has to be distributed over Chris@1: several pages. To simplify that process, Ogg divides each packet Chris@1: into 255 byte long chunks plus a final shorter chunk. These chunks Chris@1: are called "Ogg Segments". They are only a logical construct and do Chris@1: not have a header for themselves. Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Pfeiffer Informational [Page 6] Chris@1: Chris@1: RFC 3533 OGG May 2003 Chris@1: Chris@1: Chris@1: A group of contiguous segments is wrapped into a variable length page Chris@1: preceded by a header. A segment table in the page header tells about Chris@1: the "Lacing values" (sizes) of the segments included in the page. A Chris@1: flag in the page header tells whether a page contains a packet Chris@1: continued from a previous page. Note that a lacing value of 255 Chris@1: implies that a second lacing value follows in the packet, and a value Chris@1: of less than 255 marks the end of the packet after that many Chris@1: additional bytes. A packet of 255 bytes (or a multiple of 255 bytes) Chris@1: is terminated by a lacing value of 0. Note also that a 'nil' (zero Chris@1: length) packet is not an error; it consists of nothing more than a Chris@1: lacing value of zero in the header. Chris@1: Chris@1: The encoding is optimized for speed and the expected case of the Chris@1: majority of packets being between 50 and 200 bytes large. This is a Chris@1: design justification rather than a recommendation. This encoding Chris@1: both avoids imposing a maximum packet size as well as imposing Chris@1: minimum overhead on small packets. In contrast, e.g., simply using Chris@1: two bytes at the head of every packet and having a max packet size of Chris@1: 32 kBytes would always penalize small packets (< 255 bytes, the Chris@1: typical case) with twice the segmentation overhead. Using the lacing Chris@1: values as suggested, small packets see the minimum possible byte- Chris@1: aligned overhead (1 byte) and large packets (>512 bytes) see a fairly Chris@1: constant ~0.5% overhead on encoding space. Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Pfeiffer Informational [Page 7] Chris@1: Chris@1: RFC 3533 OGG May 2003 Chris@1: Chris@1: Chris@1: The following diagram shows a schematic example of a media mapping Chris@1: using Ogg and grouped logical bitstreams: Chris@1: Chris@1: logical bitstream with packet boundaries Chris@1: ----------------------------------------------------------------- Chris@1: > | packet_1 | packet_2 | packet_3 | < Chris@1: ----------------------------------------------------------------- Chris@1: Chris@1: |segmentation (logically only) Chris@1: v Chris@1: Chris@1: packet_1 (5 segments) packet_2 (4 segs) p_3 (2 segs) Chris@1: ------------------------------ -------------------- ------------ Chris@1: .. |seg_1|seg_2|seg_3|seg_4|s_5 | |seg_1|seg_2|seg_3|| |seg_1|s_2 | .. Chris@1: ------------------------------ -------------------- ------------ Chris@1: Chris@1: | page encapsulation Chris@1: v Chris@1: Chris@1: page_1 (packet_1 data) page_2 (pket_1 data) page_3 (packet_2 data) Chris@1: ------------------------ ---------------- ------------------------ Chris@1: |H|------------------- | |H|----------- | |H|------------------- | Chris@1: |D||seg_1|seg_2|seg_3| | |D|seg_4|s_5 | | |D||seg_1|seg_2|seg_3| | ... Chris@1: |R|------------------- | |R|----------- | |R|------------------- | Chris@1: ------------------------ ---------------- ------------------------ Chris@1: Chris@1: | Chris@1: pages of | Chris@1: other --------| | Chris@1: logical ------- Chris@1: bitstreams | MUX | Chris@1: ------- Chris@1: | Chris@1: v Chris@1: Chris@1: page_1 page_2 page_3 Chris@1: ------ ------ ------- ----- ------- Chris@1: ... || | || | || | || | || | ... Chris@1: ------ ------ ------- ----- ------- Chris@1: physical Ogg bitstream Chris@1: Chris@1: In this example we take a snapshot of the encapsulation process of Chris@1: one logical bitstream. We can see part of that bitstream's Chris@1: subdivision into packets as provided by the codec. The Ogg Chris@1: encapsulation process chops up the packets into segments. The Chris@1: packets in this example are rather large such that packet_1 is split Chris@1: into 5 segments - 4 segments with 255 bytes and a final smaller one. Chris@1: Packet_2 is split into 4 segments - 3 segments with 255 bytes and a Chris@1: Chris@1: Chris@1: Chris@1: Pfeiffer Informational [Page 8] Chris@1: Chris@1: RFC 3533 OGG May 2003 Chris@1: Chris@1: Chris@1: final very small one - and packet_3 is split into two segments. The Chris@1: encapsulation process then creates pages, which are quite small in Chris@1: this example. Page_1 consists of the first three segments of Chris@1: packet_1, page_2 contains the remaining 2 segments from packet_1, and Chris@1: page_3 contains the first three pages of packet_2. Finally, this Chris@1: logical bitstream is multiplexed into a physical Ogg bitstream with Chris@1: pages of other logical bitstreams. Chris@1: Chris@1: 6. The Ogg page format Chris@1: Chris@1: A physical Ogg bitstream consists of a sequence of concatenated Chris@1: pages. Pages are of variable size, usually 4-8 kB, maximum 65307 Chris@1: bytes. A page header contains all the information needed to Chris@1: demultiplex the logical bitstreams out of the physical bitstream and Chris@1: to perform basic error recovery and landmarks for seeking. Each page Chris@1: is a self-contained entity such that the page decode mechanism can Chris@1: recognize, verify, and handle single pages at a time without Chris@1: requiring the overall bitstream. Chris@1: Chris@1: The Ogg page header has the following format: Chris@1: Chris@1: 0 1 2 3 Chris@1: 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte Chris@1: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Chris@1: | capture_pattern: Magic number for page start "OggS" | 0-3 Chris@1: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Chris@1: | version | header_type | granule_position | 4-7 Chris@1: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Chris@1: | | 8-11 Chris@1: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Chris@1: | | bitstream_serial_number | 12-15 Chris@1: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Chris@1: | | page_sequence_number | 16-19 Chris@1: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Chris@1: | | CRC_checksum | 20-23 Chris@1: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Chris@1: | |page_segments | segment_table | 24-27 Chris@1: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Chris@1: | ... | 28- Chris@1: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Chris@1: Chris@1: The LSb (least significant bit) comes first in the Bytes. Fields Chris@1: with more than one byte length are encoded LSB (least significant Chris@1: byte) first. Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Pfeiffer Informational [Page 9] Chris@1: Chris@1: RFC 3533 OGG May 2003 Chris@1: Chris@1: Chris@1: The fields in the page header have the following meaning: Chris@1: Chris@1: 1. capture_pattern: a 4 Byte field that signifies the beginning of a Chris@1: page. It contains the magic numbers: Chris@1: Chris@1: 0x4f 'O' Chris@1: Chris@1: 0x67 'g' Chris@1: Chris@1: 0x67 'g' Chris@1: Chris@1: 0x53 'S' Chris@1: Chris@1: It helps a decoder to find the page boundaries and regain Chris@1: synchronisation after parsing a corrupted stream. Once the Chris@1: capture pattern is found, the decoder verifies page sync and Chris@1: integrity by computing and comparing the checksum. Chris@1: Chris@1: 2. stream_structure_version: 1 Byte signifying the version number of Chris@1: the Ogg file format used in this stream (this document specifies Chris@1: version 0). Chris@1: Chris@1: 3. header_type_flag: the bits in this 1 Byte field identify the Chris@1: specific type of this page. Chris@1: Chris@1: * bit 0x01 Chris@1: Chris@1: set: page contains data of a packet continued from the previous Chris@1: page Chris@1: Chris@1: unset: page contains a fresh packet Chris@1: Chris@1: * bit 0x02 Chris@1: Chris@1: set: this is the first page of a logical bitstream (bos) Chris@1: Chris@1: unset: this page is not a first page Chris@1: Chris@1: * bit 0x04 Chris@1: Chris@1: set: this is the last page of a logical bitstream (eos) Chris@1: Chris@1: unset: this page is not a last page Chris@1: Chris@1: 4. granule_position: an 8 Byte field containing position information. Chris@1: For example, for an audio stream, it MAY contain the total number Chris@1: of PCM samples encoded after including all frames finished on this Chris@1: page. For a video stream it MAY contain the total number of video Chris@1: Chris@1: Chris@1: Chris@1: Pfeiffer Informational [Page 10] Chris@1: Chris@1: RFC 3533 OGG May 2003 Chris@1: Chris@1: Chris@1: frames encoded after this page. This is a hint for the decoder Chris@1: and gives it some timing and position information. Its meaning is Chris@1: dependent on the codec for that logical bitstream and specified in Chris@1: a specific media mapping. A special value of -1 (in two's Chris@1: complement) indicates that no packets finish on this page. Chris@1: Chris@1: 5. bitstream_serial_number: a 4 Byte field containing the unique Chris@1: serial number by which the logical bitstream is identified. Chris@1: Chris@1: 6. page_sequence_number: a 4 Byte field containing the sequence Chris@1: number of the page so the decoder can identify page loss. This Chris@1: sequence number is increasing on each logical bitstream Chris@1: separately. Chris@1: Chris@1: 7. CRC_checksum: a 4 Byte field containing a 32 bit CRC checksum of Chris@1: the page (including header with zero CRC field and page content). Chris@1: The generator polynomial is 0x04c11db7. Chris@1: Chris@1: 8. number_page_segments: 1 Byte giving the number of segment entries Chris@1: encoded in the segment table. Chris@1: Chris@1: 9. segment_table: number_page_segments Bytes containing the lacing Chris@1: values of all segments in this page. Each Byte contains one Chris@1: lacing value. Chris@1: Chris@1: The total header size in bytes is given by: Chris@1: header_size = number_page_segments + 27 [Byte] Chris@1: Chris@1: The total page size in Bytes is given by: Chris@1: page_size = header_size + sum(lacing_values: 1..number_page_segments) Chris@1: [Byte] Chris@1: Chris@1: 7. Security Considerations Chris@1: Chris@1: The Ogg encapsulation format is a container format and only Chris@1: encapsulates content (such as Vorbis-encoded audio). It does not Chris@1: provide for any generic encryption or signing of itself or its Chris@1: contained content bitstreams. However, it encapsulates any kind of Chris@1: content bitstream as long as there is a codec for it, and is thus Chris@1: able to contain encrypted and signed content data. It is also Chris@1: possible to add an external security mechanism that encrypts or signs Chris@1: an Ogg physical bitstream and thus provides content confidentiality Chris@1: and authenticity. Chris@1: Chris@1: As Ogg encapsulates binary data, it is possible to include executable Chris@1: content in an Ogg bitstream. This can be an issue with applications Chris@1: that are implemented using the Ogg format, especially when Ogg is Chris@1: used for streaming or file transfer in a networking scenario. As Chris@1: Chris@1: Chris@1: Chris@1: Pfeiffer Informational [Page 11] Chris@1: Chris@1: RFC 3533 OGG May 2003 Chris@1: Chris@1: Chris@1: such, Ogg does not pose a threat there. However, an application Chris@1: decoding Ogg and its encapsulated content bitstreams has to ensure Chris@1: correct handling of manipulated bitstreams, of buffer overflows and Chris@1: the like. Chris@1: Chris@1: 8. References Chris@1: Chris@1: [1] Walleij, L., "The application/ogg Media Type", RFC 3534, May Chris@1: 2003. Chris@1: Chris@1: [2] Bradner, S., "Key words for use in RFCs to Indicate Requirement Chris@1: Levels", BCP 14, RFC 2119, March 1997. Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Pfeiffer Informational [Page 12] Chris@1: Chris@1: RFC 3533 OGG May 2003 Chris@1: Chris@1: Chris@1: Appendix A. Glossary of terms and abbreviations Chris@1: Chris@1: bos page: The initial page (beginning of stream) of a logical Chris@1: bitstream which contains information to identify the codec type Chris@1: and other decoding-relevant information. Chris@1: Chris@1: chaining (or sequential multiplexing): Concatenation of two or more Chris@1: complete physical Ogg bitstreams. Chris@1: Chris@1: eos page: The final page (end of stream) of a logical bitstream. Chris@1: Chris@1: granule position: An increasing position number for a specific Chris@1: logical bitstream stored in the page header. Its meaning is Chris@1: dependent on the codec for that logical bitstream and specified in Chris@1: a specific media mapping. Chris@1: Chris@1: grouping (or concurrent multiplexing): Interleaving of pages of Chris@1: several logical bitstreams into one complete physical Ogg Chris@1: bitstream under the restriction that all bos pages of all grouped Chris@1: logical bitstreams MUST appear before any data pages. Chris@1: Chris@1: lacing value: An entry in the segment table of a page header Chris@1: representing the size of the related segment. Chris@1: Chris@1: logical bitstream: A sequence of bits being the result of an encoded Chris@1: media stream. Chris@1: Chris@1: media mapping: A specific use of the Ogg encapsulation format Chris@1: together with a specific (set of) codec(s). Chris@1: Chris@1: (Ogg) packet: A subpart of a logical bitstream that is created by the Chris@1: encoder for that bitstream and represents a meaningful entity for Chris@1: the encoder, but only a sequence of bits to the Ogg encapsulation. Chris@1: Chris@1: (Ogg) page: A physical bitstream consists of a sequence of Ogg pages Chris@1: containing data of one logical bitstream only. It usually Chris@1: contains a group of contiguous segments of one packet only, but Chris@1: sometimes packets are too large and need to be split over several Chris@1: pages. Chris@1: Chris@1: physical (Ogg) bitstream: The sequence of bits resulting from an Ogg Chris@1: encapsulation of one or several logical bitstreams. It consists Chris@1: of a sequence of pages from the logical bitstreams with the Chris@1: restriction that the pages of one logical bitstream MUST come in Chris@1: their correct temporal order. Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Pfeiffer Informational [Page 13] Chris@1: Chris@1: RFC 3533 OGG May 2003 Chris@1: Chris@1: Chris@1: (Ogg) segment: The Ogg encapsulation process splits each packet into Chris@1: chunks of 255 bytes plus a last fractional chunk of less than 255 Chris@1: bytes. These chunks are called segments. Chris@1: Chris@1: Appendix B. Acknowledgements Chris@1: Chris@1: The author gratefully acknowledges the work that Christopher Chris@1: Montgomery and the Xiph.Org foundation have done in defining the Ogg Chris@1: multimedia project and as part of it the open file format described Chris@1: in this document. The author hopes that providing this document to Chris@1: the Internet community will help in promoting the Ogg multimedia Chris@1: project at http://www.xiph.org/. Many thanks also for the many Chris@1: technical and typo corrections that C. Montgomery and the Ogg Chris@1: community provided as feedback to this RFC. Chris@1: Chris@1: Author's Address Chris@1: Chris@1: Silvia Pfeiffer Chris@1: CSIRO, Australia Chris@1: Locked Bag 17 Chris@1: North Ryde, NSW 2113 Chris@1: Australia Chris@1: Chris@1: Phone: +61 2 9325 3141 Chris@1: EMail: Silvia.Pfeiffer@csiro.au Chris@1: URI: http://www.cmis.csiro.au/Silvia.Pfeiffer/ Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Pfeiffer Informational [Page 14] Chris@1: Chris@1: RFC 3533 OGG May 2003 Chris@1: Chris@1: Chris@1: Full Copyright Statement Chris@1: Chris@1: Copyright (C) The Internet Society (2003). All Rights Reserved. Chris@1: Chris@1: This document and translations of it may be copied and furnished to Chris@1: others, and derivative works that comment on or otherwise explain it Chris@1: or assist in its implementation may be prepared, copied, published Chris@1: and distributed, in whole or in part, without restriction of any Chris@1: kind, provided that the above copyright notice and this paragraph are Chris@1: included on all such copies and derivative works. However, this Chris@1: document itself may not be modified in any way, such as by removing Chris@1: the copyright notice or references to the Internet Society or other Chris@1: Internet organizations, except as needed for the purpose of Chris@1: developing Internet standards in which case the procedures for Chris@1: copyrights defined in the Internet Standards process must be Chris@1: followed, or as required to translate it into languages other than Chris@1: English. Chris@1: Chris@1: The limited permissions granted above are perpetual and will not be Chris@1: revoked by the Internet Society or its successors or assigns. Chris@1: Chris@1: This document and the information contained herein is provided on an Chris@1: "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING Chris@1: TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING Chris@1: BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION Chris@1: HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF Chris@1: MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Chris@1: Chris@1: Acknowledgement Chris@1: Chris@1: Funding for the RFC Editor function is currently provided by the Chris@1: Internet Society. Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Chris@1: Pfeiffer Informational [Page 15] Chris@1: