Chris@1
|
1
|
Chris@1
|
2
|
Chris@1
|
3
|
Chris@1
|
4
|
Chris@1
|
5
|
Chris@1
|
6
|
Chris@1
|
7 Network Working Group S. Pfeiffer
|
Chris@1
|
8 Request for Comments: 3533 CSIRO
|
Chris@1
|
9 Category: Informational May 2003
|
Chris@1
|
10
|
Chris@1
|
11
|
Chris@1
|
12 The Ogg Encapsulation Format Version 0
|
Chris@1
|
13
|
Chris@1
|
14 Status of this Memo
|
Chris@1
|
15
|
Chris@1
|
16 This memo provides information for the Internet community. It does
|
Chris@1
|
17 not specify an Internet standard of any kind. Distribution of this
|
Chris@1
|
18 memo is unlimited.
|
Chris@1
|
19
|
Chris@1
|
20 Copyright Notice
|
Chris@1
|
21
|
Chris@1
|
22 Copyright (C) The Internet Society (2003). All Rights Reserved.
|
Chris@1
|
23
|
Chris@1
|
24 Abstract
|
Chris@1
|
25
|
Chris@1
|
26 This document describes the Ogg bitstream format version 0, which is
|
Chris@1
|
27 a general, freely-available encapsulation format for media streams.
|
Chris@1
|
28 It is able to encapsulate any kind and number of video and audio
|
Chris@1
|
29 encoding formats as well as other data streams in a single bitstream.
|
Chris@1
|
30
|
Chris@1
|
31 Terminology
|
Chris@1
|
32
|
Chris@1
|
33 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
|
Chris@1
|
34 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
|
Chris@1
|
35 document are to be interpreted as described in BCP 14, RFC 2119 [2].
|
Chris@1
|
36
|
Chris@1
|
37 Table of Contents
|
Chris@1
|
38
|
Chris@1
|
39 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2
|
Chris@1
|
40 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 2
|
Chris@1
|
41 3. Requirements for a generic encapsulation format . . . . . . . 3
|
Chris@1
|
42 4. The Ogg bitstream format . . . . . . . . . . . . . . . . . . . 3
|
Chris@1
|
43 5. The encapsulation process . . . . . . . . . . . . . . . . . . 6
|
Chris@1
|
44 6. The Ogg page format . . . . . . . . . . . . . . . . . . . . . 9
|
Chris@1
|
45 7. Security Considerations . . . . . . . . . . . . . . . . . . . 11
|
Chris@1
|
46 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 12
|
Chris@1
|
47 A. Glossary of terms and abbreviations . . . . . . . . . . . . . 13
|
Chris@1
|
48 B. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14
|
Chris@1
|
49 Author's Address . . . . . . . . . . . . . . . . . . . . . . . 14
|
Chris@1
|
50 Full Copyright Statement . . . . . . . . . . . . . . . . . . . 15
|
Chris@1
|
51
|
Chris@1
|
52
|
Chris@1
|
53
|
Chris@1
|
54
|
Chris@1
|
55
|
Chris@1
|
56
|
Chris@1
|
57
|
Chris@1
|
58 Pfeiffer Informational [Page 1]
|
Chris@1
|
59
|
Chris@1
|
60 RFC 3533 OGG May 2003
|
Chris@1
|
61
|
Chris@1
|
62
|
Chris@1
|
63 1. Introduction
|
Chris@1
|
64
|
Chris@1
|
65 The Ogg bitstream format has been developed as a part of a larger
|
Chris@1
|
66 project aimed at creating a set of components for the coding and
|
Chris@1
|
67 decoding of multimedia content (codecs) which are to be freely
|
Chris@1
|
68 available and freely re-implementable, both in software and in
|
Chris@1
|
69 hardware for the computing community at large, including the Internet
|
Chris@1
|
70 community. It is the intention of the Ogg developers represented by
|
Chris@1
|
71 Xiph.Org that it be usable without intellectual property concerns.
|
Chris@1
|
72
|
Chris@1
|
73 This document describes the Ogg bitstream format and how to use it to
|
Chris@1
|
74 encapsulate one or several media bitstreams created by one or several
|
Chris@1
|
75 encoders. The Ogg transport bitstream is designed to provide
|
Chris@1
|
76 framing, error protection and seeking structure for higher-level
|
Chris@1
|
77 codec streams that consist of raw, unencapsulated data packets, such
|
Chris@1
|
78 as the Vorbis audio codec or the upcoming Tarkin and Theora video
|
Chris@1
|
79 codecs. It is capable of interleaving different binary media and
|
Chris@1
|
80 other time-continuous data streams that are prepared by an encoder as
|
Chris@1
|
81 a sequence of data packets. Ogg provides enough information to
|
Chris@1
|
82 properly separate data back into such encoder created data packets at
|
Chris@1
|
83 the original packet boundaries without relying on decoding to find
|
Chris@1
|
84 packet boundaries.
|
Chris@1
|
85
|
Chris@1
|
86 Please note that the MIME type application/ogg has been registered
|
Chris@1
|
87 with the IANA [1].
|
Chris@1
|
88
|
Chris@1
|
89 2. Definitions
|
Chris@1
|
90
|
Chris@1
|
91 For describing the Ogg encapsulation process, a set of terms will be
|
Chris@1
|
92 used whose meaning needs to be well understood. Therefore, some of
|
Chris@1
|
93 the most fundamental terms are defined now before we start with the
|
Chris@1
|
94 description of the requirements for a generic media stream
|
Chris@1
|
95 encapsulation format, the process of encapsulation, and the concrete
|
Chris@1
|
96 format of the Ogg bitstream. See the Appendix for a more complete
|
Chris@1
|
97 glossary.
|
Chris@1
|
98
|
Chris@1
|
99 The result of an Ogg encapsulation is called the "Physical (Ogg)
|
Chris@1
|
100 Bitstream". It encapsulates one or several encoder-created
|
Chris@1
|
101 bitstreams, which are called "Logical Bitstreams". A logical
|
Chris@1
|
102 bitstream, provided to the Ogg encapsulation process, has a
|
Chris@1
|
103 structure, i.e., it is split up into a sequence of so-called
|
Chris@1
|
104 "Packets". The packets are created by the encoder of that logical
|
Chris@1
|
105 bitstream and represent meaningful entities for that encoder only
|
Chris@1
|
106 (e.g., an uncompressed stream may use video frames as packets). They
|
Chris@1
|
107 do not contain boundary information - strung together they appear to
|
Chris@1
|
108 be streams of random bytes with no landmarks.
|
Chris@1
|
109
|
Chris@1
|
110
|
Chris@1
|
111
|
Chris@1
|
112
|
Chris@1
|
113
|
Chris@1
|
114 Pfeiffer Informational [Page 2]
|
Chris@1
|
115
|
Chris@1
|
116 RFC 3533 OGG May 2003
|
Chris@1
|
117
|
Chris@1
|
118
|
Chris@1
|
119 Please note that the term "packet" is not used in this document to
|
Chris@1
|
120 signify entities for transport over a network.
|
Chris@1
|
121
|
Chris@1
|
122 3. Requirements for a generic encapsulation format
|
Chris@1
|
123
|
Chris@1
|
124 The design idea behind Ogg was to provide a generic, linear media
|
Chris@1
|
125 transport format to enable both file-based storage and stream-based
|
Chris@1
|
126 transmission of one or several interleaved media streams independent
|
Chris@1
|
127 of the encoding format of the media data. Such an encapsulation
|
Chris@1
|
128 format needs to provide:
|
Chris@1
|
129
|
Chris@1
|
130 o framing for logical bitstreams.
|
Chris@1
|
131
|
Chris@1
|
132 o interleaving of different logical bitstreams.
|
Chris@1
|
133
|
Chris@1
|
134 o detection of corruption.
|
Chris@1
|
135
|
Chris@1
|
136 o recapture after a parsing error.
|
Chris@1
|
137
|
Chris@1
|
138 o position landmarks for direct random access of arbitrary positions
|
Chris@1
|
139 in the bitstream.
|
Chris@1
|
140
|
Chris@1
|
141 o streaming capability (i.e., no seeking is needed to build a 100%
|
Chris@1
|
142 complete bitstream).
|
Chris@1
|
143
|
Chris@1
|
144 o small overhead (i.e., use no more than approximately 1-2% of
|
Chris@1
|
145 bitstream bandwidth for packet boundary marking, high-level
|
Chris@1
|
146 framing, sync and seeking).
|
Chris@1
|
147
|
Chris@1
|
148 o simplicity to enable fast parsing.
|
Chris@1
|
149
|
Chris@1
|
150 o simple concatenation mechanism of several physical bitstreams.
|
Chris@1
|
151
|
Chris@1
|
152 All of these design considerations have been taken into consideration
|
Chris@1
|
153 for Ogg. Ogg supports framing and interleaving of logical
|
Chris@1
|
154 bitstreams, seeking landmarks, detection of corruption, and stream
|
Chris@1
|
155 resynchronisation after a parsing error with no more than
|
Chris@1
|
156 approximately 1-2% overhead. It is a generic framework to perform
|
Chris@1
|
157 encapsulation of time-continuous bitstreams. It does not know any
|
Chris@1
|
158 specifics about the codec data that it encapsulates and is thus
|
Chris@1
|
159 independent of any media codec.
|
Chris@1
|
160
|
Chris@1
|
161 4. The Ogg bitstream format
|
Chris@1
|
162
|
Chris@1
|
163 A physical Ogg bitstream consists of multiple logical bitstreams
|
Chris@1
|
164 interleaved in so-called "Pages". Whole pages are taken in order
|
Chris@1
|
165 from multiple logical bitstreams multiplexed at the page level. The
|
Chris@1
|
166 logical bitstreams are identified by a unique serial number in the
|
Chris@1
|
167
|
Chris@1
|
168
|
Chris@1
|
169
|
Chris@1
|
170 Pfeiffer Informational [Page 3]
|
Chris@1
|
171
|
Chris@1
|
172 RFC 3533 OGG May 2003
|
Chris@1
|
173
|
Chris@1
|
174
|
Chris@1
|
175 header of each page of the physical bitstream. This unique serial
|
Chris@1
|
176 number is created randomly and does not have any connection to the
|
Chris@1
|
177 content or encoder of the logical bitstream it represents. Pages of
|
Chris@1
|
178 all logical bitstreams are concurrently interleaved, but they need
|
Chris@1
|
179 not be in a regular order - they are only required to be consecutive
|
Chris@1
|
180 within the logical bitstream. Ogg demultiplexing reconstructs the
|
Chris@1
|
181 original logical bitstreams from the physical bitstream by taking the
|
Chris@1
|
182 pages in order from the physical bitstream and redirecting them into
|
Chris@1
|
183 the appropriate logical decoding entity.
|
Chris@1
|
184
|
Chris@1
|
185 Each Ogg page contains only one type of data as it belongs to one
|
Chris@1
|
186 logical bitstream only. Pages are of variable size and have a page
|
Chris@1
|
187 header containing encapsulation and error recovery information. Each
|
Chris@1
|
188 logical bitstream in a physical Ogg bitstream starts with a special
|
Chris@1
|
189 start page (bos=beginning of stream) and ends with a special page
|
Chris@1
|
190 (eos=end of stream).
|
Chris@1
|
191
|
Chris@1
|
192 The bos page contains information to uniquely identify the codec type
|
Chris@1
|
193 and MAY contain information to set up the decoding process. The bos
|
Chris@1
|
194 page SHOULD also contain information about the encoded media - for
|
Chris@1
|
195 example, for audio, it should contain the sample rate and number of
|
Chris@1
|
196 channels. By convention, the first bytes of the bos page contain
|
Chris@1
|
197 magic data that uniquely identifies the required codec. It is the
|
Chris@1
|
198 responsibility of anyone fielding a new codec to make sure it is
|
Chris@1
|
199 possible to reliably distinguish his/her codec from all other codecs
|
Chris@1
|
200 in use. There is no fixed way to detect the end of the codec-
|
Chris@1
|
201 identifying marker. The format of the bos page is dependent on the
|
Chris@1
|
202 codec and therefore MUST be given in the encapsulation specification
|
Chris@1
|
203 of that logical bitstream type. Ogg also allows but does not require
|
Chris@1
|
204 secondary header packets after the bos page for logical bitstreams
|
Chris@1
|
205 and these must also precede any data packets in any logical
|
Chris@1
|
206 bitstream. These subsequent header packets are framed into an
|
Chris@1
|
207 integral number of pages, which will not contain any data packets.
|
Chris@1
|
208 So, a physical bitstream begins with the bos pages of all logical
|
Chris@1
|
209 bitstreams containing one initial header packet per page, followed by
|
Chris@1
|
210 the subsidiary header packets of all streams, followed by pages
|
Chris@1
|
211 containing data packets.
|
Chris@1
|
212
|
Chris@1
|
213 The encapsulation specification for one or more logical bitstreams is
|
Chris@1
|
214 called a "media mapping". An example for a media mapping is "Ogg
|
Chris@1
|
215 Vorbis", which uses the Ogg framework to encapsulate Vorbis-encoded
|
Chris@1
|
216 audio data for stream-based storage (such as files) and transport
|
Chris@1
|
217 (such as TCP streams or pipes). Ogg Vorbis provides the name and
|
Chris@1
|
218 revision of the Vorbis codec, the audio rate and the audio quality on
|
Chris@1
|
219 the Ogg Vorbis bos page. It also uses two additional header pages
|
Chris@1
|
220 per logical bitstream. The Ogg Vorbis bos page starts with the byte
|
Chris@1
|
221 0x01, followed by "vorbis" (a total of 7 bytes of identifier).
|
Chris@1
|
222
|
Chris@1
|
223
|
Chris@1
|
224
|
Chris@1
|
225
|
Chris@1
|
226 Pfeiffer Informational [Page 4]
|
Chris@1
|
227
|
Chris@1
|
228 RFC 3533 OGG May 2003
|
Chris@1
|
229
|
Chris@1
|
230
|
Chris@1
|
231 Ogg knows two types of multiplexing: concurrent multiplexing (so-
|
Chris@1
|
232 called "Grouping") and sequential multiplexing (so-called
|
Chris@1
|
233 "Chaining"). Grouping defines how to interleave several logical
|
Chris@1
|
234 bitstreams page-wise in the same physical bitstream. Grouping is for
|
Chris@1
|
235 example needed for interleaving a video stream with several
|
Chris@1
|
236 synchronised audio tracks using different codecs in different logical
|
Chris@1
|
237 bitstreams. Chaining on the other hand, is defined to provide a
|
Chris@1
|
238 simple mechanism to concatenate physical Ogg bitstreams, as is often
|
Chris@1
|
239 needed for streaming applications.
|
Chris@1
|
240
|
Chris@1
|
241 In grouping, all bos pages of all logical bitstreams MUST appear
|
Chris@1
|
242 together at the beginning of the Ogg bitstream. The media mapping
|
Chris@1
|
243 specifies the order of the initial pages. For example, the grouping
|
Chris@1
|
244 of a specific Ogg video and Ogg audio bitstream may specify that the
|
Chris@1
|
245 physical bitstream MUST begin with the bos page of the logical video
|
Chris@1
|
246 bitstream, followed by the bos page of the audio bitstream. Unlike
|
Chris@1
|
247 bos pages, eos pages for the logical bitstreams need not all occur
|
Chris@1
|
248 contiguously. Eos pages may be 'nil' pages, that is, pages
|
Chris@1
|
249 containing no content but simply a page header with position
|
Chris@1
|
250 information and the eos flag set in the page header. Each grouped
|
Chris@1
|
251 logical bitstream MUST have a unique serial number within the scope
|
Chris@1
|
252 of the physical bitstream.
|
Chris@1
|
253
|
Chris@1
|
254 In chaining, complete logical bitstreams are concatenated. The
|
Chris@1
|
255 bitstreams do not overlap, i.e., the eos page of a given logical
|
Chris@1
|
256 bitstream is immediately followed by the bos page of the next. Each
|
Chris@1
|
257 chained logical bitstream MUST have a unique serial number within the
|
Chris@1
|
258 scope of the physical bitstream.
|
Chris@1
|
259
|
Chris@1
|
260 It is possible to consecutively chain groups of concurrently
|
Chris@1
|
261 multiplexed bitstreams. The groups, when unchained, MUST stand on
|
Chris@1
|
262 their own as a valid concurrently multiplexed bitstream. The
|
Chris@1
|
263 following diagram shows a schematic example of such a physical
|
Chris@1
|
264 bitstream that obeys all the rules of both grouped and chained
|
Chris@1
|
265 multiplexed bitstreams.
|
Chris@1
|
266
|
Chris@1
|
267 physical bitstream with pages of
|
Chris@1
|
268 different logical bitstreams grouped and chained
|
Chris@1
|
269 -------------------------------------------------------------
|
Chris@1
|
270 |*A*|*B*|*C*|A|A|C|B|A|B|#A#|C|...|B|C|#B#|#C#|*D*|D|...|#D#|
|
Chris@1
|
271 -------------------------------------------------------------
|
Chris@1
|
272 bos bos bos eos eos eos bos eos
|
Chris@1
|
273
|
Chris@1
|
274 In this example, there are two chained physical bitstreams, the first
|
Chris@1
|
275 of which is a grouped stream of three logical bitstreams A, B, and C.
|
Chris@1
|
276 The second physical bitstream is chained after the end of the grouped
|
Chris@1
|
277 bitstream, which ends after the last eos page of all its grouped
|
Chris@1
|
278 logical bitstreams. As can be seen, grouped bitstreams begin
|
Chris@1
|
279
|
Chris@1
|
280
|
Chris@1
|
281
|
Chris@1
|
282 Pfeiffer Informational [Page 5]
|
Chris@1
|
283
|
Chris@1
|
284 RFC 3533 OGG May 2003
|
Chris@1
|
285
|
Chris@1
|
286
|
Chris@1
|
287 together - all of the bos pages MUST appear before any data pages.
|
Chris@1
|
288 It can also be seen that pages of concurrently multiplexed bitstreams
|
Chris@1
|
289 need not conform to a regular order. And it can be seen that a
|
Chris@1
|
290 grouped bitstream can end long before the other bitstreams in the
|
Chris@1
|
291 group end.
|
Chris@1
|
292
|
Chris@1
|
293 Ogg does not know any specifics about the codec data except that each
|
Chris@1
|
294 logical bitstream belongs to a different codec, the data from the
|
Chris@1
|
295 codec comes in order and has position markers (so-called "Granule
|
Chris@1
|
296 positions"). Ogg does not have a concept of 'time': it only knows
|
Chris@1
|
297 about sequentially increasing, unitless position markers. An
|
Chris@1
|
298 application can only get temporal information through higher layers
|
Chris@1
|
299 which have access to the codec APIs to assign and convert granule
|
Chris@1
|
300 positions or time.
|
Chris@1
|
301
|
Chris@1
|
302 A specific definition of a media mapping using Ogg may put further
|
Chris@1
|
303 constraints on its specific use of the Ogg bitstream format. For
|
Chris@1
|
304 example, a specific media mapping may require that all the eos pages
|
Chris@1
|
305 for all grouped bitstreams need to appear in direct sequence. An
|
Chris@1
|
306 example for a media mapping is the specification of "Ogg Vorbis".
|
Chris@1
|
307 Another example is the upcoming "Ogg Theora" specification which
|
Chris@1
|
308 encapsulates Theora-encoded video data and usually comes multiplexed
|
Chris@1
|
309 with a Vorbis stream for an Ogg containing synchronised audio and
|
Chris@1
|
310 video. As Ogg does not specify temporal relationships between the
|
Chris@1
|
311 encapsulated concurrently multiplexed bitstreams, the temporal
|
Chris@1
|
312 synchronisation between the audio and video stream will be specified
|
Chris@1
|
313 in this media mapping. To enable streaming, pages from various
|
Chris@1
|
314 logical bitstreams will typically be interleaved in chronological
|
Chris@1
|
315 order.
|
Chris@1
|
316
|
Chris@1
|
317 5. The encapsulation process
|
Chris@1
|
318
|
Chris@1
|
319 The process of multiplexing different logical bitstreams happens at
|
Chris@1
|
320 the level of pages as described above. The bitstreams provided by
|
Chris@1
|
321 encoders are however handed over to Ogg as so-called "Packets" with
|
Chris@1
|
322 packet boundaries dependent on the encoding format. The process of
|
Chris@1
|
323 encapsulating packets into pages will be described now.
|
Chris@1
|
324
|
Chris@1
|
325 From Ogg's perspective, packets can be of any arbitrary size. A
|
Chris@1
|
326 specific media mapping will define how to group or break up packets
|
Chris@1
|
327 from a specific media encoder. As Ogg pages have a maximum size of
|
Chris@1
|
328 about 64 kBytes, sometimes a packet has to be distributed over
|
Chris@1
|
329 several pages. To simplify that process, Ogg divides each packet
|
Chris@1
|
330 into 255 byte long chunks plus a final shorter chunk. These chunks
|
Chris@1
|
331 are called "Ogg Segments". They are only a logical construct and do
|
Chris@1
|
332 not have a header for themselves.
|
Chris@1
|
333
|
Chris@1
|
334
|
Chris@1
|
335
|
Chris@1
|
336
|
Chris@1
|
337
|
Chris@1
|
338 Pfeiffer Informational [Page 6]
|
Chris@1
|
339
|
Chris@1
|
340 RFC 3533 OGG May 2003
|
Chris@1
|
341
|
Chris@1
|
342
|
Chris@1
|
343 A group of contiguous segments is wrapped into a variable length page
|
Chris@1
|
344 preceded by a header. A segment table in the page header tells about
|
Chris@1
|
345 the "Lacing values" (sizes) of the segments included in the page. A
|
Chris@1
|
346 flag in the page header tells whether a page contains a packet
|
Chris@1
|
347 continued from a previous page. Note that a lacing value of 255
|
Chris@1
|
348 implies that a second lacing value follows in the packet, and a value
|
Chris@1
|
349 of less than 255 marks the end of the packet after that many
|
Chris@1
|
350 additional bytes. A packet of 255 bytes (or a multiple of 255 bytes)
|
Chris@1
|
351 is terminated by a lacing value of 0. Note also that a 'nil' (zero
|
Chris@1
|
352 length) packet is not an error; it consists of nothing more than a
|
Chris@1
|
353 lacing value of zero in the header.
|
Chris@1
|
354
|
Chris@1
|
355 The encoding is optimized for speed and the expected case of the
|
Chris@1
|
356 majority of packets being between 50 and 200 bytes large. This is a
|
Chris@1
|
357 design justification rather than a recommendation. This encoding
|
Chris@1
|
358 both avoids imposing a maximum packet size as well as imposing
|
Chris@1
|
359 minimum overhead on small packets. In contrast, e.g., simply using
|
Chris@1
|
360 two bytes at the head of every packet and having a max packet size of
|
Chris@1
|
361 32 kBytes would always penalize small packets (< 255 bytes, the
|
Chris@1
|
362 typical case) with twice the segmentation overhead. Using the lacing
|
Chris@1
|
363 values as suggested, small packets see the minimum possible byte-
|
Chris@1
|
364 aligned overhead (1 byte) and large packets (>512 bytes) see a fairly
|
Chris@1
|
365 constant ~0.5% overhead on encoding space.
|
Chris@1
|
366
|
Chris@1
|
367
|
Chris@1
|
368
|
Chris@1
|
369
|
Chris@1
|
370
|
Chris@1
|
371
|
Chris@1
|
372
|
Chris@1
|
373
|
Chris@1
|
374
|
Chris@1
|
375
|
Chris@1
|
376
|
Chris@1
|
377
|
Chris@1
|
378
|
Chris@1
|
379
|
Chris@1
|
380
|
Chris@1
|
381
|
Chris@1
|
382
|
Chris@1
|
383
|
Chris@1
|
384
|
Chris@1
|
385
|
Chris@1
|
386
|
Chris@1
|
387
|
Chris@1
|
388
|
Chris@1
|
389
|
Chris@1
|
390
|
Chris@1
|
391
|
Chris@1
|
392
|
Chris@1
|
393
|
Chris@1
|
394 Pfeiffer Informational [Page 7]
|
Chris@1
|
395
|
Chris@1
|
396 RFC 3533 OGG May 2003
|
Chris@1
|
397
|
Chris@1
|
398
|
Chris@1
|
399 The following diagram shows a schematic example of a media mapping
|
Chris@1
|
400 using Ogg and grouped logical bitstreams:
|
Chris@1
|
401
|
Chris@1
|
402 logical bitstream with packet boundaries
|
Chris@1
|
403 -----------------------------------------------------------------
|
Chris@1
|
404 > | packet_1 | packet_2 | packet_3 | <
|
Chris@1
|
405 -----------------------------------------------------------------
|
Chris@1
|
406
|
Chris@1
|
407 |segmentation (logically only)
|
Chris@1
|
408 v
|
Chris@1
|
409
|
Chris@1
|
410 packet_1 (5 segments) packet_2 (4 segs) p_3 (2 segs)
|
Chris@1
|
411 ------------------------------ -------------------- ------------
|
Chris@1
|
412 .. |seg_1|seg_2|seg_3|seg_4|s_5 | |seg_1|seg_2|seg_3|| |seg_1|s_2 | ..
|
Chris@1
|
413 ------------------------------ -------------------- ------------
|
Chris@1
|
414
|
Chris@1
|
415 | page encapsulation
|
Chris@1
|
416 v
|
Chris@1
|
417
|
Chris@1
|
418 page_1 (packet_1 data) page_2 (pket_1 data) page_3 (packet_2 data)
|
Chris@1
|
419 ------------------------ ---------------- ------------------------
|
Chris@1
|
420 |H|------------------- | |H|----------- | |H|------------------- |
|
Chris@1
|
421 |D||seg_1|seg_2|seg_3| | |D|seg_4|s_5 | | |D||seg_1|seg_2|seg_3| | ...
|
Chris@1
|
422 |R|------------------- | |R|----------- | |R|------------------- |
|
Chris@1
|
423 ------------------------ ---------------- ------------------------
|
Chris@1
|
424
|
Chris@1
|
425 |
|
Chris@1
|
426 pages of |
|
Chris@1
|
427 other --------| |
|
Chris@1
|
428 logical -------
|
Chris@1
|
429 bitstreams | MUX |
|
Chris@1
|
430 -------
|
Chris@1
|
431 |
|
Chris@1
|
432 v
|
Chris@1
|
433
|
Chris@1
|
434 page_1 page_2 page_3
|
Chris@1
|
435 ------ ------ ------- ----- -------
|
Chris@1
|
436 ... || | || | || | || | || | ...
|
Chris@1
|
437 ------ ------ ------- ----- -------
|
Chris@1
|
438 physical Ogg bitstream
|
Chris@1
|
439
|
Chris@1
|
440 In this example we take a snapshot of the encapsulation process of
|
Chris@1
|
441 one logical bitstream. We can see part of that bitstream's
|
Chris@1
|
442 subdivision into packets as provided by the codec. The Ogg
|
Chris@1
|
443 encapsulation process chops up the packets into segments. The
|
Chris@1
|
444 packets in this example are rather large such that packet_1 is split
|
Chris@1
|
445 into 5 segments - 4 segments with 255 bytes and a final smaller one.
|
Chris@1
|
446 Packet_2 is split into 4 segments - 3 segments with 255 bytes and a
|
Chris@1
|
447
|
Chris@1
|
448
|
Chris@1
|
449
|
Chris@1
|
450 Pfeiffer Informational [Page 8]
|
Chris@1
|
451
|
Chris@1
|
452 RFC 3533 OGG May 2003
|
Chris@1
|
453
|
Chris@1
|
454
|
Chris@1
|
455 final very small one - and packet_3 is split into two segments. The
|
Chris@1
|
456 encapsulation process then creates pages, which are quite small in
|
Chris@1
|
457 this example. Page_1 consists of the first three segments of
|
Chris@1
|
458 packet_1, page_2 contains the remaining 2 segments from packet_1, and
|
Chris@1
|
459 page_3 contains the first three pages of packet_2. Finally, this
|
Chris@1
|
460 logical bitstream is multiplexed into a physical Ogg bitstream with
|
Chris@1
|
461 pages of other logical bitstreams.
|
Chris@1
|
462
|
Chris@1
|
463 6. The Ogg page format
|
Chris@1
|
464
|
Chris@1
|
465 A physical Ogg bitstream consists of a sequence of concatenated
|
Chris@1
|
466 pages. Pages are of variable size, usually 4-8 kB, maximum 65307
|
Chris@1
|
467 bytes. A page header contains all the information needed to
|
Chris@1
|
468 demultiplex the logical bitstreams out of the physical bitstream and
|
Chris@1
|
469 to perform basic error recovery and landmarks for seeking. Each page
|
Chris@1
|
470 is a self-contained entity such that the page decode mechanism can
|
Chris@1
|
471 recognize, verify, and handle single pages at a time without
|
Chris@1
|
472 requiring the overall bitstream.
|
Chris@1
|
473
|
Chris@1
|
474 The Ogg page header has the following format:
|
Chris@1
|
475
|
Chris@1
|
476 0 1 2 3
|
Chris@1
|
477 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte
|
Chris@1
|
478 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Chris@1
|
479 | capture_pattern: Magic number for page start "OggS" | 0-3
|
Chris@1
|
480 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Chris@1
|
481 | version | header_type | granule_position | 4-7
|
Chris@1
|
482 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Chris@1
|
483 | | 8-11
|
Chris@1
|
484 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Chris@1
|
485 | | bitstream_serial_number | 12-15
|
Chris@1
|
486 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Chris@1
|
487 | | page_sequence_number | 16-19
|
Chris@1
|
488 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Chris@1
|
489 | | CRC_checksum | 20-23
|
Chris@1
|
490 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Chris@1
|
491 | |page_segments | segment_table | 24-27
|
Chris@1
|
492 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Chris@1
|
493 | ... | 28-
|
Chris@1
|
494 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Chris@1
|
495
|
Chris@1
|
496 The LSb (least significant bit) comes first in the Bytes. Fields
|
Chris@1
|
497 with more than one byte length are encoded LSB (least significant
|
Chris@1
|
498 byte) first.
|
Chris@1
|
499
|
Chris@1
|
500
|
Chris@1
|
501
|
Chris@1
|
502
|
Chris@1
|
503
|
Chris@1
|
504
|
Chris@1
|
505
|
Chris@1
|
506 Pfeiffer Informational [Page 9]
|
Chris@1
|
507
|
Chris@1
|
508 RFC 3533 OGG May 2003
|
Chris@1
|
509
|
Chris@1
|
510
|
Chris@1
|
511 The fields in the page header have the following meaning:
|
Chris@1
|
512
|
Chris@1
|
513 1. capture_pattern: a 4 Byte field that signifies the beginning of a
|
Chris@1
|
514 page. It contains the magic numbers:
|
Chris@1
|
515
|
Chris@1
|
516 0x4f 'O'
|
Chris@1
|
517
|
Chris@1
|
518 0x67 'g'
|
Chris@1
|
519
|
Chris@1
|
520 0x67 'g'
|
Chris@1
|
521
|
Chris@1
|
522 0x53 'S'
|
Chris@1
|
523
|
Chris@1
|
524 It helps a decoder to find the page boundaries and regain
|
Chris@1
|
525 synchronisation after parsing a corrupted stream. Once the
|
Chris@1
|
526 capture pattern is found, the decoder verifies page sync and
|
Chris@1
|
527 integrity by computing and comparing the checksum.
|
Chris@1
|
528
|
Chris@1
|
529 2. stream_structure_version: 1 Byte signifying the version number of
|
Chris@1
|
530 the Ogg file format used in this stream (this document specifies
|
Chris@1
|
531 version 0).
|
Chris@1
|
532
|
Chris@1
|
533 3. header_type_flag: the bits in this 1 Byte field identify the
|
Chris@1
|
534 specific type of this page.
|
Chris@1
|
535
|
Chris@1
|
536 * bit 0x01
|
Chris@1
|
537
|
Chris@1
|
538 set: page contains data of a packet continued from the previous
|
Chris@1
|
539 page
|
Chris@1
|
540
|
Chris@1
|
541 unset: page contains a fresh packet
|
Chris@1
|
542
|
Chris@1
|
543 * bit 0x02
|
Chris@1
|
544
|
Chris@1
|
545 set: this is the first page of a logical bitstream (bos)
|
Chris@1
|
546
|
Chris@1
|
547 unset: this page is not a first page
|
Chris@1
|
548
|
Chris@1
|
549 * bit 0x04
|
Chris@1
|
550
|
Chris@1
|
551 set: this is the last page of a logical bitstream (eos)
|
Chris@1
|
552
|
Chris@1
|
553 unset: this page is not a last page
|
Chris@1
|
554
|
Chris@1
|
555 4. granule_position: an 8 Byte field containing position information.
|
Chris@1
|
556 For example, for an audio stream, it MAY contain the total number
|
Chris@1
|
557 of PCM samples encoded after including all frames finished on this
|
Chris@1
|
558 page. For a video stream it MAY contain the total number of video
|
Chris@1
|
559
|
Chris@1
|
560
|
Chris@1
|
561
|
Chris@1
|
562 Pfeiffer Informational [Page 10]
|
Chris@1
|
563
|
Chris@1
|
564 RFC 3533 OGG May 2003
|
Chris@1
|
565
|
Chris@1
|
566
|
Chris@1
|
567 frames encoded after this page. This is a hint for the decoder
|
Chris@1
|
568 and gives it some timing and position information. Its meaning is
|
Chris@1
|
569 dependent on the codec for that logical bitstream and specified in
|
Chris@1
|
570 a specific media mapping. A special value of -1 (in two's
|
Chris@1
|
571 complement) indicates that no packets finish on this page.
|
Chris@1
|
572
|
Chris@1
|
573 5. bitstream_serial_number: a 4 Byte field containing the unique
|
Chris@1
|
574 serial number by which the logical bitstream is identified.
|
Chris@1
|
575
|
Chris@1
|
576 6. page_sequence_number: a 4 Byte field containing the sequence
|
Chris@1
|
577 number of the page so the decoder can identify page loss. This
|
Chris@1
|
578 sequence number is increasing on each logical bitstream
|
Chris@1
|
579 separately.
|
Chris@1
|
580
|
Chris@1
|
581 7. CRC_checksum: a 4 Byte field containing a 32 bit CRC checksum of
|
Chris@1
|
582 the page (including header with zero CRC field and page content).
|
Chris@1
|
583 The generator polynomial is 0x04c11db7.
|
Chris@1
|
584
|
Chris@1
|
585 8. number_page_segments: 1 Byte giving the number of segment entries
|
Chris@1
|
586 encoded in the segment table.
|
Chris@1
|
587
|
Chris@1
|
588 9. segment_table: number_page_segments Bytes containing the lacing
|
Chris@1
|
589 values of all segments in this page. Each Byte contains one
|
Chris@1
|
590 lacing value.
|
Chris@1
|
591
|
Chris@1
|
592 The total header size in bytes is given by:
|
Chris@1
|
593 header_size = number_page_segments + 27 [Byte]
|
Chris@1
|
594
|
Chris@1
|
595 The total page size in Bytes is given by:
|
Chris@1
|
596 page_size = header_size + sum(lacing_values: 1..number_page_segments)
|
Chris@1
|
597 [Byte]
|
Chris@1
|
598
|
Chris@1
|
599 7. Security Considerations
|
Chris@1
|
600
|
Chris@1
|
601 The Ogg encapsulation format is a container format and only
|
Chris@1
|
602 encapsulates content (such as Vorbis-encoded audio). It does not
|
Chris@1
|
603 provide for any generic encryption or signing of itself or its
|
Chris@1
|
604 contained content bitstreams. However, it encapsulates any kind of
|
Chris@1
|
605 content bitstream as long as there is a codec for it, and is thus
|
Chris@1
|
606 able to contain encrypted and signed content data. It is also
|
Chris@1
|
607 possible to add an external security mechanism that encrypts or signs
|
Chris@1
|
608 an Ogg physical bitstream and thus provides content confidentiality
|
Chris@1
|
609 and authenticity.
|
Chris@1
|
610
|
Chris@1
|
611 As Ogg encapsulates binary data, it is possible to include executable
|
Chris@1
|
612 content in an Ogg bitstream. This can be an issue with applications
|
Chris@1
|
613 that are implemented using the Ogg format, especially when Ogg is
|
Chris@1
|
614 used for streaming or file transfer in a networking scenario. As
|
Chris@1
|
615
|
Chris@1
|
616
|
Chris@1
|
617
|
Chris@1
|
618 Pfeiffer Informational [Page 11]
|
Chris@1
|
619
|
Chris@1
|
620 RFC 3533 OGG May 2003
|
Chris@1
|
621
|
Chris@1
|
622
|
Chris@1
|
623 such, Ogg does not pose a threat there. However, an application
|
Chris@1
|
624 decoding Ogg and its encapsulated content bitstreams has to ensure
|
Chris@1
|
625 correct handling of manipulated bitstreams, of buffer overflows and
|
Chris@1
|
626 the like.
|
Chris@1
|
627
|
Chris@1
|
628 8. References
|
Chris@1
|
629
|
Chris@1
|
630 [1] Walleij, L., "The application/ogg Media Type", RFC 3534, May
|
Chris@1
|
631 2003.
|
Chris@1
|
632
|
Chris@1
|
633 [2] Bradner, S., "Key words for use in RFCs to Indicate Requirement
|
Chris@1
|
634 Levels", BCP 14, RFC 2119, March 1997.
|
Chris@1
|
635
|
Chris@1
|
636
|
Chris@1
|
637
|
Chris@1
|
638
|
Chris@1
|
639
|
Chris@1
|
640
|
Chris@1
|
641
|
Chris@1
|
642
|
Chris@1
|
643
|
Chris@1
|
644
|
Chris@1
|
645
|
Chris@1
|
646
|
Chris@1
|
647
|
Chris@1
|
648
|
Chris@1
|
649
|
Chris@1
|
650
|
Chris@1
|
651
|
Chris@1
|
652
|
Chris@1
|
653
|
Chris@1
|
654
|
Chris@1
|
655
|
Chris@1
|
656
|
Chris@1
|
657
|
Chris@1
|
658
|
Chris@1
|
659
|
Chris@1
|
660
|
Chris@1
|
661
|
Chris@1
|
662
|
Chris@1
|
663
|
Chris@1
|
664
|
Chris@1
|
665
|
Chris@1
|
666
|
Chris@1
|
667
|
Chris@1
|
668
|
Chris@1
|
669
|
Chris@1
|
670
|
Chris@1
|
671
|
Chris@1
|
672
|
Chris@1
|
673
|
Chris@1
|
674 Pfeiffer Informational [Page 12]
|
Chris@1
|
675
|
Chris@1
|
676 RFC 3533 OGG May 2003
|
Chris@1
|
677
|
Chris@1
|
678
|
Chris@1
|
679 Appendix A. Glossary of terms and abbreviations
|
Chris@1
|
680
|
Chris@1
|
681 bos page: The initial page (beginning of stream) of a logical
|
Chris@1
|
682 bitstream which contains information to identify the codec type
|
Chris@1
|
683 and other decoding-relevant information.
|
Chris@1
|
684
|
Chris@1
|
685 chaining (or sequential multiplexing): Concatenation of two or more
|
Chris@1
|
686 complete physical Ogg bitstreams.
|
Chris@1
|
687
|
Chris@1
|
688 eos page: The final page (end of stream) of a logical bitstream.
|
Chris@1
|
689
|
Chris@1
|
690 granule position: An increasing position number for a specific
|
Chris@1
|
691 logical bitstream stored in the page header. Its meaning is
|
Chris@1
|
692 dependent on the codec for that logical bitstream and specified in
|
Chris@1
|
693 a specific media mapping.
|
Chris@1
|
694
|
Chris@1
|
695 grouping (or concurrent multiplexing): Interleaving of pages of
|
Chris@1
|
696 several logical bitstreams into one complete physical Ogg
|
Chris@1
|
697 bitstream under the restriction that all bos pages of all grouped
|
Chris@1
|
698 logical bitstreams MUST appear before any data pages.
|
Chris@1
|
699
|
Chris@1
|
700 lacing value: An entry in the segment table of a page header
|
Chris@1
|
701 representing the size of the related segment.
|
Chris@1
|
702
|
Chris@1
|
703 logical bitstream: A sequence of bits being the result of an encoded
|
Chris@1
|
704 media stream.
|
Chris@1
|
705
|
Chris@1
|
706 media mapping: A specific use of the Ogg encapsulation format
|
Chris@1
|
707 together with a specific (set of) codec(s).
|
Chris@1
|
708
|
Chris@1
|
709 (Ogg) packet: A subpart of a logical bitstream that is created by the
|
Chris@1
|
710 encoder for that bitstream and represents a meaningful entity for
|
Chris@1
|
711 the encoder, but only a sequence of bits to the Ogg encapsulation.
|
Chris@1
|
712
|
Chris@1
|
713 (Ogg) page: A physical bitstream consists of a sequence of Ogg pages
|
Chris@1
|
714 containing data of one logical bitstream only. It usually
|
Chris@1
|
715 contains a group of contiguous segments of one packet only, but
|
Chris@1
|
716 sometimes packets are too large and need to be split over several
|
Chris@1
|
717 pages.
|
Chris@1
|
718
|
Chris@1
|
719 physical (Ogg) bitstream: The sequence of bits resulting from an Ogg
|
Chris@1
|
720 encapsulation of one or several logical bitstreams. It consists
|
Chris@1
|
721 of a sequence of pages from the logical bitstreams with the
|
Chris@1
|
722 restriction that the pages of one logical bitstream MUST come in
|
Chris@1
|
723 their correct temporal order.
|
Chris@1
|
724
|
Chris@1
|
725
|
Chris@1
|
726
|
Chris@1
|
727
|
Chris@1
|
728
|
Chris@1
|
729
|
Chris@1
|
730 Pfeiffer Informational [Page 13]
|
Chris@1
|
731
|
Chris@1
|
732 RFC 3533 OGG May 2003
|
Chris@1
|
733
|
Chris@1
|
734
|
Chris@1
|
735 (Ogg) segment: The Ogg encapsulation process splits each packet into
|
Chris@1
|
736 chunks of 255 bytes plus a last fractional chunk of less than 255
|
Chris@1
|
737 bytes. These chunks are called segments.
|
Chris@1
|
738
|
Chris@1
|
739 Appendix B. Acknowledgements
|
Chris@1
|
740
|
Chris@1
|
741 The author gratefully acknowledges the work that Christopher
|
Chris@1
|
742 Montgomery and the Xiph.Org foundation have done in defining the Ogg
|
Chris@1
|
743 multimedia project and as part of it the open file format described
|
Chris@1
|
744 in this document. The author hopes that providing this document to
|
Chris@1
|
745 the Internet community will help in promoting the Ogg multimedia
|
Chris@1
|
746 project at http://www.xiph.org/. Many thanks also for the many
|
Chris@1
|
747 technical and typo corrections that C. Montgomery and the Ogg
|
Chris@1
|
748 community provided as feedback to this RFC.
|
Chris@1
|
749
|
Chris@1
|
750 Author's Address
|
Chris@1
|
751
|
Chris@1
|
752 Silvia Pfeiffer
|
Chris@1
|
753 CSIRO, Australia
|
Chris@1
|
754 Locked Bag 17
|
Chris@1
|
755 North Ryde, NSW 2113
|
Chris@1
|
756 Australia
|
Chris@1
|
757
|
Chris@1
|
758 Phone: +61 2 9325 3141
|
Chris@1
|
759 EMail: Silvia.Pfeiffer@csiro.au
|
Chris@1
|
760 URI: http://www.cmis.csiro.au/Silvia.Pfeiffer/
|
Chris@1
|
761
|
Chris@1
|
762
|
Chris@1
|
763
|
Chris@1
|
764
|
Chris@1
|
765
|
Chris@1
|
766
|
Chris@1
|
767
|
Chris@1
|
768
|
Chris@1
|
769
|
Chris@1
|
770
|
Chris@1
|
771
|
Chris@1
|
772
|
Chris@1
|
773
|
Chris@1
|
774
|
Chris@1
|
775
|
Chris@1
|
776
|
Chris@1
|
777
|
Chris@1
|
778
|
Chris@1
|
779
|
Chris@1
|
780
|
Chris@1
|
781
|
Chris@1
|
782
|
Chris@1
|
783
|
Chris@1
|
784
|
Chris@1
|
785
|
Chris@1
|
786 Pfeiffer Informational [Page 14]
|
Chris@1
|
787
|
Chris@1
|
788 RFC 3533 OGG May 2003
|
Chris@1
|
789
|
Chris@1
|
790
|
Chris@1
|
791 Full Copyright Statement
|
Chris@1
|
792
|
Chris@1
|
793 Copyright (C) The Internet Society (2003). All Rights Reserved.
|
Chris@1
|
794
|
Chris@1
|
795 This document and translations of it may be copied and furnished to
|
Chris@1
|
796 others, and derivative works that comment on or otherwise explain it
|
Chris@1
|
797 or assist in its implementation may be prepared, copied, published
|
Chris@1
|
798 and distributed, in whole or in part, without restriction of any
|
Chris@1
|
799 kind, provided that the above copyright notice and this paragraph are
|
Chris@1
|
800 included on all such copies and derivative works. However, this
|
Chris@1
|
801 document itself may not be modified in any way, such as by removing
|
Chris@1
|
802 the copyright notice or references to the Internet Society or other
|
Chris@1
|
803 Internet organizations, except as needed for the purpose of
|
Chris@1
|
804 developing Internet standards in which case the procedures for
|
Chris@1
|
805 copyrights defined in the Internet Standards process must be
|
Chris@1
|
806 followed, or as required to translate it into languages other than
|
Chris@1
|
807 English.
|
Chris@1
|
808
|
Chris@1
|
809 The limited permissions granted above are perpetual and will not be
|
Chris@1
|
810 revoked by the Internet Society or its successors or assigns.
|
Chris@1
|
811
|
Chris@1
|
812 This document and the information contained herein is provided on an
|
Chris@1
|
813 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
|
Chris@1
|
814 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
|
Chris@1
|
815 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
|
Chris@1
|
816 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
|
Chris@1
|
817 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
|
Chris@1
|
818
|
Chris@1
|
819 Acknowledgement
|
Chris@1
|
820
|
Chris@1
|
821 Funding for the RFC Editor function is currently provided by the
|
Chris@1
|
822 Internet Society.
|
Chris@1
|
823
|
Chris@1
|
824
|
Chris@1
|
825
|
Chris@1
|
826
|
Chris@1
|
827
|
Chris@1
|
828
|
Chris@1
|
829
|
Chris@1
|
830
|
Chris@1
|
831
|
Chris@1
|
832
|
Chris@1
|
833
|
Chris@1
|
834
|
Chris@1
|
835
|
Chris@1
|
836
|
Chris@1
|
837
|
Chris@1
|
838
|
Chris@1
|
839
|
Chris@1
|
840
|
Chris@1
|
841
|
Chris@1
|
842 Pfeiffer Informational [Page 15]
|
Chris@1
|
843
|