annotate src/libvorbis-1.3.3/doc/a1-encapsulation-ogg.tex @ 88:fe7c3a0b0259

Add some MinGW builds
author Chris Cannam <cannam@all-day-breakfast.com>
date Wed, 20 Mar 2013 13:49:36 +0000
parents 98c1576536ae
children
rev   line source
cannam@86 1 % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
cannam@86 2 %!TEX root = Vorbis_I_spec.tex
cannam@86 3 % $Id$
cannam@86 4 \section{Embedding Vorbis into an Ogg stream} \label{vorbis:over:ogg}
cannam@86 5
cannam@86 6 \subsection{Overview}
cannam@86 7
cannam@86 8 This document describes using Ogg logical and physical transport
cannam@86 9 streams to encapsulate Vorbis compressed audio packet data into file
cannam@86 10 form.
cannam@86 11
cannam@86 12 The \xref{vorbis:spec:intro} provides an overview of the construction
cannam@86 13 of Vorbis audio packets.
cannam@86 14
cannam@86 15 The \href{oggstream.html}{Ogg
cannam@86 16 bitstream overview} and \href{framing.html}{Ogg logical
cannam@86 17 bitstream and framing spec} provide detailed descriptions of Ogg
cannam@86 18 transport streams. This specification document assumes a working
cannam@86 19 knowledge of the concepts covered in these named backround
cannam@86 20 documents. Please read them first.
cannam@86 21
cannam@86 22 \subsubsection{Restrictions}
cannam@86 23
cannam@86 24 The Ogg/Vorbis I specification currently dictates that Ogg/Vorbis
cannam@86 25 streams use Ogg transport streams in degenerate, unmultiplexed
cannam@86 26 form only. That is:
cannam@86 27
cannam@86 28 \begin{itemize}
cannam@86 29 \item
cannam@86 30 A meta-headerless Ogg file encapsulates the Vorbis I packets
cannam@86 31
cannam@86 32 \item
cannam@86 33 The Ogg stream may be chained, i.e., contain multiple, contigous logical streams (links).
cannam@86 34
cannam@86 35 \item
cannam@86 36 The Ogg stream must be unmultiplexed (only one stream, a Vorbis audio stream, per link)
cannam@86 37
cannam@86 38 \end{itemize}
cannam@86 39
cannam@86 40
cannam@86 41 This is not to say that it is not currently possible to multiplex
cannam@86 42 Vorbis with other media types into a multi-stream Ogg file. At the
cannam@86 43 time this document was written, Ogg was becoming a popular container
cannam@86 44 for low-bitrate movies consisting of DivX video and Vorbis audio.
cannam@86 45 However, a 'Vorbis I audio file' is taken to imply Vorbis audio
cannam@86 46 existing alone within a degenerate Ogg stream. A compliant 'Vorbis
cannam@86 47 audio player' is not required to implement Ogg support beyond the
cannam@86 48 specific support of Vorbis within a degenrate Ogg stream (naturally,
cannam@86 49 application authors are encouraged to support full multiplexed Ogg
cannam@86 50 handling).
cannam@86 51
cannam@86 52
cannam@86 53
cannam@86 54
cannam@86 55 \subsubsection{MIME type}
cannam@86 56
cannam@86 57 The MIME type of Ogg files depend on the context. Specifically, complex
cannam@86 58 multimedia and applications should use \literal{application/ogg},
cannam@86 59 while visual media should use \literal{video/ogg}, and audio
cannam@86 60 \literal{audio/ogg}. Vorbis data encapsulated in Ogg may appear
cannam@86 61 in any of those types. RTP encapsulated Vorbis should use
cannam@86 62 \literal{audio/vorbis} + \literal{audio/vorbis-config}.
cannam@86 63
cannam@86 64
cannam@86 65 \subsection{Encapsulation}
cannam@86 66
cannam@86 67 Ogg encapsulation of a Vorbis packet stream is straightforward.
cannam@86 68
cannam@86 69 \begin{itemize}
cannam@86 70
cannam@86 71 \item
cannam@86 72 The first Vorbis packet (the identification header), which
cannam@86 73 uniquely identifies a stream as Vorbis audio, is placed alone in the
cannam@86 74 first page of the logical Ogg stream. This results in a first Ogg
cannam@86 75 page of exactly 58 bytes at the very beginning of the logical stream.
cannam@86 76
cannam@86 77
cannam@86 78 \item
cannam@86 79 This first page is marked 'beginning of stream' in the page flags.
cannam@86 80
cannam@86 81
cannam@86 82 \item
cannam@86 83 The second and third vorbis packets (comment and setup
cannam@86 84 headers) may span one or more pages beginning on the second page of
cannam@86 85 the logical stream. However many pages they span, the third header
cannam@86 86 packet finishes the page on which it ends. The next (first audio) packet
cannam@86 87 must begin on a fresh page.
cannam@86 88
cannam@86 89
cannam@86 90 \item
cannam@86 91 The granule position of these first pages containing only headers is zero.
cannam@86 92
cannam@86 93
cannam@86 94 \item
cannam@86 95 The first audio packet of the logical stream begins a fresh Ogg page.
cannam@86 96
cannam@86 97
cannam@86 98 \item
cannam@86 99 Packets are placed into ogg pages in order until the end of stream.
cannam@86 100
cannam@86 101
cannam@86 102 \item
cannam@86 103 The last page is marked 'end of stream' in the page flags.
cannam@86 104
cannam@86 105
cannam@86 106 \item
cannam@86 107 Vorbis packets may span page boundaries.
cannam@86 108
cannam@86 109
cannam@86 110 \item
cannam@86 111 The granule position of pages containing Vorbis audio is in units
cannam@86 112 of PCM audio samples (per channel; a stereo stream's granule position
cannam@86 113 does not increment at twice the speed of a mono stream).
cannam@86 114
cannam@86 115
cannam@86 116 \item
cannam@86 117 The granule position of a page represents the end PCM sample
cannam@86 118 position of the last packet \emph{completed} on that
cannam@86 119 page. The 'last PCM sample' is the last complete sample returned by
cannam@86 120 decode, not an internal sample awaiting lapping with a
cannam@86 121 subsequent block. A page that is entirely spanned by a single
cannam@86 122 packet (that completes on a subsequent page) has no granule
cannam@86 123 position, and the granule position is set to '-1'.
cannam@86 124
cannam@86 125
cannam@86 126 Note that the last decoded (fully lapped) PCM sample from a packet
cannam@86 127 is not necessarily the middle sample from that block. If, eg, the
cannam@86 128 current Vorbis packet encodes a "long block" and the next Vorbis
cannam@86 129 packet encodes a "short block", the last decodable sample from the
cannam@86 130 current packet be at position (3*long\_block\_length/4) -
cannam@86 131 (short\_block\_length/4).
cannam@86 132
cannam@86 133
cannam@86 134 \item
cannam@86 135 The granule (PCM) position of the first page need not indicate
cannam@86 136 that the stream started at position zero. Although the granule
cannam@86 137 position belongs to the last completed packet on the page and a
cannam@86 138 valid granule position must be positive, by
cannam@86 139 inference it may indicate that the PCM position of the beginning
cannam@86 140 of audio is positive or negative.
cannam@86 141
cannam@86 142
cannam@86 143 \begin{itemize}
cannam@86 144 \item
cannam@86 145 A positive starting value simply indicates that this stream begins at
cannam@86 146 some positive time offset, potentially within a larger
cannam@86 147 program. This is a common case when connecting to the middle
cannam@86 148 of broadcast stream.
cannam@86 149
cannam@86 150 \item
cannam@86 151 A negative value indicates that
cannam@86 152 output samples preceeding time zero should be discarded during
cannam@86 153 decoding; this technique is used to allow sample-granularity
cannam@86 154 editing of the stream start time of already-encoded Vorbis
cannam@86 155 streams. The number of samples to be discarded must not exceed
cannam@86 156 the overlap-add span of the first two audio packets.
cannam@86 157
cannam@86 158 \end{itemize}
cannam@86 159
cannam@86 160
cannam@86 161 In both of these cases in which the initial audio PCM starting
cannam@86 162 offset is nonzero, the second finished audio packet must flush the
cannam@86 163 page on which it appears and the third packet begin a fresh page.
cannam@86 164 This allows the decoder to always be able to perform PCM position
cannam@86 165 adjustments before needing to return any PCM data from synthesis,
cannam@86 166 resulting in correct positioning information without any aditional
cannam@86 167 seeking logic.
cannam@86 168
cannam@86 169
cannam@86 170 \begin{note}
cannam@86 171 Failure to do so should, at worst, cause a
cannam@86 172 decoder implementation to return incorrect positioning information
cannam@86 173 for seeking operations at the very beginning of the stream.
cannam@86 174 \end{note}
cannam@86 175
cannam@86 176
cannam@86 177 \item
cannam@86 178 A granule position on the final page in a stream that indicates
cannam@86 179 less audio data than the final packet would normally return is used to
cannam@86 180 end the stream on other than even frame boundaries. The difference
cannam@86 181 between the actual available data returned and the declared amount
cannam@86 182 indicates how many trailing samples to discard from the decoding
cannam@86 183 process.
cannam@86 184
cannam@86 185 \end{itemize}