annotate src/libvorbis-1.3.3/doc/a1-encapsulation-ogg.tex @ 56:af97cad61ff0

Add updated build of PortAudio for OSX
author Chris Cannam <cannam@all-day-breakfast.com>
date Tue, 03 Jan 2017 15:10:52 +0000
parents 05aa0afa9217
children
rev   line source
Chris@1 1 % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
Chris@1 2 %!TEX root = Vorbis_I_spec.tex
Chris@1 3 % $Id$
Chris@1 4 \section{Embedding Vorbis into an Ogg stream} \label{vorbis:over:ogg}
Chris@1 5
Chris@1 6 \subsection{Overview}
Chris@1 7
Chris@1 8 This document describes using Ogg logical and physical transport
Chris@1 9 streams to encapsulate Vorbis compressed audio packet data into file
Chris@1 10 form.
Chris@1 11
Chris@1 12 The \xref{vorbis:spec:intro} provides an overview of the construction
Chris@1 13 of Vorbis audio packets.
Chris@1 14
Chris@1 15 The \href{oggstream.html}{Ogg
Chris@1 16 bitstream overview} and \href{framing.html}{Ogg logical
Chris@1 17 bitstream and framing spec} provide detailed descriptions of Ogg
Chris@1 18 transport streams. This specification document assumes a working
Chris@1 19 knowledge of the concepts covered in these named backround
Chris@1 20 documents. Please read them first.
Chris@1 21
Chris@1 22 \subsubsection{Restrictions}
Chris@1 23
Chris@1 24 The Ogg/Vorbis I specification currently dictates that Ogg/Vorbis
Chris@1 25 streams use Ogg transport streams in degenerate, unmultiplexed
Chris@1 26 form only. That is:
Chris@1 27
Chris@1 28 \begin{itemize}
Chris@1 29 \item
Chris@1 30 A meta-headerless Ogg file encapsulates the Vorbis I packets
Chris@1 31
Chris@1 32 \item
Chris@1 33 The Ogg stream may be chained, i.e., contain multiple, contigous logical streams (links).
Chris@1 34
Chris@1 35 \item
Chris@1 36 The Ogg stream must be unmultiplexed (only one stream, a Vorbis audio stream, per link)
Chris@1 37
Chris@1 38 \end{itemize}
Chris@1 39
Chris@1 40
Chris@1 41 This is not to say that it is not currently possible to multiplex
Chris@1 42 Vorbis with other media types into a multi-stream Ogg file. At the
Chris@1 43 time this document was written, Ogg was becoming a popular container
Chris@1 44 for low-bitrate movies consisting of DivX video and Vorbis audio.
Chris@1 45 However, a 'Vorbis I audio file' is taken to imply Vorbis audio
Chris@1 46 existing alone within a degenerate Ogg stream. A compliant 'Vorbis
Chris@1 47 audio player' is not required to implement Ogg support beyond the
Chris@1 48 specific support of Vorbis within a degenrate Ogg stream (naturally,
Chris@1 49 application authors are encouraged to support full multiplexed Ogg
Chris@1 50 handling).
Chris@1 51
Chris@1 52
Chris@1 53
Chris@1 54
Chris@1 55 \subsubsection{MIME type}
Chris@1 56
Chris@1 57 The MIME type of Ogg files depend on the context. Specifically, complex
Chris@1 58 multimedia and applications should use \literal{application/ogg},
Chris@1 59 while visual media should use \literal{video/ogg}, and audio
Chris@1 60 \literal{audio/ogg}. Vorbis data encapsulated in Ogg may appear
Chris@1 61 in any of those types. RTP encapsulated Vorbis should use
Chris@1 62 \literal{audio/vorbis} + \literal{audio/vorbis-config}.
Chris@1 63
Chris@1 64
Chris@1 65 \subsection{Encapsulation}
Chris@1 66
Chris@1 67 Ogg encapsulation of a Vorbis packet stream is straightforward.
Chris@1 68
Chris@1 69 \begin{itemize}
Chris@1 70
Chris@1 71 \item
Chris@1 72 The first Vorbis packet (the identification header), which
Chris@1 73 uniquely identifies a stream as Vorbis audio, is placed alone in the
Chris@1 74 first page of the logical Ogg stream. This results in a first Ogg
Chris@1 75 page of exactly 58 bytes at the very beginning of the logical stream.
Chris@1 76
Chris@1 77
Chris@1 78 \item
Chris@1 79 This first page is marked 'beginning of stream' in the page flags.
Chris@1 80
Chris@1 81
Chris@1 82 \item
Chris@1 83 The second and third vorbis packets (comment and setup
Chris@1 84 headers) may span one or more pages beginning on the second page of
Chris@1 85 the logical stream. However many pages they span, the third header
Chris@1 86 packet finishes the page on which it ends. The next (first audio) packet
Chris@1 87 must begin on a fresh page.
Chris@1 88
Chris@1 89
Chris@1 90 \item
Chris@1 91 The granule position of these first pages containing only headers is zero.
Chris@1 92
Chris@1 93
Chris@1 94 \item
Chris@1 95 The first audio packet of the logical stream begins a fresh Ogg page.
Chris@1 96
Chris@1 97
Chris@1 98 \item
Chris@1 99 Packets are placed into ogg pages in order until the end of stream.
Chris@1 100
Chris@1 101
Chris@1 102 \item
Chris@1 103 The last page is marked 'end of stream' in the page flags.
Chris@1 104
Chris@1 105
Chris@1 106 \item
Chris@1 107 Vorbis packets may span page boundaries.
Chris@1 108
Chris@1 109
Chris@1 110 \item
Chris@1 111 The granule position of pages containing Vorbis audio is in units
Chris@1 112 of PCM audio samples (per channel; a stereo stream's granule position
Chris@1 113 does not increment at twice the speed of a mono stream).
Chris@1 114
Chris@1 115
Chris@1 116 \item
Chris@1 117 The granule position of a page represents the end PCM sample
Chris@1 118 position of the last packet \emph{completed} on that
Chris@1 119 page. The 'last PCM sample' is the last complete sample returned by
Chris@1 120 decode, not an internal sample awaiting lapping with a
Chris@1 121 subsequent block. A page that is entirely spanned by a single
Chris@1 122 packet (that completes on a subsequent page) has no granule
Chris@1 123 position, and the granule position is set to '-1'.
Chris@1 124
Chris@1 125
Chris@1 126 Note that the last decoded (fully lapped) PCM sample from a packet
Chris@1 127 is not necessarily the middle sample from that block. If, eg, the
Chris@1 128 current Vorbis packet encodes a "long block" and the next Vorbis
Chris@1 129 packet encodes a "short block", the last decodable sample from the
Chris@1 130 current packet be at position (3*long\_block\_length/4) -
Chris@1 131 (short\_block\_length/4).
Chris@1 132
Chris@1 133
Chris@1 134 \item
Chris@1 135 The granule (PCM) position of the first page need not indicate
Chris@1 136 that the stream started at position zero. Although the granule
Chris@1 137 position belongs to the last completed packet on the page and a
Chris@1 138 valid granule position must be positive, by
Chris@1 139 inference it may indicate that the PCM position of the beginning
Chris@1 140 of audio is positive or negative.
Chris@1 141
Chris@1 142
Chris@1 143 \begin{itemize}
Chris@1 144 \item
Chris@1 145 A positive starting value simply indicates that this stream begins at
Chris@1 146 some positive time offset, potentially within a larger
Chris@1 147 program. This is a common case when connecting to the middle
Chris@1 148 of broadcast stream.
Chris@1 149
Chris@1 150 \item
Chris@1 151 A negative value indicates that
Chris@1 152 output samples preceeding time zero should be discarded during
Chris@1 153 decoding; this technique is used to allow sample-granularity
Chris@1 154 editing of the stream start time of already-encoded Vorbis
Chris@1 155 streams. The number of samples to be discarded must not exceed
Chris@1 156 the overlap-add span of the first two audio packets.
Chris@1 157
Chris@1 158 \end{itemize}
Chris@1 159
Chris@1 160
Chris@1 161 In both of these cases in which the initial audio PCM starting
Chris@1 162 offset is nonzero, the second finished audio packet must flush the
Chris@1 163 page on which it appears and the third packet begin a fresh page.
Chris@1 164 This allows the decoder to always be able to perform PCM position
Chris@1 165 adjustments before needing to return any PCM data from synthesis,
Chris@1 166 resulting in correct positioning information without any aditional
Chris@1 167 seeking logic.
Chris@1 168
Chris@1 169
Chris@1 170 \begin{note}
Chris@1 171 Failure to do so should, at worst, cause a
Chris@1 172 decoder implementation to return incorrect positioning information
Chris@1 173 for seeking operations at the very beginning of the stream.
Chris@1 174 \end{note}
Chris@1 175
Chris@1 176
Chris@1 177 \item
Chris@1 178 A granule position on the final page in a stream that indicates
Chris@1 179 less audio data than the final packet would normally return is used to
Chris@1 180 end the stream on other than even frame boundaries. The difference
Chris@1 181 between the actual available data returned and the declared amount
Chris@1 182 indicates how many trailing samples to discard from the decoding
Chris@1 183 process.
Chris@1 184
Chris@1 185 \end{itemize}