diff src/libvorbis-1.3.3/doc/a1-encapsulation-ogg.tex @ 1:05aa0afa9217

Bring in flac, ogg, vorbis
author Chris Cannam
date Tue, 19 Mar 2013 17:37:49 +0000
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/src/libvorbis-1.3.3/doc/a1-encapsulation-ogg.tex	Tue Mar 19 17:37:49 2013 +0000
@@ -0,0 +1,185 @@
+% -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
+%!TEX root = Vorbis_I_spec.tex
+% $Id$
+\section{Embedding Vorbis into an Ogg stream} \label{vorbis:over:ogg}
+
+\subsection{Overview}
+
+This document describes using Ogg logical and physical transport
+streams to encapsulate Vorbis compressed audio packet data into file
+form.
+
+The \xref{vorbis:spec:intro} provides an overview of the construction
+of Vorbis audio packets.
+
+The \href{oggstream.html}{Ogg
+bitstream overview} and \href{framing.html}{Ogg logical
+bitstream and framing spec} provide detailed descriptions of Ogg
+transport streams. This specification document assumes a working
+knowledge of the concepts covered in these named backround
+documents.  Please read them first.
+
+\subsubsection{Restrictions}
+
+The Ogg/Vorbis I specification currently dictates that Ogg/Vorbis
+streams use Ogg transport streams in degenerate, unmultiplexed
+form only. That is:
+
+\begin{itemize}
+ \item
+  A meta-headerless Ogg file encapsulates the Vorbis I packets
+
+ \item
+  The Ogg stream may be chained, i.e., contain multiple, contigous logical streams (links).
+
+ \item
+  The Ogg stream must be unmultiplexed (only one stream, a Vorbis audio stream, per link)
+
+\end{itemize}
+
+
+This is not to say that it is not currently possible to multiplex
+Vorbis with other media types into a multi-stream Ogg file.  At the
+time this document was written, Ogg was becoming a popular container
+for low-bitrate movies consisting of DivX video and Vorbis audio.
+However, a 'Vorbis I audio file' is taken to imply Vorbis audio
+existing alone within a degenerate Ogg stream.  A compliant 'Vorbis
+audio player' is not required to implement Ogg support beyond the
+specific support of Vorbis within a degenrate Ogg stream (naturally,
+application authors are encouraged to support full multiplexed Ogg
+handling).
+
+
+
+
+\subsubsection{MIME type}
+
+The MIME type of Ogg files depend on the context.  Specifically, complex
+multimedia and applications should use \literal{application/ogg},
+while visual media should use \literal{video/ogg}, and audio
+\literal{audio/ogg}.  Vorbis data encapsulated in Ogg may appear
+in any of those types.  RTP encapsulated Vorbis should use
+\literal{audio/vorbis} + \literal{audio/vorbis-config}.
+
+
+\subsection{Encapsulation}
+
+Ogg encapsulation of a Vorbis packet stream is straightforward.
+
+\begin{itemize}
+
+\item
+  The first Vorbis packet (the identification header), which
+  uniquely identifies a stream as Vorbis audio, is placed alone in the
+  first page of the logical Ogg stream.  This results in a first Ogg
+  page of exactly 58 bytes at the very beginning of the logical stream.
+
+
+\item
+  This first page is marked 'beginning of stream' in the page flags.
+
+
+\item
+  The second and third vorbis packets (comment and setup
+  headers) may span one or more pages beginning on the second page of
+  the logical stream.  However many pages they span, the third header
+  packet finishes the page on which it ends.  The next (first audio) packet
+  must begin on a fresh page.
+
+
+\item
+  The granule position of these first pages containing only headers is zero.
+
+
+\item
+  The first audio packet of the logical stream begins a fresh Ogg page.
+
+
+\item
+  Packets are placed into ogg pages in order until the end of stream.
+
+
+\item
+  The last page is marked 'end of stream' in the page flags.
+
+
+\item
+  Vorbis packets may span page boundaries.
+
+
+\item
+  The granule position of pages containing Vorbis audio is in units
+  of PCM audio samples (per channel; a stereo stream's granule position
+  does not increment at twice the speed of a mono stream).
+
+
+\item
+  The granule position of a page represents the end PCM sample
+  position of the last packet \emph{completed} on that
+  page.  The 'last PCM sample' is the last complete sample returned by
+  decode, not an internal sample awaiting lapping with a
+  subsequent block.  A page that is entirely spanned by a single
+  packet (that completes on a subsequent page) has no granule
+  position, and the granule position is set to '-1'.
+
+
+  Note that the last decoded (fully lapped) PCM sample from a packet
+  is not necessarily the middle sample from that block. If, eg, the
+  current Vorbis packet encodes a "long block" and the next Vorbis
+  packet encodes a "short block", the last decodable sample from the
+  current packet be at position (3*long\_block\_length/4) -
+  (short\_block\_length/4).
+
+
+\item
+    The granule (PCM) position of the first page need not indicate
+    that the stream started at position zero.  Although the granule
+    position belongs to the last completed packet on the page and a
+    valid granule position must be positive, by
+    inference it may indicate that the PCM position of the beginning
+    of audio is positive or negative.
+
+
+  \begin{itemize}
+    \item
+        A positive starting value simply indicates that this stream begins at
+        some positive time offset, potentially within a larger
+        program. This is a common case when connecting to the middle
+        of broadcast stream.
+
+    \item
+        A negative value indicates that
+        output samples preceeding time zero should be discarded during
+        decoding; this technique is used to allow sample-granularity
+        editing of the stream start time of already-encoded Vorbis
+        streams.  The number of samples to be discarded must not exceed
+        the overlap-add span of the first two audio packets.
+
+  \end{itemize}
+
+
+    In both of these cases in which the initial audio PCM starting
+    offset is nonzero, the second finished audio packet must flush the
+    page on which it appears and the third packet begin a fresh page.
+    This allows the decoder to always be able to perform PCM position
+    adjustments before needing to return any PCM data from synthesis,
+    resulting in correct positioning information without any aditional
+    seeking logic.
+
+
+  \begin{note}
+    Failure to do so should, at worst, cause a
+    decoder implementation to return incorrect positioning information
+    for seeking operations at the very beginning of the stream.
+  \end{note}
+
+
+\item
+  A granule position on the final page in a stream that indicates
+  less audio data than the final packet would normally return is used to
+  end the stream on other than even frame boundaries.  The difference
+  between the actual available data returned and the declared amount
+  indicates how many trailing samples to discard from the decoding
+  process.
+
+\end{itemize}