cannam@86: cannam@86: cannam@86: cannam@86: cannam@86: cannam@86: Ogg Vorbis Documentation cannam@86: cannam@86: cannam@86: cannam@86: cannam@86: cannam@86: cannam@86: cannam@86:

cannam@86:

cannam@86: cannam@86:

Ogg Vorbis encoding format documentation

cannam@86: cannam@86:

wait As of writing, not all the below document cannam@86: links are live. They will be populated as we complete the documents.

cannam@86: cannam@86:

Documents

cannam@86: cannam@86:

Vorbis packet structure
Temporal envelope shaping and blocksize
Time domain segmentation and MDCT transform
The resolution floor
MDCT-domain fine structure

cannam@86: cannam@86:

The Vorbis probability model
The Vorbis bitpacker

cannam@86: cannam@86:

Programming with libvorbis

cannam@86: cannam@86:

Description

cannam@86: cannam@86:

Ogg Vorbis is a general purpose compressed audio format cannam@86: for high quality (44.1-48.0kHz, 16+ bit, polyphonic) audio and music cannam@86: at moderate fixed and variable bitrates (40-80 kb/s/channel). This cannam@86: places Vorbis in the same class as audio representations including cannam@86: MPEG-1 audio layer 3, MPEG-4 audio (AAC and TwinVQ), and PAC.

cannam@86: cannam@86:

Vorbis is the first of a planned family of Ogg multimedia coding cannam@86: formats being developed as part of the Xiph.Org Foundation's Ogg multimedia cannam@86: project. See http://www.xiph.org/ cannam@86: for more information.

cannam@86: cannam@86:

Vorbis technical documents

cannam@86: cannam@86:

A Vorbis encoder takes in overlapping (but contiguous) short-time cannam@86: segments of audio data. The encoder analyzes the content of the audio cannam@86: to determine an optimal compact representation; this phase of encoding cannam@86: is known as analysis. For each short-time block of sound, cannam@86: the encoder then packs an efficient representation of the signal, as cannam@86: determined by analysis, into a raw packet much smaller than the size cannam@86: required by the original signal; this phase is coding. cannam@86: Lastly, in a streaming environment, the raw packets are then cannam@86: structured into a continuous stream of octets; this last phase is cannam@86: streaming. Note that the stream of octets is referred to both cannam@86: as a 'byte-' and 'bit-'stream; the latter usage is acceptible as the cannam@86: stream of octets is a physical representation of a true logical cannam@86: bit-by-bit stream.

cannam@86: cannam@86:

A Vorbis decoder performs a mirror image process of extracting the cannam@86: original sequence of raw packets from an Ogg stream (stream cannam@86: decomposition), reconstructing the signal representation from the cannam@86: raw data in the packet (decoding) and them reconstituting an cannam@86: audio signal from the decoded representation (synthesis).

cannam@86: cannam@86:

The Programming with libvorbis cannam@86: documents discuss use of the reference Vorbis codec library cannam@86: (libvorbis) produced by the Xiph.Org Foundation.

cannam@86: cannam@86:

The data representations and algorithms necessary at each step to cannam@86: encode and decode Ogg Vorbis bitstreams are described by the below cannam@86: documents in sufficient detail to construct a complete Vorbis codec. cannam@86: Note that at the time of writing, Vorbis is still in a 'Request For cannam@86: Comments' stage of development; despite being in advanced stages of cannam@86: development, input from the multimedia community is welcome.

cannam@86: cannam@86:

Vorbis analysis and synthesis

cannam@86: cannam@86:

Analysis begins by seperating an input audio stream into individual, cannam@86: overlapping short-time segments of audio data. These segments are cannam@86: then transformed into an alternate representation, seeking to cannam@86: represent the original signal in a more efficient form that codes into cannam@86: a smaller number of bytes. The analysis and transformation stage is cannam@86: the most complex element of producing a Vorbis bitstream.

cannam@86: cannam@86:

The corresponding synthesis step in the decoder is simpler; there is cannam@86: no analysis to perform, merely a mechanical, deterministic cannam@86: reconstruction of the original audio data from the transform-domain cannam@86: representation.

cannam@86: cannam@86:

Vorbis packet structure: cannam@86: Describes the basic analysis components necessary to produce Vorbis cannam@86: packets and the structure of the packet itself.
Temporal envelope shaping and blocksize: cannam@86: Use of temporal envelope shaping and variable blocksize to minimize cannam@86: time-domain energy leakage during wide dynamic range and spectral energy cannam@86: swings. Also discusses time-related principles of psychoacoustics.
Time domain segmentation and MDCT transform: cannam@86: Division of time domain data into individual overlapped, windowed cannam@86: short-time vectors and transformation using the MDCT
The resolution floor: Use of frequency cannam@86: doamin psychoacoustics, and the MDCT-domain noise, masking and resolution cannam@86: floors
MDCT-domain fine structure: Production, cannam@86: quantization and massaging of MDCT-spectrum fine structure

cannam@86: cannam@86:

Vorbis coding and decoding

cannam@86: cannam@86:

Coding and decoding converts the transform-domain representation of cannam@86: the original audio produced by analysis to and from a bitwise packed cannam@86: raw data packet. Coding and decoding consist of two logically cannam@86: orthogonal concepts, back-end coding and bitpacking.

cannam@86: cannam@86:

Back-end coding uses a probability model to represent the raw numbers cannam@86: of the audio representation in as few physical bits as possible; cannam@86: familiar examples of back-end coding include Huffman coding and Vector cannam@86: Quantization.

cannam@86: cannam@86:

Bitpacking arranges the variable sized words of the back-end cannam@86: coding into a vector of octets without wasting space. The octets cannam@86: produced by coding a single short-time audio segment is one raw Vorbis cannam@86: packet.

cannam@86: cannam@86:

The Vorbis probability model
The Vorbis bitpacker: Arrangement of cannam@86: variable bit-length words into an octet-aligned packet.

cannam@86: cannam@86:

Vorbis streaming and stream decomposition

cannam@86: cannam@86:

Vorbis packets contain the raw, bitwise-compressed representation of a cannam@86: snippet of audio. These packets contain no structure and cannot be cannam@86: strung together directly into a stream; for streamed transmission and cannam@86: storage, Vorbis packets are encoded into an Ogg bitstream.

cannam@86: cannam@86:

Ogg bitstream overview: High-level cannam@86: description of Ogg logical bitstreams, how logical bitstreams cannam@86: (of mixed media types) can be combined into physical bitstreams, and cannam@86: restrictions on logical-to-physical mapping. Note that this document is cannam@86: not specific only to Ogg Vorbis.
Ogg logical bitstream and framing cannam@86: spec: Low level, complete specification of Ogg logical cannam@86: bitstream pages. Note that this document is not specific only to Ogg cannam@86: Vorbis.
Vorbis bitstream mapping: cannam@86: Specifically describes mapping Vorbis data into an cannam@86: Ogg physical bitstream.

cannam@86: cannam@86:

cannam@86: cannam@86: cannam@86: