Chris@1
|
1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
|
Chris@1
|
2 <html>
|
Chris@1
|
3 <head>
|
Chris@1
|
4
|
Chris@1
|
5 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-15"/>
|
Chris@1
|
6 <title>Ogg Vorbis Documentation</title>
|
Chris@1
|
7
|
Chris@1
|
8 <style type="text/css">
|
Chris@1
|
9 body {
|
Chris@1
|
10 margin: 0 18px 0 18px;
|
Chris@1
|
11 padding-bottom: 30px;
|
Chris@1
|
12 font-family: Verdana, Arial, Helvetica, sans-serif;
|
Chris@1
|
13 color: #333333;
|
Chris@1
|
14 font-size: .8em;
|
Chris@1
|
15 }
|
Chris@1
|
16
|
Chris@1
|
17 a {
|
Chris@1
|
18 color: #3366cc;
|
Chris@1
|
19 }
|
Chris@1
|
20
|
Chris@1
|
21 img {
|
Chris@1
|
22 border: 0;
|
Chris@1
|
23 }
|
Chris@1
|
24
|
Chris@1
|
25 #xiphlogo {
|
Chris@1
|
26 margin: 30px 0 16px 0;
|
Chris@1
|
27 }
|
Chris@1
|
28
|
Chris@1
|
29 #content p {
|
Chris@1
|
30 line-height: 1.4;
|
Chris@1
|
31 }
|
Chris@1
|
32
|
Chris@1
|
33 h1, h1 a, h2, h2 a, h3, h3 a {
|
Chris@1
|
34 font-weight: bold;
|
Chris@1
|
35 color: #ff9900;
|
Chris@1
|
36 margin: 1.3em 0 8px 0;
|
Chris@1
|
37 }
|
Chris@1
|
38
|
Chris@1
|
39 h1 {
|
Chris@1
|
40 font-size: 1.3em;
|
Chris@1
|
41 }
|
Chris@1
|
42
|
Chris@1
|
43 h2 {
|
Chris@1
|
44 font-size: 1.2em;
|
Chris@1
|
45 }
|
Chris@1
|
46
|
Chris@1
|
47 h3 {
|
Chris@1
|
48 font-size: 1.1em;
|
Chris@1
|
49 }
|
Chris@1
|
50
|
Chris@1
|
51 li {
|
Chris@1
|
52 line-height: 1.4;
|
Chris@1
|
53 }
|
Chris@1
|
54
|
Chris@1
|
55 #copyright {
|
Chris@1
|
56 margin-top: 30px;
|
Chris@1
|
57 line-height: 1.5em;
|
Chris@1
|
58 text-align: center;
|
Chris@1
|
59 font-size: .8em;
|
Chris@1
|
60 color: #888888;
|
Chris@1
|
61 clear: both;
|
Chris@1
|
62 }
|
Chris@1
|
63 </style>
|
Chris@1
|
64
|
Chris@1
|
65 </head>
|
Chris@1
|
66
|
Chris@1
|
67 <body>
|
Chris@1
|
68
|
Chris@1
|
69 <div id="xiphlogo">
|
Chris@1
|
70 <a href="http://www.xiph.org/"><img src="fish_xiph_org.png" alt="Fish Logo and Xiph.Org"/></a>
|
Chris@1
|
71 </div>
|
Chris@1
|
72
|
Chris@1
|
73 <h1>Ogg Vorbis encoding format documentation</h1>
|
Chris@1
|
74
|
Chris@1
|
75 <p><img src="wait.png" alt="wait"/>As of writing, not all the below document
|
Chris@1
|
76 links are live. They will be populated as we complete the documents.</p>
|
Chris@1
|
77
|
Chris@1
|
78 <h2>Documents</h2>
|
Chris@1
|
79
|
Chris@1
|
80 <ul>
|
Chris@1
|
81 <li><a href="packet.html">Vorbis packet structure</a></li>
|
Chris@1
|
82 <li><a href="envelope.html">Temporal envelope shaping and blocksize</a></li>
|
Chris@1
|
83 <li><a href="mdct.html">Time domain segmentation and MDCT transform</a></li>
|
Chris@1
|
84 <li><a href="resolution.html">The resolution floor</a></li>
|
Chris@1
|
85 <li><a href="residuals.html">MDCT-domain fine structure</a></li>
|
Chris@1
|
86 </ul>
|
Chris@1
|
87
|
Chris@1
|
88 <ul>
|
Chris@1
|
89 <li><a href="probmodel.html">The Vorbis probability model</a></li>
|
Chris@1
|
90 <li><a href="bitpack.html">The Vorbis bitpacker</a></li>
|
Chris@1
|
91 </ul>
|
Chris@1
|
92
|
Chris@1
|
93 <ul>
|
Chris@1
|
94 <li><a href="oggstream.html">Ogg bitstream overview</a></li>
|
Chris@1
|
95 <li><a href="framing.html">Ogg logical bitstream and framing spec</a></li>
|
Chris@1
|
96 <li><a href="vorbis-stream.html">Vorbis packet->Ogg bitstream mapping</a></li>
|
Chris@1
|
97 </ul>
|
Chris@1
|
98
|
Chris@1
|
99 <ul>
|
Chris@1
|
100 <li><a href="programming.html">Programming with libvorbis</a></li>
|
Chris@1
|
101 </ul>
|
Chris@1
|
102
|
Chris@1
|
103 <h2>Description</h2>
|
Chris@1
|
104
|
Chris@1
|
105 <p>Ogg Vorbis is a general purpose compressed audio format
|
Chris@1
|
106 for high quality (44.1-48.0kHz, 16+ bit, polyphonic) audio and music
|
Chris@1
|
107 at moderate fixed and variable bitrates (40-80 kb/s/channel). This
|
Chris@1
|
108 places Vorbis in the same class as audio representations including
|
Chris@1
|
109 MPEG-1 audio layer 3, MPEG-4 audio (AAC and TwinVQ), and PAC.</p>
|
Chris@1
|
110
|
Chris@1
|
111 <p>Vorbis is the first of a planned family of Ogg multimedia coding
|
Chris@1
|
112 formats being developed as part of the Xiph.Org Foundation's Ogg multimedia
|
Chris@1
|
113 project. See <a href="http://www.xiph.org/">http://www.xiph.org/</a>
|
Chris@1
|
114 for more information.</p>
|
Chris@1
|
115
|
Chris@1
|
116 <h2>Vorbis technical documents</h2>
|
Chris@1
|
117
|
Chris@1
|
118 <p>A Vorbis encoder takes in overlapping (but contiguous) short-time
|
Chris@1
|
119 segments of audio data. The encoder analyzes the content of the audio
|
Chris@1
|
120 to determine an optimal compact representation; this phase of encoding
|
Chris@1
|
121 is known as <em>analysis</em>. For each short-time block of sound,
|
Chris@1
|
122 the encoder then packs an efficient representation of the signal, as
|
Chris@1
|
123 determined by analysis, into a raw packet much smaller than the size
|
Chris@1
|
124 required by the original signal; this phase is <em>coding</em>.
|
Chris@1
|
125 Lastly, in a streaming environment, the raw packets are then
|
Chris@1
|
126 structured into a continuous stream of octets; this last phase is
|
Chris@1
|
127 <em>streaming</em>. Note that the stream of octets is referred to both
|
Chris@1
|
128 as a 'byte-' and 'bit-'stream; the latter usage is acceptible as the
|
Chris@1
|
129 stream of octets is a physical representation of a true logical
|
Chris@1
|
130 bit-by-bit stream.</p>
|
Chris@1
|
131
|
Chris@1
|
132 <p>A Vorbis decoder performs a mirror image process of extracting the
|
Chris@1
|
133 original sequence of raw packets from an Ogg stream (<em>stream
|
Chris@1
|
134 decomposition</em>), reconstructing the signal representation from the
|
Chris@1
|
135 raw data in the packet (<em>decoding</em>) and them reconstituting an
|
Chris@1
|
136 audio signal from the decoded representation (<em>synthesis</em>).</p>
|
Chris@1
|
137
|
Chris@1
|
138 <p>The <a href="programming.html">Programming with libvorbis</a>
|
Chris@1
|
139 documents discuss use of the reference Vorbis codec library
|
Chris@1
|
140 (libvorbis) produced by the Xiph.Org Foundation.</p>
|
Chris@1
|
141
|
Chris@1
|
142 <p>The data representations and algorithms necessary at each step to
|
Chris@1
|
143 encode and decode Ogg Vorbis bitstreams are described by the below
|
Chris@1
|
144 documents in sufficient detail to construct a complete Vorbis codec.
|
Chris@1
|
145 Note that at the time of writing, Vorbis is still in a 'Request For
|
Chris@1
|
146 Comments' stage of development; despite being in advanced stages of
|
Chris@1
|
147 development, input from the multimedia community is welcome.</p>
|
Chris@1
|
148
|
Chris@1
|
149 <h3>Vorbis analysis and synthesis</h3>
|
Chris@1
|
150
|
Chris@1
|
151 <p>Analysis begins by seperating an input audio stream into individual,
|
Chris@1
|
152 overlapping short-time segments of audio data. These segments are
|
Chris@1
|
153 then transformed into an alternate representation, seeking to
|
Chris@1
|
154 represent the original signal in a more efficient form that codes into
|
Chris@1
|
155 a smaller number of bytes. The analysis and transformation stage is
|
Chris@1
|
156 the most complex element of producing a Vorbis bitstream.</p>
|
Chris@1
|
157
|
Chris@1
|
158 <p>The corresponding synthesis step in the decoder is simpler; there is
|
Chris@1
|
159 no analysis to perform, merely a mechanical, deterministic
|
Chris@1
|
160 reconstruction of the original audio data from the transform-domain
|
Chris@1
|
161 representation.</p>
|
Chris@1
|
162
|
Chris@1
|
163 <ul>
|
Chris@1
|
164 <li><a href="packet.html">Vorbis packet structure</a>:
|
Chris@1
|
165 Describes the basic analysis components necessary to produce Vorbis
|
Chris@1
|
166 packets and the structure of the packet itself.</li>
|
Chris@1
|
167 <li><a href="envelope.html">Temporal envelope shaping and blocksize</a>:
|
Chris@1
|
168 Use of temporal envelope shaping and variable blocksize to minimize
|
Chris@1
|
169 time-domain energy leakage during wide dynamic range and spectral energy
|
Chris@1
|
170 swings. Also discusses time-related principles of psychoacoustics.</li>
|
Chris@1
|
171 <li><a href="mdct.html">Time domain segmentation and MDCT transform</a>:
|
Chris@1
|
172 Division of time domain data into individual overlapped, windowed
|
Chris@1
|
173 short-time vectors and transformation using the MDCT</li>
|
Chris@1
|
174 <li><a href="resolution.html">The resolution floor</a>: Use of frequency
|
Chris@1
|
175 doamin psychoacoustics, and the MDCT-domain noise, masking and resolution
|
Chris@1
|
176 floors</li>
|
Chris@1
|
177 <li><a href="residuals.html">MDCT-domain fine structure</a>: Production,
|
Chris@1
|
178 quantization and massaging of MDCT-spectrum fine structure</li>
|
Chris@1
|
179 </ul>
|
Chris@1
|
180
|
Chris@1
|
181 <h3>Vorbis coding and decoding</h3>
|
Chris@1
|
182
|
Chris@1
|
183 <p>Coding and decoding converts the transform-domain representation of
|
Chris@1
|
184 the original audio produced by analysis to and from a bitwise packed
|
Chris@1
|
185 raw data packet. Coding and decoding consist of two logically
|
Chris@1
|
186 orthogonal concepts, <em>back-end coding</em> and <em>bitpacking</em>.</p>
|
Chris@1
|
187
|
Chris@1
|
188 <p><em>Back-end coding</em> uses a probability model to represent the raw numbers
|
Chris@1
|
189 of the audio representation in as few physical bits as possible;
|
Chris@1
|
190 familiar examples of back-end coding include Huffman coding and Vector
|
Chris@1
|
191 Quantization.</p>
|
Chris@1
|
192
|
Chris@1
|
193 <p><em>Bitpacking</em> arranges the variable sized words of the back-end
|
Chris@1
|
194 coding into a vector of octets without wasting space. The octets
|
Chris@1
|
195 produced by coding a single short-time audio segment is one raw Vorbis
|
Chris@1
|
196 packet.</p>
|
Chris@1
|
197
|
Chris@1
|
198 <ul>
|
Chris@1
|
199 <li><a href="probmodel.html">The Vorbis probability model</a></li>
|
Chris@1
|
200 <li><a href="bitpack.html">The Vorbis bitpacker</a>: Arrangement of
|
Chris@1
|
201 variable bit-length words into an octet-aligned packet.</li>
|
Chris@1
|
202 </ul>
|
Chris@1
|
203
|
Chris@1
|
204 <h3>Vorbis streaming and stream decomposition</h3>
|
Chris@1
|
205
|
Chris@1
|
206 <p>Vorbis packets contain the raw, bitwise-compressed representation of a
|
Chris@1
|
207 snippet of audio. These packets contain no structure and cannot be
|
Chris@1
|
208 strung together directly into a stream; for streamed transmission and
|
Chris@1
|
209 storage, Vorbis packets are encoded into an Ogg bitstream.</p>
|
Chris@1
|
210
|
Chris@1
|
211 <ul>
|
Chris@1
|
212 <li><a href="oggstream.html">Ogg bitstream overview</a>: High-level
|
Chris@1
|
213 description of Ogg logical bitstreams, how logical bitstreams
|
Chris@1
|
214 (of mixed media types) can be combined into physical bitstreams, and
|
Chris@1
|
215 restrictions on logical-to-physical mapping. Note that this document is
|
Chris@1
|
216 not specific only to Ogg Vorbis.</li>
|
Chris@1
|
217 <li><a href="framing.html">Ogg logical bitstream and framing
|
Chris@1
|
218 spec</a>: Low level, complete specification of Ogg logical
|
Chris@1
|
219 bitstream pages. Note that this document is not specific only to Ogg
|
Chris@1
|
220 Vorbis.</li>
|
Chris@1
|
221 <li><a href="vorbis-stream.html">Vorbis bitstream mapping</a>:
|
Chris@1
|
222 Specifically describes mapping Vorbis data into an
|
Chris@1
|
223 Ogg physical bitstream.</li>
|
Chris@1
|
224 </ul>
|
Chris@1
|
225
|
Chris@1
|
226 <div id="copyright">
|
Chris@1
|
227 The Xiph Fish Logo is a
|
Chris@1
|
228 trademark (™) of Xiph.Org.<br/>
|
Chris@1
|
229
|
Chris@1
|
230 These pages © 1994 - 2005 Xiph.Org. All rights reserved.
|
Chris@1
|
231 </div>
|
Chris@1
|
232
|
Chris@1
|
233 </body>
|
Chris@1
|
234 </html>
|