Mercurial > hg > sv-dependency-builds
comparison src/libvorbis-1.3.3/doc/vorbisenc/overview.html @ 1:05aa0afa9217
Bring in flac, ogg, vorbis
author | Chris Cannam |
---|---|
date | Tue, 19 Mar 2013 17:37:49 +0000 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
0:c7265573341e | 1:05aa0afa9217 |
---|---|
1 <html> | |
2 | |
3 <head> | |
4 <title>libvorbisenc - API Overview</title> | |
5 <link rel=stylesheet href="style.css" type="text/css"> | |
6 </head> | |
7 | |
8 <body bgcolor=white text=black link="#5555ff" alink="#5555ff" vlink="#5555ff"> | |
9 <table border=0 width=100%> | |
10 <tr> | |
11 <td><p class=tiny>libvorbisenc documentation</p></td> | |
12 <td align=right><p class=tiny>libvorbisenc version 1.3.2 - 20101101</p></td> | |
13 </tr> | |
14 </table> | |
15 | |
16 <h1>Libvorbisenc API Overview</h1> | |
17 | |
18 <p>Libvorbisenc is an encoding convenience library intended to | |
19 encapsulate the elaborate setup that libvorbis requires for encoding. | |
20 Libvorbisenc gives easy access to all high-level adjustments an | |
21 application may require when encoding and also exposes some low-level | |
22 tuning parameters to allow applications to make detailed adjustments | |
23 to the encoding process. <p> | |
24 | |
25 All the <b>libvorbisenc</b> routines are declared in "vorbis/vorbisenc.h". | |
26 | |
27 <em>Note: libvorbis and libvorbisenc always | |
28 encode in a single pass. Thus, all possible encoding setups will work | |
29 properly with live input and produce streams that decode properly when | |
30 streamed. See the subsection titled <a href="#BBR">"managed bitrate | |
31 modes"</a> for details on setting limits on bitrate usage when Vorbis | |
32 streams are used in a limited-bandwidth environment.</em> | |
33 | |
34 <h2>workflow</h2> | |
35 | |
36 <p>Libvorbisenc is used only during encoder setup; its function | |
37 is to automate initialization of a multitude of settings in a | |
38 <tt>vorbis_info</tt> structure which libvorbis then uses as a reference | |
39 during the encoding process. Libvorbisenc plays no part in the | |
40 encoding process after setup. | |
41 | |
42 <p>Encode setup using libvorbisenc consists of three steps: | |
43 | |
44 <ol> | |
45 <li>high-level initialization of a <tt>vorbis_info</tt> structure by | |
46 calling one of <a | |
47 href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a | |
48 href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a> | |
49 with the basic input audio parameters (rate and channels) and the | |
50 basic desired encoded audio output parameters (VBR quality or ABR/CBR | |
51 bitrate)<p> | |
52 | |
53 <li>optional adjustment of the basic setup defaults using <a | |
54 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a><p> | |
55 | |
56 <li>calling <a | |
57 href="vorbis_encode_setup_init.html">vorbis_encode_setup_init()</a> to | |
58 finalize the high-level setup into the detailed low-level reference | |
59 values needed by libvorbis to encode audio. The <tt>vorbis_info</tt> | |
60 structure is then ready to use for encoding by libvorbis.<p> | |
61 | |
62 </ol> | |
63 | |
64 These three steps can be collapsed into a single call by using <a | |
65 href="vorbis_encode_init_vbr.html">vorbis_encode_init_vbr</a> to set up a | |
66 quality-based VBR stream or <a | |
67 href="vorbis_encode_init.html">vorbis_encode_init</a> to set up a managed | |
68 bitrate (ABR or CBR) stream.<p> | |
69 | |
70 <h2>adjustable encoding parameters</h2> | |
71 | |
72 <h3>input audio parameters</h3> | |
73 | |
74 <p> | |
75 <table border=1 color=black width=50% cellspacing=0 cellpadding=7> | |
76 <tr bgcolor=#cccccc> | |
77 <td><b>parameter</b></td> | |
78 <td><b>description</b></td> | |
79 </tr> | |
80 <tr valign=top> | |
81 <td>sampling rate</td> | |
82 <td> | |
83 The sampling rate (in samples per second) of the input audio. Common examples are 8000 for telephony, 44100 for CD audio and 48000 for DAT. Note that a mono sample (one center value) and a stereo sample (one left value and one right value) both are a single sample. | |
84 | |
85 </td> | |
86 </tr> | |
87 <tr valign=top> | |
88 <td>channels</td> | |
89 <td> | |
90 | |
91 The number of channels encoded in each input sample. By default, | |
92 stereo input modes (two channels) are 'coupled' by Vorbis 1.1 such | |
93 that the stereo relationship between the samples is taken into account | |
94 when encoding. Stereo coupling my be disabled by using <a | |
95 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a | |
96 href="vorbis_encode_ctl.html#OV_ECTL_COUPLE_SET">OV_ECTL_COUPLE_SET</a>. | |
97 | |
98 </td> | |
99 </tr> | |
100 </table> | |
101 | |
102 <h3>quality and VBR modes</h3> | |
103 | |
104 Vorbis is natively a VBR codec; a user requests a given constant | |
105 <em>quality</em> and the encoder keeps the encoding quality constant | |
106 while allowing the bitrate to vary. 'Quality' modes (Variable BitRate) | |
107 will always produce the most consistent encoding results as well as | |
108 the highest quality for the amount of bits used. | |
109 | |
110 <p> | |
111 <table border=1 color=black width=50% cellspacing=0 cellpadding=7> | |
112 <tr bgcolor=#cccccc> | |
113 <td><b>parameter</b></td> | |
114 <td><b>description</b></td> | |
115 </tr> | |
116 <tr valign=top> | |
117 <td>quality</td> | |
118 <td> | |
119 A decimal float value requesting a desired quality. Libvorbisenc 1.1 allows quality requests in the range of -0.1 (lowest quality, smallest files) through +1.0 (highest-quality, largest files). Quality -0.1 is intended as an ultra-low setting in which low bitrate is much more important than quality consistency. Quality settings 0.0 and above are intended to produce consistent results at all times. | |
120 | |
121 </td> | |
122 </tr> | |
123 </table> | |
124 | |
125 <a name="BBR"> | |
126 <h3>managed bitrate modes</h3> | |
127 | |
128 Although the Vorbis codec is natively VBR, libvorbis includes | |
129 infrastructure for 'managing' the bitrate of streams by setting | |
130 minimum and maximum usage constraints, as well as functionality for | |
131 nudging a stream toward a desired average value. These features | |
132 should <em>only</em> be used when there is a requirement to limit | |
133 bitrate in some way. Although the difference is usually slight, | |
134 managed bitrate modes will always produce output inferior to VBR | |
135 (given equal bitrate usage). Setting overly or impossibly tight | |
136 bitrate management requirements can affect output quality dramatically | |
137 for the worse.<p> | |
138 | |
139 Beginning in libvorbis 1.1, bitrate management is implemented using a | |
140 <em>bit-reservoir</em> algorithm. The encoder has a fixed-size | |
141 reservoir used as a 'savings account' in encoding. When a frame is | |
142 smaller than the target rate, the unused bits go into the reservoir so | |
143 that they may be used by future frames. When a frame is larger than | |
144 target bitrate, it draws 'banked' bits out of the reservoir. Encoding | |
145 is managed so that the reservoir never goes negative (when a maximum | |
146 bitrate is specified) or fills beyond a fixed limit (when a minimum | |
147 bitrate is specified). An 'average bitrate' request is used as the | |
148 set-point in a long-range bitrate tracker which adjusts the encoder's | |
149 aggressiveness up or down depending on whether or not frames are coming | |
150 in larger or smaller than the requested average point. | |
151 | |
152 <p> | |
153 <table border=1 color=black width=50% cellspacing=0 cellpadding=7> | |
154 <tr bgcolor=#cccccc> | |
155 <td><b>parameter</b></td> | |
156 <td><b>description</b></td> | |
157 </tr> | |
158 <tr valign=top> | |
159 <td>maximum bitrate</td> <td> The maximum allowed bitrate, set in bits | |
160 per second. If the bitrate would otherwise rise such that oversized | |
161 frames would underflow the bit-reservoir by consuming banked bits, | |
162 bitrate management will force the encoder to use fewer bits per frame | |
163 by encoding with a more aggressive psychoacoustic model.<p> This | |
164 setting is a hard limit; the bitstream will never be allowed, under | |
165 any circumstances, to increase above the specified bitrate over the | |
166 average period set by the reservoir; it may momentarily rise over if | |
167 inspected on a granularity much finer than the average period across | |
168 the reservoir. Normally, the encoder will conserve bits gracefully by | |
169 using more aggressive psychoacoustics to shrink a frame when forced | |
170 to. However, if the encoder runs out of means of gracefully shrinking | |
171 a frame, it will simply take the smallest frame it can otherwise | |
172 generate and truncate it to the maximum allowed length. Note that | |
173 this is not an error and although it will obviously adversely affect | |
174 audio quality, a Vorbis decoder will be able to decode a truncated | |
175 frame into audio. | |
176 | |
177 </td> | |
178 </tr> | |
179 | |
180 <tr valign=top> | |
181 <td>average bitrate</td> | |
182 | |
183 <td> | |
184 | |
185 The average desired bitrate of a stream, set | |
186 in bits per second. Average bitrate is tracked via a reservoir like | |
187 minimum and maximum bitrate, however the averaging reservior does not | |
188 impose a hard limit; it is used to nudge the bitrate toward the | |
189 desired average by slowly adjusting the psychoacoustic aggressiveness. | |
190 As such, the reservoir size does not affect the average bitrate | |
191 behavior. Because this setting alone is not used to impose hard | |
192 bitrate limits, the bitrate of a stream produced using only the | |
193 <tt>average bitrate</tt> constraint will track the average over time | |
194 but not necessarily adhere strictly to that average for any given | |
195 period. Should a strict localized average be required, <tt>average | |
196 bitrate</tt> should be used along with <tt>minimum bitrate</tt> and | |
197 <tt>maximum bitrate</tt>. | |
198 </td> | |
199 | |
200 </tr> | |
201 | |
202 <tr valign=top> | |
203 <td>minimum bitrate</td> | |
204 <td> | |
205 The minimum allowed bitrate, set in bits per second. If | |
206 the bitrate would otherwise fall such that undersized frames would | |
207 overflow the bit-reservoir with unused bits, bitrate management will | |
208 force the encoder to use more bits per frame by encoding with a less | |
209 aggressive psychoacoustic model.<p> This setting is a hard limit; the | |
210 bitstream will never be allowed, under any circumstances, to drop | |
211 below the specified bitrate over the average period set by the | |
212 reservoir; it may momentarily fall under if inspected on a granularity | |
213 much finer than the average period across the reservoir. Normally, | |
214 the encoder will fill out undersided frames with additional useful | |
215 coding information by increasing the perceived quality of the stream. | |
216 If the encoder runs out of useful ways to consume more bits, it will | |
217 pad frames out with zeroes. | |
218 </td> | |
219 </tr> | |
220 | |
221 <tr valign=top> | |
222 <td>reservoir size</td> <td> The size of the minimum/maximum bitrate | |
223 tracking reservoir, set in bits. The reservoir is used as a 'bit | |
224 bank' to average out localized surges and dips in bitrate while | |
225 providing predictable, guaranteed buffering behavior for streams to be | |
226 used in situations with constrained transport bandwidth. The default | |
227 setting is two seconds of average bitrate.<p> | |
228 | |
229 When a single frame is larger than the maximum allowed overall | |
230 bitrate, the bits are 'borrowed' from the bitrate reservoir; if the | |
231 reservoir contains insufficient bits to cover the defecit, the encoder | |
232 must find some way to reduce the frame size. <p> | |
233 | |
234 When a frame is under the minimum limit, the surplus bits are placed | |
235 into the reservoir, banking them for future use. If the reservoir is | |
236 already full of banked bits, the encoder is forced to find some way to | |
237 make the frame larger.<p> | |
238 | |
239 If the frame size is between the minimum and maximum rates (thus | |
240 implying the minimum and maximum allowed rates are different), the | |
241 reservoir gravitates toward a fill point configured by the | |
242 <tt>reservoir bias</tt> setting described next. If the reservoir is | |
243 fuller than the fill point (a 'surplus of surplus'), the encoder will | |
244 consume a number bits from the reservoir equal to the number of the | |
245 bits by which the frame exceeds minimum size. If the reservoir is | |
246 emptier than the fillpoint (a 'surplus of defecit'), bits are returned | |
247 to the reservoir equaling the current frame's number of bits under the | |
248 maximum frame size. The idea of the fill point is to buffer against | |
249 both underruns and overruns, by trying to hold the reservoir to a | |
250 middle course. | |
251 </td> | |
252 </tr> | |
253 | |
254 <tr valign=top> | |
255 <td>reservoir bias</td> | |
256 | |
257 <td> | |
258 | |
259 Reservoir bias is a setting between 0.0 and 1.0 that biases bitrate | |
260 management toward smoothing bitrate spikes (0.0) or bitrate peaks | |
261 (1.0); the default setting is 0.1.<p> | |
262 | |
263 Using settings toward 0.0 causes the bitrate manager to hoard bits in | |
264 the bit reservoir such that there is a large pool of banked surplus to | |
265 draw upon during short spikes in bitrate. As a result, the encoder | |
266 will react less aggressively and less drastically to curtail framesize | |
267 during brief surges in bitrate.<p> | |
268 | |
269 Using settings toward 1.0 causes the bitrate manager to empty the bit | |
270 reservoir such that there is a large buffer available to store surplus | |
271 bits during sudden drops in bitrate. As a result, the encoder will | |
272 react less aggressively and less drastically to support minimum frame | |
273 sizes during drops in bitrate and will tend not to store any extra | |
274 bits in the reservoir for future bitrate spikes.<p> | |
275 | |
276 </td> | |
277 </tr> | |
278 | |
279 <tr valign=top> | |
280 <td>average track damping</td> | |
281 <td> | |
282 | |
283 A decimal value, in seconds, that controls how quickly the average | |
284 bitrate tracker is allowed to slew from enforcing minimum frame sizes | |
285 to maximum framesizes and vice versa. Default value is 1.5 | |
286 seconds.<p> | |
287 | |
288 When the 'average bitrate' setting is in use, the average bitrate | |
289 tracker uses an unbounded reservoir to track overall bitrate-to-date | |
290 in the stream. When bitrates are too low, the tracker will try to | |
291 nudge bitrates up and when the bitrate is too high, nudge it down. | |
292 The damping value regulates the maximum strength of the nudge; it | |
293 describes, in seconds, how quickly the tracker may transition from an | |
294 extreme nudge in one direction to an extreme nudge in the other.<p> | |
295 | |
296 </td> | |
297 </tr> | |
298 | |
299 </table> | |
300 | |
301 <h3>encoding model adjustments</h3> | |
302 | |
303 The <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> call provides | |
304 a generalized interface for making encoding setup adjustments to the | |
305 basic high-level setup provided by <a | |
306 href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a | |
307 href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>. | |
308 In reality, these two calls use <a | |
309 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> internally, and <a | |
310 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can be used to adjust | |
311 most of the parameters set by other calls.<p> | |
312 | |
313 In Vorbis 1.1, <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can | |
314 adjust the following additional parameters not described elsewhere: | |
315 | |
316 <p> | |
317 <table border=1 color=black width=50% cellspacing=0 cellpadding=7> | |
318 <tr bgcolor=#cccccc> | |
319 <td><b>parameter</b></td> | |
320 <td><b>description</b></td> | |
321 </tr> | |
322 <tr valign=top> | |
323 <td>management mode</td> <td> Configures whether or not bitrate | |
324 management is in use or not. Normally, this value is set implicitly | |
325 during encoding setup; however, the supported means of selecting a | |
326 quality mode by bitrate (that is, requesting a true VBR stream, but | |
327 doing so by asking for an approximate bitrate) is to use <a | |
328 href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a> | |
329 and then to explicitly turn off bitrate management by calling <a | |
330 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a | |
331 href="vorbis_encode_ctl.html#OV_ECTL_RATEMANAGE2_SET">OV_ECTL_RATEMANAGE2_SET</a> | |
332 </td> | |
333 </tr> | |
334 | |
335 <tr valign=top> | |
336 <td>coupling</td> <td> Stereo encoding (and in the future, surround | |
337 encodings) are normally encoded assuming the channels form a stereo | |
338 image and that lossy-stereo modelling is appropriate; this is called | |
339 'coupling'. Stereo coupling may be explicitly enabled or disabled. | |
340 </td> | |
341 </tr> | |
342 <tr valign=top> | |
343 <td>lowpass</td> <td> Sets the hard lowpass of a given encoding mode; | |
344 this may be used to conserve a few bits in high-rate audio that has | |
345 limited bandwidth, or in testing of the encoder's acoustic model. The | |
346 encoder is generally already configured with ideal lowpasses (if any | |
347 at all) for given modes; use of this parameter is strongly discouraged | |
348 if the point is to try to 'improve' a given encoding mode for general | |
349 encoding. | |
350 </td> | |
351 </tr> | |
352 | |
353 <tr valign=top> | |
354 <td>impulse coding aggressiveness</td> <td>By default, libvorbis | |
355 attempts to compromise between preventing wide bitrate swings and | |
356 high-resolution impulse coding (which is required for the crispest | |
357 possible attacks, but also requires a relatively large momentary | |
358 bitrate increase). This parameter allows an application to tune the | |
359 compromise or eliminate it; A value of 0.0 indicates normal behavior | |
360 while a value of -15.0 requests maximum possible impulse | |
361 resolution.</td> | |
362 </tr> | |
363 | |
364 </table> | |
365 | |
366 | |
367 <br><br> | |
368 <hr noshade> | |
369 <table border=0 width=100%> | |
370 <tr valign=top> | |
371 <td><p class=tiny>copyright © 2000-2010 Xiph.Org</p></td> | |
372 <td align=right><p class=tiny><a href="http://www.xiph.org/ogg/vorbis/index.html">Ogg Vorbis</a></p></td> | |
373 </tr><tr> | |
374 <td><p class=tiny>libvorbisenc documentation</p></td> | |
375 <td align=right><p class=tiny>libvorbisenc version 1.3.2 - 20101101</p></td> | |
376 </tr> | |
377 </table> | |
378 | |
379 </body> | |
380 | |
381 </html> | |
382 |