annotate src/libvorbis-1.3.3/doc/vorbisenc/overview.html @ 1:05aa0afa9217

Bring in flac, ogg, vorbis
author Chris Cannam
date Tue, 19 Mar 2013 17:37:49 +0000
parents
children
rev   line source
Chris@1 1 <html>
Chris@1 2
Chris@1 3 <head>
Chris@1 4 <title>libvorbisenc - API Overview</title>
Chris@1 5 <link rel=stylesheet href="style.css" type="text/css">
Chris@1 6 </head>
Chris@1 7
Chris@1 8 <body bgcolor=white text=black link="#5555ff" alink="#5555ff" vlink="#5555ff">
Chris@1 9 <table border=0 width=100%>
Chris@1 10 <tr>
Chris@1 11 <td><p class=tiny>libvorbisenc documentation</p></td>
Chris@1 12 <td align=right><p class=tiny>libvorbisenc version 1.3.2 - 20101101</p></td>
Chris@1 13 </tr>
Chris@1 14 </table>
Chris@1 15
Chris@1 16 <h1>Libvorbisenc API Overview</h1>
Chris@1 17
Chris@1 18 <p>Libvorbisenc is an encoding convenience library intended to
Chris@1 19 encapsulate the elaborate setup that libvorbis requires for encoding.
Chris@1 20 Libvorbisenc gives easy access to all high-level adjustments an
Chris@1 21 application may require when encoding and also exposes some low-level
Chris@1 22 tuning parameters to allow applications to make detailed adjustments
Chris@1 23 to the encoding process. <p>
Chris@1 24
Chris@1 25 All the <b>libvorbisenc</b> routines are declared in "vorbis/vorbisenc.h".
Chris@1 26
Chris@1 27 <em>Note: libvorbis and libvorbisenc always
Chris@1 28 encode in a single pass. Thus, all possible encoding setups will work
Chris@1 29 properly with live input and produce streams that decode properly when
Chris@1 30 streamed. See the subsection titled <a href="#BBR">"managed bitrate
Chris@1 31 modes"</a> for details on setting limits on bitrate usage when Vorbis
Chris@1 32 streams are used in a limited-bandwidth environment.</em>
Chris@1 33
Chris@1 34 <h2>workflow</h2>
Chris@1 35
Chris@1 36 <p>Libvorbisenc is used only during encoder setup; its function
Chris@1 37 is to automate initialization of a multitude of settings in a
Chris@1 38 <tt>vorbis_info</tt> structure which libvorbis then uses as a reference
Chris@1 39 during the encoding process. Libvorbisenc plays no part in the
Chris@1 40 encoding process after setup.
Chris@1 41
Chris@1 42 <p>Encode setup using libvorbisenc consists of three steps:
Chris@1 43
Chris@1 44 <ol>
Chris@1 45 <li>high-level initialization of a <tt>vorbis_info</tt> structure by
Chris@1 46 calling one of <a
Chris@1 47 href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a
Chris@1 48 href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>
Chris@1 49 with the basic input audio parameters (rate and channels) and the
Chris@1 50 basic desired encoded audio output parameters (VBR quality or ABR/CBR
Chris@1 51 bitrate)<p>
Chris@1 52
Chris@1 53 <li>optional adjustment of the basic setup defaults using <a
Chris@1 54 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a><p>
Chris@1 55
Chris@1 56 <li>calling <a
Chris@1 57 href="vorbis_encode_setup_init.html">vorbis_encode_setup_init()</a> to
Chris@1 58 finalize the high-level setup into the detailed low-level reference
Chris@1 59 values needed by libvorbis to encode audio. The <tt>vorbis_info</tt>
Chris@1 60 structure is then ready to use for encoding by libvorbis.<p>
Chris@1 61
Chris@1 62 </ol>
Chris@1 63
Chris@1 64 These three steps can be collapsed into a single call by using <a
Chris@1 65 href="vorbis_encode_init_vbr.html">vorbis_encode_init_vbr</a> to set up a
Chris@1 66 quality-based VBR stream or <a
Chris@1 67 href="vorbis_encode_init.html">vorbis_encode_init</a> to set up a managed
Chris@1 68 bitrate (ABR or CBR) stream.<p>
Chris@1 69
Chris@1 70 <h2>adjustable encoding parameters</h2>
Chris@1 71
Chris@1 72 <h3>input audio parameters</h3>
Chris@1 73
Chris@1 74 <p>
Chris@1 75 <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
Chris@1 76 <tr bgcolor=#cccccc>
Chris@1 77 <td><b>parameter</b></td>
Chris@1 78 <td><b>description</b></td>
Chris@1 79 </tr>
Chris@1 80 <tr valign=top>
Chris@1 81 <td>sampling rate</td>
Chris@1 82 <td>
Chris@1 83 The sampling rate (in samples per second) of the input audio. Common examples are 8000 for telephony, 44100 for CD audio and 48000 for DAT. Note that a mono sample (one center value) and a stereo sample (one left value and one right value) both are a single sample.
Chris@1 84
Chris@1 85 </td>
Chris@1 86 </tr>
Chris@1 87 <tr valign=top>
Chris@1 88 <td>channels</td>
Chris@1 89 <td>
Chris@1 90
Chris@1 91 The number of channels encoded in each input sample. By default,
Chris@1 92 stereo input modes (two channels) are 'coupled' by Vorbis 1.1 such
Chris@1 93 that the stereo relationship between the samples is taken into account
Chris@1 94 when encoding. Stereo coupling my be disabled by using <a
Chris@1 95 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a
Chris@1 96 href="vorbis_encode_ctl.html#OV_ECTL_COUPLE_SET">OV_ECTL_COUPLE_SET</a>.
Chris@1 97
Chris@1 98 </td>
Chris@1 99 </tr>
Chris@1 100 </table>
Chris@1 101
Chris@1 102 <h3>quality and VBR modes</h3>
Chris@1 103
Chris@1 104 Vorbis is natively a VBR codec; a user requests a given constant
Chris@1 105 <em>quality</em> and the encoder keeps the encoding quality constant
Chris@1 106 while allowing the bitrate to vary. 'Quality' modes (Variable BitRate)
Chris@1 107 will always produce the most consistent encoding results as well as
Chris@1 108 the highest quality for the amount of bits used.
Chris@1 109
Chris@1 110 <p>
Chris@1 111 <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
Chris@1 112 <tr bgcolor=#cccccc>
Chris@1 113 <td><b>parameter</b></td>
Chris@1 114 <td><b>description</b></td>
Chris@1 115 </tr>
Chris@1 116 <tr valign=top>
Chris@1 117 <td>quality</td>
Chris@1 118 <td>
Chris@1 119 A decimal float value requesting a desired quality. Libvorbisenc 1.1 allows quality requests in the range of -0.1 (lowest quality, smallest files) through +1.0 (highest-quality, largest files). Quality -0.1 is intended as an ultra-low setting in which low bitrate is much more important than quality consistency. Quality settings 0.0 and above are intended to produce consistent results at all times.
Chris@1 120
Chris@1 121 </td>
Chris@1 122 </tr>
Chris@1 123 </table>
Chris@1 124
Chris@1 125 <a name="BBR">
Chris@1 126 <h3>managed bitrate modes</h3>
Chris@1 127
Chris@1 128 Although the Vorbis codec is natively VBR, libvorbis includes
Chris@1 129 infrastructure for 'managing' the bitrate of streams by setting
Chris@1 130 minimum and maximum usage constraints, as well as functionality for
Chris@1 131 nudging a stream toward a desired average value. These features
Chris@1 132 should <em>only</em> be used when there is a requirement to limit
Chris@1 133 bitrate in some way. Although the difference is usually slight,
Chris@1 134 managed bitrate modes will always produce output inferior to VBR
Chris@1 135 (given equal bitrate usage). Setting overly or impossibly tight
Chris@1 136 bitrate management requirements can affect output quality dramatically
Chris@1 137 for the worse.<p>
Chris@1 138
Chris@1 139 Beginning in libvorbis 1.1, bitrate management is implemented using a
Chris@1 140 <em>bit-reservoir</em> algorithm. The encoder has a fixed-size
Chris@1 141 reservoir used as a 'savings account' in encoding. When a frame is
Chris@1 142 smaller than the target rate, the unused bits go into the reservoir so
Chris@1 143 that they may be used by future frames. When a frame is larger than
Chris@1 144 target bitrate, it draws 'banked' bits out of the reservoir. Encoding
Chris@1 145 is managed so that the reservoir never goes negative (when a maximum
Chris@1 146 bitrate is specified) or fills beyond a fixed limit (when a minimum
Chris@1 147 bitrate is specified). An 'average bitrate' request is used as the
Chris@1 148 set-point in a long-range bitrate tracker which adjusts the encoder's
Chris@1 149 aggressiveness up or down depending on whether or not frames are coming
Chris@1 150 in larger or smaller than the requested average point.
Chris@1 151
Chris@1 152 <p>
Chris@1 153 <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
Chris@1 154 <tr bgcolor=#cccccc>
Chris@1 155 <td><b>parameter</b></td>
Chris@1 156 <td><b>description</b></td>
Chris@1 157 </tr>
Chris@1 158 <tr valign=top>
Chris@1 159 <td>maximum bitrate</td> <td> The maximum allowed bitrate, set in bits
Chris@1 160 per second. If the bitrate would otherwise rise such that oversized
Chris@1 161 frames would underflow the bit-reservoir by consuming banked bits,
Chris@1 162 bitrate management will force the encoder to use fewer bits per frame
Chris@1 163 by encoding with a more aggressive psychoacoustic model.<p> This
Chris@1 164 setting is a hard limit; the bitstream will never be allowed, under
Chris@1 165 any circumstances, to increase above the specified bitrate over the
Chris@1 166 average period set by the reservoir; it may momentarily rise over if
Chris@1 167 inspected on a granularity much finer than the average period across
Chris@1 168 the reservoir. Normally, the encoder will conserve bits gracefully by
Chris@1 169 using more aggressive psychoacoustics to shrink a frame when forced
Chris@1 170 to. However, if the encoder runs out of means of gracefully shrinking
Chris@1 171 a frame, it will simply take the smallest frame it can otherwise
Chris@1 172 generate and truncate it to the maximum allowed length. Note that
Chris@1 173 this is not an error and although it will obviously adversely affect
Chris@1 174 audio quality, a Vorbis decoder will be able to decode a truncated
Chris@1 175 frame into audio.
Chris@1 176
Chris@1 177 </td>
Chris@1 178 </tr>
Chris@1 179
Chris@1 180 <tr valign=top>
Chris@1 181 <td>average bitrate</td>
Chris@1 182
Chris@1 183 <td>
Chris@1 184
Chris@1 185 The average desired bitrate of a stream, set
Chris@1 186 in bits per second. Average bitrate is tracked via a reservoir like
Chris@1 187 minimum and maximum bitrate, however the averaging reservior does not
Chris@1 188 impose a hard limit; it is used to nudge the bitrate toward the
Chris@1 189 desired average by slowly adjusting the psychoacoustic aggressiveness.
Chris@1 190 As such, the reservoir size does not affect the average bitrate
Chris@1 191 behavior. Because this setting alone is not used to impose hard
Chris@1 192 bitrate limits, the bitrate of a stream produced using only the
Chris@1 193 <tt>average bitrate</tt> constraint will track the average over time
Chris@1 194 but not necessarily adhere strictly to that average for any given
Chris@1 195 period. Should a strict localized average be required, <tt>average
Chris@1 196 bitrate</tt> should be used along with <tt>minimum bitrate</tt> and
Chris@1 197 <tt>maximum bitrate</tt>.
Chris@1 198 </td>
Chris@1 199
Chris@1 200 </tr>
Chris@1 201
Chris@1 202 <tr valign=top>
Chris@1 203 <td>minimum bitrate</td>
Chris@1 204 <td>
Chris@1 205 The minimum allowed bitrate, set in bits per second. If
Chris@1 206 the bitrate would otherwise fall such that undersized frames would
Chris@1 207 overflow the bit-reservoir with unused bits, bitrate management will
Chris@1 208 force the encoder to use more bits per frame by encoding with a less
Chris@1 209 aggressive psychoacoustic model.<p> This setting is a hard limit; the
Chris@1 210 bitstream will never be allowed, under any circumstances, to drop
Chris@1 211 below the specified bitrate over the average period set by the
Chris@1 212 reservoir; it may momentarily fall under if inspected on a granularity
Chris@1 213 much finer than the average period across the reservoir. Normally,
Chris@1 214 the encoder will fill out undersided frames with additional useful
Chris@1 215 coding information by increasing the perceived quality of the stream.
Chris@1 216 If the encoder runs out of useful ways to consume more bits, it will
Chris@1 217 pad frames out with zeroes.
Chris@1 218 </td>
Chris@1 219 </tr>
Chris@1 220
Chris@1 221 <tr valign=top>
Chris@1 222 <td>reservoir size</td> <td> The size of the minimum/maximum bitrate
Chris@1 223 tracking reservoir, set in bits. The reservoir is used as a 'bit
Chris@1 224 bank' to average out localized surges and dips in bitrate while
Chris@1 225 providing predictable, guaranteed buffering behavior for streams to be
Chris@1 226 used in situations with constrained transport bandwidth. The default
Chris@1 227 setting is two seconds of average bitrate.<p>
Chris@1 228
Chris@1 229 When a single frame is larger than the maximum allowed overall
Chris@1 230 bitrate, the bits are 'borrowed' from the bitrate reservoir; if the
Chris@1 231 reservoir contains insufficient bits to cover the defecit, the encoder
Chris@1 232 must find some way to reduce the frame size. <p>
Chris@1 233
Chris@1 234 When a frame is under the minimum limit, the surplus bits are placed
Chris@1 235 into the reservoir, banking them for future use. If the reservoir is
Chris@1 236 already full of banked bits, the encoder is forced to find some way to
Chris@1 237 make the frame larger.<p>
Chris@1 238
Chris@1 239 If the frame size is between the minimum and maximum rates (thus
Chris@1 240 implying the minimum and maximum allowed rates are different), the
Chris@1 241 reservoir gravitates toward a fill point configured by the
Chris@1 242 <tt>reservoir bias</tt> setting described next. If the reservoir is
Chris@1 243 fuller than the fill point (a 'surplus of surplus'), the encoder will
Chris@1 244 consume a number bits from the reservoir equal to the number of the
Chris@1 245 bits by which the frame exceeds minimum size. If the reservoir is
Chris@1 246 emptier than the fillpoint (a 'surplus of defecit'), bits are returned
Chris@1 247 to the reservoir equaling the current frame's number of bits under the
Chris@1 248 maximum frame size. The idea of the fill point is to buffer against
Chris@1 249 both underruns and overruns, by trying to hold the reservoir to a
Chris@1 250 middle course.
Chris@1 251 </td>
Chris@1 252 </tr>
Chris@1 253
Chris@1 254 <tr valign=top>
Chris@1 255 <td>reservoir bias</td>
Chris@1 256
Chris@1 257 <td>
Chris@1 258
Chris@1 259 Reservoir bias is a setting between 0.0 and 1.0 that biases bitrate
Chris@1 260 management toward smoothing bitrate spikes (0.0) or bitrate peaks
Chris@1 261 (1.0); the default setting is 0.1.<p>
Chris@1 262
Chris@1 263 Using settings toward 0.0 causes the bitrate manager to hoard bits in
Chris@1 264 the bit reservoir such that there is a large pool of banked surplus to
Chris@1 265 draw upon during short spikes in bitrate. As a result, the encoder
Chris@1 266 will react less aggressively and less drastically to curtail framesize
Chris@1 267 during brief surges in bitrate.<p>
Chris@1 268
Chris@1 269 Using settings toward 1.0 causes the bitrate manager to empty the bit
Chris@1 270 reservoir such that there is a large buffer available to store surplus
Chris@1 271 bits during sudden drops in bitrate. As a result, the encoder will
Chris@1 272 react less aggressively and less drastically to support minimum frame
Chris@1 273 sizes during drops in bitrate and will tend not to store any extra
Chris@1 274 bits in the reservoir for future bitrate spikes.<p>
Chris@1 275
Chris@1 276 </td>
Chris@1 277 </tr>
Chris@1 278
Chris@1 279 <tr valign=top>
Chris@1 280 <td>average track damping</td>
Chris@1 281 <td>
Chris@1 282
Chris@1 283 A decimal value, in seconds, that controls how quickly the average
Chris@1 284 bitrate tracker is allowed to slew from enforcing minimum frame sizes
Chris@1 285 to maximum framesizes and vice versa. Default value is 1.5
Chris@1 286 seconds.<p>
Chris@1 287
Chris@1 288 When the 'average bitrate' setting is in use, the average bitrate
Chris@1 289 tracker uses an unbounded reservoir to track overall bitrate-to-date
Chris@1 290 in the stream. When bitrates are too low, the tracker will try to
Chris@1 291 nudge bitrates up and when the bitrate is too high, nudge it down.
Chris@1 292 The damping value regulates the maximum strength of the nudge; it
Chris@1 293 describes, in seconds, how quickly the tracker may transition from an
Chris@1 294 extreme nudge in one direction to an extreme nudge in the other.<p>
Chris@1 295
Chris@1 296 </td>
Chris@1 297 </tr>
Chris@1 298
Chris@1 299 </table>
Chris@1 300
Chris@1 301 <h3>encoding model adjustments</h3>
Chris@1 302
Chris@1 303 The <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> call provides
Chris@1 304 a generalized interface for making encoding setup adjustments to the
Chris@1 305 basic high-level setup provided by <a
Chris@1 306 href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a
Chris@1 307 href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>.
Chris@1 308 In reality, these two calls use <a
Chris@1 309 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> internally, and <a
Chris@1 310 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can be used to adjust
Chris@1 311 most of the parameters set by other calls.<p>
Chris@1 312
Chris@1 313 In Vorbis 1.1, <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can
Chris@1 314 adjust the following additional parameters not described elsewhere:
Chris@1 315
Chris@1 316 <p>
Chris@1 317 <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
Chris@1 318 <tr bgcolor=#cccccc>
Chris@1 319 <td><b>parameter</b></td>
Chris@1 320 <td><b>description</b></td>
Chris@1 321 </tr>
Chris@1 322 <tr valign=top>
Chris@1 323 <td>management mode</td> <td> Configures whether or not bitrate
Chris@1 324 management is in use or not. Normally, this value is set implicitly
Chris@1 325 during encoding setup; however, the supported means of selecting a
Chris@1 326 quality mode by bitrate (that is, requesting a true VBR stream, but
Chris@1 327 doing so by asking for an approximate bitrate) is to use <a
Chris@1 328 href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>
Chris@1 329 and then to explicitly turn off bitrate management by calling <a
Chris@1 330 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a
Chris@1 331 href="vorbis_encode_ctl.html#OV_ECTL_RATEMANAGE2_SET">OV_ECTL_RATEMANAGE2_SET</a>
Chris@1 332 </td>
Chris@1 333 </tr>
Chris@1 334
Chris@1 335 <tr valign=top>
Chris@1 336 <td>coupling</td> <td> Stereo encoding (and in the future, surround
Chris@1 337 encodings) are normally encoded assuming the channels form a stereo
Chris@1 338 image and that lossy-stereo modelling is appropriate; this is called
Chris@1 339 'coupling'. Stereo coupling may be explicitly enabled or disabled.
Chris@1 340 </td>
Chris@1 341 </tr>
Chris@1 342 <tr valign=top>
Chris@1 343 <td>lowpass</td> <td> Sets the hard lowpass of a given encoding mode;
Chris@1 344 this may be used to conserve a few bits in high-rate audio that has
Chris@1 345 limited bandwidth, or in testing of the encoder's acoustic model. The
Chris@1 346 encoder is generally already configured with ideal lowpasses (if any
Chris@1 347 at all) for given modes; use of this parameter is strongly discouraged
Chris@1 348 if the point is to try to 'improve' a given encoding mode for general
Chris@1 349 encoding.
Chris@1 350 </td>
Chris@1 351 </tr>
Chris@1 352
Chris@1 353 <tr valign=top>
Chris@1 354 <td>impulse coding aggressiveness</td> <td>By default, libvorbis
Chris@1 355 attempts to compromise between preventing wide bitrate swings and
Chris@1 356 high-resolution impulse coding (which is required for the crispest
Chris@1 357 possible attacks, but also requires a relatively large momentary
Chris@1 358 bitrate increase). This parameter allows an application to tune the
Chris@1 359 compromise or eliminate it; A value of 0.0 indicates normal behavior
Chris@1 360 while a value of -15.0 requests maximum possible impulse
Chris@1 361 resolution.</td>
Chris@1 362 </tr>
Chris@1 363
Chris@1 364 </table>
Chris@1 365
Chris@1 366
Chris@1 367 <br><br>
Chris@1 368 <hr noshade>
Chris@1 369 <table border=0 width=100%>
Chris@1 370 <tr valign=top>
Chris@1 371 <td><p class=tiny>copyright &copy; 2000-2010 Xiph.Org</p></td>
Chris@1 372 <td align=right><p class=tiny><a href="http://www.xiph.org/ogg/vorbis/index.html">Ogg Vorbis</a></p></td>
Chris@1 373 </tr><tr>
Chris@1 374 <td><p class=tiny>libvorbisenc documentation</p></td>
Chris@1 375 <td align=right><p class=tiny>libvorbisenc version 1.3.2 - 20101101</p></td>
Chris@1 376 </tr>
Chris@1 377 </table>
Chris@1 378
Chris@1 379 </body>
Chris@1 380
Chris@1 381 </html>
Chris@1 382