annotate src/libvorbis-1.3.3/doc/vorbisenc/overview.html @ 86:98c1576536ae

Bring in flac, ogg, vorbis
author Chris Cannam <cannam@all-day-breakfast.com>
date Tue, 19 Mar 2013 17:37:49 +0000
parents
children
rev   line source
cannam@86 1 <html>
cannam@86 2
cannam@86 3 <head>
cannam@86 4 <title>libvorbisenc - API Overview</title>
cannam@86 5 <link rel=stylesheet href="style.css" type="text/css">
cannam@86 6 </head>
cannam@86 7
cannam@86 8 <body bgcolor=white text=black link="#5555ff" alink="#5555ff" vlink="#5555ff">
cannam@86 9 <table border=0 width=100%>
cannam@86 10 <tr>
cannam@86 11 <td><p class=tiny>libvorbisenc documentation</p></td>
cannam@86 12 <td align=right><p class=tiny>libvorbisenc version 1.3.2 - 20101101</p></td>
cannam@86 13 </tr>
cannam@86 14 </table>
cannam@86 15
cannam@86 16 <h1>Libvorbisenc API Overview</h1>
cannam@86 17
cannam@86 18 <p>Libvorbisenc is an encoding convenience library intended to
cannam@86 19 encapsulate the elaborate setup that libvorbis requires for encoding.
cannam@86 20 Libvorbisenc gives easy access to all high-level adjustments an
cannam@86 21 application may require when encoding and also exposes some low-level
cannam@86 22 tuning parameters to allow applications to make detailed adjustments
cannam@86 23 to the encoding process. <p>
cannam@86 24
cannam@86 25 All the <b>libvorbisenc</b> routines are declared in "vorbis/vorbisenc.h".
cannam@86 26
cannam@86 27 <em>Note: libvorbis and libvorbisenc always
cannam@86 28 encode in a single pass. Thus, all possible encoding setups will work
cannam@86 29 properly with live input and produce streams that decode properly when
cannam@86 30 streamed. See the subsection titled <a href="#BBR">"managed bitrate
cannam@86 31 modes"</a> for details on setting limits on bitrate usage when Vorbis
cannam@86 32 streams are used in a limited-bandwidth environment.</em>
cannam@86 33
cannam@86 34 <h2>workflow</h2>
cannam@86 35
cannam@86 36 <p>Libvorbisenc is used only during encoder setup; its function
cannam@86 37 is to automate initialization of a multitude of settings in a
cannam@86 38 <tt>vorbis_info</tt> structure which libvorbis then uses as a reference
cannam@86 39 during the encoding process. Libvorbisenc plays no part in the
cannam@86 40 encoding process after setup.
cannam@86 41
cannam@86 42 <p>Encode setup using libvorbisenc consists of three steps:
cannam@86 43
cannam@86 44 <ol>
cannam@86 45 <li>high-level initialization of a <tt>vorbis_info</tt> structure by
cannam@86 46 calling one of <a
cannam@86 47 href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a
cannam@86 48 href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>
cannam@86 49 with the basic input audio parameters (rate and channels) and the
cannam@86 50 basic desired encoded audio output parameters (VBR quality or ABR/CBR
cannam@86 51 bitrate)<p>
cannam@86 52
cannam@86 53 <li>optional adjustment of the basic setup defaults using <a
cannam@86 54 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a><p>
cannam@86 55
cannam@86 56 <li>calling <a
cannam@86 57 href="vorbis_encode_setup_init.html">vorbis_encode_setup_init()</a> to
cannam@86 58 finalize the high-level setup into the detailed low-level reference
cannam@86 59 values needed by libvorbis to encode audio. The <tt>vorbis_info</tt>
cannam@86 60 structure is then ready to use for encoding by libvorbis.<p>
cannam@86 61
cannam@86 62 </ol>
cannam@86 63
cannam@86 64 These three steps can be collapsed into a single call by using <a
cannam@86 65 href="vorbis_encode_init_vbr.html">vorbis_encode_init_vbr</a> to set up a
cannam@86 66 quality-based VBR stream or <a
cannam@86 67 href="vorbis_encode_init.html">vorbis_encode_init</a> to set up a managed
cannam@86 68 bitrate (ABR or CBR) stream.<p>
cannam@86 69
cannam@86 70 <h2>adjustable encoding parameters</h2>
cannam@86 71
cannam@86 72 <h3>input audio parameters</h3>
cannam@86 73
cannam@86 74 <p>
cannam@86 75 <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
cannam@86 76 <tr bgcolor=#cccccc>
cannam@86 77 <td><b>parameter</b></td>
cannam@86 78 <td><b>description</b></td>
cannam@86 79 </tr>
cannam@86 80 <tr valign=top>
cannam@86 81 <td>sampling rate</td>
cannam@86 82 <td>
cannam@86 83 The sampling rate (in samples per second) of the input audio. Common examples are 8000 for telephony, 44100 for CD audio and 48000 for DAT. Note that a mono sample (one center value) and a stereo sample (one left value and one right value) both are a single sample.
cannam@86 84
cannam@86 85 </td>
cannam@86 86 </tr>
cannam@86 87 <tr valign=top>
cannam@86 88 <td>channels</td>
cannam@86 89 <td>
cannam@86 90
cannam@86 91 The number of channels encoded in each input sample. By default,
cannam@86 92 stereo input modes (two channels) are 'coupled' by Vorbis 1.1 such
cannam@86 93 that the stereo relationship between the samples is taken into account
cannam@86 94 when encoding. Stereo coupling my be disabled by using <a
cannam@86 95 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a
cannam@86 96 href="vorbis_encode_ctl.html#OV_ECTL_COUPLE_SET">OV_ECTL_COUPLE_SET</a>.
cannam@86 97
cannam@86 98 </td>
cannam@86 99 </tr>
cannam@86 100 </table>
cannam@86 101
cannam@86 102 <h3>quality and VBR modes</h3>
cannam@86 103
cannam@86 104 Vorbis is natively a VBR codec; a user requests a given constant
cannam@86 105 <em>quality</em> and the encoder keeps the encoding quality constant
cannam@86 106 while allowing the bitrate to vary. 'Quality' modes (Variable BitRate)
cannam@86 107 will always produce the most consistent encoding results as well as
cannam@86 108 the highest quality for the amount of bits used.
cannam@86 109
cannam@86 110 <p>
cannam@86 111 <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
cannam@86 112 <tr bgcolor=#cccccc>
cannam@86 113 <td><b>parameter</b></td>
cannam@86 114 <td><b>description</b></td>
cannam@86 115 </tr>
cannam@86 116 <tr valign=top>
cannam@86 117 <td>quality</td>
cannam@86 118 <td>
cannam@86 119 A decimal float value requesting a desired quality. Libvorbisenc 1.1 allows quality requests in the range of -0.1 (lowest quality, smallest files) through +1.0 (highest-quality, largest files). Quality -0.1 is intended as an ultra-low setting in which low bitrate is much more important than quality consistency. Quality settings 0.0 and above are intended to produce consistent results at all times.
cannam@86 120
cannam@86 121 </td>
cannam@86 122 </tr>
cannam@86 123 </table>
cannam@86 124
cannam@86 125 <a name="BBR">
cannam@86 126 <h3>managed bitrate modes</h3>
cannam@86 127
cannam@86 128 Although the Vorbis codec is natively VBR, libvorbis includes
cannam@86 129 infrastructure for 'managing' the bitrate of streams by setting
cannam@86 130 minimum and maximum usage constraints, as well as functionality for
cannam@86 131 nudging a stream toward a desired average value. These features
cannam@86 132 should <em>only</em> be used when there is a requirement to limit
cannam@86 133 bitrate in some way. Although the difference is usually slight,
cannam@86 134 managed bitrate modes will always produce output inferior to VBR
cannam@86 135 (given equal bitrate usage). Setting overly or impossibly tight
cannam@86 136 bitrate management requirements can affect output quality dramatically
cannam@86 137 for the worse.<p>
cannam@86 138
cannam@86 139 Beginning in libvorbis 1.1, bitrate management is implemented using a
cannam@86 140 <em>bit-reservoir</em> algorithm. The encoder has a fixed-size
cannam@86 141 reservoir used as a 'savings account' in encoding. When a frame is
cannam@86 142 smaller than the target rate, the unused bits go into the reservoir so
cannam@86 143 that they may be used by future frames. When a frame is larger than
cannam@86 144 target bitrate, it draws 'banked' bits out of the reservoir. Encoding
cannam@86 145 is managed so that the reservoir never goes negative (when a maximum
cannam@86 146 bitrate is specified) or fills beyond a fixed limit (when a minimum
cannam@86 147 bitrate is specified). An 'average bitrate' request is used as the
cannam@86 148 set-point in a long-range bitrate tracker which adjusts the encoder's
cannam@86 149 aggressiveness up or down depending on whether or not frames are coming
cannam@86 150 in larger or smaller than the requested average point.
cannam@86 151
cannam@86 152 <p>
cannam@86 153 <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
cannam@86 154 <tr bgcolor=#cccccc>
cannam@86 155 <td><b>parameter</b></td>
cannam@86 156 <td><b>description</b></td>
cannam@86 157 </tr>
cannam@86 158 <tr valign=top>
cannam@86 159 <td>maximum bitrate</td> <td> The maximum allowed bitrate, set in bits
cannam@86 160 per second. If the bitrate would otherwise rise such that oversized
cannam@86 161 frames would underflow the bit-reservoir by consuming banked bits,
cannam@86 162 bitrate management will force the encoder to use fewer bits per frame
cannam@86 163 by encoding with a more aggressive psychoacoustic model.<p> This
cannam@86 164 setting is a hard limit; the bitstream will never be allowed, under
cannam@86 165 any circumstances, to increase above the specified bitrate over the
cannam@86 166 average period set by the reservoir; it may momentarily rise over if
cannam@86 167 inspected on a granularity much finer than the average period across
cannam@86 168 the reservoir. Normally, the encoder will conserve bits gracefully by
cannam@86 169 using more aggressive psychoacoustics to shrink a frame when forced
cannam@86 170 to. However, if the encoder runs out of means of gracefully shrinking
cannam@86 171 a frame, it will simply take the smallest frame it can otherwise
cannam@86 172 generate and truncate it to the maximum allowed length. Note that
cannam@86 173 this is not an error and although it will obviously adversely affect
cannam@86 174 audio quality, a Vorbis decoder will be able to decode a truncated
cannam@86 175 frame into audio.
cannam@86 176
cannam@86 177 </td>
cannam@86 178 </tr>
cannam@86 179
cannam@86 180 <tr valign=top>
cannam@86 181 <td>average bitrate</td>
cannam@86 182
cannam@86 183 <td>
cannam@86 184
cannam@86 185 The average desired bitrate of a stream, set
cannam@86 186 in bits per second. Average bitrate is tracked via a reservoir like
cannam@86 187 minimum and maximum bitrate, however the averaging reservior does not
cannam@86 188 impose a hard limit; it is used to nudge the bitrate toward the
cannam@86 189 desired average by slowly adjusting the psychoacoustic aggressiveness.
cannam@86 190 As such, the reservoir size does not affect the average bitrate
cannam@86 191 behavior. Because this setting alone is not used to impose hard
cannam@86 192 bitrate limits, the bitrate of a stream produced using only the
cannam@86 193 <tt>average bitrate</tt> constraint will track the average over time
cannam@86 194 but not necessarily adhere strictly to that average for any given
cannam@86 195 period. Should a strict localized average be required, <tt>average
cannam@86 196 bitrate</tt> should be used along with <tt>minimum bitrate</tt> and
cannam@86 197 <tt>maximum bitrate</tt>.
cannam@86 198 </td>
cannam@86 199
cannam@86 200 </tr>
cannam@86 201
cannam@86 202 <tr valign=top>
cannam@86 203 <td>minimum bitrate</td>
cannam@86 204 <td>
cannam@86 205 The minimum allowed bitrate, set in bits per second. If
cannam@86 206 the bitrate would otherwise fall such that undersized frames would
cannam@86 207 overflow the bit-reservoir with unused bits, bitrate management will
cannam@86 208 force the encoder to use more bits per frame by encoding with a less
cannam@86 209 aggressive psychoacoustic model.<p> This setting is a hard limit; the
cannam@86 210 bitstream will never be allowed, under any circumstances, to drop
cannam@86 211 below the specified bitrate over the average period set by the
cannam@86 212 reservoir; it may momentarily fall under if inspected on a granularity
cannam@86 213 much finer than the average period across the reservoir. Normally,
cannam@86 214 the encoder will fill out undersided frames with additional useful
cannam@86 215 coding information by increasing the perceived quality of the stream.
cannam@86 216 If the encoder runs out of useful ways to consume more bits, it will
cannam@86 217 pad frames out with zeroes.
cannam@86 218 </td>
cannam@86 219 </tr>
cannam@86 220
cannam@86 221 <tr valign=top>
cannam@86 222 <td>reservoir size</td> <td> The size of the minimum/maximum bitrate
cannam@86 223 tracking reservoir, set in bits. The reservoir is used as a 'bit
cannam@86 224 bank' to average out localized surges and dips in bitrate while
cannam@86 225 providing predictable, guaranteed buffering behavior for streams to be
cannam@86 226 used in situations with constrained transport bandwidth. The default
cannam@86 227 setting is two seconds of average bitrate.<p>
cannam@86 228
cannam@86 229 When a single frame is larger than the maximum allowed overall
cannam@86 230 bitrate, the bits are 'borrowed' from the bitrate reservoir; if the
cannam@86 231 reservoir contains insufficient bits to cover the defecit, the encoder
cannam@86 232 must find some way to reduce the frame size. <p>
cannam@86 233
cannam@86 234 When a frame is under the minimum limit, the surplus bits are placed
cannam@86 235 into the reservoir, banking them for future use. If the reservoir is
cannam@86 236 already full of banked bits, the encoder is forced to find some way to
cannam@86 237 make the frame larger.<p>
cannam@86 238
cannam@86 239 If the frame size is between the minimum and maximum rates (thus
cannam@86 240 implying the minimum and maximum allowed rates are different), the
cannam@86 241 reservoir gravitates toward a fill point configured by the
cannam@86 242 <tt>reservoir bias</tt> setting described next. If the reservoir is
cannam@86 243 fuller than the fill point (a 'surplus of surplus'), the encoder will
cannam@86 244 consume a number bits from the reservoir equal to the number of the
cannam@86 245 bits by which the frame exceeds minimum size. If the reservoir is
cannam@86 246 emptier than the fillpoint (a 'surplus of defecit'), bits are returned
cannam@86 247 to the reservoir equaling the current frame's number of bits under the
cannam@86 248 maximum frame size. The idea of the fill point is to buffer against
cannam@86 249 both underruns and overruns, by trying to hold the reservoir to a
cannam@86 250 middle course.
cannam@86 251 </td>
cannam@86 252 </tr>
cannam@86 253
cannam@86 254 <tr valign=top>
cannam@86 255 <td>reservoir bias</td>
cannam@86 256
cannam@86 257 <td>
cannam@86 258
cannam@86 259 Reservoir bias is a setting between 0.0 and 1.0 that biases bitrate
cannam@86 260 management toward smoothing bitrate spikes (0.0) or bitrate peaks
cannam@86 261 (1.0); the default setting is 0.1.<p>
cannam@86 262
cannam@86 263 Using settings toward 0.0 causes the bitrate manager to hoard bits in
cannam@86 264 the bit reservoir such that there is a large pool of banked surplus to
cannam@86 265 draw upon during short spikes in bitrate. As a result, the encoder
cannam@86 266 will react less aggressively and less drastically to curtail framesize
cannam@86 267 during brief surges in bitrate.<p>
cannam@86 268
cannam@86 269 Using settings toward 1.0 causes the bitrate manager to empty the bit
cannam@86 270 reservoir such that there is a large buffer available to store surplus
cannam@86 271 bits during sudden drops in bitrate. As a result, the encoder will
cannam@86 272 react less aggressively and less drastically to support minimum frame
cannam@86 273 sizes during drops in bitrate and will tend not to store any extra
cannam@86 274 bits in the reservoir for future bitrate spikes.<p>
cannam@86 275
cannam@86 276 </td>
cannam@86 277 </tr>
cannam@86 278
cannam@86 279 <tr valign=top>
cannam@86 280 <td>average track damping</td>
cannam@86 281 <td>
cannam@86 282
cannam@86 283 A decimal value, in seconds, that controls how quickly the average
cannam@86 284 bitrate tracker is allowed to slew from enforcing minimum frame sizes
cannam@86 285 to maximum framesizes and vice versa. Default value is 1.5
cannam@86 286 seconds.<p>
cannam@86 287
cannam@86 288 When the 'average bitrate' setting is in use, the average bitrate
cannam@86 289 tracker uses an unbounded reservoir to track overall bitrate-to-date
cannam@86 290 in the stream. When bitrates are too low, the tracker will try to
cannam@86 291 nudge bitrates up and when the bitrate is too high, nudge it down.
cannam@86 292 The damping value regulates the maximum strength of the nudge; it
cannam@86 293 describes, in seconds, how quickly the tracker may transition from an
cannam@86 294 extreme nudge in one direction to an extreme nudge in the other.<p>
cannam@86 295
cannam@86 296 </td>
cannam@86 297 </tr>
cannam@86 298
cannam@86 299 </table>
cannam@86 300
cannam@86 301 <h3>encoding model adjustments</h3>
cannam@86 302
cannam@86 303 The <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> call provides
cannam@86 304 a generalized interface for making encoding setup adjustments to the
cannam@86 305 basic high-level setup provided by <a
cannam@86 306 href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a
cannam@86 307 href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>.
cannam@86 308 In reality, these two calls use <a
cannam@86 309 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> internally, and <a
cannam@86 310 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can be used to adjust
cannam@86 311 most of the parameters set by other calls.<p>
cannam@86 312
cannam@86 313 In Vorbis 1.1, <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can
cannam@86 314 adjust the following additional parameters not described elsewhere:
cannam@86 315
cannam@86 316 <p>
cannam@86 317 <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
cannam@86 318 <tr bgcolor=#cccccc>
cannam@86 319 <td><b>parameter</b></td>
cannam@86 320 <td><b>description</b></td>
cannam@86 321 </tr>
cannam@86 322 <tr valign=top>
cannam@86 323 <td>management mode</td> <td> Configures whether or not bitrate
cannam@86 324 management is in use or not. Normally, this value is set implicitly
cannam@86 325 during encoding setup; however, the supported means of selecting a
cannam@86 326 quality mode by bitrate (that is, requesting a true VBR stream, but
cannam@86 327 doing so by asking for an approximate bitrate) is to use <a
cannam@86 328 href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>
cannam@86 329 and then to explicitly turn off bitrate management by calling <a
cannam@86 330 href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a
cannam@86 331 href="vorbis_encode_ctl.html#OV_ECTL_RATEMANAGE2_SET">OV_ECTL_RATEMANAGE2_SET</a>
cannam@86 332 </td>
cannam@86 333 </tr>
cannam@86 334
cannam@86 335 <tr valign=top>
cannam@86 336 <td>coupling</td> <td> Stereo encoding (and in the future, surround
cannam@86 337 encodings) are normally encoded assuming the channels form a stereo
cannam@86 338 image and that lossy-stereo modelling is appropriate; this is called
cannam@86 339 'coupling'. Stereo coupling may be explicitly enabled or disabled.
cannam@86 340 </td>
cannam@86 341 </tr>
cannam@86 342 <tr valign=top>
cannam@86 343 <td>lowpass</td> <td> Sets the hard lowpass of a given encoding mode;
cannam@86 344 this may be used to conserve a few bits in high-rate audio that has
cannam@86 345 limited bandwidth, or in testing of the encoder's acoustic model. The
cannam@86 346 encoder is generally already configured with ideal lowpasses (if any
cannam@86 347 at all) for given modes; use of this parameter is strongly discouraged
cannam@86 348 if the point is to try to 'improve' a given encoding mode for general
cannam@86 349 encoding.
cannam@86 350 </td>
cannam@86 351 </tr>
cannam@86 352
cannam@86 353 <tr valign=top>
cannam@86 354 <td>impulse coding aggressiveness</td> <td>By default, libvorbis
cannam@86 355 attempts to compromise between preventing wide bitrate swings and
cannam@86 356 high-resolution impulse coding (which is required for the crispest
cannam@86 357 possible attacks, but also requires a relatively large momentary
cannam@86 358 bitrate increase). This parameter allows an application to tune the
cannam@86 359 compromise or eliminate it; A value of 0.0 indicates normal behavior
cannam@86 360 while a value of -15.0 requests maximum possible impulse
cannam@86 361 resolution.</td>
cannam@86 362 </tr>
cannam@86 363
cannam@86 364 </table>
cannam@86 365
cannam@86 366
cannam@86 367 <br><br>
cannam@86 368 <hr noshade>
cannam@86 369 <table border=0 width=100%>
cannam@86 370 <tr valign=top>
cannam@86 371 <td><p class=tiny>copyright &copy; 2000-2010 Xiph.Org</p></td>
cannam@86 372 <td align=right><p class=tiny><a href="http://www.xiph.org/ogg/vorbis/index.html">Ogg Vorbis</a></p></td>
cannam@86 373 </tr><tr>
cannam@86 374 <td><p class=tiny>libvorbisenc documentation</p></td>
cannam@86 375 <td align=right><p class=tiny>libvorbisenc version 1.3.2 - 20101101</p></td>
cannam@86 376 </tr>
cannam@86 377 </table>
cannam@86 378
cannam@86 379 </body>
cannam@86 380
cannam@86 381 </html>
cannam@86 382