annotate src/libvorbis-1.3.3/doc/07-floor1.tex @ 36:55ece8862b6d

Merge
author Chris Cannam
date Wed, 11 Mar 2015 13:32:44 +0000
parents 05aa0afa9217
children
rev   line source
Chris@1 1 % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
Chris@1 2 %!TEX root = Vorbis_I_spec.tex
Chris@1 3 % $Id$
Chris@1 4 \section{Floor type 1 setup and decode} \label{vorbis:spec:floor1}
Chris@1 5
Chris@1 6 \subsection{Overview}
Chris@1 7
Chris@1 8 Vorbis floor type one uses a piecewise straight-line representation to
Chris@1 9 encode a spectral envelope curve. The representation plots this curve
Chris@1 10 mechanically on a linear frequency axis and a logarithmic (dB)
Chris@1 11 amplitude axis. The integer plotting algorithm used is similar to
Chris@1 12 Bresenham's algorithm.
Chris@1 13
Chris@1 14
Chris@1 15
Chris@1 16 \subsection{Floor 1 format}
Chris@1 17
Chris@1 18 \subsubsection{model}
Chris@1 19
Chris@1 20 Floor type one represents a spectral curve as a series of
Chris@1 21 line segments. Synthesis constructs a floor curve using iterative
Chris@1 22 prediction in a process roughly equivalent to the following simplified
Chris@1 23 description:
Chris@1 24
Chris@1 25 \begin{itemize}
Chris@1 26 \item the first line segment (base case) is a logical line spanning
Chris@1 27 from x_0,y_0 to x_1,y_1 where in the base case x_0=0 and x_1=[n], the
Chris@1 28 full range of the spectral floor to be computed.
Chris@1 29
Chris@1 30 \item the induction step chooses a point x_new within an existing
Chris@1 31 logical line segment and produces a y_new value at that point computed
Chris@1 32 from the existing line's y value at x_new (as plotted by the line) and
Chris@1 33 a difference value decoded from the bitstream packet.
Chris@1 34
Chris@1 35 \item floor computation produces two new line segments, one running from
Chris@1 36 x_0,y_0 to x_new,y_new and from x_new,y_new to x_1,y_1. This step is
Chris@1 37 performed logically even if y_new represents no change to the
Chris@1 38 amplitude value at x_new so that later refinement is additionally
Chris@1 39 bounded at x_new.
Chris@1 40
Chris@1 41 \item the induction step repeats, using a list of x values specified in
Chris@1 42 the codec setup header at floor 1 initialization time. Computation
Chris@1 43 is completed at the end of the x value list.
Chris@1 44
Chris@1 45 \end{itemize}
Chris@1 46
Chris@1 47
Chris@1 48 Consider the following example, with values chosen for ease of
Chris@1 49 understanding rather than representing typical configuration:
Chris@1 50
Chris@1 51 For the below example, we assume a floor setup with an [n] of 128.
Chris@1 52 The list of selected X values in increasing order is
Chris@1 53 0,16,32,48,64,80,96,112 and 128. In list order, the values interleave
Chris@1 54 as 0, 128, 64, 32, 96, 16, 48, 80 and 112. The corresponding
Chris@1 55 list-order Y values as decoded from an example packet are 110, 20, -5,
Chris@1 56 -45, 0, -25, -10, 30 and -10. We compute the floor in the following
Chris@1 57 way, beginning with the first line:
Chris@1 58
Chris@1 59 \begin{center}
Chris@1 60 \includegraphics[width=8cm]{floor1-1}
Chris@1 61 \captionof{figure}{graph of example floor}
Chris@1 62 \end{center}
Chris@1 63
Chris@1 64 We now draw new logical lines to reflect the correction to new_Y, and
Chris@1 65 iterate for X positions 32 and 96:
Chris@1 66
Chris@1 67 \begin{center}
Chris@1 68 \includegraphics[width=8cm]{floor1-2}
Chris@1 69 \captionof{figure}{graph of example floor}
Chris@1 70 \end{center}
Chris@1 71
Chris@1 72 Although the new Y value at X position 96 is unchanged, it is still
Chris@1 73 used later as an endpoint for further refinement. From here on, the
Chris@1 74 pattern should be clear; we complete the floor computation as follows:
Chris@1 75
Chris@1 76 \begin{center}
Chris@1 77 \includegraphics[width=8cm]{floor1-3}
Chris@1 78 \captionof{figure}{graph of example floor}
Chris@1 79 \end{center}
Chris@1 80
Chris@1 81 \begin{center}
Chris@1 82 \includegraphics[width=8cm]{floor1-4}
Chris@1 83 \captionof{figure}{graph of example floor}
Chris@1 84 \end{center}
Chris@1 85
Chris@1 86 A more efficient algorithm with carefully defined integer rounding
Chris@1 87 behavior is used for actual decode, as described later. The actual
Chris@1 88 algorithm splits Y value computation and line plotting into two steps
Chris@1 89 with modifications to the above algorithm to eliminate noise
Chris@1 90 accumulation through integer roundoff/truncation.
Chris@1 91
Chris@1 92
Chris@1 93
Chris@1 94 \subsubsection{header decode}
Chris@1 95
Chris@1 96 A list of floor X values is stored in the packet header in interleaved
Chris@1 97 format (used in list order during packet decode and synthesis). This
Chris@1 98 list is split into partitions, and each partition is assigned to a
Chris@1 99 partition class. X positions 0 and [n] are implicit and do not belong
Chris@1 100 to an explicit partition or partition class.
Chris@1 101
Chris@1 102 A partition class consists of a representation vector width (the
Chris@1 103 number of Y values which the partition class encodes at once), a
Chris@1 104 'subclass' value representing the number of alternate entropy books
Chris@1 105 the partition class may use in representing Y values, the list of
Chris@1 106 [subclass] books and a master book used to encode which alternate
Chris@1 107 books were chosen for representation in a given packet. The
Chris@1 108 master/subclass mechanism is meant to be used as a flexible
Chris@1 109 representation cascade while still using codebooks only in a scalar
Chris@1 110 context.
Chris@1 111
Chris@1 112 \begin{Verbatim}[commandchars=\\\{\}]
Chris@1 113
Chris@1 114 1) [floor1\_partitions] = read 5 bits as unsigned integer
Chris@1 115 2) [maximum\_class] = -1
Chris@1 116 3) iterate [i] over the range 0 ... [floor1\_partitions]-1 \{
Chris@1 117
Chris@1 118 4) vector [floor1\_partition\_class\_list] element [i] = read 4 bits as unsigned integer
Chris@1 119
Chris@1 120 \}
Chris@1 121
Chris@1 122 5) [maximum\_class] = largest integer scalar value in vector [floor1\_partition\_class\_list]
Chris@1 123 6) iterate [i] over the range 0 ... [maximum\_class] \{
Chris@1 124
Chris@1 125 7) vector [floor1\_class\_dimensions] element [i] = read 3 bits as unsigned integer and add 1
Chris@1 126 8) vector [floor1\_class\_subclasses] element [i] = read 2 bits as unsigned integer
Chris@1 127 9) if ( vector [floor1\_class\_subclasses] element [i] is nonzero ) \{
Chris@1 128
Chris@1 129 10) vector [floor1\_class\_masterbooks] element [i] = read 8 bits as unsigned integer
Chris@1 130
Chris@1 131 \}
Chris@1 132
Chris@1 133 11) iterate [j] over the range 0 ... (2 exponent [floor1\_class\_subclasses] element [i]) - 1 \{
Chris@1 134
Chris@1 135 12) array [floor1\_subclass\_books] element [i],[j] =
Chris@1 136 read 8 bits as unsigned integer and subtract one
Chris@1 137 \}
Chris@1 138 \}
Chris@1 139
Chris@1 140 13) [floor1\_multiplier] = read 2 bits as unsigned integer and add one
Chris@1 141 14) [rangebits] = read 4 bits as unsigned integer
Chris@1 142 15) vector [floor1\_X\_list] element [0] = 0
Chris@1 143 16) vector [floor1\_X\_list] element [1] = 2 exponent [rangebits];
Chris@1 144 17) [floor1\_values] = 2
Chris@1 145 18) iterate [i] over the range 0 ... [floor1\_partitions]-1 \{
Chris@1 146
Chris@1 147 19) [current\_class\_number] = vector [floor1\_partition\_class\_list] element [i]
Chris@1 148 20) iterate [j] over the range 0 ... ([floor1\_class\_dimensions] element [current\_class\_number])-1 \{
Chris@1 149 21) vector [floor1\_X\_list] element ([floor1\_values]) =
Chris@1 150 read [rangebits] bits as unsigned integer
Chris@1 151 22) increment [floor1\_values] by one
Chris@1 152 \}
Chris@1 153 \}
Chris@1 154
Chris@1 155 23) done
Chris@1 156 \end{Verbatim}
Chris@1 157
Chris@1 158 An end-of-packet condition while reading any aspect of a floor 1
Chris@1 159 configuration during setup renders a stream undecodable. In addition,
Chris@1 160 a \varname{[floor1\_class\_masterbooks]} or
Chris@1 161 \varname{[floor1\_subclass\_books]} scalar element greater than the
Chris@1 162 highest numbered codebook configured in this stream is an error
Chris@1 163 condition that renders the stream undecodable. Vector
Chris@1 164 [floor1\_x\_list] is limited to a maximum length of 65 elements; a
Chris@1 165 setup indicating more than 65 total elements (including elements 0 and
Chris@1 166 1 set prior to the read loop) renders the stream undecodable. All
Chris@1 167 vector [floor1\_x\_list] element values must be unique within the
Chris@1 168 vector; a non-unique value renders the stream undecodable.
Chris@1 169
Chris@1 170 \subsubsection{packet decode} \label{vorbis:spec:floor1-decode}
Chris@1 171
Chris@1 172 Packet decode begins by checking the \varname{[nonzero]} flag:
Chris@1 173
Chris@1 174 \begin{Verbatim}[commandchars=\\\{\}]
Chris@1 175 1) [nonzero] = read 1 bit as boolean
Chris@1 176 \end{Verbatim}
Chris@1 177
Chris@1 178 If \varname{[nonzero]} is unset, that indicates this channel contained
Chris@1 179 no audio energy in this frame. Decode immediately returns a status
Chris@1 180 indicating this floor curve (and thus this channel) is unused this
Chris@1 181 frame. (A return status of 'unused' is different from decoding a
Chris@1 182 floor that has all points set to minimum representation amplitude,
Chris@1 183 which happens to be approximately -140dB).
Chris@1 184
Chris@1 185
Chris@1 186 Assuming \varname{[nonzero]} is set, decode proceeds as follows:
Chris@1 187
Chris@1 188 \begin{Verbatim}[commandchars=\\\{\}]
Chris@1 189 1) [range] = vector \{ 256, 128, 86, 64 \} element ([floor1\_multiplier]-1)
Chris@1 190 2) vector [floor1\_Y] element [0] = read \link{vorbis:spec:ilog}{ilog}([range]-1) bits as unsigned integer
Chris@1 191 3) vector [floor1\_Y] element [1] = read \link{vorbis:spec:ilog}{ilog}([range]-1) bits as unsigned integer
Chris@1 192 4) [offset] = 2;
Chris@1 193 5) iterate [i] over the range 0 ... [floor1\_partitions]-1 \{
Chris@1 194
Chris@1 195 6) [class] = vector [floor1\_partition\_class] element [i]
Chris@1 196 7) [cdim] = vector [floor1\_class\_dimensions] element [class]
Chris@1 197 8) [cbits] = vector [floor1\_class\_subclasses] element [class]
Chris@1 198 9) [csub] = (2 exponent [cbits])-1
Chris@1 199 10) [cval] = 0
Chris@1 200 11) if ( [cbits] is greater than zero ) \{
Chris@1 201
Chris@1 202 12) [cval] = read from packet using codebook number
Chris@1 203 (vector [floor1\_class\_masterbooks] element [class]) in scalar context
Chris@1 204 \}
Chris@1 205
Chris@1 206 13) iterate [j] over the range 0 ... [cdim]-1 \{
Chris@1 207
Chris@1 208 14) [book] = array [floor1\_subclass\_books] element [class],([cval] bitwise AND [csub])
Chris@1 209 15) [cval] = [cval] right shifted [cbits] bits
Chris@1 210 16) if ( [book] is not less than zero ) \{
Chris@1 211
Chris@1 212 17) vector [floor1\_Y] element ([j]+[offset]) = read from packet using codebook
Chris@1 213 [book] in scalar context
Chris@1 214
Chris@1 215 \} else [book] is less than zero \{
Chris@1 216
Chris@1 217 18) vector [floor1\_Y] element ([j]+[offset]) = 0
Chris@1 218
Chris@1 219 \}
Chris@1 220 \}
Chris@1 221
Chris@1 222 19) [offset] = [offset] + [cdim]
Chris@1 223
Chris@1 224 \}
Chris@1 225
Chris@1 226 20) done
Chris@1 227 \end{Verbatim}
Chris@1 228
Chris@1 229 An end-of-packet condition during curve decode should be considered a
Chris@1 230 nominal occurrence; if end-of-packet is reached during any read
Chris@1 231 operation above, floor decode is to return 'unused' status as if the
Chris@1 232 \varname{[nonzero]} flag had been unset at the beginning of decode.
Chris@1 233
Chris@1 234
Chris@1 235 Vector \varname{[floor1\_Y]} contains the values from packet decode
Chris@1 236 needed for floor 1 synthesis.
Chris@1 237
Chris@1 238
Chris@1 239
Chris@1 240 \subsubsection{curve computation} \label{vorbis:spec:floor1-synth}
Chris@1 241
Chris@1 242 Curve computation is split into two logical steps; the first step
Chris@1 243 derives final Y amplitude values from the encoded, wrapped difference
Chris@1 244 values taken from the bitstream. The second step plots the curve
Chris@1 245 lines. Also, although zero-difference values are used in the
Chris@1 246 iterative prediction to find final Y values, these points are
Chris@1 247 conditionally skipped during final line computation in step two.
Chris@1 248 Skipping zero-difference values allows a smoother line fit.
Chris@1 249
Chris@1 250 Although some aspects of the below algorithm look like inconsequential
Chris@1 251 optimizations, implementors are warned to follow the details closely.
Chris@1 252 Deviation from implementing a strictly equivalent algorithm can result
Chris@1 253 in serious decoding errors.
Chris@1 254
Chris@1 255 {\em Additional note:} Although \varname{[floor1\_final\_Y]} values in
Chris@1 256 the prediction loop and at the end of step 1 are inherently limited by
Chris@1 257 the prediction algorithm to [0, \varname{[range]}), it is possible to
Chris@1 258 abuse the setup and codebook machinery to produce negative or
Chris@1 259 over-range results. We suggest that decoder implementations guard
Chris@1 260 the values in vector \varname{[floor1\_final\_Y]} by clamping each
Chris@1 261 element to [0, \varname{[range]}) after step 1. Variants of this
Chris@1 262 suggestion are acceptable as valid floor1 setups cannot produce
Chris@1 263 out of range values.
Chris@1 264
Chris@1 265 \begin{description}
Chris@1 266 \item[step 1: amplitude value synthesis]
Chris@1 267
Chris@1 268 Unwrap the always-positive-or-zero values read from the packet into
Chris@1 269 +/- difference values, then apply to line prediction.
Chris@1 270
Chris@1 271 \begin{Verbatim}[commandchars=\\\{\}]
Chris@1 272 1) [range] = vector \{ 256, 128, 86, 64 \} element ([floor1\_multiplier]-1)
Chris@1 273 2) vector [floor1\_step2\_flag] element [0] = set
Chris@1 274 3) vector [floor1\_step2\_flag] element [1] = set
Chris@1 275 4) vector [floor1\_final\_Y] element [0] = vector [floor1\_Y] element [0]
Chris@1 276 5) vector [floor1\_final\_Y] element [1] = vector [floor1\_Y] element [1]
Chris@1 277 6) iterate [i] over the range 2 ... [floor1\_values]-1 \{
Chris@1 278
Chris@1 279 7) [low\_neighbor\_offset] = \link{vorbis:spec:low:neighbor}{low\_neighbor}([floor1\_X\_list],[i])
Chris@1 280 8) [high\_neighbor\_offset] = \link{vorbis:spec:high:neighbor}{high\_neighbor}([floor1\_X\_list],[i])
Chris@1 281
Chris@1 282 9) [predicted] = \link{vorbis:spec:render:point}{render\_point}( vector [floor1\_X\_list] element [low\_neighbor\_offset],
Chris@1 283 vector [floor1\_final\_Y] element [low\_neighbor\_offset],
Chris@1 284 vector [floor1\_X\_list] element [high\_neighbor\_offset],
Chris@1 285 vector [floor1\_final\_Y] element [high\_neighbor\_offset],
Chris@1 286 vector [floor1\_X\_list] element [i] )
Chris@1 287
Chris@1 288 10) [val] = vector [floor1\_Y] element [i]
Chris@1 289 11) [highroom] = [range] - [predicted]
Chris@1 290 12) [lowroom] = [predicted]
Chris@1 291 13) if ( [highroom] is less than [lowroom] ) \{
Chris@1 292
Chris@1 293 14) [room] = [highroom] * 2
Chris@1 294
Chris@1 295 \} else [highroom] is not less than [lowroom] \{
Chris@1 296
Chris@1 297 15) [room] = [lowroom] * 2
Chris@1 298
Chris@1 299 \}
Chris@1 300
Chris@1 301 16) if ( [val] is nonzero ) \{
Chris@1 302
Chris@1 303 17) vector [floor1\_step2\_flag] element [low\_neighbor\_offset] = set
Chris@1 304 18) vector [floor1\_step2\_flag] element [high\_neighbor\_offset] = set
Chris@1 305 19) vector [floor1\_step2\_flag] element [i] = set
Chris@1 306 20) if ( [val] is greater than or equal to [room] ) \{
Chris@1 307
Chris@1 308 21) if ( [highroom] is greater than [lowroom] ) \{
Chris@1 309
Chris@1 310 22) vector [floor1\_final\_Y] element [i] = [val] - [lowroom] + [predicted]
Chris@1 311
Chris@1 312 \} else [highroom] is not greater than [lowroom] \{
Chris@1 313
Chris@1 314 23) vector [floor1\_final\_Y] element [i] = [predicted] - [val] + [highroom] - 1
Chris@1 315
Chris@1 316 \}
Chris@1 317
Chris@1 318 \} else [val] is less than [room] \{
Chris@1 319
Chris@1 320 24) if ([val] is odd) \{
Chris@1 321
Chris@1 322 25) vector [floor1\_final\_Y] element [i] =
Chris@1 323 [predicted] - (([val] + 1) divided by 2 using integer division)
Chris@1 324
Chris@1 325 \} else [val] is even \{
Chris@1 326
Chris@1 327 26) vector [floor1\_final\_Y] element [i] =
Chris@1 328 [predicted] + ([val] / 2 using integer division)
Chris@1 329
Chris@1 330 \}
Chris@1 331
Chris@1 332 \}
Chris@1 333
Chris@1 334 \} else [val] is zero \{
Chris@1 335
Chris@1 336 27) vector [floor1\_step2\_flag] element [i] = unset
Chris@1 337 28) vector [floor1\_final\_Y] element [i] = [predicted]
Chris@1 338
Chris@1 339 \}
Chris@1 340
Chris@1 341 \}
Chris@1 342
Chris@1 343 29) done
Chris@1 344
Chris@1 345 \end{Verbatim}
Chris@1 346
Chris@1 347
Chris@1 348
Chris@1 349 \item[step 2: curve synthesis]
Chris@1 350
Chris@1 351 Curve synthesis generates a return vector \varname{[floor]} of length
Chris@1 352 \varname{[n]} (where \varname{[n]} is provided by the decode process
Chris@1 353 calling to floor decode). Floor 1 curve synthesis makes use of the
Chris@1 354 \varname{[floor1\_X\_list]}, \varname{[floor1\_final\_Y]} and
Chris@1 355 \varname{[floor1\_step2\_flag]} vectors, as well as [floor1\_multiplier]
Chris@1 356 and [floor1\_values] values.
Chris@1 357
Chris@1 358 Decode begins by sorting the scalars from vectors
Chris@1 359 \varname{[floor1\_X\_list]}, \varname{[floor1\_final\_Y]} and
Chris@1 360 \varname{[floor1\_step2\_flag]} together into new vectors
Chris@1 361 \varname{[floor1\_X\_list]'}, \varname{[floor1\_final\_Y]'} and
Chris@1 362 \varname{[floor1\_step2\_flag]'} according to ascending sort order of the
Chris@1 363 values in \varname{[floor1\_X\_list]}. That is, sort the values of
Chris@1 364 \varname{[floor1\_X\_list]} and then apply the same permutation to
Chris@1 365 elements of the other two vectors so that the X, Y and step2\_flag
Chris@1 366 values still match.
Chris@1 367
Chris@1 368 Then compute the final curve in one pass:
Chris@1 369
Chris@1 370 \begin{Verbatim}[commandchars=\\\{\}]
Chris@1 371 1) [hx] = 0
Chris@1 372 2) [lx] = 0
Chris@1 373 3) [ly] = vector [floor1\_final\_Y]' element [0] * [floor1\_multiplier]
Chris@1 374 4) iterate [i] over the range 1 ... [floor1\_values]-1 \{
Chris@1 375
Chris@1 376 5) if ( [floor1\_step2\_flag]' element [i] is set ) \{
Chris@1 377
Chris@1 378 6) [hy] = [floor1\_final\_Y]' element [i] * [floor1\_multiplier]
Chris@1 379 7) [hx] = [floor1\_X\_list]' element [i]
Chris@1 380 8) \link{vorbis:spec:render:line}{render\_line}( [lx], [ly], [hx], [hy], [floor] )
Chris@1 381 9) [lx] = [hx]
Chris@1 382 10) [ly] = [hy]
Chris@1 383 \}
Chris@1 384 \}
Chris@1 385
Chris@1 386 11) if ( [hx] is less than [n] ) \{
Chris@1 387
Chris@1 388 12) \link{vorbis:spec:render:line}{render\_line}( [hx], [hy], [n], [hy], [floor] )
Chris@1 389
Chris@1 390 \}
Chris@1 391
Chris@1 392 13) if ( [hx] is greater than [n] ) \{
Chris@1 393
Chris@1 394 14) truncate vector [floor] to [n] elements
Chris@1 395
Chris@1 396 \}
Chris@1 397
Chris@1 398 15) for each scalar in vector [floor], perform a lookup substitution using
Chris@1 399 the scalar value from [floor] as an offset into the vector \link{vorbis:spec:floor1:inverse:dB:table}{[floor1\_inverse\_dB\_static\_table]}
Chris@1 400
Chris@1 401 16) done
Chris@1 402
Chris@1 403 \end{Verbatim}
Chris@1 404
Chris@1 405 \end{description}