cannam@86
|
1 % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
|
cannam@86
|
2 %!TEX root = Vorbis_I_spec.tex
|
cannam@86
|
3 % $Id$
|
cannam@86
|
4 \section{Floor type 0 setup and decode} \label{vorbis:spec:floor0}
|
cannam@86
|
5
|
cannam@86
|
6 \subsection{Overview}
|
cannam@86
|
7
|
cannam@86
|
8 Vorbis floor type zero uses Line Spectral Pair (LSP, also alternately
|
cannam@86
|
9 known as Line Spectral Frequency or LSF) representation to encode a
|
cannam@86
|
10 smooth spectral envelope curve as the frequency response of the LSP
|
cannam@86
|
11 filter. This representation is equivalent to a traditional all-pole
|
cannam@86
|
12 infinite impulse response filter as would be used in linear predictive
|
cannam@86
|
13 coding; LSP representation may be converted to LPC representation and
|
cannam@86
|
14 vice-versa.
|
cannam@86
|
15
|
cannam@86
|
16
|
cannam@86
|
17
|
cannam@86
|
18 \subsection{Floor 0 format}
|
cannam@86
|
19
|
cannam@86
|
20 Floor zero configuration consists of six integer fields and a list of
|
cannam@86
|
21 VQ codebooks for use in coding/decoding the LSP filter coefficient
|
cannam@86
|
22 values used by each frame.
|
cannam@86
|
23
|
cannam@86
|
24 \subsubsection{header decode}
|
cannam@86
|
25
|
cannam@86
|
26 Configuration information for instances of floor zero decodes from the
|
cannam@86
|
27 codec setup header (third packet). configuration decode proceeds as
|
cannam@86
|
28 follows:
|
cannam@86
|
29
|
cannam@86
|
30 \begin{Verbatim}[commandchars=\\\{\}]
|
cannam@86
|
31 1) [floor0\_order] = read an unsigned integer of 8 bits
|
cannam@86
|
32 2) [floor0\_rate] = read an unsigned integer of 16 bits
|
cannam@86
|
33 3) [floor0\_bark\_map\_size] = read an unsigned integer of 16 bits
|
cannam@86
|
34 4) [floor0\_amplitude\_bits] = read an unsigned integer of six bits
|
cannam@86
|
35 5) [floor0\_amplitude\_offset] = read an unsigned integer of eight bits
|
cannam@86
|
36 6) [floor0\_number\_of\_books] = read an unsigned integer of four bits and add 1
|
cannam@86
|
37 7) array [floor0\_book\_list] = read a list of [floor0\_number\_of\_books] unsigned integers of eight bits each;
|
cannam@86
|
38 \end{Verbatim}
|
cannam@86
|
39
|
cannam@86
|
40 An end-of-packet condition during any of these bitstream reads renders
|
cannam@86
|
41 this stream undecodable. In addition, any element of the array
|
cannam@86
|
42 \varname{[floor0\_book\_list]} that is greater than the maximum codebook
|
cannam@86
|
43 number for this bitstream is an error condition that also renders the
|
cannam@86
|
44 stream undecodable.
|
cannam@86
|
45
|
cannam@86
|
46
|
cannam@86
|
47
|
cannam@86
|
48 \subsubsection{packet decode} \label{vorbis:spec:floor0-decode}
|
cannam@86
|
49
|
cannam@86
|
50 Extracting a floor0 curve from an audio packet consists of first
|
cannam@86
|
51 decoding the curve amplitude and \varname{[floor0\_order]} LSP
|
cannam@86
|
52 coefficient values from the bitstream, and then computing the floor
|
cannam@86
|
53 curve, which is defined as the frequency response of the decoded LSP
|
cannam@86
|
54 filter.
|
cannam@86
|
55
|
cannam@86
|
56 Packet decode proceeds as follows:
|
cannam@86
|
57 \begin{Verbatim}[commandchars=\\\{\}]
|
cannam@86
|
58 1) [amplitude] = read an unsigned integer of [floor0\_amplitude\_bits] bits
|
cannam@86
|
59 2) if ( [amplitude] is greater than zero ) \{
|
cannam@86
|
60 3) [coefficients] is an empty, zero length vector
|
cannam@86
|
61 4) [booknumber] = read an unsigned integer of \link{vorbis:spec:ilog}{ilog}( [floor0\_number\_of\_books] ) bits
|
cannam@86
|
62 5) if ( [booknumber] is greater than the highest number decode codebook ) then packet is undecodable
|
cannam@86
|
63 6) [last] = zero;
|
cannam@86
|
64 7) vector [temp\_vector] = read vector from bitstream using codebook number [floor0\_book\_list] element [booknumber] in VQ context.
|
cannam@86
|
65 8) add the scalar value [last] to each scalar in vector [temp\_vector]
|
cannam@86
|
66 9) [last] = the value of the last scalar in vector [temp\_vector]
|
cannam@86
|
67 10) concatenate [temp\_vector] onto the end of the [coefficients] vector
|
cannam@86
|
68 11) if (length of vector [coefficients] is less than [floor0\_order], continue at step 6
|
cannam@86
|
69
|
cannam@86
|
70 \}
|
cannam@86
|
71
|
cannam@86
|
72 12) done.
|
cannam@86
|
73
|
cannam@86
|
74 \end{Verbatim}
|
cannam@86
|
75
|
cannam@86
|
76 Take note of the following properties of decode:
|
cannam@86
|
77 \begin{itemize}
|
cannam@86
|
78 \item An \varname{[amplitude]} value of zero must result in a return code that indicates this channel is unused in this frame (the output of the channel will be all-zeroes in synthesis). Several later stages of decode don't occur for an unused channel.
|
cannam@86
|
79 \item An end-of-packet condition during decode should be considered a
|
cannam@86
|
80 nominal occruence; if end-of-packet is reached during any read
|
cannam@86
|
81 operation above, floor decode is to return 'unused' status as if the
|
cannam@86
|
82 \varname{[amplitude]} value had read zero at the beginning of decode.
|
cannam@86
|
83
|
cannam@86
|
84 \item The book number used for decode
|
cannam@86
|
85 can, in fact, be stored in the bitstream in \link{vorbis:spec:ilog}{ilog}( \varname{[floor0\_number\_of\_books]} -
|
cannam@86
|
86 1 ) bits. Nevertheless, the above specification is correct and values
|
cannam@86
|
87 greater than the maximum possible book value are reserved.
|
cannam@86
|
88
|
cannam@86
|
89 \item The number of scalars read into the vector \varname{[coefficients]}
|
cannam@86
|
90 may be greater than \varname{[floor0\_order]}, the number actually
|
cannam@86
|
91 required for curve computation. For example, if the VQ codebook used
|
cannam@86
|
92 for the floor currently being decoded has a
|
cannam@86
|
93 \varname{[codebook\_dimensions]} value of three and
|
cannam@86
|
94 \varname{[floor0\_order]} is ten, the only way to fill all the needed
|
cannam@86
|
95 scalars in \varname{[coefficients]} is to to read a total of twelve
|
cannam@86
|
96 scalars as four vectors of three scalars each. This is not an error
|
cannam@86
|
97 condition, and care must be taken not to allow a buffer overflow in
|
cannam@86
|
98 decode. The extra values are not used and may be ignored or discarded.
|
cannam@86
|
99 \end{itemize}
|
cannam@86
|
100
|
cannam@86
|
101
|
cannam@86
|
102
|
cannam@86
|
103
|
cannam@86
|
104 \subsubsection{curve computation} \label{vorbis:spec:floor0-synth}
|
cannam@86
|
105
|
cannam@86
|
106 Given an \varname{[amplitude]} integer and \varname{[coefficients]}
|
cannam@86
|
107 vector from packet decode as well as the [floor0\_order],
|
cannam@86
|
108 [floor0\_rate], [floor0\_bark\_map\_size], [floor0\_amplitude\_bits] and
|
cannam@86
|
109 [floor0\_amplitude\_offset] values from floor setup, and an output
|
cannam@86
|
110 vector size \varname{[n]} specified by the decode process, we compute a
|
cannam@86
|
111 floor output vector.
|
cannam@86
|
112
|
cannam@86
|
113 If the value \varname{[amplitude]} is zero, the return value is a
|
cannam@86
|
114 length \varname{[n]} vector with all-zero scalars. Otherwise, begin by
|
cannam@86
|
115 assuming the following definitions for the given vector to be
|
cannam@86
|
116 synthesized:
|
cannam@86
|
117
|
cannam@86
|
118 \begin{displaymath}
|
cannam@86
|
119 \mathrm{map}_i = \left\{
|
cannam@86
|
120 \begin{array}{ll}
|
cannam@86
|
121 \min (
|
cannam@86
|
122 \mathtt{floor0\texttt{\_}bark\texttt{\_}map\texttt{\_}size} - 1,
|
cannam@86
|
123 foobar
|
cannam@86
|
124 ) & \textrm{for } i \in [0,n-1] \\
|
cannam@86
|
125 -1 & \textrm{for } i = n
|
cannam@86
|
126 \end{array}
|
cannam@86
|
127 \right.
|
cannam@86
|
128 \end{displaymath}
|
cannam@86
|
129
|
cannam@86
|
130 where
|
cannam@86
|
131
|
cannam@86
|
132 \begin{displaymath}
|
cannam@86
|
133 foobar =
|
cannam@86
|
134 \left\lfloor
|
cannam@86
|
135 \mathrm{bark}\left(\frac{\mathtt{floor0\texttt{\_}rate} \cdot i}{2n}\right) \cdot \frac{\mathtt{floor0\texttt{\_}bark\texttt{\_}map\texttt{\_}size}} {\mathrm{bark}(.5 \cdot \mathtt{floor0\texttt{\_}rate})}
|
cannam@86
|
136 \right\rfloor
|
cannam@86
|
137 \end{displaymath}
|
cannam@86
|
138
|
cannam@86
|
139 and
|
cannam@86
|
140
|
cannam@86
|
141 \begin{displaymath}
|
cannam@86
|
142 \mathrm{bark}(x) = 13.1 \arctan (.00074x) + 2.24 \arctan (.0000000185x^2 + .0001x)
|
cannam@86
|
143 \end{displaymath}
|
cannam@86
|
144
|
cannam@86
|
145 The above is used to synthesize the LSP curve on a Bark-scale frequency
|
cannam@86
|
146 axis, then map the result to a linear-scale frequency axis.
|
cannam@86
|
147 Similarly, the below calculation synthesizes the output LSP curve \varname{[output]} on a log
|
cannam@86
|
148 (dB) amplitude scale, mapping it to linear amplitude in the last step:
|
cannam@86
|
149
|
cannam@86
|
150 \begin{enumerate}
|
cannam@86
|
151 \item \varname{[i]} = 0
|
cannam@86
|
152 \item \varname{[$\omega$]} = $\pi$ * map element \varname{[i]} / \varname{[floor0\_bark\_map\_size]}
|
cannam@86
|
153 \item if ( \varname{[floor0\_order]} is odd ) {
|
cannam@86
|
154 \begin{enumerate}
|
cannam@86
|
155 \item calculate \varname{[p]} and \varname{[q]} according to:
|
cannam@86
|
156 \begin{eqnarray*}
|
cannam@86
|
157 p & = & (1 - \cos^2\omega)\prod_{j=0}^{\frac{\mathtt{floor0\texttt{\_}order}-3}{2}} 4 (\cos([\mathtt{coefficients}]_{2j+1}) - \cos \omega)^2 \\
|
cannam@86
|
158 q & = & \frac{1}{4} \prod_{j=0}^{\frac{\mathtt{floor0\texttt{\_}order}-1}{2}} 4 (\cos([\mathtt{coefficients}]_{2j}) - \cos \omega)^2
|
cannam@86
|
159 \end{eqnarray*}
|
cannam@86
|
160
|
cannam@86
|
161 \end{enumerate}
|
cannam@86
|
162 } else \varname{[floor0\_order]} is even {
|
cannam@86
|
163 \begin{enumerate}[resume]
|
cannam@86
|
164 \item calculate \varname{[p]} and \varname{[q]} according to:
|
cannam@86
|
165 \begin{eqnarray*}
|
cannam@86
|
166 p & = & \frac{(1 - \cos\omega)}{2} \prod_{j=0}^{\frac{\mathtt{floor0\texttt{\_}order}-2}{2}} 4 (\cos([\mathtt{coefficients}]_{2j+1}) - \cos \omega)^2 \\
|
cannam@86
|
167 q & = & \frac{(1 + \cos\omega)}{2} \prod_{j=0}^{\frac{\mathtt{floor0\texttt{\_}order}-2}{2}} 4 (\cos([\mathtt{coefficients}]_{2j}) - \cos \omega)^2
|
cannam@86
|
168 \end{eqnarray*}
|
cannam@86
|
169
|
cannam@86
|
170 \end{enumerate}
|
cannam@86
|
171 }
|
cannam@86
|
172
|
cannam@86
|
173 \item calculate \varname{[linear\_floor\_value]} according to:
|
cannam@86
|
174 \begin{displaymath}
|
cannam@86
|
175 \exp \left( .11512925 \left(\frac{\mathtt{amplitude} \cdot \mathtt{floor0\texttt{\_}amplitute\texttt{\_}offset}}{(2^{\mathtt{floor0\texttt{\_}amplitude\texttt{\_}bits}}-1)\sqrt{p+q}}
|
cannam@86
|
176 - \mathtt{floor0\texttt{\_}amplitude\texttt{\_}offset} \right) \right)
|
cannam@86
|
177 \end{displaymath}
|
cannam@86
|
178
|
cannam@86
|
179 \item \varname{[iteration\_condition]} = map element \varname{[i]}
|
cannam@86
|
180 \item \varname{[output]} element \varname{[i]} = \varname{[linear\_floor\_value]}
|
cannam@86
|
181 \item increment \varname{[i]}
|
cannam@86
|
182 \item if ( map element \varname{[i]} is equal to \varname{[iteration\_condition]} ) continue at step 5
|
cannam@86
|
183 \item if ( \varname{[i]} is less than \varname{[n]} ) continue at step 2
|
cannam@86
|
184 \item done
|
cannam@86
|
185 \end{enumerate}
|
cannam@86
|
186
|
cannam@86
|
187
|
cannam@86
|
188
|
cannam@86
|
189
|
cannam@86
|
190
|
cannam@86
|
191
|
cannam@86
|
192
|