Mercurial > hg > sv-dependency-builds
comparison src/libvorbis-1.3.3/doc/02-bitpacking.tex @ 1:05aa0afa9217
Bring in flac, ogg, vorbis
author | Chris Cannam |
---|---|
date | Tue, 19 Mar 2013 17:37:49 +0000 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
0:c7265573341e | 1:05aa0afa9217 |
---|---|
1 % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*- | |
2 %!TEX root = Vorbis_I_spec.tex | |
3 % $Id$ | |
4 \section{Bitpacking Convention} \label{vorbis:spec:bitpacking} | |
5 | |
6 \subsection{Overview} | |
7 | |
8 The Vorbis codec uses relatively unstructured raw packets containing | |
9 arbitrary-width binary integer fields. Logically, these packets are a | |
10 bitstream in which bits are coded one-by-one by the encoder and then | |
11 read one-by-one in the same monotonically increasing order by the | |
12 decoder. Most current binary storage arrangements group bits into a | |
13 native word size of eight bits (octets), sixteen bits, thirty-two bits | |
14 or, less commonly other fixed word sizes. The Vorbis bitpacking | |
15 convention specifies the correct mapping of the logical packet | |
16 bitstream into an actual representation in fixed-width words. | |
17 | |
18 | |
19 \subsubsection{octets, bytes and words} | |
20 | |
21 In most contemporary architectures, a 'byte' is synonymous with an | |
22 'octet', that is, eight bits. This has not always been the case; | |
23 seven, ten, eleven and sixteen bit 'bytes' have been used. For | |
24 purposes of the bitpacking convention, a byte implies the native, | |
25 smallest integer storage representation offered by a platform. On | |
26 modern platforms, this is generally assumed to be eight bits (not | |
27 necessarily because of the processor but because of the | |
28 filesystem/memory architecture. Modern filesystems invariably offer | |
29 bytes as the fundamental atom of storage). A 'word' is an integer | |
30 size that is a grouped multiple of this smallest size. | |
31 | |
32 The most ubiquitous architectures today consider a 'byte' to be an | |
33 octet (eight bits) and a word to be a group of two, four or eight | |
34 bytes (16, 32 or 64 bits). Note however that the Vorbis bitpacking | |
35 convention is still well defined for any native byte size; Vorbis uses | |
36 the native bit-width of a given storage system. This document assumes | |
37 that a byte is one octet for purposes of example. | |
38 | |
39 \subsubsection{bit order} | |
40 | |
41 A byte has a well-defined 'least significant' bit (LSb), which is the | |
42 only bit set when the byte is storing the two's complement integer | |
43 value +1. A byte's 'most significant' bit (MSb) is at the opposite | |
44 end of the byte. Bits in a byte are numbered from zero at the LSb to | |
45 $n$ ($n=7$ in an octet) for the | |
46 MSb. | |
47 | |
48 | |
49 | |
50 \subsubsection{byte order} | |
51 | |
52 Words are native groupings of multiple bytes. Several byte orderings | |
53 are possible in a word; the common ones are 3-2-1-0 ('big endian' or | |
54 'most significant byte first' in which the highest-valued byte comes | |
55 first), 0-1-2-3 ('little endian' or 'least significant byte first' in | |
56 which the lowest value byte comes first) and less commonly 3-1-2-0 and | |
57 0-2-1-3 ('mixed endian'). | |
58 | |
59 The Vorbis bitpacking convention specifies storage and bitstream | |
60 manipulation at the byte, not word, level, thus host word ordering is | |
61 of a concern only during optimization when writing high performance | |
62 code that operates on a word of storage at a time rather than by byte. | |
63 Logically, bytes are always coded and decoded in order from byte zero | |
64 through byte $n$. | |
65 | |
66 | |
67 | |
68 \subsubsection{coding bits into byte sequences} | |
69 | |
70 The Vorbis codec has need to code arbitrary bit-width integers, from | |
71 zero to 32 bits wide, into packets. These integer fields are not | |
72 aligned to the boundaries of the byte representation; the next field | |
73 is written at the bit position at which the previous field ends. | |
74 | |
75 The encoder logically packs integers by writing the LSb of a binary | |
76 integer to the logical bitstream first, followed by next least | |
77 significant bit, etc, until the requested number of bits have been | |
78 coded. When packing the bits into bytes, the encoder begins by | |
79 placing the LSb of the integer to be written into the least | |
80 significant unused bit position of the destination byte, followed by | |
81 the next-least significant bit of the source integer and so on up to | |
82 the requested number of bits. When all bits of the destination byte | |
83 have been filled, encoding continues by zeroing all bits of the next | |
84 byte and writing the next bit into the bit position 0 of that byte. | |
85 Decoding follows the same process as encoding, but by reading bits | |
86 from the byte stream and reassembling them into integers. | |
87 | |
88 | |
89 | |
90 \subsubsection{signedness} | |
91 | |
92 The signedness of a specific number resulting from decode is to be | |
93 interpreted by the decoder given decode context. That is, the three | |
94 bit binary pattern 'b111' can be taken to represent either 'seven' as | |
95 an unsigned integer, or '-1' as a signed, two's complement integer. | |
96 The encoder and decoder are responsible for knowing if fields are to | |
97 be treated as signed or unsigned. | |
98 | |
99 | |
100 | |
101 \subsubsection{coding example} | |
102 | |
103 Code the 4 bit integer value '12' [b1100] into an empty bytestream. | |
104 Bytestream result: | |
105 | |
106 \begin{Verbatim}[commandchars=\\\{\}] | |
107 | | |
108 V | |
109 | |
110 7 6 5 4 3 2 1 0 | |
111 byte 0 [0 0 0 0 1 1 0 0] <- | |
112 byte 1 [ ] | |
113 byte 2 [ ] | |
114 byte 3 [ ] | |
115 ... | |
116 byte n [ ] bytestream length == 1 byte | |
117 | |
118 \end{Verbatim} | |
119 | |
120 | |
121 Continue by coding the 3 bit integer value '-1' [b111]: | |
122 | |
123 \begin{Verbatim}[commandchars=\\\{\}] | |
124 | | |
125 V | |
126 | |
127 7 6 5 4 3 2 1 0 | |
128 byte 0 [0 1 1 1 1 1 0 0] <- | |
129 byte 1 [ ] | |
130 byte 2 [ ] | |
131 byte 3 [ ] | |
132 ... | |
133 byte n [ ] bytestream length == 1 byte | |
134 \end{Verbatim} | |
135 | |
136 | |
137 Continue by coding the 7 bit integer value '17' [b0010001]: | |
138 | |
139 \begin{Verbatim}[commandchars=\\\{\}] | |
140 | | |
141 V | |
142 | |
143 7 6 5 4 3 2 1 0 | |
144 byte 0 [1 1 1 1 1 1 0 0] | |
145 byte 1 [0 0 0 0 1 0 0 0] <- | |
146 byte 2 [ ] | |
147 byte 3 [ ] | |
148 ... | |
149 byte n [ ] bytestream length == 2 bytes | |
150 bit cursor == 6 | |
151 \end{Verbatim} | |
152 | |
153 | |
154 Continue by coding the 13 bit integer value '6969' [b110 11001110 01]: | |
155 | |
156 \begin{Verbatim}[commandchars=\\\{\}] | |
157 | | |
158 V | |
159 | |
160 7 6 5 4 3 2 1 0 | |
161 byte 0 [1 1 1 1 1 1 0 0] | |
162 byte 1 [0 1 0 0 1 0 0 0] | |
163 byte 2 [1 1 0 0 1 1 1 0] | |
164 byte 3 [0 0 0 0 0 1 1 0] <- | |
165 ... | |
166 byte n [ ] bytestream length == 4 bytes | |
167 | |
168 \end{Verbatim} | |
169 | |
170 | |
171 | |
172 | |
173 \subsubsection{decoding example} | |
174 | |
175 Reading from the beginning of the bytestream encoded in the above example: | |
176 | |
177 \begin{Verbatim}[commandchars=\\\{\}] | |
178 | | |
179 V | |
180 | |
181 7 6 5 4 3 2 1 0 | |
182 byte 0 [1 1 1 1 1 1 0 0] <- | |
183 byte 1 [0 1 0 0 1 0 0 0] | |
184 byte 2 [1 1 0 0 1 1 1 0] | |
185 byte 3 [0 0 0 0 0 1 1 0] bytestream length == 4 bytes | |
186 | |
187 \end{Verbatim} | |
188 | |
189 | |
190 We read two, two-bit integer fields, resulting in the returned numbers | |
191 'b00' and 'b11'. Two things are worth noting here: | |
192 | |
193 \begin{itemize} | |
194 \item Although these four bits were originally written as a single | |
195 four-bit integer, reading some other combination of bit-widths from the | |
196 bitstream is well defined. There are no artificial alignment | |
197 boundaries maintained in the bitstream. | |
198 | |
199 \item The second value is the | |
200 two-bit-wide integer 'b11'. This value may be interpreted either as | |
201 the unsigned value '3', or the signed value '-1'. Signedness is | |
202 dependent on decode context. | |
203 \end{itemize} | |
204 | |
205 | |
206 | |
207 | |
208 \subsubsection{end-of-packet alignment} | |
209 | |
210 The typical use of bitpacking is to produce many independent | |
211 byte-aligned packets which are embedded into a larger byte-aligned | |
212 container structure, such as an Ogg transport bitstream. Externally, | |
213 each bytestream (encoded bitstream) must begin and end on a byte | |
214 boundary. Often, the encoded bitstream is not an integer number of | |
215 bytes, and so there is unused (uncoded) space in the last byte of a | |
216 packet. | |
217 | |
218 Unused space in the last byte of a bytestream is always zeroed during | |
219 the coding process. Thus, should this unused space be read, it will | |
220 return binary zeroes. | |
221 | |
222 Attempting to read past the end of an encoded packet results in an | |
223 'end-of-packet' condition. End-of-packet is not to be considered an | |
224 error; it is merely a state indicating that there is insufficient | |
225 remaining data to fulfill the desired read size. Vorbis uses truncated | |
226 packets as a normal mode of operation, and as such, decoders must | |
227 handle reading past the end of a packet as a typical mode of | |
228 operation. Any further read operations after an 'end-of-packet' | |
229 condition shall also return 'end-of-packet'. | |
230 | |
231 | |
232 | |
233 \subsubsection{reading zero bits} | |
234 | |
235 Reading a zero-bit-wide integer returns the value '0' and does not | |
236 increment the stream cursor. Reading to the end of the packet (but | |
237 not past, such that an 'end-of-packet' condition has not triggered) | |
238 and then reading a zero bit integer shall succeed, returning 0, and | |
239 not trigger an end-of-packet condition. Reading a zero-bit-wide | |
240 integer after a previous read sets 'end-of-packet' shall also fail | |
241 with 'end-of-packet'. | |
242 | |
243 | |
244 | |
245 | |
246 | |
247 |