sv-dependency-builds: src/libvorbis-1.3.3/doc/02-bitpacking.tex annotate

annotate src/libvorbis-1.3.3/doc/02-bitpacking.tex @ 124:e3d5853d5918

Current stable PortAudio source

author	Chris Cannam <cannam@all-day-breakfast.com>
date	Tue, 18 Oct 2016 13:11:05 +0100
parents	98c1576536ae
children

rev	line source
cannam@86	1 % -- mode: latex; TeX-master: "Vorbis_I_spec"; --
cannam@86	2 %!TEX root = Vorbis_I_spec.tex
cannam@86	3 % $Id$
cannam@86	4 \section{Bitpacking Convention} \label{vorbis:spec:bitpacking}
cannam@86	5
cannam@86	6 \subsection{Overview}
cannam@86	7
cannam@86	8 The Vorbis codec uses relatively unstructured raw packets containing
cannam@86	9 arbitrary-width binary integer fields. Logically, these packets are a
cannam@86	10 bitstream in which bits are coded one-by-one by the encoder and then
cannam@86	11 read one-by-one in the same monotonically increasing order by the
cannam@86	12 decoder. Most current binary storage arrangements group bits into a
cannam@86	13 native word size of eight bits (octets), sixteen bits, thirty-two bits
cannam@86	14 or, less commonly other fixed word sizes. The Vorbis bitpacking
cannam@86	15 convention specifies the correct mapping of the logical packet
cannam@86	16 bitstream into an actual representation in fixed-width words.
cannam@86	17
cannam@86	18
cannam@86	19 \subsubsection{octets, bytes and words}
cannam@86	20
cannam@86	21 In most contemporary architectures, a 'byte' is synonymous with an
cannam@86	22 'octet', that is, eight bits. This has not always been the case;
cannam@86	23 seven, ten, eleven and sixteen bit 'bytes' have been used. For
cannam@86	24 purposes of the bitpacking convention, a byte implies the native,
cannam@86	25 smallest integer storage representation offered by a platform. On
cannam@86	26 modern platforms, this is generally assumed to be eight bits (not
cannam@86	27 necessarily because of the processor but because of the
cannam@86	28 filesystem/memory architecture. Modern filesystems invariably offer
cannam@86	29 bytes as the fundamental atom of storage). A 'word' is an integer
cannam@86	30 size that is a grouped multiple of this smallest size.
cannam@86	31
cannam@86	32 The most ubiquitous architectures today consider a 'byte' to be an
cannam@86	33 octet (eight bits) and a word to be a group of two, four or eight
cannam@86	34 bytes (16, 32 or 64 bits). Note however that the Vorbis bitpacking
cannam@86	35 convention is still well defined for any native byte size; Vorbis uses
cannam@86	36 the native bit-width of a given storage system. This document assumes
cannam@86	37 that a byte is one octet for purposes of example.
cannam@86	38
cannam@86	39 \subsubsection{bit order}
cannam@86	40
cannam@86	41 A byte has a well-defined 'least significant' bit (LSb), which is the
cannam@86	42 only bit set when the byte is storing the two's complement integer
cannam@86	43 value +1. A byte's 'most significant' bit (MSb) is at the opposite
cannam@86	44 end of the byte. Bits in a byte are numbered from zero at the LSb to
cannam@86	45 $n$ ($n=7$ in an octet) for the
cannam@86	46 MSb.
cannam@86	47
cannam@86	48
cannam@86	49
cannam@86	50 \subsubsection{byte order}
cannam@86	51
cannam@86	52 Words are native groupings of multiple bytes. Several byte orderings
cannam@86	53 are possible in a word; the common ones are 3-2-1-0 ('big endian' or
cannam@86	54 'most significant byte first' in which the highest-valued byte comes
cannam@86	55 first), 0-1-2-3 ('little endian' or 'least significant byte first' in
cannam@86	56 which the lowest value byte comes first) and less commonly 3-1-2-0 and
cannam@86	57 0-2-1-3 ('mixed endian').
cannam@86	58
cannam@86	59 The Vorbis bitpacking convention specifies storage and bitstream
cannam@86	60 manipulation at the byte, not word, level, thus host word ordering is
cannam@86	61 of a concern only during optimization when writing high performance
cannam@86	62 code that operates on a word of storage at a time rather than by byte.
cannam@86	63 Logically, bytes are always coded and decoded in order from byte zero
cannam@86	64 through byte $n$.
cannam@86	65
cannam@86	66
cannam@86	67
cannam@86	68 \subsubsection{coding bits into byte sequences}
cannam@86	69
cannam@86	70 The Vorbis codec has need to code arbitrary bit-width integers, from
cannam@86	71 zero to 32 bits wide, into packets. These integer fields are not
cannam@86	72 aligned to the boundaries of the byte representation; the next field
cannam@86	73 is written at the bit position at which the previous field ends.
cannam@86	74
cannam@86	75 The encoder logically packs integers by writing the LSb of a binary
cannam@86	76 integer to the logical bitstream first, followed by next least
cannam@86	77 significant bit, etc, until the requested number of bits have been
cannam@86	78 coded. When packing the bits into bytes, the encoder begins by
cannam@86	79 placing the LSb of the integer to be written into the least
cannam@86	80 significant unused bit position of the destination byte, followed by
cannam@86	81 the next-least significant bit of the source integer and so on up to
cannam@86	82 the requested number of bits. When all bits of the destination byte
cannam@86	83 have been filled, encoding continues by zeroing all bits of the next
cannam@86	84 byte and writing the next bit into the bit position 0 of that byte.
cannam@86	85 Decoding follows the same process as encoding, but by reading bits
cannam@86	86 from the byte stream and reassembling them into integers.
cannam@86	87
cannam@86	88
cannam@86	89
cannam@86	90 \subsubsection{signedness}
cannam@86	91
cannam@86	92 The signedness of a specific number resulting from decode is to be
cannam@86	93 interpreted by the decoder given decode context. That is, the three
cannam@86	94 bit binary pattern 'b111' can be taken to represent either 'seven' as
cannam@86	95 an unsigned integer, or '-1' as a signed, two's complement integer.
cannam@86	96 The encoder and decoder are responsible for knowing if fields are to
cannam@86	97 be treated as signed or unsigned.
cannam@86	98
cannam@86	99
cannam@86	100
cannam@86	101 \subsubsection{coding example}
cannam@86	102
cannam@86	103 Code the 4 bit integer value '12' [b1100] into an empty bytestream.
cannam@86	104 Bytestream result:
cannam@86	105
cannam@86	106 \begin{Verbatim}[commandchars=\\\{\}]
cannam@86	107 \|
cannam@86	108 V
cannam@86	109
cannam@86	110 7 6 5 4 3 2 1 0
cannam@86	111 byte 0 [0 0 0 0 1 1 0 0] <-
cannam@86	112 byte 1 [ ]
cannam@86	113 byte 2 [ ]
cannam@86	114 byte 3 [ ]
cannam@86	115 ...
cannam@86	116 byte n [ ] bytestream length == 1 byte
cannam@86	117
cannam@86	118 \end{Verbatim}
cannam@86	119
cannam@86	120
cannam@86	121 Continue by coding the 3 bit integer value '-1' [b111]:
cannam@86	122
cannam@86	123 \begin{Verbatim}[commandchars=\\\{\}]
cannam@86	124 \|
cannam@86	125 V
cannam@86	126
cannam@86	127 7 6 5 4 3 2 1 0
cannam@86	128 byte 0 [0 1 1 1 1 1 0 0] <-
cannam@86	129 byte 1 [ ]
cannam@86	130 byte 2 [ ]
cannam@86	131 byte 3 [ ]
cannam@86	132 ...
cannam@86	133 byte n [ ] bytestream length == 1 byte
cannam@86	134 \end{Verbatim}
cannam@86	135
cannam@86	136
cannam@86	137 Continue by coding the 7 bit integer value '17' [b0010001]:
cannam@86	138
cannam@86	139 \begin{Verbatim}[commandchars=\\\{\}]
cannam@86	140 \|
cannam@86	141 V
cannam@86	142
cannam@86	143 7 6 5 4 3 2 1 0
cannam@86	144 byte 0 [1 1 1 1 1 1 0 0]
cannam@86	145 byte 1 [0 0 0 0 1 0 0 0] <-
cannam@86	146 byte 2 [ ]
cannam@86	147 byte 3 [ ]
cannam@86	148 ...
cannam@86	149 byte n [ ] bytestream length == 2 bytes
cannam@86	150 bit cursor == 6
cannam@86	151 \end{Verbatim}
cannam@86	152
cannam@86	153
cannam@86	154 Continue by coding the 13 bit integer value '6969' [b110 11001110 01]:
cannam@86	155
cannam@86	156 \begin{Verbatim}[commandchars=\\\{\}]
cannam@86	157 \|
cannam@86	158 V
cannam@86	159
cannam@86	160 7 6 5 4 3 2 1 0
cannam@86	161 byte 0 [1 1 1 1 1 1 0 0]
cannam@86	162 byte 1 [0 1 0 0 1 0 0 0]
cannam@86	163 byte 2 [1 1 0 0 1 1 1 0]
cannam@86	164 byte 3 [0 0 0 0 0 1 1 0] <-
cannam@86	165 ...
cannam@86	166 byte n [ ] bytestream length == 4 bytes
cannam@86	167
cannam@86	168 \end{Verbatim}
cannam@86	169
cannam@86	170
cannam@86	171
cannam@86	172
cannam@86	173 \subsubsection{decoding example}
cannam@86	174
cannam@86	175 Reading from the beginning of the bytestream encoded in the above example:
cannam@86	176
cannam@86	177 \begin{Verbatim}[commandchars=\\\{\}]
cannam@86	178 \|
cannam@86	179 V
cannam@86	180
cannam@86	181 7 6 5 4 3 2 1 0
cannam@86	182 byte 0 [1 1 1 1 1 1 0 0] <-
cannam@86	183 byte 1 [0 1 0 0 1 0 0 0]
cannam@86	184 byte 2 [1 1 0 0 1 1 1 0]
cannam@86	185 byte 3 [0 0 0 0 0 1 1 0] bytestream length == 4 bytes
cannam@86	186
cannam@86	187 \end{Verbatim}
cannam@86	188
cannam@86	189
cannam@86	190 We read two, two-bit integer fields, resulting in the returned numbers
cannam@86	191 'b00' and 'b11'. Two things are worth noting here:
cannam@86	192
cannam@86	193 \begin{itemize}
cannam@86	194 \item Although these four bits were originally written as a single
cannam@86	195 four-bit integer, reading some other combination of bit-widths from the
cannam@86	196 bitstream is well defined. There are no artificial alignment
cannam@86	197 boundaries maintained in the bitstream.
cannam@86	198
cannam@86	199 \item The second value is the
cannam@86	200 two-bit-wide integer 'b11'. This value may be interpreted either as
cannam@86	201 the unsigned value '3', or the signed value '-1'. Signedness is
cannam@86	202 dependent on decode context.
cannam@86	203 \end{itemize}
cannam@86	204
cannam@86	205
cannam@86	206
cannam@86	207
cannam@86	208 \subsubsection{end-of-packet alignment}
cannam@86	209
cannam@86	210 The typical use of bitpacking is to produce many independent
cannam@86	211 byte-aligned packets which are embedded into a larger byte-aligned
cannam@86	212 container structure, such as an Ogg transport bitstream. Externally,
cannam@86	213 each bytestream (encoded bitstream) must begin and end on a byte
cannam@86	214 boundary. Often, the encoded bitstream is not an integer number of
cannam@86	215 bytes, and so there is unused (uncoded) space in the last byte of a
cannam@86	216 packet.
cannam@86	217
cannam@86	218 Unused space in the last byte of a bytestream is always zeroed during
cannam@86	219 the coding process. Thus, should this unused space be read, it will
cannam@86	220 return binary zeroes.
cannam@86	221
cannam@86	222 Attempting to read past the end of an encoded packet results in an
cannam@86	223 'end-of-packet' condition. End-of-packet is not to be considered an
cannam@86	224 error; it is merely a state indicating that there is insufficient
cannam@86	225 remaining data to fulfill the desired read size. Vorbis uses truncated
cannam@86	226 packets as a normal mode of operation, and as such, decoders must
cannam@86	227 handle reading past the end of a packet as a typical mode of
cannam@86	228 operation. Any further read operations after an 'end-of-packet'
cannam@86	229 condition shall also return 'end-of-packet'.
cannam@86	230
cannam@86	231
cannam@86	232
cannam@86	233 \subsubsection{reading zero bits}
cannam@86	234
cannam@86	235 Reading a zero-bit-wide integer returns the value '0' and does not
cannam@86	236 increment the stream cursor. Reading to the end of the packet (but
cannam@86	237 not past, such that an 'end-of-packet' condition has not triggered)
cannam@86	238 and then reading a zero bit integer shall succeed, returning 0, and
cannam@86	239 not trigger an end-of-packet condition. Reading a zero-bit-wide
cannam@86	240 integer after a previous read sets 'end-of-packet' shall also fail
cannam@86	241 with 'end-of-packet'.
cannam@86	242
cannam@86	243
cannam@86	244
cannam@86	245
cannam@86	246
cannam@86	247

Mercurial > hg > sv-dependency-builds

annotate src/libvorbis-1.3.3/doc/02-bitpacking.tex @ 124:e3d5853d5918