Chris@1: % -*- mode: latex; TeX-master: "Vorbis_I_spec"; -*-
Chris@1: %!TEX root = Vorbis_I_spec.tex
Chris@1: % $Id$
Chris@1: \section{Bitpacking Convention} \label{vorbis:spec:bitpacking}
Chris@1: 
Chris@1: \subsection{Overview}
Chris@1: 
Chris@1: The Vorbis codec uses relatively unstructured raw packets containing
Chris@1: arbitrary-width binary integer fields.  Logically, these packets are a
Chris@1: bitstream in which bits are coded one-by-one by the encoder and then
Chris@1: read one-by-one in the same monotonically increasing order by the
Chris@1: decoder.  Most current binary storage arrangements group bits into a
Chris@1: native word size of eight bits (octets), sixteen bits, thirty-two bits
Chris@1: or, less commonly other fixed word sizes.  The Vorbis bitpacking
Chris@1: convention specifies the correct mapping of the logical packet
Chris@1: bitstream into an actual representation in fixed-width words.
Chris@1: 
Chris@1: 
Chris@1: \subsubsection{octets, bytes and words}
Chris@1: 
Chris@1: In most contemporary architectures, a 'byte' is synonymous with an
Chris@1: 'octet', that is, eight bits.  This has not always been the case;
Chris@1: seven, ten, eleven and sixteen bit 'bytes' have been used.  For
Chris@1: purposes of the bitpacking convention, a byte implies the native,
Chris@1: smallest integer storage representation offered by a platform.  On
Chris@1: modern platforms, this is generally assumed to be eight bits (not
Chris@1: necessarily because of the processor but because of the
Chris@1: filesystem/memory architecture.  Modern filesystems invariably offer
Chris@1: bytes as the fundamental atom of storage).  A 'word' is an integer
Chris@1: size that is a grouped multiple of this smallest size.
Chris@1: 
Chris@1: The most ubiquitous architectures today consider a 'byte' to be an
Chris@1: octet (eight bits) and a word to be a group of two, four or eight
Chris@1: bytes (16, 32 or 64 bits).  Note however that the Vorbis bitpacking
Chris@1: convention is still well defined for any native byte size; Vorbis uses
Chris@1: the native bit-width of a given storage system. This document assumes
Chris@1: that a byte is one octet for purposes of example.
Chris@1: 
Chris@1: \subsubsection{bit order}
Chris@1: 
Chris@1: A byte has a well-defined 'least significant' bit (LSb), which is the
Chris@1: only bit set when the byte is storing the two's complement integer
Chris@1: value +1.  A byte's 'most significant' bit (MSb) is at the opposite
Chris@1: end of the byte. Bits in a byte are numbered from zero at the LSb to
Chris@1: $n$ ($n=7$ in an octet) for the
Chris@1: MSb.
Chris@1: 
Chris@1: 
Chris@1: 
Chris@1: \subsubsection{byte order}
Chris@1: 
Chris@1: Words are native groupings of multiple bytes.  Several byte orderings
Chris@1: are possible in a word; the common ones are 3-2-1-0 ('big endian' or
Chris@1: 'most significant byte first' in which the highest-valued byte comes
Chris@1: first), 0-1-2-3 ('little endian' or 'least significant byte first' in
Chris@1: which the lowest value byte comes first) and less commonly 3-1-2-0 and
Chris@1: 0-2-1-3 ('mixed endian').
Chris@1: 
Chris@1: The Vorbis bitpacking convention specifies storage and bitstream
Chris@1: manipulation at the byte, not word, level, thus host word ordering is
Chris@1: of a concern only during optimization when writing high performance
Chris@1: code that operates on a word of storage at a time rather than by byte.
Chris@1: Logically, bytes are always coded and decoded in order from byte zero
Chris@1: through byte $n$.
Chris@1: 
Chris@1: 
Chris@1: 
Chris@1: \subsubsection{coding bits into byte sequences}
Chris@1: 
Chris@1: The Vorbis codec has need to code arbitrary bit-width integers, from
Chris@1: zero to 32 bits wide, into packets.  These integer fields are not
Chris@1: aligned to the boundaries of the byte representation; the next field
Chris@1: is written at the bit position at which the previous field ends.
Chris@1: 
Chris@1: The encoder logically packs integers by writing the LSb of a binary
Chris@1: integer to the logical bitstream first, followed by next least
Chris@1: significant bit, etc, until the requested number of bits have been
Chris@1: coded.  When packing the bits into bytes, the encoder begins by
Chris@1: placing the LSb of the integer to be written into the least
Chris@1: significant unused bit position of the destination byte, followed by
Chris@1: the next-least significant bit of the source integer and so on up to
Chris@1: the requested number of bits.  When all bits of the destination byte
Chris@1: have been filled, encoding continues by zeroing all bits of the next
Chris@1: byte and writing the next bit into the bit position 0 of that byte.
Chris@1: Decoding follows the same process as encoding, but by reading bits
Chris@1: from the byte stream and reassembling them into integers.
Chris@1: 
Chris@1: 
Chris@1: 
Chris@1: \subsubsection{signedness}
Chris@1: 
Chris@1: The signedness of a specific number resulting from decode is to be
Chris@1: interpreted by the decoder given decode context.  That is, the three
Chris@1: bit binary pattern 'b111' can be taken to represent either 'seven' as
Chris@1: an unsigned integer, or '-1' as a signed, two's complement integer.
Chris@1: The encoder and decoder are responsible for knowing if fields are to
Chris@1: be treated as signed or unsigned.
Chris@1: 
Chris@1: 
Chris@1: 
Chris@1: \subsubsection{coding example}
Chris@1: 
Chris@1: Code the 4 bit integer value '12' [b1100] into an empty bytestream.
Chris@1: Bytestream result:
Chris@1: 
Chris@1: \begin{Verbatim}[commandchars=\\\{\}]
Chris@1:               |
Chris@1:               V
Chris@1: 
Chris@1:         7 6 5 4 3 2 1 0
Chris@1: byte 0 [0 0 0 0 1 1 0 0]  <-
Chris@1: byte 1 [               ]
Chris@1: byte 2 [               ]
Chris@1: byte 3 [               ]
Chris@1:              ...
Chris@1: byte n [               ]  bytestream length == 1 byte
Chris@1: 
Chris@1: \end{Verbatim}
Chris@1: 
Chris@1: 
Chris@1: Continue by coding the 3 bit integer value '-1' [b111]:
Chris@1: 
Chris@1: \begin{Verbatim}[commandchars=\\\{\}]
Chris@1:         |
Chris@1:         V
Chris@1: 
Chris@1:         7 6 5 4 3 2 1 0
Chris@1: byte 0 [0 1 1 1 1 1 0 0]  <-
Chris@1: byte 1 [               ]
Chris@1: byte 2 [               ]
Chris@1: byte 3 [               ]
Chris@1:              ...
Chris@1: byte n [               ]  bytestream length == 1 byte
Chris@1: \end{Verbatim}
Chris@1: 
Chris@1: 
Chris@1: Continue by coding the 7 bit integer value '17' [b0010001]:
Chris@1: 
Chris@1: \begin{Verbatim}[commandchars=\\\{\}]
Chris@1:           |
Chris@1:           V
Chris@1: 
Chris@1:         7 6 5 4 3 2 1 0
Chris@1: byte 0 [1 1 1 1 1 1 0 0]
Chris@1: byte 1 [0 0 0 0 1 0 0 0]  <-
Chris@1: byte 2 [               ]
Chris@1: byte 3 [               ]
Chris@1:              ...
Chris@1: byte n [               ]  bytestream length == 2 bytes
Chris@1:                           bit cursor == 6
Chris@1: \end{Verbatim}
Chris@1: 
Chris@1: 
Chris@1: Continue by coding the 13 bit integer value '6969' [b110 11001110 01]:
Chris@1: 
Chris@1: \begin{Verbatim}[commandchars=\\\{\}]
Chris@1:                 |
Chris@1:                 V
Chris@1: 
Chris@1:         7 6 5 4 3 2 1 0
Chris@1: byte 0 [1 1 1 1 1 1 0 0]
Chris@1: byte 1 [0 1 0 0 1 0 0 0]
Chris@1: byte 2 [1 1 0 0 1 1 1 0]
Chris@1: byte 3 [0 0 0 0 0 1 1 0]  <-
Chris@1:              ...
Chris@1: byte n [               ]  bytestream length == 4 bytes
Chris@1: 
Chris@1: \end{Verbatim}
Chris@1: 
Chris@1: 
Chris@1: 
Chris@1: 
Chris@1: \subsubsection{decoding example}
Chris@1: 
Chris@1: Reading from the beginning of the bytestream encoded in the above example:
Chris@1: 
Chris@1: \begin{Verbatim}[commandchars=\\\{\}]
Chris@1:                       |
Chris@1:                       V
Chris@1: 
Chris@1:         7 6 5 4 3 2 1 0
Chris@1: byte 0 [1 1 1 1 1 1 0 0]  <-
Chris@1: byte 1 [0 1 0 0 1 0 0 0]
Chris@1: byte 2 [1 1 0 0 1 1 1 0]
Chris@1: byte 3 [0 0 0 0 0 1 1 0]  bytestream length == 4 bytes
Chris@1: 
Chris@1: \end{Verbatim}
Chris@1: 
Chris@1: 
Chris@1: We read two, two-bit integer fields, resulting in the returned numbers
Chris@1: 'b00' and 'b11'.  Two things are worth noting here:
Chris@1: 
Chris@1: \begin{itemize}
Chris@1: \item Although these four bits were originally written as a single
Chris@1: four-bit integer, reading some other combination of bit-widths from the
Chris@1: bitstream is well defined.  There are no artificial alignment
Chris@1: boundaries maintained in the bitstream.
Chris@1: 
Chris@1: \item The second value is the
Chris@1: two-bit-wide integer 'b11'.  This value may be interpreted either as
Chris@1: the unsigned value '3', or the signed value '-1'.  Signedness is
Chris@1: dependent on decode context.
Chris@1: \end{itemize}
Chris@1: 
Chris@1: 
Chris@1: 
Chris@1: 
Chris@1: \subsubsection{end-of-packet alignment}
Chris@1: 
Chris@1: The typical use of bitpacking is to produce many independent
Chris@1: byte-aligned packets which are embedded into a larger byte-aligned
Chris@1: container structure, such as an Ogg transport bitstream.  Externally,
Chris@1: each bytestream (encoded bitstream) must begin and end on a byte
Chris@1: boundary.  Often, the encoded bitstream is not an integer number of
Chris@1: bytes, and so there is unused (uncoded) space in the last byte of a
Chris@1: packet.
Chris@1: 
Chris@1: Unused space in the last byte of a bytestream is always zeroed during
Chris@1: the coding process.  Thus, should this unused space be read, it will
Chris@1: return binary zeroes.
Chris@1: 
Chris@1: Attempting to read past the end of an encoded packet results in an
Chris@1: 'end-of-packet' condition.  End-of-packet is not to be considered an
Chris@1: error; it is merely a state indicating that there is insufficient
Chris@1: remaining data to fulfill the desired read size.  Vorbis uses truncated
Chris@1: packets as a normal mode of operation, and as such, decoders must
Chris@1: handle reading past the end of a packet as a typical mode of
Chris@1: operation. Any further read operations after an 'end-of-packet'
Chris@1: condition shall also return 'end-of-packet'.
Chris@1: 
Chris@1: 
Chris@1: 
Chris@1: \subsubsection{reading zero bits}
Chris@1: 
Chris@1: Reading a zero-bit-wide integer returns the value '0' and does not
Chris@1: increment the stream cursor.  Reading to the end of the packet (but
Chris@1: not past, such that an 'end-of-packet' condition has not triggered)
Chris@1: and then reading a zero bit integer shall succeed, returning 0, and
Chris@1: not trigger an end-of-packet condition.  Reading a zero-bit-wide
Chris@1: integer after a previous read sets 'end-of-packet' shall also fail
Chris@1: with 'end-of-packet'.
Chris@1: 
Chris@1: 
Chris@1: 
Chris@1: 
Chris@1: 
Chris@1: