cannam@86:
cannam@86:
cannam@86:
cannam@86:
cannam@86:
cannam@86: Ogg Vorbis Documentation
cannam@86:
cannam@86:
cannam@86:
cannam@86:
cannam@86:
cannam@86:
cannam@86:
cannam@86:
cannam@86:

cannam@86:
cannam@86:
cannam@86: Ogg Vorbis I format specification: comment field and header specification
cannam@86:
cannam@86: Overview
cannam@86:
cannam@86: The Vorbis text comment header is the second (of three) header
cannam@86: packets that begin a Vorbis bitstream. It is meant for short, text
cannam@86: comments, not arbitrary metadata; arbitrary metadata belongs in a
cannam@86: separate logical bitstream (usually an XML stream type) that provides
cannam@86: greater structure and machine parseability.
cannam@86:
cannam@86: The comment field is meant to be used much like someone jotting a
cannam@86: quick note on the bottom of a CDR. It should be a little information to
cannam@86: remember the disc by and explain it to others; a short, to-the-point
cannam@86: text note that need not only be a couple words, but isn't going to be
cannam@86: more than a short paragraph. The essentials, in other words, whatever
cannam@86: they turn out to be, eg:
cannam@86:
cannam@86:
cannam@86: "Honest Bob and the Factory-to-Dealer-Incentives, _I'm Still Around_,
cannam@86: opening for Moxy Früvous, 1997"
cannam@86:
cannam@86:
cannam@86: Comment encoding
cannam@86:
cannam@86: Structure
cannam@86:
cannam@86: The comment header logically is a list of eight-bit-clean vectors; the
cannam@86: number of vectors is bounded to 2^32-1 and the length of each vector
cannam@86: is limited to 2^32-1 bytes. The vector length is encoded; the vector
cannam@86: contents themselves are not null terminated. In addition to the vector
cannam@86: list, there is a single vector for vendor name (also 8 bit clean,
cannam@86: length encoded in 32 bits). For example, the 1.0 release of libvorbis
cannam@86: set the vendor string to "Xiph.Org libVorbis I 20020717".
cannam@86:
cannam@86: The comment header is decoded as follows:
cannam@86:
cannam@86:
cannam@86: 1) [vendor_length] = read an unsigned integer of 32 bits
cannam@86: 2) [vendor_string] = read a UTF-8 vector as [vendor_length] octets
cannam@86: 3) [user_comment_list_length] = read an unsigned integer of 32 bits
cannam@86: 4) iterate [user_comment_list_length] times {
cannam@86:
cannam@86: 5) [length] = read an unsigned integer of 32 bits
cannam@86: 6) this iteration's user comment = read a UTF-8 vector as [length] octets
cannam@86:
cannam@86: }
cannam@86:
cannam@86: 7) [framing_bit] = read a single bit as boolean
cannam@86: 8) if ( [framing_bit] unset or end of packet ) then ERROR
cannam@86: 9) done.
cannam@86:
cannam@86:
cannam@86: Content vector format
cannam@86:
cannam@86: The comment vectors are structured similarly to a UNIX environment variable.
cannam@86: That is, comment fields consist of a field name and a corresponding value and
cannam@86: look like:
cannam@86:
cannam@86:
cannam@86: comment[0]="ARTIST=me";
cannam@86: comment[1]="TITLE=the sound of Vorbis";
cannam@86:
cannam@86:
cannam@86:
cannam@86: - A case-insensitive field name that may consist of ASCII 0x20 through
cannam@86: 0x7D, 0x3D ('=') excluded. ASCII 0x41 through 0x5A inclusive (A-Z) is
cannam@86: to be considered equivalent to ASCII 0x61 through 0x7A inclusive
cannam@86: (a-z).
cannam@86: - The field name is immediately followed by ASCII 0x3D ('=');
cannam@86: this equals sign is used to terminate the field name.
cannam@86: - 0x3D is followed by the 8 bit clean UTF-8 encoded value of the
cannam@86: field contents to the end of the field.
cannam@86:
cannam@86:
cannam@86: Field names
cannam@86:
cannam@86: Below is a proposed, minimal list of standard field names with a
cannam@86: description of intended use. No single or group of field names is
cannam@86: mandatory; a comment header may contain one, all or none of the names
cannam@86: in this list.
cannam@86:
cannam@86:
cannam@86:
cannam@86: - TITLE
cannam@86: - Track/Work name
cannam@86:
cannam@86: - VERSION
cannam@86: - The version field may be used to differentiate multiple
cannam@86: versions of the same track title in a single collection.
cannam@86: (e.g. remix info)
cannam@86:
cannam@86: - ALBUM
cannam@86: - The collection name to which this track belongs
cannam@86:
cannam@86: - TRACKNUMBER
cannam@86: - The track number of this piece if part of a specific larger collection or album
cannam@86:
cannam@86: - ARTIST
cannam@86: - The artist generally considered responsible for the work. In popular music
cannam@86: this is usually the performing band or singer. For classical music it would be
cannam@86: the composer. For an audio book it would be the author of the original text.
cannam@86:
cannam@86: - PERFORMER
cannam@86: - The artist(s) who performed the work. In classical music this would be the
cannam@86: conductor, orchestra, soloists. In an audio book it would be the actor who did
cannam@86: the reading. In popular music this is typically the same as the ARTIST and
cannam@86: is omitted.
cannam@86:
cannam@86: - COPYRIGHT
cannam@86: - Copyright attribution, e.g., '2001 Nobody's Band' or '1999 Jack Moffitt'
cannam@86:
cannam@86: - LICENSE
cannam@86: - License information, eg, 'All Rights Reserved', 'Any
cannam@86: Use Permitted', a URL to a license such as a Creative Commons license
cannam@86: ("www.creativecommons.org/blahblah/license.html") or the EFF Open
cannam@86: Audio License ('distributed under the terms of the Open Audio
cannam@86: License. see http://www.eff.org/IP/Open_licenses/eff_oal.html for
cannam@86: details'), etc.
cannam@86:
cannam@86: - ORGANIZATION
cannam@86: - Name of the organization producing the track (i.e.
cannam@86: the 'record label')
cannam@86:
cannam@86: - DESCRIPTION
cannam@86: - A short text description of the contents
cannam@86:
cannam@86: - GENRE
cannam@86: - A short text indication of music genre
cannam@86:
cannam@86: - DATE
cannam@86: - Date the track was recorded
cannam@86:
cannam@86: - LOCATION
cannam@86: - Location where track was recorded
cannam@86:
cannam@86: - CONTACT
cannam@86: - Contact information for the creators or distributors of the track.
cannam@86: This could be a URL, an email address, the physical address of
cannam@86: the producing label.
cannam@86:
cannam@86: - ISRC
cannam@86: - ISRC number for the track; see the
cannam@86: ISRC intro page for more information on ISRC numbers.
cannam@86:
cannam@86:
cannam@86:
cannam@86: Implications
cannam@86:
cannam@86:
cannam@86:
cannam@86: Encoding
cannam@86:
cannam@86: The comment header comprises the entirety of the second bitstream
cannam@86: header packet. Unlike the first bitstream header packet, it is not
cannam@86: generally the only packet on the second page and may not be restricted
cannam@86: to within the second bitstream page. The length of the comment header
cannam@86: packet is (practically) unbounded. The comment header packet is not
cannam@86: optional; it must be present in the bitstream even if it is
cannam@86: effectively empty.
cannam@86:
cannam@86: The comment header is encoded as follows (as per Ogg's standard
cannam@86: bitstream mapping which renders least-significant-bit of the word to be
cannam@86: coded into the least significant available bit of the current
cannam@86: bitstream octet first):
cannam@86:
cannam@86:
cannam@86: - Vendor string length (32 bit unsigned quantity specifying number of octets)
cannam@86: - Vendor string ([vendor string length] octets coded from beginning of string
cannam@86: to end of string, not null terminated)
cannam@86: - Number of comment fields (32 bit unsigned quantity specifying number of fields)
cannam@86: - Comment field 0 length (if [Number of comment fields]>0; 32 bit unsigned
cannam@86: quantity specifying number of octets)
cannam@86: - Comment field 0 ([Comment field 0 length] octets coded from beginning of
cannam@86: string to end of string, not null terminated)
cannam@86: - Comment field 1 length (if [Number of comment fields]>1...)...
cannam@86:
cannam@86:
cannam@86: This is actually somewhat easier to describe in code; implementation of the above
cannam@86: can be found in vorbis/lib/info.c:_vorbis_pack_comment(),_vorbis_unpack_comment()
cannam@86:
cannam@86:
cannam@86: The Xiph Fish Logo is a
cannam@86: trademark (™) of Xiph.Org.
cannam@86:
cannam@86: These pages © 1994 - 2005 Xiph.Org. All rights reserved.
cannam@86:
cannam@86:
cannam@86:
cannam@86: