Chris@1:
Chris@1:
Chris@1:
Chris@1:
Chris@1:
Chris@1: Ogg Vorbis Documentation
Chris@1:
Chris@1:
Chris@1:
Chris@1:
Chris@1:
Chris@1:
Chris@1:
Chris@1:
Chris@1:

Chris@1:
Chris@1:
Chris@1: Ogg Vorbis I format specification: comment field and header specification
Chris@1:
Chris@1: Overview
Chris@1:
Chris@1: The Vorbis text comment header is the second (of three) header
Chris@1: packets that begin a Vorbis bitstream. It is meant for short, text
Chris@1: comments, not arbitrary metadata; arbitrary metadata belongs in a
Chris@1: separate logical bitstream (usually an XML stream type) that provides
Chris@1: greater structure and machine parseability.
Chris@1:
Chris@1: The comment field is meant to be used much like someone jotting a
Chris@1: quick note on the bottom of a CDR. It should be a little information to
Chris@1: remember the disc by and explain it to others; a short, to-the-point
Chris@1: text note that need not only be a couple words, but isn't going to be
Chris@1: more than a short paragraph. The essentials, in other words, whatever
Chris@1: they turn out to be, eg:
Chris@1:
Chris@1:
Chris@1: "Honest Bob and the Factory-to-Dealer-Incentives, _I'm Still Around_,
Chris@1: opening for Moxy Früvous, 1997"
Chris@1:
Chris@1:
Chris@1: Comment encoding
Chris@1:
Chris@1: Structure
Chris@1:
Chris@1: The comment header logically is a list of eight-bit-clean vectors; the
Chris@1: number of vectors is bounded to 2^32-1 and the length of each vector
Chris@1: is limited to 2^32-1 bytes. The vector length is encoded; the vector
Chris@1: contents themselves are not null terminated. In addition to the vector
Chris@1: list, there is a single vector for vendor name (also 8 bit clean,
Chris@1: length encoded in 32 bits). For example, the 1.0 release of libvorbis
Chris@1: set the vendor string to "Xiph.Org libVorbis I 20020717".
Chris@1:
Chris@1: The comment header is decoded as follows:
Chris@1:
Chris@1:
Chris@1: 1) [vendor_length] = read an unsigned integer of 32 bits
Chris@1: 2) [vendor_string] = read a UTF-8 vector as [vendor_length] octets
Chris@1: 3) [user_comment_list_length] = read an unsigned integer of 32 bits
Chris@1: 4) iterate [user_comment_list_length] times {
Chris@1:
Chris@1: 5) [length] = read an unsigned integer of 32 bits
Chris@1: 6) this iteration's user comment = read a UTF-8 vector as [length] octets
Chris@1:
Chris@1: }
Chris@1:
Chris@1: 7) [framing_bit] = read a single bit as boolean
Chris@1: 8) if ( [framing_bit] unset or end of packet ) then ERROR
Chris@1: 9) done.
Chris@1:
Chris@1:
Chris@1: Content vector format
Chris@1:
Chris@1: The comment vectors are structured similarly to a UNIX environment variable.
Chris@1: That is, comment fields consist of a field name and a corresponding value and
Chris@1: look like:
Chris@1:
Chris@1:
Chris@1: comment[0]="ARTIST=me";
Chris@1: comment[1]="TITLE=the sound of Vorbis";
Chris@1:
Chris@1:
Chris@1:
Chris@1: - A case-insensitive field name that may consist of ASCII 0x20 through
Chris@1: 0x7D, 0x3D ('=') excluded. ASCII 0x41 through 0x5A inclusive (A-Z) is
Chris@1: to be considered equivalent to ASCII 0x61 through 0x7A inclusive
Chris@1: (a-z).
Chris@1: - The field name is immediately followed by ASCII 0x3D ('=');
Chris@1: this equals sign is used to terminate the field name.
Chris@1: - 0x3D is followed by the 8 bit clean UTF-8 encoded value of the
Chris@1: field contents to the end of the field.
Chris@1:
Chris@1:
Chris@1: Field names
Chris@1:
Chris@1: Below is a proposed, minimal list of standard field names with a
Chris@1: description of intended use. No single or group of field names is
Chris@1: mandatory; a comment header may contain one, all or none of the names
Chris@1: in this list.
Chris@1:
Chris@1:
Chris@1:
Chris@1: - TITLE
Chris@1: - Track/Work name
Chris@1:
Chris@1: - VERSION
Chris@1: - The version field may be used to differentiate multiple
Chris@1: versions of the same track title in a single collection.
Chris@1: (e.g. remix info)
Chris@1:
Chris@1: - ALBUM
Chris@1: - The collection name to which this track belongs
Chris@1:
Chris@1: - TRACKNUMBER
Chris@1: - The track number of this piece if part of a specific larger collection or album
Chris@1:
Chris@1: - ARTIST
Chris@1: - The artist generally considered responsible for the work. In popular music
Chris@1: this is usually the performing band or singer. For classical music it would be
Chris@1: the composer. For an audio book it would be the author of the original text.
Chris@1:
Chris@1: - PERFORMER
Chris@1: - The artist(s) who performed the work. In classical music this would be the
Chris@1: conductor, orchestra, soloists. In an audio book it would be the actor who did
Chris@1: the reading. In popular music this is typically the same as the ARTIST and
Chris@1: is omitted.
Chris@1:
Chris@1: - COPYRIGHT
Chris@1: - Copyright attribution, e.g., '2001 Nobody's Band' or '1999 Jack Moffitt'
Chris@1:
Chris@1: - LICENSE
Chris@1: - License information, eg, 'All Rights Reserved', 'Any
Chris@1: Use Permitted', a URL to a license such as a Creative Commons license
Chris@1: ("www.creativecommons.org/blahblah/license.html") or the EFF Open
Chris@1: Audio License ('distributed under the terms of the Open Audio
Chris@1: License. see http://www.eff.org/IP/Open_licenses/eff_oal.html for
Chris@1: details'), etc.
Chris@1:
Chris@1: - ORGANIZATION
Chris@1: - Name of the organization producing the track (i.e.
Chris@1: the 'record label')
Chris@1:
Chris@1: - DESCRIPTION
Chris@1: - A short text description of the contents
Chris@1:
Chris@1: - GENRE
Chris@1: - A short text indication of music genre
Chris@1:
Chris@1: - DATE
Chris@1: - Date the track was recorded
Chris@1:
Chris@1: - LOCATION
Chris@1: - Location where track was recorded
Chris@1:
Chris@1: - CONTACT
Chris@1: - Contact information for the creators or distributors of the track.
Chris@1: This could be a URL, an email address, the physical address of
Chris@1: the producing label.
Chris@1:
Chris@1: - ISRC
Chris@1: - ISRC number for the track; see the
Chris@1: ISRC intro page for more information on ISRC numbers.
Chris@1:
Chris@1:
Chris@1:
Chris@1: Implications
Chris@1:
Chris@1:
Chris@1:
Chris@1: Encoding
Chris@1:
Chris@1: The comment header comprises the entirety of the second bitstream
Chris@1: header packet. Unlike the first bitstream header packet, it is not
Chris@1: generally the only packet on the second page and may not be restricted
Chris@1: to within the second bitstream page. The length of the comment header
Chris@1: packet is (practically) unbounded. The comment header packet is not
Chris@1: optional; it must be present in the bitstream even if it is
Chris@1: effectively empty.
Chris@1:
Chris@1: The comment header is encoded as follows (as per Ogg's standard
Chris@1: bitstream mapping which renders least-significant-bit of the word to be
Chris@1: coded into the least significant available bit of the current
Chris@1: bitstream octet first):
Chris@1:
Chris@1:
Chris@1: - Vendor string length (32 bit unsigned quantity specifying number of octets)
Chris@1: - Vendor string ([vendor string length] octets coded from beginning of string
Chris@1: to end of string, not null terminated)
Chris@1: - Number of comment fields (32 bit unsigned quantity specifying number of fields)
Chris@1: - Comment field 0 length (if [Number of comment fields]>0; 32 bit unsigned
Chris@1: quantity specifying number of octets)
Chris@1: - Comment field 0 ([Comment field 0 length] octets coded from beginning of
Chris@1: string to end of string, not null terminated)
Chris@1: - Comment field 1 length (if [Number of comment fields]>1...)...
Chris@1:
Chris@1:
Chris@1: This is actually somewhat easier to describe in code; implementation of the above
Chris@1: can be found in vorbis/lib/info.c:_vorbis_pack_comment(),_vorbis_unpack_comment()
Chris@1:
Chris@1:
Chris@1: The Xiph Fish Logo is a
Chris@1: trademark (™) of Xiph.Org.
Chris@1:
Chris@1: These pages © 1994 - 2005 Xiph.Org. All rights reserved.
Chris@1:
Chris@1:
Chris@1:
Chris@1: