sv-dependency-builds: src/libvorbis-1.3.3/doc/vorbis-clip.txt annotate

annotate src/libvorbis-1.3.3/doc/vorbis-clip.txt @ 169:223a55898ab9 tip default

Add null config files

author	Chris Cannam <cannam@all-day-breakfast.com>
date	Mon, 02 Mar 2020 14:03:47 +0000
parents	98c1576536ae
children

rev	line source
cannam@86	1 Topic:
cannam@86	2
cannam@86	3 Sample granularity editing of a Vorbis file; inferred arbitrary sample
cannam@86	4 length starting offsets / PCM stream lengths
cannam@86	5
cannam@86	6 Overview:
cannam@86	7
cannam@86	8 Vorbis, like mp3, is a frame-based* audio compression where audio is
cannam@86	9 broken up into discrete short time segments. These segments are
cannam@86	10 'atomic' that is, one must recover the entire short time segment from
cannam@86	11 the frame packet; there's no way to recover only a part of the PCM time
cannam@86	12 segment from part of the coded packet without expanding the entire
cannam@86	13 packet and then discarding a portion of the resulting PCM audio.
cannam@86	14
cannam@86	15 * In mp3, the data segment representing a given time period is called
cannam@86	16 a 'frame'; the roughly equivalent Vorbis construct is a 'packet'.
cannam@86	17
cannam@86	18 Thus, when we edit a Vorbis stream, the finest physical editing
cannam@86	19 granularity is on these packet boundaries (the mp3 case is
cannam@86	20 actually somewhat more complex and mp3 editing is more complicated
cannam@86	21 than just snipping on a frame boundary because time data can be spread
cannam@86	22 backward or forward over frames. In Vorbis, packets are all
cannam@86	23 stand-alone). Thus, at the physical packet level, Vorbis is still
cannam@86	24 limited to streams that contain an integral number of packets.
cannam@86	25
cannam@86	26 However, Vorbis streams may still exactly represent and be edited to a
cannam@86	27 PCM stream of arbitrary length and starting offset without padding the
cannam@86	28 beginning or end of the decoded stream or requiring that the desired
cannam@86	29 edit points be packet aligned. Vorbis makes use of Ogg stream
cannam@86	30 framing, and this framing provides time-stamping data, called a
cannam@86	31 'granule position'; our starting offset and finished stream length may
cannam@86	32 be inferred from correct usage of the granule position data.
cannam@86	33
cannam@86	34 Time stamping mechanism:
cannam@86	35
cannam@86	36 Vorbis packets are bundled into into Ogg pages (note that pages do not
cannam@86	37 necessarily contain integral numbers of packets, but that isn't
cannam@86	38 inportant in this discussion. More about Ogg framing can be found in
cannam@86	39 ogg/doc/framing.html). Each page that contains a packet boundary is
cannam@86	40 stamped with the absolute sample-granularity offset of the data, that
cannam@86	41 is, 'complete samples-to-date' up to the last completed packet of that
cannam@86	42 page. (The same mechanism is used for eg, video, where the number
cannam@86	43 represents complete 2-D frames, and so on).
cannam@86	44
cannam@86	45 (It's possible but rare for a packet to span more than two pages such
cannam@86	46 that page[s] in the middle have no packet boundary; these packets have
cannam@86	47 a granule position of '-1'.)
cannam@86	48
cannam@86	49 This granule position mechaism in Ogg is used by Vorbis to indicate when the
cannam@86	50 PCM data intended to be represented in a Vorbis segment begins a
cannam@86	51 number of samples into the data represented by the first packet[s]
cannam@86	52 and/or ends before the physical PCM data represented in the last
cannam@86	53 packet[s].
cannam@86	54
cannam@86	55 File length a non-integral number of frames:
cannam@86	56
cannam@86	57 A file to be encoded in Vorbis will probably not encode into an
cannam@86	58 integral number of packets; such a file is encoded with the last
cannam@86	59 packet containing 'extra'* samples. These samples are not padding; they
cannam@86	60 will be discarded in decode.
cannam@86	61
cannam@86	62 *(For best results, the encoder should use extra samples that preserve
cannam@86	63 the character of the last frame. Simply setting them to zero will
cannam@86	64 introduce a 'cliff' that's hard to encode, resulting in spread-frame
cannam@86	65 noise. Libvorbis extrapolates the last frame past the end of data to
cannam@86	66 produce the extra samples. Even simply duplicating the last value is
cannam@86	67 better than clamping the signal to zero).
cannam@86	68
cannam@86	69 The encoder indicates to the decoder that the file is actually shorter
cannam@86	70 than all of the samples ('original' + 'extra') by setting the granule
cannam@86	71 position in the last page to a short value, that is, the last
cannam@86	72 timestamp is the original length of the file discarding extra samples.
cannam@86	73 The decoder will see that the number of samples it has decoded in the
cannam@86	74 last page is too many; it is 'original' + 'extra', where the
cannam@86	75 granulepos says that through the last packet we only have 'original'
cannam@86	76 number of samples. The decoder then ignores the 'extra' samples.
cannam@86	77 This behavior is to occur only when the end-of-stream bit is set in
cannam@86	78 the page (indicating last page of the logical stream).
cannam@86	79
cannam@86	80 Note that it not legal for the granule position of the last page to
cannam@86	81 indicate that there are more samples in the file than actually exist,
cannam@86	82 however, implementations should handle such an illegal file gracefully
cannam@86	83 in the interests of robust programming.
cannam@86	84
cannam@86	85 Beginning point not on integral packet boundary:
cannam@86	86
cannam@86	87 It is possible that we will the PCM data represented by a Vorbis
cannam@86	88 stream to begin at a position later than where the decoded PCM data
cannam@86	89 really begins after an integral packet boundary, a situation analagous
cannam@86	90 to the above description where the PCM data does not end at an
cannam@86	91 integral packet boundary. The easiest example is taking a clip out of
cannam@86	92 a larger Vorbis stream, and choosing a beginning point of the clip
cannam@86	93 that is not on a packet boundary; we need to ignore a few samples to
cannam@86	94 get the desired beginning point.
cannam@86	95
cannam@86	96 The process of marking the desired beginning point is similar to
cannam@86	97 marking an arbitrary ending point. If the encoder wishes sample zero
cannam@86	98 to be some location past the actual beginning of data, it associates a
cannam@86	99 'short' granule position value with the completion of the second*
cannam@86	100 audio packet. The granule position is associated with the second
cannam@86	101 packet simply by making sure the second packet completes its page.
cannam@86	102
cannam@86	103 *(We associate the short value with the second packet for two reasons.
cannam@86	104 a) The first packet only primes the overlap/add buffer. No data is
cannam@86	105 returned before decoding the second packet; this places the decision
cannam@86	106 information at the point of decision. b) Placing the short value on
cannam@86	107 the first packet would make the value negative (as the first packet
cannam@86	108 normally represents position zero); a negative value would break the
cannam@86	109 requirement that granule positions increase; the headers have
cannam@86	110 position values of zero)
cannam@86	111
cannam@86	112 The decoder sees that on the first page that will return
cannam@86	113 data from the overlap/add queue, we have more samples than the granule
cannam@86	114 position accounts for, and discards the 'surplus' from the beginning
cannam@86	115 of the queue.
cannam@86	116
cannam@86	117 Note that short granule values (indicating less than the actually
cannam@86	118 returned about of data) are not legal in the Vorbis spec outside of
cannam@86	119 indicating beginning and ending sample positions. However, decoders
cannam@86	120 should, at minimum, tolerate inadvertant short values elsewhere in the
cannam@86	121 stream (just as they should tolerate out-of-order/non-increasing
cannam@86	122 granulepos values, although this too is illegal).
cannam@86	123
cannam@86	124 Beginning point at arbitrary positive timestamp (no 'zero' sample):
cannam@86	125
cannam@86	126 It's also possible that the granule position of the first page of an
cannam@86	127 audio stream is a 'long value', that is, a value larger than the
cannam@86	128 amount of PCM audio decoded. This implies only that we are starting
cannam@86	129 playback at some point into the logical stream, a potentially common
cannam@86	130 occurence in streaming applications where the decoder may be
cannam@86	131 connecting into a live stream. The decoder should not treat the long
cannam@86	132 value specially.
cannam@86	133
cannam@86	134 A long value elsewhere in the stream would normally occur only when a
cannam@86	135 page is lost or out of sequence, as indicated by the page's sequence
cannam@86	136 number. A long value under any other situation is not legal, however
cannam@86	137 a decoder should tolerate both possibilities.
cannam@86	138
cannam@86	139

Mercurial > hg > sv-dependency-builds

annotate src/libvorbis-1.3.3/doc/vorbis-clip.txt @ 169:223a55898ab9 tip default