camir-aes2014: toolboxes/mp3readwrite/demo

annotate toolboxes/mp3readwrite/demo_mp3readwrite.m @ 0:e9a9cd732c1e tip

first hg version after svn

author	wolffd
date	Tue, 10 Feb 2015 15:05:51 +0000
parents
children

rev	line source
wolffd@0	1 %% MP3 reading and writing
wolffd@0	2 %
wolffd@0	3 % These function, mp3read and mp3write, aim to exactly duplicate
wolffd@0	4 % the operation of wavread and wavwrite for accessing soundfiles,
wolffd@0	5 % except the soundfiles are in Mpeg-Audio layer 3 (MP3) compressed
wolffd@0	6 % format. All the hard work is done by external binaries written
wolffd@0	7 % by others: mp3info to query the format of existing mp3 files,
wolffd@0	8 % mpg123 to decode mp3 files, and lame to encode audio files.
wolffd@0	9 % Binaries for these files are widely available (and may be
wolffd@0	10 % included in this distribution).
wolffd@0	11 %
wolffd@0	12 % These functions were originally developed for access to very
wolffd@0	13 % large mp3 files (i.e. many hours long), and so avoid creating
wolffd@0	14 % the entire uncompressed audio stream if possible. mp3read
wolffd@0	15 % allows you to specify the range of frames you want to read
wolffd@0	16 % (as a second argument), and mp3read will construct an mpg123
wolffd@0	17 % command that skips blocks to decode only the part of the file
wolffd@0	18 % that is required. This can be much quicker (and require less
wolffd@0	19 % memory/temporary disk) than decoding the whole file.
wolffd@0	20 %
wolffd@0	21 % mpg123 also provides for "on the fly" downsampling at conversion
wolffd@0	22 % to mono, which are supported as extra options in mp3read.
wolffd@0	23 %
wolffd@0	24 % mpg123 can read MP3s across the network. This is supported
wolffd@0	25 % if the FILE argument is a URL (e.g. beginning 'http://...').
wolffd@0	26 %
wolffd@0	27 % mp3info sometimes gets the file size wrong (as returned by the
wolffd@0	28 % mp3read(...'size') syntax). I'm not sure when this happens
wolffd@0	29 % exactly, but it's probably a result of VBR files. In the worst
wolffd@0	30 % case, figuring the number of samples in such a file requires
wolffd@0	31 % scanning through the whole file, and mp3info doesn't usually do
wolffd@0	32 % this.
wolffd@0	33 %
wolffd@0	34 % For more information, including advice on handling MP4 files,
wolffd@0	35 % see http://labrosa.ee.columbia.edu/matlab/mp3read.html
wolffd@0	36
wolffd@0	37 %% Example usage
wolffd@0	38 % Here, we read a wav file in, then write it out as an MP3, then
wolffd@0	39 % read the resulting MP3 back in, and compare it to the original
wolffd@0	40 % file.
wolffd@0	41
wolffd@0	42 % Read an audio waveform
wolffd@0	43 [d,sr] = wavread('piano.wav');
wolffd@0	44 % Save to mp3 (default settings)
wolffd@0	45 mp3write(d,sr,'piano.mp3');
wolffd@0	46 % Read it back again
wolffd@0	47 [d2,sr] = mp3read('piano.mp3');
wolffd@0	48 % mp3 encoding involves some extra padding at each end; we attempt
wolffd@0	49 % to cut it off at the start, but can't do that at the end, because
wolffd@0	50 % mp3read doesn't know how long the original was. But we do, so..
wolffd@0	51 % Chop it down to be the same length as the original
wolffd@0	52 d2 = d2(1:length(d),:);
wolffd@0	53 % What is the SNR (distortion)?
wolffd@0	54 ddiff = d - d2;
wolffd@0	55 disp(['SNR is ',num2str(10*log10(sum(d(:).^2)/sum(ddiff(:).^2))),' dB']);
wolffd@0	56 % Do they look similar?
wolffd@0	57 subplot(211)
wolffd@0	58 specgram(d(:,1),1024,sr);
wolffd@0	59 subplot(212)
wolffd@0	60 plot(1:5000,d(10000+(1:5000),1),1:5000,d2(10000+(1:5000)));
wolffd@0	61 % Yes, pretty close
wolffd@0	62 %
wolffd@0	63 % NB: lame followed by mpg123 causes a little attenuation; you
wolffd@0	64 % can get a better match by scaling up the read-back waveform:
wolffd@0	65 ddiff = d - 1.052*d2;
wolffd@0	66 disp(['SNR is ',num2str(10*log10(sum(d(:).^2)/sum(ddiff(:).^2))),' dB']);
wolffd@0	67
wolffd@0	68 %% Delay, size, and alignment
wolffd@0	69 %
wolffd@0	70 % In mid-2006 I noticed that mp3read followed by mp3write followed by
wolffd@0	71 % mp3read effectively delayed the waveform by 2257 samples (at 44
wolffd@0	72 % kHz). So I introduced code to discard the first 2257 samples to ensure
wolffd@0	73 % that the waveforms remained time aligned. As best I could understand,
wolffd@0	74 % mpg123 (v 0.5.9) was including the "warm-up" samples from the
wolffd@0	75 % synthesis filterbank which are more properly discarded.
wolffd@0	76 %
wolffd@0	77 % Then in late 2009 I noticed that some chord recognition code, which
wolffd@0	78 % used mp3read to read files which were then segmented on the basis of
wolffd@0	79 % some hand-marked timings, suddenly started getting much poorer
wolffd@0	80 % results. It turned out that I had upgraded my version of mpg123 to v
wolffd@0	81 % 1.9.0, and the warm-up samples had been fixed in this version. So my
wolffd@0	82 % code was discarding 2257 good samples, and the data was skewed 51ms
wolffd@0	83 % early relative to the hand labels.
wolffd@0	84 %
wolffd@0	85 % Hence, the current version of mp3read does not
wolffd@0	86 % discard any samples by default -- appropriate for the recent versions
wolffd@0	87 % of mpg123 included here. But if you know you're running an old, v
wolffd@0	88 % 0.5.9, mpg123, you should edit the mp3read.m source to set the flag
wolffd@0	89 % MPG123059 = 1.
wolffd@0	90 %
wolffd@0	91 % Note also that the 'size' function relies on the number of
wolffd@0	92 % blocks reported by mp3info. However, many mp3 files include
wolffd@0	93 % additional information about the size of the file in the
wolffd@0	94 % so-called Xing header, embedded in the first frame, which can
wolffd@0	95 % specify that a certain number of samples from start and end
wolffd@0	96 % should additionally be dropped. mp3info doesn't read that,
wolffd@0	97 % and there's no way for my code to probe it except by running
wolffd@0	98 % mpg123. Hence, the results of mp3read(fn,'size') may sometimes
wolffd@0	99 % overestimate the length of the actual vector you'll get if
wolffd@0	100 % you read the whole file.
wolffd@0	101
wolffd@0	102 %% External binaries
wolffd@0	103 % The m files rely on three external binaries, each of which is
wolffd@0	104 % available for Linux, Mac OS X, or Windows:
wolffd@0	105 %
wolffd@0	106 % mpg123 is a high-performance mp3 decoder. Its home page is
wolffd@0	107 % http://www.mpg123.de/ .
wolffd@0	108 %
wolffd@0	109 % mp3info is a utility to read technical information on an mp3
wolffd@0	110 % file. Its home page is http://www.ibiblio.org/mp3info/ .
wolffd@0	111 %
wolffd@0	112 % lame is an open-source MP3 encoder. Its homepage is
wolffd@0	113 % http://lame.sourceforge.net/ .
wolffd@0	114 %
wolffd@0	115 % The various authors of these packages are gratefully acknowledged
wolffd@0	116 % for doing all the hard work to make these Matlab functions possible.
wolffd@0	117
wolffd@0	118 %% Installation
wolffd@0	119 % The two routines, mp3read.m and mp3write.m, will look for their
wolffd@0	120 % binaries (mpg123 and mp3info for mp3read; lame for mp3write) in
wolffd@0	121 % the same directory where they are installed. Binaries for
wolffd@0	122 % different architectures are distinguished by their extension,
wolffd@0	123 % which is the standard Matlab computer code e.g. ".mac" for Mac
wolffd@0	124 % PPC OS X, ".glnx86" for i386-linux. The exception is Windows,
wolffd@0	125 % where the binaries have the extension ".exe".
wolffd@0	126 %
wolffd@0	127 % Temporary files
wolffd@0	128 % will be written to (a) a directory taken from the environment
wolffd@0	129 % variable TMPDIR (b) /tmp if it exists, or (c) the current
wolffd@0	130 % directory. This can easily be changed by editing the m files.
wolffd@0	131
wolffd@0	132 % Last updated: $Date: 2009/03/15 18:29:58 $
wolffd@0	133 % Dan Ellis <dpwe@ee.columbia.edu>

Mercurial > hg > camir-aes2014

annotate toolboxes/mp3readwrite/demo_mp3readwrite.m @ 0:e9a9cd732c1e tip