wolffd@0: %% MP3 reading and writing wolffd@0: % wolffd@0: % These function, mp3read and mp3write, aim to exactly duplicate wolffd@0: % the operation of wavread and wavwrite for accessing soundfiles, wolffd@0: % except the soundfiles are in Mpeg-Audio layer 3 (MP3) compressed wolffd@0: % format. All the hard work is done by external binaries written wolffd@0: % by others: mp3info to query the format of existing mp3 files, wolffd@0: % mpg123 to decode mp3 files, and lame to encode audio files. wolffd@0: % Binaries for these files are widely available (and may be wolffd@0: % included in this distribution). wolffd@0: % wolffd@0: % These functions were originally developed for access to very wolffd@0: % large mp3 files (i.e. many hours long), and so avoid creating wolffd@0: % the entire uncompressed audio stream if possible. mp3read wolffd@0: % allows you to specify the range of frames you want to read wolffd@0: % (as a second argument), and mp3read will construct an mpg123 wolffd@0: % command that skips blocks to decode only the part of the file wolffd@0: % that is required. This can be much quicker (and require less wolffd@0: % memory/temporary disk) than decoding the whole file. wolffd@0: % wolffd@0: % mpg123 also provides for "on the fly" downsampling at conversion wolffd@0: % to mono, which are supported as extra options in mp3read. wolffd@0: % wolffd@0: % mpg123 can read MP3s across the network. This is supported wolffd@0: % if the FILE argument is a URL (e.g. beginning 'http://...'). wolffd@0: % wolffd@0: % mp3info sometimes gets the file size wrong (as returned by the wolffd@0: % mp3read(...'size') syntax). I'm not sure when this happens wolffd@0: % exactly, but it's probably a result of VBR files. In the worst wolffd@0: % case, figuring the number of samples in such a file requires wolffd@0: % scanning through the whole file, and mp3info doesn't usually do wolffd@0: % this. wolffd@0: % wolffd@0: % For more information, including advice on handling MP4 files, wolffd@0: % see http://labrosa.ee.columbia.edu/matlab/mp3read.html wolffd@0: wolffd@0: %% Example usage wolffd@0: % Here, we read a wav file in, then write it out as an MP3, then wolffd@0: % read the resulting MP3 back in, and compare it to the original wolffd@0: % file. wolffd@0: wolffd@0: % Read an audio waveform wolffd@0: [d,sr] = wavread('piano.wav'); wolffd@0: % Save to mp3 (default settings) wolffd@0: mp3write(d,sr,'piano.mp3'); wolffd@0: % Read it back again wolffd@0: [d2,sr] = mp3read('piano.mp3'); wolffd@0: % mp3 encoding involves some extra padding at each end; we attempt wolffd@0: % to cut it off at the start, but can't do that at the end, because wolffd@0: % mp3read doesn't know how long the original was. But we do, so.. wolffd@0: % Chop it down to be the same length as the original wolffd@0: d2 = d2(1:length(d),:); wolffd@0: % What is the SNR (distortion)? wolffd@0: ddiff = d - d2; wolffd@0: disp(['SNR is ',num2str(10*log10(sum(d(:).^2)/sum(ddiff(:).^2))),' dB']); wolffd@0: % Do they look similar? wolffd@0: subplot(211) wolffd@0: specgram(d(:,1),1024,sr); wolffd@0: subplot(212) wolffd@0: plot(1:5000,d(10000+(1:5000),1),1:5000,d2(10000+(1:5000))); wolffd@0: % Yes, pretty close wolffd@0: % wolffd@0: % NB: lame followed by mpg123 causes a little attenuation; you wolffd@0: % can get a better match by scaling up the read-back waveform: wolffd@0: ddiff = d - 1.052*d2; wolffd@0: disp(['SNR is ',num2str(10*log10(sum(d(:).^2)/sum(ddiff(:).^2))),' dB']); wolffd@0: wolffd@0: %% Delay, size, and alignment wolffd@0: % wolffd@0: % In mid-2006 I noticed that mp3read followed by mp3write followed by wolffd@0: % mp3read effectively delayed the waveform by 2257 samples (at 44 wolffd@0: % kHz). So I introduced code to discard the first 2257 samples to ensure wolffd@0: % that the waveforms remained time aligned. As best I could understand, wolffd@0: % mpg123 (v 0.5.9) was including the "warm-up" samples from the wolffd@0: % synthesis filterbank which are more properly discarded. wolffd@0: % wolffd@0: % Then in late 2009 I noticed that some chord recognition code, which wolffd@0: % used mp3read to read files which were then segmented on the basis of wolffd@0: % some hand-marked timings, suddenly started getting much poorer wolffd@0: % results. It turned out that I had upgraded my version of mpg123 to v wolffd@0: % 1.9.0, and the warm-up samples had been fixed in this version. So my wolffd@0: % code was discarding 2257 *good* samples, and the data was skewed 51ms wolffd@0: % early relative to the hand labels. wolffd@0: % wolffd@0: % Hence, the current version of mp3read does not wolffd@0: % discard any samples by default -- appropriate for the recent versions wolffd@0: % of mpg123 included here. But if you know you're running an old, v wolffd@0: % 0.5.9, mpg123, you should edit the mp3read.m source to set the flag wolffd@0: % MPG123059 = 1. wolffd@0: % wolffd@0: % Note also that the 'size' function relies on the number of wolffd@0: % blocks reported by mp3info. However, many mp3 files include wolffd@0: % additional information about the size of the file in the wolffd@0: % so-called Xing header, embedded in the first frame, which can wolffd@0: % specify that a certain number of samples from start and end wolffd@0: % should additionally be dropped. mp3info doesn't read that, wolffd@0: % and there's no way for my code to probe it except by running wolffd@0: % mpg123. Hence, the results of mp3read(fn,'size') may sometimes wolffd@0: % overestimate the length of the actual vector you'll get if wolffd@0: % you read the whole file. wolffd@0: wolffd@0: %% External binaries wolffd@0: % The m files rely on three external binaries, each of which is wolffd@0: % available for Linux, Mac OS X, or Windows: wolffd@0: % wolffd@0: % *mpg123* is a high-performance mp3 decoder. Its home page is wolffd@0: % http://www.mpg123.de/ . wolffd@0: % wolffd@0: % *mp3info* is a utility to read technical information on an mp3 wolffd@0: % file. Its home page is http://www.ibiblio.org/mp3info/ . wolffd@0: % wolffd@0: % *lame* is an open-source MP3 encoder. Its homepage is wolffd@0: % http://lame.sourceforge.net/ . wolffd@0: % wolffd@0: % The various authors of these packages are gratefully acknowledged wolffd@0: % for doing all the hard work to make these Matlab functions possible. wolffd@0: wolffd@0: %% Installation wolffd@0: % The two routines, mp3read.m and mp3write.m, will look for their wolffd@0: % binaries (mpg123 and mp3info for mp3read; lame for mp3write) in wolffd@0: % the same directory where they are installed. Binaries for wolffd@0: % different architectures are distinguished by their extension, wolffd@0: % which is the standard Matlab computer code e.g. ".mac" for Mac wolffd@0: % PPC OS X, ".glnx86" for i386-linux. The exception is Windows, wolffd@0: % where the binaries have the extension ".exe". wolffd@0: % wolffd@0: % Temporary files wolffd@0: % will be written to (a) a directory taken from the environment wolffd@0: % variable TMPDIR (b) /tmp if it exists, or (c) the current wolffd@0: % directory. This can easily be changed by editing the m files. wolffd@0: wolffd@0: % Last updated: $Date: 2009/03/15 18:29:58 $ wolffd@0: % Dan Ellis