Daniel@0: %% MP3 reading and writing Daniel@0: % Daniel@0: % These function, mp3read and mp3write, aim to exactly duplicate Daniel@0: % the operation of wavread and wavwrite for accessing soundfiles, Daniel@0: % except the soundfiles are in Mpeg-Audio layer 3 (MP3) compressed Daniel@0: % format. All the hard work is done by external binaries written Daniel@0: % by others: mp3info to query the format of existing mp3 files, Daniel@0: % mpg123 to decode mp3 files, and lame to encode audio files. Daniel@0: % Binaries for these files are widely available (and may be Daniel@0: % included in this distribution). Daniel@0: % Daniel@0: % These functions were originally developed for access to very Daniel@0: % large mp3 files (i.e. many hours long), and so avoid creating Daniel@0: % the entire uncompressed audio stream if possible. mp3read Daniel@0: % allows you to specify the range of frames you want to read Daniel@0: % (as a second argument), and mp3read will construct an mpg123 Daniel@0: % command that skips blocks to decode only the part of the file Daniel@0: % that is required. This can be much quicker (and require less Daniel@0: % memory/temporary disk) than decoding the whole file. Daniel@0: % Daniel@0: % mpg123 also provides for "on the fly" downsampling at conversion Daniel@0: % to mono, which are supported as extra options in mp3read. Daniel@0: % Daniel@0: % mpg123 can read MP3s across the network. This is supported Daniel@0: % if the FILE argument is a URL (e.g. beginning 'http://...'). Daniel@0: % Daniel@0: % mp3info sometimes gets the file size wrong (as returned by the Daniel@0: % mp3read(...'size') syntax). I'm not sure when this happens Daniel@0: % exactly, but it's probably a result of VBR files. In the worst Daniel@0: % case, figuring the number of samples in such a file requires Daniel@0: % scanning through the whole file, and mp3info doesn't usually do Daniel@0: % this. Daniel@0: % Daniel@0: % For more information, including advice on handling MP4 files, Daniel@0: % see http://labrosa.ee.columbia.edu/matlab/mp3read.html Daniel@0: Daniel@0: %% Example usage Daniel@0: % Here, we read a wav file in, then write it out as an MP3, then Daniel@0: % read the resulting MP3 back in, and compare it to the original Daniel@0: % file. Daniel@0: Daniel@0: % Read an audio waveform Daniel@0: [d,sr] = wavread('piano.wav'); Daniel@0: % Save to mp3 (default settings) Daniel@0: mp3write(d,sr,'piano.mp3'); Daniel@0: % Read it back again Daniel@0: [d2,sr] = mp3read('piano.mp3'); Daniel@0: % mp3 encoding involves some extra padding at each end; we attempt Daniel@0: % to cut it off at the start, but can't do that at the end, because Daniel@0: % mp3read doesn't know how long the original was. But we do, so.. Daniel@0: % Chop it down to be the same length as the original Daniel@0: d2 = d2(1:length(d),:); Daniel@0: % What is the SNR (distortion)? Daniel@0: ddiff = d - d2; Daniel@0: disp(['SNR is ',num2str(10*log10(sum(d(:).^2)/sum(ddiff(:).^2))),' dB']); Daniel@0: % Do they look similar? Daniel@0: subplot(211) Daniel@0: specgram(d(:,1),1024,sr); Daniel@0: subplot(212) Daniel@0: plot(1:5000,d(10000+(1:5000),1),1:5000,d2(10000+(1:5000))); Daniel@0: % Yes, pretty close Daniel@0: % Daniel@0: % NB: lame followed by mpg123 causes a little attenuation; you Daniel@0: % can get a better match by scaling up the read-back waveform: Daniel@0: ddiff = d - 1.052*d2; Daniel@0: disp(['SNR is ',num2str(10*log10(sum(d(:).^2)/sum(ddiff(:).^2))),' dB']); Daniel@0: Daniel@0: %% Delay, size, and alignment Daniel@0: % Daniel@0: % In mid-2006 I noticed that mp3read followed by mp3write followed by Daniel@0: % mp3read effectively delayed the waveform by 2257 samples (at 44 Daniel@0: % kHz). So I introduced code to discard the first 2257 samples to ensure Daniel@0: % that the waveforms remained time aligned. As best I could understand, Daniel@0: % mpg123 (v 0.5.9) was including the "warm-up" samples from the Daniel@0: % synthesis filterbank which are more properly discarded. Daniel@0: % Daniel@0: % Then in late 2009 I noticed that some chord recognition code, which Daniel@0: % used mp3read to read files which were then segmented on the basis of Daniel@0: % some hand-marked timings, suddenly started getting much poorer Daniel@0: % results. It turned out that I had upgraded my version of mpg123 to v Daniel@0: % 1.9.0, and the warm-up samples had been fixed in this version. So my Daniel@0: % code was discarding 2257 *good* samples, and the data was skewed 51ms Daniel@0: % early relative to the hand labels. Daniel@0: % Daniel@0: % Hence, the current version of mp3read does not Daniel@0: % discard any samples by default -- appropriate for the recent versions Daniel@0: % of mpg123 included here. But if you know you're running an old, v Daniel@0: % 0.5.9, mpg123, you should edit the mp3read.m source to set the flag Daniel@0: % MPG123059 = 1. Daniel@0: % Daniel@0: % Note also that the 'size' function relies on the number of Daniel@0: % blocks reported by mp3info. However, many mp3 files include Daniel@0: % additional information about the size of the file in the Daniel@0: % so-called Xing header, embedded in the first frame, which can Daniel@0: % specify that a certain number of samples from start and end Daniel@0: % should additionally be dropped. mp3info doesn't read that, Daniel@0: % and there's no way for my code to probe it except by running Daniel@0: % mpg123. Hence, the results of mp3read(fn,'size') may sometimes Daniel@0: % overestimate the length of the actual vector you'll get if Daniel@0: % you read the whole file. Daniel@0: Daniel@0: %% External binaries Daniel@0: % The m files rely on three external binaries, each of which is Daniel@0: % available for Linux, Mac OS X, or Windows: Daniel@0: % Daniel@0: % *mpg123* is a high-performance mp3 decoder. Its home page is Daniel@0: % http://www.mpg123.de/ . Daniel@0: % Daniel@0: % *mp3info* is a utility to read technical information on an mp3 Daniel@0: % file. Its home page is http://www.ibiblio.org/mp3info/ . Daniel@0: % Daniel@0: % *lame* is an open-source MP3 encoder. Its homepage is Daniel@0: % http://lame.sourceforge.net/ . Daniel@0: % Daniel@0: % The various authors of these packages are gratefully acknowledged Daniel@0: % for doing all the hard work to make these Matlab functions possible. Daniel@0: Daniel@0: %% Installation Daniel@0: % The two routines, mp3read.m and mp3write.m, will look for their Daniel@0: % binaries (mpg123 and mp3info for mp3read; lame for mp3write) in Daniel@0: % the same directory where they are installed. Binaries for Daniel@0: % different architectures are distinguished by their extension, Daniel@0: % which is the standard Matlab computer code e.g. ".mac" for Mac Daniel@0: % PPC OS X, ".glnx86" for i386-linux. The exception is Windows, Daniel@0: % where the binaries have the extension ".exe". Daniel@0: % Daniel@0: % Temporary files Daniel@0: % will be written to (a) a directory taken from the environment Daniel@0: % variable TMPDIR (b) /tmp if it exists, or (c) the current Daniel@0: % directory. This can easily be changed by editing the m files. Daniel@0: Daniel@0: % Last updated: $Date: 2009/03/15 18:29:58 $ Daniel@0: % Dan Ellis