diff toolboxes/mp3readwrite/demo_mp3readwrite.m @ 0:e9a9cd732c1e tip

first hg version after svn
author wolffd
date Tue, 10 Feb 2015 15:05:51 +0000
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/toolboxes/mp3readwrite/demo_mp3readwrite.m	Tue Feb 10 15:05:51 2015 +0000
@@ -0,0 +1,133 @@
+%% MP3 reading and writing
+%
+% These function, mp3read and mp3write, aim to exactly duplicate 
+% the operation of wavread and wavwrite for accessing soundfiles, 
+% except the soundfiles are in Mpeg-Audio layer 3 (MP3) compressed 
+% format.  All the hard work is done by external binaries written 
+% by others: mp3info to query the format of existing mp3 files, 
+% mpg123 to decode mp3 files, and lame to encode audio files.
+% Binaries for these files are widely available (and may be
+% included in this distribution).  
+%
+% These functions were originally developed for access to very 
+% large mp3 files (i.e. many hours long), and so avoid creating 
+% the entire uncompressed audio stream if possible.  mp3read 
+% allows you to specify the range of frames you want to read 
+% (as a second argument), and mp3read will construct an mpg123 
+% command that skips blocks to decode only the part of the file 
+% that is required.  This can be much quicker (and require less 
+% memory/temporary disk) than decoding the whole file.
+%
+% mpg123 also provides for "on the fly" downsampling at conversion 
+% to mono, which are supported as extra options in mp3read.
+%
+% mpg123 can read MP3s across the network.  This is supported 
+% if the FILE argument is a URL (e.g. beginning 'http://...').
+%
+% mp3info sometimes gets the file size wrong (as returned by the
+% mp3read(...'size') syntax).  I'm not sure when this happens
+% exactly, but it's probably a result of VBR files. In the worst
+% case, figuring the number of samples in such a file requires
+% scanning through the whole file, and mp3info doesn't usually do
+% this. 
+%
+% For more information, including advice on handling MP4 files, 
+% see http://labrosa.ee.columbia.edu/matlab/mp3read.html
+
+%% Example usage
+% Here, we read a wav file in, then write it out as an MP3, then 
+% read the resulting MP3 back in, and compare it to the original 
+% file.
+
+% Read an audio waveform
+[d,sr] = wavread('piano.wav');
+% Save to mp3 (default settings)
+mp3write(d,sr,'piano.mp3');
+% Read it back again
+[d2,sr] = mp3read('piano.mp3');
+% mp3 encoding involves some extra padding at each end; we attempt 
+% to cut it off at the start, but can't do that at the end, because 
+% mp3read doesn't know how long the original was.  But we do, so..
+% Chop it down to be the same length as the original
+d2 = d2(1:length(d),:);
+% What is the SNR (distortion)?
+ddiff = d - d2;
+disp(['SNR is ',num2str(10*log10(sum(d(:).^2)/sum(ddiff(:).^2))),' dB']);
+% Do they look similar?
+subplot(211)
+specgram(d(:,1),1024,sr);
+subplot(212)
+plot(1:5000,d(10000+(1:5000),1),1:5000,d2(10000+(1:5000)));
+% Yes, pretty close
+%
+% NB: lame followed by mpg123 causes a little attenuation; you 
+% can get a better match by scaling up the read-back waveform:
+ddiff = d - 1.052*d2;
+disp(['SNR is ',num2str(10*log10(sum(d(:).^2)/sum(ddiff(:).^2))),' dB']);
+
+%% Delay, size, and alignment
+%
+% In mid-2006 I noticed that mp3read followed by mp3write followed by
+% mp3read effectively delayed the waveform by 2257 samples (at 44
+% kHz). So I introduced code to discard the first 2257 samples to ensure
+% that the waveforms remained time aligned. As best I could understand,
+% mpg123 (v 0.5.9) was including the "warm-up" samples from the
+% synthesis filterbank which are more properly discarded.
+%
+% Then in late 2009 I noticed that some chord recognition code, which
+% used mp3read to read files which were then segmented on the basis of
+% some hand-marked timings, suddenly started getting much poorer
+% results. It turned out that I had upgraded my version of mpg123 to v
+% 1.9.0, and the warm-up samples had been fixed in this version. So my
+% code was discarding 2257 *good* samples, and the data was skewed 51ms
+% early relative to the hand labels.
+%
+% Hence, the current version of mp3read does not
+% discard any samples by default -- appropriate for the recent versions
+% of mpg123 included here. But if you know you're running an old, v
+% 0.5.9, mpg123, you should edit the mp3read.m source to set the flag
+% MPG123059 = 1.
+% 
+% Note also that the 'size' function relies on the number of 
+% blocks reported by mp3info.  However, many mp3 files include 
+% additional information about the size of the file in the
+% so-called Xing header, embedded in the first frame, which can 
+% specify that a certain number of samples from start and end 
+% should additionally be dropped.  mp3info doesn't read that, 
+% and there's no way for my code to probe it except by running 
+% mpg123.  Hence, the results of mp3read(fn,'size') may sometimes 
+% overestimate the length of the actual vector you'll get if 
+% you read the whole file.
+
+%% External binaries
+% The m files rely on three external binaries, each of which is
+% available for Linux, Mac OS X, or Windows:
+%
+% *mpg123* is a high-performance mp3 decoder.  Its home page is
+% http://www.mpg123.de/ .  
+%
+% *mp3info* is a utility to read technical information on an mp3
+% file. Its home page is http://www.ibiblio.org/mp3info/ .  
+%
+% *lame* is an open-source MP3 encoder.  Its homepage is
+% http://lame.sourceforge.net/ .
+%
+% The various authors of these packages are gratefully acknowledged 
+% for doing all the hard work to make these Matlab functions possible.
+
+%% Installation
+% The two routines, mp3read.m and mp3write.m, will look for their 
+% binaries (mpg123 and mp3info for mp3read; lame for mp3write) in 
+% the same directory where they are installed.  Binaries for
+% different architectures are distinguished by their extension, 
+% which is the standard Matlab computer code e.g. ".mac" for Mac
+% PPC OS X, ".glnx86" for i386-linux.  The exception is Windows,
+% where the binaries have the extension ".exe".  
+%
+% Temporary files 
+% will be written to (a) a directory taken from the environment 
+% variable TMPDIR (b) /tmp if it exists, or (c) the current 
+% directory.  This can easily be changed by editing the m files.
+
+% Last updated: $Date: 2009/03/15 18:29:58 $
+% Dan Ellis <dpwe@ee.columbia.edu>