Mercurial > hg > camir-aes2014
comparison toolboxes/mp3readwrite/demo_mp3readwrite.m @ 0:e9a9cd732c1e tip
first hg version after svn
author | wolffd |
---|---|
date | Tue, 10 Feb 2015 15:05:51 +0000 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:e9a9cd732c1e |
---|---|
1 %% MP3 reading and writing | |
2 % | |
3 % These function, mp3read and mp3write, aim to exactly duplicate | |
4 % the operation of wavread and wavwrite for accessing soundfiles, | |
5 % except the soundfiles are in Mpeg-Audio layer 3 (MP3) compressed | |
6 % format. All the hard work is done by external binaries written | |
7 % by others: mp3info to query the format of existing mp3 files, | |
8 % mpg123 to decode mp3 files, and lame to encode audio files. | |
9 % Binaries for these files are widely available (and may be | |
10 % included in this distribution). | |
11 % | |
12 % These functions were originally developed for access to very | |
13 % large mp3 files (i.e. many hours long), and so avoid creating | |
14 % the entire uncompressed audio stream if possible. mp3read | |
15 % allows you to specify the range of frames you want to read | |
16 % (as a second argument), and mp3read will construct an mpg123 | |
17 % command that skips blocks to decode only the part of the file | |
18 % that is required. This can be much quicker (and require less | |
19 % memory/temporary disk) than decoding the whole file. | |
20 % | |
21 % mpg123 also provides for "on the fly" downsampling at conversion | |
22 % to mono, which are supported as extra options in mp3read. | |
23 % | |
24 % mpg123 can read MP3s across the network. This is supported | |
25 % if the FILE argument is a URL (e.g. beginning 'http://...'). | |
26 % | |
27 % mp3info sometimes gets the file size wrong (as returned by the | |
28 % mp3read(...'size') syntax). I'm not sure when this happens | |
29 % exactly, but it's probably a result of VBR files. In the worst | |
30 % case, figuring the number of samples in such a file requires | |
31 % scanning through the whole file, and mp3info doesn't usually do | |
32 % this. | |
33 % | |
34 % For more information, including advice on handling MP4 files, | |
35 % see http://labrosa.ee.columbia.edu/matlab/mp3read.html | |
36 | |
37 %% Example usage | |
38 % Here, we read a wav file in, then write it out as an MP3, then | |
39 % read the resulting MP3 back in, and compare it to the original | |
40 % file. | |
41 | |
42 % Read an audio waveform | |
43 [d,sr] = wavread('piano.wav'); | |
44 % Save to mp3 (default settings) | |
45 mp3write(d,sr,'piano.mp3'); | |
46 % Read it back again | |
47 [d2,sr] = mp3read('piano.mp3'); | |
48 % mp3 encoding involves some extra padding at each end; we attempt | |
49 % to cut it off at the start, but can't do that at the end, because | |
50 % mp3read doesn't know how long the original was. But we do, so.. | |
51 % Chop it down to be the same length as the original | |
52 d2 = d2(1:length(d),:); | |
53 % What is the SNR (distortion)? | |
54 ddiff = d - d2; | |
55 disp(['SNR is ',num2str(10*log10(sum(d(:).^2)/sum(ddiff(:).^2))),' dB']); | |
56 % Do they look similar? | |
57 subplot(211) | |
58 specgram(d(:,1),1024,sr); | |
59 subplot(212) | |
60 plot(1:5000,d(10000+(1:5000),1),1:5000,d2(10000+(1:5000))); | |
61 % Yes, pretty close | |
62 % | |
63 % NB: lame followed by mpg123 causes a little attenuation; you | |
64 % can get a better match by scaling up the read-back waveform: | |
65 ddiff = d - 1.052*d2; | |
66 disp(['SNR is ',num2str(10*log10(sum(d(:).^2)/sum(ddiff(:).^2))),' dB']); | |
67 | |
68 %% Delay, size, and alignment | |
69 % | |
70 % In mid-2006 I noticed that mp3read followed by mp3write followed by | |
71 % mp3read effectively delayed the waveform by 2257 samples (at 44 | |
72 % kHz). So I introduced code to discard the first 2257 samples to ensure | |
73 % that the waveforms remained time aligned. As best I could understand, | |
74 % mpg123 (v 0.5.9) was including the "warm-up" samples from the | |
75 % synthesis filterbank which are more properly discarded. | |
76 % | |
77 % Then in late 2009 I noticed that some chord recognition code, which | |
78 % used mp3read to read files which were then segmented on the basis of | |
79 % some hand-marked timings, suddenly started getting much poorer | |
80 % results. It turned out that I had upgraded my version of mpg123 to v | |
81 % 1.9.0, and the warm-up samples had been fixed in this version. So my | |
82 % code was discarding 2257 *good* samples, and the data was skewed 51ms | |
83 % early relative to the hand labels. | |
84 % | |
85 % Hence, the current version of mp3read does not | |
86 % discard any samples by default -- appropriate for the recent versions | |
87 % of mpg123 included here. But if you know you're running an old, v | |
88 % 0.5.9, mpg123, you should edit the mp3read.m source to set the flag | |
89 % MPG123059 = 1. | |
90 % | |
91 % Note also that the 'size' function relies on the number of | |
92 % blocks reported by mp3info. However, many mp3 files include | |
93 % additional information about the size of the file in the | |
94 % so-called Xing header, embedded in the first frame, which can | |
95 % specify that a certain number of samples from start and end | |
96 % should additionally be dropped. mp3info doesn't read that, | |
97 % and there's no way for my code to probe it except by running | |
98 % mpg123. Hence, the results of mp3read(fn,'size') may sometimes | |
99 % overestimate the length of the actual vector you'll get if | |
100 % you read the whole file. | |
101 | |
102 %% External binaries | |
103 % The m files rely on three external binaries, each of which is | |
104 % available for Linux, Mac OS X, or Windows: | |
105 % | |
106 % *mpg123* is a high-performance mp3 decoder. Its home page is | |
107 % http://www.mpg123.de/ . | |
108 % | |
109 % *mp3info* is a utility to read technical information on an mp3 | |
110 % file. Its home page is http://www.ibiblio.org/mp3info/ . | |
111 % | |
112 % *lame* is an open-source MP3 encoder. Its homepage is | |
113 % http://lame.sourceforge.net/ . | |
114 % | |
115 % The various authors of these packages are gratefully acknowledged | |
116 % for doing all the hard work to make these Matlab functions possible. | |
117 | |
118 %% Installation | |
119 % The two routines, mp3read.m and mp3write.m, will look for their | |
120 % binaries (mpg123 and mp3info for mp3read; lame for mp3write) in | |
121 % the same directory where they are installed. Binaries for | |
122 % different architectures are distinguished by their extension, | |
123 % which is the standard Matlab computer code e.g. ".mac" for Mac | |
124 % PPC OS X, ".glnx86" for i386-linux. The exception is Windows, | |
125 % where the binaries have the extension ".exe". | |
126 % | |
127 % Temporary files | |
128 % will be written to (a) a directory taken from the environment | |
129 % variable TMPDIR (b) /tmp if it exists, or (c) the current | |
130 % directory. This can easily be changed by editing the m files. | |
131 | |
132 % Last updated: $Date: 2009/03/15 18:29:58 $ | |
133 % Dan Ellis <dpwe@ee.columbia.edu> |