Mercurial > hg > autoencoder-specgram
annotate README.md @ 1:04f1e3463466 tip master
Implement maxpooling and unpooling aspect
author | Dan Stowell <danstowell@users.sourceforge.net> |
---|---|
date | Wed, 13 Jan 2016 09:56:16 +0000 |
parents | 73317239d6d1 |
children |
rev | line source |
---|---|
danstowell@0 | 1 |
danstowell@0 | 2 Spectrogram auto-encoder |
danstowell@0 | 3 (c) Dan Stowell 2016. |
danstowell@0 | 4 |
danstowell@0 | 5 |
danstowell@0 | 6 A simple example of an autoencoder set up for spectrograms, with two convolutional layers - thought of as one "encoding" layer and one "decoding" layer. |
danstowell@0 | 7 |
danstowell@0 | 8 It's meant to be a fairly minimal example of doing this in Theano, using the Lasagne framework to make things easier. |
danstowell@0 | 9 |
danstowell@0 | 10 By default it simply makes a training set from different chunks of the same single spectrogram (from the supplied wave file). This is not a good training set! |
danstowell@0 | 11 |
danstowell@0 | 12 Notable (potentially unusual) things about this implementation: |
danstowell@0 | 13 * Data is not pre-whitened, instead we use a custom layer (NormalisationLayer) to normalise the mean-and-variance of the data for us. This is because I want the spectrogram to be normalised when it is input but not normalised when it is output. |
danstowell@0 | 14 * It's a convolutional net but only along the time axis; along the frequency axis it's fully-connected. |
danstowell@0 | 15 * There's no maxpooling/downsampling. |
danstowell@0 | 16 |
danstowell@0 | 17 |
danstowell@0 | 18 SYSTEM REQUIREMENTS |
danstowell@0 | 19 =================== |
danstowell@0 | 20 |
danstowell@0 | 21 * Python |
danstowell@0 | 22 * Theano (NOTE: please check the Lasagne page for preferred Theano version) |
danstowell@0 | 23 * Lasagne https://github.com/Lasagne/Lasagne |
danstowell@0 | 24 * Matplotlib |
danstowell@0 | 25 * scikits.audiolab |
danstowell@0 | 26 |
danstowell@0 | 27 Tested on Ubuntu 14.04 with Python 2.7. |
danstowell@0 | 28 |
danstowell@0 | 29 USAGE |
danstowell@0 | 30 ===== |
danstowell@0 | 31 |
danstowell@0 | 32 python autoencoder-specgram.py |
danstowell@0 | 33 |
danstowell@0 | 34 It creates a "pdf" folder and puts plots in there (multi-page PDFs) as it goes along. |
danstowell@0 | 35 There's a "progress" pdf which gets repeatedly overwritten - you should see the output quality gradually getting better. |
danstowell@0 | 36 |
danstowell@0 | 37 Look in userconfig.py for configuration options. |
danstowell@0 | 38 |