danstowell@0: danstowell@0: Spectrogram auto-encoder danstowell@0: (c) Dan Stowell 2016. danstowell@0: danstowell@0: danstowell@0: A simple example of an autoencoder set up for spectrograms, with two convolutional layers - thought of as one "encoding" layer and one "decoding" layer. danstowell@0: danstowell@0: It's meant to be a fairly minimal example of doing this in Theano, using the Lasagne framework to make things easier. danstowell@0: danstowell@0: By default it simply makes a training set from different chunks of the same single spectrogram (from the supplied wave file). This is not a good training set! danstowell@0: danstowell@0: Notable (potentially unusual) things about this implementation: danstowell@0: * Data is not pre-whitened, instead we use a custom layer (NormalisationLayer) to normalise the mean-and-variance of the data for us. This is because I want the spectrogram to be normalised when it is input but not normalised when it is output. danstowell@0: * It's a convolutional net but only along the time axis; along the frequency axis it's fully-connected. danstowell@0: * There's no maxpooling/downsampling. danstowell@0: danstowell@0: danstowell@0: SYSTEM REQUIREMENTS danstowell@0: =================== danstowell@0: danstowell@0: * Python danstowell@0: * Theano (NOTE: please check the Lasagne page for preferred Theano version) danstowell@0: * Lasagne https://github.com/Lasagne/Lasagne danstowell@0: * Matplotlib danstowell@0: * scikits.audiolab danstowell@0: danstowell@0: Tested on Ubuntu 14.04 with Python 2.7. danstowell@0: danstowell@0: USAGE danstowell@0: ===== danstowell@0: danstowell@0: python autoencoder-specgram.py danstowell@0: danstowell@0: It creates a "pdf" folder and puts plots in there (multi-page PDFs) as it goes along. danstowell@0: There's a "progress" pdf which gets repeatedly overwritten - you should see the output quality gradually getting better. danstowell@0: danstowell@0: Look in userconfig.py for configuration options. danstowell@0: