tomwalters@0: .TH GENSGM 1 "11 May 1995" 
tomwalters@0: .LP 
tomwalters@0: .SH NAME 
tomwalters@0: .LP 
tomwalters@0: gensgm \- generate auditory spectrogram 
tomwalters@0: .LP 
tomwalters@0: .SH SYNOPSIS 
tomwalters@0: .LP 
tomwalters@0: gensgm [ option=value | -option ] [ filename ] 
tomwalters@0: .LP 
tomwalters@0: .SH DESCRIPTION 
tomwalters@0: .LP 
tomwalters@0: The gensgm module of the AIM software performs a time-domain spectral
tomwalters@0: analysis using a bank of auditory filters, and summarises the
tomwalters@0: information in an auditory spectrogram, that is, a spectrogram with
tomwalters@0: auditory frequency resolution and temporal resolution, rather than the
tomwalters@0: fixed frequency and temporal resolution of traditional speech
tomwalters@0: preprocessors.  The spectral analysis converts the input wave into an
tomwalters@0: array of filtered waves, one for each channel of a gammatone auditory
tomwalters@0: filterbank. The surface of the array of filtered waves is AIM's
tomwalters@0: representation of basilar membrane motion (BMM) as a function of
tomwalters@0: time. The auditory spectrogram is a plot of a sequence of spectral
tomwalters@0: slices extracted from the envelope of the BMM every 'frstep_epn'
tomwalters@0: ms. The envelope is calculated continuously, by rectifing,
tomwalters@0: compressing, and lowpass filtering the individual BMM waves as they
tomwalters@0: flow from the filterbank.
tomwalters@0: .LP
tomwalters@0: The frequency resolution of the analysis varies with the center
tomwalters@0: frequency of the channel as in the auditory system, and the
tomwalters@0: distribution of channels across frequency is chosen to match that in
tomwalters@0: the auditory system (Patterson and Moore, 1986).  Thus, the auditory
tomwalters@0: spectrogram is a greyscale plot of the activity in each channel
tomwalters@0: (shades of black) as a function of time (the abscissa) and the centre
tomwalters@0: frequency of the auditory filter (the ordinate) in ERB's.  The
tomwalters@0: representation is referred to as an auditory spectrogram (SGM) to
tomwalters@0: distinguish it from more traditional spectrograms based on Fourier,
tomwalters@0: LPC or cepstral analysis. In AIM, the suffix 'sgm' is used to
tomwalters@0: distinguish this spectral representation from the other spectral
tomwalters@0: representations provided by the software ('asa' auditory spectral
tomwalters@0: analysis, 'cgm' cochleogram, and 'epn' excitation pattern).
tomwalters@0: .LP
tomwalters@0: The spectral analysis performed by gensgm is the same as that
tomwalters@0: performed by genbmm (manaim genbmm). The primary differences are in
tomwalters@0: the display defaults and the inclusion of the Compression and Leaky
tomwalters@0: Integration modules used to produce the spectral slices that form the
tomwalters@0: spectrogram. As a result, this manual entry is restricted to
tomwalters@0: describing the option values that differ from those in genbmm and the
tomwalters@0: additional options required to control the Compression and Leaky
tomwalters@0: Integration.
tomwalters@0: .LP
tomwalters@0: .SH DISPLAY DEFAULTS
tomwalters@0: .LP
tomwalters@0: The default values for three of the display options are reset to
tomwalters@0: produce a spectrographic format rather than a landscape. Specifically,
tomwalters@0: display=greyscale, bottom=0 and top=2500. The number of channels is
tomwalters@0: set to 128 for compatibility with the auditory spectrum modules,
tomwalters@0: genasa and genepn.  When using AIM as a preprocessor for speech
tomwalters@0: recognition the number of channels would typically be reduced to
tomwalters@0: between 24 and 32.  Use option 'downsample' if it is necessary to
tomwalters@0: reduce the output to less than 24 channels across the speech range.
tomwalters@0: .LP
tomwalters@0: .SH COMPRESSION AND LEAKY INTEGRATION
tomwalters@0: .LP
tomwalters@0: Compression and lowpass filtering are activated and the neural
tomwalters@0: encoding stage that comes between them is turned off:
tomwalters@0: .LP
tomwalters@0: .SS "Compression"
tomwalters@0: .PP
tomwalters@0: Auditory spectra are usually produced via the functional route in
tomwalters@0: AIM. In this case, compress is set on
tomwalters@0: .LP
tomwalters@0: .TP 13
tomwalters@0: compress
tomwalters@0: Logarithmic compressor switch
tomwalters@0: .RS
tomwalters@0: Switch. Default: on.
tomwalters@0: .RE
tomwalters@0: .RS
tomwalters@0: .LP
tomwalters@0: Note: The compressor in the functional route of AIM is logarithmic and
tomwalters@0: it screens out negative BMM values before compression. This rectifies
tomwalters@0: the wave during the compression process and so the separate rectify
tomwalters@0: option is left off. 
tomwalters@0: .RE
tomwalters@0: .LP
tomwalters@0: .RS
tomwalters@0: .LP
tomwalters@0: Note: The compressor in the physiological route of AIM is an integral
tomwalters@0: part of the tlf module, so when using this route to produce auditory
tomwalters@0: spectra, turn off the logarithmic compressor (i.e. compress=off). The
tomwalters@0: compressor in tlf does not screen out negative values so it is also
tomwalters@0: important to set rectify=on.
tomwalters@0: .RE
tomwalters@0: .RS
tomwalters@0: .LP
tomwalters@0: Full wave rectification is produced if rectify is set to 2. This
tomwalters@0: option value leads to smoother spectrograms. It is also useful when
tomwalters@0: calculating envelopes with genasa.
tomwalters@0: .RE
tomwalters@0: .LP
tomwalters@0: .SS "Transduction"
tomwalters@0: .PP
tomwalters@0: .LP
tomwalters@0: .TP 13
tomwalters@0: transduction
tomwalters@0: Neural transduction switch (at, meddis, off)
tomwalters@0: .RS
tomwalters@0: Switch. Default: off.
tomwalters@0: .RE
tomwalters@0: .LP
tomwalters@0: .SS "Leaky Integration"
tomwalters@0: .PP
tomwalters@0: .LP
tomwalters@0: .TP 13
tomwalters@0: stages_idt
tomwalters@0: Number of stages of lowpass filtering
tomwalters@0: .RS
tomwalters@0: Default unit: scalar. Default value: 2
tomwalters@0: .RE
tomwalters@0: .TP 13
tomwalters@0: tup_idt
tomwalters@0: The time constant for each filter stage
tomwalters@0: .RS
tomwalters@0: Default unit: ms. Default value: 8 ms.
tomwalters@0: .RE
tomwalters@0: .LP 
tomwalters@0: The Equivalent Rectandular Duration (ERD) of a two stage lowpass
tomwalters@0: filter is about 1.6 times the time constant of each stage, or
tomwalters@0: 12.8 ms in the current case.
tomwalters@0: .TP 13
tomwalters@0: frstep_epn
tomwalters@0: The time between successive spectral frames
tomwalters@0: .RS
tomwalters@0: Default unit: ms. Default value: 10 ms.
tomwalters@0: .RE
tomwalters@0: .LP
tomwalters@0: With a frstep_epn of 10 ms, gensgm will produce spectral frames at a
tomwalters@0: rate of 100 per second.
tomwalters@0: .LP
tomwalters@0: .TP 13
tomwalters@0: downsample
tomwalters@0: The time between successive spectral frames. 
tomwalters@0: .RS
tomwalters@0: Default unit: ms. Default value: 10 ms.
tomwalters@0: .RE
tomwalters@0: .LP
tomwalters@0: Downsample is simply another name for frstep_epn, provided to
tomwalters@0: facilitate a different mode of thinking about time-series data.
tomwalters@0: .LP
tomwalters@0: .SH FILES
tomwalters@0: .LP
tomwalters@0: .TP 13
tomwalters@0:  .gensgmrc 
tomwalters@0: The options file for gensgm.
tomwalters@0: .LP
tomwalters@0: .SH SEE ALSO
tomwalters@0: .LP
tomwalters@0: genasa, genbmm, genepn, gencgm
tomwalters@0: .LP
tomwalters@0: .SH BUGS
tomwalters@0: .LP
tomwalters@0: None currently known.
tomwalters@0: .SH COPYRIGHT
tomwalters@0: .LP
tomwalters@0: Copyright (c) Applied Psychology Unit, Medical Research Council, 1995
tomwalters@0: .LP
tomwalters@0: Permission to use, copy, modify, and distribute this software without fee 
tomwalters@0: is hereby granted for research purposes, provided that this copyright 
tomwalters@0: notice appears in all copies and in all supporting documentation, and that 
tomwalters@0: the software is not redistributed for any fee (except for a nominal 
tomwalters@0: shipping charge). Anyone wanting to incorporate all or part of this 
tomwalters@0: software in a commercial product must obtain a license from the Medical 
tomwalters@0: Research Council.
tomwalters@0: .LP
tomwalters@0: The MRC makes no representations about the suitability of this 
tomwalters@0: software for any purpose.  It is provided "as is" without express or 
tomwalters@0: implied warranty.
tomwalters@0: .LP
tomwalters@0: THE MRC DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING 
tomwalters@0: ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL 
tomwalters@0: THE A.P.U. BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES 
tomwalters@0: OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, 
tomwalters@0: WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, 
tomwalters@0: ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS 
tomwalters@0: SOFTWARE.
tomwalters@0: .LP
tomwalters@0: .SH ACKNOWLEDGEMENTS
tomwalters@0: .LP
tomwalters@0: The AIM software was developed for Unix workstations by John
tomwalters@0: Holdsworth and Mike Allerhand of the MRC APU, under the direction of
tomwalters@0: Roy Patterson. The physiological version of AIM was developed by
tomwalters@0: Christian Giguere. The options handler is by Paul Manson. The revised
tomwalters@0: SAI module is by Jay Datta. Michael Akeroyd extended the postscript
tomwalters@0: facilites and developed the xreview routine for auditory image
tomwalters@0: cartoons.
tomwalters@0: .LP
tomwalters@0: The project was supported by the MRC and grants from the U.K. Defense
tomwalters@0: Research Agency, Farnborough (Research Contract 2239); the EEC Esprit
tomwalters@0: BR Porgramme, Project ACTS (3207); and the U.K. Hearing Research Trust.
tomwalters@0: