comparison man/man1/gensgm.1 @ 0:5242703e91d3 tip

Initial checkin for AIM92 aimR8.2 (last updated May 1997).
author tomwalters
date Fri, 20 May 2011 15:19:45 +0100
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:5242703e91d3
1 .TH GENSGM 1 "11 May 1995"
2 .LP
3 .SH NAME
4 .LP
5 gensgm \- generate auditory spectrogram
6 .LP
7 .SH SYNOPSIS
8 .LP
9 gensgm [ option=value | -option ] [ filename ]
10 .LP
11 .SH DESCRIPTION
12 .LP
13 The gensgm module of the AIM software performs a time-domain spectral
14 analysis using a bank of auditory filters, and summarises the
15 information in an auditory spectrogram, that is, a spectrogram with
16 auditory frequency resolution and temporal resolution, rather than the
17 fixed frequency and temporal resolution of traditional speech
18 preprocessors. The spectral analysis converts the input wave into an
19 array of filtered waves, one for each channel of a gammatone auditory
20 filterbank. The surface of the array of filtered waves is AIM's
21 representation of basilar membrane motion (BMM) as a function of
22 time. The auditory spectrogram is a plot of a sequence of spectral
23 slices extracted from the envelope of the BMM every 'frstep_epn'
24 ms. The envelope is calculated continuously, by rectifing,
25 compressing, and lowpass filtering the individual BMM waves as they
26 flow from the filterbank.
27 .LP
28 The frequency resolution of the analysis varies with the center
29 frequency of the channel as in the auditory system, and the
30 distribution of channels across frequency is chosen to match that in
31 the auditory system (Patterson and Moore, 1986). Thus, the auditory
32 spectrogram is a greyscale plot of the activity in each channel
33 (shades of black) as a function of time (the abscissa) and the centre
34 frequency of the auditory filter (the ordinate) in ERB's. The
35 representation is referred to as an auditory spectrogram (SGM) to
36 distinguish it from more traditional spectrograms based on Fourier,
37 LPC or cepstral analysis. In AIM, the suffix 'sgm' is used to
38 distinguish this spectral representation from the other spectral
39 representations provided by the software ('asa' auditory spectral
40 analysis, 'cgm' cochleogram, and 'epn' excitation pattern).
41 .LP
42 The spectral analysis performed by gensgm is the same as that
43 performed by genbmm (manaim genbmm). The primary differences are in
44 the display defaults and the inclusion of the Compression and Leaky
45 Integration modules used to produce the spectral slices that form the
46 spectrogram. As a result, this manual entry is restricted to
47 describing the option values that differ from those in genbmm and the
48 additional options required to control the Compression and Leaky
49 Integration.
50 .LP
51 .SH DISPLAY DEFAULTS
52 .LP
53 The default values for three of the display options are reset to
54 produce a spectrographic format rather than a landscape. Specifically,
55 display=greyscale, bottom=0 and top=2500. The number of channels is
56 set to 128 for compatibility with the auditory spectrum modules,
57 genasa and genepn. When using AIM as a preprocessor for speech
58 recognition the number of channels would typically be reduced to
59 between 24 and 32. Use option 'downsample' if it is necessary to
60 reduce the output to less than 24 channels across the speech range.
61 .LP
62 .SH COMPRESSION AND LEAKY INTEGRATION
63 .LP
64 Compression and lowpass filtering are activated and the neural
65 encoding stage that comes between them is turned off:
66 .LP
67 .SS "Compression"
68 .PP
69 Auditory spectra are usually produced via the functional route in
70 AIM. In this case, compress is set on
71 .LP
72 .TP 13
73 compress
74 Logarithmic compressor switch
75 .RS
76 Switch. Default: on.
77 .RE
78 .RS
79 .LP
80 Note: The compressor in the functional route of AIM is logarithmic and
81 it screens out negative BMM values before compression. This rectifies
82 the wave during the compression process and so the separate rectify
83 option is left off.
84 .RE
85 .LP
86 .RS
87 .LP
88 Note: The compressor in the physiological route of AIM is an integral
89 part of the tlf module, so when using this route to produce auditory
90 spectra, turn off the logarithmic compressor (i.e. compress=off). The
91 compressor in tlf does not screen out negative values so it is also
92 important to set rectify=on.
93 .RE
94 .RS
95 .LP
96 Full wave rectification is produced if rectify is set to 2. This
97 option value leads to smoother spectrograms. It is also useful when
98 calculating envelopes with genasa.
99 .RE
100 .LP
101 .SS "Transduction"
102 .PP
103 .LP
104 .TP 13
105 transduction
106 Neural transduction switch (at, meddis, off)
107 .RS
108 Switch. Default: off.
109 .RE
110 .LP
111 .SS "Leaky Integration"
112 .PP
113 .LP
114 .TP 13
115 stages_idt
116 Number of stages of lowpass filtering
117 .RS
118 Default unit: scalar. Default value: 2
119 .RE
120 .TP 13
121 tup_idt
122 The time constant for each filter stage
123 .RS
124 Default unit: ms. Default value: 8 ms.
125 .RE
126 .LP
127 The Equivalent Rectandular Duration (ERD) of a two stage lowpass
128 filter is about 1.6 times the time constant of each stage, or
129 12.8 ms in the current case.
130 .TP 13
131 frstep_epn
132 The time between successive spectral frames
133 .RS
134 Default unit: ms. Default value: 10 ms.
135 .RE
136 .LP
137 With a frstep_epn of 10 ms, gensgm will produce spectral frames at a
138 rate of 100 per second.
139 .LP
140 .TP 13
141 downsample
142 The time between successive spectral frames.
143 .RS
144 Default unit: ms. Default value: 10 ms.
145 .RE
146 .LP
147 Downsample is simply another name for frstep_epn, provided to
148 facilitate a different mode of thinking about time-series data.
149 .LP
150 .SH FILES
151 .LP
152 .TP 13
153 .gensgmrc
154 The options file for gensgm.
155 .LP
156 .SH SEE ALSO
157 .LP
158 genasa, genbmm, genepn, gencgm
159 .LP
160 .SH BUGS
161 .LP
162 None currently known.
163 .SH COPYRIGHT
164 .LP
165 Copyright (c) Applied Psychology Unit, Medical Research Council, 1995
166 .LP
167 Permission to use, copy, modify, and distribute this software without fee
168 is hereby granted for research purposes, provided that this copyright
169 notice appears in all copies and in all supporting documentation, and that
170 the software is not redistributed for any fee (except for a nominal
171 shipping charge). Anyone wanting to incorporate all or part of this
172 software in a commercial product must obtain a license from the Medical
173 Research Council.
174 .LP
175 The MRC makes no representations about the suitability of this
176 software for any purpose. It is provided "as is" without express or
177 implied warranty.
178 .LP
179 THE MRC DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING
180 ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL
181 THE A.P.U. BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES
182 OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
183 WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
184 ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS
185 SOFTWARE.
186 .LP
187 .SH ACKNOWLEDGEMENTS
188 .LP
189 The AIM software was developed for Unix workstations by John
190 Holdsworth and Mike Allerhand of the MRC APU, under the direction of
191 Roy Patterson. The physiological version of AIM was developed by
192 Christian Giguere. The options handler is by Paul Manson. The revised
193 SAI module is by Jay Datta. Michael Akeroyd extended the postscript
194 facilites and developed the xreview routine for auditory image
195 cartoons.
196 .LP
197 The project was supported by the MRC and grants from the U.K. Defense
198 Research Agency, Farnborough (Research Contract 2239); the EEC Esprit
199 BR Porgramme, Project ACTS (3207); and the U.K. Hearing Research Trust.
200