tomwalters@0
|
1 .TH GENCGM 1 "11 May 1995"
|
tomwalters@0
|
2 .LP
|
tomwalters@0
|
3 .SH NAME
|
tomwalters@0
|
4 .LP
|
tomwalters@0
|
5 gencgm \- generate a cochleogram
|
tomwalters@0
|
6 .LP
|
tomwalters@0
|
7 .SH SYNOPSIS
|
tomwalters@0
|
8 .LP
|
tomwalters@0
|
9 gencgm [ option=value | -option ] [ filename ]
|
tomwalters@0
|
10 .LP
|
tomwalters@0
|
11 .SH DESCRIPTION
|
tomwalters@0
|
12 .LP
|
tomwalters@0
|
13
|
tomwalters@0
|
14 Gencgm converts the input wave into a simulated neural activity
|
tomwalters@0
|
15 pattern (NAP) and summarises the NAP as a sequence of excitation
|
tomwalters@0
|
16 patterns (EPNs) that collectively form a 'cochleogram' (CGM). The
|
tomwalters@0
|
17 operation takes place in three stages: spectral analysis, neural
|
tomwalters@0
|
18 encoding, and temporal integration. In the spectral analysis stage,
|
tomwalters@0
|
19 the input wave is converted into an array of filtered waves, one for
|
tomwalters@0
|
20 each channel of a gammatone auditory filterbank. The surface of the
|
tomwalters@0
|
21 array of filtered waves is AIM's representation of basilar membrane
|
tomwalters@0
|
22 motion (BMM) as a function of time (manaim genbmm). In the neural
|
tomwalters@0
|
23 encoding stage, compression, adaptation and suppression, are used to
|
tomwalters@0
|
24 convert each wave from the filterbank into a simulation of the
|
tomwalters@0
|
25 aggregate neural response to that wave. The array of responses is
|
tomwalters@0
|
26 AIM's simulation of the neural activity pattern (NAP) in the auditory
|
tomwalters@0
|
27 nerve at about the level of the cochlear nucleus (manaim gennap).
|
tomwalters@0
|
28 Finally, the NAP is converted into a sequence of excitation patterns
|
tomwalters@0
|
29 (EPNs) by calculating the envelope of the NAP and extracting spectral
|
tomwalters@0
|
30 slices from the envelope every 'frstep_epn' ms. The envelope is
|
tomwalters@0
|
31 calculated continuously, by lowpass filtering the individual channels
|
tomwalters@0
|
32 of the NAP as they flow from the cochlea simulation.
|
tomwalters@0
|
33 .LP
|
tomwalters@0
|
34 When the sequence of excitation patterns is presented in spectrogram
|
tomwalters@0
|
35 format, it is referred to as a 'cochleogram' (CGM). The spectrogram
|
tomwalters@0
|
36 format has time on the abscissa (x-axis), filter centre-frequency on
|
tomwalters@0
|
37 the ordinate (y-axis), and activity level as the degree of black in
|
tomwalters@0
|
38 the display. In AIM, the suffix 'cgm' is used to distinguish this
|
tomwalters@0
|
39 spectral representation from the other spectral representations
|
tomwalters@0
|
40 provided by the software ('asa' auditory spectral analysis, 'sgm'
|
tomwalters@0
|
41 auditory spectrogram, and 'epn' excitation pattern).
|
tomwalters@0
|
42 .LP
|
tomwalters@0
|
43 The NAP generated by gencgm is the same as that produced by gennap
|
tomwalters@0
|
44 (manaim gennap). The primary differences are in the display defaults
|
tomwalters@0
|
45 and the inclusion of the Leaky Integration used to construct the
|
tomwalters@0
|
46 excitation patterns that form the cochleogram. As a result, this
|
tomwalters@0
|
47 manual entry is restricted to describing the option values that differ
|
tomwalters@0
|
48 from those in gennap and the additional options required to control
|
tomwalters@0
|
49 the Leaky Integration.
|
tomwalters@0
|
50 .LP
|
tomwalters@0
|
51 .SH DISPLAY DEFAULTS
|
tomwalters@0
|
52 .LP
|
tomwalters@0
|
53 The default values for three of the display options are reset to
|
tomwalters@0
|
54 produce a spectrographic format rather than a landscape. Specifically,
|
tomwalters@0
|
55 display=greyscale, bottom=0 and top=2500. The number of channels is
|
tomwalters@0
|
56 set to 128 for compatibility with the auditory spectrum modules,
|
tomwalters@0
|
57 genasa and genepn. When using AIM as a preprocessor for speech
|
tomwalters@0
|
58 recognition the number of channels would typically be reduced to
|
tomwalters@0
|
59 between 24 and 32. Use option 'downsample' if it is necessary to
|
tomwalters@0
|
60 reduce the output to less than 24 channels across the speech range.
|
tomwalters@0
|
61 .LP
|
tomwalters@0
|
62 .SH COMPRESSION AND LEAKY INTEGRATION
|
tomwalters@0
|
63 .LP
|
tomwalters@0
|
64 Compression and lowpass filtering are activated after the neural
|
tomwalters@0
|
65 encoding stage:
|
tomwalters@0
|
66 .LP
|
tomwalters@0
|
67 .SS "Compression"
|
tomwalters@0
|
68 .PP
|
tomwalters@0
|
69 Cochleograms are usually produced via the functional route in AIM. In
|
tomwalters@0
|
70 this case, compress is set on
|
tomwalters@0
|
71 .LP
|
tomwalters@0
|
72 .TP 13
|
tomwalters@0
|
73 compress
|
tomwalters@0
|
74 Logarithmic compressor switch
|
tomwalters@0
|
75 .RS
|
tomwalters@0
|
76 Switch. Default: on.
|
tomwalters@0
|
77 .RE
|
tomwalters@0
|
78 .RS
|
tomwalters@0
|
79 .LP
|
tomwalters@0
|
80 Note: The compressor in the functional route of AIM is logarithmic and
|
tomwalters@0
|
81 it screens out negative BMM values before compression. This rectifies
|
tomwalters@0
|
82 the wave during the compression process and so the separate rectify
|
tomwalters@0
|
83 option is left off.
|
tomwalters@0
|
84 .RE
|
tomwalters@0
|
85 .LP
|
tomwalters@0
|
86 .RS
|
tomwalters@0
|
87 .LP
|
tomwalters@0
|
88 Note: The compressor in the physiological route of AIM is an integral
|
tomwalters@0
|
89 part of the tlf module, so when using this route to produce a
|
tomwalters@0
|
90 cochleogram, turn off the logarithmic compressor
|
tomwalters@0
|
91 (i.e. compress=off). The compressor in tlf does not screen out
|
tomwalters@0
|
92 negative values so it is also important to set rectify=on.
|
tomwalters@0
|
93 .RE
|
tomwalters@0
|
94 .RS
|
tomwalters@0
|
95 .LP
|
tomwalters@0
|
96 Full wave rectification is produced if rectify is set to 2. This will
|
tomwalters@0
|
97 lead to a smoother cochleogram from both the physiological and the
|
tomwalters@0
|
98 functional versions of AIM.
|
tomwalters@0
|
99 .RE
|
tomwalters@0
|
100 .LP
|
tomwalters@0
|
101 .SS "Transduction"
|
tomwalters@0
|
102 .PP
|
tomwalters@0
|
103 .LP
|
tomwalters@0
|
104 .TP 13
|
tomwalters@0
|
105 transduction
|
tomwalters@0
|
106 Neural transduction switch (at, meddis, off)
|
tomwalters@0
|
107 .RS
|
tomwalters@0
|
108 Switch. Default: at.
|
tomwalters@0
|
109
|
tomwalters@0
|
110 .RE
|
tomwalters@0
|
111 .LP
|
tomwalters@0
|
112 .SS "Leaky Integration"
|
tomwalters@0
|
113 .PP
|
tomwalters@0
|
114 .LP
|
tomwalters@0
|
115 .TP 13
|
tomwalters@0
|
116 stages_idt
|
tomwalters@0
|
117 Number of stages of lowpass filtering
|
tomwalters@0
|
118 .RS
|
tomwalters@0
|
119 Default unit: scalar. Default value: 2
|
tomwalters@0
|
120 .RE
|
tomwalters@0
|
121 .TP 13
|
tomwalters@0
|
122 tup_idt
|
tomwalters@0
|
123 The time constant for each filter stage
|
tomwalters@0
|
124 .RS
|
tomwalters@0
|
125 Default unit: ms. Default value: 8 ms.
|
tomwalters@0
|
126 .RE
|
tomwalters@0
|
127 .LP
|
tomwalters@0
|
128 The Equivalent Rectandular Duration (ERD) of a two stage lowpass
|
tomwalters@0
|
129 filter is about 1.6 times the time constant of each stage, or
|
tomwalters@0
|
130 12.8 ms in the current case.
|
tomwalters@0
|
131 .TP 13
|
tomwalters@0
|
132 frstep_epn
|
tomwalters@0
|
133 The time between successive spectral frames
|
tomwalters@0
|
134 .RS
|
tomwalters@0
|
135 Default unit: ms. Default value: 10 ms.
|
tomwalters@0
|
136 .RE
|
tomwalters@0
|
137 .LP
|
tomwalters@0
|
138 With a frstep_epn of 10 ms, gencgm will produce spectral frames at a
|
tomwalters@0
|
139 rate of 100 per second.
|
tomwalters@0
|
140 .LP
|
tomwalters@0
|
141 .TP 13
|
tomwalters@0
|
142 downsample
|
tomwalters@0
|
143 The time between successive spectral frames.
|
tomwalters@0
|
144 .RS
|
tomwalters@0
|
145 Default unit: ms. Default value: 10 ms.
|
tomwalters@0
|
146 .RE
|
tomwalters@0
|
147 .LP
|
tomwalters@0
|
148 Downsample is simply another name for frstep_epn, provided to
|
tomwalters@0
|
149 facilitate a different mode of thinking about time-series data.
|
tomwalters@0
|
150 .LP
|
tomwalters@0
|
151 .SH FILES
|
tomwalters@0
|
152 .LP
|
tomwalters@0
|
153 .TP 13
|
tomwalters@0
|
154 .gencgmrc
|
tomwalters@0
|
155 The options file for gencgm.
|
tomwalters@0
|
156 .LP
|
tomwalters@0
|
157 .SH SEE ALSO
|
tomwalters@0
|
158 .LP
|
tomwalters@0
|
159 gensgm, genasa, genepn, gennap, genbmm
|
tomwalters@0
|
160 .LP
|
tomwalters@0
|
161 .SH BUGS
|
tomwalters@0
|
162 .LP
|
tomwalters@0
|
163 None currently known.
|
tomwalters@0
|
164 .SH COPYRIGHT
|
tomwalters@0
|
165 .LP
|
tomwalters@0
|
166 Copyright (c) Applied Psychology Unit, Medical Research Council, 1995
|
tomwalters@0
|
167 .LP
|
tomwalters@0
|
168 Permission to use, copy, modify, and distribute this software without fee
|
tomwalters@0
|
169 is hereby granted for research purposes, provided that this copyright
|
tomwalters@0
|
170 notice appears in all copies and in all supporting documentation, and that
|
tomwalters@0
|
171 the software is not redistributed for any fee (except for a nominal
|
tomwalters@0
|
172 shipping charge). Anyone wanting to incorporate all or part of this
|
tomwalters@0
|
173 software in a commercial product must obtain a license from the Medical
|
tomwalters@0
|
174 Research Council.
|
tomwalters@0
|
175 .LP
|
tomwalters@0
|
176 The MRC makes no representations about the suitability of this
|
tomwalters@0
|
177 software for any purpose. It is provided "as is" without express or
|
tomwalters@0
|
178 implied warranty.
|
tomwalters@0
|
179 .LP
|
tomwalters@0
|
180 THE MRC DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING
|
tomwalters@0
|
181 ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL
|
tomwalters@0
|
182 THE A.P.U. BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES
|
tomwalters@0
|
183 OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
|
tomwalters@0
|
184 WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
|
tomwalters@0
|
185 ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS
|
tomwalters@0
|
186 SOFTWARE.
|
tomwalters@0
|
187 .LP
|
tomwalters@0
|
188 .SH ACKNOWLEDGEMENTS
|
tomwalters@0
|
189 .LP
|
tomwalters@0
|
190 The AIM software was developed for Unix workstations by John
|
tomwalters@0
|
191 Holdsworth and Mike Allerhand of the MRC APU, under the direction of
|
tomwalters@0
|
192 Roy Patterson. The physiological version of AIM was developed by
|
tomwalters@0
|
193 Christian Giguere. The options handler is by Paul Manson. The revised
|
tomwalters@0
|
194 SAI module is by Jay Datta. Michael Akeroyd extended the postscript
|
tomwalters@0
|
195 facilites and developed the xreview routine for auditory image
|
tomwalters@0
|
196 cartoons.
|
tomwalters@0
|
197 .LP
|
tomwalters@0
|
198 The project was supported by the MRC and grants from the U.K. Defense
|
tomwalters@0
|
199 Research Agency, Farnborough (Research Contract 2239); the EEC Esprit
|
tomwalters@0
|
200 BR Porgramme, Project ACTS (3207); and the U.K. Hearing Research Trust.
|
tomwalters@0
|
201
|