Mercurial > hg > aim92
comparison man/man1/gensgm.1 @ 0:5242703e91d3 tip
Initial checkin for AIM92 aimR8.2 (last updated May 1997).
author | tomwalters |
---|---|
date | Fri, 20 May 2011 15:19:45 +0100 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:5242703e91d3 |
---|---|
1 .TH GENSGM 1 "11 May 1995" | |
2 .LP | |
3 .SH NAME | |
4 .LP | |
5 gensgm \- generate auditory spectrogram | |
6 .LP | |
7 .SH SYNOPSIS | |
8 .LP | |
9 gensgm [ option=value | -option ] [ filename ] | |
10 .LP | |
11 .SH DESCRIPTION | |
12 .LP | |
13 The gensgm module of the AIM software performs a time-domain spectral | |
14 analysis using a bank of auditory filters, and summarises the | |
15 information in an auditory spectrogram, that is, a spectrogram with | |
16 auditory frequency resolution and temporal resolution, rather than the | |
17 fixed frequency and temporal resolution of traditional speech | |
18 preprocessors. The spectral analysis converts the input wave into an | |
19 array of filtered waves, one for each channel of a gammatone auditory | |
20 filterbank. The surface of the array of filtered waves is AIM's | |
21 representation of basilar membrane motion (BMM) as a function of | |
22 time. The auditory spectrogram is a plot of a sequence of spectral | |
23 slices extracted from the envelope of the BMM every 'frstep_epn' | |
24 ms. The envelope is calculated continuously, by rectifing, | |
25 compressing, and lowpass filtering the individual BMM waves as they | |
26 flow from the filterbank. | |
27 .LP | |
28 The frequency resolution of the analysis varies with the center | |
29 frequency of the channel as in the auditory system, and the | |
30 distribution of channels across frequency is chosen to match that in | |
31 the auditory system (Patterson and Moore, 1986). Thus, the auditory | |
32 spectrogram is a greyscale plot of the activity in each channel | |
33 (shades of black) as a function of time (the abscissa) and the centre | |
34 frequency of the auditory filter (the ordinate) in ERB's. The | |
35 representation is referred to as an auditory spectrogram (SGM) to | |
36 distinguish it from more traditional spectrograms based on Fourier, | |
37 LPC or cepstral analysis. In AIM, the suffix 'sgm' is used to | |
38 distinguish this spectral representation from the other spectral | |
39 representations provided by the software ('asa' auditory spectral | |
40 analysis, 'cgm' cochleogram, and 'epn' excitation pattern). | |
41 .LP | |
42 The spectral analysis performed by gensgm is the same as that | |
43 performed by genbmm (manaim genbmm). The primary differences are in | |
44 the display defaults and the inclusion of the Compression and Leaky | |
45 Integration modules used to produce the spectral slices that form the | |
46 spectrogram. As a result, this manual entry is restricted to | |
47 describing the option values that differ from those in genbmm and the | |
48 additional options required to control the Compression and Leaky | |
49 Integration. | |
50 .LP | |
51 .SH DISPLAY DEFAULTS | |
52 .LP | |
53 The default values for three of the display options are reset to | |
54 produce a spectrographic format rather than a landscape. Specifically, | |
55 display=greyscale, bottom=0 and top=2500. The number of channels is | |
56 set to 128 for compatibility with the auditory spectrum modules, | |
57 genasa and genepn. When using AIM as a preprocessor for speech | |
58 recognition the number of channels would typically be reduced to | |
59 between 24 and 32. Use option 'downsample' if it is necessary to | |
60 reduce the output to less than 24 channels across the speech range. | |
61 .LP | |
62 .SH COMPRESSION AND LEAKY INTEGRATION | |
63 .LP | |
64 Compression and lowpass filtering are activated and the neural | |
65 encoding stage that comes between them is turned off: | |
66 .LP | |
67 .SS "Compression" | |
68 .PP | |
69 Auditory spectra are usually produced via the functional route in | |
70 AIM. In this case, compress is set on | |
71 .LP | |
72 .TP 13 | |
73 compress | |
74 Logarithmic compressor switch | |
75 .RS | |
76 Switch. Default: on. | |
77 .RE | |
78 .RS | |
79 .LP | |
80 Note: The compressor in the functional route of AIM is logarithmic and | |
81 it screens out negative BMM values before compression. This rectifies | |
82 the wave during the compression process and so the separate rectify | |
83 option is left off. | |
84 .RE | |
85 .LP | |
86 .RS | |
87 .LP | |
88 Note: The compressor in the physiological route of AIM is an integral | |
89 part of the tlf module, so when using this route to produce auditory | |
90 spectra, turn off the logarithmic compressor (i.e. compress=off). The | |
91 compressor in tlf does not screen out negative values so it is also | |
92 important to set rectify=on. | |
93 .RE | |
94 .RS | |
95 .LP | |
96 Full wave rectification is produced if rectify is set to 2. This | |
97 option value leads to smoother spectrograms. It is also useful when | |
98 calculating envelopes with genasa. | |
99 .RE | |
100 .LP | |
101 .SS "Transduction" | |
102 .PP | |
103 .LP | |
104 .TP 13 | |
105 transduction | |
106 Neural transduction switch (at, meddis, off) | |
107 .RS | |
108 Switch. Default: off. | |
109 .RE | |
110 .LP | |
111 .SS "Leaky Integration" | |
112 .PP | |
113 .LP | |
114 .TP 13 | |
115 stages_idt | |
116 Number of stages of lowpass filtering | |
117 .RS | |
118 Default unit: scalar. Default value: 2 | |
119 .RE | |
120 .TP 13 | |
121 tup_idt | |
122 The time constant for each filter stage | |
123 .RS | |
124 Default unit: ms. Default value: 8 ms. | |
125 .RE | |
126 .LP | |
127 The Equivalent Rectandular Duration (ERD) of a two stage lowpass | |
128 filter is about 1.6 times the time constant of each stage, or | |
129 12.8 ms in the current case. | |
130 .TP 13 | |
131 frstep_epn | |
132 The time between successive spectral frames | |
133 .RS | |
134 Default unit: ms. Default value: 10 ms. | |
135 .RE | |
136 .LP | |
137 With a frstep_epn of 10 ms, gensgm will produce spectral frames at a | |
138 rate of 100 per second. | |
139 .LP | |
140 .TP 13 | |
141 downsample | |
142 The time between successive spectral frames. | |
143 .RS | |
144 Default unit: ms. Default value: 10 ms. | |
145 .RE | |
146 .LP | |
147 Downsample is simply another name for frstep_epn, provided to | |
148 facilitate a different mode of thinking about time-series data. | |
149 .LP | |
150 .SH FILES | |
151 .LP | |
152 .TP 13 | |
153 .gensgmrc | |
154 The options file for gensgm. | |
155 .LP | |
156 .SH SEE ALSO | |
157 .LP | |
158 genasa, genbmm, genepn, gencgm | |
159 .LP | |
160 .SH BUGS | |
161 .LP | |
162 None currently known. | |
163 .SH COPYRIGHT | |
164 .LP | |
165 Copyright (c) Applied Psychology Unit, Medical Research Council, 1995 | |
166 .LP | |
167 Permission to use, copy, modify, and distribute this software without fee | |
168 is hereby granted for research purposes, provided that this copyright | |
169 notice appears in all copies and in all supporting documentation, and that | |
170 the software is not redistributed for any fee (except for a nominal | |
171 shipping charge). Anyone wanting to incorporate all or part of this | |
172 software in a commercial product must obtain a license from the Medical | |
173 Research Council. | |
174 .LP | |
175 The MRC makes no representations about the suitability of this | |
176 software for any purpose. It is provided "as is" without express or | |
177 implied warranty. | |
178 .LP | |
179 THE MRC DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING | |
180 ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL | |
181 THE A.P.U. BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES | |
182 OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, | |
183 WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, | |
184 ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS | |
185 SOFTWARE. | |
186 .LP | |
187 .SH ACKNOWLEDGEMENTS | |
188 .LP | |
189 The AIM software was developed for Unix workstations by John | |
190 Holdsworth and Mike Allerhand of the MRC APU, under the direction of | |
191 Roy Patterson. The physiological version of AIM was developed by | |
192 Christian Giguere. The options handler is by Paul Manson. The revised | |
193 SAI module is by Jay Datta. Michael Akeroyd extended the postscript | |
194 facilites and developed the xreview routine for auditory image | |
195 cartoons. | |
196 .LP | |
197 The project was supported by the MRC and grants from the U.K. Defense | |
198 Research Agency, Farnborough (Research Contract 2239); the EEC Esprit | |
199 BR Porgramme, Project ACTS (3207); and the U.K. Hearing Research Trust. | |
200 |