comparison docs/aimInstructions @ 0:5242703e91d3 tip

Initial checkin for AIM92 aimR8.2 (last updated May 1997).
author tomwalters
date Fri, 20 May 2011 15:19:45 +0100
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:5242703e91d3
1 INSTRUCTIONS AND OPTIONS FOR THE AUDITORY IMAGE MODEL.
2
3 This is the introduction for those wanting to use AIM to process waves
4 once the software package has been compiled, installed and tested. It
5 is assumed that the user has read the introductory article by
6 Patterson, Allerhand and Giguere (1995) and/or viewed the Overview of
7 the Auditory Image Model described at the beginning of ReadMe_bin.
8
9 Begin by typing gen -help at the prompt in an xterm window.
10
11 > gen -help
12
13 This prints general usage information on the standard output as follows:
14
15 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
16 "AIM MRC-APU Release R6.28 [gen] Tue Apr 18 17:35:15 1995"
17
18 Usage: gen??? [options] [file_name]
19
20 where ??? is one of the following abbreviations
21
22 wav wave
23 bmm basilar membrane motion
24 nap neural activity pattern
25 sai stabilized auditory image
26 spl spiral version of auditory image
27 sgm spectrogram
28 cgm cochleogram
29 asa auditory spectral analysis
30 epn excitation pattern
31
32 [file_name] is a headerless wave file (2-byte binary integers).
33
34 [options] are the parameters, options and switches that control
35 the AIM instructions and the AIM tools.
36
37 Help with options: gen??? [-help | -help=all | -help=option]
38 Path for options files (.gen???rc) = .:~ (or setenv OPTIONSPATH)
39
40
41 Processes Applied by AIM and Routes Through the Model:
42
43 Processes Auditory Speech Spectral
44 Route Route Route
45 -------------------------------- -------- ------ -------
46 Display input wave genwav genwav genwav
47 Auditory filtering (gtf/tlf) genbmm
48 Compression and rectification
49 Neural encoding (2D-AT/haircell) gennap
50 Temporal integration (LP filter) gensgm genasa
51 gencgm genepn
52 Strobed temporal integration gensai
53 Spiral version of auditory image genspl
54
55
56 Output: gen??? output=on file_name
57
58 Output is written to file: file_name.???
59
60 The format for 2-dimensional output is by columns, with the lowest
61 channel first in each column (bmm, nap, sgm, cgm, asa, epn).
62 The format for auditory image output is by rows, for each image frame
63 in succession, with the row of the lowest channel first (sai, spl).
64
65
66 The Auditory Image Model was developed at the Applied Psychology Unit
67 of the Medical Research Council, 15 Chaucer Road, Cambridge, U.K.
68
69 Copyright(c) Applied Psychology Unit, Medical Research Council, 1988-1995.
70 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
71
72
73 I. USAGE: INSTRUCTIONS, WAVE FILES, AND OPTIONS.
74
75 Instructions: The auditory processing is performed by a single large
76 program (gen) with multiple entry points, having names of the form
77 gen???. The entry-point names function as instructions, so the
78 program is run by typing an entry point name on the command line,
79 followed by options that control the details of the processing, and
80 then the name of the file containing the waveform to be processed.
81 The output will be displayed on the screen in a window created by AIM.
82 For example, in the directory aim/bin, type:
83
84 > genwav samplerate=22.05kHz start_wav=182ms length_wav=64 ../waves/hat
85
86 This will display a section of the wave in the file 'hat' which is
87 stored in the aim/waves directory. The wave was digitised at a
88 sampling rate of 22.05 kHz, so the option 'samplerate' is set to
89 22.05kHz. The options 'start_wav' and 'length_wav' determine the
90 subset of the wave to be displayed.
91
92 Each of the entry points into gen has a name which is an abbreviation
93 of the functions provided to that point in the auditroy image model.
94 For example, the first module applies spectral analysis to the input
95 wave and the output of the analysis is available at entry point
96 'genbmm' which stands for 'generate basilar membrane motion'. The
97 entry points are listed in the usage section of the gen -help output
98 and there is a manual entry for each, accessible via 'manaim
99 gen???'. Together this set of manual pages constitute the primary
100 documentation for AIM and they are the first place to look for help
101 after this document.
102
103 Wave Files: The input waves should be in the form of headerless binary
104 files (2-byte integers). Use genwav to check that the bytes are in the
105 right order. If they are not set option swap=on and AIM will reverse
106 the bytes, temporarily, at before beginning the analysis. Alternately,
107 use the aimtool 'swab' to switch the bytes in the file permanently.
108
109 A. Options:
110
111 The precise behaviour of the program is determined by options. There
112 are AUDITORY OPTIONS which control the operation of the auditory
113 processing modules in AIM, such as the number of channels in the
114 auditory filterbank (channels_afb) or the decay rate of the auditory
115 image (decay_ai). These options are described in the manual pages for
116 the individual instructions (gen???). There are also NON-AUDITORY
117 OPTIONS which a) specify the form of the input wave (e.g. samplerate
118 and swap_wav), b) govern the position and characteristics of the
119 output display and its contents, and c) specify the form and
120 destination of the output. These options are common to all entry
121 points and so they appear at the head of the options list before the
122 auditory options. For convenience, they are described in the manual
123 entry for genwav, the starting point for any analysis.
124
125 The relevant options at any entry point can be listed on the screen by
126 typing the entry point name with the -help option. For example,
127
128 > genwav -help
129
130 If the screen size is too small to display the entire list you can make the
131 display pause between screens by typing:
132
133 > genwav -help | more
134
135 Alternately, you can print the listing using
136
137 > genwav -help | lpr
138
139
140 All of the options from title up to dB_wave are for display and flow
141 control. They determine how the program output appears on the screen,
142 the source of the program input, and the destination of output. The
143 options have default values chosen for 'normal' operation and display
144 given the entry point. You can override the default values, either
145 temporarily, by supplying a value on the command line when you run the
146 program, or permanently by creating a file with your own option
147 values. Both the default value and the current value for each option
148 are listed by the help option.
149
150
151 B. Options Values:
152
153 Some of the options take numeric values (as in samplerate), others are
154 file names (as in input_wave). Others act as simple switches or flags
155 (e.g. swap_wav); these options can be set/unset using on/off or 0/1.
156
157 Numeric options can include appropriate units; so, for example, a
158 frequency can be specified either as 20000Hz or as 20kHz. There are
159 also default units for options. So, for example, Hz is the default
160 unit for frequencies and ms is the defualt unit for times in
161 milliseconds, and when these units apply, they need not be specified
162 on the command line..
163
164 The are also some special option values like 'remainder' which can be
165 used to specify 'the rest of the wave' with option 'length_wave', and
166 the special value 'centre' used with options x0_win and y0_win, to
167 centre the display window on the screen. (This assumes that your
168 window manager does not override placement of windows by
169 applications.)
170
171
172 C. Changing Option Values:
173
174 You can alter the value of one or more of the options by specifying it
175 and the required value on the command line between the entry point
176 name and the name of the input file. For example:
177
178 genwav width_win=440 ../waves/hat
179
180 will produce a waveform display for the wave in file hat with a width of
181 440 screen pixels.
182
183 Switches and flags that just take the values on and off, or 0 and 1, can be
184 switched on by specifying the option name, prefixed by a minus sign to
185 distinguish them from a file name. For example,
186
187 > genwav -swap ../wave/hat_br
188
189
190
191 II. PROCESSES APPLIED BY AIM AND ROUTES THROUGH THE MODEL
192
193 There are three routes through AIM depending on the purpose of the
194 analysis and the form in which the output is to be displayed. The
195 routes and the set of processes they apply are shown in the output of
196 gen -help.
197
198 All of the routes through AIM begin with genwav which displays the
199 contents of the file to be analysed as a magnitude versus time
200 plot. It is recommended that all analyses should begin with genwav in
201 order to a) confirm that the file does indeed contain the wave you
202 wish to analyse, b) confirm that the file is headerless and that the
203 bytes are in the right order, c) choose a subsection of the wave for
204 analysis, and d) to check whether it is a 12-bit wave or a 16-bit
205 wave. Headers usually appear as a brief bit of black, or noise with a
206 large amplitude, at the left-hand edge of the plot. If the bytes are
207 in the wrong order the entire wave usually looks like noise with a
208 large amplitude.
209
210 The AUDITORY ROUTE simulates basilar membrane motion (BMM), neural
211 activity patterns (NAPs), and auditory images either in rectangular
212 format (SAI) or in spiral format (SPL). Its purpose is to support
213 time-domain modelling of peripheral auditory processing; that is, to
214 simulate the phase-locked, time-interval patterns produced by the
215 cochlea and the conversion of the time-interval patterns into auditory
216 images by strobed temporal integration or autocorrelation (Patterson
217 et al., 1992; Meddis and Hewitt, 1991). For demonstrations of the
218 functional and physiological versions of the auditory route through
219 AIM try
220
221 > aimdemo_gtf_all ../waves/cegc and
222 > aimdemo_tlf_all ../waves/cegc
223
224 [[[ Does the SAI work for the physiological route? Does it use acgram? ]]]
225
226
227 The SPEECH ROUTE produces auditory spectrograms and cochleograms which
228 can be stored and used as input for automatic speech recognition. They
229 are 'auditory' spectrograms and cochleograms inasmuch as the centre
230 frequencies of the channels in the filterbank are equally spaced on an
231 ERB-scale, or Bark scale, rather than being linearly spaced as in
232 traditional preprocessor (Robinson et al., 1990; Giguere and Woodland,
233 1994; Patterson et al., 1994). The are currently no aimdemo scripts
234 for the speech routes through AIM.
235
236
237 The SPECTRAL ROUTE produces representations of the distribution of
238 activity across frequency in the auditory system, either at the output
239 of the filterbank (genasa) or the output of the full cochlea
240 simulation (genepn). These representations have been variously
241 referred to as 'excitation patterns' (Zwicker, 1974; Moore and
242 Glasberg, 1983), 'central auditory spectra' (Srulovicz and Goldstein,
243 1983), or simply 'auditory spectra' (Patterson, 1994). In AIM, to
244 distinguish the spectral representation at the output of the
245 filterbank, from the representation provided at the output of the full
246 cochlea simulation, the former is referred to as an 'auditory spectral
247 analysis' (genasa) and the latter is referred to as an 'excitation
248 pattern' (genepn). For demonstrations of the functional and
249 physiological versions of the spectral route through AIM, try
250
251 > aimdemo_gtf_spectra ../waves/cegc and
252 > aimdemo_tlf_spectra ../waves/cegc
253
254
255
256 III. INDIVIDUAL INSTRUCTIONS FOR THE AUDITORY ROUTE THROUGH AIM
257
258 The following is a list of the instructions that form the basis of
259 aimdemo_gtf_all. They show the output at successive stages of the
260 functional version of the auditory route through AIM.
261
262 > genwav length=32ms ../waves/cegc
263 > genstp length=32ms ../waves/cegc
264 > genbmm length=32ms ../waves/cegc
265 > gennap length=32ms ../waves/cegc
266 > gensai start=200 length=300ms ../waves/cegc
267 > genspl start=200 length=300ms pensize=2 ../waves/cegc
268
269
270 genwav shows the time waveform,
271 genstp shows the pressure wave in the middle ear bone
272 that drives the cochlea (the stapes).
273 genbmm shows simulated basilar membrane motion,
274 gennap shows a simulated neural activity pattern,
275 gensai shows a simulated stabilized auditory image,
276 genspl shows a spiral mapping of the auditory image.
277
278
279 Each of the instructions should create an X window and present a wave,
280 a landscape display, or a spiral display of cegc. The stimulus files
281 were created on a DECstation. On a SUN, swap the bytes on the command
282 line. For example:
283
284 > genwav swap=on length=32ms ../waves/cegc
285
286 The wave in the file 'cegc' is a set of click trains for the notes of
287 a major triad and the octave -- hence the name C-E-G-C. We use clicks
288 trains as test and demonstration stimuli for several reasons: They
289 activate all channels of the model and they do so with roughly equally
290 energy. The clicks elicit impulse responses from the model which show
291 the processing in its simplest broadband form. The simple temporal
292 structure of the click train immediately reveals any temporal
293 alignment problems in the software.
294
295 For the first 300 ms, the inter-click interval in cegc is 8 ms; then
296 for 300 ms, the inter-click interval decreases linearly to (4/5)*8
297 ms. For convenience, we refer to the first note as 'C', and so,
298 relative to this C, the glide takes the note up to 'E'. The gliding
299 portions of cegc illustrate the dynamic properties of AIM. The
300 inter-click interval stays at (4/5)*8 for 300 ms and then glides
301 linearly to (2/3)*8 ms which, relative to the starting C, is the note
302 G. The inter-click interval stays at (2/3)*8 ms for 300 ms and then
303 glides linearly to (1/2)*8 ms which is the final note and the octave
304 of the original C.
305
306
307 IV. INSTRUCTION AND OPTION SYNTAX
308
309 > gen??? [-option -option=value option=value ... -update] inputfile
310
311 The input file is assumed to be a headerless binary file; that is it
312 is assumed to contain in short integers (2-byte words).
313
314
315 The options handler accepts the following three formats:
316
317 -option # turns option "on" (Unix convention)
318 option=value # sets option to the string "value" (standard math)
319 -option=value # sets option to the string "value" (mixed unix/math)
320
321 So, -swap, swap=on, and -swap=on, all have the same effect.
322
323 The options handler also recognises two special options: update and help.
324
325 -update (or update=on)
326
327 This causes the options and values on the command line to be stored
328 in an 'options file' that is named according to the Unix convention
329 ".<program_name>rc". All of the gen??? instructions search for a
330 file of this form when they are executed, so once an option has been
331 'updated' with a specific value, that value will be used in all
332 subsequent occurances of that instruction in that directory. (Note:
333 The .gen???rc are 'hidden' because the unix command 'ls' does not
334 print files beginning with '.' unless the ls option '-a' is
335 present.)
336
337 CAUTION: Be careful with options files in the users home
338 directory. They will be invoked by instructions executed in any
339 directory that does not have its own specific options file for that
340 instruction and this can be very confusing.
341
342 -help (or help=on)
343
344 This causes the current options file to be printed on the screen (stdout).
345
346 The help option accepts any option name as an argument, and prints the
347 value of that single option on the screen. So
348
349 > genbmm help=channels
350
351 will cause the current number of channels to be printed on the screen.
352
353 The help option also accepts the string 'all'.
354
355 > genbmm help=all
356
357 will cause all options required for BMM output to be printed on the
358 screen (stdout).
359
360 The options and values used in an analysis of BMM can be recorded
361 in the file foo by with the following instruction.
362
363 > genbmm help=all > foo
364
365