Mercurial > hg > aim92
comparison docs/aimInstructions @ 0:5242703e91d3 tip
Initial checkin for AIM92 aimR8.2 (last updated May 1997).
author | tomwalters |
---|---|
date | Fri, 20 May 2011 15:19:45 +0100 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:5242703e91d3 |
---|---|
1 INSTRUCTIONS AND OPTIONS FOR THE AUDITORY IMAGE MODEL. | |
2 | |
3 This is the introduction for those wanting to use AIM to process waves | |
4 once the software package has been compiled, installed and tested. It | |
5 is assumed that the user has read the introductory article by | |
6 Patterson, Allerhand and Giguere (1995) and/or viewed the Overview of | |
7 the Auditory Image Model described at the beginning of ReadMe_bin. | |
8 | |
9 Begin by typing gen -help at the prompt in an xterm window. | |
10 | |
11 > gen -help | |
12 | |
13 This prints general usage information on the standard output as follows: | |
14 | |
15 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | |
16 "AIM MRC-APU Release R6.28 [gen] Tue Apr 18 17:35:15 1995" | |
17 | |
18 Usage: gen??? [options] [file_name] | |
19 | |
20 where ??? is one of the following abbreviations | |
21 | |
22 wav wave | |
23 bmm basilar membrane motion | |
24 nap neural activity pattern | |
25 sai stabilized auditory image | |
26 spl spiral version of auditory image | |
27 sgm spectrogram | |
28 cgm cochleogram | |
29 asa auditory spectral analysis | |
30 epn excitation pattern | |
31 | |
32 [file_name] is a headerless wave file (2-byte binary integers). | |
33 | |
34 [options] are the parameters, options and switches that control | |
35 the AIM instructions and the AIM tools. | |
36 | |
37 Help with options: gen??? [-help | -help=all | -help=option] | |
38 Path for options files (.gen???rc) = .:~ (or setenv OPTIONSPATH) | |
39 | |
40 | |
41 Processes Applied by AIM and Routes Through the Model: | |
42 | |
43 Processes Auditory Speech Spectral | |
44 Route Route Route | |
45 -------------------------------- -------- ------ ------- | |
46 Display input wave genwav genwav genwav | |
47 Auditory filtering (gtf/tlf) genbmm | |
48 Compression and rectification | |
49 Neural encoding (2D-AT/haircell) gennap | |
50 Temporal integration (LP filter) gensgm genasa | |
51 gencgm genepn | |
52 Strobed temporal integration gensai | |
53 Spiral version of auditory image genspl | |
54 | |
55 | |
56 Output: gen??? output=on file_name | |
57 | |
58 Output is written to file: file_name.??? | |
59 | |
60 The format for 2-dimensional output is by columns, with the lowest | |
61 channel first in each column (bmm, nap, sgm, cgm, asa, epn). | |
62 The format for auditory image output is by rows, for each image frame | |
63 in succession, with the row of the lowest channel first (sai, spl). | |
64 | |
65 | |
66 The Auditory Image Model was developed at the Applied Psychology Unit | |
67 of the Medical Research Council, 15 Chaucer Road, Cambridge, U.K. | |
68 | |
69 Copyright(c) Applied Psychology Unit, Medical Research Council, 1988-1995. | |
70 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | |
71 | |
72 | |
73 I. USAGE: INSTRUCTIONS, WAVE FILES, AND OPTIONS. | |
74 | |
75 Instructions: The auditory processing is performed by a single large | |
76 program (gen) with multiple entry points, having names of the form | |
77 gen???. The entry-point names function as instructions, so the | |
78 program is run by typing an entry point name on the command line, | |
79 followed by options that control the details of the processing, and | |
80 then the name of the file containing the waveform to be processed. | |
81 The output will be displayed on the screen in a window created by AIM. | |
82 For example, in the directory aim/bin, type: | |
83 | |
84 > genwav samplerate=22.05kHz start_wav=182ms length_wav=64 ../waves/hat | |
85 | |
86 This will display a section of the wave in the file 'hat' which is | |
87 stored in the aim/waves directory. The wave was digitised at a | |
88 sampling rate of 22.05 kHz, so the option 'samplerate' is set to | |
89 22.05kHz. The options 'start_wav' and 'length_wav' determine the | |
90 subset of the wave to be displayed. | |
91 | |
92 Each of the entry points into gen has a name which is an abbreviation | |
93 of the functions provided to that point in the auditroy image model. | |
94 For example, the first module applies spectral analysis to the input | |
95 wave and the output of the analysis is available at entry point | |
96 'genbmm' which stands for 'generate basilar membrane motion'. The | |
97 entry points are listed in the usage section of the gen -help output | |
98 and there is a manual entry for each, accessible via 'manaim | |
99 gen???'. Together this set of manual pages constitute the primary | |
100 documentation for AIM and they are the first place to look for help | |
101 after this document. | |
102 | |
103 Wave Files: The input waves should be in the form of headerless binary | |
104 files (2-byte integers). Use genwav to check that the bytes are in the | |
105 right order. If they are not set option swap=on and AIM will reverse | |
106 the bytes, temporarily, at before beginning the analysis. Alternately, | |
107 use the aimtool 'swab' to switch the bytes in the file permanently. | |
108 | |
109 A. Options: | |
110 | |
111 The precise behaviour of the program is determined by options. There | |
112 are AUDITORY OPTIONS which control the operation of the auditory | |
113 processing modules in AIM, such as the number of channels in the | |
114 auditory filterbank (channels_afb) or the decay rate of the auditory | |
115 image (decay_ai). These options are described in the manual pages for | |
116 the individual instructions (gen???). There are also NON-AUDITORY | |
117 OPTIONS which a) specify the form of the input wave (e.g. samplerate | |
118 and swap_wav), b) govern the position and characteristics of the | |
119 output display and its contents, and c) specify the form and | |
120 destination of the output. These options are common to all entry | |
121 points and so they appear at the head of the options list before the | |
122 auditory options. For convenience, they are described in the manual | |
123 entry for genwav, the starting point for any analysis. | |
124 | |
125 The relevant options at any entry point can be listed on the screen by | |
126 typing the entry point name with the -help option. For example, | |
127 | |
128 > genwav -help | |
129 | |
130 If the screen size is too small to display the entire list you can make the | |
131 display pause between screens by typing: | |
132 | |
133 > genwav -help | more | |
134 | |
135 Alternately, you can print the listing using | |
136 | |
137 > genwav -help | lpr | |
138 | |
139 | |
140 All of the options from title up to dB_wave are for display and flow | |
141 control. They determine how the program output appears on the screen, | |
142 the source of the program input, and the destination of output. The | |
143 options have default values chosen for 'normal' operation and display | |
144 given the entry point. You can override the default values, either | |
145 temporarily, by supplying a value on the command line when you run the | |
146 program, or permanently by creating a file with your own option | |
147 values. Both the default value and the current value for each option | |
148 are listed by the help option. | |
149 | |
150 | |
151 B. Options Values: | |
152 | |
153 Some of the options take numeric values (as in samplerate), others are | |
154 file names (as in input_wave). Others act as simple switches or flags | |
155 (e.g. swap_wav); these options can be set/unset using on/off or 0/1. | |
156 | |
157 Numeric options can include appropriate units; so, for example, a | |
158 frequency can be specified either as 20000Hz or as 20kHz. There are | |
159 also default units for options. So, for example, Hz is the default | |
160 unit for frequencies and ms is the defualt unit for times in | |
161 milliseconds, and when these units apply, they need not be specified | |
162 on the command line.. | |
163 | |
164 The are also some special option values like 'remainder' which can be | |
165 used to specify 'the rest of the wave' with option 'length_wave', and | |
166 the special value 'centre' used with options x0_win and y0_win, to | |
167 centre the display window on the screen. (This assumes that your | |
168 window manager does not override placement of windows by | |
169 applications.) | |
170 | |
171 | |
172 C. Changing Option Values: | |
173 | |
174 You can alter the value of one or more of the options by specifying it | |
175 and the required value on the command line between the entry point | |
176 name and the name of the input file. For example: | |
177 | |
178 genwav width_win=440 ../waves/hat | |
179 | |
180 will produce a waveform display for the wave in file hat with a width of | |
181 440 screen pixels. | |
182 | |
183 Switches and flags that just take the values on and off, or 0 and 1, can be | |
184 switched on by specifying the option name, prefixed by a minus sign to | |
185 distinguish them from a file name. For example, | |
186 | |
187 > genwav -swap ../wave/hat_br | |
188 | |
189 | |
190 | |
191 II. PROCESSES APPLIED BY AIM AND ROUTES THROUGH THE MODEL | |
192 | |
193 There are three routes through AIM depending on the purpose of the | |
194 analysis and the form in which the output is to be displayed. The | |
195 routes and the set of processes they apply are shown in the output of | |
196 gen -help. | |
197 | |
198 All of the routes through AIM begin with genwav which displays the | |
199 contents of the file to be analysed as a magnitude versus time | |
200 plot. It is recommended that all analyses should begin with genwav in | |
201 order to a) confirm that the file does indeed contain the wave you | |
202 wish to analyse, b) confirm that the file is headerless and that the | |
203 bytes are in the right order, c) choose a subsection of the wave for | |
204 analysis, and d) to check whether it is a 12-bit wave or a 16-bit | |
205 wave. Headers usually appear as a brief bit of black, or noise with a | |
206 large amplitude, at the left-hand edge of the plot. If the bytes are | |
207 in the wrong order the entire wave usually looks like noise with a | |
208 large amplitude. | |
209 | |
210 The AUDITORY ROUTE simulates basilar membrane motion (BMM), neural | |
211 activity patterns (NAPs), and auditory images either in rectangular | |
212 format (SAI) or in spiral format (SPL). Its purpose is to support | |
213 time-domain modelling of peripheral auditory processing; that is, to | |
214 simulate the phase-locked, time-interval patterns produced by the | |
215 cochlea and the conversion of the time-interval patterns into auditory | |
216 images by strobed temporal integration or autocorrelation (Patterson | |
217 et al., 1992; Meddis and Hewitt, 1991). For demonstrations of the | |
218 functional and physiological versions of the auditory route through | |
219 AIM try | |
220 | |
221 > aimdemo_gtf_all ../waves/cegc and | |
222 > aimdemo_tlf_all ../waves/cegc | |
223 | |
224 [[[ Does the SAI work for the physiological route? Does it use acgram? ]]] | |
225 | |
226 | |
227 The SPEECH ROUTE produces auditory spectrograms and cochleograms which | |
228 can be stored and used as input for automatic speech recognition. They | |
229 are 'auditory' spectrograms and cochleograms inasmuch as the centre | |
230 frequencies of the channels in the filterbank are equally spaced on an | |
231 ERB-scale, or Bark scale, rather than being linearly spaced as in | |
232 traditional preprocessor (Robinson et al., 1990; Giguere and Woodland, | |
233 1994; Patterson et al., 1994). The are currently no aimdemo scripts | |
234 for the speech routes through AIM. | |
235 | |
236 | |
237 The SPECTRAL ROUTE produces representations of the distribution of | |
238 activity across frequency in the auditory system, either at the output | |
239 of the filterbank (genasa) or the output of the full cochlea | |
240 simulation (genepn). These representations have been variously | |
241 referred to as 'excitation patterns' (Zwicker, 1974; Moore and | |
242 Glasberg, 1983), 'central auditory spectra' (Srulovicz and Goldstein, | |
243 1983), or simply 'auditory spectra' (Patterson, 1994). In AIM, to | |
244 distinguish the spectral representation at the output of the | |
245 filterbank, from the representation provided at the output of the full | |
246 cochlea simulation, the former is referred to as an 'auditory spectral | |
247 analysis' (genasa) and the latter is referred to as an 'excitation | |
248 pattern' (genepn). For demonstrations of the functional and | |
249 physiological versions of the spectral route through AIM, try | |
250 | |
251 > aimdemo_gtf_spectra ../waves/cegc and | |
252 > aimdemo_tlf_spectra ../waves/cegc | |
253 | |
254 | |
255 | |
256 III. INDIVIDUAL INSTRUCTIONS FOR THE AUDITORY ROUTE THROUGH AIM | |
257 | |
258 The following is a list of the instructions that form the basis of | |
259 aimdemo_gtf_all. They show the output at successive stages of the | |
260 functional version of the auditory route through AIM. | |
261 | |
262 > genwav length=32ms ../waves/cegc | |
263 > genstp length=32ms ../waves/cegc | |
264 > genbmm length=32ms ../waves/cegc | |
265 > gennap length=32ms ../waves/cegc | |
266 > gensai start=200 length=300ms ../waves/cegc | |
267 > genspl start=200 length=300ms pensize=2 ../waves/cegc | |
268 | |
269 | |
270 genwav shows the time waveform, | |
271 genstp shows the pressure wave in the middle ear bone | |
272 that drives the cochlea (the stapes). | |
273 genbmm shows simulated basilar membrane motion, | |
274 gennap shows a simulated neural activity pattern, | |
275 gensai shows a simulated stabilized auditory image, | |
276 genspl shows a spiral mapping of the auditory image. | |
277 | |
278 | |
279 Each of the instructions should create an X window and present a wave, | |
280 a landscape display, or a spiral display of cegc. The stimulus files | |
281 were created on a DECstation. On a SUN, swap the bytes on the command | |
282 line. For example: | |
283 | |
284 > genwav swap=on length=32ms ../waves/cegc | |
285 | |
286 The wave in the file 'cegc' is a set of click trains for the notes of | |
287 a major triad and the octave -- hence the name C-E-G-C. We use clicks | |
288 trains as test and demonstration stimuli for several reasons: They | |
289 activate all channels of the model and they do so with roughly equally | |
290 energy. The clicks elicit impulse responses from the model which show | |
291 the processing in its simplest broadband form. The simple temporal | |
292 structure of the click train immediately reveals any temporal | |
293 alignment problems in the software. | |
294 | |
295 For the first 300 ms, the inter-click interval in cegc is 8 ms; then | |
296 for 300 ms, the inter-click interval decreases linearly to (4/5)*8 | |
297 ms. For convenience, we refer to the first note as 'C', and so, | |
298 relative to this C, the glide takes the note up to 'E'. The gliding | |
299 portions of cegc illustrate the dynamic properties of AIM. The | |
300 inter-click interval stays at (4/5)*8 for 300 ms and then glides | |
301 linearly to (2/3)*8 ms which, relative to the starting C, is the note | |
302 G. The inter-click interval stays at (2/3)*8 ms for 300 ms and then | |
303 glides linearly to (1/2)*8 ms which is the final note and the octave | |
304 of the original C. | |
305 | |
306 | |
307 IV. INSTRUCTION AND OPTION SYNTAX | |
308 | |
309 > gen??? [-option -option=value option=value ... -update] inputfile | |
310 | |
311 The input file is assumed to be a headerless binary file; that is it | |
312 is assumed to contain in short integers (2-byte words). | |
313 | |
314 | |
315 The options handler accepts the following three formats: | |
316 | |
317 -option # turns option "on" (Unix convention) | |
318 option=value # sets option to the string "value" (standard math) | |
319 -option=value # sets option to the string "value" (mixed unix/math) | |
320 | |
321 So, -swap, swap=on, and -swap=on, all have the same effect. | |
322 | |
323 The options handler also recognises two special options: update and help. | |
324 | |
325 -update (or update=on) | |
326 | |
327 This causes the options and values on the command line to be stored | |
328 in an 'options file' that is named according to the Unix convention | |
329 ".<program_name>rc". All of the gen??? instructions search for a | |
330 file of this form when they are executed, so once an option has been | |
331 'updated' with a specific value, that value will be used in all | |
332 subsequent occurances of that instruction in that directory. (Note: | |
333 The .gen???rc are 'hidden' because the unix command 'ls' does not | |
334 print files beginning with '.' unless the ls option '-a' is | |
335 present.) | |
336 | |
337 CAUTION: Be careful with options files in the users home | |
338 directory. They will be invoked by instructions executed in any | |
339 directory that does not have its own specific options file for that | |
340 instruction and this can be very confusing. | |
341 | |
342 -help (or help=on) | |
343 | |
344 This causes the current options file to be printed on the screen (stdout). | |
345 | |
346 The help option accepts any option name as an argument, and prints the | |
347 value of that single option on the screen. So | |
348 | |
349 > genbmm help=channels | |
350 | |
351 will cause the current number of channels to be printed on the screen. | |
352 | |
353 The help option also accepts the string 'all'. | |
354 | |
355 > genbmm help=all | |
356 | |
357 will cause all options required for BMM output to be printed on the | |
358 screen (stdout). | |
359 | |
360 The options and values used in an analysis of BMM can be recorded | |
361 in the file foo by with the following instruction. | |
362 | |
363 > genbmm help=all > foo | |
364 | |
365 |