annotate docs/aimInstructions @ 0:5242703e91d3 tip

Initial checkin for AIM92 aimR8.2 (last updated May 1997).
author tomwalters
date Fri, 20 May 2011 15:19:45 +0100
parents
children
rev   line source
tomwalters@0 1 INSTRUCTIONS AND OPTIONS FOR THE AUDITORY IMAGE MODEL.
tomwalters@0 2
tomwalters@0 3 This is the introduction for those wanting to use AIM to process waves
tomwalters@0 4 once the software package has been compiled, installed and tested. It
tomwalters@0 5 is assumed that the user has read the introductory article by
tomwalters@0 6 Patterson, Allerhand and Giguere (1995) and/or viewed the Overview of
tomwalters@0 7 the Auditory Image Model described at the beginning of ReadMe_bin.
tomwalters@0 8
tomwalters@0 9 Begin by typing gen -help at the prompt in an xterm window.
tomwalters@0 10
tomwalters@0 11 > gen -help
tomwalters@0 12
tomwalters@0 13 This prints general usage information on the standard output as follows:
tomwalters@0 14
tomwalters@0 15 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
tomwalters@0 16 "AIM MRC-APU Release R6.28 [gen] Tue Apr 18 17:35:15 1995"
tomwalters@0 17
tomwalters@0 18 Usage: gen??? [options] [file_name]
tomwalters@0 19
tomwalters@0 20 where ??? is one of the following abbreviations
tomwalters@0 21
tomwalters@0 22 wav wave
tomwalters@0 23 bmm basilar membrane motion
tomwalters@0 24 nap neural activity pattern
tomwalters@0 25 sai stabilized auditory image
tomwalters@0 26 spl spiral version of auditory image
tomwalters@0 27 sgm spectrogram
tomwalters@0 28 cgm cochleogram
tomwalters@0 29 asa auditory spectral analysis
tomwalters@0 30 epn excitation pattern
tomwalters@0 31
tomwalters@0 32 [file_name] is a headerless wave file (2-byte binary integers).
tomwalters@0 33
tomwalters@0 34 [options] are the parameters, options and switches that control
tomwalters@0 35 the AIM instructions and the AIM tools.
tomwalters@0 36
tomwalters@0 37 Help with options: gen??? [-help | -help=all | -help=option]
tomwalters@0 38 Path for options files (.gen???rc) = .:~ (or setenv OPTIONSPATH)
tomwalters@0 39
tomwalters@0 40
tomwalters@0 41 Processes Applied by AIM and Routes Through the Model:
tomwalters@0 42
tomwalters@0 43 Processes Auditory Speech Spectral
tomwalters@0 44 Route Route Route
tomwalters@0 45 -------------------------------- -------- ------ -------
tomwalters@0 46 Display input wave genwav genwav genwav
tomwalters@0 47 Auditory filtering (gtf/tlf) genbmm
tomwalters@0 48 Compression and rectification
tomwalters@0 49 Neural encoding (2D-AT/haircell) gennap
tomwalters@0 50 Temporal integration (LP filter) gensgm genasa
tomwalters@0 51 gencgm genepn
tomwalters@0 52 Strobed temporal integration gensai
tomwalters@0 53 Spiral version of auditory image genspl
tomwalters@0 54
tomwalters@0 55
tomwalters@0 56 Output: gen??? output=on file_name
tomwalters@0 57
tomwalters@0 58 Output is written to file: file_name.???
tomwalters@0 59
tomwalters@0 60 The format for 2-dimensional output is by columns, with the lowest
tomwalters@0 61 channel first in each column (bmm, nap, sgm, cgm, asa, epn).
tomwalters@0 62 The format for auditory image output is by rows, for each image frame
tomwalters@0 63 in succession, with the row of the lowest channel first (sai, spl).
tomwalters@0 64
tomwalters@0 65
tomwalters@0 66 The Auditory Image Model was developed at the Applied Psychology Unit
tomwalters@0 67 of the Medical Research Council, 15 Chaucer Road, Cambridge, U.K.
tomwalters@0 68
tomwalters@0 69 Copyright(c) Applied Psychology Unit, Medical Research Council, 1988-1995.
tomwalters@0 70 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
tomwalters@0 71
tomwalters@0 72
tomwalters@0 73 I. USAGE: INSTRUCTIONS, WAVE FILES, AND OPTIONS.
tomwalters@0 74
tomwalters@0 75 Instructions: The auditory processing is performed by a single large
tomwalters@0 76 program (gen) with multiple entry points, having names of the form
tomwalters@0 77 gen???. The entry-point names function as instructions, so the
tomwalters@0 78 program is run by typing an entry point name on the command line,
tomwalters@0 79 followed by options that control the details of the processing, and
tomwalters@0 80 then the name of the file containing the waveform to be processed.
tomwalters@0 81 The output will be displayed on the screen in a window created by AIM.
tomwalters@0 82 For example, in the directory aim/bin, type:
tomwalters@0 83
tomwalters@0 84 > genwav samplerate=22.05kHz start_wav=182ms length_wav=64 ../waves/hat
tomwalters@0 85
tomwalters@0 86 This will display a section of the wave in the file 'hat' which is
tomwalters@0 87 stored in the aim/waves directory. The wave was digitised at a
tomwalters@0 88 sampling rate of 22.05 kHz, so the option 'samplerate' is set to
tomwalters@0 89 22.05kHz. The options 'start_wav' and 'length_wav' determine the
tomwalters@0 90 subset of the wave to be displayed.
tomwalters@0 91
tomwalters@0 92 Each of the entry points into gen has a name which is an abbreviation
tomwalters@0 93 of the functions provided to that point in the auditroy image model.
tomwalters@0 94 For example, the first module applies spectral analysis to the input
tomwalters@0 95 wave and the output of the analysis is available at entry point
tomwalters@0 96 'genbmm' which stands for 'generate basilar membrane motion'. The
tomwalters@0 97 entry points are listed in the usage section of the gen -help output
tomwalters@0 98 and there is a manual entry for each, accessible via 'manaim
tomwalters@0 99 gen???'. Together this set of manual pages constitute the primary
tomwalters@0 100 documentation for AIM and they are the first place to look for help
tomwalters@0 101 after this document.
tomwalters@0 102
tomwalters@0 103 Wave Files: The input waves should be in the form of headerless binary
tomwalters@0 104 files (2-byte integers). Use genwav to check that the bytes are in the
tomwalters@0 105 right order. If they are not set option swap=on and AIM will reverse
tomwalters@0 106 the bytes, temporarily, at before beginning the analysis. Alternately,
tomwalters@0 107 use the aimtool 'swab' to switch the bytes in the file permanently.
tomwalters@0 108
tomwalters@0 109 A. Options:
tomwalters@0 110
tomwalters@0 111 The precise behaviour of the program is determined by options. There
tomwalters@0 112 are AUDITORY OPTIONS which control the operation of the auditory
tomwalters@0 113 processing modules in AIM, such as the number of channels in the
tomwalters@0 114 auditory filterbank (channels_afb) or the decay rate of the auditory
tomwalters@0 115 image (decay_ai). These options are described in the manual pages for
tomwalters@0 116 the individual instructions (gen???). There are also NON-AUDITORY
tomwalters@0 117 OPTIONS which a) specify the form of the input wave (e.g. samplerate
tomwalters@0 118 and swap_wav), b) govern the position and characteristics of the
tomwalters@0 119 output display and its contents, and c) specify the form and
tomwalters@0 120 destination of the output. These options are common to all entry
tomwalters@0 121 points and so they appear at the head of the options list before the
tomwalters@0 122 auditory options. For convenience, they are described in the manual
tomwalters@0 123 entry for genwav, the starting point for any analysis.
tomwalters@0 124
tomwalters@0 125 The relevant options at any entry point can be listed on the screen by
tomwalters@0 126 typing the entry point name with the -help option. For example,
tomwalters@0 127
tomwalters@0 128 > genwav -help
tomwalters@0 129
tomwalters@0 130 If the screen size is too small to display the entire list you can make the
tomwalters@0 131 display pause between screens by typing:
tomwalters@0 132
tomwalters@0 133 > genwav -help | more
tomwalters@0 134
tomwalters@0 135 Alternately, you can print the listing using
tomwalters@0 136
tomwalters@0 137 > genwav -help | lpr
tomwalters@0 138
tomwalters@0 139
tomwalters@0 140 All of the options from title up to dB_wave are for display and flow
tomwalters@0 141 control. They determine how the program output appears on the screen,
tomwalters@0 142 the source of the program input, and the destination of output. The
tomwalters@0 143 options have default values chosen for 'normal' operation and display
tomwalters@0 144 given the entry point. You can override the default values, either
tomwalters@0 145 temporarily, by supplying a value on the command line when you run the
tomwalters@0 146 program, or permanently by creating a file with your own option
tomwalters@0 147 values. Both the default value and the current value for each option
tomwalters@0 148 are listed by the help option.
tomwalters@0 149
tomwalters@0 150
tomwalters@0 151 B. Options Values:
tomwalters@0 152
tomwalters@0 153 Some of the options take numeric values (as in samplerate), others are
tomwalters@0 154 file names (as in input_wave). Others act as simple switches or flags
tomwalters@0 155 (e.g. swap_wav); these options can be set/unset using on/off or 0/1.
tomwalters@0 156
tomwalters@0 157 Numeric options can include appropriate units; so, for example, a
tomwalters@0 158 frequency can be specified either as 20000Hz or as 20kHz. There are
tomwalters@0 159 also default units for options. So, for example, Hz is the default
tomwalters@0 160 unit for frequencies and ms is the defualt unit for times in
tomwalters@0 161 milliseconds, and when these units apply, they need not be specified
tomwalters@0 162 on the command line..
tomwalters@0 163
tomwalters@0 164 The are also some special option values like 'remainder' which can be
tomwalters@0 165 used to specify 'the rest of the wave' with option 'length_wave', and
tomwalters@0 166 the special value 'centre' used with options x0_win and y0_win, to
tomwalters@0 167 centre the display window on the screen. (This assumes that your
tomwalters@0 168 window manager does not override placement of windows by
tomwalters@0 169 applications.)
tomwalters@0 170
tomwalters@0 171
tomwalters@0 172 C. Changing Option Values:
tomwalters@0 173
tomwalters@0 174 You can alter the value of one or more of the options by specifying it
tomwalters@0 175 and the required value on the command line between the entry point
tomwalters@0 176 name and the name of the input file. For example:
tomwalters@0 177
tomwalters@0 178 genwav width_win=440 ../waves/hat
tomwalters@0 179
tomwalters@0 180 will produce a waveform display for the wave in file hat with a width of
tomwalters@0 181 440 screen pixels.
tomwalters@0 182
tomwalters@0 183 Switches and flags that just take the values on and off, or 0 and 1, can be
tomwalters@0 184 switched on by specifying the option name, prefixed by a minus sign to
tomwalters@0 185 distinguish them from a file name. For example,
tomwalters@0 186
tomwalters@0 187 > genwav -swap ../wave/hat_br
tomwalters@0 188
tomwalters@0 189
tomwalters@0 190
tomwalters@0 191 II. PROCESSES APPLIED BY AIM AND ROUTES THROUGH THE MODEL
tomwalters@0 192
tomwalters@0 193 There are three routes through AIM depending on the purpose of the
tomwalters@0 194 analysis and the form in which the output is to be displayed. The
tomwalters@0 195 routes and the set of processes they apply are shown in the output of
tomwalters@0 196 gen -help.
tomwalters@0 197
tomwalters@0 198 All of the routes through AIM begin with genwav which displays the
tomwalters@0 199 contents of the file to be analysed as a magnitude versus time
tomwalters@0 200 plot. It is recommended that all analyses should begin with genwav in
tomwalters@0 201 order to a) confirm that the file does indeed contain the wave you
tomwalters@0 202 wish to analyse, b) confirm that the file is headerless and that the
tomwalters@0 203 bytes are in the right order, c) choose a subsection of the wave for
tomwalters@0 204 analysis, and d) to check whether it is a 12-bit wave or a 16-bit
tomwalters@0 205 wave. Headers usually appear as a brief bit of black, or noise with a
tomwalters@0 206 large amplitude, at the left-hand edge of the plot. If the bytes are
tomwalters@0 207 in the wrong order the entire wave usually looks like noise with a
tomwalters@0 208 large amplitude.
tomwalters@0 209
tomwalters@0 210 The AUDITORY ROUTE simulates basilar membrane motion (BMM), neural
tomwalters@0 211 activity patterns (NAPs), and auditory images either in rectangular
tomwalters@0 212 format (SAI) or in spiral format (SPL). Its purpose is to support
tomwalters@0 213 time-domain modelling of peripheral auditory processing; that is, to
tomwalters@0 214 simulate the phase-locked, time-interval patterns produced by the
tomwalters@0 215 cochlea and the conversion of the time-interval patterns into auditory
tomwalters@0 216 images by strobed temporal integration or autocorrelation (Patterson
tomwalters@0 217 et al., 1992; Meddis and Hewitt, 1991). For demonstrations of the
tomwalters@0 218 functional and physiological versions of the auditory route through
tomwalters@0 219 AIM try
tomwalters@0 220
tomwalters@0 221 > aimdemo_gtf_all ../waves/cegc and
tomwalters@0 222 > aimdemo_tlf_all ../waves/cegc
tomwalters@0 223
tomwalters@0 224 [[[ Does the SAI work for the physiological route? Does it use acgram? ]]]
tomwalters@0 225
tomwalters@0 226
tomwalters@0 227 The SPEECH ROUTE produces auditory spectrograms and cochleograms which
tomwalters@0 228 can be stored and used as input for automatic speech recognition. They
tomwalters@0 229 are 'auditory' spectrograms and cochleograms inasmuch as the centre
tomwalters@0 230 frequencies of the channels in the filterbank are equally spaced on an
tomwalters@0 231 ERB-scale, or Bark scale, rather than being linearly spaced as in
tomwalters@0 232 traditional preprocessor (Robinson et al., 1990; Giguere and Woodland,
tomwalters@0 233 1994; Patterson et al., 1994). The are currently no aimdemo scripts
tomwalters@0 234 for the speech routes through AIM.
tomwalters@0 235
tomwalters@0 236
tomwalters@0 237 The SPECTRAL ROUTE produces representations of the distribution of
tomwalters@0 238 activity across frequency in the auditory system, either at the output
tomwalters@0 239 of the filterbank (genasa) or the output of the full cochlea
tomwalters@0 240 simulation (genepn). These representations have been variously
tomwalters@0 241 referred to as 'excitation patterns' (Zwicker, 1974; Moore and
tomwalters@0 242 Glasberg, 1983), 'central auditory spectra' (Srulovicz and Goldstein,
tomwalters@0 243 1983), or simply 'auditory spectra' (Patterson, 1994). In AIM, to
tomwalters@0 244 distinguish the spectral representation at the output of the
tomwalters@0 245 filterbank, from the representation provided at the output of the full
tomwalters@0 246 cochlea simulation, the former is referred to as an 'auditory spectral
tomwalters@0 247 analysis' (genasa) and the latter is referred to as an 'excitation
tomwalters@0 248 pattern' (genepn). For demonstrations of the functional and
tomwalters@0 249 physiological versions of the spectral route through AIM, try
tomwalters@0 250
tomwalters@0 251 > aimdemo_gtf_spectra ../waves/cegc and
tomwalters@0 252 > aimdemo_tlf_spectra ../waves/cegc
tomwalters@0 253
tomwalters@0 254
tomwalters@0 255
tomwalters@0 256 III. INDIVIDUAL INSTRUCTIONS FOR THE AUDITORY ROUTE THROUGH AIM
tomwalters@0 257
tomwalters@0 258 The following is a list of the instructions that form the basis of
tomwalters@0 259 aimdemo_gtf_all. They show the output at successive stages of the
tomwalters@0 260 functional version of the auditory route through AIM.
tomwalters@0 261
tomwalters@0 262 > genwav length=32ms ../waves/cegc
tomwalters@0 263 > genstp length=32ms ../waves/cegc
tomwalters@0 264 > genbmm length=32ms ../waves/cegc
tomwalters@0 265 > gennap length=32ms ../waves/cegc
tomwalters@0 266 > gensai start=200 length=300ms ../waves/cegc
tomwalters@0 267 > genspl start=200 length=300ms pensize=2 ../waves/cegc
tomwalters@0 268
tomwalters@0 269
tomwalters@0 270 genwav shows the time waveform,
tomwalters@0 271 genstp shows the pressure wave in the middle ear bone
tomwalters@0 272 that drives the cochlea (the stapes).
tomwalters@0 273 genbmm shows simulated basilar membrane motion,
tomwalters@0 274 gennap shows a simulated neural activity pattern,
tomwalters@0 275 gensai shows a simulated stabilized auditory image,
tomwalters@0 276 genspl shows a spiral mapping of the auditory image.
tomwalters@0 277
tomwalters@0 278
tomwalters@0 279 Each of the instructions should create an X window and present a wave,
tomwalters@0 280 a landscape display, or a spiral display of cegc. The stimulus files
tomwalters@0 281 were created on a DECstation. On a SUN, swap the bytes on the command
tomwalters@0 282 line. For example:
tomwalters@0 283
tomwalters@0 284 > genwav swap=on length=32ms ../waves/cegc
tomwalters@0 285
tomwalters@0 286 The wave in the file 'cegc' is a set of click trains for the notes of
tomwalters@0 287 a major triad and the octave -- hence the name C-E-G-C. We use clicks
tomwalters@0 288 trains as test and demonstration stimuli for several reasons: They
tomwalters@0 289 activate all channels of the model and they do so with roughly equally
tomwalters@0 290 energy. The clicks elicit impulse responses from the model which show
tomwalters@0 291 the processing in its simplest broadband form. The simple temporal
tomwalters@0 292 structure of the click train immediately reveals any temporal
tomwalters@0 293 alignment problems in the software.
tomwalters@0 294
tomwalters@0 295 For the first 300 ms, the inter-click interval in cegc is 8 ms; then
tomwalters@0 296 for 300 ms, the inter-click interval decreases linearly to (4/5)*8
tomwalters@0 297 ms. For convenience, we refer to the first note as 'C', and so,
tomwalters@0 298 relative to this C, the glide takes the note up to 'E'. The gliding
tomwalters@0 299 portions of cegc illustrate the dynamic properties of AIM. The
tomwalters@0 300 inter-click interval stays at (4/5)*8 for 300 ms and then glides
tomwalters@0 301 linearly to (2/3)*8 ms which, relative to the starting C, is the note
tomwalters@0 302 G. The inter-click interval stays at (2/3)*8 ms for 300 ms and then
tomwalters@0 303 glides linearly to (1/2)*8 ms which is the final note and the octave
tomwalters@0 304 of the original C.
tomwalters@0 305
tomwalters@0 306
tomwalters@0 307 IV. INSTRUCTION AND OPTION SYNTAX
tomwalters@0 308
tomwalters@0 309 > gen??? [-option -option=value option=value ... -update] inputfile
tomwalters@0 310
tomwalters@0 311 The input file is assumed to be a headerless binary file; that is it
tomwalters@0 312 is assumed to contain in short integers (2-byte words).
tomwalters@0 313
tomwalters@0 314
tomwalters@0 315 The options handler accepts the following three formats:
tomwalters@0 316
tomwalters@0 317 -option # turns option "on" (Unix convention)
tomwalters@0 318 option=value # sets option to the string "value" (standard math)
tomwalters@0 319 -option=value # sets option to the string "value" (mixed unix/math)
tomwalters@0 320
tomwalters@0 321 So, -swap, swap=on, and -swap=on, all have the same effect.
tomwalters@0 322
tomwalters@0 323 The options handler also recognises two special options: update and help.
tomwalters@0 324
tomwalters@0 325 -update (or update=on)
tomwalters@0 326
tomwalters@0 327 This causes the options and values on the command line to be stored
tomwalters@0 328 in an 'options file' that is named according to the Unix convention
tomwalters@0 329 ".<program_name>rc". All of the gen??? instructions search for a
tomwalters@0 330 file of this form when they are executed, so once an option has been
tomwalters@0 331 'updated' with a specific value, that value will be used in all
tomwalters@0 332 subsequent occurances of that instruction in that directory. (Note:
tomwalters@0 333 The .gen???rc are 'hidden' because the unix command 'ls' does not
tomwalters@0 334 print files beginning with '.' unless the ls option '-a' is
tomwalters@0 335 present.)
tomwalters@0 336
tomwalters@0 337 CAUTION: Be careful with options files in the users home
tomwalters@0 338 directory. They will be invoked by instructions executed in any
tomwalters@0 339 directory that does not have its own specific options file for that
tomwalters@0 340 instruction and this can be very confusing.
tomwalters@0 341
tomwalters@0 342 -help (or help=on)
tomwalters@0 343
tomwalters@0 344 This causes the current options file to be printed on the screen (stdout).
tomwalters@0 345
tomwalters@0 346 The help option accepts any option name as an argument, and prints the
tomwalters@0 347 value of that single option on the screen. So
tomwalters@0 348
tomwalters@0 349 > genbmm help=channels
tomwalters@0 350
tomwalters@0 351 will cause the current number of channels to be printed on the screen.
tomwalters@0 352
tomwalters@0 353 The help option also accepts the string 'all'.
tomwalters@0 354
tomwalters@0 355 > genbmm help=all
tomwalters@0 356
tomwalters@0 357 will cause all options required for BMM output to be printed on the
tomwalters@0 358 screen (stdout).
tomwalters@0 359
tomwalters@0 360 The options and values used in an analysis of BMM can be recorded
tomwalters@0 361 in the file foo by with the following instruction.
tomwalters@0 362
tomwalters@0 363 > genbmm help=all > foo
tomwalters@0 364
tomwalters@0 365