tomwalters@0: INSTRUCTIONS AND OPTIONS FOR THE AUDITORY IMAGE MODEL. tomwalters@0: tomwalters@0: This is the introduction for those wanting to use AIM to process waves tomwalters@0: once the software package has been compiled, installed and tested. It tomwalters@0: is assumed that the user has read the introductory article by tomwalters@0: Patterson, Allerhand and Giguere (1995) and/or viewed the Overview of tomwalters@0: the Auditory Image Model described at the beginning of ReadMe_bin. tomwalters@0: tomwalters@0: Begin by typing gen -help at the prompt in an xterm window. tomwalters@0: tomwalters@0: > gen -help tomwalters@0: tomwalters@0: This prints general usage information on the standard output as follows: tomwalters@0: tomwalters@0: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - tomwalters@0: "AIM MRC-APU Release R6.28 [gen] Tue Apr 18 17:35:15 1995" tomwalters@0: tomwalters@0: Usage: gen??? [options] [file_name] tomwalters@0: tomwalters@0: where ??? is one of the following abbreviations tomwalters@0: tomwalters@0: wav wave tomwalters@0: bmm basilar membrane motion tomwalters@0: nap neural activity pattern tomwalters@0: sai stabilized auditory image tomwalters@0: spl spiral version of auditory image tomwalters@0: sgm spectrogram tomwalters@0: cgm cochleogram tomwalters@0: asa auditory spectral analysis tomwalters@0: epn excitation pattern tomwalters@0: tomwalters@0: [file_name] is a headerless wave file (2-byte binary integers). tomwalters@0: tomwalters@0: [options] are the parameters, options and switches that control tomwalters@0: the AIM instructions and the AIM tools. tomwalters@0: tomwalters@0: Help with options: gen??? [-help | -help=all | -help=option] tomwalters@0: Path for options files (.gen???rc) = .:~ (or setenv OPTIONSPATH) tomwalters@0: tomwalters@0: tomwalters@0: Processes Applied by AIM and Routes Through the Model: tomwalters@0: tomwalters@0: Processes Auditory Speech Spectral tomwalters@0: Route Route Route tomwalters@0: -------------------------------- -------- ------ ------- tomwalters@0: Display input wave genwav genwav genwav tomwalters@0: Auditory filtering (gtf/tlf) genbmm tomwalters@0: Compression and rectification tomwalters@0: Neural encoding (2D-AT/haircell) gennap tomwalters@0: Temporal integration (LP filter) gensgm genasa tomwalters@0: gencgm genepn tomwalters@0: Strobed temporal integration gensai tomwalters@0: Spiral version of auditory image genspl tomwalters@0: tomwalters@0: tomwalters@0: Output: gen??? output=on file_name tomwalters@0: tomwalters@0: Output is written to file: file_name.??? tomwalters@0: tomwalters@0: The format for 2-dimensional output is by columns, with the lowest tomwalters@0: channel first in each column (bmm, nap, sgm, cgm, asa, epn). tomwalters@0: The format for auditory image output is by rows, for each image frame tomwalters@0: in succession, with the row of the lowest channel first (sai, spl). tomwalters@0: tomwalters@0: tomwalters@0: The Auditory Image Model was developed at the Applied Psychology Unit tomwalters@0: of the Medical Research Council, 15 Chaucer Road, Cambridge, U.K. tomwalters@0: tomwalters@0: Copyright(c) Applied Psychology Unit, Medical Research Council, 1988-1995. tomwalters@0: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - tomwalters@0: tomwalters@0: tomwalters@0: I. USAGE: INSTRUCTIONS, WAVE FILES, AND OPTIONS. tomwalters@0: tomwalters@0: Instructions: The auditory processing is performed by a single large tomwalters@0: program (gen) with multiple entry points, having names of the form tomwalters@0: gen???. The entry-point names function as instructions, so the tomwalters@0: program is run by typing an entry point name on the command line, tomwalters@0: followed by options that control the details of the processing, and tomwalters@0: then the name of the file containing the waveform to be processed. tomwalters@0: The output will be displayed on the screen in a window created by AIM. tomwalters@0: For example, in the directory aim/bin, type: tomwalters@0: tomwalters@0: > genwav samplerate=22.05kHz start_wav=182ms length_wav=64 ../waves/hat tomwalters@0: tomwalters@0: This will display a section of the wave in the file 'hat' which is tomwalters@0: stored in the aim/waves directory. The wave was digitised at a tomwalters@0: sampling rate of 22.05 kHz, so the option 'samplerate' is set to tomwalters@0: 22.05kHz. The options 'start_wav' and 'length_wav' determine the tomwalters@0: subset of the wave to be displayed. tomwalters@0: tomwalters@0: Each of the entry points into gen has a name which is an abbreviation tomwalters@0: of the functions provided to that point in the auditroy image model. tomwalters@0: For example, the first module applies spectral analysis to the input tomwalters@0: wave and the output of the analysis is available at entry point tomwalters@0: 'genbmm' which stands for 'generate basilar membrane motion'. The tomwalters@0: entry points are listed in the usage section of the gen -help output tomwalters@0: and there is a manual entry for each, accessible via 'manaim tomwalters@0: gen???'. Together this set of manual pages constitute the primary tomwalters@0: documentation for AIM and they are the first place to look for help tomwalters@0: after this document. tomwalters@0: tomwalters@0: Wave Files: The input waves should be in the form of headerless binary tomwalters@0: files (2-byte integers). Use genwav to check that the bytes are in the tomwalters@0: right order. If they are not set option swap=on and AIM will reverse tomwalters@0: the bytes, temporarily, at before beginning the analysis. Alternately, tomwalters@0: use the aimtool 'swab' to switch the bytes in the file permanently. tomwalters@0: tomwalters@0: A. Options: tomwalters@0: tomwalters@0: The precise behaviour of the program is determined by options. There tomwalters@0: are AUDITORY OPTIONS which control the operation of the auditory tomwalters@0: processing modules in AIM, such as the number of channels in the tomwalters@0: auditory filterbank (channels_afb) or the decay rate of the auditory tomwalters@0: image (decay_ai). These options are described in the manual pages for tomwalters@0: the individual instructions (gen???). There are also NON-AUDITORY tomwalters@0: OPTIONS which a) specify the form of the input wave (e.g. samplerate tomwalters@0: and swap_wav), b) govern the position and characteristics of the tomwalters@0: output display and its contents, and c) specify the form and tomwalters@0: destination of the output. These options are common to all entry tomwalters@0: points and so they appear at the head of the options list before the tomwalters@0: auditory options. For convenience, they are described in the manual tomwalters@0: entry for genwav, the starting point for any analysis. tomwalters@0: tomwalters@0: The relevant options at any entry point can be listed on the screen by tomwalters@0: typing the entry point name with the -help option. For example, tomwalters@0: tomwalters@0: > genwav -help tomwalters@0: tomwalters@0: If the screen size is too small to display the entire list you can make the tomwalters@0: display pause between screens by typing: tomwalters@0: tomwalters@0: > genwav -help | more tomwalters@0: tomwalters@0: Alternately, you can print the listing using tomwalters@0: tomwalters@0: > genwav -help | lpr tomwalters@0: tomwalters@0: tomwalters@0: All of the options from title up to dB_wave are for display and flow tomwalters@0: control. They determine how the program output appears on the screen, tomwalters@0: the source of the program input, and the destination of output. The tomwalters@0: options have default values chosen for 'normal' operation and display tomwalters@0: given the entry point. You can override the default values, either tomwalters@0: temporarily, by supplying a value on the command line when you run the tomwalters@0: program, or permanently by creating a file with your own option tomwalters@0: values. Both the default value and the current value for each option tomwalters@0: are listed by the help option. tomwalters@0: tomwalters@0: tomwalters@0: B. Options Values: tomwalters@0: tomwalters@0: Some of the options take numeric values (as in samplerate), others are tomwalters@0: file names (as in input_wave). Others act as simple switches or flags tomwalters@0: (e.g. swap_wav); these options can be set/unset using on/off or 0/1. tomwalters@0: tomwalters@0: Numeric options can include appropriate units; so, for example, a tomwalters@0: frequency can be specified either as 20000Hz or as 20kHz. There are tomwalters@0: also default units for options. So, for example, Hz is the default tomwalters@0: unit for frequencies and ms is the defualt unit for times in tomwalters@0: milliseconds, and when these units apply, they need not be specified tomwalters@0: on the command line.. tomwalters@0: tomwalters@0: The are also some special option values like 'remainder' which can be tomwalters@0: used to specify 'the rest of the wave' with option 'length_wave', and tomwalters@0: the special value 'centre' used with options x0_win and y0_win, to tomwalters@0: centre the display window on the screen. (This assumes that your tomwalters@0: window manager does not override placement of windows by tomwalters@0: applications.) tomwalters@0: tomwalters@0: tomwalters@0: C. Changing Option Values: tomwalters@0: tomwalters@0: You can alter the value of one or more of the options by specifying it tomwalters@0: and the required value on the command line between the entry point tomwalters@0: name and the name of the input file. For example: tomwalters@0: tomwalters@0: genwav width_win=440 ../waves/hat tomwalters@0: tomwalters@0: will produce a waveform display for the wave in file hat with a width of tomwalters@0: 440 screen pixels. tomwalters@0: tomwalters@0: Switches and flags that just take the values on and off, or 0 and 1, can be tomwalters@0: switched on by specifying the option name, prefixed by a minus sign to tomwalters@0: distinguish them from a file name. For example, tomwalters@0: tomwalters@0: > genwav -swap ../wave/hat_br tomwalters@0: tomwalters@0: tomwalters@0: tomwalters@0: II. PROCESSES APPLIED BY AIM AND ROUTES THROUGH THE MODEL tomwalters@0: tomwalters@0: There are three routes through AIM depending on the purpose of the tomwalters@0: analysis and the form in which the output is to be displayed. The tomwalters@0: routes and the set of processes they apply are shown in the output of tomwalters@0: gen -help. tomwalters@0: tomwalters@0: All of the routes through AIM begin with genwav which displays the tomwalters@0: contents of the file to be analysed as a magnitude versus time tomwalters@0: plot. It is recommended that all analyses should begin with genwav in tomwalters@0: order to a) confirm that the file does indeed contain the wave you tomwalters@0: wish to analyse, b) confirm that the file is headerless and that the tomwalters@0: bytes are in the right order, c) choose a subsection of the wave for tomwalters@0: analysis, and d) to check whether it is a 12-bit wave or a 16-bit tomwalters@0: wave. Headers usually appear as a brief bit of black, or noise with a tomwalters@0: large amplitude, at the left-hand edge of the plot. If the bytes are tomwalters@0: in the wrong order the entire wave usually looks like noise with a tomwalters@0: large amplitude. tomwalters@0: tomwalters@0: The AUDITORY ROUTE simulates basilar membrane motion (BMM), neural tomwalters@0: activity patterns (NAPs), and auditory images either in rectangular tomwalters@0: format (SAI) or in spiral format (SPL). Its purpose is to support tomwalters@0: time-domain modelling of peripheral auditory processing; that is, to tomwalters@0: simulate the phase-locked, time-interval patterns produced by the tomwalters@0: cochlea and the conversion of the time-interval patterns into auditory tomwalters@0: images by strobed temporal integration or autocorrelation (Patterson tomwalters@0: et al., 1992; Meddis and Hewitt, 1991). For demonstrations of the tomwalters@0: functional and physiological versions of the auditory route through tomwalters@0: AIM try tomwalters@0: tomwalters@0: > aimdemo_gtf_all ../waves/cegc and tomwalters@0: > aimdemo_tlf_all ../waves/cegc tomwalters@0: tomwalters@0: [[[ Does the SAI work for the physiological route? Does it use acgram? ]]] tomwalters@0: tomwalters@0: tomwalters@0: The SPEECH ROUTE produces auditory spectrograms and cochleograms which tomwalters@0: can be stored and used as input for automatic speech recognition. They tomwalters@0: are 'auditory' spectrograms and cochleograms inasmuch as the centre tomwalters@0: frequencies of the channels in the filterbank are equally spaced on an tomwalters@0: ERB-scale, or Bark scale, rather than being linearly spaced as in tomwalters@0: traditional preprocessor (Robinson et al., 1990; Giguere and Woodland, tomwalters@0: 1994; Patterson et al., 1994). The are currently no aimdemo scripts tomwalters@0: for the speech routes through AIM. tomwalters@0: tomwalters@0: tomwalters@0: The SPECTRAL ROUTE produces representations of the distribution of tomwalters@0: activity across frequency in the auditory system, either at the output tomwalters@0: of the filterbank (genasa) or the output of the full cochlea tomwalters@0: simulation (genepn). These representations have been variously tomwalters@0: referred to as 'excitation patterns' (Zwicker, 1974; Moore and tomwalters@0: Glasberg, 1983), 'central auditory spectra' (Srulovicz and Goldstein, tomwalters@0: 1983), or simply 'auditory spectra' (Patterson, 1994). In AIM, to tomwalters@0: distinguish the spectral representation at the output of the tomwalters@0: filterbank, from the representation provided at the output of the full tomwalters@0: cochlea simulation, the former is referred to as an 'auditory spectral tomwalters@0: analysis' (genasa) and the latter is referred to as an 'excitation tomwalters@0: pattern' (genepn). For demonstrations of the functional and tomwalters@0: physiological versions of the spectral route through AIM, try tomwalters@0: tomwalters@0: > aimdemo_gtf_spectra ../waves/cegc and tomwalters@0: > aimdemo_tlf_spectra ../waves/cegc tomwalters@0: tomwalters@0: tomwalters@0: tomwalters@0: III. INDIVIDUAL INSTRUCTIONS FOR THE AUDITORY ROUTE THROUGH AIM tomwalters@0: tomwalters@0: The following is a list of the instructions that form the basis of tomwalters@0: aimdemo_gtf_all. They show the output at successive stages of the tomwalters@0: functional version of the auditory route through AIM. tomwalters@0: tomwalters@0: > genwav length=32ms ../waves/cegc tomwalters@0: > genstp length=32ms ../waves/cegc tomwalters@0: > genbmm length=32ms ../waves/cegc tomwalters@0: > gennap length=32ms ../waves/cegc tomwalters@0: > gensai start=200 length=300ms ../waves/cegc tomwalters@0: > genspl start=200 length=300ms pensize=2 ../waves/cegc tomwalters@0: tomwalters@0: tomwalters@0: genwav shows the time waveform, tomwalters@0: genstp shows the pressure wave in the middle ear bone tomwalters@0: that drives the cochlea (the stapes). tomwalters@0: genbmm shows simulated basilar membrane motion, tomwalters@0: gennap shows a simulated neural activity pattern, tomwalters@0: gensai shows a simulated stabilized auditory image, tomwalters@0: genspl shows a spiral mapping of the auditory image. tomwalters@0: tomwalters@0: tomwalters@0: Each of the instructions should create an X window and present a wave, tomwalters@0: a landscape display, or a spiral display of cegc. The stimulus files tomwalters@0: were created on a DECstation. On a SUN, swap the bytes on the command tomwalters@0: line. For example: tomwalters@0: tomwalters@0: > genwav swap=on length=32ms ../waves/cegc tomwalters@0: tomwalters@0: The wave in the file 'cegc' is a set of click trains for the notes of tomwalters@0: a major triad and the octave -- hence the name C-E-G-C. We use clicks tomwalters@0: trains as test and demonstration stimuli for several reasons: They tomwalters@0: activate all channels of the model and they do so with roughly equally tomwalters@0: energy. The clicks elicit impulse responses from the model which show tomwalters@0: the processing in its simplest broadband form. The simple temporal tomwalters@0: structure of the click train immediately reveals any temporal tomwalters@0: alignment problems in the software. tomwalters@0: tomwalters@0: For the first 300 ms, the inter-click interval in cegc is 8 ms; then tomwalters@0: for 300 ms, the inter-click interval decreases linearly to (4/5)*8 tomwalters@0: ms. For convenience, we refer to the first note as 'C', and so, tomwalters@0: relative to this C, the glide takes the note up to 'E'. The gliding tomwalters@0: portions of cegc illustrate the dynamic properties of AIM. The tomwalters@0: inter-click interval stays at (4/5)*8 for 300 ms and then glides tomwalters@0: linearly to (2/3)*8 ms which, relative to the starting C, is the note tomwalters@0: G. The inter-click interval stays at (2/3)*8 ms for 300 ms and then tomwalters@0: glides linearly to (1/2)*8 ms which is the final note and the octave tomwalters@0: of the original C. tomwalters@0: tomwalters@0: tomwalters@0: IV. INSTRUCTION AND OPTION SYNTAX tomwalters@0: tomwalters@0: > gen??? [-option -option=value option=value ... -update] inputfile tomwalters@0: tomwalters@0: The input file is assumed to be a headerless binary file; that is it tomwalters@0: is assumed to contain in short integers (2-byte words). tomwalters@0: tomwalters@0: tomwalters@0: The options handler accepts the following three formats: tomwalters@0: tomwalters@0: -option # turns option "on" (Unix convention) tomwalters@0: option=value # sets option to the string "value" (standard math) tomwalters@0: -option=value # sets option to the string "value" (mixed unix/math) tomwalters@0: tomwalters@0: So, -swap, swap=on, and -swap=on, all have the same effect. tomwalters@0: tomwalters@0: The options handler also recognises two special options: update and help. tomwalters@0: tomwalters@0: -update (or update=on) tomwalters@0: tomwalters@0: This causes the options and values on the command line to be stored tomwalters@0: in an 'options file' that is named according to the Unix convention tomwalters@0: ".rc". All of the gen??? instructions search for a tomwalters@0: file of this form when they are executed, so once an option has been tomwalters@0: 'updated' with a specific value, that value will be used in all tomwalters@0: subsequent occurances of that instruction in that directory. (Note: tomwalters@0: The .gen???rc are 'hidden' because the unix command 'ls' does not tomwalters@0: print files beginning with '.' unless the ls option '-a' is tomwalters@0: present.) tomwalters@0: tomwalters@0: CAUTION: Be careful with options files in the users home tomwalters@0: directory. They will be invoked by instructions executed in any tomwalters@0: directory that does not have its own specific options file for that tomwalters@0: instruction and this can be very confusing. tomwalters@0: tomwalters@0: -help (or help=on) tomwalters@0: tomwalters@0: This causes the current options file to be printed on the screen (stdout). tomwalters@0: tomwalters@0: The help option accepts any option name as an argument, and prints the tomwalters@0: value of that single option on the screen. So tomwalters@0: tomwalters@0: > genbmm help=channels tomwalters@0: tomwalters@0: will cause the current number of channels to be printed on the screen. tomwalters@0: tomwalters@0: The help option also accepts the string 'all'. tomwalters@0: tomwalters@0: > genbmm help=all tomwalters@0: tomwalters@0: will cause all options required for BMM output to be printed on the tomwalters@0: screen (stdout). tomwalters@0: tomwalters@0: The options and values used in an analysis of BMM can be recorded tomwalters@0: in the file foo by with the following instruction. tomwalters@0: tomwalters@0: > genbmm help=all > foo tomwalters@0: tomwalters@0: