Mercurial > hg > aim92
view docs/aimInstructions @ 0:5242703e91d3 tip
Initial checkin for AIM92 aimR8.2 (last updated May 1997).
author | tomwalters |
---|---|
date | Fri, 20 May 2011 15:19:45 +0100 |
parents | |
children |
line wrap: on
line source
INSTRUCTIONS AND OPTIONS FOR THE AUDITORY IMAGE MODEL. This is the introduction for those wanting to use AIM to process waves once the software package has been compiled, installed and tested. It is assumed that the user has read the introductory article by Patterson, Allerhand and Giguere (1995) and/or viewed the Overview of the Auditory Image Model described at the beginning of ReadMe_bin. Begin by typing gen -help at the prompt in an xterm window. > gen -help This prints general usage information on the standard output as follows: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - "AIM MRC-APU Release R6.28 [gen] Tue Apr 18 17:35:15 1995" Usage: gen??? [options] [file_name] where ??? is one of the following abbreviations wav wave bmm basilar membrane motion nap neural activity pattern sai stabilized auditory image spl spiral version of auditory image sgm spectrogram cgm cochleogram asa auditory spectral analysis epn excitation pattern [file_name] is a headerless wave file (2-byte binary integers). [options] are the parameters, options and switches that control the AIM instructions and the AIM tools. Help with options: gen??? [-help | -help=all | -help=option] Path for options files (.gen???rc) = .:~ (or setenv OPTIONSPATH) Processes Applied by AIM and Routes Through the Model: Processes Auditory Speech Spectral Route Route Route -------------------------------- -------- ------ ------- Display input wave genwav genwav genwav Auditory filtering (gtf/tlf) genbmm Compression and rectification Neural encoding (2D-AT/haircell) gennap Temporal integration (LP filter) gensgm genasa gencgm genepn Strobed temporal integration gensai Spiral version of auditory image genspl Output: gen??? output=on file_name Output is written to file: file_name.??? The format for 2-dimensional output is by columns, with the lowest channel first in each column (bmm, nap, sgm, cgm, asa, epn). The format for auditory image output is by rows, for each image frame in succession, with the row of the lowest channel first (sai, spl). The Auditory Image Model was developed at the Applied Psychology Unit of the Medical Research Council, 15 Chaucer Road, Cambridge, U.K. Copyright(c) Applied Psychology Unit, Medical Research Council, 1988-1995. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - I. USAGE: INSTRUCTIONS, WAVE FILES, AND OPTIONS. Instructions: The auditory processing is performed by a single large program (gen) with multiple entry points, having names of the form gen???. The entry-point names function as instructions, so the program is run by typing an entry point name on the command line, followed by options that control the details of the processing, and then the name of the file containing the waveform to be processed. The output will be displayed on the screen in a window created by AIM. For example, in the directory aim/bin, type: > genwav samplerate=22.05kHz start_wav=182ms length_wav=64 ../waves/hat This will display a section of the wave in the file 'hat' which is stored in the aim/waves directory. The wave was digitised at a sampling rate of 22.05 kHz, so the option 'samplerate' is set to 22.05kHz. The options 'start_wav' and 'length_wav' determine the subset of the wave to be displayed. Each of the entry points into gen has a name which is an abbreviation of the functions provided to that point in the auditroy image model. For example, the first module applies spectral analysis to the input wave and the output of the analysis is available at entry point 'genbmm' which stands for 'generate basilar membrane motion'. The entry points are listed in the usage section of the gen -help output and there is a manual entry for each, accessible via 'manaim gen???'. Together this set of manual pages constitute the primary documentation for AIM and they are the first place to look for help after this document. Wave Files: The input waves should be in the form of headerless binary files (2-byte integers). Use genwav to check that the bytes are in the right order. If they are not set option swap=on and AIM will reverse the bytes, temporarily, at before beginning the analysis. Alternately, use the aimtool 'swab' to switch the bytes in the file permanently. A. Options: The precise behaviour of the program is determined by options. There are AUDITORY OPTIONS which control the operation of the auditory processing modules in AIM, such as the number of channels in the auditory filterbank (channels_afb) or the decay rate of the auditory image (decay_ai). These options are described in the manual pages for the individual instructions (gen???). There are also NON-AUDITORY OPTIONS which a) specify the form of the input wave (e.g. samplerate and swap_wav), b) govern the position and characteristics of the output display and its contents, and c) specify the form and destination of the output. These options are common to all entry points and so they appear at the head of the options list before the auditory options. For convenience, they are described in the manual entry for genwav, the starting point for any analysis. The relevant options at any entry point can be listed on the screen by typing the entry point name with the -help option. For example, > genwav -help If the screen size is too small to display the entire list you can make the display pause between screens by typing: > genwav -help | more Alternately, you can print the listing using > genwav -help | lpr All of the options from title up to dB_wave are for display and flow control. They determine how the program output appears on the screen, the source of the program input, and the destination of output. The options have default values chosen for 'normal' operation and display given the entry point. You can override the default values, either temporarily, by supplying a value on the command line when you run the program, or permanently by creating a file with your own option values. Both the default value and the current value for each option are listed by the help option. B. Options Values: Some of the options take numeric values (as in samplerate), others are file names (as in input_wave). Others act as simple switches or flags (e.g. swap_wav); these options can be set/unset using on/off or 0/1. Numeric options can include appropriate units; so, for example, a frequency can be specified either as 20000Hz or as 20kHz. There are also default units for options. So, for example, Hz is the default unit for frequencies and ms is the defualt unit for times in milliseconds, and when these units apply, they need not be specified on the command line.. The are also some special option values like 'remainder' which can be used to specify 'the rest of the wave' with option 'length_wave', and the special value 'centre' used with options x0_win and y0_win, to centre the display window on the screen. (This assumes that your window manager does not override placement of windows by applications.) C. Changing Option Values: You can alter the value of one or more of the options by specifying it and the required value on the command line between the entry point name and the name of the input file. For example: genwav width_win=440 ../waves/hat will produce a waveform display for the wave in file hat with a width of 440 screen pixels. Switches and flags that just take the values on and off, or 0 and 1, can be switched on by specifying the option name, prefixed by a minus sign to distinguish them from a file name. For example, > genwav -swap ../wave/hat_br II. PROCESSES APPLIED BY AIM AND ROUTES THROUGH THE MODEL There are three routes through AIM depending on the purpose of the analysis and the form in which the output is to be displayed. The routes and the set of processes they apply are shown in the output of gen -help. All of the routes through AIM begin with genwav which displays the contents of the file to be analysed as a magnitude versus time plot. It is recommended that all analyses should begin with genwav in order to a) confirm that the file does indeed contain the wave you wish to analyse, b) confirm that the file is headerless and that the bytes are in the right order, c) choose a subsection of the wave for analysis, and d) to check whether it is a 12-bit wave or a 16-bit wave. Headers usually appear as a brief bit of black, or noise with a large amplitude, at the left-hand edge of the plot. If the bytes are in the wrong order the entire wave usually looks like noise with a large amplitude. The AUDITORY ROUTE simulates basilar membrane motion (BMM), neural activity patterns (NAPs), and auditory images either in rectangular format (SAI) or in spiral format (SPL). Its purpose is to support time-domain modelling of peripheral auditory processing; that is, to simulate the phase-locked, time-interval patterns produced by the cochlea and the conversion of the time-interval patterns into auditory images by strobed temporal integration or autocorrelation (Patterson et al., 1992; Meddis and Hewitt, 1991). For demonstrations of the functional and physiological versions of the auditory route through AIM try > aimdemo_gtf_all ../waves/cegc and > aimdemo_tlf_all ../waves/cegc [[[ Does the SAI work for the physiological route? Does it use acgram? ]]] The SPEECH ROUTE produces auditory spectrograms and cochleograms which can be stored and used as input for automatic speech recognition. They are 'auditory' spectrograms and cochleograms inasmuch as the centre frequencies of the channels in the filterbank are equally spaced on an ERB-scale, or Bark scale, rather than being linearly spaced as in traditional preprocessor (Robinson et al., 1990; Giguere and Woodland, 1994; Patterson et al., 1994). The are currently no aimdemo scripts for the speech routes through AIM. The SPECTRAL ROUTE produces representations of the distribution of activity across frequency in the auditory system, either at the output of the filterbank (genasa) or the output of the full cochlea simulation (genepn). These representations have been variously referred to as 'excitation patterns' (Zwicker, 1974; Moore and Glasberg, 1983), 'central auditory spectra' (Srulovicz and Goldstein, 1983), or simply 'auditory spectra' (Patterson, 1994). In AIM, to distinguish the spectral representation at the output of the filterbank, from the representation provided at the output of the full cochlea simulation, the former is referred to as an 'auditory spectral analysis' (genasa) and the latter is referred to as an 'excitation pattern' (genepn). For demonstrations of the functional and physiological versions of the spectral route through AIM, try > aimdemo_gtf_spectra ../waves/cegc and > aimdemo_tlf_spectra ../waves/cegc III. INDIVIDUAL INSTRUCTIONS FOR THE AUDITORY ROUTE THROUGH AIM The following is a list of the instructions that form the basis of aimdemo_gtf_all. They show the output at successive stages of the functional version of the auditory route through AIM. > genwav length=32ms ../waves/cegc > genstp length=32ms ../waves/cegc > genbmm length=32ms ../waves/cegc > gennap length=32ms ../waves/cegc > gensai start=200 length=300ms ../waves/cegc > genspl start=200 length=300ms pensize=2 ../waves/cegc genwav shows the time waveform, genstp shows the pressure wave in the middle ear bone that drives the cochlea (the stapes). genbmm shows simulated basilar membrane motion, gennap shows a simulated neural activity pattern, gensai shows a simulated stabilized auditory image, genspl shows a spiral mapping of the auditory image. Each of the instructions should create an X window and present a wave, a landscape display, or a spiral display of cegc. The stimulus files were created on a DECstation. On a SUN, swap the bytes on the command line. For example: > genwav swap=on length=32ms ../waves/cegc The wave in the file 'cegc' is a set of click trains for the notes of a major triad and the octave -- hence the name C-E-G-C. We use clicks trains as test and demonstration stimuli for several reasons: They activate all channels of the model and they do so with roughly equally energy. The clicks elicit impulse responses from the model which show the processing in its simplest broadband form. The simple temporal structure of the click train immediately reveals any temporal alignment problems in the software. For the first 300 ms, the inter-click interval in cegc is 8 ms; then for 300 ms, the inter-click interval decreases linearly to (4/5)*8 ms. For convenience, we refer to the first note as 'C', and so, relative to this C, the glide takes the note up to 'E'. The gliding portions of cegc illustrate the dynamic properties of AIM. The inter-click interval stays at (4/5)*8 for 300 ms and then glides linearly to (2/3)*8 ms which, relative to the starting C, is the note G. The inter-click interval stays at (2/3)*8 ms for 300 ms and then glides linearly to (1/2)*8 ms which is the final note and the octave of the original C. IV. INSTRUCTION AND OPTION SYNTAX > gen??? [-option -option=value option=value ... -update] inputfile The input file is assumed to be a headerless binary file; that is it is assumed to contain in short integers (2-byte words). The options handler accepts the following three formats: -option # turns option "on" (Unix convention) option=value # sets option to the string "value" (standard math) -option=value # sets option to the string "value" (mixed unix/math) So, -swap, swap=on, and -swap=on, all have the same effect. The options handler also recognises two special options: update and help. -update (or update=on) This causes the options and values on the command line to be stored in an 'options file' that is named according to the Unix convention ".<program_name>rc". All of the gen??? instructions search for a file of this form when they are executed, so once an option has been 'updated' with a specific value, that value will be used in all subsequent occurances of that instruction in that directory. (Note: The .gen???rc are 'hidden' because the unix command 'ls' does not print files beginning with '.' unless the ls option '-a' is present.) CAUTION: Be careful with options files in the users home directory. They will be invoked by instructions executed in any directory that does not have its own specific options file for that instruction and this can be very confusing. -help (or help=on) This causes the current options file to be printed on the screen (stdout). The help option accepts any option name as an argument, and prints the value of that single option on the screen. So > genbmm help=channels will cause the current number of channels to be printed on the screen. The help option also accepts the string 'all'. > genbmm help=all will cause all options required for BMM output to be printed on the screen (stdout). The options and values used in an analysis of BMM can be recorded in the file foo by with the following instruction. > genbmm help=all > foo