Mercurial > hg > aim92

INSTRUCTIONS AND OPTIONS FOR THE AUDITORY IMAGE MODEL.

This is the introduction for those wanting to use AIM to process waves
once the software package has been compiled, installed and tested. It
is assumed that the user has read the introductory article by
Patterson, Allerhand and Giguere (1995) and/or viewed the Overview of
the Auditory Image Model described at the beginning of ReadMe_bin.

Begin by typing  gen -help  at the prompt in an xterm window.

> gen -help

This prints general usage information on the standard output as follows:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
"AIM MRC-APU Release R6.28   [gen]    Tue Apr 18 17:35:15 1995"

Usage: gen???  [options]  [file_name]

   where ??? is one of the following abbreviations

   wav     wave
   bmm     basilar membrane motion
   nap     neural activity pattern
   sai     stabilized auditory image
   spl     spiral version of auditory image
   sgm     spectrogram
   cgm     cochleogram
   asa     auditory spectral analysis
   epn     excitation pattern

   [file_name] is a headerless wave file (2-byte binary integers).

   [options] are the parameters, options and switches that control
       the AIM instructions and the AIM tools.

   Help with options:  gen???  [-help | -help=all | -help=option]
   Path for options files (.gen???rc) = .:~ (or setenv OPTIONSPATH)


Processes Applied by AIM and Routes Through the Model:

   Processes                           Auditory   Speech   Spectral
                                        Route      Route    Route
   --------------------------------    --------   ------   -------
   Display input wave                   genwav    genwav    genwav
   Auditory filtering (gtf/tlf)         genbmm
   Compression and rectification
   Neural encoding (2D-AT/haircell)     gennap
   Temporal integration (LP filter)               gensgm    genasa
                                                  gencgm    genepn
   Strobed temporal integration         gensai
   Spiral version of auditory image     genspl


Output:   gen???  output=on  file_name

   Output is written to file:  file_name.???

   The format for 2-dimensional output is by columns, with the lowest
   channel first in each column (bmm, nap, sgm, cgm, asa, epn).
   The format for auditory image output is by rows, for each image frame
   in succession, with the row of the lowest channel first (sai, spl).


The Auditory Image Model was developed at the Applied Psychology Unit
of the Medical Research Council, 15 Chaucer Road, Cambridge, U.K.

Copyright(c) Applied Psychology Unit, Medical Research Council, 1988-1995.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -


I. USAGE: INSTRUCTIONS, WAVE FILES, AND OPTIONS.

Instructions: The auditory processing is performed by a single large
program (gen) with multiple entry points, having names of the form
gen???.  The entry-point names function as instructions, so the
program is run by typing an entry point name on the command line,
followed by options that control the details of the processing, and
then the name of the file containing the waveform to be processed.
The output will be displayed on the screen in a window created by AIM.
For example, in the directory aim/bin, type:

> genwav samplerate=22.05kHz start_wav=182ms length_wav=64 ../waves/hat

This will display a section of the wave in the file 'hat' which is
stored in the aim/waves directory. The wave was digitised at a
sampling rate of 22.05 kHz, so the option 'samplerate' is set to
22.05kHz. The options 'start_wav' and 'length_wav' determine the
subset of the wave to be displayed.

Each of the entry points into gen has a name which is an abbreviation
of the functions provided to that point in the auditroy image model.
For example, the first module applies spectral analysis to the input
wave and the output of the analysis is available at entry point
'genbmm' which stands for 'generate basilar membrane motion'. The
entry points are listed in the usage section of the gen -help output
and there is a manual entry for each, accessible via 'manaim
gen???'. Together this set of manual pages constitute the primary
documentation for AIM and they are the first place to look for help
after this document.

Wave Files: The input waves should be in the form of headerless binary
files (2-byte integers). Use genwav to check that the bytes are in the
right order. If they are not set option swap=on and AIM will reverse
the bytes, temporarily, at before beginning the analysis. Alternately,
use the aimtool 'swab' to switch the bytes in the file permanently.

A. Options:

The precise behaviour of the program is determined by options.  There
are AUDITORY OPTIONS which control the operation of the auditory
processing modules in AIM, such as the number of channels in the
auditory filterbank (channels_afb) or the decay rate of the auditory
image (decay_ai).  These options are described in the manual pages for
the individual instructions (gen???).  There are also NON-AUDITORY
OPTIONS which a) specify the form of the input wave (e.g. samplerate
and swap_wav), b) govern the position and characteristics of the
output display and its contents, and c) specify the form and
destination of the output.  These options are common to all entry
points and so they appear at the head of the options list before the
auditory options. For convenience, they are described in the manual
entry for genwav, the starting point for any analysis.

The relevant options at any entry point can be listed on the screen by
typing the entry point name with the -help option.  For example,

> genwav -help

If the screen size is too small to display the entire list you can make the
display pause between screens by typing:

> genwav -help | more

Alternately, you can print the listing using

> genwav -help | lpr


All of the options from title up to dB_wave are for display and flow
control.  They determine how the program output appears on the screen,
the source of the program input, and the destination of output. The
options have default values chosen for 'normal' operation and display
given the entry point. You can override the default values, either
temporarily, by supplying a value on the command line when you run the
program, or permanently by creating a file with your own option
values. Both the default value and the current value for each option
are listed by the help option.


B. Options Values:

Some of the options take numeric values (as in samplerate), others are
file names (as in input_wave).  Others act as simple switches or flags
(e.g. swap_wav); these options can be set/unset using on/off or 0/1.

Numeric options can include appropriate units; so, for example, a
frequency can be specified either as 20000Hz or as 20kHz. There are
also default units for options. So, for example, Hz is the default
unit for frequencies and ms is the defualt unit for times in
milliseconds, and when these units apply, they need not be specified
on the command line..

The are also some special option values like 'remainder' which can be
used to specify 'the rest of the wave' with option 'length_wave', and
the special value 'centre' used with options x0_win and y0_win, to
centre the display window on the screen. (This assumes that your
window manager does not override placement of windows by
applications.)


C. Changing Option Values:

You can alter the value of one or more of the options by specifying it
and the required value on the command line between the entry point
name and the name of the input file. For example:

genwav width_win=440 ../waves/hat

will produce a waveform display for the wave in file hat with a width of
440 screen pixels.

Switches and flags that just take the values on and off, or 0 and 1, can be
switched on by specifying the option name, prefixed by a minus sign to
distinguish them from a file name.  For example,

> genwav -swap ../wave/hat_br


II. PROCESSES APPLIED BY AIM AND ROUTES THROUGH THE MODEL

There are three routes through AIM depending on the purpose of the
analysis and the form in which the output is to be displayed. The
routes and the set of processes they apply are shown in the output of
gen -help.

All of the routes through AIM begin with genwav which displays the
contents of the file to be analysed as a magnitude versus time
plot. It is recommended that all analyses should begin with genwav in
order to a) confirm that the file does indeed contain the wave you
wish to analyse, b) confirm that the file is headerless and that the
bytes are in the right order, c) choose a subsection of the wave for
analysis, and d) to check whether it is a 12-bit wave or a 16-bit
wave. Headers usually appear as a brief bit of black, or noise with a
large amplitude, at the left-hand edge of the plot. If the bytes are
in the wrong order the entire wave usually looks like noise with a
large amplitude.

The AUDITORY ROUTE simulates basilar membrane motion (BMM), neural
activity patterns (NAPs), and auditory images either in rectangular
format (SAI) or in spiral format (SPL). Its purpose is to support
time-domain modelling of peripheral auditory processing; that is, to
simulate the phase-locked, time-interval patterns produced by the
cochlea and the conversion of the time-interval patterns into auditory
images by strobed temporal integration or autocorrelation (Patterson
et al., 1992; Meddis and Hewitt, 1991).  For demonstrations of the
functional and physiological versions of the auditory route through
AIM try

> aimdemo_gtf_all ../waves/cegc      and
> aimdemo_tlf_all ../waves/cegc

[[[ Does the SAI work for the physiological route? Does it use acgram? ]]]


The SPEECH ROUTE produces auditory spectrograms and cochleograms which
can be stored and used as input for automatic speech recognition. They
are 'auditory' spectrograms and cochleograms inasmuch as the centre
frequencies of the channels in the filterbank are equally spaced on an
ERB-scale, or Bark scale, rather than being linearly spaced as in
traditional preprocessor (Robinson et al., 1990; Giguere and Woodland,
1994; Patterson et al., 1994).  The are currently no aimdemo scripts
for the speech routes through AIM.


The SPECTRAL ROUTE produces representations of the distribution of
activity across frequency in the auditory system, either at the output
of the filterbank (genasa) or the output of the full cochlea
simulation (genepn). These representations have been variously
referred to as 'excitation patterns' (Zwicker, 1974; Moore and
Glasberg, 1983), 'central auditory spectra' (Srulovicz and Goldstein,
1983), or simply 'auditory spectra' (Patterson, 1994). In AIM, to
distinguish the spectral representation at the output of the
filterbank, from the representation provided at the output of the full
cochlea simulation, the former is referred to as an 'auditory spectral
analysis' (genasa) and the latter is referred to as an 'excitation
pattern' (genepn).  For demonstrations of the functional and
physiological versions of the spectral route through AIM, try

> aimdemo_gtf_spectra ../waves/cegc      and
> aimdemo_tlf_spectra ../waves/cegc


III. INDIVIDUAL INSTRUCTIONS FOR THE AUDITORY ROUTE THROUGH AIM

The following is a list of the instructions that form the basis of
aimdemo_gtf_all. They show the output at successive stages of the
functional version of the auditory route through AIM.

> genwav length=32ms ../waves/cegc
> genstp length=32ms ../waves/cegc
> genbmm length=32ms ../waves/cegc
> gennap length=32ms ../waves/cegc
> gensai start=200 length=300ms ../waves/cegc
> genspl start=200 length=300ms pensize=2 ../waves/cegc


genwav shows the time waveform,
genstp shows the pressure wave in the middle ear bone
       that drives the cochlea (the stapes).
genbmm shows simulated basilar membrane motion,
gennap shows a simulated neural activity pattern,
gensai shows a simulated stabilized auditory image,
genspl shows a spiral mapping of the auditory image.


Each of the instructions should create an X window and present a wave,
a landscape display, or a spiral display of cegc.  The stimulus files
were created on a DECstation. On a SUN, swap the bytes on the command
line. For example:

> genwav swap=on length=32ms ../waves/cegc

The wave in the file 'cegc' is a set of click trains for the notes of
a major triad and the octave -- hence the name C-E-G-C. We use clicks
trains as test and demonstration stimuli for several reasons: They
activate all channels of the model and they do so with roughly equally
energy. The clicks elicit impulse responses from the model which show
the processing in its simplest broadband form. The simple temporal
structure of the click train immediately reveals any temporal
alignment problems in the software.

For the first 300 ms, the inter-click interval in cegc is 8 ms; then
for 300 ms, the inter-click interval decreases linearly to (4/5)*8
ms. For convenience, we refer to the first note as 'C', and so,
relative to this C, the glide takes the note up to 'E'.  The gliding
portions of cegc illustrate the dynamic properties of AIM.  The
inter-click interval stays at (4/5)*8 for 300 ms and then glides
linearly to (2/3)*8 ms which, relative to the starting C, is the note
G.  The inter-click interval stays at (2/3)*8 ms for 300 ms and then
glides linearly to (1/2)*8 ms which is the final note and the octave
of the original C.


IV. INSTRUCTION AND OPTION SYNTAX

> gen??? [-option -option=value option=value ... -update] inputfile

The input file is assumed to be a headerless binary file; that is it
is assumed to contain in short integers (2-byte words).


The options handler accepts the following three formats:

-option       # turns option "on" (Unix convention)
 option=value # sets option to the string "value" (standard math)
-option=value # sets option to the string "value" (mixed unix/math)

So, -swap, swap=on, and -swap=on, all have the same effect.

The options handler also recognises two special options: update and help.

-update (or update=on)

   This causes the options and values on the command line to be stored
   in an 'options file' that is named according to the Unix convention
   ".<program_name>rc".  All of the gen??? instructions search for a
   file of this form when they are executed, so once an option has been
   'updated' with a specific value, that value will be used in all
   subsequent occurances of that instruction in that directory. (Note:
   The .gen???rc are 'hidden' because the unix command 'ls' does not
   print files beginning with '.' unless the ls option '-a' is
   present.)

   CAUTION: Be careful with options files in the users home
   directory. They will be invoked by instructions executed in any
   directory that does not have its own specific options file for that
   instruction and this can be very confusing.

-help (or help=on)

   This causes the current options file to be printed on the screen (stdout).

   The help option accepts any option name as an argument, and prints the
   value of that single option on the screen. So

> genbmm help=channels

   will cause the current number of channels to be printed on the screen.

   The help option also accepts the string 'all'.

> genbmm help=all

   will cause all options required for BMM output to be printed on the
   screen (stdout).

   The options and values used in an analysis of BMM can be recorded
   in the file foo by with the following instruction.

> genbmm help=all > foo
author	tomwalters
date	Fri, 20 May 2011 15:19:45 +0100
parents
children