tomwalters@0: Revised for JASA,  3 April 95		1
tomwalters@0: 
tomwalters@0: 
tomwalters@0: Time-domain modelling of peripheral auditory processing:
tomwalters@0: 	A modular architecture and a software platform*
tomwalters@0: 
tomwalters@0: Roy D. Patterson and Mike H. Allerhand
tomwalters@0: MRC Applied Psychology Unit, 15 Chaucer Road, Cambridge  CB2 2EF, UK 
tomwalters@0: 
tomwalters@0: Christian Gigure Laboratory of Experimental Audiology, University
tomwalters@0: Hospital Utrecht, 3508 GA Utrecht, The Netherlands
tomwalters@0: 
tomwalters@0: (Received		December, 1994)   (Revised 31 March 1995)
tomwalters@0: 
tomwalters@0: A software package with a modular architecture has been developed to
tomwalters@0: support perceptual modelling of the fine-grain spectro-temporal
tomwalters@0: information observed in the auditory nerve. The package contains both
tomwalters@0: functional and physiological modules to simulate auditory spectral
tomwalters@0: analysis, neural encoding and temporal integration, including new
tomwalters@0: forms of periodicity-sensitive temporal integration that generate
tomwalters@0: stabilized auditory images. Combinations of the modules enable the
tomwalters@0: user to approximate a wide variety of existing, time-domain, auditory
tomwalters@0: models. Sequences of auditory images can be replayed to produce
tomwalters@0: cartoons of auditory perceptions that illustrate the dynamic response
tomwalters@0: of the auditory system to everyday sounds.
tomwalters@0: 
tomwalters@0: PACS numbers: 43.64.Bt, 43.66.Ba, 43.71.An
tomwalters@0: 
tomwalters@0: Running head: Auditory Image Model Software
tomwalters@0: 
tomwalters@0: 
tomwalters@0: INTRODUCTION
tomwalters@0: 
tomwalters@0: Several years ago, we developed a functional model of the cochlea to
tomwalters@0: simulate the phase-locked activity that complex sounds produce in the
tomwalters@0: auditory nerve. The purpose was to investigate the role of the
tomwalters@0: fine-grain timing information in auditory perception generally
tomwalters@0: (Patterson et al., 1992a; Patterson and Akeroyd, 1995), and in speech
tomwalters@0: perception in particular (Patterson, Holdsworth and Allerhand, 1992b).
tomwalters@0: The architecture of the resulting Auditory Image Model (AIM) is shown
tomwalters@0: in the left-hand column of Fig. 1. The responses of the three modules
tomwalters@0: to the vowel in 'hat' are shown in the three panels of Fig. 2.
tomwalters@0: Briefly, the spectral analysis stage converts the sound wave into the
tomwalters@0: model's representation of basilar membrane motion (BMM). For the vowel
tomwalters@0: in 'hat', each glottal cycle generates a version of the basic vowel
tomwalters@0: structure in the BMM (top panel).  The neural encoding stage
tomwalters@0: stabilizes the BMM in level and sharpens features like vowel formants,
tomwalters@0: to produce a simulation of the neural activity pattern (NAP) produced
tomwalters@0: by the sound in the auditory nerve (middle panel).  The temporal
tomwalters@0: integration stage stabilizes the repeating structure in the NAP and
tomwalters@0: produces a simulation of our perception of the vowel (bottom panel),
tomwalters@0: referred to as the auditory image.  Sequences of simulated images can
tomwalters@0: be generated at regular intervals and replayed as an animated cartoon
tomwalters@0: to show the dynamic behaviour of the auditory images produced by
tomwalters@0: everyday sounds.  
tomwalters@0: 
tomwalters@0: An earlier version of the AIM software was made available to
tomwalters@0: collaborators via the Internet. From there it spread to the speech and
tomwalters@0: music communities, indicating a more general interest in auditory
tomwalters@0: models than we had originally anticipated. This has prompted us to
tomwalters@0: prepare documentation and a formal release of the software (AIM R7).
tomwalters@0: 
tomwalters@0: A number of users wanted to compare the outputs from the functional
tomwalters@0: model, which is almost level independent, with those from
tomwalters@0: physiological models of the cochlea, which are fundamentally level
tomwalters@0: dependent. Others wanted to compare the auditory images produced by
tomwalters@0: strobed temporal integration with correlograms. As a result, we have
tomwalters@0: installed alternative modules for each of the three main stages as
tomwalters@0: shown in the right-hand column of Fig. 1.  The alternative spectral
tomwalters@0: analysis module is a non-linear, transmission line filterbank based on
tomwalters@0: Gigure and Woodland (1994a). The neural encoding module is based on
tomwalters@0: the inner haircell model of Meddis (1988).  The temporal integration
tomwalters@0: module generates correlograms like those of Slaney and Lyon (1990) or
tomwalters@0: Meddis and Hewitt (1991), using the algorithm proposed by Allerhand
tomwalters@0: and Patterson (1992). The responses of the three modules to the vowel
tomwalters@0: in 'hat' are shown in Fig. 3 for the case where the level of the vowel
tomwalters@0: is 60 dB SPL. The patterns are broadly similar to those of the
tomwalters@0: functional modules but the details differ, particularly at the output
tomwalters@0: of the third stage. The differences grow more pronounced when the
tomwalters@0: level of the vowel is reduced to 30 dB SPL or increased to 90 dB SPL.
tomwalters@0: Figures 2 and 3 together illustrate how the software can be used to
tomwalters@0: compare and contrast different auditory models.  The new modules also
tomwalters@0: open the way to time-domain simulation of hearing impairment and
tomwalters@0: distortion products of cochlear origin.
tomwalters@0: 
tomwalters@0: Switches were installed to enable the user to shift from the
tomwalters@0: functional to the physiological version of AIM at the output of each
tomwalters@0: stage of the model. This architecture enables the system to implement
tomwalters@0: other popular auditory models such as the gammatone- filterbank,
tomwalters@0: Meddis-haircell, correlogram models proposed by Assmann and
tomwalters@0: Summerfield (1990), Meddis and Hewitt (1991), and Brown and Cooke
tomwalters@0: (1994). The remainder of this letter describes the integrated software
tomwalters@0: package with emphasis on the functional and physiological routes, and
tomwalters@0: on practical aspects of obtaining the software package.*
tomwalters@0: 
tomwalters@0: 
tomwalters@0: 
tomwalters@0: I. THE AUDITORY IMAGE MODEL
tomwalters@0: 
tomwalters@0: A. The spectral analysis stage 
tomwalters@0: 
tomwalters@0: Spectral analysis is performed by a bank of auditory filters which
tomwalters@0: converts a digitized wave into an array of filtered waves like those
tomwalters@0: shown in the top panels of Figs 2 and 3.  The set of waves is AIM's
tomwalters@0: representation of basilar membrane motion.  The software distributes
tomwalters@0: the filters linearly along a frequency scale measured in Equivalent
tomwalters@0: Rectangular Bandwidths (ERB's). The ERB scale was proposed by Glasberg
tomwalters@0: and Moore (1990) based on physiological research summarized in
tomwalters@0: Greenwood (1990) and psychoacoustic research summarized in Patterson
tomwalters@0: and Moore (1986). The constants of the ERB function can also be set to
tomwalters@0: produce a reasonable approximation to the Bark scale. Options enable
tomwalters@0: the user to specify the number of channels in the filterbank and the
tomwalters@0: minimum and maximum filter center frequencies.
tomwalters@0: 
tomwalters@0: AIM provides both a functional auditory filter and a physiological
tomwalters@0: auditory filter for generating the BMM: the former is a linear,
tomwalters@0: gammatone filter (Patterson et al., 1992a); the latter is a
tomwalters@0: non-linear, transmission-line filter (Gigure and Woodland, 1994a).
tomwalters@0: The impulse response of the gammatone filter provides an excellent fit
tomwalters@0: to the impulse response of primary auditory neurons in cats, and its
tomwalters@0: amplitude characteristic is very similar to that of the 'roex' filter
tomwalters@0: commonly used to represent the human auditory filter. The motivation
tomwalters@0: for the gammatone filterbank and the available implementations are
tomwalters@0: summarized in Patterson (1994a). The input wave is passed through an
tomwalters@0: optional middle-ear filter adapted from Lutman and Martin (1979).
tomwalters@0: 
tomwalters@0: In the physiological version, a 'wave digital filter' is used to
tomwalters@0: implement the classical, one-dimensional, transmission-line
tomwalters@0: approximation to cochlear hydrodynamics. A feedback circuit
tomwalters@0: representing the fast motile response of the outer haircells generates
tomwalters@0: level- dependent basilar membrane motion (Gigure and Woodland,
tomwalters@0: 1994a). The filterbank generates combination tones of the type
tomwalters@0: f1-n(f2-f1) which propagate to the appropriate channel, and it has the
tomwalters@0: potential to generate cochlear echoes. Options enable the user to
tomwalters@0: customize the transmission line filter by specifying the feedback gain
tomwalters@0: and saturation level of the outer haircell circuit. The middle ear
tomwalters@0: filter forms an integral part of the simulation in this case.
tomwalters@0: Together, it and the transmission line filterbank provide a
tomwalters@0: bi-directional model of auditory spectral analysis.
tomwalters@0: 
tomwalters@0: The upper panels of Figs 2 and 3 show the responses of the two
tomwalters@0: filterbanks to the vowel in 'hat'. They have 75 channels covering the
tomwalters@0: frequency range 100 to 6000 Hz (3.3 to 30.6 ERB's). In the
tomwalters@0: high-frequency channels, the filters are broad and the glottal pulses
tomwalters@0: generate impulse responses which decay relatively quickly. In the
tomwalters@0: low-frequency channels, the filters are narrow and so they resolve
tomwalters@0: individual continuous harmonics. The rightward skew in the
tomwalters@0: low-frequency channels is the 'phase lag,' or 'propagation delay,' of
tomwalters@0: the cochlea, which arises because the narrower low-frequency filters
tomwalters@0: respond more slowly to input. The transmission line filterbank shows
tomwalters@0: more ringing in the valleys than the gammatone filterbank because of
tomwalters@0: its dynamic signal compression; as amplitude decreases the damping of
tomwalters@0: the basilar membrane is reduced to increase sensitivity and frequency
tomwalters@0: resolution.
tomwalters@0: 
tomwalters@0: 
tomwalters@0: B. The neural encoding stage
tomwalters@0: 
tomwalters@0: The second stage of AIM simulates the mechanical/neural transduction
tomwalters@0: process performed by the inner haircells. It converts the BMM into a
tomwalters@0: neural activity pattern (NAP), which is AIM's representation of the
tomwalters@0: afferent activity in the auditory nerve. Two alternative simulations
tomwalters@0: are provided for generating the NAP: a bank of two-dimensional
tomwalters@0: adaptive- thresholding units (Holdsworth and Patterson, 1993), or a
tomwalters@0: bank of inner haircell simulators (Meddis, 1988).
tomwalters@0: 
tomwalters@0: The adaptive thresholding mechanism is a functional representation of
tomwalters@0: neural encoding. It begins by rectifying and compressing the BMM; then
tomwalters@0: it applies adaptation in time and suppression across frequency. The
tomwalters@0: adaptation and suppression are coupled and they jointly sharpen
tomwalters@0: features like vowel formants in the compressed BMM representation.
tomwalters@0: Briefly, an adaptive threshold value is maintained for each channel
tomwalters@0: and updated at the sampling rate. The new value is the largest of a)
tomwalters@0: the previous value reduced by a fast-acting temporal decay factor, b)
tomwalters@0: the previous value reduced by a longer-term temporal decay factor, c)
tomwalters@0: the adapted level in the channel immediately above, reduced by a
tomwalters@0: frequency spread factor, or d) the adapted level in the channel
tomwalters@0: immediately below, reduced by the same frequency spread factor. The
tomwalters@0: mechanism produces output whenever the input exceeds the adaptive
tomwalters@0: threshold, and the output level is the difference between the input
tomwalters@0: and the adaptive threshold. The parameters that control the spread of
tomwalters@0: activity in time and frequency are options in AIM.
tomwalters@0: 
tomwalters@0: The Meddis (1988) module simulates the operation of an individual
tomwalters@0: inner haircell; specifically, it simulates the flow of
tomwalters@0: neurotransmitter across three reservoirs that are postulated to exist
tomwalters@0: in and around the haircell. The module reproduces important properties
tomwalters@0: of single afferent fibres such as two-component time adaptation and
tomwalters@0: phase-locking. The transmitter flow equations are solved using the
tomwalters@0: wave-digital-filter algorithm described in Gigure and Woodland
tomwalters@0: (1994a). There is one haircell simulator for each channel of the
tomwalters@0: filterbank. Options allow the user to shift the entire rate-intensity
tomwalters@0: function to a higher or lower level, and to specify the type of fibre
tomwalters@0: (medium or high spontaneous-rate).
tomwalters@0: 
tomwalters@0: The middle panels in Figures 2 and 3 show the NAPs obtained with
tomwalters@0: adaptive thresholding and the Meddis module in response to BMMs from
tomwalters@0: the gammatone and transmission line filterbanks of Figs 1 and 2,
tomwalters@0: respectively. The phase lag of the BMM is preserved in the NAP. The
tomwalters@0: positive half-cycles of the BMM waves have been sharpened in time, an
tomwalters@0: effect which is more obvious in the adaptive thresholding NAP.
tomwalters@0: Sharpening is also evident in the frequency dimension of the adaptive
tomwalters@0: thresholding NAP. The individual 'haircells' are not coupled across
tomwalters@0: channels in the Meddis module, and thus there is no frequency
tomwalters@0: sharpening in this case. The physiological NAP reveals that the
tomwalters@0: activity between glottal pulses in the high-frequency channels is due
tomwalters@0: to the strong sixth harmonic in the first formant of the vowel.
tomwalters@0: 
tomwalters@0: 
tomwalters@0: C. The temporal integration stage
tomwalters@0: 
tomwalters@0: Periodic sounds give rise to static, rather than oscillating,
tomwalters@0: perceptions indicating that temporal integration is applied to the NAP
tomwalters@0: in the production of our initial perception of a sound -- our auditory
tomwalters@0: image. Traditionally, auditory temporal integration is represented by
tomwalters@0: a simple leaky integration process and AIM provides a bank of lowpass
tomwalters@0: filters to enable the user to generate auditory spectra (Patterson,
tomwalters@0: 1994a) and auditory spectrograms (Patterson et al., 1992b). However,
tomwalters@0: the leaky integrator removes the phase-locked fine structure observed
tomwalters@0: in the NAP, and this conflicts with perceptual data indicating that
tomwalters@0: the fine structure plays an important role in determining sound
tomwalters@0: quality and source identification (Patterson, 1994b; Patterson and
tomwalters@0: Akeroyd, 1995). As a result, AIM includes two modules which preserve
tomwalters@0: much of the time-interval information in the NAP during temporal
tomwalters@0: integration, and which produce a better representation of our auditory
tomwalters@0: images. In the functional version of AIM, this is accomplished with
tomwalters@0: strobed temporal integration (Patterson et al., 1992a,b); in the
tomwalters@0: physiological version, it is accomplished with a bank of
tomwalters@0: autocorrelators (Slaney and Lyon, 1990; Meddis and Hewitt, 1991).
tomwalters@0: 
tomwalters@0: In the case of strobed temporal integration (STI), a bank of delay
tomwalters@0: lines is used to form a buffer store for the NAP, one delay line per
tomwalters@0: channel, and as the NAP proceeds along the buffer it decays linearly
tomwalters@0: with time, at about 2.5 %/ms. Each channel of the buffer is assigned a
tomwalters@0: strobe unit which monitors activity in that channel looking for local
tomwalters@0: maxima in the stream of NAP pulses. When one is found, the unit
tomwalters@0: initiates temporal integration in that channel; that is, it transfers
tomwalters@0: a copy of the NAP at that instant to the corresponding channel of an
tomwalters@0: image buffer and adds it point-for-point with whatever is already
tomwalters@0: there. The local maximum itself is mapped to the 0-ms point in the
tomwalters@0: image buffer. The multi-channel version of this STI process produces
tomwalters@0: AIM's representation of our auditory image of a sound. Periodic and
tomwalters@0: quasi-periodic sounds cause regular strobing which leads to simulated
tomwalters@0: auditory images that are static, or nearly static, and which have the
tomwalters@0: same temporal resolution as the NAP.  Dynamic sounds are represented
tomwalters@0: as a sequence of auditory image frames. If the rate of change in a
tomwalters@0: sound is not too rapid, as is diphthongs, features are seen to move
tomwalters@0: smoothly as the sound proceeds, much as characters move smoothly in
tomwalters@0: animated cartoons.
tomwalters@0: 
tomwalters@0: An alternative form of temporal integration is provided by the
tomwalters@0: correlogram (Slaney and Lyon, 1990; Meddis and Hewitt, 1991). It
tomwalters@0: extracts periodicity information and preserves intra-period fine
tomwalters@0: structure by autocorrelating each channel of the NAP. The correlogram
tomwalters@0: is the multi-channel version of this process. It was originally
tomwalters@0: introduced as a model of pitch perception (Licklider, 1951) with a
tomwalters@0: neural wiring diagram to illustrate that it was physiologically
tomwalters@0: plausible. To date, however, there is no physiological evidence for
tomwalters@0: autocorrelation in the auditory system, and the installation of the
tomwalters@0: module in the physiological route was a matter of convenience. The
tomwalters@0: current implementation is a recursive, or running, autocorrelation. A
tomwalters@0: functionally equivalent FFT-based method is also provided (Allerhand
tomwalters@0: and Patterson, 1992). A comparison of the correlogram in the bottom
tomwalters@0: panel of Fig. 3 with the auditory image in the bottom panel of Fig. 2
tomwalters@0: shows that the vowel structure is more symmetric in the correlogram
tomwalters@0: and there are larger level contrasts in the correlogram.  It is not
tomwalters@0: yet known whether one of the representations is more realistic or more
tomwalters@0: useful. The present purpose is to note that the software package can
tomwalters@0: be used to compare auditory representations in a way not previously
tomwalters@0: possible.
tomwalters@0: 
tomwalters@0: 
tomwalters@0: 
tomwalters@0: II. THE SOFTWARE/HARDWARE PLATFORM
tomwalters@0: 
tomwalters@0: i. The software package: The code is distributed as a compressed
tomwalters@0: archive (in unix tar format), and can be obtained via ftp from the
tomwalters@0: address: ftp.mrc-apu.cam.ac.uk (Name=anonymous; Password=<your email
tomwalters@0: address>). All the software is contained in a single archive:
tomwalters@0: pub/aim/aim.tar.Z. The associated text file pub/aim/ReadMe contains
tomwalters@0: instructions for installing and compiling the software.  The AIM
tomwalters@0: package consists of a makefile and several sub-directories.  Five of
tomwalters@0: these (filter, glib, model, stitch and wdf) contain the C code for
tomwalters@0: AIM. An aim/tools directory contains C code for ancillary software
tomwalters@0: tools.  These software tools are provided for pre/post-processing of
tomwalters@0: model input/output. A variety of functions are offered, including:
tomwalters@0: stimulus generation, signal processing, and data manipulation.  An
tomwalters@0: aim/man directory contains on-line manual pages describing AIM and the
tomwalters@0: software tools.  An aim/scripts directory contains demonstration
tomwalters@0: scripts for a guided tour through the model. Sounds used to test and
tomwalters@0: demonstrate the model are provided in the aim/waves directory. These
tomwalters@0: sounds were sampled at 20 kHz, and each sample is a 2-byte number in
tomwalters@0: little-endian byte order; a tool is provided to swap byte order when
tomwalters@0: necessary.
tomwalters@0: 
tomwalters@0: ii. System requirements: The software is written in C. The code
tomwalters@0: generated by the native C compilers included with Ultrix (version 4.3a
tomwalters@0: and above) and SunOS (version 4.1.3 and above) has been extensively
tomwalters@0: tested. The code from the GNU C compiler (version 2.5.7 and above) is
tomwalters@0: also reliable.  The total disc usage of the AIM source code is about
tomwalters@0: 700 kbytes.  The package also includes 500 kbytes of sources for
tomwalters@0: ancillary software tools, and 200 kbytes of documentation. The
tomwalters@0: executable programs occupy about 1000 kbytes, and executable programs
tomwalters@0: for ancillary tools occupy 7000 kbytes. About 800 Kbytes of temporary
tomwalters@0: space are required for object files during compilation. The graphical
tomwalters@0: interface uses X11 (R4 and above) with either the OpenWindows or Motif
tomwalters@0: user interface. The programs can be compiled using the base Xlib
tomwalters@0: library (libX11.a), and will run on both 1- bit (mono) and multi-plane
tomwalters@0: (colour or greyscale) displays.
tomwalters@0: 
tomwalters@0: iii. Compilation and operation: The makefile includes targets to
tomwalters@0: compile the source code for AIM and the associated tools on a range of
tomwalters@0: machines (DEC, SUN, SGI, HP); the targets differ only in the pathnames
tomwalters@0: for the local X11 base library (libX11.a) and header files (X11/X.h
tomwalters@0: and X11/Xlib.h).  AIM can be compiled without the display code if the
tomwalters@0: graphics interface is not required or if X11 is not available (make
tomwalters@0: noplot).  The executable for AIM is called gen. Compilation also
tomwalters@0: generates symbolic links to gen, such as genbmm, gennap and gensai,
tomwalters@0: which are used to select the desired output (BMM, NAP or SAI). The
tomwalters@0: links and the executables for the aim/tools are installed in the
tomwalters@0: aim/bin directory after compilation.  Options are specified as:
tomwalters@0: name=value on the command line; unspecified options are assigned
tomwalters@0: default values.  The model output takes the form of binary data routed
tomwalters@0: by default to the model's graphical displays. Output can also be
tomwalters@0: routed to plotting hardware, or other post- processing software.
tomwalters@0: 
tomwalters@0: 
tomwalters@0: 
tomwalters@0: III. APPLICATIONS AND SUMMARY
tomwalters@0: 
tomwalters@0: In hearing research, the functional version of AIM has been used to
tomwalters@0: model phase perception (Patterson, 1987a), octave perception
tomwalters@0: (Patterson et al., 1993), and timbre perception (Patterson, 1994b).
tomwalters@0: The physiological version has been used to simulate cochlear hearing
tomwalters@0: loss (Gigure, Woodland, and Robinson, 1993; Gigure and Woodland,
tomwalters@0: 1994b), and combination tones of cochlear origin (Gigure, Kunov, and
tomwalters@0: Smoorenburg, 1995). In speech research, the functional version has
tomwalters@0: been used to explain syllabic stress (Allerhand et al., 1992), and
tomwalters@0: both versions have been used as preprocessors for speech recognition
tomwalters@0: systems (e.g. Patterson, Anderson, and Allerhand, 1994; Gigure et
tomwalters@0: al., 1993).  In summary, the AIM software package provides a modular
tomwalters@0: architecture for time- domain computational studies of peripheral
tomwalters@0: auditory processing.
tomwalters@0: 
tomwalters@0: 
tomwalters@0: * Instructions for acquiring the software package electronically are
tomwalters@0: presented in Section II.  This document refers to AIM R7 which is the
tomwalters@0: first official release.
tomwalters@0: 
tomwalters@0: 
tomwalters@0: ACKNOWLEDGEMENTS
tomwalters@0: 
tomwalters@0: The gammatone filterbank, adaptive thresholding, and much of the
tomwalters@0: software platform were written by John Holdsworth; the options handler
tomwalters@0: is by Paul Manson, and the revised STI module by Jay Datta. Michael
tomwalters@0: Akeroyd extended the postscript facilities and developed the xreview
tomwalters@0: routine for auditory image cartoons. The software development was
tomwalters@0: supported by grants from DRA Farnborough (U.K.), Esprit BR 3207 (EEC),
tomwalters@0: and the Hearing Research Trust (U.K.). We thank Malcolm Slaney and
tomwalters@0: Michael Akeroyd for helpful comments on an earlier version of the
tomwalters@0: paper.
tomwalters@0: 
tomwalters@0: 
tomwalters@0: Allerhand, M., and Patterson, R.D. (1992). "Correlograms and auditory
tomwalters@0: images," Proc.  Inst. Acoust. 14, 281-288.
tomwalters@0: 
tomwalters@0: Allerhand, M., Butterfield, S., Cutler, A., and Patterson, R.D.
tomwalters@0: (1992). "Assessing syllable strength via an auditory model," Proc.
tomwalters@0: Inst. Acoust. 14, 297-304.
tomwalters@0: 
tomwalters@0: Assmann, P.F., and Summerfield, Q. (1990). "Modelling the perception
tomwalters@0: of concurrent vowels: Vowels with different fundamental frequencies,"
tomwalters@0: J. Acoust. Soc. Am., 88, 680- 697.
tomwalters@0: 
tomwalters@0: Brown, G.J., and Cooke, M. (1994) "Computational auditory scene
tomwalters@0: analysis," Computer Speech and Language 8, 297-336.
tomwalters@0: 
tomwalters@0: Gigure, C., Woodland, P.C., and Robinson, A.J. (1993). "Application
tomwalters@0: of an auditory model to the computer simulation of hearing impairment:
tomwalters@0: Preliminary results," Can. Acoust.  21, 135-136.
tomwalters@0: 
tomwalters@0: Gigure, C., and Woodland, P.C. (1994a). "A computational model of
tomwalters@0: the auditory periphery for speech and hearing research. I. Ascending
tomwalters@0: path," J. Acoust. Soc. Am. 95, 331-342.
tomwalters@0: 
tomwalters@0: Gigure, C., and Woodland, P.C. (1994b). "A computational model of
tomwalters@0: the auditory periphery for speech and hearing research. II: Descending
tomwalters@0: paths,'' J. Acoust. Soc. Am.  95, 343-349.
tomwalters@0: 
tomwalters@0: Gigure, C., Kunov, H., and Smoorenburg, G.F. (1995). "Computational
tomwalters@0: modelling of psycho-acoustic combination tones and distortion-product
tomwalters@0: otoacoustic emissions," 15th Int. Cong. on Acoustics, Trondheim
tomwalters@0: (Norway), 26-30 June.
tomwalters@0: 
tomwalters@0: Glasberg, B.R., and Moore, B.C.J. (1990). "Derivation of auditory
tomwalters@0: filter shapes from notched-noise data," Hear. Res. 47, 103-38.
tomwalters@0: 
tomwalters@0: Greenwood, D.D. (1990). "A cochlear frequency-position function for
tomwalters@0: several species - 29 years later," J. Acoust. Soc. Am. 87, 2592-2605.
tomwalters@0: 
tomwalters@0: Holdsworth, J.W., and Patterson, R.D. (1991).  "Analysis of
tomwalters@0: waveforms," UK Patent No.  GB 2-234-078-A (23.1.91).  London: UK
tomwalters@0: Patent Office.
tomwalters@0: 
tomwalters@0: Licklider, J. C. R. (1951). "A duplex theory of pitch perception,"
tomwalters@0: Experientia, 7, 128- 133.
tomwalters@0: 
tomwalters@0: Lutman, M.E. and Martin, A.M. (1979). "Development of an
tomwalters@0: electroacoustic analogue model of the middle ear and acoustic reflex,"
tomwalters@0: J. Sound. Vib. 64, 133-157.
tomwalters@0: 
tomwalters@0: Meddis, R. (1988). "Simulation of auditory-neural transduction:
tomwalters@0: Further studies," J.  Acoust. Soc. Am. 83, 1056-1063.
tomwalters@0: 
tomwalters@0: Meddis, R. and Hewitt, M.J. (1991). "Modelling the perception of
tomwalters@0: concurrent vowels with different fundamental frequencies," J. Acoust.
tomwalters@0: Soc. Am. 91, 233-45.
tomwalters@0: 
tomwalters@0: Patterson, R.D. (1987). "A pulse ribbon model of monaural phase
tomwalters@0: perception," J. Acoust.  Soc. Am., 82, 1560-1586.
tomwalters@0: 
tomwalters@0: Patterson, R.D. (1994a). "The sound of a sinusoid: Spectral models,"
tomwalters@0: J. Acoust. Soc. Am.  96, 1409-1418.
tomwalters@0: 
tomwalters@0: Patterson, R.D. (1994b). "The sound of a sinusoid: Time-interval
tomwalters@0: models." J. Acoust. Soc.  Am. 96, 1419-1428.
tomwalters@0: 
tomwalters@0: Patterson, R.D. and Akeroyd, M. A. (1995). "Time-interval patterns and
tomwalters@0: sound quality," in: Advances in Hearing Research: Proceedings of the
tomwalters@0: 10th International Symposium on Hearing, edited by G. Manley, G.
tomwalters@0: Klump, C. Koppl, H. Fastl, & H. Oeckinghaus, World Scientific,
tomwalters@0: Singapore, (in press).
tomwalters@0: 
tomwalters@0: Patterson, R.D., Anderson, T., and Allerhand, M. (1994). "The auditory
tomwalters@0: image model as a preprocessor for spoken language," in Proc. Third
tomwalters@0: ICSLP, Yokohama, Japan 1395- 1398.
tomwalters@0: 
tomwalters@0: Patterson, R.D., Milroy, R. and Allerhand, M. (1993). "What is the
tomwalters@0: octave of a harmonically rich note?" In: Proc. 2nd Int. Conf. on Music
tomwalters@0: and the Cognitive Sciences, edited by I. Cross and I Deliege (Harwood,
tomwalters@0: Switzerland) 69-81.
tomwalters@0: 
tomwalters@0: Patterson, R.D. and B.C.J. Moore (1986). "Auditory filters and
tomwalters@0: excitation patterns as representations of frequency resolution," in
tomwalters@0: Frequency Selectivity in Hearing, edited by B. C. J. Moore, (Academic,
tomwalters@0: London) pp. 123-177.
tomwalters@0: 
tomwalters@0: Patterson, R.D., Holdsworth, J. and Allerhand M. (1992) "Auditory
tomwalters@0: Models as preprocessors for speech recognition," In: The Auditory
tomwalters@0: Processing of Speech: From the auditory periphery to words, edited by
tomwalters@0: M. E. H. Schouten (Mouton de Gruyter, Berlin) 67-83.
tomwalters@0: 
tomwalters@0: Patterson, R.D., Robinson, K., Holdsworth, J., McKeown, D., Zhang, C.,
tomwalters@0: and Allerhand M.  (1992) "Complex sounds and auditory images," In:
tomwalters@0: Auditory physiology and perception, edited by Y Cazals, L. Demany, and
tomwalters@0: K. Horner (Pergamon, Oxford) 429-446.
tomwalters@0: 
tomwalters@0: Slaney, M. and Lyon, R.F. (1990).  "A perceptual pitch detector," in
tomwalters@0: Proc. IEEE Int. Conf.  Acoust., Speech, Signal Processing,
tomwalters@0: Albuquerque, New Mexico, April 1990.
tomwalters@0: 
tomwalters@0: 
tomwalters@0: Figure 1. The three-stage structure of the AIM software package.
tomwalters@0: Left-hand column: functional route, right-hand column: physiological
tomwalters@0: route. For each module, the figure shows the function (bold type), the
tomwalters@0: implementation (in the rectangle), and the simulation it produces
tomwalters@0: (italics).
tomwalters@0: 
tomwalters@0: Figure 2. Responses of the model to the vowel in 'hat' processed
tomwalters@0: through the functional route: (top) basilar membrane motion, (middle)
tomwalters@0: neural activity pattern, and (bottom) auditory image.
tomwalters@0: 
tomwalters@0: Figure 3. Responses of the model to the vowel in 'hat' processed
tomwalters@0: through the physiological route: (top) basilar membrane motion,
tomwalters@0: (middle) neural activity pattern, and (bottom) autocorrelogram image.
tomwalters@0: