tomwalters@0: Revised for JASA, 3 April 95 1 tomwalters@0: tomwalters@0: tomwalters@0: Time-domain modelling of peripheral auditory processing: tomwalters@0: A modular architecture and a software platform* tomwalters@0: tomwalters@0: Roy D. Patterson and Mike H. Allerhand tomwalters@0: MRC Applied Psychology Unit, 15 Chaucer Road, Cambridge CB2 2EF, UK tomwalters@0: tomwalters@0: Christian Gigure Laboratory of Experimental Audiology, University tomwalters@0: Hospital Utrecht, 3508 GA Utrecht, The Netherlands tomwalters@0: tomwalters@0: (Received December, 1994) (Revised 31 March 1995) tomwalters@0: tomwalters@0: A software package with a modular architecture has been developed to tomwalters@0: support perceptual modelling of the fine-grain spectro-temporal tomwalters@0: information observed in the auditory nerve. The package contains both tomwalters@0: functional and physiological modules to simulate auditory spectral tomwalters@0: analysis, neural encoding and temporal integration, including new tomwalters@0: forms of periodicity-sensitive temporal integration that generate tomwalters@0: stabilized auditory images. Combinations of the modules enable the tomwalters@0: user to approximate a wide variety of existing, time-domain, auditory tomwalters@0: models. Sequences of auditory images can be replayed to produce tomwalters@0: cartoons of auditory perceptions that illustrate the dynamic response tomwalters@0: of the auditory system to everyday sounds. tomwalters@0: tomwalters@0: PACS numbers: 43.64.Bt, 43.66.Ba, 43.71.An tomwalters@0: tomwalters@0: Running head: Auditory Image Model Software tomwalters@0: tomwalters@0: tomwalters@0: INTRODUCTION tomwalters@0: tomwalters@0: Several years ago, we developed a functional model of the cochlea to tomwalters@0: simulate the phase-locked activity that complex sounds produce in the tomwalters@0: auditory nerve. The purpose was to investigate the role of the tomwalters@0: fine-grain timing information in auditory perception generally tomwalters@0: (Patterson et al., 1992a; Patterson and Akeroyd, 1995), and in speech tomwalters@0: perception in particular (Patterson, Holdsworth and Allerhand, 1992b). tomwalters@0: The architecture of the resulting Auditory Image Model (AIM) is shown tomwalters@0: in the left-hand column of Fig. 1. The responses of the three modules tomwalters@0: to the vowel in 'hat' are shown in the three panels of Fig. 2. tomwalters@0: Briefly, the spectral analysis stage converts the sound wave into the tomwalters@0: model's representation of basilar membrane motion (BMM). For the vowel tomwalters@0: in 'hat', each glottal cycle generates a version of the basic vowel tomwalters@0: structure in the BMM (top panel). The neural encoding stage tomwalters@0: stabilizes the BMM in level and sharpens features like vowel formants, tomwalters@0: to produce a simulation of the neural activity pattern (NAP) produced tomwalters@0: by the sound in the auditory nerve (middle panel). The temporal tomwalters@0: integration stage stabilizes the repeating structure in the NAP and tomwalters@0: produces a simulation of our perception of the vowel (bottom panel), tomwalters@0: referred to as the auditory image. Sequences of simulated images can tomwalters@0: be generated at regular intervals and replayed as an animated cartoon tomwalters@0: to show the dynamic behaviour of the auditory images produced by tomwalters@0: everyday sounds. tomwalters@0: tomwalters@0: An earlier version of the AIM software was made available to tomwalters@0: collaborators via the Internet. From there it spread to the speech and tomwalters@0: music communities, indicating a more general interest in auditory tomwalters@0: models than we had originally anticipated. This has prompted us to tomwalters@0: prepare documentation and a formal release of the software (AIM R7). tomwalters@0: tomwalters@0: A number of users wanted to compare the outputs from the functional tomwalters@0: model, which is almost level independent, with those from tomwalters@0: physiological models of the cochlea, which are fundamentally level tomwalters@0: dependent. Others wanted to compare the auditory images produced by tomwalters@0: strobed temporal integration with correlograms. As a result, we have tomwalters@0: installed alternative modules for each of the three main stages as tomwalters@0: shown in the right-hand column of Fig. 1. The alternative spectral tomwalters@0: analysis module is a non-linear, transmission line filterbank based on tomwalters@0: Gigure and Woodland (1994a). The neural encoding module is based on tomwalters@0: the inner haircell model of Meddis (1988). The temporal integration tomwalters@0: module generates correlograms like those of Slaney and Lyon (1990) or tomwalters@0: Meddis and Hewitt (1991), using the algorithm proposed by Allerhand tomwalters@0: and Patterson (1992). The responses of the three modules to the vowel tomwalters@0: in 'hat' are shown in Fig. 3 for the case where the level of the vowel tomwalters@0: is 60 dB SPL. The patterns are broadly similar to those of the tomwalters@0: functional modules but the details differ, particularly at the output tomwalters@0: of the third stage. The differences grow more pronounced when the tomwalters@0: level of the vowel is reduced to 30 dB SPL or increased to 90 dB SPL. tomwalters@0: Figures 2 and 3 together illustrate how the software can be used to tomwalters@0: compare and contrast different auditory models. The new modules also tomwalters@0: open the way to time-domain simulation of hearing impairment and tomwalters@0: distortion products of cochlear origin. tomwalters@0: tomwalters@0: Switches were installed to enable the user to shift from the tomwalters@0: functional to the physiological version of AIM at the output of each tomwalters@0: stage of the model. This architecture enables the system to implement tomwalters@0: other popular auditory models such as the gammatone- filterbank, tomwalters@0: Meddis-haircell, correlogram models proposed by Assmann and tomwalters@0: Summerfield (1990), Meddis and Hewitt (1991), and Brown and Cooke tomwalters@0: (1994). The remainder of this letter describes the integrated software tomwalters@0: package with emphasis on the functional and physiological routes, and tomwalters@0: on practical aspects of obtaining the software package.* tomwalters@0: tomwalters@0: tomwalters@0: tomwalters@0: I. THE AUDITORY IMAGE MODEL tomwalters@0: tomwalters@0: A. The spectral analysis stage tomwalters@0: tomwalters@0: Spectral analysis is performed by a bank of auditory filters which tomwalters@0: converts a digitized wave into an array of filtered waves like those tomwalters@0: shown in the top panels of Figs 2 and 3. The set of waves is AIM's tomwalters@0: representation of basilar membrane motion. The software distributes tomwalters@0: the filters linearly along a frequency scale measured in Equivalent tomwalters@0: Rectangular Bandwidths (ERB's). The ERB scale was proposed by Glasberg tomwalters@0: and Moore (1990) based on physiological research summarized in tomwalters@0: Greenwood (1990) and psychoacoustic research summarized in Patterson tomwalters@0: and Moore (1986). The constants of the ERB function can also be set to tomwalters@0: produce a reasonable approximation to the Bark scale. Options enable tomwalters@0: the user to specify the number of channels in the filterbank and the tomwalters@0: minimum and maximum filter center frequencies. tomwalters@0: tomwalters@0: AIM provides both a functional auditory filter and a physiological tomwalters@0: auditory filter for generating the BMM: the former is a linear, tomwalters@0: gammatone filter (Patterson et al., 1992a); the latter is a tomwalters@0: non-linear, transmission-line filter (Gigure and Woodland, 1994a). tomwalters@0: The impulse response of the gammatone filter provides an excellent fit tomwalters@0: to the impulse response of primary auditory neurons in cats, and its tomwalters@0: amplitude characteristic is very similar to that of the 'roex' filter tomwalters@0: commonly used to represent the human auditory filter. The motivation tomwalters@0: for the gammatone filterbank and the available implementations are tomwalters@0: summarized in Patterson (1994a). The input wave is passed through an tomwalters@0: optional middle-ear filter adapted from Lutman and Martin (1979). tomwalters@0: tomwalters@0: In the physiological version, a 'wave digital filter' is used to tomwalters@0: implement the classical, one-dimensional, transmission-line tomwalters@0: approximation to cochlear hydrodynamics. A feedback circuit tomwalters@0: representing the fast motile response of the outer haircells generates tomwalters@0: level- dependent basilar membrane motion (Gigure and Woodland, tomwalters@0: 1994a). The filterbank generates combination tones of the type tomwalters@0: f1-n(f2-f1) which propagate to the appropriate channel, and it has the tomwalters@0: potential to generate cochlear echoes. Options enable the user to tomwalters@0: customize the transmission line filter by specifying the feedback gain tomwalters@0: and saturation level of the outer haircell circuit. The middle ear tomwalters@0: filter forms an integral part of the simulation in this case. tomwalters@0: Together, it and the transmission line filterbank provide a tomwalters@0: bi-directional model of auditory spectral analysis. tomwalters@0: tomwalters@0: The upper panels of Figs 2 and 3 show the responses of the two tomwalters@0: filterbanks to the vowel in 'hat'. They have 75 channels covering the tomwalters@0: frequency range 100 to 6000 Hz (3.3 to 30.6 ERB's). In the tomwalters@0: high-frequency channels, the filters are broad and the glottal pulses tomwalters@0: generate impulse responses which decay relatively quickly. In the tomwalters@0: low-frequency channels, the filters are narrow and so they resolve tomwalters@0: individual continuous harmonics. The rightward skew in the tomwalters@0: low-frequency channels is the 'phase lag,' or 'propagation delay,' of tomwalters@0: the cochlea, which arises because the narrower low-frequency filters tomwalters@0: respond more slowly to input. The transmission line filterbank shows tomwalters@0: more ringing in the valleys than the gammatone filterbank because of tomwalters@0: its dynamic signal compression; as amplitude decreases the damping of tomwalters@0: the basilar membrane is reduced to increase sensitivity and frequency tomwalters@0: resolution. tomwalters@0: tomwalters@0: tomwalters@0: B. The neural encoding stage tomwalters@0: tomwalters@0: The second stage of AIM simulates the mechanical/neural transduction tomwalters@0: process performed by the inner haircells. It converts the BMM into a tomwalters@0: neural activity pattern (NAP), which is AIM's representation of the tomwalters@0: afferent activity in the auditory nerve. Two alternative simulations tomwalters@0: are provided for generating the NAP: a bank of two-dimensional tomwalters@0: adaptive- thresholding units (Holdsworth and Patterson, 1993), or a tomwalters@0: bank of inner haircell simulators (Meddis, 1988). tomwalters@0: tomwalters@0: The adaptive thresholding mechanism is a functional representation of tomwalters@0: neural encoding. It begins by rectifying and compressing the BMM; then tomwalters@0: it applies adaptation in time and suppression across frequency. The tomwalters@0: adaptation and suppression are coupled and they jointly sharpen tomwalters@0: features like vowel formants in the compressed BMM representation. tomwalters@0: Briefly, an adaptive threshold value is maintained for each channel tomwalters@0: and updated at the sampling rate. The new value is the largest of a) tomwalters@0: the previous value reduced by a fast-acting temporal decay factor, b) tomwalters@0: the previous value reduced by a longer-term temporal decay factor, c) tomwalters@0: the adapted level in the channel immediately above, reduced by a tomwalters@0: frequency spread factor, or d) the adapted level in the channel tomwalters@0: immediately below, reduced by the same frequency spread factor. The tomwalters@0: mechanism produces output whenever the input exceeds the adaptive tomwalters@0: threshold, and the output level is the difference between the input tomwalters@0: and the adaptive threshold. The parameters that control the spread of tomwalters@0: activity in time and frequency are options in AIM. tomwalters@0: tomwalters@0: The Meddis (1988) module simulates the operation of an individual tomwalters@0: inner haircell; specifically, it simulates the flow of tomwalters@0: neurotransmitter across three reservoirs that are postulated to exist tomwalters@0: in and around the haircell. The module reproduces important properties tomwalters@0: of single afferent fibres such as two-component time adaptation and tomwalters@0: phase-locking. The transmitter flow equations are solved using the tomwalters@0: wave-digital-filter algorithm described in Gigure and Woodland tomwalters@0: (1994a). There is one haircell simulator for each channel of the tomwalters@0: filterbank. Options allow the user to shift the entire rate-intensity tomwalters@0: function to a higher or lower level, and to specify the type of fibre tomwalters@0: (medium or high spontaneous-rate). tomwalters@0: tomwalters@0: The middle panels in Figures 2 and 3 show the NAPs obtained with tomwalters@0: adaptive thresholding and the Meddis module in response to BMMs from tomwalters@0: the gammatone and transmission line filterbanks of Figs 1 and 2, tomwalters@0: respectively. The phase lag of the BMM is preserved in the NAP. The tomwalters@0: positive half-cycles of the BMM waves have been sharpened in time, an tomwalters@0: effect which is more obvious in the adaptive thresholding NAP. tomwalters@0: Sharpening is also evident in the frequency dimension of the adaptive tomwalters@0: thresholding NAP. The individual 'haircells' are not coupled across tomwalters@0: channels in the Meddis module, and thus there is no frequency tomwalters@0: sharpening in this case. The physiological NAP reveals that the tomwalters@0: activity between glottal pulses in the high-frequency channels is due tomwalters@0: to the strong sixth harmonic in the first formant of the vowel. tomwalters@0: tomwalters@0: tomwalters@0: C. The temporal integration stage tomwalters@0: tomwalters@0: Periodic sounds give rise to static, rather than oscillating, tomwalters@0: perceptions indicating that temporal integration is applied to the NAP tomwalters@0: in the production of our initial perception of a sound -- our auditory tomwalters@0: image. Traditionally, auditory temporal integration is represented by tomwalters@0: a simple leaky integration process and AIM provides a bank of lowpass tomwalters@0: filters to enable the user to generate auditory spectra (Patterson, tomwalters@0: 1994a) and auditory spectrograms (Patterson et al., 1992b). However, tomwalters@0: the leaky integrator removes the phase-locked fine structure observed tomwalters@0: in the NAP, and this conflicts with perceptual data indicating that tomwalters@0: the fine structure plays an important role in determining sound tomwalters@0: quality and source identification (Patterson, 1994b; Patterson and tomwalters@0: Akeroyd, 1995). As a result, AIM includes two modules which preserve tomwalters@0: much of the time-interval information in the NAP during temporal tomwalters@0: integration, and which produce a better representation of our auditory tomwalters@0: images. In the functional version of AIM, this is accomplished with tomwalters@0: strobed temporal integration (Patterson et al., 1992a,b); in the tomwalters@0: physiological version, it is accomplished with a bank of tomwalters@0: autocorrelators (Slaney and Lyon, 1990; Meddis and Hewitt, 1991). tomwalters@0: tomwalters@0: In the case of strobed temporal integration (STI), a bank of delay tomwalters@0: lines is used to form a buffer store for the NAP, one delay line per tomwalters@0: channel, and as the NAP proceeds along the buffer it decays linearly tomwalters@0: with time, at about 2.5 %/ms. Each channel of the buffer is assigned a tomwalters@0: strobe unit which monitors activity in that channel looking for local tomwalters@0: maxima in the stream of NAP pulses. When one is found, the unit tomwalters@0: initiates temporal integration in that channel; that is, it transfers tomwalters@0: a copy of the NAP at that instant to the corresponding channel of an tomwalters@0: image buffer and adds it point-for-point with whatever is already tomwalters@0: there. The local maximum itself is mapped to the 0-ms point in the tomwalters@0: image buffer. The multi-channel version of this STI process produces tomwalters@0: AIM's representation of our auditory image of a sound. Periodic and tomwalters@0: quasi-periodic sounds cause regular strobing which leads to simulated tomwalters@0: auditory images that are static, or nearly static, and which have the tomwalters@0: same temporal resolution as the NAP. Dynamic sounds are represented tomwalters@0: as a sequence of auditory image frames. If the rate of change in a tomwalters@0: sound is not too rapid, as is diphthongs, features are seen to move tomwalters@0: smoothly as the sound proceeds, much as characters move smoothly in tomwalters@0: animated cartoons. tomwalters@0: tomwalters@0: An alternative form of temporal integration is provided by the tomwalters@0: correlogram (Slaney and Lyon, 1990; Meddis and Hewitt, 1991). It tomwalters@0: extracts periodicity information and preserves intra-period fine tomwalters@0: structure by autocorrelating each channel of the NAP. The correlogram tomwalters@0: is the multi-channel version of this process. It was originally tomwalters@0: introduced as a model of pitch perception (Licklider, 1951) with a tomwalters@0: neural wiring diagram to illustrate that it was physiologically tomwalters@0: plausible. To date, however, there is no physiological evidence for tomwalters@0: autocorrelation in the auditory system, and the installation of the tomwalters@0: module in the physiological route was a matter of convenience. The tomwalters@0: current implementation is a recursive, or running, autocorrelation. A tomwalters@0: functionally equivalent FFT-based method is also provided (Allerhand tomwalters@0: and Patterson, 1992). A comparison of the correlogram in the bottom tomwalters@0: panel of Fig. 3 with the auditory image in the bottom panel of Fig. 2 tomwalters@0: shows that the vowel structure is more symmetric in the correlogram tomwalters@0: and there are larger level contrasts in the correlogram. It is not tomwalters@0: yet known whether one of the representations is more realistic or more tomwalters@0: useful. The present purpose is to note that the software package can tomwalters@0: be used to compare auditory representations in a way not previously tomwalters@0: possible. tomwalters@0: tomwalters@0: tomwalters@0: tomwalters@0: II. THE SOFTWARE/HARDWARE PLATFORM tomwalters@0: tomwalters@0: i. The software package: The code is distributed as a compressed tomwalters@0: archive (in unix tar format), and can be obtained via ftp from the tomwalters@0: address: ftp.mrc-apu.cam.ac.uk (Name=anonymous; Password=). All the software is contained in a single archive: tomwalters@0: pub/aim/aim.tar.Z. The associated text file pub/aim/ReadMe contains tomwalters@0: instructions for installing and compiling the software. The AIM tomwalters@0: package consists of a makefile and several sub-directories. Five of tomwalters@0: these (filter, glib, model, stitch and wdf) contain the C code for tomwalters@0: AIM. An aim/tools directory contains C code for ancillary software tomwalters@0: tools. These software tools are provided for pre/post-processing of tomwalters@0: model input/output. A variety of functions are offered, including: tomwalters@0: stimulus generation, signal processing, and data manipulation. An tomwalters@0: aim/man directory contains on-line manual pages describing AIM and the tomwalters@0: software tools. An aim/scripts directory contains demonstration tomwalters@0: scripts for a guided tour through the model. Sounds used to test and tomwalters@0: demonstrate the model are provided in the aim/waves directory. These tomwalters@0: sounds were sampled at 20 kHz, and each sample is a 2-byte number in tomwalters@0: little-endian byte order; a tool is provided to swap byte order when tomwalters@0: necessary. tomwalters@0: tomwalters@0: ii. System requirements: The software is written in C. The code tomwalters@0: generated by the native C compilers included with Ultrix (version 4.3a tomwalters@0: and above) and SunOS (version 4.1.3 and above) has been extensively tomwalters@0: tested. The code from the GNU C compiler (version 2.5.7 and above) is tomwalters@0: also reliable. The total disc usage of the AIM source code is about tomwalters@0: 700 kbytes. The package also includes 500 kbytes of sources for tomwalters@0: ancillary software tools, and 200 kbytes of documentation. The tomwalters@0: executable programs occupy about 1000 kbytes, and executable programs tomwalters@0: for ancillary tools occupy 7000 kbytes. About 800 Kbytes of temporary tomwalters@0: space are required for object files during compilation. The graphical tomwalters@0: interface uses X11 (R4 and above) with either the OpenWindows or Motif tomwalters@0: user interface. The programs can be compiled using the base Xlib tomwalters@0: library (libX11.a), and will run on both 1- bit (mono) and multi-plane tomwalters@0: (colour or greyscale) displays. tomwalters@0: tomwalters@0: iii. Compilation and operation: The makefile includes targets to tomwalters@0: compile the source code for AIM and the associated tools on a range of tomwalters@0: machines (DEC, SUN, SGI, HP); the targets differ only in the pathnames tomwalters@0: for the local X11 base library (libX11.a) and header files (X11/X.h tomwalters@0: and X11/Xlib.h). AIM can be compiled without the display code if the tomwalters@0: graphics interface is not required or if X11 is not available (make tomwalters@0: noplot). The executable for AIM is called gen. Compilation also tomwalters@0: generates symbolic links to gen, such as genbmm, gennap and gensai, tomwalters@0: which are used to select the desired output (BMM, NAP or SAI). The tomwalters@0: links and the executables for the aim/tools are installed in the tomwalters@0: aim/bin directory after compilation. Options are specified as: tomwalters@0: name=value on the command line; unspecified options are assigned tomwalters@0: default values. The model output takes the form of binary data routed tomwalters@0: by default to the model's graphical displays. Output can also be tomwalters@0: routed to plotting hardware, or other post- processing software. tomwalters@0: tomwalters@0: tomwalters@0: tomwalters@0: III. APPLICATIONS AND SUMMARY tomwalters@0: tomwalters@0: In hearing research, the functional version of AIM has been used to tomwalters@0: model phase perception (Patterson, 1987a), octave perception tomwalters@0: (Patterson et al., 1993), and timbre perception (Patterson, 1994b). tomwalters@0: The physiological version has been used to simulate cochlear hearing tomwalters@0: loss (Gigure, Woodland, and Robinson, 1993; Gigure and Woodland, tomwalters@0: 1994b), and combination tones of cochlear origin (Gigure, Kunov, and tomwalters@0: Smoorenburg, 1995). In speech research, the functional version has tomwalters@0: been used to explain syllabic stress (Allerhand et al., 1992), and tomwalters@0: both versions have been used as preprocessors for speech recognition tomwalters@0: systems (e.g. Patterson, Anderson, and Allerhand, 1994; Gigure et tomwalters@0: al., 1993). In summary, the AIM software package provides a modular tomwalters@0: architecture for time- domain computational studies of peripheral tomwalters@0: auditory processing. tomwalters@0: tomwalters@0: tomwalters@0: * Instructions for acquiring the software package electronically are tomwalters@0: presented in Section II. This document refers to AIM R7 which is the tomwalters@0: first official release. tomwalters@0: tomwalters@0: tomwalters@0: ACKNOWLEDGEMENTS tomwalters@0: tomwalters@0: The gammatone filterbank, adaptive thresholding, and much of the tomwalters@0: software platform were written by John Holdsworth; the options handler tomwalters@0: is by Paul Manson, and the revised STI module by Jay Datta. Michael tomwalters@0: Akeroyd extended the postscript facilities and developed the xreview tomwalters@0: routine for auditory image cartoons. The software development was tomwalters@0: supported by grants from DRA Farnborough (U.K.), Esprit BR 3207 (EEC), tomwalters@0: and the Hearing Research Trust (U.K.). We thank Malcolm Slaney and tomwalters@0: Michael Akeroyd for helpful comments on an earlier version of the tomwalters@0: paper. tomwalters@0: tomwalters@0: tomwalters@0: Allerhand, M., and Patterson, R.D. (1992). "Correlograms and auditory tomwalters@0: images," Proc. Inst. Acoust. 14, 281-288. tomwalters@0: tomwalters@0: Allerhand, M., Butterfield, S., Cutler, A., and Patterson, R.D. tomwalters@0: (1992). "Assessing syllable strength via an auditory model," Proc. tomwalters@0: Inst. Acoust. 14, 297-304. tomwalters@0: tomwalters@0: Assmann, P.F., and Summerfield, Q. (1990). "Modelling the perception tomwalters@0: of concurrent vowels: Vowels with different fundamental frequencies," tomwalters@0: J. Acoust. Soc. Am., 88, 680- 697. tomwalters@0: tomwalters@0: Brown, G.J., and Cooke, M. (1994) "Computational auditory scene tomwalters@0: analysis," Computer Speech and Language 8, 297-336. tomwalters@0: tomwalters@0: Gigure, C., Woodland, P.C., and Robinson, A.J. (1993). "Application tomwalters@0: of an auditory model to the computer simulation of hearing impairment: tomwalters@0: Preliminary results," Can. Acoust. 21, 135-136. tomwalters@0: tomwalters@0: Gigure, C., and Woodland, P.C. (1994a). "A computational model of tomwalters@0: the auditory periphery for speech and hearing research. I. Ascending tomwalters@0: path," J. Acoust. Soc. Am. 95, 331-342. tomwalters@0: tomwalters@0: Gigure, C., and Woodland, P.C. (1994b). "A computational model of tomwalters@0: the auditory periphery for speech and hearing research. II: Descending tomwalters@0: paths,'' J. Acoust. Soc. Am. 95, 343-349. tomwalters@0: tomwalters@0: Gigure, C., Kunov, H., and Smoorenburg, G.F. (1995). "Computational tomwalters@0: modelling of psycho-acoustic combination tones and distortion-product tomwalters@0: otoacoustic emissions," 15th Int. Cong. on Acoustics, Trondheim tomwalters@0: (Norway), 26-30 June. tomwalters@0: tomwalters@0: Glasberg, B.R., and Moore, B.C.J. (1990). "Derivation of auditory tomwalters@0: filter shapes from notched-noise data," Hear. Res. 47, 103-38. tomwalters@0: tomwalters@0: Greenwood, D.D. (1990). "A cochlear frequency-position function for tomwalters@0: several species - 29 years later," J. Acoust. Soc. Am. 87, 2592-2605. tomwalters@0: tomwalters@0: Holdsworth, J.W., and Patterson, R.D. (1991). "Analysis of tomwalters@0: waveforms," UK Patent No. GB 2-234-078-A (23.1.91). London: UK tomwalters@0: Patent Office. tomwalters@0: tomwalters@0: Licklider, J. C. R. (1951). "A duplex theory of pitch perception," tomwalters@0: Experientia, 7, 128- 133. tomwalters@0: tomwalters@0: Lutman, M.E. and Martin, A.M. (1979). "Development of an tomwalters@0: electroacoustic analogue model of the middle ear and acoustic reflex," tomwalters@0: J. Sound. Vib. 64, 133-157. tomwalters@0: tomwalters@0: Meddis, R. (1988). "Simulation of auditory-neural transduction: tomwalters@0: Further studies," J. Acoust. Soc. Am. 83, 1056-1063. tomwalters@0: tomwalters@0: Meddis, R. and Hewitt, M.J. (1991). "Modelling the perception of tomwalters@0: concurrent vowels with different fundamental frequencies," J. Acoust. tomwalters@0: Soc. Am. 91, 233-45. tomwalters@0: tomwalters@0: Patterson, R.D. (1987). "A pulse ribbon model of monaural phase tomwalters@0: perception," J. Acoust. Soc. Am., 82, 1560-1586. tomwalters@0: tomwalters@0: Patterson, R.D. (1994a). "The sound of a sinusoid: Spectral models," tomwalters@0: J. Acoust. Soc. Am. 96, 1409-1418. tomwalters@0: tomwalters@0: Patterson, R.D. (1994b). "The sound of a sinusoid: Time-interval tomwalters@0: models." J. Acoust. Soc. Am. 96, 1419-1428. tomwalters@0: tomwalters@0: Patterson, R.D. and Akeroyd, M. A. (1995). "Time-interval patterns and tomwalters@0: sound quality," in: Advances in Hearing Research: Proceedings of the tomwalters@0: 10th International Symposium on Hearing, edited by G. Manley, G. tomwalters@0: Klump, C. Koppl, H. Fastl, & H. Oeckinghaus, World Scientific, tomwalters@0: Singapore, (in press). tomwalters@0: tomwalters@0: Patterson, R.D., Anderson, T., and Allerhand, M. (1994). "The auditory tomwalters@0: image model as a preprocessor for spoken language," in Proc. Third tomwalters@0: ICSLP, Yokohama, Japan 1395- 1398. tomwalters@0: tomwalters@0: Patterson, R.D., Milroy, R. and Allerhand, M. (1993). "What is the tomwalters@0: octave of a harmonically rich note?" In: Proc. 2nd Int. Conf. on Music tomwalters@0: and the Cognitive Sciences, edited by I. Cross and I Deliege (Harwood, tomwalters@0: Switzerland) 69-81. tomwalters@0: tomwalters@0: Patterson, R.D. and B.C.J. Moore (1986). "Auditory filters and tomwalters@0: excitation patterns as representations of frequency resolution," in tomwalters@0: Frequency Selectivity in Hearing, edited by B. C. J. Moore, (Academic, tomwalters@0: London) pp. 123-177. tomwalters@0: tomwalters@0: Patterson, R.D., Holdsworth, J. and Allerhand M. (1992) "Auditory tomwalters@0: Models as preprocessors for speech recognition," In: The Auditory tomwalters@0: Processing of Speech: From the auditory periphery to words, edited by tomwalters@0: M. E. H. Schouten (Mouton de Gruyter, Berlin) 67-83. tomwalters@0: tomwalters@0: Patterson, R.D., Robinson, K., Holdsworth, J., McKeown, D., Zhang, C., tomwalters@0: and Allerhand M. (1992) "Complex sounds and auditory images," In: tomwalters@0: Auditory physiology and perception, edited by Y Cazals, L. Demany, and tomwalters@0: K. Horner (Pergamon, Oxford) 429-446. tomwalters@0: tomwalters@0: Slaney, M. and Lyon, R.F. (1990). "A perceptual pitch detector," in tomwalters@0: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, tomwalters@0: Albuquerque, New Mexico, April 1990. tomwalters@0: tomwalters@0: tomwalters@0: Figure 1. The three-stage structure of the AIM software package. tomwalters@0: Left-hand column: functional route, right-hand column: physiological tomwalters@0: route. For each module, the figure shows the function (bold type), the tomwalters@0: implementation (in the rectangle), and the simulation it produces tomwalters@0: (italics). tomwalters@0: tomwalters@0: Figure 2. Responses of the model to the vowel in 'hat' processed tomwalters@0: through the functional route: (top) basilar membrane motion, (middle) tomwalters@0: neural activity pattern, and (bottom) auditory image. tomwalters@0: tomwalters@0: Figure 3. Responses of the model to the vowel in 'hat' processed tomwalters@0: through the physiological route: (top) basilar membrane motion, tomwalters@0: (middle) neural activity pattern, and (bottom) autocorrelogram image. tomwalters@0: