Mercurial > hg > aim92
view docs/aimStrobeCriterion @ 0:5242703e91d3 tip
Initial checkin for AIM92 aimR8.2 (last updated May 1997).
author | tomwalters |
---|---|
date | Fri, 20 May 2011 15:19:45 +0100 |
parents | |
children |
line wrap: on
line source
docs/aimStrobeCriterion (text) scripts/aimStrobeCriterion (figures) STROBED TEMPORAL INTEGRATION AND THE STABILISED AUDITORY IMAGE Roy D. Patterson, Jay Datta and Mike Allerhand MRC Applied Psychology Unit 15 Chaucer Road, Cambridge, CB2 2EF UK email: roy.patterson, jay.datta or mike.allerhand @mrc-apu.cam.ac.uk 2 August 1995 ABSTRACT This document describes the Strobed Temporal Integration mechanism used to convert neural activity patterns into stabilised auditory images. The specific version of the Auditory Image Model is AIM R7, as described in Patterson, Allerhand, and Giguere (1995) INTRODUCTION When a periodic sound occurs with a pitch in the musical range, the cochlea produces a detailed, multi-channel, time-interval pattern that repeats once per cycle of the wave. The auditory images that we hear in response to periodic sounds are perfectly stable. That is, despite the fact that the level of activity in the neural activity pattern is fluctuating over a large range within the course of each cycle, the loudness of the sound is fixed. This indicates that some form of temporal integration is applied to the NAP prior to our initial perception of the sound. The auditory images of periodic sounds can have a very rich timbre, or sound quality, that can reveal a great deal about the sound source such as the quality of the musical instrument or the finesse of the musician. This suggests that much of the detailed time-interval information produced by the cochlea is preserved in the stabilised auditory image. The fact that we hear stable auditory images with rich sound quality presents auditory theorists with a problem. The temporal integration mechanism in traditional auditory models is a low-pass filter that removes the fine-grain time-interval information from the internal representation of the sound -- time interval information that appears to be required for timbre perception. Strobed temporal integration was introduced to solve this problem. At one and the same time, it performs the temporal integration necessary to produce stable auditory images and it preserved the majority of the time-interval information observed in the neural activity pattern (NAP) produced by the cochlea. It is not a difficult problem to produce a high-resolution, stabilised version of the NAP provided you know the moment in time at which the pattern in the NAP will repeat. For example, consider the NAP of the first note of the wave CEGC in Figure 0.1 from Patterson et al. (1992). The wave is a train of clicks separated by 8-ms gaps; the upper channels of the NAP show that the response is a sequence of filter impulse responses spaced at 8 ms intervals. A stabilised representation of the NAP can be produced by setting up an image buffer that has the same number of channels as the NAP, and simply transferring a copy of the pattern in each channel of the NAP to the corresponding channel of the image buffer once every 8 ms. In the NAP, the pattern flows from right to left as time progresses, and since the cycles are continually entering the NAP from the right hand side and exiting the NAP from the left hand side, the pattern after every 8 ms is identical to the pattern 8 ms ago. So if the transfer from the NAP to the auditory image is performed every 8 ms exactly, successive contributions from the NAP to the image are all identical. In the image buffer, activity does not move from right to left, it simply decays into the floor exponentially over time with a half life of about 30 ms. When a new contribution arrives from the NAP, it is added point for point with whatever is currently in the corresponding channel of the image buffer. In the current example, after a copy of the NAP arrives in the auditory image, and during the 30 ms over which it would decay to half its original value, three more copies of the NAP pattern arrive and are added into the auditory image. Thus, for typical musical notes and typical vowels, the rate of temporal integration from the NAP into the auditory image is high and there is little time between successive integration events for the image itself to decay. This is the source of the stability of the auditory image. Provided the integration is performed once per cycle of the sound, the majority of the time-interval information in the NAP will be preserved in the auditory image, thereby providing a solution to the problem of how to produce stable images without removing the fine-grain time-interval information associated with sound quality. The auditory image produced by this process is shown in Figure 0.2 from Patterson et al. (1992). The transfer is performed on each channel of the NAP separately and it is performed at the point in the cycle where the activity in the NAP is a maximum. The maximum of the most recent cycle to arrive in the NAP is added into the auditory image at the 0-ms point, and as a result, the NAP peaks are aligned vertically in the auditory image. This passive alignment process explains the loss of global phase information observed empirically (see Patterson, 1987, for a review). Thus it would appear that the problem of converting the oscillating NAP into a stabilised, high-resolution image reduces to the problem of finding the pitch of the sound and performing temporal integration at multiples of the pitch period. There are now a number of computational auditory models with a proven ability to extract the pitch of complex sounds (see Brown and Cook, 1994, for a review) and they could be used to direct strobed temporal integration. However, experiments with vowels (McKeown and Patterson, 1995; Robinson and Patterson, 1995a) and musical notes (Robinson and Patterson, 1995b) indicate that 4 to 8 cycles of the sound are required to produce an accurate estimate of the pitch, whereas the sound quality information necessary to identify a vowel or a musical instrument can be extracted from one cycle of the wave. This suggests, that if the auditory system does use strobed temporal integration to produce a stable, high resolution auditory image, it does it with a mechanism that operates more locally in time than pitch extraction mechanisms. This is the background that led to the development of the strobed temporal integration mechanism in the auditory image model. In Sections 1 and 2 of this document, following Allerhand and Patterson (1992), we describe two simple criteria for selecting strobe points in the NAP and show that they produce auditory images that are very similar to the correlograms produced by Assman and Summerfield (1990), Slaney and Lyon (1990), or Meddis and Hewitt (1991a, 1991b). The structures that arise in this form of auditory image are much more symmetric than the corresponding structures in the NAP (Allerhand and Patterson, 1992). There is mounting evidence, however, that the auditory system is highly sensitive to temporal asymmetry (Patterson, 1994a, 1994b; Akeroyd and Patterson, 1995; Irino and Patterson, 1996), and so the loss of asymmetry associated with the simple strobe criterion seems likely to limit the value of this representation of our perceptions. In the remaining Sections, an ordered sequence of restrictions is added to the simple criteria for initiating temporal integration, to restore asymmetry to the structures that arise in the auditory image. 1. Strobe on Every Non-Zero Point in the NAP. The initial criterion is very simple; temporal integration is initiated on each and every non-zero point in the NAP. In AIM software, the option that determines which strobe criterion will be used is 'stcrit_ai' and it is set equal to one for this simplest strobe criterion. Allerhand and Patterson (1992) showed that when temporal integration from the NAP to the auditory image is initiated on each and every non-zero point in the NAP function, the result is very similar to a correlogram -- a representation that is commonly used in time-domain models of hearing to extract the pitch of complex periodic sounds (see Brown and Cook, 1994, for a review). For example, compare the auditory image with stcrit_ai=1 (Figure 1.1) and the correlogram (Figure 1.2) of the first note of the sound cegc. Both figures show stabilised representations of the time-interval pattern that the sound produces in the NAP, and in both cases, the individual channels have been aligned vertically on the largest peak in the NAP function. The patterns in the auditory image and the correlogram both differ from the pattern in the NAP in one important way; there is a reflection of the NAP pulses associated with the ringing of the auditory filters, on the side opposite to where they originally appear. That is, autocorrelation and STI with stcrit_ai=1 reduce the temporal asymmetry observed in the NAP. The asymmetry information is not entirely removed but it is largely removed. Experiments with sounds that have asymmetric temporal modulation show that listeners are sensitive to temporal asymmetry (Patterson, 1994a, 1994b; Akeroyd and Patterson, 1995; Irino and Patterson, 1996), and so the removal of asymmetry information seems likely to prove a disadvantage when attempting to explain auditory perception. The autocorrelation process is symmetric in time by its very nature. Mechanical processes that produce sound in the world are typically asymmetric in time because they usually have some inertia. Resonators struck impulsively ring after the pulse and not before. This principle also applies to the processes that analyse the sound in the auditory system. The impulse response of the auditory filter rises faster than it falls; the adaptation process in the inner haircell adapts up faster at the onset of a sound than it adapts down after the sound passes. So asymmetry is the norm in the world and it is not surprising that the auditory system is sensitive to it. 2. Strobe on the Peak of Each NAP pulse. When temporal integration is initiated on every non-zero NAP point, the successive NAP functions that are transferred to the auditory image are highly correlated. This suggests that we could attain essentially the same auditory image for vastly less computation by restricting temporal integration to the larger points on the individual NAP pulses. This leads, in turn, to the suggestion that temporal integration be limited to the peak of the individual NAP pulses. The result of this restriction is illustrated in Figure 2.1 which shows the auditory image of the first note of CEGC with this more restricted strobe criterion. Since the peak restriction greatly reduces the rate of temporal integration, the absolute levels of structures in this form of auditory image are considerably lower than those in the previous form of image. The pattern of time intervals, however, is very similar in the two forms of auditory image. They both preserve a detailed representation of the time-interval pattern in the NAP, and, they both loose much of the asymmetry in the NAP. 3. Avoid Strobing in the Temporal Shadow after a large NAP Pulse. The loss of asymmetry in the click-train structure of the auditory image, arises when temporal integration is initiated on the smaller NAP pulses associated with the ringing of the auditory filters after each click in the train. This can be demonstrated by introducing a fixed strobe threshold below which NAP peaks do not initiate temporal integration, and progressively raising this strobe threshold to exclude more and more of the lower level NAP pulses. (In AIM, a fixed threshold is set with option stthresh_ai and stcrit_ai=1.) The auditory image becomes less and less symmetric and more and more like the original NAP pattern for the click train as the strobe threshold is increased. Fixed thresholds of this sort are not realistic for simulating the operation of auditory system, firstly because the strobe threshold eventually exceeds the largest NAP pulse and temporal integration ceases entirely, and secondly because, in the natural environment, the levels of sounds are constantly changing. Nevertheless, the example illustrates how NAP asymmetry is lost with simple strobe criteria. The problem with autocorrelation is similar; the correlation values at lags associated with the smaller NAP pulses introduce symmetric reflections into structure that appear in the correlogram. An alternative means of restricting temporal integration to the larger pulses in the NAP of the click train is to use an adaptive strobe threshold which is temporally asymmetric. In the simplest case, when the strobe unit monitoring a NAP channel encounters a pulse, strobe threshold is set to the full height of the NAP pulse. But following the peak threshold does not fall as fast as the NAP function, rather it is restricted to decaying at a fixed percentage of the peak height per ms. In AIM, the rate of decay is set to 5% per ms, so the threshold decays faster after larger peaks, and in the absence of further NAP peaks, returns to 0 in 20 ms. The NAP function for the 1.0-kHz channel of the NAP is presented in Figure 3.1 along with the adaptive threshold function. Together they illustrate what is referred to as the "temporal shadow criterion" for strobed temporal integration. In the figure, the vertical lines below the abscissa of the NAP function mark the NAP pulses that initiate temporal integration. They show that the first NAP pulse strobes temporal integration and strobe threshold is set to the peak height. It immediately begins to decay, but then it encounters another NAP pulse that exceeds strobe threshold and so the process of strobing temporal integration and raising strobe threshold is promptly repeated. At this point, however, strobe threshold is high relative to the NAP pulses and, strobe threshold is falling more slowly than the NAP pulses, so the algorithm proceeds through the rest of the cycle without encountering another NAP pulse from the ringing part of the NAP function. In this way, the strobe mechanism is synchronised to the period of the sound even though no explicit information about the pitch of the sound is provided to the strobe mechanism. It is the auditory image with the temporal shaddow criterion that was presented originally in Figure 0.2. (stcrit_ai=3). The 'temporal shadow criterion' produces stable auditory images with accurate, asymmetry for a wide variety of naturally occurring sounds like vowels and musical notes. The reason is that the NAPs of these sounds have a restricted range of periods and within those periods the asymmetry is typically characterised by the rapid-rise/slow-fall form. There are, however, periodic sounds with very low pitch and NAP functions that rise slowly over the course of the period and fall rapidly at the end of the period, and the perceptions produced by these sounds indicate that the auditory strobe mechanism is somewhat more sophisticated than the temporal shadow strobe mechanism. These "ramped" sounds are the subject of the next section. 4. Avoid Temporal Integration on NAP Peaks Followed by Larger NAP Peaks. A pair of the sounds that illustrate the limitations of the temporal shadow criterion are presented in Figures 4.1a and 4.2a; the former is an exponentially damped sinusoid that repeats every 25-ms, the latter is an exponentially ramped sinusoid with the same envelope period. The carrier frequency in this case is 800 Hz and the half life of the exponential is 4-ms. The half life is on the same order as the exponential decay of the impulse response of a gammatone auditory filter with a centre frequency in the region of 800 Hz. The example is taken from Patterson (1994a). The neural activity patterns produced by the damped and ramped sinusoids are shown in Figures 4.1b and 4.2b, respectively. The frequency range of the filterbank is from an octave below the carrier frequency to an octave above the carrier frequency. The highest and lowest channels in Figure 4.1b show the transient response of the filterbank to the onset of the damped sinusoid, and similarly the high- and low-frequency channels in Figure 4.2b show the transient response of the filterbank to the offset of the ramped sinusoid. In the high-frequency channels, the onset response of the damped sinusoid and the offset response of the ramped sinusoid are composed of impulse responses from the individual auditory filters. The centre section of each figure shows the response to the carrier. Here we see that the asymmetry in the waveform is preserved in the NAP: in Figure 4.1b, the carrier component is at its highest level just as the transient response ends and the carrier component decays away over the course of the period; in Figure 4.2b, the carrier activity rises over the course of the ramped cycle and ends at its peak level in the transient response. Auditory images of these damped and ramped sinusoids are presented in Figures 4.3 and 4.4, respectively. The upper rows show the images obtained when the strobe initiates temporal integration on every peak in the NAP; the middle rows show the images obtained with the temporal shadow criterion. The images in the upper row illustrate the problem of preserving NAP asymmetry during temporal integration. When the mechanism strobes on every peak, the temporal asymmetry observed in the NAP of the damped sinusoid is actually reversed in the auditory image of the damped sinusoid (Figure 4.3a). In the case of the ramped sinusoid, the asymmetry observed in the NAP is largely lost in the image of the ramped sinusoid (Figure 4.4a); there is activity at all time intervals in the central channels, whereas there is a gap in activity in the NAP of the ramped sinusoid, once per cycle, just after the abrupt reduction in amplitude. It is also the case that there are irregular fringes along the edges of the main structure in the auditory image of the ramped sinusoid (Figure 4.4a). This provides further evidence that the time interval pattern in the NAP is being disrupted by the temporal integration process in the construction of the auditory image. The introduction of the temporal shadow criterion for initiating temporal integration produces a dramatic improvement in the auditory image of the damped sinusoid (Figure 4.3b). The structure in the image is highly asymmetric and, once the alignment process is taken into account, the structure in the image is seen to be a very faithful reproduction of that in the NAP. The imposition of the temporal shadow criterion improves the auditory image of the ramped sound (Figure 4.4b). in as much as it eliminates the fringes seen in Figure 4.4a. But it does not solve the asymmetry problem. The structure in the auditory image of Figure 4.4a is still more symmetric than it is asymmetric, whereas the structure in the corresponding NAP is highly asymmetric. The source of the problem is illustrated in Figures 4.5a and 4.6a which show the NAPs and adaptive thresholds for 80-ms segments of the damped and ramped sinusoids, respectively. The vertical markers below the abscissa in Figure 4.5a show that after the first cycle, the strobe mechanism is synchronised to the period of the wave and initiates temporal integration once per cycle on the largest NAP peak. So this criterion preserves the asymmetry of the damped sound in its auditory image. In contrast, Figure 4.6a shows that on the way up the ramped portion of each cycle, the rising NAP pulses repeatedly exceed the adaptive threshold resulting in repeated initiation of temporal integration. Since, in this region of the cycle, the mechanism initiates temporal integration on every cycle, the auditory image does not preserve the asymmetry observed in the corresponding NAP. The irregular fringe is reduced because the mechanism reliably skips the portion of the cycle where the level of activity in the NAP is changing most rapidly. The high rate of strobing revealed in Figure 4.6a means that the level of activity in the ramped auditory image of Figure 4.4b is considerably greater than that in the damped image (Figure 4.3b). It does not show in those Figures because they have been normalised for display purposes. In terms of the auditory model, however, the greater overall level in the image of the ramped sound would lead to the prediction that ramped sounds are considerably louder than damped sounds, and this is not the case; they have roughly equal loudness. All of these observations taken together suggest that the strobe rate should be limited and that the limitation should favour larger NAP peaks, closer to the local maximum. The solution in this case is to delay temporal integration a few milliseconds after each suprathreshold NAP pulse, to determine whether another, larger, NAP pulse is about to occur. Specifically, when a NAP peak is identified, it is labeled as a potential strobe point, but the initiation of temporal integration is delayed for several milliseconds. In AIM, the value is set with option 'stlag_ai'. If, during this time, no new larger NAP pulses are encountered, the candidate strobe point is used to initiate temporal integration. If a larger NAP pulse is encountered, it becomes the new strobe candidate and replaces the previous strobe candidate, the strobe lag is reset to stlag_ai ms and the process begins again. The auditory images of damped and ramped sinusoids produced with this 'local-max' strobe criterion are shown in Figures 4.3c and 4.4c, respectively. The strobe lag restriction has virtually no effect on the auditory image of the damped sinusoid, but it improves the image of the ramped sinusoid markedly. The asymmetry observed in the NAP of the ramped sinusoid is now preserved in its auditory image. The NAP functions and the adaptive thresholds for the damped and ramped sinusoids are shown in Figures 4.5b and 4.6b, respectively. A comparison of the strobe points for the damped sinusoid under the temporal shadow criterion (Figure 4.5a) and the local max criterion (Figure 4.5b) shows that there is one small difference; the very first strobe point under the temporal shadow criterion is omitted under the local max criterion because a larger NAP pulse follows it within stlag_ai ms. So the second NAP pulse replaces the first as the strobe candidate. In the case of the ramped sinusoid, shifting to the local max criterion has a dramatic effect. The NAP functions and adaptive thresholds in Figures 4.6a and 7.6b are identical, but most of the strobe points identified under the temporal shadow criterion (Figure 4.6a) are immediately followed by larger NAP pulses as we proceed up the ramp. As a result the majority of the candidate pulses are repressed in favour of the one that occurs at the offset of the ramp. So, with the exception of the onset of the sound, the mechanism synchronises to the period of the sound and there is one strobe per cycle of the sound. The local max criterion also leads to damped and ramped auditory images with roughly the same level of activity in the auditory image, and so it is also a better predictor of the loudness of these sounds. Finally, note that the strobe lag restricts the maximum strobe rate of the mechanism. This is important because, without it, the level of a sinusoid would increase with its frequency in the auditory image. 5. Limiting the Lag of the Local Max Criterion. In the second experiment with damped and ramped sinusoids (Patterson, 1994b), the longest envelope period was 100-ms, and in that condition, the distinction between damped and ramped sinusoids is audible for half lives as long as 64 ms. In channels near the carrier frequency, the NAP function produced by the ramped sinusoid is a long, slowly rising, sequence of peaks. The local-max strobe criterion delays temporal integration to the end of the ramp and initiates temporal integration once per cycle, as previously, with the 25-ms envelope stimuli. The example, however, raises the question of what would happen in the case of a very long duration slowly rising tone, say a tone that rises from absolute threshold to 80 dB SPL over the course of 5 seconds. A listener would undoubtedly hear the sound shortly after it comes on, and hear its loudness increase progressively over the course of the 5-second rise. The local-max strobe mechanism would initiate temporal integration once, shortly after the onset of the sound, because of overshoot in the neural encoding stage of AIM. But thereafter, it would suppress temporal integration throughout the rise of the NAP function and strobe once at the end of the rise. Thus the auditory image would be empty at a time when we know the listener would hear the tone. To solve this problem, the strobe lag of the local max mechanism is limited to twice the stlag_ai value; that is, after a NAP pulse becomes a strobe candidate, either that NAP pulse or a larger one must initiate temporal integration within the next 2*stlag_ai ms. So the strobe lag restricts not only the maximum strobe rate for static sinusoids, but also the minimum strobe rate for slowly increasing sinusoids. 6. Aperiodic Strobing and Irregularity in the Auditory Image. To this point, the discussion of strobe criteria has focussed on activity in the carrier channel of the NAP and auditory image, and the relationship between strobe criteria and the preservation of NAP asymmetry through temporal integration. It was noted in passing, that, away from the carrier channel, auditory images of ramped sounds have fringes of irregular activity, for all strobe criteria prior to the local max criterion. We might expect such fringes to impart a roughness or noisy quality to the perception of ramped sounds, but typically they are static and clear. In this final Section, the activity produced by a ramped sinusoid in the 640 Hz channel of the NAP and auditory image is examined, to illustrate the relationship between strobe restrictions and the fringe of irregularity in the auditory image. The NAP produced in the 640 Hz channel of the filterbank by a ramped sinusoid with an 800-Hz carrier, a 25-ms envelope period, and a 4-ms half life is shown in Figure 6.1. The level of the ramped sinusoid rises rapidly, relative to the decay rate of the impulse response of the auditory filter and, as a result, the activity in the rising part of the NAP is dominated by carrier-period time intervals (Patterson, 1994a). When the amplitude of the ramped sinusoid drops abruptly, the energy stored in the filter decays away in a wave with periods appropriate to the centre frequency of the channel. Now consider the activity produced by this NAP in the 640-Hz channel of the auditory image for strobe criteria 2, 3 and 4, the 'every peak', 'temporal shaddow,' and 'local max' criteria, respectively. Figure 6.2a shows the case where there is no adaptive threshold and the mechanism strobes on the peak of every NAP pulse. This is the version of STI most similar to autocorrelation. Strobing on every peak causes carrier periods from the ramp to be mixed with centre-frequency periods after the offset of the ramp. This is the source of the irregularity in Fig. 6.2a, and the source of the irregular fringe in the full auditory image (Fig. 4.5a) (Allerhand and Patterson, 1992). The activity produced with the temporal shadow criterion is shown in the Figure 6.2b. The adaptive threshold function and the strobe points shown with the NAP in Fig. 6.1 were generated with the temporal shaddow criterion. In this case, the mechanism initiates temporal integration on each peak in the ramped portion of the NAP, but it skips the peaks associated with the ringing of the filter after the ramp terminates. Strobing occurs in synchrony with the carrier periods in the ramped portion of the NAP and this removes the irregularity from the ramped portion of the auditory image between 0 ms and about 10 ms. There is still irregularity in the region from 0 to -10 ms, and in the region from 25 to 15 ms, because strobing in synchrony with the carrier period mixes carrier periods and centre frequency periods in this region of the image. A further improvement occurs when the local max criterion is introduced and strobing on successive carrier periods of the ramped section of the NAP is suppressed. The activity in the 640-Hz channel of the image is shown in Figure 6.2c. The irregular activity has been removed; the image shows carrier periods to the left of the 0-ms point and centre frequency periods to the right of the 0-ms point. Thus, strobing on local maxima synchronises temporal integration to the period of the wave and preserves not only the basic asymmetry of the NAP, but also the contrasting time interval patterns associated with different sections of the NAP cycle. REFERENCES Akeroyd, M.A. and Patterson, R.D. (1995). "Discrimination of wideband noises modulated by a temporally asymmetric function," J. Acoust. Soc. Am. (in press). Assman, P. F. and Q. Summerfield (1990). "Modelling the perception of concurrent vowels: Vowels with different fundamental frequencies," J. Acoust. Soc. Am. 88, 680-697. Brown, G.J. and Cooke, M. (1994). "Computational auditory scene analysis," Computer Speech and Language 8, 297-336. Irino, T. and Patterson, R.D. (1996). "Temporal asymmetry in the auditory system," J. Acoust. Soc. Am. (revision submitted August 95). McKeown, D. and Patterson, R.D. (1995). "The time course of auditory segregation: concurrant vowels that vary in duration," J. Acoust. Soc. Am. (in press). Meddis, R. and M. J. Hewitt (1991a). "Virtual pitch and phase sensitivity of a computer model of the auditory periphery: I pitch identification," J. Acoust. Soc. Am. 89, 2866-82. Meddis, R. and M. J. Hewitt (1991b). "Virtual pitch and phase sensitivity of a computer model of the auditory periphery: II phase sensitivity," J. Acoust. Soc. Am. 89, 2883-94. Patterson, R.D. (1987b). "A pulse ribbon model of monaural phase perception," J. Acoust. Soc. Am. 82, 1560-1586. Patterson, R.D., Robinson, K., Holdsworth, J., McKeown, D., Zhang, C. and Allerhand M. (1992) "Complex sounds and auditory images," In: Auditory physiology and perception, Y Cazals, L. Demany, K. Horner (eds), Pergamon, Oxford, 429-446. Patterson, R.D. (1994a). "The sound of a sinusoid: Spectral models," J. Acoust. Soc. Am. 96, 1409-1418. Patterson, R.D. (1994b). "The sound of a sinusoid: Time-interval models." J. Acoust. Soc. Am. 96, 1419-1428. Patterson, R.D. and Akeroyd, M. A. (1995). "Time-interval patterns and sound quality," in: Advances in Hearing Research: Proceedings of the 10th International Symposium on Hearing, G. Manley, G. Klump, C. Koppl, H. Fastl, & H. Oeckinghaus, (Eds). World Scientific, Singapore, (in press). Patterson, R.D., Allerhand, M., and Giguere, C., (1995). "Time-domain modelling of peripheral auditory processing: A modular architecture and a software platform," J. Acoust. Soc. Am. 98, (in press). Robinson, K.L. & Patterson, R.D. (1995a) "The duration required to identify the instrument, the octave, or the pitch-chroma of a musical note," Music Perception (in press). Robinson, K.L. & Patterson, R.D. (1995b) "The stimulus duration required to identify vowels, their octave, and their pitch-chroma," J. Acoust. Soc. Am 98, (in press). Slaney, M. and Lyon, R.F. (1990). "A perceptual pitch detector," in Proc. IEEE Int. Conf. Acoust. Speech Signal Processing, Albuquerque, New Mexico. =========================================================================== #!/bin/sh # script/aimStrobeCriterion # Annotated script for generating the figures in docs/aimStrobeCriterion echo "FIGURES FOR SECTION 0" mv .gennaprc .oldgennaprc # a safety precaution mv .gensairc .oldgensairc # a safety precaution echo | gennap powc=off -update # make sure that powc is off echo | gensai powc=off -update # make sure that powc is off echo echo "FIGURES FOR SECTION 0" echo "Figure 0.1: Neural Activity Pattern (NAP) of cegc" gennap input=cegc_br top=3000 swap=off bits=12 gain_gtf=4 # all default values echo "Figure 0.2: Stabilised Auditory Image (SAI) of cegc" gensai stcrit=3 input=cegc_br length=100ms frstep_aid=96ms top=2500 echo echo "FIGURES FOR SECTION 1" echo "Figure 1.1 SAI of cegc strobing on every non-zero point in the NAP" echo " (stcrit_ai=1). This one is slow to calculate." gensai stcrit_ai=1 top=17000 input=cegc_br length=100ms frstep_aid=96ms # Top has to be raised because this strobe criterion causes constant # temporal integration. echo "Figure 1.2: SAI via autocorrelation -- a correlogram" echo | gennap input=cegc_br display=off length=125ms top=3000 output=stdout > cegc_br_gtf.nap #gennap -use start=48 display=on cegc_br_gtf # optional display of the NAP # After making a NAP with display=off, gennap -use requires you to set display=on. acgram start=50 wid=70ms lag=35ms frames=1 scale=.02 cegc_br_gtf.nap > cegc_gtf.sai gensai -use top=5000 input=cegc_gtf rm cegc_br_gtf.nap cegc_gtf.sai echo echo "FIGURES FOR SECTION 2" echo "Figure 2.1: SAI of cegc strobing on the peak of every NAP pulse" echo " (stcrit_ai=2)" gensai stcrit_ai=2 top=10000 input=cegc_br length=100ms frstep_aid=96ms echo echo "FIGURES FOR SECTION 3" echo "Demonstration of preservation of asymmetry when stthresh is elevated" # Note stthresh only operates when stcrit_ai=1. gensai stcrit_ai=1 top=5000 input=cegc_br length=68ms frstep_aid=66ms stthresh_ai=5000 echo "Figure 3.1: NAP of cegc with temporal shaddow criterion (stcrit_ai=3)" echo " Single Channel NAP with Strobe Threshold and Strobe Points below NAP" StrobeCriterionDisplay cegc_br 1000 100 3 2.5 17000 2000 # Type 'StrobeCriterionDisplay -help' for a listing of the options and # their order. # Control of Xplots: # Click mouse button 1 to display coordinates of points. # Click mouse button 2 to redraw. # Click mouse button 3 to remove the display (i.e. quit). echo echo "FIGURES FOR SECTION 4" echo "Figure 4.1a: Waveform of Damped Sinusoid (4 cycles)" genwav top=14000 bottom=-14000 length=100ms input=dr_f8_t4_d swap=on echo "Figure 4.2a: Waveform of Ramped Sinusoid (4 cycles)" genwav top=14000 bottom=-14000 length=100ms input=dr_f8_t4_r swap=on echo "Figure 4.1b: NAP of the Damped Sinusoid (2 cycles)" gennap input=dr_f8_t4_d gain_gtf=0.0626 bits=16 top=2000 mincf=400 maxcf=1600 swap=on length=110ms output=stdout display=off > damped.nap gennap -use start=50 leng=50 display=on damped echo "Figure 4.2b: NAP of the Ramped Sinusoid (2 cycles)" gennap input=dr_f8_t4_r gain_gtf=0.0626 bits=16 top=2000 mincf=400 maxcf=1600 swap=on length=110ms output=stdout display=off > ramped.nap gennap -use start=60 leng=50 display=on ramped rm damped.nap ramped.nap echo "Figure 4.3a: SAI of the Damped Sinusoid strobing on every NAP peak" echo " (stcrit_ai=2)" gensai input=dr_f8_t4_d gain_gtf=0.0625 bits=16 top=7000 mincf=400 maxcf=1600 swap=on length=140ms frstep_aid=135ms stcrit=2 pwid=30ms nwid=-10ms stlag=10ms stdecay=2.5 echo "Figure 4.4a: SAI of the Ramped Sinusoid strobing on every NAP peak" echo " (stcrit_ai=2)" gensai input=dr_f8_t4_r gain_gtf=0.0625 bits=16 top=7000 mincf=400 maxcf=1600 swap=on length=140ms frstep_aid=135ms stcrit=2 pwid=30ms nwid=-10ms stlag=10ms stdecay=2.5 echo "Figure 4.3b: SAI of the Damped Sinusoid with temporal shaddow criterion" echo " (stcrit_ai=3)" gensai input=dr_f8_t4_d gain_gtf=0.0625 bits=16 top=1000 mincf=400 maxcf=1600 swap=on length=140ms frstep_aid=135ms stcrit=3 pwid=30ms nwid=-10ms stlag=10ms stdecay=2.5 echo "Figure 4.4b: SAI of the Ramped Sinusoid with temporal shaddow criterion" echo " (stcrit_ai=3)" gensai input=dr_f8_t4_r gain_gtf=0.0625 bits=16 top=2000 mincf=400 maxcf=1600 swap=on length=140ms frstep_aid=135ms stcrit=3 pwid=30ms nwid=-10ms stlag=10ms stdecay=2.5 echo "Figure 4.3c: SAI of the Damped Sinusoid with the local max criterion" echo " (stcrit_ai=4)" gensai input=dr_f8_t4_d gain_gtf=0.0625 bits=16 top=800 mincf=400 maxcf=1600 swap=on length=140ms frstep_aid=135ms stcrit=4 pwid=30ms nwid=-10ms stlag=10ms stdecay=2.5 echo "Figure 4.4c: SAI of the Ramped Sinusoid with the local max criterion" echo " (stcrit_ai=4)" gensai input=dr_f8_t4_r gain_gtf=0.0625 bits=16 top=800 mincf=400 maxcf=1600 swap=on length=140ms frstep_aid=135ms stcrit=4 pwid=30ms nwid=-10ms stlag=10ms stdecay=2.5 echo | gennap swap=on bits=16 gain_gtf=0.0625 -update echo | gensai swap=on bits=16 gain_gtf=0.0625 -update echo "Figure 4.5a: NAP of Damped Sinusoid, temporal shaddow criterion (stcrit_ai=3)" echo " Single Channel NAP with Strobe Threshold and Strobe Points below NAP" StrobeCriterionDisplay dr_f8_t4_d 800 120 3 2.5 14000 2400 echo "Figure 4.5b: NAP of Damped Sinusoid, local max criterion (stcrit_ai=4)" echo " Single Channel NAP with Strobe Threshold and Strobe Points below NAP" StrobeCriterionDisplay dr_f8_t4_d 800 120 4 2.5 14000 2400 echo "Figure 4.6a: NAP of Ramped Sinusoid, temporal shaddow criterion (stcrit_ai=3)" echo " Single Channel NAP with Strobe Threshold and Strobe Points below NAP" StrobeCriterionDisplay dr_f8_t4_r 800 120 3 2.5 7500 2400 echo "Figure 4.6b: NAP of Damped Sinusoid, local max criterion (stcrit_ai=4)" echo " Single Channel NAP with Strobe Threshold and Strobe Points below NAP" StrobeCriterionDisplay dr_f8_t4_r 800 120 4 2.5 7500 2400 echo echo "FIGURES FOR SECTION 5" echo echo "FIGURES FOR SECTION 6" echo "Figure 6.1: NAP of Ramped Sinusoid, temporal shaddow criterion (stcrit_ai=3)" echo " Single Channel NAP with Strobe Threshold and Strobe Points below NAP" StrobeCriterionDisplay dr_f8_t4_r 640 120 3 2.5 7000 2000 echo "Figure 6.2a: SAI of Ramped Sinusoid in channel centred on 640Hz (stcrit_ai=2)" gensai input=dr_f8_t4_r swap=on gain_gtf=0.0625 bits=16 top=32000 mincf=640Hz chan=1 start=10ms length=110ms frstep_aid=100ms stcrit=2 pwid=30ms nwid=-10ms stlag=10ms stdecay=2.5 echo "Figure 6.2b: SAI of Ramped Sinusoid in channel centred on 640Hz (stcrit_ai=3)" gensai input=dr_f8_t4_r swap=on gain_gtf=0.0625 bits=16 top=10000 mincf=640Hz chan=1 start=10ms length=110ms frstep_aid=100ms stcrit=3 pwid=30ms nwid=-10ms stlag=10ms stdecay=2.5 echo "Figure 6.2c: SAI of Ramped Sinusoid in channel centred on 640Hz (stcrit_ai=4)" gensai input=dr_f8_t4_r swap=on gain_gtf=0.0625 bits=16 top=1200 mincf=640Hz chan=1 start=10ms length=110ms frstep_aid=100ms stcrit=4 pwid=30ms nwid=-10ms stlag=10ms stdecay=2.5