Mercurial > hg > aim92
diff docs/aimStrobeCriterion @ 0:5242703e91d3 tip
Initial checkin for AIM92 aimR8.2 (last updated May 1997).
author | tomwalters |
---|---|
date | Fri, 20 May 2011 15:19:45 +0100 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/docs/aimStrobeCriterion Fri May 20 15:19:45 2011 +0100 @@ -0,0 +1,717 @@ +docs/aimStrobeCriterion (text) +scripts/aimStrobeCriterion (figures) + + +STROBED TEMPORAL INTEGRATION AND THE STABILISED AUDITORY IMAGE + +Roy D. Patterson, Jay Datta and Mike Allerhand +MRC Applied Psychology Unit +15 Chaucer Road, Cambridge, CB2 2EF UK + +email: roy.patterson, jay.datta or mike.allerhand @mrc-apu.cam.ac.uk + +2 August 1995 + + +ABSTRACT + + This document describes the Strobed Temporal Integration +mechanism used to convert neural activity patterns into stabilised +auditory images. The specific version of the Auditory Image Model is +AIM R7, as described in Patterson, Allerhand, and Giguere (1995) + + + +INTRODUCTION + + When a periodic sound occurs with a pitch in the musical +range, the cochlea produces a detailed, multi-channel, time-interval +pattern that repeats once per cycle of the wave. The auditory images +that we hear in response to periodic sounds are perfectly stable. +That is, despite the fact that the level of activity in the neural +activity pattern is fluctuating over a large range within the course +of each cycle, the loudness of the sound is fixed. This indicates +that some form of temporal integration is applied to the NAP prior to +our initial perception of the sound. The auditory images of periodic +sounds can have a very rich timbre, or sound quality, that can reveal +a great deal about the sound source such as the quality of the musical +instrument or the finesse of the musician. This suggests that much of +the detailed time-interval information produced by the cochlea is +preserved in the stabilised auditory image. + + The fact that we hear stable auditory images with rich sound +quality presents auditory theorists with a problem. The temporal +integration mechanism in traditional auditory models is a low-pass +filter that removes the fine-grain time-interval information from the +internal representation of the sound -- time interval information that +appears to be required for timbre perception. Strobed temporal +integration was introduced to solve this problem. At one and the same +time, it performs the temporal integration necessary to produce stable +auditory images and it preserved the majority of the time-interval +information observed in the neural activity pattern (NAP) produced by +the cochlea. + + It is not a difficult problem to produce a high-resolution, +stabilised version of the NAP provided you know the moment in time at +which the pattern in the NAP will repeat. For example, consider the +NAP of the first note of the wave CEGC in Figure 0.1 from Patterson et +al. (1992). The wave is a train of clicks separated by 8-ms gaps; the +upper channels of the NAP show that the response is a sequence of +filter impulse responses spaced at 8 ms intervals. A stabilised +representation of the NAP can be produced by setting up an image +buffer that has the same number of channels as the NAP, and simply +transferring a copy of the pattern in each channel of the NAP to the +corresponding channel of the image buffer once every 8 ms. In the +NAP, the pattern flows from right to left as time progresses, and +since the cycles are continually entering the NAP from the right hand +side and exiting the NAP from the left hand side, the pattern after +every 8 ms is identical to the pattern 8 ms ago. So if the transfer +from the NAP to the auditory image is performed every 8 ms exactly, +successive contributions from the NAP to the image are all identical. + + In the image buffer, activity does not move from right to +left, it simply decays into the floor exponentially over time with a +half life of about 30 ms. When a new contribution arrives from the +NAP, it is added point for point with whatever is currently in the +corresponding channel of the image buffer. In the current example, +after a copy of the NAP arrives in the auditory image, and during the +30 ms over which it would decay to half its original value, three more +copies of the NAP pattern arrive and are added into the auditory +image. Thus, for typical musical notes and typical vowels, the rate +of temporal integration from the NAP into the auditory image is high +and there is little time between successive integration events for the +image itself to decay. This is the source of the stability of the +auditory image. + + Provided the integration is performed once per cycle of the +sound, the majority of the time-interval information in the NAP will +be preserved in the auditory image, thereby providing a solution to +the problem of how to produce stable images without removing the +fine-grain time-interval information associated with sound quality. +The auditory image produced by this process is shown in Figure 0.2 +from Patterson et al. (1992). The transfer is performed on each +channel of the NAP separately and it is performed at the point in the +cycle where the activity in the NAP is a maximum. The maximum of the +most recent cycle to arrive in the NAP is added into the auditory +image at the 0-ms point, and as a result, the NAP peaks are aligned +vertically in the auditory image. This passive alignment process +explains the loss of global phase information observed empirically +(see Patterson, 1987, for a review). + + Thus it would appear that the problem of converting the +oscillating NAP into a stabilised, high-resolution image reduces to +the problem of finding the pitch of the sound and performing temporal +integration at multiples of the pitch period. There are now a number +of computational auditory models with a proven ability to extract the +pitch of complex sounds (see Brown and Cook, 1994, for a review) and +they could be used to direct strobed temporal integration. However, +experiments with vowels (McKeown and Patterson, 1995; Robinson and +Patterson, 1995a) and musical notes (Robinson and Patterson, 1995b) +indicate that 4 to 8 cycles of the sound are required to produce an +accurate estimate of the pitch, whereas the sound quality information +necessary to identify a vowel or a musical instrument can be extracted +from one cycle of the wave. This suggests, that if the auditory +system does use strobed temporal integration to produce a stable, high +resolution auditory image, it does it with a mechanism that operates +more locally in time than pitch extraction mechanisms. This is the +background that led to the development of the strobed temporal +integration mechanism in the auditory image model. + + In Sections 1 and 2 of this document, following Allerhand and +Patterson (1992), we describe two simple criteria for selecting strobe +points in the NAP and show that they produce auditory images that are +very similar to the correlograms produced by Assman and Summerfield +(1990), Slaney and Lyon (1990), or Meddis and Hewitt (1991a, 1991b). +The structures that arise in this form of auditory image are much more +symmetric than the corresponding structures in the NAP (Allerhand and +Patterson, 1992). There is mounting evidence, however, that the +auditory system is highly sensitive to temporal asymmetry (Patterson, +1994a, 1994b; Akeroyd and Patterson, 1995; Irino and Patterson, 1996), +and so the loss of asymmetry associated with the simple strobe +criterion seems likely to limit the value of this representation of +our perceptions. In the remaining Sections, an ordered sequence of +restrictions is added to the simple criteria for initiating temporal +integration, to restore asymmetry to the structures that arise in the +auditory image. + + +1. Strobe on Every Non-Zero Point in the NAP. + + The initial criterion is very simple; temporal integration is +initiated on each and every non-zero point in the NAP. In AIM +software, the option that determines which strobe criterion will be +used is 'stcrit_ai' and it is set equal to one for this simplest +strobe criterion. Allerhand and Patterson (1992) showed that when +temporal integration from the NAP to the auditory image is initiated +on each and every non-zero point in the NAP function, the result is +very similar to a correlogram -- a representation that is commonly +used in time-domain models of hearing to extract the pitch of complex +periodic sounds (see Brown and Cook, 1994, for a review). For +example, compare the auditory image with stcrit_ai=1 (Figure 1.1) and +the correlogram (Figure 1.2) of the first note of the sound cegc. +Both figures show stabilised representations of the time-interval +pattern that the sound produces in the NAP, and in both cases, the +individual channels have been aligned vertically on the largest peak +in the NAP function. The patterns in the auditory image and the +correlogram both differ from the pattern in the NAP in one important +way; there is a reflection of the NAP pulses associated with the +ringing of the auditory filters, on the side opposite to where they +originally appear. That is, autocorrelation and STI with stcrit_ai=1 +reduce the temporal asymmetry observed in the NAP. The asymmetry +information is not entirely removed but it is largely removed. +Experiments with sounds that have asymmetric temporal modulation show +that listeners are sensitive to temporal asymmetry (Patterson, 1994a, +1994b; Akeroyd and Patterson, 1995; Irino and Patterson, 1996), and so +the removal of asymmetry information seems likely to prove a +disadvantage when attempting to explain auditory perception. + + The autocorrelation process is symmetric in time by its very +nature. Mechanical processes that produce sound in the world are +typically asymmetric in time because they usually have some inertia. +Resonators struck impulsively ring after the pulse and not before. +This principle also applies to the processes that analyse the sound in +the auditory system. The impulse response of the auditory filter +rises faster than it falls; the adaptation process in the inner +haircell adapts up faster at the onset of a sound than it adapts down +after the sound passes. So asymmetry is the norm in the world and it +is not surprising that the auditory system is sensitive to it. + + +2. Strobe on the Peak of Each NAP pulse. + + When temporal integration is initiated on every non-zero NAP +point, the successive NAP functions that are transferred to the +auditory image are highly correlated. This suggests that we could +attain essentially the same auditory image for vastly less computation +by restricting temporal integration to the larger points on the +individual NAP pulses. This leads, in turn, to the suggestion that +temporal integration be limited to the peak of the individual NAP +pulses. The result of this restriction is illustrated in Figure 2.1 +which shows the auditory image of the first note of CEGC with this +more restricted strobe criterion. Since the peak restriction greatly +reduces the rate of temporal integration, the absolute levels of +structures in this form of auditory image are considerably lower than +those in the previous form of image. The pattern of time intervals, +however, is very similar in the two forms of auditory image. They +both preserve a detailed representation of the time-interval pattern +in the NAP, and, they both loose much of the asymmetry in the NAP. + + +3. Avoid Strobing in the Temporal Shadow after a large NAP Pulse. + + The loss of asymmetry in the click-train structure of the +auditory image, arises when temporal integration is initiated on the +smaller NAP pulses associated with the ringing of the auditory filters +after each click in the train. This can be demonstrated by +introducing a fixed strobe threshold below which NAP peaks do not +initiate temporal integration, and progressively raising this strobe +threshold to exclude more and more of the lower level NAP pulses. (In +AIM, a fixed threshold is set with option stthresh_ai and +stcrit_ai=1.) The auditory image becomes less and less symmetric and +more and more like the original NAP pattern for the click train as the +strobe threshold is increased. Fixed thresholds of this sort are not +realistic for simulating the operation of auditory system, firstly +because the strobe threshold eventually exceeds the largest NAP pulse +and temporal integration ceases entirely, and secondly because, in the +natural environment, the levels of sounds are constantly changing. +Nevertheless, the example illustrates how NAP asymmetry is lost with +simple strobe criteria. The problem with autocorrelation is similar; +the correlation values at lags associated with the smaller NAP pulses +introduce symmetric reflections into structure that appear in the +correlogram. + + An alternative means of restricting temporal integration to +the larger pulses in the NAP of the click train is to use an adaptive +strobe threshold which is temporally asymmetric. In the simplest +case, when the strobe unit monitoring a NAP channel encounters a +pulse, strobe threshold is set to the full height of the NAP pulse. +But following the peak threshold does not fall as fast as the NAP +function, rather it is restricted to decaying at a fixed percentage of +the peak height per ms. In AIM, the rate of decay is set to 5% per +ms, so the threshold decays faster after larger peaks, and in the +absence of further NAP peaks, returns to 0 in 20 ms. The NAP function +for the 1.0-kHz channel of the NAP is presented in Figure 3.1 along +with the adaptive threshold function. Together they illustrate what is +referred to as the "temporal shadow criterion" for strobed temporal +integration. + + In the figure, the vertical lines below the abscissa of the +NAP function mark the NAP pulses that initiate temporal integration. +They show that the first NAP pulse strobes temporal integration and +strobe threshold is set to the peak height. It immediately begins to +decay, but then it encounters another NAP pulse that exceeds strobe +threshold and so the process of strobing temporal integration and +raising strobe threshold is promptly repeated. At this point, +however, strobe threshold is high relative to the NAP pulses and, +strobe threshold is falling more slowly than the NAP pulses, so the +algorithm proceeds through the rest of the cycle without encountering +another NAP pulse from the ringing part of the NAP function. In this +way, the strobe mechanism is synchronised to the period of the sound +even though no explicit information about the pitch of the sound is +provided to the strobe mechanism. It is the auditory image with the +temporal shaddow criterion that was presented originally in Figure +0.2. (stcrit_ai=3). + + The 'temporal shadow criterion' produces stable auditory +images with accurate, asymmetry for a wide variety of naturally +occurring sounds like vowels and musical notes. The reason is that +the NAPs of these sounds have a restricted range of periods and within +those periods the asymmetry is typically characterised by the +rapid-rise/slow-fall form. There are, however, periodic sounds with +very low pitch and NAP functions that rise slowly over the course of +the period and fall rapidly at the end of the period, and the +perceptions produced by these sounds indicate that the auditory strobe +mechanism is somewhat more sophisticated than the temporal shadow +strobe mechanism. These "ramped" sounds are the subject of the next +section. + + +4. Avoid Temporal Integration on NAP Peaks Followed by Larger NAP Peaks. + + A pair of the sounds that illustrate the limitations of the +temporal shadow criterion are presented in Figures 4.1a and 4.2a; the +former is an exponentially damped sinusoid that repeats every 25-ms, +the latter is an exponentially ramped sinusoid with the same envelope +period. The carrier frequency in this case is 800 Hz and the half +life of the exponential is 4-ms. The half life is on the same order +as the exponential decay of the impulse response of a gammatone +auditory filter with a centre frequency in the region of 800 Hz. The +example is taken from Patterson (1994a). + + The neural activity patterns produced by the damped and ramped +sinusoids are shown in Figures 4.1b and 4.2b, respectively. The +frequency range of the filterbank is from an octave below the carrier +frequency to an octave above the carrier frequency. The highest and +lowest channels in Figure 4.1b show the transient response of the +filterbank to the onset of the damped sinusoid, and similarly the +high- and low-frequency channels in Figure 4.2b show the transient +response of the filterbank to the offset of the ramped sinusoid. In +the high-frequency channels, the onset response of the damped sinusoid +and the offset response of the ramped sinusoid are composed of impulse +responses from the individual auditory filters. The centre section of +each figure shows the response to the carrier. Here we see that the +asymmetry in the waveform is preserved in the NAP: in Figure 4.1b, the +carrier component is at its highest level just as the transient +response ends and the carrier component decays away over the course of +the period; in Figure 4.2b, the carrier activity rises over the course +of the ramped cycle and ends at its peak level in the transient +response. + + Auditory images of these damped and ramped sinusoids are +presented in Figures 4.3 and 4.4, respectively. The upper rows show +the images obtained when the strobe initiates temporal integration on +every peak in the NAP; the middle rows show the images obtained with +the temporal shadow criterion. The images in the upper row illustrate +the problem of preserving NAP asymmetry during temporal integration. +When the mechanism strobes on every peak, the temporal asymmetry +observed in the NAP of the damped sinusoid is actually reversed in the +auditory image of the damped sinusoid (Figure 4.3a). In the case of +the ramped sinusoid, the asymmetry observed in the NAP is largely lost +in the image of the ramped sinusoid (Figure 4.4a); there is activity +at all time intervals in the central channels, whereas there is a gap +in activity in the NAP of the ramped sinusoid, once per cycle, just +after the abrupt reduction in amplitude. It is also the case that +there are irregular fringes along the edges of the main structure in +the auditory image of the ramped sinusoid (Figure 4.4a). This +provides further evidence that the time interval pattern in the NAP is +being disrupted by the temporal integration process in the +construction of the auditory image. + + The introduction of the temporal shadow criterion for +initiating temporal integration produces a dramatic improvement in the +auditory image of the damped sinusoid (Figure 4.3b). The structure in +the image is highly asymmetric and, once the alignment process is +taken into account, the structure in the image is seen to be a very +faithful reproduction of that in the NAP. The imposition of the +temporal shadow criterion improves the auditory image of the ramped +sound (Figure 4.4b). in as much as it eliminates the fringes seen in +Figure 4.4a. But it does not solve the asymmetry problem. The +structure in the auditory image of Figure 4.4a is still more symmetric +than it is asymmetric, whereas the structure in the corresponding NAP +is highly asymmetric. + + The source of the problem is illustrated in Figures 4.5a and +4.6a which show the NAPs and adaptive thresholds for 80-ms segments of +the damped and ramped sinusoids, respectively. The vertical markers +below the abscissa in Figure 4.5a show that after the first cycle, the +strobe mechanism is synchronised to the period of the wave and +initiates temporal integration once per cycle on the largest NAP peak. +So this criterion preserves the asymmetry of the damped sound in its +auditory image. In contrast, Figure 4.6a shows that on the way up the +ramped portion of each cycle, the rising NAP pulses repeatedly exceed +the adaptive threshold resulting in repeated initiation of temporal +integration. Since, in this region of the cycle, the mechanism +initiates temporal integration on every cycle, the auditory image does +not preserve the asymmetry observed in the corresponding NAP. The +irregular fringe is reduced because the mechanism reliably skips the +portion of the cycle where the level of activity in the NAP is +changing most rapidly. + + The high rate of strobing revealed in Figure 4.6a means that +the level of activity in the ramped auditory image of Figure 4.4b is +considerably greater than that in the damped image (Figure 4.3b). It +does not show in those Figures because they have been normalised for +display purposes. In terms of the auditory model, however, the +greater overall level in the image of the ramped sound would lead to +the prediction that ramped sounds are considerably louder than damped +sounds, and this is not the case; they have roughly equal loudness. +All of these observations taken together suggest that the strobe rate +should be limited and that the limitation should favour larger NAP +peaks, closer to the local maximum. + + The solution in this case is to delay temporal integration a +few milliseconds after each suprathreshold NAP pulse, to determine +whether another, larger, NAP pulse is about to occur. Specifically, +when a NAP peak is identified, it is labeled as a potential strobe +point, but the initiation of temporal integration is delayed for +several milliseconds. In AIM, the value is set with option +'stlag_ai'. If, during this time, no new larger NAP pulses are +encountered, the candidate strobe point is used to initiate temporal +integration. If a larger NAP pulse is encountered, it becomes the new +strobe candidate and replaces the previous strobe candidate, the +strobe lag is reset to stlag_ai ms and the process begins again. The +auditory images of damped and ramped sinusoids produced with this +'local-max' strobe criterion are shown in Figures 4.3c and 4.4c, +respectively. The strobe lag restriction has virtually no effect on +the auditory image of the damped sinusoid, but it improves the image +of the ramped sinusoid markedly. The asymmetry observed in the NAP of +the ramped sinusoid is now preserved in its auditory image. + + The NAP functions and the adaptive thresholds for the damped +and ramped sinusoids are shown in Figures 4.5b and 4.6b, respectively. +A comparison of the strobe points for the damped sinusoid under the +temporal shadow criterion (Figure 4.5a) and the local max criterion +(Figure 4.5b) shows that there is one small difference; the very first +strobe point under the temporal shadow criterion is omitted under the +local max criterion because a larger NAP pulse follows it within +stlag_ai ms. So the second NAP pulse replaces the first as the strobe +candidate. In the case of the ramped sinusoid, shifting to the local +max criterion has a dramatic effect. The NAP functions and adaptive +thresholds in Figures 4.6a and 7.6b are identical, but most of the +strobe points identified under the temporal shadow criterion (Figure +4.6a) are immediately followed by larger NAP pulses as we proceed up +the ramp. As a result the majority of the candidate pulses are +repressed in favour of the one that occurs at the offset of the ramp. +So, with the exception of the onset of the sound, the mechanism +synchronises to the period of the sound and there is one strobe per +cycle of the sound. The local max criterion also leads to damped and +ramped auditory images with roughly the same level of activity in the +auditory image, and so it is also a better predictor of the loudness +of these sounds. Finally, note that the strobe lag restricts the +maximum strobe rate of the mechanism. This is important because, +without it, the level of a sinusoid would increase with its frequency +in the auditory image. + + +5. Limiting the Lag of the Local Max Criterion. + + In the second experiment with damped and ramped sinusoids +(Patterson, 1994b), the longest envelope period was 100-ms, and in +that condition, the distinction between damped and ramped sinusoids is +audible for half lives as long as 64 ms. In channels near the carrier +frequency, the NAP function produced by the ramped sinusoid is a long, +slowly rising, sequence of peaks. The local-max strobe criterion +delays temporal integration to the end of the ramp and initiates +temporal integration once per cycle, as previously, with the 25-ms +envelope stimuli. The example, however, raises the question of what +would happen in the case of a very long duration slowly rising tone, +say a tone that rises from absolute threshold to 80 dB SPL over the +course of 5 seconds. A listener would undoubtedly hear the sound +shortly after it comes on, and hear its loudness increase +progressively over the course of the 5-second rise. The local-max +strobe mechanism would initiate temporal integration once, shortly +after the onset of the sound, because of overshoot in the neural +encoding stage of AIM. But thereafter, it would suppress temporal +integration throughout the rise of the NAP function and strobe once at +the end of the rise. Thus the auditory image would be empty at a time +when we know the listener would hear the tone. To solve this problem, +the strobe lag of the local max mechanism is limited to twice the +stlag_ai value; that is, after a NAP pulse becomes a strobe candidate, +either that NAP pulse or a larger one must initiate temporal +integration within the next 2*stlag_ai ms. So the strobe lag restricts +not only the maximum strobe rate for static sinusoids, but also the +minimum strobe rate for slowly increasing sinusoids. + + +6. Aperiodic Strobing and Irregularity in the Auditory Image. + + To this point, the discussion of strobe criteria has focussed +on activity in the carrier channel of the NAP and auditory image, and +the relationship between strobe criteria and the preservation of NAP +asymmetry through temporal integration. It was noted in passing, +that, away from the carrier channel, auditory images of ramped sounds +have fringes of irregular activity, for all strobe criteria prior to +the local max criterion. We might expect such fringes to impart a +roughness or noisy quality to the perception of ramped sounds, but +typically they are static and clear. In this final Section, the +activity produced by a ramped sinusoid in the 640 Hz channel of the +NAP and auditory image is examined, to illustrate the relationship +between strobe restrictions and the fringe of irregularity in the +auditory image. + + The NAP produced in the 640 Hz channel of the filterbank by a +ramped sinusoid with an 800-Hz carrier, a 25-ms envelope period, and a +4-ms half life is shown in Figure 6.1. The level of the ramped +sinusoid rises rapidly, relative to the decay rate of the impulse +response of the auditory filter and, as a result, the activity in the +rising part of the NAP is dominated by carrier-period time intervals +(Patterson, 1994a). When the amplitude of the ramped sinusoid drops +abruptly, the energy stored in the filter decays away in a wave with +periods appropriate to the centre frequency of the channel. Now +consider the activity produced by this NAP in the 640-Hz channel of +the auditory image for strobe criteria 2, 3 and 4, the 'every peak', +'temporal shaddow,' and 'local max' criteria, respectively. + + Figure 6.2a shows the case where there is no adaptive +threshold and the mechanism strobes on the peak of every NAP pulse. +This is the version of STI most similar to autocorrelation. Strobing +on every peak causes carrier periods from the ramp to be mixed with +centre-frequency periods after the offset of the ramp. This is the +source of the irregularity in Fig. 6.2a, and the source of the +irregular fringe in the full auditory image (Fig. 4.5a) (Allerhand and +Patterson, 1992). + + The activity produced with the temporal shadow criterion is +shown in the Figure 6.2b. The adaptive threshold function and the +strobe points shown with the NAP in Fig. 6.1 were generated with the +temporal shaddow criterion. In this case, the mechanism initiates +temporal integration on each peak in the ramped portion of the NAP, +but it skips the peaks associated with the ringing of the filter after +the ramp terminates. Strobing occurs in synchrony with the carrier +periods in the ramped portion of the NAP and this removes the +irregularity from the ramped portion of the auditory image between 0 +ms and about 10 ms. There is still irregularity in the region from 0 +to -10 ms, and in the region from 25 to 15 ms, because strobing in +synchrony with the carrier period mixes carrier periods and centre +frequency periods in this region of the image. + + A further improvement occurs when the local max criterion is +introduced and strobing on successive carrier periods of the ramped +section of the NAP is suppressed. The activity in the 640-Hz channel +of the image is shown in Figure 6.2c. The irregular activity has been +removed; the image shows carrier periods to the left of the 0-ms point +and centre frequency periods to the right of the 0-ms point. Thus, +strobing on local maxima synchronises temporal integration to the +period of the wave and preserves not only the basic asymmetry of the +NAP, but also the contrasting time interval patterns associated with +different sections of the NAP cycle. + + + +REFERENCES + +Akeroyd, M.A. and Patterson, R.D. (1995). "Discrimination of wideband + noises modulated by a temporally asymmetric function," + J. Acoust. Soc. Am. (in press). + +Assman, P. F. and Q. Summerfield (1990). "Modelling the perception of + concurrent vowels: Vowels with different fundamental frequencies," + J. Acoust. Soc. Am. 88, 680-697. + +Brown, G.J. and Cooke, M. (1994). "Computational auditory scene + analysis," Computer Speech and Language 8, 297-336. + +Irino, T. and Patterson, R.D. (1996). "Temporal asymmetry in the + auditory system," J. Acoust. Soc. Am. (revision submitted + August 95). + +McKeown, D. and Patterson, R.D. (1995). "The time course of auditory + segregation: concurrant vowels that vary in duration," + J. Acoust. Soc. Am. (in press). + +Meddis, R. and M. J. Hewitt (1991a). "Virtual pitch and phase + sensitivity of a computer model of the auditory periphery: I + pitch identification," J. Acoust. Soc. Am. 89, 2866-82. + +Meddis, R. and M. J. Hewitt (1991b). "Virtual pitch and phase + sensitivity of a computer model of the auditory periphery: II + phase sensitivity," J. Acoust. Soc. Am. 89, 2883-94. + +Patterson, R.D. (1987b). "A pulse ribbon model of monaural + phase perception," J. Acoust. Soc. Am. 82, 1560-1586. + +Patterson, R.D., Robinson, K., Holdsworth, J., McKeown, D., Zhang, + C. and Allerhand M. (1992) "Complex sounds and auditory images," + In: Auditory physiology and perception, Y Cazals, L. Demany, + K. Horner (eds), Pergamon, Oxford, 429-446. + +Patterson, R.D. (1994a). "The sound of a sinusoid: Spectral models," + J. Acoust. Soc. Am. 96, 1409-1418. + +Patterson, R.D. (1994b). "The sound of a sinusoid: Time-interval + models." J. Acoust. Soc. Am. 96, 1419-1428. + +Patterson, R.D. and Akeroyd, M. A. (1995). "Time-interval patterns and + sound quality," in: Advances in Hearing Research: Proceedings of + the 10th International Symposium on Hearing, G. Manley, G. Klump, + C. Koppl, H. Fastl, & H. Oeckinghaus, (Eds). World Scientific, + Singapore, (in press). + +Patterson, R.D., Allerhand, M., and Giguere, C., (1995). "Time-domain + modelling of peripheral auditory processing: A modular architecture + and a software platform," J. Acoust. Soc. Am. 98, (in press). + +Robinson, K.L. & Patterson, R.D. (1995a) "The duration required to + identify the instrument, the octave, or the pitch-chroma of a + musical note," Music Perception (in press). + +Robinson, K.L. & Patterson, R.D. (1995b) "The stimulus duration required to + identify vowels, their octave, and their pitch-chroma," J. Acoust. Soc. + Am 98, (in press). + +Slaney, M. and Lyon, R.F. (1990). "A perceptual pitch detector," in + Proc. IEEE Int. Conf. Acoust. Speech Signal Processing, + Albuquerque, New Mexico. + + + + +=========================================================================== +#!/bin/sh + +# script/aimStrobeCriterion +# Annotated script for generating the figures in docs/aimStrobeCriterion + +echo "FIGURES FOR SECTION 0" + +mv .gennaprc .oldgennaprc # a safety precaution +mv .gensairc .oldgensairc # a safety precaution +echo | gennap powc=off -update # make sure that powc is off +echo | gensai powc=off -update # make sure that powc is off + +echo +echo "FIGURES FOR SECTION 0" +echo "Figure 0.1: Neural Activity Pattern (NAP) of cegc" +gennap input=cegc_br top=3000 swap=off bits=12 gain_gtf=4 # all default values + +echo "Figure 0.2: Stabilised Auditory Image (SAI) of cegc" +gensai stcrit=3 input=cegc_br length=100ms frstep_aid=96ms top=2500 + +echo +echo "FIGURES FOR SECTION 1" + +echo "Figure 1.1 SAI of cegc strobing on every non-zero point in the NAP" +echo " (stcrit_ai=1). This one is slow to calculate." +gensai stcrit_ai=1 top=17000 input=cegc_br length=100ms frstep_aid=96ms + +# Top has to be raised because this strobe criterion causes constant +# temporal integration. + + +echo "Figure 1.2: SAI via autocorrelation -- a correlogram" +echo | gennap input=cegc_br display=off length=125ms top=3000 output=stdout > cegc_br_gtf.nap +#gennap -use start=48 display=on cegc_br_gtf # optional display of the NAP +# After making a NAP with display=off, gennap -use requires you to set display=on. + +acgram start=50 wid=70ms lag=35ms frames=1 scale=.02 cegc_br_gtf.nap > cegc_gtf.sai +gensai -use top=5000 input=cegc_gtf + +rm cegc_br_gtf.nap cegc_gtf.sai + +echo +echo "FIGURES FOR SECTION 2" + +echo "Figure 2.1: SAI of cegc strobing on the peak of every NAP pulse" +echo " (stcrit_ai=2)" +gensai stcrit_ai=2 top=10000 input=cegc_br length=100ms frstep_aid=96ms + +echo +echo "FIGURES FOR SECTION 3" + +echo "Demonstration of preservation of asymmetry when stthresh is elevated" +# Note stthresh only operates when stcrit_ai=1. +gensai stcrit_ai=1 top=5000 input=cegc_br length=68ms frstep_aid=66ms stthresh_ai=5000 + +echo "Figure 3.1: NAP of cegc with temporal shaddow criterion (stcrit_ai=3)" +echo " Single Channel NAP with Strobe Threshold and Strobe Points below NAP" +StrobeCriterionDisplay cegc_br 1000 100 3 2.5 17000 2000 + +# Type 'StrobeCriterionDisplay -help' for a listing of the options and +# their order. +# Control of Xplots: +# Click mouse button 1 to display coordinates of points. +# Click mouse button 2 to redraw. +# Click mouse button 3 to remove the display (i.e. quit). + +echo +echo "FIGURES FOR SECTION 4" + +echo "Figure 4.1a: Waveform of Damped Sinusoid (4 cycles)" +genwav top=14000 bottom=-14000 length=100ms input=dr_f8_t4_d swap=on + +echo "Figure 4.2a: Waveform of Ramped Sinusoid (4 cycles)" +genwav top=14000 bottom=-14000 length=100ms input=dr_f8_t4_r swap=on + +echo "Figure 4.1b: NAP of the Damped Sinusoid (2 cycles)" +gennap input=dr_f8_t4_d gain_gtf=0.0626 bits=16 top=2000 mincf=400 maxcf=1600 swap=on length=110ms output=stdout display=off > damped.nap +gennap -use start=50 leng=50 display=on damped + +echo "Figure 4.2b: NAP of the Ramped Sinusoid (2 cycles)" +gennap input=dr_f8_t4_r gain_gtf=0.0626 bits=16 top=2000 mincf=400 maxcf=1600 swap=on length=110ms output=stdout display=off > ramped.nap +gennap -use start=60 leng=50 display=on ramped + +rm damped.nap ramped.nap + +echo "Figure 4.3a: SAI of the Damped Sinusoid strobing on every NAP peak" +echo " (stcrit_ai=2)" +gensai input=dr_f8_t4_d gain_gtf=0.0625 bits=16 top=7000 mincf=400 maxcf=1600 swap=on length=140ms frstep_aid=135ms stcrit=2 pwid=30ms nwid=-10ms stlag=10ms stdecay=2.5 + +echo "Figure 4.4a: SAI of the Ramped Sinusoid strobing on every NAP peak" +echo " (stcrit_ai=2)" +gensai input=dr_f8_t4_r gain_gtf=0.0625 bits=16 top=7000 mincf=400 maxcf=1600 swap=on length=140ms frstep_aid=135ms stcrit=2 pwid=30ms nwid=-10ms stlag=10ms stdecay=2.5 + +echo "Figure 4.3b: SAI of the Damped Sinusoid with temporal shaddow criterion" +echo " (stcrit_ai=3)" +gensai input=dr_f8_t4_d gain_gtf=0.0625 bits=16 top=1000 mincf=400 maxcf=1600 swap=on length=140ms frstep_aid=135ms stcrit=3 pwid=30ms nwid=-10ms stlag=10ms stdecay=2.5 + +echo "Figure 4.4b: SAI of the Ramped Sinusoid with temporal shaddow criterion" +echo " (stcrit_ai=3)" +gensai input=dr_f8_t4_r gain_gtf=0.0625 bits=16 top=2000 mincf=400 maxcf=1600 swap=on length=140ms frstep_aid=135ms stcrit=3 pwid=30ms nwid=-10ms stlag=10ms stdecay=2.5 + +echo "Figure 4.3c: SAI of the Damped Sinusoid with the local max criterion" +echo " (stcrit_ai=4)" +gensai input=dr_f8_t4_d gain_gtf=0.0625 bits=16 top=800 mincf=400 maxcf=1600 swap=on length=140ms frstep_aid=135ms stcrit=4 pwid=30ms nwid=-10ms stlag=10ms stdecay=2.5 + +echo "Figure 4.4c: SAI of the Ramped Sinusoid with the local max criterion" +echo " (stcrit_ai=4)" +gensai input=dr_f8_t4_r gain_gtf=0.0625 bits=16 top=800 mincf=400 maxcf=1600 swap=on length=140ms frstep_aid=135ms stcrit=4 pwid=30ms nwid=-10ms stlag=10ms stdecay=2.5 + +echo | gennap swap=on bits=16 gain_gtf=0.0625 -update +echo | gensai swap=on bits=16 gain_gtf=0.0625 -update + + +echo "Figure 4.5a: NAP of Damped Sinusoid, temporal shaddow criterion (stcrit_ai=3)" +echo " Single Channel NAP with Strobe Threshold and Strobe Points below NAP" +StrobeCriterionDisplay dr_f8_t4_d 800 120 3 2.5 14000 2400 + +echo "Figure 4.5b: NAP of Damped Sinusoid, local max criterion (stcrit_ai=4)" +echo " Single Channel NAP with Strobe Threshold and Strobe Points below NAP" +StrobeCriterionDisplay dr_f8_t4_d 800 120 4 2.5 14000 2400 + +echo "Figure 4.6a: NAP of Ramped Sinusoid, temporal shaddow criterion (stcrit_ai=3)" +echo " Single Channel NAP with Strobe Threshold and Strobe Points below NAP" +StrobeCriterionDisplay dr_f8_t4_r 800 120 3 2.5 7500 2400 + +echo "Figure 4.6b: NAP of Damped Sinusoid, local max criterion (stcrit_ai=4)" +echo " Single Channel NAP with Strobe Threshold and Strobe Points below NAP" +StrobeCriterionDisplay dr_f8_t4_r 800 120 4 2.5 7500 2400 + +echo +echo "FIGURES FOR SECTION 5" + +echo +echo "FIGURES FOR SECTION 6" + +echo "Figure 6.1: NAP of Ramped Sinusoid, temporal shaddow criterion (stcrit_ai=3)" +echo " Single Channel NAP with Strobe Threshold and Strobe Points below NAP" +StrobeCriterionDisplay dr_f8_t4_r 640 120 3 2.5 7000 2000 + +echo "Figure 6.2a: SAI of Ramped Sinusoid in channel centred on 640Hz (stcrit_ai=2)" +gensai input=dr_f8_t4_r swap=on gain_gtf=0.0625 bits=16 top=32000 mincf=640Hz chan=1 start=10ms length=110ms frstep_aid=100ms stcrit=2 pwid=30ms nwid=-10ms stlag=10ms stdecay=2.5 + +echo "Figure 6.2b: SAI of Ramped Sinusoid in channel centred on 640Hz (stcrit_ai=3)" +gensai input=dr_f8_t4_r swap=on gain_gtf=0.0625 bits=16 top=10000 mincf=640Hz chan=1 start=10ms length=110ms frstep_aid=100ms stcrit=3 pwid=30ms nwid=-10ms stlag=10ms stdecay=2.5 +echo "Figure 6.2c: SAI of Ramped Sinusoid in channel centred on 640Hz (stcrit_ai=4)" +gensai input=dr_f8_t4_r swap=on gain_gtf=0.0625 bits=16 top=1200 mincf=640Hz chan=1 start=10ms length=110ms frstep_aid=100ms stcrit=4 pwid=30ms nwid=-10ms stlag=10ms stdecay=2.5 + +