Mercurial > hg > aim92

diff man/man1/gensai.1 @ 0:5242703e91d3 tip
Initial checkin for AIM92 aimR8.2 (last updated May 1997).
author: tomwalters
date: Fri, 20 May 2011 15:19:45 +0100
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/man/man1/gensai.1	Fri May 20 15:19:45 2011 +0100
@@ -0,0 +1,643 @@
+.TH GENSAI 1 "26 May 1995"
+.LP
+.SH NAME
+.LP
+gensai \- generate stabilised auditory image
+.LP
+.SH SYNOPSIS/SYNTAX
+.LP
+gensai [ option=value  |  -option ]  filename
+.LP
+.SH DESCRIPTION
+.LP
+
+Periodic sounds give rise to static, rather than oscillating,
+perceptions indicating that temporal integration is applied to the NAP
+in the production of our initial perception of a sound -- our auditory
+image. Traditionally, auditory temporal integration is represented by
+a simple leaky integration process and AIM provides a bank of lowpass
+filters to enable the user to generate auditory spectra (Patterson,
+1994a) and auditory spectrograms (Patterson et al., 1992b). However,
+the leaky integrator removes the phase-locked fine structure observed
+in the NAP, and this conflicts with perceptual data indicating that
+the fine structure plays an important role in determining sound
+quality and source identification (Patterson, 1994b; Patterson and
+Akeroyd, 1995). As a result, AIM includes two modules which preserve
+much of the time-interval information in the NAP during temporal
+integration, and which produce a better representation of our auditory
+images. In the functional version of AIM, this is accomplished with
+strobed temporal integration (Patterson et al., 1992a,b), and this is
+the topic of this manual entry.
+
+.LP
+
+In the physiological version of AIM, the auditory image is constructed
+with a bank of autocorrelators (Slaney and Lyon, 1990; Meddis and
+Hewitt, 1991).  The autocorrelation module is an aimTool rather than
+an integral part of the main program 'gen'.  The appropriate tool is
+'acgram'.  Type 'manaim acgram' for the documentation. The module
+extracts periodicity information and preserves intra-period fine
+structure by autocorrelating each channel of the NAP separately. The
+correlogram is the multi-channel version of this process. It was
+originally introduced as a model of pitch perception (Licklider,
+1951). It is not yet known whether STI or autocorrelation is more
+realistic, or more efficient, as a means of simulating our perceived
+auditory images. At present, the purpose is to provide a software
+package that can be used to compare these auditory representations in
+a way not previously possible.
+
+.RE
+.LP
+.SH STROBED TEMPORAL INTEGRATION
+.PP  
+
+In strobed temporal integration, a bank of delay lines is used to form
+a buffer store for the NAP, one delay line per channel, and as the NAP
+proceeds along the buffer it decays linearly with time, at about 2.5
+%/ms. Each channel of the buffer is assigned a strobe unit which
+monitors activity in that channel looking for local maxima in the
+stream of NAP pulses. When one is found, the unit initiates temporal
+integration in that channel; that is, it transfers a copy of the NAP
+at that instant to the corresponding channel of an image buffer and
+adds it point-for-point with whatever is already there. The local
+maximum itself is mapped to the 0-ms point in the image buffer. The
+multi-channel version of this STI process is AIM's representation of
+our auditory image of a sound. Periodic and quasi-periodic sounds
+cause regular strobing which leads to simulated auditory images that
+are static, or nearly static, but with the same temporal resolution as
+the NAP.  Dynamic sounds are represented as a sequence of auditory
+image frames. If the rate of change in a sound is not too rapid, as is
+diphthongs, features are seen to move smoothly as the sound proceeds,
+much as objects move smoothly in animated cartoons.
+
+.LP
+It is important to emphasise, that the triggering done on a 
+channel by channel basis and that triggering is asynchronous 
+across channels, inasmuch as the major peaks in one channel occur 
+at different times from the major peaks in other channels.  It 
+is this aspect of the triggering process that causes the 
+alignment of the auditory image and which accounts for the loss 
+of phase information in the auditory system (Patterson, 1987).
+
+.LP
+
+The auditory image has the same vertical dimension as the neural
+activity pattern (filter centre frequency).  The continuous time
+dimension of the neural activity pattern becomes a local,
+time-interval dimension in the auditory image; specifically, it is
+"the time interval between a given pulse and the succeeding strobe
+pulse". In order to preserve the direction of asymmetry of features
+that appear in the NAP, the time-interval origin is plotted towards
+the right-hand edge of the image, with increasing, positive time
+intervals proceeding to towards the left.
+
+.LP
+.SH OPTIONS
+.LP
+.SS "Display options for the auditory image"
+.PP
+
+The options that control the positioning of the window in which the
+auditory image appears are the same as those used to set up the
+earlier windows, as are the options that control the level of the
+image within the display.  In addition, there are three new options
+that are required to present this new auditory representation. The
+options are frstep_aid, pwid_aid, and nwid_aid; the suffix "_aid"
+means "auditory image display". These options are described here
+before the options that control the image construction process itself,
+as they occur first in the options list. There are also three extra
+display options for presenting the auditory image in its spiral form;
+these options have the suffix "_spd" for "spiral display"; they are
+described in the manual entry for 'genspl'.
+
+.LP
+.TP 17
+frstep_aid
+The frame step interval, or the update interval for the auditory image display 
+.RS
+Default units:  ms. Default value:  16 ms. 
+.RE
+.RS
+
+Conceptually, the auditory image exists continuously in time.  The
+simulation of the image produced by AIM is not continuous; rather it
+is like an animated cartoon. Frames of the cartoon are calculated at
+discrete points in time, and then the sequence of frames is replayed
+to reveal the dynamics of the sound, or the lack of dynamics in the
+case of periodic sounds.  When the sound is changing at a rate where
+we hear smooth glides, the structures in the simulated auditory image
+move much like objects in a cartoon.  frstep_aid determines the time
+interval between frames of the auditory image cartoon. Frames are
+calculated at time zero and integer multiples of segment_sai.
+
+.RE
+
+The default value (16 ms) is reasonable for musical sounds and speech
+sounds.  For a detailed examination of the development of the image of
+brief transient sounds frstep_aid should be decreased to 4 or even 2
+ms.
+.LP
+.TP 16
+pwidth_sai
+
+The maximum positive time interval presented in the display of the
+auditory image (to the left of 0 ms).
+
+.RS
+Default units:  ms. Default value: 35 ms. 
+.RE
+.LP
+.TP 16
+nwidth_sai
+
+The maximum negative time interval presented in the display of the
+auditory image (to the right of 0 ms).
+
+.RS
+Default units:  ms. Default value: -5 ms. 
+.RE
+
+.LP
+.TP 12
+animate
+Present the frames of the simulated auditory image as a cartoon. 
+.RS
+Switch. Default off. 
+.RE
+.RS
+
+With reasonable resolution and a reasonable frame rate, the auditory
+cartoon for a second of sound will require on the order of 1 Mbyte of
+storage. As a result, auditory cartoons are only stored at the
+specific request of the user.  When the animate flag is set to `on',
+the bit maps that constitute the frames the auditory cartoon are
+stored in computer memory. They can then be replayed as an auditory
+cartoon by pressing `carriage return'. To exit the instruction, type
+"q" for `quit' or "control c". The bit maps are discarded unless
+option bitmap=on.
+
+.RE
+.LP
+.SS "Storage options for the auditory image "
+.PP
+
+A record of the auditory image can be stored in two ways depending on
+the purpose for which it is stored.  The actual numerical values of
+the auditory image can be stored as previously, by setting output=on.
+In this case, a file with a .sai suffix will be created in accordance
+with the conventions of the software.  These values can be recalled
+for further processing with the aimTools.  In this regard the SAI
+module is like any previous module.
+
+.LP
+It is also possible to store the bit maps which are displayed on 
+the screen for the auditory image cartoon.  The bit maps require 
+less storage space and reload more quickly, so this is the 
+preferred mode of storage when one simply wants to review the 
+visual image.  
+.LP
+.TP 10
+bitmap
+Produce a bit-map storage file 
+.RS
+Switch. Default value: off. 
+.RE
+.RS
+
+When the bitmap option is set to `on', the bit maps are stored in a
+file with the suffix .ctn. The bitmaps are reloaded into memory using
+the commands review, or xreview, followed by the file name without the
+suffix .ctn. The auditory image can then be replayed, as with animate,
+by typing `carriage return'. xreview is the newer and preferred
+display routine. It enables the user to select subsets of the cartoon
+and to change the rate of play via a convenient control window.
+
+
+
+.LP
+The strobe mechanism is relatively simple.  A trigger threshold 
+value is maintained for each channel and when a NAP pulse exceeds 
+the threshold a trigger pulse is generated at the time associated 
+with the maximum of the peak.  The threshold value is then reset 
+to a value somewhat above the height of the current NAP peak and 
+the threshold value decays exponentially with time thereafter.
+
+
+
+There are six options with the suffix "_ai", short for
+'auditory image'. Four of these control STI itself -- stdecay_ai,
+stcrit_ai, stthresh_ai and decay_ai. The option stinfo_ai is a switch
+that causes the software to produce information about the current STI
+analysis for demonstration or diagnostic purposes.  The final option,
+napdecay_ai controls the decay rate for the NAP while it flows down
+the NAP buffer. 
+
+.LP
+.TP 17
+napdecay_ai
+Decay rate for the neural activity pattern (NAP)
+.RS
+Default units: %/ms. Default value 2.5 %/ms. 
+.RE
+.RS
+
+napdecay_ai determines the rate at which the information in the neural
+activity pattern decays as it proceeds along the auditory buffer that
+stores the NAP prior to temporal integration.
+.RE
+
+
+.LP
+.TP 16
+stdecay_ai
+Strobe threshold decay rate 
+.RS
+Default units: %/ms. Default value:  5 %/ms. 
+.RE
+.RS
+stdecay_sai determines the rate at which the strobe threshold decays. 
+.RE
+.LP
+General purpose pitch mechanisms based on peak picking are 
+notoriously difficult to design, and the trigger mechanism just 
+described would not work well on an arbitrary acoustic waveform.  
+The reason that this simple trigger mechanism is sufficient for 
+the construction of the auditory image is that NAP functions are 
+highly constrained.  The microstructure reveals a function that 
+rises from zero to a local maximum smoothly and returns smoothly 
+back to zero where it stays for more than half of a period of the 
+centre frequency of that channel.  On the longer time scale, the 
+amplitude of successive peaks changes only relatively slowly with 
+respect to time.  As a result, for periodic sounds there tends 
+to be one clear maximum per period in all but the lowest channels 
+where there is an integer number of maxima per period.  The 
+simplicity of the NAP functions follows from the fact that the 
+acoustic waveform has passed through a narrow band filter and so 
+it has a limited number of degrees of freedom.  In all but the 
+highest frequency channels, the output of the auditory filter 
+resembles a modulated sine wave whose frequency is near the 
+centre frequency of the filter.  Thus the neural activity pattern 
+is largely restricted to a set of peaks which are modified 
+versions of the positive halves of a sine wave, and the remaining 
+degrees of freedom appear as relatively slow changes in peak 
+amplitude and relatively small changes in peak time (or phase). 
+.LP
+.LP
+When the acoustic input terminates, the auditory image must 
+decay.  In the ASP model the form of the decay is exponential and 
+the decay rate is determined by decayrate_sai.  
+.LP
+.TP 18
+decay_ai
+SAI decay time constant 
+.RS
+Default units:  ms. Default value 30 ms. 
+.RE
+.RS
+decay_ai determines the rate at which the auditory image decays. 
+.RE
+.RS
+
+In addition, decay_ai determines the rate at which the strength of the
+auditory image increases and the level to which it asymptotes if the
+sound continues indefinitely. In an exponential process, the asymptote
+is reached when the increment provided by each new cycle of the sound
+equals the amount that the image decays over the same period.
+
+.RE
+.SH MOTIVATION
+.LP
+.SS "Auditory temporal integration: The problem "
+.PP
+Image stabilisation and temporal smearing.
+.LP
+When the input to the auditory system is a periodic sound like 
+pt_8ms or ae_8ms, the output of the cochlea is a rapidly flowing 
+neural activity pattern on which the information concerning the 
+source repeats every 8 ms.  Consider the display problem that 
+would arise if one attempted to present a one second sample of 
+either pt_8ms or ae_8ms with the resolution and format of Figure 
+5.2.  In that figure each 8 ms period of the sound occupies about 
+4 cm of width.  There are 125 repetitions of the period in one 
+second and so a paper version of the complete NAP would be 5 
+metres in length.  If the NAP were presented as a real-time flow 
+process, the paper would have to move past a typical window at 
+the rate of 5 metres per second!  At this rate, the temporal 
+detail within the cycle would be lost.  The image would be stable 
+but the information would be reduced to horizontal banding.  The 
+fine-grain temporal information is lost because the integration 
+time of the visual system is long with respect to the rate of 
+flow of information when the record is moving at 5 metres a 
+second.
+.LP
+Traditional models of auditory temporal integration are similar 
+to visual models.  They assume that we hear a stable auditory 
+image in response to a periodic sound because the neural activity 
+is passed through a temporal weighting function that integrates 
+over time.  The output does not fluctuate if the integration time 
+is long enough.  Unfortunately, the simple model of temporal 
+integration does not work for the auditory system.  If the output 
+is to be stable, the integrator must integrate over 10 or more 
+cycles of the sound.  We hear stable images for pitches as low 
+as, say 50 cycles per second, which suggests that the integration 
+time of the auditory system would have to be 200 ms at the 
+minimum.  Such an integrator would cause far more smearing of 
+auditory information than we know occurs.  For example, phase 
+shifts that produce small changes half way through the period of 
+a pulse train are often audible (see Patterson, 1987, for a 
+review).  Small changes of this sort would be obscured by lengthy 
+temporal integration.
+.LP
+Thus the problem in modelling auditory temporal integration is 
+to determine how the auditory system can integrate information 
+to form a stable auditory image without losing the fine-grain 
+temporal information within the individual cycles of periodic 
+sounds.  In visual terms, the problem is how to present a neural 
+activity pattern at a rate of 5 metres per second while at the 
+same time enabling the viewer to see features within periods 
+greater than about 4 ms.
+.LP
+.SS "Periodic sounds and information packets. "
+.PP
+Now consider temporal integration from an information processing 
+perspective, and in particular, the problem of preserving formant 
+information in the auditory image.  The shape of the neural 
+activity pattern within the period of a vowel sound provides 
+information about the resonances of the vocal tract (see Figure 
+3.6), and thus the identity of the vowel.  The information about 
+the source arrives in packets whose duration is the period of the 
+source.  Many of the sounds in speech and music have the property 
+that the source information changes relatively slowly when 
+compared with the repetition rate of the source wave (i.e. the 
+pitch).  Thus, from an information processing point of view, one 
+would like to combine source information from neighbouring 
+packets, while at the same time taking care not to smear the 
+source information contained within the individual packets.  In 
+short, one would like to perform quantised temporal integration, 
+integrating over cycles but not within cycles of the sound. 
+.LP
+.SH EXAMPLES
+.LP
+This first pair of examples is intended to illustrate the 
+dominant forms of motion that appear in the auditory image, and 
+the fact that shapes can be tracked across the image provided the 
+rate of change is not excessive.  The first example is a pitch 
+glide for a note with fixed timbre.  The second example involves 
+formant motion (a form of timbre glide) in a monotone voice (i.e. 
+for a relatively fixed pitch).
+.LP
+.SS "A pitch glide in the auditory image "
+.PP
+Up to this point, we have focussed on the way that TQTI can 
+convert a fast flowing NAP pattern into a stabilised auditory 
+image.  The mechanism is not, however, limited to continuous or 
+stationary sounds.  The data file cegc contains pulse trains that 
+produce pitches near the musical notes C3, E3, G3, and C4, along 
+with glides from one note to the next.  The notes are relatively 
+long and the pitch glides are relatively slow.  As a result, each 
+note form a stabilised auditory image and there is smooth motion 
+from one note image to the next.  The stimulus file cegc is 
+intended to support several examples including ones involving the 
+spiral representation of the auditory image and its relationship 
+to musical consonance in the next chapter.  For brevity, the 
+current example is limited to the transition from C to E near the 
+start of the file.  The pitch of musical notes is determined by 
+the lower harmonics when they are present and so the command for 
+the example is:
+.LP
+gensai mag=16 min=100 max=2000 start=100 length=600 
+duration_sai=32 cegc
+.LP
+In point of fact, the pulse train associated with the first note 
+has a period of 8 ms like pt_8ms and so this "C" is actually a 
+little below the musical note C3.  Since the initial C is the 
+same as pt_8ms, the onset of the first note is the same as in the 
+previous example; however, four cycles of the pulse train pattern 
+build up in the window because it has been set to show 32 ms of 
+'auditory image time'.  During the transition, the period of the 
+stimulus decreases from 32/4 ms down to 32/5 ms, and so the image 
+stabilises with five cycles in the window.  The period of E is 
+4/5 that of C.  
+.LP
+During the transition, in the lower channels associated with the 
+first and second harmonic, the individual SAI pulses march from 
+left to right in time and, at the same time, they move up in 
+frequency as the energy of these harmonics moves out of lower 
+filters and into higher filters.  In these low channels the 
+motion is relatively smooth because the SAI pulse has a duration 
+which is a significant proportion of the period of the sound.  As 
+the pitch rises and the periods get shorter, each new NAP cycle 
+contributes a NAP pulse which is shifted a little to the right 
+relative to the corresponding SAI pulse.  This increases the 
+leading edge of the SAI pulse without contributing to the lagging 
+edge.  As a result, the leading edge builds, the lagging edge 
+decays, and the SAI pulse moves to the right.  The SAI pulses are 
+asymmetric during the motion, with the trailing edge more shallow 
+than the leading edge, and the effect is greater towards the left 
+edge of the image because the discrepancies over four cycles are 
+larger than the discrepancies over one cycle.  The effects are 
+larger for the second harmonic than for the first harmonic 
+because the width of the pulses of the second harmonic are a 
+smaller proportion of the period.  During the pitch glide the SAI 
+pulses have a reduced peak height because the activity is 
+distributed over more channels and over longer durations.
+.LP
+The SAI pulses associated with the higher harmonics are 
+relatively narrow with regard to the changes in period during the 
+pitch glide.  As a result there is more blurring of the image 
+during the glide in the higher channels.  Towards the right-hand 
+edge, for the column that shows correlations over one cycle, the 
+blurring is minimal.  Towards the left-hand edge the details of 
+the pattern are blurred and we see mainly activity moving in 
+vertical bands from left to right.  When the glide terminates the 
+fine structure reforms from right to left across the image and 
+the stationary image for the note E appears.  
+.LP
+The details of the motion are more readily observed when the 
+image is played in slow motion.  If the disc space is available 
+(about 1.3 Mbytes), it is useful to generate a cegc.img file 
+using the image option.  The auditory image can then be played 
+in slow motion using the review command and the slow down option 
+"-".  
+.LP
+.LP
+.SS "Formant motion in the auditory image "
+.PP
+The vowels of speech are quasi-periodic sounds and the period for 
+the average male speaker is on the order of 8ms.  As the 
+articulators change the shape of the vocal tract during speech, 
+formants appear in the auditory image and move about.  The 
+position and motion of the formants represent the speech 
+information conveyed by the voiced parts of speech.  When the 
+speaker uses a monotone voice, the pitch remains relatively 
+steady and the motion of the formants is essentially in the 
+vertical dimension.  An example of monotone voiced speech is 
+provided in the file leo which is the acoustic waveform of the 
+word 'leo'.  The auditory image of leo can be produced using the 
+command 
+.LP
+gensai mag=12 segment=40 duration_sai=20 leo
+.LP
+The dominant impression on first observing the auditory image of 
+leo is the motion in the formation of the "e" sound, the 
+transition from "e" to "o", and the formation of the "o" sound.
+.LP
+The vocal chords come on at the start of the "l" sound but the 
+tip of the tongue is pressed against the roof of the mouth just 
+behind the teeth and so it restricts the air flow and the start 
+of the "l" does not contain much energy.  As a result, in the 
+auditory image, the presence of the "l" is primarily observed in 
+the transition from the "l" to the "e".  That is, as the three 
+formants in the auditory image of the "e" come on and grow 
+stronger, the second formant glides into its "e" position from 
+below, indicating that the second formant was recently at a lower 
+frequency for the previous sound.
+.LP
+In the "e", the first formant is low, centred on the third 
+harmonic at the bottom of the auditory image.  The second formant 
+is high, up near the third formant.  The lower portion of the 
+fourth formant shows along the upper edge of the image.  
+Recognition systems that ignore temporal fine structure often 
+have difficulty determining whether a high frequency 
+concentration of energy is a single broad formant or a pair of 
+narrower formants close together.  This makes it more difficult 
+to distinguish "e".  In the auditory image, information about the 
+pulsing of the vocal chords is maintained and the temporal 
+fluctuation of the formant shapes makes it easier to distinguish 
+that there are two overlapping formants rather than a single 
+large formant.
+.LP
+As the "e" changes into the "o", the second formant moves back 
+down onto the eighth harmonic and the first formant moves up to 
+a position between the third and fourth harmonics.  The third and 
+fourth formants remain relatively fixed in frequency but they 
+become softer as the "o" takes over.  During the transition, the 
+second formant becomes fuzzy and moves down a set of vertical 
+ridges at multiples of the period.  
+.LP
+.LP
+.SS "The vowel triangle: aiua "
+.PP
+In speech research, the vowels are specified by the centre 
+frequencies of their formants.  The first two formants carry the 
+most information and it is common to see sets of vowels 
+represented on a graph whose axes are the centre frequencies of 
+the first and second formant.  Not all combinations of these 
+formant frequencies occur in speech; rather, the vowels occupy a 
+triangular region within this vowel space and the points of the 
+triangle are represented by /a/ as in paw /i/ as in beet, /u/ as 
+in toot.  The file aiua contains a synthetic speech wave that 
+provides a tour around the vowel triangle from /a/ to /i/ to /u/ 
+and back to /a/, and there are smooth transitions from one vowel 
+to the next.  The auditory image of aiua can be generated using 
+the command
+.LP
+gensai mag=12 segment=40 duration=20 aiua
+.LP
+The initial vowel /a/ has a high first formant centred on the 
+fifth harmonic and a low second formant centred between the 
+seventh and eighth harmonics (for these low formants the harmonic 
+number can be determined by counting the number of SAI peaks in 
+one period of the image).  The third formant is at the top of the 
+image and it is reasonably strong, although relatively short in 
+duration.  As the sound changes from /a/ to /i/, the first formant 
+moves successively down through the low harmonics and comes to 
+rest on the second harmonic.  At the same time the second formant 
+moves all the way up to a position adjacent to the third formant, 
+similar to the "e" in leo.  All three of the formants are 
+relatively strong.  During the transition from the /i/ to the /
+u/, the third formant becomes much weaker;.  The second formant 
+moves down onto the seventh harmonic and it remains relatively 
+weak.  The first formant remains centred on the second harmonic 
+and it is relatively strong.  Finally, the formants return to 
+their /a/ positions.
+.LP
+.LP
+.SS "Speaker separation in the auditory image "
+.PP
+One of the more intriguing aspects of speech recognition is our 
+ability to hear out one voice in the presence of competing voices 
+-- the proverbial cocktail party phenomenon.  It is assumed that 
+we use pitch differences to help separate the voices.  In support 
+of this view, several researchers have presented listeners with 
+pairs of vowels and shown that they can discriminate the vowels 
+better when they have different pitches (Summerfield & Assman, 
+1989).  The final example involves a double vowel stimulus, /a/ 
+with /i/, and it shows that stable images of the dominant 
+formants of both vowels appear in the image.  The file dblvow 
+(double vowel) contains seven double-vowel pulses.  The amplitude 
+of the /a/ is fixed at a moderate level; the amplitude of the /
+i/ begins at a level 12 dB greater than that of the /a/ and it 
+decreases 4 dB with each successive pulse, and so they are equal 
+in level in the fourth pulse.  Each pulse is 200 ms in duration 
+with 20 ms rise and fall times that are included within the 200 
+ms.  There are 80 ms silent gaps between pulses and a gap of 80 
+ms at the start of the file.  The auditory image can be generated 
+with the command
+.LP
+gensai mag=12 samplerate=10000 segment=40 duration=20 dblvow
+.LP
+The pitch of the /a/ and the /i/ are 100 and 125 Hz, respectively.  
+The image reveals a strong first formant centred on the second 
+harmonic of 125 Hz (8 ms), and strong third and fourth formants 
+with a period of 8 ms (125 Hz).  These are the formants of the /
+e/ which is the stronger of the two vowels at this point.  In 
+between the first and second formants of the /i/ are the first 
+and second formants of the /a/ at a somewhat lower level.  The 
+formants of the /a/ show their proper period, 10 ms.  The 
+triggering mechanism can stabilise the formants of both vowels 
+at their proper periods because the triggering is done on a 
+channel by channel basis.  The upper formants of the /a/ fall in 
+the same channels as the upper formants of the /i/ and since they 
+are much weaker, they are repressed by the /i/ formants.
+.LP
+As the example proceeds, the formants of the /e/ become 
+progressively weaker.  In the image of the fifth burst of the 
+double vowel we see evidence of both the upper formants of the /
+i/ and the upper formants of the /a/ in the same channel.  
+Finally, in the last burst the first formant of the /i/ has 
+disappeared from the lowest channels entirely.  There is still 
+some evidence of /e/ in the region of the upper formants but it 
+is the formants of the /a/ that now dominate in the high frequency 
+region.
+.LP
+.SH SEE ALSO
+.LP
+.SH COPYRIGHT
+.LP
+Copyright (c) Applied Psychology Unit, Medical Research Council, 1995
+.LP
+Permission to use, copy, modify, and distribute this software without fee 
+is hereby granted for research purposes, provided that this copyright 
+notice appears in all copies and in all supporting documentation, and that 
+the software is not redistributed for any fee (except for a nominal 
+shipping charge). Anyone wanting to incorporate all or part of this 
+software in a commercial product must obtain a license from the Medical 
+Research Council.
+.LP
+The MRC makes no representations about the suitability of this 
+software for any purpose.  It is provided "as is" without express or 
+implied warranty.
+.LP
+THE MRC DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING 
+ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL 
+THE A.P.U. BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES 
+OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, 
+WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, 
+ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS 
+SOFTWARE.
+.LP
+.SH ACKNOWLEDGEMENTS
+.LP
+The AIM software was developed for Unix workstations by John
+Holdsworth and Mike Allerhand of the MRC APU, under the direction of
+Roy Patterson. The physiological version of AIM was developed by
+Christian Giguere. The options handler is by Paul Manson. The revised
+SAI module is by Jay Datta. Michael Akeroyd extended the postscript
+facilites and developed the xreview routine for auditory image
+cartoons.
+.LP
+The project was supported by the MRC and grants from the U.K. Defense
+Research Agency, Farnborough (Research Contract 2239); the EEC Esprit
+BR Porgramme, Project ACTS (3207); and the U.K. Hearing Research Trust.
+
author	tomwalters
date	Fri, 20 May 2011 15:19:45 +0100
parents
children