Mercurial > hg > aim92
diff man/man1/gensai.1 @ 0:5242703e91d3 tip
Initial checkin for AIM92 aimR8.2 (last updated May 1997).
author | tomwalters |
---|---|
date | Fri, 20 May 2011 15:19:45 +0100 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/man/man1/gensai.1 Fri May 20 15:19:45 2011 +0100 @@ -0,0 +1,643 @@ +.TH GENSAI 1 "26 May 1995" +.LP +.SH NAME +.LP +gensai \- generate stabilised auditory image +.LP +.SH SYNOPSIS/SYNTAX +.LP +gensai [ option=value | -option ] filename +.LP +.SH DESCRIPTION +.LP + +Periodic sounds give rise to static, rather than oscillating, +perceptions indicating that temporal integration is applied to the NAP +in the production of our initial perception of a sound -- our auditory +image. Traditionally, auditory temporal integration is represented by +a simple leaky integration process and AIM provides a bank of lowpass +filters to enable the user to generate auditory spectra (Patterson, +1994a) and auditory spectrograms (Patterson et al., 1992b). However, +the leaky integrator removes the phase-locked fine structure observed +in the NAP, and this conflicts with perceptual data indicating that +the fine structure plays an important role in determining sound +quality and source identification (Patterson, 1994b; Patterson and +Akeroyd, 1995). As a result, AIM includes two modules which preserve +much of the time-interval information in the NAP during temporal +integration, and which produce a better representation of our auditory +images. In the functional version of AIM, this is accomplished with +strobed temporal integration (Patterson et al., 1992a,b), and this is +the topic of this manual entry. + +.LP + +In the physiological version of AIM, the auditory image is constructed +with a bank of autocorrelators (Slaney and Lyon, 1990; Meddis and +Hewitt, 1991). The autocorrelation module is an aimTool rather than +an integral part of the main program 'gen'. The appropriate tool is +'acgram'. Type 'manaim acgram' for the documentation. The module +extracts periodicity information and preserves intra-period fine +structure by autocorrelating each channel of the NAP separately. The +correlogram is the multi-channel version of this process. It was +originally introduced as a model of pitch perception (Licklider, +1951). It is not yet known whether STI or autocorrelation is more +realistic, or more efficient, as a means of simulating our perceived +auditory images. At present, the purpose is to provide a software +package that can be used to compare these auditory representations in +a way not previously possible. + +.RE +.LP +.SH STROBED TEMPORAL INTEGRATION +.PP + +In strobed temporal integration, a bank of delay lines is used to form +a buffer store for the NAP, one delay line per channel, and as the NAP +proceeds along the buffer it decays linearly with time, at about 2.5 +%/ms. Each channel of the buffer is assigned a strobe unit which +monitors activity in that channel looking for local maxima in the +stream of NAP pulses. When one is found, the unit initiates temporal +integration in that channel; that is, it transfers a copy of the NAP +at that instant to the corresponding channel of an image buffer and +adds it point-for-point with whatever is already there. The local +maximum itself is mapped to the 0-ms point in the image buffer. The +multi-channel version of this STI process is AIM's representation of +our auditory image of a sound. Periodic and quasi-periodic sounds +cause regular strobing which leads to simulated auditory images that +are static, or nearly static, but with the same temporal resolution as +the NAP. Dynamic sounds are represented as a sequence of auditory +image frames. If the rate of change in a sound is not too rapid, as is +diphthongs, features are seen to move smoothly as the sound proceeds, +much as objects move smoothly in animated cartoons. + +.LP +It is important to emphasise, that the triggering done on a +channel by channel basis and that triggering is asynchronous +across channels, inasmuch as the major peaks in one channel occur +at different times from the major peaks in other channels. It +is this aspect of the triggering process that causes the +alignment of the auditory image and which accounts for the loss +of phase information in the auditory system (Patterson, 1987). + +.LP + +The auditory image has the same vertical dimension as the neural +activity pattern (filter centre frequency). The continuous time +dimension of the neural activity pattern becomes a local, +time-interval dimension in the auditory image; specifically, it is +"the time interval between a given pulse and the succeeding strobe +pulse". In order to preserve the direction of asymmetry of features +that appear in the NAP, the time-interval origin is plotted towards +the right-hand edge of the image, with increasing, positive time +intervals proceeding to towards the left. + +.LP +.SH OPTIONS +.LP +.SS "Display options for the auditory image" +.PP + +The options that control the positioning of the window in which the +auditory image appears are the same as those used to set up the +earlier windows, as are the options that control the level of the +image within the display. In addition, there are three new options +that are required to present this new auditory representation. The +options are frstep_aid, pwid_aid, and nwid_aid; the suffix "_aid" +means "auditory image display". These options are described here +before the options that control the image construction process itself, +as they occur first in the options list. There are also three extra +display options for presenting the auditory image in its spiral form; +these options have the suffix "_spd" for "spiral display"; they are +described in the manual entry for 'genspl'. + +.LP +.TP 17 +frstep_aid +The frame step interval, or the update interval for the auditory image display +.RS +Default units: ms. Default value: 16 ms. +.RE +.RS + +Conceptually, the auditory image exists continuously in time. The +simulation of the image produced by AIM is not continuous; rather it +is like an animated cartoon. Frames of the cartoon are calculated at +discrete points in time, and then the sequence of frames is replayed +to reveal the dynamics of the sound, or the lack of dynamics in the +case of periodic sounds. When the sound is changing at a rate where +we hear smooth glides, the structures in the simulated auditory image +move much like objects in a cartoon. frstep_aid determines the time +interval between frames of the auditory image cartoon. Frames are +calculated at time zero and integer multiples of segment_sai. + +.RE + +The default value (16 ms) is reasonable for musical sounds and speech +sounds. For a detailed examination of the development of the image of +brief transient sounds frstep_aid should be decreased to 4 or even 2 +ms. +.LP +.TP 16 +pwidth_sai + +The maximum positive time interval presented in the display of the +auditory image (to the left of 0 ms). + +.RS +Default units: ms. Default value: 35 ms. +.RE +.LP +.TP 16 +nwidth_sai + +The maximum negative time interval presented in the display of the +auditory image (to the right of 0 ms). + +.RS +Default units: ms. Default value: -5 ms. +.RE + +.LP +.TP 12 +animate +Present the frames of the simulated auditory image as a cartoon. +.RS +Switch. Default off. +.RE +.RS + +With reasonable resolution and a reasonable frame rate, the auditory +cartoon for a second of sound will require on the order of 1 Mbyte of +storage. As a result, auditory cartoons are only stored at the +specific request of the user. When the animate flag is set to `on', +the bit maps that constitute the frames the auditory cartoon are +stored in computer memory. They can then be replayed as an auditory +cartoon by pressing `carriage return'. To exit the instruction, type +"q" for `quit' or "control c". The bit maps are discarded unless +option bitmap=on. + +.RE +.LP +.SS "Storage options for the auditory image " +.PP + +A record of the auditory image can be stored in two ways depending on +the purpose for which it is stored. The actual numerical values of +the auditory image can be stored as previously, by setting output=on. +In this case, a file with a .sai suffix will be created in accordance +with the conventions of the software. These values can be recalled +for further processing with the aimTools. In this regard the SAI +module is like any previous module. + +.LP +It is also possible to store the bit maps which are displayed on +the screen for the auditory image cartoon. The bit maps require +less storage space and reload more quickly, so this is the +preferred mode of storage when one simply wants to review the +visual image. +.LP +.TP 10 +bitmap +Produce a bit-map storage file +.RS +Switch. Default value: off. +.RE +.RS + +When the bitmap option is set to `on', the bit maps are stored in a +file with the suffix .ctn. The bitmaps are reloaded into memory using +the commands review, or xreview, followed by the file name without the +suffix .ctn. The auditory image can then be replayed, as with animate, +by typing `carriage return'. xreview is the newer and preferred +display routine. It enables the user to select subsets of the cartoon +and to change the rate of play via a convenient control window. + + + +.LP +The strobe mechanism is relatively simple. A trigger threshold +value is maintained for each channel and when a NAP pulse exceeds +the threshold a trigger pulse is generated at the time associated +with the maximum of the peak. The threshold value is then reset +to a value somewhat above the height of the current NAP peak and +the threshold value decays exponentially with time thereafter. + + + +There are six options with the suffix "_ai", short for +'auditory image'. Four of these control STI itself -- stdecay_ai, +stcrit_ai, stthresh_ai and decay_ai. The option stinfo_ai is a switch +that causes the software to produce information about the current STI +analysis for demonstration or diagnostic purposes. The final option, +napdecay_ai controls the decay rate for the NAP while it flows down +the NAP buffer. + +.LP +.TP 17 +napdecay_ai +Decay rate for the neural activity pattern (NAP) +.RS +Default units: %/ms. Default value 2.5 %/ms. +.RE +.RS + +napdecay_ai determines the rate at which the information in the neural +activity pattern decays as it proceeds along the auditory buffer that +stores the NAP prior to temporal integration. +.RE + + +.LP +.TP 16 +stdecay_ai +Strobe threshold decay rate +.RS +Default units: %/ms. Default value: 5 %/ms. +.RE +.RS +stdecay_sai determines the rate at which the strobe threshold decays. +.RE +.LP +General purpose pitch mechanisms based on peak picking are +notoriously difficult to design, and the trigger mechanism just +described would not work well on an arbitrary acoustic waveform. +The reason that this simple trigger mechanism is sufficient for +the construction of the auditory image is that NAP functions are +highly constrained. The microstructure reveals a function that +rises from zero to a local maximum smoothly and returns smoothly +back to zero where it stays for more than half of a period of the +centre frequency of that channel. On the longer time scale, the +amplitude of successive peaks changes only relatively slowly with +respect to time. As a result, for periodic sounds there tends +to be one clear maximum per period in all but the lowest channels +where there is an integer number of maxima per period. The +simplicity of the NAP functions follows from the fact that the +acoustic waveform has passed through a narrow band filter and so +it has a limited number of degrees of freedom. In all but the +highest frequency channels, the output of the auditory filter +resembles a modulated sine wave whose frequency is near the +centre frequency of the filter. Thus the neural activity pattern +is largely restricted to a set of peaks which are modified +versions of the positive halves of a sine wave, and the remaining +degrees of freedom appear as relatively slow changes in peak +amplitude and relatively small changes in peak time (or phase). +.LP +.LP +When the acoustic input terminates, the auditory image must +decay. In the ASP model the form of the decay is exponential and +the decay rate is determined by decayrate_sai. +.LP +.TP 18 +decay_ai +SAI decay time constant +.RS +Default units: ms. Default value 30 ms. +.RE +.RS +decay_ai determines the rate at which the auditory image decays. +.RE +.RS + +In addition, decay_ai determines the rate at which the strength of the +auditory image increases and the level to which it asymptotes if the +sound continues indefinitely. In an exponential process, the asymptote +is reached when the increment provided by each new cycle of the sound +equals the amount that the image decays over the same period. + +.RE +.SH MOTIVATION +.LP +.SS "Auditory temporal integration: The problem " +.PP +Image stabilisation and temporal smearing. +.LP +When the input to the auditory system is a periodic sound like +pt_8ms or ae_8ms, the output of the cochlea is a rapidly flowing +neural activity pattern on which the information concerning the +source repeats every 8 ms. Consider the display problem that +would arise if one attempted to present a one second sample of +either pt_8ms or ae_8ms with the resolution and format of Figure +5.2. In that figure each 8 ms period of the sound occupies about +4 cm of width. There are 125 repetitions of the period in one +second and so a paper version of the complete NAP would be 5 +metres in length. If the NAP were presented as a real-time flow +process, the paper would have to move past a typical window at +the rate of 5 metres per second! At this rate, the temporal +detail within the cycle would be lost. The image would be stable +but the information would be reduced to horizontal banding. The +fine-grain temporal information is lost because the integration +time of the visual system is long with respect to the rate of +flow of information when the record is moving at 5 metres a +second. +.LP +Traditional models of auditory temporal integration are similar +to visual models. They assume that we hear a stable auditory +image in response to a periodic sound because the neural activity +is passed through a temporal weighting function that integrates +over time. The output does not fluctuate if the integration time +is long enough. Unfortunately, the simple model of temporal +integration does not work for the auditory system. If the output +is to be stable, the integrator must integrate over 10 or more +cycles of the sound. We hear stable images for pitches as low +as, say 50 cycles per second, which suggests that the integration +time of the auditory system would have to be 200 ms at the +minimum. Such an integrator would cause far more smearing of +auditory information than we know occurs. For example, phase +shifts that produce small changes half way through the period of +a pulse train are often audible (see Patterson, 1987, for a +review). Small changes of this sort would be obscured by lengthy +temporal integration. +.LP +Thus the problem in modelling auditory temporal integration is +to determine how the auditory system can integrate information +to form a stable auditory image without losing the fine-grain +temporal information within the individual cycles of periodic +sounds. In visual terms, the problem is how to present a neural +activity pattern at a rate of 5 metres per second while at the +same time enabling the viewer to see features within periods +greater than about 4 ms. +.LP +.SS "Periodic sounds and information packets. " +.PP +Now consider temporal integration from an information processing +perspective, and in particular, the problem of preserving formant +information in the auditory image. The shape of the neural +activity pattern within the period of a vowel sound provides +information about the resonances of the vocal tract (see Figure +3.6), and thus the identity of the vowel. The information about +the source arrives in packets whose duration is the period of the +source. Many of the sounds in speech and music have the property +that the source information changes relatively slowly when +compared with the repetition rate of the source wave (i.e. the +pitch). Thus, from an information processing point of view, one +would like to combine source information from neighbouring +packets, while at the same time taking care not to smear the +source information contained within the individual packets. In +short, one would like to perform quantised temporal integration, +integrating over cycles but not within cycles of the sound. +.LP +.SH EXAMPLES +.LP +This first pair of examples is intended to illustrate the +dominant forms of motion that appear in the auditory image, and +the fact that shapes can be tracked across the image provided the +rate of change is not excessive. The first example is a pitch +glide for a note with fixed timbre. The second example involves +formant motion (a form of timbre glide) in a monotone voice (i.e. +for a relatively fixed pitch). +.LP +.SS "A pitch glide in the auditory image " +.PP +Up to this point, we have focussed on the way that TQTI can +convert a fast flowing NAP pattern into a stabilised auditory +image. The mechanism is not, however, limited to continuous or +stationary sounds. The data file cegc contains pulse trains that +produce pitches near the musical notes C3, E3, G3, and C4, along +with glides from one note to the next. The notes are relatively +long and the pitch glides are relatively slow. As a result, each +note form a stabilised auditory image and there is smooth motion +from one note image to the next. The stimulus file cegc is +intended to support several examples including ones involving the +spiral representation of the auditory image and its relationship +to musical consonance in the next chapter. For brevity, the +current example is limited to the transition from C to E near the +start of the file. The pitch of musical notes is determined by +the lower harmonics when they are present and so the command for +the example is: +.LP +gensai mag=16 min=100 max=2000 start=100 length=600 +duration_sai=32 cegc +.LP +In point of fact, the pulse train associated with the first note +has a period of 8 ms like pt_8ms and so this "C" is actually a +little below the musical note C3. Since the initial C is the +same as pt_8ms, the onset of the first note is the same as in the +previous example; however, four cycles of the pulse train pattern +build up in the window because it has been set to show 32 ms of +'auditory image time'. During the transition, the period of the +stimulus decreases from 32/4 ms down to 32/5 ms, and so the image +stabilises with five cycles in the window. The period of E is +4/5 that of C. +.LP +During the transition, in the lower channels associated with the +first and second harmonic, the individual SAI pulses march from +left to right in time and, at the same time, they move up in +frequency as the energy of these harmonics moves out of lower +filters and into higher filters. In these low channels the +motion is relatively smooth because the SAI pulse has a duration +which is a significant proportion of the period of the sound. As +the pitch rises and the periods get shorter, each new NAP cycle +contributes a NAP pulse which is shifted a little to the right +relative to the corresponding SAI pulse. This increases the +leading edge of the SAI pulse without contributing to the lagging +edge. As a result, the leading edge builds, the lagging edge +decays, and the SAI pulse moves to the right. The SAI pulses are +asymmetric during the motion, with the trailing edge more shallow +than the leading edge, and the effect is greater towards the left +edge of the image because the discrepancies over four cycles are +larger than the discrepancies over one cycle. The effects are +larger for the second harmonic than for the first harmonic +because the width of the pulses of the second harmonic are a +smaller proportion of the period. During the pitch glide the SAI +pulses have a reduced peak height because the activity is +distributed over more channels and over longer durations. +.LP +The SAI pulses associated with the higher harmonics are +relatively narrow with regard to the changes in period during the +pitch glide. As a result there is more blurring of the image +during the glide in the higher channels. Towards the right-hand +edge, for the column that shows correlations over one cycle, the +blurring is minimal. Towards the left-hand edge the details of +the pattern are blurred and we see mainly activity moving in +vertical bands from left to right. When the glide terminates the +fine structure reforms from right to left across the image and +the stationary image for the note E appears. +.LP +The details of the motion are more readily observed when the +image is played in slow motion. If the disc space is available +(about 1.3 Mbytes), it is useful to generate a cegc.img file +using the image option. The auditory image can then be played +in slow motion using the review command and the slow down option +"-". +.LP +.LP +.SS "Formant motion in the auditory image " +.PP +The vowels of speech are quasi-periodic sounds and the period for +the average male speaker is on the order of 8ms. As the +articulators change the shape of the vocal tract during speech, +formants appear in the auditory image and move about. The +position and motion of the formants represent the speech +information conveyed by the voiced parts of speech. When the +speaker uses a monotone voice, the pitch remains relatively +steady and the motion of the formants is essentially in the +vertical dimension. An example of monotone voiced speech is +provided in the file leo which is the acoustic waveform of the +word 'leo'. The auditory image of leo can be produced using the +command +.LP +gensai mag=12 segment=40 duration_sai=20 leo +.LP +The dominant impression on first observing the auditory image of +leo is the motion in the formation of the "e" sound, the +transition from "e" to "o", and the formation of the "o" sound. +.LP +The vocal chords come on at the start of the "l" sound but the +tip of the tongue is pressed against the roof of the mouth just +behind the teeth and so it restricts the air flow and the start +of the "l" does not contain much energy. As a result, in the +auditory image, the presence of the "l" is primarily observed in +the transition from the "l" to the "e". That is, as the three +formants in the auditory image of the "e" come on and grow +stronger, the second formant glides into its "e" position from +below, indicating that the second formant was recently at a lower +frequency for the previous sound. +.LP +In the "e", the first formant is low, centred on the third +harmonic at the bottom of the auditory image. The second formant +is high, up near the third formant. The lower portion of the +fourth formant shows along the upper edge of the image. +Recognition systems that ignore temporal fine structure often +have difficulty determining whether a high frequency +concentration of energy is a single broad formant or a pair of +narrower formants close together. This makes it more difficult +to distinguish "e". In the auditory image, information about the +pulsing of the vocal chords is maintained and the temporal +fluctuation of the formant shapes makes it easier to distinguish +that there are two overlapping formants rather than a single +large formant. +.LP +As the "e" changes into the "o", the second formant moves back +down onto the eighth harmonic and the first formant moves up to +a position between the third and fourth harmonics. The third and +fourth formants remain relatively fixed in frequency but they +become softer as the "o" takes over. During the transition, the +second formant becomes fuzzy and moves down a set of vertical +ridges at multiples of the period. +.LP +.LP +.SS "The vowel triangle: aiua " +.PP +In speech research, the vowels are specified by the centre +frequencies of their formants. The first two formants carry the +most information and it is common to see sets of vowels +represented on a graph whose axes are the centre frequencies of +the first and second formant. Not all combinations of these +formant frequencies occur in speech; rather, the vowels occupy a +triangular region within this vowel space and the points of the +triangle are represented by /a/ as in paw /i/ as in beet, /u/ as +in toot. The file aiua contains a synthetic speech wave that +provides a tour around the vowel triangle from /a/ to /i/ to /u/ +and back to /a/, and there are smooth transitions from one vowel +to the next. The auditory image of aiua can be generated using +the command +.LP +gensai mag=12 segment=40 duration=20 aiua +.LP +The initial vowel /a/ has a high first formant centred on the +fifth harmonic and a low second formant centred between the +seventh and eighth harmonics (for these low formants the harmonic +number can be determined by counting the number of SAI peaks in +one period of the image). The third formant is at the top of the +image and it is reasonably strong, although relatively short in +duration. As the sound changes from /a/ to /i/, the first formant +moves successively down through the low harmonics and comes to +rest on the second harmonic. At the same time the second formant +moves all the way up to a position adjacent to the third formant, +similar to the "e" in leo. All three of the formants are +relatively strong. During the transition from the /i/ to the / +u/, the third formant becomes much weaker;. The second formant +moves down onto the seventh harmonic and it remains relatively +weak. The first formant remains centred on the second harmonic +and it is relatively strong. Finally, the formants return to +their /a/ positions. +.LP +.LP +.SS "Speaker separation in the auditory image " +.PP +One of the more intriguing aspects of speech recognition is our +ability to hear out one voice in the presence of competing voices +-- the proverbial cocktail party phenomenon. It is assumed that +we use pitch differences to help separate the voices. In support +of this view, several researchers have presented listeners with +pairs of vowels and shown that they can discriminate the vowels +better when they have different pitches (Summerfield & Assman, +1989). The final example involves a double vowel stimulus, /a/ +with /i/, and it shows that stable images of the dominant +formants of both vowels appear in the image. The file dblvow +(double vowel) contains seven double-vowel pulses. The amplitude +of the /a/ is fixed at a moderate level; the amplitude of the / +i/ begins at a level 12 dB greater than that of the /a/ and it +decreases 4 dB with each successive pulse, and so they are equal +in level in the fourth pulse. Each pulse is 200 ms in duration +with 20 ms rise and fall times that are included within the 200 +ms. There are 80 ms silent gaps between pulses and a gap of 80 +ms at the start of the file. The auditory image can be generated +with the command +.LP +gensai mag=12 samplerate=10000 segment=40 duration=20 dblvow +.LP +The pitch of the /a/ and the /i/ are 100 and 125 Hz, respectively. +The image reveals a strong first formant centred on the second +harmonic of 125 Hz (8 ms), and strong third and fourth formants +with a period of 8 ms (125 Hz). These are the formants of the / +e/ which is the stronger of the two vowels at this point. In +between the first and second formants of the /i/ are the first +and second formants of the /a/ at a somewhat lower level. The +formants of the /a/ show their proper period, 10 ms. The +triggering mechanism can stabilise the formants of both vowels +at their proper periods because the triggering is done on a +channel by channel basis. The upper formants of the /a/ fall in +the same channels as the upper formants of the /i/ and since they +are much weaker, they are repressed by the /i/ formants. +.LP +As the example proceeds, the formants of the /e/ become +progressively weaker. In the image of the fifth burst of the +double vowel we see evidence of both the upper formants of the / +i/ and the upper formants of the /a/ in the same channel. +Finally, in the last burst the first formant of the /i/ has +disappeared from the lowest channels entirely. There is still +some evidence of /e/ in the region of the upper formants but it +is the formants of the /a/ that now dominate in the high frequency +region. +.LP +.SH SEE ALSO +.LP +.SH COPYRIGHT +.LP +Copyright (c) Applied Psychology Unit, Medical Research Council, 1995 +.LP +Permission to use, copy, modify, and distribute this software without fee +is hereby granted for research purposes, provided that this copyright +notice appears in all copies and in all supporting documentation, and that +the software is not redistributed for any fee (except for a nominal +shipping charge). Anyone wanting to incorporate all or part of this +software in a commercial product must obtain a license from the Medical +Research Council. +.LP +The MRC makes no representations about the suitability of this +software for any purpose. It is provided "as is" without express or +implied warranty. +.LP +THE MRC DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING +ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL +THE A.P.U. BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES +OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, +WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, +ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS +SOFTWARE. +.LP +.SH ACKNOWLEDGEMENTS +.LP +The AIM software was developed for Unix workstations by John +Holdsworth and Mike Allerhand of the MRC APU, under the direction of +Roy Patterson. The physiological version of AIM was developed by +Christian Giguere. The options handler is by Paul Manson. The revised +SAI module is by Jay Datta. Michael Akeroyd extended the postscript +facilites and developed the xreview routine for auditory image +cartoons. +.LP +The project was supported by the MRC and grants from the U.K. Defense +Research Agency, Farnborough (Research Contract 2239); the EEC Esprit +BR Porgramme, Project ACTS (3207); and the U.K. Hearing Research Trust. +