tomwalters@0: .TH GENSPL 1 "8 September 1993" tomwalters@0: .LP tomwalters@0: .SH NAME tomwalters@0: .LP tomwalters@0: genspl \- spiral auditory image of a pulse train tomwalters@0: .LP tomwalters@0: .SH SYNOPSIS/SYNTAX tomwalters@0: .LP tomwalters@0: genspl [ option=value | -option ] filename tomwalters@0: .LP tomwalters@0: .SH DESCRIPTION tomwalters@0: .LP tomwalters@0: Since the spiral auditory image is just a different view of the tomwalters@0: auditory image, it includes all of the flags associated tomwalters@0: previously with the gensai command. In the ASP software, the tomwalters@0: spiral auditory image is presented in cartoon form, similar to tomwalters@0: the presentation of the linear auditory image. The spiral view tomwalters@0: of the auditory image is a global view of the sound that tomwalters@0: emphasises pitch and de-emphasises timbre. It is a distant tomwalters@0: perspective taken in order to view the longer term correlations tomwalters@0: that arise in periodic sounds. It is difficult to represent the tomwalters@0: functions of the SAI visually in a spiral form; the fine detail tomwalters@0: of the functions wouldbe lost in the spiral perspective. tomwalters@0: Accordingly, in the spiral perspective each of the separate SAI tomwalters@0: pulses is replaced by a dot positioned at the time of the peak tomwalters@0: of the pulse. Previously, this representation was referred to tomwalters@0: as a pulse ribbon (Patterson, 1987a). tomwalters@0: .LP tomwalters@0: Conceptually, the spiral auditory image is a set of concentric tomwalters@0: spirals one for each channel of the auditory image. The highest tomwalters@0: frequency channel is on the inside with the smallest radius; the tomwalters@0: lowest frequency channel is on the outside with the largest tomwalters@0: radius. The spirals lines are omitted for clarity, leaving just tomwalters@0: the dots. The presence of bars shows that the same period exists tomwalters@0: in a range of filter channels. Note, however, that this tomwalters@0: information about correlation across channels appears on the same tomwalters@0: spoke as the information indicating that the pattern repeats on tomwalters@0: the auditory image in time. Thus the multi-channel spiral maps tomwalters@0: both spectral and temporal information concerning the tomwalters@0: periodicity of the sound onto a single spatial vector -- a spoke tomwalters@0: of the spiral. It is this property that enables the spiral tomwalters@0: representation to explain octave perception (Patterson, 1990). tomwalters@0: .LP tomwalters@0: .SS "A pitch glide in the spiral auditory image " tomwalters@0: .PP tomwalters@0: The spiral auditory image, like its linear counterpart, is not tomwalters@0: limited to periodic sounds. When the pitch of a sound glides tomwalters@0: smoothly from one note to another the pattern on the spiral tomwalters@0: auditory image rotates smoothly from one position to another, and tomwalters@0: when the pitch changes abruptly from one note to another, the tomwalters@0: spiral pattern dissolves at the end of the first note and forms tomwalters@0: again in a different orientation at the start of the second note. tomwalters@0: .LP tomwalters@0: The spiral spokes grow from the centre outwards as the tomwalters@0: correlation across cycles grows. For the note C3, four spokes tomwalters@0: form: the vertical spoke contains information about periods tomwalters@0: separated by 1, 2, 4 and 8 cycles; the spoke at 25 minutes past tomwalters@0: the hour contains information about periods separated by 3 and 6 tomwalters@0: cycles; the remaining two spokes at 40 and 10 minutes past the tomwalters@0: hour contain information about periods separated by 5 and 7 tomwalters@0: cycles, respectively. tomwalters@0: .LP tomwalters@0: As the pitch of the note changes from C3 to E3, the pattern tomwalters@0: rotates 20 minutes, and the spoke that was previously at 40 tomwalters@0: minutes moves into the vertical position. Then, as the pitch tomwalters@0: glides from E3 to G3, the spoke which was at 25 minutes in C3, tomwalters@0: moves into the vertical position. As the pitch glides on up from tomwalters@0: G3 to C4 the longest spoke of the pattern returns to the vertical tomwalters@0: position completing one revolution as the pitch rises an octave. tomwalters@0: Note, however, that each of the spokes has been extended by one tomwalters@0: circuit towards the centre of the spiral. Thus, in the ASP model, tomwalters@0: octaves are perceived to be similar because they produce spoke tomwalters@0: patterns with the same orientation on the spiral auditory image tomwalters@0: and the notes of the major triad are those with a spoke that tomwalters@0: coincides with the main spoke of the tonic. A theory of musical tomwalters@0: consonance based on the coincidence of spokes in spiral auditory tomwalters@0: images is presented in Patterson (1986). tomwalters@0: .LP tomwalters@0: .LP tomwalters@0: .SH OPTIONS tomwalters@0: .LP tomwalters@0: .SS "Display options for the spiral auditory image " tomwalters@0: .PP tomwalters@0: The options that control the position of the spiral image window tomwalters@0: on the screen are the same as for all previous windows. tomwalters@0: Furthermore, since the spiral auditory image is a cartoon just tomwalters@0: like the linear auditory image, it may be generated, stored, tomwalters@0: animated, and reviewed in the same way as the linear auditory tomwalters@0: image. In addition, there are six new display options for the tomwalters@0: spiral view of the auditory image. tomwalters@0: .LP tomwalters@0: .TP 11 tomwalters@0: spiral tomwalters@0: Switch to spiral auditory image tomwalters@0: .RS tomwalters@0: Switch: Default, off. tomwalters@0: .RE tomwalters@0: .RS tomwalters@0: When spiral is set to "on" the time dimension of the auditory image is plotted as a spiral and the SAI function is replaced with dots positioned at the peaks of the pulses in the SAI function. tomwalters@0: .RE tomwalters@0: .TP 13 tomwalters@0: form_spl tomwalters@0: The form of the spiral time line tomwalters@0: .RS tomwalters@0: Switch: Default, archimedian. tomwalters@0: .RE tomwalters@0: .RS tomwalters@0: The software offers two visual representations of the underlying logarithmic spiral, both of which have the base 2. tomwalters@0: .RE tomwalters@0: Both representations gather doublings in time onto a specific tomwalters@0: spoke of the spiral, and so both have the general property that tomwalters@0: .LP tomwalters@0: q = log2(t/T) (6.1) tomwalters@0: .LP tomwalters@0: q is the angle between the horizontal axis and the radius drawn tomwalters@0: to point on the spiral. T is the period of the sampling rate and tomwalters@0: t is "auditory image time", both in seconds. Every time t doubles tomwalters@0: q increases by 1, and so the integer part of q (the characteristic tomwalters@0: of the logarithm) specifies the circuit of the spiral. The tomwalters@0: fractional part of the logarithm (the mantissa) specifies the tomwalters@0: angle within the circuit, and in this case, the angle is measured tomwalters@0: in revolutions, or circuits. tomwalters@0: .LP tomwalters@0: The archimedian spiral is like a coil of rope; that is, the radius tomwalters@0: increases by the thickness of the rope on each successive tomwalters@0: circuit. The form of the archimedian spiral is tomwalters@0: .LP tomwalters@0: r = aq = a log2(t/T) (6.2) tomwalters@0: .LP tomwalters@0: where r is the radius from the centre of the spiral to a point tomwalters@0: on the spiral. The logarithmic spiral has the form tomwalters@0: .LP tomwalters@0: r = 2q = 2log2(t/T) = t/T (6.3) tomwalters@0: .LP tomwalters@0: The logarithmic version of the spiral has the advantage that tomwalters@0: image time is linear along the path of the spiral. However, it tomwalters@0: has the disadvantage that it expands rapidly, and so the current tomwalters@0: default is archimedian. tomwalters@0: .LP tomwalters@0: .LP tomwalters@0: .TP 16 tomwalters@0: dotsize_spl tomwalters@0: The size of the dots on the spiral tomwalters@0: .RS tomwalters@0: Default units, pixels: Default value, 2 pixels. tomwalters@0: .RE tomwalters@0: .RS tomwalters@0: The dots plotted on the spiral are actually small squares and the value dotsize_spl determines the number of pixels along the side of the square. tomwalters@0: .RE tomwalters@0: .TP 13 tomwalters@0: axis_spl tomwalters@0: Spiral axis, or time line tomwalters@0: .RS tomwalters@0: Switch: Default, off tomwalters@0: .RE tomwalters@0: .RS tomwalters@0: When the axis_spl switch is set to "on", a spiral axis, or time line is plotted. It is presented on the outside of the circuit, one channel below the lowest filter channel, just as in the linear image. The default value for axis_spl is "off" because the spiral axis contains a large number of points and it is slow to calculate and plot. tomwalters@0: .RE tomwalters@0: Note: The length of spiral displayed in the window is determined tomwalters@0: by duration_sai. This is the same duration_sai as for the linear tomwalters@0: image. The size of the spiral display is scaled so that the tomwalters@0: radius associated with duration_sai fits inside the rectangle tomwalters@0: specified for the window. The spiral does not have to be tomwalters@0: presented in a square window and in some instances rectangular tomwalters@0: windows are quite effective for giving a sense of depth. tomwalters@0: .LP tomwalters@0: .TP 13 tomwalters@0: zero_spl tomwalters@0: Spiral start point and spiral orientation tomwalters@0: .RS tomwalters@0: Default units: revolutions. Default value 4.072 revolutions. tomwalters@0: .RE tomwalters@0: .RS tomwalters@0: This parameter determines the minimum "auditory image time" that appears on the spiral, and thus it determines the zero point on the spiral. tomwalters@0: .RE tomwalters@0: The parameter zero_spl has two primary uses: Firstly, it enables tomwalters@0: the user to determine the orientation of the main spoke of the tomwalters@0: spiral for a given combination of sampling rate and stimulus tomwalters@0: period. Without the parameter zero_spl, the orientation of the tomwalters@0: spiral would be fixed by the sampling rate and period of the tomwalters@0: sound. Periods that are an exact power-of-2 times the base tomwalters@0: period, 1/T, would appear on the spoke preceding horizontally tomwalters@0: from the centre of the spiral towards the right. By removing a tomwalters@0: portion of a circuit the orientation of the spiral can be set to tomwalters@0: suit the user. A reduction in zero_spl of 0.25 will rotate the tomwalters@0: main spoke from horizontal to vertical. tomwalters@0: .LP tomwalters@0: The second purpose of zero_spl is to enable the user to adjust tomwalters@0: the image to the period being displayed; that is, to focus on the tomwalters@0: octave of the current sound. For example, when the sound has a tomwalters@0: long period, like 8 ms, the activity produced by the sound falls tomwalters@0: in the outer circuits of the spiral. If zero_spl is set to a tomwalters@0: small value (<2) the centre of the display will be largely blank. tomwalters@0: The short circuits associated with higher octaves can be removed tomwalters@0: by setting zero_spl to a larger value, say 4, in which case a tomwalters@0: sound with an 8 ms period will fill the display. tomwalters@0: .LP tomwalters@0: The one parameter zero_spl can be used to both scale and rotate tomwalters@0: the spiral simultaneously; integer changes in the parameter cause tomwalters@0: a scaling without rotation. The default value, 4.072, assigns a tomwalters@0: vertical spoke to a period of 8 ms (and its base-2 relatives) tomwalters@0: when the sampling rate is 20 kHz (or a base-2 relative). tomwalters@0: .LP tomwalters@0: .TP 18 tomwalters@0: dotthresh_spl tomwalters@0: Threshold value for the production of a spiral dot tomwalters@0: .RS tomwalters@0: Unit: SAI strength. Default value, 50 SAI units. tomwalters@0: .RE tomwalters@0: .RS tomwalters@0: This threshold specifies the value that a pulse in the SAI must reach, or exceeds in order for it to be presented as a dot in the spiral image. tomwalters@0: .RE tomwalters@0: .LP tomwalters@0: .SH EXAMPLES tomwalters@0: .LP tomwalters@0: In order to understand the spiral mapping, look at the auditory tomwalters@0: image of C3 and imagine the pulse ribbon that would be formed by tomwalters@0: replacing each SAI pulse with a dot and extending the duration tomwalters@0: of the image to 70 ms so that it will accommodate eight cycles tomwalters@0: of the note. The spiral view is produced by compressing the pulse tomwalters@0: ribbon vertically, stretching it horizontally, and then wrapping tomwalters@0: it counterclockwise into a spiral, with the right-hand edge at tomwalters@0: the centre of the spiral and the left-hand edge at the end of the tomwalters@0: outer circuit. The dots from vertical columns of pulses in the tomwalters@0: linear auditory image, merge into short bars in the spiral view tomwalters@0: because of the vertical compression; the bars fall along spokes tomwalters@0: radiating from the centre of the spiral. The dots from the arches tomwalters@0: of pulses on either side of the vertical column in the linear tomwalters@0: auditory image appear in a stretched form like "wings" in the tomwalters@0: spiral auditory image. tomwalters@0: .LP tomwalters@0: In the case of C3 four of the bars are aligned on one spoke of tomwalters@0: the spiral (the vertical spoke); they represent the strong tomwalters@0: correlations that occur in the auditory image for cycles of the tomwalters@0: original sound separated by 1, 2, 4, and 8 cycles. In this way, tomwalters@0: much of the information that is distributed across the temporal tomwalters@0: dimension of the linear auditory image is gathered together into tomwalters@0: a single spatial vector. tomwalters@0: .LP tomwalters@0: .LP tomwalters@0: The wave cegc provides an example of how the spiral auditory tomwalters@0: image follows pitch glides from one note to another. One tomwalters@0: reasonable version of the spiral pitch glide is provided by the tomwalters@0: command tomwalters@0: .LP tomwalters@0: .LP tomwalters@0: genspl width=600 height=550 duration_sai=70 zero_spl=5.072 cegc tomwalters@0: .LP tomwalters@0: .LP tomwalters@0: .SS "The separation of pitch and timbre in the auditory image. " tomwalters@0: .PP tomwalters@0: The file vowgld contains a synthetic speech waveform that tomwalters@0: combines both formant motion and pitch motion; the formant motion tomwalters@0: is a rapid tour around the vowel triangle as in aiua, and the tomwalters@0: pitch motion is C3, E3, G3 and C4. A linear auditory image of tomwalters@0: vowgld can be generated with the command tomwalters@0: .LP tomwalters@0: .LP tomwalters@0: gensai width=420 height=420 mag=12 segment=40 duration_sai=20 tomwalters@0: spiral=off vowgld tomwalters@0: .LP tomwalters@0: .LP tomwalters@0: The motion in the linear auditory image is similar to that tomwalters@0: observed with aiua in Chapter 5. That is, the formants move tomwalters@0: vertically as the vowels change from one to the next. In this tomwalters@0: example, however, there is pitch motion and the period decreases tomwalters@0: by a factor of 2 as the example proceeds. The pitch change is tomwalters@0: observed primarily as horizontal motion that is largely tomwalters@0: independent of the formant motion. In point of fact, the tomwalters@0: resolved harmonics in the lower half of the auditory image are tomwalters@0: rising in frequency as the example proceeds but this does not tomwalters@0: seem to interfere with the perception of either the vertical tomwalters@0: motion of the formants or the horizontal shrinking of the tomwalters@0: period. tomwalters@0: .LP tomwalters@0: Although the rise in pitch can be observed in the linear auditory tomwalters@0: image it is not the dominant perception; rather, it is the tomwalters@0: formant motion that dominates in this microscopic view of the tomwalters@0: auditory image. A spiral auditory image of vowgld can be tomwalters@0: generated with the command tomwalters@0: .LP tomwalters@0: .LP tomwalters@0: gensai width=420 height=420 segment=40 duration_sai=70 spiral=on tomwalters@0: zero_spl=5.072 vowgld tomwalters@0: .LP tomwalters@0: .LP tomwalters@0: The motion in the spiral auditory image is dominated by the tomwalters@0: rotation of the spokes, that is, the pitch motion. The motion tomwalters@0: of the formants is represented in the spiral image in the sense tomwalters@0: that there is more sparkle in the information that is not on the tomwalters@0: main spoke pattern. This sparkle is caused by the formant energy tomwalters@0: changing channels as the formants move from channel to channel tomwalters@0: within one circuit of the spiral. But the fact that the motion tomwalters@0: in successive circuits is coordinated is not apparent in this tomwalters@0: macro view of the auditory image. tomwalters@0: .LP tomwalters@0: A more dramatic example of the enhancement pitch and the tomwalters@0: repression of timbre can be produced by generating a spiral tomwalters@0: auditory image for aiua in which the pitch is fixed and the tomwalters@0: vowels range around the vowel triangle. The formant information tomwalters@0: on the spokes changes as the vowel tour proceeds but the position tomwalters@0: of the spokes remains fixed. The vowel information of the spokes tomwalters@0: rushes around in three discrete transitions but there is no tomwalters@0: particular pattern to the motion. tomwalters@0: .LP tomwalters@0: Thus, in the ASP model, pitch and timbre are just two views of tomwalters@0: the same auditory image; pitch effects are observed when we stand tomwalters@0: back and take a macroscopic view of the auditory image; timbre tomwalters@0: details are observed when we move in close and take a microscopic tomwalters@0: view of the auditory image. tomwalters@0: .LP tomwalters@0: .LP tomwalters@0: The review program has the capacity to present two auditory tomwalters@0: images simultaneously. If linear and spiral auditory images of tomwalters@0: vowgld are generated and stored using image=on, they can be tomwalters@0: replayed simultaneously and compared using the command tomwalters@0: .LP tomwalters@0: review vowgld_l vowgld_s tomwalters@0: .LP tomwalters@0: Caution: this requires the user to produce separate image files tomwalters@0: (vowgld_l.img, vowgld_s.img) either by producing the images from tomwalters@0: copies of vowgld with different names, or by renaming the tomwalters@0: auditory images as they are produced. If two different auditory tomwalters@0: images are produced from the same file, the second will overwrite tomwalters@0: the first even though one has a linear format and one a spiral tomwalters@0: format. tomwalters@0: .LP tomwalters@0: .SS "Multiple pitches in the spiral auditory image " tomwalters@0: .PP tomwalters@0: It is generally assumed that when two people are speaking at the tomwalters@0: same time, the listener uses the differences in the pitches of tomwalters@0: the two voices to assist in separating the speakers. The final tomwalters@0: example in this chapter shows that the pitches of the /a/ and the tomwalters@0: /o/ in dblvow appear separately in the spiral auditory image, and tomwalters@0: that it would be reasonable to use the spiral to separate the tomwalters@0: channels associated with the two vowels and thereby assist tomwalters@0: speaker tracking. The spiral auditory image can be generated by tomwalters@0: the command tomwalters@0: .LP tomwalters@0: .LP tomwalters@0: gensai width=600 height=550 samplerate=10000 spiral=on tomwalters@0: duration=90 dblvow tomwalters@0: .LP tomwalters@0: .LP tomwalters@0: The main spokes of the /a/ and the /i/ appear at angles of 40 and tomwalters@0: 0 minutes past the hour, respectively, corresponding to periods tomwalters@0: of 10 and 8 ms. Over the course of the example, the main spoke tomwalters@0: of the /i/ fades considerably while the main spoke of the /a/ tomwalters@0: increases somewhat. tomwalters@0: .LP tomwalters@0: The second spoke of the /a/ and /i/ patterns appear at 5 and 25 tomwalters@0: minutes, respectively, and their strength changes predictably tomwalters@0: as the example proceeds. If either vowel were presented on its tomwalters@0: own there would be more than two spokes in the pattern of each tomwalters@0: vowel. The presence of the second vowel represses spokes beyond tomwalters@0: the second in the patterns of both vowels. tomwalters@0: .LP tomwalters@0: .LP tomwalters@0: .SH BUGS tomwalters@0: .LP tomwalters@0: Note: the current vrsion of the software (release 3, June 1990) tomwalters@0: incorrectly adds linear axes to hardcopy figures. Apologies. tomwalters@0: .LP tomwalters@0: .SH COPYRIGHT tomwalters@0: .LP tomwalters@0: Copyright (c) Applied Psychology Unit, Medical Research Council, 1995 tomwalters@0: .LP tomwalters@0: Permission to use, copy, modify, and distribute this software without fee tomwalters@0: is hereby granted for research purposes, provided that this copyright tomwalters@0: notice appears in all copies and in all supporting documentation, and that tomwalters@0: the software is not redistributed for any fee (except for a nominal tomwalters@0: shipping charge). Anyone wanting to incorporate all or part of this tomwalters@0: software in a commercial product must obtain a license from the Medical tomwalters@0: Research Council. tomwalters@0: .LP tomwalters@0: The MRC makes no representations about the suitability of this tomwalters@0: software for any purpose. It is provided "as is" without express or tomwalters@0: implied warranty. tomwalters@0: .LP tomwalters@0: THE MRC DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING tomwalters@0: ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL tomwalters@0: THE A.P.U. BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES tomwalters@0: OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, tomwalters@0: WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, tomwalters@0: ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS tomwalters@0: SOFTWARE. tomwalters@0: .LP tomwalters@0: .SH ACKNOWLEDGEMENTS tomwalters@0: .LP tomwalters@0: The AIM software was developed for Unix workstations by John tomwalters@0: Holdsworth and Mike Allerhand of the MRC APU, under the direction of tomwalters@0: Roy Patterson. The physiological version of AIM was developed by tomwalters@0: Christian Giguere. The options handler is by Paul Manson. The revised tomwalters@0: SAI module is by Jay Datta. Michael Akeroyd extended the postscript tomwalters@0: facilites and developed the xreview routine for auditory image tomwalters@0: cartoons. tomwalters@0: .LP tomwalters@0: The project was supported by the MRC and grants from the U.K. Defense tomwalters@0: Research Agency, Farnborough (Research Contract 2239); the EEC Esprit tomwalters@0: BR Porgramme, Project ACTS (3207); and the U.K. Hearing Research Trust. tomwalters@0: