Mercurial > hg > aim92
diff man/man1/genwav.1 @ 0:5242703e91d3 tip
Initial checkin for AIM92 aimR8.2 (last updated May 1997).
author | tomwalters |
---|---|
date | Fri, 20 May 2011 15:19:45 +0100 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/man/man1/genwav.1 Fri May 20 15:19:45 2011 +0100 @@ -0,0 +1,594 @@ +.TH GENWAV 1 "11 May 1995" +.LP +.SH NAME +.LP +genwav \- display the wave in filename. +.LP +.SH SYNOPSIS +.LP +genwav [ option=value | -option ] [ filename ] +.LP +.SH DESCRIPTION +.LP + +Genwav sets up and Xwindow and displays a segment of the input wave +in the window. The size of the window and the size of the wave are +determined by options, as are a number of other input/output +functions. These options have no direct bearing on the auditory +processing performed by AIM. For convenience, these Non-Auditory +options are associated with the instruction genwav (the one +non-auditory instruction), and they are listed at the top of the +options tables prior to the auditory options. + +.LP +There are three classes of Non-Auditory options: +.LP +I) DISPLAY OPTIONS that determine the format of the auditory representations +of sound on the screen, or on paper when printed. +.LP +II) OUTPUT OPTIONS that determine the format and content of files used +to store the auditory representations of sounds. +.LP +III) INPUT OPTIONS that determine how the wave in the input file should +be interpreted. +.LP +The output options are presented before the input options so that the +input options will be adjacent to the filterbank options in the +options tables produced by genbmm and subsequent instructions. + +.SS +I. DISPLAY OPTIONS +.LP + +The AIM modules produce output in the form of a set of functions, one +for each channel of the auditory filterbank. For example, the output +of genbmm is a set of functions that simulate basilar membrane motion +produced in response to the input wave. By default, the AIM software +puts an Xwindow up on the computer screen and displays the output in +the window. This section describes the options that control these +displays. + +.LP +The display options are: title, display, x0-win, y0-win, width_win, +height_win, display, view, top, bottom, overlap, headroom, +magnification, pensize, hiddenline. +.LP +A. The Display Window Title, Position, and Size +.RS 3 + +.LP +title Title of output display. +.RS 5 + Character string. Default: input file name. +.RE +.LP +The title of the output being displayed. If no title is given, the +display bears the name of the file of the input wave. + +.LP +display Display output on screen +.RS 5 + Switch. Default: on. +.RE +.LP + +Normally this switch is on and a bitmap of the output is displayed in +a graphical window on the computer screen. The switch is provided +because the time taken to create the displays is considerable, and it +is useful to turn it dsiplay off using AIM as a preprocessor for +speech recognition. + +.LP +x0_win Left edge of window +.RS 5 + Unit: pixels. Default: centre. +.RE +.LP +The left edge of the window into which the display will be drawn, +relative to the left edge of the screen (i.e. the x-coordinate of the +window within the screen). A value of centre will cause centring in +the horizontal dimension (provided the window manager does not +override). +.LP +y0-win Lower edge of window +.RS 5 + Unit: pixels. Default: centre. +.RE +.LP +The lower edge of the window into which the display will be drawn, +relative to the lower edge of the screen (i.e. the y-coordinate of the +window within the screen). A value of centre will cause centring in +the vertical dimension (provided the window manager does not +override). +.LP +Taken as a pair x0_win and y0-win determine the origin of the window, +relative to the screen origin which is assumed to be the lower left +corner of the screen. +.LP +width_win Window width +.RS 5 + Unit: pixels. Default: 640. +.RE +.LP +The width of the window into which the display will be drawn. +.LP +height_win Window height +.RS 6 + Unit: pixels. Default: 480. +.RE +.LP +The height of the window into which the display will be drawn. +.RE + + +.LP +B. Display Controls +.RS 3 +.LP +top The largest postive value visible in the display +.LP + Scalar. Default value: 1024 (for genwav) +.LP +Each of the functions in the multi-channel output of a module is +displayed in a transparent window. Provided the channel density is not +too low, the functions are related and the set of functions produces a +display that looks like a complex landscape. Top determines the +largest positive value that will appear in the transparent windows of +the individual functions, so top must be as large as the largest value +in the full set of functions. Increasing top has the effect of moving +the viewer farther up above the landscape. +.LP +bottom The largest negative value visible in the +.RS 5 + display +.RE +.RS 5 + Scalar. Default value: -1024 (for genwav) +.RE +.LP +Bottom determines the largest negative value that will appear in the +transparent windows of the individual functions, so bottom must be as +large in the negative direction as the largest negative value in the +full set of functions. Increasing bottom in the negative direction has +the effect of depeening the valleys in the landscape. +.LP +overlap The overlap of transparent windows of the +.RS 5 + individual functions +.RE +.RS 5 + Scalar: percentage. Default value: 50% +.RE +.LP +The fact that the output functions are related means that they +fit up under each other in the display in a way that concentrates the +lines on the landscape and improves the display. +.LP +headroom Display with headroom for the uppermost channel +.RS 5 + Scalar: percentage. Default value: 0% +.RE +.LP +Because of the overlap of the transparent windows, part of the +uppermost transparent window is hidden by the upper edge of the +display window. This can cause truncation of the waves in the upper +channels. To avoid truncation, headroom enables the user to specify +that the highest channel ought to be centred below the upper edge of +the window. The value specified is taken to be the percentage of the +window between the zero line of the upper channel and the upper edge +of the window. +.LP +magnification Display magnification +.RS 9 + Scalar. Default: 1.0. +.RE +.LP +The degree to which the amplitude of the functions in the display +should be magnified before being displayed. This parameter is merely +for adjusting the visual contrast of the display. The magnification +option is a multiplier, so a value of 1 implies drawing to scale, +while a value of 10 implies ten times (10x) the size of values in the +module output and 0.1 implies one tenth of the output size. +Magnification is related to, but separate from, the gain options which +affect the values of the output functions and the values stored in any +output files. Magnification is an alternative means of controlling the +size of the functions in the display -- alternative to top and bottom. +.LP +pensize The size of the lines in the displays and the +.RS 5 + dots on the spiral +.RE +.RS 5 + Unit: pixels. Default: 1. +.RE +.LP +This option allows the user to specify the thickness of the lines in +the display and the size of the dots on spiral auditory images. It +also affects the lines and dots in postscript plots. It is provided +primarily for use with printers which have much more resolution than +computer screens. On laser printers a value of 3-5 gives reasonable +line thickness. On the screen, a linewidth greater than 1 produces +slow drawing, and a gagged, blurred display. +.LP +hiddenline Draw with overlapping parts of functions +.RS 5 + hidden +.RE +.RS 5 + Switch. Default: on. +.RE +.LP +This switch specifies whether or not a 'hidden line' algorithm should +be used when drawing the display. It also affects printed displays. +In almost all cases, hiddenline results in more attractive displays of +waveforms, and it often makes complex displays easier to understand, +so the default is 'on'. Note: hiddenline almost doubles the drawing +time so it is sometimes useful to switch it off on slower machines. +.LP + +.SS +II. OUTPUT OPTIONS +.RS 3 +.LP +The output options are listed and described before the input options +so that the input options will be adjacent to the filterbank options +in the listings produced by genbmm and subsequent modules. The output +options are downchannel, erase_ctn, animate_ctn, bitmap_ctn, +postscript, output, and header. +.LP +downchannel Average adjacent channels of multichannel +.RS 7 + representations +.RE +.RS 7 + Units: Number of averagings. +.RE +.RS 7 + Default value: 0. +.RE +.LP + +There is interaction between channels in the transmission-line +filterbank of the physiological version of AIM, and in the neural +encoding of the functional version of AIM. The minimum channel +density for these processes to operate properly is four channels per +ERB and 2 channels per ERB, respectively. For broadband signals like +speech this means that the minimum number of channels is on the order +of 128 and 64, respectively. This channel density can produce +cluttered displays, and more importantly, it is far too many channels +for current speech recognition systems which typically use 12-24 +channels. This is not just a computer power problem; the recognition +systems actually perform less well with extra channels. Accordingly, +the option 'downchannel' provides the option of reducing the channel +density at output, so that AIM can operate with the appropriate +channel density and still provide output that is compatible with +displays and speech recognition systems. + +.LP +Downchannel averages pairs of adjacent channels and the option value +specifies how many times it should execute the averaging process. Each +averaging reduces the number of channels by a factor of 2, so for +proper transmission-line filtering and an output file with 16 +channels, set channels_afb=128 and downchannel=3 (three successive +halvings of the number of channels). + + +.LP +A. Animated Cartoons +.LP +.RS 3 +Four of the AIM instructions produce output in the form of sequences +of spectral frames (gensgm, gencgm, genasa and genepn). Bitmap +versions of the displays of the frames can be stored by AIM and +replayed by review and xreview. When the sequence of frames is played +rapidly, it appears as an animated cartoon that shows the dynamic +behaviour of the spectrum of the sound. +.LP +Similarly, the AIM instructions for auditory images (gensai and +genspl) produce sequences of landscape frames, and bitmap versions of +the landscape displays can also be stored by AIM and replayed by +review and xreview. Indeed, it was the desire to produce auditory +image cartoons that led to the development of much of the AIM software +package. The animated cartoons or auditory images show the dynamic +behaviour of features in the images, like the motion of formants in +diphthongs and the motion of notes in a melody. +.LP +This section describes the options that control the construction and +storage of sequences of bitmaps; there is a separate manual entries for +the xreview routine that replays the bitmaps ( 'manaim xreview'). + + +.LP +erase_ctn Erase the current frame before presenting +.RS 7 + the next frame +.RE +.RS 7 + Switch. Default value: on. +.RE +.LP + +Normally, when presenting a sequence of frames as an animated cartoon, +one wants to erase the current frame before presenting the next. When +the frames are spectra, however, the set of frames can together form a +meaningful display; for example, the set of rising spectra produced at +the onset of a sound produces a contour map of the onset. The option +erase_ctn enables the user to observe the full set of spectra +simultaneously. (See aimdemo_gtf_spectra or aimdemo_tlf_spectra ). + +.LP +animate_ctn Store frames in memory and replay all of +.RS 7 + them as a cartoon +.RE +.RS 7 + Switch. Default value: off. +.RE +.LP +When this option is on, AIM stores the bitmaps of the frames it +produces in the memory of the machine and replays them rapidly when +the instruction is complete. Type RETURN to animate the cartoon again; +type 'q RETURN' to exit the instruction. (This option was important +when machines were slower and before the availability of review and +xreview. It is now largely obsolete.) +.LP +bitmap_ctn Store bitmaps of frames in a file for +.RS 7 + replay as a cartoon +.RE +.RS 7 + Switch. Default value: off. +.RE +.LP +When this option is on, bitmaps of the frames produced for the input +in file_name will be stored in file_name.ctn. The sequence of frames can later be replayed using either +.LP +> review file_name or +.LP +> xreview file_name +.LP +Both of these programs enable the user to vary the rate of animation, +the section of the sequence to be view, etc. The xreview version has a +window interface with useful information and is the preferred version +in most cases. +.RE + +.RS 3 +B. Output Files for Printing and Postprocessing + +.LP +Postscript Produce printer-ready output +.RS 7 + Switch. Default value: off. +.RE +.LP +This switch causes AIM to produce a printer-ready version of the +displays it presents on the computer screen. For example, the NAP of +a 32-ms section of cegc can be printed using +.LP +> gennap length=32 postscript=on cegc | lpr -Plw +.LP +where 'lpr' is the Unix printer-driver and the 'lw' of -Plw specifies +the destination printer. You may need to check the name of your +system's printer driver and laser printer. +.LP +Alternately the postscript version of the display may be directed to a +file using an instruction like +.LP +> gennap length=32 postscript=on cegc > cegc_nap.ps +.LP +and printed later at the users convenience. In this example, the file +name cegc_nap.ps is not generated by AIM; the '_nap.ps' suffix is +added by the user following standard conventions to indicate that the file +contains a NAP in postscript form. + +.RS 3 +.LP +THREE POSTSCRIPT CAUTIONS: +.LP +Postscript files of landscape displays from AIM are very large. As a +result, we recommend +.LP +a) that you NOT switch postscript on without redirecting the output to +a file, as it will cause the output to be display on the screen in a +seemingly endless display, +.LP +b) that you be careful NOT to print postscript files on a printer +which does not understand the Postscript language, as it can cause the +printer to put out an extremely long file, one column per page! +.LP +c) that you NOT set postscript=on in an options file as it will +generate large files in the directory without your noticing. +.RE + +.LP +output Generate an output file +.RS 3 + Switch. Default value: off. +.RE +.LP +This switch causes the array of functions that defines AIM's +simulation of basilar membrane motion, or a neural activity pattern, +or an auditory image, to be stored in a file for subsequent processing +by the aimtools or other, user defined, operators. By convention, the +file is given the same name as the input file, but with a suffix +reflecting the entry point, to distinguish it from the input file on +the one hand and from other output files on the other hand. The naming +system enables the user to construct and store a set of output files +for one input file without the need to specify a sequence of file +names. The suffixes are those used to identify the modules in the +listing produced by 'gen -help'. So, for example, the following +command line: +.LP +> gennap output=on length=32 cegc +.LP +will produce an output file named cegc.nap containing a multiplexed +version of the functions that define the NAP of the first 32 ms of +cegc. +.LP +The spectrographic representations produced by gensgm and gencgm can +be stored in the same way, as can the sequences of spectra produced by +genasa and genepn. It is the output files of genasa and gencgm that +are used to interface AIM with speech recognition systems (Robinson et +al., 1990; Patterson et al., 1995; Giguere and Woodland, 1994a). +Details of the file formats are presented in docs/aimFileFormat. +.LP +Header Put a header on the output file +.RS 3 + Flag. Default value: on. +.RE +.LP +By default, a header is prepended to each output file so that +subsequent processors have access to the history of the file. Details +of the header structure are presented in docs/aimFileFormat. +.LP +.RE + +.SS +III. INPUT OPTIONS +.LP +The input options enable the user to process a subsection of the input +wave, and to specify characterisitcs of the wave. +.LP +The input options are: input_wave, start_wave, length_wave, +samplerate, swap_wave, bits_wave, dB_wave. +.LP +input_wave Default input wave name +.RS 13 +Filename. Default value: none. +.RE +.LP +The name of the wave file to process. This option permits simple +repetitive processing of the same input file without repetitive typing. It +also enables one to circumvent the Unix convention of having the filename +last on the command line. This option is overridden if the user supplies a +wave file name at the end of the command line. +.LP +start_wave Start point in wave +.RS 13 +Default unit: ms. Default value: 0. +.RE +.LP +The point in the input wave at which processing should begin. The +start_wave option is expressed in milliseconds and its default value is the +beginning of the file (i.e. 0 ms into the file). +.LP +length_wave Length of wave +.RS 13 +Default unit: ms. Default value: remainder. +.RE +.LP +The number of milliseconds of the wave that ought to be processed, +beyond the start point. The special value 'remainder' indicates that +the entire length of the wave from the start point to the end of the +file should be processed. +.LP +samplerate Input wave sample rate +.RS 13 +Default unit: Hertz. Default value: 20,000 Hz. +.RE +.LP +The rate at which the input wave was sampled. +.LP +swap_wave Swap the bytes in each binary pair of the +.RS 13 +input file +.RE +.RS 13 +Switch. Default: off. +.RE +.LP +The order of the bytes in short integers varies between manufacturers. +Specifically the order for Sun and HP is opposite that for DEC SGI and +IBM. The default setting (off) is for the latter byte order. +.LP +bits_wave Bits in the input wave +.RS 13 +Unit: bits. Default: 12. (Only alternate: 16.) +.RE +.LP +The number of significant bits in each (16-bit) word of the input +wave. Note that gain_gtf or gaim_tlf should be changed to 0.0625 when +the number of bits is set to 16 to avoid overflow. +.LP +dB_wave Scaling of the input wave +.RS 13 +(for physiological route only) +.RE +.RS 13 +Units: dB. Default: 60 dB +.RE +.LP +This option enables the user to specify the relative level of +the input wave in decibels. It is particularly useful for +investigating the level-dependent properties of the +physiological version of AIM. +.LP +The functional route is level-independent and dB_wave is +ignored no matter what its value. +.LP +dB_wave can also be used to scale the input wave in absolute +units, i.e sound-pressure level (dB SPL), using the following +equation: +.LP +dB_wave = dBSPL - 20log(RMS/200) +.LP +where RMS is the root-mean-square amplitude of the input wave, +or the portion of the wave or interest, and dBSPL is the +desired sound-pressure level scaling (in dB). For +example, to scale to 60 dB SPL a wave with an RMS amplitude +of 467.3, dB_wave should be set to 52.6. +.LP +Note: The RMS value of a stored input wave can be calculated using +the tools provided with the AIM software. + + +.LP +.RE + +.SH FILES +.LP + .genwavrc The options file for genwav. +.SH SEE ALSO +.LP +genbmm +.SH BUGS +.LP +.SH COPYRIGHT +.LP +Copyright (c) Applied Psychology Unit, Medical Research Council, 1995 +.LP +Permission to use, copy, modify, and distribute this software without fee +is hereby granted for research purposes, provided that this copyright +notice appears in all copies and in all supporting documentation, and that +the software is not redistributed for any fee (except for a nominal +shipping charge). Anyone wanting to incorporate all or part of this +software in a commercial product must obtain a license from the Medical +Research Council. +.LP +The MRC makes no representations about the suitability of this +software for any purpose. It is provided "as is" without express or +implied warranty. +.LP +THE MRC DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING +ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL +THE A.P.U. BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES +OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, +WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, +ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS +SOFTWARE. +.LP +.SH ACKNOWLEDGEMENTS +.LP +The AIM software was developed for Unix workstations by John +Holdsworth and Mike Allerhand of the MRC APU, under the direction of +Roy Patterson. The physiological version of AIM was developed by +Christian Giguere. The options handler is by Paul Manson. The revised +SAI module is by Jay Datta. Michael Akeroyd extended the postscript +facilites and developed the xreview routine for auditory image +cartoons. +.LP +The project was supported by the MRC and grants from the U.K. Defense +Research Agency, Farnborough (Research Contract 2239); the EEC Esprit +BR Porgramme, Project ACTS (3207); and the U.K. Hearing Research Trust. +