Mercurial > hg > aim92
view docs/aimFileFormats @ 0:5242703e91d3 tip
Initial checkin for AIM92 aimR8.2 (last updated May 1997).
author | tomwalters |
---|---|
date | Fri, 20 May 2011 15:19:45 +0100 |
parents | |
children |
line wrap: on
line source
Appendix C Output File Formats This Appendix describes the format of output from the ASP modules, for users who wish to process this output with another program of their own. For example, gensai produces a stabilised auditory image which is suitable as input to a speech recognition system. In most cases the output file will probably require an additional processing step before it may be used by another program. Each ASP module produces an output file in response to the -output option. These output files consist of a header (containing information concerning the makeup and history of the file) followed by a series of 16-bit data items which represent the actual data contained in the file. There are two obvious ways in which such a data file may be used as input to a user's own program; firstly, the user's program may be adapted to read and interpret the header of ASP output files before actually processing the data in them, and secondly, the header may be removed and the relevant information entered into the user's program manually before it proceeds to process the data section of the ASP output file. The former of these techniques is preferable in many ways, providing consistency and insulating the user from the details of ASP headers, although it may require some effort to alter the user's program to correctly process these headers. The second method may be preferable if the user is merely interested in testing the ASP software's suitability as a preprocessor for their program, or if the program in question cannot be altered (either because it was purchased from someone else, or because it would be too complicated to change it). The second method merely requires a reliable way to separate an ASP output file into header and data sections, and the user must then 'translate' the information in the header into the terms understood by the parameters of his/her program. To this end, the ASP package includes two programs called head and tail which take an ASP output file as their argument and produce another file containing its header and data sections, respectively. The header is adapted to resemble very closely an ASP module's options file, and ought to be easily understood by anyone familiar with the ASP modules and the way in which they are used. As far as translation concepts from ASP to another program is concerned, we can only volunteer this documentation as a guide to the interpretation of each ASP option. The former of the two methods requires rather more detailed understanding of the output file structure, and this discussion will be divided into two parts, one on the header and the other on the data itself. Head and Tail For users who wish to remove the header part of an ASP output file, the software includes two programs called head and tail which take an ASP output file as their argument and produce another file containing its header and data sections, respectively. The header is adapted to resemble very closely an ASP module's options file, and ought to be easily understood by anyone familiar with the ASP modules and the way in which they are used. As far as translation of concepts from ASP to another program is concerned we can only volunteer this documentation as a guide to the interpretation of each ASP option. Whichever method the user chooses in combining ASP output files with their own programs, some information regarding the types and format of information in ASP output modules will be needed. Discussion of the structure of the output files will be divided into two sections, one on the header and the other on the data itself. The header Each ASP module's output file contains a header which provides information about the origin and history of the data in the file. This information is used by subsequent modules in the ASP hierarchy, as described in Appendix B. The header is a sequence of ASCII characters (and is thus readable, so even using more on the output file would produce sensible text for the duration of the header). This sequence is divided into three parts. 1 An identifying line. 2 A series of option setting lines. 3 NULL (ASCII 0) character. The identifying line is a means of determining, from the first few characters in a file, whether that file is in fact an ASP output file. It also is used for determining the length of the header as a whole; thus, each output file begins with a string such as: header_bytes=0000709 where header_bytes is the identifying string (which actually descended from the 'pipes' signal processing language) and the header is 709 bytes long. The length of the header includes the identifying line and the final NULL character. Each option setting line is of the form: <option name>=<option value> and these lines taken together will closely resemble a normal ASP options file, although there may be some options which are not familiar to even an experienced ASP user. These options are 'hidden' ones which are passed between ASP modules, but which are of no interest to users of the software. They may, however, be of interest to programmers who wish to take ASP output as input to their own programs, so a complete list of these hidden options is given at the end of this Appendix. The data The structure of the actual data section of an ASP output file will be dependent on the ASP module which produced it. Since we are then committed to describing each of the output structures in detail, we will begin with the data structure that ASP modules expect in input waveforms. Input waveform structure For convenience, the ASP software assumes that input waveforms do not have headers of any form. The only information about the structure of an input waveform that is required relates to the samplerate at which the waveform was generated or collected, and the number of significant bits that are in the data. These two attributes are taken as options which must be entered by the user, but they could just as easily be viewed as the contents of a suitable header for a waveform. The actual data section of a waveform consists of a single stream of (16-bit, or 'short') integers whose length determines the actual length, in points, of the waveform. To view the waveform only the user can use genwav. Basilar membrane motion output structure The genbmm module takes a waveform of the structure defined above and produces the basilar membrane motion corresponding to this input waveform by using a gammatone filter bank. The output data is expressed as the multiplexed output of a series of channels, one per channel in the filter bank. Thus the first n data points (where n is the number of channels) in the output correspond to the first output data point for each of the n channels, beginning with the highest frequency channel and ending with lowest. The second batch of n data points in the output file correspond to the second data point for each of the n channels in turn, and so on. There will be m of these n-data item segments, where m is the length of the input file in data points, and the length of a sgm output file will thus be the m x n, which may obviously be quite large. Prepended to this data is a header which contains information related to the way in which the data was produced. Some of this information was supplied by the user of the genbmm program (eg samplerate, length_wave, mincf_afb, etc) and some was derived from other sources (eg channels and pointsperchannel, which were computed by genbmm from the mincf_afb, maxcf_afb, dencf_afb, and length_wave options). Cochleogram output structure The output of gencgm is of identical structure to that of genbmm, although of course the header will contain extra information related to how the cochleogram was generated. Stablised auditory image output structure The output of gensai is a series of 'frames', one per graphical image that is displayed by gensai and revsai. Each of these frames has the structure of an entire data section of a genbmm or gencgm output file. Thus the .sai data is structured as a series of multiplexed waveforms. The size of a .sai file will be m x n x f, where f is the number of frames generated by gensai, and is thus likely to be huge. Again, the header will have additional information related to the generation of the stabilised auditory image.