Mercurial > hg > aimc
view trunk/matlab/bmm/carfac/CARFAC_Design_Doc.txt @ 704:e9855b95cd04
Small cleanup of eigen usage in SAI implementation.
author | ronw@google.com |
---|---|
date | Tue, 16 Jul 2013 19:56:11 +0000 |
parents | e4c2162baca8 |
children |
line wrap: on
line source
CARFAC Design Doc by "Richard F. Lyon" <dicklyon@google.com> updated 24 May 2012 (v.237) The CAR-FAC (cascade of asymmetric resonators with fast-acting compression) is a cochlear model implemented as an efficient sound processor, for mono, stereo, or multi-channel sound inputs. This file describes aspects of the software design. The implementation will include equivalent Matlab, C++, Python, and perhaps other versions. They should all be based on the same set of classes (or structs) and functions, with roughly equivalent functionality and names where possible. The examples here are Matlab, in which the structs are typeless, but in other languages each will be class or type. The top-level class is CARFAC. A CARFAC object knows how to design its details from a modest set of parameters, and knows how to process sound signals to produce "neural activity patterns" (NAPs). It includes various sub-objects representing the parameters, designs, and states of different parts of the CAR-FAC model. The CARFAC class includes a vector of "ear" objects -- one for mono, two for stereo, or more. The three main subsystems of the EAR are the cascade of asymmetric resonators (CAR), the inner hair cell (IHC), and the automatic gain control (AGC). These are not intended to work independently, but each part has three groups of data associated with it. The CARFAC stores instances of the "params" that drive the design, and the "coeffs" and "state" are stored per ear: CAR_params, CAR_coeffs, CAR_state IHC_params, IHC_coeffs, IHC_state AGC_params, AGC_coeffs, AGC_state These names can be used both for the classes, and slightly modified for the member variables, arguments, temps, etc. The params are inputs that specify the system; the coeffs are things like filter coefficients, things computed once and used at run time; the state is whatever internal state is needed between running the model on samples or segments of sound waveforms, such as the state variables of the digital filters. At construction ("design") time, params are provided, and ears with coeffs are created. The created CARFAC stores the params that it was designed from; there is no state yet: CF = CARFAC_Design(CAR_params, IHC_params, AGC_params) On "channels" and "ears": Stages of the cascade are called "channels", and there are n_ch of them (n_ch is determined from other parameters, but we could also make a function to pick params to get a desired n_ch). Since we already used "channels" for that, sound input channels (one for monaural, two for binaural, or more) are called "ears", and there are n_ears of them. The parameter n_ears is set when state is initialized, but not earlier at design time; it is not in params since it doesn't affect coefficient design). Multi-ear designs usually have the same coeffs across several ears, but always need separate state. The coeffs are kept separate per ear in case someone wants to modify one or both to simulate asymmetric hearing loss or such. The only place the ears interact is in the AGC. The function that closes the AGC loop can "mix" their states, making an inter-ear coupling. Other than that, the functions that update the CAR and IHC and AGC states are simple single-ear functions (methods of the ear class?). Data size and performance: The coeffs and states are several kilobytes each, since they store a handful (10 or so) of floating-point values (at 4 or 8 bytes each) for each channel (typically 60 to 100 channels); that 240 to 800 bytes per coefficient and per state variable. That's the entire data memory footprint; most of it is accessed at every sample time (hopefully it will all fit and have good hit rate in a typical 32 KB L1 d-cache). In Matlab we use doubles (8-byte), but in C++ we intend to use floats (4-byte) and SSE (via Eigen) for higher performance. Alternate implementations are OK. Run-time strategy: To support real-time applications, sound can be processed in short segments, producing segments of NAPs: [NAPs, CF] = CARFAC_Run_Segment(CF, input_waves) Here the plurals ("NAPs" and "input_waves") suggest multiple ears. These can be vectors, with length 1 for mono, or Matlab multi-D arrays with a singleton last dimension. The CF's states are updated such that there will be no glitch when a next segment is processed. Segments are of arbitrary positive-integer length, not necessarily equal. It is not inefficient to use a segment length of 1 sample if results are needed with very low latency. Internally, CARFAC_Run updates each part of the state, one sample at a time. First the CAR, IHC, and AGC are updated ("stepped") for all ears: [car_out, CAR_state] = CARFAC_CAR_Step(sample, CAR_coeffs, CAR_state) [ihc_out, IHC_state] = CARFAC_IHC_Step(car_out, IHC_coeffs, IHC_state) [AGC_state, updated] = CARFAC_AGC_Step(ihc_out, AGC_coeffs, AGC_state) The AGC filter mostly runs at a lower sample rate (by a factor of 8 by default). Usually it just accumulates its input and returns quickly. The boolean "updated" indicates whether the AGC actually did some work and has a new output that needs to be used to "close the loop" and modify the CAR state. When it's true, there's one more step, involving both AGC and CAR states, so it's a function on the CARFAC: CF = CARFAC_Close_AGC_Loop(CF) In Matlab, these functions return a modified copy of the state or CARFAC; that's why we have the multiple returned values. In languages that do more than just call-by-value, a reference will be passed so the state can be modified in place instead. C++ Eigen strategy: The mapping from Matlab's "parallel" operations on vectors of values into C++ code will be done by using Eigen (http://eigen.tuxfamily.org/) arrays, which support similar source-level parallel arithmetic, so we don't have to write loops over channels. Eigen also allows fairly efficient compilation to SSE or Arm/Neon instructions, which can do four float operations per cycle. So we should be able to easily get to efficient code on various platforms. If there are similar strategies available in other languages, we should use them. Python may have a route to using Eigen (http://eigen.tuxfamily.org/index.php?title=PythonInterop).