view trunk/matlab/bmm/carfac/CARFAC_Design_Doc.txt @ 546:e5ae2ed4249c

(none)
author Ulf.Hammarqvist@gmail.com
date Sat, 31 Mar 2012 17:36:22 +0000
parents 95a11cca4619
children fb602edc2d55
line wrap: on
line source
CARFAC Design Doc
by "Richard F. Lyon" <dicklyon@google.com>


The CAR-FAC (cascade of asymmetric resonators with fast-acting
compression) is a cochlear model implemented as an efficient sound
processor, for mono, stereo, or multi-channel sound inputs.  This file
describes aspects of the software design.

The implementation will include equivalent Maltab, C++, Python, and
perhaps other versions.  They should all be based on the same set of
classes (or structs) and functions, with roughly equivalent
functionality and names where possible.  The examples here are Matlab,
in which the structs are typeless, but in other languages each will be
class or type.

The top-level class is CARFAC.  A CARFAC object knows how to design
its details from a modest set of parameters, and knows how to process
sound signals to produce "neural activity patterns" (NAPs).  It
includes various sub-objects representing the parameters, designs, and
states of different parts of the CAR-FAC model.

The three main subsystems of the CARFAC are the cascade of asymmetric
resonators (CAR), the inner hair cell (IHC), and the automatic gain
control (AGC).  These are not intended to work independently, so may
not have corresponding classes to combine all their aspects (in the
Matlab so far, they don't).  Rather, each part has three classes
associated with it, and the CARFAC stores instances of all of these as
members:

  CAR_params, CAR_coeffs, CAR_state
  IHC_params, IHC_coeffs, IHC_state
  AGC_params, AGC_coeffs, AGC_state

These names can be used both for the classes, and slightly modified
for the member variables, arguments, temps, etc.

 -- (presently we have "filters" where I say "CAR"; I'll change it) --

The params are inputs that specify the system; the coeffs are things
like filter coefficients, things computed once and used at run time;
the state is whatever internal state is needed between running the
model on samples or segments of sound waveforms, such as the state
variables of the digital filters.

At construction ("design") time, params are provided, and coeffs are
created.  The created CARFAC stores the params that it was designed
from, and the coeffs that will be used at run time; there is no state
yet:

  CF = CARFAC_Design(CAR_params, IHC_params, AGC_params)


On "channels" and "ears":

Stages of the cascade are called "channels", and there are n_ch of
them (n_ch is determined from other parameters, but we could also make
a function to pick params to get a desired n_ch).  Since we already
used "channels" for that, sound input channels (one for monaural, two
for binaural, or more) are called "ears", and there are n_ears of them.
The parameter n_ears is set when state is initialized, but
not earlier at design time; it is not in params since it doesn't
affect coefficient design).

Multi-ear designs always share the same coeffs, but need separate state.
The only  place the ears interact is in the AGC.  The AGC_state is
one object that holds the state of all the ears, so they can be
concurrently updated.  For the CAR and IHC, though, the state object
represents a single ear's state, and the CARFAC stores arrays (vectors) 
of state objects for the different ears.  The functions that update the
CAR and IHC states are then simple single-ear functions.


Data size and performance:

The coeffs and states are several kilobytes each, since they store a
handful (5 to 10) of floating-point values (at 4 or 8 bytes each) for
each channel (typically 60 to 100 channels); that 240 to 800 bytes per
coefficient and per state variable.  That's the entire data memory
footprint; most of it is accessed at every sample time (hopefully it
will all fit and have good hit rate in a typical 32 KB L1 d-cache).
In Matlab we use doubles (8-byte), but in C++ we intend to use floats
(4-byte) and SSE (via Eigen) for higher performance.  Alternate
implementations are OK.


Run-time strategy:

To support real-time applications, sound can be processed in short
segments, producing segments of NAPs:

  [NAPs, CF] = CARFAC_Run_Segment(CF, input_waves)

Here the plurals ("NAPs" and "input_waves") suggest multiple ears.  
These can be vectors, with length 1 for mono, or Matlab
multi-D arrays with a singleton last dimension.

The CF's states are updated such that there will be no glitch when a next
segment is processed.  Segments are of arbitrary positive-integer
length, not necessarily equal.  It is not inefficient to use a segment
length of 1 sample if results are needed with very low latency.

Internally, CARFAC_Run updates each part of the state, one sample at a
time.  First the CAR and IHC are updated ("stepped") for all ears:

  [car_out, CAR_state] = CARFAC_CAR_Step(sample,  CAR_coeffs, CAR_state)
  [ihc_out, IHC_state] = CARFAC_IHC_Step(car_out, IHC_coeffs, IHC_state)

Then the AGC is updated, for all ears at once (note the plural
"ihc_outs", representing the collection of n_ears):

  [AGC_state, updated] = CARFAC_AGC_Step(AGC_coeffs, ihc_outs, AGC_state)

The AGC filter mostly runs at a lower sample rate (by a factor of 8 by 
default). Usually it just accumulates its input and returns quickly. The 
boolean "updated" indicates whether the AGC actually did some work and 
has a new output that needs to be used to "close the loop" and modify 
the CAR state. When it's true, there's one more step, involving both AGC 
and CAR states, so it's a function on the CARFAC: 

  CF = CARFAC_Close_AGC_Loop(CF)

In Matlab, these functions return a modified copy of the state or
CARFAC; that's why we have the multiple returned values.  In languages
that do more than just call-by-value, a reference will be passed so
the state can be modified in place instead.


C++ Eigen strategy:

The mapping from Matlab's "parallel" operations on vectors of values
into C++ code will be done by using Eigen
(http://eigen.tuxfamily.org/) arrays, which support similar
source-level parallel arithmetic, so we don't have to write loops over
channels.  Eigen also allows fairly efficient compilation to SSE or
Arm/Neon instructions, which can do four float operations per cycle.
So we should be able to easily get to efficient code on various
platforms.

If there are similar strategies available in other languages, we should 
use them.  Python may have a route to using Eigen
(http://eigen.tuxfamily.org/index.php?title=PythonInterop).