annotate trunk/matlab/bmm/carfac/CARFAC_Design_Doc.txt @ 682:10dc41e4d2f2

More small style revisions to C++ CARFAC, adjusted struct member variable naming, header guards and #include structure.
author alexbrandmeyer
date Wed, 29 May 2013 15:37:28 +0000
parents e4c2162baca8
children
rev   line source
dicklyon@534 1 CARFAC Design Doc
dicklyon@534 2 by "Richard F. Lyon" <dicklyon@google.com>
dicklyon@567 3 updated 24 May 2012 (v.237)
dicklyon@534 4
dicklyon@534 5 The CAR-FAC (cascade of asymmetric resonators with fast-acting
dicklyon@534 6 compression) is a cochlear model implemented as an efficient sound
dicklyon@534 7 processor, for mono, stereo, or multi-channel sound inputs. This file
dicklyon@534 8 describes aspects of the software design.
dicklyon@534 9
dicklyon@563 10 The implementation will include equivalent Matlab, C++, Python, and
dicklyon@534 11 perhaps other versions. They should all be based on the same set of
dicklyon@534 12 classes (or structs) and functions, with roughly equivalent
dicklyon@534 13 functionality and names where possible. The examples here are Matlab,
dicklyon@534 14 in which the structs are typeless, but in other languages each will be
dicklyon@534 15 class or type.
dicklyon@534 16
dicklyon@534 17 The top-level class is CARFAC. A CARFAC object knows how to design
dicklyon@534 18 its details from a modest set of parameters, and knows how to process
dicklyon@534 19 sound signals to produce "neural activity patterns" (NAPs). It
dicklyon@534 20 includes various sub-objects representing the parameters, designs, and
dicklyon@534 21 states of different parts of the CAR-FAC model.
dicklyon@534 22
dicklyon@563 23 The CARFAC class includes a vector of "ear" objects -- one for mono,
dicklyon@563 24 two for stereo, or more.
dicklyon@563 25
dicklyon@563 26 The three main subsystems of the EAR are the cascade of asymmetric
dicklyon@534 27 resonators (CAR), the inner hair cell (IHC), and the automatic gain
dicklyon@565 28 control (AGC). These are not intended to work independently, but
dicklyon@565 29 each part has three groups of data associated with it. The CARFAC
dicklyon@565 30 stores instances of the "params" that drive the design, and the
dicklyon@565 31 "coeffs" and "state" are stored per ear:
dicklyon@534 32
dicklyon@534 33 CAR_params, CAR_coeffs, CAR_state
dicklyon@534 34 IHC_params, IHC_coeffs, IHC_state
dicklyon@534 35 AGC_params, AGC_coeffs, AGC_state
dicklyon@534 36
dicklyon@534 37 These names can be used both for the classes, and slightly modified
dicklyon@534 38 for the member variables, arguments, temps, etc.
dicklyon@534 39
dicklyon@534 40 The params are inputs that specify the system; the coeffs are things
dicklyon@534 41 like filter coefficients, things computed once and used at run time;
dicklyon@534 42 the state is whatever internal state is needed between running the
dicklyon@534 43 model on samples or segments of sound waveforms, such as the state
dicklyon@534 44 variables of the digital filters.
dicklyon@534 45
dicklyon@565 46 At construction ("design") time, params are provided, and ears with coeffs
dicklyon@565 47 are created. The created CARFAC stores the params that it was designed
dicklyon@565 48 from; there is no state yet:
dicklyon@534 49
dicklyon@534 50 CF = CARFAC_Design(CAR_params, IHC_params, AGC_params)
dicklyon@534 51
dicklyon@534 52
dicklyon@534 53 On "channels" and "ears":
dicklyon@534 54
dicklyon@534 55 Stages of the cascade are called "channels", and there are n_ch of
dicklyon@534 56 them (n_ch is determined from other parameters, but we could also make
dicklyon@534 57 a function to pick params to get a desired n_ch). Since we already
dicklyon@534 58 used "channels" for that, sound input channels (one for monaural, two
dicklyon@534 59 for binaural, or more) are called "ears", and there are n_ears of them.
dicklyon@534 60 The parameter n_ears is set when state is initialized, but
dicklyon@534 61 not earlier at design time; it is not in params since it doesn't
dicklyon@534 62 affect coefficient design).
dicklyon@534 63
dicklyon@565 64 Multi-ear designs usually have the same coeffs across several ears, but
dicklyon@565 65 always need separate state.
dicklyon@565 66 The coeffs are kept separate per ear in case someone wants to
dicklyon@565 67 modify one or both to simulate asymmetric hearing loss or such.
dicklyon@565 68
dicklyon@565 69 The only place the ears interact is in the AGC. The function that
dicklyon@565 70 closes the AGC loop can "mix" their states, making an inter-ear
dicklyon@565 71 coupling. Other than that, the functions that update the CAR and IHC
dicklyon@565 72 and AGC states are simple single-ear functions (methods of the ear
dicklyon@565 73 class?).
dicklyon@534 74
dicklyon@534 75
dicklyon@534 76 Data size and performance:
dicklyon@534 77
dicklyon@534 78 The coeffs and states are several kilobytes each, since they store a
dicklyon@565 79 handful (10 or so) of floating-point values (at 4 or 8 bytes each) for
dicklyon@534 80 each channel (typically 60 to 100 channels); that 240 to 800 bytes per
dicklyon@534 81 coefficient and per state variable. That's the entire data memory
dicklyon@534 82 footprint; most of it is accessed at every sample time (hopefully it
dicklyon@534 83 will all fit and have good hit rate in a typical 32 KB L1 d-cache).
dicklyon@534 84 In Matlab we use doubles (8-byte), but in C++ we intend to use floats
dicklyon@534 85 (4-byte) and SSE (via Eigen) for higher performance. Alternate
dicklyon@534 86 implementations are OK.
dicklyon@534 87
dicklyon@534 88
dicklyon@534 89 Run-time strategy:
dicklyon@534 90
dicklyon@534 91 To support real-time applications, sound can be processed in short
dicklyon@534 92 segments, producing segments of NAPs:
dicklyon@534 93
dicklyon@534 94 [NAPs, CF] = CARFAC_Run_Segment(CF, input_waves)
dicklyon@534 95
dicklyon@534 96 Here the plurals ("NAPs" and "input_waves") suggest multiple ears.
dicklyon@534 97 These can be vectors, with length 1 for mono, or Matlab
dicklyon@534 98 multi-D arrays with a singleton last dimension.
dicklyon@534 99
dicklyon@534 100 The CF's states are updated such that there will be no glitch when a next
dicklyon@534 101 segment is processed. Segments are of arbitrary positive-integer
dicklyon@534 102 length, not necessarily equal. It is not inefficient to use a segment
dicklyon@534 103 length of 1 sample if results are needed with very low latency.
dicklyon@534 104
dicklyon@534 105 Internally, CARFAC_Run updates each part of the state, one sample at a
dicklyon@565 106 time. First the CAR, IHC, and AGC are updated ("stepped") for all ears:
dicklyon@534 107
dicklyon@534 108 [car_out, CAR_state] = CARFAC_CAR_Step(sample, CAR_coeffs, CAR_state)
dicklyon@534 109 [ihc_out, IHC_state] = CARFAC_IHC_Step(car_out, IHC_coeffs, IHC_state)
dicklyon@565 110 [AGC_state, updated] = CARFAC_AGC_Step(ihc_out, AGC_coeffs, AGC_state)
dicklyon@534 111
dicklyon@534 112 The AGC filter mostly runs at a lower sample rate (by a factor of 8 by
dicklyon@534 113 default). Usually it just accumulates its input and returns quickly. The
dicklyon@534 114 boolean "updated" indicates whether the AGC actually did some work and
dicklyon@534 115 has a new output that needs to be used to "close the loop" and modify
dicklyon@534 116 the CAR state. When it's true, there's one more step, involving both AGC
dicklyon@534 117 and CAR states, so it's a function on the CARFAC:
dicklyon@534 118
dicklyon@534 119 CF = CARFAC_Close_AGC_Loop(CF)
dicklyon@534 120
dicklyon@534 121 In Matlab, these functions return a modified copy of the state or
dicklyon@534 122 CARFAC; that's why we have the multiple returned values. In languages
dicklyon@534 123 that do more than just call-by-value, a reference will be passed so
dicklyon@534 124 the state can be modified in place instead.
dicklyon@534 125
dicklyon@534 126
dicklyon@534 127 C++ Eigen strategy:
dicklyon@534 128
dicklyon@534 129 The mapping from Matlab's "parallel" operations on vectors of values
dicklyon@534 130 into C++ code will be done by using Eigen
dicklyon@534 131 (http://eigen.tuxfamily.org/) arrays, which support similar
dicklyon@534 132 source-level parallel arithmetic, so we don't have to write loops over
dicklyon@534 133 channels. Eigen also allows fairly efficient compilation to SSE or
dicklyon@534 134 Arm/Neon instructions, which can do four float operations per cycle.
dicklyon@534 135 So we should be able to easily get to efficient code on various
dicklyon@534 136 platforms.
dicklyon@534 137
dicklyon@534 138 If there are similar strategies available in other languages, we should
dicklyon@534 139 use them. Python may have a route to using Eigen
dicklyon@534 140 (http://eigen.tuxfamily.org/index.php?title=PythonInterop).