Idyom » History » Version 5
Marcus Pearce, 2012-02-02 02:02 PM
| 1 | 1 | Marcus Pearce | h1. idyom |
|---|---|---|---|
| 2 | 1 | Marcus Pearce | |
| 3 | 1 | Marcus Pearce | h2. Loading the system |
| 4 | 1 | Marcus Pearce | |
| 5 | 1 | Marcus Pearce | <pre> |
| 6 | 1 | Marcus Pearce | CL-USER> (asdf:oos 'asdf:load-op 'idyom) |
| 7 | 1 | Marcus Pearce | ... |
| 8 | 1 | Marcus Pearce | </pre> |
| 9 | 1 | Marcus Pearce | |
| 10 | 5 | Marcus Pearce | You will need to set the value of the variable <code>*root-dir*</code> in the file "apps/apps.lisp":https://code.soundsoftware.ac.uk/projects/idyom/repository/entry/apps/apps.lisp to point to the location of the idyom repository; this is so that files can be saved in the <code>data</code> directory within the repository. |
| 11 | 1 | Marcus Pearce | |
| 12 | 1 | Marcus Pearce | h2. Directory Structure |
| 13 | 1 | Marcus Pearce | |
| 14 | 1 | Marcus Pearce | The layout of the code is as follows: |
| 15 | 1 | Marcus Pearce | |
| 16 | 1 | Marcus Pearce | * idyom.asd: the system definition |
| 17 | 1 | Marcus Pearce | * utils/: general-purpose utilities |
| 18 | 1 | Marcus Pearce | * amuse/: an interface to amuse music data interface |
| 19 | 1 | Marcus Pearce | ** amuse-interface.lisp: look here to make this less dependent on amuse-mtp |
| 20 | 1 | Marcus Pearce | ** viewpoint-extensions.lisp: extensions to the viewpoint framework useful for modelling |
| 21 | 1 | Marcus Pearce | * ppm/: the multiple-viewpoint system modelling framework |
| 22 | 1 | Marcus Pearce | ** params.lisp: user-level parameters to the statistical modelling |
| 23 | 1 | Marcus Pearce | ** multiple-viewpoint-system.lisp: modelling with multiple-viewpoints |
| 24 | 1 | Marcus Pearce | ** prediction-sets.lisp: representing and combining distributions, computing entropy and information content |
| 25 | 1 | Marcus Pearce | * apps/: top-level applications for users |
| 26 | 1 | Marcus Pearce | ** resampling.lisp: the main user interface - estimating information content by cross-validation |
| 27 | 1 | Marcus Pearce | ** viewpoint-selection.lisp: on tin does it says the what |
| 28 | 1 | Marcus Pearce | * data/: directory used internally for storing data |
| 29 | 1 | Marcus Pearce | ** cache/: for storing cache files |
| 30 | 1 | Marcus Pearce | ** models/: for storing model files |
| 31 | 1 | Marcus Pearce | ** resampling/: for storing resampling sets |
| 32 | 1 | Marcus Pearce | |
| 33 | 1 | Marcus Pearce | h2. Usage |
| 34 | 1 | Marcus Pearce | |
| 35 | 2 | Marcus Pearce | h3. Top-level function <code>MAIN:IDYOM</code> |
| 36 | 1 | Marcus Pearce | |
| 37 | 2 | Marcus Pearce | The main workhorse is the function <code>MAIN:IDYOM</code> which accepts the following arguments (the first three are required, the remainder are optional keyword arguments): |
| 38 | 1 | Marcus Pearce | |
| 39 | 2 | Marcus Pearce | _Required parameters_ |
| 40 | 2 | Marcus Pearce | |
| 41 | 1 | Marcus Pearce | * dataset-id: a dataset id |
| 42 | 1 | Marcus Pearce | ** e.g., 1 |
| 43 | 1 | Marcus Pearce | * basic-attributes: a list of basic attributes to predict |
| 44 | 1 | Marcus Pearce | ** e.g., '(cpitch bioi) |
| 45 | 1 | Marcus Pearce | * attributes: a list of attributes to use in prediction |
| 46 | 1 | Marcus Pearce | ** e.g., '((cpintfref cpint) bioi) |
| 47 | 2 | Marcus Pearce | |
| 48 | 2 | Marcus Pearce | _Parameters controlling the statistical modelling_ |
| 49 | 2 | Marcus Pearce | |
| 50 | 1 | Marcus Pearce | * pretraining-ids: a list of dataset-ids to pretrain the long-term models |
| 51 | 1 | Marcus Pearce | ** e.g., '(0 1 7) |
| 52 | 1 | Marcus Pearce | * k: an integer designating the number of cross-validation folds to use |
| 53 | 1 | Marcus Pearce | ** 1 = no cross-validation, but also no training set unless the models are pretrained; |
| 54 | 1 | Marcus Pearce | ** :full = as many folds as there are compositions in the dataset |
| 55 | 1 | Marcus Pearce | ** default = 10 |
| 56 | 1 | Marcus Pearce | * resampling-indices: you can limit the modelling to a particular set of resampling folds |
| 57 | 1 | Marcus Pearce | * models: whether to use the short-term or long-term models or both |
| 58 | 1 | Marcus Pearce | ** :stm - short-term model only |
| 59 | 1 | Marcus Pearce | ** :ltm - long-term model only |
| 60 | 1 | Marcus Pearce | ** :ltm+ - the long-term model trained incrementally on the test set |
| 61 | 1 | Marcus Pearce | ** :both - :stm + :ltm |
| 62 | 1 | Marcus Pearce | ** :both+ - :stm + :ltm+ (this is the default) |
| 63 | 1 | Marcus Pearce | * ltm-order-bound: the order bound for the long-term model (the default <code>nil</code> means no order bound, otherwise an integer indicates the bound in number of events) |
| 64 | 1 | Marcus Pearce | * ltm-mixtures: whether to use mixtures for the LTM (default <code>t</code>) |
| 65 | 1 | Marcus Pearce | * ltm-update-exclusion: whether to use update exclusion for the LTM (default <code>nil</code>) |
| 66 | 1 | Marcus Pearce | * ltm-escape: the escape method to use for the LTM (<code>:a :b :c :d :x</code> - default <code>:c</code>) |
| 67 | 1 | Marcus Pearce | * stm-order-bound: the order bound to use for the short-term model (default <code>nil</code>) |
| 68 | 1 | Marcus Pearce | * stm-mixtures: whether to use mixtures for the STM (default <code>t</code>) |
| 69 | 1 | Marcus Pearce | * stm-update-exclusion: whether to use update exclusion for the STM (default <code>t</code>) |
| 70 | 1 | Marcus Pearce | * stm-escape: the escape method for the STM (default <code>:x</code>) |
| 71 | 1 | Marcus Pearce | |
| 72 | 1 | Marcus Pearce | See "Pearce [2005, chapter 6]":http://webprojects.eecs.qmul.ac.uk/marcusp/papers/Pearce2005.pdf for a description and explanation of these parameters. |
| 73 | 2 | Marcus Pearce | |
| 74 | 2 | Marcus Pearce | _Parameters controlling viewpoint selection_ |
| 75 | 2 | Marcus Pearce | |
| 76 | 2 | Marcus Pearce | * dp: the number of decimal places to use when comparing information contents in viewpoint selection |
| 77 | 2 | Marcus Pearce | ** full floating point precision is used if this is <code>nil</code> (the default) |
| 78 | 2 | Marcus Pearce | * max-links: the maximum number of links to use when creating linked viewpoints in viewpoint selection |
| 79 | 2 | Marcus Pearce | ** the default is 2 |
| 80 | 2 | Marcus Pearce | |
| 81 | 2 | Marcus Pearce | _Parameters controlling the output_ |
| 82 | 2 | Marcus Pearce | |
| 83 | 2 | Marcus Pearce | * output-path: a string indicating a directory in which to write the output |
| 84 | 3 | Marcus Pearce | ** output is only written to the console if this is <code>nil</code> |
| 85 | 2 | Marcus Pearce | * detail: an integer which determines how the information content is averaged in the output: |
| 86 | 2 | Marcus Pearce | ** 1: averaged over the entire dataset |
| 87 | 2 | Marcus Pearce | ** 2: and also averaged over each composition |
| 88 | 2 | Marcus Pearce | ** 3: and also for each event in each composition |
| 89 | 2 | Marcus Pearce | |
| 90 | 2 | Marcus Pearce | h3. Subsidiary functions |
| 91 | 2 | Marcus Pearce | |
| 92 | 2 | Marcus Pearce | The function <MAIN:IDYOM> in turn passes its arguments on to a number of sub-functions which can be used independently. |
| 93 | 2 | Marcus Pearce | |
| 94 | 2 | Marcus Pearce | <code>RESAMPLING:DATASET-PREDICTION</code> accepts the following arguments (all but the first three are optional, keyword arguments): |
| 95 | 2 | Marcus Pearce | |
| 96 | 2 | Marcus Pearce | * dataset-id: a dataset id |
| 97 | 2 | Marcus Pearce | ** e.g., 1 |
| 98 | 2 | Marcus Pearce | * basic-attributes: a list of basic attributes to predict |
| 99 | 2 | Marcus Pearce | ** e.g., '(cpitch bioi) |
| 100 | 2 | Marcus Pearce | * attributes: a list of attributes to use in prediction |
| 101 | 2 | Marcus Pearce | ** e.g., '((cpintfref cpint) bioi) |
| 102 | 2 | Marcus Pearce | * pretraining-ids: a list of dataset-ids to pretrain the long-term models |
| 103 | 2 | Marcus Pearce | ** e.g., '(0 1 7) |
| 104 | 2 | Marcus Pearce | * k: an integer designating the number of cross-validation folds to use |
| 105 | 2 | Marcus Pearce | ** 1 = no cross-validation, but also no training set unless the models are pretrained; |
| 106 | 2 | Marcus Pearce | ** :full = as many folds as there are compositions in the dataset |
| 107 | 2 | Marcus Pearce | ** default = 10 |
| 108 | 2 | Marcus Pearce | * resampling-indices: you can limit the modelling to a particular set of resampling folds |
| 109 | 2 | Marcus Pearce | * models: whether to use the short-term or long-term models or both |
| 110 | 2 | Marcus Pearce | ** :stm - short-term model only |
| 111 | 2 | Marcus Pearce | ** :ltm - long-term model only |
| 112 | 2 | Marcus Pearce | ** :ltm+ - the long-term model trained incrementally on the test set |
| 113 | 2 | Marcus Pearce | ** :both - :stm + :ltm |
| 114 | 2 | Marcus Pearce | ** :both+ - :stm + :ltm+ (this is the default) |
| 115 | 2 | Marcus Pearce | * ltm-order-bound: the order bound for the long-term model (the default <code>nil</code> means no order bound, otherwise an integer indicates the bound in number of events) |
| 116 | 2 | Marcus Pearce | * ltm-mixtures: whether to use mixtures for the LTM (default <code>t</code>) |
| 117 | 2 | Marcus Pearce | * ltm-update-exclusion: whether to use update exclusion for the LTM (default <code>nil</code>) |
| 118 | 2 | Marcus Pearce | * ltm-escape: the escape method to use for the LTM (<code>:a :b :c :d :x</code> - default <code>:c</code>) |
| 119 | 2 | Marcus Pearce | * stm-order-bound: the order bound to use for the short-term model (default <code>nil</code>) |
| 120 | 2 | Marcus Pearce | * stm-mixtures: whether to use mixtures for the STM (default <code>t</code>) |
| 121 | 2 | Marcus Pearce | * stm-update-exclusion: whether to use update exclusion for the STM (default <code>t</code>) |
| 122 | 2 | Marcus Pearce | * stm-escape: the escape method for the STM (default <code>:x</code>) |
| 123 | 1 | Marcus Pearce | |
| 124 | 1 | Marcus Pearce | <code>RESAMPLING:OUTPUT-INFORMATION-CONTENT</code> takes the output of <code>RESAMPLING:DATASET-PREDICTION</code> and returns the average information content. It takes the following arguments: |
| 125 | 1 | Marcus Pearce | |
| 126 | 1 | Marcus Pearce | * predictions: the output of <code>RESAMPLING:DATASET-PREDICTION</code> |
| 127 | 1 | Marcus Pearce | * detail: an integer which determines how the information content is averaged (these are returned as multiple values): |
| 128 | 1 | Marcus Pearce | ** 1: averaged over the entire dataset |
| 129 | 1 | Marcus Pearce | ** 2: and also averaged over each composition |
| 130 | 1 | Marcus Pearce | ** 3: and also for each event in each composition |
| 131 | 1 | Marcus Pearce | |
| 132 | 1 | Marcus Pearce | <code>RESAMPLING:FORMAT-INFORMATION-CONTENT</code> takes the output of <code>RESAMPLING:DATASET-PREDICTION</code> and writes it to file. It takes the following arguments: |
| 133 | 1 | Marcus Pearce | |
| 134 | 1 | Marcus Pearce | * predictions: the output of <code>RESAMPLING:DATASET-PREDICTION</code> |
| 135 | 1 | Marcus Pearce | * file: a string denoting a file |
| 136 | 1 | Marcus Pearce | * dataset-id: an integer reflecting the dataset-id |
| 137 | 1 | Marcus Pearce | * detail: an integer which determines how the information content is averaged (these are returned as multiple values): |
| 138 | 1 | Marcus Pearce | ** 1: averaged over the entire dataset |
| 139 | 1 | Marcus Pearce | ** 2: and also averaged over each composition |
| 140 | 1 | Marcus Pearce | ** 3: and also for each event in each composition |
| 141 | 1 | Marcus Pearce | |
| 142 | 1 | Marcus Pearce | h2. Examples |
| 143 | 1 | Marcus Pearce | |
| 144 | 1 | Marcus Pearce | h3. To get mean information contents for each melody of dataset 0 in a list |
| 145 | 1 | Marcus Pearce | |
| 146 | 1 | Marcus Pearce | <pre> |
| 147 | 1 | Marcus Pearce | CL-USER> (resampling:output-information-content |
| 148 | 1 | Marcus Pearce | (resampling:dataset-prediction 0 '(cpitch) '(cpintfref cpint)) |
| 149 | 1 | Marcus Pearce | 2) |
| 150 | 1 | Marcus Pearce | 2.493305 |
| 151 | 1 | Marcus Pearce | (2.1368716 2.8534691 2.6938546 2.6491673 2.4993074 2.6098127 2.7728052 2.772861 |
| 152 | 1 | Marcus Pearce | 2.5921957 2.905856 2.3591626 2.957503 2.4042292 2.7562473 2.3996017 2.8073587 |
| 153 | 1 | Marcus Pearce | 2.114944 1.7434102 2.2310295 2.6374347 2.361792 1.9476132 2.501488 2.5472867 |
| 154 | 1 | Marcus Pearce | 2.1056154 2.8225484 2.134257 2.9162033 3.0715692 2.9012227 2.7291088 2.866882 |
| 155 | 1 | Marcus Pearce | 2.8795822 2.4571223 2.9277062 2.7861307 2.6623116 2.3304622 2.4217033 |
| 156 | 1 | Marcus Pearce | 2.0556943 2.4048684 2.914848 2.7182267 3.0894585 2.873869 1.8821808 2.640174 |
| 157 | 1 | Marcus Pearce | 2.8165438 2.5423129 2.3011856 3.1477294 2.655349 2.5216308 2.0667994 3.2579045 |
| 158 | 1 | Marcus Pearce | 2.573013 2.6035044 2.202191 2.622113 2.2621205 2.3617425 2.7526956 2.3281655 |
| 159 | 1 | Marcus Pearce | 2.9357266 2.3372407 3.1848125 2.67367 2.1906006 2.7835917 2.6332111 3.206142 |
| 160 | 1 | Marcus Pearce | 2.1426969 2.194259 2.415167 1.9769101 2.0870917 2.7844474 2.2373738 2.772138 |
| 161 | 1 | Marcus Pearce | 2.9702199 1.724408 2.473073 2.2464263 2.2452457 2.688889 2.6299863 2.2223835 |
| 162 | 1 | Marcus Pearce | 2.8082614 2.673671 2.7693706 2.3369458 2.5016947 2.3837066 2.3682225 2.795649 |
| 163 | 1 | Marcus Pearce | 2.9063463 2.5880773 2.0457468 1.8635312 2.4522712 1.5877498 2.8802161 |
| 164 | 1 | Marcus Pearce | 2.7988417 2.3125513 1.7245895 2.2404804 2.1694546 2.365556 1.5905867 1.3827317 |
| 165 | 1 | Marcus Pearce | 2.2706041 3.023884 2.2864542 2.1259797 2.713626 2.1967313 2.5721254 2.5812547 |
| 166 | 1 | Marcus Pearce | 2.8233812 2.3134546 2.6203637 2.945946 2.601433 2.1920888 2.3732007 2.440137 |
| 167 | 1 | Marcus Pearce | 2.4291563 2.3676903 2.734724 3.0283954 2.8076048 2.7796154 2.326931 2.1779459 |
| 168 | 1 | Marcus Pearce | 2.2570527 2.2688026 1.3976555 2.030298 2.640235 2.568248 2.6338177 2.157162 |
| 169 | 1 | Marcus Pearce | 2.3915367 2.7873137 2.3088667 2.2176988 2.4402564 2.8062992 2.784044 2.4296925 |
| 170 | 1 | Marcus Pearce | 2.3520193 2.6146257) |
| 171 | 1 | Marcus Pearce | </pre> |
| 172 | 1 | Marcus Pearce | |
| 173 | 1 | Marcus Pearce | h3. To write the information contents for each note of each melody in dataset 0 to a file |
| 174 | 1 | Marcus Pearce | |
| 175 | 1 | Marcus Pearce | <pre> |
| 176 | 1 | Marcus Pearce | CL-USER> (resampling:format-information-content |
| 177 | 1 | Marcus Pearce | (resampling:dataset-prediction 0 '(cpitch) '(cpintfref cpint)) |
| 178 | 1 | Marcus Pearce | "/tmp/foo.dat" |
| 179 | 1 | Marcus Pearce | 0 |
| 180 | 1 | Marcus Pearce | 3) |
| 181 | 1 | Marcus Pearce | </pre> |
| 182 | 1 | Marcus Pearce | |
| 183 | 1 | Marcus Pearce | h3. To simulate the experiments of Conklin & Witten (1995) |
| 184 | 1 | Marcus Pearce | |
| 185 | 1 | Marcus Pearce | <pre> |
| 186 | 1 | Marcus Pearce | CL-USER> (resampling:conkwit95) |
| 187 | 1 | Marcus Pearce | Simulation of the experiments of Conklin & Witten (1995, Table 4). |
| 188 | 1 | Marcus Pearce | System 1; Mean Information Content: 2.33 |
| 189 | 1 | Marcus Pearce | System 2; Mean Information Content: 2.36 |
| 190 | 1 | Marcus Pearce | System 3; Mean Information Content: 2.09 |
| 191 | 1 | Marcus Pearce | System 4; Mean Information Content: 2.01 |
| 192 | 1 | Marcus Pearce | System 5; Mean Information Content: 2.08 |
| 193 | 1 | Marcus Pearce | System 6; Mean Information Content: 1.90 |
| 194 | 1 | Marcus Pearce | System 7; Mean Information Content: 1.88 |
| 195 | 1 | Marcus Pearce | System 8; Mean Information Content: 1.86 |
| 196 | 1 | Marcus Pearce | NIL |
| 197 | 1 | Marcus Pearce | </pre> |
| 198 | 1 | Marcus Pearce | |
| 199 | 1 | Marcus Pearce | Compare with "Conklin & Witten [1995, JNMR, table 4]":http://www.sc.ehu.es/ccwbayes/members/conklin/papers/jnmr95.pdf |
| 200 | 1 | Marcus Pearce | |
| 201 | 1 | Marcus Pearce | h3. Viewpoint Selection |
| 202 | 1 | Marcus Pearce | |
| 203 | 1 | Marcus Pearce | Two functions are supplied for searching a space of viewpoints: <code>run-hill-climber</code> and <code>run-best-first</code>, which take 4 arguments: |
| 204 | 1 | Marcus Pearce | |
| 205 | 1 | Marcus Pearce | * a list of viewpoints: the algorithm searches through the space of combinations of these viewpoints |
| 206 | 1 | Marcus Pearce | * a start state (usually nil, the empty viewpoint system) |
| 207 | 1 | Marcus Pearce | * an evaluation function returning a numeric performance metric: e.g., the mean information content of the dataset returned by <code>dataset-prediction</code> |
| 208 | 1 | Marcus Pearce | * a symbol describing which way to optimise the metric: <code>:desc</code> mean lower values are better <code>:asc</code> mean greater values are better |
| 209 | 1 | Marcus Pearce | |
| 210 | 1 | Marcus Pearce | Here is an example: |
| 211 | 1 | Marcus Pearce | |
| 212 | 1 | Marcus Pearce | <pre> |
| 213 | 1 | Marcus Pearce | CL-USER> (viewpoint-selection:run-hill-climber |
| 214 | 1 | Marcus Pearce | '(:cpitch :cpintfref :cpint :contour) |
| 215 | 1 | Marcus Pearce | nil |
| 216 | 1 | Marcus Pearce | #'(lambda (viewpoints) |
| 217 | 1 | Marcus Pearce | (utils:round-to-nearest-decimal-place |
| 218 | 1 | Marcus Pearce | (resampling:output-information-content |
| 219 | 1 | Marcus Pearce | (resampling:dataset-prediction 0 '(cpitch) viewpoints :k 10 :models :both+) |
| 220 | 1 | Marcus Pearce | 1) |
| 221 | 1 | Marcus Pearce | 2)) |
| 222 | 1 | Marcus Pearce | :desc) |
| 223 | 1 | Marcus Pearce | |
| 224 | 1 | Marcus Pearce | ============================================================================= |
| 225 | 1 | Marcus Pearce | System Score |
| 226 | 1 | Marcus Pearce | ----------------------------------------------------------------------------- |
| 227 | 1 | Marcus Pearce | NIL NIL |
| 228 | 1 | Marcus Pearce | (CPITCH) 2.52 |
| 229 | 1 | Marcus Pearce | (CPINT CPITCH) 2.43 |
| 230 | 1 | Marcus Pearce | (CPINTFREF CPINT CPITCH) 2.38 |
| 231 | 1 | Marcus Pearce | ============================================================================= |
| 232 | 1 | Marcus Pearce | #S(VIEWPOINT-SELECTION::RECORD :STATE (:CPINTFREF :CPINT :CPITCH) :WEIGHT 2.38) |
| 233 | 1 | Marcus Pearce | </pre> |
| 234 | 1 | Marcus Pearce | |
| 235 | 1 | Marcus Pearce | Since this can be quite a time consuming process, there are also functions for caching the results. |
| 236 | 1 | Marcus Pearce | |
| 237 | 1 | Marcus Pearce | <pre> |
| 238 | 1 | Marcus Pearce | (initialise-vs-cache) |
| 239 | 1 | Marcus Pearce | (load-vs-cache filename package) |
| 240 | 1 | Marcus Pearce | (store-vs-cache filename package) |
| 241 | 1 | Marcus Pearce | </pre> |