Idyom » History » Version 51
Marcus Pearce, 2014-06-16 11:16 AM
1 | 11 | Jeremy Gow | h1. Running IDyOM |
---|---|---|---|
2 | 1 | Marcus Pearce | |
3 | 11 | Jeremy Gow | {{>toc}} |
4 | 1 | Marcus Pearce | |
5 | 44 | Marcus Pearce | h2. <code>*idyom:idyom*</code> |
6 | 1 | Marcus Pearce | |
7 | 12 | Jeremy Gow | The main workhorse function is <code>idyom:idyom</code>, which has three required arguments and a number of optional keyword arguments. |
8 | 1 | Marcus Pearce | |
9 | 13 | Jeremy Gow | h3. Required parameters |
10 | 1 | Marcus Pearce | |
11 | 23 | Jeremy Gow | * @dataset-id@: a dataset id, e.g. 1. |
12 | 23 | Jeremy Gow | * @target-viewpoints@: a list of basic viewpoints to predict, e.g. '(:cpitch :bioi) |
13 | 23 | Jeremy Gow | * @source-viewpoints@: a list of viewpoints to use in prediction, e.g. '((:cpintfref :cpint) :bioi) |
14 | 12 | Jeremy Gow | ** Passing <code>:select</code> will trigger viewpoint selection (see further options below) |
15 | 12 | Jeremy Gow | |
16 | 17 | Jeremy Gow | See the [[List of viewpoints]] for a description of the various viewpoints available in IDyOM. A simple call to IDyOM would be: |
17 | 12 | Jeremy Gow | <pre> |
18 | 36 | Marcus Pearce | CL-USER> (idyom:idyom 0 '(cpitch) '(cpitch)) |
19 | 15 | Jeremy Gow | 2.2490792 |
20 | 15 | Jeremy Gow | (1.9049941 2.427845 2.0234334 1.7971386 1.8213106 1.9313766 2.3758402 1.8310248 |
21 | 14 | Jeremy Gow | ... |
22 | 12 | Jeremy Gow | </pre> |
23 | 18 | Jeremy Gow | This predicts the pitch values in dataset 1, based on previous pitches (cpitch) and pitch intervals (cpint). IDyOM computes the information content for each note, and by default returns two values: the first is a mean note IC for the dataset, the second a list of mean note ICs for the individual compositions. The first value is calculated as the mean of the second. |
24 | 2 | Marcus Pearce | |
25 | 13 | Jeremy Gow | h3. Statistical modelling parameters |
26 | 2 | Marcus Pearce | |
27 | 19 | Jeremy Gow | See "Pearce [2005, chapter 6]":http://webprojects.eecs.qmul.ac.uk/marcusp/papers/Pearce2005.pdf for further description and explanation of these parameters. |
28 | 1 | Marcus Pearce | |
29 | 19 | Jeremy Gow | * @models@: the type of IDyOM model to use. Options are: |
30 | 25 | Jeremy Gow | ** @:stm@ - short-term model only, trained on the current composition. |
31 | 25 | Jeremy Gow | ** @:ltm@ - long-term model only, trained on the pretraining and resampling training data. |
32 | 25 | Jeremy Gow | ** @:ltm+@ - the long-term model, with additional incremental training on the test set; |
33 | 19 | Jeremy Gow | ** @:both@ - a combination of :stm and :ltm; |
34 | 19 | Jeremy Gow | ** @:both+@ - a combination of :stm and :ltm+ (this is the default). |
35 | 19 | Jeremy Gow | |
36 | 27 | Jeremy Gow | The LTM and STM can be configured using the @ltmo@ and @stmo@ parameters. These accept a property list with the following properties - the defaults are used if a property is omitted or no parameter list is supplied: |
37 | 1 | Marcus Pearce | * @:order-bound@: an integer indicating the bound on the order of the model, i.e. the number of past events used by the model. The default is @nil@, no bound. |
38 | 1 | Marcus Pearce | * @:mixtures@: whether to use mixtures for the model. (Default @t@). |
39 | 1 | Marcus Pearce | * @:update-exclusion@: whether to use update exclusion. (LTM default @nil@, STM default @t@.) |
40 | 1 | Marcus Pearce | * @:escape@: the model's escape method. One of @:a :b :c :d :x@. (LTM default @:c@, STM default @:x@.) |
41 | 1 | Marcus Pearce | |
42 | 1 | Marcus Pearce | For example, the following command would combine the STM and LTM, without incremental training for the latter and an STM order bound of 4: |
43 | 25 | Jeremy Gow | <pre> |
44 | 26 | Jeremy Gow | CL-USER> (idyom:idyom 1 '(cpitch) '(cpitch) :models :both :stmo '(:order-bound 4)) |
45 | 25 | Jeremy Gow | </pre> |
46 | 1 | Marcus Pearce | |
47 | 20 | Jeremy Gow | h3. Training parameters |
48 | 20 | Jeremy Gow | |
49 | 20 | Jeremy Gow | When using IDyOM to estimate note IC for a given dataset, the long-term models can be trained on other datasets (pretraining) and/or on the current dataset, i.e. via resampling (cross-validation). In the latter case, the dataset is partitioned into a training set (used to train the LTMs) and a test set (for which note IC is computed). This split is called a fold, and the modelling process can be repeated with a number of different folds in order to model the entire dataset. |
50 | 20 | Jeremy Gow | |
51 | 48 | Marcus Pearce | * @pretraining-ids@: a list of dataset ids used to pretrain the long-term models (done before resampling). Note that if pretraining-ids are supplied for an STM (i.e., <code> :models :stm</code>) the pretraining datasets are used to set the viewpoint domains (alphabet) for the models although not for training the models themselves (because they are short-term models). |
52 | 20 | Jeremy Gow | * @k@: the number of resampling (cross-validation) folds to use. The default value is 10. |
53 | 20 | Jeremy Gow | ** @1@ = no resampling, but also no training set unless the models are pretrained; |
54 | 20 | Jeremy Gow | ** @:full@ = as many folds as there are compositions in the dataset |
55 | 20 | Jeremy Gow | * @resampling-indices@: a list of numbers designating which resampling folds to use, i.e. a subset of @[0, 1, ..., k - 1]@. By default, all folds are used. |
56 | 2 | Marcus Pearce | |
57 | 35 | Marcus Pearce | *Note* that cross-validation only applies to the dataset being analysed (i.e., the one specified by the <code>dataset-id</code> argument). If a value of k=1 is supplied, the long-term models are not trained, unless a pretraining set is used. |
58 | 35 | Marcus Pearce | |
59 | 13 | Jeremy Gow | h3. Viewpoint selection parameters |
60 | 2 | Marcus Pearce | |
61 | 24 | Jeremy Gow | * @basis@: Identifies a set of viewpoints to be used in viewpoint selection, i.e. it will attempt to find the 'best' viewpoint system combining these, including by linking them. The parameter can be a list or one of the following keywords: |
62 | 46 | Marcus Pearce | ** @:pitch-full@ - The basis is a list of viewpoints useful for predicting pitch in Western music: cpitch, cpitch-class, tessitura, cpint, cpint-size, cpcint, cpcint-size, contour, newcontour, cpintfip, cpintfref, inscale. |
63 | 46 | Marcus Pearce | ** @:pitch-short@ - A shorter version of the above: cpitch, cpitch-class, cpint, cpint-size, contour, newcontour. |
64 | 46 | Marcus Pearce | ** @:bioi@ - For predicting Inter-Onset Interval (IOI): bioi, bioi-ratio, bioi-contour. |
65 | 46 | Marcus Pearce | ** @:onset@ - For predicting onset: onset, ioi, ioi-ratio, ioi-contour, metaccent |
66 | 24 | Jeremy Gow | ** @:auto@ - the basis is chosen to be the set of viewpoints that are defined in terms of one or more of the target viewpoints. This is the default. |
67 | 22 | Jeremy Gow | * @dp@: the number of decimal places to use when comparing information contents in viewpoint selection. Full floating point precision is used if this is @nil@ (the default) |
68 | 22 | Jeremy Gow | * @max-links@: the maximum number of links to use when creating linked viewpoints in viewpoint selection. The default is 2. |
69 | 2 | Marcus Pearce | |
70 | 13 | Jeremy Gow | h3. Output parameters |
71 | 2 | Marcus Pearce | |
72 | 2 | Marcus Pearce | * output-path: a string indicating a directory in which to write the output |
73 | 50 | Marcus Pearce | ** see [[IDyOM output]] for an explanation of the output files |
74 | 50 | Marcus Pearce | ** if a value of <code>nil</code> is given, output is written to the console (see example below) |
75 | 2 | Marcus Pearce | * detail: an integer which determines how the information content is averaged in the output: |
76 | 1 | Marcus Pearce | ** 1: averaged over the entire dataset |
77 | 1 | Marcus Pearce | ** 2: and also averaged over each composition |
78 | 28 | Jeremy Gow | ** 3: and also with raw IC values for each event in each composition |
79 | 2 | Marcus Pearce | |
80 | 13 | Jeremy Gow | h2. Examples |
81 | 1 | Marcus Pearce | |
82 | 13 | Jeremy Gow | h3. Mean melody IC |
83 | 1 | Marcus Pearce | |
84 | 51 | Marcus Pearce | To get mean information contents (IC) for each composition of dataset 0 in a list. The first value represents the average IC for the whole dataset, the second value is a list of average ICs for each composition in the dataset. If <code>:detail 3</code> is specified, then the output would contain a third list, containing lists of ICs for each event in each composition in the database. |
85 | 13 | Jeremy Gow | |
86 | 1 | Marcus Pearce | <pre> |
87 | 37 | Marcus Pearce | CL-USER> (idyom:idyom 0 '(cpitch) '(cpintfref cpint) :detail 2) |
88 | 1 | Marcus Pearce | 2.493305 |
89 | 1 | Marcus Pearce | (2.1368716 2.8534691 2.6938546 2.6491673 2.4993074 2.6098127 2.7728052 2.772861 |
90 | 1 | Marcus Pearce | 2.5921957 2.905856 2.3591626 2.957503 2.4042292 2.7562473 2.3996017 2.8073587 |
91 | 1 | Marcus Pearce | 2.114944 1.7434102 2.2310295 2.6374347 2.361792 1.9476132 2.501488 2.5472867 |
92 | 1 | Marcus Pearce | 2.1056154 2.8225484 2.134257 2.9162033 3.0715692 2.9012227 2.7291088 2.866882 |
93 | 1 | Marcus Pearce | 2.8795822 2.4571223 2.9277062 2.7861307 2.6623116 2.3304622 2.4217033 |
94 | 1 | Marcus Pearce | 2.0556943 2.4048684 2.914848 2.7182267 3.0894585 2.873869 1.8821808 2.640174 |
95 | 1 | Marcus Pearce | 2.8165438 2.5423129 2.3011856 3.1477294 2.655349 2.5216308 2.0667994 3.2579045 |
96 | 1 | Marcus Pearce | 2.573013 2.6035044 2.202191 2.622113 2.2621205 2.3617425 2.7526956 2.3281655 |
97 | 1 | Marcus Pearce | 2.9357266 2.3372407 3.1848125 2.67367 2.1906006 2.7835917 2.6332111 3.206142 |
98 | 1 | Marcus Pearce | 2.1426969 2.194259 2.415167 1.9769101 2.0870917 2.7844474 2.2373738 2.772138 |
99 | 1 | Marcus Pearce | 2.9702199 1.724408 2.473073 2.2464263 2.2452457 2.688889 2.6299863 2.2223835 |
100 | 1 | Marcus Pearce | 2.8082614 2.673671 2.7693706 2.3369458 2.5016947 2.3837066 2.3682225 2.795649 |
101 | 1 | Marcus Pearce | 2.9063463 2.5880773 2.0457468 1.8635312 2.4522712 1.5877498 2.8802161 |
102 | 1 | Marcus Pearce | 2.7988417 2.3125513 1.7245895 2.2404804 2.1694546 2.365556 1.5905867 1.3827317 |
103 | 1 | Marcus Pearce | 2.2706041 3.023884 2.2864542 2.1259797 2.713626 2.1967313 2.5721254 2.5812547 |
104 | 1 | Marcus Pearce | 2.8233812 2.3134546 2.6203637 2.945946 2.601433 2.1920888 2.3732007 2.440137 |
105 | 1 | Marcus Pearce | 2.4291563 2.3676903 2.734724 3.0283954 2.8076048 2.7796154 2.326931 2.1779459 |
106 | 1 | Marcus Pearce | 2.2570527 2.2688026 1.3976555 2.030298 2.640235 2.568248 2.6338177 2.157162 |
107 | 1 | Marcus Pearce | 2.3915367 2.7873137 2.3088667 2.2176988 2.4402564 2.8062992 2.784044 2.4296925 |
108 | 1 | Marcus Pearce | 2.3520193 2.6146257) |
109 | 1 | Marcus Pearce | </pre> |
110 | 1 | Marcus Pearce | |
111 | 13 | Jeremy Gow | h3. Write note IC to file |
112 | 1 | Marcus Pearce | |
113 | 13 | Jeremy Gow | To write the information contents for each note of each melody in dataset 0 to a file |
114 | 13 | Jeremy Gow | |
115 | 1 | Marcus Pearce | <pre> |
116 | 38 | Marcus Pearce | CL-USER> (idyom:idyom 0 '(cpitch) '(cpintfref cpint)) :detail 3 :output-path "/tmp/") |
117 | 1 | Marcus Pearce | </pre> |
118 | 1 | Marcus Pearce | |
119 | 43 | Marcus Pearce | h3. Viewpoint Selection |
120 | 43 | Marcus Pearce | |
121 | 43 | Marcus Pearce | <pre> |
122 | 47 | Marcus Pearce | CL-USER> (idyom:idyom 17 '(cpitch) :select :models :stm :dp 3) |
123 | 43 | Marcus Pearce | Selecting viewpoints for the STM model on dataset 17 predicting viewpoints (CPITCH). |
124 | 43 | Marcus Pearce | Generating candidate viewpoints from: (CPITCH CPITCH-CLASS CPINT |
125 | 43 | Marcus Pearce | CPINT-SIZE CONTOUR NEWCONTOUR) |
126 | 43 | Marcus Pearce | Max. links 2, whitelist (ANY), blacklist NIL |
127 | 43 | Marcus Pearce | Candidate viewpoints: (CPITCH CPITCH-CLASS CPINT CPINT-SIZE CONTOUR |
128 | 43 | Marcus Pearce | NEWCONTOUR (CONTOUR NEWCONTOUR) |
129 | 43 | Marcus Pearce | (CPINT-SIZE NEWCONTOUR) (CPINT-SIZE CONTOUR) |
130 | 43 | Marcus Pearce | (CPINT NEWCONTOUR) (CPINT CONTOUR) |
131 | 43 | Marcus Pearce | (CPINT CPINT-SIZE) (CPITCH-CLASS NEWCONTOUR) |
132 | 43 | Marcus Pearce | (CPITCH-CLASS CONTOUR) (CPITCH-CLASS CPINT-SIZE) |
133 | 43 | Marcus Pearce | (CPITCH-CLASS CPINT) (CPITCH NEWCONTOUR) |
134 | 43 | Marcus Pearce | (CPITCH CONTOUR) (CPITCH CPINT-SIZE) (CPITCH CPINT) |
135 | 43 | Marcus Pearce | (CPITCH CPITCH-CLASS)) |
136 | 43 | Marcus Pearce | |
137 | 43 | Marcus Pearce | Selected system NIL, mean IC = NIL |
138 | 43 | Marcus Pearce | |
139 | 43 | Marcus Pearce | Selected system ((CPITCH-CLASS CONTOUR)), mean IC = 3.0302427 |
140 | 43 | Marcus Pearce | ======================================================================================= |
141 | 43 | Marcus Pearce | The selected viewpoint system with a mean IC of 3.0302427 is ((CPITCH-CLASS |
142 | 43 | Marcus Pearce | CONTOUR)) |
143 | 43 | Marcus Pearce | 3.0302427 |
144 | 43 | Marcus Pearce | (3.169925 3.169925 3.0849624 3.0849624 2.9886398 2.9886398 2.8774438 2.8774438) |
145 | 43 | Marcus Pearce | ((3.169925 3.169925) (3.169925 3.169925) (3.169925 3.0) (3.169925 3.0) |
146 | 43 | Marcus Pearce | (3.169925 2.807355) (3.169925 2.807355) (3.169925 2.5849626) |
147 | 43 | Marcus Pearce | (3.169925 2.5849626)) |
148 | 43 | Marcus Pearce | </pre> |
149 | 43 | Marcus Pearce | |
150 | 43 | Marcus Pearce | |
151 | 13 | Jeremy Gow | h3. Conklin & Witten (1995) |
152 | 13 | Jeremy Gow | |
153 | 13 | Jeremy Gow | To simulate the experiments of Conklin & Witten (1995) |
154 | 1 | Marcus Pearce | |
155 | 1 | Marcus Pearce | <pre> |
156 | 45 | Marcus Pearce | CL-USER> (idyom:conkwit95) |
157 | 1 | Marcus Pearce | Simulation of the experiments of Conklin & Witten (1995, Table 4). |
158 | 1 | Marcus Pearce | System 1; Mean Information Content: 2.33 |
159 | 1 | Marcus Pearce | System 2; Mean Information Content: 2.36 |
160 | 1 | Marcus Pearce | System 3; Mean Information Content: 2.09 |
161 | 1 | Marcus Pearce | System 4; Mean Information Content: 2.01 |
162 | 1 | Marcus Pearce | System 5; Mean Information Content: 2.08 |
163 | 1 | Marcus Pearce | System 6; Mean Information Content: 1.90 |
164 | 1 | Marcus Pearce | System 7; Mean Information Content: 1.88 |
165 | 1 | Marcus Pearce | System 8; Mean Information Content: 1.86 |
166 | 1 | Marcus Pearce | NIL |
167 | 1 | Marcus Pearce | </pre> |
168 | 1 | Marcus Pearce | |
169 | 1 | Marcus Pearce | Compare with "Conklin & Witten [1995, JNMR, table 4]":http://www.sc.ehu.es/ccwbayes/members/conklin/papers/jnmr95.pdf |