Idyom » History » Version 67

Marcus Pearce, 2018-08-01 02:11 PM

1 11 Jeremy Gow
h1. Running IDyOM 
2 1 Marcus Pearce
3 11 Jeremy Gow
{{>toc}}
4 1 Marcus Pearce
5 44 Marcus Pearce
h2. <code>*idyom:idyom*</code> 
6 1 Marcus Pearce
7 54 Marcus Pearce
The top-level point of entry is <code>idyom:idyom</code>, which has three required arguments and a number of optional keyword arguments.
8 1 Marcus Pearce
9 67 Marcus Pearce
10 67 Marcus Pearce
11 67 Marcus Pearce
12 67 Marcus Pearce
13 67 Marcus Pearce
14 67 Marcus Pearce
15 54 Marcus Pearce
h3. Required parameters
16 54 Marcus Pearce
17 1 Marcus Pearce
* @dataset-id@: a dataset id, (an integer, e.g., 0)
18 1 Marcus Pearce
* @target-viewpoints@: a list of basic viewpoints to predict, e.g. '(cpitch) or '(cpitch onset)
19 67 Marcus Pearce
* @source-viewpoints@: a list of viewpoints to use in prediction, e.g. '((cpintfref cpint) ioi) to use two viewpoints: cpintfref linked with cpint and ioi; or '(cpintfref cpint ioi) to use three unlinked viewpoints: cpintfref, cpint, and ioi
20 67 Marcus Pearce
** Note that passing <code>:select</code> will trigger viewpoint selection (see further options below)
21 1 Marcus Pearce
22 54 Marcus Pearce
See the [[List of viewpoints]] for a description of the various viewpoints available in IDyOM.  
23 54 Marcus Pearce
24 54 Marcus Pearce
A simple call to IDyOM would be:
25 1 Marcus Pearce
<pre>
26 54 Marcus Pearce
CL-USER> (idyom:idyom 18 '(cpitch) '(cpitch))
27 54 Marcus Pearce
3.4519181
28 54 Marcus Pearce
(3.7459166 3.7913148 3.5499783 3.2027783 3.5338237 3.4903128 3.4759412 2.8252819)
29 54 Marcus Pearce
((3.586735 3.8160942 6.8622913 3.3496706 3.3494937 2.0084002 2.4969249 6.04035
30 54 Marcus Pearce
  2.5135455 3.611938 5.617945 4.6973023 3.1194212 4.153834 2.328312 2.863982
31 54 Marcus Pearce
  3.5802953 2.198141 4.9777308)
32 54 Marcus Pearce
 (3.7181098 6.350471 4.8110385 4.712717 3.4230165 3.5964172 1.7951053 1.3934463
33 54 Marcus Pearce
  5.252618 3.8527348 3.8151293 1.9286131 6.499048 4.1175685 1.8474439
34 54 Marcus Pearce
  3.5475667)
35 54 Marcus Pearce
 (4.031454 2.9693763 4.0899825 2.9261 1.24961 4.785756 2.1267474 2.8533628
36 54 Marcus Pearce
  3.0932114 4.859682 3.4375515 3.6497543 5.485611 5.512378 3.1457896 2.5832813)
37 54 Marcus Pearce
 (3.8783064 4.2615805 3.4476812 3.516016 3.12956 3.3235698 2.8970401 4.3073235
38 54 Marcus Pearce
  3.7609475 2.2025976 3.6883376 2.3482933 1.623888 1.1030108 3.648819
39 54 Marcus Pearce
  4.1074767)
40 54 Marcus Pearce
 (3.8407595 4.2985744 4.4947147 4.7583337 4.5309863 3.1522655 3.7750213
41 54 Marcus Pearce
  5.0408797 3.7760868 3.78026 2.1053405 4.7487717 3.6750562 3.6836185 1.5774668
42 54 Marcus Pearce
  4.2355194 2.109648 2.879462 1.1504707 2.2890074 4.3080635)
43 54 Marcus Pearce
 (3.7692716 3.8107364 4.005191 2.9590254 1.5087044 1.722277 4.741388 4.64542
44 54 Marcus Pearce
  1.9565314 4.9309998 4.805511 3.6190488 3.2736812 1.834998 3.5858262 5.5998607
45 54 Marcus Pearce
  3.7004514 2.1804154 1.924814 4.200107 4.5223045)
46 54 Marcus Pearce
 (3.8725564 4.1996803 3.89821 3.8978398 2.9178553 1.5882304 1.413055 2.2741494
47 54 Marcus Pearce
  3.9745965 3.4585989 7.8948627 2.3402674 2.172695 6.2782865 3.0781124
48 54 Marcus Pearce
  1.6451169 1.7057719 5.9570546)
49 54 Marcus Pearce
 (3.9283562 1.877443 2.0237117 4.214842 4.316505 2.8062139 2.4793828 2.6004548
50 54 Marcus Pearce
  5.554006 2.514725 1.8137112 1.560705 2.7592404 2.4539397 3.0249727 1.2541932
51 54 Marcus Pearce
  1.048938 1.128769 6.32025))
52 53 Marcus Pearce
</pre>
53 54 Marcus Pearce
This predicts the pitch values in dataset 18 (containing 8 short melodies), based on previous pitches (i.e., the target and source viewpoints are both <code>cpitch</code>). IDyOM computes the information content (IC) for each note, and by default returns three values: the first is a mean note IC for the whole dataset, the second a list of mean ICs for the individual compositions, the third is a list of lists containing the IC values for each note in each composition. The <code>:detail</code> parameter controls the detail of the output, while the <code>:output-path</code> parameter allows the user to generate a spreadsheet containing detailed model output (see below for further details).
54 13 Jeremy Gow
55 2 Marcus Pearce
h3. Statistical modelling parameters
56 19 Jeremy Gow
57 1 Marcus Pearce
See "Pearce [2005, chapter 6]":http://webprojects.eecs.qmul.ac.uk/marcusp/papers/Pearce2005.pdf for further description and explanation of these parameters.
58 19 Jeremy Gow
59 25 Jeremy Gow
* @models@: the type of IDyOM model to use.  Options are:
60 25 Jeremy Gow
** @:stm@ - short-term model only, trained on the current composition.
61 25 Jeremy Gow
** @:ltm@ - long-term model only, trained on the pretraining and resampling training data.
62 19 Jeremy Gow
** @:ltm+@ - the long-term model, with additional incremental training on the test set;
63 19 Jeremy Gow
** @:both@ - a combination of :stm and :ltm;
64 19 Jeremy Gow
** @:both+@ -  a combination of :stm and :ltm+ (this is the default).
65 27 Jeremy Gow
66 1 Marcus Pearce
The LTM and STM can be configured using the @ltmo@ and @stmo@ parameters.  These accept a property list with the following properties - the defaults are used if a property is omitted or no parameter list is supplied:
67 1 Marcus Pearce
* @:order-bound@: an integer indicating the bound on the order of the model, i.e. the number of past events used by the model.  The default is @nil@, no bound.
68 1 Marcus Pearce
* @:mixtures@: whether to use mixtures for the model. (Default @t@).
69 1 Marcus Pearce
* @:update-exclusion@: whether to use update exclusion. (LTM default @nil@, STM default @t@.)
70 1 Marcus Pearce
* @:escape@: the model's escape method.  One of @:a :b :c :d :x@.  (LTM default @:c@, STM default @:x@.)
71 1 Marcus Pearce
72 25 Jeremy Gow
For example, the following command would combine the STM and LTM, without incremental training for the latter and an STM order bound of 4:
73 26 Jeremy Gow
<pre>
74 25 Jeremy Gow
CL-USER> (idyom:idyom 1 '(cpitch) '(cpitch) :models :both :stmo '(:order-bound 4))
75 1 Marcus Pearce
</pre>
76 20 Jeremy Gow
77 20 Jeremy Gow
h3. Training parameters
78 20 Jeremy Gow
79 20 Jeremy Gow
When using IDyOM to estimate note IC for a given dataset, the long-term models can be trained on other datasets (pretraining) and/or on the current dataset, i.e. via resampling (cross-validation).  In the latter case, the dataset is partitioned into a training set (used to train the LTMs) and a test set (for which note IC is computed).  This split is called a fold, and the modelling process can be repeated with a number of different folds in order to model the entire dataset.
80 48 Marcus Pearce
81 20 Jeremy Gow
* @pretraining-ids@: a list of dataset ids used to pretrain the long-term models (done before resampling). Note that if pretraining-ids are supplied for an STM (i.e., <code> :models :stm</code>) the pretraining datasets are used to set the viewpoint domains (alphabet) for the models although not for training the models themselves (because they are short-term models).
82 20 Jeremy Gow
* @k@: the number of resampling (cross-validation) folds to use.  The default value is 10.
83 20 Jeremy Gow
** @1@ = no resampling, but also no training set unless the models are pretrained; 
84 20 Jeremy Gow
** @:full@ = as many folds as there are compositions in the dataset
85 2 Marcus Pearce
* @resampling-indices@: a list of numbers designating which resampling folds to use, i.e. a subset of @[0, 1, ..., k - 1]@.  By default, all folds are used.
86 35 Marcus Pearce
87 35 Marcus Pearce
*Note* that cross-validation only applies to the dataset being analysed (i.e., the one specified by the <code>dataset-id</code> argument). If a value of k=1 is supplied, the long-term models are not trained, unless a pretraining set is used.
88 13 Jeremy Gow
89 2 Marcus Pearce
h3. Viewpoint selection parameters
90 24 Jeremy Gow
91 57 Marcus Pearce
These parameters only have an effect when the source viewpoint supplied is <code>:select</code>, triggering viewpoint selection, which searches for an optimal set of viewpoints using a hill-climbing procedure. 
92 57 Marcus Pearce
93 46 Marcus Pearce
* @basis@: Identifies a set of viewpoints to be used in viewpoint selection, i.e. it will attempt to find the 'best' viewpoint system combining these, including by linking them.  The parameter can be a list or one of the following keywords:
94 46 Marcus Pearce
** @:pitch-full@ - The basis is a list of viewpoints useful for predicting pitch in Western music: cpitch, cpitch-class, tessitura, cpint, cpint-size, cpcint, cpcint-size, contour, newcontour, cpintfip, cpintfref, inscale.
95 46 Marcus Pearce
** @:pitch-short@ - A shorter version of the above: cpitch, cpitch-class, cpint, cpint-size, contour, newcontour.
96 46 Marcus Pearce
** @:bioi@ - For predicting Inter-Onset Interval (IOI): bioi, bioi-ratio, bioi-contour.
97 24 Jeremy Gow
** @:onset@ - For predicting onset: onset, ioi, ioi-ratio, ioi-contour, metaccent
98 22 Jeremy Gow
** @:auto@ - the basis is chosen to be the set of viewpoints that are defined in terms of one or more of the target viewpoints.  This is the default.
99 22 Jeremy Gow
* @dp@: the number of decimal places to use when comparing information contents in viewpoint selection.  Full floating point precision is used if this is @nil@ (the default)
100 2 Marcus Pearce
* @max-links@: the maximum number of links to use when creating linked viewpoints in viewpoint selection.  The default is 2.
101 64 Marcus Pearce
* @min-links@: the minimum number of links to use when creating linked viewpoints in viewpoint selection.  The default is 2.
102 65 Marcus Pearce
* @viewpoint-selection-output@: a filepath to write output for every viewpoint system considered during viewpoint selection. The default is nil meaning that no files are written.
103 13 Jeremy Gow
104 2 Marcus Pearce
h3. Output parameters
105 2 Marcus Pearce
106 55 Marcus Pearce
* <code>output-path</code>: a string indicating a directory in which to write the output 
107 50 Marcus Pearce
** see [[IDyOM output]] for an explanation of the output files
108 56 Marcus Pearce
** if a value of <code>nil</code> is given, information content (IC) is written to the console (see example below)
109 55 Marcus Pearce
* <code>detail</code>: an integer which determines how the information content is averaged in the output: 
110 1 Marcus Pearce
** 1: averaged over the entire dataset 
111 28 Jeremy Gow
** 2: and also averaged over each composition 
112 2 Marcus Pearce
** 3: and also with raw IC values for each event in each composition
113 62 Marcus Pearce
* <code>overwrite</code>: whether to overwrite an existing output file if it exists. Default is not to overwrite (<code>nil</code>), pass 
114 62 Marcus Pearce
<code>t</code> to overwrite output files.
115 63 Marcus Pearce
116 65 Marcus Pearce
* <code>separator</code>: a string defining the character to use for delimiting columns in the output file (default is " ", use "," for CSV)
117 13 Jeremy Gow
118 58 Marcus Pearce
h3. Caching parameters
119 58 Marcus Pearce
120 58 Marcus Pearce
* <code>use-resampling-set-cache?</code> a Boolean (t/nil) to specify whether to cache resampling sets
121 58 Marcus Pearce
** default: t (so that the random division of the dataset into k-folds is stored and reused)
122 58 Marcus Pearce
* <code>use-ltms-cache?</code> a Boolean (t/nil) controlling whether long-term models are stored and reused
123 58 Marcus Pearce
** default: t
124 58 Marcus Pearce
125 1 Marcus Pearce
h2. Examples
126 13 Jeremy Gow
127 1 Marcus Pearce
h3. Mean melody IC
128 51 Marcus Pearce
129 13 Jeremy Gow
To get mean information contents (IC) for each composition of dataset 0 in a list. The first value represents the average IC for the whole dataset, the second value is a list of average ICs for each composition in the dataset. If <code>:detail 3</code> is specified, then the output would contain a third list, containing lists of ICs for each event in each composition in the database.
130 1 Marcus Pearce
131 37 Marcus Pearce
<pre>
132 1 Marcus Pearce
CL-USER> (idyom:idyom 0 '(cpitch) '(cpintfref cpint) :detail 2)
133 1 Marcus Pearce
2.493305
134 1 Marcus Pearce
(2.1368716 2.8534691 2.6938546 2.6491673 2.4993074 2.6098127 2.7728052 2.772861
135 1 Marcus Pearce
 2.5921957 2.905856 2.3591626 2.957503 2.4042292 2.7562473 2.3996017 2.8073587
136 1 Marcus Pearce
 2.114944 1.7434102 2.2310295 2.6374347 2.361792 1.9476132 2.501488 2.5472867
137 1 Marcus Pearce
 2.1056154 2.8225484 2.134257 2.9162033 3.0715692 2.9012227 2.7291088 2.866882
138 1 Marcus Pearce
 2.8795822 2.4571223 2.9277062 2.7861307 2.6623116 2.3304622 2.4217033
139 1 Marcus Pearce
 2.0556943 2.4048684 2.914848 2.7182267 3.0894585 2.873869 1.8821808 2.640174
140 1 Marcus Pearce
 2.8165438 2.5423129 2.3011856 3.1477294 2.655349 2.5216308 2.0667994 3.2579045
141 1 Marcus Pearce
 2.573013 2.6035044 2.202191 2.622113 2.2621205 2.3617425 2.7526956 2.3281655
142 1 Marcus Pearce
 2.9357266 2.3372407 3.1848125 2.67367 2.1906006 2.7835917 2.6332111 3.206142
143 1 Marcus Pearce
 2.1426969 2.194259 2.415167 1.9769101 2.0870917 2.7844474 2.2373738 2.772138
144 1 Marcus Pearce
 2.9702199 1.724408 2.473073 2.2464263 2.2452457 2.688889 2.6299863 2.2223835
145 1 Marcus Pearce
 2.8082614 2.673671 2.7693706 2.3369458 2.5016947 2.3837066 2.3682225 2.795649
146 1 Marcus Pearce
 2.9063463 2.5880773 2.0457468 1.8635312 2.4522712 1.5877498 2.8802161
147 1 Marcus Pearce
 2.7988417 2.3125513 1.7245895 2.2404804 2.1694546 2.365556 1.5905867 1.3827317
148 1 Marcus Pearce
 2.2706041 3.023884 2.2864542 2.1259797 2.713626 2.1967313 2.5721254 2.5812547
149 1 Marcus Pearce
 2.8233812 2.3134546 2.6203637 2.945946 2.601433 2.1920888 2.3732007 2.440137
150 1 Marcus Pearce
 2.4291563 2.3676903 2.734724 3.0283954 2.8076048 2.7796154 2.326931 2.1779459
151 1 Marcus Pearce
 2.2570527 2.2688026 1.3976555 2.030298 2.640235 2.568248 2.6338177 2.157162
152 1 Marcus Pearce
 2.3915367 2.7873137 2.3088667 2.2176988 2.4402564 2.8062992 2.784044 2.4296925
153 1 Marcus Pearce
 2.3520193 2.6146257)
154 1 Marcus Pearce
</pre>
155 13 Jeremy Gow
156 1 Marcus Pearce
h3. Write note IC to file
157 52 Marcus Pearce
158 13 Jeremy Gow
To write the information contents for each note of each melody in dataset 0 to a file: 
159 1 Marcus Pearce
160 38 Marcus Pearce
<pre>
161 60 Marcus Pearce
CL-USER> (idyom:idyom 0 '(cpitch) '((cpintfref cpint)) :detail 3 :output-path "/tmp/")
162 52 Marcus Pearce
</pre>
163 52 Marcus Pearce
164 1 Marcus Pearce
See [[IDyOM Output]] for a description of the output files.
165 43 Marcus Pearce
166 43 Marcus Pearce
h3. Viewpoint Selection
167 43 Marcus Pearce
168 47 Marcus Pearce
<pre>
169 43 Marcus Pearce
CL-USER> (idyom:idyom 17 '(cpitch) :select :models :stm :dp 3)
170 43 Marcus Pearce
Selecting viewpoints for the STM model on dataset 17 predicting viewpoints (CPITCH).
171 43 Marcus Pearce
Generating candidate viewpoints from: (CPITCH CPITCH-CLASS CPINT
172 43 Marcus Pearce
                                       CPINT-SIZE CONTOUR NEWCONTOUR)
173 43 Marcus Pearce
Max. links 2, whitelist (ANY), blacklist NIL
174 43 Marcus Pearce
Candidate viewpoints: (CPITCH CPITCH-CLASS CPINT CPINT-SIZE CONTOUR
175 43 Marcus Pearce
                       NEWCONTOUR (CONTOUR NEWCONTOUR)
176 43 Marcus Pearce
                       (CPINT-SIZE NEWCONTOUR) (CPINT-SIZE CONTOUR)
177 43 Marcus Pearce
                       (CPINT NEWCONTOUR) (CPINT CONTOUR)
178 43 Marcus Pearce
                       (CPINT CPINT-SIZE) (CPITCH-CLASS NEWCONTOUR)
179 43 Marcus Pearce
                       (CPITCH-CLASS CONTOUR) (CPITCH-CLASS CPINT-SIZE)
180 43 Marcus Pearce
                       (CPITCH-CLASS CPINT) (CPITCH NEWCONTOUR)
181 43 Marcus Pearce
                       (CPITCH CONTOUR) (CPITCH CPINT-SIZE) (CPITCH CPINT)
182 43 Marcus Pearce
                       (CPITCH CPITCH-CLASS))
183 43 Marcus Pearce
184 43 Marcus Pearce
Selected system NIL, mean IC = NIL
185 43 Marcus Pearce
186 43 Marcus Pearce
Selected system ((CPITCH-CLASS CONTOUR)), mean IC = 3.0302427
187 43 Marcus Pearce
 =======================================================================================
188 43 Marcus Pearce
The selected viewpoint system with a mean IC of 3.0302427 is ((CPITCH-CLASS
189 43 Marcus Pearce
                                                               CONTOUR))
190 43 Marcus Pearce
3.0302427
191 43 Marcus Pearce
(3.169925 3.169925 3.0849624 3.0849624 2.9886398 2.9886398 2.8774438 2.8774438)
192 43 Marcus Pearce
((3.169925 3.169925) (3.169925 3.169925) (3.169925 3.0) (3.169925 3.0)
193 43 Marcus Pearce
 (3.169925 2.807355) (3.169925 2.807355) (3.169925 2.5849626)
194 43 Marcus Pearce
 (3.169925 2.5849626))
195 43 Marcus Pearce
</pre>
196 43 Marcus Pearce
197 13 Jeremy Gow
198 13 Jeremy Gow
h3. Conklin & Witten (1995)
199 13 Jeremy Gow
200 1 Marcus Pearce
To simulate the experiments of Conklin & Witten (1995) 
201 1 Marcus Pearce
202 45 Marcus Pearce
<pre>
203 1 Marcus Pearce
CL-USER> (idyom:conkwit95)
204 1 Marcus Pearce
Simulation of the experiments of Conklin & Witten (1995, Table 4).
205 1 Marcus Pearce
System 1; Mean Information Content: 2.33 
206 1 Marcus Pearce
System 2; Mean Information Content: 2.36 
207 1 Marcus Pearce
System 3; Mean Information Content: 2.09 
208 1 Marcus Pearce
System 4; Mean Information Content: 2.01 
209 1 Marcus Pearce
System 5; Mean Information Content: 2.08 
210 1 Marcus Pearce
System 6; Mean Information Content: 1.90 
211 1 Marcus Pearce
System 7; Mean Information Content: 1.88 
212 1 Marcus Pearce
System 8; Mean Information Content: 1.86 
213 1 Marcus Pearce
NIL
214 1 Marcus Pearce
</pre>
215 1 Marcus Pearce
216 1 Marcus Pearce
Compare with "Conklin & Witten [1995, JNMR, table 4]":http://www.sc.ehu.es/ccwbayes/members/conklin/papers/jnmr95.pdf
217 59 Marcus Pearce
218 66 Marcus Pearce
h3. Identifying melodic grouping boundaries (Pearce et al., 2010, _Perception_, 39, 1367-1391). 
219 59 Marcus Pearce
220 59 Marcus Pearce
This involves applying a peak-picking algorithm to the output of <code>idyom</code>. For example,
221 59 Marcus Pearce
222 59 Marcus Pearce
<pre>
223 59 Marcus Pearce
CL-USER> (multiple-value-bind (d1 d2 d3) 
224 59 Marcus Pearce
             (idyom:idyom 19 '(cpitch) '(cpitch) :k 6 :pretraining-ids '(3) :models :both+)
225 59 Marcus Pearce
           (declare (ignore d1 d2))
226 59 Marcus Pearce
           (mapcar #'segmentation:peak-picker d3))
227 59 Marcus Pearce
((0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0)
228 59 Marcus Pearce
 (0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0
229 59 Marcus Pearce
  0 0 0 0 0 0 1 0 0)
230 59 Marcus Pearce
 (0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0)
231 59 Marcus Pearce
 (0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
232 59 Marcus Pearce
  0 0 0 0 0 0 0 0 0 0 0 0 0 0)
233 59 Marcus Pearce
 (0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0)
234 59 Marcus Pearce
 (0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0))
235 59 Marcus Pearce
CL-USER> 
236 59 Marcus Pearce
</pre>
237 59 Marcus Pearce
238 59 Marcus Pearce
In the output, a one indicates that the event follows a predicted grouping boundary (e.g., the first event in a new phrase) while a zero indicates that this is not the case.