Idyom » History » Version 53

Marcus Pearce, 2014-07-24 02:27 PM

1 11 Jeremy Gow
h1. Running IDyOM 
2 1 Marcus Pearce
3 11 Jeremy Gow
{{>toc}}
4 1 Marcus Pearce
5 44 Marcus Pearce
h2. <code>*idyom:idyom*</code> 
6 1 Marcus Pearce
7 12 Jeremy Gow
The main workhorse function is <code>idyom:idyom</code>, which has three required arguments and a number of optional keyword arguments.
8 1 Marcus Pearce
9 13 Jeremy Gow
h3. Required parameters
10 1 Marcus Pearce
11 23 Jeremy Gow
* @dataset-id@: a dataset id, e.g. 1.
12 23 Jeremy Gow
* @target-viewpoints@: a list of basic viewpoints to predict, e.g. '(:cpitch :bioi)
13 23 Jeremy Gow
* @source-viewpoints@: a list of viewpoints to use in prediction, e.g. '((:cpintfref :cpint) :bioi)
14 12 Jeremy Gow
** Passing <code>:select</code> will trigger viewpoint selection (see further options below)
15 12 Jeremy Gow
16 17 Jeremy Gow
See the [[List of viewpoints]] for a description of the various viewpoints available in IDyOM.  A simple call to IDyOM would be:
17 12 Jeremy Gow
<pre>
18 36 Marcus Pearce
CL-USER> (idyom:idyom 0 '(cpitch) '(cpitch))
19 15 Jeremy Gow
2.2490792
20 53 Marcus Pearce
(1.9049941 2.427845 2.0234334 1.7971386 1.8213106 1.9313766 2.3758402 1.8310248 ...)
21 53 Marcus Pearce
((
22 12 Jeremy Gow
</pre>
23 53 Marcus Pearce
This predicts the pitch values in dataset 0, based on previous pitches (i.e., the target and source viewpoints are both <code>cpitch</code>). IDyOM computes the information content (IC) for each note, and by default returns three values: the first is a mean note IC for the dataset, the second a list of mean note ICs for the individual compositions, the third is a list of lists containing the IC values for each note in each composition.
24 2 Marcus Pearce
25 13 Jeremy Gow
h3. Statistical modelling parameters
26 2 Marcus Pearce
27 19 Jeremy Gow
See "Pearce [2005, chapter 6]":http://webprojects.eecs.qmul.ac.uk/marcusp/papers/Pearce2005.pdf for further description and explanation of these parameters.
28 1 Marcus Pearce
29 19 Jeremy Gow
* @models@: the type of IDyOM model to use.  Options are:
30 25 Jeremy Gow
** @:stm@ - short-term model only, trained on the current composition.
31 25 Jeremy Gow
** @:ltm@ - long-term model only, trained on the pretraining and resampling training data.
32 25 Jeremy Gow
** @:ltm+@ - the long-term model, with additional incremental training on the test set;
33 19 Jeremy Gow
** @:both@ - a combination of :stm and :ltm;
34 19 Jeremy Gow
** @:both+@ -  a combination of :stm and :ltm+ (this is the default).
35 19 Jeremy Gow
36 27 Jeremy Gow
The LTM and STM can be configured using the @ltmo@ and @stmo@ parameters.  These accept a property list with the following properties - the defaults are used if a property is omitted or no parameter list is supplied:
37 1 Marcus Pearce
* @:order-bound@: an integer indicating the bound on the order of the model, i.e. the number of past events used by the model.  The default is @nil@, no bound.
38 1 Marcus Pearce
* @:mixtures@: whether to use mixtures for the model. (Default @t@).
39 1 Marcus Pearce
* @:update-exclusion@: whether to use update exclusion. (LTM default @nil@, STM default @t@.)
40 1 Marcus Pearce
* @:escape@: the model's escape method.  One of @:a :b :c :d :x@.  (LTM default @:c@, STM default @:x@.)
41 1 Marcus Pearce
42 1 Marcus Pearce
For example, the following command would combine the STM and LTM, without incremental training for the latter and an STM order bound of 4:
43 25 Jeremy Gow
<pre>
44 26 Jeremy Gow
CL-USER> (idyom:idyom 1 '(cpitch) '(cpitch) :models :both :stmo '(:order-bound 4))
45 25 Jeremy Gow
</pre>
46 1 Marcus Pearce
47 20 Jeremy Gow
h3. Training parameters
48 20 Jeremy Gow
49 20 Jeremy Gow
When using IDyOM to estimate note IC for a given dataset, the long-term models can be trained on other datasets (pretraining) and/or on the current dataset, i.e. via resampling (cross-validation).  In the latter case, the dataset is partitioned into a training set (used to train the LTMs) and a test set (for which note IC is computed).  This split is called a fold, and the modelling process can be repeated with a number of different folds in order to model the entire dataset.
50 20 Jeremy Gow
51 48 Marcus Pearce
* @pretraining-ids@: a list of dataset ids used to pretrain the long-term models (done before resampling). Note that if pretraining-ids are supplied for an STM (i.e., <code> :models :stm</code>) the pretraining datasets are used to set the viewpoint domains (alphabet) for the models although not for training the models themselves (because they are short-term models).
52 20 Jeremy Gow
* @k@: the number of resampling (cross-validation) folds to use.  The default value is 10.
53 20 Jeremy Gow
** @1@ = no resampling, but also no training set unless the models are pretrained; 
54 20 Jeremy Gow
** @:full@ = as many folds as there are compositions in the dataset
55 20 Jeremy Gow
* @resampling-indices@: a list of numbers designating which resampling folds to use, i.e. a subset of @[0, 1, ..., k - 1]@.  By default, all folds are used.
56 2 Marcus Pearce
57 35 Marcus Pearce
*Note* that cross-validation only applies to the dataset being analysed (i.e., the one specified by the <code>dataset-id</code> argument). If a value of k=1 is supplied, the long-term models are not trained, unless a pretraining set is used.
58 35 Marcus Pearce
59 13 Jeremy Gow
h3. Viewpoint selection parameters
60 2 Marcus Pearce
61 24 Jeremy Gow
* @basis@: Identifies a set of viewpoints to be used in viewpoint selection, i.e. it will attempt to find the 'best' viewpoint system combining these, including by linking them.  The parameter can be a list or one of the following keywords:
62 46 Marcus Pearce
** @:pitch-full@ - The basis is a list of viewpoints useful for predicting pitch in Western music: cpitch, cpitch-class, tessitura, cpint, cpint-size, cpcint, cpcint-size, contour, newcontour, cpintfip, cpintfref, inscale.
63 46 Marcus Pearce
** @:pitch-short@ - A shorter version of the above: cpitch, cpitch-class, cpint, cpint-size, contour, newcontour.
64 46 Marcus Pearce
** @:bioi@ - For predicting Inter-Onset Interval (IOI): bioi, bioi-ratio, bioi-contour.
65 46 Marcus Pearce
** @:onset@ - For predicting onset: onset, ioi, ioi-ratio, ioi-contour, metaccent
66 24 Jeremy Gow
** @:auto@ - the basis is chosen to be the set of viewpoints that are defined in terms of one or more of the target viewpoints.  This is the default.
67 22 Jeremy Gow
* @dp@: the number of decimal places to use when comparing information contents in viewpoint selection.  Full floating point precision is used if this is @nil@ (the default)
68 22 Jeremy Gow
* @max-links@: the maximum number of links to use when creating linked viewpoints in viewpoint selection.  The default is 2.
69 2 Marcus Pearce
70 13 Jeremy Gow
h3. Output parameters
71 2 Marcus Pearce
72 2 Marcus Pearce
* output-path: a string indicating a directory in which to write the output 
73 50 Marcus Pearce
** see [[IDyOM output]] for an explanation of the output files
74 50 Marcus Pearce
** if a value of <code>nil</code> is given, output is written to the console (see example below)
75 2 Marcus Pearce
* detail: an integer which determines how the information content is averaged in the output: 
76 1 Marcus Pearce
** 1: averaged over the entire dataset 
77 1 Marcus Pearce
** 2: and also averaged over each composition 
78 28 Jeremy Gow
** 3: and also with raw IC values for each event in each composition
79 2 Marcus Pearce
80 13 Jeremy Gow
h2. Examples
81 1 Marcus Pearce
82 13 Jeremy Gow
h3. Mean melody IC
83 1 Marcus Pearce
84 51 Marcus Pearce
To get mean information contents (IC) for each composition of dataset 0 in a list. The first value represents the average IC for the whole dataset, the second value is a list of average ICs for each composition in the dataset. If <code>:detail 3</code> is specified, then the output would contain a third list, containing lists of ICs for each event in each composition in the database.
85 13 Jeremy Gow
86 1 Marcus Pearce
<pre>
87 37 Marcus Pearce
CL-USER> (idyom:idyom 0 '(cpitch) '(cpintfref cpint) :detail 2)
88 1 Marcus Pearce
2.493305
89 1 Marcus Pearce
(2.1368716 2.8534691 2.6938546 2.6491673 2.4993074 2.6098127 2.7728052 2.772861
90 1 Marcus Pearce
 2.5921957 2.905856 2.3591626 2.957503 2.4042292 2.7562473 2.3996017 2.8073587
91 1 Marcus Pearce
 2.114944 1.7434102 2.2310295 2.6374347 2.361792 1.9476132 2.501488 2.5472867
92 1 Marcus Pearce
 2.1056154 2.8225484 2.134257 2.9162033 3.0715692 2.9012227 2.7291088 2.866882
93 1 Marcus Pearce
 2.8795822 2.4571223 2.9277062 2.7861307 2.6623116 2.3304622 2.4217033
94 1 Marcus Pearce
 2.0556943 2.4048684 2.914848 2.7182267 3.0894585 2.873869 1.8821808 2.640174
95 1 Marcus Pearce
 2.8165438 2.5423129 2.3011856 3.1477294 2.655349 2.5216308 2.0667994 3.2579045
96 1 Marcus Pearce
 2.573013 2.6035044 2.202191 2.622113 2.2621205 2.3617425 2.7526956 2.3281655
97 1 Marcus Pearce
 2.9357266 2.3372407 3.1848125 2.67367 2.1906006 2.7835917 2.6332111 3.206142
98 1 Marcus Pearce
 2.1426969 2.194259 2.415167 1.9769101 2.0870917 2.7844474 2.2373738 2.772138
99 1 Marcus Pearce
 2.9702199 1.724408 2.473073 2.2464263 2.2452457 2.688889 2.6299863 2.2223835
100 1 Marcus Pearce
 2.8082614 2.673671 2.7693706 2.3369458 2.5016947 2.3837066 2.3682225 2.795649
101 1 Marcus Pearce
 2.9063463 2.5880773 2.0457468 1.8635312 2.4522712 1.5877498 2.8802161
102 1 Marcus Pearce
 2.7988417 2.3125513 1.7245895 2.2404804 2.1694546 2.365556 1.5905867 1.3827317
103 1 Marcus Pearce
 2.2706041 3.023884 2.2864542 2.1259797 2.713626 2.1967313 2.5721254 2.5812547
104 1 Marcus Pearce
 2.8233812 2.3134546 2.6203637 2.945946 2.601433 2.1920888 2.3732007 2.440137
105 1 Marcus Pearce
 2.4291563 2.3676903 2.734724 3.0283954 2.8076048 2.7796154 2.326931 2.1779459
106 1 Marcus Pearce
 2.2570527 2.2688026 1.3976555 2.030298 2.640235 2.568248 2.6338177 2.157162
107 1 Marcus Pearce
 2.3915367 2.7873137 2.3088667 2.2176988 2.4402564 2.8062992 2.784044 2.4296925
108 1 Marcus Pearce
 2.3520193 2.6146257)
109 1 Marcus Pearce
</pre>
110 1 Marcus Pearce
111 13 Jeremy Gow
h3. Write note IC to file
112 1 Marcus Pearce
113 52 Marcus Pearce
To write the information contents for each note of each melody in dataset 0 to a file: 
114 13 Jeremy Gow
115 1 Marcus Pearce
<pre>
116 38 Marcus Pearce
CL-USER> (idyom:idyom 0 '(cpitch) '(cpintfref cpint)) :detail 3 :output-path "/tmp/")
117 1 Marcus Pearce
</pre>
118 52 Marcus Pearce
119 52 Marcus Pearce
See [[IDyOM Output]] for a description of the output files.
120 1 Marcus Pearce
121 43 Marcus Pearce
h3. Viewpoint Selection
122 43 Marcus Pearce
123 43 Marcus Pearce
<pre>
124 47 Marcus Pearce
CL-USER> (idyom:idyom 17 '(cpitch) :select :models :stm :dp 3)
125 43 Marcus Pearce
Selecting viewpoints for the STM model on dataset 17 predicting viewpoints (CPITCH).
126 43 Marcus Pearce
Generating candidate viewpoints from: (CPITCH CPITCH-CLASS CPINT
127 43 Marcus Pearce
                                       CPINT-SIZE CONTOUR NEWCONTOUR)
128 43 Marcus Pearce
Max. links 2, whitelist (ANY), blacklist NIL
129 43 Marcus Pearce
Candidate viewpoints: (CPITCH CPITCH-CLASS CPINT CPINT-SIZE CONTOUR
130 43 Marcus Pearce
                       NEWCONTOUR (CONTOUR NEWCONTOUR)
131 43 Marcus Pearce
                       (CPINT-SIZE NEWCONTOUR) (CPINT-SIZE CONTOUR)
132 43 Marcus Pearce
                       (CPINT NEWCONTOUR) (CPINT CONTOUR)
133 43 Marcus Pearce
                       (CPINT CPINT-SIZE) (CPITCH-CLASS NEWCONTOUR)
134 43 Marcus Pearce
                       (CPITCH-CLASS CONTOUR) (CPITCH-CLASS CPINT-SIZE)
135 43 Marcus Pearce
                       (CPITCH-CLASS CPINT) (CPITCH NEWCONTOUR)
136 43 Marcus Pearce
                       (CPITCH CONTOUR) (CPITCH CPINT-SIZE) (CPITCH CPINT)
137 43 Marcus Pearce
                       (CPITCH CPITCH-CLASS))
138 43 Marcus Pearce
139 43 Marcus Pearce
Selected system NIL, mean IC = NIL
140 43 Marcus Pearce
141 43 Marcus Pearce
Selected system ((CPITCH-CLASS CONTOUR)), mean IC = 3.0302427
142 43 Marcus Pearce
 =======================================================================================
143 43 Marcus Pearce
The selected viewpoint system with a mean IC of 3.0302427 is ((CPITCH-CLASS
144 43 Marcus Pearce
                                                               CONTOUR))
145 43 Marcus Pearce
3.0302427
146 43 Marcus Pearce
(3.169925 3.169925 3.0849624 3.0849624 2.9886398 2.9886398 2.8774438 2.8774438)
147 43 Marcus Pearce
((3.169925 3.169925) (3.169925 3.169925) (3.169925 3.0) (3.169925 3.0)
148 43 Marcus Pearce
 (3.169925 2.807355) (3.169925 2.807355) (3.169925 2.5849626)
149 43 Marcus Pearce
 (3.169925 2.5849626))
150 43 Marcus Pearce
</pre>
151 43 Marcus Pearce
152 43 Marcus Pearce
153 13 Jeremy Gow
h3. Conklin & Witten (1995)
154 13 Jeremy Gow
155 13 Jeremy Gow
To simulate the experiments of Conklin & Witten (1995) 
156 1 Marcus Pearce
157 1 Marcus Pearce
<pre>
158 45 Marcus Pearce
CL-USER> (idyom:conkwit95)
159 1 Marcus Pearce
Simulation of the experiments of Conklin & Witten (1995, Table 4).
160 1 Marcus Pearce
System 1; Mean Information Content: 2.33 
161 1 Marcus Pearce
System 2; Mean Information Content: 2.36 
162 1 Marcus Pearce
System 3; Mean Information Content: 2.09 
163 1 Marcus Pearce
System 4; Mean Information Content: 2.01 
164 1 Marcus Pearce
System 5; Mean Information Content: 2.08 
165 1 Marcus Pearce
System 6; Mean Information Content: 1.90 
166 1 Marcus Pearce
System 7; Mean Information Content: 1.88 
167 1 Marcus Pearce
System 8; Mean Information Content: 1.86 
168 1 Marcus Pearce
NIL
169 1 Marcus Pearce
</pre>
170 1 Marcus Pearce
171 1 Marcus Pearce
Compare with "Conklin & Witten [1995, JNMR, table 4]":http://www.sc.ehu.es/ccwbayes/members/conklin/papers/jnmr95.pdf