Idyom » History » Version 15

« Previous - Version 15/74 (diff) - Next » - Current version
Jeremy Gow, 2013-02-22 02:43 PM


Running IDyOM

idyom:idyom

The main workhorse function is idyom:idyom, which has three required arguments and a number of optional keyword arguments.

Required parameters

  • dataset-id: a dataset id, e.g. 1.
  • target-viewpoints: a list of basic viewpoints to predict, e.g. '(:cpitch :bioi)
  • source-viewpoints: a list of viewpoints to use in prediction, e.g. '((:cpintfref :cpint) :bioi)
    • Passing :select will trigger viewpoint selection (see further options below)

See the List of viewpoints for a description of the various viewpoints available in IDyOM. So a simple call to IDyOM would be:

CL-USER> (idyom:idyom 1 '(:cpitch) '(:cpitch :cpint))
2.2490792
(1.9049941 2.427845 2.0234334 1.7971386 1.8213106 1.9313766 2.3758402 1.8310248
...

This predicts the pitch values in dataset 1, based on previous pitches (cpitch) and pitch intervals (cpint). IDyOM computes the information content for each note, and by default returns a list of composition mean ICs (mean note IC for each composition, second value) and dataset mean IC (the mean composition mean IC, first value).

Statistical modelling parameters

  • pretraining-ids: a list of dataset-ids to pretrain the long-term models
    • e.g., '(0 1 7)
  • k: an integer designating the number of cross-validation folds to use
    • 1 = no cross-validation, but also no training set unless the models are pretrained;
    • :full = as many folds as there are compositions in the dataset
    • default = 10
  • resampling-indices: you can limit the modelling to a particular set of resampling folds
  • models: whether to use the short-term or long-term models or both
    • :stm - short-term model only
    • :ltm - long-term model only
    • :ltm+ - the long-term model trained incrementally on the test set
    • :both - :stm + :ltm
    • :both+ - :stm + :ltm+ (this is the default)
  • ltm-order-bound: the order bound for the long-term model (the default nil means no order bound, otherwise an integer indicates the bound in number of events)
  • ltm-mixtures: whether to use mixtures for the LTM (default t)
  • ltm-update-exclusion: whether to use update exclusion for the LTM (default nil)
  • ltm-escape: the escape method to use for the LTM (:a :b :c :d :x - default :c)
  • stm-order-bound: the order bound to use for the short-term model (default nil)
  • stm-mixtures: whether to use mixtures for the STM (default t)
  • stm-update-exclusion: whether to use update exclusion for the STM (default t)
  • stm-escape: the escape method for the STM (default :x)

See Pearce [2005, chapter 6] for a description and explanation of these parameters.

Viewpoint selection parameters

  • dp: the number of decimal places to use when comparing information contents in viewpoint selection
    • full floating point precision is used if this is nil (the default)
  • max-links: the maximum number of links to use when creating linked viewpoints in viewpoint selection
    • the default is 2

Output parameters

  • output-path: a string indicating a directory in which to write the output
    • output is only written to the console if this is nil
  • detail: an integer which determines how the information content is averaged in the output:
    • 1: averaged over the entire dataset
    • 2: and also averaged over each composition
    • 3: and also for each event in each composition

resampling:idyom-resample

The top-level function in turn passes its arguments on to a number of sub-functions which can be used independently.
RESAMPLING:DATASET-PREDICTION accepts the following arguments (all but the first three are optional, keyword arguments):

  • dataset-id: a dataset id
    • e.g., 1
  • basic-attributes: a list of basic attributes to predict
    • e.g., '(cpitch bioi)
  • attributes: a list of attributes to use in prediction
    • e.g., '((cpintfref cpint) bioi)
  • pretraining-ids: a list of dataset-ids to pretrain the long-term models
    • e.g., '(0 1 7)
  • k: an integer designating the number of cross-validation folds to use
    • 1 = no cross-validation, but also no training set unless the models are pretrained;
    • :full = as many folds as there are compositions in the dataset
    • default = 10
  • resampling-indices: you can limit the modelling to a particular set of resampling folds
  • models: whether to use the short-term or long-term models or both
    • :stm - short-term model only
    • :ltm - long-term model only
    • :ltm+ - the long-term model trained incrementally on the test set
    • :both - :stm + :ltm
    • :both+ - :stm + :ltm+ (this is the default)
  • ltm-order-bound: the order bound for the long-term model (the default nil means no order bound, otherwise an integer indicates the bound in number of events)
  • ltm-mixtures: whether to use mixtures for the LTM (default t)
  • ltm-update-exclusion: whether to use update exclusion for the LTM (default nil)
  • ltm-escape: the escape method to use for the LTM (:a :b :c :d :x - default :c)
  • stm-order-bound: the order bound to use for the short-term model (default nil)
  • stm-mixtures: whether to use mixtures for the STM (default t)
  • stm-update-exclusion: whether to use update exclusion for the STM (default t)
  • stm-escape: the escape method for the STM (default :x)
RESAMPLING:OUTPUT-INFORMATION-CONTENT takes the output of RESAMPLING:DATASET-PREDICTION and returns the average information content. It takes the following arguments:
  • predictions: the output of RESAMPLING:DATASET-PREDICTION
  • detail: an integer which determines how the information content is averaged (these are returned as multiple values):
    • 1: averaged over the entire dataset
    • 2: and also averaged over each composition
    • 3: and also for each event in each composition

resampling:format-information-content

RESAMPLING:FORMAT-INFORMATION-CONTENT takes the output of RESAMPLING:DATASET-PREDICTION and writes it to file. It takes the following arguments:
  • predictions: the output of RESAMPLING:DATASET-PREDICTION
  • file: a string denoting a file
  • dataset-id: an integer reflecting the dataset-id
  • detail: an integer which determines how the information content is averaged (these are returned as multiple values):
    • 1: averaged over the entire dataset
    • 2: and also averaged over each composition
    • 3: and also for each event in each composition

Examples

Mean melody IC

To get mean information contents for each melody of dataset 0 in a list

CL-USER> (resampling:output-information-content 
          (resampling:dataset-prediction 0 '(cpitch) '(cpintfref cpint))
          2)
2.493305
(2.1368716 2.8534691 2.6938546 2.6491673 2.4993074 2.6098127 2.7728052 2.772861
 2.5921957 2.905856 2.3591626 2.957503 2.4042292 2.7562473 2.3996017 2.8073587
 2.114944 1.7434102 2.2310295 2.6374347 2.361792 1.9476132 2.501488 2.5472867
 2.1056154 2.8225484 2.134257 2.9162033 3.0715692 2.9012227 2.7291088 2.866882
 2.8795822 2.4571223 2.9277062 2.7861307 2.6623116 2.3304622 2.4217033
 2.0556943 2.4048684 2.914848 2.7182267 3.0894585 2.873869 1.8821808 2.640174
 2.8165438 2.5423129 2.3011856 3.1477294 2.655349 2.5216308 2.0667994 3.2579045
 2.573013 2.6035044 2.202191 2.622113 2.2621205 2.3617425 2.7526956 2.3281655
 2.9357266 2.3372407 3.1848125 2.67367 2.1906006 2.7835917 2.6332111 3.206142
 2.1426969 2.194259 2.415167 1.9769101 2.0870917 2.7844474 2.2373738 2.772138
 2.9702199 1.724408 2.473073 2.2464263 2.2452457 2.688889 2.6299863 2.2223835
 2.8082614 2.673671 2.7693706 2.3369458 2.5016947 2.3837066 2.3682225 2.795649
 2.9063463 2.5880773 2.0457468 1.8635312 2.4522712 1.5877498 2.8802161
 2.7988417 2.3125513 1.7245895 2.2404804 2.1694546 2.365556 1.5905867 1.3827317
 2.2706041 3.023884 2.2864542 2.1259797 2.713626 2.1967313 2.5721254 2.5812547
 2.8233812 2.3134546 2.6203637 2.945946 2.601433 2.1920888 2.3732007 2.440137
 2.4291563 2.3676903 2.734724 3.0283954 2.8076048 2.7796154 2.326931 2.1779459
 2.2570527 2.2688026 1.3976555 2.030298 2.640235 2.568248 2.6338177 2.157162
 2.3915367 2.7873137 2.3088667 2.2176988 2.4402564 2.8062992 2.784044 2.4296925
 2.3520193 2.6146257)

Write note IC to file

To write the information contents for each note of each melody in dataset 0 to a file

CL-USER> (resampling:format-information-content 
          (resampling:dataset-prediction 0 '(cpitch) '(cpintfref cpint))
          "/tmp/foo.dat" 
          0
          3)

Conklin & Witten (1995)

To simulate the experiments of Conklin & Witten (1995)

CL-USER> (resampling:conkwit95)
Simulation of the experiments of Conklin & Witten (1995, Table 4).
System 1; Mean Information Content: 2.33 
System 2; Mean Information Content: 2.36 
System 3; Mean Information Content: 2.09 
System 4; Mean Information Content: 2.01 
System 5; Mean Information Content: 2.08 
System 6; Mean Information Content: 1.90 
System 7; Mean Information Content: 1.88 
System 8; Mean Information Content: 1.86 
NIL

Compare with Conklin & Witten [1995, JNMR, table 4]

Viewpoint Selection

Two functions are supplied for searching a space of viewpoints: run-hill-climber and run-best-first, which take 4 arguments:

  • a list of viewpoints: the algorithm searches through the space of combinations of these viewpoints
  • a start state (usually nil, the empty viewpoint system)
  • an evaluation function returning a numeric performance metric: e.g., the mean information content of the dataset returned by dataset-prediction
  • a symbol describing which way to optimise the metric: :desc mean lower values are better :asc mean greater values are better

Here is an example:

CL-USER> (viewpoint-selection:run-hill-climber 
          '(:cpitch :cpintfref :cpint :contour)
          nil
          #'(lambda (viewpoints)
              (utils:round-to-nearest-decimal-place 
               (resampling:output-information-content 
                (resampling:dataset-prediction 0 '(cpitch) viewpoints :k 10 :models :both+) 
                1)
               2))
          :desc)

 =============================================================================
   System                                                Score
 -----------------------------------------------------------------------------
   NIL                                                   NIL
   (CPITCH)                                              2.52
   (CPINT CPITCH)                                        2.43
   (CPINTFREF CPINT CPITCH)                              2.38
 =============================================================================
#S(VIEWPOINT-SELECTION::RECORD :STATE (:CPINTFREF :CPINT :CPITCH) :WEIGHT 2.38)

Since this can be quite a time consuming process, there are also functions for caching the results.

(initialise-vs-cache)
(load-vs-cache filename package)
(store-vs-cache filename package)