comparison draft.tex @ 70:2cb06db0d271

FINISHED!
author samer
date Sat, 17 Mar 2012 18:06:03 +0000
parents 3fa185431bbc
children 9135f6fb1a68
comparison
equal deleted inserted replaced
69:3fa185431bbc 70:2cb06db0d271
82 \section{Introduction} 82 \section{Introduction}
83 \label{s:Intro} 83 \label{s:Intro}
84 The relationship between 84 The relationship between
85 Shannon's \cite{Shannon48} information theory and music and art in general has been the 85 Shannon's \cite{Shannon48} information theory and music and art in general has been the
86 subject of some interest since the 1950s 86 subject of some interest since the 1950s
87 \cite{Youngblood58,CoonsKraehenbuehl1958,HillerBean66,Moles66,Meyer67,Cohen1962}. 87 \cite{Youngblood58,CoonsKraehenbuehl1958,Moles66,Meyer67,Cohen1962}.
88 The general thesis is that perceptible qualities and subjective states 88 The general thesis is that perceptible qualities and subjective states
89 like uncertainty, surprise, complexity, tension, and interestingness 89 like uncertainty, surprise, complexity, tension, and interestingness
90 are closely related to information-theoretic quantities like 90 are closely related to information-theoretic quantities like
91 entropy, relative entropy, and mutual information. 91 entropy, relative entropy, and mutual information.
92 92
96 immediately, after some delay, or modified as the music unfolds. 96 immediately, after some delay, or modified as the music unfolds.
97 In this paper, we explore this ``Information Dynamics'' view of music, 97 In this paper, we explore this ``Information Dynamics'' view of music,
98 discussing the theory behind it and some emerging applications. 98 discussing the theory behind it and some emerging applications.
99 99
100 \subsection{Expectation and surprise in music} 100 \subsection{Expectation and surprise in music}
101 The thesis that the musical experience is strongly shaped by the generation 101 The idea that the musical experience is strongly shaped by the generation
102 and playing out of strong and weak expectations was put forward by, amongst others, 102 and playing out of strong and weak expectations was put forward by, amongst others,
103 music theorists L. B. Meyer \cite{Meyer67} and Narmour \citep{Narmour77}, but was 103 music theorists L. B. Meyer \cite{Meyer67} and Narmour \citep{Narmour77}, but was
104 recognised much earlier; for example, 104 recognised much earlier; for example,
105 it was elegantly put by Hanslick \cite{Hanslick1854} in the 105 it was elegantly put by Hanslick \cite{Hanslick1854} in the
106 nineteenth century: 106 nineteenth century:
128 We suppose that when we listen to music, expectations are created on the basis 128 We suppose that when we listen to music, expectations are created on the basis
129 of our familiarity with various styles of music and our ability to 129 of our familiarity with various styles of music and our ability to
130 detect and learn statistical regularities in the music as they emerge, 130 detect and learn statistical regularities in the music as they emerge,
131 There is experimental evidence that human listeners are able to internalise 131 There is experimental evidence that human listeners are able to internalise
132 statistical knowledge about musical structure, \eg 132 statistical knowledge about musical structure, \eg
133 \citep{SaffranJohnsonAslin1999,EerolaToiviainenKrumhansl2002}, and also 133 % \citep{SaffranJohnsonAslin1999,EerolaToiviainenKrumhansl2002}, and also
134 \citep{SaffranJohnsonAslin1999}, and also
134 that statistical models can form an effective basis for computational 135 that statistical models can form an effective basis for computational
135 analysis of music, \eg 136 analysis of music, \eg
136 \cite{ConklinWitten95,PonsfordWigginsMellish1999,Pearce2005}. 137 \cite{ConklinWitten95,PonsfordWigginsMellish1999,Pearce2005}.
137 138
138 % \subsection{Music and information theory} 139 % \subsection{Music and information theory}
139 With a probabilistic framework for music modelling and prediction in hand, 140 With a probabilistic framework for music modelling and prediction in hand,
140 we are in a position to compute various 141 we can %are in a position to
142 compute various
141 \comment{ 143 \comment{
142 which provides us with a number of measures, such as entropy 144 which provides us with a number of measures, such as entropy
143 and mutual information, which are suitable for quantifying states of 145 and mutual information, which are suitable for quantifying states of
144 uncertainty and surprise, and thus could potentially enable us to build 146 uncertainty and surprise, and thus could potentially enable us to build
145 quantitative models of the listening process described above. They are 147 quantitative models of the listening process described above. They are
168 % listener, a temporal programme of varying 170 % listener, a temporal programme of varying
169 % levels of uncertainty, ambiguity and surprise. 171 % levels of uncertainty, ambiguity and surprise.
170 172
171 173
172 \subsection{Information dynamic approach} 174 \subsection{Information dynamic approach}
173 Our working hypothesis is that, as a 175 Our working hypothesis is that, as an intelligent, predictive
174 listener (to which will refer as `it') listens to a piece of music, it maintains 176 agent (to which will refer as `it') listens to a piece of music, it maintains
175 a dynamically evolving probabilistic model that enables it to make predictions 177 a dynamically evolving probabilistic belief state that enables it to make predictions
176 about how the piece will continue, relying on both its previous experience 178 about how the piece will continue, relying on both its previous experience
177 of music and the emerging themes of the piece. As events unfold, it revises 179 of music and the emerging themes of the piece. As events unfold, it revises
178 its probabilistic belief state, which includes predictive 180 this belief state, which includes predictive
179 distributions over possible future events. These 181 distributions over possible future events. These
180 % distributions and changes in distributions 182 % distributions and changes in distributions
181 can be characterised in terms of a handful of information 183 can be characterised in terms of a handful of information
182 theoretic-measures such as entropy and relative entropy. By tracing the 184 theoretic-measures such as entropy and relative entropy. By tracing the
183 evolution of a these measures, we obtain a representation which captures much 185 evolution of a these measures, we obtain a representation which captures much
184 of the significant structure of the music. 186 of the significant structure of the music.
185 187
186 One of the consequences of this approach is that regardless of the details of 188 One consequence of this approach is that regardless of the details of
187 the sensory input or even which sensory modality is being processed, the resulting 189 the sensory input or even which sensory modality is being processed, the resulting
188 analysis is in terms of the same units: quantities of information (bits) and 190 analysis is in terms of the same units: quantities of information (bits) and
189 rates of information flow (bits per second). The information 191 rates of information flow (bits per second). The information
190 theoretic concepts in terms of which the analysis is framed are universal to all sorts 192 theoretic concepts in terms of which the analysis is framed are universal to all sorts
191 of data. 193 of data.
399 \mathcal{I}_t = \sum_{\fut{x}_t \in \X^*} 401 \mathcal{I}_t = \sum_{\fut{x}_t \in \X^*}
400 p(\fut{x}_t|x_t,\past{x}_t) \log \frac{ p(\fut{x}_t|x_t,\past{x}_t) }{ p(\fut{x}_t|\past{x}_t) }, 402 p(\fut{x}_t|x_t,\past{x}_t) \log \frac{ p(\fut{x}_t|x_t,\past{x}_t) }{ p(\fut{x}_t|\past{x}_t) },
401 \end{equation} 403 \end{equation}
402 where the sum is to be taken over the set of infinite sequences $\X^*$. 404 where the sum is to be taken over the set of infinite sequences $\X^*$.
403 Note that it is quite possible for an event to be surprising but not informative 405 Note that it is quite possible for an event to be surprising but not informative
404 in predictive sense. 406 in a predictive sense.
405 As with the surprisingness, the observer can compute its \emph{expected} IPI 407 As with the surprisingness, the observer can compute its \emph{expected} IPI
406 at time $t$, which reduces to a mutual information $I(X_t;\fut{X}_t|\ev(\past{X}_t=\past{x}_t))$ 408 at time $t$, which reduces to a mutual information $I(X_t;\fut{X}_t|\ev(\past{X}_t=\past{x}_t))$
407 conditioned on the observed past. This could be used, for example, as an estimate 409 conditioned on the observed past. This could be used, for example, as an estimate
408 of attentional resources which should be directed at this stream of data, which may 410 of attentional resources which should be directed at this stream of data, which may
409 be in competition with other sensory streams. 411 be in competition with other sensory streams.
451 \node at (barycentric cs:p2=1,p1=1) [shape=rectangle,fill=black!45,inner sep=1pt]{$\rho_\mu$}; 453 \node at (barycentric cs:p2=1,p1=1) [shape=rectangle,fill=black!45,inner sep=1pt]{$\rho_\mu$};
452 \path (p2) +(90:3em) node {$X_0$}; 454 \path (p2) +(90:3em) node {$X_0$};
453 \path (p1) +(-3em,0em) node {\shortstack{infinite\\past}}; 455 \path (p1) +(-3em,0em) node {\shortstack{infinite\\past}};
454 \path (p1) +(-4em,\rad) node [anchor=south] {$\ldots,X_{-1}$}; 456 \path (p1) +(-4em,\rad) node [anchor=south] {$\ldots,X_{-1}$};
455 \end{tikzpicture}}% 457 \end{tikzpicture}}%
456 \\[1.25em] 458 \\[1em]
457 \subfig{(b) excess entropy}{% 459 \subfig{(b) excess entropy}{%
458 \newcommand\blob{\longblob} 460 \newcommand\blob{\longblob}
459 \begin{tikzpicture} 461 \begin{tikzpicture}
460 \coordinate (p1) at (-\offs,0em); 462 \coordinate (p1) at (-\offs,0em);
461 \coordinate (p2) at (\offs,0em); 463 \coordinate (p2) at (\offs,0em);
469 \path (0,0) node (future) {$E$}; 471 \path (0,0) node (future) {$E$};
470 \path (p1) +(-2em,\rad) node [anchor=south] {$\ldots,X_{-1}$}; 472 \path (p1) +(-2em,\rad) node [anchor=south] {$\ldots,X_{-1}$};
471 \path (p2) +(2em,\rad) node [anchor=south] {$X_0,\ldots$}; 473 \path (p2) +(2em,\rad) node [anchor=south] {$X_0,\ldots$};
472 \end{tikzpicture}% 474 \end{tikzpicture}%
473 }% 475 }%
474 \\[1.25em] 476 \\[1em]
475 \subfig{(c) predictive information rate $b_\mu$}{% 477 \subfig{(c) predictive information rate $b_\mu$}{%
476 \begin{tikzpicture}%[baseline=-1em] 478 \begin{tikzpicture}%[baseline=-1em]
477 \newcommand\rc{2.1em} 479 \newcommand\rc{2.1em}
478 \newcommand\throw{2.5em} 480 \newcommand\throw{2.5em}
479 \coordinate (p1) at (210:1.5em); 481 \coordinate (p1) at (210:1.5em);
504 \path (p3) +(3em,0em) node {\shortstack{infinite\\future}}; 506 \path (p3) +(3em,0em) node {\shortstack{infinite\\future}};
505 \path (p1) +(-3em,0em) node {\shortstack{infinite\\past}}; 507 \path (p1) +(-3em,0em) node {\shortstack{infinite\\past}};
506 \path (p1) +(-4em,\rad) node [anchor=south] {$\ldots,X_{-1}$}; 508 \path (p1) +(-4em,\rad) node [anchor=south] {$\ldots,X_{-1}$};
507 \path (p3) +(4em,\rad) node [anchor=south] {$X_1,\ldots$}; 509 \path (p3) +(4em,\rad) node [anchor=south] {$X_1,\ldots$};
508 \end{tikzpicture}}% 510 \end{tikzpicture}}%
509 \\[0.5em] 511 \\[0.25em]
510 \end{tabular} 512 \end{tabular}
511 \caption{ 513 \caption{
512 I-diagrams for several information measures in 514 I-diagrams for several information measures in
513 stationary random processes. Each circle or oval represents a random 515 stationary random processes. Each circle or oval represents a random
514 variable or sequence of random variables relative to time $t=0$. Overlapped areas 516 variable or sequence of random variables relative to time $t=0$. Overlapped areas
605 % information} rate. 607 % information} rate.
606 608
607 609
608 \subsection{First and higher order Markov chains} 610 \subsection{First and higher order Markov chains}
609 \label{s:markov} 611 \label{s:markov}
610 First order Markov chains are the simplest non-trivial models to which information 612 % First order Markov chains are the simplest non-trivial models to which information
611 dynamics methods can be applied. In \cite{AbdallahPlumbley2009} we derived 613 % dynamics methods can be applied.
614 In \cite{AbdallahPlumbley2009} we derived
612 expressions for all the information measures described in \secrf{surprise-info-seq} for 615 expressions for all the information measures described in \secrf{surprise-info-seq} for
613 ergodic Markov chains (\ie that have a unique stationary 616 ergodic first order Markov chains (\ie that have a unique stationary
614 distribution). 617 distribution).
615 % The derivation is greatly simplified by the dependency structure 618 % The derivation is greatly simplified by the dependency structure
616 % of the Markov chain: for the purpose of the analysis, the `past' and `future' 619 % of the Markov chain: for the purpose of the analysis, the `past' and `future'
617 % segments $\past{X}_t$ and $\fut{X}_t$ can be collapsed to just the previous 620 % segments $\past{X}_t$ and $\fut{X}_t$ can be collapsed to just the previous
618 % and next variables $X_{t-1}$ and $X_{t+1}$ respectively. 621 % and next variables $X_{t-1}$ and $X_{t+1}$ respectively.
619 We also showed that 622 We also showed that
620 the predictive information rate can be expressed simply in terms of entropy rates: 623 the PIR can be expressed simply in terms of entropy rates:
621 if we let $a$ denote the $K\times K$ transition matrix of a Markov chain over 624 if we let $a$ denote the $K\times K$ transition matrix of a Markov chain over
622 an alphabet of $\{1,\ldots,K\}$, such that 625 an alphabet of $\{1,\ldots,K\}$, such that
623 $a_{ij} = \Pr(\ev(X_t=i|\ev(X_{t-1}=j)))$, and let $h:\reals^{K\times K}\to \reals$ be 626 $a_{ij} = \Pr(\ev(X_t=i|\ev(X_{t-1}=j)))$, and let $h:\reals^{K\times K}\to \reals$ be
624 the entropy rate function such that $h(a)$ is the entropy rate of a Markov chain 627 the entropy rate function such that $h(a)$ is the entropy rate of a Markov chain
625 with transition matrix $a$, then the predictive information rate is 628 with transition matrix $a$, then the PIR is
626 \begin{equation} 629 \begin{equation}
627 b_\mu = h(a^2) - h(a), 630 b_\mu = h(a^2) - h(a),
628 \end{equation} 631 \end{equation}
629 where $a^2$, the transition matrix squared, is the transition matrix 632 where $a^2$, the transition matrix squared, is the transition matrix
630 of the `skip one' Markov chain obtained by jumping two steps at a time 633 of the `skip one' Markov chain obtained by jumping two steps at a time
631 along the original chain. 634 along the original chain.
632 635
633 Second and higher order Markov chains can be treated in a similar way by transforming 636 Second and higher order Markov chains can be treated in a similar way by transforming
634 to a first order representation of the high order Markov chain. If we are dealing 637 to a first order representation of the high order Markov chain. With
635 with an $N$th order model, this is done forming a new alphabet of size $K^N$ 638 an $N$th order model, this is done by forming a new alphabet of size $K^N$
636 consisting of all possible $N$-tuples of symbols from the base alphabet. 639 consisting of all possible $N$-tuples of symbols from the base alphabet.
637 An observation $\hat{x}_t$ in this new model encodes a block of $N$ observations 640 An observation $\hat{x}_t$ in this new model encodes a block of $N$ observations
638 $(x_{t+1},\ldots,x_{t+N})$ from the base model. The next 641 $(x_{t+1},\ldots,x_{t+N})$ from the base model.
639 observation $\hat{x}_{t+1}$ encodes the block of $N$ obtained by shifting the previous 642 % The next
640 block along by one step. The new Markov of chain is parameterised by a sparse $K^N\times K^N$ 643 % observation $\hat{x}_{t+1}$ encodes the block of $N$ obtained by shifting the previous
644 % block along by one step.
645 The new Markov of chain is parameterised by a sparse $K^N\times K^N$
641 transition matrix $\hat{a}$, in terms of which the PIR is 646 transition matrix $\hat{a}$, in terms of which the PIR is
642 \begin{equation} 647 \begin{equation}
643 h_\mu = h(\hat{a}), \qquad b_\mu = h({\hat{a}^{N+1}}) - N h({\hat{a}}), 648 h_\mu = h(\hat{a}), \qquad b_\mu = h({\hat{a}^{N+1}}) - N h({\hat{a}}),
644 \end{equation} 649 \end{equation}
645 where $\hat{a}^{N+1}$ is the $(N+1)$th power of the first order transition matrix. 650 where $\hat{a}^{N+1}$ is the $(N+1)$th power of the first order transition matrix.
646 Other information measures can also be computed for the high-order Markov chain, including 651 Other information measures can also be computed for the high-order Markov chain, including
647 the multi-information rate $\rho_\mu$ and the excess entropy $E$. These are identical 652 the multi-information rate $\rho_\mu$ and the excess entropy $E$. (These are identical
648 for first order Markov chains, but for order $N$ chains, $E$ can be up to $N$ times larger 653 for first order Markov chains, but for order $N$ chains, $E$ can be up to $N$ times larger
649 than $\rho_\mu$. 654 than $\rho_\mu$.)
650 655
651 In our early experiments with visualising and sonifying sequences sampled from 656 In our experiments with visualising and sonifying sequences sampled from
652 first order Markov chains \cite{AbdallahPlumbley2009}, we found that 657 first order Markov chains \cite{AbdallahPlumbley2009}, we found that
653 the measures $h_\mu$, $\rho_\mu$ and $b_\mu$ are related to perceptible 658 the measures $h_\mu$, $\rho_\mu$ and $b_\mu$ correspond to perceptible
654 characteristics, and that the kinds of transition matrices maximising or minimising 659 characteristics, and that the transition matrices maximising or minimising
655 each of these quantities are quite distinct. High entropy rates are associated 660 each of these quantities are quite distinct. High entropy rates are associated
656 with completely uncorrelated sequences with no recognisable temporal structure, 661 with completely uncorrelated sequences with no recognisable temporal structure
657 along with low $\rho_\mu$ and $b_\mu$. 662 (and low $\rho_\mu$ and $b_\mu$).
658 High values of $\rho_\mu$ are associated with long periodic cycles, low $h_\mu$ 663 High values of $\rho_\mu$ are associated with long periodic cycles (and low $h_\mu$
659 and low $b_\mu$. High values of $b_\mu$ are associated with intermediate values 664 and $b_\mu$). High values of $b_\mu$ are associated with intermediate values
660 of $\rho_\mu$ and $h_\mu$, and recognisable, but not completely predictable, 665 of $\rho_\mu$ and $h_\mu$, and recognisable, but not completely predictable,
661 temporal structures. These relationships are visible in \figrf{mtriscat} in 666 temporal structures. These relationships are visible in \figrf{mtriscat} in
662 \secrf{composition}, where we pick up the thread with an application of 667 \secrf{composition}, where we pick up this thread again, with an application of
663 information dynamics in a compositional aid. 668 information dynamics in a compositional aid.
664 669
665 670
666 \section{Information Dynamics in Analysis} 671 \section{Information Dynamics in Analysis}
667 672
673 \subsection{Musicological Analysis}
674 \label{s:minimusic}
675
668 \begin{fig}{twopages} 676 \begin{fig}{twopages}
669 \colfig[0.96]{matbase/fig9471} % update from mbc paper 677 \colfig[0.96]{matbase/fig9471}\\ % update from mbc paper
670 % \colfig[0.97]{matbase/fig72663}\\ % later update from mbc paper (Keith's new picks) 678 % \colfig[0.97]{matbase/fig72663}\\ % later update from mbc paper (Keith's new picks)
671 \vspace*{1em} 679 \vspace*{0.5em}
672 \colfig[0.97]{matbase/fig13377} % rule based analysis 680 \colfig[0.97]{matbase/fig13377} % rule based analysis
673 \caption{Analysis of \emph{Two Pages}. 681 \caption{Analysis of \emph{Two Pages}.
674 The thick vertical lines are the part boundaries as indicated in 682 The thick vertical lines are the part boundaries as indicated in
675 the score by the composer. 683 the score by the composer.
676 The thin grey lines 684 The thin grey lines
677 indicate changes in the melodic `figures' of which the piece is 685 indicate changes in the melodic `figures' of which the piece is
678 constructed. In the `model information rate' panel, the black asterisks 686 constructed. In the `model information rate' panel, the black asterisks
679 mark the 687 mark the six most surprising moments selected by Keith Potter.
680 six most surprising moments selected by Keith Potter. 688 The bottom two panels show two rule-based boundary strength analyses.
681 The bottom panel shows a rule-based boundary strength analysis computed 689 All information measures are in nats.
682 using Cambouropoulos' LBDM. 690 Note that the boundary marked in the score at around note 5,400 is known to be
683 All information measures are in nats and time is in notes. 691 anomalous; on the basis of a listening analysis, some musicologists have
692 placed the boundary a few bars later, in agreement with our analysis
693 \cite{PotterEtAl2007}.
684 } 694 }
685 \end{fig} 695 \end{fig}
686 696
687 \subsection{Musicological Analysis} 697 In \cite{AbdallahPlumbley2009}, we analysed two pieces of music in the minimalist style
688 In \cite{AbdallahPlumbley2009}, methods based on the theory described above
689 were used to analysis two pieces of music in the minimalist style
690 by Philip Glass: \emph{Two Pages} (1969) and \emph{Gradus} (1968). 698 by Philip Glass: \emph{Two Pages} (1969) and \emph{Gradus} (1968).
691 The analysis was done using a first-order Markov chain model, with the 699 The analysis was done using a first-order Markov chain model, with the
692 enhancement that the transition matrix of the model was allowed to 700 enhancement that the transition matrix of the model was allowed to
693 evolve dynamically as the notes were processed, and was tracked (in 701 evolve dynamically as the notes were processed, and was tracked (in
694 a Bayesian way) as a \emph{distribution} over possible transition matrices, 702 a Bayesian way) as a \emph{distribution} over possible transition matrices,
695 rather than a point estimate. Some results are summarised in \figrf{twopages}: 703 rather than a point estimate. Some results are summarised in \figrf{twopages}:
696 the upper four plots show the dynamically evolving subjective information 704 the upper four plots show the dynamically evolving subjective information
697 measures as described in \secrf{surprise-info-seq} computed using a point 705 measures as described in \secrf{surprise-info-seq}, computed using a point
698 estimate of the current transition matrix; the fifth plot (the `model information rate') 706 estimate of the current transition matrix; the fifth plot (the `model information rate')
699 measures the information in each observation about the transition matrix. 707 shows the information in each observation about the transition matrix.
700 In \cite{AbdallahPlumbley2010b}, we showed that this `model information rate' 708 In \cite{AbdallahPlumbley2010b}, we showed that this `model information rate'
701 is actually a component of the true IPI when the transition 709 is actually a component of the true IPI when the transition
702 matrix is being learned online, and was neglected when we computed the IPI from 710 matrix is being learned online, and was neglected when we computed the IPI from
703 the transition matrix as if the transition probabilities 711 the transition matrix as if it were a constant.
704 were constant. 712
705 713 The peaks of the surprisingness and both components of the IPI
706 The peaks of the surprisingness and both components of the predictive information
707 show good correspondence with structure of the piece both as marked in the score 714 show good correspondence with structure of the piece both as marked in the score
708 and as analysed by musicologist Keith Potter, who was asked to mark the six 715 and as analysed by musicologist Keith Potter, who was asked to mark the six
709 `most surprising moments' of the piece (shown as asterisks in the fifth plot)% 716 `most surprising moments' of the piece (shown as asterisks in the fifth plot). %%
710 \footnote{% 717 % \footnote{%
711 Note that the boundary marked in the score at around note 5,400 is known to be 718 % Note that the boundary marked in the score at around note 5,400 is known to be
712 anomalous; on the basis of a listening analysis, some musicologists have 719 % anomalous; on the basis of a listening analysis, some musicologists have
713 placed the boundary a few bars later, in agreement with our analysis 720 % placed the boundary a few bars later, in agreement with our analysis
714 \cite{PotterEtAl2007}.} 721 % \cite{PotterEtAl2007}.}
715 722 %
716 In contrast, the analyses shown in the lower two plots of \figrf{twopages}, 723 In contrast, the analyses shown in the lower two plots of \figrf{twopages},
717 obtained using two rule-based music segmentation algorithms, while clearly 724 obtained using two rule-based music segmentation algorithms, while clearly
718 \emph{reflecting} the structure of the piece, do not \emph{segment} the piece, 725 \emph{reflecting} the structure of the piece, do not \emph{segment} the piece,
719 with no tendency to peaking of the boundary strength function at 726 with no tendency to peaking of the boundary strength function at
720 the boundaries in the piece. 727 the boundaries in the piece.
726 show that the first note of each bar is, on average, significantly more surprising 733 show that the first note of each bar is, on average, significantly more surprising
727 and informative than the others, up to the 64-note level, where as at the 128-note, 734 and informative than the others, up to the 64-note level, where as at the 128-note,
728 level, the dominant periodicity appears to remain at 64 notes. 735 level, the dominant periodicity appears to remain at 64 notes.
729 736
730 \begin{fig}{metre} 737 \begin{fig}{metre}
731 % \scalebox{1}[1]{% 738 % \scalebox{1}{%
732 \begin{tabular}{cc} 739 \begin{tabular}{cc}
733 \colfig[0.45]{matbase/fig36859} & \colfig[0.48]{matbase/fig88658} \\ 740 \colfig[0.45]{matbase/fig36859} & \colfig[0.48]{matbase/fig88658} \\
734 \colfig[0.45]{matbase/fig48061} & \colfig[0.48]{matbase/fig46367} \\ 741 \colfig[0.45]{matbase/fig48061} & \colfig[0.48]{matbase/fig46367} \\
735 \colfig[0.45]{matbase/fig99042} & \colfig[0.47]{matbase/fig87490} 742 \colfig[0.45]{matbase/fig99042} & \colfig[0.47]{matbase/fig87490}
736 % \colfig[0.46]{matbase/fig56807} & \colfig[0.48]{matbase/fig27144} \\ 743 % \colfig[0.46]{matbase/fig56807} & \colfig[0.48]{matbase/fig27144} \\
741 % \colfig[0.48]{matbase/fig9142} & \colfig[0.48]{matbase/fig27751} 748 % \colfig[0.48]{matbase/fig9142} & \colfig[0.48]{matbase/fig27751}
742 749
743 \end{tabular}% 750 \end{tabular}%
744 % } 751 % }
745 \caption{Metrical analysis by computing average surprisingness and 752 \caption{Metrical analysis by computing average surprisingness and
746 informative of notes at different periodicities (\ie hypothetical 753 IPI of notes at different periodicities (\ie hypothetical
747 bar lengths) and phases (\ie positions within a bar). 754 bar lengths) and phases (\ie positions within a bar).
748 } 755 }
749 \end{fig} 756 \end{fig}
750 757
751 \subsection{Real-valued signals and audio analysis} 758 \subsection{Real-valued signals and audio analysis}
752 Using analogous definitions based on the differential entropy 759 Using analogous definitions based on the differential entropy
753 \cite{CoverThomas}, the methods outlined 760 \cite{CoverThomas}, the methods outlined
754 in \secrf{surprise-info-seq} and \secrf{process-info} 761 in \secrf{surprise-info-seq} and \secrf{process-info}
755 are equally applicable to random variables taking values in a continuous domain. 762 can be reformulated for random variables taking values in a continuous domain.
756 In the case of music, where expressive properties such as dynamics, tempo, 763 Information-dynamic methods may thus be applied to expressive parameters of music
757 timing and timbre are readily quantified on a continuous scale, the information 764 such as dynamics, timing and timbre, which are readily quantified on a continuous scale.
758 dynamic framework may thus be applied. 765
759 % \subsection{Audio based content analysis} 766 % \subsection{Audio based content analysis}
760 % Using analogous definitions of differential entropy, the methods outlined 767 % Using analogous definitions of differential entropy, the methods outlined
761 % in the previous section are equally applicable to continuous random variables. 768 % in the previous section are equally applicable to continuous random variables.
762 % In the case of music, where expressive properties such as dynamics, tempo, 769 % In the case of music, where expressive properties such as dynamics, tempo,
763 % timing and timbre are readily quantified on a continuous scale, the information 770 % timing and timbre are readily quantified on a continuous scale, the information
764 % dynamic framework may also be considered. 771 % dynamic framework may also be considered.
765 772
766 Dubnov \cite{Dubnov2006} considers the class of stationary Gaussian 773 Dubnov \cite{Dubnov2006} considers the class of stationary Gaussian
767 processes. For such processes, the entropy rate may be obtained analytically 774 processes, for which entropy rate may be obtained analytically
768 from the power spectral density of the signal. Dubnov found that the 775 from the power spectral density of the signal. Dubnov found that the
769 multi-information rate (which he refers to as `information rate') can be 776 multi-information rate (which he refers to as `information rate') can be
770 expressed as a function of the spectral flatness measure. For a given variance, 777 expressed as a function of the \emph{spectral flatness measure}. Thus, for a given variance,
771 Gaussian processes with maximal multi-information rate are those with maximally 778 Gaussian processes with maximal multi-information rate are those with maximally
772 non-flat spectra. These are essentially consist of a single 779 non-flat spectra. These essentially consist of a single
773 sinusoidal component and hence are completely predictable and periodic once 780 sinusoidal component and hence are completely predictable once
774 the parameters of the sinusoid have been inferred. 781 the parameters of the sinusoid have been inferred.
775 % Local stationarity is assumed, which may be achieved by windowing or 782 % Local stationarity is assumed, which may be achieved by windowing or
776 % change point detection \cite{Dubnov2008}. 783 % change point detection \cite{Dubnov2008}.
777 %TODO 784 %TODO
778 785
779 We are currently working towards methods for the computation of predictive information 786 We are currently working towards methods for the computation of predictive information
780 rate in some restricted classes of Gaussian processes including finite-order 787 rate in some restricted classes of Gaussian processes including finite-order
781 autoregressive models and processes with power-law spectra (fractionally integrated Gaussian noise). 788 autoregressive models and processes with power-law (or $1/f$) spectra,
789 which have previously been investegated in relation to their aesthetic properties
790 \cite{Voss75,TaylorSpeharVan-Donkelaar2011}.
782 791
792 % (fractionally integrated Gaussian noise).
783 % %(fBm (continuous), fiGn discrete time) possible reference: 793 % %(fBm (continuous), fiGn discrete time) possible reference:
784 % @book{palma2007long, 794 % @book{palma2007long,
785 % title={Long-memory time series: theory and methods}, 795 % title={Long-memory time series: theory and methods},
786 % author={Palma, W.}, 796 % author={Palma, W.},
787 % volume={662}, 797 % volume={662},
805 815
806 816
807 \subsection{Beat Tracking} 817 \subsection{Beat Tracking}
808 818
809 A probabilistic method for drum tracking was presented by Robertson 819 A probabilistic method for drum tracking was presented by Robertson
810 \cite{Robertson11c}. The algorithm is used to synchronise a music 820 \cite{Robertson11c}. The system infers a beat grid (a sequence
811 sequencer to a live drummer. The expected beat time of the sequencer is 821 of approximately regular beat times) given audio inputs from a
812 represented by a click track, and the algorithm takes as input event 822 live drummer, for the purpose of synchronising a music
813 times for discrete kick and snare drum events relative to this click 823 sequencer with the drummer.
814 track. These are obtained using dedicated microphones for each drum and 824 The times of kick and snare drum events are obtained
815 using a percussive onset detector \cite{puckette98}. The drum tracker 825 using dedicated microphones for each drum and a percussive onset detector
816 continually updates distributions for tempo and phase on receiving a new 826 \cite{puckette98}. These event times are then sent
817 event time. We can thus quantify the information contributed of an event 827 to the beat tracker, which maintains a probabilistic belief state in
818 by measuring the difference between the system's prior distribution and 828 the form of distributions over the tempo and phase of the beat grid.
819 the posterior distribution using the Kullback-Leiber divergence. 829 Every time an event is received, these distributions are updated
820 830 with respect to a probabilistic model which accounts both for tempo and phase
821 Here, we have calculated the KL divergence and entropy for kick and 831 variations and the emission of drum events at musically plausible times
822 snare events in sixteen files. The analysis of information rates can be 832 relative to the beat grid.
823 considered \emph{subjective}, in that it measures how the drum tracker's 833 %continually updates distributions for tempo and phase on receiving a new
824 probability distributions change, and these are contingent upon the 834 %event time
825 model used as well as external properties in the signal. We expect, 835
826 however, that following periods of increased uncertainty, such as fills 836 The use of a probabilistic belief state means we can compute entropies
827 or expressive timing, the information contained in an individual event 837 representing the system's uncertainty about the beat grid, and quantify
828 increases. We also examine whether the information is dependent upon 838 the amount of information in each event about the beat grid as the KL divergence
829 metrical position. 839 between prior and posterior distributions. Though this is not strictly the
830 840 instantaneous predictive information (IPI) as described in \secrf{surprise-info-seq}
831 % !!! FIXME 841 (the information gained is not directly about future event times), we can treat
842 it as a proxy for the IPI, in the manner of the `model information rate'
843 described in \secrf{minimusic}, which has a similar status.
844
845 \begin{fig*}{drumfig}
846 % \includegraphics[width=0.9\linewidth]{drum_plots/file9-track.eps}% \\
847 \includegraphics[width=0.97\linewidth]{drum_plots/file11-track.eps} \\
848 % \includegraphics[width=0.9\linewidth]{newplots/file8-track.eps}
849 \caption{Information dynamic analysis derived from audio recordings of
850 drumming, obtained by applying a Bayesian beat tracking system to the
851 sequence of detected kick and snare drum events. The grey line show the system's
852 varying level of uncertainty (entropy) about the tempo and phase of the
853 beat grid, while the stem plot shows the amount of information in each
854 drum event about the beat grid. The entropy drops instantaneously at each
855 event and rises gradually between events.
856 }
857 \end{fig*}
858
859 We carried out the analysis on 16 recordings; an example
860 is shown in \figrf{drumfig}. There we can see variations in the
861 entropy in the upper graph and the information in each drum event in the lower
862 stem plot. At certain points in time, unusually large amounts of information
863 arrive; these may be related to fills and other rhythmic irregularities, which
864 are often followed by an emphatic return to a steady beat at the beginning
865 of the next bar---this is something we are currently investigating.
866 We also analysed the pattern of information flow
867 on a cyclic metre, much as in \figrf{metre}. All the recordings we
868 analysed are audibly in 4/4 metre, but we found no
869 evidence of a general tendency for greater amounts of information to arrive
870 at metrically strong beats, which suggests that the rhythmic accuracy of the
871 drummers does not vary systematically across each bar. It is possible that metrical information
872 existing in the pattern of kick and snare events might emerge in an information
873 dynamic analysis using a model that attempts to predict the time and type of
874 the next drum event, rather than just inferring the beat grid as the current model does.
875 %The analysis of information rates can b
876 %considered \emph{subjective}, in that it measures how the drum tracker's
877 %probability distributions change, and these are contingent upon the
878 %model used as well as external properties in the signal.
879 %We expect,
880 %however, that following periods of increased uncertainty, such as fills
881 %or expressive timing, the information contained in an individual event
882 %increases. We also examine whether the information is dependent upon
883 %metrical position.
884
832 885
833 \section{Information dynamics as compositional aid} 886 \section{Information dynamics as compositional aid}
834 \label{s:composition} 887 \label{s:composition}
835 888
836 The use of stochastic processes in music composition has been widespread for 889 The use of stochastic processes in music composition has been widespread for
839 can drive the \emph{generative} phase of the creative process, information dynamics 892 can drive the \emph{generative} phase of the creative process, information dynamics
840 can serve as a novel framework for a \emph{selective} phase, by 893 can serve as a novel framework for a \emph{selective} phase, by
841 providing a set of criteria to be used in judging which of the 894 providing a set of criteria to be used in judging which of the
842 generated materials 895 generated materials
843 are of value. This alternation of generative and selective phases as been 896 are of value. This alternation of generative and selective phases as been
844 noted by art theorist Margaret Boden \cite{Boden1990}. 897 noted before \cite{Boden1990}.
845 898 %
846 Information-dynamic criteria can also be used as \emph{constraints} on the 899 Information-dynamic criteria can also be used as \emph{constraints} on the
847 generative processes, for example, by specifying a certain temporal profile 900 generative processes, for example, by specifying a certain temporal profile
848 of suprisingness and uncertainty the composer wishes to induce in the listener 901 of suprisingness and uncertainty the composer wishes to induce in the listener
849 as the piece unfolds. 902 as the piece unfolds.
850 %stochastic and algorithmic processes: ; outputs can be filtered to match a set of 903 %stochastic and algorithmic processes: ; outputs can be filtered to match a set of
867 Processes with high PIR maintain a certain kind of balance between 920 Processes with high PIR maintain a certain kind of balance between
868 predictability and unpredictability in such a way that the observer must continually 921 predictability and unpredictability in such a way that the observer must continually
869 pay attention to each new observation as it occurs in order to make the best 922 pay attention to each new observation as it occurs in order to make the best
870 possible predictions about the evolution of the seqeunce. This balance between predictability 923 possible predictions about the evolution of the seqeunce. This balance between predictability
871 and unpredictability is reminiscent of the inverted `U' shape of the Wundt curve (see \figrf{wundt}), 924 and unpredictability is reminiscent of the inverted `U' shape of the Wundt curve (see \figrf{wundt}),
872 which summarises the observations of Wundt that the greatest aesthetic value in art 925 which summarises the observations of Wundt \cite{Wundt1897} that stimuli are most
873 is to be found at intermediate levels of disorder, where there is a balance between 926 pleasing at intermediate levels of novelty or disorder, where there is a balance between
874 `order' and `chaos'. 927 `order' and `chaos'.
875 928
876 Using the methods of \secrf{markov}, we found \cite{AbdallahPlumbley2009} 929 Using the methods of \secrf{markov}, we found \cite{AbdallahPlumbley2009}
877 a similar shape when plotting entropy rate againt PIR---this is visible in the 930 a similar shape when plotting entropy rate againt PIR---this is visible in the
878 upper envelope of the scatter plot in \figrf{mtriscat}, which is a 3-D scatter plot of 931 upper envelope of the scatter plot in \figrf{mtriscat}, which is a 3-D scatter plot of
881 The coordinates of the `information space' are entropy rate ($h_\mu$), redundancy ($\rho_\mu$), and 934 The coordinates of the `information space' are entropy rate ($h_\mu$), redundancy ($\rho_\mu$), and
882 predictive information rate ($b_\mu$). The points along the `redundancy' axis correspond 935 predictive information rate ($b_\mu$). The points along the `redundancy' axis correspond
883 to periodic Markov chains. Those along the `entropy' axis produce uncorrelated sequences 936 to periodic Markov chains. Those along the `entropy' axis produce uncorrelated sequences
884 with no temporal structure. Processes with high PIR are to be found at intermediate 937 with no temporal structure. Processes with high PIR are to be found at intermediate
885 levels of entropy and redundancy. 938 levels of entropy and redundancy.
886 These observations led us to construct the `Melody Triangle' as a graphical interface 939 These observations led us to construct the `Melody Triangle', a graphical interface
887 for exploring the melodic patterns generated by each of the Markov chains represented 940 for exploring the melodic patterns generated by each of the Markov chains represented
888 as points in \figrf{mtriscat}. 941 as points in \figrf{mtriscat}.
942
943
944 %It is possible to apply information dynamics to the generation of content, such as to the composition of musical materials.
945
946 %For instance a stochastic music generating process could be controlled by modifying
947 %constraints on its output in terms of predictive information rate or entropy
948 %rate.
889 949
890 \begin{fig}{wundt} 950 \begin{fig}{wundt}
891 \raisebox{-4em}{\colfig[0.43]{wundt}} 951 \raisebox{-4em}{\colfig[0.43]{wundt}}
892 % {\ \shortstack{{\Large$\longrightarrow$}\\ {\scriptsize\emph{exposure}}}\ } 952 % {\ \shortstack{{\Large$\longrightarrow$}\\ {\scriptsize\emph{exposure}}}\ }
893 {\ {\large$\longrightarrow$}\ } 953 {\ {\large$\longrightarrow$}\ }
898 in a move to the left along the curve \cite{Berlyne71}. 958 in a move to the left along the curve \cite{Berlyne71}.
899 } 959 }
900 \end{fig} 960 \end{fig}
901 961
902 962
903 %It is possible to apply information dynamics to the generation of content, such as to the composition of musical materials.
904
905 %For instance a stochastic music generating process could be controlled by modifying
906 %constraints on its output in terms of predictive information rate or entropy
907 %rate.
908
909
910 963
911 \subsection{The Melody Triangle} 964 \subsection{The Melody Triangle}
912 965
913 The Melody Triangle is an exploratory interface for the discovery of melodic 966 The Melody Triangle is an interface for the discovery of melodic
914 content, where the input---positions within a triangle---directly map to information 967 materials, where the input---positions within a triangle---directly map to information
915 theoretic properties of the output. 968 theoretic properties of the output.
916 %The measures---entropy rate, redundancy and 969 %The measures---entropy rate, redundancy and
917 %predictive information rate---form a criteria with which to filter the output 970 %predictive information rate---form a criteria with which to filter the output
918 %of the stochastic processes used to generate sequences of notes. 971 %of the stochastic processes used to generate sequences of notes.
919 These measures 972 %These measures
920 address notions of expectation and surprise in music, and as such the Melody 973 %address notions of expectation and surprise in music, and as such the Melody
921 Triangle is a means of interfacing with a generative process in terms of the 974 %Triangle is a means of interfacing with a generative process in terms of the
922 predictability of its output. 975 %predictability of its output.
923
924
925 \begin{fig}{mtriscat}
926 \colfig[0.9]{mtriscat}
927 \caption{The population of transition matrices distributed along three axes of
928 redundancy, entropy rate and predictive information rate (all measured in bits).
929 The concentrations of points along the redundancy axis correspond
930 to Markov chains which are roughly periodic with periods of 2 (redundancy 1 bit),
931 3, 4, \etc all the way to period 8 (redundancy 3 bits). The colour of each point
932 represents its PIR---note that the highest values are found at intermediate entropy
933 and redundancy, and that the distribution as a whole makes a curved triangle. Although
934 not visible in this plot, it is largely hollow in the middle.}
935 \end{fig}
936 976
937 The triangle is populated with first order Markov chain transition 977 The triangle is populated with first order Markov chain transition
938 matrices as illustrated in \figrf{mtriscat}. 978 matrices as illustrated in \figrf{mtriscat}.
939 The distribution of transition matrices plotted in this space forms an arch shape 979 The distribution of transition matrices in this space forms a relatively thin
940 that is fairly thin. Thus, it is a reasonable simplification to project out the 980 curved sheet. Thus, it is a reasonable simplification to project out the
941 third dimension (the PIR) and present an interface that is just two dimensional. 981 third dimension (the PIR) and present an interface that is just two dimensional.
942 The right-angled triangle is rotated, reflected and stretched to form an equilateral triangle with 982 The right-angled triangle is rotated, reflected and stretched to form an equilateral triangle with
943 the $h_\mu=0, \rho_\mu=0$ vertex at the top, the `redundancy' axis down the left-hand 983 the $h_\mu=0, \rho_\mu=0$ vertex at the top, the `redundancy' axis down the left-hand
944 side, and the `entropy rate' axis down the right, as shown in \figrf{TheTriangle}. 984 side, and the `entropy rate' axis down the right, as shown in \figrf{TheTriangle}.
945 This is our `Melody Triangle' and 985 This is our `Melody Triangle' and
946 forms the interface by which the system is controlled. 986 forms the interface by which the system is controlled.
947 %Using this interface thus involves a mapping to information space; 987 %Using this interface thus involves a mapping to information space;
948 The user selects a position within the triangle, the point is mapped into the 988 The user selects a point within the triangle, this is mapped into the
949 information space, and a corresponding transition matrix is returned. The third dimension, 989 information space and the nearest transition matrix is used to generate
950 though not visible, is implicitly there, as transition matrices retrieved from 990 a sequence of values which are then sonified either as pitched notes or percussive
991 sounds. By choosing the position within the triangle, the user can control the
992 output at the level of its `collative' properties, with access to the variety
993 of patterns as described above and in \secrf{markov}.
994 %and information-theoretic criteria related to predictability
995 %and information flow
996 Though the interface is 2D, the third dimension (PIR) is implicitly present, as
997 transition matrices retrieved from
951 along the centre line of the triangle will tend to have higher PIR. 998 along the centre line of the triangle will tend to have higher PIR.
952 999 We hypothesise that, under
953 Each corner corresponds to three different extremes of predictability and
954 unpredictability, which could be loosely characterised as `periodicity', `noise'
955 and `repetition'. Melodies from the `noise' corner (high $h_\mu$, low $\rho_\mu$
956 and $b_\mu$) have no discernible pattern;
957 Melodies along the `periodicity'
958 to `repetition' edge are all deterministic loops that get shorter as we approach
959 the `repetition' corner, until each is just one repeating note. The
960 areas in between will tend to have higher PIR, and we hypothesise that, under
961 the appropriate conditions, these will be perceived as more `interesting' or 1000 the appropriate conditions, these will be perceived as more `interesting' or
962 `melodic.' 1001 `melodic.'
1002
1003 %The corners correspond to three different extremes of predictability and
1004 %unpredictability, which could be loosely characterised as `periodicity', `noise'
1005 %and `repetition'. Melodies from the `noise' corner (high $h_\mu$, low $\rho_\mu$
1006 %and $b_\mu$) have no discernible pattern;
1007 %those along the `periodicity'
1008 %to `repetition' edge are all cyclic patterns that get shorter as we approach
1009 %the `repetition' corner, until each is just one repeating note. Those along the
1010 %opposite edge consist of independent random notes from non-uniform distributions.
1011 %Areas between the left and right edges will tend to have higher PIR,
1012 %and we hypothesise that, under
1013 %the appropriate conditions, these will be perceived as more `interesting' or
1014 %`melodic.'
963 %These melodies have some level of unpredictability, but are not completely random. 1015 %These melodies have some level of unpredictability, but are not completely random.
964 % Or, conversely, are predictable, but not entirely so. 1016 % Or, conversely, are predictable, but not entirely so.
965
966 \begin{fig}{TheTriangle}
967 \colfig[0.8]{TheTriangle.pdf}
968 \caption{The Melody Triangle}
969 \end{fig}
970 1017
971 %PERHAPS WE SHOULD FOREGO TALKING ABOUT THE 1018 %PERHAPS WE SHOULD FOREGO TALKING ABOUT THE
972 %INSTALLATION VERSION OF THE TRIANGLE? 1019 %INSTALLATION VERSION OF THE TRIANGLE?
973 %feels a bit like a tangent, and could do with the space.. 1020 %feels a bit like a tangent, and could do with the space..
974 The Melody Triangle exists in two incarnations; a standard screen based interface 1021 The Melody Triangle exists in two incarnations: a screen-based interface
975 where a user moves tokens in and around a triangle on screen, and a multi-user 1022 where a user moves tokens in and around a triangle on screen, and a multi-user
976 interactive installation where a Kinect camera tracks individuals in a space and 1023 interactive installation where a Kinect camera tracks individuals in a space and
977 maps their positions in physical space to the triangle. In the latter each visitor 1024 maps their positions in physical space to the triangle. In the latter each visitor
978 that enters the installation generates a melody and can collaborate with their 1025 that enters the installation generates a melody and can collaborate with their
979 co-visitors to generate musical textures. This makes the interaction physically engaging 1026 co-visitors to generate musical textures. This makes the interaction physically engaging
980 and (as our experience with visitors both young and old has demonstrated) more playful. 1027 and (as our experience with visitors both young and old has demonstrated) more playful.
981 %Additionally visitors can change the 1028 %Additionally visitors can change the
982 %tempo, register, instrumentation and periodicity of their melody with body gestures. 1029 %tempo, register, instrumentation and periodicity of their melody with body gestures.
983 1030
984 As a screen based interface the Melody Triangle can serve as a composition tool. 1031
1032 \begin{fig}{mtriscat}
1033 \colfig[0.9]{mtriscat}
1034 \caption{The population of transition matrices in the 3D space of
1035 entropy rate ($h_\mu$), redundancy ($\rho_\mu$) and PIR ($b_\mu$),
1036 all in bits.
1037 The concentrations of points along the redundancy axis correspond
1038 to Markov chains which are roughly periodic with periods of 2 (redundancy 1 bit),
1039 3, 4, \etc all the way to period 7 (redundancy 2.8 bits). The colour of each point
1040 represents its PIR---note that the highest values are found at intermediate entropy
1041 and redundancy, and that the distribution as a whole makes a curved triangle. Although
1042 not visible in this plot, it is largely hollow in the middle.}
1043 \end{fig}
1044
1045
1046 The screen based interface can serve as a compositional tool.
985 %%A triangle is drawn on the screen, screen space thus mapped to the statistical 1047 %%A triangle is drawn on the screen, screen space thus mapped to the statistical
986 %space of the Melody Triangle. 1048 %space of the Melody Triangle.
987 A number of tokens, each representing a 1049 A number of tokens, each representing a
988 melody, can be dragged in and around the triangle. For each token, a sequence of symbols with 1050 sonification stream or `voice', can be dragged in and around the triangle.
989 statistical properties that correspond to the token's position is generated. These 1051 For each token, a sequence of symbols is sampled using the corresponding
990 symbols are then mapped to notes of a scale or percussive sounds. 1052 transition matrix, which
991 However they could easily be mapped to other musical processes, possibly over 1053 %statistical properties that correspond to the token's position is generated. These
1054 %symbols
1055 are then mapped to notes of a scale or percussive sounds%
1056 \footnote{The sampled sequence could easily be mapped to other musical processes, possibly over
992 different time scales, such as chords, dynamics and timbres. It would also be possible 1057 different time scales, such as chords, dynamics and timbres. It would also be possible
993 to map the symbols to visual or kinetic outputs. 1058 to map the symbols to visual or other outputs.}%
1059 . Keyboard commands give control over other musical parameters such
1060 as pitch register and inter-onset interval.
994 %The possibilities afforded by the Melody Triangle in these other domains remains to be investigated.}. 1061 %The possibilities afforded by the Melody Triangle in these other domains remains to be investigated.}.
995 Additionally keyboard commands give control over other musical parameters such 1062 %
996 as pitch register and note duration. 1063 The system is capable of generating quite intricate musical textures when multiple tokens
997 1064 are in the triangle, but unlike other computer aided composition tools or programming
998 The Melody Triangle can generate intricate musical textures when multiple tokens 1065 environments, the composer excercises control at the abstract level of information-dynamic
999 are in the triangle. Unlike other computer aided composition tools or programming 1066 properties.
1000 environments, here the composer engages with music on a high and abstract level; 1067 %the interface relating to subjective expectation and predictability.
1001 the interface relating to subjective expectation and predictability. 1068
1002 1069 \begin{fig}{TheTriangle}
1003 1070 \colfig[0.7]{TheTriangle.pdf}
1004 1071 \caption{The Melody Triangle}
1005 1072 \end{fig}
1006
1007 \begin{fig}{mtri-results}
1008 \def\scat#1{\colfig[0.42]{mtri/#1}}
1009 \def\subj#1{\scat{scat_dwells_subj_#1} & \scat{scat_marks_subj_#1}}
1010 \begin{tabular}{cc}
1011 % \subj{a} \\
1012 \subj{b} \\
1013 \subj{c} \\
1014 \subj{d}
1015 \end{tabular}
1016 \caption{Dwell times and mark positions from user trials with the
1017 on-screen Melody Triangle interface, for two subjects. The left-hand column shows
1018 the positions in a 2D information space (entropy rate vs multi-information rate
1019 in bits) where each spent their time; the area of each circle is proportional
1020 to the time spent there. The right-hand column shows point which subjects
1021 `liked'; the area of the circles here is proportional to the duration spent at
1022 that point before the point was marked.}
1023 \end{fig}
1024 1073
1025 \comment{ 1074 \comment{
1026 \subsection{Information Dynamics as Evaluative Feedback Mechanism} 1075 \subsection{Information Dynamics as Evaluative Feedback Mechanism}
1027 %NOT SURE THIS SHOULD BE HERE AT ALL..? 1076 %NOT SURE THIS SHOULD BE HERE AT ALL..?
1028 Information measures on a stream of symbols can form a feedback mechanism; a 1077 Information measures on a stream of symbols can form a feedback mechanism; a
1043 characteristics of sonified Markov chains and subjective musical preference. 1092 characteristics of sonified Markov chains and subjective musical preference.
1044 We carried out a pilot study with six participants, who were asked 1093 We carried out a pilot study with six participants, who were asked
1045 to use a simplified form of the user interface (a single controllable token, 1094 to use a simplified form of the user interface (a single controllable token,
1046 and no rhythmic, registral or timbral controls) under two conditions: 1095 and no rhythmic, registral or timbral controls) under two conditions:
1047 one where a single sequence was sonified under user control, and another 1096 one where a single sequence was sonified under user control, and another
1048 where an addition sequence was sonified in a different register, as if generated 1097 where an additional sequence was sonified in a different register, as if generated
1049 by a fixed invisible in one of four regions of the triangle. In addition, subjects 1098 by a fixed invisible token in one of four regions of the triangle. In addition, subjects
1050 were asked to press a key if they `liked' what they were hearing. 1099 were asked to press a key if they `liked' what they were hearing.
1051 1100
1052 We recorded subjects' behaviour as well as points which they marked 1101 We recorded subjects' behaviour as well as points which they marked
1053 with a key press. 1102 with a key press.
1054 Some results for three of the subjects are shown in \figrf{mtri-results}. Though 1103 Some results for two of the subjects are shown in \figrf{mtri-results}. Though
1055 we have not been able to detect any systematic across-subjects preference for any particular 1104 we have not been able to detect any systematic across-subjects preference for any particular
1056 region of the triangle, subjects do seem to exhibit distinct kinds of exploratory behaviour. 1105 region of the triangle, subjects do seem to exhibit distinct kinds of exploratory behaviour.
1057 Our initial hypothesis, that subjects would linger longer in regions of the triangle 1106 Our initial hypothesis, that subjects would linger longer in regions of the triangle
1058 that produced aesthetically preferable sequences, and that this tend to be towards the 1107 that produced aesthetically preferable sequences, and that this would tend to be towards the
1059 centre line of the triangle for all subjects, was not confirmed. However, it is possible 1108 centre line of the triangle for all subjects, was not confirmed. However, it is possible
1060 that the design of the experiment encouraged an initial exploration of the space (sometimes 1109 that the design of the experiment encouraged an initial exploration of the space (sometimes
1061 very systematic, as for subject c) aimed at \emph{understanding} the parameter space and 1110 very systematic, as for subject c) aimed at \emph{understanding} %the parameter space and
1062 how the system works, rather than finding musical sequences. It is also possible that the 1111 how the system works, rather than finding musical patterns. It is also possible that the
1063 system encourages users to create musically interesting output by \emph{moving the token}, 1112 system encourages users to create musically interesting output by \emph{moving the token},
1064 rather than finding a particular spot in the triangle which produces a musically interesting 1113 rather than finding a particular spot in the triangle which produces a musically interesting
1065 pattern by itself. 1114 sequence by itself.
1115
1116 \begin{fig}{mtri-results}
1117 \def\scat#1{\colfig[0.42]{mtri/#1}}
1118 \def\subj#1{\scat{scat_dwells_subj_#1} & \scat{scat_marks_subj_#1}}
1119 \begin{tabular}{cc}
1120 % \subj{a} \\
1121 % \subj{b} \\
1122 \subj{c} \\
1123 \subj{d}
1124 \end{tabular}
1125 \caption{Dwell times and mark positions from user trials with the
1126 on-screen Melody Triangle interface, for two subjects. The left-hand column shows
1127 the positions in a 2D information space (entropy rate vs multi-information rate
1128 in bits) where each spent their time; the area of each circle is proportional
1129 to the time spent there. The right-hand column shows point which subjects
1130 `liked'; the area of the circles here is proportional to the duration spent at
1131 that point before the point was marked.}
1132 \end{fig}
1066 1133
1067 Comments collected from the subjects 1134 Comments collected from the subjects
1068 %during and after the experiment 1135 %during and after the experiment
1069 suggest that 1136 suggest that
1070 the information-dynamic characteristics of the patterns were readily apparent 1137 the information-dynamic characteristics of the patterns were readily apparent
1071 to most: several noticed the main organisation of the triangle, 1138 to most: several noticed the main organisation of the triangle,
1072 with repetative notes at the top, cyclic patterns along one edge, and unpredictable 1139 with repetetive notes at the top, cyclic patterns along one edge, and unpredictable
1073 notes towards the opposite corner. Some described their consciously systematic exploration of the space. 1140 notes towards the opposite corner. Some described their systematic exploration of the space.
1074 Two felt that the right side was `more controllable' than the left (a direct consequence 1141 Two felt that the right side was `more controllable' than the left (a consequence
1075 of their ability to return to a particular distinctive pattern and recognise it 1142 of their ability to return to a particular distinctive pattern and recognise it
1076 as one heard previously). Two said that the trial was too long and became bored towards the end, 1143 as one heard previously). Two reported that they became bored towards the end,
1077 but another felt there wasn't enough time to get to hear out the patterns properly. 1144 but another felt there wasn't enough time to `hear out' the patterns properly.
1078 One subject did not `enjoy' the patterns in the lower region, but another said the lower 1145 One subject did not `enjoy' the patterns in the lower region, but another said the lower
1079 central regions were more `melodic' and `interesting'. 1146 central regions were more `melodic' and `interesting'.
1080 1147
1081 We plan to continue the trials with a slightly less restricted user interface in order 1148 We plan to continue the trials with a slightly less restricted user interface in order
1082 make the experience more enjoyable and thereby give subjects longer to use the interface; 1149 make the experience more enjoyable and thereby give subjects longer to use the interface;
1097 %and frequencies, only lighting when it heard these. As the Musicolour would 1164 %and frequencies, only lighting when it heard these. As the Musicolour would
1098 %`get bored', the musician would have to change and vary their playing, eliciting 1165 %`get bored', the musician would have to change and vary their playing, eliciting
1099 %new and unexpected outputs in trying to keep the Musicolour interested. 1166 %new and unexpected outputs in trying to keep the Musicolour interested.
1100 1167
1101 1168
1102 \section{Conclusion} 1169 \section{Conclusions}
1103 1170
1104 % !!! FIXME 1171 % !!! FIXME
1105 We reviewed our information dynamics approach to the modelling of the perception 1172 %We reviewed our information dynamics approach to the modelling of the perception
1106 of music and have outlined several areas of application, including 1173 We have looked at several emerging areas of application of the methods and
1107 musicological analysis, audio analysis, beat tracking, and the generation 1174 ideas of information dynamics to various problems in music analysis, perception
1108 of musical materials in a compositional aid. 1175 and cognition, including musicological analysis of symbolic music, audio analysis,
1109 1176 rhythm processing and compositional and creative tasks. The approach has proved
1110 We have described the `Melody Triangle', a novel system that enables a user/composer 1177 successful in musicological analysis, and though our initial data on
1111 to discover musical content in terms of the information theoretic properties of 1178 rhythm processing and aesthetic preference are inconclusive, there is still
1112 the output, and considered how information dynamics could be used to provide 1179 plenty of work to be done in this area: where-ever there are probabilistic models,
1113 evaluative feedback on a composition or improvisation. Finally we outline a 1180 information dynamics can shed light on their behaviour.
1114 pilot study that used the Melody Triangle as an experimental interface to help 1181
1115 determine if there are any correlations between aesthetic preference and information
1116 dynamics measures.
1117 1182
1118 1183
1119 \section*{acknowledgments} 1184 \section*{acknowledgments}
1120 This work is supported by EPSRC Doctoral Training Centre EP/G03723X/1 (HE), 1185 This work is supported by EPSRC Doctoral Training Centre EP/G03723X/1 (HE),
1121 GR/S82213/01 and EP/E045235/1(SA), an EPSRC DTA Studentship (PF), an RAEng/EPSRC Research Fellowship 10216/88 (AR), an EPSRC Leadership Fellowship, EP/G007144/1 1186 GR/S82213/01 and EP/E045235/1(SA), an EPSRC DTA Studentship (PF), an RAEng/EPSRC Research Fellowship 10216/88 (AR), an EPSRC Leadership Fellowship, EP/G007144/1