Mercurial > hg > cip2012
comparison draft.tex @ 70:2cb06db0d271
FINISHED!
author | samer |
---|---|
date | Sat, 17 Mar 2012 18:06:03 +0000 |
parents | 3fa185431bbc |
children | 9135f6fb1a68 |
comparison
equal
deleted
inserted
replaced
69:3fa185431bbc | 70:2cb06db0d271 |
---|---|
82 \section{Introduction} | 82 \section{Introduction} |
83 \label{s:Intro} | 83 \label{s:Intro} |
84 The relationship between | 84 The relationship between |
85 Shannon's \cite{Shannon48} information theory and music and art in general has been the | 85 Shannon's \cite{Shannon48} information theory and music and art in general has been the |
86 subject of some interest since the 1950s | 86 subject of some interest since the 1950s |
87 \cite{Youngblood58,CoonsKraehenbuehl1958,HillerBean66,Moles66,Meyer67,Cohen1962}. | 87 \cite{Youngblood58,CoonsKraehenbuehl1958,Moles66,Meyer67,Cohen1962}. |
88 The general thesis is that perceptible qualities and subjective states | 88 The general thesis is that perceptible qualities and subjective states |
89 like uncertainty, surprise, complexity, tension, and interestingness | 89 like uncertainty, surprise, complexity, tension, and interestingness |
90 are closely related to information-theoretic quantities like | 90 are closely related to information-theoretic quantities like |
91 entropy, relative entropy, and mutual information. | 91 entropy, relative entropy, and mutual information. |
92 | 92 |
96 immediately, after some delay, or modified as the music unfolds. | 96 immediately, after some delay, or modified as the music unfolds. |
97 In this paper, we explore this ``Information Dynamics'' view of music, | 97 In this paper, we explore this ``Information Dynamics'' view of music, |
98 discussing the theory behind it and some emerging applications. | 98 discussing the theory behind it and some emerging applications. |
99 | 99 |
100 \subsection{Expectation and surprise in music} | 100 \subsection{Expectation and surprise in music} |
101 The thesis that the musical experience is strongly shaped by the generation | 101 The idea that the musical experience is strongly shaped by the generation |
102 and playing out of strong and weak expectations was put forward by, amongst others, | 102 and playing out of strong and weak expectations was put forward by, amongst others, |
103 music theorists L. B. Meyer \cite{Meyer67} and Narmour \citep{Narmour77}, but was | 103 music theorists L. B. Meyer \cite{Meyer67} and Narmour \citep{Narmour77}, but was |
104 recognised much earlier; for example, | 104 recognised much earlier; for example, |
105 it was elegantly put by Hanslick \cite{Hanslick1854} in the | 105 it was elegantly put by Hanslick \cite{Hanslick1854} in the |
106 nineteenth century: | 106 nineteenth century: |
128 We suppose that when we listen to music, expectations are created on the basis | 128 We suppose that when we listen to music, expectations are created on the basis |
129 of our familiarity with various styles of music and our ability to | 129 of our familiarity with various styles of music and our ability to |
130 detect and learn statistical regularities in the music as they emerge, | 130 detect and learn statistical regularities in the music as they emerge, |
131 There is experimental evidence that human listeners are able to internalise | 131 There is experimental evidence that human listeners are able to internalise |
132 statistical knowledge about musical structure, \eg | 132 statistical knowledge about musical structure, \eg |
133 \citep{SaffranJohnsonAslin1999,EerolaToiviainenKrumhansl2002}, and also | 133 % \citep{SaffranJohnsonAslin1999,EerolaToiviainenKrumhansl2002}, and also |
134 \citep{SaffranJohnsonAslin1999}, and also | |
134 that statistical models can form an effective basis for computational | 135 that statistical models can form an effective basis for computational |
135 analysis of music, \eg | 136 analysis of music, \eg |
136 \cite{ConklinWitten95,PonsfordWigginsMellish1999,Pearce2005}. | 137 \cite{ConklinWitten95,PonsfordWigginsMellish1999,Pearce2005}. |
137 | 138 |
138 % \subsection{Music and information theory} | 139 % \subsection{Music and information theory} |
139 With a probabilistic framework for music modelling and prediction in hand, | 140 With a probabilistic framework for music modelling and prediction in hand, |
140 we are in a position to compute various | 141 we can %are in a position to |
142 compute various | |
141 \comment{ | 143 \comment{ |
142 which provides us with a number of measures, such as entropy | 144 which provides us with a number of measures, such as entropy |
143 and mutual information, which are suitable for quantifying states of | 145 and mutual information, which are suitable for quantifying states of |
144 uncertainty and surprise, and thus could potentially enable us to build | 146 uncertainty and surprise, and thus could potentially enable us to build |
145 quantitative models of the listening process described above. They are | 147 quantitative models of the listening process described above. They are |
168 % listener, a temporal programme of varying | 170 % listener, a temporal programme of varying |
169 % levels of uncertainty, ambiguity and surprise. | 171 % levels of uncertainty, ambiguity and surprise. |
170 | 172 |
171 | 173 |
172 \subsection{Information dynamic approach} | 174 \subsection{Information dynamic approach} |
173 Our working hypothesis is that, as a | 175 Our working hypothesis is that, as an intelligent, predictive |
174 listener (to which will refer as `it') listens to a piece of music, it maintains | 176 agent (to which will refer as `it') listens to a piece of music, it maintains |
175 a dynamically evolving probabilistic model that enables it to make predictions | 177 a dynamically evolving probabilistic belief state that enables it to make predictions |
176 about how the piece will continue, relying on both its previous experience | 178 about how the piece will continue, relying on both its previous experience |
177 of music and the emerging themes of the piece. As events unfold, it revises | 179 of music and the emerging themes of the piece. As events unfold, it revises |
178 its probabilistic belief state, which includes predictive | 180 this belief state, which includes predictive |
179 distributions over possible future events. These | 181 distributions over possible future events. These |
180 % distributions and changes in distributions | 182 % distributions and changes in distributions |
181 can be characterised in terms of a handful of information | 183 can be characterised in terms of a handful of information |
182 theoretic-measures such as entropy and relative entropy. By tracing the | 184 theoretic-measures such as entropy and relative entropy. By tracing the |
183 evolution of a these measures, we obtain a representation which captures much | 185 evolution of a these measures, we obtain a representation which captures much |
184 of the significant structure of the music. | 186 of the significant structure of the music. |
185 | 187 |
186 One of the consequences of this approach is that regardless of the details of | 188 One consequence of this approach is that regardless of the details of |
187 the sensory input or even which sensory modality is being processed, the resulting | 189 the sensory input or even which sensory modality is being processed, the resulting |
188 analysis is in terms of the same units: quantities of information (bits) and | 190 analysis is in terms of the same units: quantities of information (bits) and |
189 rates of information flow (bits per second). The information | 191 rates of information flow (bits per second). The information |
190 theoretic concepts in terms of which the analysis is framed are universal to all sorts | 192 theoretic concepts in terms of which the analysis is framed are universal to all sorts |
191 of data. | 193 of data. |
399 \mathcal{I}_t = \sum_{\fut{x}_t \in \X^*} | 401 \mathcal{I}_t = \sum_{\fut{x}_t \in \X^*} |
400 p(\fut{x}_t|x_t,\past{x}_t) \log \frac{ p(\fut{x}_t|x_t,\past{x}_t) }{ p(\fut{x}_t|\past{x}_t) }, | 402 p(\fut{x}_t|x_t,\past{x}_t) \log \frac{ p(\fut{x}_t|x_t,\past{x}_t) }{ p(\fut{x}_t|\past{x}_t) }, |
401 \end{equation} | 403 \end{equation} |
402 where the sum is to be taken over the set of infinite sequences $\X^*$. | 404 where the sum is to be taken over the set of infinite sequences $\X^*$. |
403 Note that it is quite possible for an event to be surprising but not informative | 405 Note that it is quite possible for an event to be surprising but not informative |
404 in predictive sense. | 406 in a predictive sense. |
405 As with the surprisingness, the observer can compute its \emph{expected} IPI | 407 As with the surprisingness, the observer can compute its \emph{expected} IPI |
406 at time $t$, which reduces to a mutual information $I(X_t;\fut{X}_t|\ev(\past{X}_t=\past{x}_t))$ | 408 at time $t$, which reduces to a mutual information $I(X_t;\fut{X}_t|\ev(\past{X}_t=\past{x}_t))$ |
407 conditioned on the observed past. This could be used, for example, as an estimate | 409 conditioned on the observed past. This could be used, for example, as an estimate |
408 of attentional resources which should be directed at this stream of data, which may | 410 of attentional resources which should be directed at this stream of data, which may |
409 be in competition with other sensory streams. | 411 be in competition with other sensory streams. |
451 \node at (barycentric cs:p2=1,p1=1) [shape=rectangle,fill=black!45,inner sep=1pt]{$\rho_\mu$}; | 453 \node at (barycentric cs:p2=1,p1=1) [shape=rectangle,fill=black!45,inner sep=1pt]{$\rho_\mu$}; |
452 \path (p2) +(90:3em) node {$X_0$}; | 454 \path (p2) +(90:3em) node {$X_0$}; |
453 \path (p1) +(-3em,0em) node {\shortstack{infinite\\past}}; | 455 \path (p1) +(-3em,0em) node {\shortstack{infinite\\past}}; |
454 \path (p1) +(-4em,\rad) node [anchor=south] {$\ldots,X_{-1}$}; | 456 \path (p1) +(-4em,\rad) node [anchor=south] {$\ldots,X_{-1}$}; |
455 \end{tikzpicture}}% | 457 \end{tikzpicture}}% |
456 \\[1.25em] | 458 \\[1em] |
457 \subfig{(b) excess entropy}{% | 459 \subfig{(b) excess entropy}{% |
458 \newcommand\blob{\longblob} | 460 \newcommand\blob{\longblob} |
459 \begin{tikzpicture} | 461 \begin{tikzpicture} |
460 \coordinate (p1) at (-\offs,0em); | 462 \coordinate (p1) at (-\offs,0em); |
461 \coordinate (p2) at (\offs,0em); | 463 \coordinate (p2) at (\offs,0em); |
469 \path (0,0) node (future) {$E$}; | 471 \path (0,0) node (future) {$E$}; |
470 \path (p1) +(-2em,\rad) node [anchor=south] {$\ldots,X_{-1}$}; | 472 \path (p1) +(-2em,\rad) node [anchor=south] {$\ldots,X_{-1}$}; |
471 \path (p2) +(2em,\rad) node [anchor=south] {$X_0,\ldots$}; | 473 \path (p2) +(2em,\rad) node [anchor=south] {$X_0,\ldots$}; |
472 \end{tikzpicture}% | 474 \end{tikzpicture}% |
473 }% | 475 }% |
474 \\[1.25em] | 476 \\[1em] |
475 \subfig{(c) predictive information rate $b_\mu$}{% | 477 \subfig{(c) predictive information rate $b_\mu$}{% |
476 \begin{tikzpicture}%[baseline=-1em] | 478 \begin{tikzpicture}%[baseline=-1em] |
477 \newcommand\rc{2.1em} | 479 \newcommand\rc{2.1em} |
478 \newcommand\throw{2.5em} | 480 \newcommand\throw{2.5em} |
479 \coordinate (p1) at (210:1.5em); | 481 \coordinate (p1) at (210:1.5em); |
504 \path (p3) +(3em,0em) node {\shortstack{infinite\\future}}; | 506 \path (p3) +(3em,0em) node {\shortstack{infinite\\future}}; |
505 \path (p1) +(-3em,0em) node {\shortstack{infinite\\past}}; | 507 \path (p1) +(-3em,0em) node {\shortstack{infinite\\past}}; |
506 \path (p1) +(-4em,\rad) node [anchor=south] {$\ldots,X_{-1}$}; | 508 \path (p1) +(-4em,\rad) node [anchor=south] {$\ldots,X_{-1}$}; |
507 \path (p3) +(4em,\rad) node [anchor=south] {$X_1,\ldots$}; | 509 \path (p3) +(4em,\rad) node [anchor=south] {$X_1,\ldots$}; |
508 \end{tikzpicture}}% | 510 \end{tikzpicture}}% |
509 \\[0.5em] | 511 \\[0.25em] |
510 \end{tabular} | 512 \end{tabular} |
511 \caption{ | 513 \caption{ |
512 I-diagrams for several information measures in | 514 I-diagrams for several information measures in |
513 stationary random processes. Each circle or oval represents a random | 515 stationary random processes. Each circle or oval represents a random |
514 variable or sequence of random variables relative to time $t=0$. Overlapped areas | 516 variable or sequence of random variables relative to time $t=0$. Overlapped areas |
605 % information} rate. | 607 % information} rate. |
606 | 608 |
607 | 609 |
608 \subsection{First and higher order Markov chains} | 610 \subsection{First and higher order Markov chains} |
609 \label{s:markov} | 611 \label{s:markov} |
610 First order Markov chains are the simplest non-trivial models to which information | 612 % First order Markov chains are the simplest non-trivial models to which information |
611 dynamics methods can be applied. In \cite{AbdallahPlumbley2009} we derived | 613 % dynamics methods can be applied. |
614 In \cite{AbdallahPlumbley2009} we derived | |
612 expressions for all the information measures described in \secrf{surprise-info-seq} for | 615 expressions for all the information measures described in \secrf{surprise-info-seq} for |
613 ergodic Markov chains (\ie that have a unique stationary | 616 ergodic first order Markov chains (\ie that have a unique stationary |
614 distribution). | 617 distribution). |
615 % The derivation is greatly simplified by the dependency structure | 618 % The derivation is greatly simplified by the dependency structure |
616 % of the Markov chain: for the purpose of the analysis, the `past' and `future' | 619 % of the Markov chain: for the purpose of the analysis, the `past' and `future' |
617 % segments $\past{X}_t$ and $\fut{X}_t$ can be collapsed to just the previous | 620 % segments $\past{X}_t$ and $\fut{X}_t$ can be collapsed to just the previous |
618 % and next variables $X_{t-1}$ and $X_{t+1}$ respectively. | 621 % and next variables $X_{t-1}$ and $X_{t+1}$ respectively. |
619 We also showed that | 622 We also showed that |
620 the predictive information rate can be expressed simply in terms of entropy rates: | 623 the PIR can be expressed simply in terms of entropy rates: |
621 if we let $a$ denote the $K\times K$ transition matrix of a Markov chain over | 624 if we let $a$ denote the $K\times K$ transition matrix of a Markov chain over |
622 an alphabet of $\{1,\ldots,K\}$, such that | 625 an alphabet of $\{1,\ldots,K\}$, such that |
623 $a_{ij} = \Pr(\ev(X_t=i|\ev(X_{t-1}=j)))$, and let $h:\reals^{K\times K}\to \reals$ be | 626 $a_{ij} = \Pr(\ev(X_t=i|\ev(X_{t-1}=j)))$, and let $h:\reals^{K\times K}\to \reals$ be |
624 the entropy rate function such that $h(a)$ is the entropy rate of a Markov chain | 627 the entropy rate function such that $h(a)$ is the entropy rate of a Markov chain |
625 with transition matrix $a$, then the predictive information rate is | 628 with transition matrix $a$, then the PIR is |
626 \begin{equation} | 629 \begin{equation} |
627 b_\mu = h(a^2) - h(a), | 630 b_\mu = h(a^2) - h(a), |
628 \end{equation} | 631 \end{equation} |
629 where $a^2$, the transition matrix squared, is the transition matrix | 632 where $a^2$, the transition matrix squared, is the transition matrix |
630 of the `skip one' Markov chain obtained by jumping two steps at a time | 633 of the `skip one' Markov chain obtained by jumping two steps at a time |
631 along the original chain. | 634 along the original chain. |
632 | 635 |
633 Second and higher order Markov chains can be treated in a similar way by transforming | 636 Second and higher order Markov chains can be treated in a similar way by transforming |
634 to a first order representation of the high order Markov chain. If we are dealing | 637 to a first order representation of the high order Markov chain. With |
635 with an $N$th order model, this is done forming a new alphabet of size $K^N$ | 638 an $N$th order model, this is done by forming a new alphabet of size $K^N$ |
636 consisting of all possible $N$-tuples of symbols from the base alphabet. | 639 consisting of all possible $N$-tuples of symbols from the base alphabet. |
637 An observation $\hat{x}_t$ in this new model encodes a block of $N$ observations | 640 An observation $\hat{x}_t$ in this new model encodes a block of $N$ observations |
638 $(x_{t+1},\ldots,x_{t+N})$ from the base model. The next | 641 $(x_{t+1},\ldots,x_{t+N})$ from the base model. |
639 observation $\hat{x}_{t+1}$ encodes the block of $N$ obtained by shifting the previous | 642 % The next |
640 block along by one step. The new Markov of chain is parameterised by a sparse $K^N\times K^N$ | 643 % observation $\hat{x}_{t+1}$ encodes the block of $N$ obtained by shifting the previous |
644 % block along by one step. | |
645 The new Markov of chain is parameterised by a sparse $K^N\times K^N$ | |
641 transition matrix $\hat{a}$, in terms of which the PIR is | 646 transition matrix $\hat{a}$, in terms of which the PIR is |
642 \begin{equation} | 647 \begin{equation} |
643 h_\mu = h(\hat{a}), \qquad b_\mu = h({\hat{a}^{N+1}}) - N h({\hat{a}}), | 648 h_\mu = h(\hat{a}), \qquad b_\mu = h({\hat{a}^{N+1}}) - N h({\hat{a}}), |
644 \end{equation} | 649 \end{equation} |
645 where $\hat{a}^{N+1}$ is the $(N+1)$th power of the first order transition matrix. | 650 where $\hat{a}^{N+1}$ is the $(N+1)$th power of the first order transition matrix. |
646 Other information measures can also be computed for the high-order Markov chain, including | 651 Other information measures can also be computed for the high-order Markov chain, including |
647 the multi-information rate $\rho_\mu$ and the excess entropy $E$. These are identical | 652 the multi-information rate $\rho_\mu$ and the excess entropy $E$. (These are identical |
648 for first order Markov chains, but for order $N$ chains, $E$ can be up to $N$ times larger | 653 for first order Markov chains, but for order $N$ chains, $E$ can be up to $N$ times larger |
649 than $\rho_\mu$. | 654 than $\rho_\mu$.) |
650 | 655 |
651 In our early experiments with visualising and sonifying sequences sampled from | 656 In our experiments with visualising and sonifying sequences sampled from |
652 first order Markov chains \cite{AbdallahPlumbley2009}, we found that | 657 first order Markov chains \cite{AbdallahPlumbley2009}, we found that |
653 the measures $h_\mu$, $\rho_\mu$ and $b_\mu$ are related to perceptible | 658 the measures $h_\mu$, $\rho_\mu$ and $b_\mu$ correspond to perceptible |
654 characteristics, and that the kinds of transition matrices maximising or minimising | 659 characteristics, and that the transition matrices maximising or minimising |
655 each of these quantities are quite distinct. High entropy rates are associated | 660 each of these quantities are quite distinct. High entropy rates are associated |
656 with completely uncorrelated sequences with no recognisable temporal structure, | 661 with completely uncorrelated sequences with no recognisable temporal structure |
657 along with low $\rho_\mu$ and $b_\mu$. | 662 (and low $\rho_\mu$ and $b_\mu$). |
658 High values of $\rho_\mu$ are associated with long periodic cycles, low $h_\mu$ | 663 High values of $\rho_\mu$ are associated with long periodic cycles (and low $h_\mu$ |
659 and low $b_\mu$. High values of $b_\mu$ are associated with intermediate values | 664 and $b_\mu$). High values of $b_\mu$ are associated with intermediate values |
660 of $\rho_\mu$ and $h_\mu$, and recognisable, but not completely predictable, | 665 of $\rho_\mu$ and $h_\mu$, and recognisable, but not completely predictable, |
661 temporal structures. These relationships are visible in \figrf{mtriscat} in | 666 temporal structures. These relationships are visible in \figrf{mtriscat} in |
662 \secrf{composition}, where we pick up the thread with an application of | 667 \secrf{composition}, where we pick up this thread again, with an application of |
663 information dynamics in a compositional aid. | 668 information dynamics in a compositional aid. |
664 | 669 |
665 | 670 |
666 \section{Information Dynamics in Analysis} | 671 \section{Information Dynamics in Analysis} |
667 | 672 |
673 \subsection{Musicological Analysis} | |
674 \label{s:minimusic} | |
675 | |
668 \begin{fig}{twopages} | 676 \begin{fig}{twopages} |
669 \colfig[0.96]{matbase/fig9471} % update from mbc paper | 677 \colfig[0.96]{matbase/fig9471}\\ % update from mbc paper |
670 % \colfig[0.97]{matbase/fig72663}\\ % later update from mbc paper (Keith's new picks) | 678 % \colfig[0.97]{matbase/fig72663}\\ % later update from mbc paper (Keith's new picks) |
671 \vspace*{1em} | 679 \vspace*{0.5em} |
672 \colfig[0.97]{matbase/fig13377} % rule based analysis | 680 \colfig[0.97]{matbase/fig13377} % rule based analysis |
673 \caption{Analysis of \emph{Two Pages}. | 681 \caption{Analysis of \emph{Two Pages}. |
674 The thick vertical lines are the part boundaries as indicated in | 682 The thick vertical lines are the part boundaries as indicated in |
675 the score by the composer. | 683 the score by the composer. |
676 The thin grey lines | 684 The thin grey lines |
677 indicate changes in the melodic `figures' of which the piece is | 685 indicate changes in the melodic `figures' of which the piece is |
678 constructed. In the `model information rate' panel, the black asterisks | 686 constructed. In the `model information rate' panel, the black asterisks |
679 mark the | 687 mark the six most surprising moments selected by Keith Potter. |
680 six most surprising moments selected by Keith Potter. | 688 The bottom two panels show two rule-based boundary strength analyses. |
681 The bottom panel shows a rule-based boundary strength analysis computed | 689 All information measures are in nats. |
682 using Cambouropoulos' LBDM. | 690 Note that the boundary marked in the score at around note 5,400 is known to be |
683 All information measures are in nats and time is in notes. | 691 anomalous; on the basis of a listening analysis, some musicologists have |
692 placed the boundary a few bars later, in agreement with our analysis | |
693 \cite{PotterEtAl2007}. | |
684 } | 694 } |
685 \end{fig} | 695 \end{fig} |
686 | 696 |
687 \subsection{Musicological Analysis} | 697 In \cite{AbdallahPlumbley2009}, we analysed two pieces of music in the minimalist style |
688 In \cite{AbdallahPlumbley2009}, methods based on the theory described above | |
689 were used to analysis two pieces of music in the minimalist style | |
690 by Philip Glass: \emph{Two Pages} (1969) and \emph{Gradus} (1968). | 698 by Philip Glass: \emph{Two Pages} (1969) and \emph{Gradus} (1968). |
691 The analysis was done using a first-order Markov chain model, with the | 699 The analysis was done using a first-order Markov chain model, with the |
692 enhancement that the transition matrix of the model was allowed to | 700 enhancement that the transition matrix of the model was allowed to |
693 evolve dynamically as the notes were processed, and was tracked (in | 701 evolve dynamically as the notes were processed, and was tracked (in |
694 a Bayesian way) as a \emph{distribution} over possible transition matrices, | 702 a Bayesian way) as a \emph{distribution} over possible transition matrices, |
695 rather than a point estimate. Some results are summarised in \figrf{twopages}: | 703 rather than a point estimate. Some results are summarised in \figrf{twopages}: |
696 the upper four plots show the dynamically evolving subjective information | 704 the upper four plots show the dynamically evolving subjective information |
697 measures as described in \secrf{surprise-info-seq} computed using a point | 705 measures as described in \secrf{surprise-info-seq}, computed using a point |
698 estimate of the current transition matrix; the fifth plot (the `model information rate') | 706 estimate of the current transition matrix; the fifth plot (the `model information rate') |
699 measures the information in each observation about the transition matrix. | 707 shows the information in each observation about the transition matrix. |
700 In \cite{AbdallahPlumbley2010b}, we showed that this `model information rate' | 708 In \cite{AbdallahPlumbley2010b}, we showed that this `model information rate' |
701 is actually a component of the true IPI when the transition | 709 is actually a component of the true IPI when the transition |
702 matrix is being learned online, and was neglected when we computed the IPI from | 710 matrix is being learned online, and was neglected when we computed the IPI from |
703 the transition matrix as if the transition probabilities | 711 the transition matrix as if it were a constant. |
704 were constant. | 712 |
705 | 713 The peaks of the surprisingness and both components of the IPI |
706 The peaks of the surprisingness and both components of the predictive information | |
707 show good correspondence with structure of the piece both as marked in the score | 714 show good correspondence with structure of the piece both as marked in the score |
708 and as analysed by musicologist Keith Potter, who was asked to mark the six | 715 and as analysed by musicologist Keith Potter, who was asked to mark the six |
709 `most surprising moments' of the piece (shown as asterisks in the fifth plot)% | 716 `most surprising moments' of the piece (shown as asterisks in the fifth plot). %% |
710 \footnote{% | 717 % \footnote{% |
711 Note that the boundary marked in the score at around note 5,400 is known to be | 718 % Note that the boundary marked in the score at around note 5,400 is known to be |
712 anomalous; on the basis of a listening analysis, some musicologists have | 719 % anomalous; on the basis of a listening analysis, some musicologists have |
713 placed the boundary a few bars later, in agreement with our analysis | 720 % placed the boundary a few bars later, in agreement with our analysis |
714 \cite{PotterEtAl2007}.} | 721 % \cite{PotterEtAl2007}.} |
715 | 722 % |
716 In contrast, the analyses shown in the lower two plots of \figrf{twopages}, | 723 In contrast, the analyses shown in the lower two plots of \figrf{twopages}, |
717 obtained using two rule-based music segmentation algorithms, while clearly | 724 obtained using two rule-based music segmentation algorithms, while clearly |
718 \emph{reflecting} the structure of the piece, do not \emph{segment} the piece, | 725 \emph{reflecting} the structure of the piece, do not \emph{segment} the piece, |
719 with no tendency to peaking of the boundary strength function at | 726 with no tendency to peaking of the boundary strength function at |
720 the boundaries in the piece. | 727 the boundaries in the piece. |
726 show that the first note of each bar is, on average, significantly more surprising | 733 show that the first note of each bar is, on average, significantly more surprising |
727 and informative than the others, up to the 64-note level, where as at the 128-note, | 734 and informative than the others, up to the 64-note level, where as at the 128-note, |
728 level, the dominant periodicity appears to remain at 64 notes. | 735 level, the dominant periodicity appears to remain at 64 notes. |
729 | 736 |
730 \begin{fig}{metre} | 737 \begin{fig}{metre} |
731 % \scalebox{1}[1]{% | 738 % \scalebox{1}{% |
732 \begin{tabular}{cc} | 739 \begin{tabular}{cc} |
733 \colfig[0.45]{matbase/fig36859} & \colfig[0.48]{matbase/fig88658} \\ | 740 \colfig[0.45]{matbase/fig36859} & \colfig[0.48]{matbase/fig88658} \\ |
734 \colfig[0.45]{matbase/fig48061} & \colfig[0.48]{matbase/fig46367} \\ | 741 \colfig[0.45]{matbase/fig48061} & \colfig[0.48]{matbase/fig46367} \\ |
735 \colfig[0.45]{matbase/fig99042} & \colfig[0.47]{matbase/fig87490} | 742 \colfig[0.45]{matbase/fig99042} & \colfig[0.47]{matbase/fig87490} |
736 % \colfig[0.46]{matbase/fig56807} & \colfig[0.48]{matbase/fig27144} \\ | 743 % \colfig[0.46]{matbase/fig56807} & \colfig[0.48]{matbase/fig27144} \\ |
741 % \colfig[0.48]{matbase/fig9142} & \colfig[0.48]{matbase/fig27751} | 748 % \colfig[0.48]{matbase/fig9142} & \colfig[0.48]{matbase/fig27751} |
742 | 749 |
743 \end{tabular}% | 750 \end{tabular}% |
744 % } | 751 % } |
745 \caption{Metrical analysis by computing average surprisingness and | 752 \caption{Metrical analysis by computing average surprisingness and |
746 informative of notes at different periodicities (\ie hypothetical | 753 IPI of notes at different periodicities (\ie hypothetical |
747 bar lengths) and phases (\ie positions within a bar). | 754 bar lengths) and phases (\ie positions within a bar). |
748 } | 755 } |
749 \end{fig} | 756 \end{fig} |
750 | 757 |
751 \subsection{Real-valued signals and audio analysis} | 758 \subsection{Real-valued signals and audio analysis} |
752 Using analogous definitions based on the differential entropy | 759 Using analogous definitions based on the differential entropy |
753 \cite{CoverThomas}, the methods outlined | 760 \cite{CoverThomas}, the methods outlined |
754 in \secrf{surprise-info-seq} and \secrf{process-info} | 761 in \secrf{surprise-info-seq} and \secrf{process-info} |
755 are equally applicable to random variables taking values in a continuous domain. | 762 can be reformulated for random variables taking values in a continuous domain. |
756 In the case of music, where expressive properties such as dynamics, tempo, | 763 Information-dynamic methods may thus be applied to expressive parameters of music |
757 timing and timbre are readily quantified on a continuous scale, the information | 764 such as dynamics, timing and timbre, which are readily quantified on a continuous scale. |
758 dynamic framework may thus be applied. | 765 |
759 % \subsection{Audio based content analysis} | 766 % \subsection{Audio based content analysis} |
760 % Using analogous definitions of differential entropy, the methods outlined | 767 % Using analogous definitions of differential entropy, the methods outlined |
761 % in the previous section are equally applicable to continuous random variables. | 768 % in the previous section are equally applicable to continuous random variables. |
762 % In the case of music, where expressive properties such as dynamics, tempo, | 769 % In the case of music, where expressive properties such as dynamics, tempo, |
763 % timing and timbre are readily quantified on a continuous scale, the information | 770 % timing and timbre are readily quantified on a continuous scale, the information |
764 % dynamic framework may also be considered. | 771 % dynamic framework may also be considered. |
765 | 772 |
766 Dubnov \cite{Dubnov2006} considers the class of stationary Gaussian | 773 Dubnov \cite{Dubnov2006} considers the class of stationary Gaussian |
767 processes. For such processes, the entropy rate may be obtained analytically | 774 processes, for which entropy rate may be obtained analytically |
768 from the power spectral density of the signal. Dubnov found that the | 775 from the power spectral density of the signal. Dubnov found that the |
769 multi-information rate (which he refers to as `information rate') can be | 776 multi-information rate (which he refers to as `information rate') can be |
770 expressed as a function of the spectral flatness measure. For a given variance, | 777 expressed as a function of the \emph{spectral flatness measure}. Thus, for a given variance, |
771 Gaussian processes with maximal multi-information rate are those with maximally | 778 Gaussian processes with maximal multi-information rate are those with maximally |
772 non-flat spectra. These are essentially consist of a single | 779 non-flat spectra. These essentially consist of a single |
773 sinusoidal component and hence are completely predictable and periodic once | 780 sinusoidal component and hence are completely predictable once |
774 the parameters of the sinusoid have been inferred. | 781 the parameters of the sinusoid have been inferred. |
775 % Local stationarity is assumed, which may be achieved by windowing or | 782 % Local stationarity is assumed, which may be achieved by windowing or |
776 % change point detection \cite{Dubnov2008}. | 783 % change point detection \cite{Dubnov2008}. |
777 %TODO | 784 %TODO |
778 | 785 |
779 We are currently working towards methods for the computation of predictive information | 786 We are currently working towards methods for the computation of predictive information |
780 rate in some restricted classes of Gaussian processes including finite-order | 787 rate in some restricted classes of Gaussian processes including finite-order |
781 autoregressive models and processes with power-law spectra (fractionally integrated Gaussian noise). | 788 autoregressive models and processes with power-law (or $1/f$) spectra, |
789 which have previously been investegated in relation to their aesthetic properties | |
790 \cite{Voss75,TaylorSpeharVan-Donkelaar2011}. | |
782 | 791 |
792 % (fractionally integrated Gaussian noise). | |
783 % %(fBm (continuous), fiGn discrete time) possible reference: | 793 % %(fBm (continuous), fiGn discrete time) possible reference: |
784 % @book{palma2007long, | 794 % @book{palma2007long, |
785 % title={Long-memory time series: theory and methods}, | 795 % title={Long-memory time series: theory and methods}, |
786 % author={Palma, W.}, | 796 % author={Palma, W.}, |
787 % volume={662}, | 797 % volume={662}, |
805 | 815 |
806 | 816 |
807 \subsection{Beat Tracking} | 817 \subsection{Beat Tracking} |
808 | 818 |
809 A probabilistic method for drum tracking was presented by Robertson | 819 A probabilistic method for drum tracking was presented by Robertson |
810 \cite{Robertson11c}. The algorithm is used to synchronise a music | 820 \cite{Robertson11c}. The system infers a beat grid (a sequence |
811 sequencer to a live drummer. The expected beat time of the sequencer is | 821 of approximately regular beat times) given audio inputs from a |
812 represented by a click track, and the algorithm takes as input event | 822 live drummer, for the purpose of synchronising a music |
813 times for discrete kick and snare drum events relative to this click | 823 sequencer with the drummer. |
814 track. These are obtained using dedicated microphones for each drum and | 824 The times of kick and snare drum events are obtained |
815 using a percussive onset detector \cite{puckette98}. The drum tracker | 825 using dedicated microphones for each drum and a percussive onset detector |
816 continually updates distributions for tempo and phase on receiving a new | 826 \cite{puckette98}. These event times are then sent |
817 event time. We can thus quantify the information contributed of an event | 827 to the beat tracker, which maintains a probabilistic belief state in |
818 by measuring the difference between the system's prior distribution and | 828 the form of distributions over the tempo and phase of the beat grid. |
819 the posterior distribution using the Kullback-Leiber divergence. | 829 Every time an event is received, these distributions are updated |
820 | 830 with respect to a probabilistic model which accounts both for tempo and phase |
821 Here, we have calculated the KL divergence and entropy for kick and | 831 variations and the emission of drum events at musically plausible times |
822 snare events in sixteen files. The analysis of information rates can be | 832 relative to the beat grid. |
823 considered \emph{subjective}, in that it measures how the drum tracker's | 833 %continually updates distributions for tempo and phase on receiving a new |
824 probability distributions change, and these are contingent upon the | 834 %event time |
825 model used as well as external properties in the signal. We expect, | 835 |
826 however, that following periods of increased uncertainty, such as fills | 836 The use of a probabilistic belief state means we can compute entropies |
827 or expressive timing, the information contained in an individual event | 837 representing the system's uncertainty about the beat grid, and quantify |
828 increases. We also examine whether the information is dependent upon | 838 the amount of information in each event about the beat grid as the KL divergence |
829 metrical position. | 839 between prior and posterior distributions. Though this is not strictly the |
830 | 840 instantaneous predictive information (IPI) as described in \secrf{surprise-info-seq} |
831 % !!! FIXME | 841 (the information gained is not directly about future event times), we can treat |
842 it as a proxy for the IPI, in the manner of the `model information rate' | |
843 described in \secrf{minimusic}, which has a similar status. | |
844 | |
845 \begin{fig*}{drumfig} | |
846 % \includegraphics[width=0.9\linewidth]{drum_plots/file9-track.eps}% \\ | |
847 \includegraphics[width=0.97\linewidth]{drum_plots/file11-track.eps} \\ | |
848 % \includegraphics[width=0.9\linewidth]{newplots/file8-track.eps} | |
849 \caption{Information dynamic analysis derived from audio recordings of | |
850 drumming, obtained by applying a Bayesian beat tracking system to the | |
851 sequence of detected kick and snare drum events. The grey line show the system's | |
852 varying level of uncertainty (entropy) about the tempo and phase of the | |
853 beat grid, while the stem plot shows the amount of information in each | |
854 drum event about the beat grid. The entropy drops instantaneously at each | |
855 event and rises gradually between events. | |
856 } | |
857 \end{fig*} | |
858 | |
859 We carried out the analysis on 16 recordings; an example | |
860 is shown in \figrf{drumfig}. There we can see variations in the | |
861 entropy in the upper graph and the information in each drum event in the lower | |
862 stem plot. At certain points in time, unusually large amounts of information | |
863 arrive; these may be related to fills and other rhythmic irregularities, which | |
864 are often followed by an emphatic return to a steady beat at the beginning | |
865 of the next bar---this is something we are currently investigating. | |
866 We also analysed the pattern of information flow | |
867 on a cyclic metre, much as in \figrf{metre}. All the recordings we | |
868 analysed are audibly in 4/4 metre, but we found no | |
869 evidence of a general tendency for greater amounts of information to arrive | |
870 at metrically strong beats, which suggests that the rhythmic accuracy of the | |
871 drummers does not vary systematically across each bar. It is possible that metrical information | |
872 existing in the pattern of kick and snare events might emerge in an information | |
873 dynamic analysis using a model that attempts to predict the time and type of | |
874 the next drum event, rather than just inferring the beat grid as the current model does. | |
875 %The analysis of information rates can b | |
876 %considered \emph{subjective}, in that it measures how the drum tracker's | |
877 %probability distributions change, and these are contingent upon the | |
878 %model used as well as external properties in the signal. | |
879 %We expect, | |
880 %however, that following periods of increased uncertainty, such as fills | |
881 %or expressive timing, the information contained in an individual event | |
882 %increases. We also examine whether the information is dependent upon | |
883 %metrical position. | |
884 | |
832 | 885 |
833 \section{Information dynamics as compositional aid} | 886 \section{Information dynamics as compositional aid} |
834 \label{s:composition} | 887 \label{s:composition} |
835 | 888 |
836 The use of stochastic processes in music composition has been widespread for | 889 The use of stochastic processes in music composition has been widespread for |
839 can drive the \emph{generative} phase of the creative process, information dynamics | 892 can drive the \emph{generative} phase of the creative process, information dynamics |
840 can serve as a novel framework for a \emph{selective} phase, by | 893 can serve as a novel framework for a \emph{selective} phase, by |
841 providing a set of criteria to be used in judging which of the | 894 providing a set of criteria to be used in judging which of the |
842 generated materials | 895 generated materials |
843 are of value. This alternation of generative and selective phases as been | 896 are of value. This alternation of generative and selective phases as been |
844 noted by art theorist Margaret Boden \cite{Boden1990}. | 897 noted before \cite{Boden1990}. |
845 | 898 % |
846 Information-dynamic criteria can also be used as \emph{constraints} on the | 899 Information-dynamic criteria can also be used as \emph{constraints} on the |
847 generative processes, for example, by specifying a certain temporal profile | 900 generative processes, for example, by specifying a certain temporal profile |
848 of suprisingness and uncertainty the composer wishes to induce in the listener | 901 of suprisingness and uncertainty the composer wishes to induce in the listener |
849 as the piece unfolds. | 902 as the piece unfolds. |
850 %stochastic and algorithmic processes: ; outputs can be filtered to match a set of | 903 %stochastic and algorithmic processes: ; outputs can be filtered to match a set of |
867 Processes with high PIR maintain a certain kind of balance between | 920 Processes with high PIR maintain a certain kind of balance between |
868 predictability and unpredictability in such a way that the observer must continually | 921 predictability and unpredictability in such a way that the observer must continually |
869 pay attention to each new observation as it occurs in order to make the best | 922 pay attention to each new observation as it occurs in order to make the best |
870 possible predictions about the evolution of the seqeunce. This balance between predictability | 923 possible predictions about the evolution of the seqeunce. This balance between predictability |
871 and unpredictability is reminiscent of the inverted `U' shape of the Wundt curve (see \figrf{wundt}), | 924 and unpredictability is reminiscent of the inverted `U' shape of the Wundt curve (see \figrf{wundt}), |
872 which summarises the observations of Wundt that the greatest aesthetic value in art | 925 which summarises the observations of Wundt \cite{Wundt1897} that stimuli are most |
873 is to be found at intermediate levels of disorder, where there is a balance between | 926 pleasing at intermediate levels of novelty or disorder, where there is a balance between |
874 `order' and `chaos'. | 927 `order' and `chaos'. |
875 | 928 |
876 Using the methods of \secrf{markov}, we found \cite{AbdallahPlumbley2009} | 929 Using the methods of \secrf{markov}, we found \cite{AbdallahPlumbley2009} |
877 a similar shape when plotting entropy rate againt PIR---this is visible in the | 930 a similar shape when plotting entropy rate againt PIR---this is visible in the |
878 upper envelope of the scatter plot in \figrf{mtriscat}, which is a 3-D scatter plot of | 931 upper envelope of the scatter plot in \figrf{mtriscat}, which is a 3-D scatter plot of |
881 The coordinates of the `information space' are entropy rate ($h_\mu$), redundancy ($\rho_\mu$), and | 934 The coordinates of the `information space' are entropy rate ($h_\mu$), redundancy ($\rho_\mu$), and |
882 predictive information rate ($b_\mu$). The points along the `redundancy' axis correspond | 935 predictive information rate ($b_\mu$). The points along the `redundancy' axis correspond |
883 to periodic Markov chains. Those along the `entropy' axis produce uncorrelated sequences | 936 to periodic Markov chains. Those along the `entropy' axis produce uncorrelated sequences |
884 with no temporal structure. Processes with high PIR are to be found at intermediate | 937 with no temporal structure. Processes with high PIR are to be found at intermediate |
885 levels of entropy and redundancy. | 938 levels of entropy and redundancy. |
886 These observations led us to construct the `Melody Triangle' as a graphical interface | 939 These observations led us to construct the `Melody Triangle', a graphical interface |
887 for exploring the melodic patterns generated by each of the Markov chains represented | 940 for exploring the melodic patterns generated by each of the Markov chains represented |
888 as points in \figrf{mtriscat}. | 941 as points in \figrf{mtriscat}. |
942 | |
943 | |
944 %It is possible to apply information dynamics to the generation of content, such as to the composition of musical materials. | |
945 | |
946 %For instance a stochastic music generating process could be controlled by modifying | |
947 %constraints on its output in terms of predictive information rate or entropy | |
948 %rate. | |
889 | 949 |
890 \begin{fig}{wundt} | 950 \begin{fig}{wundt} |
891 \raisebox{-4em}{\colfig[0.43]{wundt}} | 951 \raisebox{-4em}{\colfig[0.43]{wundt}} |
892 % {\ \shortstack{{\Large$\longrightarrow$}\\ {\scriptsize\emph{exposure}}}\ } | 952 % {\ \shortstack{{\Large$\longrightarrow$}\\ {\scriptsize\emph{exposure}}}\ } |
893 {\ {\large$\longrightarrow$}\ } | 953 {\ {\large$\longrightarrow$}\ } |
898 in a move to the left along the curve \cite{Berlyne71}. | 958 in a move to the left along the curve \cite{Berlyne71}. |
899 } | 959 } |
900 \end{fig} | 960 \end{fig} |
901 | 961 |
902 | 962 |
903 %It is possible to apply information dynamics to the generation of content, such as to the composition of musical materials. | |
904 | |
905 %For instance a stochastic music generating process could be controlled by modifying | |
906 %constraints on its output in terms of predictive information rate or entropy | |
907 %rate. | |
908 | |
909 | |
910 | 963 |
911 \subsection{The Melody Triangle} | 964 \subsection{The Melody Triangle} |
912 | 965 |
913 The Melody Triangle is an exploratory interface for the discovery of melodic | 966 The Melody Triangle is an interface for the discovery of melodic |
914 content, where the input---positions within a triangle---directly map to information | 967 materials, where the input---positions within a triangle---directly map to information |
915 theoretic properties of the output. | 968 theoretic properties of the output. |
916 %The measures---entropy rate, redundancy and | 969 %The measures---entropy rate, redundancy and |
917 %predictive information rate---form a criteria with which to filter the output | 970 %predictive information rate---form a criteria with which to filter the output |
918 %of the stochastic processes used to generate sequences of notes. | 971 %of the stochastic processes used to generate sequences of notes. |
919 These measures | 972 %These measures |
920 address notions of expectation and surprise in music, and as such the Melody | 973 %address notions of expectation and surprise in music, and as such the Melody |
921 Triangle is a means of interfacing with a generative process in terms of the | 974 %Triangle is a means of interfacing with a generative process in terms of the |
922 predictability of its output. | 975 %predictability of its output. |
923 | |
924 | |
925 \begin{fig}{mtriscat} | |
926 \colfig[0.9]{mtriscat} | |
927 \caption{The population of transition matrices distributed along three axes of | |
928 redundancy, entropy rate and predictive information rate (all measured in bits). | |
929 The concentrations of points along the redundancy axis correspond | |
930 to Markov chains which are roughly periodic with periods of 2 (redundancy 1 bit), | |
931 3, 4, \etc all the way to period 8 (redundancy 3 bits). The colour of each point | |
932 represents its PIR---note that the highest values are found at intermediate entropy | |
933 and redundancy, and that the distribution as a whole makes a curved triangle. Although | |
934 not visible in this plot, it is largely hollow in the middle.} | |
935 \end{fig} | |
936 | 976 |
937 The triangle is populated with first order Markov chain transition | 977 The triangle is populated with first order Markov chain transition |
938 matrices as illustrated in \figrf{mtriscat}. | 978 matrices as illustrated in \figrf{mtriscat}. |
939 The distribution of transition matrices plotted in this space forms an arch shape | 979 The distribution of transition matrices in this space forms a relatively thin |
940 that is fairly thin. Thus, it is a reasonable simplification to project out the | 980 curved sheet. Thus, it is a reasonable simplification to project out the |
941 third dimension (the PIR) and present an interface that is just two dimensional. | 981 third dimension (the PIR) and present an interface that is just two dimensional. |
942 The right-angled triangle is rotated, reflected and stretched to form an equilateral triangle with | 982 The right-angled triangle is rotated, reflected and stretched to form an equilateral triangle with |
943 the $h_\mu=0, \rho_\mu=0$ vertex at the top, the `redundancy' axis down the left-hand | 983 the $h_\mu=0, \rho_\mu=0$ vertex at the top, the `redundancy' axis down the left-hand |
944 side, and the `entropy rate' axis down the right, as shown in \figrf{TheTriangle}. | 984 side, and the `entropy rate' axis down the right, as shown in \figrf{TheTriangle}. |
945 This is our `Melody Triangle' and | 985 This is our `Melody Triangle' and |
946 forms the interface by which the system is controlled. | 986 forms the interface by which the system is controlled. |
947 %Using this interface thus involves a mapping to information space; | 987 %Using this interface thus involves a mapping to information space; |
948 The user selects a position within the triangle, the point is mapped into the | 988 The user selects a point within the triangle, this is mapped into the |
949 information space, and a corresponding transition matrix is returned. The third dimension, | 989 information space and the nearest transition matrix is used to generate |
950 though not visible, is implicitly there, as transition matrices retrieved from | 990 a sequence of values which are then sonified either as pitched notes or percussive |
991 sounds. By choosing the position within the triangle, the user can control the | |
992 output at the level of its `collative' properties, with access to the variety | |
993 of patterns as described above and in \secrf{markov}. | |
994 %and information-theoretic criteria related to predictability | |
995 %and information flow | |
996 Though the interface is 2D, the third dimension (PIR) is implicitly present, as | |
997 transition matrices retrieved from | |
951 along the centre line of the triangle will tend to have higher PIR. | 998 along the centre line of the triangle will tend to have higher PIR. |
952 | 999 We hypothesise that, under |
953 Each corner corresponds to three different extremes of predictability and | |
954 unpredictability, which could be loosely characterised as `periodicity', `noise' | |
955 and `repetition'. Melodies from the `noise' corner (high $h_\mu$, low $\rho_\mu$ | |
956 and $b_\mu$) have no discernible pattern; | |
957 Melodies along the `periodicity' | |
958 to `repetition' edge are all deterministic loops that get shorter as we approach | |
959 the `repetition' corner, until each is just one repeating note. The | |
960 areas in between will tend to have higher PIR, and we hypothesise that, under | |
961 the appropriate conditions, these will be perceived as more `interesting' or | 1000 the appropriate conditions, these will be perceived as more `interesting' or |
962 `melodic.' | 1001 `melodic.' |
1002 | |
1003 %The corners correspond to three different extremes of predictability and | |
1004 %unpredictability, which could be loosely characterised as `periodicity', `noise' | |
1005 %and `repetition'. Melodies from the `noise' corner (high $h_\mu$, low $\rho_\mu$ | |
1006 %and $b_\mu$) have no discernible pattern; | |
1007 %those along the `periodicity' | |
1008 %to `repetition' edge are all cyclic patterns that get shorter as we approach | |
1009 %the `repetition' corner, until each is just one repeating note. Those along the | |
1010 %opposite edge consist of independent random notes from non-uniform distributions. | |
1011 %Areas between the left and right edges will tend to have higher PIR, | |
1012 %and we hypothesise that, under | |
1013 %the appropriate conditions, these will be perceived as more `interesting' or | |
1014 %`melodic.' | |
963 %These melodies have some level of unpredictability, but are not completely random. | 1015 %These melodies have some level of unpredictability, but are not completely random. |
964 % Or, conversely, are predictable, but not entirely so. | 1016 % Or, conversely, are predictable, but not entirely so. |
965 | |
966 \begin{fig}{TheTriangle} | |
967 \colfig[0.8]{TheTriangle.pdf} | |
968 \caption{The Melody Triangle} | |
969 \end{fig} | |
970 | 1017 |
971 %PERHAPS WE SHOULD FOREGO TALKING ABOUT THE | 1018 %PERHAPS WE SHOULD FOREGO TALKING ABOUT THE |
972 %INSTALLATION VERSION OF THE TRIANGLE? | 1019 %INSTALLATION VERSION OF THE TRIANGLE? |
973 %feels a bit like a tangent, and could do with the space.. | 1020 %feels a bit like a tangent, and could do with the space.. |
974 The Melody Triangle exists in two incarnations; a standard screen based interface | 1021 The Melody Triangle exists in two incarnations: a screen-based interface |
975 where a user moves tokens in and around a triangle on screen, and a multi-user | 1022 where a user moves tokens in and around a triangle on screen, and a multi-user |
976 interactive installation where a Kinect camera tracks individuals in a space and | 1023 interactive installation where a Kinect camera tracks individuals in a space and |
977 maps their positions in physical space to the triangle. In the latter each visitor | 1024 maps their positions in physical space to the triangle. In the latter each visitor |
978 that enters the installation generates a melody and can collaborate with their | 1025 that enters the installation generates a melody and can collaborate with their |
979 co-visitors to generate musical textures. This makes the interaction physically engaging | 1026 co-visitors to generate musical textures. This makes the interaction physically engaging |
980 and (as our experience with visitors both young and old has demonstrated) more playful. | 1027 and (as our experience with visitors both young and old has demonstrated) more playful. |
981 %Additionally visitors can change the | 1028 %Additionally visitors can change the |
982 %tempo, register, instrumentation and periodicity of their melody with body gestures. | 1029 %tempo, register, instrumentation and periodicity of their melody with body gestures. |
983 | 1030 |
984 As a screen based interface the Melody Triangle can serve as a composition tool. | 1031 |
1032 \begin{fig}{mtriscat} | |
1033 \colfig[0.9]{mtriscat} | |
1034 \caption{The population of transition matrices in the 3D space of | |
1035 entropy rate ($h_\mu$), redundancy ($\rho_\mu$) and PIR ($b_\mu$), | |
1036 all in bits. | |
1037 The concentrations of points along the redundancy axis correspond | |
1038 to Markov chains which are roughly periodic with periods of 2 (redundancy 1 bit), | |
1039 3, 4, \etc all the way to period 7 (redundancy 2.8 bits). The colour of each point | |
1040 represents its PIR---note that the highest values are found at intermediate entropy | |
1041 and redundancy, and that the distribution as a whole makes a curved triangle. Although | |
1042 not visible in this plot, it is largely hollow in the middle.} | |
1043 \end{fig} | |
1044 | |
1045 | |
1046 The screen based interface can serve as a compositional tool. | |
985 %%A triangle is drawn on the screen, screen space thus mapped to the statistical | 1047 %%A triangle is drawn on the screen, screen space thus mapped to the statistical |
986 %space of the Melody Triangle. | 1048 %space of the Melody Triangle. |
987 A number of tokens, each representing a | 1049 A number of tokens, each representing a |
988 melody, can be dragged in and around the triangle. For each token, a sequence of symbols with | 1050 sonification stream or `voice', can be dragged in and around the triangle. |
989 statistical properties that correspond to the token's position is generated. These | 1051 For each token, a sequence of symbols is sampled using the corresponding |
990 symbols are then mapped to notes of a scale or percussive sounds. | 1052 transition matrix, which |
991 However they could easily be mapped to other musical processes, possibly over | 1053 %statistical properties that correspond to the token's position is generated. These |
1054 %symbols | |
1055 are then mapped to notes of a scale or percussive sounds% | |
1056 \footnote{The sampled sequence could easily be mapped to other musical processes, possibly over | |
992 different time scales, such as chords, dynamics and timbres. It would also be possible | 1057 different time scales, such as chords, dynamics and timbres. It would also be possible |
993 to map the symbols to visual or kinetic outputs. | 1058 to map the symbols to visual or other outputs.}% |
1059 . Keyboard commands give control over other musical parameters such | |
1060 as pitch register and inter-onset interval. | |
994 %The possibilities afforded by the Melody Triangle in these other domains remains to be investigated.}. | 1061 %The possibilities afforded by the Melody Triangle in these other domains remains to be investigated.}. |
995 Additionally keyboard commands give control over other musical parameters such | 1062 % |
996 as pitch register and note duration. | 1063 The system is capable of generating quite intricate musical textures when multiple tokens |
997 | 1064 are in the triangle, but unlike other computer aided composition tools or programming |
998 The Melody Triangle can generate intricate musical textures when multiple tokens | 1065 environments, the composer excercises control at the abstract level of information-dynamic |
999 are in the triangle. Unlike other computer aided composition tools or programming | 1066 properties. |
1000 environments, here the composer engages with music on a high and abstract level; | 1067 %the interface relating to subjective expectation and predictability. |
1001 the interface relating to subjective expectation and predictability. | 1068 |
1002 | 1069 \begin{fig}{TheTriangle} |
1003 | 1070 \colfig[0.7]{TheTriangle.pdf} |
1004 | 1071 \caption{The Melody Triangle} |
1005 | 1072 \end{fig} |
1006 | |
1007 \begin{fig}{mtri-results} | |
1008 \def\scat#1{\colfig[0.42]{mtri/#1}} | |
1009 \def\subj#1{\scat{scat_dwells_subj_#1} & \scat{scat_marks_subj_#1}} | |
1010 \begin{tabular}{cc} | |
1011 % \subj{a} \\ | |
1012 \subj{b} \\ | |
1013 \subj{c} \\ | |
1014 \subj{d} | |
1015 \end{tabular} | |
1016 \caption{Dwell times and mark positions from user trials with the | |
1017 on-screen Melody Triangle interface, for two subjects. The left-hand column shows | |
1018 the positions in a 2D information space (entropy rate vs multi-information rate | |
1019 in bits) where each spent their time; the area of each circle is proportional | |
1020 to the time spent there. The right-hand column shows point which subjects | |
1021 `liked'; the area of the circles here is proportional to the duration spent at | |
1022 that point before the point was marked.} | |
1023 \end{fig} | |
1024 | 1073 |
1025 \comment{ | 1074 \comment{ |
1026 \subsection{Information Dynamics as Evaluative Feedback Mechanism} | 1075 \subsection{Information Dynamics as Evaluative Feedback Mechanism} |
1027 %NOT SURE THIS SHOULD BE HERE AT ALL..? | 1076 %NOT SURE THIS SHOULD BE HERE AT ALL..? |
1028 Information measures on a stream of symbols can form a feedback mechanism; a | 1077 Information measures on a stream of symbols can form a feedback mechanism; a |
1043 characteristics of sonified Markov chains and subjective musical preference. | 1092 characteristics of sonified Markov chains and subjective musical preference. |
1044 We carried out a pilot study with six participants, who were asked | 1093 We carried out a pilot study with six participants, who were asked |
1045 to use a simplified form of the user interface (a single controllable token, | 1094 to use a simplified form of the user interface (a single controllable token, |
1046 and no rhythmic, registral or timbral controls) under two conditions: | 1095 and no rhythmic, registral or timbral controls) under two conditions: |
1047 one where a single sequence was sonified under user control, and another | 1096 one where a single sequence was sonified under user control, and another |
1048 where an addition sequence was sonified in a different register, as if generated | 1097 where an additional sequence was sonified in a different register, as if generated |
1049 by a fixed invisible in one of four regions of the triangle. In addition, subjects | 1098 by a fixed invisible token in one of four regions of the triangle. In addition, subjects |
1050 were asked to press a key if they `liked' what they were hearing. | 1099 were asked to press a key if they `liked' what they were hearing. |
1051 | 1100 |
1052 We recorded subjects' behaviour as well as points which they marked | 1101 We recorded subjects' behaviour as well as points which they marked |
1053 with a key press. | 1102 with a key press. |
1054 Some results for three of the subjects are shown in \figrf{mtri-results}. Though | 1103 Some results for two of the subjects are shown in \figrf{mtri-results}. Though |
1055 we have not been able to detect any systematic across-subjects preference for any particular | 1104 we have not been able to detect any systematic across-subjects preference for any particular |
1056 region of the triangle, subjects do seem to exhibit distinct kinds of exploratory behaviour. | 1105 region of the triangle, subjects do seem to exhibit distinct kinds of exploratory behaviour. |
1057 Our initial hypothesis, that subjects would linger longer in regions of the triangle | 1106 Our initial hypothesis, that subjects would linger longer in regions of the triangle |
1058 that produced aesthetically preferable sequences, and that this tend to be towards the | 1107 that produced aesthetically preferable sequences, and that this would tend to be towards the |
1059 centre line of the triangle for all subjects, was not confirmed. However, it is possible | 1108 centre line of the triangle for all subjects, was not confirmed. However, it is possible |
1060 that the design of the experiment encouraged an initial exploration of the space (sometimes | 1109 that the design of the experiment encouraged an initial exploration of the space (sometimes |
1061 very systematic, as for subject c) aimed at \emph{understanding} the parameter space and | 1110 very systematic, as for subject c) aimed at \emph{understanding} %the parameter space and |
1062 how the system works, rather than finding musical sequences. It is also possible that the | 1111 how the system works, rather than finding musical patterns. It is also possible that the |
1063 system encourages users to create musically interesting output by \emph{moving the token}, | 1112 system encourages users to create musically interesting output by \emph{moving the token}, |
1064 rather than finding a particular spot in the triangle which produces a musically interesting | 1113 rather than finding a particular spot in the triangle which produces a musically interesting |
1065 pattern by itself. | 1114 sequence by itself. |
1115 | |
1116 \begin{fig}{mtri-results} | |
1117 \def\scat#1{\colfig[0.42]{mtri/#1}} | |
1118 \def\subj#1{\scat{scat_dwells_subj_#1} & \scat{scat_marks_subj_#1}} | |
1119 \begin{tabular}{cc} | |
1120 % \subj{a} \\ | |
1121 % \subj{b} \\ | |
1122 \subj{c} \\ | |
1123 \subj{d} | |
1124 \end{tabular} | |
1125 \caption{Dwell times and mark positions from user trials with the | |
1126 on-screen Melody Triangle interface, for two subjects. The left-hand column shows | |
1127 the positions in a 2D information space (entropy rate vs multi-information rate | |
1128 in bits) where each spent their time; the area of each circle is proportional | |
1129 to the time spent there. The right-hand column shows point which subjects | |
1130 `liked'; the area of the circles here is proportional to the duration spent at | |
1131 that point before the point was marked.} | |
1132 \end{fig} | |
1066 | 1133 |
1067 Comments collected from the subjects | 1134 Comments collected from the subjects |
1068 %during and after the experiment | 1135 %during and after the experiment |
1069 suggest that | 1136 suggest that |
1070 the information-dynamic characteristics of the patterns were readily apparent | 1137 the information-dynamic characteristics of the patterns were readily apparent |
1071 to most: several noticed the main organisation of the triangle, | 1138 to most: several noticed the main organisation of the triangle, |
1072 with repetative notes at the top, cyclic patterns along one edge, and unpredictable | 1139 with repetetive notes at the top, cyclic patterns along one edge, and unpredictable |
1073 notes towards the opposite corner. Some described their consciously systematic exploration of the space. | 1140 notes towards the opposite corner. Some described their systematic exploration of the space. |
1074 Two felt that the right side was `more controllable' than the left (a direct consequence | 1141 Two felt that the right side was `more controllable' than the left (a consequence |
1075 of their ability to return to a particular distinctive pattern and recognise it | 1142 of their ability to return to a particular distinctive pattern and recognise it |
1076 as one heard previously). Two said that the trial was too long and became bored towards the end, | 1143 as one heard previously). Two reported that they became bored towards the end, |
1077 but another felt there wasn't enough time to get to hear out the patterns properly. | 1144 but another felt there wasn't enough time to `hear out' the patterns properly. |
1078 One subject did not `enjoy' the patterns in the lower region, but another said the lower | 1145 One subject did not `enjoy' the patterns in the lower region, but another said the lower |
1079 central regions were more `melodic' and `interesting'. | 1146 central regions were more `melodic' and `interesting'. |
1080 | 1147 |
1081 We plan to continue the trials with a slightly less restricted user interface in order | 1148 We plan to continue the trials with a slightly less restricted user interface in order |
1082 make the experience more enjoyable and thereby give subjects longer to use the interface; | 1149 make the experience more enjoyable and thereby give subjects longer to use the interface; |
1097 %and frequencies, only lighting when it heard these. As the Musicolour would | 1164 %and frequencies, only lighting when it heard these. As the Musicolour would |
1098 %`get bored', the musician would have to change and vary their playing, eliciting | 1165 %`get bored', the musician would have to change and vary their playing, eliciting |
1099 %new and unexpected outputs in trying to keep the Musicolour interested. | 1166 %new and unexpected outputs in trying to keep the Musicolour interested. |
1100 | 1167 |
1101 | 1168 |
1102 \section{Conclusion} | 1169 \section{Conclusions} |
1103 | 1170 |
1104 % !!! FIXME | 1171 % !!! FIXME |
1105 We reviewed our information dynamics approach to the modelling of the perception | 1172 %We reviewed our information dynamics approach to the modelling of the perception |
1106 of music and have outlined several areas of application, including | 1173 We have looked at several emerging areas of application of the methods and |
1107 musicological analysis, audio analysis, beat tracking, and the generation | 1174 ideas of information dynamics to various problems in music analysis, perception |
1108 of musical materials in a compositional aid. | 1175 and cognition, including musicological analysis of symbolic music, audio analysis, |
1109 | 1176 rhythm processing and compositional and creative tasks. The approach has proved |
1110 We have described the `Melody Triangle', a novel system that enables a user/composer | 1177 successful in musicological analysis, and though our initial data on |
1111 to discover musical content in terms of the information theoretic properties of | 1178 rhythm processing and aesthetic preference are inconclusive, there is still |
1112 the output, and considered how information dynamics could be used to provide | 1179 plenty of work to be done in this area: where-ever there are probabilistic models, |
1113 evaluative feedback on a composition or improvisation. Finally we outline a | 1180 information dynamics can shed light on their behaviour. |
1114 pilot study that used the Melody Triangle as an experimental interface to help | 1181 |
1115 determine if there are any correlations between aesthetic preference and information | |
1116 dynamics measures. | |
1117 | 1182 |
1118 | 1183 |
1119 \section*{acknowledgments} | 1184 \section*{acknowledgments} |
1120 This work is supported by EPSRC Doctoral Training Centre EP/G03723X/1 (HE), | 1185 This work is supported by EPSRC Doctoral Training Centre EP/G03723X/1 (HE), |
1121 GR/S82213/01 and EP/E045235/1(SA), an EPSRC DTA Studentship (PF), an RAEng/EPSRC Research Fellowship 10216/88 (AR), an EPSRC Leadership Fellowship, EP/G007144/1 | 1186 GR/S82213/01 and EP/E045235/1(SA), an EPSRC DTA Studentship (PF), an RAEng/EPSRC Research Fellowship 10216/88 (AR), an EPSRC Leadership Fellowship, EP/G007144/1 |