samer@18: \documentclass[conference,a4paper]{IEEEtran} samer@4: \usepackage{cite} samer@4: \usepackage[cmex10]{amsmath} samer@4: \usepackage{graphicx} samer@4: \usepackage{amssymb} samer@4: \usepackage{epstopdf} samer@4: \usepackage{url} samer@4: \usepackage{listings} samer@18: %\usepackage[expectangle]{tools} samer@9: \usepackage{tools} samer@18: \usepackage{tikz} samer@18: \usetikzlibrary{calc} samer@18: \usetikzlibrary{matrix} samer@18: \usetikzlibrary{patterns} samer@18: \usetikzlibrary{arrows} samer@9: samer@9: \let\citep=\cite samer@33: \newcommand{\colfig}[2][1]{\includegraphics[width=#1\linewidth]{figs/#2}}% samer@18: \newcommand\preals{\reals_+} samer@18: \newcommand\X{\mathcal{X}} samer@18: \newcommand\Y{\mathcal{Y}} samer@18: \newcommand\domS{\mathcal{S}} samer@18: \newcommand\A{\mathcal{A}} samer@25: \newcommand\Data{\mathcal{D}} samer@18: \newcommand\rvm[1]{\mathrm{#1}} samer@18: \newcommand\sps{\,.\,} samer@18: \newcommand\Ipred{\mathcal{I}_{\mathrm{pred}}} samer@18: \newcommand\Ix{\mathcal{I}} samer@18: \newcommand\IXZ{\overline{\underline{\mathcal{I}}}} samer@18: \newcommand\x{\vec{x}} samer@18: \newcommand\Ham[1]{\mathcal{H}_{#1}} samer@18: \newcommand\subsets[2]{[#1]^{(k)}} samer@18: \def\bet(#1,#2){#1..#2} samer@18: samer@18: samer@18: \def\ev(#1=#2){#1\!\!=\!#2} samer@18: \newcommand\rv[1]{\Omega \to #1} samer@18: \newcommand\ceq{\!\!=\!} samer@18: \newcommand\cmin{\!-\!} samer@18: \newcommand\modulo[2]{#1\!\!\!\!\!\mod#2} samer@18: samer@18: \newcommand\sumitoN{\sum_{i=1}^N} samer@18: \newcommand\sumktoK{\sum_{k=1}^K} samer@18: \newcommand\sumjtoK{\sum_{j=1}^K} samer@18: \newcommand\sumalpha{\sum_{\alpha\in\A}} samer@18: \newcommand\prodktoK{\prod_{k=1}^K} samer@18: \newcommand\prodjtoK{\prod_{j=1}^K} samer@18: samer@18: \newcommand\past[1]{\overset{\rule{0pt}{0.2em}\smash{\leftarrow}}{#1}} samer@18: \newcommand\fut[1]{\overset{\rule{0pt}{0.1em}\smash{\rightarrow}}{#1}} samer@18: \newcommand\parity[2]{P^{#1}_{2,#2}} samer@4: samer@4: %\usepackage[parfill]{parskip} samer@4: samer@4: \begin{document} samer@4: \title{Cognitive Music Modelling: an Information Dynamics Approach} samer@4: samer@4: \author{ hekeus@16: \IEEEauthorblockN{Samer A. Abdallah, Henrik Ekeus, Peter Foster} hekeus@16: \IEEEauthorblockN{Andrew Robertson and Mark D. Plumbley} samer@4: \IEEEauthorblockA{Centre for Digital Music\\ samer@4: Queen Mary University of London\\ hekeus@16: Mile End Road, London E1 4NS\\ hekeus@16: Email:}} samer@4: samer@4: \maketitle samer@18: \begin{abstract} samer@18: People take in information when perceiving music. With it they continually samer@18: build predictive models of what is going to happen. There is a relationship samer@18: between information measures and how we perceive music. An information samer@18: theoretic approach to music cognition is thus a fruitful avenue of research. samer@18: In this paper, we review the theoretical foundations of information dynamics samer@18: and discuss a few emerging areas of application. hekeus@16: \end{abstract} samer@4: samer@4: samer@25: \section{Introduction} samer@9: \label{s:Intro} samer@9: samer@25: \subsection{Expectation and surprise in music} samer@18: One of the effects of listening to music is to create samer@18: expectations of what is to come next, which may be fulfilled samer@9: immediately, after some delay, or not at all as the case may be. samer@9: This is the thesis put forward by, amongst others, music theorists samer@18: L. B. Meyer \cite{Meyer67} and Narmour \citep{Narmour77}, but was samer@18: recognised much earlier; for example, samer@9: it was elegantly put by Hanslick \cite{Hanslick1854} in the samer@9: nineteenth century: samer@9: \begin{quote} samer@9: `The most important factor in the mental process which accompanies the samer@9: act of listening to music, and which converts it to a source of pleasure, samer@18: is \ldots the intellectual satisfaction samer@9: which the listener derives from continually following and anticipating samer@9: the composer's intentions---now, to see his expectations fulfilled, and samer@18: now, to find himself agreeably mistaken. samer@18: %It is a matter of course that samer@18: %this intellectual flux and reflux, this perpetual giving and receiving samer@18: %takes place unconsciously, and with the rapidity of lightning-flashes.' samer@9: \end{quote} samer@9: An essential aspect of this is that music is experienced as a phenomenon samer@9: that `unfolds' in time, rather than being apprehended as a static object samer@9: presented in its entirety. Meyer argued that musical experience depends samer@9: on how we change and revise our conceptions \emph{as events happen}, on samer@9: how expectation and prediction interact with occurrence, and that, to a samer@9: large degree, the way to understand the effect of music is to focus on samer@9: this `kinetics' of expectation and surprise. samer@9: samer@25: Prediction and expectation are essentially probabilistic concepts samer@25: and can be treated mathematically using probability theory. samer@25: We suppose that when we listen to music, expectations are created on the basis samer@25: of our familiarity with various styles of music and our ability to samer@25: detect and learn statistical regularities in the music as they emerge, samer@25: There is experimental evidence that human listeners are able to internalise samer@25: statistical knowledge about musical structure, \eg samer@25: \citep{SaffranJohnsonAslin1999,EerolaToiviainenKrumhansl2002}, and also samer@25: that statistical models can form an effective basis for computational samer@25: analysis of music, \eg samer@25: \cite{ConklinWitten95,PonsfordWigginsMellish1999,Pearce2005}. samer@25: samer@25: samer@25: \comment{ samer@9: The business of making predictions and assessing surprise is essentially samer@9: one of reasoning under conditions of uncertainty and manipulating samer@9: degrees of belief about the various proposition which may or may not samer@9: hold, and, as has been argued elsewhere \cite{Cox1946,Jaynes27}, best samer@9: quantified in terms of Bayesian probability theory. samer@9: Thus, we suppose that samer@9: when we listen to music, expectations are created on the basis of our samer@24: familiarity with various stylistic norms that apply to music in general, samer@24: the particular style (or styles) of music that seem best to fit the piece samer@24: we are listening to, and samer@9: the emerging structures peculiar to the current piece. There is samer@9: experimental evidence that human listeners are able to internalise samer@9: statistical knowledge about musical structure, \eg samer@9: \citep{SaffranJohnsonAslin1999,EerolaToiviainenKrumhansl2002}, and also samer@9: that statistical models can form an effective basis for computational samer@9: analysis of music, \eg samer@9: \cite{ConklinWitten95,PonsfordWigginsMellish1999,Pearce2005}. samer@25: } samer@9: samer@9: \subsection{Music and information theory} samer@24: With a probabilistic framework for music modelling and prediction in hand, samer@25: we are in a position to apply Shannon's quantitative information theory samer@25: \cite{Shannon48}. samer@25: \comment{ samer@25: which provides us with a number of measures, such as entropy samer@25: and mutual information, which are suitable for quantifying states of samer@25: uncertainty and surprise, and thus could potentially enable us to build samer@25: quantitative models of the listening process described above. They are samer@25: what Berlyne \cite{Berlyne71} called `collative variables' since they are samer@25: to do with patterns of occurrence rather than medium-specific details. samer@25: Berlyne sought to show that the collative variables are closely related to samer@25: perceptual qualities like complexity, tension, interestingness, samer@25: and even aesthetic value, not just in music, but in other temporal samer@25: or visual media. samer@25: The relevance of information theory to music and art has samer@25: also been addressed by researchers from the 1950s onwards samer@25: \cite{Youngblood58,CoonsKraehenbuehl1958,Cohen1962,HillerBean66,Moles66,Meyer67}. samer@25: } samer@9: The relationship between information theory and music and art in general has been the samer@9: subject of some interest since the 1950s samer@9: \cite{Youngblood58,CoonsKraehenbuehl1958,HillerBean66,Moles66,Meyer67,Cohen1962}. samer@9: The general thesis is that perceptible qualities and subjective samer@9: states like uncertainty, surprise, complexity, tension, and interestingness samer@9: are closely related to samer@9: information-theoretic quantities like entropy, relative entropy, samer@9: and mutual information. samer@9: % and are major determinants of the overall experience. samer@9: Berlyne \cite{Berlyne71} called such quantities `collative variables', since samer@9: they are to do with patterns of occurrence rather than medium-specific details, samer@9: and developed the ideas of `information aesthetics' in an experimental setting. samer@9: % Berlyne's `new experimental aesthetics', the `information-aestheticians'. samer@9: samer@9: % Listeners then experience greater or lesser levels of surprise samer@9: % in response to departures from these norms. samer@9: % By careful manipulation samer@9: % of the material, the composer can thus define, and induce within the samer@9: % listener, a temporal programme of varying samer@9: % levels of uncertainty, ambiguity and surprise. samer@9: samer@9: samer@9: \subsection{Information dynamic approach} samer@9: samer@24: Bringing the various strands together, our working hypothesis is that as a samer@24: listener (to which will refer as `it') listens to a piece of music, it maintains samer@25: a dynamically evolving probabilistic model that enables it to make predictions samer@24: about how the piece will continue, relying on both its previous experience samer@24: of music and the immediate context of the piece. As events unfold, it revises samer@25: its probabilistic belief state, which includes predictive samer@25: distributions over possible future events. These samer@25: % distributions and changes in distributions samer@25: can be characterised in terms of a handful of information samer@25: theoretic-measures such as entropy and relative entropy. By tracing the samer@24: evolution of a these measures, we obtain a representation which captures much samer@25: of the significant structure of the music. samer@25: samer@25: One of the consequences of this approach is that regardless of the details of samer@25: the sensory input or even which sensory modality is being processed, the resulting samer@25: analysis is in terms of the same units: quantities of information (bits) and samer@25: rates of information flow (bits per second). The probabilistic and information samer@25: theoretic concepts in terms of which the analysis is framed are universal to all sorts samer@25: of data. samer@25: In addition, when adaptive probabilistic models are used, expectations are samer@25: created mainly in response to to \emph{patterns} of occurence, samer@25: rather the details of which specific things occur. samer@25: Together, these suggest that an information dynamic analysis captures a samer@25: high level of \emph{abstraction}, and could be used to samer@25: make structural comparisons between different temporal media, samer@25: such as music, film, animation, and dance. samer@25: % analyse and compare information samer@25: % flow in different temporal media regardless of whether they are auditory, samer@25: % visual or otherwise. samer@9: samer@25: Another consequence is that the information dynamic approach gives us a principled way samer@24: to address the notion of \emph{subjectivity}, since the analysis is dependent on the samer@24: probability model the observer starts off with, which may depend on prior experience samer@24: or other factors, and which may change over time. Thus, inter-subject variablity and samer@24: variation in subjects' responses over time are samer@24: fundamental to the theory. samer@9: samer@18: %modelling the creative process, which often alternates between generative samer@18: %and selective or evaluative phases \cite{Boden1990}, and would have samer@18: %applications in tools for computer aided composition. samer@18: samer@18: samer@18: \section{Theoretical review} samer@18: samer@34: \subsection{Entropy and information} samer@34: Let $X$ denote some variable whose value is initially unknown to our samer@34: hypothetical observer. We will treat $X$ mathematically as a random variable, samer@36: with a value to be drawn from some set $\X$ and a samer@34: probability distribution representing the observer's beliefs about the samer@34: true value of $X$. samer@34: In this case, the observer's uncertainty about $X$ can be quantified samer@34: as the entropy of the random variable $H(X)$. For a discrete variable samer@36: with probability mass function $p:\X \to [0,1]$, this is samer@34: \begin{equation} samer@36: H(X) = \sum_{x\in\X} -p(x) \log p(x) = \expect{-\log p(X)}, samer@34: \end{equation} samer@34: where $\expect{}$ is the expectation operator. The negative-log-probability samer@34: $\ell(x) = -\log p(x)$ of a particular value $x$ can usefully be thought of as samer@34: the \emph{surprisingness} of the value $x$ should it be observed, and samer@34: hence the entropy is the expected surprisingness. samer@34: samer@34: Now suppose that the observer receives some new data $\Data$ that samer@34: causes a revision of its beliefs about $X$. The \emph{information} samer@34: in this new data \emph{about} $X$ can be quantified as the samer@34: Kullback-Leibler (KL) divergence between the prior and posterior samer@34: distributions $p(x)$ and $p(x|\Data)$ respectively: samer@34: \begin{equation} samer@34: \mathcal{I}_{\Data\to X} = D(p_{X|\Data} || p_{X}) samer@36: = \sum_{x\in\X} p(x|\Data) \log \frac{p(x|\Data)}{p(x)}. samer@34: \end{equation} samer@34: When there are multiple variables $X_1, X_2$ samer@34: \etc which the observer believes to be dependent, then the observation of samer@34: one may change its beliefs and hence yield information about the samer@34: others. The joint and conditional entropies as described in any samer@34: textbook on information theory (\eg \cite{CoverThomas}) then quantify samer@34: the observer's expected uncertainty about groups of variables given the samer@34: values of others. In particular, the \emph{mutual information} samer@34: $I(X_1;X_2)$ is both the expected information samer@34: in an observation of $X_2$ about $X_1$ and the expected reduction samer@34: in uncertainty about $X_1$ after observing $X_2$: samer@34: \begin{equation} samer@34: I(X_1;X_2) = H(X_1) - H(X_1|X_2), samer@34: \end{equation} samer@34: where $H(X_1|X_2) = H(X_1,X_2) - H(X_2)$ is the conditional entropy samer@34: of $X_2$ given $X_1$. A little algebra shows that $I(X_1;X_2)=I(X_2;X_1)$ samer@34: and so the mutual information is symmetric in its arguments. A conditional samer@34: form of the mutual information can be formulated analogously: samer@34: \begin{equation} samer@34: I(X_1;X_2|X_3) = H(X_1|X_3) - H(X_1|X_2,X_3). samer@34: \end{equation} samer@34: These relationships between the various entropies and mutual samer@34: informations are conveniently visualised in Venn diagram-like \emph{information diagrams} samer@34: or I-diagrams \cite{Yeung1991} such as the one in \figrf{venn-example}. samer@34: samer@18: \begin{fig}{venn-example} samer@18: \newcommand\rad{2.2em}% samer@18: \newcommand\circo{circle (3.4em)}% samer@18: \newcommand\labrad{4.3em} samer@18: \newcommand\bound{(-6em,-5em) rectangle (6em,6em)} samer@18: \newcommand\colsep{\ } samer@18: \newcommand\clipin[1]{\clip (#1) \circo;}% samer@18: \newcommand\clipout[1]{\clip \bound (#1) \circo;}% samer@18: \newcommand\cliptwo[3]{% samer@18: \begin{scope} samer@18: \clipin{#1}; samer@18: \clipin{#2}; samer@18: \clipout{#3}; samer@18: \fill[black!30] \bound; samer@18: \end{scope} samer@18: }% samer@18: \newcommand\clipone[3]{% samer@18: \begin{scope} samer@18: \clipin{#1}; samer@18: \clipout{#2}; samer@18: \clipout{#3}; samer@18: \fill[black!15] \bound; samer@18: \end{scope} samer@18: }% samer@18: \begin{tabular}{c@{\colsep}c} samer@18: \begin{tikzpicture}[baseline=0pt] samer@18: \coordinate (p1) at (90:\rad); samer@18: \coordinate (p2) at (210:\rad); samer@18: \coordinate (p3) at (-30:\rad); samer@18: \clipone{p1}{p2}{p3}; samer@18: \clipone{p2}{p3}{p1}; samer@18: \clipone{p3}{p1}{p2}; samer@18: \cliptwo{p1}{p2}{p3}; samer@18: \cliptwo{p2}{p3}{p1}; samer@18: \cliptwo{p3}{p1}{p2}; samer@18: \begin{scope} samer@18: \clip (p1) \circo; samer@18: \clip (p2) \circo; samer@18: \clip (p3) \circo; samer@18: \fill[black!45] \bound; samer@18: \end{scope} samer@18: \draw (p1) \circo; samer@18: \draw (p2) \circo; samer@18: \draw (p3) \circo; samer@18: \path samer@18: (barycentric cs:p3=1,p1=-0.2,p2=-0.1) +(0ex,0) node {$I_{3|12}$} samer@18: (barycentric cs:p1=1,p2=-0.2,p3=-0.1) +(0ex,0) node {$I_{1|23}$} samer@18: (barycentric cs:p2=1,p3=-0.2,p1=-0.1) +(0ex,0) node {$I_{2|13}$} samer@18: (barycentric cs:p3=1,p2=1,p1=-0.55) +(0ex,0) node {$I_{23|1}$} samer@18: (barycentric cs:p1=1,p3=1,p2=-0.55) +(0ex,0) node {$I_{13|2}$} samer@18: (barycentric cs:p2=1,p1=1,p3=-0.55) +(0ex,0) node {$I_{12|3}$} samer@18: (barycentric cs:p3=1,p2=1,p1=1) node {$I_{123}$} samer@18: ; samer@18: \path samer@18: (p1) +(140:\labrad) node {$X_1$} samer@18: (p2) +(-140:\labrad) node {$X_2$} samer@18: (p3) +(-40:\labrad) node {$X_3$}; samer@18: \end{tikzpicture} samer@18: & samer@18: \parbox{0.5\linewidth}{ samer@18: \small samer@18: \begin{align*} samer@18: I_{1|23} &= H(X_1|X_2,X_3) \\ samer@18: I_{13|2} &= I(X_1;X_3|X_2) \\ samer@18: I_{1|23} + I_{13|2} &= H(X_1|X_2) \\ samer@18: I_{12|3} + I_{123} &= I(X_1;X_2) samer@18: \end{align*} samer@18: } samer@18: \end{tabular} samer@18: \caption{ samer@30: I-diagram visualisation of entropies and mutual informations samer@18: for three random variables $X_1$, $X_2$ and $X_3$. The areas of samer@18: the three circles represent $H(X_1)$, $H(X_2)$ and $H(X_3)$ respectively. samer@18: The total shaded area is the joint entropy $H(X_1,X_2,X_3)$. samer@18: The central area $I_{123}$ is the co-information \cite{McGill1954}. samer@18: Some other information measures are indicated in the legend. samer@18: } samer@18: \end{fig} samer@30: samer@30: samer@36: \subsection{Surprise and information in sequences} samer@36: \label{s:surprise-info-seq} samer@30: samer@36: Suppose that $(\ldots,X_{-1},X_0,X_1,\ldots)$ is a sequence of samer@30: random variables, infinite in both directions, samer@36: and that $\mu$ is the associated probability measure over all samer@36: realisations of the sequence---in the following, $\mu$ will simply serve samer@30: as a label for the process. We can indentify a number of information-theoretic samer@30: measures meaningful in the context of a sequential observation of the sequence, during samer@36: which, at any time $t$, the sequence of variables can be divided into a `present' $X_t$, a `past' samer@30: $\past{X}_t \equiv (\ldots, X_{t-2}, X_{t-1})$, and a `future' samer@30: $\fut{X}_t \equiv (X_{t+1},X_{t+2},\ldots)$. samer@36: The actually observed value of $X_t$ will be written as $x_t$, and samer@36: the sequence of observations up to but not including $x_t$ as samer@36: $\past{x}_t$. samer@36: % Since the sequence is assumed stationary, we can without loss of generality, samer@36: % assume that $t=0$ in the following definitions. samer@36: samer@36: The in-context surprisingness of the observation $X_t=x_t$ is a function samer@36: of both $x_t$ and the context $\past{x}_t$: samer@36: \begin{equation} samer@36: \ell(x_t|\past{x}_t) = - \log p(x_t|\past{x}_t). samer@36: \end{equation} samer@36: However, before $X_t$ is observed to be $x_t$, the observer can compute samer@36: its \emph{expected} surprisingness as a measure of its uncertainty about samer@36: the very next event; this may be written as an entropy samer@36: $H(X_t|\ev(\past{X}_t = \past{x}_t))$, but note that this is samer@36: conditional on the \emph{event} $\ev(\past{X}_t=\past{x}_t)$, not samer@36: \emph{variables} $\past{X}_t$ as in the conventional conditional entropy. samer@36: samer@36: The surprisingness $\ell(x_t|\past{x}_t)$ and expected surprisingness samer@36: $H(X_t|\ev(\past{X}_t=\past{x}_t))$ samer@36: are subjective information dynamic measures since they are based on its samer@36: subjective probability model in the context of the actually observed sequence samer@36: $\past{x}_t$---they characterise what it is like to `be in the observer's shoes'. samer@36: If we view the observer as a purely passive or reactive agent, this would samer@36: probably be sufficient, but for active agents such as humans or animals, it is samer@36: often necessary to \emph{aniticipate} future events in order, for example, to plan the samer@36: most effective course of action. It makes sense for such observers to be samer@36: concerned about the predictive probability distribution over future events, samer@36: $p(\fut{x}_t|\past{x}_t)$. When an observation $\ev(X_t=x_t)$ is made in this context, samer@36: the \emph{instantaneous predictive information} (IPI) is the information in the samer@36: event $\ev(X_t=x_t)$ about the entire future of the sequence $\fut{X}_t$. samer@36: samer@36: \subsection{Information measures for stationary random processes} samer@30: samer@30: The \emph{entropy rate} of the process is the entropy of the next variable samer@30: $X_t$ given all the previous ones. samer@30: \begin{equation} samer@30: \label{eq:entro-rate} samer@30: h_\mu = H(X_0|\past{X}_0). samer@30: \end{equation} samer@30: The entropy rate gives a measure of the overall randomness samer@30: or unpredictability of the process. samer@30: samer@30: The \emph{multi-information rate} $\rho_\mu$ (following Dubnov's \cite{Dubnov2006} samer@30: notation for what he called the `information rate') is the mutual samer@30: information between the `past' and the `present': samer@30: \begin{equation} samer@30: \label{eq:multi-info} samer@30: \rho_\mu(t) = I(\past{X}_t;X_t) = H(X_0) - h_\mu. samer@30: \end{equation} samer@30: It is a measure of how much the context of an observation (that is, samer@30: the observation of previous elements of the sequence) helps in predicting samer@30: or reducing the suprisingness of the current observation. samer@30: samer@30: The \emph{excess entropy} \cite{CrutchfieldPackard1983} samer@30: is the mutual information between samer@30: the entire `past' and the entire `future': samer@30: \begin{equation} samer@30: E = I(\past{X}_0; X_0,\fut{X}_0). samer@30: \end{equation} samer@30: samer@30: samer@18: samer@18: \begin{fig}{predinfo-bg} samer@18: \newcommand\subfig[2]{\shortstack{#2\\[0.75em]#1}} samer@18: \newcommand\rad{1.8em}% samer@18: \newcommand\ovoid[1]{% samer@18: ++(-#1,\rad) samer@18: -- ++(2 * #1,0em) arc (90:-90:\rad) samer@18: -- ++(-2 * #1,0em) arc (270:90:\rad) samer@18: }% samer@18: \newcommand\axis{2.75em}% samer@18: \newcommand\olap{0.85em}% samer@18: \newcommand\offs{3.6em} samer@18: \newcommand\colsep{\hspace{5em}} samer@18: \newcommand\longblob{\ovoid{\axis}} samer@18: \newcommand\shortblob{\ovoid{1.75em}} samer@18: \begin{tabular}{c@{\colsep}c} samer@18: \subfig{(a) excess entropy}{% samer@18: \newcommand\blob{\longblob} samer@18: \begin{tikzpicture} samer@18: \coordinate (p1) at (-\offs,0em); samer@18: \coordinate (p2) at (\offs,0em); samer@18: \begin{scope} samer@18: \clip (p1) \blob; samer@18: \clip (p2) \blob; samer@18: \fill[lightgray] (-1,-1) rectangle (1,1); samer@18: \end{scope} samer@18: \draw (p1) +(-0.5em,0em) node{\shortstack{infinite\\past}} \blob; samer@18: \draw (p2) +(0.5em,0em) node{\shortstack{infinite\\future}} \blob; samer@18: \path (0,0) node (future) {$E$}; samer@18: \path (p1) +(-2em,\rad) node [anchor=south] {$\ldots,X_{-1}$}; samer@18: \path (p2) +(2em,\rad) node [anchor=south] {$X_0,\ldots$}; samer@18: \end{tikzpicture}% samer@18: }% samer@18: \\[1.25em] samer@18: \subfig{(b) predictive information rate $b_\mu$}{% samer@18: \begin{tikzpicture}%[baseline=-1em] samer@18: \newcommand\rc{2.1em} samer@18: \newcommand\throw{2.5em} samer@18: \coordinate (p1) at (210:1.5em); samer@18: \coordinate (p2) at (90:0.7em); samer@18: \coordinate (p3) at (-30:1.5em); samer@18: \newcommand\bound{(-7em,-2.6em) rectangle (7em,3.0em)} samer@18: \newcommand\present{(p2) circle (\rc)} samer@18: \newcommand\thepast{(p1) ++(-\throw,0) \ovoid{\throw}} samer@18: \newcommand\future{(p3) ++(\throw,0) \ovoid{\throw}} samer@18: \newcommand\fillclipped[2]{% samer@18: \begin{scope}[even odd rule] samer@18: \foreach \thing in {#2} {\clip \thing;} samer@18: \fill[black!#1] \bound; samer@18: \end{scope}% samer@18: }% samer@18: \fillclipped{30}{\present,\future,\bound \thepast} samer@18: \fillclipped{15}{\present,\bound \future,\bound \thepast} samer@18: \draw \future; samer@18: \fillclipped{45}{\present,\thepast} samer@18: \draw \thepast; samer@18: \draw \present; samer@18: \node at (barycentric cs:p2=1,p1=-0.17,p3=-0.17) {$r_\mu$}; samer@18: \node at (barycentric cs:p1=-0.4,p2=1.0,p3=1) {$b_\mu$}; samer@18: \node at (barycentric cs:p3=0,p2=1,p1=1.2) [shape=rectangle,fill=black!45,inner sep=1pt]{$\rho_\mu$}; samer@18: \path (p2) +(140:3em) node {$X_0$}; samer@18: % \node at (barycentric cs:p3=0,p2=1,p1=1) {$\rho_\mu$}; samer@18: \path (p3) +(3em,0em) node {\shortstack{infinite\\future}}; samer@18: \path (p1) +(-3em,0em) node {\shortstack{infinite\\past}}; samer@18: \path (p1) +(-4em,\rad) node [anchor=south] {$\ldots,X_{-1}$}; samer@18: \path (p3) +(4em,\rad) node [anchor=south] {$X_1,\ldots$}; samer@18: \end{tikzpicture}}% samer@18: \\[0.5em] samer@18: \end{tabular} samer@18: \caption{ samer@30: I-diagrams for several information measures in samer@18: stationary random processes. Each circle or oval represents a random samer@18: variable or sequence of random variables relative to time $t=0$. Overlapped areas samer@18: correspond to various mutual information as in \Figrf{venn-example}. samer@33: In (b), the circle represents the `present'. Its total area is samer@33: $H(X_0)=\rho_\mu+r_\mu+b_\mu$, where $\rho_\mu$ is the multi-information samer@18: rate, $r_\mu$ is the residual entropy rate, and $b_\mu$ is the predictive samer@18: information rate. The entropy rate is $h_\mu = r_\mu+b_\mu$. samer@18: } samer@18: \end{fig} samer@18: samer@30: The \emph{predictive information rate} (or PIR) \cite{AbdallahPlumbley2009} samer@30: is the average information in one observation about the infinite future given the infinite past, samer@30: and is defined as a conditional mutual information: samer@18: \begin{equation} samer@18: \label{eq:PIR} samer@30: b_\mu = I(X_0;\fut{X}_0|\past{X}_0) = H(\fut{X}_0|\past{X}_0) - H(\fut{X}_0|X_0,\past{X}_0). samer@18: \end{equation} samer@18: Equation \eqrf{PIR} can be read as the average reduction samer@18: in uncertainty about the future on learning $X_t$, given the past. samer@18: Due to the symmetry of the mutual information, it can also be written samer@18: as samer@18: \begin{equation} samer@18: % \IXZ_t samer@34: I(X_0;\fut{X}_0|\past{X}_0) = h_\mu - r_\mu, samer@18: % \label{<++>} samer@18: \end{equation} samer@18: % If $X$ is stationary, then samer@34: where $r_\mu = H(X_0|\fut{X}_0,\past{X}_0)$, samer@34: is the \emph{residual} \cite{AbdallahPlumbley2010}, samer@34: or \emph{erasure} \cite{VerduWeissman2006} entropy rate. samer@18: These relationships are illustrated in \Figrf{predinfo-bg}, along with samer@18: several of the information measures we have discussed so far. samer@18: samer@18: samer@25: James et al \cite{JamesEllisonCrutchfield2011} study the predictive information samer@25: rate and also examine some related measures. In particular they identify the samer@25: $\sigma_\mu$, the difference between the multi-information rate and the excess samer@25: entropy, as an interesting quantity that measures the predictive benefit of samer@25: model-building (that is, maintaining an internal state summarising past samer@25: observations in order to make better predictions). They also identify samer@25: $w_\mu = \rho_\mu + b_{\mu}$, which they call the \emph{local exogenous samer@30: information} rate. samer@24: samer@34: \begin{fig}{wundt} samer@34: \raisebox{-4em}{\colfig[0.43]{wundt}} samer@34: % {\ \shortstack{{\Large$\longrightarrow$}\\ {\scriptsize\emph{exposure}}}\ } samer@34: {\ {\large$\longrightarrow$}\ } samer@34: \raisebox{-4em}{\colfig[0.43]{wundt2}} samer@34: \caption{ samer@34: The Wundt curve relating randomness/complexity with samer@34: perceived value. Repeated exposure sometimes results samer@34: in a move to the left along the curve \cite{Berlyne71}. samer@34: } samer@34: \end{fig} samer@34: samer@4: samer@36: \subsection{First and higher order Markov chains} samer@36: First order Markov chains are the simplest non-trivial models to which information samer@36: dynamics methods can be applied. In \cite{AbdallahPlumbley2009} we derived samer@36: expressions for all the information measures introduced [above] for samer@36: irreducible stationary Markov chains (\ie that have a unique stationary samer@36: distribution). The derivation is greatly simplified by the dependency structure samer@36: of the Markov chain: for the purpose of the analysis, the `past' and `future' samer@36: segments $\past{X}_t$ and $\fut{X})_t$ can be reduced to just the previous samer@36: and next variables $X_{t-1}$ and $X_{t-1}$ respectively. We also showed that samer@36: the predictive information rate can be expressed simply in terms of entropy rates: samer@36: if we let $a$ denote the $K\times K$ transition matrix of a Markov chain over samer@36: an alphabet of $\{1,\ldots,K\}$, such that samer@36: $a_{ij} = \Pr(\ev(X_t=i|X_{t-1}=j))$, and let $h:\reals^{K\times K}\to \reals$ be samer@36: the entropy rate function such that $h(a)$ is the entropy rate of a Markov chain samer@36: with transition matrix $a$, then the predictive information rate $b(a)$ is samer@36: \begin{equation} samer@36: b(a) = h(a^2) - h(a), samer@36: \end{equation} samer@36: where $a^2$, the transition matrix squared, is the transition matrix samer@36: of the `skip one' Markov chain obtained by jumping two steps at a time samer@36: along the original chain. samer@36: samer@36: Second and higher order Markov chains can be treated in a similar way by transforming samer@36: to a first order representation of the high order Markov chain. If we are dealing samer@36: with an $N$th order model, this is done forming a new alphabet of size $K^N$ samer@36: consisting of all possible $N$-tuples of symbols from the base alphabet of size $K$. samer@36: An observation $\hat{x}_t$ in this new model represents a block of $N$ observations samer@36: $(x_{t+1},\ldots,x_{t+N})$ from the base model. The next samer@36: observation $\hat{x}_{t+1}$ represents the block of $N$ obtained by shifting the previous samer@36: block along by one step. The new Markov of chain is parameterised by a sparse $K^N\times K^N$ samer@36: transition matrix $\hat{a}$. The entropy rate of the first order system is the same samer@36: as the entropy rate of the original order $N$ system, and its PIR is samer@36: \begin{equation} samer@36: b({\hat{a}}) = h({\hat{a}^{N+1}}) - N h({\hat{a}}), samer@36: \end{equation} samer@36: where $\hat{a}^{N+1}$ is the $(N+1)$th power of the first order transition matrix. samer@36: samer@36: hekeus@16: \section{Information Dynamics in Analysis} samer@4: samer@24: \begin{fig}{twopages} samer@33: \colfig[0.96]{matbase/fig9471} % update from mbc paper samer@33: % \colfig[0.97]{matbase/fig72663}\\ % later update from mbc paper (Keith's new picks) samer@24: \vspace*{1em} samer@24: \colfig[0.97]{matbase/fig13377} % rule based analysis samer@24: \caption{Analysis of \emph{Two Pages}. samer@24: The thick vertical lines are the part boundaries as indicated in samer@24: the score by the composer. samer@24: The thin grey lines samer@24: indicate changes in the melodic `figures' of which the piece is samer@24: constructed. In the `model information rate' panel, the black asterisks samer@24: mark the samer@24: six most surprising moments selected by Keith Potter. samer@24: The bottom panel shows a rule-based boundary strength analysis computed samer@24: using Cambouropoulos' LBDM. samer@24: All information measures are in nats and time is in notes. samer@24: } samer@24: \end{fig} samer@24: samer@36: \subsection{Musicological Analysis} samer@36: In \cite{AbdallahPlumbley2009}, methods based on the theory described above samer@36: were used to analysis two pieces of music in the minimalist style samer@36: by Philip Glass: \emph{Two Pages} (1969) and \emph{Gradus} (1968). samer@36: The analysis was done using a first-order Markov chain model, with the samer@36: enhancement that the transition matrix of the model was allowed to samer@36: evolve dynamically as the notes were processed, and was tracked (in samer@36: a Bayesian way) as a \emph{distribution} over possible transition matrices, samer@36: rather than a point estimate. The results are summarised in \figrf{twopages}: samer@36: the upper four plots show the dynamically evolving subjective information samer@36: measures as described in \secrf{surprise-info-seq} computed using a point samer@36: estimate of the current transition matrix, but the fifth plot (the `model information rate') samer@36: measures the information in each observation about the transition matrix. samer@36: In \cite{AbdallahPlumbley2010b}, we showed that this `model information rate' samer@36: is actually a component of the true IPI in samer@36: a time-varying Markov chain, which was neglected when we computed the IPI from samer@36: point estimates of the transition matrix as if the transition probabilities samer@36: were constant. samer@36: samer@36: The peaks of the surprisingness and both components of the predictive information samer@36: show good correspondence with structure of the piece both as marked in the score samer@36: and as analysed by musicologist Keith Potter, who was asked to mark the six samer@36: `most surprising moments' of the piece (shown as asterisks in the fifth plot)% samer@36: \footnote{% samer@36: Note that the boundary marked in the score at around note 5,400 is known to be samer@36: anomalous; on the basis of a listening analysis, some musicologists [ref] have samer@36: placed the boundary a few bars later, in agreement with our analysis.}. samer@36: samer@36: In contrast, the analyses shown in the lower two plots of \figrf{twopages}, samer@36: obtained using two rule-based music segmentation algorithms, while clearly samer@37: \emph{reflecting} the structure of the piece, do not \emph{segment} the piece, samer@37: with no tendency to peaking of the boundary strength function at samer@36: the boundaries in the piece. samer@36: samer@36: samer@24: \begin{fig}{metre} samer@33: % \scalebox{1}[1]{% samer@24: \begin{tabular}{cc} samer@33: \colfig[0.45]{matbase/fig36859} & \colfig[0.48]{matbase/fig88658} \\ samer@33: \colfig[0.45]{matbase/fig48061} & \colfig[0.48]{matbase/fig46367} \\ samer@33: \colfig[0.45]{matbase/fig99042} & \colfig[0.47]{matbase/fig87490} samer@24: % \colfig[0.46]{matbase/fig56807} & \colfig[0.48]{matbase/fig27144} \\ samer@24: % \colfig[0.46]{matbase/fig87574} & \colfig[0.48]{matbase/fig13651} \\ samer@24: % \colfig[0.44]{matbase/fig19913} & \colfig[0.46]{matbase/fig66144} \\ samer@24: % \colfig[0.48]{matbase/fig73098} & \colfig[0.48]{matbase/fig57141} \\ samer@24: % \colfig[0.48]{matbase/fig25703} & \colfig[0.48]{matbase/fig72080} \\ samer@24: % \colfig[0.48]{matbase/fig9142} & \colfig[0.48]{matbase/fig27751} samer@24: samer@24: \end{tabular}% samer@33: % } samer@24: \caption{Metrical analysis by computing average surprisingness and samer@24: informative of notes at different periodicities (\ie hypothetical samer@24: bar lengths) and phases (\ie positions within a bar). samer@24: } samer@24: \end{fig} samer@24: samer@23: \subsection{Content analysis/Sound Categorisation}. peterf@26: Overview of of information-theoretic approaches to music content analysis. peterf@26: \begin{itemize} samer@33: \item Continuous domain information samer@33: \item Audio based music expectation modelling peterf@26: \item Proposed model for Gaussian processes peterf@26: \end{itemize} peterf@26: \emph{Peter} peterf@26: samer@4: samer@4: \subsection{Beat Tracking} hekeus@16: \emph{Andrew} samer@4: samer@4: samer@24: \section{Information dynamics as compositional aid} hekeus@13: hekeus@35: In addition to applying information dynamics to analysis, it is also possible to apply it to the generation of content, such as to the composition of musical materials. hekeus@35: The outputs of algorithmic or stochastic processes can be filtered to match a set of criteria defined in terms of the information dynamics model, this criteria thus becoming a means of interfacing with the generative process. hekeus@35: For instance a stochastic music generating process could be controlled by modifying constraints on its output in terms of predictive information rate or entropy rate. hekeus@13: hekeus@35: The use of stochastic processes for the composition of musical material has been widespread for decades -- for instance Iannis Xenakis applied probabilistic mathematical models to the creation of musical materials\cite{Xenakis:1992ul}. hekeus@35: Information dynamics can serve as a novel framework for the exploration of the possibilities of such processes at the high and abstract level of expectation, randomness and predictability. hekeus@13: samer@23: \subsection{The Melody Triangle} hekeus@35: The Melody Triangle is an exploratory interface for the discovery of melodic content, where the input -- positions within a triangle -- directly map to information theoretic measures of the output. hekeus@35: The measures -- entropy rate, redundancy and predictive information rate -- form a criteria with which to filter the output of the stochastic processes used to generate sequences of notes. hekeus@35: These measures address notions of expectation and surprise in music, and as such the Melody Triangle is a means of interfacing with a generative process in terms of the predictability of its output. hekeus@13: hekeus@35: The triangle is `populated' with possible parameter values for melody generators. hekeus@35: These are plotted in a 3d statistical space of redundancy, entropy rate and predictive information rate. hekeus@35: In our case we generated thousands of transition matrixes, representing first-order Markov chains, by a random sampling method. hekeus@35: In figure \ref{InfoDynEngine} we see a representation of how these matrixes are distributed in the 3d statistical space; each one of these points corresponds to a transition matrix. hekeus@35: hekeus@35: hekeus@35: hekeus@35: hekeus@35: The distribution of transition matrixes plotted in this space forms an arch shape that is fairly thin. hekeus@35: It thus becomes a reasonable approximation to pretend that it is just a sheet in two dimensions; and so we stretch out this curved arc into a flat triangle. hekeus@35: It is this triangular sheet that is our `Melody Triangle' and forms the interface by which the system is controlled. hekeus@35: Using this interface thus involves a mapping to statistical space; a user selects a position within the triangle, and a corresponding transition matrix is returned. hekeus@35: Figure \ref{TheTriangle} shows how the triangle maps to different measures of redundancy, entropy rate and predictive information rate. hekeus@17: samer@4: samer@34: hekeus@35: hekeus@35: Each corner corresponds to three different extremes of predictability and unpredictability, which could be loosely characterised as `periodicity', `noise' and `repetition'. hekeus@35: Melodies from the `noise' corner have no discernible pattern; they have high entropy rate, low predictive information rate and low redundancy. hekeus@35: These melodies are essentially totally random. hekeus@35: A melody along the `periodicity' to `repetition' edge are all deterministic loops that get shorter as we approach the `repetition' corner, until it becomes just one repeating note. hekeus@35: It is the areas in between the extremes that provide the more `interesting' melodies. hekeus@35: These melodies have some level of unpredictability, but are not completely random. hekeus@35: Or, conversely, are predictable, but not entirely so. hekeus@35: hekeus@35: The Melody Triangle exists in two incarnations; a standard screen based interface where a user moves tokens in and around a triangle on screen, and a multi-user interactive installation where a Kinect camera tracks individuals in a space and maps their positions in physical space to the triangle. hekeus@35: In the latter visitors entering the installation generates a melody, and could collaborate with their co-visitors to generate musical textures -- a playful yet informative way to explore expectation and surprise in music. hekeus@35: Additionally different gestures could be detected to change the tempo, register, instrumentation and periodicity of the output melody. samer@23: samer@34: \begin{figure} samer@34: \centering samer@34: \includegraphics[width=\linewidth]{figs/mtriscat} samer@34: \caption{The population of transition matrices distributed along three axes of samer@34: redundancy, entropy rate and predictive information rate (all measured in bits). samer@34: The concentrations of points along the redundancy axis correspond samer@34: to Markov chains which are roughly periodic with periods of 2 (redundancy 1 bit), samer@34: 3, 4, \etc all the way to period 8 (redundancy 3 bits). The colour of each point samer@34: represents its PIR---note that the highest values are found at intermediate entropy samer@34: and redundancy, and that the distribution as a whole makes a curved triangle. Although samer@34: not visible in this plot, it is largely hollow in the middle. samer@34: \label{InfoDynEngine}} samer@34: \end{figure} samer@23: samer@23: As a screen based interface the Melody Triangle can serve as composition tool. hekeus@35: A triangle is drawn on the screen, screen space thus mapped to the statistical space of the Melody Triangle. hekeus@35: A number of round tokens, each representing a melody can be dragged in and around the triangle. hekeus@35: When a token is dragged into the triangle, the system will start generating the sequence of symbols with statistical properties that correspond to the position of the token. hekeus@35: These symbols are then mapped to notes of a scale. hekeus@35: Keyboard input allow for control over additionally parameters. samer@23: samer@34: \begin{figure} samer@34: \centering samer@34: \includegraphics[width=0.9\linewidth]{figs/TheTriangle.pdf} samer@34: \caption{The Melody Triangle\label{TheTriangle}} samer@34: \end{figure} samer@34: hekeus@35: In this mode, the Melody Triangle is a compositional tool. hekeus@35: It can assist a composer in the creation not only of melodies, but by placing multiple tokens in the triangle, the generation of intricate musical textures. hekeus@35: Unlike other computer aided composition tools or programming environments, here the composer engages with music on the high and abstract level of expectation, randomness and predictability. hekeus@35: hekeus@35: hekeus@38: hekeus@38: \subsection{Information Dynamics as Evaluative Feedback Mechanism} hekeus@38: %NOT SURE THIS SHOULD BE HERE AT ALL..? hekeus@38: hekeus@38: hekeus@38: Information measures on a stream of symbols could form a feedback mechanism; a rudamentary `critic' of sorts. hekeus@38: For instance symbol by symbol measure of predictive information rate, entropy rate and redundancy could tell us if a stream of symbols is at this current moment `boring', either because it is too repetitive, or because it is too chaotic. hekeus@38: Such feedback would be oblivious to more long term and large scale structures, but it nonetheless could be provide valuable insight on the short term properties of a work. hekeus@38: This could not only be used for the evaluation of pre-composed streams of symbols, but could also provide real-time feedback in an improvisatory setup. hekeus@38: hekeus@38: hekeus@13: \section{Musical Preference and Information Dynamics} hekeus@38: We are carrying out a study to investigate the relationship between musical preference and the information dynamics models, the experimental interface a simplified version of the screen-based Melody Triangle. hekeus@38: Participants are asked to use this music pattern generator under various experimental conditions in a composition task. hekeus@38: The data collected includes usage statistics of the system: where in the triangle they place the tokens, how long they leave them there and the state of the system when users, by pressing a key, indicate that they like what they are hearing. hekeus@38: As such the experiments will help us identify any correlation between the information theoretic properties of a stream and its perceived aesthetic worth. hekeus@16: samer@4: hekeus@38: %\emph{comparable system} Gordon Pask's Musicolor (1953) applied a similar notion hekeus@38: %of boredom in its design. The Musicolour would react to audio input through a hekeus@38: %microphone by flashing coloured lights. Rather than a direct mapping of sound hekeus@38: %to light, Pask designed the device to be a partner to a performing musician. It hekeus@38: %would adapt its lighting pattern based on the rhythms and frequencies it would hekeus@38: %hear, quickly `learning' to flash in time with the music. However Pask endowed hekeus@38: %the device with the ability to `be bored'; if the rhythmic and frequency content hekeus@38: %of the input remained the same for too long it would listen for other rhythms hekeus@38: %and frequencies, only lighting when it heard these. As the Musicolour would hekeus@38: %`get bored', the musician would have to change and vary their playing, eliciting hekeus@38: %new and unexpected outputs in trying to keep the Musicolour interested. samer@4: hekeus@13: samer@4: \section{Conclusion} samer@4: samer@9: \bibliographystyle{unsrt} hekeus@16: {\bibliography{all,c4dm,nime}} samer@4: \end{document}