cip2012: draft.tex annotate

annotate draft.tex @ 44:244b74fb707d

Added acknowledgments (copied from NIME paper, need a couple more for Andrew/Peter)

author	Henrik Ekeus <hekeus@eecs.qmul.ac.uk>
date	Thu, 15 Mar 2012 16:26:55 +0000
parents	3f643e9fead0
children	4d348a206e94 df41539257ba

rev	line source
samer@41	1 \documentclass[conference]{IEEEtran}
samer@4	2 \usepackage{cite}
samer@4	3 \usepackage[cmex10]{amsmath}
samer@4	4 \usepackage{graphicx}
samer@4	5 \usepackage{amssymb}
samer@4	6 \usepackage{epstopdf}
samer@4	7 \usepackage{url}
samer@4	8 \usepackage{listings}
samer@18	9 %\usepackage[expectangle]{tools}
samer@9	10 \usepackage{tools}
samer@18	11 \usepackage{tikz}
samer@18	12 \usetikzlibrary{calc}
samer@18	13 \usetikzlibrary{matrix}
samer@18	14 \usetikzlibrary{patterns}
samer@18	15 \usetikzlibrary{arrows}
samer@9	16
samer@9	17 \let\citep=\cite
samer@33	18 \newcommand{\colfig}[2][1]{\includegraphics[width=#1\linewidth]{figs/#2}}%
samer@18	19 \newcommand\preals{\reals_+}
samer@18	20 \newcommand\X{\mathcal{X}}
samer@18	21 \newcommand\Y{\mathcal{Y}}
samer@18	22 \newcommand\domS{\mathcal{S}}
samer@18	23 \newcommand\A{\mathcal{A}}
samer@25	24 \newcommand\Data{\mathcal{D}}
samer@18	25 \newcommand\rvm[1]{\mathrm{#1}}
samer@18	26 \newcommand\sps{\,.\,}
samer@18	27 \newcommand\Ipred{\mathcal{I}_{\mathrm{pred}}}
samer@18	28 \newcommand\Ix{\mathcal{I}}
samer@18	29 \newcommand\IXZ{\overline{\underline{\mathcal{I}}}}
samer@18	30 \newcommand\x{\vec{x}}
samer@18	31 \newcommand\Ham[1]{\mathcal{H}_{#1}}
samer@18	32 \newcommand\subsets[2]{[#1]^{(k)}}
samer@18	33 \def\bet(#1,#2){#1..#2}
samer@18	34
samer@18	35
samer@18	36 \def\ev(#1=#2){#1\!\!=\!#2}
samer@18	37 \newcommand\rv[1]{\Omega \to #1}
samer@18	38 \newcommand\ceq{\!\!=\!}
samer@18	39 \newcommand\cmin{\!-\!}
samer@18	40 \newcommand\modulo[2]{#1\!\!\!\!\!\mod#2}
samer@18	41
samer@18	42 \newcommand\sumitoN{\sum_{i=1}^N}
samer@18	43 \newcommand\sumktoK{\sum_{k=1}^K}
samer@18	44 \newcommand\sumjtoK{\sum_{j=1}^K}
samer@18	45 \newcommand\sumalpha{\sum_{\alpha\in\A}}
samer@18	46 \newcommand\prodktoK{\prod_{k=1}^K}
samer@18	47 \newcommand\prodjtoK{\prod_{j=1}^K}
samer@18	48
samer@18	49 \newcommand\past[1]{\overset{\rule{0pt}{0.2em}\smash{\leftarrow}}{#1}}
samer@18	50 \newcommand\fut[1]{\overset{\rule{0pt}{0.1em}\smash{\rightarrow}}{#1}}
samer@18	51 \newcommand\parity[2]{P^{#1}_{2,#2}}
samer@4	52
samer@4	53 %\usepackage[parfill]{parskip}
samer@4	54
samer@4	55 \begin{document}
samer@41	56 \title{Cognitive Music Modelling: an\\Information Dynamics Approach}
samer@4	57
samer@4	58 \author{
hekeus@16	59 \IEEEauthorblockN{Samer A. Abdallah, Henrik Ekeus, Peter Foster}
hekeus@16	60 \IEEEauthorblockN{Andrew Robertson and Mark D. Plumbley}
samer@4	61 \IEEEauthorblockA{Centre for Digital Music\\
samer@4	62 Queen Mary University of London\\
samer@41	63 Mile End Road, London E1 4NS}}
samer@4	64
samer@4	65 \maketitle
samer@18	66 \begin{abstract}
samer@18	67 People take in information when perceiving music. With it they continually
samer@18	68 build predictive models of what is going to happen. There is a relationship
samer@18	69 between information measures and how we perceive music. An information
samer@18	70 theoretic approach to music cognition is thus a fruitful avenue of research.
samer@18	71 In this paper, we review the theoretical foundations of information dynamics
samer@18	72 and discuss a few emerging areas of application.
hekeus@16	73 \end{abstract}
samer@4	74
samer@4	75
samer@25	76 \section{Introduction}
samer@9	77 \label{s:Intro}
samer@9	78
samer@25	79 \subsection{Expectation and surprise in music}
samer@18	80 One of the effects of listening to music is to create
samer@18	81 expectations of what is to come next, which may be fulfilled
samer@9	82 immediately, after some delay, or not at all as the case may be.
samer@9	83 This is the thesis put forward by, amongst others, music theorists
samer@18	84 L. B. Meyer \cite{Meyer67} and Narmour \citep{Narmour77}, but was
samer@18	85 recognised much earlier; for example,
samer@9	86 it was elegantly put by Hanslick \cite{Hanslick1854} in the
samer@9	87 nineteenth century:
samer@9	88 \begin{quote}
samer@9	89 `The most important factor in the mental process which accompanies the
samer@9	90 act of listening to music, and which converts it to a source of pleasure,
samer@18	91 is \ldots the intellectual satisfaction
samer@9	92 which the listener derives from continually following and anticipating
samer@9	93 the composer's intentions---now, to see his expectations fulfilled, and
samer@18	94 now, to find himself agreeably mistaken.
samer@18	95 %It is a matter of course that
samer@18	96 %this intellectual flux and reflux, this perpetual giving and receiving
samer@18	97 %takes place unconsciously, and with the rapidity of lightning-flashes.'
samer@9	98 \end{quote}
samer@9	99 An essential aspect of this is that music is experienced as a phenomenon
samer@9	100 that `unfolds' in time, rather than being apprehended as a static object
samer@9	101 presented in its entirety. Meyer argued that musical experience depends
samer@9	102 on how we change and revise our conceptions \emph{as events happen}, on
samer@9	103 how expectation and prediction interact with occurrence, and that, to a
samer@9	104 large degree, the way to understand the effect of music is to focus on
samer@9	105 this `kinetics' of expectation and surprise.
samer@9	106
samer@25	107 Prediction and expectation are essentially probabilistic concepts
samer@25	108 and can be treated mathematically using probability theory.
samer@25	109 We suppose that when we listen to music, expectations are created on the basis
samer@25	110 of our familiarity with various styles of music and our ability to
samer@25	111 detect and learn statistical regularities in the music as they emerge,
samer@25	112 There is experimental evidence that human listeners are able to internalise
samer@25	113 statistical knowledge about musical structure, \eg
samer@25	114 \citep{SaffranJohnsonAslin1999,EerolaToiviainenKrumhansl2002}, and also
samer@25	115 that statistical models can form an effective basis for computational
samer@25	116 analysis of music, \eg
samer@25	117 \cite{ConklinWitten95,PonsfordWigginsMellish1999,Pearce2005}.
samer@25	118
samer@25	119
samer@25	120 \comment{
samer@9	121 The business of making predictions and assessing surprise is essentially
samer@9	122 one of reasoning under conditions of uncertainty and manipulating
samer@9	123 degrees of belief about the various proposition which may or may not
samer@9	124 hold, and, as has been argued elsewhere \cite{Cox1946,Jaynes27}, best
samer@9	125 quantified in terms of Bayesian probability theory.
samer@9	126 Thus, we suppose that
samer@9	127 when we listen to music, expectations are created on the basis of our
samer@24	128 familiarity with various stylistic norms that apply to music in general,
samer@24	129 the particular style (or styles) of music that seem best to fit the piece
samer@24	130 we are listening to, and
samer@9	131 the emerging structures peculiar to the current piece. There is
samer@9	132 experimental evidence that human listeners are able to internalise
samer@9	133 statistical knowledge about musical structure, \eg
samer@9	134 \citep{SaffranJohnsonAslin1999,EerolaToiviainenKrumhansl2002}, and also
samer@9	135 that statistical models can form an effective basis for computational
samer@9	136 analysis of music, \eg
samer@9	137 \cite{ConklinWitten95,PonsfordWigginsMellish1999,Pearce2005}.
samer@25	138 }
samer@9	139
samer@9	140 \subsection{Music and information theory}
samer@24	141 With a probabilistic framework for music modelling and prediction in hand,
samer@25	142 we are in a position to apply Shannon's quantitative information theory
samer@25	143 \cite{Shannon48}.
samer@25	144 \comment{
samer@25	145 which provides us with a number of measures, such as entropy
samer@25	146 and mutual information, which are suitable for quantifying states of
samer@25	147 uncertainty and surprise, and thus could potentially enable us to build
samer@25	148 quantitative models of the listening process described above. They are
samer@25	149 what Berlyne \cite{Berlyne71} called `collative variables' since they are
samer@25	150 to do with patterns of occurrence rather than medium-specific details.
samer@25	151 Berlyne sought to show that the collative variables are closely related to
samer@25	152 perceptual qualities like complexity, tension, interestingness,
samer@25	153 and even aesthetic value, not just in music, but in other temporal
samer@25	154 or visual media.
samer@25	155 The relevance of information theory to music and art has
samer@25	156 also been addressed by researchers from the 1950s onwards
samer@25	157 \cite{Youngblood58,CoonsKraehenbuehl1958,Cohen1962,HillerBean66,Moles66,Meyer67}.
samer@25	158 }
samer@9	159 The relationship between information theory and music and art in general has been the
samer@9	160 subject of some interest since the 1950s
samer@9	161 \cite{Youngblood58,CoonsKraehenbuehl1958,HillerBean66,Moles66,Meyer67,Cohen1962}.
samer@9	162 The general thesis is that perceptible qualities and subjective
samer@9	163 states like uncertainty, surprise, complexity, tension, and interestingness
samer@9	164 are closely related to
samer@9	165 information-theoretic quantities like entropy, relative entropy,
samer@9	166 and mutual information.
samer@9	167 % and are major determinants of the overall experience.
samer@9	168 Berlyne \cite{Berlyne71} called such quantities `collative variables', since
samer@9	169 they are to do with patterns of occurrence rather than medium-specific details,
samer@9	170 and developed the ideas of `information aesthetics' in an experimental setting.
samer@9	171 % Berlyne's `new experimental aesthetics', the `information-aestheticians'.
samer@9	172
samer@9	173 % Listeners then experience greater or lesser levels of surprise
samer@9	174 % in response to departures from these norms.
samer@9	175 % By careful manipulation
samer@9	176 % of the material, the composer can thus define, and induce within the
samer@9	177 % listener, a temporal programme of varying
samer@9	178 % levels of uncertainty, ambiguity and surprise.
samer@9	179
samer@9	180
samer@9	181 \subsection{Information dynamic approach}
samer@9	182
samer@24	183 Bringing the various strands together, our working hypothesis is that as a
samer@24	184 listener (to which will refer as `it') listens to a piece of music, it maintains
samer@25	185 a dynamically evolving probabilistic model that enables it to make predictions
samer@24	186 about how the piece will continue, relying on both its previous experience
samer@24	187 of music and the immediate context of the piece. As events unfold, it revises
samer@25	188 its probabilistic belief state, which includes predictive
samer@25	189 distributions over possible future events. These
samer@25	190 % distributions and changes in distributions
samer@25	191 can be characterised in terms of a handful of information
samer@25	192 theoretic-measures such as entropy and relative entropy. By tracing the
samer@24	193 evolution of a these measures, we obtain a representation which captures much
samer@25	194 of the significant structure of the music.
samer@25	195
samer@25	196 One of the consequences of this approach is that regardless of the details of
samer@25	197 the sensory input or even which sensory modality is being processed, the resulting
samer@25	198 analysis is in terms of the same units: quantities of information (bits) and
samer@25	199 rates of information flow (bits per second). The probabilistic and information
samer@25	200 theoretic concepts in terms of which the analysis is framed are universal to all sorts
samer@25	201 of data.
samer@25	202 In addition, when adaptive probabilistic models are used, expectations are
samer@25	203 created mainly in response to to \emph{patterns} of occurence,
samer@25	204 rather the details of which specific things occur.
samer@25	205 Together, these suggest that an information dynamic analysis captures a
samer@25	206 high level of \emph{abstraction}, and could be used to
samer@25	207 make structural comparisons between different temporal media,
samer@25	208 such as music, film, animation, and dance.
samer@25	209 % analyse and compare information
samer@25	210 % flow in different temporal media regardless of whether they are auditory,
samer@25	211 % visual or otherwise.
samer@9	212
samer@25	213 Another consequence is that the information dynamic approach gives us a principled way
samer@24	214 to address the notion of \emph{subjectivity}, since the analysis is dependent on the
samer@24	215 probability model the observer starts off with, which may depend on prior experience
samer@24	216 or other factors, and which may change over time. Thus, inter-subject variablity and
samer@24	217 variation in subjects' responses over time are
samer@24	218 fundamental to the theory.
samer@9	219
samer@18	220 %modelling the creative process, which often alternates between generative
samer@18	221 %and selective or evaluative phases \cite{Boden1990}, and would have
samer@18	222 %applications in tools for computer aided composition.
samer@18	223
samer@18	224
samer@18	225 \section{Theoretical review}
samer@18	226
samer@34	227 \subsection{Entropy and information}
samer@41	228 \label{s:entro-info}
samer@41	229
samer@34	230 Let $X$ denote some variable whose value is initially unknown to our
samer@34	231 hypothetical observer. We will treat $X$ mathematically as a random variable,
samer@36	232 with a value to be drawn from some set $\X$ and a
samer@34	233 probability distribution representing the observer's beliefs about the
samer@34	234 true value of $X$.
samer@34	235 In this case, the observer's uncertainty about $X$ can be quantified
samer@34	236 as the entropy of the random variable $H(X)$. For a discrete variable
samer@36	237 with probability mass function $p:\X \to [0,1]$, this is
samer@34	238 \begin{equation}
samer@41	239 H(X) = \sum_{x\in\X} -p(x) \log p(x), % = \expect{-\log p(X)},
samer@34	240 \end{equation}
samer@41	241 % where $\expect{}$ is the expectation operator.
samer@41	242 The negative-log-probability
samer@34	243 $\ell(x) = -\log p(x)$ of a particular value $x$ can usefully be thought of as
samer@34	244 the \emph{surprisingness} of the value $x$ should it be observed, and
samer@41	245 hence the entropy is the expectation of the surprisingness $\expect \ell(X)$.
samer@34	246
samer@34	247 Now suppose that the observer receives some new data $\Data$ that
samer@34	248 causes a revision of its beliefs about $X$. The \emph{information}
samer@34	249 in this new data \emph{about} $X$ can be quantified as the
samer@34	250 Kullback-Leibler (KL) divergence between the prior and posterior
samer@34	251 distributions $p(x)$ and $p(x\|\Data)$ respectively:
samer@34	252 \begin{equation}
samer@34	253 \mathcal{I}_{\Data\to X} = D(p_{X\|\Data} \|\| p_{X})
samer@36	254 = \sum_{x\in\X} p(x\|\Data) \log \frac{p(x\|\Data)}{p(x)}.
samer@41	255 \label{eq:info}
samer@34	256 \end{equation}
samer@34	257 When there are multiple variables $X_1, X_2$
samer@34	258 \etc which the observer believes to be dependent, then the observation of
samer@34	259 one may change its beliefs and hence yield information about the
samer@34	260 others. The joint and conditional entropies as described in any
samer@34	261 textbook on information theory (\eg \cite{CoverThomas}) then quantify
samer@34	262 the observer's expected uncertainty about groups of variables given the
samer@34	263 values of others. In particular, the \emph{mutual information}
samer@34	264 $I(X_1;X_2)$ is both the expected information
samer@34	265 in an observation of $X_2$ about $X_1$ and the expected reduction
samer@34	266 in uncertainty about $X_1$ after observing $X_2$:
samer@34	267 \begin{equation}
samer@34	268 I(X_1;X_2) = H(X_1) - H(X_1\|X_2),
samer@34	269 \end{equation}
samer@34	270 where $H(X_1\|X_2) = H(X_1,X_2) - H(X_2)$ is the conditional entropy
samer@34	271 of $X_2$ given $X_1$. A little algebra shows that $I(X_1;X_2)=I(X_2;X_1)$
samer@34	272 and so the mutual information is symmetric in its arguments. A conditional
samer@34	273 form of the mutual information can be formulated analogously:
samer@34	274 \begin{equation}
samer@34	275 I(X_1;X_2\|X_3) = H(X_1\|X_3) - H(X_1\|X_2,X_3).
samer@34	276 \end{equation}
samer@34	277 These relationships between the various entropies and mutual
samer@34	278 informations are conveniently visualised in Venn diagram-like \emph{information diagrams}
samer@34	279 or I-diagrams \cite{Yeung1991} such as the one in \figrf{venn-example}.
samer@34	280
samer@18	281 \begin{fig}{venn-example}
samer@18	282 \newcommand\rad{2.2em}%
samer@18	283 \newcommand\circo{circle (3.4em)}%
samer@18	284 \newcommand\labrad{4.3em}
samer@18	285 \newcommand\bound{(-6em,-5em) rectangle (6em,6em)}
samer@18	286 \newcommand\colsep{\ }
samer@18	287 \newcommand\clipin[1]{\clip (#1) \circo;}%
samer@18	288 \newcommand\clipout[1]{\clip \bound (#1) \circo;}%
samer@18	289 \newcommand\cliptwo[3]{%
samer@18	290 \begin{scope}
samer@18	291 \clipin{#1};
samer@18	292 \clipin{#2};
samer@18	293 \clipout{#3};
samer@18	294 \fill[black!30] \bound;
samer@18	295 \end{scope}
samer@18	296 }%
samer@18	297 \newcommand\clipone[3]{%
samer@18	298 \begin{scope}
samer@18	299 \clipin{#1};
samer@18	300 \clipout{#2};
samer@18	301 \clipout{#3};
samer@18	302 \fill[black!15] \bound;
samer@18	303 \end{scope}
samer@18	304 }%
samer@18	305 \begin{tabular}{c@{\colsep}c}
samer@18	306 \begin{tikzpicture}[baseline=0pt]
samer@18	307 \coordinate (p1) at (90:\rad);
samer@18	308 \coordinate (p2) at (210:\rad);
samer@18	309 \coordinate (p3) at (-30:\rad);
samer@18	310 \clipone{p1}{p2}{p3};
samer@18	311 \clipone{p2}{p3}{p1};
samer@18	312 \clipone{p3}{p1}{p2};
samer@18	313 \cliptwo{p1}{p2}{p3};
samer@18	314 \cliptwo{p2}{p3}{p1};
samer@18	315 \cliptwo{p3}{p1}{p2};
samer@18	316 \begin{scope}
samer@18	317 \clip (p1) \circo;
samer@18	318 \clip (p2) \circo;
samer@18	319 \clip (p3) \circo;
samer@18	320 \fill[black!45] \bound;
samer@18	321 \end{scope}
samer@18	322 \draw (p1) \circo;
samer@18	323 \draw (p2) \circo;
samer@18	324 \draw (p3) \circo;
samer@18	325 \path
samer@18	326 (barycentric cs:p3=1,p1=-0.2,p2=-0.1) +(0ex,0) node {$I_{3\|12}$}
samer@18	327 (barycentric cs:p1=1,p2=-0.2,p3=-0.1) +(0ex,0) node {$I_{1\|23}$}
samer@18	328 (barycentric cs:p2=1,p3=-0.2,p1=-0.1) +(0ex,0) node {$I_{2\|13}$}
samer@18	329 (barycentric cs:p3=1,p2=1,p1=-0.55) +(0ex,0) node {$I_{23\|1}$}
samer@18	330 (barycentric cs:p1=1,p3=1,p2=-0.55) +(0ex,0) node {$I_{13\|2}$}
samer@18	331 (barycentric cs:p2=1,p1=1,p3=-0.55) +(0ex,0) node {$I_{12\|3}$}
samer@18	332 (barycentric cs:p3=1,p2=1,p1=1) node {$I_{123}$}
samer@18	333 ;
samer@18	334 \path
samer@18	335 (p1) +(140:\labrad) node {$X_1$}
samer@18	336 (p2) +(-140:\labrad) node {$X_2$}
samer@18	337 (p3) +(-40:\labrad) node {$X_3$};
samer@18	338 \end{tikzpicture}
samer@18	339 &
samer@18	340 \parbox{0.5\linewidth}{
samer@18	341 \small
samer@18	342 \begin{align*}
samer@18	343 I_{1\|23} &= H(X_1\|X_2,X_3) \\
samer@18	344 I_{13\|2} &= I(X_1;X_3\|X_2) \\
samer@18	345 I_{1\|23} + I_{13\|2} &= H(X_1\|X_2) \\
samer@18	346 I_{12\|3} + I_{123} &= I(X_1;X_2)
samer@18	347 \end{align*}
samer@18	348 }
samer@18	349 \end{tabular}
samer@18	350 \caption{
samer@30	351 I-diagram visualisation of entropies and mutual informations
samer@18	352 for three random variables $X_1$, $X_2$ and $X_3$. The areas of
samer@18	353 the three circles represent $H(X_1)$, $H(X_2)$ and $H(X_3)$ respectively.
samer@18	354 The total shaded area is the joint entropy $H(X_1,X_2,X_3)$.
samer@18	355 The central area $I_{123}$ is the co-information \cite{McGill1954}.
samer@18	356 Some other information measures are indicated in the legend.
samer@18	357 }
samer@18	358 \end{fig}
samer@30	359
samer@30	360
samer@36	361 \subsection{Surprise and information in sequences}
samer@36	362 \label{s:surprise-info-seq}
samer@30	363
samer@36	364 Suppose that $(\ldots,X_{-1},X_0,X_1,\ldots)$ is a sequence of
samer@30	365 random variables, infinite in both directions,
samer@36	366 and that $\mu$ is the associated probability measure over all
samer@36	367 realisations of the sequence---in the following, $\mu$ will simply serve
samer@30	368 as a label for the process. We can indentify a number of information-theoretic
samer@30	369 measures meaningful in the context of a sequential observation of the sequence, during
samer@36	370 which, at any time $t$, the sequence of variables can be divided into a `present' $X_t$, a `past'
samer@30	371 $\past{X}_t \equiv (\ldots, X_{t-2}, X_{t-1})$, and a `future'
samer@30	372 $\fut{X}_t \equiv (X_{t+1},X_{t+2},\ldots)$.
samer@41	373 We will write the actually observed value of $X_t$ as $x_t$, and
samer@36	374 the sequence of observations up to but not including $x_t$ as
samer@36	375 $\past{x}_t$.
samer@36	376 % Since the sequence is assumed stationary, we can without loss of generality,
samer@36	377 % assume that $t=0$ in the following definitions.
samer@36	378
samer@41	379 The in-context surprisingness of the observation $X_t=x_t$ depends on
samer@41	380 both $x_t$ and the context $\past{x}_t$:
samer@36	381 \begin{equation}
samer@41	382 \ell_t = - \log p(x_t\|\past{x}_t).
samer@36	383 \end{equation}
samer@36	384 However, before $X_t$ is observed to be $x_t$, the observer can compute
samer@36	385 its \emph{expected} surprisingness as a measure of its uncertainty about
samer@36	386 the very next event; this may be written as an entropy
samer@36	387 $H(X_t\|\ev(\past{X}_t = \past{x}_t))$, but note that this is
samer@36	388 conditional on the \emph{event} $\ev(\past{X}_t=\past{x}_t)$, not
samer@36	389 \emph{variables} $\past{X}_t$ as in the conventional conditional entropy.
samer@36	390
samer@41	391 The surprisingness $\ell_t$ and expected surprisingness
samer@36	392 $H(X_t\|\ev(\past{X}_t=\past{x}_t))$
samer@41	393 can be understood as \emph{subjective} information dynamic measures, since they are
samer@41	394 based on the observer's probability model in the context of the actually observed sequence
samer@36	395 $\past{x}_t$---they characterise what it is like to `be in the observer's shoes'.
samer@36	396 If we view the observer as a purely passive or reactive agent, this would
samer@36	397 probably be sufficient, but for active agents such as humans or animals, it is
samer@36	398 often necessary to \emph{aniticipate} future events in order, for example, to plan the
samer@36	399 most effective course of action. It makes sense for such observers to be
samer@36	400 concerned about the predictive probability distribution over future events,
samer@36	401 $p(\fut{x}_t\|\past{x}_t)$. When an observation $\ev(X_t=x_t)$ is made in this context,
samer@41	402 the \emph{instantaneous predictive information} (IPI) $\mathcal{I}_t$ at time $t$
samer@41	403 is the information in the event $\ev(X_t=x_t)$ about the entire future of the sequence $\fut{X}_t$,
samer@41	404 \emph{given} the observed past $\past{X}_t=\past{x}_t$.
samer@41	405 Referring to the definition of information \eqrf{info}, this is the KL divergence
samer@41	406 between prior and posterior distributions over possible futures, which written out in full, is
samer@41	407 \begin{equation}
samer@41	408 \mathcal{I}_t = \sum_{\fut{x}_t \in \X^*}
samer@41	409 p(\fut{x}_t\|x_t,\past{x}_t) \log \frac{ p(\fut{x}_t\|x_t,\past{x}_t) }{ p(\fut{x}_t\|\past{x}_t) },
samer@41	410 \end{equation}
samer@41	411 where the sum is to be taken over the set of infinite sequences $\X^*$.
samer@41	412 As with the surprisingness, the observer can compute its \emph{expected} IPI
samer@41	413 at time $t$, which reduces to a mutual information $I(X_t;\fut{X}_t\|\ev(\past{X}_t=\past{x}_t))$
samer@41	414 conditioned on the observed past. This could be used, for example, as an estimate
samer@41	415 of attentional resources which should be directed at this stream of data, which may
samer@41	416 be in competition with other sensory streams.
samer@36	417
samer@36	418 \subsection{Information measures for stationary random processes}
samer@43	419 \label{s:process-info}
samer@30	420
samer@18	421
samer@18	422 \begin{fig}{predinfo-bg}
samer@18	423 \newcommand\subfig[2]{\shortstack{#2\\[0.75em]#1}}
samer@18	424 \newcommand\rad{1.8em}%
samer@18	425 \newcommand\ovoid[1]{%
samer@18	426 ++(-#1,\rad)
samer@18	427 -- ++(2 * #1,0em) arc (90:-90:\rad)
samer@18	428 -- ++(-2 * #1,0em) arc (270:90:\rad)
samer@18	429 }%
samer@18	430 \newcommand\axis{2.75em}%
samer@18	431 \newcommand\olap{0.85em}%
samer@18	432 \newcommand\offs{3.6em}
samer@18	433 \newcommand\colsep{\hspace{5em}}
samer@18	434 \newcommand\longblob{\ovoid{\axis}}
samer@18	435 \newcommand\shortblob{\ovoid{1.75em}}
samer@18	436 \begin{tabular}{c@{\colsep}c}
samer@43	437 \subfig{(a) multi-information and entropy rates}{%
samer@43	438 \begin{tikzpicture}%[baseline=-1em]
samer@43	439 \newcommand\rc{1.75em}
samer@43	440 \newcommand\throw{2.5em}
samer@43	441 \coordinate (p1) at (180:1.5em);
samer@43	442 \coordinate (p2) at (0:0.3em);
samer@43	443 \newcommand\bound{(-7em,-2.6em) rectangle (7em,3.0em)}
samer@43	444 \newcommand\present{(p2) circle (\rc)}
samer@43	445 \newcommand\thepast{(p1) ++(-\throw,0) \ovoid{\throw}}
samer@43	446 \newcommand\fillclipped[2]{%
samer@43	447 \begin{scope}[even odd rule]
samer@43	448 \foreach \thing in {#2} {\clip \thing;}
samer@43	449 \fill[black!#1] \bound;
samer@43	450 \end{scope}%
samer@43	451 }%
samer@43	452 \fillclipped{30}{\present,\bound \thepast}
samer@43	453 \fillclipped{15}{\present,\bound \thepast}
samer@43	454 \fillclipped{45}{\present,\thepast}
samer@43	455 \draw \thepast;
samer@43	456 \draw \present;
samer@43	457 \node at (barycentric cs:p2=1,p1=-0.3) {$h_\mu$};
samer@43	458 \node at (barycentric cs:p2=1,p1=1) [shape=rectangle,fill=black!45,inner sep=1pt]{$\rho_\mu$};
samer@43	459 \path (p2) +(90:3em) node {$X_0$};
samer@43	460 \path (p1) +(-3em,0em) node {\shortstack{infinite\\past}};
samer@43	461 \path (p1) +(-4em,\rad) node [anchor=south] {$\ldots,X_{-1}$};
samer@43	462 \end{tikzpicture}}%
samer@43	463 \\[1.25em]
samer@43	464 \subfig{(b) excess entropy}{%
samer@18	465 \newcommand\blob{\longblob}
samer@18	466 \begin{tikzpicture}
samer@18	467 \coordinate (p1) at (-\offs,0em);
samer@18	468 \coordinate (p2) at (\offs,0em);
samer@18	469 \begin{scope}
samer@18	470 \clip (p1) \blob;
samer@18	471 \clip (p2) \blob;
samer@18	472 \fill[lightgray] (-1,-1) rectangle (1,1);
samer@18	473 \end{scope}
samer@18	474 \draw (p1) +(-0.5em,0em) node{\shortstack{infinite\\past}} \blob;
samer@18	475 \draw (p2) +(0.5em,0em) node{\shortstack{infinite\\future}} \blob;
samer@18	476 \path (0,0) node (future) {$E$};
samer@18	477 \path (p1) +(-2em,\rad) node [anchor=south] {$\ldots,X_{-1}$};
samer@18	478 \path (p2) +(2em,\rad) node [anchor=south] {$X_0,\ldots$};
samer@18	479 \end{tikzpicture}%
samer@18	480 }%
samer@18	481 \\[1.25em]
samer@43	482 \subfig{(c) predictive information rate $b_\mu$}{%
samer@18	483 \begin{tikzpicture}%[baseline=-1em]
samer@18	484 \newcommand\rc{2.1em}
samer@18	485 \newcommand\throw{2.5em}
samer@18	486 \coordinate (p1) at (210:1.5em);
samer@18	487 \coordinate (p2) at (90:0.7em);
samer@18	488 \coordinate (p3) at (-30:1.5em);
samer@18	489 \newcommand\bound{(-7em,-2.6em) rectangle (7em,3.0em)}
samer@18	490 \newcommand\present{(p2) circle (\rc)}
samer@18	491 \newcommand\thepast{(p1) ++(-\throw,0) \ovoid{\throw}}
samer@18	492 \newcommand\future{(p3) ++(\throw,0) \ovoid{\throw}}
samer@18	493 \newcommand\fillclipped[2]{%
samer@18	494 \begin{scope}[even odd rule]
samer@18	495 \foreach \thing in {#2} {\clip \thing;}
samer@18	496 \fill[black!#1] \bound;
samer@18	497 \end{scope}%
samer@18	498 }%
samer@43	499 \fillclipped{80}{\future,\thepast}
samer@18	500 \fillclipped{30}{\present,\future,\bound \thepast}
samer@18	501 \fillclipped{15}{\present,\bound \future,\bound \thepast}
samer@18	502 \draw \future;
samer@18	503 \fillclipped{45}{\present,\thepast}
samer@18	504 \draw \thepast;
samer@18	505 \draw \present;
samer@18	506 \node at (barycentric cs:p2=1,p1=-0.17,p3=-0.17) {$r_\mu$};
samer@18	507 \node at (barycentric cs:p1=-0.4,p2=1.0,p3=1) {$b_\mu$};
samer@18	508 \node at (barycentric cs:p3=0,p2=1,p1=1.2) [shape=rectangle,fill=black!45,inner sep=1pt]{$\rho_\mu$};
samer@18	509 \path (p2) +(140:3em) node {$X_0$};
samer@18	510 % \node at (barycentric cs:p3=0,p2=1,p1=1) {$\rho_\mu$};
samer@18	511 \path (p3) +(3em,0em) node {\shortstack{infinite\\future}};
samer@18	512 \path (p1) +(-3em,0em) node {\shortstack{infinite\\past}};
samer@18	513 \path (p1) +(-4em,\rad) node [anchor=south] {$\ldots,X_{-1}$};
samer@18	514 \path (p3) +(4em,\rad) node [anchor=south] {$X_1,\ldots$};
samer@18	515 \end{tikzpicture}}%
samer@18	516 \\[0.5em]
samer@18	517 \end{tabular}
samer@18	518 \caption{
samer@30	519 I-diagrams for several information measures in
samer@18	520 stationary random processes. Each circle or oval represents a random
samer@18	521 variable or sequence of random variables relative to time $t=0$. Overlapped areas
samer@18	522 correspond to various mutual information as in \Figrf{venn-example}.
samer@33	523 In (b), the circle represents the `present'. Its total area is
samer@33	524 $H(X_0)=\rho_\mu+r_\mu+b_\mu$, where $\rho_\mu$ is the multi-information
samer@18	525 rate, $r_\mu$ is the residual entropy rate, and $b_\mu$ is the predictive
samer@43	526 information rate. The entropy rate is $h_\mu = r_\mu+b_\mu$. The small dark
samer@43	527 region below $X_0$ in (c) is $\sigma_\mu = E-\rho_\mu$.
samer@18	528 }
samer@18	529 \end{fig}
samer@18	530
samer@41	531 If we step back, out of the observer's shoes as it were, and consider the
samer@41	532 random process $(\ldots,X_{-1},X_0,X_1,\dots)$ as a statistical ensemble of
samer@41	533 possible realisations, and furthermore assume that it is stationary,
samer@41	534 then it becomes possible to define a number of information-theoretic measures,
samer@41	535 closely related to those described above, but which characterise the
samer@41	536 process as a whole, rather than on a moment-by-moment basis. Some of these,
samer@41	537 such as the entropy rate, are well-known, but others are only recently being
samer@41	538 investigated. (In the following, the assumption of stationarity means that
samer@41	539 the measures defined below are independent of $t$.)
samer@41	540
samer@41	541 The \emph{entropy rate} of the process is the entropy of the next variable
samer@41	542 $X_t$ given all the previous ones.
samer@41	543 \begin{equation}
samer@41	544 \label{eq:entro-rate}
samer@41	545 h_\mu = H(X_t\|\past{X}_t).
samer@41	546 \end{equation}
samer@41	547 The entropy rate gives a measure of the overall randomness
samer@41	548 or unpredictability of the process.
samer@41	549
samer@41	550 The \emph{multi-information rate} $\rho_\mu$ (following Dubnov's \cite{Dubnov2006}
samer@41	551 notation for what he called the `information rate') is the mutual
samer@41	552 information between the `past' and the `present':
samer@41	553 \begin{equation}
samer@41	554 \label{eq:multi-info}
samer@41	555 \rho_\mu = I(\past{X}_t;X_t) = H(X_t) - h_\mu.
samer@41	556 \end{equation}
samer@41	557 It is a measure of how much the context of an observation (that is,
samer@41	558 the observation of previous elements of the sequence) helps in predicting
samer@41	559 or reducing the suprisingness of the current observation.
samer@41	560
samer@41	561 The \emph{excess entropy} \cite{CrutchfieldPackard1983}
samer@41	562 is the mutual information between
samer@41	563 the entire `past' and the entire `future':
samer@41	564 \begin{equation}
samer@41	565 E = I(\past{X}_t; X_t,\fut{X}_t).
samer@41	566 \end{equation}
samer@43	567 Both the excess entropy and the multi-information rate can be thought
samer@43	568 of as measures of \emph{redundancy}, quantifying the extent to which
samer@43	569 the same information is to be found in all parts of the sequence.
samer@41	570
samer@41	571
samer@30	572 The \emph{predictive information rate} (or PIR) \cite{AbdallahPlumbley2009}
samer@30	573 is the average information in one observation about the infinite future given the infinite past,
samer@30	574 and is defined as a conditional mutual information:
samer@18	575 \begin{equation}
samer@18	576 \label{eq:PIR}
samer@41	577 b_\mu = I(X_t;\fut{X}_t\|\past{X}_t) = H(\fut{X}_t\|\past{X}_t) - H(\fut{X}_t\|X_t,\past{X}_t).
samer@18	578 \end{equation}
samer@18	579 Equation \eqrf{PIR} can be read as the average reduction
samer@18	580 in uncertainty about the future on learning $X_t$, given the past.
samer@18	581 Due to the symmetry of the mutual information, it can also be written
samer@18	582 as
samer@18	583 \begin{equation}
samer@18	584 % \IXZ_t
samer@43	585 b_\mu = H(X_t\|\past{X}_t) - H(X_t\|\past{X}_t,\fut{X}_t) = h_\mu - r_\mu,
samer@18	586 % \label{<++>}
samer@18	587 \end{equation}
samer@18	588 % If $X$ is stationary, then
samer@41	589 where $r_\mu = H(X_t\|\fut{X}_t,\past{X}_t)$,
samer@34	590 is the \emph{residual} \cite{AbdallahPlumbley2010},
samer@34	591 or \emph{erasure} \cite{VerduWeissman2006} entropy rate.
samer@18	592 These relationships are illustrated in \Figrf{predinfo-bg}, along with
samer@18	593 several of the information measures we have discussed so far.
samer@18	594
samer@18	595
samer@25	596 James et al \cite{JamesEllisonCrutchfield2011} study the predictive information
samer@25	597 rate and also examine some related measures. In particular they identify the
samer@25	598 $\sigma_\mu$, the difference between the multi-information rate and the excess
samer@25	599 entropy, as an interesting quantity that measures the predictive benefit of
samer@25	600 model-building (that is, maintaining an internal state summarising past
samer@43	601 observations in order to make better predictions).
samer@43	602 % They also identify
samer@43	603 % $w_\mu = \rho_\mu + b_{\mu}$, which they call the \emph{local exogenous
samer@43	604 % information} rate.
samer@34	605
samer@4	606
samer@36	607 \subsection{First and higher order Markov chains}
samer@36	608 First order Markov chains are the simplest non-trivial models to which information
samer@36	609 dynamics methods can be applied. In \cite{AbdallahPlumbley2009} we derived
samer@41	610 expressions for all the information measures described in \secrf{surprise-info-seq} for
samer@36	611 irreducible stationary Markov chains (\ie that have a unique stationary
samer@36	612 distribution). The derivation is greatly simplified by the dependency structure
samer@36	613 of the Markov chain: for the purpose of the analysis, the `past' and `future'
samer@41	614 segments $\past{X}_t$ and $\fut{X}_t$ can be collapsed to just the previous
samer@36	615 and next variables $X_{t-1}$ and $X_{t-1}$ respectively. We also showed that
samer@36	616 the predictive information rate can be expressed simply in terms of entropy rates:
samer@36	617 if we let $a$ denote the $K\times K$ transition matrix of a Markov chain over
samer@36	618 an alphabet of $\{1,\ldots,K\}$, such that
samer@36	619 $a_{ij} = \Pr(\ev(X_t=i\|X_{t-1}=j))$, and let $h:\reals^{K\times K}\to \reals$ be
samer@36	620 the entropy rate function such that $h(a)$ is the entropy rate of a Markov chain
samer@36	621 with transition matrix $a$, then the predictive information rate $b(a)$ is
samer@36	622 \begin{equation}
samer@36	623 b(a) = h(a^2) - h(a),
samer@36	624 \end{equation}
samer@36	625 where $a^2$, the transition matrix squared, is the transition matrix
samer@36	626 of the `skip one' Markov chain obtained by jumping two steps at a time
samer@36	627 along the original chain.
samer@36	628
samer@36	629 Second and higher order Markov chains can be treated in a similar way by transforming
samer@36	630 to a first order representation of the high order Markov chain. If we are dealing
samer@36	631 with an $N$th order model, this is done forming a new alphabet of size $K^N$
samer@41	632 consisting of all possible $N$-tuples of symbols from the base alphabet.
samer@41	633 An observation $\hat{x}_t$ in this new model encodes a block of $N$ observations
samer@36	634 $(x_{t+1},\ldots,x_{t+N})$ from the base model. The next
samer@41	635 observation $\hat{x}_{t+1}$ encodes the block of $N$ obtained by shifting the previous
samer@36	636 block along by one step. The new Markov of chain is parameterised by a sparse $K^N\times K^N$
samer@41	637 transition matrix $\hat{a}$. Adopting the label $\mu$ for the order $N$ system,
samer@41	638 we obtain:
samer@36	639 \begin{equation}
samer@41	640 h_\mu = h(\hat{a}), \qquad b_\mu = h({\hat{a}^{N+1}}) - N h({\hat{a}}),
samer@36	641 \end{equation}
samer@36	642 where $\hat{a}^{N+1}$ is the $(N+1)$th power of the first order transition matrix.
samer@41	643 Other information measures can also be computed for the high-order Markov chain, including
samer@41	644 the multi-information rate $\rho_\mu$ and the excess entropy $E$. These are identical
samer@41	645 for first order Markov chains, but for order $N$ chains, $E$ can be up to $N$ times larger
samer@41	646 than $\rho_\mu$.
samer@43	647
samer@43	648 [Something about what kinds of Markov chain maximise $h_\mu$ (uncorrelated `white'
samer@43	649 sequences, no temporal structure), $\rho_\mu$ and $E$ (periodic) and $b_\mu$. We return
samer@43	650 this in \secrf{composition}.]
samer@36	651
samer@36	652
hekeus@16	653 \section{Information Dynamics in Analysis}
samer@4	654
samer@24	655 \begin{fig}{twopages}
samer@33	656 \colfig[0.96]{matbase/fig9471} % update from mbc paper
samer@33	657 % \colfig[0.97]{matbase/fig72663}\\ % later update from mbc paper (Keith's new picks)
samer@24	658 \vspace*{1em}
samer@24	659 \colfig[0.97]{matbase/fig13377} % rule based analysis
samer@24	660 \caption{Analysis of \emph{Two Pages}.
samer@24	661 The thick vertical lines are the part boundaries as indicated in
samer@24	662 the score by the composer.
samer@24	663 The thin grey lines
samer@24	664 indicate changes in the melodic `figures' of which the piece is
samer@24	665 constructed. In the `model information rate' panel, the black asterisks
samer@24	666 mark the
samer@24	667 six most surprising moments selected by Keith Potter.
samer@24	668 The bottom panel shows a rule-based boundary strength analysis computed
samer@24	669 using Cambouropoulos' LBDM.
samer@24	670 All information measures are in nats and time is in notes.
samer@24	671 }
samer@24	672 \end{fig}
samer@24	673
samer@36	674 \subsection{Musicological Analysis}
samer@36	675 In \cite{AbdallahPlumbley2009}, methods based on the theory described above
samer@36	676 were used to analysis two pieces of music in the minimalist style
samer@36	677 by Philip Glass: \emph{Two Pages} (1969) and \emph{Gradus} (1968).
samer@36	678 The analysis was done using a first-order Markov chain model, with the
samer@36	679 enhancement that the transition matrix of the model was allowed to
samer@36	680 evolve dynamically as the notes were processed, and was tracked (in
samer@36	681 a Bayesian way) as a \emph{distribution} over possible transition matrices,
samer@36	682 rather than a point estimate. The results are summarised in \figrf{twopages}:
samer@36	683 the upper four plots show the dynamically evolving subjective information
samer@36	684 measures as described in \secrf{surprise-info-seq} computed using a point
samer@36	685 estimate of the current transition matrix, but the fifth plot (the `model information rate')
samer@36	686 measures the information in each observation about the transition matrix.
samer@36	687 In \cite{AbdallahPlumbley2010b}, we showed that this `model information rate'
samer@36	688 is actually a component of the true IPI in
samer@36	689 a time-varying Markov chain, which was neglected when we computed the IPI from
samer@36	690 point estimates of the transition matrix as if the transition probabilities
samer@36	691 were constant.
samer@36	692
samer@36	693 The peaks of the surprisingness and both components of the predictive information
samer@36	694 show good correspondence with structure of the piece both as marked in the score
samer@36	695 and as analysed by musicologist Keith Potter, who was asked to mark the six
samer@36	696 `most surprising moments' of the piece (shown as asterisks in the fifth plot)%
samer@36	697 \footnote{%
samer@36	698 Note that the boundary marked in the score at around note 5,400 is known to be
samer@36	699 anomalous; on the basis of a listening analysis, some musicologists [ref] have
samer@36	700 placed the boundary a few bars later, in agreement with our analysis.}.
samer@36	701
samer@36	702 In contrast, the analyses shown in the lower two plots of \figrf{twopages},
samer@36	703 obtained using two rule-based music segmentation algorithms, while clearly
samer@37	704 \emph{reflecting} the structure of the piece, do not \emph{segment} the piece,
samer@37	705 with no tendency to peaking of the boundary strength function at
samer@36	706 the boundaries in the piece.
samer@36	707
samer@36	708
samer@24	709 \begin{fig}{metre}
samer@33	710 % \scalebox{1}[1]{%
samer@24	711 \begin{tabular}{cc}
samer@33	712 \colfig[0.45]{matbase/fig36859} & \colfig[0.48]{matbase/fig88658} \\
samer@33	713 \colfig[0.45]{matbase/fig48061} & \colfig[0.48]{matbase/fig46367} \\
samer@33	714 \colfig[0.45]{matbase/fig99042} & \colfig[0.47]{matbase/fig87490}
samer@24	715 % \colfig[0.46]{matbase/fig56807} & \colfig[0.48]{matbase/fig27144} \\
samer@24	716 % \colfig[0.46]{matbase/fig87574} & \colfig[0.48]{matbase/fig13651} \\
samer@24	717 % \colfig[0.44]{matbase/fig19913} & \colfig[0.46]{matbase/fig66144} \\
samer@24	718 % \colfig[0.48]{matbase/fig73098} & \colfig[0.48]{matbase/fig57141} \\
samer@24	719 % \colfig[0.48]{matbase/fig25703} & \colfig[0.48]{matbase/fig72080} \\
samer@24	720 % \colfig[0.48]{matbase/fig9142} & \colfig[0.48]{matbase/fig27751}
samer@24	721
samer@24	722 \end{tabular}%
samer@33	723 % }
samer@24	724 \caption{Metrical analysis by computing average surprisingness and
samer@24	725 informative of notes at different periodicities (\ie hypothetical
samer@24	726 bar lengths) and phases (\ie positions within a bar).
samer@24	727 }
samer@24	728 \end{fig}
samer@24	729
peterf@39	730 \subsection{Content analysis/Sound Categorisation}.
samer@42	731 Using analogous definitions of differential entropy, the methods outlined
samer@42	732 in the previous section are equally applicable to continuous random variables.
samer@42	733 In the case of music, where expressive properties such as dynamics, tempo,
samer@42	734 timing and timbre are readily quantified on a continuous scale, the information
samer@42	735 dynamic framework thus may also be considered.
peterf@39	736
samer@42	737 In \cite{Dubnov2006}, Dubnov considers the class of stationary Gaussian
samer@42	738 processes. For such processes, the entropy rate may be obtained analytically
samer@42	739 from the power spectral density of the signal, allowing the multi-information
samer@42	740 rate to be subsequently obtained. Local stationarity is assumed, which may
samer@42	741 be achieved by windowing or change point detection \cite{Dubnov2008}. %TODO
samer@42	742 mention non-gaussian processes extension Similarly, the predictive information
samer@42	743 rate may be computed using a Gaussian linear formulation CITE. In this view,
samer@42	744 the PIR is a function of the correlation between random innovations supplied
samer@42	745 to the stochastic process. %Dubnov, MacAdams, Reynolds (2006) %Bailes and
samer@42	746 Dean (2009)
peterf@39	747
peterf@26	748 \begin{itemize}
peterf@39	749 \item Continuous domain information
peterf@39	750 \item Audio based music expectation modelling
peterf@39	751 \item Proposed model for Gaussian processes
peterf@26	752 \end{itemize}
peterf@26	753
samer@4	754
samer@4	755 \subsection{Beat Tracking}
samer@4	756
samer@43	757 A probabilistic method for drum tracking was presented by Robertson
samer@43	758 \cite{Robertson11c}. The algorithm is used to synchronise a music
samer@43	759 sequencer to a live drummer. The expected beat time of the sequencer is
samer@43	760 represented by a click track, and the algorithm takes as input event
samer@43	761 times for discrete kick and snare drum events relative to this click
samer@43	762 track. These are obtained using dedicated microphones for each drum and
samer@43	763 using a percussive onset detector (Puckette 1998). The drum tracker
samer@43	764 continually updates distributions for tempo and phase on receiving a new
samer@43	765 event time. We can thus quantify the information contributed of an event
samer@43	766 by measuring the difference between the system's prior distribution and
samer@43	767 the posterior distribution using the Kullback-Leiber divergence.
samer@43	768
samer@43	769 Here, we have calculated the KL divergence and entropy for kick and
samer@43	770 snare events in sixteen files. The analysis of information rates can be
samer@43	771 considered \emph{subjective}, in that it measures how the drum tracker's
samer@43	772 probability distributions change, and these are contingent upon the
samer@43	773 model used as well as external properties in the signal. We expect,
samer@43	774 however, that following periods of increased uncertainty, such as fills
samer@43	775 or expressive timing, the information contained in an individual event
samer@43	776 increases. We also examine whether the information is dependent upon
samer@43	777 metrical position.
samer@43	778
samer@4	779
samer@24	780 \section{Information dynamics as compositional aid}
samer@43	781 \label{s:composition}
samer@43	782
samer@43	783 \begin{fig}{wundt}
samer@43	784 \raisebox{-4em}{\colfig[0.43]{wundt}}
samer@43	785 % {\ \shortstack{{\Large$\longrightarrow$}\\ {\scriptsize\emph{exposure}}}\ }
samer@43	786 {\ {\large$\longrightarrow$}\ }
samer@43	787 \raisebox{-4em}{\colfig[0.43]{wundt2}}
samer@43	788 \caption{
samer@43	789 The Wundt curve relating randomness/complexity with
samer@43	790 perceived value. Repeated exposure sometimes results
samer@43	791 in a move to the left along the curve \cite{Berlyne71}.
samer@43	792 }
samer@43	793 \end{fig}
hekeus@13	794
samer@42	795 In addition to applying information dynamics to analysis, it is also possible
samer@42	796 to apply it to the generation of content, such as to the composition of musical
samer@42	797 materials. The outputs of algorithmic or stochastic processes can be filtered
samer@42	798 to match a set of criteria defined in terms of the information dynamics model,
samer@42	799 this criteria thus becoming a means of interfacing with the generative process.
samer@42	800 For instance a stochastic music generating process could be controlled by modifying
samer@42	801 constraints on its output in terms of predictive information rate or entropy
samer@42	802 rate.
hekeus@13	803
samer@42	804 The use of stochastic processes for the composition of musical material has been
samer@42	805 widespread for decades---for instance Iannis Xenakis applied probabilistic
samer@42	806 mathematical models to the creation of musical materials\cite{Xenakis:1992ul}.
samer@42	807 Information dynamics can serve as a novel framework for the exploration of the
samer@42	808 possibilities of such processes at the high and abstract level of expectation,
samer@42	809 randomness and predictability.
hekeus@13	810
samer@23	811 \subsection{The Melody Triangle}
samer@23	812
samer@34	813 \begin{figure}
samer@34	814 \centering
samer@34	815 \includegraphics[width=\linewidth]{figs/mtriscat}
samer@34	816 \caption{The population of transition matrices distributed along three axes of
samer@34	817 redundancy, entropy rate and predictive information rate (all measured in bits).
samer@34	818 The concentrations of points along the redundancy axis correspond
samer@34	819 to Markov chains which are roughly periodic with periods of 2 (redundancy 1 bit),
samer@34	820 3, 4, \etc all the way to period 8 (redundancy 3 bits). The colour of each point
samer@34	821 represents its PIR---note that the highest values are found at intermediate entropy
samer@34	822 and redundancy, and that the distribution as a whole makes a curved triangle. Although
samer@34	823 not visible in this plot, it is largely hollow in the middle.
samer@34	824 \label{InfoDynEngine}}
samer@34	825 \end{figure}
samer@23	826
samer@42	827 The Melody Triangle is an exploratory interface for the discovery of melodic
samer@42	828 content, where the input---positions within a triangle---directly map to information
samer@42	829 theoretic measures of the output. The measures---entropy rate, redundancy and
samer@42	830 predictive information rate---form a criteria with which to filter the output
samer@42	831 of the stochastic processes used to generate sequences of notes. These measures
samer@42	832 address notions of expectation and surprise in music, and as such the Melody
samer@42	833 Triangle is a means of interfacing with a generative process in terms of the
samer@42	834 predictability of its output.
samer@41	835
samer@42	836 The triangle is `populated' with possible parameter values for melody generators.
samer@43	837 These are plotted in a 3D information space of $\rho_\mu$ (redundancy), $h_\mu$ (entropy rate) and
samer@43	838 $b_\mu$ (predictive information rate), as defined in \secrf{process-info}.
samer@43	839 In our case we generated thousands of transition matrices, representing first-order
samer@42	840 Markov chains, by a random sampling method. In figure \ref{InfoDynEngine} we
samer@43	841 see a representation of how these matrices are distributed in the 3d statistical
samer@42	842 space; each one of these points corresponds to a transition matrix.
samer@41	843
samer@43	844 The distribution of transition matrices plotted in this space forms an arch shape
samer@42	845 that is fairly thin. It thus becomes a reasonable approximation to pretend that
samer@42	846 it is just a sheet in two dimensions; and so we stretch out this curved arc into
samer@42	847 a flat triangle. It is this triangular sheet that is our `Melody Triangle' and
samer@42	848 forms the interface by which the system is controlled. Using this interface
samer@42	849 thus involves a mapping to statistical space; a user selects a position within
samer@42	850 the triangle, and a corresponding transition matrix is returned. Figure
samer@42	851 \ref{TheTriangle} shows how the triangle maps to different measures of redundancy,
samer@42	852 entropy rate and predictive information rate.
samer@41	853
samer@41	854
samer@42	855 Each corner corresponds to three different extremes of predictability and
samer@42	856 unpredictability, which could be loosely characterised as `periodicity', `noise'
samer@42	857 and `repetition'. Melodies from the `noise' corner have no discernible pattern;
samer@42	858 they have high entropy rate, low predictive information rate and low redundancy.
samer@42	859 These melodies are essentially totally random. A melody along the `periodicity'
samer@42	860 to `repetition' edge are all deterministic loops that get shorter as we approach
samer@42	861 the `repetition' corner, until it becomes just one repeating note. It is the
samer@42	862 areas in between the extremes that provide the more `interesting' melodies.
samer@42	863 These melodies have some level of unpredictability, but are not completely random.
samer@42	864 Or, conversely, are predictable, but not entirely so.
samer@41	865
samer@41	866 \begin{figure}
samer@41	867 \centering
samer@41	868 \includegraphics[width=0.9\linewidth]{figs/TheTriangle.pdf}
samer@41	869 \caption{The Melody Triangle\label{TheTriangle}}
samer@41	870 \end{figure}
samer@41	871
samer@41	872
samer@42	873 The Melody Triangle exists in two incarnations; a standard screen based interface
samer@42	874 where a user moves tokens in and around a triangle on screen, and a multi-user
samer@42	875 interactive installation where a Kinect camera tracks individuals in a space and
samer@42	876 maps their positions in physical space to the triangle. In the latter visitors
samer@42	877 entering the installation generates a melody, and could collaborate with their
samer@42	878 co-visitors to generate musical textures---a playful yet informative way to
samer@42	879 explore expectation and surprise in music. Additionally different gestures could
samer@42	880 be detected to change the tempo, register, instrumentation and periodicity of
samer@42	881 the output melody.
samer@41	882
samer@23	883 As a screen based interface the Melody Triangle can serve as composition tool.
samer@42	884 A triangle is drawn on the screen, screen space thus mapped to the statistical
samer@42	885 space of the Melody Triangle. A number of round tokens, each representing a
samer@42	886 melody can be dragged in and around the triangle. When a token is dragged into
samer@42	887 the triangle, the system will start generating the sequence of symbols with
samer@42	888 statistical properties that correspond to the position of the token. These
samer@42	889 symbols are then mapped to notes of a scale.
samer@42	890 Keyboard input allow for control over additionally parameters.
samer@23	891
samer@42	892 The Melody Triangle is can assist a composer in the creation not only of melodies,
samer@42	893 but, by placing multiple tokens in the triangle, can generate intricate musical
samer@42	894 textures. Unlike other computer aided composition tools or programming
samer@42	895 environments, here the composer engages with music on the high and abstract level
samer@42	896 of expectation, randomness and predictability.
hekeus@35	897
hekeus@35	898
hekeus@38	899
hekeus@38	900 \subsection{Information Dynamics as Evaluative Feedback Mechanism}
hekeus@38	901 %NOT SURE THIS SHOULD BE HERE AT ALL..?
hekeus@38	902
hekeus@38	903
samer@42	904 Information measures on a stream of symbols can form a feedback mechanism; a
samer@42	905 rudamentary `critic' of sorts. For instance symbol by symbol measure of predictive
samer@42	906 information rate, entropy rate and redundancy could tell us if a stream of symbols
samer@42	907 is currently `boring', either because it is too repetitive, or because it is too
samer@42	908 chaotic. Such feedback would be oblivious to more long term and large scale
samer@42	909 structures, but it nonetheless could be provide a composer valuable insight on
samer@42	910 the short term properties of a work. This could not only be used for the
samer@42	911 evaluation of pre-composed streams of symbols, but could also provide real-time
samer@42	912 feedback in an improvisatory setup.
hekeus@38	913
hekeus@13	914 \section{Musical Preference and Information Dynamics}
samer@42	915 We are carrying out a study to investigate the relationship between musical
samer@42	916 preference and the information dynamics models, the experimental interface a
samer@42	917 simplified version of the screen-based Melody Triangle. Participants are asked
samer@42	918 to use this music pattern generator under various experimental conditions in a
samer@42	919 composition task. The data collected includes usage statistics of the system:
samer@42	920 where in the triangle they place the tokens, how long they leave them there and
samer@42	921 the state of the system when users, by pressing a key, indicate that they like
samer@42	922 what they are hearing. As such the experiments will help us identify any
samer@42	923 correlation between the information theoretic properties of a stream and its
samer@42	924 perceived aesthetic worth.
hekeus@16	925
samer@4	926
hekeus@38	927 %\emph{comparable system} Gordon Pask's Musicolor (1953) applied a similar notion
hekeus@38	928 %of boredom in its design. The Musicolour would react to audio input through a
hekeus@38	929 %microphone by flashing coloured lights. Rather than a direct mapping of sound
hekeus@38	930 %to light, Pask designed the device to be a partner to a performing musician. It
hekeus@38	931 %would adapt its lighting pattern based on the rhythms and frequencies it would
hekeus@38	932 %hear, quickly `learning' to flash in time with the music. However Pask endowed
hekeus@38	933 %the device with the ability to `be bored'; if the rhythmic and frequency content
hekeus@38	934 %of the input remained the same for too long it would listen for other rhythms
hekeus@38	935 %and frequencies, only lighting when it heard these. As the Musicolour would
hekeus@38	936 %`get bored', the musician would have to change and vary their playing, eliciting
hekeus@38	937 %new and unexpected outputs in trying to keep the Musicolour interested.
samer@4	938
hekeus@13	939
samer@4	940 \section{Conclusion}
samer@4	941
hekeus@44	942 \section{acknowledgments}
hekeus@44	943 This work is supported by EPSRC Doctoral Training Centre EP/G03723X/1 (HE), GR/S82213/01 and EP/E045235/1(SA), an EPSRC Leadership Fellowship, EP/G007144/1 (MDP) and EPSRC IDyOM2 EP/H013059/1.
hekeus@44	944
samer@9	945 \bibliographystyle{unsrt}
samer@43	946 {\bibliography{all,c4dm,nime,andrew}}
samer@4	947 \end{document}

Mercurial > hg > cip2012

annotate draft.tex @ 44:244b74fb707d