cip2012: draft.tex annotate

annotate draft.tex @ 21:739b2444a4ac

Moved graphics files to figs; replaced MatrixDistribution.png with mtriscat.eps; updated its caption.

author	samer
date	Mon, 12 Mar 2012 17:06:06 +0000
parents	ca694f7dc3f9
children	f9a67e19a66b

rev	line source
samer@18	1 \documentclass[conference,a4paper]{IEEEtran}
samer@4	2 \usepackage{cite}
samer@4	3 \usepackage[cmex10]{amsmath}
samer@4	4 \usepackage{graphicx}
samer@4	5 \usepackage{amssymb}
samer@4	6 \usepackage{epstopdf}
samer@4	7 \usepackage{url}
samer@4	8 \usepackage{listings}
samer@18	9 %\usepackage[expectangle]{tools}
samer@9	10 \usepackage{tools}
samer@18	11 \usepackage{tikz}
samer@18	12 \usetikzlibrary{calc}
samer@18	13 \usetikzlibrary{matrix}
samer@18	14 \usetikzlibrary{patterns}
samer@18	15 \usetikzlibrary{arrows}
samer@9	16
samer@9	17 \let\citep=\cite
samer@18	18 \newcommand{\colfig}[2][1]{\includegraphics[width=#1\linewidth]{figs/#2}}%
samer@18	19 \newcommand\preals{\reals_+}
samer@18	20 \newcommand\X{\mathcal{X}}
samer@18	21 \newcommand\Y{\mathcal{Y}}
samer@18	22 \newcommand\domS{\mathcal{S}}
samer@18	23 \newcommand\A{\mathcal{A}}
samer@18	24 \newcommand\rvm[1]{\mathrm{#1}}
samer@18	25 \newcommand\sps{\,.\,}
samer@18	26 \newcommand\Ipred{\mathcal{I}_{\mathrm{pred}}}
samer@18	27 \newcommand\Ix{\mathcal{I}}
samer@18	28 \newcommand\IXZ{\overline{\underline{\mathcal{I}}}}
samer@18	29 \newcommand\x{\vec{x}}
samer@18	30 \newcommand\Ham[1]{\mathcal{H}_{#1}}
samer@18	31 \newcommand\subsets[2]{[#1]^{(k)}}
samer@18	32 \def\bet(#1,#2){#1..#2}
samer@18	33
samer@18	34
samer@18	35 \def\ev(#1=#2){#1\!\!=\!#2}
samer@18	36 \newcommand\rv[1]{\Omega \to #1}
samer@18	37 \newcommand\ceq{\!\!=\!}
samer@18	38 \newcommand\cmin{\!-\!}
samer@18	39 \newcommand\modulo[2]{#1\!\!\!\!\!\mod#2}
samer@18	40
samer@18	41 \newcommand\sumitoN{\sum_{i=1}^N}
samer@18	42 \newcommand\sumktoK{\sum_{k=1}^K}
samer@18	43 \newcommand\sumjtoK{\sum_{j=1}^K}
samer@18	44 \newcommand\sumalpha{\sum_{\alpha\in\A}}
samer@18	45 \newcommand\prodktoK{\prod_{k=1}^K}
samer@18	46 \newcommand\prodjtoK{\prod_{j=1}^K}
samer@18	47
samer@18	48 \newcommand\past[1]{\overset{\rule{0pt}{0.2em}\smash{\leftarrow}}{#1}}
samer@18	49 \newcommand\fut[1]{\overset{\rule{0pt}{0.1em}\smash{\rightarrow}}{#1}}
samer@18	50 \newcommand\parity[2]{P^{#1}_{2,#2}}
samer@4	51
samer@4	52 %\usepackage[parfill]{parskip}
samer@4	53
samer@4	54 \begin{document}
samer@4	55 \title{Cognitive Music Modelling: an Information Dynamics Approach}
samer@4	56
samer@4	57 \author{
hekeus@16	58 \IEEEauthorblockN{Samer A. Abdallah, Henrik Ekeus, Peter Foster}
hekeus@16	59 \IEEEauthorblockN{Andrew Robertson and Mark D. Plumbley}
samer@4	60 \IEEEauthorblockA{Centre for Digital Music\\
samer@4	61 Queen Mary University of London\\
hekeus@16	62 Mile End Road, London E1 4NS\\
hekeus@16	63 Email:}}
samer@4	64
samer@4	65 \maketitle
samer@18	66 \begin{abstract}
samer@18	67 People take in information when perceiving music. With it they continually
samer@18	68 build predictive models of what is going to happen. There is a relationship
samer@18	69 between information measures and how we perceive music. An information
samer@18	70 theoretic approach to music cognition is thus a fruitful avenue of research.
samer@18	71 In this paper, we review the theoretical foundations of information dynamics
samer@18	72 and discuss a few emerging areas of application.
hekeus@16	73 \end{abstract}
samer@4	74
samer@4	75
samer@9	76 \section{Expectation and surprise in music}
samer@9	77 \label{s:Intro}
samer@9	78
samer@18	79 One of the effects of listening to music is to create
samer@18	80 expectations of what is to come next, which may be fulfilled
samer@9	81 immediately, after some delay, or not at all as the case may be.
samer@9	82 This is the thesis put forward by, amongst others, music theorists
samer@18	83 L. B. Meyer \cite{Meyer67} and Narmour \citep{Narmour77}, but was
samer@18	84 recognised much earlier; for example,
samer@9	85 it was elegantly put by Hanslick \cite{Hanslick1854} in the
samer@9	86 nineteenth century:
samer@9	87 \begin{quote}
samer@9	88 `The most important factor in the mental process which accompanies the
samer@9	89 act of listening to music, and which converts it to a source of pleasure,
samer@18	90 is \ldots the intellectual satisfaction
samer@9	91 which the listener derives from continually following and anticipating
samer@9	92 the composer's intentions---now, to see his expectations fulfilled, and
samer@18	93 now, to find himself agreeably mistaken.
samer@18	94 %It is a matter of course that
samer@18	95 %this intellectual flux and reflux, this perpetual giving and receiving
samer@18	96 %takes place unconsciously, and with the rapidity of lightning-flashes.'
samer@9	97 \end{quote}
samer@9	98 An essential aspect of this is that music is experienced as a phenomenon
samer@9	99 that `unfolds' in time, rather than being apprehended as a static object
samer@9	100 presented in its entirety. Meyer argued that musical experience depends
samer@9	101 on how we change and revise our conceptions \emph{as events happen}, on
samer@9	102 how expectation and prediction interact with occurrence, and that, to a
samer@9	103 large degree, the way to understand the effect of music is to focus on
samer@9	104 this `kinetics' of expectation and surprise.
samer@9	105
samer@9	106 The business of making predictions and assessing surprise is essentially
samer@9	107 one of reasoning under conditions of uncertainty and manipulating
samer@9	108 degrees of belief about the various proposition which may or may not
samer@9	109 hold, and, as has been argued elsewhere \cite{Cox1946,Jaynes27}, best
samer@9	110 quantified in terms of Bayesian probability theory.
samer@9	111 Thus, we suppose that
samer@9	112 when we listen to music, expectations are created on the basis of our
samer@9	113 familiarity with various stylistic norms %, that is, using models that
samer@9	114 encode the statistics of music in general, the particular styles of
samer@9	115 music that seem best to fit the piece we happen to be listening to, and
samer@9	116 the emerging structures peculiar to the current piece. There is
samer@9	117 experimental evidence that human listeners are able to internalise
samer@9	118 statistical knowledge about musical structure, \eg
samer@9	119 \citep{SaffranJohnsonAslin1999,EerolaToiviainenKrumhansl2002}, and also
samer@9	120 that statistical models can form an effective basis for computational
samer@9	121 analysis of music, \eg
samer@9	122 \cite{ConklinWitten95,PonsfordWigginsMellish1999,Pearce2005}.
samer@9	123
samer@9	124 \subsection{Music and information theory}
samer@9	125 Given a probabilistic framework for music modelling and prediction,
samer@9	126 it is a small step to apply quantitative information theory \cite{Shannon48} to
samer@9	127 the models at hand.
samer@9	128 The relationship between information theory and music and art in general has been the
samer@9	129 subject of some interest since the 1950s
samer@9	130 \cite{Youngblood58,CoonsKraehenbuehl1958,HillerBean66,Moles66,Meyer67,Cohen1962}.
samer@9	131 The general thesis is that perceptible qualities and subjective
samer@9	132 states like uncertainty, surprise, complexity, tension, and interestingness
samer@9	133 are closely related to
samer@9	134 information-theoretic quantities like entropy, relative entropy,
samer@9	135 and mutual information.
samer@9	136 % and are major determinants of the overall experience.
samer@9	137 Berlyne \cite{Berlyne71} called such quantities `collative variables', since
samer@9	138 they are to do with patterns of occurrence rather than medium-specific details,
samer@9	139 and developed the ideas of `information aesthetics' in an experimental setting.
samer@9	140 % Berlyne's `new experimental aesthetics', the `information-aestheticians'.
samer@9	141
samer@9	142 % Listeners then experience greater or lesser levels of surprise
samer@9	143 % in response to departures from these norms.
samer@9	144 % By careful manipulation
samer@9	145 % of the material, the composer can thus define, and induce within the
samer@9	146 % listener, a temporal programme of varying
samer@9	147 % levels of uncertainty, ambiguity and surprise.
samer@9	148
samer@9	149
samer@9	150 Previous work in this area \cite{Berlyne74} treated the various
samer@9	151 information theoretic quantities
samer@9	152 such as entropy as if they were intrinsic properties of the stimulus---subjects
samer@9	153 were presented with a sequence of tones with `high entropy', or a visual pattern
samer@9	154 with `low entropy'. These values were determined from some known `objective'
samer@9	155 probability model of the stimuli,%
samer@9	156 \footnote{%
samer@9	157 The notion of objective probabalities and whether or not they can
samer@9	158 usefully be said to exist is the subject of some debate, with advocates of
samer@9	159 subjective probabilities including de Finetti \cite{deFinetti}.
samer@9	160 Accordingly, we will treat the concept of a `true' or `objective' probability
samer@9	161 models with a grain of salt and not rely on them in our
samer@9	162 theoretical development.}%
samer@9	163 or from simple statistical analyses such as
samer@9	164 computing emprical distributions. Our approach is explicitly to consider the role
samer@9	165 of the observer in perception, and more specifically, to consider estimates of
samer@9	166 entropy \etc with respect to \emph{subjective} probabilities.
samer@9	167 \subsection{Information dynamic approach}
samer@9	168
samer@9	169 Bringing the various strands together, our working hypothesis is that
samer@9	170 as a listener (to which will refer gender neutrally as `it')
samer@9	171 listens to a piece of music, it maintains a dynamically evolving statistical
samer@9	172 model that enables it to make predictions about how the piece will
samer@9	173 continue, relying on both its previous experience of music and the immediate
samer@9	174 context of the piece.
samer@9	175 As events unfold, it revises its model and hence its probabilistic belief state,
samer@9	176 which includes predictive distributions over future observations.
samer@9	177 These distributions and changes in distributions can be characterised in terms of a handful of information
samer@9	178 theoretic-measures such as entropy and relative entropy.
samer@9	179 % to measure uncertainty and information. %, that is, changes in predictive distributions maintained by the model.
samer@9	180 By tracing the evolution of a these measures, we obtain a representation
samer@9	181 which captures much of the significant structure of the
samer@9	182 music.
samer@9	183 This approach has a number of features which we list below.
samer@9	184
samer@18	185 \emph{Abstraction}:
samer@9	186 Because it is sensitive mainly to \emph{patterns} of occurence,
samer@9	187 rather the details of which specific things occur,
samer@9	188 it operates at a level of abstraction removed from the details of the sensory
samer@9	189 experience and the medium through which it was received, suggesting that the
samer@9	190 same approach could, in principle, be used to analyse and compare information
samer@9	191 flow in different temporal media regardless of whether they are auditory,
samer@9	192 visual or otherwise.
samer@9	193
samer@18	194 \emph{Generality} applicable to any probabilistic model.
samer@9	195
samer@18	196 \emph{Subjectivity}:
samer@9	197 Since the analysis is dependent on the probability model the observer brings to the
samer@9	198 problem, which may depend on prior experience or other factors, and which may change
samer@9	199 over time, inter-subject variablity and variation in subjects' responses over time are
samer@9	200 fundamental to the theory. It is essentially a theory of subjective response
samer@9	201
samer@18	202 %modelling the creative process, which often alternates between generative
samer@18	203 %and selective or evaluative phases \cite{Boden1990}, and would have
samer@18	204 %applications in tools for computer aided composition.
samer@18	205
samer@18	206
samer@18	207 \section{Theoretical review}
samer@18	208
samer@18	209 In this section, we summarise the definitions of some of the relevant quantities
samer@18	210 in information dynamics and show how they can be computed in some simple probabilistic
samer@18	211 models (namely, first and higher-order Markov chains, and Gaussian processes [Peter?]).
samer@18	212
samer@18	213 \begin{fig}{venn-example}
samer@18	214 \newcommand\rad{2.2em}%
samer@18	215 \newcommand\circo{circle (3.4em)}%
samer@18	216 \newcommand\labrad{4.3em}
samer@18	217 \newcommand\bound{(-6em,-5em) rectangle (6em,6em)}
samer@18	218 \newcommand\colsep{\ }
samer@18	219 \newcommand\clipin[1]{\clip (#1) \circo;}%
samer@18	220 \newcommand\clipout[1]{\clip \bound (#1) \circo;}%
samer@18	221 \newcommand\cliptwo[3]{%
samer@18	222 \begin{scope}
samer@18	223 \clipin{#1};
samer@18	224 \clipin{#2};
samer@18	225 \clipout{#3};
samer@18	226 \fill[black!30] \bound;
samer@18	227 \end{scope}
samer@18	228 }%
samer@18	229 \newcommand\clipone[3]{%
samer@18	230 \begin{scope}
samer@18	231 \clipin{#1};
samer@18	232 \clipout{#2};
samer@18	233 \clipout{#3};
samer@18	234 \fill[black!15] \bound;
samer@18	235 \end{scope}
samer@18	236 }%
samer@18	237 \begin{tabular}{c@{\colsep}c}
samer@18	238 \begin{tikzpicture}[baseline=0pt]
samer@18	239 \coordinate (p1) at (90:\rad);
samer@18	240 \coordinate (p2) at (210:\rad);
samer@18	241 \coordinate (p3) at (-30:\rad);
samer@18	242 \clipone{p1}{p2}{p3};
samer@18	243 \clipone{p2}{p3}{p1};
samer@18	244 \clipone{p3}{p1}{p2};
samer@18	245 \cliptwo{p1}{p2}{p3};
samer@18	246 \cliptwo{p2}{p3}{p1};
samer@18	247 \cliptwo{p3}{p1}{p2};
samer@18	248 \begin{scope}
samer@18	249 \clip (p1) \circo;
samer@18	250 \clip (p2) \circo;
samer@18	251 \clip (p3) \circo;
samer@18	252 \fill[black!45] \bound;
samer@18	253 \end{scope}
samer@18	254 \draw (p1) \circo;
samer@18	255 \draw (p2) \circo;
samer@18	256 \draw (p3) \circo;
samer@18	257 \path
samer@18	258 (barycentric cs:p3=1,p1=-0.2,p2=-0.1) +(0ex,0) node {$I_{3\|12}$}
samer@18	259 (barycentric cs:p1=1,p2=-0.2,p3=-0.1) +(0ex,0) node {$I_{1\|23}$}
samer@18	260 (barycentric cs:p2=1,p3=-0.2,p1=-0.1) +(0ex,0) node {$I_{2\|13}$}
samer@18	261 (barycentric cs:p3=1,p2=1,p1=-0.55) +(0ex,0) node {$I_{23\|1}$}
samer@18	262 (barycentric cs:p1=1,p3=1,p2=-0.55) +(0ex,0) node {$I_{13\|2}$}
samer@18	263 (barycentric cs:p2=1,p1=1,p3=-0.55) +(0ex,0) node {$I_{12\|3}$}
samer@18	264 (barycentric cs:p3=1,p2=1,p1=1) node {$I_{123}$}
samer@18	265 ;
samer@18	266 \path
samer@18	267 (p1) +(140:\labrad) node {$X_1$}
samer@18	268 (p2) +(-140:\labrad) node {$X_2$}
samer@18	269 (p3) +(-40:\labrad) node {$X_3$};
samer@18	270 \end{tikzpicture}
samer@18	271 &
samer@18	272 \parbox{0.5\linewidth}{
samer@18	273 \small
samer@18	274 \begin{align*}
samer@18	275 I_{1\|23} &= H(X_1\|X_2,X_3) \\
samer@18	276 I_{13\|2} &= I(X_1;X_3\|X_2) \\
samer@18	277 I_{1\|23} + I_{13\|2} &= H(X_1\|X_2) \\
samer@18	278 I_{12\|3} + I_{123} &= I(X_1;X_2)
samer@18	279 \end{align*}
samer@18	280 }
samer@18	281 \end{tabular}
samer@18	282 \caption{
samer@18	283 Venn diagram visualisation of entropies and mutual informations
samer@18	284 for three random variables $X_1$, $X_2$ and $X_3$. The areas of
samer@18	285 the three circles represent $H(X_1)$, $H(X_2)$ and $H(X_3)$ respectively.
samer@18	286 The total shaded area is the joint entropy $H(X_1,X_2,X_3)$.
samer@18	287 The central area $I_{123}$ is the co-information \cite{McGill1954}.
samer@18	288 Some other information measures are indicated in the legend.
samer@18	289 }
samer@18	290 \end{fig}
samer@18	291 [Adopting notation of recent Binding information paper.]
samer@18	292 \subsection{'Anatomy of a bit' stuff}
samer@18	293 Entropy rates, redundancy, predictive information etc.
samer@18	294 Information diagrams.
samer@18	295
samer@18	296 \begin{fig}{predinfo-bg}
samer@18	297 \newcommand\subfig[2]{\shortstack{#2\\[0.75em]#1}}
samer@18	298 \newcommand\rad{1.8em}%
samer@18	299 \newcommand\ovoid[1]{%
samer@18	300 ++(-#1,\rad)
samer@18	301 -- ++(2 * #1,0em) arc (90:-90:\rad)
samer@18	302 -- ++(-2 * #1,0em) arc (270:90:\rad)
samer@18	303 }%
samer@18	304 \newcommand\axis{2.75em}%
samer@18	305 \newcommand\olap{0.85em}%
samer@18	306 \newcommand\offs{3.6em}
samer@18	307 \newcommand\colsep{\hspace{5em}}
samer@18	308 \newcommand\longblob{\ovoid{\axis}}
samer@18	309 \newcommand\shortblob{\ovoid{1.75em}}
samer@18	310 \begin{tabular}{c@{\colsep}c}
samer@18	311 \subfig{(a) excess entropy}{%
samer@18	312 \newcommand\blob{\longblob}
samer@18	313 \begin{tikzpicture}
samer@18	314 \coordinate (p1) at (-\offs,0em);
samer@18	315 \coordinate (p2) at (\offs,0em);
samer@18	316 \begin{scope}
samer@18	317 \clip (p1) \blob;
samer@18	318 \clip (p2) \blob;
samer@18	319 \fill[lightgray] (-1,-1) rectangle (1,1);
samer@18	320 \end{scope}
samer@18	321 \draw (p1) +(-0.5em,0em) node{\shortstack{infinite\\past}} \blob;
samer@18	322 \draw (p2) +(0.5em,0em) node{\shortstack{infinite\\future}} \blob;
samer@18	323 \path (0,0) node (future) {$E$};
samer@18	324 \path (p1) +(-2em,\rad) node [anchor=south] {$\ldots,X_{-1}$};
samer@18	325 \path (p2) +(2em,\rad) node [anchor=south] {$X_0,\ldots$};
samer@18	326 \end{tikzpicture}%
samer@18	327 }%
samer@18	328 \\[1.25em]
samer@18	329 \subfig{(b) predictive information rate $b_\mu$}{%
samer@18	330 \begin{tikzpicture}%[baseline=-1em]
samer@18	331 \newcommand\rc{2.1em}
samer@18	332 \newcommand\throw{2.5em}
samer@18	333 \coordinate (p1) at (210:1.5em);
samer@18	334 \coordinate (p2) at (90:0.7em);
samer@18	335 \coordinate (p3) at (-30:1.5em);
samer@18	336 \newcommand\bound{(-7em,-2.6em) rectangle (7em,3.0em)}
samer@18	337 \newcommand\present{(p2) circle (\rc)}
samer@18	338 \newcommand\thepast{(p1) ++(-\throw,0) \ovoid{\throw}}
samer@18	339 \newcommand\future{(p3) ++(\throw,0) \ovoid{\throw}}
samer@18	340 \newcommand\fillclipped[2]{%
samer@18	341 \begin{scope}[even odd rule]
samer@18	342 \foreach \thing in {#2} {\clip \thing;}
samer@18	343 \fill[black!#1] \bound;
samer@18	344 \end{scope}%
samer@18	345 }%
samer@18	346 \fillclipped{30}{\present,\future,\bound \thepast}
samer@18	347 \fillclipped{15}{\present,\bound \future,\bound \thepast}
samer@18	348 \draw \future;
samer@18	349 \fillclipped{45}{\present,\thepast}
samer@18	350 \draw \thepast;
samer@18	351 \draw \present;
samer@18	352 \node at (barycentric cs:p2=1,p1=-0.17,p3=-0.17) {$r_\mu$};
samer@18	353 \node at (barycentric cs:p1=-0.4,p2=1.0,p3=1) {$b_\mu$};
samer@18	354 \node at (barycentric cs:p3=0,p2=1,p1=1.2) [shape=rectangle,fill=black!45,inner sep=1pt]{$\rho_\mu$};
samer@18	355 \path (p2) +(140:3em) node {$X_0$};
samer@18	356 % \node at (barycentric cs:p3=0,p2=1,p1=1) {$\rho_\mu$};
samer@18	357 \path (p3) +(3em,0em) node {\shortstack{infinite\\future}};
samer@18	358 \path (p1) +(-3em,0em) node {\shortstack{infinite\\past}};
samer@18	359 \path (p1) +(-4em,\rad) node [anchor=south] {$\ldots,X_{-1}$};
samer@18	360 \path (p3) +(4em,\rad) node [anchor=south] {$X_1,\ldots$};
samer@18	361 \end{tikzpicture}}%
samer@18	362 \\[0.5em]
samer@18	363 \end{tabular}
samer@18	364 \caption{
samer@18	365 Venn diagram representation of several information measures for
samer@18	366 stationary random processes. Each circle or oval represents a random
samer@18	367 variable or sequence of random variables relative to time $t=0$. Overlapped areas
samer@18	368 correspond to various mutual information as in \Figrf{venn-example}.
samer@18	369 In (c), the circle represents the `present'. Its total area is
samer@18	370 $H(X_0)=H(1)=\rho_\mu+r_\mu+b_\mu$, where $\rho_\mu$ is the multi-information
samer@18	371 rate, $r_\mu$ is the residual entropy rate, and $b_\mu$ is the predictive
samer@18	372 information rate. The entropy rate is $h_\mu = r_\mu+b_\mu$.
samer@18	373 }
samer@18	374 \end{fig}
samer@18	375
samer@18	376 \paragraph{Predictive information rate}
samer@18	377 In previous work \cite{AbdallahPlumbley2009}, we introduced
samer@18	378 % examined several
samer@18	379 % information-theoretic measures that could be used to characterise
samer@18	380 % not only random processes (\ie, an ensemble of possible sequences),
samer@18	381 % but also the dynamic progress of specific realisations of such processes.
samer@18	382 % One of these measures was
samer@18	383 %
samer@18	384 the \emph{predictive information rate}
samer@18	385 (PIR), which is the average information
samer@18	386 in one observation about the infinite future given the infinite past.
samer@18	387 If $\past{X}_t=(\ldots,X_{t-2},X_{t-1})$ denotes the variables
samer@18	388 before time $t$,
samer@18	389 and $\fut{X}_t = (X_{t+1},X_{t+2},\ldots)$ denotes
samer@18	390 those after $t$,
samer@18	391 the PIR at time $t$ is defined as a conditional mutual information:
samer@18	392 \begin{equation}
samer@18	393 \label{eq:PIR}
samer@18	394 \IXZ_t \define I(X_t;\fut{X}_t\|\past{X}_t) = H(\fut{X}_t\|\past{X}_t) - H(\fut{X}_t\|X_t,\past{X}_t).
samer@18	395 \end{equation}
samer@18	396 % (The underline/overline notation follows that of \cite[\S 3]{AbdallahPlumbley2009}.)
samer@18	397 % Hence, $\Ix_t$ quantifies the \emph{new}
samer@18	398 % information gained about the future from the observation at time $t$.
samer@18	399 Equation \eqrf{PIR} can be read as the average reduction
samer@18	400 in uncertainty about the future on learning $X_t$, given the past.
samer@18	401 Due to the symmetry of the mutual information, it can also be written
samer@18	402 as
samer@18	403 \begin{equation}
samer@18	404 % \IXZ_t
samer@18	405 I(X_t;\fut{X}_t\|\past{X}_t) = H(X_t\|\past{X}_t) - H(X_t\|\fut{X}_t,\past{X}_t).
samer@18	406 % \label{<++>}
samer@18	407 \end{equation}
samer@18	408 % If $X$ is stationary, then
samer@18	409 Now, in the shift-invariant case, $H(X_t\|\past{X}_t)$
samer@18	410 is the familiar entropy rate $h_\mu$, but $H(X_t\|\fut{X}_t,\past{X}_t)$,
samer@18	411 the conditional entropy of one variable given \emph{all} the others
samer@18	412 in the sequence, future as well as past, is what
samer@18	413 we called the \emph{residual entropy rate} $r_\mu$ in \cite{AbdallahPlumbley2010},
samer@18	414 but was previously identified by Verd{\'u} and Weissman \cite{VerduWeissman2006} as the
samer@18	415 \emph{erasure entropy rate}.
samer@18	416 % It is not expressible in terms of the block entropy function $H(\cdot)$.
samer@18	417 It can be defined as the limit
samer@18	418 \begin{equation}
samer@18	419 \label{eq:residual-entropy-rate}
samer@18	420 r_\mu \define \lim_{N\tends\infty} H(X_{\bet(-N,N)}) - H(X_{\bet(-N,-1)},X_{\bet(1,N)}).
samer@18	421 \end{equation}
samer@18	422 The second term, $H(X_{\bet(1,N)},X_{\bet(-N,-1)})$,
samer@18	423 is the joint entropy of two non-adjacent blocks each of length $N$ with a
samer@18	424 gap between them,
samer@18	425 and cannot be expressed as a function of block entropies alone.
samer@18	426 % In order to associate it with the concept of \emph{binding information} which
samer@18	427 % we will define in \secrf{binding-info}, we
samer@18	428 Thus, the shift-invariant PIR (which we will write as $b_\mu$) is the difference between
samer@18	429 the entropy rate and the erasure entropy rate: $b_\mu = h_\mu - r_\mu$.
samer@18	430 These relationships are illustrated in \Figrf{predinfo-bg}, along with
samer@18	431 several of the information measures we have discussed so far.
samer@18	432
samer@18	433
samer@18	434 \subsection{First order Markov chains}
samer@18	435 These are the simplest non-trivial models to which information dynamics methods
samer@18	436 can be applied. In \cite{AbdallahPlumbley2009} we, showed that the predictive information
samer@18	437 rate can be expressed simply in terms of the entropy rate of the Markov chain.
samer@18	438 If we let $a$ denote the transition matrix of the Markov chain, and $h_a$ it's
samer@18	439 entropy rate, then its predictive information rate $b_a$ is
samer@18	440 \begin{equation}
samer@18	441 b_a = h_{a^2} - h_a,
samer@18	442 \end{equation}
samer@18	443 where $a^2 = aa$, the transition matrix squared, is the transition matrix
samer@18	444 of the `skip one' Markov chain obtained by leaving out every other observation.
samer@18	445
samer@18	446 \subsection{Higher order Markov chains}
samer@18	447 Second and higher order Markov chains can be treated in a similar way by transforming
samer@18	448 to a first order representation of the high order Markov chain. If we are dealing
samer@18	449 with an $N$th order model, this is done forming a new alphabet of possible observations
samer@18	450 consisting of all possible $N$-tuples of symbols from the base alphabet. An observation
samer@18	451 in this new model represents a block of $N$ observations from the base model. The next
samer@18	452 observation represents the block of $N$ obtained by shift the previous block along
samer@18	453 by one step. The new Markov of chain is parameterised by a sparse $K^N\times K^N$
samer@18	454 transition matrix $\hat{a}$.
samer@18	455 \begin{equation}
samer@18	456 b_{\hat{a}} = h_{\hat{a}^{N+1}} - N h_{\hat{a}},
samer@18	457 \end{equation}
samer@18	458 where $\hat{a}^{N+1}$ is the $N+1$th power of the transition matrix.
samer@18	459
samer@9	460
samer@4	461
hekeus@16	462 \section{Information Dynamics in Analysis}
samer@4	463
hekeus@16	464 \subsection{Musicological Analysis}
samer@4	465 refer to the work with the analysis of minimalist pieces
samer@4	466
hekeus@16	467 \subsection{Content analysis/Sound Categorisation}. Using Information Dynamics it is possible to segment music. From there we can then use this to search large data sets. Determine musical structure for the purpose of playlist navigation and search.
hekeus@16	468 \emph{Peter}
samer@4	469
samer@4	470 \subsection{Beat Tracking}
hekeus@16	471 \emph{Andrew}
samer@4	472
samer@4	473
hekeus@13	474 \section{Information Dynamics as Design Tool}
hekeus@13	475
hekeus@16	476 In addition to applying information dynamics to analysis, it is also possible use this approach in design, such as the composition of musical materials.
hekeus@13	477 By providing a framework for linking information theoretic measures to the control of generative processes, it becomes possible to steer the output of these processes to match a criteria defined by these measures.
hekeus@13	478 For instance outputs of a stochastic musical process could be filtered to match constraints defined by a set of information theoretic measures.
hekeus@13	479
hekeus@13	480 The use of stochastic processes for the generation of musical material has been widespread for decades -- Iannis Xenakis applied probabilistic mathematical models to the creation of musical materials, including to the formulation of a theory of Markovian Stochastic Music.
hekeus@13	481 However we can use information dynamics measures to explore and interface with such processes at the high and abstract level of expectation, randomness and predictability.
hekeus@13	482 The Melody Triangle is such a system.
hekeus@13	483
hekeus@13	484 \subsection{The Melody Triangle} The Melody Triangle is an exploratory interface for the discovery of melodic content, where the input -- positions within a triangle -- directly map to information theoretic measures associated with the output.
hekeus@13	485 The measures are the entropy rate, redundancy and predictive information rate of the random process used to generate the sequence of notes.
hekeus@13	486 These are all related to the predictability of the the sequence and as such address the notions of expectation and surprise in the perception of music.\emph{self-plagiarised}
hekeus@13	487
hekeus@13	488 Before the Melody Triangle can used, it has to be �populated� with possible parameter values for the melody generators.
hekeus@13	489 These are then plotted in a 3d statistical space of redundancy, entropy rate and predictive information rate.
hekeus@17	490 In our case we generated thousands of transition matrixes, representing first-order Markov chains, by a random sampling method. In figure \ref{InfoDynEngine} we see a representation of how these matrixes are distributed in the 3d statistical space; each one of these points corresponds to a transition matrix.\emph{self-plagiarised}
hekeus@17	491
hekeus@17	492 \begin{figure}
hekeus@17	493 \centering
samer@21	494 \includegraphics[width=\linewidth]{figs/mtriscat}
samer@21	495 \caption{The population of transition matrices distributed along three axes of
samer@21	496 redundancy, entropy rate and predictive information rate (all measured in bits).
samer@21	497 The concentrations of points along the redundancy axis correspond
samer@21	498 to Markov chains which are roughly periodic with periods of 2 (redundancy 1 bit),
samer@21	499 3, 4, \etc all the way to period 8 (redundancy 3 bits). The colour of each point
samer@21	500 represents its PIR---note that the highest values are found at intermediate entropy
samer@21	501 and redundancy, and that the distribution as a whole makes a curved triangle. Although
samer@21	502 not visible in this plot, it is largely hollow in the middle.
samer@21	503 \label{InfoDynEngine}}
hekeus@17	504 \end{figure}
hekeus@17	505
samer@4	506
hekeus@13	507 When we look at the distribution of transition matrixes plotted in this space, we see that it forms an arch shape that is fairly thin.
hekeus@13	508 It thus becomes a reasonable approximation to pretend that it is just a sheet in two dimensions; and so we stretch out this curved arc into a flat triangle.
hekeus@13	509 It is this triangular sheet that is our �Melody Triangle� and forms the interface by which the system is controlled. \emph{self-plagiarised}
samer@4	510
hekeus@13	511 When the Melody Triangle is used, regardless of whether it is as a screen based system, or as an interactive installation, it involves a mapping to this statistical space.
hekeus@13	512 When the user, through the interface, selects a position within the triangle, the corresponding transition matrix is returned.
hekeus@17	513 Figure \ref{TheTriangle} shows how the triangle maps to different measures of redundancy, entropy rate and predictive information rate.\emph{self-plagiarised}
hekeus@17	514 \begin{figure}
hekeus@17	515 \centering
samer@21	516 \includegraphics[width=\linewidth]{figs/TheTriangle.pdf}
hekeus@17	517 \caption{The Melody Triangle\label{TheTriangle}}
hekeus@17	518 \end{figure}
hekeus@13	519 Each corner corresponds to three different extremes of predictability and unpredictability, which could be loosely characterised as �periodicity�, �noise� and �repetition�.
hekeus@13	520 Melodies from the �noise� corner have no discernible pattern; they have high entropy rate, low predictive information rate and low redundancy.
hekeus@13	521 These melodies are essentially totally random.
hekeus@13	522 A melody along the �periodicity� to �repetition� edge are all deterministic loops that get shorter as we approach the �repetition� corner, until it becomes just one repeating note.
hekeus@13	523 It is the areas in between the extremes that provide the more �interesting� melodies.
hekeus@13	524 That is, those that have some level of unpredictability, but are not completely ran- dom. Or, conversely, that are predictable, but not entirely so.
hekeus@13	525 This triangular space allows for an intuitive explorationof expectation and surprise in temporal sequences based on a simple model of how one might guess the next event given the previous one.\emph{self-plagiarised}
samer@4	526
hekeus@13	527
hekeus@13	528
hekeus@13	529 Any number of interfaces could be developed for the Melody Triangle.
hekeus@13	530 We have developed two; a standard screen based interface where a user moves tokens with a mouse in and around a triangle on screen, and a multi-user interactive installation where a Kinect camera tracks individuals in a space and maps their positions in the space to the triangle.
hekeus@13	531 Each visitor would generate a melody, and could collaborate with their co-visitors to generate musical textures -- a playful yet informative way to explore expectation and surprise in music.
hekeus@13	532
hekeus@13	533 As a screen based interface the Melody Triangle can serve as composition tool.
hekeus@13	534 A triangle is drawn on the screen, screen space thus mapped to the statistical space of the Melody Triangle.
hekeus@13	535 A number of round tokens, each representing a melody can be dragged in and around the triangle.
hekeus@13	536 When a token is dragged into the triangle, the system will start generating the sequence of notes with statistical properties that correspond to its position in the triangle.\emph{self-plagiarised}
hekeus@13	537
hekeus@13	538 In this mode, the Melody Triangle can be used as a kind of composition assistant for the generation of interesting musical textures and melodies.
hekeus@13	539 However unlike other computer aided composition tools or programming environments, here the composer engages with music on the high and abstract level of expectation, randomness and predictability.\emph{self-plagiarised}
hekeus@13	540
hekeus@13	541
hekeus@13	542 Additionally the Melody Triangle serves as an effective tool for experimental investigations into musical preference and their relationship to the information dynamics models.
samer@4	543
hekeus@13	544 %As the Melody Triangle essentially operates on a stream of symbols, it it is possible to apply the melody triangle to the design of non-sonic content.
hekeus@13	545
hekeus@13	546 \section{Musical Preference and Information Dynamics}
hekeus@13	547 We carried out a preliminary study that sought to identify any correlation between aesthetic preference and the information theoretical measures of the Melody Triangle.
hekeus@13	548 In this study participants were asked to use the screen based interface but it was simplified so that all they could do was move tokens around.
hekeus@13	549 To help discount visual biases, the axes of the triangle would be randomly rearranged for each participant.\emph{self-plagiarised}
hekeus@16	550
hekeus@16	551 The study was divided in to two parts, the first investigated musical preference with respect to single melodies at different tempos.
hekeus@13	552 In the second part of the study, a back- ground melody is playing and the participants are asked to find a second melody that �works well� with the background melody.
hekeus@13	553 For each participant this was done four times, each with a different background melody from four different areas of the Melody Triangle.
hekeus@13	554 For all parts of the study the participants were asked to �mark�, by pressing the space bar, whenever they liked what they were hearing.\emph{self-plagiarised}
samer@4	555
hekeus@13	556 \emph{todo - results}
samer@4	557
hekeus@13	558 \section{Information Dynamics as Evaluative Feedback Mechanism}
hekeus@13	559
hekeus@13	560 \emph{todo - code the info dyn evaluator :) }
samer@4	561
hekeus@13	562 It is possible to use information dynamics measures to develop a kind of `critic' that would evaluate a stream of symbols.
hekeus@13	563 For instance we could develop a system to notify us if a stream of symbols is too boring, either because they are too repetitive or too chaotic.
hekeus@13	564 This could be used to evaluate both pre-composed streams of symbols, or could even be used to provide real-time feedback in an improvisatory setup.
hekeus@13	565
hekeus@13	566 \emph{comparable system} Gordon Pask's Musicolor (1953) applied a similar notion of boredom in its design.
hekeus@13	567 The Musicolour would react to audio input through a microphone by flashing coloured lights.
hekeus@13	568 Rather than a direct mapping of sound to light, Pask designed the device to be a partner to a performing musician.
hekeus@13	569 It would adapt its lighting pattern based on the rhythms and frequencies it would hear, quickly `learning' to flash in time with the music.
hekeus@13	570 However Pask endowed the device with the ability to `be bored'; if the rhythmic and frequency content of the input remained the same for too long it would listen for other rhythms and frequencies, only lighting when it heard these.
hekeus@13	571 As the Musicolour would `get bored', the musician would have to change and vary their playing, eliciting new and unexpected outputs in trying to keep the Musicolour interested.
samer@4	572
hekeus@13	573 In a similar vain, our \emph{Information Dynamics Critic}(name?) allows for an evaluative measure of an input stream, however containing a more sophisticated notion of boredom that \dots
hekeus@13	574
hekeus@13	575
hekeus@13	576
hekeus@13	577
samer@4	578 \section{Conclusion}
samer@4	579
samer@9	580 \bibliographystyle{unsrt}
hekeus@16	581 {\bibliography{all,c4dm,nime}}
samer@4	582 \end{document}

Mercurial > hg > cip2012

annotate draft.tex @ 21:739b2444a4ac