cip2012: draft.tex annotate

annotate draft.tex @ 23:f9a67e19a66b

Reformatted long lines to approx 80 characters.

author	samer
date	Mon, 12 Mar 2012 20:00:25 +0000
parents	739b2444a4ac
children	79ede31feb20

rev	line source
samer@18	1 \documentclass[conference,a4paper]{IEEEtran}
samer@4	2 \usepackage{cite}
samer@4	3 \usepackage[cmex10]{amsmath}
samer@4	4 \usepackage{graphicx}
samer@4	5 \usepackage{amssymb}
samer@4	6 \usepackage{epstopdf}
samer@4	7 \usepackage{url}
samer@4	8 \usepackage{listings}
samer@18	9 %\usepackage[expectangle]{tools}
samer@9	10 \usepackage{tools}
samer@18	11 \usepackage{tikz}
samer@18	12 \usetikzlibrary{calc}
samer@18	13 \usetikzlibrary{matrix}
samer@18	14 \usetikzlibrary{patterns}
samer@18	15 \usetikzlibrary{arrows}
samer@9	16
samer@9	17 \let\citep=\cite
samer@18	18 \newcommand{\colfig}[2][1]{\includegraphics[width=#1\linewidth]{figs/#2}}%
samer@18	19 \newcommand\preals{\reals_+}
samer@18	20 \newcommand\X{\mathcal{X}}
samer@18	21 \newcommand\Y{\mathcal{Y}}
samer@18	22 \newcommand\domS{\mathcal{S}}
samer@18	23 \newcommand\A{\mathcal{A}}
samer@18	24 \newcommand\rvm[1]{\mathrm{#1}}
samer@18	25 \newcommand\sps{\,.\,}
samer@18	26 \newcommand\Ipred{\mathcal{I}_{\mathrm{pred}}}
samer@18	27 \newcommand\Ix{\mathcal{I}}
samer@18	28 \newcommand\IXZ{\overline{\underline{\mathcal{I}}}}
samer@18	29 \newcommand\x{\vec{x}}
samer@18	30 \newcommand\Ham[1]{\mathcal{H}_{#1}}
samer@18	31 \newcommand\subsets[2]{[#1]^{(k)}}
samer@18	32 \def\bet(#1,#2){#1..#2}
samer@18	33
samer@18	34
samer@18	35 \def\ev(#1=#2){#1\!\!=\!#2}
samer@18	36 \newcommand\rv[1]{\Omega \to #1}
samer@18	37 \newcommand\ceq{\!\!=\!}
samer@18	38 \newcommand\cmin{\!-\!}
samer@18	39 \newcommand\modulo[2]{#1\!\!\!\!\!\mod#2}
samer@18	40
samer@18	41 \newcommand\sumitoN{\sum_{i=1}^N}
samer@18	42 \newcommand\sumktoK{\sum_{k=1}^K}
samer@18	43 \newcommand\sumjtoK{\sum_{j=1}^K}
samer@18	44 \newcommand\sumalpha{\sum_{\alpha\in\A}}
samer@18	45 \newcommand\prodktoK{\prod_{k=1}^K}
samer@18	46 \newcommand\prodjtoK{\prod_{j=1}^K}
samer@18	47
samer@18	48 \newcommand\past[1]{\overset{\rule{0pt}{0.2em}\smash{\leftarrow}}{#1}}
samer@18	49 \newcommand\fut[1]{\overset{\rule{0pt}{0.1em}\smash{\rightarrow}}{#1}}
samer@18	50 \newcommand\parity[2]{P^{#1}_{2,#2}}
samer@4	51
samer@4	52 %\usepackage[parfill]{parskip}
samer@4	53
samer@4	54 \begin{document}
samer@4	55 \title{Cognitive Music Modelling: an Information Dynamics Approach}
samer@4	56
samer@4	57 \author{
hekeus@16	58 \IEEEauthorblockN{Samer A. Abdallah, Henrik Ekeus, Peter Foster}
hekeus@16	59 \IEEEauthorblockN{Andrew Robertson and Mark D. Plumbley}
samer@4	60 \IEEEauthorblockA{Centre for Digital Music\\
samer@4	61 Queen Mary University of London\\
hekeus@16	62 Mile End Road, London E1 4NS\\
hekeus@16	63 Email:}}
samer@4	64
samer@4	65 \maketitle
samer@18	66 \begin{abstract}
samer@18	67 People take in information when perceiving music. With it they continually
samer@18	68 build predictive models of what is going to happen. There is a relationship
samer@18	69 between information measures and how we perceive music. An information
samer@18	70 theoretic approach to music cognition is thus a fruitful avenue of research.
samer@18	71 In this paper, we review the theoretical foundations of information dynamics
samer@18	72 and discuss a few emerging areas of application.
hekeus@16	73 \end{abstract}
samer@4	74
samer@4	75
samer@9	76 \section{Expectation and surprise in music}
samer@9	77 \label{s:Intro}
samer@9	78
samer@18	79 One of the effects of listening to music is to create
samer@18	80 expectations of what is to come next, which may be fulfilled
samer@9	81 immediately, after some delay, or not at all as the case may be.
samer@9	82 This is the thesis put forward by, amongst others, music theorists
samer@18	83 L. B. Meyer \cite{Meyer67} and Narmour \citep{Narmour77}, but was
samer@18	84 recognised much earlier; for example,
samer@9	85 it was elegantly put by Hanslick \cite{Hanslick1854} in the
samer@9	86 nineteenth century:
samer@9	87 \begin{quote}
samer@9	88 `The most important factor in the mental process which accompanies the
samer@9	89 act of listening to music, and which converts it to a source of pleasure,
samer@18	90 is \ldots the intellectual satisfaction
samer@9	91 which the listener derives from continually following and anticipating
samer@9	92 the composer's intentions---now, to see his expectations fulfilled, and
samer@18	93 now, to find himself agreeably mistaken.
samer@18	94 %It is a matter of course that
samer@18	95 %this intellectual flux and reflux, this perpetual giving and receiving
samer@18	96 %takes place unconsciously, and with the rapidity of lightning-flashes.'
samer@9	97 \end{quote}
samer@9	98 An essential aspect of this is that music is experienced as a phenomenon
samer@9	99 that `unfolds' in time, rather than being apprehended as a static object
samer@9	100 presented in its entirety. Meyer argued that musical experience depends
samer@9	101 on how we change and revise our conceptions \emph{as events happen}, on
samer@9	102 how expectation and prediction interact with occurrence, and that, to a
samer@9	103 large degree, the way to understand the effect of music is to focus on
samer@9	104 this `kinetics' of expectation and surprise.
samer@9	105
samer@9	106 The business of making predictions and assessing surprise is essentially
samer@9	107 one of reasoning under conditions of uncertainty and manipulating
samer@9	108 degrees of belief about the various proposition which may or may not
samer@9	109 hold, and, as has been argued elsewhere \cite{Cox1946,Jaynes27}, best
samer@9	110 quantified in terms of Bayesian probability theory.
samer@9	111 Thus, we suppose that
samer@9	112 when we listen to music, expectations are created on the basis of our
samer@9	113 familiarity with various stylistic norms %, that is, using models that
samer@9	114 encode the statistics of music in general, the particular styles of
samer@9	115 music that seem best to fit the piece we happen to be listening to, and
samer@9	116 the emerging structures peculiar to the current piece. There is
samer@9	117 experimental evidence that human listeners are able to internalise
samer@9	118 statistical knowledge about musical structure, \eg
samer@9	119 \citep{SaffranJohnsonAslin1999,EerolaToiviainenKrumhansl2002}, and also
samer@9	120 that statistical models can form an effective basis for computational
samer@9	121 analysis of music, \eg
samer@9	122 \cite{ConklinWitten95,PonsfordWigginsMellish1999,Pearce2005}.
samer@9	123
samer@9	124 \subsection{Music and information theory}
samer@9	125 Given a probabilistic framework for music modelling and prediction,
samer@9	126 it is a small step to apply quantitative information theory \cite{Shannon48} to
samer@9	127 the models at hand.
samer@9	128 The relationship between information theory and music and art in general has been the
samer@9	129 subject of some interest since the 1950s
samer@9	130 \cite{Youngblood58,CoonsKraehenbuehl1958,HillerBean66,Moles66,Meyer67,Cohen1962}.
samer@9	131 The general thesis is that perceptible qualities and subjective
samer@9	132 states like uncertainty, surprise, complexity, tension, and interestingness
samer@9	133 are closely related to
samer@9	134 information-theoretic quantities like entropy, relative entropy,
samer@9	135 and mutual information.
samer@9	136 % and are major determinants of the overall experience.
samer@9	137 Berlyne \cite{Berlyne71} called such quantities `collative variables', since
samer@9	138 they are to do with patterns of occurrence rather than medium-specific details,
samer@9	139 and developed the ideas of `information aesthetics' in an experimental setting.
samer@9	140 % Berlyne's `new experimental aesthetics', the `information-aestheticians'.
samer@9	141
samer@9	142 % Listeners then experience greater or lesser levels of surprise
samer@9	143 % in response to departures from these norms.
samer@9	144 % By careful manipulation
samer@9	145 % of the material, the composer can thus define, and induce within the
samer@9	146 % listener, a temporal programme of varying
samer@9	147 % levels of uncertainty, ambiguity and surprise.
samer@9	148
samer@9	149
samer@9	150 Previous work in this area \cite{Berlyne74} treated the various
samer@9	151 information theoretic quantities
samer@9	152 such as entropy as if they were intrinsic properties of the stimulus---subjects
samer@9	153 were presented with a sequence of tones with `high entropy', or a visual pattern
samer@9	154 with `low entropy'. These values were determined from some known `objective'
samer@9	155 probability model of the stimuli,%
samer@9	156 \footnote{%
samer@9	157 The notion of objective probabalities and whether or not they can
samer@9	158 usefully be said to exist is the subject of some debate, with advocates of
samer@9	159 subjective probabilities including de Finetti \cite{deFinetti}.
samer@9	160 Accordingly, we will treat the concept of a `true' or `objective' probability
samer@9	161 models with a grain of salt and not rely on them in our
samer@9	162 theoretical development.}%
samer@9	163 or from simple statistical analyses such as
samer@9	164 computing emprical distributions. Our approach is explicitly to consider the role
samer@9	165 of the observer in perception, and more specifically, to consider estimates of
samer@9	166 entropy \etc with respect to \emph{subjective} probabilities.
samer@9	167 \subsection{Information dynamic approach}
samer@9	168
samer@9	169 Bringing the various strands together, our working hypothesis is that
samer@9	170 as a listener (to which will refer gender neutrally as `it')
samer@9	171 listens to a piece of music, it maintains a dynamically evolving statistical
samer@9	172 model that enables it to make predictions about how the piece will
samer@9	173 continue, relying on both its previous experience of music and the immediate
samer@9	174 context of the piece.
samer@9	175 As events unfold, it revises its model and hence its probabilistic belief state,
samer@9	176 which includes predictive distributions over future observations.
samer@9	177 These distributions and changes in distributions can be characterised in terms of a handful of information
samer@9	178 theoretic-measures such as entropy and relative entropy.
samer@9	179 % to measure uncertainty and information. %, that is, changes in predictive distributions maintained by the model.
samer@9	180 By tracing the evolution of a these measures, we obtain a representation
samer@9	181 which captures much of the significant structure of the
samer@9	182 music.
samer@9	183 This approach has a number of features which we list below.
samer@9	184
samer@18	185 \emph{Abstraction}:
samer@9	186 Because it is sensitive mainly to \emph{patterns} of occurence,
samer@9	187 rather the details of which specific things occur,
samer@9	188 it operates at a level of abstraction removed from the details of the sensory
samer@9	189 experience and the medium through which it was received, suggesting that the
samer@9	190 same approach could, in principle, be used to analyse and compare information
samer@9	191 flow in different temporal media regardless of whether they are auditory,
samer@9	192 visual or otherwise.
samer@9	193
samer@18	194 \emph{Generality} applicable to any probabilistic model.
samer@9	195
samer@18	196 \emph{Subjectivity}:
samer@9	197 Since the analysis is dependent on the probability model the observer brings to the
samer@9	198 problem, which may depend on prior experience or other factors, and which may change
samer@9	199 over time, inter-subject variablity and variation in subjects' responses over time are
samer@9	200 fundamental to the theory. It is essentially a theory of subjective response
samer@9	201
samer@18	202 %modelling the creative process, which often alternates between generative
samer@18	203 %and selective or evaluative phases \cite{Boden1990}, and would have
samer@18	204 %applications in tools for computer aided composition.
samer@18	205
samer@18	206
samer@18	207 \section{Theoretical review}
samer@18	208
samer@18	209 In this section, we summarise the definitions of some of the relevant quantities
samer@18	210 in information dynamics and show how they can be computed in some simple probabilistic
samer@18	211 models (namely, first and higher-order Markov chains, and Gaussian processes [Peter?]).
samer@18	212
samer@18	213 \begin{fig}{venn-example}
samer@18	214 \newcommand\rad{2.2em}%
samer@18	215 \newcommand\circo{circle (3.4em)}%
samer@18	216 \newcommand\labrad{4.3em}
samer@18	217 \newcommand\bound{(-6em,-5em) rectangle (6em,6em)}
samer@18	218 \newcommand\colsep{\ }
samer@18	219 \newcommand\clipin[1]{\clip (#1) \circo;}%
samer@18	220 \newcommand\clipout[1]{\clip \bound (#1) \circo;}%
samer@18	221 \newcommand\cliptwo[3]{%
samer@18	222 \begin{scope}
samer@18	223 \clipin{#1};
samer@18	224 \clipin{#2};
samer@18	225 \clipout{#3};
samer@18	226 \fill[black!30] \bound;
samer@18	227 \end{scope}
samer@18	228 }%
samer@18	229 \newcommand\clipone[3]{%
samer@18	230 \begin{scope}
samer@18	231 \clipin{#1};
samer@18	232 \clipout{#2};
samer@18	233 \clipout{#3};
samer@18	234 \fill[black!15] \bound;
samer@18	235 \end{scope}
samer@18	236 }%
samer@18	237 \begin{tabular}{c@{\colsep}c}
samer@18	238 \begin{tikzpicture}[baseline=0pt]
samer@18	239 \coordinate (p1) at (90:\rad);
samer@18	240 \coordinate (p2) at (210:\rad);
samer@18	241 \coordinate (p3) at (-30:\rad);
samer@18	242 \clipone{p1}{p2}{p3};
samer@18	243 \clipone{p2}{p3}{p1};
samer@18	244 \clipone{p3}{p1}{p2};
samer@18	245 \cliptwo{p1}{p2}{p3};
samer@18	246 \cliptwo{p2}{p3}{p1};
samer@18	247 \cliptwo{p3}{p1}{p2};
samer@18	248 \begin{scope}
samer@18	249 \clip (p1) \circo;
samer@18	250 \clip (p2) \circo;
samer@18	251 \clip (p3) \circo;
samer@18	252 \fill[black!45] \bound;
samer@18	253 \end{scope}
samer@18	254 \draw (p1) \circo;
samer@18	255 \draw (p2) \circo;
samer@18	256 \draw (p3) \circo;
samer@18	257 \path
samer@18	258 (barycentric cs:p3=1,p1=-0.2,p2=-0.1) +(0ex,0) node {$I_{3\|12}$}
samer@18	259 (barycentric cs:p1=1,p2=-0.2,p3=-0.1) +(0ex,0) node {$I_{1\|23}$}
samer@18	260 (barycentric cs:p2=1,p3=-0.2,p1=-0.1) +(0ex,0) node {$I_{2\|13}$}
samer@18	261 (barycentric cs:p3=1,p2=1,p1=-0.55) +(0ex,0) node {$I_{23\|1}$}
samer@18	262 (barycentric cs:p1=1,p3=1,p2=-0.55) +(0ex,0) node {$I_{13\|2}$}
samer@18	263 (barycentric cs:p2=1,p1=1,p3=-0.55) +(0ex,0) node {$I_{12\|3}$}
samer@18	264 (barycentric cs:p3=1,p2=1,p1=1) node {$I_{123}$}
samer@18	265 ;
samer@18	266 \path
samer@18	267 (p1) +(140:\labrad) node {$X_1$}
samer@18	268 (p2) +(-140:\labrad) node {$X_2$}
samer@18	269 (p3) +(-40:\labrad) node {$X_3$};
samer@18	270 \end{tikzpicture}
samer@18	271 &
samer@18	272 \parbox{0.5\linewidth}{
samer@18	273 \small
samer@18	274 \begin{align*}
samer@18	275 I_{1\|23} &= H(X_1\|X_2,X_3) \\
samer@18	276 I_{13\|2} &= I(X_1;X_3\|X_2) \\
samer@18	277 I_{1\|23} + I_{13\|2} &= H(X_1\|X_2) \\
samer@18	278 I_{12\|3} + I_{123} &= I(X_1;X_2)
samer@18	279 \end{align*}
samer@18	280 }
samer@18	281 \end{tabular}
samer@18	282 \caption{
samer@18	283 Venn diagram visualisation of entropies and mutual informations
samer@18	284 for three random variables $X_1$, $X_2$ and $X_3$. The areas of
samer@18	285 the three circles represent $H(X_1)$, $H(X_2)$ and $H(X_3)$ respectively.
samer@18	286 The total shaded area is the joint entropy $H(X_1,X_2,X_3)$.
samer@18	287 The central area $I_{123}$ is the co-information \cite{McGill1954}.
samer@18	288 Some other information measures are indicated in the legend.
samer@18	289 }
samer@18	290 \end{fig}
samer@18	291 [Adopting notation of recent Binding information paper.]
samer@18	292 \subsection{'Anatomy of a bit' stuff}
samer@18	293 Entropy rates, redundancy, predictive information etc.
samer@18	294 Information diagrams.
samer@18	295
samer@18	296 \begin{fig}{predinfo-bg}
samer@18	297 \newcommand\subfig[2]{\shortstack{#2\\[0.75em]#1}}
samer@18	298 \newcommand\rad{1.8em}%
samer@18	299 \newcommand\ovoid[1]{%
samer@18	300 ++(-#1,\rad)
samer@18	301 -- ++(2 * #1,0em) arc (90:-90:\rad)
samer@18	302 -- ++(-2 * #1,0em) arc (270:90:\rad)
samer@18	303 }%
samer@18	304 \newcommand\axis{2.75em}%
samer@18	305 \newcommand\olap{0.85em}%
samer@18	306 \newcommand\offs{3.6em}
samer@18	307 \newcommand\colsep{\hspace{5em}}
samer@18	308 \newcommand\longblob{\ovoid{\axis}}
samer@18	309 \newcommand\shortblob{\ovoid{1.75em}}
samer@18	310 \begin{tabular}{c@{\colsep}c}
samer@18	311 \subfig{(a) excess entropy}{%
samer@18	312 \newcommand\blob{\longblob}
samer@18	313 \begin{tikzpicture}
samer@18	314 \coordinate (p1) at (-\offs,0em);
samer@18	315 \coordinate (p2) at (\offs,0em);
samer@18	316 \begin{scope}
samer@18	317 \clip (p1) \blob;
samer@18	318 \clip (p2) \blob;
samer@18	319 \fill[lightgray] (-1,-1) rectangle (1,1);
samer@18	320 \end{scope}
samer@18	321 \draw (p1) +(-0.5em,0em) node{\shortstack{infinite\\past}} \blob;
samer@18	322 \draw (p2) +(0.5em,0em) node{\shortstack{infinite\\future}} \blob;
samer@18	323 \path (0,0) node (future) {$E$};
samer@18	324 \path (p1) +(-2em,\rad) node [anchor=south] {$\ldots,X_{-1}$};
samer@18	325 \path (p2) +(2em,\rad) node [anchor=south] {$X_0,\ldots$};
samer@18	326 \end{tikzpicture}%
samer@18	327 }%
samer@18	328 \\[1.25em]
samer@18	329 \subfig{(b) predictive information rate $b_\mu$}{%
samer@18	330 \begin{tikzpicture}%[baseline=-1em]
samer@18	331 \newcommand\rc{2.1em}
samer@18	332 \newcommand\throw{2.5em}
samer@18	333 \coordinate (p1) at (210:1.5em);
samer@18	334 \coordinate (p2) at (90:0.7em);
samer@18	335 \coordinate (p3) at (-30:1.5em);
samer@18	336 \newcommand\bound{(-7em,-2.6em) rectangle (7em,3.0em)}
samer@18	337 \newcommand\present{(p2) circle (\rc)}
samer@18	338 \newcommand\thepast{(p1) ++(-\throw,0) \ovoid{\throw}}
samer@18	339 \newcommand\future{(p3) ++(\throw,0) \ovoid{\throw}}
samer@18	340 \newcommand\fillclipped[2]{%
samer@18	341 \begin{scope}[even odd rule]
samer@18	342 \foreach \thing in {#2} {\clip \thing;}
samer@18	343 \fill[black!#1] \bound;
samer@18	344 \end{scope}%
samer@18	345 }%
samer@18	346 \fillclipped{30}{\present,\future,\bound \thepast}
samer@18	347 \fillclipped{15}{\present,\bound \future,\bound \thepast}
samer@18	348 \draw \future;
samer@18	349 \fillclipped{45}{\present,\thepast}
samer@18	350 \draw \thepast;
samer@18	351 \draw \present;
samer@18	352 \node at (barycentric cs:p2=1,p1=-0.17,p3=-0.17) {$r_\mu$};
samer@18	353 \node at (barycentric cs:p1=-0.4,p2=1.0,p3=1) {$b_\mu$};
samer@18	354 \node at (barycentric cs:p3=0,p2=1,p1=1.2) [shape=rectangle,fill=black!45,inner sep=1pt]{$\rho_\mu$};
samer@18	355 \path (p2) +(140:3em) node {$X_0$};
samer@18	356 % \node at (barycentric cs:p3=0,p2=1,p1=1) {$\rho_\mu$};
samer@18	357 \path (p3) +(3em,0em) node {\shortstack{infinite\\future}};
samer@18	358 \path (p1) +(-3em,0em) node {\shortstack{infinite\\past}};
samer@18	359 \path (p1) +(-4em,\rad) node [anchor=south] {$\ldots,X_{-1}$};
samer@18	360 \path (p3) +(4em,\rad) node [anchor=south] {$X_1,\ldots$};
samer@18	361 \end{tikzpicture}}%
samer@18	362 \\[0.5em]
samer@18	363 \end{tabular}
samer@18	364 \caption{
samer@18	365 Venn diagram representation of several information measures for
samer@18	366 stationary random processes. Each circle or oval represents a random
samer@18	367 variable or sequence of random variables relative to time $t=0$. Overlapped areas
samer@18	368 correspond to various mutual information as in \Figrf{venn-example}.
samer@18	369 In (c), the circle represents the `present'. Its total area is
samer@18	370 $H(X_0)=H(1)=\rho_\mu+r_\mu+b_\mu$, where $\rho_\mu$ is the multi-information
samer@18	371 rate, $r_\mu$ is the residual entropy rate, and $b_\mu$ is the predictive
samer@18	372 information rate. The entropy rate is $h_\mu = r_\mu+b_\mu$.
samer@18	373 }
samer@18	374 \end{fig}
samer@18	375
samer@18	376 \paragraph{Predictive information rate}
samer@18	377 In previous work \cite{AbdallahPlumbley2009}, we introduced
samer@18	378 % examined several
samer@18	379 % information-theoretic measures that could be used to characterise
samer@18	380 % not only random processes (\ie, an ensemble of possible sequences),
samer@18	381 % but also the dynamic progress of specific realisations of such processes.
samer@18	382 % One of these measures was
samer@18	383 %
samer@18	384 the \emph{predictive information rate}
samer@18	385 (PIR), which is the average information
samer@18	386 in one observation about the infinite future given the infinite past.
samer@18	387 If $\past{X}_t=(\ldots,X_{t-2},X_{t-1})$ denotes the variables
samer@18	388 before time $t$,
samer@18	389 and $\fut{X}_t = (X_{t+1},X_{t+2},\ldots)$ denotes
samer@18	390 those after $t$,
samer@18	391 the PIR at time $t$ is defined as a conditional mutual information:
samer@18	392 \begin{equation}
samer@18	393 \label{eq:PIR}
samer@18	394 \IXZ_t \define I(X_t;\fut{X}_t\|\past{X}_t) = H(\fut{X}_t\|\past{X}_t) - H(\fut{X}_t\|X_t,\past{X}_t).
samer@18	395 \end{equation}
samer@18	396 % (The underline/overline notation follows that of \cite[\S 3]{AbdallahPlumbley2009}.)
samer@18	397 % Hence, $\Ix_t$ quantifies the \emph{new}
samer@18	398 % information gained about the future from the observation at time $t$.
samer@18	399 Equation \eqrf{PIR} can be read as the average reduction
samer@18	400 in uncertainty about the future on learning $X_t$, given the past.
samer@18	401 Due to the symmetry of the mutual information, it can also be written
samer@18	402 as
samer@18	403 \begin{equation}
samer@18	404 % \IXZ_t
samer@18	405 I(X_t;\fut{X}_t\|\past{X}_t) = H(X_t\|\past{X}_t) - H(X_t\|\fut{X}_t,\past{X}_t).
samer@18	406 % \label{<++>}
samer@18	407 \end{equation}
samer@18	408 % If $X$ is stationary, then
samer@18	409 Now, in the shift-invariant case, $H(X_t\|\past{X}_t)$
samer@18	410 is the familiar entropy rate $h_\mu$, but $H(X_t\|\fut{X}_t,\past{X}_t)$,
samer@18	411 the conditional entropy of one variable given \emph{all} the others
samer@18	412 in the sequence, future as well as past, is what
samer@18	413 we called the \emph{residual entropy rate} $r_\mu$ in \cite{AbdallahPlumbley2010},
samer@18	414 but was previously identified by Verd{\'u} and Weissman \cite{VerduWeissman2006} as the
samer@18	415 \emph{erasure entropy rate}.
samer@18	416 % It is not expressible in terms of the block entropy function $H(\cdot)$.
samer@18	417 It can be defined as the limit
samer@18	418 \begin{equation}
samer@18	419 \label{eq:residual-entropy-rate}
samer@18	420 r_\mu \define \lim_{N\tends\infty} H(X_{\bet(-N,N)}) - H(X_{\bet(-N,-1)},X_{\bet(1,N)}).
samer@18	421 \end{equation}
samer@18	422 The second term, $H(X_{\bet(1,N)},X_{\bet(-N,-1)})$,
samer@18	423 is the joint entropy of two non-adjacent blocks each of length $N$ with a
samer@18	424 gap between them,
samer@18	425 and cannot be expressed as a function of block entropies alone.
samer@18	426 % In order to associate it with the concept of \emph{binding information} which
samer@18	427 % we will define in \secrf{binding-info}, we
samer@18	428 Thus, the shift-invariant PIR (which we will write as $b_\mu$) is the difference between
samer@18	429 the entropy rate and the erasure entropy rate: $b_\mu = h_\mu - r_\mu$.
samer@18	430 These relationships are illustrated in \Figrf{predinfo-bg}, along with
samer@18	431 several of the information measures we have discussed so far.
samer@18	432
samer@18	433
samer@18	434 \subsection{First order Markov chains}
samer@18	435 These are the simplest non-trivial models to which information dynamics methods
samer@18	436 can be applied. In \cite{AbdallahPlumbley2009} we, showed that the predictive information
samer@18	437 rate can be expressed simply in terms of the entropy rate of the Markov chain.
samer@18	438 If we let $a$ denote the transition matrix of the Markov chain, and $h_a$ it's
samer@18	439 entropy rate, then its predictive information rate $b_a$ is
samer@18	440 \begin{equation}
samer@18	441 b_a = h_{a^2} - h_a,
samer@18	442 \end{equation}
samer@18	443 where $a^2 = aa$, the transition matrix squared, is the transition matrix
samer@18	444 of the `skip one' Markov chain obtained by leaving out every other observation.
samer@18	445
samer@18	446 \subsection{Higher order Markov chains}
samer@18	447 Second and higher order Markov chains can be treated in a similar way by transforming
samer@18	448 to a first order representation of the high order Markov chain. If we are dealing
samer@18	449 with an $N$th order model, this is done forming a new alphabet of possible observations
samer@18	450 consisting of all possible $N$-tuples of symbols from the base alphabet. An observation
samer@18	451 in this new model represents a block of $N$ observations from the base model. The next
samer@18	452 observation represents the block of $N$ obtained by shift the previous block along
samer@18	453 by one step. The new Markov of chain is parameterised by a sparse $K^N\times K^N$
samer@18	454 transition matrix $\hat{a}$.
samer@18	455 \begin{equation}
samer@18	456 b_{\hat{a}} = h_{\hat{a}^{N+1}} - N h_{\hat{a}},
samer@18	457 \end{equation}
samer@18	458 where $\hat{a}^{N+1}$ is the $N+1$th power of the transition matrix.
samer@18	459
samer@9	460
samer@4	461
hekeus@16	462 \section{Information Dynamics in Analysis}
samer@4	463
hekeus@16	464 \subsection{Musicological Analysis}
samer@4	465 refer to the work with the analysis of minimalist pieces
samer@4	466
samer@23	467 \subsection{Content analysis/Sound Categorisation}.
samer@23	468 Using Information Dynamics it is possible to segment music. From there we
samer@23	469 can then use this to search large data sets. Determine musical structure for
samer@23	470 the purpose of playlist navigation and search.
hekeus@16	471 \emph{Peter}
samer@4	472
samer@4	473 \subsection{Beat Tracking}
hekeus@16	474 \emph{Andrew}
samer@4	475
samer@4	476
hekeus@13	477 \section{Information Dynamics as Design Tool}
hekeus@13	478
samer@23	479 In addition to applying information dynamics to analysis, it is also possible
samer@23	480 use this approach in design, such as the composition of musical materials. By
samer@23	481 providing a framework for linking information theoretic measures to the control
samer@23	482 of generative processes, it becomes possible to steer the output of these processes
samer@23	483 to match a criteria defined by these measures. For instance outputs of a
samer@23	484 stochastic musical process could be filtered to match constraints defined by a
samer@23	485 set of information theoretic measures.
hekeus@13	486
samer@23	487 The use of stochastic processes for the generation of musical material has been
samer@23	488 widespread for decades -- Iannis Xenakis applied probabilistic mathematical
samer@23	489 models to the creation of musical materials, including to the formulation of a
samer@23	490 theory of Markovian Stochastic Music. However we can use information dynamics
samer@23	491 measures to explore and interface with such processes at the high and abstract
samer@23	492 level of expectation, randomness and predictability. The Melody Triangle is
samer@23	493 such a system.
hekeus@13	494
samer@23	495 \subsection{The Melody Triangle}
samer@23	496 The Melody Triangle is an exploratory interface for the discovery of melodic
samer@23	497 content, where the input -- positions within a triangle -- directly map to
samer@23	498 information theoretic measures associated with the output.
samer@23	499 The measures are the entropy rate, redundancy and predictive information rate
samer@23	500 of the random process used to generate the sequence of notes.
samer@23	501 These are all related to the predictability of the the sequence and as such
samer@23	502 address the notions of expectation and surprise in the perception of
samer@23	503 music.\emph{self-plagiarised}
hekeus@13	504
samer@23	505 Before the Melody Triangle can used, it has to be `populated' with possible
samer@23	506 parameter values for the melody generators. These are then plotted in a 3d
samer@23	507 statistical space of redundancy, entropy rate and predictive information rate.
samer@23	508 In our case we generated thousands of transition matrixes, representing first-order
samer@23	509 Markov chains, by a random sampling method. In figure \ref{InfoDynEngine} we see
samer@23	510 a representation of how these matrixes are distributed in the 3d statistical
samer@23	511 space; each one of these points corresponds to a transition
samer@23	512 matrix.\emph{self-plagiarised}
hekeus@17	513
hekeus@17	514 \begin{figure}
hekeus@17	515 \centering
samer@21	516 \includegraphics[width=\linewidth]{figs/mtriscat}
samer@21	517 \caption{The population of transition matrices distributed along three axes of
samer@21	518 redundancy, entropy rate and predictive information rate (all measured in bits).
samer@21	519 The concentrations of points along the redundancy axis correspond
samer@21	520 to Markov chains which are roughly periodic with periods of 2 (redundancy 1 bit),
samer@21	521 3, 4, \etc all the way to period 8 (redundancy 3 bits). The colour of each point
samer@21	522 represents its PIR---note that the highest values are found at intermediate entropy
samer@21	523 and redundancy, and that the distribution as a whole makes a curved triangle. Although
samer@21	524 not visible in this plot, it is largely hollow in the middle.
samer@21	525 \label{InfoDynEngine}}
hekeus@17	526 \end{figure}
hekeus@17	527
samer@4	528
samer@23	529 When we look at the distribution of transition matrixes plotted in this space,
samer@23	530 we see that it forms an arch shape that is fairly thin. It thus becomes a
samer@23	531 reasonable approximation to pretend that it is just a sheet in two dimensions;
samer@23	532 and so we stretch out this curved arc into a flat triangle. It is this triangular
samer@23	533 sheet that is our `Melody Triangle' and forms the interface by which the system
samer@23	534 is controlled. \emph{self-plagiarised}
samer@4	535
samer@23	536 When the Melody Triangle is used, regardless of whether it is as a screen based
samer@23	537 system, or as an interactive installation, it involves a mapping to this statistical
samer@23	538 space. When the user, through the interface, selects a position within the
samer@23	539 triangle, the corresponding transition matrix is returned. Figure \ref{TheTriangle}
samer@23	540 shows how the triangle maps to different measures of redundancy, entropy rate
samer@23	541 and predictive information rate.\emph{self-plagiarised}
hekeus@17	542 \begin{figure}
hekeus@17	543 \centering
samer@21	544 \includegraphics[width=\linewidth]{figs/TheTriangle.pdf}
hekeus@17	545 \caption{The Melody Triangle\label{TheTriangle}}
hekeus@17	546 \end{figure}
samer@23	547 Each corner corresponds to three different extremes of predictability and
samer@23	548 unpredictability, which could be loosely characterised as `periodicity', `noise'
samer@23	549 and `repetition'. Melodies from the `noise' corner have no discernible pattern;
samer@23	550 they have high entropy rate, low predictive information rate and low redundancy.
samer@23	551 These melodies are essentially totally random. A melody along the `periodicity'
samer@23	552 to `repetition' edge are all deterministic loops that get shorter as we approach
samer@23	553 the `repetition' corner, until it becomes just one repeating note. It is the
samer@23	554 areas in between the extremes that provide the more `interesting' melodies. That
samer@23	555 is, those that have some level of unpredictability, but are not completely ran-
samer@23	556 dom. Or, conversely, that are predictable, but not entirely so. This triangular
samer@23	557 space allows for an intuitive explorationof expectation and surprise in temporal
samer@23	558 sequences based on a simple model of how one might guess the next event given
samer@23	559 the previous one.\emph{self-plagiarised}
samer@23	560
samer@4	561
samer@23	562
samer@23	563 Any number of interfaces could be developed for the Melody Triangle. We have
samer@23	564 developed two; a standard screen based interface where a user moves tokens with
samer@23	565 a mouse in and around a triangle on screen, and a multi-user interactive
samer@23	566 installation where a Kinect camera tracks individuals in a space and maps their
samer@23	567 positions in the space to the triangle.
samer@23	568 Each visitor would generate a melody, and could collaborate with their co-visitors
samer@23	569 to generate musical textures -- a playful yet informative way to explore
samer@23	570 expectation and surprise in music.
samer@23	571
samer@23	572 As a screen based interface the Melody Triangle can serve as composition tool.
samer@23	573 A triangle is drawn on the screen, screen space thus mapped to the statistical
samer@23	574 space of the Melody Triangle.
samer@23	575 A number of round tokens, each representing a melody can be dragged in and
samer@23	576 around the triangle. When a token is dragged into the triangle, the system
samer@23	577 will start generating the sequence of notes with statistical properties that
samer@23	578 correspond to its position in the triangle.\emph{self-plagiarised}
samer@23	579
samer@23	580 In this mode, the Melody Triangle can be used as a kind of composition assistant
samer@23	581 for the generation of interesting musical textures and melodies. However unlike
samer@23	582 other computer aided composition tools or programming environments, here the
samer@23	583 composer engages with music on the high and abstract level of expectation,
samer@23	584 randomness and predictability.\emph{self-plagiarised}
samer@23	585
hekeus@13	586
hekeus@13	587 Additionally the Melody Triangle serves as an effective tool for experimental investigations into musical preference and their relationship to the information dynamics models.
samer@4	588
hekeus@13	589 %As the Melody Triangle essentially operates on a stream of symbols, it it is possible to apply the melody triangle to the design of non-sonic content.
hekeus@13	590
hekeus@13	591 \section{Musical Preference and Information Dynamics}
samer@23	592 We carried out a preliminary study that sought to identify any correlation between
samer@23	593 aesthetic preference and the information theoretical measures of the Melody
samer@23	594 Triangle. In this study participants were asked to use the screen based interface
samer@23	595 but it was simplified so that all they could do was move tokens around. To help
samer@23	596 discount visual biases, the axes of the triangle would be randomly rearranged
samer@23	597 for each participant.\emph{self-plagiarised}
hekeus@16	598
samer@23	599 The study was divided in to two parts, the first investigated musical preference
samer@23	600 with respect to single melodies at different tempos. In the second part of the
samer@23	601 study, a background melody is playing and the participants are asked to continue
samer@23	602 playing with the system under the implicit assumption that they will try to find
samer@23	603 a second melody that works well with the background melody. For each participant
samer@23	604 this was done four times, each with a different background melody from four
samer@23	605 different areas of the Melody Triangle. For all parts of the study the participants
samer@23	606 were asked to signal, by pressing the space bar, whenever they liked what they
samer@23	607 were hearing.\emph{self-plagiarised}
samer@4	608
hekeus@13	609 \emph{todo - results}
samer@4	610
hekeus@13	611 \section{Information Dynamics as Evaluative Feedback Mechanism}
hekeus@13	612
hekeus@13	613 \emph{todo - code the info dyn evaluator :) }
samer@4	614
samer@23	615 It is possible to use information dynamics measures to develop a kind of `critic'
samer@23	616 that would evaluate a stream of symbols. For instance we could develop a system
samer@23	617 to notify us if a stream of symbols is too boring, either because they are too
samer@23	618 repetitive or too chaotic. This could be used to evaluate both pre-composed
samer@23	619 streams of symbols, or could even be used to provide real-time feedback in an
samer@23	620 improvisatory setup.
hekeus@13	621
samer@23	622 \emph{comparable system} Gordon Pask's Musicolor (1953) applied a similar notion
samer@23	623 of boredom in its design. The Musicolour would react to audio input through a
samer@23	624 microphone by flashing coloured lights. Rather than a direct mapping of sound
samer@23	625 to light, Pask designed the device to be a partner to a performing musician. It
samer@23	626 would adapt its lighting pattern based on the rhythms and frequencies it would
samer@23	627 hear, quickly `learning' to flash in time with the music. However Pask endowed
samer@23	628 the device with the ability to `be bored'; if the rhythmic and frequency content
samer@23	629 of the input remained the same for too long it would listen for other rhythms
samer@23	630 and frequencies, only lighting when it heard these. As the Musicolour would
samer@23	631 `get bored', the musician would have to change and vary their playing, eliciting
samer@23	632 new and unexpected outputs in trying to keep the Musicolour interested.
samer@4	633
samer@23	634 In a similar vein, our \emph{Information Dynamics Critic}(name?) allows for an
samer@23	635 evaluative measure of an input stream, however containing a more sophisticated
samer@23	636 notion of boredom that \dots
samer@23	637
hekeus@13	638
hekeus@13	639
hekeus@13	640
samer@4	641 \section{Conclusion}
samer@4	642
samer@9	643 \bibliographystyle{unsrt}
hekeus@16	644 {\bibliography{all,c4dm,nime}}
samer@4	645 \end{document}

Mercurial > hg > cip2012

annotate draft.tex @ 23:f9a67e19a66b