samer@18
|
1 \documentclass[conference,a4paper]{IEEEtran}
|
samer@4
|
2 \usepackage{cite}
|
samer@4
|
3 \usepackage[cmex10]{amsmath}
|
samer@4
|
4 \usepackage{graphicx}
|
samer@4
|
5 \usepackage{amssymb}
|
samer@4
|
6 \usepackage{epstopdf}
|
samer@4
|
7 \usepackage{url}
|
samer@4
|
8 \usepackage{listings}
|
samer@18
|
9 %\usepackage[expectangle]{tools}
|
samer@9
|
10 \usepackage{tools}
|
samer@18
|
11 \usepackage{tikz}
|
samer@18
|
12 \usetikzlibrary{calc}
|
samer@18
|
13 \usetikzlibrary{matrix}
|
samer@18
|
14 \usetikzlibrary{patterns}
|
samer@18
|
15 \usetikzlibrary{arrows}
|
samer@9
|
16
|
samer@9
|
17 \let\citep=\cite
|
samer@24
|
18 \newcommand{\colfig}[2][1]{\includegraphics[width=#1\linewidth]{ifigs/#2}}%
|
samer@18
|
19 \newcommand\preals{\reals_+}
|
samer@18
|
20 \newcommand\X{\mathcal{X}}
|
samer@18
|
21 \newcommand\Y{\mathcal{Y}}
|
samer@18
|
22 \newcommand\domS{\mathcal{S}}
|
samer@18
|
23 \newcommand\A{\mathcal{A}}
|
samer@18
|
24 \newcommand\rvm[1]{\mathrm{#1}}
|
samer@18
|
25 \newcommand\sps{\,.\,}
|
samer@18
|
26 \newcommand\Ipred{\mathcal{I}_{\mathrm{pred}}}
|
samer@18
|
27 \newcommand\Ix{\mathcal{I}}
|
samer@18
|
28 \newcommand\IXZ{\overline{\underline{\mathcal{I}}}}
|
samer@18
|
29 \newcommand\x{\vec{x}}
|
samer@18
|
30 \newcommand\Ham[1]{\mathcal{H}_{#1}}
|
samer@18
|
31 \newcommand\subsets[2]{[#1]^{(k)}}
|
samer@18
|
32 \def\bet(#1,#2){#1..#2}
|
samer@18
|
33
|
samer@18
|
34
|
samer@18
|
35 \def\ev(#1=#2){#1\!\!=\!#2}
|
samer@18
|
36 \newcommand\rv[1]{\Omega \to #1}
|
samer@18
|
37 \newcommand\ceq{\!\!=\!}
|
samer@18
|
38 \newcommand\cmin{\!-\!}
|
samer@18
|
39 \newcommand\modulo[2]{#1\!\!\!\!\!\mod#2}
|
samer@18
|
40
|
samer@18
|
41 \newcommand\sumitoN{\sum_{i=1}^N}
|
samer@18
|
42 \newcommand\sumktoK{\sum_{k=1}^K}
|
samer@18
|
43 \newcommand\sumjtoK{\sum_{j=1}^K}
|
samer@18
|
44 \newcommand\sumalpha{\sum_{\alpha\in\A}}
|
samer@18
|
45 \newcommand\prodktoK{\prod_{k=1}^K}
|
samer@18
|
46 \newcommand\prodjtoK{\prod_{j=1}^K}
|
samer@18
|
47
|
samer@18
|
48 \newcommand\past[1]{\overset{\rule{0pt}{0.2em}\smash{\leftarrow}}{#1}}
|
samer@18
|
49 \newcommand\fut[1]{\overset{\rule{0pt}{0.1em}\smash{\rightarrow}}{#1}}
|
samer@18
|
50 \newcommand\parity[2]{P^{#1}_{2,#2}}
|
samer@4
|
51
|
samer@4
|
52 %\usepackage[parfill]{parskip}
|
samer@4
|
53
|
samer@4
|
54 \begin{document}
|
samer@4
|
55 \title{Cognitive Music Modelling: an Information Dynamics Approach}
|
samer@4
|
56
|
samer@4
|
57 \author{
|
hekeus@16
|
58 \IEEEauthorblockN{Samer A. Abdallah, Henrik Ekeus, Peter Foster}
|
hekeus@16
|
59 \IEEEauthorblockN{Andrew Robertson and Mark D. Plumbley}
|
samer@4
|
60 \IEEEauthorblockA{Centre for Digital Music\\
|
samer@4
|
61 Queen Mary University of London\\
|
hekeus@16
|
62 Mile End Road, London E1 4NS\\
|
hekeus@16
|
63 Email:}}
|
samer@4
|
64
|
samer@4
|
65 \maketitle
|
samer@18
|
66 \begin{abstract}
|
samer@18
|
67 People take in information when perceiving music. With it they continually
|
samer@18
|
68 build predictive models of what is going to happen. There is a relationship
|
samer@18
|
69 between information measures and how we perceive music. An information
|
samer@18
|
70 theoretic approach to music cognition is thus a fruitful avenue of research.
|
samer@18
|
71 In this paper, we review the theoretical foundations of information dynamics
|
samer@18
|
72 and discuss a few emerging areas of application.
|
hekeus@16
|
73 \end{abstract}
|
samer@4
|
74
|
samer@4
|
75
|
samer@9
|
76 \section{Expectation and surprise in music}
|
samer@9
|
77 \label{s:Intro}
|
samer@9
|
78
|
samer@18
|
79 One of the effects of listening to music is to create
|
samer@18
|
80 expectations of what is to come next, which may be fulfilled
|
samer@9
|
81 immediately, after some delay, or not at all as the case may be.
|
samer@9
|
82 This is the thesis put forward by, amongst others, music theorists
|
samer@18
|
83 L. B. Meyer \cite{Meyer67} and Narmour \citep{Narmour77}, but was
|
samer@18
|
84 recognised much earlier; for example,
|
samer@9
|
85 it was elegantly put by Hanslick \cite{Hanslick1854} in the
|
samer@9
|
86 nineteenth century:
|
samer@9
|
87 \begin{quote}
|
samer@9
|
88 `The most important factor in the mental process which accompanies the
|
samer@9
|
89 act of listening to music, and which converts it to a source of pleasure,
|
samer@18
|
90 is \ldots the intellectual satisfaction
|
samer@9
|
91 which the listener derives from continually following and anticipating
|
samer@9
|
92 the composer's intentions---now, to see his expectations fulfilled, and
|
samer@18
|
93 now, to find himself agreeably mistaken.
|
samer@18
|
94 %It is a matter of course that
|
samer@18
|
95 %this intellectual flux and reflux, this perpetual giving and receiving
|
samer@18
|
96 %takes place unconsciously, and with the rapidity of lightning-flashes.'
|
samer@9
|
97 \end{quote}
|
samer@9
|
98 An essential aspect of this is that music is experienced as a phenomenon
|
samer@9
|
99 that `unfolds' in time, rather than being apprehended as a static object
|
samer@9
|
100 presented in its entirety. Meyer argued that musical experience depends
|
samer@9
|
101 on how we change and revise our conceptions \emph{as events happen}, on
|
samer@9
|
102 how expectation and prediction interact with occurrence, and that, to a
|
samer@9
|
103 large degree, the way to understand the effect of music is to focus on
|
samer@9
|
104 this `kinetics' of expectation and surprise.
|
samer@9
|
105
|
samer@9
|
106 The business of making predictions and assessing surprise is essentially
|
samer@9
|
107 one of reasoning under conditions of uncertainty and manipulating
|
samer@9
|
108 degrees of belief about the various proposition which may or may not
|
samer@9
|
109 hold, and, as has been argued elsewhere \cite{Cox1946,Jaynes27}, best
|
samer@9
|
110 quantified in terms of Bayesian probability theory.
|
samer@9
|
111 Thus, we suppose that
|
samer@9
|
112 when we listen to music, expectations are created on the basis of our
|
samer@24
|
113 familiarity with various stylistic norms that apply to music in general,
|
samer@24
|
114 the particular style (or styles) of music that seem best to fit the piece
|
samer@24
|
115 we are listening to, and
|
samer@9
|
116 the emerging structures peculiar to the current piece. There is
|
samer@9
|
117 experimental evidence that human listeners are able to internalise
|
samer@9
|
118 statistical knowledge about musical structure, \eg
|
samer@9
|
119 \citep{SaffranJohnsonAslin1999,EerolaToiviainenKrumhansl2002}, and also
|
samer@9
|
120 that statistical models can form an effective basis for computational
|
samer@9
|
121 analysis of music, \eg
|
samer@9
|
122 \cite{ConklinWitten95,PonsfordWigginsMellish1999,Pearce2005}.
|
samer@9
|
123
|
samer@9
|
124 \subsection{Music and information theory}
|
samer@24
|
125 With a probabilistic framework for music modelling and prediction in hand,
|
samer@24
|
126 we are in a position to apply quantitative information theory \cite{Shannon48}.
|
samer@9
|
127 The relationship between information theory and music and art in general has been the
|
samer@9
|
128 subject of some interest since the 1950s
|
samer@9
|
129 \cite{Youngblood58,CoonsKraehenbuehl1958,HillerBean66,Moles66,Meyer67,Cohen1962}.
|
samer@9
|
130 The general thesis is that perceptible qualities and subjective
|
samer@9
|
131 states like uncertainty, surprise, complexity, tension, and interestingness
|
samer@9
|
132 are closely related to
|
samer@9
|
133 information-theoretic quantities like entropy, relative entropy,
|
samer@9
|
134 and mutual information.
|
samer@9
|
135 % and are major determinants of the overall experience.
|
samer@9
|
136 Berlyne \cite{Berlyne71} called such quantities `collative variables', since
|
samer@9
|
137 they are to do with patterns of occurrence rather than medium-specific details,
|
samer@9
|
138 and developed the ideas of `information aesthetics' in an experimental setting.
|
samer@9
|
139 % Berlyne's `new experimental aesthetics', the `information-aestheticians'.
|
samer@9
|
140
|
samer@9
|
141 % Listeners then experience greater or lesser levels of surprise
|
samer@9
|
142 % in response to departures from these norms.
|
samer@9
|
143 % By careful manipulation
|
samer@9
|
144 % of the material, the composer can thus define, and induce within the
|
samer@9
|
145 % listener, a temporal programme of varying
|
samer@9
|
146 % levels of uncertainty, ambiguity and surprise.
|
samer@9
|
147
|
samer@9
|
148
|
samer@9
|
149 Previous work in this area \cite{Berlyne74} treated the various
|
samer@9
|
150 information theoretic quantities
|
samer@9
|
151 such as entropy as if they were intrinsic properties of the stimulus---subjects
|
samer@9
|
152 were presented with a sequence of tones with `high entropy', or a visual pattern
|
samer@9
|
153 with `low entropy'. These values were determined from some known `objective'
|
samer@9
|
154 probability model of the stimuli,%
|
samer@9
|
155 \footnote{%
|
samer@9
|
156 The notion of objective probabalities and whether or not they can
|
samer@9
|
157 usefully be said to exist is the subject of some debate, with advocates of
|
samer@24
|
158 subjective probabilities including de Finetti \cite{deFinetti}.}
|
samer@9
|
159 or from simple statistical analyses such as
|
samer@9
|
160 computing emprical distributions. Our approach is explicitly to consider the role
|
samer@9
|
161 of the observer in perception, and more specifically, to consider estimates of
|
samer@9
|
162 entropy \etc with respect to \emph{subjective} probabilities.
|
samer@9
|
163 \subsection{Information dynamic approach}
|
samer@9
|
164
|
samer@24
|
165 Bringing the various strands together, our working hypothesis is that as a
|
samer@24
|
166 listener (to which will refer as `it') listens to a piece of music, it maintains
|
samer@24
|
167 a dynamically evolving statistical model that enables it to make predictions
|
samer@24
|
168 about how the piece will continue, relying on both its previous experience
|
samer@24
|
169 of music and the immediate context of the piece. As events unfold, it revises
|
samer@24
|
170 its model and hence its probabilistic belief state, which includes predictive
|
samer@24
|
171 distributions over future observations. These distributions and changes in
|
samer@24
|
172 distributions can be characterised in terms of a handful of information
|
samer@24
|
173 theoretic-measures such as entropy and relative entropy. By tracing the
|
samer@24
|
174 evolution of a these measures, we obtain a representation which captures much
|
samer@24
|
175 of the significant structure of the music, but does so at a high level of
|
samer@24
|
176 \emph{abstraction}, since it is sensitive mainly to \emph{patterns} of occurence,
|
samer@24
|
177 rather the details of which specific things occur or even the sensory modality
|
samer@24
|
178 through which they are detected. This suggests that the
|
samer@9
|
179 same approach could, in principle, be used to analyse and compare information
|
samer@9
|
180 flow in different temporal media regardless of whether they are auditory,
|
samer@9
|
181 visual or otherwise.
|
samer@9
|
182
|
samer@24
|
183 In addition, the information dynamic approach gives us a principled way
|
samer@24
|
184 to address the notion of \emph{subjectivity}, since the analysis is dependent on the
|
samer@24
|
185 probability model the observer starts off with, which may depend on prior experience
|
samer@24
|
186 or other factors, and which may change over time. Thus, inter-subject variablity and
|
samer@24
|
187 variation in subjects' responses over time are
|
samer@24
|
188 fundamental to the theory.
|
samer@9
|
189
|
samer@18
|
190 %modelling the creative process, which often alternates between generative
|
samer@18
|
191 %and selective or evaluative phases \cite{Boden1990}, and would have
|
samer@18
|
192 %applications in tools for computer aided composition.
|
samer@18
|
193
|
samer@18
|
194
|
samer@18
|
195 \section{Theoretical review}
|
samer@18
|
196
|
samer@24
|
197 \subsection{Entropy and information in sequences}
|
samer@18
|
198 In this section, we summarise the definitions of some of the relevant quantities
|
samer@18
|
199 in information dynamics and show how they can be computed in some simple probabilistic
|
samer@18
|
200 models (namely, first and higher-order Markov chains, and Gaussian processes [Peter?]).
|
samer@18
|
201
|
samer@18
|
202 \begin{fig}{venn-example}
|
samer@18
|
203 \newcommand\rad{2.2em}%
|
samer@18
|
204 \newcommand\circo{circle (3.4em)}%
|
samer@18
|
205 \newcommand\labrad{4.3em}
|
samer@18
|
206 \newcommand\bound{(-6em,-5em) rectangle (6em,6em)}
|
samer@18
|
207 \newcommand\colsep{\ }
|
samer@18
|
208 \newcommand\clipin[1]{\clip (#1) \circo;}%
|
samer@18
|
209 \newcommand\clipout[1]{\clip \bound (#1) \circo;}%
|
samer@18
|
210 \newcommand\cliptwo[3]{%
|
samer@18
|
211 \begin{scope}
|
samer@18
|
212 \clipin{#1};
|
samer@18
|
213 \clipin{#2};
|
samer@18
|
214 \clipout{#3};
|
samer@18
|
215 \fill[black!30] \bound;
|
samer@18
|
216 \end{scope}
|
samer@18
|
217 }%
|
samer@18
|
218 \newcommand\clipone[3]{%
|
samer@18
|
219 \begin{scope}
|
samer@18
|
220 \clipin{#1};
|
samer@18
|
221 \clipout{#2};
|
samer@18
|
222 \clipout{#3};
|
samer@18
|
223 \fill[black!15] \bound;
|
samer@18
|
224 \end{scope}
|
samer@18
|
225 }%
|
samer@18
|
226 \begin{tabular}{c@{\colsep}c}
|
samer@18
|
227 \begin{tikzpicture}[baseline=0pt]
|
samer@18
|
228 \coordinate (p1) at (90:\rad);
|
samer@18
|
229 \coordinate (p2) at (210:\rad);
|
samer@18
|
230 \coordinate (p3) at (-30:\rad);
|
samer@18
|
231 \clipone{p1}{p2}{p3};
|
samer@18
|
232 \clipone{p2}{p3}{p1};
|
samer@18
|
233 \clipone{p3}{p1}{p2};
|
samer@18
|
234 \cliptwo{p1}{p2}{p3};
|
samer@18
|
235 \cliptwo{p2}{p3}{p1};
|
samer@18
|
236 \cliptwo{p3}{p1}{p2};
|
samer@18
|
237 \begin{scope}
|
samer@18
|
238 \clip (p1) \circo;
|
samer@18
|
239 \clip (p2) \circo;
|
samer@18
|
240 \clip (p3) \circo;
|
samer@18
|
241 \fill[black!45] \bound;
|
samer@18
|
242 \end{scope}
|
samer@18
|
243 \draw (p1) \circo;
|
samer@18
|
244 \draw (p2) \circo;
|
samer@18
|
245 \draw (p3) \circo;
|
samer@18
|
246 \path
|
samer@18
|
247 (barycentric cs:p3=1,p1=-0.2,p2=-0.1) +(0ex,0) node {$I_{3|12}$}
|
samer@18
|
248 (barycentric cs:p1=1,p2=-0.2,p3=-0.1) +(0ex,0) node {$I_{1|23}$}
|
samer@18
|
249 (barycentric cs:p2=1,p3=-0.2,p1=-0.1) +(0ex,0) node {$I_{2|13}$}
|
samer@18
|
250 (barycentric cs:p3=1,p2=1,p1=-0.55) +(0ex,0) node {$I_{23|1}$}
|
samer@18
|
251 (barycentric cs:p1=1,p3=1,p2=-0.55) +(0ex,0) node {$I_{13|2}$}
|
samer@18
|
252 (barycentric cs:p2=1,p1=1,p3=-0.55) +(0ex,0) node {$I_{12|3}$}
|
samer@18
|
253 (barycentric cs:p3=1,p2=1,p1=1) node {$I_{123}$}
|
samer@18
|
254 ;
|
samer@18
|
255 \path
|
samer@18
|
256 (p1) +(140:\labrad) node {$X_1$}
|
samer@18
|
257 (p2) +(-140:\labrad) node {$X_2$}
|
samer@18
|
258 (p3) +(-40:\labrad) node {$X_3$};
|
samer@18
|
259 \end{tikzpicture}
|
samer@18
|
260 &
|
samer@18
|
261 \parbox{0.5\linewidth}{
|
samer@18
|
262 \small
|
samer@18
|
263 \begin{align*}
|
samer@18
|
264 I_{1|23} &= H(X_1|X_2,X_3) \\
|
samer@18
|
265 I_{13|2} &= I(X_1;X_3|X_2) \\
|
samer@18
|
266 I_{1|23} + I_{13|2} &= H(X_1|X_2) \\
|
samer@18
|
267 I_{12|3} + I_{123} &= I(X_1;X_2)
|
samer@18
|
268 \end{align*}
|
samer@18
|
269 }
|
samer@18
|
270 \end{tabular}
|
samer@18
|
271 \caption{
|
samer@24
|
272 Information diagram visualisation of entropies and mutual informations
|
samer@18
|
273 for three random variables $X_1$, $X_2$ and $X_3$. The areas of
|
samer@18
|
274 the three circles represent $H(X_1)$, $H(X_2)$ and $H(X_3)$ respectively.
|
samer@18
|
275 The total shaded area is the joint entropy $H(X_1,X_2,X_3)$.
|
samer@18
|
276 The central area $I_{123}$ is the co-information \cite{McGill1954}.
|
samer@18
|
277 Some other information measures are indicated in the legend.
|
samer@18
|
278 }
|
samer@18
|
279 \end{fig}
|
samer@18
|
280 [Adopting notation of recent Binding information paper.]
|
samer@18
|
281 \subsection{'Anatomy of a bit' stuff}
|
samer@18
|
282 Entropy rates, redundancy, predictive information etc.
|
samer@18
|
283 Information diagrams.
|
samer@18
|
284
|
samer@18
|
285 \begin{fig}{predinfo-bg}
|
samer@18
|
286 \newcommand\subfig[2]{\shortstack{#2\\[0.75em]#1}}
|
samer@18
|
287 \newcommand\rad{1.8em}%
|
samer@18
|
288 \newcommand\ovoid[1]{%
|
samer@18
|
289 ++(-#1,\rad)
|
samer@18
|
290 -- ++(2 * #1,0em) arc (90:-90:\rad)
|
samer@18
|
291 -- ++(-2 * #1,0em) arc (270:90:\rad)
|
samer@18
|
292 }%
|
samer@18
|
293 \newcommand\axis{2.75em}%
|
samer@18
|
294 \newcommand\olap{0.85em}%
|
samer@18
|
295 \newcommand\offs{3.6em}
|
samer@18
|
296 \newcommand\colsep{\hspace{5em}}
|
samer@18
|
297 \newcommand\longblob{\ovoid{\axis}}
|
samer@18
|
298 \newcommand\shortblob{\ovoid{1.75em}}
|
samer@18
|
299 \begin{tabular}{c@{\colsep}c}
|
samer@18
|
300 \subfig{(a) excess entropy}{%
|
samer@18
|
301 \newcommand\blob{\longblob}
|
samer@18
|
302 \begin{tikzpicture}
|
samer@18
|
303 \coordinate (p1) at (-\offs,0em);
|
samer@18
|
304 \coordinate (p2) at (\offs,0em);
|
samer@18
|
305 \begin{scope}
|
samer@18
|
306 \clip (p1) \blob;
|
samer@18
|
307 \clip (p2) \blob;
|
samer@18
|
308 \fill[lightgray] (-1,-1) rectangle (1,1);
|
samer@18
|
309 \end{scope}
|
samer@18
|
310 \draw (p1) +(-0.5em,0em) node{\shortstack{infinite\\past}} \blob;
|
samer@18
|
311 \draw (p2) +(0.5em,0em) node{\shortstack{infinite\\future}} \blob;
|
samer@18
|
312 \path (0,0) node (future) {$E$};
|
samer@18
|
313 \path (p1) +(-2em,\rad) node [anchor=south] {$\ldots,X_{-1}$};
|
samer@18
|
314 \path (p2) +(2em,\rad) node [anchor=south] {$X_0,\ldots$};
|
samer@18
|
315 \end{tikzpicture}%
|
samer@18
|
316 }%
|
samer@18
|
317 \\[1.25em]
|
samer@18
|
318 \subfig{(b) predictive information rate $b_\mu$}{%
|
samer@18
|
319 \begin{tikzpicture}%[baseline=-1em]
|
samer@18
|
320 \newcommand\rc{2.1em}
|
samer@18
|
321 \newcommand\throw{2.5em}
|
samer@18
|
322 \coordinate (p1) at (210:1.5em);
|
samer@18
|
323 \coordinate (p2) at (90:0.7em);
|
samer@18
|
324 \coordinate (p3) at (-30:1.5em);
|
samer@18
|
325 \newcommand\bound{(-7em,-2.6em) rectangle (7em,3.0em)}
|
samer@18
|
326 \newcommand\present{(p2) circle (\rc)}
|
samer@18
|
327 \newcommand\thepast{(p1) ++(-\throw,0) \ovoid{\throw}}
|
samer@18
|
328 \newcommand\future{(p3) ++(\throw,0) \ovoid{\throw}}
|
samer@18
|
329 \newcommand\fillclipped[2]{%
|
samer@18
|
330 \begin{scope}[even odd rule]
|
samer@18
|
331 \foreach \thing in {#2} {\clip \thing;}
|
samer@18
|
332 \fill[black!#1] \bound;
|
samer@18
|
333 \end{scope}%
|
samer@18
|
334 }%
|
samer@18
|
335 \fillclipped{30}{\present,\future,\bound \thepast}
|
samer@18
|
336 \fillclipped{15}{\present,\bound \future,\bound \thepast}
|
samer@18
|
337 \draw \future;
|
samer@18
|
338 \fillclipped{45}{\present,\thepast}
|
samer@18
|
339 \draw \thepast;
|
samer@18
|
340 \draw \present;
|
samer@18
|
341 \node at (barycentric cs:p2=1,p1=-0.17,p3=-0.17) {$r_\mu$};
|
samer@18
|
342 \node at (barycentric cs:p1=-0.4,p2=1.0,p3=1) {$b_\mu$};
|
samer@18
|
343 \node at (barycentric cs:p3=0,p2=1,p1=1.2) [shape=rectangle,fill=black!45,inner sep=1pt]{$\rho_\mu$};
|
samer@18
|
344 \path (p2) +(140:3em) node {$X_0$};
|
samer@18
|
345 % \node at (barycentric cs:p3=0,p2=1,p1=1) {$\rho_\mu$};
|
samer@18
|
346 \path (p3) +(3em,0em) node {\shortstack{infinite\\future}};
|
samer@18
|
347 \path (p1) +(-3em,0em) node {\shortstack{infinite\\past}};
|
samer@18
|
348 \path (p1) +(-4em,\rad) node [anchor=south] {$\ldots,X_{-1}$};
|
samer@18
|
349 \path (p3) +(4em,\rad) node [anchor=south] {$X_1,\ldots$};
|
samer@18
|
350 \end{tikzpicture}}%
|
samer@18
|
351 \\[0.5em]
|
samer@18
|
352 \end{tabular}
|
samer@18
|
353 \caption{
|
samer@18
|
354 Venn diagram representation of several information measures for
|
samer@18
|
355 stationary random processes. Each circle or oval represents a random
|
samer@18
|
356 variable or sequence of random variables relative to time $t=0$. Overlapped areas
|
samer@18
|
357 correspond to various mutual information as in \Figrf{venn-example}.
|
samer@18
|
358 In (c), the circle represents the `present'. Its total area is
|
samer@18
|
359 $H(X_0)=H(1)=\rho_\mu+r_\mu+b_\mu$, where $\rho_\mu$ is the multi-information
|
samer@18
|
360 rate, $r_\mu$ is the residual entropy rate, and $b_\mu$ is the predictive
|
samer@18
|
361 information rate. The entropy rate is $h_\mu = r_\mu+b_\mu$.
|
samer@18
|
362 }
|
samer@18
|
363 \end{fig}
|
samer@18
|
364
|
samer@18
|
365 \paragraph{Predictive information rate}
|
samer@18
|
366 In previous work \cite{AbdallahPlumbley2009}, we introduced
|
samer@18
|
367 % examined several
|
samer@18
|
368 % information-theoretic measures that could be used to characterise
|
samer@18
|
369 % not only random processes (\ie, an ensemble of possible sequences),
|
samer@18
|
370 % but also the dynamic progress of specific realisations of such processes.
|
samer@18
|
371 % One of these measures was
|
samer@18
|
372 %
|
samer@18
|
373 the \emph{predictive information rate}
|
samer@18
|
374 (PIR), which is the average information
|
samer@18
|
375 in one observation about the infinite future given the infinite past.
|
samer@18
|
376 If $\past{X}_t=(\ldots,X_{t-2},X_{t-1})$ denotes the variables
|
samer@18
|
377 before time $t$,
|
samer@18
|
378 and $\fut{X}_t = (X_{t+1},X_{t+2},\ldots)$ denotes
|
samer@18
|
379 those after $t$,
|
samer@18
|
380 the PIR at time $t$ is defined as a conditional mutual information:
|
samer@18
|
381 \begin{equation}
|
samer@18
|
382 \label{eq:PIR}
|
samer@18
|
383 \IXZ_t \define I(X_t;\fut{X}_t|\past{X}_t) = H(\fut{X}_t|\past{X}_t) - H(\fut{X}_t|X_t,\past{X}_t).
|
samer@18
|
384 \end{equation}
|
samer@18
|
385 % (The underline/overline notation follows that of \cite[\S 3]{AbdallahPlumbley2009}.)
|
samer@18
|
386 % Hence, $\Ix_t$ quantifies the \emph{new}
|
samer@18
|
387 % information gained about the future from the observation at time $t$.
|
samer@18
|
388 Equation \eqrf{PIR} can be read as the average reduction
|
samer@18
|
389 in uncertainty about the future on learning $X_t$, given the past.
|
samer@18
|
390 Due to the symmetry of the mutual information, it can also be written
|
samer@18
|
391 as
|
samer@18
|
392 \begin{equation}
|
samer@18
|
393 % \IXZ_t
|
samer@18
|
394 I(X_t;\fut{X}_t|\past{X}_t) = H(X_t|\past{X}_t) - H(X_t|\fut{X}_t,\past{X}_t).
|
samer@18
|
395 % \label{<++>}
|
samer@18
|
396 \end{equation}
|
samer@18
|
397 % If $X$ is stationary, then
|
samer@18
|
398 Now, in the shift-invariant case, $H(X_t|\past{X}_t)$
|
samer@18
|
399 is the familiar entropy rate $h_\mu$, but $H(X_t|\fut{X}_t,\past{X}_t)$,
|
samer@18
|
400 the conditional entropy of one variable given \emph{all} the others
|
samer@18
|
401 in the sequence, future as well as past, is what
|
samer@18
|
402 we called the \emph{residual entropy rate} $r_\mu$ in \cite{AbdallahPlumbley2010},
|
samer@18
|
403 but was previously identified by Verd{\'u} and Weissman \cite{VerduWeissman2006} as the
|
samer@18
|
404 \emph{erasure entropy rate}.
|
samer@18
|
405 % It is not expressible in terms of the block entropy function $H(\cdot)$.
|
samer@18
|
406 It can be defined as the limit
|
samer@18
|
407 \begin{equation}
|
samer@18
|
408 \label{eq:residual-entropy-rate}
|
samer@18
|
409 r_\mu \define \lim_{N\tends\infty} H(X_{\bet(-N,N)}) - H(X_{\bet(-N,-1)},X_{\bet(1,N)}).
|
samer@18
|
410 \end{equation}
|
samer@18
|
411 The second term, $H(X_{\bet(1,N)},X_{\bet(-N,-1)})$,
|
samer@18
|
412 is the joint entropy of two non-adjacent blocks each of length $N$ with a
|
samer@18
|
413 gap between them,
|
samer@18
|
414 and cannot be expressed as a function of block entropies alone.
|
samer@18
|
415 % In order to associate it with the concept of \emph{binding information} which
|
samer@18
|
416 % we will define in \secrf{binding-info}, we
|
samer@18
|
417 Thus, the shift-invariant PIR (which we will write as $b_\mu$) is the difference between
|
samer@18
|
418 the entropy rate and the erasure entropy rate: $b_\mu = h_\mu - r_\mu$.
|
samer@18
|
419 These relationships are illustrated in \Figrf{predinfo-bg}, along with
|
samer@18
|
420 several of the information measures we have discussed so far.
|
samer@18
|
421
|
samer@18
|
422
|
samer@24
|
423 \begin{fig}{wundt}
|
samer@24
|
424 \raisebox{-4em}{\colfig[0.43]{wundt}}
|
samer@24
|
425 % {\ \shortstack{{\Large$\longrightarrow$}\\ {\scriptsize\emph{exposure}}}\ }
|
samer@24
|
426 {\ {\large$\longrightarrow$}\ }
|
samer@24
|
427 \raisebox{-4em}{\colfig[0.43]{wundt2}}
|
samer@24
|
428 \caption{
|
samer@24
|
429 The Wundt curve relating randomness/complexity with
|
samer@24
|
430 perceived value. Repeated exposure sometimes results
|
samer@24
|
431 in a move to the left along the curve \cite{Berlyne71}.
|
samer@24
|
432 }
|
samer@24
|
433 \end{fig}
|
samer@24
|
434
|
samer@24
|
435
|
samer@18
|
436 \subsection{First order Markov chains}
|
samer@18
|
437 These are the simplest non-trivial models to which information dynamics methods
|
samer@18
|
438 can be applied. In \cite{AbdallahPlumbley2009} we, showed that the predictive information
|
samer@18
|
439 rate can be expressed simply in terms of the entropy rate of the Markov chain.
|
samer@18
|
440 If we let $a$ denote the transition matrix of the Markov chain, and $h_a$ it's
|
samer@18
|
441 entropy rate, then its predictive information rate $b_a$ is
|
samer@18
|
442 \begin{equation}
|
samer@18
|
443 b_a = h_{a^2} - h_a,
|
samer@18
|
444 \end{equation}
|
samer@18
|
445 where $a^2 = aa$, the transition matrix squared, is the transition matrix
|
samer@18
|
446 of the `skip one' Markov chain obtained by leaving out every other observation.
|
samer@18
|
447
|
samer@18
|
448 \subsection{Higher order Markov chains}
|
samer@18
|
449 Second and higher order Markov chains can be treated in a similar way by transforming
|
samer@18
|
450 to a first order representation of the high order Markov chain. If we are dealing
|
samer@18
|
451 with an $N$th order model, this is done forming a new alphabet of possible observations
|
samer@18
|
452 consisting of all possible $N$-tuples of symbols from the base alphabet. An observation
|
samer@18
|
453 in this new model represents a block of $N$ observations from the base model. The next
|
samer@18
|
454 observation represents the block of $N$ obtained by shift the previous block along
|
samer@18
|
455 by one step. The new Markov of chain is parameterised by a sparse $K^N\times K^N$
|
samer@18
|
456 transition matrix $\hat{a}$.
|
samer@18
|
457 \begin{equation}
|
samer@18
|
458 b_{\hat{a}} = h_{\hat{a}^{N+1}} - N h_{\hat{a}},
|
samer@18
|
459 \end{equation}
|
samer@18
|
460 where $\hat{a}^{N+1}$ is the $N+1$th power of the transition matrix.
|
samer@18
|
461
|
samer@9
|
462
|
samer@4
|
463
|
hekeus@16
|
464 \section{Information Dynamics in Analysis}
|
samer@4
|
465
|
hekeus@16
|
466 \subsection{Musicological Analysis}
|
samer@4
|
467 refer to the work with the analysis of minimalist pieces
|
samer@4
|
468
|
samer@24
|
469 \begin{fig}{twopages}
|
samer@24
|
470 % \colfig[0.96]{matbase/fig9471} % update from mbc paper
|
samer@24
|
471 \colfig[0.97]{matbase/fig72663}\\ % later update from mbc paper (Keith's new picks)
|
samer@24
|
472 \vspace*{1em}
|
samer@24
|
473 \colfig[0.97]{matbase/fig13377} % rule based analysis
|
samer@24
|
474 \caption{Analysis of \emph{Two Pages}.
|
samer@24
|
475 The thick vertical lines are the part boundaries as indicated in
|
samer@24
|
476 the score by the composer.
|
samer@24
|
477 The thin grey lines
|
samer@24
|
478 indicate changes in the melodic `figures' of which the piece is
|
samer@24
|
479 constructed. In the `model information rate' panel, the black asterisks
|
samer@24
|
480 mark the
|
samer@24
|
481 six most surprising moments selected by Keith Potter.
|
samer@24
|
482 The bottom panel shows a rule-based boundary strength analysis computed
|
samer@24
|
483 using Cambouropoulos' LBDM.
|
samer@24
|
484 All information measures are in nats and time is in notes.
|
samer@24
|
485 }
|
samer@24
|
486 \end{fig}
|
samer@24
|
487
|
samer@24
|
488 \begin{fig}{metre}
|
samer@24
|
489 \scalebox{1}[0.8]{%
|
samer@24
|
490 \begin{tabular}{cc}
|
samer@24
|
491 \colfig[0.45]{matbase/fig36859} & \colfig[0.45]{matbase/fig88658} \\
|
samer@24
|
492 \colfig[0.45]{matbase/fig48061} & \colfig[0.45]{matbase/fig46367} \\
|
samer@24
|
493 \colfig[0.45]{matbase/fig99042} & \colfig[0.45]{matbase/fig87490}
|
samer@24
|
494 % \colfig[0.46]{matbase/fig56807} & \colfig[0.48]{matbase/fig27144} \\
|
samer@24
|
495 % \colfig[0.46]{matbase/fig87574} & \colfig[0.48]{matbase/fig13651} \\
|
samer@24
|
496 % \colfig[0.44]{matbase/fig19913} & \colfig[0.46]{matbase/fig66144} \\
|
samer@24
|
497 % \colfig[0.48]{matbase/fig73098} & \colfig[0.48]{matbase/fig57141} \\
|
samer@24
|
498 % \colfig[0.48]{matbase/fig25703} & \colfig[0.48]{matbase/fig72080} \\
|
samer@24
|
499 % \colfig[0.48]{matbase/fig9142} & \colfig[0.48]{matbase/fig27751}
|
samer@24
|
500
|
samer@24
|
501 \end{tabular}%
|
samer@24
|
502 }
|
samer@24
|
503 \caption{Metrical analysis by computing average surprisingness and
|
samer@24
|
504 informative of notes at different periodicities (\ie hypothetical
|
samer@24
|
505 bar lengths) and phases (\ie positions within a bar).
|
samer@24
|
506 }
|
samer@24
|
507 \end{fig}
|
samer@24
|
508
|
samer@23
|
509 \subsection{Content analysis/Sound Categorisation}.
|
samer@23
|
510 Using Information Dynamics it is possible to segment music. From there we
|
samer@23
|
511 can then use this to search large data sets. Determine musical structure for
|
samer@23
|
512 the purpose of playlist navigation and search.
|
hekeus@16
|
513 \emph{Peter}
|
samer@4
|
514
|
samer@4
|
515 \subsection{Beat Tracking}
|
hekeus@16
|
516 \emph{Andrew}
|
samer@4
|
517
|
samer@4
|
518
|
samer@24
|
519 \section{Information dynamics as compositional aid}
|
hekeus@13
|
520
|
samer@23
|
521 In addition to applying information dynamics to analysis, it is also possible
|
samer@23
|
522 use this approach in design, such as the composition of musical materials. By
|
samer@23
|
523 providing a framework for linking information theoretic measures to the control
|
samer@23
|
524 of generative processes, it becomes possible to steer the output of these processes
|
samer@23
|
525 to match a criteria defined by these measures. For instance outputs of a
|
samer@23
|
526 stochastic musical process could be filtered to match constraints defined by a
|
samer@23
|
527 set of information theoretic measures.
|
hekeus@13
|
528
|
samer@23
|
529 The use of stochastic processes for the generation of musical material has been
|
samer@23
|
530 widespread for decades -- Iannis Xenakis applied probabilistic mathematical
|
samer@23
|
531 models to the creation of musical materials, including to the formulation of a
|
samer@23
|
532 theory of Markovian Stochastic Music. However we can use information dynamics
|
samer@23
|
533 measures to explore and interface with such processes at the high and abstract
|
samer@23
|
534 level of expectation, randomness and predictability. The Melody Triangle is
|
samer@23
|
535 such a system.
|
hekeus@13
|
536
|
samer@23
|
537 \subsection{The Melody Triangle}
|
samer@23
|
538 The Melody Triangle is an exploratory interface for the discovery of melodic
|
samer@23
|
539 content, where the input -- positions within a triangle -- directly map to
|
samer@23
|
540 information theoretic measures associated with the output.
|
samer@23
|
541 The measures are the entropy rate, redundancy and predictive information rate
|
samer@23
|
542 of the random process used to generate the sequence of notes.
|
samer@23
|
543 These are all related to the predictability of the the sequence and as such
|
samer@23
|
544 address the notions of expectation and surprise in the perception of
|
samer@23
|
545 music.\emph{self-plagiarised}
|
hekeus@13
|
546
|
samer@23
|
547 Before the Melody Triangle can used, it has to be `populated' with possible
|
samer@23
|
548 parameter values for the melody generators. These are then plotted in a 3d
|
samer@23
|
549 statistical space of redundancy, entropy rate and predictive information rate.
|
samer@23
|
550 In our case we generated thousands of transition matrixes, representing first-order
|
samer@23
|
551 Markov chains, by a random sampling method. In figure \ref{InfoDynEngine} we see
|
samer@23
|
552 a representation of how these matrixes are distributed in the 3d statistical
|
samer@23
|
553 space; each one of these points corresponds to a transition
|
samer@23
|
554 matrix.\emph{self-plagiarised}
|
hekeus@17
|
555
|
hekeus@17
|
556 \begin{figure}
|
hekeus@17
|
557 \centering
|
samer@21
|
558 \includegraphics[width=\linewidth]{figs/mtriscat}
|
samer@21
|
559 \caption{The population of transition matrices distributed along three axes of
|
samer@21
|
560 redundancy, entropy rate and predictive information rate (all measured in bits).
|
samer@21
|
561 The concentrations of points along the redundancy axis correspond
|
samer@21
|
562 to Markov chains which are roughly periodic with periods of 2 (redundancy 1 bit),
|
samer@21
|
563 3, 4, \etc all the way to period 8 (redundancy 3 bits). The colour of each point
|
samer@21
|
564 represents its PIR---note that the highest values are found at intermediate entropy
|
samer@21
|
565 and redundancy, and that the distribution as a whole makes a curved triangle. Although
|
samer@21
|
566 not visible in this plot, it is largely hollow in the middle.
|
samer@21
|
567 \label{InfoDynEngine}}
|
hekeus@17
|
568 \end{figure}
|
hekeus@17
|
569
|
samer@4
|
570
|
samer@23
|
571 When we look at the distribution of transition matrixes plotted in this space,
|
samer@23
|
572 we see that it forms an arch shape that is fairly thin. It thus becomes a
|
samer@23
|
573 reasonable approximation to pretend that it is just a sheet in two dimensions;
|
samer@23
|
574 and so we stretch out this curved arc into a flat triangle. It is this triangular
|
samer@23
|
575 sheet that is our `Melody Triangle' and forms the interface by which the system
|
samer@23
|
576 is controlled. \emph{self-plagiarised}
|
samer@4
|
577
|
samer@23
|
578 When the Melody Triangle is used, regardless of whether it is as a screen based
|
samer@23
|
579 system, or as an interactive installation, it involves a mapping to this statistical
|
samer@23
|
580 space. When the user, through the interface, selects a position within the
|
samer@23
|
581 triangle, the corresponding transition matrix is returned. Figure \ref{TheTriangle}
|
samer@23
|
582 shows how the triangle maps to different measures of redundancy, entropy rate
|
samer@23
|
583 and predictive information rate.\emph{self-plagiarised}
|
hekeus@17
|
584 \begin{figure}
|
hekeus@17
|
585 \centering
|
samer@24
|
586 \includegraphics[width=0.85\linewidth]{figs/TheTriangle.pdf}
|
hekeus@17
|
587 \caption{The Melody Triangle\label{TheTriangle}}
|
hekeus@17
|
588 \end{figure}
|
samer@23
|
589 Each corner corresponds to three different extremes of predictability and
|
samer@23
|
590 unpredictability, which could be loosely characterised as `periodicity', `noise'
|
samer@23
|
591 and `repetition'. Melodies from the `noise' corner have no discernible pattern;
|
samer@23
|
592 they have high entropy rate, low predictive information rate and low redundancy.
|
samer@23
|
593 These melodies are essentially totally random. A melody along the `periodicity'
|
samer@23
|
594 to `repetition' edge are all deterministic loops that get shorter as we approach
|
samer@23
|
595 the `repetition' corner, until it becomes just one repeating note. It is the
|
samer@23
|
596 areas in between the extremes that provide the more `interesting' melodies. That
|
samer@23
|
597 is, those that have some level of unpredictability, but are not completely ran-
|
samer@23
|
598 dom. Or, conversely, that are predictable, but not entirely so. This triangular
|
samer@23
|
599 space allows for an intuitive explorationof expectation and surprise in temporal
|
samer@23
|
600 sequences based on a simple model of how one might guess the next event given
|
samer@23
|
601 the previous one.\emph{self-plagiarised}
|
samer@23
|
602
|
samer@4
|
603
|
samer@23
|
604
|
samer@23
|
605 Any number of interfaces could be developed for the Melody Triangle. We have
|
samer@23
|
606 developed two; a standard screen based interface where a user moves tokens with
|
samer@23
|
607 a mouse in and around a triangle on screen, and a multi-user interactive
|
samer@23
|
608 installation where a Kinect camera tracks individuals in a space and maps their
|
samer@23
|
609 positions in the space to the triangle.
|
samer@23
|
610 Each visitor would generate a melody, and could collaborate with their co-visitors
|
samer@23
|
611 to generate musical textures -- a playful yet informative way to explore
|
samer@23
|
612 expectation and surprise in music.
|
samer@23
|
613
|
samer@23
|
614 As a screen based interface the Melody Triangle can serve as composition tool.
|
samer@23
|
615 A triangle is drawn on the screen, screen space thus mapped to the statistical
|
samer@23
|
616 space of the Melody Triangle.
|
samer@23
|
617 A number of round tokens, each representing a melody can be dragged in and
|
samer@23
|
618 around the triangle. When a token is dragged into the triangle, the system
|
samer@23
|
619 will start generating the sequence of notes with statistical properties that
|
samer@23
|
620 correspond to its position in the triangle.\emph{self-plagiarised}
|
samer@23
|
621
|
samer@23
|
622 In this mode, the Melody Triangle can be used as a kind of composition assistant
|
samer@23
|
623 for the generation of interesting musical textures and melodies. However unlike
|
samer@23
|
624 other computer aided composition tools or programming environments, here the
|
samer@23
|
625 composer engages with music on the high and abstract level of expectation,
|
samer@23
|
626 randomness and predictability.\emph{self-plagiarised}
|
samer@23
|
627
|
hekeus@13
|
628
|
hekeus@13
|
629 Additionally the Melody Triangle serves as an effective tool for experimental investigations into musical preference and their relationship to the information dynamics models.
|
samer@4
|
630
|
hekeus@13
|
631 %As the Melody Triangle essentially operates on a stream of symbols, it it is possible to apply the melody triangle to the design of non-sonic content.
|
hekeus@13
|
632
|
hekeus@13
|
633 \section{Musical Preference and Information Dynamics}
|
samer@23
|
634 We carried out a preliminary study that sought to identify any correlation between
|
samer@23
|
635 aesthetic preference and the information theoretical measures of the Melody
|
samer@23
|
636 Triangle. In this study participants were asked to use the screen based interface
|
samer@23
|
637 but it was simplified so that all they could do was move tokens around. To help
|
samer@23
|
638 discount visual biases, the axes of the triangle would be randomly rearranged
|
samer@23
|
639 for each participant.\emph{self-plagiarised}
|
hekeus@16
|
640
|
samer@23
|
641 The study was divided in to two parts, the first investigated musical preference
|
samer@23
|
642 with respect to single melodies at different tempos. In the second part of the
|
samer@23
|
643 study, a background melody is playing and the participants are asked to continue
|
samer@23
|
644 playing with the system under the implicit assumption that they will try to find
|
samer@23
|
645 a second melody that works well with the background melody. For each participant
|
samer@23
|
646 this was done four times, each with a different background melody from four
|
samer@23
|
647 different areas of the Melody Triangle. For all parts of the study the participants
|
samer@23
|
648 were asked to signal, by pressing the space bar, whenever they liked what they
|
samer@23
|
649 were hearing.\emph{self-plagiarised}
|
samer@4
|
650
|
hekeus@13
|
651 \emph{todo - results}
|
samer@4
|
652
|
hekeus@13
|
653 \section{Information Dynamics as Evaluative Feedback Mechanism}
|
hekeus@13
|
654
|
hekeus@13
|
655 \emph{todo - code the info dyn evaluator :) }
|
samer@4
|
656
|
samer@23
|
657 It is possible to use information dynamics measures to develop a kind of `critic'
|
samer@23
|
658 that would evaluate a stream of symbols. For instance we could develop a system
|
samer@23
|
659 to notify us if a stream of symbols is too boring, either because they are too
|
samer@23
|
660 repetitive or too chaotic. This could be used to evaluate both pre-composed
|
samer@23
|
661 streams of symbols, or could even be used to provide real-time feedback in an
|
samer@23
|
662 improvisatory setup.
|
hekeus@13
|
663
|
samer@23
|
664 \emph{comparable system} Gordon Pask's Musicolor (1953) applied a similar notion
|
samer@23
|
665 of boredom in its design. The Musicolour would react to audio input through a
|
samer@23
|
666 microphone by flashing coloured lights. Rather than a direct mapping of sound
|
samer@23
|
667 to light, Pask designed the device to be a partner to a performing musician. It
|
samer@23
|
668 would adapt its lighting pattern based on the rhythms and frequencies it would
|
samer@23
|
669 hear, quickly `learning' to flash in time with the music. However Pask endowed
|
samer@23
|
670 the device with the ability to `be bored'; if the rhythmic and frequency content
|
samer@23
|
671 of the input remained the same for too long it would listen for other rhythms
|
samer@23
|
672 and frequencies, only lighting when it heard these. As the Musicolour would
|
samer@23
|
673 `get bored', the musician would have to change and vary their playing, eliciting
|
samer@23
|
674 new and unexpected outputs in trying to keep the Musicolour interested.
|
samer@4
|
675
|
samer@23
|
676 In a similar vein, our \emph{Information Dynamics Critic}(name?) allows for an
|
samer@23
|
677 evaluative measure of an input stream, however containing a more sophisticated
|
samer@23
|
678 notion of boredom that \dots
|
samer@23
|
679
|
hekeus@13
|
680
|
hekeus@13
|
681
|
hekeus@13
|
682
|
samer@4
|
683 \section{Conclusion}
|
samer@4
|
684
|
samer@9
|
685 \bibliographystyle{unsrt}
|
hekeus@16
|
686 {\bibliography{all,c4dm,nime}}
|
samer@4
|
687 \end{document}
|