annotate draft.tex @ 16:d5f63ea0f266

fixed abstract tag, tidy doc structure
author Henrik Ekeus <hekeus@eecs.qmul.ac.uk>
date Tue, 06 Mar 2012 15:21:35 +0000
parents 317db6d6f433
children e47aaea2ac28
rev   line source
samer@4 1 \documentclass[conference]{IEEEtran}
samer@4 2 \usepackage{cite}
samer@4 3 \usepackage[cmex10]{amsmath}
samer@4 4 \usepackage{graphicx}
samer@4 5 \usepackage{amssymb}
samer@4 6 \usepackage{epstopdf}
samer@4 7 \usepackage{url}
samer@4 8 \usepackage{listings}
samer@9 9 \usepackage{tools}
samer@9 10
samer@9 11 \let\citep=\cite
samer@9 12 \def\squash{}
samer@4 13
samer@4 14 %\usepackage[parfill]{parskip}
samer@4 15
samer@4 16 \begin{document}
samer@4 17 \title{Cognitive Music Modelling: an Information Dynamics Approach}
samer@4 18
samer@4 19 \author{
hekeus@16 20 \IEEEauthorblockN{Samer A. Abdallah, Henrik Ekeus, Peter Foster}
hekeus@16 21 \IEEEauthorblockN{Andrew Robertson and Mark D. Plumbley}
samer@4 22 \IEEEauthorblockA{Centre for Digital Music\\
samer@4 23 Queen Mary University of London\\
hekeus@16 24 Mile End Road, London E1 4NS\\
hekeus@16 25 Email:}}
samer@4 26
samer@4 27 \maketitle
hekeus@16 28 \begin{abstract}People take in information when perceiving music. With it they continually build predictive models of what is going to happen. There is a relationship between information measures and how we perceive music. An information theoretic approach to music cognition is thus a fruitful avenue of research.
hekeus@16 29 \end{abstract}
samer@4 30
samer@4 31
samer@9 32 \section{Expectation and surprise in music}
samer@9 33 \label{s:Intro}
samer@9 34
samer@9 35 One of the more salient effects of listening to music is to create
samer@9 36 \emph{expectations} of what is to come next, which may be fulfilled
samer@9 37 immediately, after some delay, or not at all as the case may be.
samer@9 38 This is the thesis put forward by, amongst others, music theorists
samer@9 39 L. B. Meyer \cite{Meyer67} and Narmour \citep{Narmour77}.
samer@9 40 In fact, %the gist of
samer@9 41 this insight predates Meyer quite considerably; for example,
samer@9 42 it was elegantly put by Hanslick \cite{Hanslick1854} in the
samer@9 43 nineteenth century:
samer@9 44 \begin{quote}
samer@9 45 `The most important factor in the mental process which accompanies the
samer@9 46 act of listening to music, and which converts it to a source of pleasure,
samer@9 47 is %\ldots
samer@9 48 frequently overlooked. We here refer to the intellectual satisfaction
samer@9 49 which the listener derives from continually following and anticipating
samer@9 50 the composer's intentions---now, to see his expectations fulfilled, and
samer@9 51 now, to find himself agreeably mistaken. It is a matter of course that
samer@9 52 this intellectual flux and reflux, this perpetual giving and receiving
samer@9 53 takes place unconsciously, and with the rapidity of lightning-flashes.'
samer@9 54 \end{quote}
samer@9 55
samer@9 56 An essential aspect of this is that music is experienced as a phenomenon
samer@9 57 that `unfolds' in time, rather than being apprehended as a static object
samer@9 58 presented in its entirety. Meyer argued that musical experience depends
samer@9 59 on how we change and revise our conceptions \emph{as events happen}, on
samer@9 60 how expectation and prediction interact with occurrence, and that, to a
samer@9 61 large degree, the way to understand the effect of music is to focus on
samer@9 62 this `kinetics' of expectation and surprise.
samer@9 63
samer@9 64 The business of making predictions and assessing surprise is essentially
samer@9 65 one of reasoning under conditions of uncertainty and manipulating
samer@9 66 degrees of belief about the various proposition which may or may not
samer@9 67 hold, and, as has been argued elsewhere \cite{Cox1946,Jaynes27}, best
samer@9 68 quantified in terms of Bayesian probability theory.
samer@9 69 % Thus, we assume that musical schemata are encoded as probabilistic %
samer@9 70 %\citep{Meyer56} models, and
samer@9 71 Thus, we suppose that
samer@9 72 when we listen to music, expectations are created on the basis of our
samer@9 73 familiarity with various stylistic norms %, that is, using models that
samer@9 74 encode the statistics of music in general, the particular styles of
samer@9 75 music that seem best to fit the piece we happen to be listening to, and
samer@9 76 the emerging structures peculiar to the current piece. There is
samer@9 77 experimental evidence that human listeners are able to internalise
samer@9 78 statistical knowledge about musical structure, \eg
samer@9 79 \citep{SaffranJohnsonAslin1999,EerolaToiviainenKrumhansl2002}, and also
samer@9 80 that statistical models can form an effective basis for computational
samer@9 81 % analysis of music, \eg \cite{Pearce2005}.
samer@9 82 analysis of music, \eg
samer@9 83 \cite{ConklinWitten95,PonsfordWigginsMellish1999,Pearce2005}.
samer@9 84 % \cite{Ferrand2002}. Dubnov and Assayag PSTs?
samer@9 85
samer@9 86 \squash
samer@9 87 \subsection{Music and information theory}
samer@9 88 Given a probabilistic framework for music modelling and prediction,
samer@9 89 it is a small step to apply quantitative information theory \cite{Shannon48} to
samer@9 90 the models at hand.
samer@9 91 The relationship between information theory and music and art in general has been the
samer@9 92 subject of some interest since the 1950s
samer@9 93 \cite{Youngblood58,CoonsKraehenbuehl1958,HillerBean66,Moles66,Meyer67,Cohen1962}.
samer@9 94 The general thesis is that perceptible qualities and subjective
samer@9 95 states like uncertainty, surprise, complexity, tension, and interestingness
samer@9 96 are closely related to
samer@9 97 information-theoretic quantities like entropy, relative entropy,
samer@9 98 and mutual information.
samer@9 99 % and are major determinants of the overall experience.
samer@9 100 Berlyne \cite{Berlyne71} called such quantities `collative variables', since
samer@9 101 they are to do with patterns of occurrence rather than medium-specific details,
samer@9 102 and developed the ideas of `information aesthetics' in an experimental setting.
samer@9 103 % Berlyne's `new experimental aesthetics', the `information-aestheticians'.
samer@9 104
samer@9 105 % Listeners then experience greater or lesser levels of surprise
samer@9 106 % in response to departures from these norms.
samer@9 107 % By careful manipulation
samer@9 108 % of the material, the composer can thus define, and induce within the
samer@9 109 % listener, a temporal programme of varying
samer@9 110 % levels of uncertainty, ambiguity and surprise.
samer@9 111
samer@9 112
samer@9 113 Previous work in this area \cite{Berlyne74} treated the various
samer@9 114 information theoretic quantities
samer@9 115 such as entropy as if they were intrinsic properties of the stimulus---subjects
samer@9 116 were presented with a sequence of tones with `high entropy', or a visual pattern
samer@9 117 with `low entropy'. These values were determined from some known `objective'
samer@9 118 probability model of the stimuli,%
samer@9 119 \footnote{%
samer@9 120 The notion of objective probabalities and whether or not they can
samer@9 121 usefully be said to exist is the subject of some debate, with advocates of
samer@9 122 subjective probabilities including de Finetti \cite{deFinetti}.
samer@9 123 Accordingly, we will treat the concept of a `true' or `objective' probability
samer@9 124 models with a grain of salt and not rely on them in our
samer@9 125 theoretical development.}%
samer@9 126 % since probabilities are almost always a function of the state of knowledge of the observer
samer@9 127 or from simple statistical analyses such as
samer@9 128 computing emprical distributions. Our approach is explicitly to consider the role
samer@9 129 of the observer in perception, and more specifically, to consider estimates of
samer@9 130 entropy \etc with respect to \emph{subjective} probabilities.
samer@9 131 % !!REV - DONE - explain use of quoted `objective'
samer@9 132
samer@9 133 % !!REV - previous work on information theory in music
samer@9 134 More recent work on using information theoretic concepts to analyse music in
samer@9 135 includes Simon's \cite{Simon2005} assessments of the entropy of
samer@9 136 Jazz improvisations and Dubnov's
samer@9 137 \cite{Dubnov2006,DubnovMcAdamsReynolds2006,Dubnov2008}
samer@9 138 investigations of the `information rate' of musical processes, which is related
samer@9 139 to the notion of redundancy in a communications channel.
samer@9 140 Dubnov's work in particular is informed by similar concerns to our own
samer@9 141 and we will discuss the relationship between it and our work at
samer@9 142 several points later in this paper
samer@9 143 (see \secrf{Redundancy}, \secrf{methods} and \secrf{RelatedWork}).
samer@9 144
samer@9 145
samer@9 146 % !!REV - DONE - rephrase, check grammar (now there are too many 'one's!)
samer@9 147 \squash
samer@9 148 \subsection{Information dynamic approach}
samer@9 149
samer@9 150 Bringing the various strands together, our working hypothesis is that
samer@9 151 as a listener (to which will refer gender neutrally as `it')
samer@9 152 listens to a piece of music, it maintains a dynamically evolving statistical
samer@9 153 model that enables it to make predictions about how the piece will
samer@9 154 continue, relying on both its previous experience of music and the immediate
samer@9 155 context of the piece.
samer@9 156 As events unfold, it revises its model and hence its probabilistic belief state,
samer@9 157 which includes predictive distributions over future observations.
samer@9 158 These distributions and changes in distributions can be characterised in terms of a handful of information
samer@9 159 theoretic-measures such as entropy and relative entropy.
samer@9 160 % to measure uncertainty and information. %, that is, changes in predictive distributions maintained by the model.
samer@9 161 By tracing the evolution of a these measures, we obtain a representation
samer@9 162 which captures much of the significant structure of the
samer@9 163 music.
samer@9 164 This approach has a number of features which we list below.
samer@9 165
samer@9 166 (1) \emph{Abstraction}:
samer@9 167 Because it is sensitive mainly to \emph{patterns} of occurence,
samer@9 168 rather the details of which specific things occur,
samer@9 169 it operates at a level of abstraction removed from the details of the sensory
samer@9 170 experience and the medium through which it was received, suggesting that the
samer@9 171 same approach could, in principle, be used to analyse and compare information
samer@9 172 flow in different temporal media regardless of whether they are auditory,
samer@9 173 visual or otherwise.
samer@9 174
samer@9 175 (2) \emph{Generality}:
samer@9 176 This approach does not proscribe which probabilistic models should be used---the
samer@9 177 choice can be guided by standard model selection criteria such as Bayes
samer@9 178 factors \cite{KassRaftery1995}, \etc
samer@9 179
samer@9 180 (3) \emph{Richness}:
samer@9 181 It may be effective to use a model with time-dependent latent
samer@9 182 variables, such as a hidden Markov model. In these cases, we can track changes
samer@9 183 in beliefs about the hidden variables as well as the observed ones, adding
samer@9 184 another layer of richness to the description while maintaining the same
samer@9 185 level of abstraction.
samer@9 186 For example, harmony (\ie, the `current chord') in music is not stated explicitly, but rather
samer@9 187 must be inferred from the musical surface; nonetheless, a sense of harmonic
samer@9 188 progression is an important aspect of many styles of music.
samer@9 189
samer@9 190 (4) \emph{Subjectivity}:
samer@9 191 Since the analysis is dependent on the probability model the observer brings to the
samer@9 192 problem, which may depend on prior experience or other factors, and which may change
samer@9 193 over time, inter-subject variablity and variation in subjects' responses over time are
samer@9 194 fundamental to the theory. It is essentially a theory of subjective response
samer@9 195
samer@9 196 % !!REV - clarify aims of paper.
samer@9 197 Having outlined the basic ideas, our aims in pursuing this line of thought
samer@9 198 are threefold: firstly, to propose dynamic information-based measures which
samer@9 199 are coherent from a theoretical point of view and consistent with the general
samer@9 200 principles of probabilistic inference, with possible applications in
samer@9 201 regulating machine learning systems;
samer@9 202 % when heuristics are required to manage intractible models or limited computational resources.
samer@9 203 secondly, to construct computational models of what human brains are doing
samer@9 204 in response to music, on the basis that our brains implement, or at least
samer@9 205 approximate, optimal probabilistic inference under the relevant constraints;
samer@9 206 and thirdly, to construct a computational model of a certain restricted
samer@9 207 field of aesthetic judgements (namely judgements related to formal structure)
samer@9 208 that may shed light on what makes a stimulus interesting or aesthetically
samer@9 209 pleasing. This would be of particular relevance to understanding and
samer@9 210 modelling the creative process, which often alternates between generative
samer@9 211 and selective or evaluative phases \cite{Boden1990}, and would have
samer@9 212 applications in tools for computer aided composition.
samer@9 213
samer@4 214
hekeus@16 215 \section{Information Dynamics in Analysis}
samer@4 216
hekeus@16 217 \subsection{Musicological Analysis}
samer@4 218 refer to the work with the analysis of minimalist pieces
samer@4 219
hekeus@16 220 \subsection{Content analysis/Sound Categorisation}. Using Information Dynamics it is possible to segment music. From there we can then use this to search large data sets. Determine musical structure for the purpose of playlist navigation and search.
hekeus@16 221 \emph{Peter}
samer@4 222
samer@4 223 \subsection{Beat Tracking}
hekeus@16 224 \emph{Andrew}
samer@4 225
samer@4 226
hekeus@13 227 \section{Information Dynamics as Design Tool}
hekeus@13 228
hekeus@16 229 In addition to applying information dynamics to analysis, it is also possible use this approach in design, such as the composition of musical materials.
hekeus@13 230 By providing a framework for linking information theoretic measures to the control of generative processes, it becomes possible to steer the output of these processes to match a criteria defined by these measures.
hekeus@13 231 For instance outputs of a stochastic musical process could be filtered to match constraints defined by a set of information theoretic measures.
hekeus@13 232
hekeus@13 233 The use of stochastic processes for the generation of musical material has been widespread for decades -- Iannis Xenakis applied probabilistic mathematical models to the creation of musical materials, including to the formulation of a theory of Markovian Stochastic Music.
hekeus@13 234 However we can use information dynamics measures to explore and interface with such processes at the high and abstract level of expectation, randomness and predictability.
hekeus@13 235 The Melody Triangle is such a system.
hekeus@13 236
hekeus@13 237 \subsection{The Melody Triangle} The Melody Triangle is an exploratory interface for the discovery of melodic content, where the input -- positions within a triangle -- directly map to information theoretic measures associated with the output.
hekeus@13 238 The measures are the entropy rate, redundancy and predictive information rate of the random process used to generate the sequence of notes.
hekeus@13 239 These are all related to the predictability of the the sequence and as such address the notions of expectation and surprise in the perception of music.\emph{self-plagiarised}
hekeus@13 240
hekeus@13 241 Before the Melody Triangle can used, it has to be ÔpopulatedÕ with possible parameter values for the melody generators.
hekeus@13 242 These are then plotted in a 3d statistical space of redundancy, entropy rate and predictive information rate.
hekeus@13 243 In our case we generated thousands of transition matrixes, representing first-order Markov chains, by a random sampling method. In figure x we see a representation of how these matrixes are distributed in the 3d statistical space; each one of these points corresponds to a transition matrix.\emph{self-plagiarised}
samer@4 244
hekeus@13 245 When we look at the distribution of transition matrixes plotted in this space, we see that it forms an arch shape that is fairly thin.
hekeus@13 246 It thus becomes a reasonable approximation to pretend that it is just a sheet in two dimensions; and so we stretch out this curved arc into a flat triangle.
hekeus@13 247 It is this triangular sheet that is our ÔMelody TriangleÕ and forms the interface by which the system is controlled. \emph{self-plagiarised}
samer@4 248
hekeus@13 249 When the Melody Triangle is used, regardless of whether it is as a screen based system, or as an interactive installation, it involves a mapping to this statistical space.
hekeus@13 250 When the user, through the interface, selects a position within the triangle, the corresponding transition matrix is returned.
hekeus@16 251 Figure x shows how the triangle maps to different measures of redundancy, entropy rate and predictive information rate.\emph{self-plagiarised}
samer@4 252
hekeus@13 253 Each corner corresponds to three different extremes of predictability and unpredictability, which could be loosely characterised as ÔperiodicityÕ, ÔnoiseÕ and ÔrepetitionÕ.
hekeus@13 254 Melodies from the ÔnoiseÕ corner have no discernible pattern; they have high entropy rate, low predictive information rate and low redundancy.
hekeus@13 255 These melodies are essentially totally random.
hekeus@13 256 A melody along the ÔperiodicityÕ to ÔrepetitionÕ edge are all deterministic loops that get shorter as we approach the ÔrepetitionÕ corner, until it becomes just one repeating note.
hekeus@13 257 It is the areas in between the extremes that provide the more ÔinterestingÕ melodies.
hekeus@13 258 That is, those that have some level of unpredictability, but are not completely ran- dom. Or, conversely, that are predictable, but not entirely so.
hekeus@13 259 This triangular space allows for an intuitive explorationof expectation and surprise in temporal sequences based on a simple model of how one might guess the next event given the previous one.\emph{self-plagiarised}
samer@4 260
hekeus@13 261
hekeus@13 262
hekeus@13 263 Any number of interfaces could be developed for the Melody Triangle.
hekeus@13 264 We have developed two; a standard screen based interface where a user moves tokens with a mouse in and around a triangle on screen, and a multi-user interactive installation where a Kinect camera tracks individuals in a space and maps their positions in the space to the triangle.
hekeus@13 265 Each visitor would generate a melody, and could collaborate with their co-visitors to generate musical textures -- a playful yet informative way to explore expectation and surprise in music.
hekeus@13 266
hekeus@13 267 As a screen based interface the Melody Triangle can serve as composition tool.
hekeus@13 268 A triangle is drawn on the screen, screen space thus mapped to the statistical space of the Melody Triangle.
hekeus@13 269 A number of round tokens, each representing a melody can be dragged in and around the triangle.
hekeus@13 270 When a token is dragged into the triangle, the system will start generating the sequence of notes with statistical properties that correspond to its position in the triangle.\emph{self-plagiarised}
hekeus@13 271
hekeus@13 272 In this mode, the Melody Triangle can be used as a kind of composition assistant for the generation of interesting musical textures and melodies.
hekeus@13 273 However unlike other computer aided composition tools or programming environments, here the composer engages with music on the high and abstract level of expectation, randomness and predictability.\emph{self-plagiarised}
hekeus@13 274
hekeus@13 275
hekeus@13 276 Additionally the Melody Triangle serves as an effective tool for experimental investigations into musical preference and their relationship to the information dynamics models.
samer@4 277
hekeus@13 278 %As the Melody Triangle essentially operates on a stream of symbols, it it is possible to apply the melody triangle to the design of non-sonic content.
hekeus@13 279
hekeus@13 280 \section{Musical Preference and Information Dynamics}
hekeus@13 281 We carried out a preliminary study that sought to identify any correlation between aesthetic preference and the information theoretical measures of the Melody Triangle.
hekeus@13 282 In this study participants were asked to use the screen based interface but it was simplified so that all they could do was move tokens around.
hekeus@13 283 To help discount visual biases, the axes of the triangle would be randomly rearranged for each participant.\emph{self-plagiarised}
hekeus@16 284
hekeus@16 285 The study was divided in to two parts, the first investigated musical preference with respect to single melodies at different tempos.
hekeus@13 286 In the second part of the study, a back- ground melody is playing and the participants are asked to find a second melody that Õworks wellÕ with the background melody.
hekeus@13 287 For each participant this was done four times, each with a different background melody from four different areas of the Melody Triangle.
hekeus@13 288 For all parts of the study the participants were asked to ÔmarkÕ, by pressing the space bar, whenever they liked what they were hearing.\emph{self-plagiarised}
samer@4 289
hekeus@13 290 \emph{todo - results}
samer@4 291
hekeus@13 292 \section{Information Dynamics as Evaluative Feedback Mechanism}
hekeus@13 293
hekeus@13 294 \emph{todo - code the info dyn evaluator :) }
samer@4 295
hekeus@13 296 It is possible to use information dynamics measures to develop a kind of `critic' that would evaluate a stream of symbols.
hekeus@13 297 For instance we could develop a system to notify us if a stream of symbols is too boring, either because they are too repetitive or too chaotic.
hekeus@13 298 This could be used to evaluate both pre-composed streams of symbols, or could even be used to provide real-time feedback in an improvisatory setup.
hekeus@13 299
hekeus@13 300 \emph{comparable system} Gordon Pask's Musicolor (1953) applied a similar notion of boredom in its design.
hekeus@13 301 The Musicolour would react to audio input through a microphone by flashing coloured lights.
hekeus@13 302 Rather than a direct mapping of sound to light, Pask designed the device to be a partner to a performing musician.
hekeus@13 303 It would adapt its lighting pattern based on the rhythms and frequencies it would hear, quickly `learning' to flash in time with the music.
hekeus@13 304 However Pask endowed the device with the ability to `be bored'; if the rhythmic and frequency content of the input remained the same for too long it would listen for other rhythms and frequencies, only lighting when it heard these.
hekeus@13 305 As the Musicolour would `get bored', the musician would have to change and vary their playing, eliciting new and unexpected outputs in trying to keep the Musicolour interested.
samer@4 306
hekeus@13 307 In a similar vain, our \emph{Information Dynamics Critic}(name?) allows for an evaluative measure of an input stream, however containing a more sophisticated notion of boredom that \dots
hekeus@13 308
hekeus@13 309
hekeus@13 310
hekeus@13 311
samer@4 312 \section{Conclusion}
samer@4 313
samer@9 314 \bibliographystyle{unsrt}
hekeus@16 315 {\bibliography{all,c4dm,nime}}
samer@4 316 \end{document}