annotate draft.tex @ 9:a76c1edacdde

Intro words in draft.tex
author samer
date Tue, 06 Mar 2012 12:13:58 +0000
parents f35b863a8d1a
children 317db6d6f433
rev   line source
samer@4 1 \documentclass[conference]{IEEEtran}
samer@4 2 \usepackage{cite}
samer@4 3 \usepackage[cmex10]{amsmath}
samer@4 4 \usepackage{graphicx}
samer@4 5 \usepackage{amssymb}
samer@4 6 \usepackage{epstopdf}
samer@4 7 \usepackage{url}
samer@4 8 \usepackage{listings}
samer@9 9 \usepackage{tools}
samer@9 10
samer@9 11 \let\citep=\cite
samer@9 12 \def\squash{}
samer@4 13
samer@4 14 %\usepackage[parfill]{parskip}
samer@4 15
samer@4 16 \begin{document}
samer@4 17 \title{Cognitive Music Modelling: an Information Dynamics Approach}
samer@4 18
samer@4 19 \author{
samer@4 20 \IEEEauthorblockN{Samer A. Abdallah, Henrik Ekeus,}
samer@4 21 \IEEEauthorblockN{Peter Foster and Mark D. Plumbley}
samer@4 22 \IEEEauthorblockA{Centre for Digital Music\\
samer@4 23 Queen Mary University of London\\
samer@4 24 Mile End Road, London E1 4NS}}
samer@4 25
samer@4 26 \maketitle
samer@4 27 \abstract{People take in information when perceiving music. With it they continually build predictive models of what is going to happen. There is a relationship between information measures and how we perceive music. An information theoretic approach to music cognition is thus a fruitful avenue of research.
samer@4 28 }
samer@4 29
samer@4 30
samer@9 31 \section{Expectation and surprise in music}
samer@9 32 \label{s:Intro}
samer@9 33
samer@9 34 One of the more salient effects of listening to music is to create
samer@9 35 \emph{expectations} of what is to come next, which may be fulfilled
samer@9 36 immediately, after some delay, or not at all as the case may be.
samer@9 37 This is the thesis put forward by, amongst others, music theorists
samer@9 38 L. B. Meyer \cite{Meyer67} and Narmour \citep{Narmour77}.
samer@9 39 In fact, %the gist of
samer@9 40 this insight predates Meyer quite considerably; for example,
samer@9 41 it was elegantly put by Hanslick \cite{Hanslick1854} in the
samer@9 42 nineteenth century:
samer@9 43 \begin{quote}
samer@9 44 `The most important factor in the mental process which accompanies the
samer@9 45 act of listening to music, and which converts it to a source of pleasure,
samer@9 46 is %\ldots
samer@9 47 frequently overlooked. We here refer to the intellectual satisfaction
samer@9 48 which the listener derives from continually following and anticipating
samer@9 49 the composer's intentions---now, to see his expectations fulfilled, and
samer@9 50 now, to find himself agreeably mistaken. It is a matter of course that
samer@9 51 this intellectual flux and reflux, this perpetual giving and receiving
samer@9 52 takes place unconsciously, and with the rapidity of lightning-flashes.'
samer@9 53 \end{quote}
samer@9 54
samer@9 55 An essential aspect of this is that music is experienced as a phenomenon
samer@9 56 that `unfolds' in time, rather than being apprehended as a static object
samer@9 57 presented in its entirety. Meyer argued that musical experience depends
samer@9 58 on how we change and revise our conceptions \emph{as events happen}, on
samer@9 59 how expectation and prediction interact with occurrence, and that, to a
samer@9 60 large degree, the way to understand the effect of music is to focus on
samer@9 61 this `kinetics' of expectation and surprise.
samer@9 62
samer@9 63 The business of making predictions and assessing surprise is essentially
samer@9 64 one of reasoning under conditions of uncertainty and manipulating
samer@9 65 degrees of belief about the various proposition which may or may not
samer@9 66 hold, and, as has been argued elsewhere \cite{Cox1946,Jaynes27}, best
samer@9 67 quantified in terms of Bayesian probability theory.
samer@9 68 % Thus, we assume that musical schemata are encoded as probabilistic %
samer@9 69 %\citep{Meyer56} models, and
samer@9 70 Thus, we suppose that
samer@9 71 when we listen to music, expectations are created on the basis of our
samer@9 72 familiarity with various stylistic norms %, that is, using models that
samer@9 73 encode the statistics of music in general, the particular styles of
samer@9 74 music that seem best to fit the piece we happen to be listening to, and
samer@9 75 the emerging structures peculiar to the current piece. There is
samer@9 76 experimental evidence that human listeners are able to internalise
samer@9 77 statistical knowledge about musical structure, \eg
samer@9 78 \citep{SaffranJohnsonAslin1999,EerolaToiviainenKrumhansl2002}, and also
samer@9 79 that statistical models can form an effective basis for computational
samer@9 80 % analysis of music, \eg \cite{Pearce2005}.
samer@9 81 analysis of music, \eg
samer@9 82 \cite{ConklinWitten95,PonsfordWigginsMellish1999,Pearce2005}.
samer@9 83 % \cite{Ferrand2002}. Dubnov and Assayag PSTs?
samer@9 84
samer@9 85 \squash
samer@9 86 \subsection{Music and information theory}
samer@9 87 Given a probabilistic framework for music modelling and prediction,
samer@9 88 it is a small step to apply quantitative information theory \cite{Shannon48} to
samer@9 89 the models at hand.
samer@9 90 The relationship between information theory and music and art in general has been the
samer@9 91 subject of some interest since the 1950s
samer@9 92 \cite{Youngblood58,CoonsKraehenbuehl1958,HillerBean66,Moles66,Meyer67,Cohen1962}.
samer@9 93 The general thesis is that perceptible qualities and subjective
samer@9 94 states like uncertainty, surprise, complexity, tension, and interestingness
samer@9 95 are closely related to
samer@9 96 information-theoretic quantities like entropy, relative entropy,
samer@9 97 and mutual information.
samer@9 98 % and are major determinants of the overall experience.
samer@9 99 Berlyne \cite{Berlyne71} called such quantities `collative variables', since
samer@9 100 they are to do with patterns of occurrence rather than medium-specific details,
samer@9 101 and developed the ideas of `information aesthetics' in an experimental setting.
samer@9 102 % Berlyne's `new experimental aesthetics', the `information-aestheticians'.
samer@9 103
samer@9 104 % Listeners then experience greater or lesser levels of surprise
samer@9 105 % in response to departures from these norms.
samer@9 106 % By careful manipulation
samer@9 107 % of the material, the composer can thus define, and induce within the
samer@9 108 % listener, a temporal programme of varying
samer@9 109 % levels of uncertainty, ambiguity and surprise.
samer@9 110
samer@9 111
samer@9 112 Previous work in this area \cite{Berlyne74} treated the various
samer@9 113 information theoretic quantities
samer@9 114 such as entropy as if they were intrinsic properties of the stimulus---subjects
samer@9 115 were presented with a sequence of tones with `high entropy', or a visual pattern
samer@9 116 with `low entropy'. These values were determined from some known `objective'
samer@9 117 probability model of the stimuli,%
samer@9 118 \footnote{%
samer@9 119 The notion of objective probabalities and whether or not they can
samer@9 120 usefully be said to exist is the subject of some debate, with advocates of
samer@9 121 subjective probabilities including de Finetti \cite{deFinetti}.
samer@9 122 Accordingly, we will treat the concept of a `true' or `objective' probability
samer@9 123 models with a grain of salt and not rely on them in our
samer@9 124 theoretical development.}%
samer@9 125 % since probabilities are almost always a function of the state of knowledge of the observer
samer@9 126 or from simple statistical analyses such as
samer@9 127 computing emprical distributions. Our approach is explicitly to consider the role
samer@9 128 of the observer in perception, and more specifically, to consider estimates of
samer@9 129 entropy \etc with respect to \emph{subjective} probabilities.
samer@9 130 % !!REV - DONE - explain use of quoted `objective'
samer@9 131
samer@9 132 % !!REV - previous work on information theory in music
samer@9 133 More recent work on using information theoretic concepts to analyse music in
samer@9 134 includes Simon's \cite{Simon2005} assessments of the entropy of
samer@9 135 Jazz improvisations and Dubnov's
samer@9 136 \cite{Dubnov2006,DubnovMcAdamsReynolds2006,Dubnov2008}
samer@9 137 investigations of the `information rate' of musical processes, which is related
samer@9 138 to the notion of redundancy in a communications channel.
samer@9 139 Dubnov's work in particular is informed by similar concerns to our own
samer@9 140 and we will discuss the relationship between it and our work at
samer@9 141 several points later in this paper
samer@9 142 (see \secrf{Redundancy}, \secrf{methods} and \secrf{RelatedWork}).
samer@9 143
samer@9 144
samer@9 145 % !!REV - DONE - rephrase, check grammar (now there are too many 'one's!)
samer@9 146 \squash
samer@9 147 \subsection{Information dynamic approach}
samer@9 148
samer@9 149 Bringing the various strands together, our working hypothesis is that
samer@9 150 as a listener (to which will refer gender neutrally as `it')
samer@9 151 listens to a piece of music, it maintains a dynamically evolving statistical
samer@9 152 model that enables it to make predictions about how the piece will
samer@9 153 continue, relying on both its previous experience of music and the immediate
samer@9 154 context of the piece.
samer@9 155 As events unfold, it revises its model and hence its probabilistic belief state,
samer@9 156 which includes predictive distributions over future observations.
samer@9 157 These distributions and changes in distributions can be characterised in terms of a handful of information
samer@9 158 theoretic-measures such as entropy and relative entropy.
samer@9 159 % to measure uncertainty and information. %, that is, changes in predictive distributions maintained by the model.
samer@9 160 By tracing the evolution of a these measures, we obtain a representation
samer@9 161 which captures much of the significant structure of the
samer@9 162 music.
samer@9 163 This approach has a number of features which we list below.
samer@9 164
samer@9 165 (1) \emph{Abstraction}:
samer@9 166 Because it is sensitive mainly to \emph{patterns} of occurence,
samer@9 167 rather the details of which specific things occur,
samer@9 168 it operates at a level of abstraction removed from the details of the sensory
samer@9 169 experience and the medium through which it was received, suggesting that the
samer@9 170 same approach could, in principle, be used to analyse and compare information
samer@9 171 flow in different temporal media regardless of whether they are auditory,
samer@9 172 visual or otherwise.
samer@9 173
samer@9 174 (2) \emph{Generality}:
samer@9 175 This approach does not proscribe which probabilistic models should be used---the
samer@9 176 choice can be guided by standard model selection criteria such as Bayes
samer@9 177 factors \cite{KassRaftery1995}, \etc
samer@9 178
samer@9 179 (3) \emph{Richness}:
samer@9 180 It may be effective to use a model with time-dependent latent
samer@9 181 variables, such as a hidden Markov model. In these cases, we can track changes
samer@9 182 in beliefs about the hidden variables as well as the observed ones, adding
samer@9 183 another layer of richness to the description while maintaining the same
samer@9 184 level of abstraction.
samer@9 185 For example, harmony (\ie, the `current chord') in music is not stated explicitly, but rather
samer@9 186 must be inferred from the musical surface; nonetheless, a sense of harmonic
samer@9 187 progression is an important aspect of many styles of music.
samer@9 188
samer@9 189 (4) \emph{Subjectivity}:
samer@9 190 Since the analysis is dependent on the probability model the observer brings to the
samer@9 191 problem, which may depend on prior experience or other factors, and which may change
samer@9 192 over time, inter-subject variablity and variation in subjects' responses over time are
samer@9 193 fundamental to the theory. It is essentially a theory of subjective response
samer@9 194
samer@9 195 % !!REV - clarify aims of paper.
samer@9 196 Having outlined the basic ideas, our aims in pursuing this line of thought
samer@9 197 are threefold: firstly, to propose dynamic information-based measures which
samer@9 198 are coherent from a theoretical point of view and consistent with the general
samer@9 199 principles of probabilistic inference, with possible applications in
samer@9 200 regulating machine learning systems;
samer@9 201 % when heuristics are required to manage intractible models or limited computational resources.
samer@9 202 secondly, to construct computational models of what human brains are doing
samer@9 203 in response to music, on the basis that our brains implement, or at least
samer@9 204 approximate, optimal probabilistic inference under the relevant constraints;
samer@9 205 and thirdly, to construct a computational model of a certain restricted
samer@9 206 field of aesthetic judgements (namely judgements related to formal structure)
samer@9 207 that may shed light on what makes a stimulus interesting or aesthetically
samer@9 208 pleasing. This would be of particular relevance to understanding and
samer@9 209 modelling the creative process, which often alternates between generative
samer@9 210 and selective or evaluative phases \cite{Boden1990}, and would have
samer@9 211 applications in tools for computer aided composition.
samer@9 212
samer@4 213 \section{Information Dynamics Approach}
samer@4 214
samer@4 215 \subsection{Re-iterate core hypothesis}
samer@4 216
samer@4 217 \subsection{models/parameters/observations}
samer@4 218 The grouping of elements into past, present and future..s
samer@4 219 \subsection{Information measures}
samer@4 220 Predictive information rate as a measure of structure
samer@4 221 Cruchfield papers, anatomy of abit
samer@4 222 \subsection{Case of this approach being good at modelling music cognition}
samer@4 223 Inverted U
samer@4 224 \section{Applications}
samer@4 225 \subsection{In Analysis}
samer@4 226 refer to the work with the analysis of minimalist pieces
samer@4 227
samer@4 228 Content analysis - Sound Categorisation. Using Information Dynamics it is possible to segment music. From there we can then use this to search large data sets. Determine musical structure for the purpose of playlist navigation and search. (Peter)
samer@4 229
samer@4 230 \subsection{Beat Tracking}
samer@4 231 Bayesian belief can be used to predict when things happen (as oppose to just what happens). Information Dynamics of?
samer@4 232
samer@4 233
samer@4 234
samer@4 235 \subsection{Information Dynamics as Design Tool }
samer@4 236 \subsubsection{The Melody Triangle}
samer@4 237 \emph{What the Melody Triangle is\dots}
samer@4 238
samer@4 239
samer@4 240 \emph{The Melody Triangle as Composition Assistant\dots}
samer@4 241
samer@4 242 \emph{comparable tools} The use of stochastic processes for the generation of musical material has been widespread for decades. Just as Information Theory was coming of age Iannis Xenakis applied probabilistic mathematical models to the creation of musical materials. This included the formulation of a theory of Markovian Stochastic Music. With the Melody Triangle similar processes generate the content, however we are able to explore and interface with these processes at the high and abstract level of expectation, randomness and predictability.
samer@4 243
samer@4 244 \emph{Using the Melody Triangle for the generation of non-sonic content (maybe)}
samer@4 245
samer@4 246 \subsection{Information Dynamics as Evaluative Feedback Mechanism}
samer@4 247
samer@4 248
samer@4 249 \emph{comparable system} Gordon Pask's Musicolor (1953) applied a similar notion of boredom in its design.
samer@4 250 The Musicolour would react to audio input through a microphone by flashing coloured lights. Rather than a direct mapping of sound to light, Pask designed the device to be a partner to a performing musician. It would adapt its lighting pattern based on the rhythms and frequencies it would hear, quickly `learning' to flash in time with the music. However Pask endowed the device with the ability to `be bored'; if the rhythmic and frequency content of the input remained the same for too long it would listen for other rhythms and frequencies, only lighting when it heard these. As the Musicolour would `get bored', the musician would have to change and vary their playing, eliciting new and unexpected outputs in trying to keep the Musicolour interested.
samer@4 251
samer@4 252 In a similar vain, our \emph{Information Dynamics Critic}(name?) allows for an evaluative measure of an input stream, however containing a more sophisticated notion of boredom that \dots
samer@4 253
samer@4 254 \subsection{Musical Preference and Information Dynamics}
samer@4 255 Any results from this study
samer@4 256 \section{Conclusion}
samer@4 257
samer@9 258 \bibliographystyle{unsrt}
samer@9 259 {\bibliography{all,c4dm}}
samer@4 260 \end{document}