annotate mume2012/mume2012.tex @ 58:a63c438b3f65 tip

Squeezed it into the 6 page limit
author Henrik Ekeus <hekeus@eecs.qmul.ac.uk>
date Tue, 11 Jun 2013 15:17:21 +0100
parents 508760245ab1
children
rev   line source
hekeus@52 1 %File: formatting-instruction.tex
hekeus@52 2 \documentclass[letterpaper]{article}
hekeus@52 3 \usepackage{aaai}
hekeus@52 4 \usepackage{times}
hekeus@52 5 \usepackage{helvet}
hekeus@52 6 \usepackage{courier}
hekeus@52 7 \usepackage{tools}
hekeus@52 8 \usepackage{url}
hekeus@52 9
hekeus@52 10
hekeus@52 11
hekeus@52 12 \let\citep=\cite
hekeus@52 13 \newcommand{\colfig}[2][1]{\includegraphics[width=#1\linewidth]{figs/#2}}%
hekeus@53 14 \newcommand\past[1]{\overset{\rule{0pt}{0.2em}\smash{\leftarrow}}{#1}}
hekeus@53 15 \newcommand\fut[1]{\overset{\rule{0pt}{0.1em}\smash{\rightarrow}}{#1}}
hekeus@52 16 \frenchspacing
hekeus@52 17 \pdfinfo{
hekeus@53 18 /Title (The Melody Triangle: Exploring Pattern and Predictability in Music )
hekeus@52 19 /Subject (Musical Metacreation, Interfaces)
hekeus@52 20 /Author (Henrik Ekeus, Samer A. Abdallah, Mark D. Plumbley, Peter W. McOwan)}
hekeus@52 21 \setcounter{secnumdepth}{0}
hekeus@52 22
hekeus@52 23 % The file aaai.sty is the style file for AAAI Press
hekeus@52 24 % proceedings, working notes, and technical reports.
hekeus@52 25 %
hekeus@53 26 \title{The Melody Triangle:\\ Exploring Pattern and Predictability in Music}
hekeus@52 27 \author{Henrik Ekeus, Samer A. Abdallah, Mark D. Plumbley, Peter W. McOwan\\
hekeus@52 28 Centre for Digital Music, Queen Mary University of London,\\
hekeus@52 29 London E1 4NS, UK\\
hekeus@52 30 }
hekeus@52 31 \begin{document}
hekeus@52 32 \maketitle
hekeus@52 33 \begin{abstract}
hekeus@52 34 \begin{quote}
hekeus@52 35
hekeus@53 36 The Melody Triangle is an interface for the discovery of melodic materials, where the input -- positions within a triangle -- directly map to information theoretic properties of the output. A model of human expectation and surprise in the perception of music, \emph{information dynamics}, is used to `map out' a musical generative system's parameter space. This enables a user to explore the possibilities afforded by a generative algorithm, in this case Markov chains, not by directly selecting parameters, but by specifying the subjective \emph{predictability} of the output sequence. We describe some of the relevant ideas from information dynamics and how the Melody Triangle is defined in terms of these. We describe its incarnation as a screen based performance tool and compositional aid for the generation of musical textures; the userŐs control at the abstract level of randomness and predictability, and some pilot studies carried out with it. We also briefly outline a multi-user installation, where collaboration in a performative setting provides a playful yet informative way to explore expectation and surprise in music, and a forthcoming mobile phone version of the Melody Triangle.
hekeus@52 37
hekeus@52 38 \end{quote}
hekeus@52 39 \end{abstract}
hekeus@52 40
hekeus@52 41 \noindent
hekeus@52 42
hekeus@52 43
hekeus@53 44 \comment{
hekeus@53 45
hekeus@53 46 %%%How we created the transition matrixes and created the triangle.
hekeus@53 47
hekeus@53 48 }
hekeus@53 49
hekeus@53 50 \section{Introduction}
hekeus@53 51
hekeus@53 52 The use of generative stochastic processes in music composition has been widespread for
hekeus@53 53 decades---for instance Iannis Xenakis applied probabilistic mathematical models
hekeus@53 54 to the creation of musical materials\cite{Xenakis:1992ul}. However it can sometimes be difficult for a composer to find desirable parameters and navigate the possibilities of a generative algorithm intuitively.
hekeus@53 55
hekeus@53 56 The Melody Triangle is an interface for the discovery of melodic content where the parameter space of a stochastic generative musical process, the Markov chain, is `mapped out' according to the \emph{predictability} of the output. The Melody Triangle was developed in the context of \emph{information dynamics}\cite{CIP}; an information theoretic approach to modelling human expectation and surprise in the perception of music. %Information dynamics considers a series of sequential information measures, and these define the interface to the Melody Triangle.
hekeus@53 57 Users of the Melody Triangle do not select the parameters to generative processes directly, rather they provide input in the form of a position within a triangle, and this maps to the information theoretic properties of an output melody.
hekeus@53 58 For instance one corner of the triangle returns completely random melodies, while an other area yields entirely predictable and periodic patterns, the entirety of the triangle covering a spectrum of predictability of the output melodies.
hekeus@53 59
hekeus@53 60 In this paper we outline the concepts and ideas behind information dynamics, and describe the information measures that lead to the development of the Melody Triangle. We describe its physical realisations; a multi-user interactive installation where visitors would use their bodies and gestures to generate musical materials, and a screen based interface. We outline some pilot studies carried out with the screen interface, as well as some qualitative feedback from music practitioners exploring its potential as a performance or composition tool. Finally we outline a forthcoming mobile phone version of the Melody Triangle.
hekeus@52 61
hekeus@52 62 \section{Information Dynamics}
hekeus@52 63 \label{s:Intro}
hekeus@52 64 The relationship between
hekeus@53 65 Shannon's \shortcite{Shannon48} information theory and music and art in general has been the
hekeus@52 66 subject of some interest since the 1950s
hekeus@52 67 \cite{Youngblood58,CoonsKraehenbuehl1958,Moles66,Meyer67,Cohen1962}.
hekeus@52 68 The general thesis is that perceptible qualities and subjective states
hekeus@52 69 like uncertainty, surprise, complexity, tension, and interestingness
hekeus@52 70 are closely related to information-theoretic quantities like
hekeus@52 71 entropy, relative entropy, and mutual information.
hekeus@52 72
hekeus@52 73 Music is an inherently dynamic process. The idea that the musical experience is strongly shaped by the generation
hekeus@52 74 and playing out of strong and weak expectations was put forward by, amongst others,
hekeus@53 75 music theorists L. B. Meyer \shortcite{Meyer:1967} and Narmour \shortcite{Narmour:1977}.
hekeus@52 76
hekeus@52 77
hekeus@52 78
hekeus@52 79 An essential aspect of this is that music is experienced as a phenomenon
hekeus@52 80 that unfolds in time, rather than being apprehended as a static object
hekeus@52 81 presented in its entirety. Meyer argued that the experience depends
hekeus@52 82 on how we change and revise our conceptions \emph{as events happen}, on
hekeus@52 83 how expectation and prediction interact with occurrence, and that, to a
hekeus@52 84 large degree, the way to understand the effect of music is to focus on
hekeus@52 85 this `kinetics' of expectation and surprise.
hekeus@52 86
hekeus@52 87
hekeus@52 88 Prediction and expectation are essentially probabilistic concepts
hekeus@52 89 and can be treated mathematically using probability theory.
hekeus@52 90 We suppose that when we listen to music, expectations are created on the basis
hekeus@52 91 of our familiarity with various styles of music and our ability to
hekeus@52 92 detect and learn statistical regularities in the music as they emerge.
hekeus@52 93 There is experimental evidence that human listeners are able to internalise
hekeus@52 94 statistical knowledge about musical structure,
hekeus@52 95 \cite{SaffranJohnsonAslin1999}, and also
hekeus@52 96 that statistical models can form an effective basis for computational
hekeus@52 97 analysis of music,
hekeus@52 98 \cite{ConklinWitten95,PonsfordWigginsMellish1999,Pearce2005}.
hekeus@52 99
hekeus@52 100
hekeus@53 101 Information dynamics considers several different kinds of predictability in musical patterns, how these might be quantified using the tools of information theory,
hekeus@52 102 %human listeners might perceive these,
hekeus@52 103 and how they shape or affect the listening experience. Our working hypothesis is that listeners maintain a dynamically evolving probabilistic belief state that enables them to make predictions about how a piece of music will continue.
hekeus@52 104
hekeus@52 105
hekeus@52 106 They do this using both the immediate context of the piece as well as using previous musical experience, such as a familiarity with musical styles and conventions. As the music unfolds, listeners continually revise this belief state, which includes predictive
hekeus@52 107 distributions over possible future events. These changes in probabilistic beliefs can be associated with
hekeus@52 108 quantities of information; these are the focus of information dynamics.
hekeus@52 109
hekeus@52 110
hekeus@53 111 In this next section we briefly describe the information measures that we use to define the Melody Triangle, however a more complete overview of information dynamics and some of its applications can be found in \cite{Abdallah:2009p4089} and \cite{CIP}.
hekeus@52 112
hekeus@52 113
hekeus@53 114
hekeus@53 115
hekeus@52 116
hekeus@52 117
hekeus@52 118
hekeus@52 119
hekeus@52 120
hekeus@52 121
hekeus@52 122
hekeus@52 123 \subsection{Sequential Information Measures}\label{sec:Sequential_Information_Measures}
hekeus@53 124
hekeus@53 125 Consider a sequence of symbols from the viewpoint of an observer at a certain time, and split the
hekeus@53 126 sequence into a single symbol in the \emph{present} ($X_t$), an infinite \emph{past} ($\past{X}_t$) and the
hekeus@53 127 infinite \emph{future} ($\fut{X}_t$).
hekeus@53 128
hekeus@53 129
hekeus@53 130 The \emph{entropy rate} of a random process is a well-known, basic measure of its randomness or
hekeus@53 131 unpredictablity. The entropy rate is the entropy, \emph{H}, of the \emph{present} given the \emph{past}:
hekeus@52 132 \begin{equation}
hekeus@53 133 %\mathrm{EntropyRate} = H( \mathrm{Present} | \mathrm{Past}).
hekeus@53 134 \label{eq:entro-rate}
hekeus@53 135 h_\mu = H(X_t|\past{X}_t).
hekeus@52 136 \end{equation}
hekeus@52 137 that is, it represents our average uncertainty about the present symbol \emph{given}
hekeus@52 138 that we have observed everything before it. Processes with zero entropy rate can
hekeus@52 139 be predicted perfectly given enough of the preceding context.
hekeus@52 140
hekeus@53 141 The \emph{multi-information rate} $\rho_\mu$ \cite{Dubnov2004}
hekeus@53 142 is the mutual
hekeus@53 143 information, \emph{I}, between the `past' and the `present':
hekeus@53 144 \begin{equation}
hekeus@53 145 \label{eq:multi-info}
hekeus@53 146 \rho_\mu = I(\past{X}_t;X_t) = H(X_t) - H(X_t|\past{X}_t).
hekeus@53 147 %\mathrm{Redundancy} = H( \mathrm{Present} ) - H(\mathrm{Present} | \mathrm{Past}).
hekeus@53 148 \end{equation}
hekeus@53 149
hekeus@53 150 Multi-information rate can be thought of as measures of \emph{redundancy}, quantifying the extent to which the same information is to be found in all parts of the sequence.
hekeus@53 151 It is a measure of how much the predictability of the process depends on knowing the
hekeus@52 152 preceding context. It is the difference between the entropy of a single element of the
hekeus@52 153 sequence in isolation (imagine choosing a note from a musical score at random with your
hekeus@52 154 eyes closed and then trying to guess the note) and its entropy after taking into account
hekeus@52 155 the preceding context:
hekeus@53 156 If the previous symbols reduce our uncertainty about the present symbol a great deal, then
hekeus@52 157 the redundancy is high. For example, if we know that a sequence consists of a repeating
hekeus@52 158 cycle such as $ \ldots b, c, d, a, b, c, d, a \ldots$, but we don't know which was the first
hekeus@53 159 symbol, then the redundancy is high, as $H(X_t)$ is high (because we
hekeus@53 160 have no idea about the present symbol in isolation), but $H(X_t|\past{X}_t)$
hekeus@52 161 is zero, because knowing the previous symbol immediately tells us what the present symbol is.
hekeus@52 162
hekeus@53 163 The \emph{predictive information rate} (PIR) \cite{Abdallah:2009p4089} brings in our uncertainty about the future. It is a
hekeus@52 164 measure of how much each symbol reduces our uncertainty about the future as it is
hekeus@52 165 observed, \emph{given} that we have observed the past:
hekeus@52 166 \begin{equation}
hekeus@53 167 \label{eq:PIR}
hekeus@53 168 b_\mu = I(X_t;\fut{X}_t|\past{X}_t) = H(\fut{X}_t|\past{X}_t) - H(\fut{X}_t|X_t,\past{X}_t).
hekeus@53 169 %\mathrm{PIR} = H(\mathrm{Future} | \mathrm{Past}) - H(\mathrm{Future} | \mathrm{Present}, \mathrm{Past}).
hekeus@52 170 \end{equation}
hekeus@53 171 It is a measure of the mutual information between the `presentŐ and the `futureŐ given the `past'. In other words, it is a measure of the \emph{new} information in each symbol.
hekeus@52 172 %Notice that if the past completely determines both the present and the future (as in the cyclic
hekeus@52 173 %pattern above) the PIR is zero, since the present symbol brings no new information. However,
hekeus@52 174 %if the symbols in a sequence are generated completely independently, e.g. by rolling a die for each
hekeus@52 175 %one, then again, the present symbol provides no information about the future and the PIR
hekeus@52 176 %is zero.
hekeus@52 177
hekeus@53 178
hekeus@53 179
hekeus@52 180 The behaviour of the predictive information rate make it interesting from a compositional point of view. The definition
hekeus@52 181 of the PIR is such that it is low both for extremely regular processes, such as constant
hekeus@52 182 or periodic sequences, \emph{and} low for extremely random processes, where each symbol
hekeus@52 183 is chosen independently of the others, in a kind of `white noise'. In the former case,
hekeus@52 184 the pattern, once established, is completely predictable and therefore there is no
hekeus@52 185 \emph{new} information in subsequent observations. In the latter case, the randomness
hekeus@52 186 and independence of all elements of the sequence means that, though potentially surprising,
hekeus@52 187 each observation carries no information about the ones to come.
hekeus@52 188
hekeus@52 189 Processes with high PIR maintain a certain kind of balance between
hekeus@52 190 predictability and unpredictability in such a way that the observer must continually
hekeus@52 191 pay attention to each new observation as it occurs in order to make the best
hekeus@52 192 possible predictions about the evolution of the sequence. This balance between predictability
hekeus@53 193 and unpredictability is reminiscent of the inverted `U' shape of the Wundt curve (see \Figrf{wundt}),
hekeus@53 194 which summarises the observations of Wundt \shortcite{Wundt1897} that stimuli are most
hekeus@52 195 pleasing at intermediate levels of novelty or disorder, where there is a balance between
hekeus@52 196 `order' and `chaos'.
hekeus@52 197
hekeus@52 198 \begin{fig}{wundt}
hekeus@52 199 \raisebox{-4em}{\colfig[0.43]{wundt}}
hekeus@52 200 % {\ \shortstack{{\Large$\longrightarrow$}\\ {\scriptsize\emph{exposure}}}\ }
hekeus@52 201 {\ {\large$\longrightarrow$}\ }
hekeus@52 202 \raisebox{-4em}{\colfig[0.43]{wundt2}}
hekeus@52 203 \caption{
hekeus@52 204 The Wundt curve relating randomness/complexity with
hekeus@52 205 perceived value. Repeated exposure sometimes results
hekeus@52 206 in a move to the left along the curve \cite{Berlyne71}.
hekeus@52 207 }
hekeus@52 208 \end{fig}
hekeus@52 209
hekeus@52 210
hekeus@53 211 A similar shape is visible in the upper envelope of the plot in \Figrf{mtriscat}, which is a 3-D scatter plot of
hekeus@53 212 the information information measures for several thousand
hekeus@53 213 first-order Markov chain transition matrices generated by a random sampling method.
hekeus@53 214 The coordinates of the `information space' are entropy rate ($h_\mu$), redundancy ($\rho_\mu$), and
hekeus@53 215 predictive information rate ($b_\mu$). The points along the `redundancy' axis correspond
hekeus@53 216 to periodic Markov chains. Those along the `entropy' axis produce uncorrelated sequences
hekeus@53 217 with no temporal structure. Processes with high PIR are to be found at intermediate
hekeus@53 218 levels of entropy and redundancy.
hekeus@52 219
hekeus@53 220 These observations led us to construct the `Melody Triangle'.
hekeus@52 221
hekeus@52 222 \begin{figure}
hekeus@52 223 \centering
hekeus@53 224 \includegraphics[width=0.23\textwidth]{figs/PeriodicMatrix.pdf}
hekeus@53 225 \includegraphics[width=0.23\textwidth]{figs/NonPeriodicMatrix.pdf}
hekeus@53 226 \caption{Two transition matrixes representing Markov chains. The shade of white represents the probabilities of transition from one symbol to the next (white=0, black=1). The current symbol is along the bottom, and the next symbol is along the left. The left hand matrix has no uncertainty; it represents a periodic pattern (a,d,c,b,a,d,c,b,a,d,c,b,a\dots). The right hand matrix contains unpredictability but nonetheless is not completely without perceivable structure (we know for instance that any `b' will always be followed by an `a' and preceded by a `c'), it is of a higher entropy rate. \label{TransitionMatrixes}}
hekeus@52 227 \end{figure}
hekeus@52 228
hekeus@52 229
hekeus@52 230
hekeus@52 231 \begin{fig}{mtriscat}
hekeus@53 232 \colfig[1]{mtriscat}
hekeus@52 233 \caption{The population of transition matrices in the 3D space of
hekeus@53 234 entropy rate ($h_\mu$), redundancy ($\rho_\mu$) and predictive information rate ($b_\mu$),
hekeus@53 235 all in bits. Note that the distribution as a whole makes a curved triangle. Although
hekeus@53 236 not visible in this plot, it is largely hollow in the middle.
hekeus@52 237 The concentrations of points along the redundancy axis correspond
hekeus@52 238 to Markov chains which are roughly periodic with periods of 2 (redundancy 1 bit),
hekeus@52 239 3, 4, \etc all the way to period 7 (redundancy 2.8 bits). Note that the highest PIR values are found at intermediate entropy
hekeus@53 240 and redundancy. \label{InfoDynEngine}}
hekeus@52 241 \end{fig}
hekeus@52 242
hekeus@52 243
hekeus@52 244
hekeus@52 245
hekeus@53 246 \section{The Melody Triangle}\label{makingthetriangle}
hekeus@52 247
hekeus@52 248
hekeus@52 249
hekeus@53 250 %Before the Melody Triangle can used, it has to be populated with possible parameter values for the melody generators. These are then plotted in a 3D statistical space of redundancy, entropy rate and predictive information rate. In our case we generated thousands of transition matrixes, representing first-order Markov chains, by a random sampling method. In Fig \ref{InfoDynEngine} we see a representation of how these matrixes are distributed in the 3D statistical space; each one of these points corresponds to a transition matrix.
hekeus@53 251 %
hekeus@53 252 The Melody Triangle is an interface that is designed around this natural distribution of Markov chain transition
hekeus@53 253 matrices in the information space of entropy rate ($h_\mu$), redundancy ($\rho_\mu$) and predictive information rate ($b_\mu$), as illustrated in \Figrf{mtriscat}.
hekeus@52 254
hekeus@52 255
hekeus@52 256 The distribution of transition matrices in this space forms a relatively thin
hekeus@53 257 curved sheet. Thus, it is a reasonable simplification to project out the
hekeus@52 258 third dimension (the PIR) and present an interface that is just two dimensional.
hekeus@53 259
hekeus@53 260
hekeus@52 261 The right-angled triangle is rotated, reflected and stretched to form an equilateral triangle with
hekeus@52 262 the `redundancy'/`entropy rate' vertex at the top, the `redundancy' axis down the left-hand
hekeus@53 263 side, and the `entropy rate' axis down the right, as shown in \Figrf{TheTriangle}.
hekeus@52 264 This is our `Melody Triangle' and
hekeus@52 265 forms the interface by which the system is controlled.
hekeus@52 266
hekeus@52 267 \begin{fig}{TheTriangle}
hekeus@52 268 \colfig[1]{TheTriangle.pdf}
hekeus@52 269 \caption{The Melody Triangle}
hekeus@52 270 \end{fig}
hekeus@52 271
hekeus@53 272 \subsection{Usage}
hekeus@53 273
hekeus@52 274 %Using this interface thus involves a mapping to information space;
hekeus@52 275 The user selects a point within the triangle, this is mapped into the
hekeus@52 276 information space and the nearest transition matrix is used to generate
hekeus@52 277 a sequence of values which are then sonified either as pitched notes or percussive
hekeus@52 278 sounds.
hekeus@52 279
hekeus@53 280 Though the interface is 2D, the third dimension (predictive information rate) is implicitly present, as
hekeus@52 281 transition matrices retrieved from
hekeus@52 282 along the centre line of the triangle will tend to have higher PIR.
hekeus@52 283
hekeus@53 284 As shown in as shown in \Figrf{TheTriangle}, the corners correspond to three different extremes of predictability and
hekeus@52 285 unpredictability, which could be loosely characterised as `periodicity', `noise'
hekeus@53 286 and `repetition'. Melodies from the `noise' corner (high $h_\mu$, low $\rho_\mu$
hekeus@53 287 and low $b_\mu$) have no discernible pattern;
hekeus@52 288 those along the `periodicity'
hekeus@52 289 to `repetition' edge are all cyclic patterns that get shorter as we approach
hekeus@52 290 the `repetition' corner, until each is just one repeating note. Those along the
hekeus@52 291 opposite edge consist of independent random notes from non-uniform distributions.
hekeus@53 292 Areas between the left and right edges will tend to have higher predictive information rate ($b_\mu$),
hekeus@52 293 and we hypothesise that, under
hekeus@52 294 the appropriate conditions, these will be perceived as more `interesting' or
hekeus@52 295 `melodic.'
hekeus@52 296 These melodies have some level of unpredictability, but are not completely random.
hekeus@52 297 Or, conversely, are predictable, but not entirely so.
hekeus@52 298
hekeus@52 299 Given coordinates corresponding to a point in the triangle, we select from a pre-built
hekeus@52 300 library of random processes, choosing one whose entropy rate and redundancy match the desired
hekeus@52 301 values. The implementations discussed in this paper use first order Markov chains as the content generator,
hekeus@52 302 since it is easy to compute the theoretically exact values of entropy rate, redundancy and predictive
hekeus@52 303 information rate given the transition matrix of the Markov chain. However, in principle, any generative system could be used to create the library of sequences, given an appropriate probabilistic listener model supporting
hekeus@52 304 the estimation of entropy rate and redundancy.
hekeus@52 305
hekeus@52 306
hekeus@52 307 The Markov chain based implementation generates streams of symbols in the abstract; the alphabet of symbols is then mapped to a set of distinct sounds, such as pitched notes in a scale or a set of percussive
hekeus@52 308 sounds. Further by layering these streams intricate musical textures can be created. The selection of
hekeus@52 309 notes or sounds is arbitrary, as long as they are all distinguishable.
hekeus@52 310 %)le is not a part of the Melody Triangle's core functionality, i
hekeus@52 311 Indeed, the symbols could be mapped to even non sonic outputs such as visible shapes, colours, or movements.
hekeus@52 312
hekeus@53 313 %The physical interface to the Triangle has so far been realised in two forms: as an interactive installation and as a screen based interface. Currently a mobile phone version is under development.
hekeus@52 314
hekeus@53 315 \section{Interfaces}
hekeus@53 316
hekeus@53 317
hekeus@53 318
hekeus@53 319 \subsection{Interface 1: The Interactive Installation}
hekeus@53 320 \begin{figure}
hekeus@53 321 \centering
hekeus@53 322 \includegraphics[width=0.5\textwidth]{figures/kinnect.pdf}
hekeus@53 323 \caption{The depth map as seen by the Kinect camera in the interactive installation version of the Melody Triangle. The bounding box outlines the blobs detected by OpenNI.\label{Kinect}}
hekeus@53 324 \end{figure}
hekeus@53 325 The Melody Triangle was first implemented as a multi-user
hekeus@53 326 interactive installation. It has been exhibited at the Brighton Science Festival 2012, Digital Shoreditch as well as at The British Science Festival 2011. A Kinect\footnote{http://www.xbox.com/en-GB/Kinect} camera tracks individuals in a space, the range of its depth sensors naturally forming a triangle.
hekeus@53 327
hekeus@53 328 As visitors/users come into the range of the camera, they start generating a melody, the statistical properties of this melody determined by the mapping of physical space to the statistical space of the Melody Triangle. Thus by exploring the physical space, the participant changes the predictability of the generated melodic content. When multiple people are in the space they can cooperate to create interweaving melodies, forming intricate polyphonic textures.
hekeus@53 329
hekeus@53 330 % Additionally the visitors can change the periodicity, register and instrumentation of their melody with body gestures. When multiple people are in the space they can cooperate to create interweaving melodies, forming intricate polyphonic textures.
hekeus@53 331 This makes the interaction physically engaging
hekeus@53 332 and (as our experience with visitors both young and old has demonstrated) more playful.
hekeus@53 333 %Additionally visitors can change the
hekeus@53 334 %tempo, register, instrumentation and periodicity of their melody with body gestures.
hekeus@53 335
hekeus@53 336
hekeus@53 337
hekeus@53 338
hekeus@53 339 \subsubsection{Tracking and Control}
hekeus@53 340
hekeus@53 341 Tracking and control was done using the OpenNI libraries' API\footnote{http://OpenNi.org/} and high level middle-ware for tracking with Kinect. This provided reliable blob tracking of humanoid forms in 2D space. By triangulating this to the Kinect's depth map it became possible to get reliable coordinate of visitors' positions in the space.
hekeus@53 342
hekeus@53 343 By detecting the bounding box of the 2D blobs of individuals in the space, and then normalising these based on the distance of the depth map it became possible to work out if an individual had an arm stretched out or if they were crouching. With this it was possible to define a series of gestures for controlling the system without the use of any controllers(see table \ref{gestures}). Thus for instance by sticking out one's left arm quickly, the melody doubles in tempo. By pulling one's left arm in at the same time as sticking the right arm out the melody would shift onto the offbeat. Sending out both arms would change the instrument being `played', and crouching would decrease the volume of the melody.
hekeus@53 344
hekeus@53 345 \begin{table}
hekeus@53 346 \centering
hekeus@53 347 %\includegraphics[width=0.5\textwidth]{InstructionsText.pdf}
hekeus@53 348 \caption{Gestures and their resulting effect\label{gestures}}
hekeus@53 349 \begin{tabular}{ l c l }
hekeus@53 350 left arm & right arm & meaning\\
hekeus@53 351 \hline
hekeus@53 352 out & static & double tempo \\
hekeus@53 353 in & static & halve tempo \\
hekeus@53 354 static & out & triple tempo \\
hekeus@53 355 static & in & one-third tempo\\
hekeus@53 356 out & in & shift to off-beat \\
hekeus@53 357 out & out & change instrument\\
hekeus@53 358 in & in & reset tempo\\
hekeus@53 359 \end{tabular}
hekeus@53 360 \end{table}
hekeus@53 361
hekeus@53 362 \begin{figure}
hekeus@53 363 \centering
hekeus@53 364 \includegraphics[width=0.5\textwidth]{figures/InstructionsImage2.pdf}
hekeus@53 365 \caption{Gestures and their resulting effect \label{gestures2}}
hekeus@53 366 \end{figure}
hekeus@53 367
hekeus@53 368 \subsubsection{Observations}
hekeus@53 369 Although visitors would need some initial instructions, they were then quickly able to collaboratively design musical textures. For example, one person would lay down a predictable repeating bass line by keeping themselves to the periodicity/repetition side of the room, while a companion can generate a freer melodic line by being nearer the 'noise' part of the space.
hekeus@53 370
hekeus@53 371
hekeus@53 372 The collaborative nature of this installation is an area that merits attention. By not having one user be able to control the whole narrative, the participants would communicate verbally and direct each other in the goals of learning to use the system and finding interesting musical textures. This collaboration added an element of playfulness and enjoyment that was clearly apparent.
hekeus@53 373
hekeus@53 374 As an artefact this installation occupies an ambiguous role in terms of purpose; it is in a nebulous middle ground between instrument, art installation and technical demonstration. It is clear however, that as a vehicle for communicating ideas related to the expectation, pattern and predictability in music to the general public, it has proved very effective.
hekeus@53 375
hekeus@53 376 However we were interested in carrying out some studies under more controlled circumstances. Additionally we are interested in the Melody Triangle's potential as a compositional aid or music performance interface. To this end we developed a screen based user interface to the Melody Triangle.
hekeus@53 377
hekeus@53 378
hekeus@53 379 \subsection{Interface 2: The Screen Interface}
hekeus@53 380 %The screen based interface can serve as a compositional tool or performance interface.
hekeus@52 381
hekeus@52 382 \begin{fig}{melTriScreenShot}
hekeus@52 383 \colfig[1]{melTriScreenShot}
hekeus@53 384 \caption{Screen shot of the Melody Triangle screen UI. On the right current transition matrixes being played are displayed. The tokens flash when ever a note from its melody is rendered. }
hekeus@52 385 \end{fig}
hekeus@52 386
hekeus@52 387 %%A triangle is drawn on the screen, screen space thus mapped to the statistical
hekeus@52 388 %space of the Melody Triangle.
hekeus@53 389 In the screen based interface, a number of tokens, each representing a
hekeus@52 390 sonification stream or `voice', can be dragged in and around the triangle.
hekeus@52 391 For each token, a sequence of symbols is sampled using the corresponding
hekeus@52 392 transition matrix, which are then mapped to notes of a scale or percussive sounds%
hekeus@52 393 \footnote{The sampled sequence could easily be mapped to other musical processes, possibly over
hekeus@52 394 different time scales, such as chords, dynamics and timbres. It would also be possible
hekeus@52 395 to map the symbols to visual or other outputs.}%
hekeus@52 396 . Keyboard commands give control over other musical parameters such
hekeus@52 397 as the pitch register, volume, scale, inter-onset interval and instrument for each voice.
hekeus@52 398 %The possibilities afforded by the Melody Triangle in these other domains remains to be investigated.}.
hekeus@52 399 %
hekeus@52 400 The system is capable of generating quite intricate musical textures when multiple tokens
hekeus@52 401 are in the triangle. The overlapping and interweaving of melodies of varying periodicity's and predictability is well suited for making content that could stylistically be characterised as `minimalism'.
hekeus@52 402
hekeus@52 403 This interface is quite unlike other computer aided composition tools or programming
hekeus@52 404 environments, as here the composer exercises control at the abstract level of information-dynamic
hekeus@52 405 properties.
hekeus@53 406 A video of the interface in use can be viewed here - \url{http://bit.ly/My49lT}
hekeus@52 407 %the interface relating to subjective expectation and predictability.
hekeus@52 408
hekeus@52 409
hekeus@52 410
hekeus@52 411
hekeus@52 412
hekeus@52 413
hekeus@52 414 \section{User trials with the Melody Triangle}
hekeus@53 415 %We are currently in the process of using the screen-based Melody Triangle user interface to investigate the relationship between the information-dynamic characteristics of sonified Markov chains and subjective musical preference.
hekeus@52 416 We carried out a pilot study with six participants, who were asked
hekeus@52 417 to use a simplified form of the user interface (a single controllable token,
hekeus@52 418 and no rhythmic, registral or timbral controls) under two conditions:
hekeus@52 419 one where a single sequence was sonified under user control, and another
hekeus@52 420 where an additional sequence was sonified in a different register, as if generated
hekeus@52 421 by a fixed invisible token in one of four regions of the triangle. In addition, subjects
hekeus@52 422 were asked to press a key if they `liked' what they were hearing.
hekeus@52 423
hekeus@53 424
hekeus@53 425 Our hypothesis is that users would linger longer in areas of the triangle that would produce more aesthetically desirable sequences, and these would tend to be the in the areas of the triangle that are of high predictive information rate, that is, areas along the middle and lower edge of the triangle.
hekeus@53 426
hekeus@53 427
hekeus@52 428 We recorded subjects' behaviour as well as points which they marked
hekeus@52 429 with a key press.
hekeus@53 430
hekeus@53 431
hekeus@53 432 \subsection{Results}
hekeus@53 433 Some results for four of the subjects are shown in \Figrf{mtri-results}. We have not been able to detect any systematic across-subject preference for any particular region of the triangle.
hekeus@52 434
hekeus@52 435 \begin{fig}{mtri-results}
hekeus@52 436 \def\scat#1{\colfig[0.42]{mtri/#1}}
hekeus@52 437 \def\subj#1{\scat{scat_dwells_subj_#1} & \scat{scat_marks_subj_#1}}
hekeus@52 438 \begin{tabular}{cc}
hekeus@53 439 \subj{a} \\
hekeus@52 440 \subj{b} \\
hekeus@52 441 \subj{c} \\
hekeus@52 442 \subj{d}
hekeus@52 443 \end{tabular}
hekeus@52 444 \caption{Dwell times and mark positions from user trials with the
hekeus@53 445 on-screen Melody Triangle interface, for four subjects. The left-hand column shows
hekeus@52 446 the positions in a 2D information space (entropy rate vs redundancy
hekeus@52 447 in bits) where each spent their time; the area of each circle is proportional
hekeus@52 448 to the time spent there. The right-hand column shows point which subjects
hekeus@52 449 `liked'; the area of the circles here is proportional to the duration spent at
hekeus@52 450 that point before the point was marked.}
hekeus@52 451 \end{fig}
hekeus@52 452
hekeus@53 453
hekeus@52 454 Comments collected from the subjects
hekeus@52 455 %during and after the experiment
hekeus@52 456 suggest that
hekeus@52 457 the information-dynamic characteristics of the patterns were readily apparent
hekeus@52 458 to most: several noticed the main organisation of the triangle,
hekeus@52 459 with repetitive notes at the top, cyclic patterns along one edge, and unpredictable
hekeus@52 460 notes towards the opposite corner. Some described their systematic exploration of the space.
hekeus@52 461 Two felt that the right side was `more controllable' than the left (a consequence
hekeus@52 462 of their ability to return to a particular distinctive pattern and recognise it
hekeus@52 463 as one heard previously). Two reported that they became bored towards the end,
hekeus@52 464 but another felt there wasn't enough time to `hear out' the patterns properly.
hekeus@53 465 One subject did not `enjoy' the patterns in the lower region, but another said the lower
hekeus@53 466 central regions were more `melodic' and `interesting'.
hekeus@53 467
hekeus@53 468 \subsection{Discussion}
hekeus@53 469 Our initial hypothesis, that subjects would linger longer in regions of the triangle
hekeus@53 470 that produced aesthetically preferable sequences, and that this would tend to be towards the
hekeus@53 471 centre line of the triangle for all subjects, was not confirmed.
hekeus@53 472 However the subjects did seem to exhibit distinct kinds of exploratory behaviour.
hekeus@53 473 It is possible
hekeus@53 474 that the design of the experiment encouraged an initial exploration of the space (sometimes
hekeus@53 475 very systematic, as for subject (c)) aimed at \emph{understanding} %the parameter space and
hekeus@53 476 how the system works, rather than finding musical patterns. It is also possible that the
hekeus@53 477 system encourages users to create musically interesting output by \emph{moving the token},
hekeus@53 478 rather than finding a particular spot in the triangle which produces a musically interesting
hekeus@53 479 sequence by itself.
hekeus@53 480
hekeus@52 481
hekeus@52 482 We plan to continue the trials with a slightly less restricted user interface in order
hekeus@52 483 make the experience more enjoyable and thereby give subjects longer to use the interface;
hekeus@52 484 this may allow them to get beyond the initial exploratory phase and give a clearer
hekeus@52 485 picture of their aesthetic preferences. In addition, we plan to conduct a
hekeus@52 486 study under more restrictive conditions, where subjects will have no control over the patterns
hekeus@52 487 other than to signal (a) which of two alternatives they prefer in a forced
hekeus@52 488 choice paradigm, and (b) when they are bored of listening to a given sequence.
hekeus@52 489
hekeus@53 490 \section{Qualitative Feedback}
hekeus@52 491
hekeus@53 492 In parallel to the pilot study, we have collected qualitative feedback from users of the screen interface. This is with a view to establish what features would be desired for any eventual further development of the interface, for instance as a VST instrument for inclusion in a standard audio production environment.
hekeus@53 493 Unlike in the pilot study where participants would not know anything about the interface before hand and were asked to `explore' with as little instructions in possible, here potential users, who are music practitioners, are first taught how to use the system. Then they are given some time to play and experiment, and in informal discussion feedback and criticism of the system is sought ought.
hekeus@52 494
hekeus@52 495
hekeus@53 496 %\begin{quote} ``You can hear really quick what the different areas (of the triangle) are like.''\end{quote}
hekeus@53 497 %\begin{quote}``I would love to try it out as a performance tool"\end{quote}
hekeus@53 498
hekeus@53 499
hekeus@53 500
hekeus@53 501 % I love it, I'm really having fun, it's really good.
hekeus@53 502 % for me, as I like generative music, this is amazing, because I like to randomise things and listen to it, and ask, what can I do with it? and then have this kind of choices to edit probabilities I really like that.
hekeus@53 503
hekeus@53 504 %"it would work great with a live-scoring system allowing users to control real musicians"
hekeus@53 505
hekeus@53 506 % hear what is cool that you have control several things within one interface, I can make multiple loops and they all have different characterisitcs and I don't have to adjust like five numbers in different plugins in different places, it's in one thing, and that's what I like most, it's kind of like a macro thing.
hekeus@53 507
hekeus@53 508 %
hekeus@53 509
hekeus@53 510 Some points collected include -
hekeus@52 511 \begin{itemize}
hekeus@52 512 \item The subjects were very quick to get to grasps with the properties of the different areas of the triangle, and found it quite intuitive.
hekeus@52 513 \item The more periodic/predictable half of the triangle was used considerably more.
hekeus@53 514 \item They expressed interest in its potential as live performance interface for electronic music .
hekeus@53 515 \item Some users desired the abillity to map the output of the triangle to other parameters than just notes, such as to the control of filters and effect parameters.
hekeus@52 516 \end{itemize}
hekeus@52 517
hekeus@53 518 Some of the users indicated that the Melody Triangle could integrate well into their musical practice. Some comments are provided here -
hekeus@53 519 \begin{quote}``If it was a kind of VST instrument, I would use it really a lot, definitely! Because there are not that many around that make this kind of stuff. I always love if something is generative or stochastic to generate things I would not come up with, but to generate a lot of them in a short amount of time and I'm the creative catalyst that just picks them.. and then have this kind of choices to edit probabilities, I really like that.''\end{quote}
hekeus@52 520
hekeus@53 521 \begin{quote}
hekeus@53 522 ``Here what is cool is that .. I can make multiple loops and they all have different characteristics and I don't have to adjust like five numbers in different places, it's in one thing, and that's what I like most, it's kind of like a macro [interface].''
hekeus@53 523 \end{quote}
hekeus@52 524
hekeus@53 525 \begin{quote}``I would use it as an idea generator ..what i probably would do is I would run this, maybe I would select some random sounds and maybe I would try around and develop some motifs, and see `oh I like that!' and would record that as midi and move on. "\end{quote}
hekeus@53 526
hekeus@53 527 Stochastic process have often been used to generate musical materials. While such processes
hekeus@53 528 can drive the \emph{generative} phase of the creative process, these comments suggest that information dynamics and the Melody Triangle
hekeus@53 529 can serve as a novel framework for a \emph{selective} phase;
hekeus@53 530 %providing a set of criteria to be used in judging which of the
hekeus@53 531 helping composers discover generated materials that
hekeus@53 532 are of value. This alternation of generative and selective phases has been
hekeus@53 533 noted before \cite{Boden1990}.
hekeus@53 534 %
hekeus@53 535 %Information-dynamic criteria can also be used as \emph{constraints} on the
hekeus@53 536 %generative processes, for example, by specifying a certain temporal profile
hekeus@53 537 %of suprisingness and uncertainty the composer wishes to induce in the listener
hekeus@53 538 %as the piece unfolds.
hekeus@53 539
hekeus@53 540 % Here the users don't select parameters directly, but instead through the triangle interface, the users define the information-theoretic properties of the output.
hekeus@53 541
hekeus@53 542
hekeus@53 543
hekeus@53 544 %\begin{quote}
hekeus@53 545 %``for me, as I like generative music, this is amazing, because I like to randomise things and listen to it, and ask, what can I do with it? and then have this kind of choices to edit probabilities I really like that.''
hekeus@53 546 %\end{quote}
hekeus@53 547
hekeus@53 548
hekeus@53 549
hekeus@53 550 % you pick one thing that you like, your main theme, and then you add another one
hekeus@53 551 %
hekeus@53 552 % I liked randomised
hekeus@53 553 % It is really intuitive to..
hekeus@53 554 %
hekeus@53 555 % for me, as I like generative music, this is amazing, because I like to randomise things and listen to it, and ask, what can I do with it? and then have this kind of choices to edit probabilities I really like that.
hekeus@53 556 % as an interface for one person that works very well
hekeus@53 557 % if you like, if you are really exploring what happens if I do that and that it works great.
hekeus@53 558 %
hekeus@53 559 % to be honest,
hekeus@53 560 % multi-polimetric stuff,
hekeus@53 561 %
hekeus@53 562 %
hekeus@53 563 % it would be really fun if someone controls this and real musicians on acoustic instruments.
hekeus@53 564 % pre-listen, fade things in , fade things out..
hekeus@53 565 % it would work great with a live-scoring system allowing users to control real musicians
hekeus@53 566 %
hekeus@53 567 %
hekeus@53 568 % hear what is cool that you have control several things within one interface, I can make multiple loops and they all have different characterisitcs and I don't have to adjust like five numbers in different plugins in different [journals?], it's in one thing, and that's what I like most, it's kind of like a macro thing.
hekeus@53 569 %
hekeus@53 570 %in terms of playing around, need a way to quickly select instruments..
hekeus@53 571 %
hekeus@53 572 % it is interesting for several guys .. for normal users as well as for performance, i think you can do quite a lot with it
hekeus@53 573 %[composition]
hekeus@53 574 % i would use it all the time, so if you could make me a copy I'll take it home
hekeus@53 575 %as soon as you can make it a stand-alone version I would really use it, cus I do this kind of stuff a lot, and I like to have a randomise button and I like to have this kind of.. just as an idea generator.
hekeus@53 576 %
hekeus@53 577 %what i probably would do is I would run this, maybe I would select some random sounds and maybe I would try around and develop some motifs, and see oh I like that would record that as midi and move on.
hekeus@53 578 %and it could be also nice if you already have something..
hekeus@53 579 %
hekeus@53 580 % So I would love to use that, maybe run it in synch with the stuff I already have, and just play around and see oh well that could fit or try out different sounds, try out different rhythms and just play around with it, and say 'oh I like that', and probably would record that.
hekeus@53 581 % If it was a kind of VST instrument, I would use it really a lot, definitely! Cuz there are not that many around that make this kind of stuff, I always love if something is generative or stochastic, to use it as an idea generator, to generate things I would not come up with, but to generate a lot of them in a short amount of time and I'm the creative catalyst that just picks them.
hekeus@53 582 %
hekeus@53 583 % it is a cool idea generally to say.. this kind of.. relative qualitative parameters.. to say I want it to have it a little bit like that, but a little bit different.. so this would be the really cool feature, say you are playing a synthesiser and say wow that sounds is good but I want it you know a little bit different than that and then try different options of that sound and say yeah that is the one.. because this difficult to achieve.
hekeus@53 584 %
hekeus@53 585 % As i really like this kind of stuff I'd use it straight forward to create notes, and then think about what kind of other process I have to control where I could use that..
hekeus@53 586 %
hekeus@53 587 % you sequence some other parameters
hekeus@53 588 %
hekeus@53 589
hekeus@53 590
hekeus@53 591 \section{The Mobile App}
hekeus@53 592
hekeus@53 593 In order to further our study into musical preferences with a wider audience, the Melody Triangle is being implemented as an Android mobile phone application. The audio engine is developed in libpd\footnote{http://libpd.cc/}, a port of the open source Pure Data programming environment. The app will allow users to use the phone's touch screen to drag tokens around the triangle and generate musical textures. Usage statistics will be collected on the phone and periodically uploaded to our servers for analysis.
hekeus@53 594
hekeus@53 595
hekeus@53 596 % It is essentially a user friendly generative, perpetual minimalist music making app. Additionally it will collect usage statistics, and keep a record of the users' favourite positions in the triangle and periodically upload these to our servers.
hekeus@52 597 \begin{fig}{mobile}
hekeus@52 598 \colfig[1]{mobile}
hekeus@52 599 \caption{The Melody Triangle mobile phone app }
hekeus@52 600 \end{fig}
hekeus@52 601
hekeus@52 602
hekeus@53 603 \section{Conclusion}
hekeus@53 604 %the MT
hekeus@53 605 %the infromation theory
hekeus@53 606 %the interactive installation
hekeus@53 607 %the screen based interface
hekeus@53 608 %the pilot study
hekeus@53 609 %the qualitative feedback
hekeus@53 610 %the mobile app
hekeus@53 611
hekeus@53 612 %We presented the Melody Triangle; an interface for the discovery of melodic content that emerged out of research in \emph{information dynamics} -- an information theoretic approach to modelling human expectation and surprise. It allows a user to explore the melodic properties of Markov chains by identifying positino
hekeus@53 613
hekeus@53 614 We presented the Melody Triangle; an interface for the discovery of melodic content where the input -- positions within a triangle -- corresponds to the predictability of the output melodies. The Melody Triangle is contextualised in \emph{information dynamics}; an information theoretic approach to modelling human expectation and surprise.
hekeus@53 615
hekeus@53 616 We outlined the relevant ideas behind information dynamics and described three key information theoretic measures; entropy rate, redundancy and a measure of \emph{predictive information rate}, which describes the gain in information made by current observations about the future, but which are not already known from past observations. We described how the natural distribution of randomly generated Markov chains in terms of these measures lead us to design the Melody Triangle, and outlined its two physical incarnations.
hekeus@53 617
hekeus@53 618 The first is a multi-user installation where collaboration in a performative setting provides a playful yet informative way to explore expectation and surprise in music.
hekeus@53 619
hekeus@53 620 The second is a screen based interface where the Melody Triangle can be used as a musical performance interface or compositional aid for the generation of musical textures; the userŐs control at the abstract level of randomness and predictability. We outlined some qualitative feedback gathered from users of the system, it was generally positive and indicates that the Melody Triangle could be useful as a performance tool and composition aid. We described a pilot study where the screen-based interface was used under experimental conditions to determine how the information dynamics measures might relate to musical preference. Although the results were inconclusive, we plan to continue this work under different experimental setups. Finally we outlined a forthcoming mobile phone version of the Melody Triangle that, when released, will collect data from its users with a view to help us identify any relationship between human musical preferences and the information-dynamic model of human expectation and surprise.
hekeus@53 621
hekeus@53 622
hekeus@52 623
hekeus@52 624 \bibliography{all,c4dm,all2}\bibliographystyle{aaai}
hekeus@52 625 \end{document}