mtridoc: mume2012/mume2012_review.tex annotate

annotate mume2012/mume2012_review.tex @ 58:a63c438b3f65 tip

Squeezed it into the 6 page limit

author	Henrik Ekeus <hekeus@eecs.qmul.ac.uk>
date	Tue, 11 Jun 2013 15:17:21 +0100
parents	70bfa77c1476
children

rev	line source
hekeus@54	1 \documentclass[letterpaper]{article}
hekeus@54	2 \usepackage{aaai}
hekeus@54	3 \usepackage{times}
hekeus@54	4 \usepackage{helvet}
hekeus@54	5 \usepackage{courier}
hekeus@54	6 \usepackage{tools} %custom
hekeus@54	7
hekeus@54	8
hekeus@54	9
hekeus@54	10
hekeus@54	11 \let\citep=\cite
hekeus@54	12 \newcommand{\colfig}[2][1]{\includegraphics[width=#1\linewidth]{figs/#2}}%
hekeus@54	13 \newcommand\past[1]{\overset{\rule{0pt}{0.2em}\smash{\leftarrow}}{#1}}
hekeus@54	14 \newcommand\fut[1]{\overset{\rule{0pt}{0.1em}\smash{\rightarrow}}{#1}}
hekeus@54	15 \frenchspacing
hekeus@54	16 \pdfinfo{
hekeus@54	17 /Title (The Melody Triangle: Exploring Pattern and Predictability in Music )
hekeus@54	18 /Subject (Musical Metacreation, Interfaces)
hekeus@54	19 /Author (Henrik Ekeus, Samer A. Abdallah, Mark D. Plumbley, Peter W. McOwan)}
hekeus@54	20 \setcounter{secnumdepth}{0}
hekeus@54	21
hekeus@54	22 % The file aaai.sty is the style file for AAAI Press
hekeus@54	23 % proceedings, working notes, and technical reports.
hekeus@54	24 %
hekeus@54	25 \title{The Melody Triangle:\\ Exploring Pattern and Predictability in Music}
hekeus@54	26 \author{Henrik Ekeus, Samer A. Abdallah, Mark D. Plumbley, Peter W. McOwan\\
hekeus@54	27 Centre for Digital Music, Queen Mary University of London,\\
hekeus@54	28 London E1 4NS, UK\\
hekeus@54	29 }
hekeus@54	30 \begin{document}
hekeus@54	31 \maketitle
hekeus@54	32 \begin{abstract}
hekeus@54	33 \begin{quote}
hekeus@54	34
hekeus@54	35 The Melody Triangle is an interface for the discovery of melodic materials, where the input -- positions within a triangle -- directly map to information theoretic properties of the output. A model of human expectation and surprise in the perception of music, \emph{information dynamics}, is used to `map out' a musical generative system's parameter space. This enables a user to explore the possibilities afforded by a generative algorithm, in this case Markov chains, not by directly selecting parameters, but by specifying the subjective \emph{predictability} of the output sequence. We describe some of the relevant ideas from information dynamics and how the Melody Triangle is defined in terms of these. We describe its incarnation as a screen based performance tool and compositional aid for the generation of musical textures; the userÕs control at the abstract level of randomness and predictability, and some pilot studies carried out with it. We also briefly outline a multi-user installation, where collaboration in a performative setting provides a playful yet informative way to explore expectation and surprise in music, and a forthcoming mobile phone version of the Melody Triangle.
hekeus@54	36
hekeus@54	37 \end{quote}
hekeus@54	38 \end{abstract}
hekeus@54	39
hekeus@54	40 \noindent
hekeus@54	41
hekeus@54	42 \section{Introduction}
hekeus@54	43
hekeus@54	44 The use of generative stochastic processes in music composition has been widespread for
hekeus@54	45 decades---for instance Iannis Xenakis applied probabilistic mathematical models
hekeus@54	46 to the creation of musical materials\cite{Xenakis:1992ul}. However it can sometimes be difficult for a composer to find desirable parameters and navigate the possibilities of a generative algorithm intuitively.
hekeus@54	47
hekeus@54	48 The Melody Triangle is an interface for the discovery of melodic content where the parameter space of a stochastic generative musical process, the Markov chain, is `mapped out' according to the \emph{predictability} of the output. The Melody Triangle was developed in the context of \emph{information dynamics}\cite{CIP}; an information theoretic approach to modelling human expectation and surprise in the perception of music.
hekeus@54	49 Users of the Melody Triangle do not select the parameters to generative processes directly, rather they provide input in the form of a position within a triangle, and this maps to the information theoretic properties of an output melody.
hekeus@54	50 For instance one corner of the triangle returns completely random melodies, while an other area yields entirely predictable and periodic patterns, the entirety of the triangle covering a spectrum of predictability of the output melodies.
hekeus@54	51
hekeus@54	52 In this paper we outline the concepts and ideas behind information dynamics, and describe the information measures that lead to the development of the Melody Triangle. We describe its physical realisations; a multi-user interactive installation where visitors would use their bodies and gestures to generate musical materials, and a screen based interface. We outline some pilot studies carried out with the screen interface, as well as some qualitative feedback from music practitioners exploring its potential as a performance or composition tool. Finally we outline a forthcoming mobile phone version of the Melody Triangle.
hekeus@54	53
hekeus@54	54 \section{Information Dynamics}
hekeus@54	55 \label{s:Intro}
hekeus@54	56 The relationship between
hekeus@54	57 Shannon's \shortcite{Shannon48} information theory and music and art in general has been the
hekeus@54	58 subject of some interest since the 1950s
hekeus@54	59 \cite{Youngblood58,CoonsKraehenbuehl1958,Moles66,Meyer67,Cohen1962}.
hekeus@54	60 The general thesis is that perceptible qualities and subjective states
hekeus@54	61 like uncertainty, surprise, complexity, tension, and interestingness
hekeus@54	62 are closely related to information-theoretic quantities like
hekeus@54	63 entropy, relative entropy, and mutual information.
hekeus@54	64
hekeus@54	65 Music is an inherently dynamic process. The idea that the musical experience is strongly shaped by the generation
hekeus@54	66 and playing out of strong and weak expectations was put forward by, amongst others,
hekeus@54	67 music theorists L. B. Meyer \shortcite{Meyer:1967} and Narmour \shortcite{Narmour:1977}.
hekeus@54	68
hekeus@54	69 An essential aspect of this is that music is experienced as a phenomenon
hekeus@54	70 that unfolds in time, rather than being apprehended as a static object
hekeus@54	71 presented in its entirety. Meyer argued that the experience depends
hekeus@54	72 on how we change and revise our conceptions \emph{as events happen}, on
hekeus@54	73 how expectation and prediction interact with occurrence, and that, to a
hekeus@54	74 large degree, the way to understand the effect of music is to focus on
hekeus@54	75 this `kinetics' of expectation and surprise.
hekeus@54	76
hekeus@54	77 Prediction and expectation are essentially probabilistic concepts
hekeus@54	78 and can be treated mathematically using probability theory.
hekeus@54	79 We suppose that when we listen to music, expectations are created on the basis
hekeus@54	80 of our familiarity with various styles of music and our ability to
hekeus@54	81 detect and learn statistical regularities in the music as they emerge.
hekeus@54	82 There is experimental evidence that human listeners are able to internalise
hekeus@54	83 statistical knowledge about musical structure,
hekeus@54	84 \cite{SaffranJohnsonAslin1999}, and also
hekeus@54	85 that statistical models can form an effective basis for computational
hekeus@54	86 analysis of music,
hekeus@54	87 \cite{ConklinWitten95,PonsfordWigginsMellish1999,Pearce2005}.
hekeus@54	88
hekeus@54	89 Information dynamics considers several different kinds of predictability in musical patterns, how these might be quantified using the tools of information theory,
hekeus@54	90 and how they shape or affect the listening experience. Our working hypothesis is that listeners maintain a dynamically evolving probabilistic belief state that enables them to make predictions about how a piece of music will continue.
hekeus@54	91
hekeus@54	92 They do this using both the immediate context of the piece as well as using previous musical experience, such as a familiarity with musical styles and conventions. As the music unfolds, listeners continually revise this belief state, which includes predictive
hekeus@54	93 distributions over possible future events. These changes in probabilistic beliefs can be associated with
hekeus@54	94 quantities of information; these are the focus of information dynamics.
hekeus@54	95
hekeus@54	96 In this next section we briefly describe the information measures that we use to define the Melody Triangle, however a more complete overview of information dynamics and some of its applications can be found in \cite{Abdallah:2009p4089} and \cite{CIP}.
hekeus@54	97
hekeus@54	98 \subsection{Sequential Information Measures}\label{sec:Sequential_Information_Measures}
hekeus@54	99
hekeus@54	100 Consider a sequence of symbols from the viewpoint of an observer at a certain time, and split the
hekeus@54	101 sequence into a single symbol in the \emph{present} ($X_t$), an infinite \emph{past} ($\past{X}_t$) and the
hekeus@54	102 infinite \emph{future} ($\fut{X}_t$). The symbols arrive at a constant, uniform rate.
hekeus@54	103
hekeus@54	104 The \emph{entropy rate} of a random process is a well-known, basic measure of its randomness or
hekeus@54	105 unpredictablity. The entropy rate is the entropy, \emph{H}, of the \emph{present} given the \emph{past}:
hekeus@54	106 \begin{equation}
hekeus@54	107 \label{eq:entro-rate}
hekeus@54	108 h_\mu = H(X_t\|\past{X}_t).
hekeus@54	109 \end{equation}
hekeus@54	110 that is, it represents our average uncertainty about the present symbol \emph{given}
hekeus@54	111 that we have observed everything before it. Processes with zero entropy rate can
hekeus@54	112 be predicted perfectly given enough of the preceding context.
hekeus@54	113
hekeus@54	114 The \emph{multi-information rate} $\rho_\mu$ \cite{Dubnov2004}
hekeus@54	115 is the mutual
hekeus@54	116 information, \emph{I}, between the `past' and the `present':
hekeus@54	117 \begin{equation}
hekeus@54	118 \label{eq:multi-info}
hekeus@54	119 \rho_\mu = I(\past{X}_t;X_t) = H(X_t) - H(X_t\|\past{X}_t).
hekeus@54	120 \end{equation}
hekeus@54	121
hekeus@54	122 Multi-information rate can be thought of as measures of \emph{redundancy}, quantifying the extent to which the same information is to be found in all parts of the sequence.
hekeus@54	123 It is a measure of how much the predictability of the process depends on knowing the
hekeus@54	124 preceding context. It is the difference between the entropy of a single element of the
hekeus@54	125 sequence in isolation (imagine choosing a note from a musical score at random with your
hekeus@54	126 eyes closed and then trying to guess the note) and its entropy after taking into account
hekeus@54	127 the preceding context:
hekeus@54	128 If the previous symbols reduce our uncertainty about the present symbol a great deal, then
hekeus@54	129 the redundancy is high. For example, if we know that a sequence consists of a repeating
hekeus@54	130 cycle such as $ \ldots b, c, d, a, b, c, d, a \ldots$, but we don't know which was the first
hekeus@54	131 symbol, then the redundancy is high, as $H(X_t)$ is high (because we
hekeus@54	132 have no idea about the present symbol in isolation), but $H(X_t\|\past{X}_t)$
hekeus@54	133 is zero, because knowing the previous symbol immediately tells us what the present symbol is.
hekeus@54	134
hekeus@54	135 The \emph{predictive information rate} (PIR) \cite{Abdallah:2009p4089} brings in our uncertainty about the future. It is a
hekeus@54	136 measure of how much each symbol reduces our uncertainty about the future as it is
hekeus@54	137 observed, \emph{given} that we have observed the past:
hekeus@54	138 \begin{equation}
hekeus@54	139 \label{eq:PIR}
hekeus@54	140 b_\mu = I(X_t;\fut{X}_t\|\past{X}_t) = H(\fut{X}_t\|\past{X}_t) - H(\fut{X}_t\|X_t,\past{X}_t).
hekeus@54	141 \end{equation}
hekeus@54	142 It is a measure of the mutual information between the `presentÕ and the `futureÕ given the `past'. In other words, it is a measure of the \emph{new} information in each symbol.
hekeus@54	143
hekeus@54	144 The behaviour of the predictive information rate make it interesting from a compositional point of view. The definition
hekeus@54	145 of the PIR is such that it is low both for extremely regular processes, such as constant
hekeus@54	146 or periodic sequences, \emph{and} low for extremely random processes, where each symbol
hekeus@54	147 is chosen independently of the others, in a kind of `white noise'. In the former case,
hekeus@54	148 the pattern, once established, is completely predictable and therefore there is no
hekeus@54	149 \emph{new} information in subsequent observations. In the latter case, the randomness
hekeus@54	150 and independence of all elements of the sequence means that, though potentially surprising,
hekeus@54	151 each observation carries no information about the ones to come.
hekeus@54	152
hekeus@54	153 Processes with high PIR maintain a certain kind of balance between
hekeus@54	154 predictability and unpredictability in such a way that the observer must continually
hekeus@54	155 pay attention to each new observation as it occurs in order to make the best
hekeus@54	156 possible predictions about the evolution of the sequence. This balance between predictability
hekeus@54	157 and unpredictability is reminiscent of the inverted `U' shape of the Wundt curve (see \Figrf{wundt}),
hekeus@54	158 which summarises the observations of Wundt \shortcite{Wundt1897} that stimuli are most
hekeus@54	159 pleasing at intermediate levels of novelty or disorder, where there is a balance between
hekeus@54	160 `order' and `chaos'.
hekeus@54	161
hekeus@54	162 \begin{fig}{wundt}
hekeus@54	163 \raisebox{-4em}{\colfig[0.43]{wundt}}
hekeus@54	164 {\ {\large$\longrightarrow$}\ }
hekeus@54	165 \raisebox{-4em}{\colfig[0.43]{wundt2}}
hekeus@54	166 \caption{
hekeus@54	167 The Wundt curve relating randomness/complexity with
hekeus@54	168 perceived value. Repeated exposure sometimes results
hekeus@54	169 in a move to the left along the curve \cite{Berlyne71}.
hekeus@54	170 }
hekeus@54	171 \end{fig}
hekeus@54	172
hekeus@54	173 A similar shape is visible in the upper envelope of the plot in \Figrf{mtriscat}, which is a 3-D scatter plot of
hekeus@54	174 the information information measures for several thousand
hekeus@54	175 first-order Markov chain transition matrices generated by a random sampling method.
hekeus@54	176 The coordinates of the `information space' are entropy rate ($h_\mu$), redundancy ($\rho_\mu$), and
hekeus@54	177 predictive information rate ($b_\mu$). The points along the `redundancy' axis correspond
hekeus@54	178 to periodic Markov chains. Those along the `entropy' axis produce uncorrelated sequences
hekeus@54	179 with no temporal structure. Processes with high PIR are to be found at intermediate
hekeus@54	180 levels of entropy and redundancy.
hekeus@54	181
hekeus@54	182 These observations led us to construct the `Melody Triangle'.
hekeus@54	183
hekeus@54	184 \begin{figure}
hekeus@54	185 \centering
hekeus@54	186 \includegraphics[width=0.49\linewidth]{figs/PeriodicMatrix.pdf}
hekeus@54	187 \includegraphics[width=0.49\linewidth]{figs/NonPeriodicMatrix.pdf}
hekeus@54	188 \caption{Two transition matrixes representing Markov chains. The shade of gray represents the probabilities of transition from one symbol to the next (white=0, black=1). The current symbol is along the bottom, and the next symbol is along the left. The left hand matrix has no uncertainty; it represents a periodic pattern (a,d,c,b,a,d,c,b,a,d,c,b,a\dots). The right hand matrix contains unpredictability but nonetheless is not completely without perceivable structure (we know for instance that any `b' will always be followed by an `a' and preceded by a `c'), it is of a higher entropy rate. \label{TransitionMatrixes}}
hekeus@54	189 \end{figure}
hekeus@54	190
hekeus@54	191 \begin{fig}{mtriscat}
hekeus@54	192 \colfig[1]{mtriscat}
hekeus@54	193 \caption{The population of transition matrices in the 3D space of
hekeus@54	194 entropy rate ($h_\mu$), redundancy ($\rho_\mu$) and predictive information rate ($b_\mu$),
hekeus@54	195 all in bits. Note that the distribution as a whole makes a curved triangle. Although
hekeus@54	196 not visible in this plot, it is largely hollow in the middle.
hekeus@54	197 The concentrations of points along the redundancy axis correspond
hekeus@54	198 to Markov chains which are roughly periodic with periods of 2 (redundancy 1 bit),
hekeus@54	199 3, 4, \etc all the way to period 7 (redundancy 2.8 bits). Note that the highest PIR values are found at intermediate entropy
hekeus@54	200 and redundancy. \label{InfoDynEngine}}
hekeus@54	201 \end{fig}
hekeus@54	202
hekeus@54	203 \section{The Melody Triangle}\label{makingthetriangle}
hekeus@54	204
hekeus@54	205 The Melody Triangle is an interface that is designed around this natural distribution of Markov chain transition
hekeus@54	206 matrices in the information space of entropy rate ($h_\mu$), redundancy ($\rho_\mu$) and predictive information rate ($b_\mu$), as illustrated in \Figrf{mtriscat}.
hekeus@54	207
hekeus@54	208 The distribution of transition matrices in this space forms a relatively thin
hekeus@54	209 curved sheet. Thus, it is a reasonable simplification to project out the
hekeus@54	210 third dimension (the PIR) and present an interface that is just two dimensional.
hekeus@54	211
hekeus@54	212 The right-angled triangle is rotated, reflected and stretched to form an equilateral triangle with
hekeus@54	213 the `redundancy'/`entropy rate' vertex at the top, the `redundancy' axis down the left-hand
hekeus@54	214 side, and the `entropy rate' axis down the right, as shown in \Figrf{TheTriangle}.
hekeus@54	215 This is our `Melody Triangle' and
hekeus@54	216 forms the interface by which the system is controlled.
hekeus@54	217
hekeus@54	218 \begin{fig}{TheTriangle}
hekeus@54	219 \colfig[1]{TheTriangle.pdf}
hekeus@54	220 \caption{The Melody Triangle}
hekeus@54	221 \end{fig}
hekeus@54	222
hekeus@54	223 \subsection{Usage}
hekeus@54	224
hekeus@54	225 The user selects a point within the triangle, this is mapped into the
hekeus@54	226 information space and the nearest transition matrix is used to generate
hekeus@54	227 a sequence of values which are then sonified either as pitched notes or percussive
hekeus@54	228 sounds.
hekeus@54	229
hekeus@54	230 Though the interface is 2D, the third dimension (predictive information rate) is implicitly present, as
hekeus@54	231 transition matrices retrieved from
hekeus@54	232 along the centre line of the triangle will tend to have higher PIR.
hekeus@54	233
hekeus@54	234 As shown in as shown in \Figrf{TheTriangle}, the corners correspond to three different extremes of predictability and
hekeus@54	235 unpredictability, which could be loosely characterised as `periodicity', `noise'
hekeus@54	236 and `repetition'. Melodies from the `noise' corner (high $h_\mu$, low $\rho_\mu$
hekeus@54	237 and low $b_\mu$) have no discernible pattern;
hekeus@54	238 those along the `periodicity'
hekeus@54	239 to `repetition' edge are all cyclic patterns that get shorter as we approach
hekeus@54	240 the `repetition' corner, until each is just one repeating note. Those along the
hekeus@54	241 opposite edge consist of independent random notes from non-uniform distributions.
hekeus@54	242 Areas between the left and right edges will tend to have higher predictive information rate ($b_\mu$),
hekeus@54	243 and we hypothesise that, under
hekeus@54	244 the appropriate conditions, these will be perceived as more `interesting' or `melodic.'
hekeus@54	245 These melodies have some level of unpredictability, but are not completely random.
hekeus@54	246 Or, conversely, are predictable, but not entirely so.
hekeus@54	247
hekeus@54	248 Given coordinates corresponding to a point in the triangle, we select from a pre-built
hekeus@54	249 library of random processes, choosing one whose entropy rate and redundancy match the desired
hekeus@54	250 values. The implementations discussed in this paper use first order Markov chains as the content generator,
hekeus@54	251 since it is easy to compute the theoretically exact values of entropy rate, redundancy and predictive
hekeus@54	252 information rate given the transition matrix of the Markov chain. However, in principle, any generative system could be used to create the library of sequences, given an appropriate probabilistic listener model supporting
hekeus@54	253 the estimation of entropy rate and redundancy.
hekeus@54	254
hekeus@54	255 The Markov chain based implementation generates streams of symbols in the abstract; the alphabet of symbols is then mapped to a set of distinct sounds, such as pitched notes in a scale. Further by layering these streams, intricate musical textures can be created. The Melody Triangle does not take into account the statistical experience of our exposure to tonal music. Even if a particular stream of symbols is periodic and predictable, in mapping to the chromatic scale there is a chance that the melody may conflict with culturally defined expectations. A mapping to the diatonic scale however is less likely to lead to such conflicts, and mappings to the pentatonic scale even less so. Indeed, the symbols can be mapped to a set of percussive sounds, and even non sonic outputs such as visible shapes, colours, or movements.
hekeus@54	256
hekeus@54	257 The information measures that define the Melody Triangle assume a constant rate of symbols, and thus the notes of each output melody proceeds at a uniform mate. Although the placing of events in time has a strong effect on expectations, surprise and satisfaction in music, the system does not, as yet, address this temporal dimension.
hekeus@54	258
hekeus@54	259 \section{Interfaces}
hekeus@54	260
hekeus@54	261 \subsection{Interface 1: The Interactive Installation}
hekeus@54	262 \begin{figure}
hekeus@54	263 \centering
hekeus@54	264 \includegraphics[width=1\linewidth]{figs/kinnect.pdf}
hekeus@54	265 \caption{The depth map as seen by the Kinect camera in the interactive installation version of the Melody Triangle. The bounding box outlines the blobs detected by OpenNI.\label{Kinect}}
hekeus@54	266 \end{figure}
hekeus@54	267 The Melody Triangle was first implemented as a multi-user
hekeus@54	268 interactive installation. It has been exhibited at the Brighton Science Festival 2012, Digital Shoreditch as well as at The British Science Festival 2011. A Kinect\footnote{http://www.xbox.com/en-GB/Kinect} camera tracks individuals in a space, the range of its depth sensors naturally forming a triangle.
hekeus@54	269
hekeus@54	270 As visitors/users come into the range of the camera, they start generating a melody, the statistical properties of this melody determined by the mapping of physical space to the statistical space of the Melody Triangle. Thus by exploring the physical space, the participant changes the predictability of the generated melodic content. When multiple people are in the space they can cooperate to create interweaving melodies, forming intricate polyphonic textures.
hekeus@54	271
hekeus@54	272 This makes the interaction physically engaging and (as our experience with visitors both young and old has demonstrated) more playful.
hekeus@54	273
hekeus@54	274 \subsubsection{Tracking and Control}
hekeus@54	275
hekeus@54	276 Tracking and control was done using the OpenNI libraries' API\footnote{http://OpenNi.org/} and high level middle-ware for tracking with Kinect. This provided reliable blob tracking of humanoid forms in 2D space. By triangulating this to the Kinect's depth map it became possible to get reliable coordinate of visitors' positions in the space.
hekeus@54	277
hekeus@54	278 By detecting the bounding box of the 2D blobs of individuals in the space, and then normalising these based on the distance of the depth map it became possible to work out if an individual had an arm stretched out or if they were crouching. With this it was possible to define a series of gestures for controlling the system without the use of any controllers(see table \ref{gestures}). Thus for instance by sticking out one's left arm quickly, the melody doubles in tempo. By pulling one's left arm in at the same time as sticking the right arm out the melody would shift onto the offbeat. Sending out both arms would change the instrument being `played', and crouching would decrease the volume of the melody.
hekeus@54	279
hekeus@54	280 \begin{table}
hekeus@54	281 \centering
hekeus@54	282 \caption{Gestures and their resulting effect\label{gestures}}
hekeus@54	283 \begin{tabular}{ l c l }
hekeus@54	284 left arm & right arm & meaning\\
hekeus@54	285 \hline
hekeus@54	286 out & static & double tempo \\
hekeus@54	287 in & static & halve tempo \\
hekeus@54	288 static & out & triple tempo \\
hekeus@54	289 static & in & one-third tempo\\
hekeus@54	290 out & in & shift to off-beat \\
hekeus@54	291 out & out & change instrument\\
hekeus@54	292 in & in & reset tempo\\
hekeus@54	293 \end{tabular}
hekeus@54	294 \end{table}
hekeus@54	295
hekeus@54	296 \begin{figure}
hekeus@54	297 \centering
hekeus@54	298 \includegraphics[width=1\linewidth]{figs/InstructionsImage3.pdf}
hekeus@54	299 \caption{Gestures and their resulting effect \label{gestures2}}
hekeus@54	300 \end{figure}
hekeus@54	301
hekeus@54	302
hekeus@54	303
hekeus@54	304 \subsubsection{Observations}
hekeus@54	305 Although visitors would need some initial instructions, they were then quickly able to collaboratively design musical textures. For example, one person would lay down a predictable repeating bass line by keeping themselves to the periodicity/repetition side of the room, while a companion can generate a freer melodic line by being nearer the 'noise' part of the space.
hekeus@54	306
hekeus@54	307 The collaborative nature of this installation is an area that merits attention. By not having one user be able to control the whole narrative, the participants would communicate verbally and direct each other in the goals of learning to use the system and finding interesting musical textures. This collaboration added an element of playfulness and enjoyment that was clearly apparent.
hekeus@54	308
hekeus@54	309 As an artefact this installation occupies an ambiguous role in terms of purpose; it is in a nebulous middle ground between instrument, art installation and technical demonstration. It is clear however, that as a vehicle for communicating ideas related to the expectation, pattern and predictability in music to the general public, it has proved very effective.
hekeus@54	310
hekeus@54	311 However we were interested in carrying out some studies under more controlled circumstances. Additionally we are interested in the Melody Triangle's potential as a compositional aid or music performance interface. To this end we developed a screen based user interface to the Melody Triangle.
hekeus@54	312
hekeus@54	313
hekeus@54	314 \subsection{Interface 2: The Screen Interface}
hekeus@54	315
hekeus@54	316 \begin{fig}{melTriScreenShot}
hekeus@54	317 \colfig[1]{melTriScreenShot}
hekeus@54	318 \caption{Screen shot of the Melody Triangle screen UI. On the right current transition matrixes being played are displayed. The tokens flash when ever a note from its melody is rendered. }
hekeus@54	319 \end{fig}
hekeus@54	320
hekeus@54	321 In the screen based interface, a number of tokens, each representing a
hekeus@54	322 sonification stream or `voice', can be dragged in and around the triangle.
hekeus@54	323 For each token, a sequence of symbols is sampled using the corresponding
hekeus@54	324 transition matrix, which are then mapped to notes of a scale or percussive sounds%
hekeus@54	325 \footnote{The sampled sequence could easily be mapped to other musical processes, possibly over
hekeus@54	326 different time scales, such as chords, dynamics and timbres. It would also be possible
hekeus@54	327 to map the symbols to visual or other outputs.}%
hekeus@54	328 . Keyboard commands give control over other musical parameters such
hekeus@54	329 as the pitch register, volume, scale, inter-onset interval and instrument for each voice.
hekeus@54	330 %The possibilities afforded by the Melody Triangle in these other domains remains to be investigated.
hekeus@54	331 The system is capable of generating quite intricate musical textures when multiple tokens
hekeus@54	332 are in the triangle. The overlapping and interweaving of melodies of varying periodicity's and predictability is well suited for making content that could stylistically be characterised as `minimalism'.
hekeus@54	333
hekeus@54	334 This interface is quite unlike other computer aided composition tools or programming
hekeus@54	335 environments, as here the composer exercises control at the abstract level of information-dynamic
hekeus@54	336 properties.
hekeus@54	337 A video of the interface in use can be viewed here - \emph{http://bit.ly/My49lT}
hekeus@54	338
hekeus@54	339
hekeus@54	340
hekeus@54	341
hekeus@54	342
hekeus@54	343
hekeus@54	344 \section{User trials with the Melody Triangle}
hekeus@54	345 We carried out a pilot study with six participants who were asked
hekeus@54	346 to use a simplified form of the user interface (a single controllable token,
hekeus@54	347 and no rhythmic, registral or timbral controls) under two conditions:
hekeus@54	348 one where a single sequence was sonified under user control, and another
hekeus@54	349 where an additional sequence was sonified in a different register, as if generated
hekeus@54	350 by a fixed invisible token in one of four regions of the triangle. In addition, subjects
hekeus@54	351 were asked to press a key if they `liked' what they were hearing. The subj
hekeus@54	352
hekeus@54	353
hekeus@54	354 Our hypothesis is that users would linger longer in areas of the triangle that would produce more aesthetically desirable sequences, and these would tend to be the in the areas of the triangle that are of high predictive information rate, that is, areas along the middle and lower edge of the triangle.
hekeus@54	355
hekeus@54	356
hekeus@54	357 We recorded subjects' behaviour as well as points which they marked
hekeus@54	358 with a key press. After the study the participants were surveyed with the Goldsmiths Musical Sophistication Index\cite{Mullensiefen:2011ts} to elicit their prior musical experience, which varied broadly. The sample size, however, was too small to draw any statistically significant correlations between the collected data and the index.
hekeus@54	359
hekeus@54	360 \subsection{Results}
hekeus@54	361 Some results for four of the subjects are shown in \Figrf{mtri-results}. We have not been able to detect any systematic across-subject preference for any particular region of the triangle.
hekeus@54	362
hekeus@54	363 \begin{fig}{mtri-results}
hekeus@54	364 \def\scat#1{\colfig[0.42]{mtri/#1}}
hekeus@54	365 \def\subj#1{\scat{scat_dwells_subj_#1} & \scat{scat_marks_subj_#1}}
hekeus@54	366 \begin{tabular}{cc}
hekeus@54	367 \subj{a} \\
hekeus@54	368 \subj{b} \\
hekeus@54	369 \subj{c} \\
hekeus@54	370 \subj{d}
hekeus@54	371 \end{tabular}
hekeus@54	372 \caption{Dwell times and mark positions from user trials with the
hekeus@54	373 on-screen Melody Triangle interface, for four subjects. The left-hand column shows
hekeus@54	374 the positions in a 2D information space (entropy rate vs redundancy
hekeus@54	375 in bits) where each spent their time; the area of each circle is proportional
hekeus@54	376 to the time spent there. The right-hand column shows point which subjects
hekeus@54	377 `liked'; the area of the circles here is proportional to the duration spent at
hekeus@54	378 that point before the point was marked.}
hekeus@54	379 \end{fig}
hekeus@54	380
hekeus@54	381
hekeus@54	382 Comments collected from the subjects
hekeus@54	383 suggest that
hekeus@54	384 the information-dynamic characteristics of the patterns were readily apparent
hekeus@54	385 to most: several noticed the main organisation of the triangle,
hekeus@54	386 with repetitive notes at the top, cyclic patterns along one edge, and unpredictable
hekeus@54	387 notes towards the opposite corner. Some described their systematic exploration of the space.
hekeus@54	388 Two felt that the right side was `more controllable' than the left (a consequence
hekeus@54	389 of their ability to return to a particular distinctive pattern and recognise it
hekeus@54	390 as one heard previously). Two reported that they became bored towards the end,
hekeus@54	391 but another felt there wasn't enough time to `hear out' the patterns properly.
hekeus@54	392 One subject did not `enjoy' the patterns in the lower region, but another said the lower
hekeus@54	393 central regions were more `melodic' and `interesting'.
hekeus@54	394
hekeus@54	395 \subsection{Discussion}
hekeus@54	396 Our initial hypothesis, that subjects would linger longer in regions of the triangle
hekeus@54	397 that produced aesthetically preferable sequences, and that this would tend to be towards the
hekeus@54	398 centre line of the triangle for all subjects, was not confirmed.
hekeus@54	399 However the subjects did seem to exhibit distinct kinds of exploratory behaviour.
hekeus@54	400 It is possible
hekeus@54	401 that the design of the experiment encouraged an initial exploration of the space (sometimes
hekeus@54	402 very systematic, as for subject (c)) aimed at \emph{understanding}
hekeus@54	403 how the system works, rather than finding musical patterns. It is also possible that the
hekeus@54	404 system encourages users to create musically interesting output by \emph{moving the token},
hekeus@54	405 rather than finding a particular spot in the triangle which produces a musically interesting
hekeus@54	406 sequence by itself.
hekeus@54	407
hekeus@54	408 We plan to continue the trials with a slightly less restricted user interface in order
hekeus@54	409 make the experience more enjoyable and thereby give subjects longer to use the interface;
hekeus@54	410 this may allow them to get beyond the initial exploratory phase and give a clearer
hekeus@54	411 picture of their aesthetic preferences. In addition, we plan to conduct a
hekeus@54	412 study under more restrictive conditions, where subjects will have no control over the patterns
hekeus@54	413 other than to signal (a) which of two alternatives they prefer in a forced
hekeus@54	414 choice paradigm, and (b) when they are bored of listening to a given sequence.
hekeus@54	415
hekeus@54	416 \section{Qualitative Feedback}
hekeus@54	417
hekeus@54	418 In parallel to the pilot study, we have collected qualitative feedback from potential users of the screen interface. Here four participants were interviewed, all practicing musicians that use computers in music production or in performance. This is with a view to establish what features would be desired for any eventual further development of the interface, for instance as a VST instrument for inclusion in a standard audio production environment.
hekeus@54	419
hekeus@54	420 Unlike in the pilot study where participants would not know anything about the interface before hand and were asked to `explore' with as little instructions in possible, here the potential users are first taught how to use the system. Then they are given some time to play and experiment, and in informal discussion feedback and criticism of the system is sought ought. As part of a broader conversation, they were asked if they could identify the different areas of the triangle, what features of the system they liked and disliked, if they could see themselves using the system as part of their musical practice, and if so how.
hekeus@54	421
hekeus@54	422 Some points collected include -
hekeus@54	423 \begin{itemize}
hekeus@54	424 \item The subjects were very quick to get to grips with the properties of the different areas of the triangle, and found it quite intuitive.
hekeus@54	425 \item The more periodic/predictable half of the triangle was used considerably more by all participants.
hekeus@54	426 \item Some expressed interest in its potential as live performance interface for electronic music.
hekeus@54	427 \item All users desired more control over the mapping of symbols to notes, and some desired the ability to map the output of the triangle to other parameters such as to the control of filters and effect parameters.
hekeus@54	428 \end{itemize}
hekeus@54	429
hekeus@54	430 Two of the users indicated that the Melody Triangle could integrate well into their musical practice, one was unsure and the other said it would not and expressed frustration at having little control over the musical style of the output.
hekeus@54	431 Some comments are provided here -
hekeus@54	432 \begin{quote}``If it was a kind of VST instrument, I would use it really a lot, definitely! Because there are not that many around that make this kind of stuff. I always love if something is generative or stochastic to generate things I would not come up with, but to generate a lot of them in a short amount of time and I'm the creative catalyst that just picks them.. and then have this kind of choices to edit probabilities, I really like that.''\end{quote}
hekeus@54	433
hekeus@54	434 \begin{quote}
hekeus@54	435 ``Here what is cool is that .. I can make multiple loops and they all have different characteristics and I don't have to adjust like five numbers in different places, it's in one thing, and that's what I like most, it's kind of like a macro [interface].''
hekeus@54	436 \end{quote}
hekeus@54	437
hekeus@54	438 \begin{quote}``I would use it as an idea generator ..what i probably would do is I would run this, maybe I would select some random sounds and maybe I would try around and develop some motifs, and see `oh I like that!' and would record that as midi and move on. "\end{quote}
hekeus@54	439
hekeus@54	440 Stochastic process have often been used to generate musical materials. While such processes can drive the \emph{generative} phase of the creative process, these comments suggest that information dynamics and the Melody Triangle can serve as a novel framework for a \emph{selective} phase; helping composers discover generated materials that are of value. This alternation of generative and selective phases has been noted before \cite{Boden1990}.
hekeus@54	441
hekeus@54	442 \section{The Mobile App}
hekeus@54	443
hekeus@54	444 In order to further our study into musical preferences with a wider audience, the Melody Triangle is being implemented as an Android mobile phone application. The research motivation is to use the app as a means of collecting large quantities of crowd-sourced data, providing us with a larger data set than could be realistically achieved through individual studies.
hekeus@54	445
hekeus@54	446 The audio engine is developed in libpd\footnote{http://libpd.cc/}, a port of the open source Pure Data programming environment. The app will allow users to use the phone's touch screen to drag tokens around the triangle and generate musical textures. Usage statistics will be collected on the phone and periodically uploaded to our servers for analysis.
hekeus@54	447
hekeus@54	448
hekeus@54	449 \begin{fig}{mobile}
hekeus@54	450 \colfig[1]{mobile}
hekeus@54	451 \caption{The Melody Triangle mobile phone app }
hekeus@54	452 \end{fig}
hekeus@54	453
hekeus@54	454
hekeus@54	455 \section{Conclusion}
hekeus@54	456 We presented the Melody Triangle; an interface for the discovery of melodic content where the input -- positions within a triangle -- corresponds to the predictability of the output melodies. The Melody Triangle is contextualised in \emph{information dynamics}; an information theoretic approach to modelling human expectation and surprise.
hekeus@54	457
hekeus@54	458 We outlined the relevant ideas behind information dynamics and described three key information theoretic measures; entropy rate, redundancy and a measure of \emph{predictive information rate}, which describes the gain in information made by current observations about the future, but which are not already known from past observations. We described how the natural distribution of randomly generated Markov chains in terms of these measures lead us to design the Melody Triangle, and outlined its two physical incarnations.
hekeus@54	459
hekeus@54	460 The first is a multi-user installation where collaboration in a performative setting provides a playful yet informative way to explore expectation and surprise in music.
hekeus@54	461
hekeus@54	462 The second is a screen based interface where the Melody Triangle can be used as a musical performance interface or compositional aid for the generation of musical textures; the userÕs control at the abstract level of randomness and predictability. We outlined some qualitative feedback gathered from users of the system. It indicates that the Melody Triangle could be useful as a performance tool or composition aid. We described a pilot study where the screen-based interface was used under experimental conditions to determine how the information dynamics measures might relate to musical preference. Although the results were inconclusive, we plan to continue this work under different experimental setups. Finally we outlined a forthcoming mobile phone version of the Melody Triangle that, when released, will collect data from its users with a view to help us identify any relationship between human musical preferences and the information-dynamic model of human expectation and surprise.
hekeus@54	463
hekeus@54	464 \section{Acknowledgments}
hekeus@54	465 This work is supported by an EPSRC Doctoral Training Centre EP/G03723X/1 (HE), GR/S82213/01 and \\EP/E045235/1(SA), an EPSRC Leadership Fellowship, \\EP/G007144/1 (MDP) and EPSRC IDyOM2 EP/H013059/1.
hekeus@54	466
hekeus@54	467 \bibliography{all,c4dm,all2}\bibliographystyle{aaai}
hekeus@54	468 \end{document}

Mercurial > hg > mtridoc

annotate mume2012/mume2012_review.tex @ 58:a63c438b3f65 tip