hekeus@54
|
1 \documentclass[letterpaper]{article}
|
hekeus@54
|
2 \usepackage{aaai}
|
hekeus@54
|
3 \usepackage{times}
|
hekeus@54
|
4 \usepackage{helvet}
|
hekeus@54
|
5 \usepackage{courier}
|
hekeus@54
|
6 \usepackage{tools} %custom
|
hekeus@54
|
7
|
hekeus@54
|
8
|
hekeus@54
|
9
|
hekeus@54
|
10
|
hekeus@54
|
11 \let\citep=\cite
|
hekeus@54
|
12 \newcommand{\colfig}[2][1]{\includegraphics[width=#1\linewidth]{figs/#2}}%
|
hekeus@54
|
13 \newcommand\past[1]{\overset{\rule{0pt}{0.2em}\smash{\leftarrow}}{#1}}
|
hekeus@54
|
14 \newcommand\fut[1]{\overset{\rule{0pt}{0.1em}\smash{\rightarrow}}{#1}}
|
hekeus@54
|
15 \frenchspacing
|
hekeus@54
|
16 \pdfinfo{
|
hekeus@54
|
17 /Title (The Melody Triangle: Exploring Pattern and Predictability in Music )
|
hekeus@54
|
18 /Subject (Musical Metacreation, Interfaces)
|
hekeus@54
|
19 /Author (Henrik Ekeus, Samer A. Abdallah, Mark D. Plumbley, Peter W. McOwan)}
|
hekeus@54
|
20 \setcounter{secnumdepth}{0}
|
hekeus@54
|
21
|
hekeus@54
|
22 % The file aaai.sty is the style file for AAAI Press
|
hekeus@54
|
23 % proceedings, working notes, and technical reports.
|
hekeus@54
|
24 %
|
hekeus@54
|
25 \title{The Melody Triangle:\\ Exploring Pattern and Predictability in Music}
|
hekeus@54
|
26 \author{Henrik Ekeus, Samer A. Abdallah, Mark D. Plumbley, Peter W. McOwan\\
|
hekeus@54
|
27 Centre for Digital Music, Queen Mary University of London,\\
|
hekeus@54
|
28 London E1 4NS, UK\\
|
hekeus@54
|
29 }
|
hekeus@54
|
30 \begin{document}
|
hekeus@54
|
31 \maketitle
|
hekeus@54
|
32 \begin{abstract}
|
hekeus@54
|
33 \begin{quote}
|
hekeus@54
|
34
|
hekeus@54
|
35 The Melody Triangle is an interface for the discovery of melodic materials, where the input -- positions within a triangle -- directly map to information theoretic properties of the output. A model of human expectation and surprise in the perception of music, \emph{information dynamics}, is used to `map out' a musical generative system's parameter space. This enables a user to explore the possibilities afforded by a generative algorithm, in this case Markov chains, not by directly selecting parameters, but by specifying the subjective \emph{predictability} of the output sequence. We describe some of the relevant ideas from information dynamics and how the Melody Triangle is defined in terms of these. We describe its incarnation as a screen based performance tool and compositional aid for the generation of musical textures; the userŐs control at the abstract level of randomness and predictability, and some pilot studies carried out with it. We also briefly outline a multi-user installation, where collaboration in a performative setting provides a playful yet informative way to explore expectation and surprise in music, and a forthcoming mobile phone version of the Melody Triangle.
|
hekeus@54
|
36
|
hekeus@54
|
37 \end{quote}
|
hekeus@54
|
38 \end{abstract}
|
hekeus@54
|
39
|
hekeus@54
|
40 \noindent
|
hekeus@54
|
41
|
hekeus@54
|
42 \section{Introduction}
|
hekeus@54
|
43
|
hekeus@54
|
44 The use of generative stochastic processes in music composition has been widespread for
|
hekeus@54
|
45 decades---for instance Iannis Xenakis applied probabilistic mathematical models
|
hekeus@54
|
46 to the creation of musical materials\cite{Xenakis:1992ul}. However it can sometimes be difficult for a composer to find desirable parameters and navigate the possibilities of a generative algorithm intuitively.
|
hekeus@54
|
47
|
hekeus@54
|
48 The Melody Triangle is an interface for the discovery of melodic content where the parameter space of a stochastic generative musical process, the Markov chain, is `mapped out' according to the \emph{predictability} of the output. The Melody Triangle was developed in the context of \emph{information dynamics}\cite{CIP}; an information theoretic approach to modelling human expectation and surprise in the perception of music.
|
hekeus@54
|
49 Users of the Melody Triangle do not select the parameters to generative processes directly, rather they provide input in the form of a position within a triangle, and this maps to the information theoretic properties of an output melody.
|
hekeus@54
|
50 For instance one corner of the triangle returns completely random melodies, while an other area yields entirely predictable and periodic patterns, the entirety of the triangle covering a spectrum of predictability of the output melodies.
|
hekeus@54
|
51
|
hekeus@54
|
52 In this paper we outline the concepts and ideas behind information dynamics, and describe the information measures that lead to the development of the Melody Triangle. We describe its physical realisations; a multi-user interactive installation where visitors would use their bodies and gestures to generate musical materials, and a screen based interface. We outline some pilot studies carried out with the screen interface, as well as some qualitative feedback from music practitioners exploring its potential as a performance or composition tool. Finally we outline a forthcoming mobile phone version of the Melody Triangle.
|
hekeus@54
|
53
|
hekeus@54
|
54 \section{Information Dynamics}
|
hekeus@54
|
55 \label{s:Intro}
|
hekeus@54
|
56 The relationship between
|
hekeus@54
|
57 Shannon's \shortcite{Shannon48} information theory and music and art in general has been the
|
hekeus@54
|
58 subject of some interest since the 1950s
|
hekeus@54
|
59 \cite{Youngblood58,CoonsKraehenbuehl1958,Moles66,Meyer67,Cohen1962}.
|
hekeus@54
|
60 The general thesis is that perceptible qualities and subjective states
|
hekeus@54
|
61 like uncertainty, surprise, complexity, tension, and interestingness
|
hekeus@54
|
62 are closely related to information-theoretic quantities like
|
hekeus@54
|
63 entropy, relative entropy, and mutual information.
|
hekeus@54
|
64
|
hekeus@54
|
65 Music is an inherently dynamic process. The idea that the musical experience is strongly shaped by the generation
|
hekeus@54
|
66 and playing out of strong and weak expectations was put forward by, amongst others,
|
hekeus@54
|
67 music theorists L. B. Meyer \shortcite{Meyer:1967} and Narmour \shortcite{Narmour:1977}.
|
hekeus@54
|
68
|
hekeus@54
|
69 An essential aspect of this is that music is experienced as a phenomenon
|
hekeus@54
|
70 that unfolds in time, rather than being apprehended as a static object
|
hekeus@54
|
71 presented in its entirety. Meyer argued that the experience depends
|
hekeus@54
|
72 on how we change and revise our conceptions \emph{as events happen}, on
|
hekeus@54
|
73 how expectation and prediction interact with occurrence, and that, to a
|
hekeus@54
|
74 large degree, the way to understand the effect of music is to focus on
|
hekeus@54
|
75 this `kinetics' of expectation and surprise.
|
hekeus@54
|
76
|
hekeus@54
|
77 Prediction and expectation are essentially probabilistic concepts
|
hekeus@54
|
78 and can be treated mathematically using probability theory.
|
hekeus@54
|
79 We suppose that when we listen to music, expectations are created on the basis
|
hekeus@54
|
80 of our familiarity with various styles of music and our ability to
|
hekeus@54
|
81 detect and learn statistical regularities in the music as they emerge.
|
hekeus@54
|
82 There is experimental evidence that human listeners are able to internalise
|
hekeus@54
|
83 statistical knowledge about musical structure,
|
hekeus@54
|
84 \cite{SaffranJohnsonAslin1999}, and also
|
hekeus@54
|
85 that statistical models can form an effective basis for computational
|
hekeus@54
|
86 analysis of music,
|
hekeus@54
|
87 \cite{ConklinWitten95,PonsfordWigginsMellish1999,Pearce2005}.
|
hekeus@54
|
88
|
hekeus@54
|
89 Information dynamics considers several different kinds of predictability in musical patterns, how these might be quantified using the tools of information theory,
|
hekeus@54
|
90 and how they shape or affect the listening experience. Our working hypothesis is that listeners maintain a dynamically evolving probabilistic belief state that enables them to make predictions about how a piece of music will continue.
|
hekeus@54
|
91
|
hekeus@54
|
92 They do this using both the immediate context of the piece as well as using previous musical experience, such as a familiarity with musical styles and conventions. As the music unfolds, listeners continually revise this belief state, which includes predictive
|
hekeus@54
|
93 distributions over possible future events. These changes in probabilistic beliefs can be associated with
|
hekeus@54
|
94 quantities of information; these are the focus of information dynamics.
|
hekeus@54
|
95
|
hekeus@54
|
96 In this next section we briefly describe the information measures that we use to define the Melody Triangle, however a more complete overview of information dynamics and some of its applications can be found in \cite{Abdallah:2009p4089} and \cite{CIP}.
|
hekeus@54
|
97
|
hekeus@54
|
98 \subsection{Sequential Information Measures}\label{sec:Sequential_Information_Measures}
|
hekeus@54
|
99
|
hekeus@54
|
100 Consider a sequence of symbols from the viewpoint of an observer at a certain time, and split the
|
hekeus@54
|
101 sequence into a single symbol in the \emph{present} ($X_t$), an infinite \emph{past} ($\past{X}_t$) and the
|
hekeus@54
|
102 infinite \emph{future} ($\fut{X}_t$). The symbols arrive at a constant, uniform rate.
|
hekeus@54
|
103
|
hekeus@54
|
104 The \emph{entropy rate} of a random process is a well-known, basic measure of its randomness or
|
hekeus@54
|
105 unpredictablity. The entropy rate is the entropy, \emph{H}, of the \emph{present} given the \emph{past}:
|
hekeus@54
|
106 \begin{equation}
|
hekeus@54
|
107 \label{eq:entro-rate}
|
hekeus@54
|
108 h_\mu = H(X_t|\past{X}_t).
|
hekeus@54
|
109 \end{equation}
|
hekeus@54
|
110 that is, it represents our average uncertainty about the present symbol \emph{given}
|
hekeus@54
|
111 that we have observed everything before it. Processes with zero entropy rate can
|
hekeus@54
|
112 be predicted perfectly given enough of the preceding context.
|
hekeus@54
|
113
|
hekeus@54
|
114 The \emph{multi-information rate} $\rho_\mu$ \cite{Dubnov2004}
|
hekeus@54
|
115 is the mutual
|
hekeus@54
|
116 information, \emph{I}, between the `past' and the `present':
|
hekeus@54
|
117 \begin{equation}
|
hekeus@54
|
118 \label{eq:multi-info}
|
hekeus@54
|
119 \rho_\mu = I(\past{X}_t;X_t) = H(X_t) - H(X_t|\past{X}_t).
|
hekeus@54
|
120 \end{equation}
|
hekeus@54
|
121
|
hekeus@54
|
122 Multi-information rate can be thought of as measures of \emph{redundancy}, quantifying the extent to which the same information is to be found in all parts of the sequence.
|
hekeus@54
|
123 It is a measure of how much the predictability of the process depends on knowing the
|
hekeus@54
|
124 preceding context. It is the difference between the entropy of a single element of the
|
hekeus@54
|
125 sequence in isolation (imagine choosing a note from a musical score at random with your
|
hekeus@54
|
126 eyes closed and then trying to guess the note) and its entropy after taking into account
|
hekeus@54
|
127 the preceding context:
|
hekeus@54
|
128 If the previous symbols reduce our uncertainty about the present symbol a great deal, then
|
hekeus@54
|
129 the redundancy is high. For example, if we know that a sequence consists of a repeating
|
hekeus@54
|
130 cycle such as $ \ldots b, c, d, a, b, c, d, a \ldots$, but we don't know which was the first
|
hekeus@54
|
131 symbol, then the redundancy is high, as $H(X_t)$ is high (because we
|
hekeus@54
|
132 have no idea about the present symbol in isolation), but $H(X_t|\past{X}_t)$
|
hekeus@54
|
133 is zero, because knowing the previous symbol immediately tells us what the present symbol is.
|
hekeus@54
|
134
|
hekeus@54
|
135 The \emph{predictive information rate} (PIR) \cite{Abdallah:2009p4089} brings in our uncertainty about the future. It is a
|
hekeus@54
|
136 measure of how much each symbol reduces our uncertainty about the future as it is
|
hekeus@54
|
137 observed, \emph{given} that we have observed the past:
|
hekeus@54
|
138 \begin{equation}
|
hekeus@54
|
139 \label{eq:PIR}
|
hekeus@54
|
140 b_\mu = I(X_t;\fut{X}_t|\past{X}_t) = H(\fut{X}_t|\past{X}_t) - H(\fut{X}_t|X_t,\past{X}_t).
|
hekeus@54
|
141 \end{equation}
|
hekeus@54
|
142 It is a measure of the mutual information between the `presentŐ and the `futureŐ given the `past'. In other words, it is a measure of the \emph{new} information in each symbol.
|
hekeus@54
|
143
|
hekeus@54
|
144 The behaviour of the predictive information rate make it interesting from a compositional point of view. The definition
|
hekeus@54
|
145 of the PIR is such that it is low both for extremely regular processes, such as constant
|
hekeus@54
|
146 or periodic sequences, \emph{and} low for extremely random processes, where each symbol
|
hekeus@54
|
147 is chosen independently of the others, in a kind of `white noise'. In the former case,
|
hekeus@54
|
148 the pattern, once established, is completely predictable and therefore there is no
|
hekeus@54
|
149 \emph{new} information in subsequent observations. In the latter case, the randomness
|
hekeus@54
|
150 and independence of all elements of the sequence means that, though potentially surprising,
|
hekeus@54
|
151 each observation carries no information about the ones to come.
|
hekeus@54
|
152
|
hekeus@54
|
153 Processes with high PIR maintain a certain kind of balance between
|
hekeus@54
|
154 predictability and unpredictability in such a way that the observer must continually
|
hekeus@54
|
155 pay attention to each new observation as it occurs in order to make the best
|
hekeus@54
|
156 possible predictions about the evolution of the sequence. This balance between predictability
|
hekeus@54
|
157 and unpredictability is reminiscent of the inverted `U' shape of the Wundt curve (see \Figrf{wundt}),
|
hekeus@54
|
158 which summarises the observations of Wundt \shortcite{Wundt1897} that stimuli are most
|
hekeus@54
|
159 pleasing at intermediate levels of novelty or disorder, where there is a balance between
|
hekeus@54
|
160 `order' and `chaos'.
|
hekeus@54
|
161
|
hekeus@54
|
162 \begin{fig}{wundt}
|
hekeus@54
|
163 \raisebox{-4em}{\colfig[0.43]{wundt}}
|
hekeus@54
|
164 {\ {\large$\longrightarrow$}\ }
|
hekeus@54
|
165 \raisebox{-4em}{\colfig[0.43]{wundt2}}
|
hekeus@54
|
166 \caption{
|
hekeus@54
|
167 The Wundt curve relating randomness/complexity with
|
hekeus@54
|
168 perceived value. Repeated exposure sometimes results
|
hekeus@54
|
169 in a move to the left along the curve \cite{Berlyne71}.
|
hekeus@54
|
170 }
|
hekeus@54
|
171 \end{fig}
|
hekeus@54
|
172
|
hekeus@54
|
173 A similar shape is visible in the upper envelope of the plot in \Figrf{mtriscat}, which is a 3-D scatter plot of
|
hekeus@54
|
174 the information information measures for several thousand
|
hekeus@54
|
175 first-order Markov chain transition matrices generated by a random sampling method.
|
hekeus@54
|
176 The coordinates of the `information space' are entropy rate ($h_\mu$), redundancy ($\rho_\mu$), and
|
hekeus@54
|
177 predictive information rate ($b_\mu$). The points along the `redundancy' axis correspond
|
hekeus@54
|
178 to periodic Markov chains. Those along the `entropy' axis produce uncorrelated sequences
|
hekeus@54
|
179 with no temporal structure. Processes with high PIR are to be found at intermediate
|
hekeus@54
|
180 levels of entropy and redundancy.
|
hekeus@54
|
181
|
hekeus@54
|
182 These observations led us to construct the `Melody Triangle'.
|
hekeus@54
|
183
|
hekeus@54
|
184 \begin{figure}
|
hekeus@54
|
185 \centering
|
hekeus@54
|
186 \includegraphics[width=0.49\linewidth]{figs/PeriodicMatrix.pdf}
|
hekeus@54
|
187 \includegraphics[width=0.49\linewidth]{figs/NonPeriodicMatrix.pdf}
|
hekeus@54
|
188 \caption{Two transition matrixes representing Markov chains. The shade of gray represents the probabilities of transition from one symbol to the next (white=0, black=1). The current symbol is along the bottom, and the next symbol is along the left. The left hand matrix has no uncertainty; it represents a periodic pattern (a,d,c,b,a,d,c,b,a,d,c,b,a\dots). The right hand matrix contains unpredictability but nonetheless is not completely without perceivable structure (we know for instance that any `b' will always be followed by an `a' and preceded by a `c'), it is of a higher entropy rate. \label{TransitionMatrixes}}
|
hekeus@54
|
189 \end{figure}
|
hekeus@54
|
190
|
hekeus@54
|
191 \begin{fig}{mtriscat}
|
hekeus@54
|
192 \colfig[1]{mtriscat}
|
hekeus@54
|
193 \caption{The population of transition matrices in the 3D space of
|
hekeus@54
|
194 entropy rate ($h_\mu$), redundancy ($\rho_\mu$) and predictive information rate ($b_\mu$),
|
hekeus@54
|
195 all in bits. Note that the distribution as a whole makes a curved triangle. Although
|
hekeus@54
|
196 not visible in this plot, it is largely hollow in the middle.
|
hekeus@54
|
197 The concentrations of points along the redundancy axis correspond
|
hekeus@54
|
198 to Markov chains which are roughly periodic with periods of 2 (redundancy 1 bit),
|
hekeus@54
|
199 3, 4, \etc all the way to period 7 (redundancy 2.8 bits). Note that the highest PIR values are found at intermediate entropy
|
hekeus@54
|
200 and redundancy. \label{InfoDynEngine}}
|
hekeus@54
|
201 \end{fig}
|
hekeus@54
|
202
|
hekeus@54
|
203 \section{The Melody Triangle}\label{makingthetriangle}
|
hekeus@54
|
204
|
hekeus@54
|
205 The Melody Triangle is an interface that is designed around this natural distribution of Markov chain transition
|
hekeus@54
|
206 matrices in the information space of entropy rate ($h_\mu$), redundancy ($\rho_\mu$) and predictive information rate ($b_\mu$), as illustrated in \Figrf{mtriscat}.
|
hekeus@54
|
207
|
hekeus@54
|
208 The distribution of transition matrices in this space forms a relatively thin
|
hekeus@54
|
209 curved sheet. Thus, it is a reasonable simplification to project out the
|
hekeus@54
|
210 third dimension (the PIR) and present an interface that is just two dimensional.
|
hekeus@54
|
211
|
hekeus@54
|
212 The right-angled triangle is rotated, reflected and stretched to form an equilateral triangle with
|
hekeus@54
|
213 the `redundancy'/`entropy rate' vertex at the top, the `redundancy' axis down the left-hand
|
hekeus@54
|
214 side, and the `entropy rate' axis down the right, as shown in \Figrf{TheTriangle}.
|
hekeus@54
|
215 This is our `Melody Triangle' and
|
hekeus@54
|
216 forms the interface by which the system is controlled.
|
hekeus@54
|
217
|
hekeus@54
|
218 \begin{fig}{TheTriangle}
|
hekeus@54
|
219 \colfig[1]{TheTriangle.pdf}
|
hekeus@54
|
220 \caption{The Melody Triangle}
|
hekeus@54
|
221 \end{fig}
|
hekeus@54
|
222
|
hekeus@54
|
223 \subsection{Usage}
|
hekeus@54
|
224
|
hekeus@54
|
225 The user selects a point within the triangle, this is mapped into the
|
hekeus@54
|
226 information space and the nearest transition matrix is used to generate
|
hekeus@54
|
227 a sequence of values which are then sonified either as pitched notes or percussive
|
hekeus@54
|
228 sounds.
|
hekeus@54
|
229
|
hekeus@54
|
230 Though the interface is 2D, the third dimension (predictive information rate) is implicitly present, as
|
hekeus@54
|
231 transition matrices retrieved from
|
hekeus@54
|
232 along the centre line of the triangle will tend to have higher PIR.
|
hekeus@54
|
233
|
hekeus@54
|
234 As shown in as shown in \Figrf{TheTriangle}, the corners correspond to three different extremes of predictability and
|
hekeus@54
|
235 unpredictability, which could be loosely characterised as `periodicity', `noise'
|
hekeus@54
|
236 and `repetition'. Melodies from the `noise' corner (high $h_\mu$, low $\rho_\mu$
|
hekeus@54
|
237 and low $b_\mu$) have no discernible pattern;
|
hekeus@54
|
238 those along the `periodicity'
|
hekeus@54
|
239 to `repetition' edge are all cyclic patterns that get shorter as we approach
|
hekeus@54
|
240 the `repetition' corner, until each is just one repeating note. Those along the
|
hekeus@54
|
241 opposite edge consist of independent random notes from non-uniform distributions.
|
hekeus@54
|
242 Areas between the left and right edges will tend to have higher predictive information rate ($b_\mu$),
|
hekeus@54
|
243 and we hypothesise that, under
|
hekeus@54
|
244 the appropriate conditions, these will be perceived as more `interesting' or `melodic.'
|
hekeus@54
|
245 These melodies have some level of unpredictability, but are not completely random.
|
hekeus@54
|
246 Or, conversely, are predictable, but not entirely so.
|
hekeus@54
|
247
|
hekeus@54
|
248 Given coordinates corresponding to a point in the triangle, we select from a pre-built
|
hekeus@54
|
249 library of random processes, choosing one whose entropy rate and redundancy match the desired
|
hekeus@54
|
250 values. The implementations discussed in this paper use first order Markov chains as the content generator,
|
hekeus@54
|
251 since it is easy to compute the theoretically exact values of entropy rate, redundancy and predictive
|
hekeus@54
|
252 information rate given the transition matrix of the Markov chain. However, in principle, any generative system could be used to create the library of sequences, given an appropriate probabilistic listener model supporting
|
hekeus@54
|
253 the estimation of entropy rate and redundancy.
|
hekeus@54
|
254
|
hekeus@54
|
255 The Markov chain based implementation generates streams of symbols in the abstract; the alphabet of symbols is then mapped to a set of distinct sounds, such as pitched notes in a scale. Further by layering these streams, intricate musical textures can be created. The Melody Triangle does not take into account the statistical experience of our exposure to tonal music. Even if a particular stream of symbols is periodic and predictable, in mapping to the chromatic scale there is a chance that the melody may conflict with culturally defined expectations. A mapping to the diatonic scale however is less likely to lead to such conflicts, and mappings to the pentatonic scale even less so. Indeed, the symbols can be mapped to a set of percussive sounds, and even non sonic outputs such as visible shapes, colours, or movements.
|
hekeus@54
|
256
|
hekeus@54
|
257 The information measures that define the Melody Triangle assume a constant rate of symbols, and thus the notes of each output melody proceeds at a uniform mate. Although the placing of events in time has a strong effect on expectations, surprise and satisfaction in music, the system does not, as yet, address this temporal dimension.
|
hekeus@54
|
258
|
hekeus@54
|
259 \section{Interfaces}
|
hekeus@54
|
260
|
hekeus@54
|
261 \subsection{Interface 1: The Interactive Installation}
|
hekeus@54
|
262 \begin{figure}
|
hekeus@54
|
263 \centering
|
hekeus@54
|
264 \includegraphics[width=1\linewidth]{figs/kinnect.pdf}
|
hekeus@54
|
265 \caption{The depth map as seen by the Kinect camera in the interactive installation version of the Melody Triangle. The bounding box outlines the blobs detected by OpenNI.\label{Kinect}}
|
hekeus@54
|
266 \end{figure}
|
hekeus@54
|
267 The Melody Triangle was first implemented as a multi-user
|
hekeus@54
|
268 interactive installation. It has been exhibited at the Brighton Science Festival 2012, Digital Shoreditch as well as at The British Science Festival 2011. A Kinect\footnote{http://www.xbox.com/en-GB/Kinect} camera tracks individuals in a space, the range of its depth sensors naturally forming a triangle.
|
hekeus@54
|
269
|
hekeus@54
|
270 As visitors/users come into the range of the camera, they start generating a melody, the statistical properties of this melody determined by the mapping of physical space to the statistical space of the Melody Triangle. Thus by exploring the physical space, the participant changes the predictability of the generated melodic content. When multiple people are in the space they can cooperate to create interweaving melodies, forming intricate polyphonic textures.
|
hekeus@54
|
271
|
hekeus@54
|
272 This makes the interaction physically engaging and (as our experience with visitors both young and old has demonstrated) more playful.
|
hekeus@54
|
273
|
hekeus@54
|
274 \subsubsection{Tracking and Control}
|
hekeus@54
|
275
|
hekeus@54
|
276 Tracking and control was done using the OpenNI libraries' API\footnote{http://OpenNi.org/} and high level middle-ware for tracking with Kinect. This provided reliable blob tracking of humanoid forms in 2D space. By triangulating this to the Kinect's depth map it became possible to get reliable coordinate of visitors' positions in the space.
|
hekeus@54
|
277
|
hekeus@54
|
278 By detecting the bounding box of the 2D blobs of individuals in the space, and then normalising these based on the distance of the depth map it became possible to work out if an individual had an arm stretched out or if they were crouching. With this it was possible to define a series of gestures for controlling the system without the use of any controllers(see table \ref{gestures}). Thus for instance by sticking out one's left arm quickly, the melody doubles in tempo. By pulling one's left arm in at the same time as sticking the right arm out the melody would shift onto the offbeat. Sending out both arms would change the instrument being `played', and crouching would decrease the volume of the melody.
|
hekeus@54
|
279
|
hekeus@54
|
280 \begin{table}
|
hekeus@54
|
281 \centering
|
hekeus@54
|
282 \caption{Gestures and their resulting effect\label{gestures}}
|
hekeus@54
|
283 \begin{tabular}{ l c l }
|
hekeus@54
|
284 left arm & right arm & meaning\\
|
hekeus@54
|
285 \hline
|
hekeus@54
|
286 out & static & double tempo \\
|
hekeus@54
|
287 in & static & halve tempo \\
|
hekeus@54
|
288 static & out & triple tempo \\
|
hekeus@54
|
289 static & in & one-third tempo\\
|
hekeus@54
|
290 out & in & shift to off-beat \\
|
hekeus@54
|
291 out & out & change instrument\\
|
hekeus@54
|
292 in & in & reset tempo\\
|
hekeus@54
|
293 \end{tabular}
|
hekeus@54
|
294 \end{table}
|
hekeus@54
|
295
|
hekeus@54
|
296 \begin{figure}
|
hekeus@54
|
297 \centering
|
hekeus@54
|
298 \includegraphics[width=1\linewidth]{figs/InstructionsImage3.pdf}
|
hekeus@54
|
299 \caption{Gestures and their resulting effect \label{gestures2}}
|
hekeus@54
|
300 \end{figure}
|
hekeus@54
|
301
|
hekeus@54
|
302
|
hekeus@54
|
303
|
hekeus@54
|
304 \subsubsection{Observations}
|
hekeus@54
|
305 Although visitors would need some initial instructions, they were then quickly able to collaboratively design musical textures. For example, one person would lay down a predictable repeating bass line by keeping themselves to the periodicity/repetition side of the room, while a companion can generate a freer melodic line by being nearer the 'noise' part of the space.
|
hekeus@54
|
306
|
hekeus@54
|
307 The collaborative nature of this installation is an area that merits attention. By not having one user be able to control the whole narrative, the participants would communicate verbally and direct each other in the goals of learning to use the system and finding interesting musical textures. This collaboration added an element of playfulness and enjoyment that was clearly apparent.
|
hekeus@54
|
308
|
hekeus@54
|
309 As an artefact this installation occupies an ambiguous role in terms of purpose; it is in a nebulous middle ground between instrument, art installation and technical demonstration. It is clear however, that as a vehicle for communicating ideas related to the expectation, pattern and predictability in music to the general public, it has proved very effective.
|
hekeus@54
|
310
|
hekeus@54
|
311 However we were interested in carrying out some studies under more controlled circumstances. Additionally we are interested in the Melody Triangle's potential as a compositional aid or music performance interface. To this end we developed a screen based user interface to the Melody Triangle.
|
hekeus@54
|
312
|
hekeus@54
|
313
|
hekeus@54
|
314 \subsection{Interface 2: The Screen Interface}
|
hekeus@54
|
315
|
hekeus@54
|
316 \begin{fig}{melTriScreenShot}
|
hekeus@54
|
317 \colfig[1]{melTriScreenShot}
|
hekeus@54
|
318 \caption{Screen shot of the Melody Triangle screen UI. On the right current transition matrixes being played are displayed. The tokens flash when ever a note from its melody is rendered. }
|
hekeus@54
|
319 \end{fig}
|
hekeus@54
|
320
|
hekeus@54
|
321 In the screen based interface, a number of tokens, each representing a
|
hekeus@54
|
322 sonification stream or `voice', can be dragged in and around the triangle.
|
hekeus@54
|
323 For each token, a sequence of symbols is sampled using the corresponding
|
hekeus@54
|
324 transition matrix, which are then mapped to notes of a scale or percussive sounds%
|
hekeus@54
|
325 \footnote{The sampled sequence could easily be mapped to other musical processes, possibly over
|
hekeus@54
|
326 different time scales, such as chords, dynamics and timbres. It would also be possible
|
hekeus@54
|
327 to map the symbols to visual or other outputs.}%
|
hekeus@54
|
328 . Keyboard commands give control over other musical parameters such
|
hekeus@54
|
329 as the pitch register, volume, scale, inter-onset interval and instrument for each voice.
|
hekeus@54
|
330 %The possibilities afforded by the Melody Triangle in these other domains remains to be investigated.
|
hekeus@54
|
331 The system is capable of generating quite intricate musical textures when multiple tokens
|
hekeus@54
|
332 are in the triangle. The overlapping and interweaving of melodies of varying periodicity's and predictability is well suited for making content that could stylistically be characterised as `minimalism'.
|
hekeus@54
|
333
|
hekeus@54
|
334 This interface is quite unlike other computer aided composition tools or programming
|
hekeus@54
|
335 environments, as here the composer exercises control at the abstract level of information-dynamic
|
hekeus@54
|
336 properties.
|
hekeus@54
|
337 A video of the interface in use can be viewed here - \emph{http://bit.ly/My49lT}
|
hekeus@54
|
338
|
hekeus@54
|
339
|
hekeus@54
|
340
|
hekeus@54
|
341
|
hekeus@54
|
342
|
hekeus@54
|
343
|
hekeus@54
|
344 \section{User trials with the Melody Triangle}
|
hekeus@54
|
345 We carried out a pilot study with six participants who were asked
|
hekeus@54
|
346 to use a simplified form of the user interface (a single controllable token,
|
hekeus@54
|
347 and no rhythmic, registral or timbral controls) under two conditions:
|
hekeus@54
|
348 one where a single sequence was sonified under user control, and another
|
hekeus@54
|
349 where an additional sequence was sonified in a different register, as if generated
|
hekeus@54
|
350 by a fixed invisible token in one of four regions of the triangle. In addition, subjects
|
hekeus@54
|
351 were asked to press a key if they `liked' what they were hearing. The subj
|
hekeus@54
|
352
|
hekeus@54
|
353
|
hekeus@54
|
354 Our hypothesis is that users would linger longer in areas of the triangle that would produce more aesthetically desirable sequences, and these would tend to be the in the areas of the triangle that are of high predictive information rate, that is, areas along the middle and lower edge of the triangle.
|
hekeus@54
|
355
|
hekeus@54
|
356
|
hekeus@54
|
357 We recorded subjects' behaviour as well as points which they marked
|
hekeus@54
|
358 with a key press. After the study the participants were surveyed with the Goldsmiths Musical Sophistication Index\cite{Mullensiefen:2011ts} to elicit their prior musical experience, which varied broadly. The sample size, however, was too small to draw any statistically significant correlations between the collected data and the index.
|
hekeus@54
|
359
|
hekeus@54
|
360 \subsection{Results}
|
hekeus@54
|
361 Some results for four of the subjects are shown in \Figrf{mtri-results}. We have not been able to detect any systematic across-subject preference for any particular region of the triangle.
|
hekeus@54
|
362
|
hekeus@54
|
363 \begin{fig}{mtri-results}
|
hekeus@54
|
364 \def\scat#1{\colfig[0.42]{mtri/#1}}
|
hekeus@54
|
365 \def\subj#1{\scat{scat_dwells_subj_#1} & \scat{scat_marks_subj_#1}}
|
hekeus@54
|
366 \begin{tabular}{cc}
|
hekeus@54
|
367 \subj{a} \\
|
hekeus@54
|
368 \subj{b} \\
|
hekeus@54
|
369 \subj{c} \\
|
hekeus@54
|
370 \subj{d}
|
hekeus@54
|
371 \end{tabular}
|
hekeus@54
|
372 \caption{Dwell times and mark positions from user trials with the
|
hekeus@54
|
373 on-screen Melody Triangle interface, for four subjects. The left-hand column shows
|
hekeus@54
|
374 the positions in a 2D information space (entropy rate vs redundancy
|
hekeus@54
|
375 in bits) where each spent their time; the area of each circle is proportional
|
hekeus@54
|
376 to the time spent there. The right-hand column shows point which subjects
|
hekeus@54
|
377 `liked'; the area of the circles here is proportional to the duration spent at
|
hekeus@54
|
378 that point before the point was marked.}
|
hekeus@54
|
379 \end{fig}
|
hekeus@54
|
380
|
hekeus@54
|
381
|
hekeus@54
|
382 Comments collected from the subjects
|
hekeus@54
|
383 suggest that
|
hekeus@54
|
384 the information-dynamic characteristics of the patterns were readily apparent
|
hekeus@54
|
385 to most: several noticed the main organisation of the triangle,
|
hekeus@54
|
386 with repetitive notes at the top, cyclic patterns along one edge, and unpredictable
|
hekeus@54
|
387 notes towards the opposite corner. Some described their systematic exploration of the space.
|
hekeus@54
|
388 Two felt that the right side was `more controllable' than the left (a consequence
|
hekeus@54
|
389 of their ability to return to a particular distinctive pattern and recognise it
|
hekeus@54
|
390 as one heard previously). Two reported that they became bored towards the end,
|
hekeus@54
|
391 but another felt there wasn't enough time to `hear out' the patterns properly.
|
hekeus@54
|
392 One subject did not `enjoy' the patterns in the lower region, but another said the lower
|
hekeus@54
|
393 central regions were more `melodic' and `interesting'.
|
hekeus@54
|
394
|
hekeus@54
|
395 \subsection{Discussion}
|
hekeus@54
|
396 Our initial hypothesis, that subjects would linger longer in regions of the triangle
|
hekeus@54
|
397 that produced aesthetically preferable sequences, and that this would tend to be towards the
|
hekeus@54
|
398 centre line of the triangle for all subjects, was not confirmed.
|
hekeus@54
|
399 However the subjects did seem to exhibit distinct kinds of exploratory behaviour.
|
hekeus@54
|
400 It is possible
|
hekeus@54
|
401 that the design of the experiment encouraged an initial exploration of the space (sometimes
|
hekeus@54
|
402 very systematic, as for subject (c)) aimed at \emph{understanding}
|
hekeus@54
|
403 how the system works, rather than finding musical patterns. It is also possible that the
|
hekeus@54
|
404 system encourages users to create musically interesting output by \emph{moving the token},
|
hekeus@54
|
405 rather than finding a particular spot in the triangle which produces a musically interesting
|
hekeus@54
|
406 sequence by itself.
|
hekeus@54
|
407
|
hekeus@54
|
408 We plan to continue the trials with a slightly less restricted user interface in order
|
hekeus@54
|
409 make the experience more enjoyable and thereby give subjects longer to use the interface;
|
hekeus@54
|
410 this may allow them to get beyond the initial exploratory phase and give a clearer
|
hekeus@54
|
411 picture of their aesthetic preferences. In addition, we plan to conduct a
|
hekeus@54
|
412 study under more restrictive conditions, where subjects will have no control over the patterns
|
hekeus@54
|
413 other than to signal (a) which of two alternatives they prefer in a forced
|
hekeus@54
|
414 choice paradigm, and (b) when they are bored of listening to a given sequence.
|
hekeus@54
|
415
|
hekeus@54
|
416 \section{Qualitative Feedback}
|
hekeus@54
|
417
|
hekeus@54
|
418 In parallel to the pilot study, we have collected qualitative feedback from potential users of the screen interface. Here four participants were interviewed, all practicing musicians that use computers in music production or in performance. This is with a view to establish what features would be desired for any eventual further development of the interface, for instance as a VST instrument for inclusion in a standard audio production environment.
|
hekeus@54
|
419
|
hekeus@54
|
420 Unlike in the pilot study where participants would not know anything about the interface before hand and were asked to `explore' with as little instructions in possible, here the potential users are first taught how to use the system. Then they are given some time to play and experiment, and in informal discussion feedback and criticism of the system is sought ought. As part of a broader conversation, they were asked if they could identify the different areas of the triangle, what features of the system they liked and disliked, if they could see themselves using the system as part of their musical practice, and if so how.
|
hekeus@54
|
421
|
hekeus@54
|
422 Some points collected include -
|
hekeus@54
|
423 \begin{itemize}
|
hekeus@54
|
424 \item The subjects were very quick to get to grips with the properties of the different areas of the triangle, and found it quite intuitive.
|
hekeus@54
|
425 \item The more periodic/predictable half of the triangle was used considerably more by all participants.
|
hekeus@54
|
426 \item Some expressed interest in its potential as live performance interface for electronic music.
|
hekeus@54
|
427 \item All users desired more control over the mapping of symbols to notes, and some desired the ability to map the output of the triangle to other parameters such as to the control of filters and effect parameters.
|
hekeus@54
|
428 \end{itemize}
|
hekeus@54
|
429
|
hekeus@54
|
430 Two of the users indicated that the Melody Triangle could integrate well into their musical practice, one was unsure and the other said it would not and expressed frustration at having little control over the musical style of the output.
|
hekeus@54
|
431 Some comments are provided here -
|
hekeus@54
|
432 \begin{quote}``If it was a kind of VST instrument, I would use it really a lot, definitely! Because there are not that many around that make this kind of stuff. I always love if something is generative or stochastic to generate things I would not come up with, but to generate a lot of them in a short amount of time and I'm the creative catalyst that just picks them.. and then have this kind of choices to edit probabilities, I really like that.''\end{quote}
|
hekeus@54
|
433
|
hekeus@54
|
434 \begin{quote}
|
hekeus@54
|
435 ``Here what is cool is that .. I can make multiple loops and they all have different characteristics and I don't have to adjust like five numbers in different places, it's in one thing, and that's what I like most, it's kind of like a macro [interface].''
|
hekeus@54
|
436 \end{quote}
|
hekeus@54
|
437
|
hekeus@54
|
438 \begin{quote}``I would use it as an idea generator ..what i probably would do is I would run this, maybe I would select some random sounds and maybe I would try around and develop some motifs, and see `oh I like that!' and would record that as midi and move on. "\end{quote}
|
hekeus@54
|
439
|
hekeus@54
|
440 Stochastic process have often been used to generate musical materials. While such processes can drive the \emph{generative} phase of the creative process, these comments suggest that information dynamics and the Melody Triangle can serve as a novel framework for a \emph{selective} phase; helping composers discover generated materials that are of value. This alternation of generative and selective phases has been noted before \cite{Boden1990}.
|
hekeus@54
|
441
|
hekeus@54
|
442 \section{The Mobile App}
|
hekeus@54
|
443
|
hekeus@54
|
444 In order to further our study into musical preferences with a wider audience, the Melody Triangle is being implemented as an Android mobile phone application. The research motivation is to use the app as a means of collecting large quantities of crowd-sourced data, providing us with a larger data set than could be realistically achieved through individual studies.
|
hekeus@54
|
445
|
hekeus@54
|
446 The audio engine is developed in libpd\footnote{http://libpd.cc/}, a port of the open source Pure Data programming environment. The app will allow users to use the phone's touch screen to drag tokens around the triangle and generate musical textures. Usage statistics will be collected on the phone and periodically uploaded to our servers for analysis.
|
hekeus@54
|
447
|
hekeus@54
|
448
|
hekeus@54
|
449 \begin{fig}{mobile}
|
hekeus@54
|
450 \colfig[1]{mobile}
|
hekeus@54
|
451 \caption{The Melody Triangle mobile phone app }
|
hekeus@54
|
452 \end{fig}
|
hekeus@54
|
453
|
hekeus@54
|
454
|
hekeus@54
|
455 \section{Conclusion}
|
hekeus@54
|
456 We presented the Melody Triangle; an interface for the discovery of melodic content where the input -- positions within a triangle -- corresponds to the predictability of the output melodies. The Melody Triangle is contextualised in \emph{information dynamics}; an information theoretic approach to modelling human expectation and surprise.
|
hekeus@54
|
457
|
hekeus@54
|
458 We outlined the relevant ideas behind information dynamics and described three key information theoretic measures; entropy rate, redundancy and a measure of \emph{predictive information rate}, which describes the gain in information made by current observations about the future, but which are not already known from past observations. We described how the natural distribution of randomly generated Markov chains in terms of these measures lead us to design the Melody Triangle, and outlined its two physical incarnations.
|
hekeus@54
|
459
|
hekeus@54
|
460 The first is a multi-user installation where collaboration in a performative setting provides a playful yet informative way to explore expectation and surprise in music.
|
hekeus@54
|
461
|
hekeus@54
|
462 The second is a screen based interface where the Melody Triangle can be used as a musical performance interface or compositional aid for the generation of musical textures; the userŐs control at the abstract level of randomness and predictability. We outlined some qualitative feedback gathered from users of the system. It indicates that the Melody Triangle could be useful as a performance tool or composition aid. We described a pilot study where the screen-based interface was used under experimental conditions to determine how the information dynamics measures might relate to musical preference. Although the results were inconclusive, we plan to continue this work under different experimental setups. Finally we outlined a forthcoming mobile phone version of the Melody Triangle that, when released, will collect data from its users with a view to help us identify any relationship between human musical preferences and the information-dynamic model of human expectation and surprise.
|
hekeus@54
|
463
|
hekeus@54
|
464 \section{Acknowledgments}
|
hekeus@54
|
465 This work is supported by an EPSRC Doctoral Training Centre EP/G03723X/1 (HE), GR/S82213/01 and \\EP/E045235/1(SA), an EPSRC Leadership Fellowship, \\EP/G007144/1 (MDP) and EPSRC IDyOM2 EP/H013059/1.
|
hekeus@54
|
466
|
hekeus@54
|
467 \bibliography{all,c4dm,all2}\bibliographystyle{aaai}
|
hekeus@54
|
468 \end{document}
|