Mercurial > hg > mtridoc

Binary file nime2012/mtriange.pdf has changed
--- a/nime2012/mtriange.tex	Fri Feb 03 21:38:41 2012 +0000
+++ b/nime2012/mtriange.tex	Sat Feb 04 00:38:55 2012 +0000
@@ -25,47 +25,59 @@
 \begin{document}
 \maketitle
 \begin{abstract}
-The Melody Triangle is a Markov-chain based musical pattern generator who's input parameters correspond to information theoretic measures of the system's own output.  The primary user input, positions within a triangle, directly maps to the amount of entropy, redundancy and \emph{predictive information rate}\cite{Abdallah} of output melodies. Predictive information rate is a \emph{time-varying} information measure.  Developed as part of the Information Dynamics of Music project(IDyOM)\footnote{http://www.idyom.org/}, it characterises temporal structure and is a way of modelling expectation and surprise in the perception of music.
+The Melody Triangle is a Markov-chain based melody generator where the input - positions within a triangle - directly map to information theoretic measures of its output.  The measures are the amount of entropy, redundancy and \emph{predictive information rate}\cite{Abdallah} in the melody. Predictive information rate is a \emph{time-varying} information measure developed as part of the Information Dynamics of Music project(IDyOM)\footnote{http://www.idyom.org/}.  It characterises temporal structure and is a way of modelling expectation and surprise in the perception of music.

-We describe the Information Dynamics model and how it forms the basis of the Melody Triangle.  We outline the interfaces to the system - a multi-user installation where collaboration in a performative setting provides a playful yet informative way to explore expectation and surprise in music, and a screen based interface where the Melody Triangle becomes compositional tool that allows for the generation of intricate musical textures using an abstract, high-level description of predictability. Finally we outline a study where participants used the screen-based interface under experimental conditions to allow us to determine the relationship between the Information Dynamics models and musical preference. We found that\dots
+We describe the Information Dynamics model and how it forms the basis of the Melody Triangle.  We outline two interfaces and uses of the system.  The first is a multi-user installation where collaboration in a performative setting provides a playful yet informative way to explore expectation and surprise in music.  The second is a screen based interface where the Melody Triangle becomes a compositional tool that allows for the generation of intricate musical textures using an abstract, high-level description of predictability. Finally we outline a study where the screen-based interface was used under experimental conditions to determine how the three measures of predictive information rate, entropy and redundancy might relate to musical preference. We found that\dots

 \end{abstract}
 \keywords{Information dynamics, Markov chains, Collaborative performance, Aleatoric composition}

-\section{Introduction}
+\section{Information dynamics}

- Music involve patterns in time, and when listening to music we create expectations of what is to come next, and composers commonly, consciously or not, play with this expectation by setting up expectations which may, or may not be fulfilled, and thus manipulate expectations and surprise in the listener[ref].  The research into Information Dynamics explores several different kinds of predictability in musical patterns, how human listeners might perceive these, and how they shape or affect the listening experience.
+ Music involve patterns in time, and when listening to music we create expectations of what is to come next.  Composers commonly, consciously or not, play with this expectation by setting up expectations which may, or may not be fulfilled.  This manipulation of expectation and surprise in the listener has been articulated by [ref]. [little more background on expectation]
+
+The research into Information Dynamics explores several different kinds of predictability in musical patterns, how human listeners might perceive these, and how they shape or affect the listening experience. [more on IDyOM]


-\section{Information Dynamics and the Triangle }
-(some background on IDyOM and Markov chains)

+\subsection{Predictive Information Rate}
+[todo quick summary of predictive information rate]

+\subsection{Redundancy and Entropy}
+[todo quick summary of the redundancy and entropy measures used]


+\section{The Melody Triangle}
+%%%How we created the transition matrixes and created the triangle.

-\begin{figure*}[t]
+Given a stream of symbols we can calculate values for redundancy, entropy and predictive information rate as they happen.  The Melody Triangle goes 'backwards' - given values for redundancy, entropy and predictive information rate, we return a stream of symbols that match those measures.
+
+First order Markov chains, represented as transition matrixes, generate the streams.  By mapping the symbols to individual notes, melodies are generated.  By layering multiple instances of this process, complex musical textures can be generated. The choice of notes and scale is not a part of the Melody Triangle's core functionality, the symbols could be mapped to anything, even non sonic outputs.
+
+\subsection{Transition Matrixes}
+\begin{figure}
 \centering
 \includegraphics[width=0.3\textwidth]{PeriodicMatrix.png}
 \includegraphics[width=0.3\textwidth]{NonDeterministicMatrix_bw.png}
-\caption{Two transition matrixies.  The colour represents the probabilities of transition from one state to the next (black=0, white=1).  The left matrix has no uncertainty.  It represents a periodic pattern. The right hand matrix contains unpredictability but nonetheless is not completely without perceivable structure.\label{TransitionMatrixes}}
-\end{figure*}
+\caption{Two transition matrixes.  The colour represents the probabilities of transition from one state to the next (black=0, white=1). The current symbol is along the bottom, and in this case there are twelve possibilities (mapped to a chromatic scale).  The upper matrix has no uncertainty; it represents a periodic pattern. We can see for example that symbol 4 must follow symbol 3, and that symbol 10 must follow symbol 4, and so on.The lower matrix contains unpredictability but nonetheless is not completely without perceivable structure, it is of a higher entropy.   In the upper matrix,  \label{TransitionMatrixes}}
+\end{figure}

-The Information Dynamics model operates on discreet symbols, only at the output stage is any symbol mapped to a particular note. Each stream of symbols is at any one time defined by a transition matrix.  A transition matrix defines the probabilistic distribution for the symbol following the current one.  In fig.\ref{TransitionMatrixes} we see two transition matrixes, the one on the left has no uncertainty and therefore outlines a periodic pattern.  The one on the right has unpredictability but is nonetheless not completely without perceivable structure, it is of a higher entropy.  The current symbol is along the bottom, and in this case there are twelve possibilities (mapped to a chromatic scale).  In the left hand matrix, we can see for example that symbol 4 must follow symbol 3, and that symbol 10 must follow symbol 4, and so on.

-\begin{figure*}[t]
+
+
+\begin{figure}
 \centering
-\includegraphics[width=0.75\textwidth]{MatrixDistribution.png}
+\includegraphics[width=0.5\textwidth]{MatrixDistribution.png}
 \caption{The population of transition matrixes distributed along three axes of redundancy, entropy rate and predictive -information rate.  Note how the distribution makes triangle-like plane floating in 3d space.\label{InfoDynEngine}}
-\end{figure*}
-
Hundreds of transition matrixes are generated, and they are then placed in a 3d statistical space based on 3 information measures calculated from the matrix, these are redundancy, entropy rate, and predictive-information rate [see [cite]].  In fig.\ref{InfoDynEngine} on the right, we see a representation of these matrixes distributed; each one of these points corresponds to a transition matrix.  Entropy rate is the average uncertainty for the next symbol as we go through the sequence.  A looping sequence has 0 entropy, a sequence that is difficult to predict has high entropy rate.   Entropy rate is an average of �surprisingness� over time.
-
Redundancy tells us the difference in uncertainty before we look at the context (the fixed point distribution) and the uncertainty after we look at context.  For instance a matrix with high redundancy, such as one that represents a long periodic sequence, would have high uncertainty before we look at the context but as soon as we look at the previous symbol, the uncertainty drops to zero because we now know what is coming next.

Predictive information rate tell us the average reduction in uncertainty upon perceiving a symbol; a system with high predictive information rate means that each symbol tells you more about the next one.  If we imagine a purely periodic sequence, each symbol tells you nothing about the next one, that we didn't already know as we already know how the pattern is going.  Similarly with a seemingly uncorrelated sequence,  seeing the next symbol does not tell us anymore because they are completely independent anyway; there is no pattern.   There is a subset of transition matrixes that have high predictive information rate, and it is neither the periodic ones, nor the completely un-corellated ones.  Rather they tend to yield output that have certain characteristic patterns, however a listener can't necessarily know when they occur.  However a certain sequence of symbols might tell us about which one of the characteristics patterns will show up next.  Each symbols tell a us little bit about the future but nothing about the infinite future, we only learn about that as time goes on; there is continual building of prediction.

When we look at the distribution of matrixes generated by a random sampling method in this 3d space of entropy rate, redundancy and predictive information rate, it forms an arch shape that is fairly thin, and it thus becomes a reasonable approximation to pretend that it is just a sheet in two dimensions(see fig.\ref{InfoDynEngine}).  It is this triangular sheetfig.\ref{TheTriangle} that is then mapped either to the screen, or in the case of the interactive installation, physical space.  Each corner corresponding to three different extremes of predictability/unpredictability, which could be loosely characterised as periodicity, noise and repetition.
+\end{figure}
+
Hundreds of transition matrixes are randomly generated, and they are then placed in a 3d statistical space based on the information measures calculated from the matrix, these are redundancy, entropy rate, and predictive-information rate [see [cite]].  In fig.\ref{InfoDynEngine} on the right, we see a representation of these matrixes distributed; each one of these points corresponds to a transition matrix.  Entropy rate is the average uncertainty for the next symbol as we go through the sequence.  A looping sequence has 0 entropy, a sequence that is difficult to predict has high entropy rate.   Entropy rate is an average of �surprisingness� over time.
+
Redundancy tells us the difference in uncertainty before we look at the context (the fixed point distribution) and the uncertainty after we look at context.  For instance a matrix with high redundancy, such as one that represents a long periodic sequence, would have high uncertainty before we look at the context but as soon as we look at the previous symbol, the uncertainty drops to zero because we now know what is coming next.

Predictive information rate tell us the average reduction in uncertainty upon perceiving a symbol; a system with high predictive information rate means that each symbol tells you more about the next one.  If we imagine a purely periodic sequence, each symbol tells you nothing about the next one that we didn't already know as we already know how the pattern is going.  Similarly with a seemingly uncorrelated sequence,  seeing the next symbol does not tell us anymore because they are completely independent anyway; there is no pattern.   There is a subset of transition matrixes that have high predictive information rate, and it is neither the periodic ones, nor the completely un-corellated ones.  Rather they tend to yield output that have certain characteristic patterns, however a listener can't necessarily know when they occur.  However a certain sequence of symbols might tell us about which one of the characteristics patterns will show up next.  Each symbols tell a us little bit about the future but nothing about the infinite future, we only learn about that as time goes on; there is continual building of prediction.

When we look at the distribution of matrixes generated by a random sampling method in this 3d space of entropy rate, redundancy and predictive information rate, it forms an arch shape that is fairly thin, and it thus becomes a reasonable approximation to pretend that it is just a sheet in two dimensions(see fig.\ref{InfoDynEngine}).  It is this triangular sheetfig.\ref{TheTriangle} that is then mapped either to the screen, or in the case of the interactive installation, physical space.  Each corner corresponding to three different extremes of predictability/unpredictability, which could be loosely characterised as periodicity, noise and repetition.

-\begin{figure*}[t]
+\begin{figure}
 \centering
 \includegraphics[width=0.5\textwidth]{TheTriangle.pdf}
 \caption{The Melody Triangle \label{TheTriangle}}
-\end{figure*}
+\end{figure}


@@ -87,26 +99,26 @@
 \subsubsection{Tracking and Control}
 It uses the OpenNI libraries' API and high level middle-ware for tracking with Kinnect.  This provided reliable blob tracking of humanoid forms in 2d space.  By then triangulating this to the Kinnect's depth map it became possible to get reliable coordinate of visitors positions in the space.

-\begin{figure*}[t]
+\begin{figure}
 \centering
 \includegraphics[width=1\textwidth]{kinnect.pdf}
 \caption{?On the left we see the depth map as seen by the Kinnect, and the bounding box outlines the blobs detected by OpenNI.  One the right is a an eagle-eye view of the position of individuals tracked in the space.   These are the positions that are mapped to the statistical space of the information dynamics engine.\label{Kinnect}}
-\end{figure*}
+\end{figure}


This system was then further extended to detect gestures.  By detecting the bounding box of the 2d blobs of individuals in the space, and then normalising these based on the distance of the depth map it became possible to work out if an individual had a an arm stretched out or if they were crouching.

 With this it was possible to define a series of gestures for controlling the system without the use of any controllers.  Thus for instance by sticking out one's left arm quickly, the melody doubles in tempo.  By pulling one's left arm in at the same time as sticking the right arm out the melody would shift onto the offbeat.   Sending out both arms would change instrument.

-\begin{figure*}[t]
+\begin{figure}
 \centering
 \includegraphics[width=0.5\textwidth]{InstructionsText.pdf}
 %\caption{\label{Kinnect}}
-\end{figure*}
-\begin{figure*}[t]
+\end{figure}
+\begin{figure}
 \centering
 \includegraphics[width=1.0\textwidth]{InstructionsImage.pdf}
 \caption{Control Gestures\label{Kinnect}}
-\end{figure*}
+\end{figure}


 \subsection{The Screen Based Interface}