Mercurial > hg > mtridoc
changeset 13:5e2c50450ace
Improved the bits about the triangle. next I will re-write the user-interfaces section.
author | Henrik Ekeus <hekeus@eecs.qmul.ac.uk> |
---|---|
date | Sat, 04 Feb 2012 15:08:36 +0000 |
parents | d26e4561d036 |
children | 2f88a91ad01f |
files | nime2012/kinnect.pdf nime2012/mtriange.pdf nime2012/mtriange.tex |
diffstat | 3 files changed, 31 insertions(+), 19 deletions(-) [+] |
line wrap: on
line diff
--- a/nime2012/mtriange.tex Sat Feb 04 01:57:04 2012 +0000 +++ b/nime2012/mtriange.tex Sat Feb 04 15:08:36 2012 +0000 @@ -27,60 +27,70 @@ \begin{abstract} The Melody Triangle is a Markov-chain based melody generator where the input - positions within a triangle - directly map to information theoretic measures of its output. The measures are the entropy rate, redundancy and \emph{predictive information rate}\cite{Abdallah} in the melody. Predictive information rate is a \emph{time-varying} information measure developed as part of the Information Dynamics of Music project(IDyOM)\footnote{http://www.idyom.org/}. It characterises temporal structure and is a way of modelling expectation and surprise in the perception of music. -We describe the Information Dynamics model and how it forms the basis of the Melody Triangle. We outline two interfaces and uses of the system. The first is a multi-user installation where collaboration in a performative setting provides a playful yet informative way to explore expectation and surprise in music. The second is a screen based interface where the Melody Triangle becomes a compositional tool that allows for the generation of intricate musical textures using an abstract, high-level description of predictability. Finally we outline a study where the screen-based interface was used under experimental conditions to determine how the three measures of predictive information rate, entropy and redundancy might relate to musical preference. We found that\dots +We describe the Information Dynamics model and how it forms the basis of the Melody Triangle. We outline two interfaces and uses of the system. The first is a multi-user installation where collaboration in a performative setting provides a playful yet informative way to explore expectation and surprise in music. The second is a screen based interface where the Melody Triangle becomes a compositional tool for the generation of intricate musical textures using an abstract, high-level description of predictability. Finally we outline a study where the screen-based interface was used under experimental conditions to determine how the three measures of predictive information rate, entropy and redundancy might relate to musical preference. We found that\dots \end{abstract} \keywords{Information dynamics, Markov chains, Collaborative performance, Aleatoric composition} \section{Information dynamics} - Music involve patterns in time, and when listening to music we create expectations of what is to come next. Composers commonly, consciously or not, play with this expectation by setting up expectations which may, or may not be fulfilled. This manipulation of expectation and surprise in the listener has been articulated by [ref]. [little more background on expectation] + Music involves patterns in time. When listening to music we continually build and re-evaluate expectations of what is to come next. Composers commonly, consciously or not, play with this expectation by setting up expectations which may, or may not be fulfilled. This manipulation of expectation and surprise in the listener has been articulated by [ref]. [little more background on expectation] -The research into Information Dynamics explores several different kinds of predictability in musical patterns, how human listeners might perceive these, and how they shape or affect the listening experience. [more on IDyOM] +The research into Information Dynamics explores several different kinds of predictability in musical patterns, how human listeners might perceive these, and how they shape or affect the listening experience. [more on IDyOM project] \section{The Melody Triangle} %%%How we created the transition matrixes and created the triangle. -The Melody Triangle is based on first order Markov chains, represented as transition matrixes, and generate streams of symbols. By mapping the symbols to individual notes, melodies are generated and by layering these complex musical textures can be generated. The choice of notes or scale is not a part of the Melody Triangle's core functionality (the symbols could be mapped to anything, even non sonic outputs). +The Melody Triangle is based on first order Markov chains, represented as transition matrixes, that generate streams of symbols. By mapping the symbols to individual notes, melodies are generated. Further by layering these streams of notes can result in intricate musical textures. The choice of notes or scale is not a part of the Melody Triangle's core functionality, in fact the symbols could be mapped to anything, even non sonic outputs. -Given a sequence of symbols we can make information theoretic measures on it. The novelty of the Melody Triangle lies in that it goes 'backwards' - given values for these measures, as determined from the user interface, we return a stream of symbols that match those measures. The information measures used are redundancy, entropy rate and predictive information rate. +Any sequence of symbols can be analysed and information theoretic measures taken from it. The novelty of the Melody Triangle lies in that we go 'backwards' - given desired values for these measures, as determined from the user interface, we return a stream of symbols that match those measures. The information measures used are redundancy, entropy rate and predictive information rate. \subsection{Information measures} \subsubsection{Redundancy} +[todo - a more formal description] Redundancy tells us the difference in uncertainty before we look at the context (the fixed point distribution) and the uncertainty after we look at context. For instance a transition matrix with high redundancy, such as one that represents a long periodic sequence, would have high uncertainty before we look at the context but as soon as we look at the previous symbol, the uncertainty drops to zero because we now know what is coming next. \subsubsection{Entropy rate} +[todo - a more formal description] Entropy rate is the average uncertainty for the next symbol as we go through the sequence. A looping sequence has 0 entropy, a sequence that is difficult to predict has high entropy rate. Entropy rate is an average of ÔsurprisingnessÕ over time. \subsubsection{Predictive Information Rate} +[todo - a more formal description] Predictive information rate tell us the average reduction in uncertainty upon perceiving a symbol; a system with high predictive information rate means that each symbol tells you more about the next one. If we imagine a purely periodic sequence, each symbol tells you nothing about the next one that we didn't already know as we already know how the pattern is going. Similarly with a seemingly uncorrelated sequence, seeing the next symbol does not tell us anymore because they are completely independent anyway; there is no pattern. There is a subset of transition matrixes that have high predictive information rate, and it is neither the periodic ones, nor the completely un-corellated ones. Rather they tend to yield output that have certain characteristic patterns, however a listener can't necessarily know when they occur. However a certain sequence of symbols might tell us about which one of the characteristics patterns will show up next. Each symbols tell a us little bit about the future but nothing about the infinite future, we only learn about that as time goes on; there is continual building of prediction. \begin{figure} \centering -\includegraphics[width=0.3\textwidth]{PeriodicMatrix.png} -\includegraphics[width=0.3\textwidth]{NonDeterministicMatrix_bw.png} -\caption{Two transition matrixes. The colour represents the probabilities of transition from one state to the next (black=0, white=1). The current symbol is along the bottom, and in this case there are twelve possibilities (mapped to a chromatic scale). The upper matrix has no uncertainty; it represents a periodic pattern. We can see for example that symbol 4 must follow symbol 3, and that symbol 10 must follow symbol 4, and so on.The lower matrix contains unpredictability but nonetheless is not completely without perceivable structure, it is of a higher entropy rate. \label{TransitionMatrixes}} +\includegraphics[width=0.2\textwidth]{PeriodicMatrix.png} +\includegraphics[width=0.2\textwidth]{NonDeterministicMatrix_bw.png} +\caption{Two transition matrixes. The shade of white represents the probabilities of transition from one symbol to the next (black=0, white=1). The current symbol is along the bottom, and in this case there are twelve possibilities (mapped to a chromatic scale). The left hand matrix has no uncertainty; it represents a periodic pattern. We can see for example that symbol 4 must follow symbol 3, and that symbol 10 must follow symbol 4, and so on.The right hand matrix contains unpredictability but nonetheless is not completely without perceivable structure, it is of a higher entropy rate. \label{TransitionMatrixes}} \end{figure} -\begin{figure} + + \begin{figure} \centering \includegraphics[width=0.5\textwidth]{MatrixDistribution.png} -\caption{The population of transition matrixes distributed along three axes of redundancy, entropy rate and predictive -information rate. Note how the distribution makes triangle-like plane floating in 3d space. [todo get higher res screen shot, rotate so we can see triangle better] \label{InfoDynEngine}} +\caption{The population of transition matrixes distributed along three axes of redundancy, entropy rate and predictive -information rate. Note how the distribution makes triangle-like plane floating in 3d space. \label{InfoDynEngine}} \end{figure} - -\subsection{Making the triangle} Hundreds of transition matrixes are generated by a random sampling method. These are then plotted in a 3d statistical space of redundancy, entropy rate and predictive information rate. In figure \ref{InfoDynEngine} we see a representation of how these matrixes are distributed; each one of these points corresponds to a transition matrix. - \begin{figure}[h] \centering \includegraphics[width=0.5\textwidth]{TheTriangle.pdf} \caption{The Melody Triangle [todo fix shading to be more triangular] \label{TheTriangle}} \end{figure} - When we look at the distribution of matrixes generated by this random sampling method, we see that it forms an arch shape that is fairly thin, and it thus becomes a reasonable approximation to pretend that it is just a sheet in two dimensions. It is this triangular sheet that is our 'Melody Triangle'. It is the interface that is mapped either to the screen, or in the case of the interactive installation, physical space. Each corner correspondes to three different extremes of predictability/unpredictability, which could be loosely characterised as periodicity, noise and repetition. [expand, make better] +\subsection{Making the triangle} We generate hundreds of transition matrixes, representing first-order Markov chains, by a random sampling method. These are then plotted in a 3d statistical space of redundancy, entropy rate and predictive information rate. In figure \ref{InfoDynEngine} we see a representation of how these matrixes are distributed; each one of these points corresponds to a transition matrix. + + + When we look at the distribution of randomly generated transition matrixes and plotted in this space, we see that it forms an arch shape that is fairly thin. It thus becomes a reasonable approximation to pretend that it is just a sheet in two dimensions; and so we stretch out this curved arc into a flat triangle. It is this triangular sheet that is our 'Melody Triangle' and forms the interface by which the system is controlled. + + + + When the Melody Triangle is used, regardless of whether it is as a screen based system, or as an interactive installation, it involves a mapping to this statistical space. As can be seen in figure \ref{TheTriangle}, a position within the triangle maps to different measures of redundancy, entropy rate and predictive information rate. + + [todo improve, include example melodies?] Each corner corresponds to three different extremes of predictability and unpredictability, which could be loosely characterised as periodicity, noise and repetition. Melodies from the 'noise' corner have no discernible pattern; they have high entropy rate, low predictive information rate and low redundancy. These melodies are essentially totally random. A melody along the 'periodicity' to 'repetition' edge are all deterministic loops that get shorter as we approach the 'repetition' corner, until it becomes just one repeating note. It is the areas in between that provide the more interesting melodies, those that have some level of unpredictability, but are not completely random and conversely that are predictable, but not entirely so. This triangular space allows for an intuitive exploration of expectation and surprise in temporal sequences. @@ -92,6 +102,12 @@ \subsection{The Multi-User Installation} +\begin{figure} +\centering +\includegraphics[width=0.5\textwidth]{kinnect.pdf} +\caption{The depth map as seen by the Kinnect, and the bounding box outlines the blobs detected by OpenNI.\label{Kinnect}} +\end{figure} + [old text - rewrite!] the statistical properties of this melody is based on where in the physical room the participant is standing, as this is mapped to a statistical space (see below). By exploring the physical space the participants thus explore the predictability of the melodic and rhythmical patterns, based on a simple model of how might guess the next musical event given the previous one. @@ -104,11 +120,7 @@ \subsubsection{Tracking and Control} It uses the OpenNI libraries' API and high level middle-ware for tracking with Kinnect. This provided reliable blob tracking of humanoid forms in 2d space. By then triangulating this to the Kinnect's depth map it became possible to get reliable coordinate of visitors positions in the space. -\begin{figure} -\centering -\includegraphics[width=1\textwidth]{kinnect.pdf} -\caption{?On the left we see the depth map as seen by the Kinnect, and the bounding box outlines the blobs detected by OpenNI. One the right is a an eagle-eye view of the position of individuals tracked in the space. These are the positions that are mapped to the statistical space of the information dynamics engine.\label{Kinnect}} -\end{figure} + This system was then further extended to detect gestures. By detecting the bounding box of the 2d blobs of individuals in the space, and then normalising these based on the distance of the depth map it became possible to work out if an individual had a an arm stretched out or if they were crouching.