cip2012: draft.tex comparison

comparison draft.tex @ 46:df41539257ba

Added some graphics for the melody triangle trial.

author	samer
date	Thu, 15 Mar 2012 20:05:35 +0000
parents	244b74fb707d
children	9a0d400bc827

comparison

equal deleted inserted replaced

-:244b74fb707d
+:df41539257ba
 	both $x_t$ and the context $\past{x}_t$:
 	\begin{equation}
 		\ell_t = - \log p(x_t|\past{x}_t).
 	\end{equation}
 	However, before $X_t$ is observed to be $x_t$, the observer can compute
-	its \emph{expected} surprisingness as a measure of its uncertainty about
+	the \emph{expected} surprisingness as a measure of its uncertainty about
 	the very next event; this may be written as an entropy
 	$H(X_t|\ev(\past{X}_t = \past{x}_t))$, but note that this is
 	conditional on the \emph{event} $\ev(\past{X}_t=\past{x}_t)$, not
 	\emph{variables} $\past{X}_t$ as in the conventional conditional entropy.
 	\begin{equation}
 		\mathcal{I}_t = \sum_{\fut{x}_t \in \X^*}
 					p(\fut{x}_t|x_t,\past{x}_t) \log \frac{ p(\fut{x}_t|x_t,\past{x}_t) }{ p(\fut{x}_t|\past{x}_t) },
 	\end{equation}
 	where the sum is to be taken over the set of infinite sequences $\X^*$.
+	Note that it is quite possible for an event to be surprising but not informative
+	in predictive sense.
 	As with the surprisingness, the observer can compute its \emph{expected} IPI
 	at time $t$, which reduces to a mutual information $I(X_t;\fut{X}_t|\ev(\past{X}_t=\past{x}_t))$
 	conditioned on the observed past. This could be used, for example, as an estimate
 	of attentional resources which should be directed at this stream of data, which may
 	be in competition with other sensory streams.
 	$X_t$ given all the previous ones.
 	\begin{equation}
 		\label{eq:entro-rate}
 		h_\mu = H(X_t|\past{X}_t).
 	\end{equation}
-	The entropy rate gives a measure of the overall randomness
+	The entropy rate gives a measure of the overall surprisingness
 	or unpredictability of the process.
 	The \emph{multi-information rate} $\rho_\mu$ (following Dubnov's \cite{Dubnov2006}
 	notation for what he called the `information rate') is the mutual
 	information between the `past' and the `present':
 	of as measures of \emph{redundancy}, quantifying the extent to which
 	the same information is to be found in all parts of the sequence.
 	The \emph{predictive information rate} (or PIR) \cite{AbdallahPlumbley2009}
-	is the average information in one observation about the infinite future given the infinite past,
+	is the mutual information between the present and the infinite future given the infinite
-	and is defined as a conditional mutual information:
+	past:
 	\begin{equation}
 		\label{eq:PIR}
 		b_\mu = I(X_t;\fut{X}_t|\past{X}_t) = H(\fut{X}_t|\past{X}_t) - H(\fut{X}_t|X_t,\past{X}_t).
 	\end{equation}
 	Equation \eqrf{PIR} can be read as the average reduction
 	or \emph{erasure} \cite{VerduWeissman2006} entropy rate.
 	These relationships are illustrated in \Figrf{predinfo-bg}, along with
 	several of the information measures we have discussed so far.
-	James et al \cite{JamesEllisonCrutchfield2011} study the predictive information
+	James et al \cite{JamesEllisonCrutchfield2011} review several of these
-	rate and also examine some related measures. In particular they identify the
+	information measures and introduce some new related ones.
-	$\sigma_\mu$, the difference between the multi-information rate and the excess
+	In particular they identify the $\sigma_\mu = I(\past{X}_t;\fut{X}_t|X_t)$,
-	entropy, as an interesting quantity that measures the predictive benefit of
+	the mutual information between the past and the future given the present,
+	as an interesting quantity that measures the predictive benefit of
 	model-building (that is, maintaining an internal state summarising past
-	observations in order to make better predictions).
+	observations in order to make better predictions). It is shown as the
+	small dark region below the circle in \figrf{predinfo-bg}(c).
+	By comparing with \figrf{predinfo-bg}(b), we can see that
+	$\sigma_\mu = E - \rho_\mu$.
 %	They also identify
 %	$w_\mu = \rho_\mu + b_{\mu}$, which they call the \emph{local exogenous
 %	information} rate.
 	expressions for all the information measures described in \secrf{surprise-info-seq} for
 	irreducible stationary Markov chains (\ie that have a unique stationary
 	distribution). The derivation is greatly simplified by the dependency structure
 	of the Markov chain: for the purpose of the analysis, the `past' and `future'
 	segments $\past{X}_t$ and $\fut{X}_t$ can be collapsed to just the previous
-	and next variables $X_{t-1}$ and $X_{t-1}$ respectively. We also showed that
+	and next variables $X_{t-1}$ and $X_{t+1}$ respectively. We also showed that
 	the predictive information rate can be expressed simply in terms of entropy rates:
 	if we let $a$ denote the $K\times K$ transition matrix of a Markov chain over
 	an alphabet of $\{1,\ldots,K\}$, such that
 	$a_{ij} = \Pr(\ev(X_t=i|X_{t-1}=j))$, and let $h:\reals^{K\times K}\to \reals$ be
 	the entropy rate function such that $h(a)$ is the entropy rate of a Markov chain
 	obtained using two rule-based music segmentation algorithms, while clearly
 	\emph{reflecting} the structure of the piece, do not \emph{segment} the piece,
 	with no tendency to peaking of the boundary strength function at
 	the boundaries in the piece.
+	The complete analysis of \emph{Gradus} can be found in \cite{AbdallahPlumbley2009},
+	but \figrf{metre} illustrates the result of a metrical analysis: the piece was divided
+	into bars of 32, 64 and 128 notes. In each case, the average surprisingness and
+	IPI for the first, second, third \etc notes in each bar were computed. The plots
+	show that the first note of each bar is, on average, significantly more surprising
+	and informative than the others, up to the 64-note level, where as at the 128-note,
+	level, the dominant periodicity appears to remain at 64 notes.
 \begin{fig}{metre}
 %      \scalebox{1}[1]{%
 \begin{tabular}{cc}
 \colfig[0.45]{matbase/fig36859} & \colfig[0.48]{matbase/fig88658} \\
 informative of notes at different periodicities (\ie hypothetical
 bar lengths) and phases (\ie positions within a bar).
 }
 \end{fig}
-\subsection{Content analysis/Sound Categorisation}.
+\subsection{Content analysis/Sound Categorisation}
 	 Using analogous definitions of differential entropy, the methods outlined
 	 in the previous section are equally applicable to continuous random variables.
 	 In the case of music, where expressive properties such as dynamics, tempo,
 	 timing and timbre are readily quantified on a continuous scale, the information
 	 dynamic framework thus may also be considered.
 The triangle is `populated' with possible parameter values for melody generators.
 These are plotted in a 3D information space of $\rho_\mu$ (redundancy), $h_\mu$ (entropy rate) and
 $b_\mu$ (predictive information rate), as defined in \secrf{process-info}.
 In our case we generated thousands of transition matrices, representing first-order
 Markov chains, by a random sampling method.  In figure \ref{InfoDynEngine} we
-see a representation of how these matrices are distributed in the 3d statistical
+see a representation of how these matrices are distributed in the 3D information
 space; each one of these points corresponds to a transition matrix.
 The distribution of transition matrices plotted in this space forms an arch shape
 that is fairly thin.  It thus becomes a reasonable approximation to pretend that
 it is just a sheet in two dimensions; and so we stretch out this curved arc into
 a flat triangle.  It is this triangular sheet that is our `Melody Triangle' and
 forms the interface by which the system is controlled.  Using this interface
-thus involves a mapping to statistical space; a user selects a position within
+thus involves a mapping to information space; a user selects a position within
 the triangle, and a corresponding transition matrix is returned.  Figure
 \ref{TheTriangle} shows how the triangle maps to different measures of redundancy,
 entropy rate and predictive information rate.
 explore expectation and surprise in music.  Additionally different gestures could
 be detected to change the tempo, register, instrumentation and periodicity of
 the output melody.
 As a screen based interface the Melody Triangle can serve as composition tool.
-A triangle is drawn on the screen, screen space thus mapped to the statistical
+A triangle is drawn on the screen, screen space thus mapped to the information
 space of the Melody Triangle.  A number of round tokens, each representing a
 melody can be dragged in and around the triangle.  When a token is dragged into
 the triangle, the system will start generating the sequence of symbols with
 statistical properties that correspond to the position of the token.  These
 symbols are then mapped to notes of a scale.
 \subsection{Information Dynamics as Evaluative Feedback Mechanism}
 %NOT SURE THIS SHOULD BE HERE AT ALL..?
+\begin{fig}{mtri-results}
+	\def\scat#1{\colfig[0.42]{mtri/#1}}
+	\def\subj#1{\scat{scat_dwells_subj_#1} & \scat{scat_marks_subj_#1}}
+	\begin{tabular}{cc}
+		\subj{a} \\
+		\subj{b} \\
+		\subj{c} \\
+		\subj{d}
+	\end{tabular}
+	\caption{Dwell times and mark positions from user trials with the
+	on-screen Melody Triangle interface. The left-hand column shows
+	the positions in a 2D information space (entropy rate vs multi-information rate
+	in bits) where spent their time; the area of each circle is proportional
+	to the time spent there. The right-hand column shows point which subjects
+	`liked'.}
+\end{fig}
 Information measures on a stream of symbols can form a feedback mechanism; a
 rudamentary `critic' of sorts.  For instance symbol by symbol measure of predictive
 information rate, entropy rate and redundancy could tell us if a stream of symbols
 is currently `boring', either because it is too repetitive, or because it is too
 the state of the system when users, by pressing a key, indicate that they like
 what they are hearing.  As such the experiments will help us identify any
 correlation between the information theoretic properties of a stream and its
 perceived aesthetic worth.
+Some initial results for four subjects are shown in \figrf{mtri-results}. Though
+subjects seem to exhibit distinct kinds of exploratory behaviour, we have
+not been able to show any systematic across-subjects preference for any particular
+region of the triangle.
+Subjects' comments: several noticed the main organisation of the triangle:
+repetative notes at the top, cyclic patters along the right edge, and unpredictable
+notes towards the bottom left (a,c,f). Some did systematic exploration.
+Felt that the right side was more `controllable' than the left (a,f)---a direct consequence
+of their ability to return to a particular periodic pattern and recognise at
+as one heard previously. Some (a,e) felt the trial was too long and became
+bored towards the end.
+One subject (f) felt there wasn't enough time to get to hear out the patterns properly.
+One subject (b) didn't enjoy the lower region whereas another (d) said the lower
+regions were more `melodic' and `interesting'.
 %\emph{comparable system}  Gordon Pask's Musicolor (1953) applied a similar notion
 %of boredom in its design.  The Musicolour would react to audio input through a
 %microphone by flashing coloured lights.  Rather than a direct mapping of sound
 %to light, Pask designed the device to be a partner to a performing musician.  It

Mercurial > hg > cip2012

comparison draft.tex @ 46:df41539257ba