comparison draft.tex @ 46:df41539257ba

Added some graphics for the melody triangle trial.
author samer
date Thu, 15 Mar 2012 20:05:35 +0000
parents 244b74fb707d
children 9a0d400bc827
comparison
equal deleted inserted replaced
44:244b74fb707d 46:df41539257ba
380 both $x_t$ and the context $\past{x}_t$: 380 both $x_t$ and the context $\past{x}_t$:
381 \begin{equation} 381 \begin{equation}
382 \ell_t = - \log p(x_t|\past{x}_t). 382 \ell_t = - \log p(x_t|\past{x}_t).
383 \end{equation} 383 \end{equation}
384 However, before $X_t$ is observed to be $x_t$, the observer can compute 384 However, before $X_t$ is observed to be $x_t$, the observer can compute
385 its \emph{expected} surprisingness as a measure of its uncertainty about 385 the \emph{expected} surprisingness as a measure of its uncertainty about
386 the very next event; this may be written as an entropy 386 the very next event; this may be written as an entropy
387 $H(X_t|\ev(\past{X}_t = \past{x}_t))$, but note that this is 387 $H(X_t|\ev(\past{X}_t = \past{x}_t))$, but note that this is
388 conditional on the \emph{event} $\ev(\past{X}_t=\past{x}_t)$, not 388 conditional on the \emph{event} $\ev(\past{X}_t=\past{x}_t)$, not
389 \emph{variables} $\past{X}_t$ as in the conventional conditional entropy. 389 \emph{variables} $\past{X}_t$ as in the conventional conditional entropy.
390 390
407 \begin{equation} 407 \begin{equation}
408 \mathcal{I}_t = \sum_{\fut{x}_t \in \X^*} 408 \mathcal{I}_t = \sum_{\fut{x}_t \in \X^*}
409 p(\fut{x}_t|x_t,\past{x}_t) \log \frac{ p(\fut{x}_t|x_t,\past{x}_t) }{ p(\fut{x}_t|\past{x}_t) }, 409 p(\fut{x}_t|x_t,\past{x}_t) \log \frac{ p(\fut{x}_t|x_t,\past{x}_t) }{ p(\fut{x}_t|\past{x}_t) },
410 \end{equation} 410 \end{equation}
411 where the sum is to be taken over the set of infinite sequences $\X^*$. 411 where the sum is to be taken over the set of infinite sequences $\X^*$.
412 Note that it is quite possible for an event to be surprising but not informative
413 in predictive sense.
412 As with the surprisingness, the observer can compute its \emph{expected} IPI 414 As with the surprisingness, the observer can compute its \emph{expected} IPI
413 at time $t$, which reduces to a mutual information $I(X_t;\fut{X}_t|\ev(\past{X}_t=\past{x}_t))$ 415 at time $t$, which reduces to a mutual information $I(X_t;\fut{X}_t|\ev(\past{X}_t=\past{x}_t))$
414 conditioned on the observed past. This could be used, for example, as an estimate 416 conditioned on the observed past. This could be used, for example, as an estimate
415 of attentional resources which should be directed at this stream of data, which may 417 of attentional resources which should be directed at this stream of data, which may
416 be in competition with other sensory streams. 418 be in competition with other sensory streams.
542 $X_t$ given all the previous ones. 544 $X_t$ given all the previous ones.
543 \begin{equation} 545 \begin{equation}
544 \label{eq:entro-rate} 546 \label{eq:entro-rate}
545 h_\mu = H(X_t|\past{X}_t). 547 h_\mu = H(X_t|\past{X}_t).
546 \end{equation} 548 \end{equation}
547 The entropy rate gives a measure of the overall randomness 549 The entropy rate gives a measure of the overall surprisingness
548 or unpredictability of the process. 550 or unpredictability of the process.
549 551
550 The \emph{multi-information rate} $\rho_\mu$ (following Dubnov's \cite{Dubnov2006} 552 The \emph{multi-information rate} $\rho_\mu$ (following Dubnov's \cite{Dubnov2006}
551 notation for what he called the `information rate') is the mutual 553 notation for what he called the `information rate') is the mutual
552 information between the `past' and the `present': 554 information between the `past' and the `present':
568 of as measures of \emph{redundancy}, quantifying the extent to which 570 of as measures of \emph{redundancy}, quantifying the extent to which
569 the same information is to be found in all parts of the sequence. 571 the same information is to be found in all parts of the sequence.
570 572
571 573
572 The \emph{predictive information rate} (or PIR) \cite{AbdallahPlumbley2009} 574 The \emph{predictive information rate} (or PIR) \cite{AbdallahPlumbley2009}
573 is the average information in one observation about the infinite future given the infinite past, 575 is the mutual information between the present and the infinite future given the infinite
574 and is defined as a conditional mutual information: 576 past:
575 \begin{equation} 577 \begin{equation}
576 \label{eq:PIR} 578 \label{eq:PIR}
577 b_\mu = I(X_t;\fut{X}_t|\past{X}_t) = H(\fut{X}_t|\past{X}_t) - H(\fut{X}_t|X_t,\past{X}_t). 579 b_\mu = I(X_t;\fut{X}_t|\past{X}_t) = H(\fut{X}_t|\past{X}_t) - H(\fut{X}_t|X_t,\past{X}_t).
578 \end{equation} 580 \end{equation}
579 Equation \eqrf{PIR} can be read as the average reduction 581 Equation \eqrf{PIR} can be read as the average reduction
591 or \emph{erasure} \cite{VerduWeissman2006} entropy rate. 593 or \emph{erasure} \cite{VerduWeissman2006} entropy rate.
592 These relationships are illustrated in \Figrf{predinfo-bg}, along with 594 These relationships are illustrated in \Figrf{predinfo-bg}, along with
593 several of the information measures we have discussed so far. 595 several of the information measures we have discussed so far.
594 596
595 597
596 James et al \cite{JamesEllisonCrutchfield2011} study the predictive information 598 James et al \cite{JamesEllisonCrutchfield2011} review several of these
597 rate and also examine some related measures. In particular they identify the 599 information measures and introduce some new related ones.
598 $\sigma_\mu$, the difference between the multi-information rate and the excess 600 In particular they identify the $\sigma_\mu = I(\past{X}_t;\fut{X}_t|X_t)$,
599 entropy, as an interesting quantity that measures the predictive benefit of 601 the mutual information between the past and the future given the present,
602 as an interesting quantity that measures the predictive benefit of
600 model-building (that is, maintaining an internal state summarising past 603 model-building (that is, maintaining an internal state summarising past
601 observations in order to make better predictions). 604 observations in order to make better predictions). It is shown as the
605 small dark region below the circle in \figrf{predinfo-bg}(c).
606 By comparing with \figrf{predinfo-bg}(b), we can see that
607 $\sigma_\mu = E - \rho_\mu$.
602 % They also identify 608 % They also identify
603 % $w_\mu = \rho_\mu + b_{\mu}$, which they call the \emph{local exogenous 609 % $w_\mu = \rho_\mu + b_{\mu}$, which they call the \emph{local exogenous
604 % information} rate. 610 % information} rate.
605 611
606 612
610 expressions for all the information measures described in \secrf{surprise-info-seq} for 616 expressions for all the information measures described in \secrf{surprise-info-seq} for
611 irreducible stationary Markov chains (\ie that have a unique stationary 617 irreducible stationary Markov chains (\ie that have a unique stationary
612 distribution). The derivation is greatly simplified by the dependency structure 618 distribution). The derivation is greatly simplified by the dependency structure
613 of the Markov chain: for the purpose of the analysis, the `past' and `future' 619 of the Markov chain: for the purpose of the analysis, the `past' and `future'
614 segments $\past{X}_t$ and $\fut{X}_t$ can be collapsed to just the previous 620 segments $\past{X}_t$ and $\fut{X}_t$ can be collapsed to just the previous
615 and next variables $X_{t-1}$ and $X_{t-1}$ respectively. We also showed that 621 and next variables $X_{t-1}$ and $X_{t+1}$ respectively. We also showed that
616 the predictive information rate can be expressed simply in terms of entropy rates: 622 the predictive information rate can be expressed simply in terms of entropy rates:
617 if we let $a$ denote the $K\times K$ transition matrix of a Markov chain over 623 if we let $a$ denote the $K\times K$ transition matrix of a Markov chain over
618 an alphabet of $\{1,\ldots,K\}$, such that 624 an alphabet of $\{1,\ldots,K\}$, such that
619 $a_{ij} = \Pr(\ev(X_t=i|X_{t-1}=j))$, and let $h:\reals^{K\times K}\to \reals$ be 625 $a_{ij} = \Pr(\ev(X_t=i|X_{t-1}=j))$, and let $h:\reals^{K\times K}\to \reals$ be
620 the entropy rate function such that $h(a)$ is the entropy rate of a Markov chain 626 the entropy rate function such that $h(a)$ is the entropy rate of a Markov chain
703 obtained using two rule-based music segmentation algorithms, while clearly 709 obtained using two rule-based music segmentation algorithms, while clearly
704 \emph{reflecting} the structure of the piece, do not \emph{segment} the piece, 710 \emph{reflecting} the structure of the piece, do not \emph{segment} the piece,
705 with no tendency to peaking of the boundary strength function at 711 with no tendency to peaking of the boundary strength function at
706 the boundaries in the piece. 712 the boundaries in the piece.
707 713
714 The complete analysis of \emph{Gradus} can be found in \cite{AbdallahPlumbley2009},
715 but \figrf{metre} illustrates the result of a metrical analysis: the piece was divided
716 into bars of 32, 64 and 128 notes. In each case, the average surprisingness and
717 IPI for the first, second, third \etc notes in each bar were computed. The plots
718 show that the first note of each bar is, on average, significantly more surprising
719 and informative than the others, up to the 64-note level, where as at the 128-note,
720 level, the dominant periodicity appears to remain at 64 notes.
708 721
709 \begin{fig}{metre} 722 \begin{fig}{metre}
710 % \scalebox{1}[1]{% 723 % \scalebox{1}[1]{%
711 \begin{tabular}{cc} 724 \begin{tabular}{cc}
712 \colfig[0.45]{matbase/fig36859} & \colfig[0.48]{matbase/fig88658} \\ 725 \colfig[0.45]{matbase/fig36859} & \colfig[0.48]{matbase/fig88658} \\
725 informative of notes at different periodicities (\ie hypothetical 738 informative of notes at different periodicities (\ie hypothetical
726 bar lengths) and phases (\ie positions within a bar). 739 bar lengths) and phases (\ie positions within a bar).
727 } 740 }
728 \end{fig} 741 \end{fig}
729 742
730 \subsection{Content analysis/Sound Categorisation}. 743 \subsection{Content analysis/Sound Categorisation}
731 Using analogous definitions of differential entropy, the methods outlined 744 Using analogous definitions of differential entropy, the methods outlined
732 in the previous section are equally applicable to continuous random variables. 745 in the previous section are equally applicable to continuous random variables.
733 In the case of music, where expressive properties such as dynamics, tempo, 746 In the case of music, where expressive properties such as dynamics, tempo,
734 timing and timbre are readily quantified on a continuous scale, the information 747 timing and timbre are readily quantified on a continuous scale, the information
735 dynamic framework thus may also be considered. 748 dynamic framework thus may also be considered.
836 The triangle is `populated' with possible parameter values for melody generators. 849 The triangle is `populated' with possible parameter values for melody generators.
837 These are plotted in a 3D information space of $\rho_\mu$ (redundancy), $h_\mu$ (entropy rate) and 850 These are plotted in a 3D information space of $\rho_\mu$ (redundancy), $h_\mu$ (entropy rate) and
838 $b_\mu$ (predictive information rate), as defined in \secrf{process-info}. 851 $b_\mu$ (predictive information rate), as defined in \secrf{process-info}.
839 In our case we generated thousands of transition matrices, representing first-order 852 In our case we generated thousands of transition matrices, representing first-order
840 Markov chains, by a random sampling method. In figure \ref{InfoDynEngine} we 853 Markov chains, by a random sampling method. In figure \ref{InfoDynEngine} we
841 see a representation of how these matrices are distributed in the 3d statistical 854 see a representation of how these matrices are distributed in the 3D information
842 space; each one of these points corresponds to a transition matrix. 855 space; each one of these points corresponds to a transition matrix.
843 856
844 The distribution of transition matrices plotted in this space forms an arch shape 857 The distribution of transition matrices plotted in this space forms an arch shape
845 that is fairly thin. It thus becomes a reasonable approximation to pretend that 858 that is fairly thin. It thus becomes a reasonable approximation to pretend that
846 it is just a sheet in two dimensions; and so we stretch out this curved arc into 859 it is just a sheet in two dimensions; and so we stretch out this curved arc into
847 a flat triangle. It is this triangular sheet that is our `Melody Triangle' and 860 a flat triangle. It is this triangular sheet that is our `Melody Triangle' and
848 forms the interface by which the system is controlled. Using this interface 861 forms the interface by which the system is controlled. Using this interface
849 thus involves a mapping to statistical space; a user selects a position within 862 thus involves a mapping to information space; a user selects a position within
850 the triangle, and a corresponding transition matrix is returned. Figure 863 the triangle, and a corresponding transition matrix is returned. Figure
851 \ref{TheTriangle} shows how the triangle maps to different measures of redundancy, 864 \ref{TheTriangle} shows how the triangle maps to different measures of redundancy,
852 entropy rate and predictive information rate. 865 entropy rate and predictive information rate.
853 866
854 867
879 explore expectation and surprise in music. Additionally different gestures could 892 explore expectation and surprise in music. Additionally different gestures could
880 be detected to change the tempo, register, instrumentation and periodicity of 893 be detected to change the tempo, register, instrumentation and periodicity of
881 the output melody. 894 the output melody.
882 895
883 As a screen based interface the Melody Triangle can serve as composition tool. 896 As a screen based interface the Melody Triangle can serve as composition tool.
884 A triangle is drawn on the screen, screen space thus mapped to the statistical 897 A triangle is drawn on the screen, screen space thus mapped to the information
885 space of the Melody Triangle. A number of round tokens, each representing a 898 space of the Melody Triangle. A number of round tokens, each representing a
886 melody can be dragged in and around the triangle. When a token is dragged into 899 melody can be dragged in and around the triangle. When a token is dragged into
887 the triangle, the system will start generating the sequence of symbols with 900 the triangle, the system will start generating the sequence of symbols with
888 statistical properties that correspond to the position of the token. These 901 statistical properties that correspond to the position of the token. These
889 symbols are then mapped to notes of a scale. 902 symbols are then mapped to notes of a scale.
898 911
899 912
900 \subsection{Information Dynamics as Evaluative Feedback Mechanism} 913 \subsection{Information Dynamics as Evaluative Feedback Mechanism}
901 %NOT SURE THIS SHOULD BE HERE AT ALL..? 914 %NOT SURE THIS SHOULD BE HERE AT ALL..?
902 915
916 \begin{fig}{mtri-results}
917 \def\scat#1{\colfig[0.42]{mtri/#1}}
918 \def\subj#1{\scat{scat_dwells_subj_#1} & \scat{scat_marks_subj_#1}}
919 \begin{tabular}{cc}
920 \subj{a} \\
921 \subj{b} \\
922 \subj{c} \\
923 \subj{d}
924 \end{tabular}
925 \caption{Dwell times and mark positions from user trials with the
926 on-screen Melody Triangle interface. The left-hand column shows
927 the positions in a 2D information space (entropy rate vs multi-information rate
928 in bits) where spent their time; the area of each circle is proportional
929 to the time spent there. The right-hand column shows point which subjects
930 `liked'.}
931 \end{fig}
903 932
904 Information measures on a stream of symbols can form a feedback mechanism; a 933 Information measures on a stream of symbols can form a feedback mechanism; a
905 rudamentary `critic' of sorts. For instance symbol by symbol measure of predictive 934 rudamentary `critic' of sorts. For instance symbol by symbol measure of predictive
906 information rate, entropy rate and redundancy could tell us if a stream of symbols 935 information rate, entropy rate and redundancy could tell us if a stream of symbols
907 is currently `boring', either because it is too repetitive, or because it is too 936 is currently `boring', either because it is too repetitive, or because it is too
921 the state of the system when users, by pressing a key, indicate that they like 950 the state of the system when users, by pressing a key, indicate that they like
922 what they are hearing. As such the experiments will help us identify any 951 what they are hearing. As such the experiments will help us identify any
923 correlation between the information theoretic properties of a stream and its 952 correlation between the information theoretic properties of a stream and its
924 perceived aesthetic worth. 953 perceived aesthetic worth.
925 954
955 Some initial results for four subjects are shown in \figrf{mtri-results}. Though
956 subjects seem to exhibit distinct kinds of exploratory behaviour, we have
957 not been able to show any systematic across-subjects preference for any particular
958 region of the triangle.
959
960 Subjects' comments: several noticed the main organisation of the triangle:
961 repetative notes at the top, cyclic patters along the right edge, and unpredictable
962 notes towards the bottom left (a,c,f). Some did systematic exploration.
963 Felt that the right side was more `controllable' than the left (a,f)---a direct consequence
964 of their ability to return to a particular periodic pattern and recognise at
965 as one heard previously. Some (a,e) felt the trial was too long and became
966 bored towards the end.
967 One subject (f) felt there wasn't enough time to get to hear out the patterns properly.
968 One subject (b) didn't enjoy the lower region whereas another (d) said the lower
969 regions were more `melodic' and `interesting'.
926 970
927 %\emph{comparable system} Gordon Pask's Musicolor (1953) applied a similar notion 971 %\emph{comparable system} Gordon Pask's Musicolor (1953) applied a similar notion
928 %of boredom in its design. The Musicolour would react to audio input through a 972 %of boredom in its design. The Musicolour would react to audio input through a
929 %microphone by flashing coloured lights. Rather than a direct mapping of sound 973 %microphone by flashing coloured lights. Rather than a direct mapping of sound
930 %to light, Pask designed the device to be a partner to a performing musician. It 974 %to light, Pask designed the device to be a partner to a performing musician. It