Mercurial > hg > cip2012
comparison draft.tex @ 46:df41539257ba
Added some graphics for the melody triangle trial.
author | samer |
---|---|
date | Thu, 15 Mar 2012 20:05:35 +0000 |
parents | 244b74fb707d |
children | 9a0d400bc827 |
comparison
equal
deleted
inserted
replaced
44:244b74fb707d | 46:df41539257ba |
---|---|
380 both $x_t$ and the context $\past{x}_t$: | 380 both $x_t$ and the context $\past{x}_t$: |
381 \begin{equation} | 381 \begin{equation} |
382 \ell_t = - \log p(x_t|\past{x}_t). | 382 \ell_t = - \log p(x_t|\past{x}_t). |
383 \end{equation} | 383 \end{equation} |
384 However, before $X_t$ is observed to be $x_t$, the observer can compute | 384 However, before $X_t$ is observed to be $x_t$, the observer can compute |
385 its \emph{expected} surprisingness as a measure of its uncertainty about | 385 the \emph{expected} surprisingness as a measure of its uncertainty about |
386 the very next event; this may be written as an entropy | 386 the very next event; this may be written as an entropy |
387 $H(X_t|\ev(\past{X}_t = \past{x}_t))$, but note that this is | 387 $H(X_t|\ev(\past{X}_t = \past{x}_t))$, but note that this is |
388 conditional on the \emph{event} $\ev(\past{X}_t=\past{x}_t)$, not | 388 conditional on the \emph{event} $\ev(\past{X}_t=\past{x}_t)$, not |
389 \emph{variables} $\past{X}_t$ as in the conventional conditional entropy. | 389 \emph{variables} $\past{X}_t$ as in the conventional conditional entropy. |
390 | 390 |
407 \begin{equation} | 407 \begin{equation} |
408 \mathcal{I}_t = \sum_{\fut{x}_t \in \X^*} | 408 \mathcal{I}_t = \sum_{\fut{x}_t \in \X^*} |
409 p(\fut{x}_t|x_t,\past{x}_t) \log \frac{ p(\fut{x}_t|x_t,\past{x}_t) }{ p(\fut{x}_t|\past{x}_t) }, | 409 p(\fut{x}_t|x_t,\past{x}_t) \log \frac{ p(\fut{x}_t|x_t,\past{x}_t) }{ p(\fut{x}_t|\past{x}_t) }, |
410 \end{equation} | 410 \end{equation} |
411 where the sum is to be taken over the set of infinite sequences $\X^*$. | 411 where the sum is to be taken over the set of infinite sequences $\X^*$. |
412 Note that it is quite possible for an event to be surprising but not informative | |
413 in predictive sense. | |
412 As with the surprisingness, the observer can compute its \emph{expected} IPI | 414 As with the surprisingness, the observer can compute its \emph{expected} IPI |
413 at time $t$, which reduces to a mutual information $I(X_t;\fut{X}_t|\ev(\past{X}_t=\past{x}_t))$ | 415 at time $t$, which reduces to a mutual information $I(X_t;\fut{X}_t|\ev(\past{X}_t=\past{x}_t))$ |
414 conditioned on the observed past. This could be used, for example, as an estimate | 416 conditioned on the observed past. This could be used, for example, as an estimate |
415 of attentional resources which should be directed at this stream of data, which may | 417 of attentional resources which should be directed at this stream of data, which may |
416 be in competition with other sensory streams. | 418 be in competition with other sensory streams. |
542 $X_t$ given all the previous ones. | 544 $X_t$ given all the previous ones. |
543 \begin{equation} | 545 \begin{equation} |
544 \label{eq:entro-rate} | 546 \label{eq:entro-rate} |
545 h_\mu = H(X_t|\past{X}_t). | 547 h_\mu = H(X_t|\past{X}_t). |
546 \end{equation} | 548 \end{equation} |
547 The entropy rate gives a measure of the overall randomness | 549 The entropy rate gives a measure of the overall surprisingness |
548 or unpredictability of the process. | 550 or unpredictability of the process. |
549 | 551 |
550 The \emph{multi-information rate} $\rho_\mu$ (following Dubnov's \cite{Dubnov2006} | 552 The \emph{multi-information rate} $\rho_\mu$ (following Dubnov's \cite{Dubnov2006} |
551 notation for what he called the `information rate') is the mutual | 553 notation for what he called the `information rate') is the mutual |
552 information between the `past' and the `present': | 554 information between the `past' and the `present': |
568 of as measures of \emph{redundancy}, quantifying the extent to which | 570 of as measures of \emph{redundancy}, quantifying the extent to which |
569 the same information is to be found in all parts of the sequence. | 571 the same information is to be found in all parts of the sequence. |
570 | 572 |
571 | 573 |
572 The \emph{predictive information rate} (or PIR) \cite{AbdallahPlumbley2009} | 574 The \emph{predictive information rate} (or PIR) \cite{AbdallahPlumbley2009} |
573 is the average information in one observation about the infinite future given the infinite past, | 575 is the mutual information between the present and the infinite future given the infinite |
574 and is defined as a conditional mutual information: | 576 past: |
575 \begin{equation} | 577 \begin{equation} |
576 \label{eq:PIR} | 578 \label{eq:PIR} |
577 b_\mu = I(X_t;\fut{X}_t|\past{X}_t) = H(\fut{X}_t|\past{X}_t) - H(\fut{X}_t|X_t,\past{X}_t). | 579 b_\mu = I(X_t;\fut{X}_t|\past{X}_t) = H(\fut{X}_t|\past{X}_t) - H(\fut{X}_t|X_t,\past{X}_t). |
578 \end{equation} | 580 \end{equation} |
579 Equation \eqrf{PIR} can be read as the average reduction | 581 Equation \eqrf{PIR} can be read as the average reduction |
591 or \emph{erasure} \cite{VerduWeissman2006} entropy rate. | 593 or \emph{erasure} \cite{VerduWeissman2006} entropy rate. |
592 These relationships are illustrated in \Figrf{predinfo-bg}, along with | 594 These relationships are illustrated in \Figrf{predinfo-bg}, along with |
593 several of the information measures we have discussed so far. | 595 several of the information measures we have discussed so far. |
594 | 596 |
595 | 597 |
596 James et al \cite{JamesEllisonCrutchfield2011} study the predictive information | 598 James et al \cite{JamesEllisonCrutchfield2011} review several of these |
597 rate and also examine some related measures. In particular they identify the | 599 information measures and introduce some new related ones. |
598 $\sigma_\mu$, the difference between the multi-information rate and the excess | 600 In particular they identify the $\sigma_\mu = I(\past{X}_t;\fut{X}_t|X_t)$, |
599 entropy, as an interesting quantity that measures the predictive benefit of | 601 the mutual information between the past and the future given the present, |
602 as an interesting quantity that measures the predictive benefit of | |
600 model-building (that is, maintaining an internal state summarising past | 603 model-building (that is, maintaining an internal state summarising past |
601 observations in order to make better predictions). | 604 observations in order to make better predictions). It is shown as the |
605 small dark region below the circle in \figrf{predinfo-bg}(c). | |
606 By comparing with \figrf{predinfo-bg}(b), we can see that | |
607 $\sigma_\mu = E - \rho_\mu$. | |
602 % They also identify | 608 % They also identify |
603 % $w_\mu = \rho_\mu + b_{\mu}$, which they call the \emph{local exogenous | 609 % $w_\mu = \rho_\mu + b_{\mu}$, which they call the \emph{local exogenous |
604 % information} rate. | 610 % information} rate. |
605 | 611 |
606 | 612 |
610 expressions for all the information measures described in \secrf{surprise-info-seq} for | 616 expressions for all the information measures described in \secrf{surprise-info-seq} for |
611 irreducible stationary Markov chains (\ie that have a unique stationary | 617 irreducible stationary Markov chains (\ie that have a unique stationary |
612 distribution). The derivation is greatly simplified by the dependency structure | 618 distribution). The derivation is greatly simplified by the dependency structure |
613 of the Markov chain: for the purpose of the analysis, the `past' and `future' | 619 of the Markov chain: for the purpose of the analysis, the `past' and `future' |
614 segments $\past{X}_t$ and $\fut{X}_t$ can be collapsed to just the previous | 620 segments $\past{X}_t$ and $\fut{X}_t$ can be collapsed to just the previous |
615 and next variables $X_{t-1}$ and $X_{t-1}$ respectively. We also showed that | 621 and next variables $X_{t-1}$ and $X_{t+1}$ respectively. We also showed that |
616 the predictive information rate can be expressed simply in terms of entropy rates: | 622 the predictive information rate can be expressed simply in terms of entropy rates: |
617 if we let $a$ denote the $K\times K$ transition matrix of a Markov chain over | 623 if we let $a$ denote the $K\times K$ transition matrix of a Markov chain over |
618 an alphabet of $\{1,\ldots,K\}$, such that | 624 an alphabet of $\{1,\ldots,K\}$, such that |
619 $a_{ij} = \Pr(\ev(X_t=i|X_{t-1}=j))$, and let $h:\reals^{K\times K}\to \reals$ be | 625 $a_{ij} = \Pr(\ev(X_t=i|X_{t-1}=j))$, and let $h:\reals^{K\times K}\to \reals$ be |
620 the entropy rate function such that $h(a)$ is the entropy rate of a Markov chain | 626 the entropy rate function such that $h(a)$ is the entropy rate of a Markov chain |
703 obtained using two rule-based music segmentation algorithms, while clearly | 709 obtained using two rule-based music segmentation algorithms, while clearly |
704 \emph{reflecting} the structure of the piece, do not \emph{segment} the piece, | 710 \emph{reflecting} the structure of the piece, do not \emph{segment} the piece, |
705 with no tendency to peaking of the boundary strength function at | 711 with no tendency to peaking of the boundary strength function at |
706 the boundaries in the piece. | 712 the boundaries in the piece. |
707 | 713 |
714 The complete analysis of \emph{Gradus} can be found in \cite{AbdallahPlumbley2009}, | |
715 but \figrf{metre} illustrates the result of a metrical analysis: the piece was divided | |
716 into bars of 32, 64 and 128 notes. In each case, the average surprisingness and | |
717 IPI for the first, second, third \etc notes in each bar were computed. The plots | |
718 show that the first note of each bar is, on average, significantly more surprising | |
719 and informative than the others, up to the 64-note level, where as at the 128-note, | |
720 level, the dominant periodicity appears to remain at 64 notes. | |
708 | 721 |
709 \begin{fig}{metre} | 722 \begin{fig}{metre} |
710 % \scalebox{1}[1]{% | 723 % \scalebox{1}[1]{% |
711 \begin{tabular}{cc} | 724 \begin{tabular}{cc} |
712 \colfig[0.45]{matbase/fig36859} & \colfig[0.48]{matbase/fig88658} \\ | 725 \colfig[0.45]{matbase/fig36859} & \colfig[0.48]{matbase/fig88658} \\ |
725 informative of notes at different periodicities (\ie hypothetical | 738 informative of notes at different periodicities (\ie hypothetical |
726 bar lengths) and phases (\ie positions within a bar). | 739 bar lengths) and phases (\ie positions within a bar). |
727 } | 740 } |
728 \end{fig} | 741 \end{fig} |
729 | 742 |
730 \subsection{Content analysis/Sound Categorisation}. | 743 \subsection{Content analysis/Sound Categorisation} |
731 Using analogous definitions of differential entropy, the methods outlined | 744 Using analogous definitions of differential entropy, the methods outlined |
732 in the previous section are equally applicable to continuous random variables. | 745 in the previous section are equally applicable to continuous random variables. |
733 In the case of music, where expressive properties such as dynamics, tempo, | 746 In the case of music, where expressive properties such as dynamics, tempo, |
734 timing and timbre are readily quantified on a continuous scale, the information | 747 timing and timbre are readily quantified on a continuous scale, the information |
735 dynamic framework thus may also be considered. | 748 dynamic framework thus may also be considered. |
836 The triangle is `populated' with possible parameter values for melody generators. | 849 The triangle is `populated' with possible parameter values for melody generators. |
837 These are plotted in a 3D information space of $\rho_\mu$ (redundancy), $h_\mu$ (entropy rate) and | 850 These are plotted in a 3D information space of $\rho_\mu$ (redundancy), $h_\mu$ (entropy rate) and |
838 $b_\mu$ (predictive information rate), as defined in \secrf{process-info}. | 851 $b_\mu$ (predictive information rate), as defined in \secrf{process-info}. |
839 In our case we generated thousands of transition matrices, representing first-order | 852 In our case we generated thousands of transition matrices, representing first-order |
840 Markov chains, by a random sampling method. In figure \ref{InfoDynEngine} we | 853 Markov chains, by a random sampling method. In figure \ref{InfoDynEngine} we |
841 see a representation of how these matrices are distributed in the 3d statistical | 854 see a representation of how these matrices are distributed in the 3D information |
842 space; each one of these points corresponds to a transition matrix. | 855 space; each one of these points corresponds to a transition matrix. |
843 | 856 |
844 The distribution of transition matrices plotted in this space forms an arch shape | 857 The distribution of transition matrices plotted in this space forms an arch shape |
845 that is fairly thin. It thus becomes a reasonable approximation to pretend that | 858 that is fairly thin. It thus becomes a reasonable approximation to pretend that |
846 it is just a sheet in two dimensions; and so we stretch out this curved arc into | 859 it is just a sheet in two dimensions; and so we stretch out this curved arc into |
847 a flat triangle. It is this triangular sheet that is our `Melody Triangle' and | 860 a flat triangle. It is this triangular sheet that is our `Melody Triangle' and |
848 forms the interface by which the system is controlled. Using this interface | 861 forms the interface by which the system is controlled. Using this interface |
849 thus involves a mapping to statistical space; a user selects a position within | 862 thus involves a mapping to information space; a user selects a position within |
850 the triangle, and a corresponding transition matrix is returned. Figure | 863 the triangle, and a corresponding transition matrix is returned. Figure |
851 \ref{TheTriangle} shows how the triangle maps to different measures of redundancy, | 864 \ref{TheTriangle} shows how the triangle maps to different measures of redundancy, |
852 entropy rate and predictive information rate. | 865 entropy rate and predictive information rate. |
853 | 866 |
854 | 867 |
879 explore expectation and surprise in music. Additionally different gestures could | 892 explore expectation and surprise in music. Additionally different gestures could |
880 be detected to change the tempo, register, instrumentation and periodicity of | 893 be detected to change the tempo, register, instrumentation and periodicity of |
881 the output melody. | 894 the output melody. |
882 | 895 |
883 As a screen based interface the Melody Triangle can serve as composition tool. | 896 As a screen based interface the Melody Triangle can serve as composition tool. |
884 A triangle is drawn on the screen, screen space thus mapped to the statistical | 897 A triangle is drawn on the screen, screen space thus mapped to the information |
885 space of the Melody Triangle. A number of round tokens, each representing a | 898 space of the Melody Triangle. A number of round tokens, each representing a |
886 melody can be dragged in and around the triangle. When a token is dragged into | 899 melody can be dragged in and around the triangle. When a token is dragged into |
887 the triangle, the system will start generating the sequence of symbols with | 900 the triangle, the system will start generating the sequence of symbols with |
888 statistical properties that correspond to the position of the token. These | 901 statistical properties that correspond to the position of the token. These |
889 symbols are then mapped to notes of a scale. | 902 symbols are then mapped to notes of a scale. |
898 | 911 |
899 | 912 |
900 \subsection{Information Dynamics as Evaluative Feedback Mechanism} | 913 \subsection{Information Dynamics as Evaluative Feedback Mechanism} |
901 %NOT SURE THIS SHOULD BE HERE AT ALL..? | 914 %NOT SURE THIS SHOULD BE HERE AT ALL..? |
902 | 915 |
916 \begin{fig}{mtri-results} | |
917 \def\scat#1{\colfig[0.42]{mtri/#1}} | |
918 \def\subj#1{\scat{scat_dwells_subj_#1} & \scat{scat_marks_subj_#1}} | |
919 \begin{tabular}{cc} | |
920 \subj{a} \\ | |
921 \subj{b} \\ | |
922 \subj{c} \\ | |
923 \subj{d} | |
924 \end{tabular} | |
925 \caption{Dwell times and mark positions from user trials with the | |
926 on-screen Melody Triangle interface. The left-hand column shows | |
927 the positions in a 2D information space (entropy rate vs multi-information rate | |
928 in bits) where spent their time; the area of each circle is proportional | |
929 to the time spent there. The right-hand column shows point which subjects | |
930 `liked'.} | |
931 \end{fig} | |
903 | 932 |
904 Information measures on a stream of symbols can form a feedback mechanism; a | 933 Information measures on a stream of symbols can form a feedback mechanism; a |
905 rudamentary `critic' of sorts. For instance symbol by symbol measure of predictive | 934 rudamentary `critic' of sorts. For instance symbol by symbol measure of predictive |
906 information rate, entropy rate and redundancy could tell us if a stream of symbols | 935 information rate, entropy rate and redundancy could tell us if a stream of symbols |
907 is currently `boring', either because it is too repetitive, or because it is too | 936 is currently `boring', either because it is too repetitive, or because it is too |
921 the state of the system when users, by pressing a key, indicate that they like | 950 the state of the system when users, by pressing a key, indicate that they like |
922 what they are hearing. As such the experiments will help us identify any | 951 what they are hearing. As such the experiments will help us identify any |
923 correlation between the information theoretic properties of a stream and its | 952 correlation between the information theoretic properties of a stream and its |
924 perceived aesthetic worth. | 953 perceived aesthetic worth. |
925 | 954 |
955 Some initial results for four subjects are shown in \figrf{mtri-results}. Though | |
956 subjects seem to exhibit distinct kinds of exploratory behaviour, we have | |
957 not been able to show any systematic across-subjects preference for any particular | |
958 region of the triangle. | |
959 | |
960 Subjects' comments: several noticed the main organisation of the triangle: | |
961 repetative notes at the top, cyclic patters along the right edge, and unpredictable | |
962 notes towards the bottom left (a,c,f). Some did systematic exploration. | |
963 Felt that the right side was more `controllable' than the left (a,f)---a direct consequence | |
964 of their ability to return to a particular periodic pattern and recognise at | |
965 as one heard previously. Some (a,e) felt the trial was too long and became | |
966 bored towards the end. | |
967 One subject (f) felt there wasn't enough time to get to hear out the patterns properly. | |
968 One subject (b) didn't enjoy the lower region whereas another (d) said the lower | |
969 regions were more `melodic' and `interesting'. | |
926 | 970 |
927 %\emph{comparable system} Gordon Pask's Musicolor (1953) applied a similar notion | 971 %\emph{comparable system} Gordon Pask's Musicolor (1953) applied a similar notion |
928 %of boredom in its design. The Musicolour would react to audio input through a | 972 %of boredom in its design. The Musicolour would react to audio input through a |
929 %microphone by flashing coloured lights. Rather than a direct mapping of sound | 973 %microphone by flashing coloured lights. Rather than a direct mapping of sound |
930 %to light, Pask designed the device to be a partner to a performing musician. It | 974 %to light, Pask designed the device to be a partner to a performing musician. It |