comparison draft.tex @ 43:3f643e9fead0

Added Andrew's bits, added to fig 2, fixed some spellings, added some section crossrefs.
author samer
date Thu, 15 Mar 2012 15:08:46 +0000
parents 1161caf0bdda
children 244b74fb707d
comparison
equal deleted inserted replaced
42:1161caf0bdda 43:3f643e9fead0
414 conditioned on the observed past. This could be used, for example, as an estimate 414 conditioned on the observed past. This could be used, for example, as an estimate
415 of attentional resources which should be directed at this stream of data, which may 415 of attentional resources which should be directed at this stream of data, which may
416 be in competition with other sensory streams. 416 be in competition with other sensory streams.
417 417
418 \subsection{Information measures for stationary random processes} 418 \subsection{Information measures for stationary random processes}
419 \label{s:process-info}
419 420
420 421
421 \begin{fig}{predinfo-bg} 422 \begin{fig}{predinfo-bg}
422 \newcommand\subfig[2]{\shortstack{#2\\[0.75em]#1}} 423 \newcommand\subfig[2]{\shortstack{#2\\[0.75em]#1}}
423 \newcommand\rad{1.8em}% 424 \newcommand\rad{1.8em}%
431 \newcommand\offs{3.6em} 432 \newcommand\offs{3.6em}
432 \newcommand\colsep{\hspace{5em}} 433 \newcommand\colsep{\hspace{5em}}
433 \newcommand\longblob{\ovoid{\axis}} 434 \newcommand\longblob{\ovoid{\axis}}
434 \newcommand\shortblob{\ovoid{1.75em}} 435 \newcommand\shortblob{\ovoid{1.75em}}
435 \begin{tabular}{c@{\colsep}c} 436 \begin{tabular}{c@{\colsep}c}
436 \subfig{(a) excess entropy}{% 437 \subfig{(a) multi-information and entropy rates}{%
438 \begin{tikzpicture}%[baseline=-1em]
439 \newcommand\rc{1.75em}
440 \newcommand\throw{2.5em}
441 \coordinate (p1) at (180:1.5em);
442 \coordinate (p2) at (0:0.3em);
443 \newcommand\bound{(-7em,-2.6em) rectangle (7em,3.0em)}
444 \newcommand\present{(p2) circle (\rc)}
445 \newcommand\thepast{(p1) ++(-\throw,0) \ovoid{\throw}}
446 \newcommand\fillclipped[2]{%
447 \begin{scope}[even odd rule]
448 \foreach \thing in {#2} {\clip \thing;}
449 \fill[black!#1] \bound;
450 \end{scope}%
451 }%
452 \fillclipped{30}{\present,\bound \thepast}
453 \fillclipped{15}{\present,\bound \thepast}
454 \fillclipped{45}{\present,\thepast}
455 \draw \thepast;
456 \draw \present;
457 \node at (barycentric cs:p2=1,p1=-0.3) {$h_\mu$};
458 \node at (barycentric cs:p2=1,p1=1) [shape=rectangle,fill=black!45,inner sep=1pt]{$\rho_\mu$};
459 \path (p2) +(90:3em) node {$X_0$};
460 \path (p1) +(-3em,0em) node {\shortstack{infinite\\past}};
461 \path (p1) +(-4em,\rad) node [anchor=south] {$\ldots,X_{-1}$};
462 \end{tikzpicture}}%
463 \\[1.25em]
464 \subfig{(b) excess entropy}{%
437 \newcommand\blob{\longblob} 465 \newcommand\blob{\longblob}
438 \begin{tikzpicture} 466 \begin{tikzpicture}
439 \coordinate (p1) at (-\offs,0em); 467 \coordinate (p1) at (-\offs,0em);
440 \coordinate (p2) at (\offs,0em); 468 \coordinate (p2) at (\offs,0em);
441 \begin{scope} 469 \begin{scope}
449 \path (p1) +(-2em,\rad) node [anchor=south] {$\ldots,X_{-1}$}; 477 \path (p1) +(-2em,\rad) node [anchor=south] {$\ldots,X_{-1}$};
450 \path (p2) +(2em,\rad) node [anchor=south] {$X_0,\ldots$}; 478 \path (p2) +(2em,\rad) node [anchor=south] {$X_0,\ldots$};
451 \end{tikzpicture}% 479 \end{tikzpicture}%
452 }% 480 }%
453 \\[1.25em] 481 \\[1.25em]
454 \subfig{(b) predictive information rate $b_\mu$}{% 482 \subfig{(c) predictive information rate $b_\mu$}{%
455 \begin{tikzpicture}%[baseline=-1em] 483 \begin{tikzpicture}%[baseline=-1em]
456 \newcommand\rc{2.1em} 484 \newcommand\rc{2.1em}
457 \newcommand\throw{2.5em} 485 \newcommand\throw{2.5em}
458 \coordinate (p1) at (210:1.5em); 486 \coordinate (p1) at (210:1.5em);
459 \coordinate (p2) at (90:0.7em); 487 \coordinate (p2) at (90:0.7em);
466 \begin{scope}[even odd rule] 494 \begin{scope}[even odd rule]
467 \foreach \thing in {#2} {\clip \thing;} 495 \foreach \thing in {#2} {\clip \thing;}
468 \fill[black!#1] \bound; 496 \fill[black!#1] \bound;
469 \end{scope}% 497 \end{scope}%
470 }% 498 }%
499 \fillclipped{80}{\future,\thepast}
471 \fillclipped{30}{\present,\future,\bound \thepast} 500 \fillclipped{30}{\present,\future,\bound \thepast}
472 \fillclipped{15}{\present,\bound \future,\bound \thepast} 501 \fillclipped{15}{\present,\bound \future,\bound \thepast}
473 \draw \future; 502 \draw \future;
474 \fillclipped{45}{\present,\thepast} 503 \fillclipped{45}{\present,\thepast}
475 \draw \thepast; 504 \draw \thepast;
492 variable or sequence of random variables relative to time $t=0$. Overlapped areas 521 variable or sequence of random variables relative to time $t=0$. Overlapped areas
493 correspond to various mutual information as in \Figrf{venn-example}. 522 correspond to various mutual information as in \Figrf{venn-example}.
494 In (b), the circle represents the `present'. Its total area is 523 In (b), the circle represents the `present'. Its total area is
495 $H(X_0)=\rho_\mu+r_\mu+b_\mu$, where $\rho_\mu$ is the multi-information 524 $H(X_0)=\rho_\mu+r_\mu+b_\mu$, where $\rho_\mu$ is the multi-information
496 rate, $r_\mu$ is the residual entropy rate, and $b_\mu$ is the predictive 525 rate, $r_\mu$ is the residual entropy rate, and $b_\mu$ is the predictive
497 information rate. The entropy rate is $h_\mu = r_\mu+b_\mu$. 526 information rate. The entropy rate is $h_\mu = r_\mu+b_\mu$. The small dark
527 region below $X_0$ in (c) is $\sigma_\mu = E-\rho_\mu$.
498 } 528 }
499 \end{fig} 529 \end{fig}
500 530
501 If we step back, out of the observer's shoes as it were, and consider the 531 If we step back, out of the observer's shoes as it were, and consider the
502 random process $(\ldots,X_{-1},X_0,X_1,\dots)$ as a statistical ensemble of 532 random process $(\ldots,X_{-1},X_0,X_1,\dots)$ as a statistical ensemble of
532 is the mutual information between 562 is the mutual information between
533 the entire `past' and the entire `future': 563 the entire `past' and the entire `future':
534 \begin{equation} 564 \begin{equation}
535 E = I(\past{X}_t; X_t,\fut{X}_t). 565 E = I(\past{X}_t; X_t,\fut{X}_t).
536 \end{equation} 566 \end{equation}
567 Both the excess entropy and the multi-information rate can be thought
568 of as measures of \emph{redundancy}, quantifying the extent to which
569 the same information is to be found in all parts of the sequence.
537 570
538 571
539 The \emph{predictive information rate} (or PIR) \cite{AbdallahPlumbley2009} 572 The \emph{predictive information rate} (or PIR) \cite{AbdallahPlumbley2009}
540 is the average information in one observation about the infinite future given the infinite past, 573 is the average information in one observation about the infinite future given the infinite past,
541 and is defined as a conditional mutual information: 574 and is defined as a conditional mutual information:
547 in uncertainty about the future on learning $X_t$, given the past. 580 in uncertainty about the future on learning $X_t$, given the past.
548 Due to the symmetry of the mutual information, it can also be written 581 Due to the symmetry of the mutual information, it can also be written
549 as 582 as
550 \begin{equation} 583 \begin{equation}
551 % \IXZ_t 584 % \IXZ_t
552 I(X_t;\fut{X}_t|\past{X}_t) = h_\mu - r_\mu, 585 b_\mu = H(X_t|\past{X}_t) - H(X_t|\past{X}_t,\fut{X}_t) = h_\mu - r_\mu,
553 % \label{<++>} 586 % \label{<++>}
554 \end{equation} 587 \end{equation}
555 % If $X$ is stationary, then 588 % If $X$ is stationary, then
556 where $r_\mu = H(X_t|\fut{X}_t,\past{X}_t)$, 589 where $r_\mu = H(X_t|\fut{X}_t,\past{X}_t)$,
557 is the \emph{residual} \cite{AbdallahPlumbley2010}, 590 is the \emph{residual} \cite{AbdallahPlumbley2010},
563 James et al \cite{JamesEllisonCrutchfield2011} study the predictive information 596 James et al \cite{JamesEllisonCrutchfield2011} study the predictive information
564 rate and also examine some related measures. In particular they identify the 597 rate and also examine some related measures. In particular they identify the
565 $\sigma_\mu$, the difference between the multi-information rate and the excess 598 $\sigma_\mu$, the difference between the multi-information rate and the excess
566 entropy, as an interesting quantity that measures the predictive benefit of 599 entropy, as an interesting quantity that measures the predictive benefit of
567 model-building (that is, maintaining an internal state summarising past 600 model-building (that is, maintaining an internal state summarising past
568 observations in order to make better predictions). They also identify 601 observations in order to make better predictions).
569 $w_\mu = \rho_\mu + b_{\mu}$, which they call the \emph{local exogenous 602 % They also identify
570 information} rate. 603 % $w_\mu = \rho_\mu + b_{\mu}$, which they call the \emph{local exogenous
571 604 % information} rate.
572 \begin{fig}{wundt}
573 \raisebox{-4em}{\colfig[0.43]{wundt}}
574 % {\ \shortstack{{\Large$\longrightarrow$}\\ {\scriptsize\emph{exposure}}}\ }
575 {\ {\large$\longrightarrow$}\ }
576 \raisebox{-4em}{\colfig[0.43]{wundt2}}
577 \caption{
578 The Wundt curve relating randomness/complexity with
579 perceived value. Repeated exposure sometimes results
580 in a move to the left along the curve \cite{Berlyne71}.
581 }
582 \end{fig}
583 605
584 606
585 \subsection{First and higher order Markov chains} 607 \subsection{First and higher order Markov chains}
586 First order Markov chains are the simplest non-trivial models to which information 608 First order Markov chains are the simplest non-trivial models to which information
587 dynamics methods can be applied. In \cite{AbdallahPlumbley2009} we derived 609 dynamics methods can be applied. In \cite{AbdallahPlumbley2009} we derived
620 where $\hat{a}^{N+1}$ is the $(N+1)$th power of the first order transition matrix. 642 where $\hat{a}^{N+1}$ is the $(N+1)$th power of the first order transition matrix.
621 Other information measures can also be computed for the high-order Markov chain, including 643 Other information measures can also be computed for the high-order Markov chain, including
622 the multi-information rate $\rho_\mu$ and the excess entropy $E$. These are identical 644 the multi-information rate $\rho_\mu$ and the excess entropy $E$. These are identical
623 for first order Markov chains, but for order $N$ chains, $E$ can be up to $N$ times larger 645 for first order Markov chains, but for order $N$ chains, $E$ can be up to $N$ times larger
624 than $\rho_\mu$. 646 than $\rho_\mu$.
647
648 [Something about what kinds of Markov chain maximise $h_\mu$ (uncorrelated `white'
649 sequences, no temporal structure), $\rho_\mu$ and $E$ (periodic) and $b_\mu$. We return
650 this in \secrf{composition}.]
625 651
626 652
627 \section{Information Dynamics in Analysis} 653 \section{Information Dynamics in Analysis}
628 654
629 \begin{fig}{twopages} 655 \begin{fig}{twopages}
726 \end{itemize} 752 \end{itemize}
727 753
728 754
729 \subsection{Beat Tracking} 755 \subsection{Beat Tracking}
730 756
757 A probabilistic method for drum tracking was presented by Robertson
758 \cite{Robertson11c}. The algorithm is used to synchronise a music
759 sequencer to a live drummer. The expected beat time of the sequencer is
760 represented by a click track, and the algorithm takes as input event
761 times for discrete kick and snare drum events relative to this click
762 track. These are obtained using dedicated microphones for each drum and
763 using a percussive onset detector (Puckette 1998). The drum tracker
764 continually updates distributions for tempo and phase on receiving a new
765 event time. We can thus quantify the information contributed of an event
766 by measuring the difference between the system's prior distribution and
767 the posterior distribution using the Kullback-Leiber divergence.
768
769 Here, we have calculated the KL divergence and entropy for kick and
770 snare events in sixteen files. The analysis of information rates can be
771 considered \emph{subjective}, in that it measures how the drum tracker's
772 probability distributions change, and these are contingent upon the
773 model used as well as external properties in the signal. We expect,
774 however, that following periods of increased uncertainty, such as fills
775 or expressive timing, the information contained in an individual event
776 increases. We also examine whether the information is dependent upon
777 metrical position.
778
731 779
732 \section{Information dynamics as compositional aid} 780 \section{Information dynamics as compositional aid}
781 \label{s:composition}
782
783 \begin{fig}{wundt}
784 \raisebox{-4em}{\colfig[0.43]{wundt}}
785 % {\ \shortstack{{\Large$\longrightarrow$}\\ {\scriptsize\emph{exposure}}}\ }
786 {\ {\large$\longrightarrow$}\ }
787 \raisebox{-4em}{\colfig[0.43]{wundt2}}
788 \caption{
789 The Wundt curve relating randomness/complexity with
790 perceived value. Repeated exposure sometimes results
791 in a move to the left along the curve \cite{Berlyne71}.
792 }
793 \end{fig}
733 794
734 In addition to applying information dynamics to analysis, it is also possible 795 In addition to applying information dynamics to analysis, it is also possible
735 to apply it to the generation of content, such as to the composition of musical 796 to apply it to the generation of content, such as to the composition of musical
736 materials. The outputs of algorithmic or stochastic processes can be filtered 797 materials. The outputs of algorithmic or stochastic processes can be filtered
737 to match a set of criteria defined in terms of the information dynamics model, 798 to match a set of criteria defined in terms of the information dynamics model,
771 address notions of expectation and surprise in music, and as such the Melody 832 address notions of expectation and surprise in music, and as such the Melody
772 Triangle is a means of interfacing with a generative process in terms of the 833 Triangle is a means of interfacing with a generative process in terms of the
773 predictability of its output. 834 predictability of its output.
774 835
775 The triangle is `populated' with possible parameter values for melody generators. 836 The triangle is `populated' with possible parameter values for melody generators.
776 These are plotted in a 3d statistical space of redundancy, entropy rate and 837 These are plotted in a 3D information space of $\rho_\mu$ (redundancy), $h_\mu$ (entropy rate) and
777 predictive information rate. 838 $b_\mu$ (predictive information rate), as defined in \secrf{process-info}.
778 In our case we generated thousands of transition matrixes, representing first-order 839 In our case we generated thousands of transition matrices, representing first-order
779 Markov chains, by a random sampling method. In figure \ref{InfoDynEngine} we 840 Markov chains, by a random sampling method. In figure \ref{InfoDynEngine} we
780 see a representation of how these matrixes are distributed in the 3d statistical 841 see a representation of how these matrices are distributed in the 3d statistical
781 space; each one of these points corresponds to a transition matrix. 842 space; each one of these points corresponds to a transition matrix.
782 843
783 The distribution of transition matrixes plotted in this space forms an arch shape 844 The distribution of transition matrices plotted in this space forms an arch shape
784 that is fairly thin. It thus becomes a reasonable approximation to pretend that 845 that is fairly thin. It thus becomes a reasonable approximation to pretend that
785 it is just a sheet in two dimensions; and so we stretch out this curved arc into 846 it is just a sheet in two dimensions; and so we stretch out this curved arc into
786 a flat triangle. It is this triangular sheet that is our `Melody Triangle' and 847 a flat triangle. It is this triangular sheet that is our `Melody Triangle' and
787 forms the interface by which the system is controlled. Using this interface 848 forms the interface by which the system is controlled. Using this interface
788 thus involves a mapping to statistical space; a user selects a position within 849 thus involves a mapping to statistical space; a user selects a position within
877 938
878 939
879 \section{Conclusion} 940 \section{Conclusion}
880 941
881 \bibliographystyle{unsrt} 942 \bibliographystyle{unsrt}
882 {\bibliography{all,c4dm,nime}} 943 {\bibliography{all,c4dm,nime,andrew}}
883 \end{document} 944 \end{document}