annotate docs/spec/spec.tex @ 605:d7eb0a7440ad

Tidying and POST enhancement * Ensured that the storage module is closed/freed when errors occur * Now reads in the query via POST if provided (allows for other clients)
author mas01mj
date Tue, 18 Aug 2009 14:23:32 +0000
parents dfeb5ef768da
children 365d3c500ef3
rev   line source
mas02md@547 1 \documentclass[11pt]{article}
mas02md@547 2 \usepackage{oz,times}
mas02md@547 3
mas02md@547 4 \usepackage{tikz}
mas02md@547 5 \usetikzlibrary{arrows}
mas02md@547 6 \usetikzlibrary{calc}
mas02md@547 7
mas02md@547 8 \title{A Formal Framework for Content-based Retrieval}
mas02md@547 9 \author{Mark d'Inverno and Christophe Rhodes}
mas02md@547 10
mas02md@547 11
mas02md@547 12 \begin{document}
mas02md@547 13 \maketitle
mas02md@547 14
mas02md@547 15 %%\begin{gendef}[X]
mas02md@547 16 %% concat : (\seq (\seq X)) \fun (\seq X)
mas02md@547 17 %%\where
mas02md@547 18 %% \forall xs : \seq X; xss : \seq (\seq X) @ \\
mas02md@547 19 %%\t2 concat~ ((\langle xs \rangle) \cat xss) = xs \cat
mas02md@547 20 %%(concat~xss)
mas02md@547 21 %%\end{gendef}
mas02md@547 22
mas02md@547 23 %% \begin{gendef}[X]
mas02md@547 24 %% nfront : \nat \fun (\seq X) \fun \seq (\seq X) \\
mas02md@547 25 %% \where
mas02md@547 26 %% \forall n : \nat ; xs : \seq X @ \\
mas02md@547 27 %% n > \# xs \implies nfront ~n~xs = \langle \rangle \land \\
mas02md@547 28 %% n \leq \# xs \implies nfront~n~xs = \\
mas02md@547 29 %% \t2 (\langle (0 \upto n-1) \dres xs \rangle) \cat (nfront~n~(tail~xs))
mas02md@547 30 %% \end{gendef}
mas02md@547 31
mas02md@547 32 \section{The System Instance}
mas02md@547 33
mas02md@562 34 \newcommand{\LET}{\mathrel{\sf Let}}
mas02md@562 35
mas02md@547 36 \newcommand{\mylet}{\methrel{\sf Let}}
mas02md@547 37 \newcommand{\FV}{\mathrel{~FV}}
mas02md@547 38 \newcommand{\V}{\mathrel{~FV^{d}}}
mas02md@547 39 \newcommand{\U}{\mathrel{~FV^{1}}}
mas02md@547 40 \newcommand{\R}{\mathrel{~R}}
mas02md@547 41
mas02md@547 42 \newcommand{\mylog}{\mathrel{ln}}
mas02md@547 43
mas02md@547 44 \begin{enumerate}
mas02md@547 45
mas02md@547 46 \item \textsf{Track} - original media.
mas02md@547 47
mas02md@547 48 We define the set of all such tracks.
mas02md@547 49
mas02md@547 50 \begin{flushright}
mas02md@547 51 \begin{tikzpicture}
mas02md@547 52 \input track.tikz
mas02md@547 53 \end{tikzpicture}
mas02md@547 54 \end{flushright}
mas02md@547 55
mas02md@547 56 \begin{zed}
mas02md@547 57 [Track]
mas02md@547 58 \end{zed}
mas02md@547 59
mas02md@547 60 %% \begin{zed}
mas02md@547 61 %%\R == \nat
mas02md@547 62 %% \end{zed}
mas02md@547 63
mas02md@547 64 %% \begin{axdef}
mas02md@547 65 %% \mylog : \R \fun \R
mas02md@547 66 %% \end{axdef}
mas02md@547 67
mas02md@547 68 % For every track it is possible to determine the lenth.
mas02md@547 69
mas02md@547 70 % \begin{axdef}
mas02md@547 71 % lenth : Track \fun Time
mas02md@547 72 % \end{axdef}
mas02md@547 73
mas02md@547 74 % For every track it is possible to determine the lenth.
mas02md@547 75
mas02md@547 76 % \begin{axdef}
mas02md@547 77 % lenth : Track \fun Time
mas02md@547 78 % \end{axdef}
mas02md@547 79
mas02md@547 80 \item \textsf{Catalogue} - a non-empty list of tracks.
mas02md@547 81
mas02md@547 82 % A catalogue is typically generated for a specific \emph{use case}.
mas02md@547 83
mas02md@547 84 \begin{flushright}
mas02md@547 85 \begin{tikzpicture}
mas02md@547 86 \draw (-3,-3) rectangle (3,3);
mas02md@547 87 \foreach \y/\xscale in {2/1,1/1.02,0/0.87,-1/0.95} {
mas02md@547 88 \begin{scope}[yshift=\y cm]
mas02md@547 89 \begin{scope}[xscale=\xscale]
mas02md@547 90 \input track.tikz
mas02md@547 91 \end{scope}
mas02md@547 92 \end{scope}
mas02md@547 93 }
mas02md@547 94 \draw node at (0,-2) {\vdots};
mas02md@547 95 \end{tikzpicture}
mas02md@547 96 \end{flushright}
mas02md@547 97
mas02md@547 98 \begin{zed}
mas02md@547 99 Catalogue == \seq_1 Track
mas02md@547 100 \end{zed}
mas02md@547 101
mas02md@547 102 \item \textsf{Collection} - a set of catalogues where each track in each catalogue is associated with a unique name.
mas02md@547 103
mas02md@547 104 In a collection, each track has a unique \textsf{TrackName}. We define the injective function $name$ to do this.
mas02md@547 105
mas02md@547 106 \begin{zed}
mas02md@547 107 [TrackName]
mas02md@547 108 \end{zed}
mas02md@547 109
mas02md@547 110 \begin{schema}{Collection}
mas02md@547 111 collection : \power Catalogue \\
mas02md@547 112 tracks : \power Track \\
mas02md@547 113 name : Track \inj TrackName
mas02md@547 114 \where
mas02md@547 115 tracks = \{ cat : collection; t : Track | t \in (\ran cat) @ t \} \\
mas02md@547 116 \dom name = tracks
mas02md@547 117 \end{schema}
mas02md@547 118
mas02md@547 119 \item \textsf{Time} - the set of non-negative reals.
mas02md@547 120
mas02md@547 121
mas02md@547 122 \begin{zed}
mas02md@547 123 Time == \R
mas02md@547 124 \end{zed}
mas02md@547 125
mas02md@547 126
mas02md@547 127 \item \textsf{Interval} - defined as a continuous set of real numbers, represented as an ordered pair of reals with the second of the pair strictly greater.
mas02md@547 128
mas02md@547 129 \begin{flushright}
mas02md@547 130 \begin{tikzpicture}[>=stealth]
mas02md@547 131 \draw (0,0.8) rectangle (0.4,1);
mas02md@547 132 \fill (0.2,0.9) ellipse (0.05);
mas02md@547 133 \draw[->] (0.2,0.9) -- (0.1,0.2);
mas02md@547 134 \draw[->] (0.2,0.9) -- (0.55,0.2);
mas02md@547 135 \draw[very thick] (0.1,0.2) -- (0.55,0.2);
mas02md@547 136 \draw[o->] (-0.5,0.2) -- (1,0.2) node[anchor=north west] {\small $\mathbb{R}$};
mas02md@547 137 \end{tikzpicture}
mas02md@547 138 \end{flushright}
mas02md@547 139
mas02md@547 140 \begin{zed}
mas02md@547 141 Interval == \{ t_1, t_2 : Time | t_2 > t_1 @ (t_1, t_2) \}
mas02md@547 142 \end{zed}
mas02md@547 143
mas02md@547 144 \item \textsf{Interval Index} - list of intervals such that the start of each successive interval strictly increases. Intervals may be overlapping.
mas02md@547 145
mas02md@547 146 The predicate below states that for any consecutive intervals in an interval index $i_j$ and $i_{j+1}$, that $i_j$ starts before $i_{j+1}$.
mas02md@547 147
mas02md@547 148 \begin{flushright}
mas02md@547 149 \begin{tikzpicture}[>=stealth]
mas02md@547 150 \draw (0,0.8) rectangle (1.6,1);
mas02md@547 151 \draw (0.4,0.8) -- (0.4,1);
mas02md@547 152 \draw (0.8,0.8) -- (0.8,1);
mas02md@547 153 \draw (1.2,0.8) -- (1.2,1);
mas02md@547 154 \fill (0.2,0.9) ellipse (0.05);
mas02md@547 155 \fill (0.6,0.9) ellipse (0.05);
mas02md@547 156 \fill (1.0,0.9) ellipse (0.05);
mas02md@547 157 \fill (1.4,0.9) ellipse (0.05);
mas02md@547 158 \draw[->] (0.2,0.9) -- (0.1,0.2);
mas02md@547 159 \draw[->] (0.2,0.9) -- (0.55,0.2);
mas02md@547 160 \draw[->] (0.6,0.9) -- (0.55,0.2);
mas02md@547 161 \draw[->] (0.6,0.9) -- (0.7,0.2);
mas02md@547 162 \draw[->] (1.0,0.9) -- (0.9,0.2);
mas02md@547 163 \draw[->] (1.0,0.9) -- (1.3,0.2);
mas02md@547 164 \draw[->] (1.4,0.9) -- (1.2,0.2);
mas02md@547 165 \draw[->] (1.4,0.9) -- (1.5,0.2);
mas02md@547 166 \draw[o->] (-0.5,0.2) -- (2,0.2) node[anchor=north west] {\small $\mathbb{R}$};
mas02md@547 167 \end{tikzpicture}
mas02md@547 168 \end{flushright}
mas02md@547 169
mas02md@547 170 \begin{zed}
mas02md@547 171 IntervalIndex == \iseq Interval \\ \\ \\
mas02md@547 172 \end{zed}
mas02md@547 173
mas02md@547 174 \[ \forall i : IntervalIndex; i_j, i_{j+1} : Interval @ \\
mas02md@547 175 \t3 \langle i_j, i_{j+1} \rangle \inseq i \implies first~ i_j \leq first ~ i_{j+1}
mas02md@547 176 \]
mas02md@547 177
mas02md@547 178 \item \textsf{Continuous Interval Index - CII} - there are no consecutive intervals $i_j$ and $i_{j+1}$ in the index where the second element of $i_j$ is strictly less than the first element of $i_{j+1}$.
mas02md@547 179
mas02md@547 180 \begin{flushright}
mas02md@547 181 \begin{tikzpicture}[>=stealth]
mas02md@547 182 \draw (0,0.8) rectangle (1.6,1);
mas02md@547 183 \draw (0.4,0.8) -- (0.4,1);
mas02md@547 184 \draw (0.8,0.8) -- (0.8,1);
mas02md@547 185 \draw (1.2,0.8) -- (1.2,1);
mas02md@547 186 \fill (0.2,0.9) ellipse (0.05);
mas02md@547 187 \fill (0.6,0.9) ellipse (0.05);
mas02md@547 188 \fill (1.0,0.9) ellipse (0.05);
mas02md@547 189 \fill (1.4,0.9) ellipse (0.05);
mas02md@547 190 \draw[->] (0.2,0.9) -- (0.1,0.2);
mas02md@547 191 \draw[->] (0.2,0.9) -- (0.55,0.2);
mas02md@547 192 \draw[->] (0.6,0.9) -- (0.55,0.2);
mas02md@547 193 \draw[->] (0.6,0.9) -- (0.9,0.2);
mas02md@547 194 \draw[->] (1.0,0.9) -- (0.7,0.2);
mas02md@547 195 \draw[->] (1.0,0.9) -- (1.3,0.2);
mas02md@547 196 \draw[->] (1.4,0.9) -- (1.2,0.2);
mas02md@547 197 \draw[->] (1.4,0.9) -- (1.5,0.2);
mas02md@547 198 \draw[o->] (-0.5,0.2) -- (2,0.2) node[anchor=north west] {\small $\mathbb{R}$};
mas02md@547 199 \end{tikzpicture}
mas02md@547 200 \end{flushright}
mas02md@547 201
mas02md@547 202 %%unchecked
mas02md@547 203 \begin{zed}
mas02md@562 204 ContIntIndex == \\
mas02md@562 205 \t1 \{ ii : IntervalIndex; i_j, i_{j+1}: Interval | \\
mas02md@562 206 \t3 \langle i_j, i_{j+1} \rangle \inseq ii \implies second ~ i_j \geq first ~ i_{j+1}@ ii \}
mas02md@562 207 \end{zed}
mas02md@547 208
mas02md@562 209 % In fact this enables us to specify such a data structure simply as a sequence of increasing times as we can assume the track starts at 0.
mas02md@562 210
mas02md@562 211 % \begin{zed}
mas02md@562 212 % ContIntIndex == \iseq Time \\ \\ \\
mas02md@562 213 % \end{zed}
mas02md@562 214
mas02md@562 215 %% \begin{zed}
mas02md@562 216 %% ContIntIndex == IntervalIndex
mas02md@562 217 %% \end{zed}
mas02md@547 218
mas02md@547 219 \item A \textsf{Duration} is an amount of time.
mas02md@547 220
mas02md@547 221 \begin{zed}
mas02md@547 222 Duration == Time
mas02md@547 223 \end{zed}
mas02md@547 224
mas02md@547 225 \item The \textsf{Track Duration} is the duration of a track. It is the difference between the \emph{end} time and \emph{start} time of the track.
mas02md@547 226
mas02md@547 227 \begin{axdef}
mas02md@547 228 trackduration : Track \fun Duration
mas02md@547 229 \end{axdef}
mas02md@547 230
mas02md@547 231 \item \textsf{Interval Duration} is the duration of an interval. It is calculated by subtracting the first element of the interval from the second element.
mas02md@547 232
mas02md@547 233 \begin{axdef}
mas02md@547 234 intervalduration : Interval \fun Duration
mas02md@547 235 \where
mas02md@547 236 \forall i : Interval @ intervalduration ~ i = second ~ i - first ~ i
mas02md@547 237 \end{axdef}
mas02md@547 238
mas02md@547 239
mas02md@562 240 \item \textsf{Index Duration} is the duration of a Continuous Interval Index which is simply the second element of the last interval pair.
mas02md@547 241
mas02md@547 242 \begin{axdef}
mas02md@562 243 indexduration : ContIntIndex \fun Time
mas02md@547 244 \where
mas02md@562 245 \forall cii : ContIntIndex @ indexduration~cii = first (last~cii)
mas02md@547 246 \end{axdef}
mas02md@547 247
mas02md@547 248 \item \textsf{Segmenter} - a process which computes an interval index for any track, represented as a function which maps any track to an interval index.
mas02md@547 249
mas02md@547 250 \begin{flushright}
mas02md@547 251 \begin{tikzpicture}
mas02md@547 252 \input interval-index.tikz
mas02md@547 253 \input segmented-track.tikz
mas02md@547 254 \end{tikzpicture}
mas02md@547 255 \end{flushright}
mas02md@547 256
mas02md@562 257
mas02md@562 258 In this specification we will consider segmenters that produce a continuous interval index.
mas02md@562 259
mas02md@547 260 \begin{zed}
mas02md@562 261 Segmenter == Track \fun ContIntIndex \\
mas02md@547 262 \end{zed}
mas02md@547 263
mas02md@547 264 \item The \textsf{Length} of an interval index is the number of intervals contained within it.
mas02md@547 265
mas02md@547 266 \begin{axdef}
mas02md@547 267 length : IntervalIndex \fun \nat
mas02md@547 268 \where
mas02md@547 269 \forall ii : IntervalIndex @ length~ii = \# ii
mas02md@547 270 \end{axdef}
mas02md@547 271
mas02md@547 272 % \section{Features}
mas02md@547 273
mas02md@547 274 \item \textsf{Feature Name} - a name describing a feature.
mas02md@547 275
mas02md@547 276 \begin{zed}
mas02md@547 277 [FeatureName]
mas02md@547 278 \end{zed}
mas02md@547 279
mas02md@547 280 \item \textsf{Dimension} - a natural number.
mas02md@547 281
mas02md@547 282 \begin{zed}
mas02md@547 283 Dimension == \nat
mas02md@547 284 \end{zed}
mas02md@547 285
mas02md@547 286 \item \textsf{Feature Kind} - a feature name and associated set of parameters.
mas02md@547 287
mas02md@547 288 \begin{zed}
mas02md@547 289 [Const, Var]
mas02md@547 290 \end{zed}
mas02md@547 291
mas02md@547 292 We define the binding type as the set of functions between variables and constants.
mas02md@547 293
mas02md@547 294 \begin{zed}
mas02md@547 295 Binding == \power (Var \cross Const)
mas02md@547 296 \end{zed}
mas02md@547 297
mas02md@547 298 Every feature kind has a dimension (not necessarily fixed).
mas02md@547 299
mas02md@547 300 \begin{schema}{FeatureKind}
mas02md@547 301 name : FeatureName \\
mas02md@547 302 parameters : \power Var \\
mas02md@547 303 dparameters : \power Var \\
mas02md@547 304 binding : Binding \\
mas02md@547 305 ddimension : Binding \pfun \nat \\
mas02md@547 306 fdimension : Dimension \\
mas02md@547 307 \where
mas02md@547 308 dparameters \subseteq parameters \\
mas02md@547 309 \forall b : Binding | b \in (\dom ddimension) @ dparameters \subseteq (\dom b)
mas02md@547 310 \end{schema}
mas02md@547 311
mas02md@547 312 \item A \textsf{Feature} - sometimes referred to as a Fully Qualified Feature - is a feature kind with all parameters bound.
mas02md@547 313
mas02md@547 314 \item \textsf{Feature Dimension} - every feature has a fixed dimension.
mas02md@547 315
mas02md@547 316 \begin{schema}{Feature}
mas02md@547 317 FeatureKind \\
mas02md@547 318 \where
mas02md@547 319 \dom binding = parameters \\
mas02md@547 320 fdimension = ddimension~binding
mas02md@547 321 \end{schema}
mas02md@547 322
mas02md@547 323 \item \textsf{Unit Feature} - any feature with a dimension of 1.
mas02md@547 324
mas02md@547 325 \begin{schema}{UnitFeature}
mas02md@547 326 Feature \\
mas02md@547 327 \where
mas02md@547 328 fdimension = 1
mas02md@547 329 \end{schema}
mas02md@547 330
mas02md@547 331 \item \textsf{Feature Vector} - a non-empty sequence of real numbers.
mas02md@547 332
mas02md@547 333 \begin{zed}
mas02md@547 334 \FV == \seq_1 \R
mas02md@547 335 \end{zed}
mas02md@547 336
mas02md@547 337 % DEFINE ALL FEATURE VECTORS HERE
mas02md@547 338
mas02md@547 339 %% \begin{zed}
mas02md@547 340 %% \V == \FV \\
mas02md@547 341 %% \U == \FV
mas02md@547 342 %% \end{zed}
mas02md@547 343
mas02md@547 344 \newcommand{\Vdsl}{\mathrel{~FV^{d * sl}}}
mas02md@547 345
mas02md@547 346 %% \begin{zed}
mas02md@547 347 %% \Vdsl == \FV
mas02md@547 348 %% \end{zed}
mas02md@547 349
mas02md@547 350 \newcommand{\Vd}{\mathrel{~FV^{d}}}
mas02md@547 351
mas02md@547 352 %% \begin{zed}
mas02md@547 353 %% \Vd == \FV
mas02md@547 354 %% \end{zed}
mas02md@547 355
mas02md@547 356 \newcommand{\Z}{\mathrel{~FV^{1}}}
mas02md@547 357
mas02md@547 358 %% \begin{zed}
mas02md@547 359 %% \Z == \FV
mas02md@547 360 %% \end{zed}
mas02md@547 361
mas02md@547 362 \newcommand{\Vsl}{\mathrel{~FV^{sl}}}
mas02md@547 363
mas02md@547 364 %% \begin{zed}
mas02md@547 365 %% \Vsl == \FV
mas02md@547 366 %% \end{zed}
mas02md@547 367
mas02md@547 368
mas02md@547 369 \item \textsf{Feature Vector Dimension} - the length of a feature vector.
mas02md@547 370
mas02md@547 371 \begin{axdef}
mas02md@547 372 dimension : \FV \fun \nat
mas02md@547 373 \where
mas02md@547 374 \forall fv : \FV @ dimension~fv = \# fv
mas02md@547 375 \end{axdef}
mas02md@547 376
mas02md@547 377 \item \textsf{Unit Vector} - any feature vector with a dimension of 1.
mas02md@547 378
mas02md@547 379 \begin{zed}
mas02md@547 380 UnitVector == \{ fv : \FV | dimension~fv = 1 \}
mas02md@547 381 \end{zed}
mas02md@547 382
mas02md@547 383 \item \textsf{Extractor} - a process which computes a feature vector of fixed dimension for any interval of any track, represented formally as a function that takes a track and an interval and returns a feature vector.
mas02md@547 384
mas02md@547 385 \begin{zed}
mas02md@547 386 Extractor == Track \fun Interval \fun \FV
mas02md@547 387 \end{zed}
mas02md@547 388
mas02md@547 389 \item \textsf{Unit Extractor} - a subclass of Extractor where the feature vector dimension is equal to 1.
mas02md@547 390
mas02md@547 391 \begin{zed}
mas02md@547 392 UnitExtractor == \\
mas02md@547 393 \t1 \{ e : Extractor | \forall t : Track; i : Interval @ dimension (e~t~i) = 1 \}
mas02md@547 394 \end{zed}
mas02md@547 395
mas02md@547 396 \item \textsf{Bel Unit Extractor} - a subclass of Unit Extractor where the function outputs on the bel scale (between minus infinity and zero).
mas02md@547 397
mas02md@547 398 \begin{zed}
mas02md@547 399 BelUnitExtractor == UnitExtractor
mas02md@547 400 \end{zed}
mas02md@547 401
mas02md@547 402 \item \textsf{Feature Extractors} - the set of all extractors associated with a feature.
mas02md@547 403
mas02md@547 404 As this data structure can be updated over time - by adding and removing new features, and adding and removing extractors for a feature as new ones arise and old ones are not used - we use a schema. The predicate states that the feature vector dimension computed by an extractor must be equal to the associated feature dimension. Every feature has a unique associated unit feature and every extractor has a unique associated unit extractor. Once a user defines a feature, and one its associated extractors, the unit feature and extractor are generated automatically.
mas02md@547 405
mas02md@547 406 \begin{schema}{FeatureExtractors}
mas02md@547 407 featureextractors : Feature \pfun (\power Extractor) \\
mas02md@547 408 features : \power Feature \\
mas02md@547 409 extractors : \power Extractor \\
mas02md@547 410 \where
mas02md@547 411 \dom featureextractors = features \\
mas02md@547 412 \bigcup (\ran featureextractors) = extractors \\
mas02md@547 413 \forall f : Feature ; e : Extractor ; t : Track; i : Interval \\
mas02md@547 414 \t1 @ e \in featureextractors(f) \implies \\
mas02md@547 415 \t2 f.fdimension = dimension (e~ t ~ i) \\
mas02md@547 416 \end{schema}
mas02md@547 417
mas02md@547 418 The range of the real contained in the bel unit vector is less than 0.
mas02md@547 419
mas02md@547 420 \begin{zed}
mas02md@547 421 \forall t : Track; int : Interval; ux : BelUnitExtractor @ \\
mas02md@547 422 \t5 head (ux~t~int) \leq 0
mas02md@547 423 \end{zed}
mas02md@547 424
mas02md@547 425 \item \textsf{Track Feature Vectors} - given a segmenter and an extractor the sequence of feature vectors for each interval of the track
mas02md@547 426
mas02md@547 427 \begin{flushright}
mas02md@547 428 \begin{tikzpicture}
mas02md@547 429 \input track-feature-vectors.tikz
mas02md@547 430 \end{tikzpicture}
mas02md@547 431 \end{flushright}
mas02md@547 432
mas02md@547 433 The function $extract$ applies an extractor to each interval of a track to return a list of feature vectors. This requires the generic function map, which takes a function and applies it to every element of a list. A definition can be found in the appendix.
mas02md@547 434
mas02md@547 435 %%\begin{gendef}[X,Y]
mas02md@547 436 %% map : (X \fun Y) \fun (\seq X) \fun (\seq Y) \\
mas02md@547 437 %%\where
mas02md@547 438 %% \forall f : X \fun Y; x : X; xs,ys : \seq X @ \\
mas02md@547 439 %%\t1 map~f~\langle \rangle = \langle \rangle \land \\
mas02md@547 440 %%\t1 map~f~(xs \cat ys) = map~f~xs \cat map~f~ys
mas02md@547 441 %%\end{gendef}
mas02md@547 442
mas02md@547 443 \begin{axdef}
mas02md@547 444 extract : Segmenter \fun Extractor \fun Track \fun \seq \FV
mas02md@547 445 \where
mas02md@547 446 \forall seg : Segmenter; fx : Extractor; t : Track @ \\
mas02md@547 447 \t5 extract~seg~fx~t = map ~ (fx~t) (seg~t)
mas02md@547 448 \end{axdef}
mas02md@547 449
mas02md@547 450
mas02md@547 451 \item \textsf{Track Unit Vectors} - defined as above but specifically for unit extractors
mas02md@547 452
mas02md@547 453 \begin{flushright}
mas02md@547 454 \begin{tikzpicture}
mas02md@547 455 \input interval-index.tikz
mas02md@547 456 \input segmented-track.tikz
mas02md@547 457 % \foreach \x in {-2.5,-2,-1.4,-0.5,0,0.8,1.3,1.8} {
mas02md@547 458 % \begin{scope}[xshift=\x cm,yshift=-0.4 cm]
mas02md@547 459 \foreach \x in {-1.8,-1.4,...,1.1} {
mas02md@547 460 \begin{scope}[xshift=\x cm,yshift=1.1 cm]
mas02md@547 461 \draw (-0.1,0) rectangle (0.1,0.2);
mas02md@547 462 \end{scope}
mas02md@547 463 }
mas02md@547 464 \begin{scope}[yshift=1.1 cm]
mas02md@547 465 \draw[|-|,>=stealth] (-2.3,0) -- (-2.3,0.2);
mas02md@547 466 \draw node[anchor=south east] at (-2.4,0.1) {\small $d=1$};
mas02md@547 467 \end{scope}
mas02md@547 468 \end{tikzpicture}
mas02md@547 469 \end{flushright}
mas02md@547 470
mas02md@562 471 \item \textsf{Catalogue Feature Vectors} - a sequence of track feature vectors for each track in the instance ($features$)
mas02md@547 472
mas02md@562 473 \item \textsf{Catalogue Unit Vectors} - a sequence of track unit vectors for each track in the instance ($unitfeatures$)
mas02md@547 474
mas02md@562 475 \item \textsf{Catalogue Index} - a sequence of continuous interval indexes (one for each track).
mas02md@562 476
mas02md@562 477 An example can be seen below
mas02md@562 478
mas02md@562 479 \[ \langle
mas02md@562 480 \langle (0,r_{11}), (r_{11},r_{12}), (r_{12}, r_{13}) \rangle,
mas02md@562 481 \langle (0,r_{21}), (r_{21},r_{22}), (r_{22}, r_{23}, (r_{23}, r_{24}) \rangle,
mas02md@562 482 \langle (0,r_{31}), (r_{31},r_{32}), (r_{32}, r_{33}) \rangle,
mas02md@562 483 \rangle \]
mas02md@562 484
mas02md@562 485 \item \textsf{Instance} - a catalogue, segmenter, feature, extractor, unit feature, unit extractor, dimension, feature vectors, catal
mas02md@547 486
mas02md@547 487 % \emph{In our usage (see \textsf{Absolute} \and \textsf{Relative} in section \ref{s:refining}), we require certain semantics of the \textsf{Unit Feature} (and so of the values returned by its \textit{Unit Extractors}. Specifically, the values returned must be interpretable on a logarithmic scale in Bels, with the maximum possible value being the reference threshold at 0B.}
mas02md@547 488
mas02md@547 489 \begin{schema}{Instance}
mas02md@547 490 cat : Catalogue \\
mas02md@547 491 % tracks : \power Track \\
mas02md@547 492 seg : Segmenter \\
mas02md@547 493 f : Feature \\
mas02md@547 494 x : Extractor \\
mas02md@547 495 uf : UnitFeature \\
mas02md@547 496 ux : UnitExtractor \\
mas02md@562 497 d : Dimension \\
mas02md@562 498 index : \seq ContIntIndex \\
mas02md@562 499 features : \seq ~ ( \seq \V) \\
mas02md@562 500 unitfeatures : \seq ~ (\seq \U) \\
mas02md@547 501 \where
mas02md@547 502 d = f.fdimension \\
mas02md@562 503 index = map~seg~cat \\
mas02md@562 504 features = map ~ (extract~seg~x) ~ cat \\
mas02md@562 505 unitfeatures = map ~ (extract~seg~ux) ~ cat \\
mas02md@547 506 \end{schema}
mas02md@547 507
mas02md@562 508 \item \textsf{Singleton Instances} - the set of System instances which contain only one track
mas02md@547 509
mas02md@562 510 \begin{schema}{SingletonInstance}
mas02md@562 511 Instance \\
mas02md@562 512 \where
mas02md@562 513 \# cat = 1
mas02md@562 514 \end{schema}
mas02md@562 515
mas02md@562 516 \item \textsf{System Instances} - the set of system instances.
mas02md@562 517
mas02md@562 518 Note that every instance must contain a catalogue in the music collection but not all catalogues in the collection are necessarily part of an instance.
mas02md@547 519
mas02md@547 520 \begin{schema}{SystemInstances}
mas02md@547 521 Collection \\
mas02md@547 522 instances : \power Instance
mas02md@547 523 \where
mas02md@547 524 \{ i : instances @ i.cat \} \subseteq collection
mas02md@547 525 \end{schema}
mas02md@547 526
mas02md@562 527 \section{Search Vectors}
mas02md@547 528 Searching takes place using concatenations of feature vectors to build \emph{search} vectors. For a particular query all search vectors will have a fixed length equal to some multiple (which we will refer to as $sl$) of the dimension of the orignal feature vectors ($d$).
mas02md@547 529
mas02md@547 530 In the schema below, the function $makesearchvs$ takes a natual number $sl$ and a sequence of feature vectors (associated with a track) to create a sequence of search vectors. The first element of the returned sequence is the concatenation of the first $sl$ feature vectors, the second element is also the concatenation the next $sl$ feature vectors but starting from the second element, and so on until all such sequences are formed. It should be clear that if the original sequence contains $n$ feature vectors then the output will contain $n-sl+1$ vectors.
mas02md@547 531
mas02md@547 532 \begin{flushright}
mas02md@547 533 \begin{tikzpicture}[>=stealth]
mas02md@547 534 \begin{scope}[xshift=-6 cm]
mas02md@547 535 \input track-feature-vectors.tikz
mas02md@547 536 \begin{scope}[yshift=1.1 cm]
mas02md@547 537 \begin{scope}[yshift=1.1 cm,xshift=-1.8cm]
mas02md@547 538 \input fv1.tikz
mas02md@547 539 \draw (-0.1,0) rectangle (0.1,1);
mas02md@547 540 \draw (-0.1,0.2) -- (0.1,0.2);
mas02md@547 541 \draw (-0.1,0.4) -- (0.1,0.4);
mas02md@547 542 \draw (-0.1,0.6) -- (0.1,0.6);
mas02md@547 543 \draw (-0.1,0.8) -- (0.1,0.8);
mas02md@547 544 \end{scope}
mas02md@547 545 \begin{scope}[yshift=2.2 cm,xshift=-1.8cm]
mas02md@547 546 \input fv2.tikz
mas02md@547 547 \draw (-0.1,0) rectangle (0.1,1);
mas02md@547 548 \draw (-0.1,0.2) -- (0.1,0.2);
mas02md@547 549 \draw (-0.1,0.4) -- (0.1,0.4);
mas02md@547 550 \draw (-0.1,0.6) -- (0.1,0.6);
mas02md@547 551 \draw (-0.1,0.8) -- (0.1,0.8);
mas02md@547 552 \end{scope}
mas02md@547 553 \draw[->] (-1.4,1) arc(0:90:0.4 and 0.6);
mas02md@547 554 \draw[->] (-1.0,1) arc(0:90:0.8 and 1.7);
mas02md@547 555 \draw[->] (-1.0,1) arc(0:90:0.4 and 0.6);
mas02md@547 556 \draw[->] (-0.6,1) arc(0:90:0.8 and 1.7);
mas02md@547 557 \end{scope}
mas02md@547 558 \end{scope}
mas02md@547 559 \input interval-index.tikz
mas02md@547 560 \input segmented-track.tikz
mas02md@547 561 % \foreach \x in {-2.5,-2,-1.4,-0.5,0,0.8,1.3,1.8} {
mas02md@547 562 % \begin{scope}[xshift=\x cm,yshift=-0.4 cm]
mas02md@547 563 \foreach \x/\f in {-1.8/fv0,-1.4/fv1,-1.0/fv2,-0.6/fv3,-0.2/fv4,0.2/fv5,0.6/fv6,1.0/fv7} {
mas02md@547 564 \begin{scope}[yshift=1.1 cm,xshift=\x cm]
mas02md@547 565 \input \f.tikz
mas02md@547 566 \end{scope}
mas02md@547 567 }
mas02md@547 568 \foreach \x/\f in {-1.8/fv1,-1.4/fv2,-1.0/fv3,-0.6/fv4,-0.2/fv5,0.2/fv6,0.6/fv7} {
mas02md@547 569 \begin{scope}[yshift=2.1 cm,xshift=\x cm]
mas02md@547 570 \input \f.tikz
mas02md@547 571 \end{scope}
mas02md@547 572 }
mas02md@547 573 \foreach \x/\f in {-1.8/fv2,-1.4/fv3,-1.0/fv4,-0.6/fv5,-0.2/fv6,0.2/fv7} {
mas02md@547 574 \begin{scope}[yshift=3.1 cm,xshift=\x cm]
mas02md@547 575 \input \f.tikz
mas02md@547 576 \end{scope}
mas02md@547 577 }
mas02md@547 578 \foreach \x in {-1.8,-1.4,...,0.3} {
mas02md@547 579 \begin{scope}[xshift=\x cm,yshift=1.1 cm]
mas02md@547 580 \draw (-0.1,0) rectangle (0.1,3);
mas02md@547 581 \foreach \y in {0.2,0.4,...,2.9} {
mas02md@547 582 \draw (-0.1,\y) -- (0.1,\y);
mas02md@547 583 }
mas02md@547 584 \end{scope}
mas02md@547 585 }
mas02md@547 586 \begin{scope}[yshift=1.1 cm]
mas02md@547 587 \draw[<->,>=stealth] (-2.1,0) -- (-2.1,3);
mas02md@547 588 \draw node[anchor=east] at (-2.3,1.5) {$d \times sl$};
mas02md@547 589 \end{scope}
mas02md@547 590 \begin{scope}[yshift=1.1 cm]
mas02md@547 591 \draw[<->,>=stealth] (-2,3.2) -- (0.4,3.2);
mas02md@547 592 \draw node at (-0.8,3.6) {$n - sl + 1$};
mas02md@547 593 \end{scope}
mas02md@547 594 \begin{scope}[xshift=0.6 cm,yshift=1.1 cm]
mas02md@547 595 \draw (-0.1,0) rectangle (0.1,2);
mas02md@547 596 \foreach \y in {0.2,0.4,...,1.9} {
mas02md@547 597 \draw (-0.1,\y) -- (0.1,\y);
mas02md@547 598 }
mas02md@547 599 \draw[dashed] (-0.1,0) -- (-0.3,1);
mas02md@547 600 \draw[dashed] (-0.1,1) -- (-0.3,2);
mas02md@547 601 \draw[dashed] (-0.1,2) -- (-0.3,3);
mas02md@547 602 \end{scope}
mas02md@547 603 \begin{scope}[xshift=1.0 cm,yshift=1.1 cm]
mas02md@547 604 \draw (-0.1,0) rectangle (0.1,1);
mas02md@547 605 \foreach \y in {0.2,0.4,...,0.9} {
mas02md@547 606 \draw (-0.1,\y) -- (0.1,\y);
mas02md@547 607 }
mas02md@547 608 \draw[dashed] (-0.1,0) -- (-0.3,1);
mas02md@547 609 \draw[dashed] (-0.1,1) -- (-0.3,2);
mas02md@547 610 \end{scope}
mas02md@547 611 \foreach \x in {-1.4,-1.0,...,0.3} {
mas02md@547 612 \begin{scope}[xshift=\x cm,yshift=1.1 cm]
mas02md@547 613 \draw[dashed] (-0.1,0) -- (-0.3,1);
mas02md@547 614 \draw[dashed] (-0.1,1) -- (-0.3,2);
mas02md@547 615 \draw[dashed] (-0.1,2) -- (-0.3,3);
mas02md@547 616 \end{scope}
mas02md@547 617 }
mas02md@547 618 \end{tikzpicture}
mas02md@547 619 \end{flushright}
mas02md@547 620
mas02md@547 621 \begin{axdef}
mas02md@547 622 makesearchvs : \nat \fun \seq \Vd \fun \seq \Vdsl
mas02md@547 623 \where
mas02md@547 624 \forall xs : \seq \FV ; sl : \nat @ \\
mas02md@547 625 \t1 sl > \# xs \implies makesearchvs~sl~xs = \langle \rangle \land \\
mas02md@547 626 \t1 sl \leq \# xs \implies makesearchvs~sl~xs = \\
mas02md@547 627 \t2 concat (\langle (0 \upto sl-1) \dres xs \rangle) \cat makesearchvs~sl (tail~xs)
mas02md@547 628 \end{axdef}
mas02md@547 629
mas02md@547 630
mas02md@562 631 \item \textsf{Instance Search Vectors}
mas02md@547 632
mas02md@562 633 For any natural number less than the length of feature vectors for each track we can calculate the search vectors for an given by applying the $makesearchvs$ function to each of the sequences of feature vectors. We call this natural number \emph{search length} ($sl$) and we can use $map$ to apply $makesearchvs ~ sl$ to each of these sequences.
mas02md@547 634
mas02md@562 635 \begin{schema}{SearchVectors}
mas02md@562 636 i? : Instance \\
mas02md@562 637 sl? : \nat \\
mas02md@562 638 searchvs! : \seq (\seq \Vdsl) \\
mas02md@562 639 unitvs! : \seq (\seq \Vsl) \\
mas02md@547 640 \where
mas02md@562 641 searchvs! = map ~ (makesearchvs ~ sl?) ~ i?.features \\
mas02md@562 642 unitvs! = map ~ (makesearchvs ~ sl?) ~ i?.features \\
mas02md@547 643 \end{schema}
mas02md@547 644
mas02md@547 645 \item \textsf{Hopped Search Vectors}
mas02md@547 646
mas02md@547 647 Rather then generating all possible search vectors we may wish to general vectors starting at equally distanced intervals in the interval index of the tracks in an instance. Re refer to the size of this interval as $hop$.
mas02md@547 648
mas02md@547 649 \begin{axdef}
mas02md@547 650 makehopsearchvs : \nat \fun \nat \fun \seq \Vd \fun \seq \Vdsl
mas02md@547 651 \where
mas02md@547 652 \forall xs : \seq \FV ; sl : \nat ; hop : \nat @ \\
mas02md@547 653 \t1 sl > \# xs \implies makehopsearchvs~sl~hop~xs = \langle \rangle \land \\
mas02md@547 654 \t1 sl \leq \# xs \implies makehopsearchvs~sl~hop~xs = \\
mas02md@547 655 \t2 concat (\langle (0 \upto sl-1) \dres xs \rangle) \cat \\
mas02md@547 656 \t2 makehopsearchvs~sl~hop~ ((0 \upto (hop -1)) \ndres xs)
mas02md@547 657 \end{axdef}
mas02md@547 658
mas02md@562 659 This enables us to define a hop size when generating search vectors
mas02md@547 660
mas02md@562 661 \begin{schema}{HopSearchVectors}
mas02md@562 662 SearchVectors \\
mas02md@547 663 hop? : \nat \\
mas02md@562 664 hopsearchvs! : \seq (\seq \Vdsl) \\
mas02md@562 665 hopunitvs! : \seq (\seq \Vsl) \\
mas02md@547 666 \where
mas02md@562 667 hopsearchvs! = map ~ (makehopsearchvs ~i?.d ~ hop?) ~ i?.features \\
mas02md@562 668 hopunitvs! = map ~ (makehopsearchvs ~ i?.d ~ hop?) ~ i?.features \\
mas02md@547 669 \end{schema}
mas02md@547 670
mas02md@562 671 \item \textsf{Search Vector Power}
mas02md@562 672
mas02md@562 673 For each Search Vector in an instances search vectors we take the associated unit values and take the arithmetic mean.
mas02md@562 674
mas02md@562 675 \begin{axdef}
mas02md@562 676 average : \V \fun \R
mas02md@562 677 \end{axdef}
mas02md@562 678
mas02md@562 679 \begin{schema}{Powers}
mas02md@562 680 SearchVectors \\
mas02md@562 681 powers : \seq (\seq \R)
mas02md@562 682 \where
mas02md@562 683 powers = map~(map ~ average) ~ unitvs! \\
mas02md@562 684 \end{schema}
mas02md@562 685
mas02md@562 686 \item \textsf{Search Vector Duration}
mas02md@562 687
mas02md@562 688 The duration of the sequence of a the original feature vectors from which the search vector is composed.
mas02md@562 689
mas02md@562 690 \begin{enumerate}
mas02md@562 691
mas02md@562 692 \item \textsf{Single Search Vector Duration}
mas02md@562 693
mas02md@562 694 This is defined as the duration of the sequence for each search vector. So for example if a search vector was made from the combination of 3 feature vectors and those 3 feature vectors came from the following intervals $ (3,8), (8,11), (11,13) $ then the duration of the search vector is $13-3=10$.
mas02md@562 695
mas02md@562 696 \item \textsf{Track Search Vector Durations}
mas02md@562 697
mas02md@562 698 Next we define a function at the level of track search vectors. We define a function which takes a sequence of search vectors for that track, the search length that was used to create them, and the track interval index and returns a sequence of natural numbers each which is the duration of the corresponding search vector
mas02md@562 699
mas02md@562 700 \begin{axdef}
mas02md@562 701 durationsTsv : \seq \Vdsl \fun ContIntIndex \fun \nat \fun \seq \R
mas02md@562 702 \where
mas02md@562 703 \forall svs : \seq \Vdsl; cii : ContIntIndex; sl : \nat @ \\
mas02md@562 704 \t1 durationsTsv~svs~cii~sl = \\
mas02md@562 705 \t2 \{ n : \nat | n \in 1 \upto n - sl + 1 @ \\
mas02md@562 706 \t3 (n, first (cii (n + sl)) - first (cii (n)) ) \}
mas02md@562 707 \end{axdef}
mas02md@562 708
mas02md@562 709 \item \textsf{Instance Search Vector Durations}
mas02md@562 710
mas02md@562 711 Finally, we define a function called svdurations which takes a sequence of search vectors and calculate the track search vector durations for each sequence.
mas02md@562 712
mas02md@562 713 \begin{axdef}
mas02md@562 714 durationsIsv : \seq (\seq \Vdsl) \fun (\seq ContIntIndex) \fun \nat \fun \seq (\seq \R) \\
mas02md@562 715 \where
mas02md@562 716 \forall svss : \seq (\seq \Vdsl); sl : \nat; ciis : \seq ContIntIndex @ \\
mas02md@562 717 \t1 durationsIsv ~ svss ~ ciis ~ sl = \\
mas02md@562 718 \t2 \langle durationsTsv ~ (head ~ svss) ~ (head ~ ciis) ~ sl \rangle \cat \\
mas02md@562 719 \t3 durationsIsv ~ (tail ~ svss) ~ (tail ~ ciis) ~ sl \land \\
mas02md@562 720 \t1 durationsIsv ~ \langle \rangle~ \langle \rangle ~ sl = \langle \rangle
mas02md@562 721 \end{axdef}
mas02md@562 722
mas02md@547 723 \end{enumerate}
mas02md@547 724
mas02md@562 725 \begin{schema}{Durations}
mas02md@562 726 SearchVectors \\
mas02md@562 727 svdurations : \seq (\seq \R) \\
mas02md@562 728 \where
mas02md@562 729 svdurations = durationsIsv ~ searchvs! ~ i?.index ~ sl?
mas02md@562 730 \end{schema}
mas02md@547 731
mas02md@562 732 % \end{enumerate}
mas02md@562 733
mas02md@562 734 \section{Making a Query}
mas02md@562 735
mas02md@562 736 \item \textsf{General Instance Query}
mas02md@562 737
mas02md@562 738 Two instances, a source and a target are identified by the user for comparison as well as the search length from which the search vectors are defined for both the source and the sequence. Once the query has been defined we can generate the \emph{source search vectors}, \emph{source unit vectors}, \emph{target search vectors} and \emph{target unit vectors} as specified earlier. We also generate \emph{source powers}, \emph{target powers}, \emph{source search vector durations} and \emph{target search vector durations}.
mas02md@562 739
mas02md@562 740 \emph{What are the pre-conditions regarding the search length? That it is less than the length of any sequence of feature vectors for every track?}
mas02md@562 741
mas02md@562 742 \begin{schema}{GeneralQuery}
mas02md@562 743 SystemInstances \\
mas02md@562 744 src?, tgt? : Instance \\
mas02md@562 745 sl?: \nat \\
mas02md@562 746 srcsearchvs!, tgtsearchvs! : \seq (\seq \Vdsl) \\
mas02md@562 747 srcunitvs!, tgtunitvs! : \seq (\seq \Vsl) \\
mas02md@562 748 sourcepowers, targetpowers : \seq (\seq \R) \\
mas02md@562 749 sourcedurations, targetdurations : \seq (\seq \R) \\
mas02md@562 750 \where
mas02md@562 751 \{ src?, tgt? \} \subseteq instances \\
mas02md@562 752 \end{schema}
mas02md@547 753
mas02md@562 754 \item \textsf{General Instance Self Query}
mas02md@562 755
mas02md@562 756 When an instance is identified as both the source and target we state that it is a self query.
mas02md@562 757
mas02md@562 758 \begin{schema}{GeneralSelfQuery}
mas02md@562 759 GeneralQuery
mas02md@562 760 \where
mas02md@562 761 src? = tgt?
mas02md@562 762 \end{schema}
mas02md@562 763
mas02md@562 764 \item Singleton (Track) Query
mas02md@562 765
mas02md@562 766 A user defines an instance with exactly one track as the source.
mas02md@562 767
mas02md@562 768 \begin{schema}{SingletonQuery}
mas02md@562 769 GeneralQuery
mas02md@562 770 \where
mas02md@562 771 src? \in SingletonInstance
mas02md@562 772 \end{schema}
mas02md@562 773
mas02md@562 774 \item Point (Sequence) Query
mas02md@562 775
mas02md@562 776 The user identifies a specific part of the track in a singleton instance to be used as the source known as a \emph{sequence}. A user defines the sequence by specifying the start and end points of the sequence index. Once these have been defined the sequence, sequence index and sequence length are easily identified.
mas02md@562 777
mas02md@562 778 \begin{enumerate}
mas02md@562 779
mas02md@562 780 \item \textsf{Source track} - the single track contained in the instance.
mas02md@562 781
mas02md@562 782 \item \textsf{Sequence} - a continuous sub-section of the point query track used to make a query.
mas02md@562 783
mas02md@562 784 \begin{zed}
mas02md@562 785 Sequence == Track
mas02md@562 786 \end{zed}
mas02md@562 787
mas02md@562 788 \item \textsf{Sequence Index} - a continuous interval index that defines the sequence according to the segmenter of the instance.
mas02md@562 789
mas02md@562 790 % \emph{Can we discuss this - is it reset so that the first element is indexed by 1 and so on?}
mas02md@562 791
mas02md@562 792 \item \textsf{Sequence Length} - the number of intervals in the sequence index.
mas02md@562 793
mas02md@562 794 \end{enumerate}
mas02md@562 795
mas02md@562 796 \begin{schema}{PointQuery}
mas02md@562 797 SingletonQuery \\
mas02md@562 798 start?, end? : \nat \\
mas02md@562 799 sourcetrack! : Track \\
mas02md@562 800 sequence! : Sequence \\
mas02md@562 801 seqindex! : ContIntIndex \\
mas02md@562 802 sl! : \nat \\
mas02md@562 803 \where
mas02md@562 804 sourcetrack! = head (src?.cat) \\
mas02md@562 805 start? \in \dom (src?.seg (sourcetrack!)) \\
mas02md@562 806 end? \in \dom (src?.seg (sourcetrack!)) \\
mas02md@562 807 seqindex! = (start? \upto end?) \dres (src?.seg~sourcetrack!) \\
mas02md@562 808 sl! = end? - start? \\
mas02md@562 809 \end{schema}
mas02md@562 810
mas02md@562 811 In this case the search vectors will be equal to a sequence containing only one sequence. That sequence will itself consist of only one sequence which will be a search vector of dimension $d * sl$ where $d$ is the dimension of the sequence and $sl$ the length of the sequence.
mas02md@562 812
mas02md@562 813 \[ \langle \langle \Vdsl \rangle \rangle \]
mas02md@562 814
mas02md@562 815 \section{Distance between Search Vectors}
mas02md@547 816
mas02md@547 817 We define the scalar product and distance between vectors of equal dimension.
mas02md@547 818
mas02md@547 819 \begin{axdef}
mas02md@547 820 scalarproduct : \V \fun \V \fun \R \\
mas02md@547 821 distance : \V \fun \V \fun \R \\
mas02md@547 822 \end{axdef}
mas02md@547 823
mas02md@547 824 \begin{zed}
mas02md@547 825 \forall v_1, v_2 : \V @ distance~v_1~v_2 = 2 - scalarproduct~v_1~v_2
mas02md@547 826 \end{zed}
mas02md@547 827
mas02md@562 828 Then we can define a variable which maintains a list of all distances between the source and target search vectors. If we have a sequence of sequence of search vectors in the both the source and the target we will end up with a sequence of sequence of sequence of sequences.
mas02md@547 829
mas02md@562 830 For the purposes of illumination imagine a source and target each with two simple search vectors. Again, we will replace the reals with naturals here.
mas02md@562 831
mas02md@562 832 Let us assume we have the following source search vectors for an instance of two tracks
mas02md@562 833
mas02md@562 834 \[ \langle \langle a_1, a_2, a_3 \rangle, \langle b_1, b_2 \rangle \rangle \]
mas02md@562 835
mas02md@562 836 And the following target search vectors also for an instance of two tracks
mas02md@562 837
mas02md@562 838 \[ \langle \langle x_1, x_2 \rangle, \langle y_1, y_2, y_3 \rangle \rangle \]
mas02md@562 839
mas02md@562 840 Then we need to generate the following data structure
mas02md@562 841
mas02md@562 842 \[ \langle \\ % First track in source with target instance \\
mas02md@562 843 \t1 \langle \\ % First Search Vector with whole of target \\
mas02md@562 844 \t2 \langle \\ % First Search Vector with first track
mas02md@562 845 \t3 \langle a_1 \dot x_1, a_1 \dot x_2 \rangle, \\
mas02md@562 846 \t3 \langle a_1 \dot y_1, a_1 \dot y_2, a_1 \dot y_3 \rangle \\
mas02md@562 847 \t2 \rangle, \\
mas02md@562 848 % Second search vector
mas02md@562 849 \t2 \langle \\
mas02md@562 850 \t3 \langle a_2 \dot x_1, a_2 \dot x_2 \rangle, \\
mas02md@562 851 \t3 \langle a_2 \dot y_1, a_2 \dot y_2, a_2 \dot y_3 \rangle \\
mas02md@562 852 \t2 \rangle, \\
mas02md@562 853 \t2 \langle \\
mas02md@562 854 \t3 \langle a_3 \dot x_1, a_3 \dot x_2 \rangle, \\
mas02md@562 855 \t3 \langle a_3 \dot y_1, a_3 \dot y_2, a_3 \dot y_3 \rangle \\
mas02md@562 856 \t2 \rangle \\
mas02md@562 857 \t1 \rangle, \\
mas02md@562 858 % Second track in source with target instance
mas02md@562 859 \t1 \langle, \\
mas02md@562 860 \t2 \langle \\
mas02md@562 861 \t3 \langle b_1 \dot x_1, b_1 \dot x_2 \rangle, \\
mas02md@562 862 \t3 \langle b_1 \dot y_1, b_1 \dot y_2, b_1 \dot y_3 \rangle \\
mas02md@562 863 \t2 \rangle \\
mas02md@562 864 \t2 \langle \\
mas02md@562 865 \t3 \langle b_2 \dot x_1, b_2 \dot x_2 \rangle, \\
mas02md@562 866 \t3 \langle b_2 \dot y_1, b_2 \dot y_2, b_2 \dot y_3 \rangle \\
mas02md@562 867 \t2 \rangle \\
mas02md@562 868 \t1 \rangle \\
mas02md@562 869 \rangle \\
mas02md@562 870 \]
mas02md@562 871
mas02md@562 872
mas02md@562 873 First let us define the function for a calculating the distances for a single search vector and general target search vectors
mas02md@562 874
mas02md@562 875 \emph{I couldn't do this on one go first time round - too much nesting for me!}
mas02md@562 876
mas02md@562 877 \begin{axdef}
mas02md@562 878 vectordistances : \Vdsl \fun \seq (\seq \Vdsl) \fun \seq (\seq \R)
mas02md@547 879 \where
mas02md@562 880 \forall sv : \Vdsl; targetsv : \seq (\seq \Vdsl) @ \\
mas02md@562 881 \t1 vectordistances~sv~targetsv = map (map (distance~sv)) targetsv
mas02md@562 882 \end{axdef}
mas02md@562 883
mas02md@562 884 Next we define the function for calculating the distances for the search vectors of a track with a target instance.
mas02md@562 885
mas02md@562 886 \begin{axdef}
mas02md@562 887 trackdistances : (\seq \Vdsl) \fun \seq (\seq \Vdsl) \fun \seq (\seq (\seq \R))
mas02md@562 888 \where
mas02md@562 889 \forall tracksv : \seq \Vdsl; targetsv : \seq (\seq \Vdsl) @ \\
mas02md@562 890 \t1 trackdistances~tracksv~targetsv = \\
mas02md@562 891 \t3 \langle vectordistances~(head ~ tracksv)~targetsv \rangle \cat \\
mas02md@562 892 \t4 trackdistances~(tail ~ tracksv) ~ targetsv \land \\
mas02md@562 893 \t1 trackdistances~\langle \rangle ~targetsv = \langle \langle \langle \rangle \rangle \rangle
mas02md@562 894 \end{axdef}
mas02md@562 895
mas02md@562 896 Then we can define a function which calculating the distances between a source and target instance.
mas02md@562 897
mas02md@562 898 \begin{axdef}
mas02md@562 899 instancedistances : \seq (\seq \Vdsl) \fun \seq (\seq \Vdsl) \fun \seq (\seq (\seq (\seq \R)))
mas02md@562 900 \where
mas02md@562 901 \forall sourcesv, targetsv: \seq (\seq \Vdsl) @ \\
mas02md@562 902 \t1 instancedistances~sourcesv~targetsv = \\
mas02md@562 903 \t3 \langle trackdistances~(head ~ sourcesv)~targetsv \rangle \cat \\
mas02md@562 904 \t4 instancedistances~(tail ~ sourcesv) ~ targetsv \land \\
mas02md@562 905 \t1 instancedistances~\langle \langle \rangle \rangle ~targetsv = \langle \langle \langle \langle \rangle \rangle \rangle \rangle
mas02md@562 906 \end{axdef}
mas02md@562 907
mas02md@562 908 The next scheme takes a general query and calculates all the distances.
mas02md@562 909
mas02md@562 910 \begin{schema}{CalculateDistances}
mas02md@562 911 GeneralQuery \\
mas02md@562 912 distances : \seq (\seq (\seq (\seq \R)))
mas02md@562 913 \where
mas02md@562 914 distances = instancedistances~srcsearchvs!~tgtsearchvs!
mas02md@547 915 \end{schema}
mas02md@547 916
mas02md@562 917 The list can be sorted into ascending order and some threshold set (as we shall see) where we believe it is sensible to expect some acoustic or cognitive similarity.
mas02md@547 918
mas02md@562 919 Along with the distance between two search vectors, we also output the source track and index, the target track and index, as well as the duration the source search vector and power of the associated and corresponding unit vectors.
mas02md@547 920
mas02md@562 921 \begin{schema}{Output}
mas02md@562 922 srctrack, srcindex, tgttrack, tgtindex : \nat \\
mas02md@562 923 distance : \R \\
mas02md@562 924 sourceduration, targetduration : \R \\
mas02md@562 925 sourcepower, targetpower : \R \\
mas02md@562 926 \end{schema}
mas02md@547 927
mas02md@562 928 The System outputs are a sequence of outputs ordered according to distance. We define a new variable which specifies the modified output that the user can make which we discuss in the next section.
mas02md@547 929
mas02md@547 930 \begin{schema}{SystemOutput}
mas02md@562 931 CalculateDistances \\
mas02md@547 932 output! : \seq Output \\
mas02md@547 933 modifiedoutput! : \seq Output
mas02md@547 934 \where
mas02md@547 935 \forall o_1, o_2 : Output @ \langle o_1, o_2 \rangle \inseq output! \\
mas02md@547 936 \t3 \implies o_1.distance \leq o_2.distance
mas02md@547 937 \end{schema}
mas02md@547 938
mas02md@562 939 \section{Refining a Search}
mas02md@562 940 \label{s:refining}
mas02md@562 941
mas02md@562 942 There are ways of refining the query.
mas02md@562 943
mas02md@562 944 \begin{enumerate}
mas02md@562 945
mas02md@562 946 \item \textsf{Key List} - specify specific tracks to search over. (Either source or target or both.)
mas02md@562 947
mas02md@562 948 \item \textsf{Radius} - reject distances which are greater than a given real number radius. (Either source or target or both.)
mas02md@562 949
mas02md@562 950 \item \textsf{Absolute} - reject any search vectors where the `power' average is less than a specific absolute value. (Either source or target or either or both.)
mas02md@562 951
mas02md@562 952 \item \textsf{Relative} - reject any search vectors where the `power' average is not within + or - a relative value. (Either source or target or either or both.)
mas02md@562 953
mas02md@562 954 \item \textsf{Duration Ratio} - remove any outputs where the relative durations of the search vectors are not within a specified range.
mas02md@562 955
mas02md@562 956 \item \textsf{Hop Size} - Rather than making search vectors which start with the first feature vectors and then the second feature vector and so on, make sparser search vectors by starting with fv at 1, then fv at $1 + h$, then fv at $1 + 2h$ and so on where h is the hop size. (Either source or target or both with either equal or separate values of hop.)
mas02md@562 957
mas02md@562 958 \end{enumerate}
mas02md@562 959
mas02md@547 960 \begin{enumerate}
mas02md@547 961
mas02md@547 962 \item Keylist
mas02md@547 963
mas02md@562 964 \begin{schema}{KeyListSource}
mas02md@547 965 SystemOutput \\
mas02md@562 966 k? : \power \nat
mas02md@547 967 \where
mas02md@562 968 modifiedoutput! = output! \filter \{ o : Output | o.srctrack \in k? \}
mas02md@547 969 \end{schema}
mas02md@547 970
mas02md@547 971 \item Radius
mas02md@547 972
mas02md@547 973 The system removes any elements which have a distance greater than the radius.
mas02md@547 974
mas02md@547 975 \begin{schema}{Radius}
mas02md@547 976 SystemOutput \\
mas02md@547 977 r? : \R \\
mas02md@547 978 \where
mas02md@547 979 modifiedoutput! = \\
mas02md@547 980 \t1 output! \filter \{ o : Output | o.distance \leq r? \}
mas02md@547 981 \end{schema}
mas02md@547 982
mas02md@562 983 \item Absolute Source
mas02md@547 984
mas02md@547 985 \begin{schema}{Absolute}
mas02md@547 986 SystemOutput \\
mas02md@547 987 a? : \R \\
mas02md@547 988 \where
mas02md@547 989 modifiedoutput! = \\
mas02md@562 990 \t1 output! \filter \{ o : Output | o.sourcepower \geq a? \}
mas02md@547 991 \end{schema}
mas02md@547 992
mas02md@547 993 \item Relative
mas02md@547 994
mas02md@547 995 %% \begin{axdef}
mas02md@547 996 %% abs : \R \fun \R
mas02md@547 997 %% \end{axdef}
mas02md@547 998
mas02md@547 999
mas02md@562 1000
mas02md@547 1001 \begin{schema}{Relative}
mas02md@547 1002 SystemOutput \\
mas02md@547 1003 rel? : \R \\
mas02md@562 1004 % \where
mas02md@562 1005 % modifiedoutput! = \\
mas02md@562 1006 % \t1 output! \filter \{ o : Output | abs (o.power - sourcepower) \leq rel? \}
mas02md@547 1007 \end{schema}
mas02md@547 1008
mas02md@547 1009 \item Duration Ratio
mas02md@547 1010
mas02md@547 1011 %%unchecked
mas02md@547 1012 \begin{schema}{DurationRatio}
mas02md@547 1013 SystemOutput \\
mas02md@547 1014 d? : \R \\
mas02md@547 1015 \where
mas02md@547 1016 modifiedoutput! = \\
mas02md@562 1017 \t1 output! \filter \{ o : Output | \exp ^ {abs \ln (\frac{o.duration}{srcduration})} \leq d? \}
mas02md@547 1018 \end{schema}
mas02md@547 1019
mas02md@547 1020 \item Hop Size
mas02md@547 1021
mas02md@547 1022 \begin{schema}{Hop}
mas02md@547 1023 hop? : \nat \\
mas02md@547 1024 SystemOutput \\
mas02md@562 1025 HopSearchVectors \\
mas02md@547 1026 \end{schema}
mas02md@547 1027
mas02md@547 1028 \end{enumerate}
mas02md@562 1029 \end{enumerate}
mas02md@562 1030 \end{document}
mas02md@547 1031
mas02md@547 1032 \section{Setting Thresholds}
mas02md@547 1033
mas02md@547 1034 \subsection{For an Instance}
mas02md@547 1035
mas02md@547 1036 \begin{enumerate}
mas02md@547 1037 \item Guess several lengths based on understanding of the instance (beat, bar, phrase, section). ($sl$)
mas02md@547 1038 \item Generate all sequences of length ($sl$) from the tracks in the instance
mas02md@547 1039 \item Select pairs of sequences of length ($sl$) which may or may not be from the same track randomly from the instance, compute distance between these pairs, and repeat a fixed number of times.
mas02md@547 1040 \item Tabulate distances
mas02md@547 1041 \item Scary maths
mas02md@547 1042 \item Obtain a global threshold
mas02md@547 1043 \item Set the radius threshold equal to this value
mas02md@547 1044 \end{enumerate}
mas02md@547 1045
mas02md@547 1046 \subsection{For a Source Sequence and a Target Instance (I)}
mas02md@547 1047
mas02md@547 1048 \begin{enumerate}
mas02md@547 1049 \item Take the length of the source sequence ($sl$)
mas02md@547 1050 \item Generate all sequences of length ($sl$) from the tracks in the instance
mas02md@547 1051 \item Select a sequence randomly, calculate the distance form the source sequence, and repeat a fixed number of times.
mas02md@547 1052 \item Tabulate distances
mas02md@547 1053 \item Scary maths
mas02md@547 1054 \item Obtain a threshold for this sequence/instance pair
mas02md@547 1055 \item Set the radius threshold equal to this value
mas02md@547 1056 \end{enumerate}
mas02md@547 1057
mas02md@547 1058 \subsection{For a Source Track and a Target Instance (II)}
mas02md@547 1059
mas02md@547 1060 \begin{enumerate}
mas02md@547 1061 \item Choose a length $sl$ which is less than the track length
mas02md@547 1062 \item Generate all sequences of length ($sl$) from the tracks in the instance
mas02md@547 1063 \item Generate all sequences of length ($sl$) from the source sequence
mas02md@547 1064 \item Select one of these source sequences randomly and one of the sequences from one of the tracks in the instance randomly, calculate the distance form the source sequence, and repeat a fixed number of times.
mas02md@547 1065 \item Tabulate distances
mas02md@547 1066 \item Scary maths
mas02md@547 1067 \item Obtain a threshold for this track/instance pair
mas02md@547 1068 \item Set the radius threshold equal to this value
mas02md@547 1069 \end{enumerate}
mas02md@547 1070
mas02md@547 1071 \appendix
mas02md@547 1072 \section{Auxillary Definitions}
mas02md@547 1073
mas02md@547 1074 \subsection{The function $map$}
mas02md@547 1075
mas02md@547 1076 This requires need the generic function map, which takes a function and applies it to every element of a list.
mas02md@547 1077
mas02md@547 1078 %%unchecked
mas02md@547 1079 \begin{gendef}[X,Y]
mas02md@547 1080 map : (X \fun Y) \fun (\seq X) \fun (\seq Y) \\
mas02md@547 1081 \where
mas02md@547 1082 \forall f : X \fun Y; x : X; xs,ys : \seq X @ \\
mas02md@547 1083 \t1 map~f~\langle \rangle = \langle \rangle \land \\
mas02md@547 1084 \t1 map~f~(xs \cat ys) = map~f~xs \cat map~f~ys
mas02md@547 1085 \end{gendef}
mas02md@547 1086
mas02md@547 1087
mas02md@547 1088 \subsection{The function $concat$}
mas02md@547 1089
mas02md@547 1090 We can do this by making use of the following generic function that takes a list of lists of lists, and concatenates each of the lists together.
mas02md@547 1091
mas02md@547 1092 %%unchecked
mas02md@547 1093 \begin{gendef}[X]
mas02md@547 1094 concat : (\seq (\seq X)) \fun (\seq X)
mas02md@547 1095 \where
mas02md@547 1096 \forall xs : \seq X; xss : \seq (\seq X) @ \\
mas02md@547 1097 \t2 concat~ ((\langle xs \rangle) \cat xss) = xs \cat
mas02md@547 1098 (concat~xss)
mas02md@547 1099 \end{gendef}
mas02md@547 1100
mas02md@547 1101 \section{Probabilty of chosing a track}
mas02md@547 1102
mas02md@547 1103 The longer the track in a target the more sequences it will contain with the same number of intervals with length ($sl$). Therefore the probability of selecting a track is weighted. Once a track has been selected then a sequence within the track is chosen uniformly.
mas02md@547 1104
mas02md@547 1105 %% \begin{axdef}
mas02md@547 1106 %% sum : \power \R \fun \R
mas02md@547 1107 %% \end{axdef}
mas02md@547 1108
mas02md@547 1109 %% \begin{axdef}
mas02md@547 1110 %% pairmax : \R \cross \R \fun \R
mas02md@547 1111 %% \end{axdef}
mas02md@547 1112
mas02md@547 1113 %%unchecked
mas02md@547 1114 \begin{schema}{CalculateProbabilities}
mas02md@547 1115 i : Instance \\
mas02md@547 1116 sl : \nat \\
mas02md@547 1117 allslsequences : \nat \\
mas02md@547 1118 SystemInstances \\
mas02md@547 1119 prob : Track \fun \R \also
mas02md@547 1120 \where
mas02md@547 1121 allslsequences = \\
mas02md@547 1122 \t1 sum \{ t : Track @ pairmax ( (length (i.seg~t) - sl + 1), 0 ) \} \also
mas02md@547 1123 \forall t : Track @ prob~t = \frac{length (i.seg~t) - sl + 1}{ allslsequences}
mas02md@547 1124 \end{schema}
mas02md@547 1125
mas02md@547 1126 This uses a function $subsequences$ which takes a sequence of elements $s$ and a natural number $n$ and returns the number of contiguous sequences of length $n$ and a function $sumseq$ which sums the elements in a list.
mas02md@547 1127
mas02md@547 1128 \section{Examples illustrating functions}
mas02md@547 1129
mas02md@547 1130 \subsection{Track Extract}
mas02md@547 1131
mas02md@547 1132 Given a segmenter, an extractor, and a track, the function $extract$ returns a list of feature vectors for that track. Each feature vector has equal dimension, and each vector corresponds to the equivalent interval in the interval index. As an example, if the feature vectors have dimension $d$, and if there are $m$ intervals defined by the segmenter, and if we adopt the convention that $\V$ represents a vector of length d, this function would return something with the following form:
mas02md@547 1133
mas02md@547 1134 \[ \langle V_{1}^{d}, V_{2}^{d} \dots V_{m}^{d} \rangle \]
mas02md@547 1135
mas02md@547 1136 When we apply this function to every track in a catalogue (which is a list of tracks) we simply determine a list of such lists. If there are $n$ tracks in a catalogue then our data would look something like the following.
mas02md@547 1137
mas02md@562 1138 \[ features = \langle ~~ \langle V_{11}^{d}, V_{12}^{d} \dots V_{1m_{1}}^{d} \rangle \\
mas02md@547 1139 \t3 ~~~ \langle V_{21}^{d}, V_{22}^{d} \dots V_{2m_{2}}^{d} \rangle \\
mas02md@547 1140 \t3 \dots \\
mas02md@547 1141 \t3 \dots \\
mas02md@547 1142 \t3 ~~~ \langle V_{n1}^{d}, V_{n2}^{d} \dots V_{nm_{n}}^{d} \rangle
mas02md@547 1143 ~~ \rangle
mas02md@547 1144 ~~ \]
mas02md@547 1145
mas02md@547 1146 The unit vector data has exactly the same structure but with different vectors all of dimension 1.
mas02md@547 1147
mas02md@562 1148 \[ unitfeatures = \langle ~~ \langle V_{11}^{1}, V_{12}^{1} \dots V_{1m_{1}}^{1} \rangle \\
mas02md@547 1149 \t4 ~~ \langle V_{21}^{1}, V_{22}^{1} \dots V_{2m_{2}}^{1} \rangle \\
mas02md@547 1150 \t4 \dots \\
mas02md@547 1151 \t4 \dots \\
mas02md@547 1152 \t4 ~~~ \langle V_{n1}^{1}, V_{n2}^{1} \dots V_{nm_{n}}^{1} \rangle
mas02md@547 1153 ~~ \rangle
mas02md@547 1154 ~~ \]
mas02md@547 1155
mas02md@547 1156 \subsection{Making Search Vectors}
mas02md@547 1157
mas02md@547 1158 Suppose we had the following feature vectors for a track.
mas02md@547 1159
mas02md@547 1160 \[ \langle V_{1}^{d}, V_{2}^{d} , V_{3}^{d} \rangle \]
mas02md@547 1161
mas02md@547 1162 Also suppose that our sequence length $sl$ had a value of 2 and we applied the function $makesearchvs$. We would then make search vectors as follows.
mas02md@547 1163
mas02md@547 1164 \[ \t3 \langle V_{1}^{d} \cat V_{2}^{d} , V_{2}^{d} \cat V_{3}^{d} \rangle \]
mas02md@547 1165
mas02md@562 1166
mas02md@562 1167 \end{enumerate}
mas02md@547 1168 \end{document}
mas02md@547 1169