mas01cr@417
|
1 Currently supported query types:
|
mas01cr@417
|
2
|
mas01cr@417
|
3 O2_POINT_QUERY
|
mas01cr@417
|
4 * dot_product
|
mas01cr@417
|
5
|
mas01cr@417
|
6 Find and report, from the database, up to "pointNN"
|
mas01cr@417
|
7 near-neighbours of length-1 query sequences.
|
mas01cr@417
|
8
|
mas01cr@417
|
9 O2_TRACK_QUERY
|
mas01cr@417
|
10 * dot_product
|
mas01cr@417
|
11
|
mas01cr@417
|
12 Find, in each track, up to "pointNN" near-neighbours of length-1
|
mas01cr@417
|
13 query sequences, reporting the top "trackNN" tracks, ordered by
|
mas01cr@417
|
14 the average distance of the pairwise matches.
|
mas01cr@417
|
15
|
mas01cr@417
|
16 O2_SEQUENCE_QUERY
|
mas01cr@417
|
17 - radius, + radius
|
mas01cr@417
|
18 * euclidean_normed, euclidean
|
mas01cr@417
|
19
|
mas01cr@417
|
20 O2_N_SEQUENCE_QUERY
|
mas01cr@417
|
21 - radius, + radius
|
mas01cr@417
|
22 * euclidean_normed, euclidean
|
mas01cr@417
|
23
|
mas01cr@417
|
24 Find, in each track, up to "pointNN" near-neighbours of query
|
mas01cr@417
|
25 sequences. Report the results from the "trackNN" top tracks,
|
mas01cr@417
|
26 where the tracks are ordered by the average distance of the
|
mas01cr@417
|
27 retrieved pairwise matches. The difference between SEQUENCE and
|
mas01cr@417
|
28 N_SEQUENCE is that the SEQUENCE case reports only the average,
|
mas01cr@417
|
29 while the N_SEQUENCE reports the individual points too.
|
mas01cr@417
|
30
|
mas01cr@417
|
31 (Ordering by average is arbitrary, and it's not hard to construct
|
mas01cr@417
|
32 cases where it is suboptimal. The two cases where it is not
|
mas01cr@417
|
33 arbitrary are when pointNN is 1, and when trackNN is equal to the
|
mas01cr@417
|
34 number of files in the database.)
|
mas01cr@417
|
35
|
mas01cr@417
|
36 O2_ONE_TO_ONE_N_SEQUENCE_QUERY
|
mas01cr@417
|
37 + radius
|
mas01cr@417
|
38 * euclidean_normed
|
mas01cr@417
|
39
|
mas01cr@417
|
40 For all applicable query sequences, find and report the closest
|
mas01cr@417
|
41 target instance point. Each query sequence is responsible for
|
mas01cr@417
|
42 exactly one result.
|
mas01cr@417
|
43
|
mas01cr@417
|
44 (This feels like it should be more orthogonal than a separate
|
mas01cr@417
|
45 query type; the restriction on using a target instance point only
|
mas01cr@417
|
46 once in a match seems like it should compose with the sequencing
|
mas01cr@417
|
47 query above.)
|
mas01cr@417
|
48
|
mas01cr@417
|
49 Plan:
|
mas01cr@417
|
50
|
mas01cr@417
|
51 We have
|
mas01cr@417
|
52
|
mas01cr@417
|
53 reporter->add_point(),
|
mas01cr@417
|
54 reporter->report().
|
mas01cr@417
|
55
|
mas01cr@417
|
56 Insert into the whole shebang a new class Accumulator, with methods
|
mas01cr@417
|
57
|
mas01cr@417
|
58 void accumulator->add_point()
|
mas01cr@417
|
59 adb_query_results *accumulator->get_points()
|
mas01cr@417
|
60
|
mas01cr@417
|
61 The accumulator has to be responsible for keeping track of how many
|
mas01cr@417
|
62 points (total, or per track) there are so far; ->get_points() has to
|
mas01cr@417
|
63 make the final decision about which points to preserve. So sadly we
|
mas01cr@417
|
64 can't be completely on the side of the angels and have only one single
|
mas01cr@417
|
65 accumulator class, as POINT_QUERY is different from all the others.
|
mas01cr@417
|
66 (Though maybe we can with a suitably careful use of the "if"
|
mas01cr@417
|
67 construct).
|
mas01cr@417
|
68
|
mas01cr@417
|
69 We don't have to alter the Reporter class at all. The query loop goes
|
mas01cr@417
|
70 roughly
|
mas01cr@417
|
71
|
mas01cr@417
|
72 choose point pair
|
mas01cr@417
|
73 if(everything OK with point pair)
|
mas01cr@417
|
74 accumulator->add_point()
|
mas01cr@417
|
75 loop
|
mas01cr@417
|
76
|
mas01cr@417
|
77 results = accumulator->get_points()
|
mas01cr@417
|
78
|
mas01cr@417
|
79 for matches in results
|
mas01cr@417
|
80 reporter->add_point(match)
|
mas01cr@417
|
81 loop
|
mas01cr@417
|
82
|
mas01cr@417
|
83 reporter->report()
|
mas01cr@417
|
84
|
mas01cr@417
|
85 This separation is engineered (ha) such that everything after the last
|
mas01cr@417
|
86 use of the accumulator doesn't need to be in libaudiodb; the return
|
mas01cr@417
|
87 value from audiodb_query() can be "results" in the above, and then the
|
mas01cr@417
|
88 command-line binary and SOAP server can do whatever weird mangling to
|
mas01cr@417
|
89 the results they want to.
|
mas01cr@417
|
90
|
mas01cr@417
|
91 We still need to be careful in the accumulator to defend against some
|
mas01cr@417
|
92 of the weird things that our query implementation might choose to do:
|
mas01cr@417
|
93 insert the same hit multiple times or some such.
|