mas01cr@417: Currently supported query types: mas01cr@417: mas01cr@417: O2_POINT_QUERY mas01cr@417: * dot_product mas01cr@417: mas01cr@417: Find and report, from the database, up to "pointNN" mas01cr@417: near-neighbours of length-1 query sequences. mas01cr@417: mas01cr@417: O2_TRACK_QUERY mas01cr@417: * dot_product mas01cr@417: mas01cr@417: Find, in each track, up to "pointNN" near-neighbours of length-1 mas01cr@417: query sequences, reporting the top "trackNN" tracks, ordered by mas01cr@417: the average distance of the pairwise matches. mas01cr@417: mas01cr@417: O2_SEQUENCE_QUERY mas01cr@417: - radius, + radius mas01cr@417: * euclidean_normed, euclidean mas01cr@417: mas01cr@417: O2_N_SEQUENCE_QUERY mas01cr@417: - radius, + radius mas01cr@417: * euclidean_normed, euclidean mas01cr@417: mas01cr@417: Find, in each track, up to "pointNN" near-neighbours of query mas01cr@417: sequences. Report the results from the "trackNN" top tracks, mas01cr@417: where the tracks are ordered by the average distance of the mas01cr@417: retrieved pairwise matches. The difference between SEQUENCE and mas01cr@417: N_SEQUENCE is that the SEQUENCE case reports only the average, mas01cr@417: while the N_SEQUENCE reports the individual points too. mas01cr@417: mas01cr@417: (Ordering by average is arbitrary, and it's not hard to construct mas01cr@417: cases where it is suboptimal. The two cases where it is not mas01cr@417: arbitrary are when pointNN is 1, and when trackNN is equal to the mas01cr@417: number of files in the database.) mas01cr@417: mas01cr@417: O2_ONE_TO_ONE_N_SEQUENCE_QUERY mas01cr@417: + radius mas01cr@417: * euclidean_normed mas01cr@417: mas01cr@417: For all applicable query sequences, find and report the closest mas01cr@417: target instance point. Each query sequence is responsible for mas01cr@417: exactly one result. mas01cr@417: mas01cr@417: (This feels like it should be more orthogonal than a separate mas01cr@417: query type; the restriction on using a target instance point only mas01cr@417: once in a match seems like it should compose with the sequencing mas01cr@417: query above.) mas01cr@417: mas01cr@417: Plan: mas01cr@417: mas01cr@417: We have mas01cr@417: mas01cr@417: reporter->add_point(), mas01cr@417: reporter->report(). mas01cr@417: mas01cr@417: Insert into the whole shebang a new class Accumulator, with methods mas01cr@417: mas01cr@417: void accumulator->add_point() mas01cr@417: adb_query_results *accumulator->get_points() mas01cr@417: mas01cr@417: The accumulator has to be responsible for keeping track of how many mas01cr@417: points (total, or per track) there are so far; ->get_points() has to mas01cr@417: make the final decision about which points to preserve. So sadly we mas01cr@417: can't be completely on the side of the angels and have only one single mas01cr@417: accumulator class, as POINT_QUERY is different from all the others. mas01cr@417: (Though maybe we can with a suitably careful use of the "if" mas01cr@417: construct). mas01cr@417: mas01cr@417: We don't have to alter the Reporter class at all. The query loop goes mas01cr@417: roughly mas01cr@417: mas01cr@417: choose point pair mas01cr@417: if(everything OK with point pair) mas01cr@417: accumulator->add_point() mas01cr@417: loop mas01cr@417: mas01cr@417: results = accumulator->get_points() mas01cr@417: mas01cr@417: for matches in results mas01cr@417: reporter->add_point(match) mas01cr@417: loop mas01cr@417: mas01cr@417: reporter->report() mas01cr@417: mas01cr@417: This separation is engineered (ha) such that everything after the last mas01cr@417: use of the accumulator doesn't need to be in libaudiodb; the return mas01cr@417: value from audiodb_query() can be "results" in the above, and then the mas01cr@417: command-line binary and SOAP server can do whatever weird mangling to mas01cr@417: the results they want to. mas01cr@417: mas01cr@417: We still need to be careful in the accumulator to defend against some mas01cr@417: of the weird things that our query implementation might choose to do: mas01cr@417: insert the same hit multiple times or some such.