changeset 408:f0a69693eaef api-inversion

The lesser of two evils, part 1. Most of the body of audiodb_insert_datum() will apply to "LARGE_ADB"-type insertions: checking for the right flags, checking for enough space free, synchronizing the header. Wouldn't it be nice if we could reuse all that code (or at least the bits that apply) without one horrible almost-identical cut-and-paste job (see batchinsert_large_adb(), or if that's not compelling enough, the four almost-identical query loops from before the Great Refactoring). Well, yes, it would. Sadly C makes it mildly difficult, because its functions are explicitly typed (so we can't pass arbitrary arguments of other types, even if they're ABI-compatible), while its macros are textual (which makes writing and maintaining them horrible). The thought of a union argument was briefly entertained and then discarded as being just Too Weird. So, instead, (ab)use the oldest trick in the book: void *. Define an adb_datum_internal_t which has void * instead of double *; the intention is that this internal data type can be constructed both from an adb_datum_t and some notional adb_reference_t (which looks very much like an adb_insert_t at the time of writing, with char * structure entries representing filenames). This adb_datum_internal_t structure is very much an internals-only thing, so put its definition in the internals header. Call what was previously audiodb_insert_datum() a new function audiodb_insert_datum_internal(), made static so that really no-one is tempted to call it other than ourselves. audiodb_insert_datum() is then trivial in terms of this new function, if stupidly tedious. (If we were playing dangerously, we could just perform a cast, but relying on the fact that sizeof(double *) = sizeof(void *) would almost certainly end up biting when we least expect. Incidental inclusion in this patch, since I noticed it at the time: actually check for the O2_FLAG_L2NORM before scribbling all over the l2norm table. Somewhat unsurprisingly, there are as yet no tests to defend against this (harmless, as it turns out) erroneous behaviour.
author mas01cr
date Tue, 09 Dec 2008 20:53:39 +0000
parents a82a2d9b2451
children 99e6cbad7f76
files audioDB-internals.h insert.cpp
diffstat 2 files changed, 46 insertions(+), 21 deletions(-) [+]
line wrap: on
line diff
--- a/audioDB-internals.h	Tue Dec 09 20:53:34 2008 +0000
+++ b/audioDB-internals.h	Tue Dec 09 20:53:39 2008 +0000
@@ -1,3 +1,12 @@
+typedef struct adb_datum_internal {
+  uint32_t nvectors;
+  uint32_t dim;
+  const char *key;
+  void *data;
+  void *times;
+  void *power;
+} adb_datum_internal_t;
+
 struct adb {
   char *path;
   int fd;
--- a/insert.cpp	Tue Dec 09 20:53:34 2008 +0000
+++ b/insert.cpp	Tue Dec 09 20:53:39 2008 +0000
@@ -48,14 +48,17 @@
  * 10. sync adb->header with disk.
  *
  * Step 10 essentially commits the transaction; until we update
- * header->length, nothing will recognize the newly-written data.
- * In principle, if it fails, we should roll back, which we can in
- * fact do on the assumption that nothing in step 9 can ever fail;
- * on the other hand, if it's failed, then it's unlikely that
- * rolling back by syncing the original header back to disk is going
- * to work desperately well.
+ * header->length, nothing will recognize the newly-written data.  In
+ * principle, if it fails, we should roll back, which we can in fact
+ * do on the assumption that nothing in step 9 can ever fail; on the
+ * other hand, if it's failed, then it's unlikely that rolling back by
+ * syncing the original header back to disk is going to work
+ * desperately well.  We should perhaps take an operating-system lock
+ * around step 10, so that we can't be interrupted part-way through
+ * (except of course for SIGKILL, but if we're hit with that we will
+ * always lose).
  */
-int audiodb_insert_datum(adb_t *adb, adb_datum_t *datum) {
+static int audiodb_insert_datum_internal(adb_t *adb, adb_datum_internal_t *datum) {
 
   off_t size, offset, nfiles;
   double *l2norm_buffer, *lp, *dp;
@@ -131,22 +134,24 @@
   write(adb->fd, datum->key, strlen(datum->key)+1);
 
   /* 8. if O2_FLAG_L2NORM, compute norms and fill in table; */
-  l2norm_buffer = (double *) malloc(datum->nvectors * sizeof(double));
-
-  /* FIXME: shared code with audiodb_norm_existing() */
-  dp = datum->data;
-  lp = l2norm_buffer;
-  for(size_t i = 0; i < datum->nvectors; i++) {
-    *lp = 0;
-    for(unsigned int k = 0; k < datum->dim; k++) {
-      *lp += (*dp)*(*dp);
-      dp++;
+  if(adb->header->flags & O2_FLAG_L2NORM) {
+    l2norm_buffer = (double *) malloc(datum->nvectors * sizeof(double));
+    
+    /* FIXME: shared code with audiodb_norm_existing() */
+    dp = (double *) datum->data;
+    lp = l2norm_buffer;
+    for(size_t i = 0; i < datum->nvectors; i++) {
+      *lp = 0;
+      for(unsigned int k = 0; k < datum->dim; k++) {
+        *lp += (*dp)*(*dp);
+        dp++;
+      }
+      lp++;
     }
-    lp++;
+    lseek(adb->fd, adb->header->l2normTableOffset + offset / datum->dim, SEEK_SET);
+    write(adb->fd, l2norm_buffer, sizeof(double) * datum->nvectors);
+    free(l2norm_buffer);
   }
-  lseek(adb->fd, adb->header->l2normTableOffset + offset / datum->dim, SEEK_SET);
-  write(adb->fd, l2norm_buffer, sizeof(double) * datum->nvectors);
-  free(l2norm_buffer);
 
   adb->keys->insert(datum->key);
   adb->header->numFiles += 1;
@@ -158,6 +163,17 @@
   return 1;
 }
 
+int audiodb_insert_datum(adb_t *adb, adb_datum_t *datum) {
+  adb_datum_internal_t d;
+  d.nvectors = datum->nvectors;
+  d.dim = datum->dim;
+  d.key = datum->key;
+  d.data = datum->data;
+  d.times = datum->times;
+  d.power = datum->power;
+  return audiodb_insert_datum_internal(adb, &d);
+}
+
 bool audioDB::enough_per_file_space_free() {
   unsigned int fmaxfiles, tmaxfiles;
   unsigned int maxfiles;