r1 - 17 Sep 2008 - 11:33:14 - MichaelCaseyYou are here: OMRAS2 >  Main Web  >  TWikiUsers > MichaelCasey > AudioDB > AudioDBDeveloperDiscussionPage > AudioDBToDo
* development of functionality

** exposure of all non-write functions over Web Services

radius query-type over SOAP. [DONE]

** matrix of possible queries

Three major types of result ranking are supported:

1 Averaged-Nearest Neighbours (a-NN) 2a Radius-bounded nearest neighbours (r-NN) 2b Approximate radius-bounded near neighbors (ar-NN) this is a-NN with LSH indexing

Averaged-NN and Radius-Bounded-NN use two different algorithms to sort tracks, they both report nearest-neighbour points within tracks.

The space of possible (sensible?) queries is larger than this -- though working out the sensible abstraction might have to wait for more use cases -- and also that the orthogonality of various parameters is missing. (e.g. a silence threshold should be applied to all queries or none, if it makes sense at all.)

Additionally, query by key (filename) might be important. [DONE by Michael]

** results

Need to sort out what the results mean; is it a similarity or a distance score, etc. Also, is it possible to support NN queries in a non-Euclidean space?

E.g. Embedding Earth-Mover's Distance in L1

** SOAP / URIs

Define a query data structure that can be serialised (preferably automatically) by SOAP for use in queries. QueryByKey? solves most of this, but features, powers and restrict lists (keyLists) are not currently serialized.

Add support for serealizing features over Web Services

If we ever support inserting or other write functionality over SOAP, this will need doing for feature files (the same as queries) and for key lists too.

** Memory management tricks

For non-LSH search, investigate whether madvise() tricks improve performance on any OSes. Also, maybe investigate a specialized use of GetViewOfFile? on win32 to make it tolerable on that platform.

** LSH

DONE

** RDF (not necessarily related to audioDB)

Export the results of our experiments (kept in an SQL database) as RDF, so that people can infer stuff if they know enough about our methods.

Possibly also write an export routine for exporting an audioDB as RDF. And laugh hollowly as XML parsers fail completely to ingest such a monstrous file.

* architectural issues

** more safety

A couple of areas are not yet safe against runtime faults.

LARGE_ADB format supports millions of tracks. For non-LARGE_ADB format Large databases might well end up writing off the end of the various tables (e.g. track, l2norm).

* transactionality is important; the last thing that should be updated on insert are the free pointers (dbH->length, dbH->numFiles, maybe others), so that if something goes wrong in the meantime the database is not in an inconsistent state. [Michael Thinks that this is DONE. Needs testing in all cases.]

** API vs command-line

API version 1 coming soon.

But most functionality is still accessed by faking command-line calls. Having the "business logic" run by the constructor is also a little bit weird.

* regression (and other) tests

** Command line interface

There is now broad coverage of the audioDB logic, with the major exceptions of the batch insert command, and the specifying of different keys on import.

** SOAP

The shell's support for wait() and equivalents is limited, so there are "sleep 1"s dotted around to attempt to avoid race conditions. Find a better way. Similarly, using SO_REUSEADDR in bind() is a hack that ought not to be necessary just to run the same test twice...

** Locking

The fcntl() locking should be good enough for our uses. Investigate whether it is in fact robust enough (including that EAGAIN workaround for OS X; read the kernel source to find out where that's coming from and report it if possible).

** Benchmarks

Get together a realistic set of usage cases, preferably testing each of the query types, and benchmark them automatically. This is basically a prerequisite of any performance work.

-- MichaelCasey - 17 Sep 2008

Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r1 | More topic actions
 
EPSRC OMRAS2
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding OMRAS2? Send feedback