Most queries run by scientists on our databases take quite some time to execute. Scanning large tables can take up to multiple tens of minutes and complex joins between various science tables can end in execution times of hours. Unfortunately indexing can only be applied in limited circumstances to such queries, since we want scientists to run any query they need for their analysis and not provide ready made answers. We cannot build indexes for every possible question a scientist might ask our databases.
Looking at usage statistics from our largest database, the MultiDark DB, makes this point clear. Of the over 1 million queries submitted to the Multidark Database, around 71% take less than one second. These are the queries that we have created indexes for and correspond mainly to the example queries we gave in the documentation. Further 21% of all queries take less than 10 seconds to execute, 4% less than 100 seconds and the remaining 4% longer. Even though many queries take only a couple of seconds to execute, some queries take much longer.
Imagine a user of our SQL query facility who starts many long running queries at once. Another user might ask queries with short execution times in the same instance. The long running queries might end up locking the database and the user with the short query needs to wait for the long queries to finish. You might argue a good database system can handle with this by executing queries concurrently and switching between the various query tasks from time to time (MS SQL Server is observed to do this). But task switching takes time, new data needs to be read from the disk to memory and things get complicated. Prioritising short queries over longer ones is therefore crucial.
A much stronger argument for queuing long queries however is I/O performance (keep in mind that our tables are too large to be kept in memory). A long query with a full table scan performs best, if the data on the disk can be read sequentially. In such cases we can obtain peak I/O read performances of up to 1 GB/s on our MyISAM tables (using RAID 6), strongly depending on the type of query we ask. If multiple table scans are performed by the database simultaneously, the sequential reads become random reads, since various positions on the disk need to be read at the same time. This will lead to seeks. So the best I/O (and thus query execution) performance can be achieved if tables are read sequentially and only one query is executed at a time. Experiments on our MyISAM tables showed, that a mean I/O performance of around 400 MB/s can be achieved with two simultaneously executing queries and that the average performance does not increase much with restricting MySQL to execute only one query at a time.
And as the last argument, a scientist most evidently does not want to stay on the web application until her query finally finished executing. Asynchronous query execution is therefore a MUST for any scientific database.
A query queue inside MySQL
I have decided to use the MySQL plugin API to write a queue daemon that runs from within MySQL. The daemon is able to run queues based on a priority value, which is determined by the user group and queue. Different user groups can have different priorities, enabling for instance that an admin’s job always gets picked first from the queue. Further it is possible to define different queues that also have different priorities and maximum run times. This is similar to the PBS queueing system. The queue daemon also ensures, that a query exceeding the defined maximum run time gets killed.
At the moment, the queue is implemented in a “highest priority goes fist” fashion. In the future, I can imagine to make this behaviour more customisable, including some crude form of load balancing. One of the problems with the current approach is, that highly prioritised long jobs execute in parallel and it would be nice to let shorter jobs with less priority run before the longer ones.
The query queue maintains four tables in the “mysql” database: a table defining the different queues and user groups, a table holding the currently pending/running jobs, and a job history table. Jobs are submitted through “user defined functions” (UDFs) which are then added by the UDF to the corresponding table. It is also possible to insert jobs directly into the appropriate table, but this is strongly discouraged. Data from the queue however is accessed through simple SQL SELECT statements.
Since jobs executing through the queue are run without a connection to which the can write back to, the queries have to be enclosed by a CREATE TABLE statement, i.e. results need to be written back to a table. The queue tries its best to add this CREATE TABLE statement to the submitted query. You can also choose to provide the CREATE TABLE statement yourself in your query.
The MySQL query queue is found on my github account and I welcome any contributions.