mysqldump-vo: VOTables for MySQL!
Astronomers have been using the FITS data format for a long time to save images and other table style data. This de facto standard was developed by NASA especially for the needs of astronomers and scientists. However, one can argue that FITS is becoming rather old (it has been there since 1993) and might not be adequate anymore. Even though the International Virtual Observatory Alliance (IVOA – a body standardising various data related issues in astronomy) recognises FITS as a standard, it provides a more “modern” approach: VOTables (for description, see standard document).
VOTable is a standardised XML schema for holding multi-dimensional data, such as tables and images. In contrast to the FITS format, it provides the possibility to provide extensive meta data (also standardised and eventually machine readable) for all the data in the VOTable. Additionally it enables hierarchical data structures and allows the data to be either stored in ASCII format, or binary (using base64 encoding).
It is therefore important to provide results from MySQL queries saved in tables in the VOTable format. I have therefore extended the “mysqldump” tool provided with MySQL to export data as VOTables in either ASCII or binary format. Since “mysqldump” already provided XML support, this was a rather straight forward task.
The VOTable enabled version of “mysqldump” I provide is based on the MySQL 5.5 stack and is compatible with all versions greater than 5.5. To compile this tool, the source code of MySQL is needed. You can find all relevant information on the GitHub project page:
mysql_validateSQL: UDF for validating SQL
MySQL has the sometimes annoying practise, that it does not check the syntax of a query without already executing parts of it. This is especially the case, when sub queries are involved. Other databases however allow a query to be checked before it is executed. In large expensive queries, especially in a parallel setting, it is thus important to have the syntax of the query checked before anything is executed.
I therefore created a UDF exposing those parts of the query execution routines, that are responsible for checking the syntax of the query and the availability and ACLs of the tables. Since these routines are from the deep internals of MySQL, they can change from version to version at any time. I have chosen to base the code on the 5.5 version and have tested the UDF on version 5.6 as well. It remains to be seen, how they will work with future versions or MariaDB.
The UDF itself takes the SQL query as string and then returns the MySQL error message where applicable.
mysql_udf: My useful MySQL UDF collection!
My MySQL UDF collection features many UDFs interesting for the astronomical mind and will be extended with time. At the moment, I implemented the Vincentry formula for angular distances and more spherical trigonometric functions will follow (an initial version of my port of pgSphere for MySQL is available on GitHub, however it is not yet fully finished and needs some more testing).
Further, I provide UDFs for generating up to 10 or 20-dimensional hilbert curve keys for faster spatial indexing (as has been used in the MultiDark and Millenium Simulation Database). This makes use of my N-dimensional hilbert curve generation library implementing the Chenyang, Hong, Nengchao 2008 IEEE algorithm.
Additionally I provide UDFs for distributed streamed STDDEV calculations using the Welford 1962 and Chan et al. 1979 algorithms. This UDF is also part of PaQu, the parallel query execution facility.
Other semi useful UDFs provide a sleep function (doing nothing for a specified amount of seconds) and a strrpos implementation.
Many UDFs to make MySQL astronomy proof: MORE TO COME