PL/SQL Adventures with MariaDB

Four years ago MariaDB announced its Oracle compatibility mode; according to its own statements, it allows 80-90% of the existing PL/SQL code to run unmodified on MariaDB.

I was recently tasked with writing a minimal implementation of the DBMS_XMLGEN PL/SQL package for MariaDB. During this endeavor I learned that the above claim is likely hugely exaggerated and that the claimed “compatibility” is more a marketing thing than a one for the real world.

To begin with, the compatibility is achieved mostly by converting PL/SQL syntax into the MariaDB native one. This means that many features that are used via PL/SQL on Oracle remain completely unavailable on MariaDB because the underlying layer lacks one or another function. As we are going to see, this has a huge impact on PL/SQL portability.

Back to my task, DBMS_XMLGEN is a relatively small package that does one basic thing: takes a query, runs it and returns the result as XML – labeling each cell with the column name and grouping rows under a row tag. All-in-all, this should be easy to redo. While the code of the package is proprietary, the interface is not (and after some infamous litigation like SCO vs IBM and Oracle vs Google we know that these specifications can be used to create an alternative, compatible implementation).

One interesting feature of the original a package is that it is stateful. An instance, called context, is created for each incoming query; it can later be amended with various flags that affect the produced XML layout. Naturally, the context was the first thing that needed implementation – and here came the first hiccup. To make any future changes in the internal workings of a package invisible to the apps that use it, it is a common technique to define one or more custom data type; while it is usually mapped internally 1:1 into something standard, this mapping can be altered if needed without the external app knowing of (and adapting itself to) the change. However, MariaDB does not allow for custom data type at all. We had no alternative but to drop the custom data type for the context and used a standard one. Thus, at the very first step, we already lost full compatibility with the original. But little did we know it was just the beginning.

We eventually settled on the context being a temporary table, thus allowing automatic cleanup when the session is closed. However, this brought up the question of running parallel contexts inside the same session. The original package does not say this is possible, but also it does not say it is not; hence we had to be able to do it. We also wanted to avoid altering this table every time support for yet another new parameter was added to the context. We therefore either had to go with one table per context, writing each parameter on its own row (essentially making the table a key/value store), or by using a JSON string. The first was impossible as it meant having dynamic SQL (due to the randomised table name for each context) – but the creation of the context in the original package happens in a stored function; and MariaDB, for whatever reason, does not allow the execution of dynamic SQL anywhere in the path of a function (or a trigger too) – only in procedures. So, our last choice was to use JSON – as MariaDB has reasonable (although not perfect) JSON support… and we could not have guessed that a hidden bug here would bite us later.

Once the context was established, creating the procedures to update it was a breeze. It seemed that the hard part was behind us; it was quite clear what we should do next: run the custom query, get the list of columns and write XML. This would not have taken more than an hour on any decent programming language… but not on the PL/SQL “compatibility” that MariaDB has.

Running the query itself posed a problem that we knew about: it was dynamic SQL and again the XML extraction was done in the original package in a function. We could not do this in MariaDB. The decision we made was to add one more stored procedure that has to be called separately prior to the XML extraction; it would do all the heavy lifting – run the query, produce the XML and store it in the context for later fetching.

The next quite unexpected problem turned out to be the column list. As the query to run could not be known in advance, we expected to retrieve the list of column names from the result set. While this is possible and normal if programming on the SQL client side, it turned out to be impossible in MariaDB PL/SQL. Oracle tackles the issue by providing a package named DBMS_SQL, which MariaDB lacks. More importantly, MariaDB lacks a basic feature for this package – the so-called numeric cursors (we were yet to see how much really MariaDB lags behind Oracle in terms of cursor support). Seeing that it was impossible to obtain the column names from the cursor, we turned to the idea of materialising the query. To our rescue came the CREATE TABLE AS SELECT feature. Indeed, as the queries to be run could always be expected to be data retrieval, simply prepending a “CREATE TABLE AS” opened the possibility to get the result set in a table, from which to extract the column list. But the quest was far from over yet.

As we love the self-cleanup of temporary tables, we naturally attempted to use such one to materialise the query… and failed miserably. It turned out that MariaDB has no way to obtain the list of columns of a temporary table from its INFORMATION_SCHEMA; something that its sibling, MySQL, had had for years. Even more confusingly, MariaDB did have a similar feature for about two months some four years ago, but it was promptly revoked – supposedly to be replaced with something better which in reality never came to be… leaving yet another area in which MariaDB is behind MySQL (hello from my favourite lateral join!). While it is possible to use the SHOW COLUMNS command in a session even against a temporary table, MariaDB (for unknown reason) prohibits its usage in stored procedures. So, we had to give up on the temporary table here and resort to using a standard one, followed by proper cleanup when it is no longer needed.

Having somehow sorted out the retrieval of the column names, we approached what turned out to be the biggest hurdle: the retrieval of the data. Yes, we had a table and we knew the columns names… but we could not read the rows. Why? Since this table was generated by unknown SQL statement, we could not have a static cursor on it. Naturally, we would turn to using a dynamic cursor… just that MariaDB does not have such. Its cursors are only static. Not all is lost, one would say – it is possible to use EXECUTE IMMEDIATE and do a bulk write to a variable array; this would give us all rows when the table layout is not known. But no luck again – MariaDB has no array support whatsoever – not even for static ones.

At this point we seemed to have ran out of ammunition. Yes, it was still possible to do kinky stuff like write the data to a delimited file (and the table layout to another one), then call a user-defined function that will read, process the file, create the XML and give it back to us – but that seemed way too clumsy. We needed something a bit simpler… and we managed to find one – even if turned out to be simpler just a bit.

Since the result from the original query was already present in a table, we decided to repeat the trick we did above: knowing the list of columns it has, do a CREATE TABLE AS SELECT and read all columns, concatenating the data on the fly into one string and adding at the same time the XML tags for each column. This meant yet another table, but being a single-column, we could give this column some hard-coded name and declare a static cursor on it. Bingo! This way we would finally be able to read the XML rows one by one, wrap each one in a row tag and be done. From this final single-column table we only needed the rows, so we again created it as temporary one.

The last step was more tedious than challenging: when doing the CONCAT() of all the columns from the permanent table, we also needed to account for the set NULL handling – and the original PL/SQL package provides thee options: skip NULL cells completely, create an empty XML tag or create an empty tag with a special xsi:nil=”true” attribute. This added a bunch of IFs, IFNULLs and concatenations, but finally, we were done. Proof-reading this SQL query is hardly pleasant, but it does the work. Do you remember the JSON bug mentioned in the beginning? It was the time for one last MariaDB hiccup. The context has two similar settings, one for managing row tags in the XML and another for managing row set tags. There is a default value for each of these and you could also set your custom ones; should you not want such a tag at all, you should set the context property to NULL… and here MariaDB failed us again. A bug in its JSON processing functions mans it does distinguish between the string “null” and a NULL value. As with so many other things, it turned out to be a known bug which MariaDB developers had been contemplating for years whether to fix it or not. Facepalm. The only thing we could do here was to document the bug and the declare it a known issue in our implementation.

As a summary of this journey, here is a list of what we found MariaDB to lack in its PL/SQL support – all omissions that make their claim for “compatibility” sound dubious, at least, in the real world:

  • No DBMS_* packages of any kind. Even the simplest ones you’ll need to re-write from scratch yourself.
  • No custom datatype support; forget about using TYPE.
  • No dynamic cursors. Forget about using OPEN CURSOR FOR.
  • No reference cursors. Forget about passing a cursor from one procedure to another.
  • No numeric cursors. Forget about getting the column list from the cursor.
  • No dynamic SQL in the execution path of a function or a trigger. This seems quite arbitrary, as the dynamic SQL is allowed in the execution path of a stored procedure.
  • No arrays, static or virtual. Forget about reading a row with a layout that is not static (even if you know the column count at runtime and could cat them all into something like a VARCHAR).
  • JSON processing functions cannot tell the string “null” from a NULL value.

Curious what we ended up with? Here it is: https://github.com/assen-totin/dbms_xmlgen

This entry was posted in Нули и единици. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.