
XML Data Format: A Timeless Companion in Web Applications
The XML format has been around for quite a while—since 1998, if memory serves. If you’ve worked on web applications, you’ve certainly come across XML at some point.
XML (eXtensible Markup Language) is a data structure created to make data easily readable by both humans (us) and machines. It’s called a “markup” language because it uses tags to define the start and end of a data element.
Its structure is, in fact, quite simple. Entirely textual—meaning it’s human-readable. It’s widely used for exchanging data between applications or web services, among many other use cases.
There’s been a lot of talk about JSON replacing XML. My opinion? That might happen one day. But as of now, many legacy systems, current applications, and even new projects are still built using XML. I’ve always found XML elegant—but yes, it’s undeniably space-consuming (very verbose). And having to open and close every tag always felt a bit tedious. That said, I believe XML will remain present for quite some time.
How to Define an XML Data Structure
<tag> Jedi </tag> |
<tag> |
Jedi |
</tag> |
<code>001</code> <name>Luke Skywalker</name> |
<code> 001 </code> |
<name> Luke Skywalker </name> |

There is no theoretical limit for an XML. There are XML files with many megabytes (trivial) and even gigabytes (rare). Also, there is no limit to the number of tags and values. However, once you intend to store an XML in a column, obviously the XML size must be compatible with the maximum size of that column.
It is important to emphasize that XML is used for data exchange between processes and/or applications (API calls, for example), storage (for configurations, for example), and for data import and export (a list of issued invoices, for example). Only in the latter case is it conceivable to deal with large amounts of data and/or files.
The XML structure format adopted by MySQL follows the standard defined by the W3C (World Wide Web Consortium) called XPATH (XML Path Language), version 1.0 (1999). Currently, XPATH is at version 3.1. Learn more about XPATH 1.0.
How about swinging our lightsabers a bit? Let’s play with MySQL and XML.
First, let’s create a table to hold our XML:

MySQL malformed XML
Here, I preferred to use a TEXT column, but I could have used VARCHAR or any other data type that stores text properly and efficiently. Whoa! Text? String? But aren’t we talking about XML? Yes, as we said, XML is a TEXTUAL data structure. So, ultimately, it is a well-formatted “string.” Therefore, I can store XML in variables, columns, and files.
In the figure below, we have the assignment of a “string” to a user-defined session variable.


Once this “string” is properly formatted according to XPATH 1.0, let’s call it XML from this point onward.
Note that the tags are id, name, and order. Notice that each “tag” is contained or defined between the less-than and greater-than signs. The assigned value is found between the opening and closing “tag.” The closing “tag” has a “/” before its name, as seen:
Opening/start tag | Assigned value | Closing/end tag |
---|---|---|
<name> | Luke Skywalker | </name> |
Between “tags” we can have a space, multiple spaces, no space, line breaks, tabs. This, in fact, does not matter. Format your XML according to your needs: to facilitate human reading, or to save space to optimize machine reading.
Let’s create an XML with incorrect formatting, leaving a “tag” unclosed:

Clearly, the XML above is invalid because the <midichlorian> tag was opened, a value was assigned, but it was not properly closed with </midichlorian>. Jedi stuff—if it were Sith, would it have worked? Of course not!
Let’s remember that XML is a text, a “string.” However, note that we are using the MySQL session, where there is no XML data type. There are also no XML-type columns.
Therefore, there is no XML validation for columns or session variables. This is what we call weak validation, or no validation at all.
MySQL will always trust that the developer will make their best efforts to deliver correctly formatted XML, according to the implemented standard.
However, there is a less rudimentary way to work with XML, at least forcing validation on its typography. This is achieved using the exclusive data type for variables inside routines such as procedures, functions, and events. Let’s see how this works.

We created a very simple procedure that will show the content of a “tag”:
And we will use it:

In the first case, our XML did not have the expected “midichorians” tag, and it returned blank. Then, we submitted a malformed XML (without closing the “midichorians” tag), and it returned NULL. Additionally, we also received a “warning” message: “Incorrect XML value.”
It’s not a “mega” validation, throwing blinking red errors or beeping, but it is something.
Remember: There is no XML data type. Do your part: format the XML correctly.
This article will continue!
Visit our Blog
Learn more about databases
Learn about monitoring with advanced tools

Have questions about our services? Visit our FAQ
Want to see how we’ve helped other companies? Check out what our clients say in these testimonials!
Discover the History of HTI Tecnologia