What is an XML Database?

An eXtensible Markup Language (XML) database is a software system that permits data storage in XML format. XML is a meta-markup language used to manage data which employs user customizable tags to organize information. The flexibility of the language, which allows the creation of custom data structures and organizational systems, has led to its widespread use to exchange data in multiple forms. XML databases are often used in applications such as informational portals, document exchanges, and product catalogs.

It is generally considered more efficient in terms of data conversion costs to use an XML database due to the widespread use of this language in data transportation. There are two major categories of these databases: XML-enabled databases and Native XML databases (NXD). Each type of XML database is used to store different types of data.

An XML-enabled database funnels data into a traditional relational database in an XML format. The data is translated for storage, and returned to its initial format upon output. This type of database is used to store data-centric documents which include highly structured information, such as patient records, and only use XML for data transfer.

Native XML databases store XML documents as a whole, instead of separating out the data within them, and are designed to store semi-structured information, such as marketing brochures or health data. XML documents that contain semi-structured data are referred to as document-centric. A native XML database does not conform to a certain physical storage model, being able to use relational, hierarchical, or object-oriented structures as well as custom storage formats. It manages documents by grouping them into logical collections, and can set up and manage multiple collections simultaneously. This type of database permits the user to store any type of XML document, regardless of structure, within the same collection. Queries can be constructed across the whole collection, generally making data organization and manipulation more flexible.

An XML database uses a special programming language designed specifically to extract and manipulate XML documents, known as XQuery. The purpose of XQuery is to allow the construction of flexible queries that can extract and manipulate information from XML documents, as well as other sources that can be translated into XML. Some applications in which XQuery can be used include searching text documents on the Web for relevant data and compiling the results, extracting data from databases to be used in application integration, and generating reports on the data contained in an XML database.

XML databases are often employed by organizations that must manage complex and varied content, allowing them to process and reuse the data efficiently for various business objectives. The flexibility of XML documents and databases enables organizations to store and manipulate data across diverse software platforms and environments. Documents can be created and managed so that the same information may be used in different projects, such as manuals or product catalogs, as well as providing multiple output formats to conform to varied end-user requirements.