Nov. 21, 2010, 7:22 a.m.
posted by pitbull
The Quest for Universal Data AccessGiven that much of the world's data is stored in a huge variety of data sources, such as HTML pages, file systems, e-mail, documents, spreadsheets, and project files, it has always been a desire to provide a universal data access mechanism for this data. Typically this has been achieved by defining a "universal" API for accessing this data regardless of where it lives. This provides the advantages of having to learn only a single API, hides the native storage of the data source, and allows tools and services to evolve independently. In the Windows DNA era, the universal data access mechanism was OLE-DB; however, it never completely succeeded in fulfilling this role, primarily because it was a Microsoft technology and not an industry-agreed standard, combined with being based on COM, which was also only on Microsoft Windows platforms. Implementing an OLE-DB provider was hard, and writing the providers was far from trivial. Then XML came onto the scene and changed everything. Now XML is the universal data access mechanism, and the world of data and data interchange can be regarded as XML documents. The primary driving factors for the growth of XML were its industry-agreed standard, its language and API neutrality, and (best of all) its human-readability and ease of manipulation. These factors led initially to the use of XML parsers as the data access API—data stores wrote their data to the XML 1.0 format and then consumers could use their favorite XML parser to read this generated XML. Since there were many parsers available, virtually any platform could consume this data. Figure on the next page illustrates this concept of rewriting the data in the data store to the XML 1.0 format and then accessing this from the application with an XML parser. 1. Converting to XML 1.0 to allow data interchange
However, mapping large stores of data to XML 1.0 is not feasible, and typically implementers wanted to expose the data not as XML 1.0 but as an XML data model via some API that could be consumed. This need drove the requirement for XML providers. XML ProvidersIn order to address the growing need to provide a better data access story for XML, System.Xml version 1.x introduced the XML provider model, which translates the data store interface API into an XML API. This is effectively a mapping between one API and another. It allows for non-XML data stores to be exposed as "logical" XML documents, which do not contain any angle brackets when accessing the data. Figure illustrates this concept of mapping the stores' data access API to an XML API without writing the document as XML 1.0. 2. Exposing a data store as a logical XML document via an XML provider API
Once data has been exposed via a specific XML API, XML services can be layered on top to provide additional XML processing. These services can be other XML APIs, thereby creating a "pluggable" architecture utilizing other XML technologies, such as XSLT 1.0 and XQuery. Figure illustrates this. 3. Layering XML technologies onto an XML provider
System.Xml version 1.x makes it easy to write your own custom XML providers that can be plugged into the overall architecture. These providers are either stream based or random access based. The streaming provider APIs are the XmlReader for reading and the XmlWriter for writing. The random access–based API is the XPathNavigator. It is worth noting that the W3C DOM (represented by the XmlDocument class) could be considered a provider, but it suffers from the major limitation that the API is large—and hence not realistic or easy to implement. Significantly, it also forces the implementer to build a tree of nodes, which is another duplication of the data store. The DOM API is a very poor XML provider, and the other APIs in System.Xml version 1.x do not suffer from these limitations. If you implement the XmlReader API over a data store, you have a streaming pull-based API for reading XML. If you implement the XmlWriter API, you have the ability to push XML into the data store in a very efficient, noncaching manner. If you implement an XPathNavigator provider API over a data store (e.g., a file system, the CLR metadata in a .NET assembly, or any CLR object graph), you have the ability to perform random access data retrieval as well as XPath 1.0 and XSLT over this data store. The advent of XML data providers has started to elevate XML from being not just the interchange format on the wire but also a universal data access mechanism to retrieve and update data back to different and disparate data sources. With the advent of XQuery this process will be further accelerated, as it provides a common data query and aggregation language. Integration with ADO.NETADO.NET has deeply integrated support for XML today, which is shown by the DataSet class's ability to read and write XML and to persist its relational table structure as W3C XML schemas. The DataSet has a fixed mapping whereby XML element and attribute names are loaded in tables and columns with the same name. Also, through the ExecuteXmlReader method on the SqlClientCommand class, you can issue a FOR XML SQL command against SQL Server to return XML to the client, using the server's support for generating XML. Much of the new work in System.Xml version 2.0 is about further providing deeply integrated support in ADO.NET for XML via the XmlAdapter class and XQuery. |
- Comment


