XML Legacy Migration XML Legacy Migration XML Legacy Migration XML Legacy Migration XML Legacy Migration
xml Home Forecross Corporation Xml Solutions Migration Solutions Integrity Solutions XML Links Information and News Investor Relations
Second-Generation Legacy to Web Strategies via XML
Part 3: Disciplined XML

by
Don Estes

gradient.gif (3616 bytes)

3.5 XML Validation Over The Life Cycle

XML usage will evolve with the applications using it over their life cycle. Let's consider that usage in more detail. We will have programs that will encode and decode complete files or database tables, as a batch task. These documents will persist for relatively long periods of time. We will have programs that encode or decode documents in a real-time data exchange mode, both with applications on the same platform and on different platforms, so that the documents exist only as a byte stream in memory and only for a few milliseconds. Data elements in programs that will be involved in XML data exchange need to be matched to the global dictionary of valid data tags. Document encoding and decoding routines have to be written and maintained. Schemas have to be developed and maintained within the scope of the relevant schema definition schema. And, provision must be made to ensure that, as any of these elements change over time, all will be in concert and, if any elements do get out of coordination, the
discrepancies will be detected and corrected before any harm can occur.

The most important capability of XML with regard to disciplined use is the schema validation procedure. XML documents that persist on disk for extended periods of time can be validated against their respective schema as a batch process. However, XML documents that are transient in nature also require validation. Therefore, programs that encode and decode these documents must include either a mechanism to capture and write the transient documents to a persistent disk file for subsequent validation, or there must be a real-time validation engine that can be called as needed. This validation must be present during all testing of new program versions before entering production. In addition, some percentage of production XML documents, up to 100% if possible, should receive validation. Documents that fail validation must have error handling logic defined, particularly as fields are strongly typed and data validity tests are centralized into XML schema.

In addition, some type of versioning procedure is needed to ensure that the validation is occurring as expected. Depending on the site configuration, validation may be by a universal routine or by a routine generated with customized logic for a particular schema. The latter will be much faster, but then it can get out of synchronization with revisions to the schema. In both cases, updates to the schema definition schema will require revalidation of the validators.

XML schema are a passive validation device, since validation occurs only when requested, and it can be bypassed. Most database systems have active validation, such that discrepancies between the current version of the database and older versions of programs using the database are detected at run time, and defined error procedures are executed. We recommend that XML validation should operate like active database validation. This can be implemented by a modest extension to the schema definition schema to create a version or a time stamp attribute for each schema that is defined to match an equivalent attribute in each XML document. Each custom validation program will also have the version or time stamp compiled into its object code, so that a mismatch will be reported as an immediate error. In this way, normal XML facilities can be used to provide the active validation that is missing from the XML specification in the name of providing the greatest possible flexibility.

Previous Next

Forecross is a registered trademark of Forecross Corporation.
Copyright © 1996-2008 Forecross Corporation
All Rights Reserved.