XML Legacy Migration XML Legacy Migration XML Legacy Migration XML Legacy Migration XML Legacy Migration
xml Home Forecross Corporation Xml Solutions Migration Solutions Integrity Solutions XML Links Information and News Investor Relations
Second-Generation Legacy to Web Strategies via XML
Part 3: Disciplined XML

by
Don Estes

gradient.gif (3616 bytes)

3.4 Schema Definition Schema

One of the major differences between a DTD and a schema is that a schema is an XML document, while the DTD used a non-XML syntax. Like any XML document, there is a global schema against which each schema must validate: the schema definition schema. (For mathematical completeness, the schema definition schema must validate against itself.)

There are two schema definition schemas in common currency at this time, one from W3C and one from Microsoft. If you write a schema to which your XML documents must conform, you may write it using either the W3C definition, the Microsoft definition, or in some new schema definition that may be published at any time. In principle, you could write your own schema definition schema, but in practice we think this is unlikely to be an appropriate strategy to follow. Some sites may choose to limit some of the advanced syntax available in either of the schema definitions in order to enforce the KIS rule, and this may be an appropriate reason to modify one or the other schema definitions for local use. However, we discourage this as well for new XML users, until experience shows what should and should not be used.

Consider the following COBOL record description:

COBOL Record Description
01 MASTER-RECORD.
03 MASTER-REFERENCE-NO PIC 9(6).
03 MASTER-NAME PIC X(50).
03 MASTER-CURR-BALANCE PIC S9(8)V9(2).
03 MASTER-CURR-TRANS-DATE PIC 9(8).

If we had a single data record stored in a file of this description, and if we defined appropriate data tags for these items, we could encode that file into this XML document:

XML Document
<?xml version = "1.0"?>
<MASTER-FILE>
<MASTER-RECORD>
<REFERENCE-NO>000020</REFERENCE-NO>
<NAME>JOHN SMITH</NAME>
<CURR-BALANCE>+123.45</CURR-BALANCE>
<CURR-TRANS-DATE>2000-08-26</CURR-TRANS-DATE>
</MASTER-RECORD>
</MASTER-FILE>

This document will validate1 against the following schema, prepared using the W3C syntax:

W3C Syntax Schema
<?xml version ="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/1999/XMLSchema">
<xsd:element name="MASTER-FILE">
<xsd:complexType content="elementOnly">
<xsd:element name="MASTER-RECORD" type="MASTER-RECORDType"/>
</xsd:complexType>
</xsd:element>
<xsd:complexType name="MASTER-RECORDType" content="elementOnly">
<xsd:element name="REFERENCE-NO" type="xsd:positiveInteger"/>
<xsd:element name="NAME" type="xsd:string"/>
<xsd:element name="CURR-BALANCE" type="xsd:decimal"/>
<xsd:element name="CURR-TRANS-DATE" type="xsd:date"/>
</xsd:complexType>
</xsd:schema>

This document will also validate2 against the following equivalent schema, prepared using the Microsoft syntax:

Microsoft Syntax Schema
<?xml version ="1.0"?>
<Schema xmlns="urn:schemas-microsoft-com:xml-data"
xmlns:dt="urn:schemas-microsoft-com:datatypes">
<ElementType name="REFERENCE-NO" content="textOnly"
dt:type="string"/>
<ElementType name="NAME" content="textOnly" dt:type="string"/>
<ElementType name="CURR-BALANCE" content="textOnly"
dt:type="float"/>
<ElementType name="CURR-TRANS-DATE" content="textOnly"
dt:type="date"/>
<ElementType name="MASTER-RECORD" content="mixed">
<element type="REFERENCE-NO"/>
<element type="NAME"/>
<element type="CURR-BALANCE"/>
<element type="CURR-TRANS-DATE"/>
</ElementType>
<ElementType name="MASTER-FILE" content="eltOnly">
<element type="MASTER-RECORD"/>
</ElementType>
</Schema>

The "eXtensibility" in XML provides both its greatest benefit and its greatest potential weakness. The very flexibility that gives XML so much of its power to eliminate barriers to the exchange of data also provides the rope an organization can use to hang itself. It's not hard to imagine the confusion that would result from multiple groups, one using W3C, another using Microsoft, a third using a home-grown schema definition, and a fourth avoiding all use of schemas. Therefore, there does need to be a central group which establishes global standards to which all groups must adhere, and that these standards need to be in place as early as possible, preferably before any significant use of XML. At the same time, care must be taken to avoid establishing an XML Gestapo that inhibits productive use of XML.

While a large IT organization may find it helpful to have someone on staff with an in-depth knowledge of XML, many XML experts are computer science graduates with a strong theoretical understanding of XML. These gurus can be very productive if assigned to implementation groups where they will work directly with data definitions. However, establishing an ivory tower group with several gurus together at some distance from practical data usage may not return the desired results. People who know the data in depth should be well represented in the standards setting group as a helpful balance to those with in depth theoretical knowledge.

1 Using Oracle's XML Schema processor for Java Version 1.0.0, released 7/28/2000 from http://technet.oracle.com/tech/xml/xdk_java.html.

2 Using MSXML 3.0, released 7/31/2000, from http://msdn.microsoft.com/xml/default.asp.

Previous Next

Forecross is a registered trademark of Forecross Corporation.
Copyright © 1996-2008 Forecross Corporation
All Rights Reserved.